1 2 ============================= 3 p0f v3: passive fingerprinter 4 ============================= 5 6 http://lcamtuf.coredump.cx/p0f3.shtml 7 8 Copyright (C) 2012 by Michal Zalewski <lcamtuf@coredump.cx> 9 10 11--------------- 121. What's this? 13--------------- 14 15P0f is a tool that utilizes an array of sophisticated, purely passive traffic 16fingerprinting mechanisms to identify the players behind any incidental TCP/IP 17communications (often as little as a single normal SYN) without interfering in 18any way. 19 20Some of its capabilities include: 21 22 - Highly scalable and extremely fast identification of the operating system 23 and software on both endpoints of a vanilla TCP connection - especially in 24 settings where NMap probes are blocked, too slow, unreliable, or would 25 simply set off alarms, 26 27 - Measurement of system uptime and network hookup, distance (including 28 topology behind NAT or packet filters), and so on. 29 30 - Automated detection of connection sharing / NAT, load balancing, and 31 application-level proxying setups. 32 33 - Detection of dishonest clients / servers that forge declarative statements 34 such as X-Mailer or User-Agent. 35 36The tool can be operated in the foreground or as a daemon, and offers a simple 37real-time API for third-party components that wish to obtain additional 38information about the actors they are talking to. 39 40Common uses for p0f include reconnaissance during penetration tests; routine 41network monitoring; detection of unauthorized network interconnects in corporate 42environments; providing signals for abuse-prevention tools; and miscellanous 43forensics. 44 45A snippet of typical p0f output may look like this: 46 47.-[ 1.2.3.4/1524 -> 4.3.2.1/80 (syn) ]- 48| 49| client = 1.2.3.4 50| os = Windows XP 51| dist = 8 52| params = none 53| raw_sig = 4:120+8:0:1452:65535,0:mss,nop,nop,sok:df,id+:0 54| 55`---- 56 57.-[ 1.2.3.4/1524 -> 4.3.2.1/80 (syn+ack) ]- 58| 59| server = 4.3.2.1 60| os = Linux 3.x 61| dist = 0 62| params = none 63| raw_sig = 4:64+0:0:1460:mss*10,0:mss,nop,nop,sok:df:0 64| 65`---- 66 67.-[ 1.2.3.4/1524 -> 4.3.2.1/80 (mtu) ]- 68| 69| client = 1.2.3.4 70| link = DSL 71| raw_mtu = 1492 72| 73`---- 74 75.-[ 1.2.3.4/1524 -> 4.3.2.1/80 (uptime) ]- 76| 77| client = 1.2.3.4 78| uptime = 0 days 11 hrs 16 min (modulo 198 days) 79| raw_freq = 250.00 Hz 80| 81`---- 82 83A live demonstration can be seen here: 84 85http://lcamtuf.coredump.cx/p0f3/ 86 87-------------------- 882. How does it work? 89-------------------- 90 91A vast majority of metrics used by p0f were invented specifically for this tool, 92and include data extracted from IPv4 and IPv6 headers, TCP headers, the dynamics 93of the TCP handshake, and the contents of application-level payloads. 94 95For TCP/IP, the tool fingerprints the client-originating SYN packet and the 96first SYN+ACK response from the server, paying attention to factors such as the 97ordering of TCP options, the relation between maximum segment size and window 98size, the progression of TCP timestamps, and the state of about a dozen possible 99implementation quirks (e.g. non-zero values in "must be zero" fields). 100 101The metrics used for application-level traffic vary from one module to another; 102where possible, the tool relies on signals such as the ordering or syntax of 103HTTP headers or SMTP commands, rather than any declarative statements such as 104User-Agent. Application-level fingerprinting modules currently support HTTP. 105Before the tool leaves "beta", I want to add SMTP and FTP. Other protocols, 106such as FTP, POP3, IMAP, SSH, and SSL, may follow. 107 108The list of all the measured parameters is reviewed in section 5 later on. 109Some of the analysis also happens on a higher level: inconsistencies in the 110data collected from various sources, or in the data from the same source 111obtained over time, may be indicative of address translation, proxying, or 112just plain trickery. For example, a system where TCP timestamps jump back 113and forth, or where TTLs and MTUs change subtly, is probably a NAT device. 114 115------------------------------- 1163. How do I compile and use it? 117------------------------------- 118 119To compile p0f, try running './build.sh'; if that fails, you will be probably 120given some tips about the probable cause. If the tips are useless, send me a 121mean-spirited mail. 122 123It is also possible to build a debug binary ('./build.sh debug'), in which case, 124verbose packet parsing and signature matching information will be written to 125stderr. This is useful when troubleshooting problems, but that's about it. 126 127The tool should compile cleanly under any reasonably new version of Linux, 128FreeBSD, OpenBSD, MacOS X, and so forth. You can also builtdit on Windows using 129cygwin and winpcap. I have not tested it on all possible varieties of un*x, but 130if there are issues, they should be fairly superficial. 131 132Once you have the binary compiled, you should be aware of the following 133command-line options: 134 135 -f fname - reads fingerprint database (p0f.fp) from the specified location. 136 See section 5 for more information about the contents of this 137 file. 138 139 The default location is ./p0f.fp. If you want to install p0f, you 140 may want to change FP_FILE in config.h to /usr/local/etc/p0f.fp. 141 142 -i iface - asks p0f to listen on a specific network interface. On un*x, you 143 should reference the interface by name (e.g., eth0). On Windows, 144 you can use adapter index instead (0, 1, 2...). 145 146 Multiple -i parameters are not supported; you need to run 147 separate instances of p0f for that. On Linux, you can specify 148 'any' to access a pseudo-device that combines the traffic on 149 all other interfaces; the only limitation is that libpcap will 150 not recognize VLAN-tagged frames in this mode, which may be 151 an issue in some of the more exotic setups. 152 153 If you do not specify an interface, libpcap will probably pick 154 the first working interface in your system. 155 156 -L - lists all available network interfaces, then quits. Particularly 157 useful on Windows, where the system-generated interface names 158 are impossible to memorize. 159 160 -r fname - instead of listening for live traffic, reads pcap captures from 161 the specified file. The data can be collected with tcpdump or any 162 other compatible tool. Make sure that snapshot length (-s 163 option in tcpdump) is large enough not to truncate packets; the 164 default may be too small. 165 166 As with -i, only one -r option can be specified at any given 167 time. 168 169 -o fname - appends grep-friendly log data to the specified file. The log 170 contains all observations made by p0f about every matching 171 connection, and may grow large; plan accordingly. 172 173 Only one instance of p0f should be writing to a particular file 174 at any given time; where supported, advisory locking is used to 175 avoid problems. 176 177 -s fname - listens for API queries on the specified filesystem socket. This 178 allows other programs to ask p0f about its current thoughts about 179 a particular host. More information about the API protocol can be 180 found in section 4 below. 181 182 Only one instance of p0f can be listening on a particular socket 183 at any given time. The mode is also incompatible with -r. 184 185 -d - runs p0f in daemon mode: the program will fork into background 186 and continue writing to the specified log file or API socket. It 187 will continue running until killed, until the listening interface 188 is shut down, or until some other fatal error is encountered. 189 190 This mode requires either -o or -s to be specified. 191 192 To continue capturing p0f debug output and error messages (but 193 not signatures), redirect stderr to another non-TTY destination, 194 e.g.: 195 196 ./p0f -o /var/log/p0f.log -d 2>>/var/log/p0f.error 197 198 Note that if -d is specified and stderr points to a TTY, error 199 messages will be lost. 200 201 -u user - causes p0f to drop privileges, switching to the specified user 202 and chroot()ing itself to said user's home directory. 203 204 This mode is *highly* advisable (but not required) on un*x 205 systems, especially in daemon mode. See section 7 for more info. 206 207More arcane settings (you probably don't need to touch these): 208 209 -p - puts the interface specified with -i in promiscuous mode. If 210 supported by the firmware, the card will also process frames not 211 addressed to it. 212 213 -S num - sets the maximum number of simultaneous API connections. The 214 default is 20; the upper cap is 100. 215 216 -m c,h - sets the maximum number of connections (c) and hosts (h) to be 217 tracked at the same time (default: c = 1,000, h = 10,000). Once 218 the limit is reached, the oldest 10% entries gets pruned to make 219 room for new data. 220 221 This setting effectively controls the memory footprint of p0f. 222 The cost of tracking a single host is under 400 bytes; active 223 connections have a worst-case footprint of about 18 kB. High 224 limits have some CPU impact, too, by the virtue of complicating 225 data lookups in the cache. 226 227 NOTE: P0f tracks connections only until the handshake is done, 228 and if protocol-level fingerprinting is possible, until few 229 initial kilobytes of data have been exchanged. This means that 230 most connections are dropped from the cache in under 5 seconds; 231 consequently, the 'c' variable can be much lower than the real 232 number of parallel connections happening on the wire. 233 234 -t c,h - sets the timeout for collecting signatures for any connection 235 (c); and for purging idle hosts from in-memory cache (h). The 236 first parameter is given in seconds, and defaults to 30 s; the 237 second one is in minutes, and defaults to 120 min. 238 239 The first value must be just high enough to reliably capture 240 SYN, SYN+ACK, and the initial few kB of traffic. Low-performance 241 sites may want to increase it slightly. 242 243 The second value governs for how long API queries about a 244 previously seen host can be made; and what's the maximum interval 245 between signatures to still trigger NAT detection and so on. 246 Raising it is usually not advisable; lowering it to 5-10 minutes 247 may make sense for high-traffic servers, where it is possible to 248 see several unrelated visitors subsequently obtaining the same 249 dynamic IP from their ISP. 250 251Well, that's about it. You probably need to run the tool as root. Some of the 252most common use cases: 253 254# ./p0f -i eth0 255 256# ./p0f -i eth0 -d -u p0f-user -o /var/log/p0f.log 257 258# ./p0f -r some_capture.cap 259 260The greppable log format (-o) uses pipe ('|') as a delimiter, with name=value 261pairs describing the signature in a manner very similar to the pretty-printed 262output generated on stdout: 263 264[2012/01/04 10:26:14] mod=mtu|cli=1.2.3.4/1234|srv=4.3.2.1/80|subj=cli|link=DSL|raw_mtu=1492 265 266The 'mod' parameter identifies the subsystem that generated the entry; the 267'cli' and 'srv' parameters always describe the direction in which the TCP 268session is established; and 'subj' describes which of these two parties is 269actually being fingerprinted. 270 271Command-line options may be followed by a single parameter containing a 272pcap-style traffic filtering rule. This allows you to reject some of the less 273interesting packets for performance or privacy reasons. Simple examples include: 274 275 'dst net 10.0.0.0/8 and port 80' 276 277 'not src host 10.1.2.3' 278 279 'port 22 or port 443' 280 281You can read more about the supported syntax by doing 'man pcap-fiter'; if 282that fails, try this URL: 283 284 http://www.manpagez.com/man/7/pcap-filter/ 285 286Filters work both for online capture (-i) and for previously collected data 287produced by any other tool (-r). 288 289------------- 2904. API access 291------------- 292 293The API allows other applications running on the same system to get p0f's 294current opinion about a particular host. This is useful for integrating it with 295spam filters, web apps, and so on. 296 297Clients are welcome to connect to the unix socket specified with -s using the 298SOCK_STREAM protocol, and may issue any number of fixed-length queries. The 299queries will be answered in the order they are received. 300 301Note that there is no response caching, nor any software limits in place on p0f 302end, so it is your responsibility to write reasonably well-behaved clients. 303 304Queries have exactly 21 bytes. The format is: 305 306 - Magic dword (0x50304601), in native endian of the platform. 307 308 - Address type byte: 4 for IPv4, 6 for IPv6. 309 310 - 16 bytes of address data, network endian. IPv4 addresses should be 311 aligned to the left. 312 313To such a query, p0f responds with: 314 315 - Another magic dword (0x50304602), native endian. 316 317 - Status dword: 0x00 for 'bad query', 0x10 for 'OK', and 0x20 for 'no match'. 318 319 - Host information, valid only if status is 'OK' (byte width in square 320 brackets): 321 322 [4] first_seen - unix time (seconds) of first observation of the host. 323 324 [4] last_seen - unix time (seconds) of most recent traffic. 325 326 [4] total_conn - total number of connections seen. 327 328 [4] uptime_min - calculated system uptime, in minutes. Zero if not known. 329 330 [4] up_mod_days - uptime wrap-around interval, in days. 331 332 [4] last_nat - time of the most recent detection of IP sharing (NAT, 333 load balancing, proxying). Zero if never detected. 334 335 [4] last_chg - time of the most recent individual OS mismatch (e.g., 336 due to multiboot or IP reuse). 337 338 [2] distance - system distance (derived from TTL; -1 if no data). 339 340 [1] bad_sw - p0f thinks the User-Agent or Server strings aren't 341 accurate. The value of 1 means OS difference (possibly 342 due to proxying), while 2 means an outright mismatch. 343 344 NOTE: If User-Agent is not present at all, this value 345 stays at 0. 346 347 [1] os_match_q - OS match quality: 0 for a normal match; 1 for fuzzy 348 (e.g., TTL or DF difference); 2 for a generic signature; 349 and 3 for both. 350 351 [32] os_name - NUL-terminated name of the most recent positively matched 352 OS. If OS not known, os_name[0] is NUL. 353 354 NOTE: If the host is first seen using an known system and 355 then switches to an unknown one, this field is not 356 reset. 357 358 [32] os_flavor - OS version. May be empty if no data. 359 360 [32] http_name - most recent positively identified HTTP application 361 (e.g. 'Firefox'). 362 363 [32] http_flavor - version of the HTTP application, if any. 364 365 [32] link_type - network link type, if recognized. 366 367 [32] language - system language, if recognized. 368 369A simple reference implementation of an API client is provided in p0f-client.c. 370Implementations in C / C++ may reuse api.h from p0f source code, too. 371 372Developers using the API should be aware of several important constraints: 373 374 - The maximum number of simultaneous API connections is capped to 20. The 375 limit may be adjusted with the -S parameter, but rampant parallelism may 376 lead to poorly controlled latency; consider a single query pipeline, 377 possibly with prioritization and caching. 378 379 - The maximum number of hosts and connections tracked at any given time is 380 subject to configurable limits. You should look at your traffic stats and 381 see if the defaults are suitable. 382 383 You should also keep in mind that whenever you are subject to an ongoing 384 DDoS or SYN spoofing DoS attack, p0f may end up dropping entries faster 385 than you could query for them. It's that or running out of memory, so 386 don't fret. 387 388 - Cache entries with no activity for more than 120 minutes will be dropped 389 even if the cache is nearly empty. The timeout is adjustable with -t, but 390 you should not use the API to obtain ancient data; if you routinely need to 391 go back hours or days, parse the logs instead of wasting RAM. 392 393----------------------- 3945. Fingerprint database 395----------------------- 396 397Whenever p0f obtains a fingerprint from the observed traffic, it defers to 398the data read from p0f.fp to identify the operating system and obtain some 399ancillary data needed for other analysis tasks. The fingerprint database is a 400simple text file where lines starting with ; are ignored. 401 402== Module specification == 403 404The file is split into sections based on the type of traffic the fingerprints 405apply to. Section identifiers are enclosed in square brackets, like so: 406 407[module:direction] 408 409 module - the name of the fingerprinting module (e.g. 'tcp' or 'http'). 410 411 direction - the direction of fingerprinted traffic: 'request' (from client to 412 server) or 'response' (from server to client). 413 414 For the TCP module, 'client' matches the initial SYN; and 415 'server' matches SYN+ACK. 416 417The 'direction' part is omitted for MTU signatures, as they work equally well 418both ways. 419 420== Signature groups == 421 422The actual signatures must be preceeded by an 'label' line, describing the 423fingerprinted software: 424 425label = type:class:name:flavor 426 427 type - some signatures in p0f.fp offer broad, last-resort matching for 428 less researched corner cases. The goal there is to give an 429 answer slightly better than "unknown", but less precise than 430 what the user may be expecting. 431 432 Normal, reasonably specific signatures that can't be radically 433 improved should have their type specified as 's'; while generic, 434 last-resort ones should be tagged with 'g'. 435 436 Note that generic signatures are considered only if no specific 437 matches are found in the database. 438 439 class - the tool needs to distinguish between OS-identifying signatures 440 (only one of which should be matched for any given host) and 441 signatures that just identify user applications (many of which 442 may be seen concurrently). 443 444 To assist with this, OS-specific signatures should specify the 445 OS architecture family here (e.g., 'win', 'unix', 'cisco'); while 446 application-related sigs (NMap, MSIE, Apache) should use a 447 special value of '!'. 448 449 Most TCP signatures are OS-specific, and should have OS family 450 defined. Other signatures, such as HTTP, should use '!' unless 451 the fingerprinted component is deeply intertwined with the 452 platform (e.g., Windows Update). 453 454 NOTE: To avoid variations (e.g. 'win' and 'windows' or 'unix' 455 and 'linux'), all classes need to be pre-registered using a 456 'classes' directive, seen near the beginning of p0f.fp. 457 458 name - a human-readable short name for what the fingerprint actually 459 helps identify - say, 'Linux', 'Sendmail', or 'NMap'. The tool 460 doesn't care about the exact value, but requires consistency - so 461 don't switch between 'Internet Explorer' and 'MSIE', or 'MacOS' 462 and 'Mac OS'. 463 464 flavor - anything you want to say to further qualify the observation. Can 465 be the version of the identified software, or a description of 466 what the application seems to be doing (e.g. 'SYN scan' for NMap). 467 468 NOTE: Don't be too specific: if you have a signature for Apache 469 2.2.16, but have no reason to suspect that other recent versions 470 behave in a radically different way, just say '2.x'. 471 472P0f uses labels to group similar signatures that may be plausibly generated by 473the same system or application, and should not be considered a strong signal for 474NAT detection. 475 476To further assist the tool in deciding which OS and application combinations are 477reasonable, and which ones are indicative of foul play, any 'label' line for 478applications (class '!') should be followed by a comma-delimited list of OS 479names or @-prefixed OS architecture classes on which this software is known to 480be used on. For example: 481 482label = s:!:Uncle John's Networked ls Utility:2.3.0.1 483sys = Linux,FreeBSD,OpenBSD 484 485...or: 486 487label = s:!:Mom's Homestyle Browser:1.x 488sys = @unix,@win 489 490The label can be followed by any number of module-specific signatures; all of 491them will be linked to the most recent label, and will be reported the same 492way. 493 494All sections except for 'name' are omitted for [mtu] signatures, which do not 495convey any OS-specific information, and just describe link types. 496 497== MTU signatures == 498 499Many operating systems derive the maximum segment size specified in TCP options 500from the MTU of their network interface; that value, in turn, normally depends 501on the design of the link-layer protocol. A different MTU is associated with 502PPPoE, a different one with IPSec, and a different one with Juniper VPN. 503 504The format of the signatures in the [mtu] section is exceedingly simple, 505consisting just of a description and a list of values: 506 507label = Ethernet 508sig = 1500 509 510These will be matched for any wildcard MSS TCP packets (see below) not generated 511by userspace TCP tools. 512 513== TCP signatures == 514 515For TCP traffic, signature layout is as follows: 516 517sig = ver:ittl:olen:mss:wsize,scale:olayout:quirks:pclass 518 519 ver - signature for IPv4 ('4'), IPv6 ('6'), or both ('*'). 520 521 NEW SIGNATURES: P0f documents the protocol observed on the wire, 522 but you should replace it with '*' unless you have observed some 523 actual differences between IPv4 and IPv6 traffic, or unless the 524 software supports only one of these versions to begin with. 525 526 ittl - initial TTL used by the OS. Almost all operating systems use 527 64, 128, or 255; ancient versions of Windows sometimes used 528 32, and several obscure systems sometimes resort to odd values 529 such as 60. 530 531 NEW SIGNATURES: P0f will usually suggest something, using the 532 format of 'observed_ttl+distance' (e.g. 54+10). Consider using 533 traceroute to check that the distance is accurate, then sum up 534 the values. If initial TTL can't be guessed, p0f will output 535 'nnn+?', and you need to use traceroute to estimate the '?'. 536 537 A handful of userspace tools will generate random TTLs. In these 538 cases, determine maximum initial TTL and then add a - suffix to 539 the value to avoid confusion. 540 541 olen - length of IPv4 options or IPv6 extension headers. Usually zero 542 for normal IPv4 traffic; always zero for IPv6 due to the 543 limitations of libpcap. 544 545 NEW SIGNATURES: Copy p0f output literally. 546 547 mss - maximum segment size, if specified in TCP options. Special value 548 of '*' can be used to denote that MSS varies depending on the 549 parameters of sender's network link, and should not be a part of 550 the signature. In this case, MSS will be used to guess the 551 type of network hookup according to the [mtu] rules. 552 553 NEW SIGNATURES: Use '*' for any commodity OSes where MSS is 554 around 1300 - 1500, unless you know for sure that it's fixed. 555 If the value is outside that range, you can probably copy it 556 literally. 557 558 wsize - window size. Can be expressed as a fixed value, but many 559 operating systems set it to a multiple of MSS or MTU, or a 560 multiple of some random integer. P0f automatically detects these 561 cases, and allows notation such as 'mss*4', 'mtu*4', or '%8192' 562 to be used. Wilcard ('*') is possible too. 563 564 NEW SIGNATURES: Copy p0f output literally. If frequent variations 565 are seen, look for obvious patterns. If there are no patterns, 566 '*' is a possible alternative. 567 568 scale - window scaling factor, if specified in TCP options. Fixed value 569 or '*'. 570 571 NEW SIGNATURES: Copy literally, unless the value varies randomly. 572 Many systems alter between 2 or 3 scaling factors, in which case, 573 it's better to have several 'sig' lines, rather than a wildcard. 574 575 olayout - comma-delimited layout and ordering of TCP options, if any. This 576 is one of the most valuable TCP fingerprinting signals. Supported 577 values: 578 579 eol+n - explicit end of options, followed by n bytes of padding 580 nop - no-op option 581 mss - maximum segment size 582 ws - window scaling 583 sok - selective ACK permitted 584 sack - selective ACK (should not be seen) 585 ts - timestamp 586 ?n - unknown option ID n 587 588 NEW SIGNATURES: Copy this string literally. 589 590 quirks - comma-delimited properties and quirks observed in IP or TCP 591 headers: 592 593 df - "don't fragment" set (probably PMTUD); ignored for IPv6 594 id+ - DF set but IPID non-zero; ignored for IPv6 595 id- - DF not set but IPID is zero; ignored for IPv6 596 ecn - explicit congestion notification support 597 0+ - "must be zero" field not zero; ignored for IPv6 598 flow - non-zero IPv6 flow ID; ignored for IPv4 599 600 seq- - sequence number is zero 601 ack+ - ACK number is non-zero, but ACK flag not set 602 ack- - ACK number is zero, but ACK flag set 603 uptr+ - URG pointer is non-zero, but URG flag not set 604 urgf+ - URG flag used 605 pushf+ - PUSH flag used 606 607 ts1- - own timestamp specified as zero 608 ts2+ - non-zero peer timestamp on initial SYN 609 opt+ - trailing non-zero data in options segment 610 exws - excessive window scaling factor (> 14) 611 bad - malformed TCP options 612 613 If a signature scoped to both IPv4 and IPv6 contains quirks valid 614 for just one of these protocols, such quirks will be ignored for 615 on packets using the other protocol. For example, any combination 616 of 'df', 'id+', and 'id-' is always matched by any IPv6 packet. 617 618 NEW SIGNATURES: Copy literally. 619 620 pclass - payload size classification: '0' for zero, '+' for non-zero, 621 '*' for any. The packets we fingerprint right now normally have 622 no payloads, but some corner cases exist. 623 624 NEW SIGNATURES: Copy literally. 625 626NOTE: The TCP module allows some fuzziness when an exact match can't be found: 627'df' and 'id+' quirks are allowed to disappear; 'id-' or 'ecn' may appear; and 628TTLs can change. 629 630To gather new SYN ('request') signatures, simply connect to the fingerprinted 631system, and p0f will provide you with the necessary data. To gather SYN+ACK 632('response') signatures, you should use the bundled p0f-sendsyn utility while p0f 633is running in the background; creating them manually is not advisable. 634 635== HTTP signatures == 636 637A special directive should appear at the beginning of the [http:request] 638section, structured the following way: 639 640ua_os = Linux,Windows,iOS=[iPad],iOS=[iPhone],Mac OS X,... 641 642This list should specify OS names that should be looked for within the 643User-Agent string if the string is otherwise deemed to be honest. This input 644is not used for fingerprinting, but aids NAT detection in some useful ways. 645 646The names have to match the names used in 'sig' specifiers across p0f.fp. If a 647particular name used by p0f differs from what typically appears in User-Agent, 648the name=[string] syntax may be used to define any number of aliases. 649 650Other than that, HTTP signatures for GET and HEAD requests have the following 651layout: 652 653sig = ver:horder:habsent:expsw 654 655 ver - 0 for HTTP/1.0, 1 for HTTP/1.1, or '*' for any. 656 657 NEW SIGNATURES: Copy the value literally, unless you have a 658 specific reason to do otherwise. 659 660 horder - comma-separated, ordered list of headers that should appear in 661 matching traffic. Substrings to match within each of these 662 headers may be specified using a name=[value] notation. 663 664 The signature will be matched even if other headers appear in 665 between, as long as the list itself is matched in the specified 666 sequence. 667 668 Headers that usually do appear in the traffic, but may go away 669 (e.g. Accept-Language if the user has no languages defined, or 670 Referer if no referring site exists) should be prefixed with '?', 671 e.g. "?Referer". P0f will accept their disappearance, but will 672 not allow them to appear at any other location. 673 674 NEW SIGNATURES: Review the list and remove any headers that 675 appear to be irrelevant to the fingerprinted software, and mark 676 transient ones with '?'. Remove header values that do not add 677 anything to the signature, or are request- or user-specific. 678 In particular, pay attention to Accept, Accept-Language, and 679 Accept-Charset, as they are highly specific to request type 680 and user settings. 681 682 P0f automatically removes some headers, prefixes others with '?', 683 and inhibits the value of fields such as 'Referer' or 'Cookie' - 684 but this is not a substitute for manual review. 685 686 NOTE: Server signatures may differ depending on the request 687 (HTTP/1.1 versus 1.0, keep-alive versus one-shot, etc) and on the 688 returned resource (e.g., CGI versus static content). Play around, 689 browse to several URLs, also try curl and wget. 690 691 habsent - comma-separated list of headers that must *not* appear in 692 matching traffic. This is particularly useful for noting the 693 absence of standard headers (e.g. 'Host'), or for differentiating 694 between otherwise very similar signatures. 695 696 NEW SIGNATURES: P0f will automatically highlight the absence of 697 any normally present headers; other entries may be added where 698 necessary. 699 700 expsw - expected substring in 'User-Agent' or 'Server'. This is not 701 used to match traffic, and merely serves to detect dishonest 702 software. If you want to explicitly match User-Agent, you need 703 to do this in the 'horder' section, e.g.: 704 705 User-Agent=[Firefox] 706 707Any of these sections sections except for 'ver' may be blank. 708 709There are many protocol-level quirks that p0f could be detecting - for example, 710the use of non-standard newlines, or missing or extra spacing between header 711field names and values. There is also some information to be gathered from 712responses to OPTIONS or POST. That said, it does not seem to be worth the 713effort: the protocol is so verbose, and implemented so arbitrarily, that we are 714getting more than enough information just with a simple GET / HEAD fingerprint. 715 716== SMTP signatures == 717 718 *** NOT IMPLEMENTED YET *** 719 720== FTP signatures == 721 722 *** NOT IMPLEMENTED YET *** 723 724---------------- 7256. NAT detection 726---------------- 727 728In addition to fairly straightforward measurements of intrinsic properties of 729a single TCP session, p0f also tries to compare signatures across sessions to 730detect client-side connection sharing (NAT, HTTP proxies) or server-side load 731balancing. 732 733This is done in two steps: the first significant deviation usually prompts a 734"host change" entry (which may be also indicative of multi-boot, address reuse, 735or other one-off events); and a persistent pattern of changes prompts an 736"ip sharing" notification later on. 737 738All of these messages are accompanied by a set of reason codes: 739 740 os_sig - the OS detected right now doesn't match the OS detected earlier 741 on. 742 743 sig_diff - no definite OS detection data available, but protocol-level 744 characteristics have changed drastically (e.g., different 745 TCP option layout). 746 747 app_vs_os - the application detected running on the host is not supposed 748 to work on the host's operating system. 749 750 x_known - the signature progressed from known to unknown, or vice versa. 751 752The following additional codes are specific to TCP: 753 754 tstamp - TCP timestamps went back or jumped forward. 755 756 ttl - TTL values have changed. 757 758 port - source port number has decreased. 759 760 mtu - system MTU has changed. 761 762 fuzzy - the precision with which a TCP signature is matched has 763 changed. 764 765The following code is also issued by the HTTP module: 766 767 via - data explicitly includes Via / X-Forwarded-For. 768 769 us_vs_os - OS fingerprint doesn't match User-Agent data, and the 770 User-Agent value otherwise looks honest. 771 772 app_srv_lb - server application signatures change, suggesting load 773 balancing. 774 775 date - server-advertised date changes inconsistently. 776 777Different reasons have different weights, balanced to keep p0f very sensitive 778even to very homogenous environments behind NAT. If you end up seeing false 779positives or other detection problems in your environment, please let me know! 780 781----------- 7827. Security 783----------- 784 785You should treat the output from this tool as advisory; the fingerprinting can 786be gambled with some minor effort, and it's also possible to evade it altogether 787(e.g. with excessive IP fragmentation or bad TCP checksums). Plan accordingly. 788 789P0f should to be reasonably secure to operate as a daemon. That said, un*x 790users should employ the -u option to drop privileges and chroot() when running 791the tool continuously. This greatly minimizes the consequences of any mishaps - 792and mishaps in C just tend to happen. 793 794To make this step meaningful, the user you are running p0f as should be 795completely unprivileged, and should have an empty, read-only home directory. For 796example, you can do: 797 798# useradd -d /var/empty/p0f -M -r -s /bin/nologin p0f-user 799# mkdir -p -m 755 /var/empty/p0f 800 801Please don't put the p0f binary itself, or any other valuable assets, inside 802that user's home directory; and certainly do not use any generic locations such 803as / or /bin/ in lieu of a proper home. 804 805P0f running in the background should be fairly difficult to DoS, especially 806compared to any real TCP services it will be watching. Nevertheless, there are 807so many deployment-specific factors at play that you should always preemptively 808stress-test your setup, and see how it behaves. 809 810Other than that, let's talk filesystem security. When using the tool in the 811API mode (-s), the listening socket is always re-created created with 666 812permissions, so that applications running as other uids can query it at will. 813If you want to preserve the privacy of captured traffic in a multi-user system, 814please ensure that the socket is created in a directory with finer-grained 815permissions; or change API_MODE in config.h. 816 817The default file mode for binary log data (-o) is 600, on the account that 818others probably don't need access to historical data; if you need to share logs, 819you can pre-create the file or change LOG_MODE in config.h. 820 821Don't build p0f, and do not store its source, binary, configuration files, logs, 822or query sockets in world-writable locations such as /tmp (or any 823subdirectories created therein). 824 825Last but not least, please do not attempt to make p0f setuid, or otherwise 826grant it privileges higher than these of the calling user. Neither the tool 827itself, nor the third-party components it depends on, are designed to keep rogue 828less-privileged callers at bay. If you use /usr/local/etc/sudoers to list p0f as the only 829program that user X should be able to run as root, that user will probably be 830able to compromise your system. The same goes for many other uses of sudo, by 831the way. 832 833-------------- 8348. Limitations 835-------------- 836 837Here are some of the known issues you may run into: 838 839== General == 840 8411) RST, ACK, and other experimental fingerprinting modes offered in p0f v2 are 842 no longer supported in v3. This is because they proved to have very low 843 specificity. The consequence is that you can no longer fingerprint 844 "connection refused" responses. 845 8462) API queries or daemon execution are not supported when reading offline pcaps. 847 While there may be some fringe use cases for that, offline pcaps use a 848 much simpler event loop, and so supporting these features would require some 849 extra effort. 850 8513) P0f needs to observe at least about 25 milliseconds worth of qualifying 852 traffic to estimate system uptime. This means that if you're testing it over 853 loopback or LAN, you may need to let it see more than one connection. 854 855 Systems with extremely slow timestamp clocks may need longer acquisition 856 periods (up to several seconds); very fast clocks (over 1.5 kHz) are rejected 857 completely on account of being prohibited by the RFC. Almost all OSes are 858 between 100 Hz and 1 kHz, which should work fine. 859 8604) Some systems vary SYN+ACK responses based on the contents of the initial SYN, 861 sometimes removing TCP options not supported by the other endpoint. 862 Unfortunately, there is no easy way to account for this, so several SYN+ACK 863 signatures may be required per system. The bundled p0f-sendsyn utility helps 864 with collecting them. 865 866 Another consequence of this is that you will sometimes see server uptime only 867 if your own system has RFC1323 timestamps enabled. Linux does that since 868 version 2.2; on Windows, you need version 7 or newer. Client uptimes are not 869 affected. 870 871== Windows port == 872 8731) API sockets do not work on Windows. This is due to a limitation of winpcap; 874 see live_event_loop(...) in p0f.c for more info. 875 8762) The chroot() jail (-u) on Windows doesn't offer any real security. This is 877 due to the limitations of cygwin. 878 8793) The p0f-sendsyn utility doesn't work because of the limited capabilities of 880 Windows raw sockets (this should be relatively easy to fix if there are any 881 users who care). 882 883--------------------------- 8849. Acknowledgments and more 885--------------------------- 886 887P0f is made possible thanks to the contributions of several good souls, 888including: 889 890 Phil Ames 891 Jannich Brendle 892 Matthew Dempsky 893 Jason DePriest 894 Dalibor Dukic 895 Mark Martinec 896 Damien Miller 897 Josh Newton 898 Nibbler 899 Bernhard Rabe 900 Chris John Riley 901 Sebastian Roschke 902 Peter Valchev 903 Jeff Weisberg 904 Anthony Howe 905 Tomoyuki Murakami 906 Michael Petch 907 908If you wish to help, the most immediate way to do so is to simply gather new 909signatures, especially from less popular or older platforms (servers, networking 910equipment, portable / embedded / specialty OSes, etc). 911 912Problems? Suggestions? Complaints? Compliments? You can reach the author at 913<lcamtuf@coredump.cx>. The author is very lonely and appreciates your mail. 914