1@(#) $Id: README,v 1.8 2007/10/07 16:41:05 deraadt Exp $ (LBL) 2 3TCPDUMP 3.4 4Lawrence Berkeley National Laboratory 5Network Research Group 6tcpdump@ee.lbl.gov 7ftp://ftp.ee.lbl.gov/tcpdump.tar.Z 8 9This directory contains source code for tcpdump, a tool for network 10monitoring and data acquisition. The original distribution is 11available via anonymous ftp to ftp.ee.lbl.gov, in tcpdump.tar.Z. 12 13Tcpdump now uses libpcap, a system-independent interface for user-level 14packet capture. Before building tcpdump, you must first retrieve and 15build libpcap, also from LBL, in: 16 17 ftp://ftp.ee.lbl.gov/libpcap.tar.Z 18 19Once libpcap is built (either install it or make sure it's in 20../libpcap), you can build tcpdump using the procedure in the INSTALL 21file. 22 23The program is loosely based on SMI's "etherfind" although none of the 24etherfind code remains. It was originally written by Van Jacobson as 25part of an ongoing research project to investigate and improve tcp and 26internet gateway performance. The parts of the program originally 27taken from Sun's etherfind were later re-written by Steven McCanne of 28LBL. To insure that there would be no vestige of proprietary code in 29tcpdump, Steve wrote these pieces from the specification given by the 30manual entry, with no access to the source of tcpdump or etherfind. 31 32Over the past few years, tcpdump has been steadily improved by the 33excellent contributions from the Internet community (just browse 34through the CHANGES file). We are grateful for all the input. 35 36Richard Stevens gives an excellent treatment of the Internet protocols 37in his book ``TCP/IP Illustrated, Volume 1''. If you want to learn more 38about tcpdump and how to interpret its output, pick up this book. 39 40Some tools for viewing and analyzing tcpdump trace files are available 41from the Internet Traffic Archive: 42 43 http://www.acm.org/sigcomm/ITA/ 44 45Another tool that tcpdump users might find useful is tcpslice: 46 47 ftp://ftp.ee.lbl.gov/tcpslice.tar.gz 48 49It is a program that can be used to extract portions of tcpdump binary 50trace files. See the above distribution for further details and 51documentation. 52 53Problems, bugs, questions, desirable enhancements, source code 54contributions, etc., should be sent to the email address 55"tcpdump@ee.lbl.gov". 56 57 - Steve McCanne 58 Craig Leres 59 Van Jacobson 60------------------------------------- 61This directory also contains some short awk programs intended as 62examples of ways to reduce tcpdump data when you're tracking 63particular network problems: 64 65send-ack.awk 66 Simplifies the tcpdump trace for an ftp (or other unidirectional 67 tcp transfer). Since we assume that one host only sends and 68 the other only acks, all address information is left off and 69 we just note if the packet is a "send" or an "ack". 70 71 There is one output line per line of the original trace. 72 Field 1 is the packet time in decimal seconds, relative 73 to the start of the conversation. Field 2 is delta-time 74 from last packet. Field 3 is packet type/direction. 75 "Send" means data going from sender to receiver, "ack" 76 means an ack going from the receiver to the sender. A 77 preceding "*" indicates that the data is a retransmission. 78 A preceding "-" indicates a hole in the sequence space 79 (i.e., missing packet(s)), a "#" means an odd-size (not max 80 seg size) packet. Field 4 has the packet flags 81 (same format as raw trace). Field 5 is the sequence 82 number (start seq. num for sender, next expected seq number 83 for acks). The number in parens following an ack is 84 the delta-time from the first send of the packet to the 85 ack. A number in parens following a send is the 86 delta-time from the first send of the packet to the 87 current send (on duplicate packets only). Duplicate 88 sends or acks have a number in square brackets showing 89 the number of duplicates so far. 90 91 Here is a short sample from near the start of an ftp: 92 3.00 0.20 send . 512 93 3.20 0.20 ack . 1024 (0.20) 94 3.20 0.00 send P 1024 95 3.40 0.20 ack . 1536 (0.20) 96 3.80 0.40 * send . 0 (3.80) [2] 97 3.82 0.02 * ack . 1536 (0.62) [2] 98 Three seconds into the conversation, bytes 512 through 1023 99 were sent. 200ms later they were acked. Shortly thereafter 100 bytes 1024-1535 were sent and again acked after 200ms. 101 Then, for no apparent reason, 0-511 is retransmitted, 3.8 102 seconds after its initial send (the round trip time for this 103 ftp was 1sec, +-500ms). Since the receiver is expecting 104 1536, 1536 is re-acked when 0 arrives. 105 106packetdat.awk 107 Computes chunk summary data for an ftp (or similar 108 unidirectional tcp transfer). [A "chunk" refers to 109 a chunk of the sequence space -- essentially the packet 110 sequence number divided by the max segment size.] 111 112 A summary line is printed showing the number of chunks, 113 the number of packets it took to send that many chunks 114 (if there are no lost or duplicated packets, the number 115 of packets should equal the number of chunks) and the 116 number of acks. 117 118 Following the summary line is one line of information 119 per chunk. The line contains eight fields: 120 1 - the chunk number 121 2 - the start sequence number for this chunk 122 3 - time of first send 123 4 - time of last send 124 5 - time of first ack 125 6 - time of last ack 126 7 - number of times chunk was sent 127 8 - number of times chunk was acked 128 (all times are in decimal seconds, relative to the start 129 of the conversation.) 130 131 As an example, here is the first part of the output for 132 an ftp trace: 133 134 # 134 chunks. 536 packets sent. 508 acks. 135 1 1 0.00 5.80 0.20 0.20 4 1 136 2 513 0.28 6.20 0.40 0.40 4 1 137 3 1025 1.16 6.32 1.20 1.20 4 1 138 4 1561 1.86 15.00 2.00 2.00 6 1 139 5 2049 2.16 15.44 2.20 2.20 5 1 140 6 2585 2.64 16.44 2.80 2.80 5 1 141 7 3073 3.00 16.66 3.20 3.20 4 1 142 8 3609 3.20 17.24 3.40 5.82 4 11 143 9 4097 6.02 6.58 6.20 6.80 2 5 144 145 This says that 134 chunks were transferred (about 70K 146 since the average packet size was 512 bytes). It took 147 536 packets to transfer the data (i.e., on the average 148 each chunk was transmitted four times). Looking at, 149 say, chunk 4, we see it represents the 512 bytes of 150 sequence space from 1561 to 2048. It was first sent 151 1.86 seconds into the conversation. It was last 152 sent 15 seconds into the conversation and was sent 153 a total of 6 times (i.e., it was retransmitted every 154 2 seconds on the average). It was acked once, 140ms 155 after it first arrived. 156 157stime.awk 158atime.awk 159 Output one line per send or ack, respectively, in the form 160 <time> <seq. number> 161 where <time> is the time in seconds since the start of the 162 transfer and <seq. number> is the sequence number being sent 163 or acked. I typically plot this data looking for suspicious 164 patterns. 165 166 167The problem I was looking at was the bulk-data-transfer 168throughput of medium delay network paths (1-6 sec. round trip 169time) under typical DARPA Internet conditions. The trace of the 170ftp transfer of a large file was used as the raw data source. 171The method was: 172 173 - On a local host (but not the Sun running tcpdump), connect to 174 the remote ftp. 175 176 - On the monitor Sun, start the trace going. E.g., 177 tcpdump host local-host and remote-host and port ftp-data >tracefile 178 179 - On local, do either a get or put of a large file (~500KB), 180 preferably to the null device (to minimize effects like 181 closing the receive window while waiting for a disk write). 182 183 - When transfer is finished, stop tcpdump. Use awk to make up 184 two files of summary data (maxsize is the maximum packet size, 185 tracedata is the file of tcpdump tracedata): 186 awk -f send-ack.awk packetsize=avgsize tracedata >sa 187 awk -f packetdat.awk packetsize=avgsize tracedata >pd 188 189 - While the summary data files are printing, take a look at 190 how the transfer behaved: 191 awk -f stime.awk tracedata | xgraph 192 (90% of what you learn seems to happen in this step). 193 194 - Do all of the above steps several times, both directions, 195 at different times of day, with different protocol 196 implementations on the other end. 197 198 - Using one of the Unix data analysis packages (in my case, 199 S and Gary Perlman's Unix|Stat), spend a few months staring 200 at the data. 201 202 - Change something in the local protocol implementation and 203 redo the steps above. 204 205 - Once a week, tell your funding agent that you're discovering 206 wonderful things and you'll write up that research report 207 "real soon now". 208 209