1@(#) $Id: README,v 1.9 2015/12/05 21:41:29 mmcc Exp $ (LBL) 2 3The program is loosely based on SMI's "etherfind" although none of the 4etherfind code remains. It was originally written by Van Jacobson as 5part of an ongoing research project to investigate and improve tcp and 6internet gateway performance. The parts of the program originally 7taken from Sun's etherfind were later re-written by Steven McCanne of 8LBL. To insure that there would be no vestige of proprietary code in 9tcpdump, Steve wrote these pieces from the specification given by the 10manual entry, with no access to the source of tcpdump or etherfind. 11 12Richard Stevens gives an excellent treatment of the Internet protocols 13in his book ``TCP/IP Illustrated, Volume 1''. If you want to learn more 14about tcpdump and how to interpret its output, pick up this book. 15 16Some tools for viewing and analyzing tcpdump trace files are available 17from the Internet Traffic Archive: 18 19 http://www.acm.org/sigcomm/ITA/ 20 21Another tool that tcpdump users might find useful is tcpslice: 22 23 ftp://ftp.ee.lbl.gov/tcpslice.tar.gz 24 25It is a program that can be used to extract portions of tcpdump binary 26trace files. See the above distribution for further details and 27documentation. 28 29 - Steve McCanne 30 Craig Leres 31 Van Jacobson 32------------------------------------- 33This directory also contains some short awk programs intended as 34examples of ways to reduce tcpdump data when you're tracking 35particular network problems: 36 37send-ack.awk 38 Simplifies the tcpdump trace for an ftp (or other unidirectional 39 tcp transfer). Since we assume that one host only sends and 40 the other only acks, all address information is left off and 41 we just note if the packet is a "send" or an "ack". 42 43 There is one output line per line of the original trace. 44 Field 1 is the packet time in decimal seconds, relative 45 to the start of the conversation. Field 2 is delta-time 46 from last packet. Field 3 is packet type/direction. 47 "Send" means data going from sender to receiver, "ack" 48 means an ack going from the receiver to the sender. A 49 preceding "*" indicates that the data is a retransmission. 50 A preceding "-" indicates a hole in the sequence space 51 (i.e., missing packet(s)), a "#" means an odd-size (not max 52 seg size) packet. Field 4 has the packet flags 53 (same format as raw trace). Field 5 is the sequence 54 number (start seq. num for sender, next expected seq number 55 for acks). The number in parens following an ack is 56 the delta-time from the first send of the packet to the 57 ack. A number in parens following a send is the 58 delta-time from the first send of the packet to the 59 current send (on duplicate packets only). Duplicate 60 sends or acks have a number in square brackets showing 61 the number of duplicates so far. 62 63 Here is a short sample from near the start of an ftp: 64 3.00 0.20 send . 512 65 3.20 0.20 ack . 1024 (0.20) 66 3.20 0.00 send P 1024 67 3.40 0.20 ack . 1536 (0.20) 68 3.80 0.40 * send . 0 (3.80) [2] 69 3.82 0.02 * ack . 1536 (0.62) [2] 70 Three seconds into the conversation, bytes 512 through 1023 71 were sent. 200ms later they were acked. Shortly thereafter 72 bytes 1024-1535 were sent and again acked after 200ms. 73 Then, for no apparent reason, 0-511 is retransmitted, 3.8 74 seconds after its initial send (the round trip time for this 75 ftp was 1sec, +-500ms). Since the receiver is expecting 76 1536, 1536 is re-acked when 0 arrives. 77 78packetdat.awk 79 Computes chunk summary data for an ftp (or similar 80 unidirectional tcp transfer). [A "chunk" refers to 81 a chunk of the sequence space -- essentially the packet 82 sequence number divided by the max segment size.] 83 84 A summary line is printed showing the number of chunks, 85 the number of packets it took to send that many chunks 86 (if there are no lost or duplicated packets, the number 87 of packets should equal the number of chunks) and the 88 number of acks. 89 90 Following the summary line is one line of information 91 per chunk. The line contains eight fields: 92 1 - the chunk number 93 2 - the start sequence number for this chunk 94 3 - time of first send 95 4 - time of last send 96 5 - time of first ack 97 6 - time of last ack 98 7 - number of times chunk was sent 99 8 - number of times chunk was acked 100 (all times are in decimal seconds, relative to the start 101 of the conversation.) 102 103 As an example, here is the first part of the output for 104 an ftp trace: 105 106 # 134 chunks. 536 packets sent. 508 acks. 107 1 1 0.00 5.80 0.20 0.20 4 1 108 2 513 0.28 6.20 0.40 0.40 4 1 109 3 1025 1.16 6.32 1.20 1.20 4 1 110 4 1561 1.86 15.00 2.00 2.00 6 1 111 5 2049 2.16 15.44 2.20 2.20 5 1 112 6 2585 2.64 16.44 2.80 2.80 5 1 113 7 3073 3.00 16.66 3.20 3.20 4 1 114 8 3609 3.20 17.24 3.40 5.82 4 11 115 9 4097 6.02 6.58 6.20 6.80 2 5 116 117 This says that 134 chunks were transferred (about 70K 118 since the average packet size was 512 bytes). It took 119 536 packets to transfer the data (i.e., on the average 120 each chunk was transmitted four times). Looking at, 121 say, chunk 4, we see it represents the 512 bytes of 122 sequence space from 1561 to 2048. It was first sent 123 1.86 seconds into the conversation. It was last 124 sent 15 seconds into the conversation and was sent 125 a total of 6 times (i.e., it was retransmitted every 126 2 seconds on the average). It was acked once, 140ms 127 after it first arrived. 128 129stime.awk 130atime.awk 131 Output one line per send or ack, respectively, in the form 132 <time> <seq. number> 133 where <time> is the time in seconds since the start of the 134 transfer and <seq. number> is the sequence number being sent 135 or acked. I typically plot this data looking for suspicious 136 patterns. 137 138 139The problem I was looking at was the bulk-data-transfer 140throughput of medium delay network paths (1-6 sec. round trip 141time) under typical DARPA Internet conditions. The trace of the 142ftp transfer of a large file was used as the raw data source. 143The method was: 144 145 - On a local host (but not the Sun running tcpdump), connect to 146 the remote ftp. 147 148 - On the monitor Sun, start the trace going. E.g., 149 tcpdump host local-host and remote-host and port ftp-data >tracefile 150 151 - On local, do either a get or put of a large file (~500KB), 152 preferably to the null device (to minimize effects like 153 closing the receive window while waiting for a disk write). 154 155 - When transfer is finished, stop tcpdump. Use awk to make up 156 two files of summary data (maxsize is the maximum packet size, 157 tracedata is the file of tcpdump tracedata): 158 awk -f send-ack.awk packetsize=avgsize tracedata >sa 159 awk -f packetdat.awk packetsize=avgsize tracedata >pd 160 161 - While the summary data files are printing, take a look at 162 how the transfer behaved: 163 awk -f stime.awk tracedata | xgraph 164 (90% of what you learn seems to happen in this step). 165 166 - Do all of the above steps several times, both directions, 167 at different times of day, with different protocol 168 implementations on the other end. 169 170 - Using one of the Unix data analysis packages (in my case, 171 S and Gary Perlman's Unix|Stat), spend a few months staring 172 at the data. 173 174 - Change something in the local protocol implementation and 175 redo the steps above. 176 177 - Once a week, tell your funding agent that you're discovering 178 wonderful things and you'll write up that research report 179 "real soon now". 180 181