1b860cb26SRobert Watsontcpp -- Parallel TCP Exercise Tool 2b860cb26SRobert Watson 3b860cb26SRobert WatsonThis is a new tool, and is rife with bugs. However, it appears to create 4b860cb26SRobert Watsoneven more problems for device drivers and the kernel, so that's OK. 5b860cb26SRobert Watson 6b860cb26SRobert WatsonThis tool generates large numbers of TCP connections and stuffs lots of data 7b860cb26SRobert Watsoninto them. One binary encapsulates both a client and a server. Each of the 8b860cb26SRobert Watsonclient and the server generates a certain number of worker processes, each of 9b860cb26SRobert Watsonwhich in turn uses its own TCP port. The number of server processes must be 10b860cb26SRobert Watson>= the number of client processes, or some of the ports required by the 11b860cb26SRobert Watsonclient won't have a listener. The client then proceeds to make connections 12b860cb26SRobert Watsonand send data to the server. Each worker multiplexes many connections at 13b860cb26SRobert Watsononce, up to a maximum parallelism limit. The client can use one or many IP 14b860cb26SRobert Watsonaddresses, in order to make more 4-tuples available for testing, and will 15b860cb26SRobert Watsonautomatically spread the load of new connections across available source 16b860cb26SRobert Watsonaddresses. 17b860cb26SRobert Watson 18b860cb26SRobert WatsonYou will need to retune your TCP stack for high volume, see Configuration 19b860cb26SRobert WatsonNotes below. 20b860cb26SRobert Watson 21b860cb26SRobert WatsonThe server has very little to configure, use the following command line 22b860cb26SRobert Watsonflags: 23b860cb26SRobert Watson 24b860cb26SRobert Watson -s Select server mode 25b860cb26SRobert Watson -p <numprocs> Number of workers, should be >= client -p arg 26b860cb26SRobert Watson -r <baseport> Non-default base TCP port, should match client 27b860cb26SRobert Watson -T Print CPU usage every ten seconds 28b860cb26SRobert Watson -m <maxconnectionsperproc> Maximum simultaneous connections/proc, should 29b860cb26SRobert Watson be >= client setting. 30b860cb26SRobert Watson 31b860cb26SRobert WatsonTypical use: 32b860cb26SRobert Watson 33b860cb26SRobert Watson ./tcpp -s -p 4 -m 1000000 34b860cb26SRobert Watson 35b860cb26SRobert WatsonThis selects server mode, four workers, and at most 1 million TCP connections 36b860cb26SRobert Watsonper worker at a time. 37b860cb26SRobert Watson 38b860cb26SRobert WatsonThe client has more to configure, with the following flags: 39b860cb26SRobert Watson 40b860cb26SRobert Watson -c <remoteIP> Select client mode, and specific dest IP 41b860cb26SRobert Watson -C Print connections/second instead of GBps 4273dd1f43SRobert Watson -P Pin each worker to a CPU 43b860cb26SRobert Watson -M <localIPcount> Number of sequential local IPs to use; req. -l 44b860cb26SRobert Watson -T Include CPU use summary in stats at end of run 45b860cb26SRobert Watson -b <bytespertcp> Data bytes per connection 46b860cb26SRobert Watson -l <localIPbase> Starting local IP address to bind 47b860cb26SRobert Watson -m <maxtcpsatonce> Max simultaneous conn/worker (see server -m) 48b860cb26SRobert Watson -p <numprocs> Number of workers, should be <= server -p 49b860cb26SRobert Watson -r <baseport> Non-default base TCP port, should match server 50b860cb26SRobert Watson -t <tcpsperproc> How many connections to use per worker 51b860cb26SRobert Watson 52b860cb26SRobert WatsonTypical use: 53b860cb26SRobert Watson 54b860cb26SRobert Watson ./tcpp -c 192.168.100.201 -p 4 -t 100000 -m 10000 -b 100000 \ 55b860cb26SRobert Watson -l 192.168.100.101 -M 4 56b860cb26SRobert Watson 57b860cb26SRobert WatsonThis creates four workers, each of which will (over its lifetime) set up and 58b860cb26SRobert Watsonuse 100,000 TCP connections carrying 100K of data, up to 10,000 simultaneous 59b860cb26SRobert Watsonconnection at any given moment. tcpp will use four source IP addresses, 60b860cb26SRobert Watsonstarting with 192.168.100.101, and all connections will be to the single 61b860cb26SRobert Watsondestination IP of 192.168.100.201. 62b860cb26SRobert Watson 63b860cb26SRobert WatsonHaving (p) <= the number of cores is advisable. When multiple IPs are used 64b860cb26SRobert Watsonon the client, they will be sequential starting with the localIPbase set with 65b860cb26SRobert Watson-l. 66b860cb26SRobert Watson 67b860cb26SRobert WatsonKnown Issues 68b860cb26SRobert Watson------------ 69b860cb26SRobert Watson 70b860cb26SRobert WatsonThe bandwidth estimate doesn't handle failures well. It also has serious 71b860cb26SRobert Watsonrounding errors and probably conceptual problems. 72b860cb26SRobert Watson 73b860cb26SRobert WatsonIt's not clear that kevent() is "fair" to multiple connections. 74b860cb26SRobert Watson 75b860cb26SRobert WatsonRather than passing the length for each connection, we might want to pass 76b860cb26SRobert Watsonit once with a control connection up front. On the other hand, the server 77b860cb26SRobert Watsonis quite dumb right now, so we could take advantage of this to do size 78b860cb26SRobert Watsonmixes. 79b860cb26SRobert Watson 80b860cb26SRobert WatsonConfiguration Notes 81b860cb26SRobert Watson------------------- 82b860cb26SRobert Watson 83a34eb060SRobert WatsonIn my testing, I use loader.conf entries of: 84b860cb26SRobert Watson 85b860cb26SRobert Watsonkern.ipc.maxsockets=1000000 86b860cb26SRobert Watsonnet.inet.tcp.maxtcptw=3000000 87b860cb26SRobert Watsonkern.ipc.somaxconn=49152 88a34eb060SRobert Watsonkern.ipc.nmbjumbo16=262144 89a34eb060SRobert Watsonkern.ipc.nmbjumbo9=262144 90a34eb060SRobert Watsonkern.ipc.nmbjumbop=262144 91a34eb060SRobert Watsonkern.ipc.nmbclusters=262144 92a34eb060SRobert Watsonnet.inet.tcp.syncache.cachelimit=65536 93a34eb060SRobert Watsonnet.inet.tcp.syncache.bucketlimit=512 94a34eb060SRobert Watson 95a34eb060SRobert Watson# May be useful if you can't use multiple IP addresses 96a34eb060SRobert Watsonnet.inet.ip.portrange.first=100 97b860cb26SRobert Watson 98b860cb26SRobert Watson# For running !multiq, do this before loading the driver: 99b860cb26SRobert Watsonkenv hw.cxgb.singleq="1" 100b860cb26SRobert Watson 101b860cb26SRobert Watsonkldload if_cxgb 102b860cb26SRobert Watson 103b860cb26SRobert Watson# Consider turning off TSO and/or adjusting the MTU for some scenarios: 104b860cb26SRobert Watsonifconfig cxgb0 -tso 105b860cb26SRobert Watsonifconfig cxgb0 mtu 1500 106