1.. include:: header.rst
2
3.. contents:: Table of contents
4  :depth: 2
5  :backlinks: none
6
7tuning libtorrent
8=================
9
10libtorrent expose most parameters used in the bittorrent engine for
11customization through the settings_pack. This makes it possible to
12test and tweak the parameters for certain algorithms to make a client
13that fits a wide range of needs. From low memory embedded devices to
14servers seeding thousands of torrents. The default settings in libtorrent
15are tuned for an end-user bittorrent client running on a normal desktop
16computer.
17
18This document describes techniques to benchmark libtorrent performance
19and how parameters are likely to affect it.
20
21profiling
22=========
23
24libtorrent is instrumented with a number of counters and gauges you can have
25access to via the ``session_stats_alert``. First, enable these alerts in the
26alert mask::
27
28	settings_pack p;
29	p.set_int(settings_mask::alert_mask, alert::stats_notification);
30	ses.apply_settings(p);
31
32Then print alerts to a file::
33
34	std::vector<alert*> alerts;
35	ses.pop_alerts(&alerts);
36
37	for (auto* a : alerts) {
38		std::cout << a->message() << "\n";
39	}
40
41If you want to separate generic alerts from session stats, you can filter on the
42alert category in the alert, ``alert::category()``.
43
44The alerts with data will have the type session_stats_alert and there is one
45session_log_alert that will be posted on startup containing the column names
46for all metrics. Logging this line will greatly simplify interpreting the output.
47
48The python scrip in ``tools/parse_session_stats.py`` can parse the resulting
49file and produce graphs of relevant stats. It requires gnuplot_.
50
51.. _gnuplot: http://www.gnuplot.info
52
53reducing memory footprint
54=========================
55
56These are things you can do to reduce the memory footprint of libtorrent. You get
57some of this by basing your default settings_pack on the min_memory_usage()
58setting preset function.
59
60Keep in mind that lowering memory usage will affect performance, always profile
61and benchmark your settings to determine if it's worth the trade-off.
62
63The typical buffer usage of libtorrent, for a single download, with the cache
64size set to 256 blocks (256 * 16 kiB = 4 MiB) is::
65
66	read cache:      128.6 (2058 kiB)
67	write cache:     103.5 (1656 kiB)
68	receive buffers: 7.3   (117 kiB)
69	send buffers:    4.8   (77 kiB)
70	hash temp:       0.001 (19 Bytes)
71
72The receive buffers is proportional to the number of connections we make, and is
73limited by the total number of connections in the session (default is 200).
74
75The send buffers is proportional to the number of upload slots that are allowed
76in the session. The default is auto configured based on the observed upload rate.
77
78The read and write cache can be controlled (see section below).
79
80The "hash temp" entry size depends on whether or not hashing is optimized for
81speed or memory usage. In this test run it was optimized for memory usage.
82
83disable disk cache
84------------------
85
86The bulk of the memory libtorrent will use is used for the disk cache. To save
87the absolute most amount of memory, you can disable the cache by setting
88settings_pack::cache_size to 0. You might want to consider using the cache
89but just disable caching read operations. You do this by settings
90settings_pack::use_read_cache to false. This is the main factor in how much
91memory will be used by the client. Keep in mind that you will degrade performance
92by disabling the cache. You should benchmark the disk access in order to make an
93informed trade-off.
94
95remove torrents
96---------------
97
98Torrents that have been added to libtorrent will inevitably use up memory, even
99when it's paused. A paused torrent will not use any peer connection objects or
100any send or receive buffers though. Any added torrent holds the entire .torrent
101file in memory, it also remembers the entire list of peers that it's heard about
102(which can be fairly long unless it's capped). It also retains information about
103which blocks and pieces we have on disk, which can be significant for torrents
104with many pieces.
105
106If you need to minimize the memory footprint, consider removing torrents from
107the session rather than pausing them. This will likely only make a difference
108when you have a very large number of torrents in a session.
109
110The downside of removing them is that they will no longer be auto-managed. Paused
111auto managed torrents are scraped periodically, to determine which torrents are
112in the greatest need of seeding, and libtorrent will prioritize to seed those.
113
114socket buffer sizes
115-------------------
116
117You can make libtorrent explicitly set the kernel buffer sizes of all its peer
118sockets. If you set this to a low number, you may see reduced throughput, especially
119for high latency connections. It is however an opportunity to save memory per
120connection, and might be worth considering if you have a very large number of
121peer connections. This memory will not be visible in your process, this sets
122the amount of kernel memory is used for your sockets.
123
124Change this by setting settings_pack::recv_socket_buffer_size and
125settings_pack::send_socket_buffer_size.
126
127peer list size
128--------------
129
130The default maximum for the peer list is 4000 peers. For IPv4 peers, each peer
131entry uses 32 bytes, which ends up using 128 kB per torrent. If seeding 4 popular
132torrents, the peer lists alone uses about half a megabyte.
133
134The default limit is the same for paused torrents as well, so if you have a
135large number of paused torrents (that are popular) it will be even more
136significant.
137
138If you're short of memory, you should consider lowering the limit. 500 is probably
139enough. You can do this by setting settings_pack::max_peerlist_size to
140the max number of peers you want in a torrent's peer list. This limit applies per
141torrent. For 5 torrents, the total number of peers in peer lists will be 5 times
142the setting.
143
144You should also lower the same limit but for paused torrents. It might even make sense
145to set that even lower, since you only need a few peers to start up while waiting
146for the tracker and DHT to give you fresh ones. The max peer list size for paused
147torrents is set by settings_pack::max_paused_peerlist_size.
148
149The drawback of lowering this number is that if you end up in a position where
150the tracker is down for an extended period of time, your only hope of finding live
151peers is to go through your list of all peers you've ever seen. Having a large
152peer list will also help increase performance when starting up, since the torrent
153can start connecting to peers in parallel with connecting to the tracker.
154
155send buffer watermark
156---------------------
157
158The send buffer watermark controls when libtorrent will ask the disk I/O thread
159to read blocks from disk, and append it to a peer's send buffer.
160
161When the send buffer has fewer than or equal number of bytes as
162settings_pack::send_buffer_watermark, the peer will ask the disk I/O thread
163for more data to send. The trade-off here is between wasting memory by having too
164much data in the send buffer, and hurting send rate by starving out the socket,
165waiting for the disk read operation to complete.
166
167If your main objective is memory usage and you're not concerned about being able
168to achieve high send rates, you can set the watermark to 9 bytes. This will guarantee
169that no more than a single (16 kiB) block will be on the send buffer at a time, for
170all peers. This is the least amount of memory possible for the send buffer.
171
172You should benchmark your max send rate when adjusting this setting. If you have
173a very fast disk, you are less likely see a performance hit.
174
175reduce executable size
176----------------------
177
178Compilers generally add a significant number of bytes to executables that make use
179of C++ exceptions. By disabling exceptions (-fno-exceptions on GCC), you can
180reduce the executable size with up to 45%. In order to build without exception
181support, you need to patch parts of boost.
182
183Also make sure to optimize for size when compiling.
184
185Another way of reducing the executable size is to disable code that isn't used.
186There are a number of ``TORRENT_*`` macros that control which features are included
187in libtorrent. If these macros are used to strip down libtorrent, make sure the same
188macros are defined when building libtorrent as when linking against it. If these
189are different the structures will look different from the libtorrent side and from
190the client side and memory corruption will follow.
191
192One, probably, safe macro to define is ``TORRENT_NO_DEPRECATE`` which removes all
193deprecated functions and struct members. As long as no deprecated functions are
194relied upon, this should be a simple way to eliminate a little bit of code.
195
196For all available options, see the `building libtorrent`_ section.
197
198.. _`building libtorrent`: building.html
199
200high performance seeding
201========================
202
203In the case of a high volume seed, there are two main concerns. Performance and scalability.
204This translates into high send rates, and low memory and CPU usage per peer connection.
205
206file pool
207---------
208
209libtorrent keeps an LRU file cache. Each file that is opened, is stuck in the cache. The main
210purpose of this is because of anti-virus software that hooks on file-open and file close to
211scan the file. Anti-virus software that does that will significantly increase the cost of
212opening and closing files. However, for a high performance seed, the file open/close might
213be so frequent that it becomes a significant cost. It might therefore be a good idea to allow
214a large file descriptor cache. Adjust this though settings_pack::file_pool_size.
215
216Don't forget to set a high rlimit for file descriptors in your process as well. This limit
217must be high enough to keep all connections and files open.
218
219disk cache
220----------
221
222You typically want to set the cache size to as high as possible. The
223settings_pack::cache_size is specified in 16 kiB blocks. Since you're seeding,
224the cache would be useless unless you also set settings_pack::use_read_cache
225to true.
226
227In order to increase the possibility of read cache hits, set the
228settings_pack::cache_expiry to a large number. This won't degrade anything as
229long as the client is only seeding, and not downloading any torrents.
230
231There's a *guided cache* mode. This means the size of the read cache line that's
232stored in the cache is determined based on the upload rate to the peer that
233triggered the read operation. The idea being that slow peers don't use up a
234disproportional amount of space in the cache. This is enabled through
235settings_pack::guided_read_cache.
236
237In cases where the assumption is that the cache is only used as a read-ahead, and that no
238other peer will ever request the same block while it's still in the cache, the read
239cache can be set to be *volatile*. This means that every block that is requested out of
240the read cache is removed immediately. This saves a significant amount of cache space
241which can be used as read-ahead for other peers. To enable volatile read cache, set
242settings_pack::volatile_read_cache to true.
243
244uTP-TCP mixed mode
245------------------
246
247libtorrent supports uTP_, which has a delay based congestion controller. In order to
248avoid having a single TCP bittorrent connection completely starve out any uTP connection,
249there is a mixed mode algorithm. This attempts to detect congestion on the uTP peers and
250throttle TCP to avoid it taking over all bandwidth. This balances the bandwidth resources
251between the two protocols. When running on a network where the bandwidth is in such an
252abundance that it's virtually infinite, this algorithm is no longer necessary, and might
253even be harmful to throughput. It is advised to experiment with the
254settings_pack::mixed_mode_algorithm, setting it to settings_pack::prefer_tcp.
255This setting entirely disables the balancing and un-throttles all connections. On a typical
256home connection, this would mean that none of the benefits of uTP would be preserved
257(the modem's send buffer would be full at all times) and uTP connections would for the most
258part be squashed by the TCP traffic.
259
260.. _`uTP`: utp.html
261
262send buffer low watermark
263-------------------------
264
265libtorrent uses a low watermark for send buffers to determine when a new piece should
266be requested from the disk I/O subsystem, to be appended to the send buffer. The low
267watermark is determined based on the send rate of the socket. It needs to be large
268enough to not draining the socket's send buffer before the disk operation completes.
269
270The watermark is bound to a max value, to avoid buffer sizes growing out of control.
271The default max send buffer size might not be enough to sustain very high upload rates,
272and you might have to increase it. It's specified in bytes in
273settings_pack::send_buffer_watermark.
274
275peers
276-----
277
278First of all, in order to allow many connections, set the global connection limit
279high, settings_pack::connections_limit. Also set the upload rate limit to
280infinite, settings_pack::upload_rate_limit, 0 means infinite.
281
282When dealing with a large number of peers, it might be a good idea to have slightly
283stricter timeouts, to get rid of lingering connections as soon as possible.
284
285There are a couple of relevant settings: settings_pack::request_timeout,
286settings_pack::peer_timeout and settings_pack::inactivity_timeout.
287
288For seeds that are critical for a delivery system, you most likely want to allow
289multiple connections from the same IP. That way two people from behind the same NAT
290can use the service simultaneously. This is controlled by
291settings_pack::allow_multiple_connections_per_ip.
292
293In order to always unchoke peers, turn off automatic unchoke by setting
294settings_pack::choking_algorithm to settings_pack::fixed_slot_choker and set the number
295of upload slots to a large number via settings_pack::unchoke_slots_limit,
296or use -1 (which means infinite).
297
298torrent limits
299--------------
300
301To seed thousands of torrents, you need to increase the settings_pack::active_limit
302and settings_pack::active_seeds.
303
304SHA-1 hashing
305-------------
306
307When downloading at very high rates, it is possible to have the CPU be the
308bottleneck for passing every downloaded byte through SHA-1. In order to enable
309calculating SHA-1 hashes in parallel, on multi-core systems, set
310settings_pack::aio_threads to the number of threads libtorrent should
311perform I/O and do SHA-1 hashing in. Only if that thread is close to saturating
312one core does it make sense to increase the number of threads.
313
314scalability
315===========
316
317In order to make more efficient use of the libtorrent interface when running a large
318number of torrents simultaneously, one can use the ``session::get_torrent_status()`` call
319together with ``session::post_torrent_updates()``. Keep in mind that every call into
320libtorrent that return some value have to block your thread while posting a message to
321the main network thread and then wait for a response. Calls that don't return any data
322will simply post the message and then immediately return, performing the work
323asynchronously. The time this takes might become significant once you reach a
324few hundred torrents, depending on how many calls you make to each torrent and how often.
325``session::get_torrent_status()`` lets you query the status of all torrents in a single call.
326This will actually loop through all torrents and run a provided predicate function to
327determine whether or not to include it in the returned vector.
328
329To use ``session::post_torrent_updates()`` torrents need to have the ``flag_update_subscribe``
330flag set. When post_torrent_updates() is called, a state_update_alert alert
331is posted, with all the torrents that have updated since the last time this
332function was called. The client have to keep its own state of all torrents, and
333update it based on this alert.
334
335benchmarking
336============
337
338There is a bunch of built-in instrumentation of libtorrent that can be used to get an insight
339into what it's doing and how well it performs. This instrumentation is enabled by defining
340preprocessor symbols when building.
341
342There are also a number of scripts that parses the log files and generates graphs (requires
343gnuplot and python).
344
345understanding the disk threads
346==============================
347
348*This section is somewhat outdated, there are potentially more than one disk
349thread*
350
351All disk operations are funneled through a separate thread, referred to as the
352disk thread. The main interface to the disk thread is a queue where disk jobs
353are posted, and the results of these jobs are then posted back on the main
354thread's io_service.
355
356A disk job is essentially one of:
357
3581. write this block to disk, i.e. a write job. For the most part this is just a
359	matter of sticking the block in the disk cache, but if we've run out of
360	cache space or completed a whole piece, we'll also flush blocks to disk.
361	This is typically very fast, since the OS just sticks these buffers in its
362	write cache which will be flushed at a later time, presumably when the drive
363	head will pass the place on the platter where the blocks go.
364
3652. read this block from disk. The first thing that happens is we look in the
366	cache to see if the block is already in RAM. If it is, we'll return
367	immediately with this block. If it's a cache miss, we'll have to hit the
368	disk. Here we decide to defer this job. We find the physical offset on the
369	drive for this block and insert the job in an ordered queue, sorted by the
370	physical location. At a later time, once we don't have any more non-read
371	jobs left in the queue, we pick one read job out of the ordered queue and
372	service it. The order we pick jobs out of the queue is according to an
373	elevator cursor moving up and down along the ordered queue of read jobs. If
374	we have enough space in the cache we'll read read_cache_line_size number of
375	blocks and stick those in the cache. This defaults to 32 blocks. If the
376	system supports asynchronous I/O (Windows, Linux, Mac OS X, BSD, Solars for
377	instance), jobs will be issued immediately to the OS. This especially
378	increases read throughput, since the OS has a much greater flexibility to
379	reorder the read jobs.
380
381Other disk job consist of operations that needs to be synchronized with the
382disk I/O, like renaming files, closing files, flushing the cache, updating the
383settings etc. These are relatively rare though.
384
385contributions
386=============
387
388If you have added instrumentation for some part of libtorrent that is not
389covered here, or if you have improved any of the parser scrips, please consider
390contributing it back to the project.
391
392If you have run tests and found that some algorithm or default value in
393libtorrent are suboptimal, please contribute that knowledge back as well, to
394allow us to improve the library.
395
396If you have additional suggestions on how to tune libtorrent for any specific
397use case, please let us know and we'll update this document.
398
399