1@node Multithreading 2@chapter Multithreading 3 4Multithreading is a programming paradigm. In a multithreaded program, 5multiple threads execute concurrently (or quasi concurrently) at different 6places in the program. 7 8There are three motivations for using multithreading in a program: 9@itemize @bullet 10@item 11Exploiting CPU hardware with multiple execution units. Nowadays, many CPUs 12have 2 to 8 execution cores in a single chip. Additionally, often multiple 13CPU chips are combined in a single package. Thus, some CPU packages support 1464 or 96 simultaneous threads of execution. 15@item 16Simplifying program architecture. When a program has to read from different 17file descriptors, network sockets, or event channels at the same time, the 18classical single-threaded architecture is to have a main loop which uses 19@code{select} or @code{poll} on all the descriptors and then dispatches 20according to from which descriptor input arrived. In a multi-threaded 21program, you allocate one thread for each descriptor, and these threads can 22be programmed and managed independently. 23@item 24Offloading work from signal handlers. A signal handler is not allowed to 25call @code{malloc}; therefore you are very limited in what you can do in 26a signal handler. But a signal handler can notify a thread, and the thread 27can then do the appropriate processing, as complex as it needs to be. 28@end itemize 29 30A multithreading API offers 31@itemize @bullet 32@item 33Primitives for creating threads, for waiting until threads are terminated, 34and for reaping their results. 35@item 36Primitives through which different threads can operate on the same data or 37use some data structures for communicating between the threads. These are 38called ``mutexes'' or ``locks''. 39@item 40Primitives for executing a certain (initialization) code at most once. 41@item 42Primitives for notifying one or more other threads. These are called wait 43queues or ``condition variables''. 44@item 45Primitives for allowing different threads to have different values for a 46variable. Such a variable is said to reside in ``thread-local storage'' or 47``thread-specific storage''. 48@item 49Primitives for relinquishing control for some time and letting other threads 50go. 51@end itemize 52 53Note: Programs that achieve multithreading through OpenMP (cf. the gnulib 54module @samp{openmp}) don't create and manage their threads themselves. 55Nevertheless, they need to use mutexes/locks in many cases. 56 57@menu 58* Multithreading APIs:: 59* Choosing a multithreading API:: 60* POSIX multithreading:: 61* ISO C multithreading:: 62* Gnulib multithreading:: 63* Multithreading Optimizations:: 64@end menu 65 66@node Multithreading APIs 67@section The three multithreading APIs 68 69Three multithreading APIs are available to Gnulib users: 70@itemize @bullet 71@item 72POSIX multithreading, 73@item 74ISO C multithreading, 75@item 76Gnulib multithreading. 77@end itemize 78 79They are supported on all platforms that have multithreading in one form or 80the other. Currently, these are all platforms supported by Gnulib, except 81for Minix. 82 83The main differences are: 84@itemize @bullet 85@item 86The exit code of a thread is a pointer in the POSIX and Gnulib APIs, but 87only an @code{int} in the ISO C API. 88@item 89The POSIX API has additional facilities for detaching threads, setting the 90priority of a thread, assigning a thread to a certain set of processors, 91and much more. 92@item 93In the POSIX and ISO C APIs, most functions have a return code, and you 94are supposed to check the return code; even locking and unlocking a lock 95can fail. In the Gnulib API, many functions don't have a return code; if 96they cannot complete, the program aborts. This sounds harsh, but such 97aborts have not been reported in 12 years. 98@item 99In the ISO C API, the initialization of a statically allocated lock is 100clumsy: You have to initialize it through a once-only function. 101@end itemize 102 103@node Choosing a multithreading API 104@section Choosing the right multithreading API 105 106Here are guidelines for determining which multithreading API is best for 107your code. 108 109In programs that use advanced POSIX APIs, such as spin locks, 110detached threads (@code{pthread_detach}), 111signal blocking (@code{pthread_sigmask}), 112priorities (@code{pthread_setschedparam}), 113processor affinity (@code{pthread_setaffinity_np}), it is best to use 114the POSIX API. This is because you cannot convert an ISO C @code{thrd_t} 115or a Gnulib @code{gl_thread_t} to a POSIX @code{pthread_t}. 116 117In code that is shared with glibc, it is best to use the POSIX API as well. 118 119In libraries, it is best to use the Gnulib API. This is because it gives 120the person who builds the library an option 121@samp{--enable-threads=@{isoc,posix,windows@}}, that determines on which 122native multithreading API of the platform to rely. In other words, with 123this choice, you can minimize the amount of glue code that your library 124needs to contain. 125 126In the other cases, the POSIX API and the Gnulib API are equally well suited. 127 128The ISO C API is never the best choice, as of this writing (2020). 129 130@node POSIX multithreading 131@section The POSIX multithreading API 132 133The POSIX multithreading API is documented in POSIX 134@url{https://pubs.opengroup.org/onlinepubs/9699919799/}. 135 136To make use of POSIX multithreading, even on platforms that don't support it 137natively (most prominently, native Windows), use the following Gnulib modules: 138@multitable @columnfractions .75 .25 139@headitem Purpose @tab Module 140@item For thread creation and management:@tie{} @tab @code{pthread-thread} 141@item For simple and recursive locks:@tie{} @tab @code{pthread-mutex} 142@item For read-write locks:@tie{} @tab @code{pthread-rwlock} 143@item For once-only execution:@tie{} @tab @code{pthread-once} 144@item For ``condition variables'' (wait queues):@tie{} @tab @code{pthread-cond} 145@item For thread-local storage:@tie{} @tab @code{pthread-tss} 146@item For relinquishing control:@tie{} @tab @code{sched_yield} 147@item For spin locks:@tie{} @tab @code{pthread-spin} 148@end multitable 149 150There is also a convenience module named @code{pthread} which depends on all 151of these (except @code{sched_yield}); so you don't need to enumerate these 152modules one by one. 153 154@node ISO C multithreading 155@section The ISO C multithreading API 156 157The ISO C multithreading API is documented in ISO C 11 158@url{http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf}. 159 160To make use of ISO C multithreading, even on platforms that don't support it 161or have severe bugs, use the following Gnulib modules: 162@multitable @columnfractions .85 .15 163@headitem Purpose @tab Module 164@item For thread creation and management:@tie{} @tab @code{thrd} 165@item For simple locks, recursive locks, and read-write locks:@tie{} 166 @tab @code{mtx} 167@item For once-only execution:@tie{} @tab @code{mtx} 168@item For ``condition variables'' (wait queues):@tie{} @tab @code{cnd} 169@item For thread-local storage:@tie{} @tab @code{tss} 170@end multitable 171 172There is also a convenience module named @code{threads} which depends on all 173of these; so you don't need to enumerate these modules one by one. 174 175@node Gnulib multithreading 176@section The Gnulib multithreading API 177 178The Gnulib multithreading API is documented in the respective include files: 179@itemize 180@item 181@code{<glthread/thread.h>} 182@item 183@code{<glthread/lock.h>} 184@item 185@code{<glthread/cond.h>} 186@item 187@code{<glthread/tls.h>} 188@item 189@code{<glthread/yield.h>} 190@end itemize 191 192To make use of Gnulib multithreading, use the following Gnulib modules: 193@multitable @columnfractions .85 .15 194@headitem Purpose @tab Module 195@item For thread creation and management:@tie{} @tab @code{thread} 196@item For simple locks, recursive locks, and read-write locks:@tie{} 197 @tab @code{lock} 198@item For once-only execution:@tie{} @tab @code{lock} 199@item For ``condition variables'' (wait queues):@tie{} @tab @code{cond} 200@item For thread-local storage:@tie{} @tab @code{tls} 201@item For relinquishing control:@tie{} @tab @code{yield} 202@end multitable 203 204The Gnulib multithreading supports a configure option 205@samp{--enable-threads=@{isoc,posix,windows@}}, that chooses the underlying 206thread implementation. Currently (2020): 207@itemize @bullet 208@item 209@code{--enable-threads=posix} is supported and is the best choice on all 210platforms except for native Windows. It may also work, to a limited extent, 211on mingw with the @code{winpthreads} library, but is not recommended there. 212@item 213@code{--enable-threads=windows} is supported and is the best choice on 214native Windows platforms (mingw and MSVC). 215@item 216@code{--enable-threads=isoc} is supported on all platforms that have the 217ISO C multithreading API. However, @code{--enable-threads=posix} is always 218a better choice. 219@end itemize 220 221@node Multithreading Optimizations 222@section Optimizations of multithreaded code 223 224Despite all the optimizations of multithreading primitives that have been 225implemented over the years --- from 226@url{https://en.wikipedia.org/wiki/Compare-and-swap, 227atomic operations in hardware}, 228over @url{https://en.wikipedia.org/wiki/Futex, futexes} and 229@url{https://www.efficios.com/blog/2019/02/08/linux-restartable-sequences/, 230restartable sequences} 231in the Linux kernel, to lock elision 232@url{https://lwn.net/Articles/534758/, [1]} 233@url{https://www.gnu.org/software/libc/manual/html_node/Elision-Tunables.html, 234[2]}) 235--- single-threaded programs can still profit performance-wise from the 236assertion that they are single-threaded. 237 238Gnulib defines four facilities that help optimizing for the single-threaded 239case. 240 241@itemize @bullet 242@item 243The Gnulib multithreading API, when used on glibc @leq{} 2.32 and *BSD systems, 244uses weak symbols to detect whether the program is linked with 245@code{libpthread}. If not, the program has no way to create additional 246threads and must therefore be single-threaded. This optimization applies 247to all the Gnulib multithreading API (locks, thread-local storage, and more). 248@item 249The @code{thread-optim} module, on glibc @geq{} 2.32 systems, allows your code 250to skip locking between threads (regardless which of the three multithreading 251APIs you use). You need extra code for this: include the 252@code{"thread-optim.h"} header file, and use the macro @code{gl_multithreaded} 253like this: 254@smallexample 255bool mt = gl_multithreaded (); 256if (mt) gl_lock_lock (some_lock); 257... 258if (mt) gl_lock_unlock (some_lock); 259@end smallexample 260@item 261You may use the @code{unlocked-io} module if you want the @code{FILE} stream 262functions @code{getc}, @code{putc}, etc.@: to use unlocked I/O if available, 263throughout the package. Unlocked I/O can improve performance, sometimes 264dramatically. But unlocked I/O is safe only in single-threaded programs, 265as well as in multithreaded programs for which you can guarantee that 266every @code{FILE} stream, including @code{stdin}, @code{stdout}, @code{stderr}, 267is used only in a single thread. 268 269You need extra code for this optimization to be effective: include the 270@code{"unlocked-io.h"} header file. Some Gnulib modules that do operations 271on @code{FILE} streams have these preparations already included. 272@item 273You may define the C macro @code{GNULIB_REGEX_SINGLE_THREAD}, if all the 274programs in your package invoke the functions of the @code{regex} module 275only from a single thread. 276@item 277You may define the C macro @code{GNULIB_MBRTOWC_SINGLE_THREAD}, if all the 278programs in your package invoke the functions @code{mbrtowc}, @code{mbrtoc32}, 279and the functions of the @code{regex} module only from a single thread. (The 280@code{regex} module uses @code{mbrtowc} under the hood.) 281@item 282You may define the C macro @code{GNULIB_WCHAR_SINGLE_LOCALE}, if all the 283programs in your package set the locale early and 284@itemize 285@item 286don't change the locale after it has been initialized, and 287@item 288don't call locale sensitive functions (@code{mbrtowc}, @code{wcwidth}, etc.@:) 289before the locale has been initialized. 290@end itemize 291This macro optimizes the functions @code{mbrtowc}, @code{mbrtoc32}, and 292@code{wcwidth}. 293@item 294You may define the C macro @code{GNULIB_GETUSERSHELL_SINGLE_THREAD}, if all the 295programs in your package invoke the functions @code{setusershell}, 296@code{getusershell}, @code{endusershell} only from a single thread. 297@item 298You may define the C macro @code{GNULIB_EXCLUDE_SINGLE_THREAD}, if all the 299programs in your package invoke the functions of the @code{exclude} module 300only from a single thread. 301@end itemize 302