1@node Multithreading
2@chapter Multithreading
3
4Multithreading is a programming paradigm.  In a multithreaded program,
5multiple threads execute concurrently (or quasi concurrently) at different
6places in the program.
7
8There are three motivations for using multithreading in a program:
9@itemize @bullet
10@item
11Exploiting CPU hardware with multiple execution units.  Nowadays, many CPUs
12have 2 to 8 execution cores in a single chip.  Additionally, often multiple
13CPU chips are combined in a single package.  Thus, some CPU packages support
1464 or 96 simultaneous threads of execution.
15@item
16Simplifying program architecture.  When a program has to read from different
17file descriptors, network sockets, or event channels at the same time, the
18classical single-threaded architecture is to have a main loop which uses
19@code{select} or @code{poll} on all the descriptors and then dispatches
20according to from which descriptor input arrived.  In a multi-threaded
21program, you allocate one thread for each descriptor, and these threads can
22be programmed and managed independently.
23@item
24Offloading work from signal handlers.  A signal handler is not allowed to
25call @code{malloc}; therefore you are very limited in what you can do in
26a signal handler.  But a signal handler can notify a thread, and the thread
27can then do the appropriate processing, as complex as it needs to be.
28@end itemize
29
30A multithreading API offers
31@itemize @bullet
32@item
33Primitives for creating threads, for waiting until threads are terminated,
34and for reaping their results.
35@item
36Primitives through which different threads can operate on the same data or
37use some data structures for communicating between the threads.  These are
38called ``mutexes'' or ``locks''.
39@item
40Primitives for executing a certain (initialization) code at most once.
41@item
42Primitives for notifying one or more other threads.  These are called wait
43queues or ``condition variables''.
44@item
45Primitives for allowing different threads to have different values for a
46variable.  Such a variable is said to reside in ``thread-local storage'' or
47``thread-specific storage''.
48@item
49Primitives for relinquishing control for some time and letting other threads
50go.
51@end itemize
52
53Note: Programs that achieve multithreading through OpenMP (cf. the gnulib
54module @samp{openmp}) don't create and manage their threads themselves.
55Nevertheless, they need to use mutexes/locks in many cases.
56
57@menu
58* Multithreading APIs::
59* Choosing a multithreading API::
60* POSIX multithreading::
61* ISO C multithreading::
62* Gnulib multithreading::
63* Multithreading Optimizations::
64@end menu
65
66@node Multithreading APIs
67@section The three multithreading APIs
68
69Three multithreading APIs are available to Gnulib users:
70@itemize @bullet
71@item
72POSIX multithreading,
73@item
74ISO C multithreading,
75@item
76Gnulib multithreading.
77@end itemize
78
79They are supported on all platforms that have multithreading in one form or
80the other.  Currently, these are all platforms supported by Gnulib, except
81for Minix.
82
83The main differences are:
84@itemize @bullet
85@item
86The exit code of a thread is a pointer in the POSIX and Gnulib APIs, but
87only an @code{int} in the ISO C API.
88@item
89The POSIX API has additional facilities for detaching threads, setting the
90priority of a thread, assigning a thread to a certain set of processors,
91and much more.
92@item
93In the POSIX and ISO C APIs, most functions have a return code, and you
94are supposed to check the return code; even locking and unlocking a lock
95can fail.  In the Gnulib API, many functions don't have a return code; if
96they cannot complete, the program aborts.  This sounds harsh, but such
97aborts have not been reported in 12 years.
98@item
99In the ISO C API, the initialization of a statically allocated lock is
100clumsy: You have to initialize it through a once-only function.
101@end itemize
102
103@node Choosing a multithreading API
104@section Choosing the right multithreading API
105
106Here are guidelines for determining which multithreading API is best for
107your code.
108
109In programs that use advanced POSIX APIs, such as spin locks,
110detached threads (@code{pthread_detach}),
111signal blocking (@code{pthread_sigmask}),
112priorities (@code{pthread_setschedparam}),
113processor affinity (@code{pthread_setaffinity_np}), it is best to use
114the POSIX API.  This is because you cannot convert an ISO C @code{thrd_t}
115or a Gnulib @code{gl_thread_t} to a POSIX @code{pthread_t}.
116
117In code that is shared with glibc, it is best to use the POSIX API as well.
118
119In libraries, it is best to use the Gnulib API.  This is because it gives
120the person who builds the library an option
121@samp{--enable-threads=@{isoc,posix,windows@}}, that determines on which
122native multithreading API of the platform to rely.  In other words, with
123this choice, you can minimize the amount of glue code that your library
124needs to contain.
125
126In the other cases, the POSIX API and the Gnulib API are equally well suited.
127
128The ISO C API is never the best choice, as of this writing (2020).
129
130@node POSIX multithreading
131@section The POSIX multithreading API
132
133The POSIX multithreading API is documented in POSIX
134@url{https://pubs.opengroup.org/onlinepubs/9699919799/}.
135
136To make use of POSIX multithreading, even on platforms that don't support it
137natively (most prominently, native Windows), use the following Gnulib modules:
138@multitable @columnfractions .75 .25
139@headitem Purpose @tab Module
140@item For thread creation and management:@tie{} @tab @code{pthread-thread}
141@item For simple and recursive locks:@tie{} @tab @code{pthread-mutex}
142@item For read-write locks:@tie{} @tab @code{pthread-rwlock}
143@item For once-only execution:@tie{} @tab @code{pthread-once}
144@item For ``condition variables'' (wait queues):@tie{} @tab @code{pthread-cond}
145@item For thread-local storage:@tie{} @tab @code{pthread-tss}
146@item For relinquishing control:@tie{} @tab @code{sched_yield}
147@item For spin locks:@tie{} @tab @code{pthread-spin}
148@end multitable
149
150There is also a convenience module named @code{pthread} which depends on all
151of these (except @code{sched_yield}); so you don't need to enumerate these
152modules one by one.
153
154@node ISO C multithreading
155@section The ISO C multithreading API
156
157The ISO C multithreading API is documented in ISO C 11
158@url{http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf}.
159
160To make use of ISO C multithreading, even on platforms that don't support it
161or have severe bugs, use the following Gnulib modules:
162@multitable @columnfractions .85 .15
163@headitem Purpose @tab Module
164@item For thread creation and management:@tie{} @tab @code{thrd}
165@item For simple locks, recursive locks, and read-write locks:@tie{}
166      @tab @code{mtx}
167@item For once-only execution:@tie{} @tab @code{mtx}
168@item For ``condition variables'' (wait queues):@tie{} @tab @code{cnd}
169@item For thread-local storage:@tie{} @tab @code{tss}
170@end multitable
171
172There is also a convenience module named @code{threads} which depends on all
173of these; so you don't need to enumerate these modules one by one.
174
175@node Gnulib multithreading
176@section The Gnulib multithreading API
177
178The Gnulib multithreading API is documented in the respective include files:
179@itemize
180@item
181@code{<glthread/thread.h>}
182@item
183@code{<glthread/lock.h>}
184@item
185@code{<glthread/cond.h>}
186@item
187@code{<glthread/tls.h>}
188@item
189@code{<glthread/yield.h>}
190@end itemize
191
192To make use of Gnulib multithreading, use the following Gnulib modules:
193@multitable @columnfractions .85 .15
194@headitem Purpose @tab Module
195@item For thread creation and management:@tie{} @tab @code{thread}
196@item For simple locks, recursive locks, and read-write locks:@tie{}
197      @tab @code{lock}
198@item For once-only execution:@tie{} @tab @code{lock}
199@item For ``condition variables'' (wait queues):@tie{} @tab @code{cond}
200@item For thread-local storage:@tie{} @tab @code{tls}
201@item For relinquishing control:@tie{} @tab @code{yield}
202@end multitable
203
204The Gnulib multithreading supports a configure option
205@samp{--enable-threads=@{isoc,posix,windows@}}, that chooses the underlying
206thread implementation.  Currently (2020):
207@itemize @bullet
208@item
209@code{--enable-threads=posix} is supported and is the best choice on all
210platforms except for native Windows.  It may also work, to a limited extent,
211on mingw with the @code{winpthreads} library, but is not recommended there.
212@item
213@code{--enable-threads=windows} is supported and is the best choice on
214native Windows platforms (mingw and MSVC).
215@item
216@code{--enable-threads=isoc} is supported on all platforms that have the
217ISO C multithreading API.  However, @code{--enable-threads=posix} is always
218a better choice.
219@end itemize
220
221@node Multithreading Optimizations
222@section Optimizations of multithreaded code
223
224Despite all the optimizations of multithreading primitives that have been
225implemented over the years --- from
226@url{https://en.wikipedia.org/wiki/Compare-and-swap,
227atomic operations in hardware},
228over @url{https://en.wikipedia.org/wiki/Futex, futexes} and
229@url{https://www.efficios.com/blog/2019/02/08/linux-restartable-sequences/,
230restartable sequences}
231in the Linux kernel, to lock elision
232@url{https://lwn.net/Articles/534758/, [1]}
233@url{https://www.gnu.org/software/libc/manual/html_node/Elision-Tunables.html,
234[2]})
235--- single-threaded programs can still profit performance-wise from the
236assertion that they are single-threaded.
237
238Gnulib defines four facilities that help optimizing for the single-threaded
239case.
240
241@itemize @bullet
242@item
243The Gnulib multithreading API, when used on glibc @leq{} 2.32 and *BSD systems,
244uses weak symbols to detect whether the program is linked with
245@code{libpthread}.  If not, the program has no way to create additional
246threads and must therefore be single-threaded.  This optimization applies
247to all the Gnulib multithreading API (locks, thread-local storage, and more).
248@item
249The @code{thread-optim} module, on glibc @geq{} 2.32 systems, allows your code
250to skip locking between threads (regardless which of the three multithreading
251APIs you use).  You need extra code for this: include the
252@code{"thread-optim.h"} header file, and use the macro @code{gl_multithreaded}
253like this:
254@smallexample
255bool mt = gl_multithreaded ();
256if (mt) gl_lock_lock (some_lock);
257...
258if (mt) gl_lock_unlock (some_lock);
259@end smallexample
260@item
261You may use the @code{unlocked-io} module if you want the @code{FILE} stream
262functions @code{getc}, @code{putc}, etc.@: to use unlocked I/O if available,
263throughout the package.  Unlocked I/O can improve performance, sometimes
264dramatically.  But unlocked I/O is safe only in single-threaded programs,
265as well as in multithreaded programs for which you can guarantee that
266every @code{FILE} stream, including @code{stdin}, @code{stdout}, @code{stderr},
267is used only in a single thread.
268
269You need extra code for this optimization to be effective: include the
270@code{"unlocked-io.h"} header file.  Some Gnulib modules that do operations
271on @code{FILE} streams have these preparations already included.
272@item
273You may define the C macro @code{GNULIB_REGEX_SINGLE_THREAD}, if all the
274programs in your package invoke the functions of the @code{regex} module
275only from a single thread.
276@item
277You may define the C macro @code{GNULIB_MBRTOWC_SINGLE_THREAD}, if all the
278programs in your package invoke the functions @code{mbrtowc}, @code{mbrtoc32},
279and the functions of the @code{regex} module only from a single thread.  (The
280@code{regex} module uses @code{mbrtowc} under the hood.)
281@item
282You may define the C macro @code{GNULIB_WCHAR_SINGLE_LOCALE}, if all the
283programs in your package set the locale early and
284@itemize
285@item
286don't change the locale after it has been initialized, and
287@item
288don't call locale sensitive functions (@code{mbrtowc}, @code{wcwidth}, etc.@:)
289before the locale has been initialized.
290@end itemize
291This macro optimizes the functions @code{mbrtowc}, @code{mbrtoc32}, and
292@code{wcwidth}.
293@item
294You may define the C macro @code{GNULIB_GETUSERSHELL_SINGLE_THREAD}, if all the
295programs in your package invoke the functions @code{setusershell},
296@code{getusershell}, @code{endusershell} only from a single thread.
297@item
298You may define the C macro @code{GNULIB_EXCLUDE_SINGLE_THREAD}, if all the
299programs in your package invoke the functions of the @code{exclude} module
300only from a single thread.
301@end itemize
302