1\input texinfo @c -*-texinfo-*- 2 3@c %**start of header 4@setfilename libitm.info 5@settitle GNU libitm 6@c %**end of header 7 8 9@copying 10Copyright @copyright{} 2011-2016 Free Software Foundation, Inc. 11 12Permission is granted to copy, distribute and/or modify this document 13under the terms of the GNU Free Documentation License, Version 1.2 or 14any later version published by the Free Software Foundation; with no 15Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. 16A copy of the license is included in the section entitled ``GNU 17Free Documentation License''. 18@end copying 19 20@ifinfo 21@dircategory GNU Libraries 22@direntry 23* libitm: (libitm). GNU Transactional Memory Library 24@end direntry 25 26This manual documents the GNU Transactional Memory Library. 27 28@insertcopying 29@end ifinfo 30 31 32@setchapternewpage odd 33 34@titlepage 35@title The GNU Transactional Memory Library 36@page 37@vskip 0pt plus 1filll 38@comment For the @value{version-GCC} Version* 39@sp 1 40@insertcopying 41@end titlepage 42 43@summarycontents 44@contents 45@page 46 47 48@node Top 49@top Introduction 50@cindex Introduction 51 52This manual documents the usage and internals of libitm, the GNU Transactional 53Memory Library. It provides transaction support for accesses to a process' 54memory, enabling easy-to-use synchronization of accesses to shared memory by 55several threads. 56 57 58@comment 59@comment When you add a new menu item, please keep the right hand 60@comment aligned to the same column. Do not use tabs. This provides 61@comment better formatting. 62@comment 63@menu 64* Enabling libitm:: How to enable libitm for your applications. 65* C/C++ Language Constructs for TM:: 66 Notes on the language-level interface supported 67 by gcc. 68* The libitm ABI:: Notes on the external ABI provided by libitm. 69* Internals:: Notes on libitm's internal synchronization. 70* GNU Free Documentation License:: 71 How you can copy and share this manual. 72* Library Index:: Index of this documentation. 73@end menu 74 75 76@c --------------------------------------------------------------------- 77@c Enabling libitm 78@c --------------------------------------------------------------------- 79 80@node Enabling libitm 81@chapter Enabling libitm 82 83To activate support for TM in C/C++, the compile-time flag @option{-fgnu-tm} 84must be specified. This enables TM language-level constructs such as 85transaction statements (e.g., @code{__transaction_atomic}, @pxref{C/C++ 86Language Constructs for TM} for details). 87 88@c --------------------------------------------------------------------- 89@c C/C++ Language Constructs for TM 90@c --------------------------------------------------------------------- 91 92@node C/C++ Language Constructs for TM 93@chapter C/C++ Language Constructs for TM 94 95Transactions are supported in C++ and C in the form of transaction statements, 96transaction expressions, and function transactions. In the following example, 97both @code{a} and @code{b} will be read and the difference will be written to 98@code{c}, all atomically and isolated from other transactions: 99 100@example 101__transaction_atomic @{ c = a - b; @} 102@end example 103 104Therefore, another thread can use the following code to concurrently update 105@code{b} without ever causing @code{c} to hold a negative value (and without 106having to use other synchronization constructs such as locks or C++11 107atomics): 108 109@example 110__transaction_atomic @{ if (a > b) b++; @} 111@end example 112 113GCC follows the @uref{https://sites.google.com/site/tmforcplusplus/, Draft 114Specification of Transactional Language Constructs for C++ (v1.1)} in its 115implementation of transactions. 116 117The precise semantics of transactions are defined in terms of the C++11/C11 118memory model (see the specification). Roughly, transactions provide 119synchronization guarantees that are similar to what would be guaranteed when 120using a single global lock as a guard for all transactions. Note that like 121other synchronization constructs in C/C++, transactions rely on a 122data-race-free program (e.g., a nontransactional write that is concurrent 123with a transactional read to the same memory location is a data race). 124 125@c --------------------------------------------------------------------- 126@c The libitm ABI 127@c --------------------------------------------------------------------- 128 129@node The libitm ABI 130@chapter The libitm ABI 131 132The ABI provided by libitm is basically equal to the Linux variant of Intel's 133current TM ABI specification document (Revision 1.1, May 6 2009) but with the 134differences listed in this chapter. It would be good if these changes would 135eventually be merged into a future version of this specification. To ease 136look-up, the following subsections mirror the structure of this specification. 137 138@section [No changes] Objectives 139@section [No changes] Non-objectives 140 141@section Library design principles 142@subsection [No changes] Calling conventions 143@subsection [No changes] TM library algorithms 144@subsection [No changes] Optimized load and store routines 145@subsection [No changes] Aligned load and store routines 146 147@subsection Data logging functions 148 149The memory locations accessed with transactional loads and stores and the 150memory locations whose values are logged must not overlap. This required 151separation only extends to the scope of the execution of one transaction 152including all the executions of all nested transactions. 153 154The compiler must be consistent (within the scope of a single transaction) 155about which memory locations are shared and which are not shared with other 156threads (i.e., data must be accessed either transactionally or 157nontransactionally). Otherwise, non-write-through TM algorithms would not work. 158 159For memory locations on the stack, this requirement extends to only the 160lifetime of the stack frame that the memory location belongs to (or the 161lifetime of the transaction, whichever is shorter). Thus, memory that is 162reused for several stack frames could be target of both data logging and 163transactional accesses; however, this is harmless because these stack frames' 164lifetimes will end before the transaction finishes. 165 166@subsection [No changes] Scatter/gather calls 167@subsection [No changes] Serial and irrevocable mode 168@subsection [No changes] Transaction descriptor 169@subsection Store allocation 170 171There is no @code{getTransaction} function. 172 173@subsection [No changes] Naming conventions 174 175@subsection Function pointer encryption 176 177Currently, this is not implemented. 178 179 180@section Types and macros list 181 182@code{_ITM_codeProperties} has changed, @pxref{txn-code-properties,,Starting a 183transaction}. 184@code{_ITM_srcLocation} is not used. 185 186 187@section Function list 188 189@subsection Initialization and finalization functions 190These functions are not part of the ABI. 191 192@subsection [No changes] Version checking 193@subsection [No changes] Error reporting 194@subsection [No changes] inTransaction call 195 196@subsection State manipulation functions 197There is no @code{getTransaction} function. Transaction identifiers for 198nested transactions will be ordered but not necessarily sequential (i.e., for 199a nested transaction's identifier @var{IN} and its enclosing transaction's 200identifier @var{IE}, it is guaranteed that @math{IN >= IE}). 201 202@subsection [No changes] Source locations 203 204@subsection Starting a transaction 205 206@subsubsection Transaction code properties 207 208@anchor{txn-code-properties} 209The bit @code{hasNoXMMUpdate} is instead called @code{hasNoVectorUpdate}. 210Iff it is set, vector register save/restore is not necessary for any target 211machine. 212 213The @code{hasNoFloatUpdate} bit (@code{0x0010}) is new. Iff it is set, floating 214point register save/restore is not necessary for any target machine. 215 216@code{undoLogCode} is not supported and a fatal runtime error will be raised 217if this bit is set. It is not properly defined in the ABI why barriers 218other than undo logging are not present; Are they not necessary (e.g., a 219transaction operating purely on thread-local data) or have they been omitted by 220the compiler because it thinks that some kind of global synchronization 221(e.g., serial mode) might perform better? The specification suggests that the 222latter might be the case, but the former seems to be more useful. 223 224The @code{readOnly} bit (@code{0x4000}) is new. @strong{TODO} Lexical or dynamic 225scope? 226 227@code{hasNoRetry} is not supported. If this bit is not set, but 228@code{hasNoAbort} is set, the library can assume that transaction 229rollback will not be requested. 230 231It would be useful if the absence of externally-triggered rollbacks would be 232reported for the dynamic scope as well, not just for the lexical scope 233(@code{hasNoAbort}). Without this, a library cannot exploit this together 234with flat nesting. 235 236@code{exceptionBlock} is not supported because exception blocks are not used. 237 238@subsubsection [No changes] Windows exception state 239@subsubsection [No changes] Other machine state 240 241@subsubsection [No changes] Results from beginTransaction 242 243@subsection Aborting a transaction 244 245@code{_ITM_rollbackTransaction} is not supported. @code{_ITM_abortTransaction} 246is supported but the abort reasons @code{exceptionBlockAbort}, 247@code{TMConflict}, and @code{userRetry} are not supported. There are no 248exception blocks in general, so the related cases also do not have to be 249considered. To encode @code{__transaction_cancel [[outer]]}, compilers must 250set the new @code{outerAbort} bit (@code{0x10}) additionally to the 251@code{userAbort} bit in the abort reason. 252 253@subsection Committing a transaction 254 255The exception handling (EH) scheme is different. The Intel ABI requires the 256@code{_ITM_tryCommitTransaction} function that will return even when the 257commit failed and will have to be matched with calls to either 258@code{_ITM_abortTransaction} or @code{_ITM_commitTransaction}. In contrast, 259gcc relies on transactional wrappers for the functions of the Exception 260Handling ABI and on one additional commit function (shown below). This allows 261the TM to keep track of EH internally and thus it does not have to embed the 262cleanup of EH state into the existing EH code in the program. 263@code{_ITM_tryCommitTransaction} is not supported. 264@code{_ITM_commitTransactionToId} is also not supported because the 265propagation of thrown exceptions will not bypass commits of nested 266transactions. 267 268@example 269void _ITM_commitTransactionEH(void *exc_ptr) ITM_REGPARM; 270void *_ITM_cxa_allocate_exception (size_t); 271void _ITM_cxa_free_exception (void *exc_ptr); 272void _ITM_cxa_throw (void *obj, void *tinfo, void *dest); 273void *_ITM_cxa_begin_catch (void *exc_ptr); 274void _ITM_cxa_end_catch (void); 275@end example 276 277The EH scheme changed in version 6 of GCC. Previously, the compiler 278added a call to @code{_ITM_commitTransactionEH} to commit a transaction if 279an exception could be in flight at this position in the code; @code{exc_ptr} is 280the address of the current exception and must be non-zero. Now, the 281compiler must catch all exceptions that are about to be thrown out of a 282transaction and call @code{_ITM_commitTransactionEH} from the catch clause, 283with @code{exc_ptr} being zero. 284 285Note that the old EH scheme never worked completely in GCC's implementation; 286libitm currently does not try to be compatible with the old scheme. 287 288The @code{_ITM_cxa...} functions are transactional wrappers for the respective 289@code{__cxa...} functions and must be called instead of these in transactional 290code. @code{_ITM_cxa_free_exception} is new in GCC 6. 291 292To support this EH scheme, libstdc++ needs to provide one additional function 293(@code{_cxa_tm_cleanup}), which is used by the TM to clean up the exception 294handling state while rolling back a transaction: 295 296@example 297void __cxa_tm_cleanup (void *unthrown_obj, void *cleanup_exc, 298 unsigned int caught_count); 299@end example 300 301Since GCC 6, @code{unthrown_obj} is not used anymore and always null; 302prior to that, @code{unthrown_obj} is non-null if the program called 303@code{__cxa_allocate_exception} for this exception but did not yet called 304@code{__cxa_throw} for it. @code{cleanup_exc} is non-null if the program is 305currently processing a cleanup along an exception path but has not caught this 306exception yet. @code{caught_count} is the nesting depth of 307@code{__cxa_begin_catch} within the transaction (which can be counted by the TM 308using @code{_ITM_cxa_begin_catch} and @code{_ITM_cxa_end_catch}); 309@code{__cxa_tm_cleanup} then performs rollback by essentially performing 310@code{__cxa_end_catch} that many times. 311 312 313 314@subsection Exception handling support 315 316Currently, there is no support for functionality like 317@code{__transaction_cancel throw} as described in the C++ TM specification. 318Supporting this should be possible with the EH scheme explained previously 319because via the transactional wrappers for the EH ABI, the TM is able to 320observe and intercept EH. 321 322 323@subsection [No changes] Transition to serial--irrevocable mode 324@subsection [No changes] Data transfer functions 325@subsection [No changes] Transactional memory copies 326 327@subsection Transactional versions of memmove 328 329If either the source or destination memory region is to be accessed 330nontransactionally, then source and destination regions must not be 331overlapping. The respective @code{_ITM_memmove} functions are still 332available but a fatal runtime error will be raised if such regions do overlap. 333To support this functionality, the ABI would have to specify how the 334intersection of the regions has to be accessed (i.e., transactionally or 335nontransactionally). 336 337@subsection [No changes] Transactional versions of memset 338@subsection [No changes] Logging functions 339 340@subsection User-registered commit and undo actions 341 342Commit actions will get executed in the same order in which the respective 343calls to @code{_ITM_addUserCommitAction} happened. Only 344@code{_ITM_noTransactionId} is allowed as value for the 345@code{resumingTransactionId} argument. Commit actions get executed after 346privatization safety has been ensured. 347 348Undo actions will get executed in reverse order compared to the order in which 349the respective calls to @code{_ITM_addUserUndoAction} happened. The ordering of 350undo actions w.r.t. the roll-back of other actions (e.g., data transfers or 351memory allocations) is undefined. 352 353@code{_ITM_getThreadnum} is not supported currently because its only purpose 354is to provide a thread ID that matches some assumed performance tuning output, 355but this output is not part of the ABI nor further defined by it. 356 357@code{_ITM_dropReferences} is not supported currently because its semantics and 358the intention behind it is not entirely clear. The 359specification suggests that this function is necessary because of certain 360orderings of data transfer undos and the releasing of memory regions (i.e., 361privatization). However, this ordering is never defined, nor is the ordering of 362dropping references w.r.t. other events. 363 364@subsection [New] Transactional indirect calls 365 366Indirect calls (i.e., calls through a function pointer) within transactions 367should execute the transactional clone of the original function (i.e., a clone 368of the original that has been fully instrumented to use the TM runtime), if 369such a clone is available. The runtime provides two functions to 370register/deregister clone tables: 371 372@example 373struct clone_entry 374@{ 375 void *orig, *clone; 376@}; 377 378void _ITM_registerTMCloneTable (clone_entry *table, size_t entries); 379void _ITM_deregisterTMCloneTable (clone_entry *table); 380@end example 381 382Registered tables must be writable by the TM runtime, and must be live 383throughout the life-time of the TM runtime. 384 385@strong{TODO} The intention was always to drop the registration functions 386entirely, and create a new ELF Phdr describing the linker-sorted table. Much 387like what currently happens for @code{PT_GNU_EH_FRAME}. 388This work kept getting bogged down in how to represent the @var{N} different 389code generation variants. We clearly needed at least two---SW and HW 390transactional clones---but there was always a suggestion of more variants for 391different TM assumptions/invariants. 392 393The compiler can then use two TM runtime functions to perform indirect calls in 394transactions: 395@example 396void *_ITM_getTMCloneOrIrrevocable (void *function) ITM_REGPARM; 397void *_ITM_getTMCloneSafe (void *function) ITM_REGPARM; 398@end example 399 400If there is a registered clone for supplied function, both will return a 401pointer to the clone. If not, the first runtime function will attempt to switch 402to serial--irrevocable mode and return the original pointer, whereas the second 403will raise a fatal runtime error. 404 405@subsection [New] Transactional dynamic memory management 406 407@example 408void *_ITM_malloc (size_t) 409 __attribute__((__malloc__)) ITM_PURE; 410void *_ITM_calloc (size_t, size_t) 411 __attribute__((__malloc__)) ITM_PURE; 412void _ITM_free (void *) ITM_PURE; 413@end example 414 415These functions are essentially transactional wrappers for @code{malloc}, 416@code{calloc}, and @code{free}. Within transactions, the compiler should 417replace calls to the original functions with calls to the wrapper functions. 418 419libitm also provides transactional clones of C++ memory management functions 420such as global operator new and delete. They are part of libitm for historic 421reasons but do not need to be part of this ABI. 422 423 424@section [No changes] Future Enhancements to the ABI 425 426@section Sample code 427 428The code examples might not be correct w.r.t. the current version of the ABI, 429especially everything related to exception handling. 430 431 432@section [New] Memory model 433 434The ABI should define a memory model and the ordering that is guaranteed for 435data transfers and commit/undo actions, or at least refer to another memory 436model that needs to be preserved. Without that, the compiler cannot ensure the 437memory model specified on the level of the programming language (e.g., by the 438C++ TM specification). 439 440For example, if a transactional load is ordered before another load/store, then 441the TM runtime must also ensure this ordering when accessing shared state. If 442not, this might break the kind of publication safety used in the C++ TM 443specification. Likewise, the TM runtime must ensure privatization safety. 444 445 446 447@c --------------------------------------------------------------------- 448@c Internals 449@c --------------------------------------------------------------------- 450 451@node Internals 452@chapter Internals 453 454@section TM methods and method groups 455 456libitm supports several ways of synchronizing transactions with each other. 457These TM methods (or TM algorithms) are implemented in the form of 458subclasses of @code{abi_dispatch}, which provide methods for 459transactional loads and stores as well as callbacks for rollback and commit. 460All methods that are compatible with each other (i.e., that let concurrently 461running transactions still synchronize correctly even if different methods 462are used) belong to the same TM method group. Pointers to TM methods can be 463obtained using the factory methods prefixed with @code{dispatch_} in 464@file{libitm_i.h}. There are two special methods, @code{dispatch_serial} and 465@code{dispatch_serialirr}, that are compatible with all methods because they 466run transactions completely in serial mode. 467 468@subsection TM method life cycle 469 470The state of TM methods does not change after construction, but they do alter 471the state of transactions that use this method. However, because 472per-transaction data gets used by several methods, @code{gtm_thread} is 473responsible for setting an initial state that is useful for all methods. 474After that, methods are responsible for resetting/clearing this state on each 475rollback or commit (of outermost transactions), so that the transaction 476executed next is not affected by the previous transaction. 477 478There is also global state associated with each method group, which is 479initialized and shut down (@code{method_group::init()} and @code{fini()}) 480when switching between method groups (see @file{retry.cc}). 481 482@subsection Selecting the default method 483 484The default method that libitm uses for freshly started transactions (but 485not necessarily for restarted transactions) can be set via an environment 486variable (@env{ITM_DEFAULT_METHOD}), whose value should be equal to the name 487of one of the factory methods returning abi_dispatch subclasses but without 488the "dispatch_" prefix (e.g., "serialirr" instead of 489@code{GTM::dispatch_serialirr()}). 490 491Note that this environment variable is only a hint for libitm and might not 492be supported in the future. 493 494 495@section Nesting: flat vs. closed 496 497We support two different kinds of nesting of transactions. In the case of 498@emph{flat nesting}, the nesting structure is flattened and all nested 499transactions are subsumed by the enclosing transaction. In contrast, 500with @emph{closed nesting}, nested transactions that have not yet committed 501can be rolled back separately from the enclosing transactions; when they 502commit, they are subsumed by the enclosing transaction, and their effects 503will be finally committed when the outermost transaction commits. 504@emph{Open nesting} (where nested transactions can commit independently of the 505enclosing transactions) are not supported. 506 507Flat nesting is the default nesting mode, but closed nesting is supported and 508used when transactions contain user-controlled aborts 509(@code{__transaction_cancel} statements). We assume that user-controlled 510aborts are rare in typical code and used mostly in exceptional situations. 511Thus, it makes more sense to use flat nesting by default to avoid the 512performance overhead of the additional checkpoints required for closed 513nesting. User-controlled aborts will correctly abort the innermost enclosing 514transaction, whereas the whole (i.e., outermost) transaction will be restarted 515otherwise (e.g., when a transaction encounters data conflicts during 516optimistic execution). 517 518 519@section Locking conventions 520 521This section documents the locking scheme and rules for all uses of locking 522in libitm. We have to support serial(-irrevocable) mode, which is implemented 523using a global lock as explained next (called the @emph{serial lock}). To 524simplify the overall design, we use the same lock as catch-all locking 525mechanism for other infrequent tasks such as (de)registering clone tables or 526threads. Besides the serial lock, there are @emph{per-method-group locks} that 527are managed by specific method groups (i.e., groups of similar TM concurrency 528control algorithms), and lock-like constructs for quiescence-based operations 529such as ensuring privatization safety. 530 531Thus, the actions that participate in the libitm-internal locking are either 532@emph{active transactions} that do not run in serial mode, @emph{serial 533transactions} (which (are about to) run in serial mode), and management tasks 534that do not execute within a transaction but have acquired the serial mode 535like a serial transaction would do (e.g., to be able to register threads with 536libitm). Transactions become active as soon as they have successfully used the 537serial lock to announce this globally (@pxref{serial-lock-impl,,Serial lock 538implementation}). Likewise, transactions become serial transactions as soon as 539they have acquired the exclusive rights provided by the serial lock (i.e., 540serial mode, which also means that there are no other concurrent active or 541serial transactions). Note that active transactions can become serial 542transactions when they enter serial mode during the runtime of the 543transaction. 544 545@subsection State-to-lock mapping 546 547Application data is protected by the serial lock if there is a serial 548transaction and no concurrently running active transaction (i.e., non-serial). 549Otherwise, application data is protected by the currently selected method 550group, which might use per-method-group locks or other mechanisms. Also note 551that application data that is about to be privatized might not be allowed to be 552accessed by nontransactional code until privatization safety has been ensured; 553the details of this are handled by the current method group. 554 555libitm-internal state is either protected by the serial lock or accessed 556through custom concurrent code. The latter applies to the public/shared part 557of a transaction object and most typical method-group-specific state. 558 559The former category (protected by the serial lock) includes: 560@itemize @bullet 561@item The list of active threads that have used transactions. 562@item The tables that map functions to their transactional clones. 563@item The current selection of which method group to use. 564@item Some method-group-specific data, or invariants of this data. For example, 565resetting a method group to its initial state is handled by switching to the 566same method group, so the serial lock protects such resetting as well. 567@end itemize 568In general, such state is immutable whenever there exists an active 569(non-serial) transaction. If there is no active transaction, a serial 570transaction (or a thread that is not currently executing a transaction but has 571acquired the serial lock) is allowed to modify this state (but must of course 572be careful to not surprise the current method group's implementation with such 573modifications). 574 575@subsection Lock acquisition order 576 577To prevent deadlocks, locks acquisition must happen in a globally agreed-upon 578order. Note that this applies to other forms of blocking too, but does not 579necessarily apply to lock acquisitions that do not block (e.g., trylock() 580calls that do not get retried forever). Note that serial transactions are 581never return back to active transactions until the transaction has committed. 582Likewise, active transactions stay active until they have committed. 583Per-method-group locks are typically also not released before commit. 584 585Lock acquisition / blocking rules: 586@itemize @bullet 587 588@item Transactions must become active or serial before they are allowed to 589use method-group-specific locks or blocking (i.e., the serial lock must be 590acquired before those other locks, either in serial or nonserial mode). 591 592@item Any number of threads that do not currently run active transactions can 593block while trying to get the serial lock in exclusive mode. Note that active 594transactions must not block when trying to upgrade to serial mode unless there 595is no other transaction that is trying that (the latter is ensured by the 596serial lock implementation. 597 598@item Method groups must prevent deadlocks on their locks. In particular, they 599must also be prepared for another active transaction that has acquired 600method-group-specific locks but is blocked during an attempt to upgrade to 601being a serial transaction. See below for details. 602 603@item Serial transactions can acquire method-group-specific locks because there 604will be no other active nor serial transaction. 605 606@end itemize 607 608There is no single rule for per-method-group blocking because this depends on 609when a TM method might acquire locks. If no active transaction can upgrade to 610being a serial transaction after it has acquired per-method-group locks (e.g., 611when those locks are only acquired during an attempt to commit), then the TM 612method does not need to consider a potential deadlock due to serial mode. 613 614If there can be upgrades to serial mode after the acquisition of 615per-method-group locks, then TM methods need to avoid those deadlocks: 616@itemize @bullet 617@item When upgrading to a serial transaction, after acquiring exclusive rights 618to the serial lock but before waiting for concurrent active transactions to 619finish (@pxref{serial-lock-impl,,Serial lock implementation} for details), 620we have to wake up all active transactions waiting on the upgrader's 621per-method-group locks. 622@item Active transactions blocking on per-method-group locks need to check the 623serial lock and abort if there is a pending serial transaction. 624@item Lost wake-ups have to be prevented (e.g., by changing a bit in each 625per-method-group lock before doing the wake-up, and only blocking on this lock 626using a futex if this bit is not group). 627@end itemize 628 629@strong{TODO}: Can reuse serial lock for gl-*? And if we can, does it make 630sense to introduce further complexity in the serial lock? For gl-*, we can 631really only avoid an abort if we do -wb and -vbv. 632 633 634@subsection Serial lock implementation 635@anchor{serial-lock-impl} 636 637The serial lock implementation is optimized towards assuming that serial 638transactions are infrequent and not the common case. However, the performance 639of entering serial mode can matter because when only few transactions are run 640concurrently or if there are few threads, then it can be efficient to run 641transactions serially. 642 643The serial lock is similar to a multi-reader-single-writer lock in that there 644can be several active transactions but only one serial transaction. However, 645we do want to avoid contention (in the lock implementation) between active 646transactions, so we split up the reader side of the lock into per-transaction 647flags that are true iff the transaction is active. The exclusive writer side 648remains a shared single flag, which is acquired using a CAS, for example. 649On the fast-path, the serial lock then works similar to Dekker's algorithm but 650with several reader flags that a serial transaction would have to check. 651A serial transaction thus requires a list of all threads with potentially 652active transactions; we can use the serial lock itself to protect this list 653(i.e., only threads that have acquired the serial lock can modify this list). 654 655We want starvation-freedom for the serial lock to allow for using it to ensure 656progress for potentially starved transactions (@pxref{progress-guarantees,, 657Progress Guarantees} for details). However, this is currently not enforced by 658the implementation of the serial lock. 659 660Here is pseudo-code for the read/write fast paths of acquiring the serial 661lock (read-to-write upgrade is similar to write_lock: 662@example 663// read_lock: 664tx->shared_state |= active; 665__sync_synchronize(); // or STLD membar, or C++0x seq-cst fence 666while (!serial_lock.exclusive) 667 if (spinning_for_too_long) goto slowpath; 668 669// write_lock: 670if (CAS(&serial_lock.exclusive, 0, this) != 0) 671 goto slowpath; // writer-writer contention 672// need a membar here, but CAS already has full membar semantics 673bool need_blocking = false; 674for (t: all txns) 675 @{ 676 for (;t->shared_state & active;) 677 if (spinning_for_too_long) @{ need_blocking = true; break; @} 678 @} 679if (need_blocking) goto slowpath; 680@end example 681 682Releasing a lock in this spin-lock version then just consists of resetting 683@code{tx->shared_state} to inactive or clearing @code{serial_lock.exclusive}. 684 685However, we can't rely on a pure spinlock because we need to get the OS 686involved at some time (e.g., when there are more threads than CPUs to run on). 687Therefore, the real implementation falls back to a blocking slow path, either 688based on pthread mutexes or Linux futexes. 689 690 691@subsection Reentrancy 692 693libitm has to consider the following cases of reentrancy: 694@itemize @bullet 695 696@item Transaction calls unsafe code that starts a new transaction: The outer 697transaction will become a serial transaction before executing unsafe code. 698Therefore, nesting within serial transactions must work, even if the nested 699transaction is called from within uninstrumented code. 700 701@item Transaction calls either a transactional wrapper or safe code, which in 702turn starts a new transaction: It is not yet defined in the specification 703whether this is allowed. Thus, it is undefined whether libitm supports this. 704 705@item Code that starts new transactions might be called from within any part 706of libitm: This kind of reentrancy would likely be rather complex and can 707probably be avoided. Therefore, it is not supported. 708 709@end itemize 710 711@subsection Privatization safety 712 713Privatization safety is ensured by libitm using a quiescence-based approach. 714Basically, a privatizing transaction waits until all concurrent active 715transactions will either have finished (are not active anymore) or operate on 716a sufficiently recent snapshot to not access the privatized data anymore. This 717happens after the privatizing transaction has stopped being an active 718transaction, so waiting for quiescence does not contribute to deadlocks. 719 720In method groups that need to ensure publication safety explicitly, active 721transactions maintain a flag or timestamp in the public/shared part of the 722transaction descriptor. Before blocking, privatizers need to let the other 723transactions know that they should wake up the privatizer. 724 725@strong{TODO} Ho to implement the waiters? Should those flags be 726per-transaction or at a central place? We want to avoid one wake/wait call 727per active transactions, so we might want to use either a tree or combining 728to reduce the syscall overhead, or rather spin for a long amount of time 729instead of doing blocking. Also, it would be good if only the last transaction 730that the privatizer waits for would do the wake-up. 731 732@subsection Progress guarantees 733@anchor{progress-guarantees} 734 735Transactions that do not make progress when using the current TM method will 736eventually try to execute in serial mode. Thus, the serial lock's progress 737guarantees determine the progress guarantees of the whole TM. Obviously, we at 738least need deadlock-freedom for the serial lock, but it would also be good to 739provide starvation-freedom (informally, all threads will finish executing a 740transaction eventually iff they get enough cycles). 741 742However, the scheduling of transactions (e.g., thread scheduling by the OS) 743also affects the handling of progress guarantees by the TM. First, the TM 744can only guarantee deadlock-freedom if threads do not get stopped. Likewise, 745low-priority threads can starve if they do not get scheduled when other 746high-priority threads get those cycles instead. 747 748If all threads get scheduled eventually, correct lock implementations will 749provide deadlock-freedom, but might not provide starvation-freedom. We can 750either enforce the latter in the TM's lock implementation, or assume that 751the scheduling is sufficiently random to yield a probabilistic guarantee that 752no thread will starve (because eventually, a transaction will encounter a 753scheduling that will allow it to run). This can indeed work well in practice 754but is not necessarily guaranteed to work (e.g., simple spin locks can be 755pretty efficient). 756 757Because enforcing stronger progress guarantees in the TM has a higher runtime 758overhead, we focus on deadlock-freedom right now and assume that the threads 759will get scheduled eventually by the OS (but don't consider threads with 760different priorities). We should support starvation-freedom for serial 761transactions in the future. Everything beyond that is highly related to proper 762contention management across all of the TM (including with TM method to 763choose), and is future work. 764 765@strong{TODO} Handling thread priorities: We want to avoid priority inversion 766but it's unclear how often that actually matters in practice. Workloads that 767have threads with different priorities will likely also require lower latency 768or higher throughput for high-priority threads. Therefore, it probably makes 769not that much sense (except for eventual progress guarantees) to use 770priority inheritance until the TM has priority-aware contention management. 771 772 773@c --------------------------------------------------------------------- 774@c GNU Free Documentation License 775@c --------------------------------------------------------------------- 776 777@include fdl.texi 778 779@c --------------------------------------------------------------------- 780@c Index 781@c --------------------------------------------------------------------- 782 783@node Library Index 784@unnumbered Library Index 785 786@printindex cp 787 788@bye 789