1src/backend/storage/lmgr/README-SSI
2
3Serializable Snapshot Isolation (SSI) and Predicate Locking
4===========================================================
5
6This code is in the lmgr directory because about 90% of it is an
7implementation of predicate locking, which is required for SSI,
8rather than being directly related to SSI itself.  When another use
9for predicate locking justifies the effort to tease these two things
10apart, this README file should probably be split.
11
12
13Credits
14-------
15
16This feature was developed by Kevin Grittner and Dan R. K. Ports,
17with review and suggestions from Joe Conway, Heikki Linnakangas, and
18Jeff Davis.  It is based on work published in these papers:
19
20	Michael J. Cahill, Uwe Röhm, and Alan D. Fekete. 2008.
21	Serializable isolation for snapshot databases.
22	In SIGMOD '08: Proceedings of the 2008 ACM SIGMOD
23	international conference on Management of data,
24	pages 729-738, New York, NY, USA. ACM.
25	http://doi.acm.org/10.1145/1376616.1376690
26
27	Michael James Cahill. 2009.
28	Serializable Isolation for Snapshot Databases.
29	Sydney Digital Theses.
30	University of Sydney, School of Information Technologies.
31	http://hdl.handle.net/2123/5353
32
33
34Overview
35--------
36
37With true serializable transactions, if you can show that your
38transaction will do the right thing if there are no concurrent
39transactions, it will do the right thing in any mix of serializable
40transactions or be rolled back with a serialization failure.  This
41feature has been implemented in PostgreSQL using SSI.
42
43
44Serializable and Snapshot Transaction Isolation Levels
45------------------------------------------------------
46
47Serializable transaction isolation is attractive for shops with
48active development by many programmers against a complex schema
49because it guarantees data integrity with very little staff time --
50if a transaction can be shown to always do the right thing when it is
51run alone (before or after any other transaction), it will always do
52the right thing in any mix of concurrent serializable transactions.
53Where conflicts with other transactions would result in an
54inconsistent state within the database or an inconsistent view of
55the data, a serializable transaction will block or roll back to
56prevent the anomaly. The SQL standard provides a specific SQLSTATE
57for errors generated when a transaction rolls back for this reason,
58so that transactions can be retried automatically.
59
60Before version 9.1, PostgreSQL did not support a full serializable
61isolation level. A request for serializable transaction isolation
62actually provided snapshot isolation. This has well known anomalies
63which can allow data corruption or inconsistent views of the data
64during concurrent transactions; although these anomalies only occur
65when certain patterns of read-write dependencies exist within a set
66of concurrent transactions. Where these patterns exist, the anomalies
67can be prevented by introducing conflicts through explicitly
68programmed locks or otherwise unnecessary writes to the database.
69Snapshot isolation is popular because performance is better than
70serializable isolation and the integrity guarantees which it does
71provide allow anomalies to be avoided or managed with reasonable
72effort in many environments.
73
74
75Serializable Isolation Implementation Strategies
76------------------------------------------------
77
78Techniques for implementing full serializable isolation have been
79published and in use in many database products for decades. The
80primary technique which has been used is Strict Two-Phase Locking
81(S2PL), which operates by blocking writes against data which has been
82read by concurrent transactions and blocking any access (read or
83write) against data which has been written by concurrent
84transactions. A cycle in a graph of blocking indicates a deadlock,
85requiring a rollback. Blocking and deadlocks under S2PL in high
86contention workloads can be debilitating, crippling throughput and
87response time.
88
89A new technique for implementing full serializable isolation in an
90MVCC database appears in the literature beginning in 2008. This
91technique, known as Serializable Snapshot Isolation (SSI) has many of
92the advantages of snapshot isolation. In particular, reads don't
93block anything and writes don't block reads. Essentially, it runs
94snapshot isolation but monitors the read-write conflicts between
95transactions to identify dangerous structures in the transaction
96graph which indicate that a set of concurrent transactions might
97produce an anomaly, and rolls back transactions to ensure that no
98anomalies occur. It will produce some false positives (where a
99transaction is rolled back even though there would not have been an
100anomaly), but will never let an anomaly occur. In the two known
101prototype implementations, performance for many workloads (even with
102the need to restart transactions which are rolled back) is very close
103to snapshot isolation and generally far better than an S2PL
104implementation.
105
106
107Apparent Serial Order of Execution
108----------------------------------
109
110One way to understand when snapshot anomalies can occur, and to
111visualize the difference between the serializable implementations
112described above, is to consider that among transactions executing at
113the serializable transaction isolation level, the results are
114required to be consistent with some serial (one-at-a-time) execution
115of the transactions [1]. How is that order determined in each?
116
117In S2PL, each transaction locks any data it accesses. It holds the
118locks until committing, preventing other transactions from making
119conflicting accesses to the same data in the interim. Some
120transactions may have to be rolled back to prevent deadlock. But
121successful transactions can always be viewed as having occurred
122sequentially, in the order they committed.
123
124With snapshot isolation, reads never block writes, nor vice versa, so
125more concurrency is possible. The order in which transactions appear
126to have executed is determined by something more subtle than in S2PL:
127read/write dependencies. If a transaction reads data, it appears to
128execute after the transaction that wrote the data it is reading.
129Similarly, if it updates data, it appears to execute after the
130transaction that wrote the previous version. These dependencies, which
131we call "wr-dependencies" and "ww-dependencies", are consistent with
132the commit order, because the first transaction must have committed
133before the second starts. However, there can also be dependencies
134between two *concurrent* transactions, i.e. where one was running when
135the other acquired its snapshot.  These "rw-conflicts" occur when one
136transaction attempts to read data which is not visible to it because
137the transaction which wrote it (or will later write it) is
138concurrent. The reading transaction appears to have executed first,
139regardless of the actual sequence of transaction starts or commits,
140because it sees a database state prior to that in which the other
141transaction leaves it.
142
143Anomalies occur when a cycle is created in the graph of dependencies:
144when a dependency or series of dependencies causes transaction A to
145appear to have executed before transaction B, but another series of
146dependencies causes B to appear before A. If that's the case, then
147the results can't be consistent with any serial execution of the
148transactions.
149
150
151SSI Algorithm
152-------------
153
154As of 9.1, serializable transactions in PostgreSQL are implemented using
155Serializable Snapshot Isolation (SSI), based on the work of Cahill
156et al. Fundamentally, this allows snapshot isolation to run as it
157previously did, while monitoring for conditions which could create a
158serialization anomaly.
159
160SSI is based on the observation [2] that each snapshot isolation
161anomaly corresponds to a cycle that contains a "dangerous structure"
162of two adjacent rw-conflict edges:
163
164      Tin ------> Tpivot ------> Tout
165            rw             rw
166
167SSI works by watching for this dangerous structure, and rolling
168back a transaction when needed to prevent any anomaly. This means it
169only needs to track rw-conflicts between concurrent transactions, not
170wr- and ww-dependencies. It also means there is a risk of false
171positives, because not every dangerous structure is embedded in an
172actual cycle.  The number of false positives is low in practice, so
173this represents an acceptable tradeoff for keeping the detection
174overhead low.
175
176The PostgreSQL implementation uses two additional optimizations:
177
178* Tout must commit before any other transaction in the cycle
179  (see proof of Theorem 2.1 of [2]). We only roll back a transaction
180  if Tout commits before Tpivot and Tin.
181
182* if Tin is read-only, there can only be an anomaly if Tout committed
183  before Tin takes its snapshot. This optimization is an original
184  one. Proof:
185
186  - Because there is a cycle, there must be some transaction T0 that
187    precedes Tin in the cycle. (T0 might be the same as Tout.)
188
189  - The edge between T0 and Tin can't be a rw-conflict or ww-dependency,
190    because Tin was read-only, so it must be a wr-dependency.
191    Those can only occur if T0 committed before Tin took its snapshot,
192    else Tin would have ignored T0's output.
193
194  - Because Tout must commit before any other transaction in the
195    cycle, it must commit before T0 commits -- and thus before Tin
196    starts.
197
198
199PostgreSQL Implementation
200-------------------------
201
202    * Since this technique is based on Snapshot Isolation (SI), those
203areas in PostgreSQL which don't use SI can't be brought under SSI.
204This includes system tables, temporary tables, sequences, hint bit
205rewrites, etc.  SSI can not eliminate existing anomalies in these
206areas.
207
208    * Any transaction which is run at a transaction isolation level
209other than SERIALIZABLE will not be affected by SSI.  If you want to
210enforce business rules through SSI, all transactions should be run at
211the SERIALIZABLE transaction isolation level, and that should
212probably be set as the default.
213
214    * If all transactions are run at the SERIALIZABLE transaction
215isolation level, business rules can be enforced in triggers or
216application code without ever having a need to acquire an explicit
217lock or to use SELECT FOR SHARE or SELECT FOR UPDATE.
218
219    * Those who want to continue to use snapshot isolation without
220the additional protections of SSI (and the associated costs of
221enforcing those protections), can use the REPEATABLE READ transaction
222isolation level.  This level retains its legacy behavior, which
223is identical to the old SERIALIZABLE implementation and fully
224consistent with the standard's requirements for the REPEATABLE READ
225transaction isolation level.
226
227    * Performance under this SSI implementation will be significantly
228improved if transactions which don't modify permanent tables are
229declared to be READ ONLY before they begin reading data.
230
231    * Performance under SSI will tend to degrade more rapidly with a
232large number of active database transactions than under less strict
233isolation levels.  Limiting the number of active transactions through
234use of a connection pool or similar techniques may be necessary to
235maintain good performance.
236
237    * Any transaction which must be rolled back to prevent
238serialization anomalies will fail with SQLSTATE 40001, which has a
239standard meaning of "serialization failure".
240
241    * This SSI implementation makes an effort to choose the
242transaction to be canceled such that an immediate retry of the
243transaction will not fail due to conflicts with exactly the same
244transactions.  Pursuant to this goal, no transaction is canceled
245until one of the other transactions in the set of conflicts which
246could generate an anomaly has successfully committed.  This is
247conceptually similar to how write conflicts are handled.  To fully
248implement this guarantee there needs to be a way to roll back the
249active transaction for another process with a serialization failure
250SQLSTATE, even if it is "idle in transaction".
251
252
253Predicate Locking
254-----------------
255
256Both S2PL and SSI require some form of predicate locking to handle
257situations where reads conflict with later inserts or with later
258updates which move data into the selected range.  PostgreSQL didn't
259already have predicate locking, so it needed to be added to support
260full serializable transactions under either strategy. Practical
261implementations of predicate locking generally involve acquiring
262locks against data as it is accessed, using multiple granularities
263(tuple, page, table, etc.) with escalation as needed to keep the lock
264count to a number which can be tracked within RAM structures.  This
265approach was used in PostgreSQL.  Coarse granularities can cause some
266false positive indications of conflict. The number of false positives
267can be influenced by plan choice.
268
269
270Implementation overview
271-----------------------
272
273New RAM structures, inspired by those used to track traditional locks
274in PostgreSQL, but tailored to the needs of SIREAD predicate locking,
275are used.  These refer to physical objects actually accessed in the
276course of executing the query, to model the predicates through
277inference.  Anyone interested in this subject should review the
278Hellerstein, Stonebraker and Hamilton paper [3], along with the
279locking papers referenced from that and the Cahill papers.
280
281Because the SIREAD locks don't block, traditional locking techniques
282have to be modified.  Intent locking (locking higher level objects
283before locking lower level objects) doesn't work with non-blocking
284"locks" (which are, in some respects, more like flags than locks).
285
286A configurable amount of shared memory is reserved at postmaster
287start-up to track predicate locks. This size cannot be changed
288without a restart.
289
290To prevent resource exhaustion, multiple fine-grained locks may
291be promoted to a single coarser-grained lock as needed.
292
293An attempt to acquire an SIREAD lock on a tuple when the same
294transaction already holds an SIREAD lock on the page or the relation
295will be ignored. Likewise, an attempt to lock a page when the
296relation is locked will be ignored, and the acquisition of a coarser
297lock will result in the automatic release of all finer-grained locks
298it covers.
299
300
301Heap locking
302------------
303
304Predicate locks will be acquired for the heap based on the following:
305
306    * For a table scan, the entire relation will be locked.
307
308    * Each tuple read which is visible to the reading transaction
309will be locked, whether or not it meets selection criteria; except
310that there is no need to acquire an SIREAD lock on a tuple when the
311transaction already holds a write lock on any tuple representing the
312row, since a rw-conflict would also create a ww-dependency which
313has more aggressive enforcement and thus will prevent any anomaly.
314
315    * Modifying a heap tuple creates a rw-conflict with any transaction
316that holds a SIREAD lock on that tuple, or on the page or relation
317that contains it.
318
319    * Inserting a new tuple creates a rw-conflict with any transaction
320holding a SIREAD lock on the entire relation. It doesn't conflict with
321page-level locks, because page-level locks are only used to aggregate
322tuple locks. Unlike index page locks, they don't lock "gaps" on the page.
323
324
325Index AM implementations
326------------------------
327
328Since predicate locks only exist to detect writes which conflict with
329earlier reads, and heap tuple locks are acquired to cover all heap
330tuples actually read, including those read through indexes, the index
331tuples which were actually scanned are not of interest in themselves;
332we only care about their "new neighbors" -- later inserts into the
333index which would have been included in the scan had they existed at
334the time.  Conceptually, we want to lock the gaps between and
335surrounding index entries within the scanned range.
336
337Correctness requires that any insert into an index generates a
338rw-conflict with a concurrent serializable transaction if, after that
339insert, re-execution of any index scan of the other transaction would
340access the heap for a row not accessed during the previous execution.
341Note that a non-HOT update which expires an old index entry covered
342by the scan and adds a new entry for the modified row's new tuple
343need not generate a conflict, although an update which "moves" a row
344into the scan must generate a conflict.  While correctness allows
345false positives, they should be minimized for performance reasons.
346
347Several optimizations are possible, though not all are implemented yet:
348
349    * An index scan which is just finding the right position for an
350index insertion or deletion need not acquire a predicate lock.
351
352    * An index scan which is comparing for equality on the entire key
353for a unique index need not acquire a predicate lock as long as a key
354is found corresponding to a visible tuple which has not been modified
355by another transaction -- there are no "between or around" gaps to
356cover.
357
358    * As long as built-in foreign key enforcement continues to use
359its current "special tricks" to deal with MVCC issues, predicate
360locks should not be needed for scans done by enforcement code.
361
362    * If a search determines that no rows can be found regardless of
363index contents because the search conditions are contradictory (e.g.,
364x = 1 AND x = 2), then no predicate lock is needed.
365
366Other index AM implementation considerations:
367
368    * For an index AM that doesn't have support for predicate locking,
369we just acquire a predicate lock on the whole index for any search.
370
371    * B-tree index searches acquire predicate locks only on the
372index *leaf* pages needed to lock the appropriate index range. If,
373however, a search discovers that no root page has yet been created, a
374predicate lock on the index relation is required.
375
376    * GiST searches can determine that there are no matches at any
377level of the index, so there must be a predicate lock at each index
378level during a GiST search. An index insert at the leaf level can
379then be trusted to ripple up to all levels and locations where
380conflicting predicate locks may exist.
381
382    * The effects of page splits, overflows, consolidations, and
383removals must be carefully reviewed to ensure that predicate locks
384aren't "lost" during those operations, or kept with pages which could
385get re-used for different parts of the index.
386
387
388Innovations
389-----------
390
391The PostgreSQL implementation of Serializable Snapshot Isolation
392differs from what is described in the cited papers for several
393reasons:
394
395   1. PostgreSQL didn't have any existing predicate locking. It had
396to be added from scratch.
397
398   2. The existing in-memory lock structures were not suitable for
399tracking SIREAD locks.
400          * In PostgreSQL, tuple level locks are not held in RAM for
401any length of time; lock information is written to the tuples
402involved in the transactions.
403          * In PostgreSQL, existing lock structures have pointers to
404memory which is related to a session. SIREAD locks need to persist
405past the end of the originating transaction and even the session
406which ran it.
407          * PostgreSQL needs to be able to tolerate a large number of
408transactions executing while one long-running transaction stays open
409-- the in-RAM techniques discussed in the papers wouldn't support
410that.
411
412   3. Unlike the database products used for the prototypes described
413in the papers, PostgreSQL didn't already have a true serializable
414isolation level distinct from snapshot isolation.
415
416   4. PostgreSQL supports subtransactions -- an issue not mentioned
417in the papers.
418
419   5. PostgreSQL doesn't assign a transaction number to a database
420transaction until and unless necessary (normally, when the transaction
421attempts to modify data).
422
423   6. PostgreSQL has pluggable data types with user-definable
424operators, as well as pluggable index types, not all of which are
425based around data types which support ordering.
426
427   7. Some possible optimizations became apparent during development
428and testing.
429
430Differences from the implementation described in the papers are
431listed below.
432
433    * New structures needed to be created in shared memory to track
434the proper information for serializable transactions and their SIREAD
435locks.
436
437    * Because PostgreSQL does not have the same concept of an "oldest
438transaction ID" for all serializable transactions as assumed in the
439Cahill thesis, we track the oldest snapshot xmin among serializable
440transactions, and a count of how many active transactions use that
441xmin. When the count hits zero we find the new oldest xmin and run a
442clean-up based on that.
443
444    * Because reads in a subtransaction may cause that subtransaction
445to roll back, thereby affecting what is written by the top level
446transaction, predicate locks must survive a subtransaction rollback.
447As a consequence, all xid usage in SSI, including predicate locking,
448is based on the top level xid.  When looking at an xid that comes
449from a tuple's xmin or xmax, for example, we always call
450SubTransGetTopmostTransaction() before doing much else with it.
451
452    * PostgreSQL does not use "update in place" with a rollback log
453for its MVCC implementation.  Where possible it uses "HOT" updates on
454the same page (if there is room and no indexed value is changed).
455For non-HOT updates the old tuple is expired in place and a new tuple
456is inserted at a new location.  Because of this difference, a tuple
457lock in PostgreSQL doesn't automatically lock any other versions of a
458row.  We don't try to copy or expand a tuple lock to any other
459versions of the row, based on the following proof that any additional
460serialization failures we would get from that would be false
461positives:
462
463          o If transaction T1 reads a row version (thus acquiring a
464predicate lock on it) and a second transaction T2 updates that row
465version (thus creating a rw-conflict graph edge from T1 to T2), must a
466third transaction T3 which re-updates the new version of the row also
467have a rw-conflict in from T1 to prevent anomalies?  In other words,
468does it matter whether we recognize the edge T1 -> T3?
469
470          o If T1 has a conflict in, it certainly doesn't. Adding the
471edge T1 -> T3 would create a dangerous structure, but we already had
472one from the edge T1 -> T2, so we would have aborted something anyway.
473(T2 has already committed, else T3 could not have updated its output;
474but we would have aborted either T1 or T1's predecessor(s).  Hence
475no cycle involving T1 and T3 can survive.)
476
477          o Now let's consider the case where T1 doesn't have a
478rw-conflict in. If that's the case, for this edge T1 -> T3 to make a
479difference, T3 must have a rw-conflict out that induces a cycle in the
480dependency graph, i.e. a conflict out to some transaction preceding T1
481in the graph. (A conflict out to T1 itself would be problematic too,
482but that would mean T1 has a conflict in, the case we already
483eliminated.)
484
485          o So now we're trying to figure out if there can be an
486rw-conflict edge T3 -> T0, where T0 is some transaction that precedes
487T1. For T0 to precede T1, there has to be some edge, or sequence of
488edges, from T0 to T1. At least the last edge has to be a wr-dependency
489or ww-dependency rather than a rw-conflict, because T1 doesn't have a
490rw-conflict in. And that gives us enough information about the order
491of transactions to see that T3 can't have a rw-conflict to T0:
492 - T0 committed before T1 started (the wr/ww-dependency implies this)
493 - T1 started before T2 committed (the T1->T2 rw-conflict implies this)
494 - T2 committed before T3 started (otherwise, T3 would get aborted
495                                   because of an update conflict)
496
497          o That means T0 committed before T3 started, and therefore
498there can't be a rw-conflict from T3 to T0.
499
500          o So in all cases, we don't need the T1 -> T3 edge to
501recognize cycles.  Therefore it's not necessary for T1's SIREAD lock
502on the original tuple version to cover later versions as well.
503
504    * Predicate locking in PostgreSQL starts at the tuple level
505when possible. Multiple fine-grained locks are promoted to a single
506coarser-granularity lock as needed to avoid resource exhaustion.  The
507amount of memory used for these structures is configurable, to balance
508RAM usage against SIREAD lock granularity.
509
510    * Each backend keeps a process-local table of the locks it holds.
511To support granularity promotion decisions with low CPU and locking
512overhead, this table also includes the coarser covering locks and the
513number of finer-granularity locks they cover.
514
515    * Conflicts are identified by looking for predicate locks
516when tuples are written, and by looking at the MVCC information when
517tuples are read. There is no matching between two RAM-based locks.
518
519    * Because write locks are stored in the heap tuples rather than a
520RAM-based lock table, the optimization described in the Cahill thesis
521which eliminates an SIREAD lock where there is a write lock is
522implemented by the following:
523         1. When checking a heap write for conflicts against existing
524predicate locks, a tuple lock on the tuple being written is removed.
525         2. When acquiring a predicate lock on a heap tuple, we
526return quickly without doing anything if it is a tuple written by the
527reading transaction.
528
529    * Rather than using conflictIn and conflictOut pointers which use
530NULL to indicate no conflict and a self-reference to indicate
531multiple conflicts or conflicts with committed transactions, we use a
532list of rw-conflicts. With the more complete information, false
533positives are reduced and we have sufficient data for more aggressive
534clean-up and other optimizations:
535
536          o We can avoid ever rolling back a transaction until and
537unless there is a pivot where a transaction on the conflict *out*
538side of the pivot committed before either of the other transactions.
539
540          o We can avoid ever rolling back a transaction when the
541transaction on the conflict *in* side of the pivot is explicitly or
542implicitly READ ONLY unless the transaction on the conflict *out*
543side of the pivot committed before the READ ONLY transaction acquired
544its snapshot. (An implicit READ ONLY transaction is one which
545committed without writing, even though it was not explicitly declared
546to be READ ONLY.)
547
548          o We can more aggressively clean up conflicts, predicate
549locks, and SSI transaction information.
550
551    * We allow a READ ONLY transaction to "opt out" of SSI if there are
552no READ WRITE transactions which could cause the READ ONLY
553transaction to ever become part of a "dangerous structure" of
554overlapping transaction dependencies.
555
556    * We allow the user to request that a READ ONLY transaction wait
557until the conditions are right for it to start in the "opt out" state
558described above. We add a DEFERRABLE state to transactions, which is
559specified and maintained in a way similar to READ ONLY. It is
560ignored for transactions that are not SERIALIZABLE and READ ONLY.
561
562    * When a transaction must be rolled back, we pick among the
563active transactions such that an immediate retry will not fail again
564on conflicts with the same transactions.
565
566    * We use the PostgreSQL SLRU system to hold summarized
567information about older committed transactions to put an upper bound
568on RAM used. Beyond that limit, information spills to disk.
569Performance can degrade in a pessimal situation, but it should be
570tolerable, and transactions won't need to be canceled or blocked
571from starting.
572
573
574R&D Issues
575----------
576
577This is intended to be the place to record specific issues which need
578more detailed review or analysis.
579
580    * WAL file replay. While serializable implementations using S2PL
581can guarantee that the write-ahead log contains commits in a sequence
582consistent with some serial execution of serializable transactions,
583SSI cannot make that guarantee. While the WAL replay is no less
584consistent than under snapshot isolation, it is possible that under
585PITR recovery or hot standby a database could reach a readable state
586where some transactions appear before other transactions which would
587have had to precede them to maintain serializable consistency. In
588essence, if we do nothing, WAL replay will be at snapshot isolation
589even for serializable transactions. Is this OK? If not, how do we
590address it?
591
592    * External replication. Look at how this impacts external
593replication solutions, like Postgres-R, Slony, pgpool, HS/SR, etc.
594This is related to the "WAL file replay" issue.
595
596    * UNIQUE btree search for equality on all columns. Since a search
597of a UNIQUE index using equality tests on all columns will lock the
598heap tuple if an entry is found, it appears that there is no need to
599get a predicate lock on the index in that case. A predicate lock is
600still needed for such a search if a matching index entry which points
601to a visible tuple is not found.
602
603    * Minimize touching of shared memory. Should lists in shared
604memory push entries which have just been returned to the front of the
605available list, so they will be popped back off soon and some memory
606might never be touched, or should we keep adding returned items to
607the end of the available list?
608
609
610References
611----------
612
613[1] http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt
614Search for serial execution to find the relevant section.
615
616[2] A. Fekete et al. Making Snapshot Isolation Serializable. In ACM
617Transactions on Database Systems 30:2, Jun. 2005.
618http://dx.doi.org/10.1145/1071610.1071615
619
620[3] Joseph M. Hellerstein, Michael Stonebraker and James Hamilton. 2007.
621Architecture of a Database System. Foundations and Trends(R) in
622Databases Vol. 1, No. 2 (2007) 141-259.
623http://db.cs.berkeley.edu/papers/fntdb07-architecture.pdf
624  Of particular interest:
625    * 6.1 A Note on ACID
626    * 6.2 A Brief Review of Serializability
627    * 6.3 Locking and Latching
628    * 6.3.1 Transaction Isolation Levels
629    * 6.5.3 Next-Key Locking: Physical Surrogates for Logical Properties
630