1===============================
2Convenient API for Transactions
3===============================
4
5:Spec Title: Convenient API for Transactions
6:Spec Version: 1.2
7:Author: Jeremy Mikola
8:Lead: Jeff Yemin
9:Advisors: A\. Jesse Jiryu Davis, Kris Brandow, Oleg Pudeyev, Sam Ritter, Tess Avitabile
10:Status: Accepted
11:Type: Standards
12:Minimum Server Version: 4.0
13:Last Modified: 2019-04-24
14
15.. contents::
16
17--------
18
19Abstract
20========
21
22Reliably committing a transaction in the face of errors can be a complicated
23endeavor using the MongoDB 4.0 drivers API.  This specification introduces a
24``withTransaction`` method on the ClientSession object that allows application
25logic to be executed within a transaction. This method is capable of retrying
26either the commit operation or entire transaction as needed (and when the error
27permits) to better ensure that the transaction can complete successfully.
28
29META
30====
31
32The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
33"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
34interpreted as described in `RFC 2119 <https://www.ietf.org/rfc/rfc2119.txt>`_.
35
36Specification
37=============
38
39Terms
40-----
41
42.. _callback:
43
44Callback
45   A user-defined function that will be passed to the helper method defined in
46   this specification. Depending on the implementing language, this may be a
47   closure, function pointer, or other callable type.
48
49ClientSession
50   Driver object representing a client session, as defined in the
51   `Driver Session`_ specification. The name of this object MAY vary across
52   drivers.
53
54.. _Driver Session: ../sessions/driver-sessions.rst
55
56MongoClient
57   The root object of a driver's API. The name of this object MAY vary across
58   drivers.
59
60.. _TransactionOptions:
61
62TransactionOptions
63   Options for ``ClientSession.startTransaction``, as defined in the
64   `Transactions`_ specification. The structure of these options MAY vary across
65   drivers (e.g. dictionary, typed class).
66
67.. _Transactions: ../transactions/transactions.rst
68
69Naming Deviations
70-----------------
71
72This specification defines the name for a new ClientSession method,
73``withTransaction``. Drivers SHOULD use the defined name but MAY deviate to
74comply with their existing conventions. For example, a driver may use
75``with_transaction`` instead of ``withTransaction``.
76
77Callback Semantics
78------------------
79
80The purpose of the callback is to allow the application to specify some sequence
81of operations to be executed within the body of a transaction. In an ideal
82scenario, ``withTransaction`` will start a transaction, execute the callback,
83and commit the transaction. In the event of error, the commit or entire
84transaction may need to be retried and thusly the callback could be invoked
85multiple times.
86
87Drivers MUST ensure that the application can access the ClientSession within the
88callback, since the ClientSession will be needed to associate operations with
89the transaction. Drivers may elect an idiomatic approach to satisfy this
90requirement (e.g. require the callback to receive the ClientSession as its first
91argument, expect the callback to access the ClientSession from its lexical
92scope). Drivers MAY allow the callback to support additional parameters as
93needed (e.g. user data parameter, error output parameter). Drivers MAY allow the
94callback to return a value to be propagated as the return value of
95``withTransaction``.
96
97ClientSession Changes
98---------------------
99
100This specification introduces a ``withTransaction`` method on the ClientSession
101class:
102
103.. code:: typescript
104
105    interface ClientSession {
106        withTransaction(function<any(...)> callback,
107                        Optional<TransactionOptions> options,
108                        ... /* other arguments as needed */): any
109
110        // other existing members of ClientSession
111    }
112
113ClientSession.withTransaction
114~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
115
116This method is responsible for starting a transaction, invoking a callback, and
117committing a transaction. The callback is expected to execute one or more
118operations with the transaction; however, that is not enforced. The callback is
119allowed to execute other operations not associated with the transaction.
120
121Since ``withTransaction`` includes logic to retry transactions and commits,
122drivers MUST enforce a 120-second timeout to limit retry behavior and safeguard
123applications from long-running (or infinite) retry loops. Drivers SHOULD use a
124monotonic clock to determine elapsed time.
125
126If an UnknownTransactionCommitResult error is encountered for a commit, the
127driver MUST retry the commit if and only if the error is not MaxTimeMSExpired
128and the retry timeout has not been exceeded. Otherwise, the driver MUST NOT
129retry the commit and allow ``withTransaction`` to propagate the error to its
130caller.
131
132If a TransientTransactionError is encountered at any point, the entire
133transaction may be retried. If the retry timeout has not been exceeded, the
134driver MUST retry a transaction that fails with an error bearing the
135"TransientTransactionError" label. Since retrying the entire transaction will
136entail invoking the callback again, drivers MUST document that the callback may
137be invoked multiple times (i.e. one additional time per retry attempt) and MUST
138document the risk of side effects from using a non-idempotent callback. If the
139retry timeout has been exceeded, drivers MUST NOT retry the transaction and
140allow ``withTransaction`` to propagate the error to its caller.
141
142If an error bearing neither the UnknownTransactionCommitResult nor the
143TransientTransactionError label is encountered at any point, the driver MUST NOT
144retry and MUST allow ``withTransaction`` to propagate the error to its caller.
145
146This method MUST receive a `callback`_ as its first parameter.  An optional
147`TransactionOptions`_ MUST be provided as its second parameter (with deviations
148permitted as outlined in the `CRUD`_ specification). Drivers MAY support other
149parameters or options as needed (e.g. user data to pass as a parameter to the
150callback).
151
152.. _CRUD: ../crud/crud.rst#deviations
153
154~~~~~~~~~~~~~~~~~~~
155Sequence of Actions
156~~~~~~~~~~~~~~~~~~~
157
158This method should perform the following sequence of actions:
159
1601. Record the current monotonic time, which will be used to enforce the
161   120-second timeout before later retry attempts.
162
1632. Invoke `startTransaction`_ on the session. If TransactionOptions were
164   specified in the call to ``withTransaction``, those MUST be used for
165   ``startTransaction``. Note that ``ClientSession.defaultTransactionOptions``
166   will be used in the absence of any explicit TransactionOptions.
167
1683. If ``startTransaction`` reported an error, propagate that error to the caller
169   of ``withTransaction`` and return immediately.
170
1714. Invoke the callback. Drivers MUST ensure that the ClientSession can be
172   accessed within the callback (e.g. pass ClientSession as the first parameter,
173   rely on lexical scoping). Drivers MAY pass additional parameters as needed
174   (e.g. user data solicited by withTransaction).
175
1765. Control returns to ``withTransaction``. Determine the current `state`_ of the
177   ClientSession and whether the callback reported an error (e.g. thrown
178   exception, error output parameter).
179
1806. If the callback reported an error:
181
182   a. If the ClientSession is in the "starting transaction" or "transaction in
183      progress" state, invoke `abortTransaction`_ on the session.
184
185   b. If the callback's error includes a "TransientTransactionError" label and
186      the elapsed time of ``withTransaction`` is less than 120 seconds, jump
187      back to step two.
188
189   c. If the callback's error includes a "UnknownTransactionCommitResult" label,
190      the callback must have manually commited a transaction, propagate the
191      callback's error to the caller of ``withTransaction`` and return
192      immediately.
193
194   d. Otherwise, propagate the callback's error to the caller of
195      ``withTransaction`` and return immediately.
196
1977. If the ClientSession is in the "no transaction", "transaction aborted", or
198   "transaction committed" state, assume the callback intentionally aborted or
199   committed the transaction and return immediately.
200
2018. Invoke `commitTransaction`_ on the session.
202
2039. If ``commitTransaction`` reported an error:
204
205   a. If the ``commitTransaction`` error includes a
206      "UnknownTransactionCommitResult" label and the error is not
207      MaxTimeMSExpired and the elapsed time of ``withTransaction`` is less
208      than 120 seconds, jump back to step eight. We will trust
209      ``commitTransaction`` to apply a majority write concern on
210      retry attempts (see:
211      `Majority write concern is used when retrying commitTransaction`_).
212
213   b. If the ``commitTransaction`` error includes a "TransientTransactionError"
214      label and the elapsed time of ``withTransaction`` is less than 120
215      seconds, jump back to step two.
216
217   c. Otherwise, propagate the ``commitTransaction`` error to the caller of
218      ``withTransaction`` and return immediately.
219
22010. The transaction was committed successfully. Return immediately.
221
222.. _startTransaction: ../transactions/transactions.rst#starttransaction
223.. _state: ../transactions/transactions.rst#clientsession-changes
224.. _abortTransaction: ../transactions/transactions.rst#aborttransaction
225.. _commitTransaction: ../transactions/transactions.rst#committransaction
226
227~~~~~~~~~~~
228Pseudo-code
229~~~~~~~~~~~
230
231This method can be expressed by the following pseudo-code:
232
233.. code:: typescript
234
235    withTransaction(callback, options) {
236        // Note: drivers SHOULD use a monotonic clock to determine elapsed time
237        var startTime = Date.now(); // milliseconds since Unix epoch
238
239        retryTransaction: while (true) {
240            this.startTransaction(options); // may throw on error
241
242            try {
243                userDefinedFunction(this);
244            } catch (error) {
245                if (this.transactionState == STARTING ||
246                    this.transactionState == IN_PROGRESS) {
247                    this.abortTransaction();
248                }
249
250                if (error.hasErrorLabel("TransientTransactionError") &&
251                    Date.now() - startTime < 120000) {
252                    continue retryTransaction;
253                }
254
255                throw error;
256            }
257
258            if (this.transactionState == NO_TXN ||
259                this.transactionState == COMMITTED ||
260                this.transactionState == ABORTED) {
261                return; // Assume callback intentionally ended the transaction
262            }
263
264            retryCommit: while (true) {
265                try {
266                    /* We will rely on ClientSession.commitTransaction() to
267                     * apply a majority write concern if commitTransaction is
268                     * being retried (see: DRIVERS-601) */
269                    this.commitTransaction();
270                } catch (error) {
271                    /* Note: a maxTimeMS error will have the MaxTimeMSExpired
272                     * code (50) and can be reported as a top-level error or
273                     * inside writeConcernError, ie:
274                     * {ok:0, code: 50, codeName: "MaxTimeMSExpired"}
275                     * {ok:1, writeConcernError: {code: 50, codeName: "MaxTimeMSExpired"}}
276                     */
277                    if (!isMaxTimeMSExpiredError(error) &&
278                        error.hasErrorLabel("UnknownTransactionCommitResult") &&
279                        Date.now() - startTime < 120000) {
280                        continue retryCommit;
281                    }
282
283                    if (error.hasErrorLabel("TransientTransactionError") &&
284                        Date.now() - startTime < 120000) {
285                        continue retryTransaction;
286                    }
287
288                    throw error;
289                }
290                break; // Commit was successful
291            }
292            break; // Transaction was successful
293        }
294    }
295
296ClientSession must provide access to a MongoClient
297--------------------------------------------------
298
299The callback invoked by ``withTransaction`` is only guaranteed to receive a
300ClientSession parameter. Drivers MUST ensure that it is possible to obtain a
301MongoClient within the callback in order to execute operations within the
302transaction. Per the `Driver Session`_ specification, ClientSessions should
303already provide access to a client object.
304
305Test Plan
306=========
307
308See the `README <tests/README.rst>`_ for tests.
309
310Motivation for Change
311=====================
312
313Reliably committing a transaction in the face of errors can be a complicated
314endeavor using the MongoDB 4.0 drivers API. Providing helper method in the
315driver to execute a transaction (and retry when possible) will enable our users
316to make better use of transactions in their applications.
317
318Design Rationale
319================
320
321This specification introduces a helper method on the ClientSession object that
322applications may optionally employ to execute a user-defined function within a
323transaction. An application does not need to be modified unless it wants to take
324advantage of this helper method.
325
326Majority write concern is used when retrying commitTransaction
327--------------------------------------------------------------
328
329Drivers should apply a majority write concern when retrying commitTransaction to
330guard against a transaction being applied twice. This requirement was originally
331enforced in the implementation of ``withTransaction``, but will now be handled
332by the transaction spec itself in order to benefit applications irrespective of
333whether they use ``withTransaction`` (see the corresponding section in the
334`Transactions spec Design Rationale`_).
335
336.. _Transactions spec Design Rationale: ../transactions/transactions.rst#majority-write-concern-is-used-when-retrying-committransaction
337
338The callback function has a flexible signature
339----------------------------------------------
340
341An original design considered requiring the callback to accept a ClientSession
342as its first parameter. That could be superfluous for languages where the
343callback might already have access to ClientSession through its lexical scope.
344Instead, the spec simply requires that drivers ensure the callback will be able
345to access the ClientSession.
346
347Similarly, the specification does not concern itself with the return type of the
348callback function. If drivers allow the callback to return a value, they may
349also choose to propagate that value as the return value of withTransaction.
350
351An earlier design also considered using the callback's return value to indicate
352whether control should break out of ``withTransaction`` (and its retry loop) and
353return to the application. The design allows this to be accomplished in one of
354two ways:
355
356- The callback aborts the transaction directly and returns to
357  ``withTransaction``, which will then return to its caller.
358
359- The callback raises an error without the "TransientTransactionError" label,
360  in which case ``withTransaction`` will abort the transaction and return to
361  its caller.
362
363Applications are responsible for passing ClientSession for operations within a transaction
364------------------------------------------------------------------------------------------
365
366It remains the responsibility of the application to pass a ClientSession to all
367operations that should be included in a transaction. With regard to
368``withTransaction``, applications are free to execute any operations within the
369callback, irrespective of whether those operations are associated with the
370transaction.
371
372It is assumed that the callback will not start a new transaction on the ClientSession
373-------------------------------------------------------------------------------------
374
375Under normal circumstances, the callback should not commit the transaction nor
376should it start a new transaction. The ``withTransaction`` method will inspect
377the ClientSession's transaction state after the callback returns and take the
378most sensible course of action; however, it will not detect whether the callback
379has started a new transaction.
380
381The callback function may be executed multiple times
382----------------------------------------------------
383
384The implementation of withTransaction is based on the original examples for
385`Retry Transactions and Commit Operation`_ from the MongoDB Manual. As such, the
386callback may be executed any number of times. Drivers are free to encourage
387their users to design idempotent callbacks.
388
389.. _Retry Transactions and Commit Operation: https://docs.mongodb.com/manual/core/transactions/#retry-transaction-and-commit-operation
390
391The commit is retried after a write concern timeout (i.e. wtimeout) error
392-------------------------------------------------------------------------
393
394Per the Transactions specification, drivers internally retry
395``commitTransaction`` once if it fails due to a retryable error (as defined in
396the `Retryable Writes`_ specification). Beyond that, applications may manually
397retry ``commitTransaction`` if it fails with any error bearing the
398`UnknownTransactionCommitResult`_ error label. This label is applied for the
399the following errors:
400
401.. _Retryable Writes: ../retryable-writes/retryable-writes.rst#terms
402
403.. _UnknownTransactionCommitResult: ../transactions/transactions.rst#unknowntransactioncommitresult
404
405- Server selection failure
406- Retryable error (as defined in the `Retryable Writes`_ specification)
407- Write concern failure or timeout (excluding UnsatisfiableWriteConcern and
408  UnknownReplWriteConcern)
409- MaxTimeMSExpired errors, ie ``{ok:0, code: 50, codeName: "MaxTimeMSExpired"}``
410  and ``{ok:1, writeConcernError: {code: 50, codeName: "MaxTimeMSExpired"}}``.
411
412A previous design for ``withTransaction`` retried for all of these errors
413*except* for write concern timeouts, so as not to exceed the user's original
414intention for ``wtimeout``. The current design of this specification no longer
415excludes write concern timeouts, and simply retries ``commitTransaction`` within
416its timeout period for all errors bearing the "UnknownTransactionCommitResult"
417label.
418
419This change was made in light of the forthcoming Client-side Operations Timeout
420specification (see: `Future Work`_), which we expect will allow the current
421120-second timeout for ``withTransaction`` to be customized and also obviate the
422need for users to specify ``wtimeout``.
423
424The commit is not retried after a MaxTimeMSExpired error
425--------------------------------------------------------
426
427This specification intentionally chooses not to retry commit operations after a
428MaxTimeMSExpired error as doing so would exceed the user's original intention
429for ``maxTimeMS``.
430
431The transaction and commit may be retried any number of times within a timeout period
432-------------------------------------------------------------------------------------
433
434The implementation of withTransaction is based on the original examples for
435`Retry Transactions and Commit Operation`_ from the MongoDB Manual. As such, the
436transaction and commit may be continually retried as long as the error label
437indicates that retrying is possible.
438
439A previous design had no limits for retrying commits or entire transactions. The
440callback is always able indicate that ``withTransaction`` should return to its
441caller (without future retry attempts) by aborting the transaction directly;
442however, that puts the onus on avoiding very long (or infinite) retry loops on
443the application. We expect the most common cause of retry loops will be due to
444TransientTransactionErrors caused by write conflicts, as those can occur
445regularly in a healthy application, as opposed to
446UnknownTransactionCommitResult, which would typically be caused by an election.
447
448In order to avoid blocking the application with infinite retry loops,
449``withTransaction`` will cease retrying invocations of the callback or
450commitTransaction if it has exceeded a fixed timeout period of 120 seconds. This
451limit is a non-configurable default and is intentionally twice the value of
452MongoDB 4.0's default for the `transactionLifetimeLimitSeconds`_ parameter (60
453seconds). Applications that desire longer retry periods may call
454``withTransaction`` additional times as needed. Applications that desire shorter
455retry periods should not use this method.
456
457.. _transactionLifetimeLimitSeconds: https://docs.mongodb.com/manual/reference/parameters/#param.transactionLifetimeLimitSeconds
458
459Backwards Compatibility
460=======================
461
462The specification introduces a new method on the ClientSession class and does
463not introduce any backward breaking changes. Existing programs that do not make
464use of this new method will continue to compile and run correctly.
465
466Reference Implementation
467========================
468
469The C, Java, and Ruby drivers will provide reference implementations. The
470corresponding tickets for those implementations may be found via
471`DRIVERS-556`_.
472
473.. _DRIVERS-556: https://jira.mongodb.org/browse/DRIVERS-556
474
475Security Implication
476====================
477
478Applications that use transaction guarantees to enforce security rules will
479benefit from a less error-prone API. Adding a helper method to execute a
480user-defined function within a transaction has few security implications, as it
481only provides an implementation of a technique already described in the MongoDB
4824.0 documentation (`DRIVERS-488`_).
483
484.. _DRIVERS-488: https://jira.mongodb.org/browse/DRIVERS-488
485
486Future Work
487===========
488
489The forthcoming Client-side Operations Timeout specification (`DRIVERS-555`_)
490may allow users to alter the default retry timeout, as a client-side timeout
491could be applied to ``withTransaction`` and its retry logic. In the absence of a
492client-side operation timeout, withTransaction can continue to use the
493120-second default and thus preserve backwards compatibility.
494
495.. _DRIVERS-555: https://jira.mongodb.org/browse/DRIVERS-555
496
497Changes
498=======
499
5002019-04-24: withTransaction does not retry when commit fails with
501            MaxTimeMSExpired.
502
5032018-02-13: withTransaction should retry commits after a wtimeout
504