1Design
2======
3
4.. currentmodule:: websockets
5
6This document describes the design of ``websockets``. It assumes familiarity
7with the specification of the WebSocket protocol in :rfc:`6455`.
8
9It's primarily intended at maintainers. It may also be useful for users who
10wish to understand what happens under the hood.
11
12.. warning::
13
14    Internals described in this document may change at any time.
15
16    Backwards compatibility is only guaranteed for `public APIs <api>`_.
17
18
19Lifecycle
20---------
21
22State
23.....
24
25WebSocket connections go through a trivial state machine:
26
27- ``CONNECTING``: initial state,
28- ``OPEN``: when the opening handshake is complete,
29- ``CLOSING``: when the closing handshake is started,
30- ``CLOSED``: when the TCP connection is closed.
31
32Transitions happen in the following places:
33
34- ``CONNECTING -> OPEN``: in
35  :meth:`~protocol.WebSocketCommonProtocol.connection_open` which runs when
36  the :ref:`opening handshake <opening-handshake>` completes and the WebSocket
37  connection is established — not to be confused with
38  :meth:`~asyncio.Protocol.connection_made` which runs when the TCP connection
39  is established;
40- ``OPEN -> CLOSING``: in
41  :meth:`~protocol.WebSocketCommonProtocol.write_frame` immediately before
42  sending a close frame; since receiving a close frame triggers sending a
43  close frame, this does the right thing regardless of which side started the
44  :ref:`closing handshake <closing-handshake>`; also in
45  :meth:`~protocol.WebSocketCommonProtocol.fail_connection` which duplicates
46  a few lines of code from ``write_close_frame()`` and ``write_frame()``;
47- ``* -> CLOSED``: in
48  :meth:`~protocol.WebSocketCommonProtocol.connection_lost` which is always
49  called exactly once when the TCP connection is closed.
50
51Coroutines
52..........
53
54The following diagram shows which coroutines are running at each stage of the
55connection lifecycle on the client side.
56
57.. image:: lifecycle.svg
58   :target: _images/lifecycle.svg
59
60The lifecycle is identical on the server side, except inversion of control
61makes the equivalent of :meth:`~client.connect` implicit.
62
63Coroutines shown in green are called by the application. Multiple coroutines
64may interact with the WebSocket connection concurrently.
65
66Coroutines shown in gray manage the connection. When the opening handshake
67succeeds, :meth:`~protocol.WebSocketCommonProtocol.connection_open` starts
68two tasks:
69
70- :attr:`~protocol.WebSocketCommonProtocol.transfer_data_task` runs
71  :meth:`~protocol.WebSocketCommonProtocol.transfer_data` which handles
72  incoming data and lets :meth:`~protocol.WebSocketCommonProtocol.recv`
73  consume it. It may be canceled to terminate the connection. It never exits
74  with an exception other than :exc:`~asyncio.CancelledError`. See :ref:`data
75  transfer <data-transfer>` below.
76
77- :attr:`~protocol.WebSocketCommonProtocol.keepalive_ping_task` runs
78  :meth:`~protocol.WebSocketCommonProtocol.keepalive_ping` which sends Ping
79  frames at regular intervals and ensures that corresponding Pong frames are
80  received. It is canceled when the connection terminates. It never exits
81  with an exception other than :exc:`~asyncio.CancelledError`.
82
83- :attr:`~protocol.WebSocketCommonProtocol.close_connection_task` runs
84  :meth:`~protocol.WebSocketCommonProtocol.close_connection` which waits for
85  the data transfer to terminate, then takes care of closing the TCP
86  connection. It must not be canceled. It never exits with an exception. See
87  :ref:`connection termination <connection-termination>` below.
88
89Besides, :meth:`~protocol.WebSocketCommonProtocol.fail_connection` starts
90the same :attr:`~protocol.WebSocketCommonProtocol.close_connection_task` when
91the opening handshake fails, in order to close the TCP connection.
92
93Splitting the responsibilities between two tasks makes it easier to guarantee
94that ``websockets`` can terminate connections:
95
96- within a fixed timeout,
97- without leaking pending tasks,
98- without leaking open TCP connections,
99
100regardless of whether the connection terminates normally or abnormally.
101
102:attr:`~protocol.WebSocketCommonProtocol.transfer_data_task` completes when no
103more data will be received on the connection. Under normal circumstances, it
104exits after exchanging close frames.
105
106:attr:`~protocol.WebSocketCommonProtocol.close_connection_task` completes when
107the TCP connection is closed.
108
109
110.. _opening-handshake:
111
112Opening handshake
113-----------------
114
115``websockets`` performs the opening handshake when establishing a WebSocket
116connection. On the client side, :meth:`~client.connect` executes it before
117returning the protocol to the caller. On the server side, it's executed before
118passing the protocol to the ``ws_handler`` coroutine handling the connection.
119
120While the opening handshake is asymmetrical — the client sends an HTTP Upgrade
121request and the server replies with an HTTP Switching Protocols response —
122``websockets`` aims at keeping the implementation of both sides consistent
123with one another.
124
125On the client side, :meth:`~client.WebSocketClientProtocol.handshake`:
126
127- builds a HTTP request based on the ``uri`` and parameters passed to
128  :meth:`~client.connect`;
129- writes the HTTP request to the network;
130- reads a HTTP response from the network;
131- checks the HTTP response, validates ``extensions`` and ``subprotocol``, and
132  configures the protocol accordingly;
133- moves to the ``OPEN`` state.
134
135On the server side, :meth:`~server.WebSocketServerProtocol.handshake`:
136
137- reads a HTTP request from the network;
138- calls :meth:`~server.WebSocketServerProtocol.process_request` which may
139  abort the WebSocket handshake and return a HTTP response instead; this
140  hook only makes sense on the server side;
141- checks the HTTP request, negotiates ``extensions`` and ``subprotocol``, and
142  configures the protocol accordingly;
143- builds a HTTP response based on the above and parameters passed to
144  :meth:`~server.serve`;
145- writes the HTTP response to the network;
146- moves to the ``OPEN`` state;
147- returns the ``path`` part of the ``uri``.
148
149The most significant asymmetry between the two sides of the opening handshake
150lies in the negotiation of extensions and, to a lesser extent, of the
151subprotocol. The server knows everything about both sides and decides what the
152parameters should be for the connection. The client merely applies them.
153
154If anything goes wrong during the opening handshake, ``websockets``
155:ref:`fails the connection <connection-failure>`.
156
157
158.. _data-transfer:
159
160Data transfer
161-------------
162
163Symmetry
164........
165
166Once the opening handshake has completed, the WebSocket protocol enters the
167data transfer phase. This part is almost symmetrical. There are only two
168differences between a server and a client:
169
170- `client-to-server masking`_: the client masks outgoing frames; the server
171  unmasks incoming frames;
172- `closing the TCP connection`_: the server closes the connection immediately;
173  the client waits for the server to do it.
174
175.. _client-to-server masking: https://tools.ietf.org/html/rfc6455#section-5.3
176.. _closing the TCP connection: https://tools.ietf.org/html/rfc6455#section-5.5.1
177
178These differences are so minor that all the logic for `data framing`_, for
179`sending and receiving data`_ and for `closing the connection`_ is implemented
180in the same class, :class:`~protocol.WebSocketCommonProtocol`.
181
182.. _data framing: https://tools.ietf.org/html/rfc6455#section-5
183.. _sending and receiving data: https://tools.ietf.org/html/rfc6455#section-6
184.. _closing the connection: https://tools.ietf.org/html/rfc6455#section-7
185
186The :attr:`~protocol.WebSocketCommonProtocol.is_client` attribute tells which
187side a protocol instance is managing. This attribute is defined on the
188:attr:`~server.WebSocketServerProtocol` and
189:attr:`~client.WebSocketClientProtocol` classes.
190
191Data flow
192.........
193
194The following diagram shows how data flows between an application built on top
195of ``websockets`` and a remote endpoint. It applies regardless of which side
196is the server or the client.
197
198.. image:: protocol.svg
199   :target: _images/protocol.svg
200
201Public methods are shown in green, private methods in yellow, and buffers in
202orange. Methods related to connection termination are omitted; connection
203termination is discussed in another section below.
204
205Receiving data
206..............
207
208The left side of the diagram shows how ``websockets`` receives data.
209
210Incoming data is written to a :class:`~asyncio.StreamReader` in order to
211implement flow control and provide backpressure on the TCP connection.
212
213:attr:`~protocol.WebSocketCommonProtocol.transfer_data_task`, which is started
214when the WebSocket connection is established, processes this data.
215
216When it receives data frames, it reassembles fragments and puts the resulting
217messages in the :attr:`~protocol.WebSocketCommonProtocol.messages` queue.
218
219When it encounters a control frame:
220
221- if it's a close frame, it starts the closing handshake;
222- if it's a ping frame, it answers with a pong frame;
223- if it's a pong frame, it acknowledges the corresponding ping (unless it's an
224  unsolicited pong).
225
226Running this process in a task guarantees that control frames are processed
227promptly. Without such a task, ``websockets`` would depend on the application
228to drive the connection by having exactly one coroutine awaiting
229:meth:`~protocol.WebSocketCommonProtocol.recv` at any time. While this
230happens naturally in many use cases, it cannot be relied upon.
231
232Then :meth:`~protocol.WebSocketCommonProtocol.recv` fetches the next message
233from the :attr:`~protocol.WebSocketCommonProtocol.messages` queue, with some
234complexity added for handling backpressure and termination correctly.
235
236Sending data
237............
238
239The right side of the diagram shows how ``websockets`` sends data.
240
241:meth:`~protocol.WebSocketCommonProtocol.send` writes one or several data
242frames containing the message. While sending a fragmented message, concurrent
243calls to :meth:`~protocol.WebSocketCommonProtocol.send` are put on hold until
244all fragments are sent. This makes concurrent calls safe.
245
246:meth:`~protocol.WebSocketCommonProtocol.ping` writes a ping frame and
247yields a :class:`~asyncio.Future` which will be completed when a matching pong
248frame is received.
249
250:meth:`~protocol.WebSocketCommonProtocol.pong` writes a pong frame.
251
252:meth:`~protocol.WebSocketCommonProtocol.close` writes a close frame and
253waits for the TCP connection to terminate.
254
255Outgoing data is written to a :class:`~asyncio.StreamWriter` in order to
256implement flow control and provide backpressure from the TCP connection.
257
258.. _closing-handshake:
259
260Closing handshake
261.................
262
263When the other side of the connection initiates the closing handshake,
264:meth:`~protocol.WebSocketCommonProtocol.read_message` receives a close
265frame while in the ``OPEN`` state. It moves to the ``CLOSING`` state, sends a
266close frame, and returns ``None``, causing
267:attr:`~protocol.WebSocketCommonProtocol.transfer_data_task` to terminate.
268
269When this side of the connection initiates the closing handshake with
270:meth:`~protocol.WebSocketCommonProtocol.close`, it moves to the ``CLOSING``
271state and sends a close frame. When the other side sends a close frame,
272:meth:`~protocol.WebSocketCommonProtocol.read_message` receives it in the
273``CLOSING`` state and returns ``None``, also causing
274:attr:`~protocol.WebSocketCommonProtocol.transfer_data_task` to terminate.
275
276If the other side doesn't send a close frame within the connection's close
277timeout, ``websockets`` :ref:`fails the connection <connection-failure>`.
278
279The closing handshake can take up to ``2 * close_timeout``: one
280``close_timeout`` to write a close frame and one ``close_timeout`` to receive
281a close frame.
282
283Then ``websockets`` terminates the TCP connection.
284
285
286.. _connection-termination:
287
288Connection termination
289----------------------
290
291:attr:`~protocol.WebSocketCommonProtocol.close_connection_task`, which is
292started when the WebSocket connection is established, is responsible for
293eventually closing the TCP connection.
294
295First :attr:`~protocol.WebSocketCommonProtocol.close_connection_task` waits
296for :attr:`~protocol.WebSocketCommonProtocol.transfer_data_task` to terminate,
297which may happen as a result of:
298
299- a successful closing handshake: as explained above, this exits the infinite
300  loop in :attr:`~protocol.WebSocketCommonProtocol.transfer_data_task`;
301- a timeout while waiting for the closing handshake to complete: this cancels
302  :attr:`~protocol.WebSocketCommonProtocol.transfer_data_task`;
303- a protocol error, including connection errors: depending on the exception,
304  :attr:`~protocol.WebSocketCommonProtocol.transfer_data_task` :ref:`fails the
305  connection <connection-failure>` with a suitable code and exits.
306
307:attr:`~protocol.WebSocketCommonProtocol.close_connection_task` is separate
308from :attr:`~protocol.WebSocketCommonProtocol.transfer_data_task` to make it
309easier to implement the timeout on the closing handshake. Canceling
310:attr:`~protocol.WebSocketCommonProtocol.transfer_data_task` creates no risk
311of canceling :attr:`~protocol.WebSocketCommonProtocol.close_connection_task`
312and failing to close the TCP connection, thus leaking resources.
313
314Then :attr:`~protocol.WebSocketCommonProtocol.close_connection_task` cancels
315:attr:`~protocol.WebSocketCommonProtocol.keepalive_ping`. This task has no
316protocol compliance responsibilities. Terminating it to avoid leaking it is
317the only concern.
318
319Terminating the TCP connection can take up to ``2 * close_timeout`` on the
320server side and ``3 * close_timeout`` on the client side. Clients start by
321waiting for the server to close the connection, hence the extra
322``close_timeout``. Then both sides go through the following steps until the
323TCP connection is lost: half-closing the connection (only for non-TLS
324connections), closing the connection, aborting the connection. At this point
325the connection drops regardless of what happens on the network.
326
327
328.. _connection-failure:
329
330Connection failure
331------------------
332
333If the opening handshake doesn't complete successfully, ``websockets`` fails
334the connection by closing the TCP connection.
335
336Once the opening handshake has completed, ``websockets`` fails the connection
337by canceling :attr:`~protocol.WebSocketCommonProtocol.transfer_data_task` and
338sending a close frame if appropriate.
339
340:attr:`~protocol.WebSocketCommonProtocol.transfer_data_task` exits, unblocking
341:attr:`~protocol.WebSocketCommonProtocol.close_connection_task`, which closes
342the TCP connection.
343
344
345.. _server-shutdown:
346
347Server shutdown
348---------------
349
350:class:`~websockets.server.WebSocketServer` closes asynchronously like
351:class:`asyncio.Server`. The shutdown happen in two steps:
352
3531. Stop listening and accepting new connections;
3542. Close established connections with close code 1001 (going away) or, if
355   the opening handshake is still in progress, with HTTP status code 503
356   (Service Unavailable).
357
358The first call to :class:`~websockets.server.WebSocketServer.close` starts a
359task that performs this sequence. Further calls are ignored. This is the
360easiest way to make :class:`~websockets.server.WebSocketServer.close` and
361:class:`~websockets.server.WebSocketServer.wait_closed` idempotent.
362
363
364.. _cancellation:
365
366Cancellation
367------------
368
369User code
370.........
371
372``websockets`` provides a WebSocket application server. It manages connections
373and passes them to user-provided connection handlers. This is an *inversion of
374control* scenario: library code calls user code.
375
376If a connection drops, the corresponding handler should terminate. If the
377server shuts down, all connection handlers must terminate. Canceling
378connection handlers would terminate them.
379
380However, using cancellation for this purpose would require all connection
381handlers to handle it properly. For example, if a connection handler starts
382some tasks, it should catch :exc:`~asyncio.CancelledError`, terminate or
383cancel these tasks, and then re-raise the exception.
384
385Cancellation is tricky in :mod:`asyncio` applications, especially when it
386interacts with finalization logic. In the example above, what if a handler
387gets interrupted with :exc:`~asyncio.CancelledError` while it's finalizing
388the tasks it started, after detecting that the connection dropped?
389
390``websockets`` considers that cancellation may only be triggered by the caller
391of a coroutine when it doesn't care about the results of that coroutine
392anymore. (Source: `Guido van Rossum <https://groups.google.com/forum/#!msg
393/python-tulip/LZQe38CR3bg/7qZ1p_q5yycJ>`_). Since connection handlers run
394arbitrary user code, ``websockets`` has no way of deciding whether that code
395is still doing something worth caring about.
396
397For these reasons, ``websockets`` never cancels connection handlers. Instead
398it expects them to detect when the connection is closed, execute finalization
399logic if needed, and exit.
400
401Conversely, cancellation isn't a concern for WebSocket clients because they
402don't involve inversion of control.
403
404Library
405.......
406
407Most :doc:`public APIs <api>` of ``websockets`` are coroutines. They may be
408canceled, for example if the user starts a task that calls these coroutines
409and cancels the task later. ``websockets`` must handle this situation.
410
411Cancellation during the opening handshake is handled like any other exception:
412the TCP connection is closed and the exception is re-raised. This can only
413happen on the client side. On the server side, the opening handshake is
414managed by ``websockets`` and nothing results in a cancellation.
415
416Once the WebSocket connection is established, internal tasks
417:attr:`~protocol.WebSocketCommonProtocol.transfer_data_task` and
418:attr:`~protocol.WebSocketCommonProtocol.close_connection_task` mustn't get
419accidentally canceled if a coroutine that awaits them is canceled. In other
420words, they must be shielded from cancellation.
421
422:meth:`~protocol.WebSocketCommonProtocol.recv` waits for the next message in
423the queue or for :attr:`~protocol.WebSocketCommonProtocol.transfer_data_task`
424to terminate, whichever comes first. It relies on :func:`~asyncio.wait` for
425waiting on two futures in parallel. As a consequence, even though it's waiting
426on a :class:`~asyncio.Future` signaling the next message and on
427:attr:`~protocol.WebSocketCommonProtocol.transfer_data_task`, it doesn't
428propagate cancellation to them.
429
430:meth:`~protocol.WebSocketCommonProtocol.ensure_open` is called by
431:meth:`~protocol.WebSocketCommonProtocol.send`,
432:meth:`~protocol.WebSocketCommonProtocol.ping`, and
433:meth:`~protocol.WebSocketCommonProtocol.pong`. When the connection state is
434``CLOSING``, it waits for
435:attr:`~protocol.WebSocketCommonProtocol.transfer_data_task` but shields it to
436prevent cancellation.
437
438:meth:`~protocol.WebSocketCommonProtocol.close` waits for the data transfer
439task to terminate with :func:`~asyncio.wait_for`. If it's canceled or if the
440timeout elapses, :attr:`~protocol.WebSocketCommonProtocol.transfer_data_task`
441is canceled, which is correct at this point.
442:meth:`~protocol.WebSocketCommonProtocol.close` then waits for
443:attr:`~protocol.WebSocketCommonProtocol.close_connection_task` but shields it
444to prevent cancellation.
445
446:meth:`~protocol.WebSocketCommonProtocol.close` and
447:func:`~protocol.WebSocketCommonProtocol.fail_connection` are the only
448places where :attr:`~protocol.WebSocketCommonProtocol.transfer_data_task` may
449be canceled.
450
451:attr:`~protocol.WebSocketCommonProtocol.close_connnection_task` starts by
452waiting for :attr:`~protocol.WebSocketCommonProtocol.transfer_data_task`. It
453catches :exc:`~asyncio.CancelledError` to prevent a cancellation of
454:attr:`~protocol.WebSocketCommonProtocol.transfer_data_task` from propagating
455to :attr:`~protocol.WebSocketCommonProtocol.close_connnection_task`.
456
457.. _backpressure:
458
459Backpressure
460------------
461
462.. note::
463
464    This section discusses backpressure from the perspective of a server but
465    the concept applies to clients symmetrically.
466
467With a naive implementation, if a server receives inputs faster than it can
468process them, or if it generates outputs faster than it can send them, data
469accumulates in buffers, eventually causing the server to run out of memory and
470crash.
471
472The solution to this problem is backpressure. Any part of the server that
473receives inputs faster than it can process them and send the outputs
474must propagate that information back to the previous part in the chain.
475
476``websockets`` is designed to make it easy to get backpressure right.
477
478For incoming data, ``websockets`` builds upon :class:`~asyncio.StreamReader`
479which propagates backpressure to its own buffer and to the TCP stream. Frames
480are parsed from the input stream and added to a bounded queue. If the queue
481fills up, parsing halts until the application reads a frame.
482
483For outgoing data, ``websockets`` builds upon :class:`~asyncio.StreamWriter`
484which implements flow control. If the output buffers grow too large, it waits
485until they're drained. That's why all APIs that write frames are asynchronous.
486
487Of course, it's still possible for an application to create its own unbounded
488buffers and break the backpressure. Be careful with queues.
489
490
491.. _buffers:
492
493Buffers
494-------
495
496.. note::
497
498    This section discusses buffers from the perspective of a server but it
499    applies to clients as well.
500
501An asynchronous systems works best when its buffers are almost always empty.
502
503For example, if a client sends data too fast for a server, the queue of
504incoming messages will be constantly full. The server will always be 32
505messages (by default) behind the client. This consumes memory and increases
506latency for no good reason. The problem is called bufferbloat.
507
508If buffers are almost always full and that problem cannot be solved by adding
509capacity — typically because the system is bottlenecked by the output and
510constantly regulated by backpressure — reducing the size of buffers minimizes
511negative consequences.
512
513By default ``websockets`` has rather high limits. You can decrease them
514according to your application's characteristics.
515
516Bufferbloat can happen at every level in the stack where there is a buffer.
517For each connection, the receiving side contains these buffers:
518
519- OS buffers: tuning them is an advanced optimization.
520- :class:`~asyncio.StreamReader` bytes buffer: the default limit is 64 KiB.
521  You can set another limit by passing a ``read_limit`` keyword argument to
522  :func:`~client.connect()` or :func:`~server.serve`.
523- Incoming messages :class:`~collections.deque`: its size depends both on
524  the size and the number of messages it contains. By default the maximum
525  UTF-8 encoded size is 1 MiB and the maximum number is 32. In the worst case,
526  after UTF-8 decoding, a single message could take up to 4 MiB of memory and
527  the overall memory consumption could reach 128 MiB. You should adjust these
528  limits by setting the ``max_size`` and ``max_queue`` keyword arguments of
529  :func:`~client.connect()` or :func:`~server.serve` according to your
530  application's requirements.
531
532For each connection, the sending side contains these buffers:
533
534- :class:`~asyncio.StreamWriter` bytes buffer: the default size is 64 KiB.
535  You can set another limit by passing a ``write_limit`` keyword argument to
536  :func:`~client.connect()` or :func:`~server.serve`.
537- OS buffers: tuning them is an advanced optimization.
538
539Concurrency
540-----------
541
542Awaiting any combination of :meth:`~protocol.WebSocketCommonProtocol.recv`,
543:meth:`~protocol.WebSocketCommonProtocol.send`,
544:meth:`~protocol.WebSocketCommonProtocol.close`
545:meth:`~protocol.WebSocketCommonProtocol.ping`, or
546:meth:`~protocol.WebSocketCommonProtocol.pong` concurrently is safe, including
547multiple calls to the same method, with one exception and one limitation.
548
549* **Only one coroutine can receive messages at a time.** This constraint
550  avoids non-deterministic behavior (and simplifies the implementation). If a
551  coroutine is awaiting :meth:`~protocol.WebSocketCommonProtocol.recv`,
552  awaiting it again in another coroutine raises :exc:`RuntimeError`.
553
554* **Sending a fragmented message forces serialization.** Indeed, the WebSocket
555  protocol doesn't support multiplexing messages. If a coroutine is awaiting
556  :meth:`~protocol.WebSocketCommonProtocol.send` to send a fragmented message,
557  awaiting it again in another coroutine waits until the first call completes.
558  This will be transparent in many cases. It may be a concern if the
559  fragmented message is generated slowly by an asynchronous iterator.
560
561Receiving frames is independent from sending frames. This isolates
562:meth:`~protocol.WebSocketCommonProtocol.recv`, which receives frames, from
563the other methods, which send frames.
564
565While the connection is open, each frame is sent with a single write. Combined
566with the concurrency model of :mod:`asyncio`, this enforces serialization. The
567only other requirement is to prevent interleaving other data frames in the
568middle of a fragmented message.
569
570After the connection is closed, sending a frame raises
571:exc:`~websockets.exceptions.ConnectionClosed`, which is safe.
572