1.. highlight:: c
2
3.. index::
4   single: buffer protocol
5   single: buffer interface; (see buffer protocol)
6   single: buffer object; (see buffer protocol)
7
8.. _bufferobjects:
9
10Buffer Protocol
11---------------
12
13.. sectionauthor:: Greg Stein <gstein@lyra.org>
14.. sectionauthor:: Benjamin Peterson
15.. sectionauthor:: Stefan Krah
16
17
18Certain objects available in Python wrap access to an underlying memory
19array or *buffer*.  Such objects include the built-in :class:`bytes` and
20:class:`bytearray`, and some extension types like :class:`array.array`.
21Third-party libraries may define their own types for special purposes, such
22as image processing or numeric analysis.
23
24While each of these types have their own semantics, they share the common
25characteristic of being backed by a possibly large memory buffer.  It is
26then desirable, in some situations, to access that buffer directly and
27without intermediate copying.
28
29Python provides such a facility at the C level in the form of the :ref:`buffer
30protocol <bufferobjects>`.  This protocol has two sides:
31
32.. index:: single: PyBufferProcs
33
34- on the producer side, a type can export a "buffer interface" which allows
35  objects of that type to expose information about their underlying buffer.
36  This interface is described in the section :ref:`buffer-structs`;
37
38- on the consumer side, several means are available to obtain a pointer to
39  the raw underlying data of an object (for example a method parameter).
40
41Simple objects such as :class:`bytes` and :class:`bytearray` expose their
42underlying buffer in byte-oriented form.  Other forms are possible; for example,
43the elements exposed by an :class:`array.array` can be multi-byte values.
44
45An example consumer of the buffer interface is the :meth:`~io.BufferedIOBase.write`
46method of file objects: any object that can export a series of bytes through
47the buffer interface can be written to a file.  While :meth:`write` only
48needs read-only access to the internal contents of the object passed to it,
49other methods such as :meth:`~io.BufferedIOBase.readinto` need write access
50to the contents of their argument.  The buffer interface allows objects to
51selectively allow or reject exporting of read-write and read-only buffers.
52
53There are two ways for a consumer of the buffer interface to acquire a buffer
54over a target object:
55
56* call :c:func:`PyObject_GetBuffer` with the right parameters;
57
58* call :c:func:`PyArg_ParseTuple` (or one of its siblings) with one of the
59  ``y*``, ``w*`` or ``s*`` :ref:`format codes <arg-parsing>`.
60
61In both cases, :c:func:`PyBuffer_Release` must be called when the buffer
62isn't needed anymore.  Failure to do so could lead to various issues such as
63resource leaks.
64
65
66.. _buffer-structure:
67
68Buffer structure
69================
70
71Buffer structures (or simply "buffers") are useful as a way to expose the
72binary data from another object to the Python programmer.  They can also be
73used as a zero-copy slicing mechanism.  Using their ability to reference a
74block of memory, it is possible to expose any data to the Python programmer
75quite easily.  The memory could be a large, constant array in a C extension,
76it could be a raw block of memory for manipulation before passing to an
77operating system library, or it could be used to pass around structured data
78in its native, in-memory format.
79
80Contrary to most data types exposed by the Python interpreter, buffers
81are not :c:type:`PyObject` pointers but rather simple C structures.  This
82allows them to be created and copied very simply.  When a generic wrapper
83around a buffer is needed, a :ref:`memoryview <memoryview-objects>` object
84can be created.
85
86For short instructions how to write an exporting object, see
87:ref:`Buffer Object Structures <buffer-structs>`. For obtaining
88a buffer, see :c:func:`PyObject_GetBuffer`.
89
90.. c:type:: Py_buffer
91
92   .. c:member:: void *buf
93
94      A pointer to the start of the logical structure described by the buffer
95      fields. This can be any location within the underlying physical memory
96      block of the exporter. For example, with negative :c:member:`~Py_buffer.strides`
97      the value may point to the end of the memory block.
98
99      For :term:`contiguous` arrays, the value points to the beginning of
100      the memory block.
101
102   .. c:member:: void *obj
103
104      A new reference to the exporting object. The reference is owned by
105      the consumer and automatically decremented and set to ``NULL`` by
106      :c:func:`PyBuffer_Release`. The field is the equivalent of the return
107      value of any standard C-API function.
108
109      As a special case, for *temporary* buffers that are wrapped by
110      :c:func:`PyMemoryView_FromBuffer` or :c:func:`PyBuffer_FillInfo`
111      this field is ``NULL``. In general, exporting objects MUST NOT
112      use this scheme.
113
114   .. c:member:: Py_ssize_t len
115
116      ``product(shape) * itemsize``. For contiguous arrays, this is the length
117      of the underlying memory block. For non-contiguous arrays, it is the length
118      that the logical structure would have if it were copied to a contiguous
119      representation.
120
121      Accessing ``((char *)buf)[0] up to ((char *)buf)[len-1]`` is only valid
122      if the buffer has been obtained by a request that guarantees contiguity. In
123      most cases such a request will be :c:macro:`PyBUF_SIMPLE` or :c:macro:`PyBUF_WRITABLE`.
124
125   .. c:member:: int readonly
126
127      An indicator of whether the buffer is read-only. This field is controlled
128      by the :c:macro:`PyBUF_WRITABLE` flag.
129
130   .. c:member:: Py_ssize_t itemsize
131
132      Item size in bytes of a single element. Same as the value of :func:`struct.calcsize`
133      called on non-``NULL`` :c:member:`~Py_buffer.format` values.
134
135      Important exception: If a consumer requests a buffer without the
136      :c:macro:`PyBUF_FORMAT` flag, :c:member:`~Py_buffer.format` will
137      be set to  ``NULL``,  but :c:member:`~Py_buffer.itemsize` still has
138      the value for the original format.
139
140      If :c:member:`~Py_buffer.shape` is present, the equality
141      ``product(shape) * itemsize == len`` still holds and the consumer
142      can use :c:member:`~Py_buffer.itemsize` to navigate the buffer.
143
144      If :c:member:`~Py_buffer.shape` is ``NULL`` as a result of a :c:macro:`PyBUF_SIMPLE`
145      or a :c:macro:`PyBUF_WRITABLE` request, the consumer must disregard
146      :c:member:`~Py_buffer.itemsize` and assume ``itemsize == 1``.
147
148   .. c:member:: const char *format
149
150      A *NUL* terminated string in :mod:`struct` module style syntax describing
151      the contents of a single item. If this is ``NULL``, ``"B"`` (unsigned bytes)
152      is assumed.
153
154      This field is controlled by the :c:macro:`PyBUF_FORMAT` flag.
155
156   .. c:member:: int ndim
157
158      The number of dimensions the memory represents as an n-dimensional array.
159      If it is ``0``, :c:member:`~Py_buffer.buf` points to a single item representing
160      a scalar. In this case, :c:member:`~Py_buffer.shape`, :c:member:`~Py_buffer.strides`
161      and :c:member:`~Py_buffer.suboffsets` MUST be ``NULL``.
162
163      The macro :c:macro:`PyBUF_MAX_NDIM` limits the maximum number of dimensions
164      to 64. Exporters MUST respect this limit, consumers of multi-dimensional
165      buffers SHOULD be able to handle up to :c:macro:`PyBUF_MAX_NDIM` dimensions.
166
167   .. c:member:: Py_ssize_t *shape
168
169      An array of :c:type:`Py_ssize_t` of length :c:member:`~Py_buffer.ndim`
170      indicating the shape of the memory as an n-dimensional array. Note that
171      ``shape[0] * ... * shape[ndim-1] * itemsize`` MUST be equal to
172      :c:member:`~Py_buffer.len`.
173
174      Shape values are restricted to ``shape[n] >= 0``. The case
175      ``shape[n] == 0`` requires special attention. See `complex arrays`_
176      for further information.
177
178      The shape array is read-only for the consumer.
179
180   .. c:member:: Py_ssize_t *strides
181
182      An array of :c:type:`Py_ssize_t` of length :c:member:`~Py_buffer.ndim`
183      giving the number of bytes to skip to get to a new element in each
184      dimension.
185
186      Stride values can be any integer. For regular arrays, strides are
187      usually positive, but a consumer MUST be able to handle the case
188      ``strides[n] <= 0``. See `complex arrays`_ for further information.
189
190      The strides array is read-only for the consumer.
191
192   .. c:member:: Py_ssize_t *suboffsets
193
194      An array of :c:type:`Py_ssize_t` of length :c:member:`~Py_buffer.ndim`.
195      If ``suboffsets[n] >= 0``, the values stored along the nth dimension are
196      pointers and the suboffset value dictates how many bytes to add to each
197      pointer after de-referencing. A suboffset value that is negative
198      indicates that no de-referencing should occur (striding in a contiguous
199      memory block).
200
201      If all suboffsets are negative (i.e. no de-referencing is needed), then
202      this field must be ``NULL`` (the default value).
203
204      This type of array representation is used by the Python Imaging Library
205      (PIL). See `complex arrays`_ for further information how to access elements
206      of such an array.
207
208      The suboffsets array is read-only for the consumer.
209
210   .. c:member:: void *internal
211
212      This is for use internally by the exporting object. For example, this
213      might be re-cast as an integer by the exporter and used to store flags
214      about whether or not the shape, strides, and suboffsets arrays must be
215      freed when the buffer is released. The consumer MUST NOT alter this
216      value.
217
218.. _buffer-request-types:
219
220Buffer request types
221====================
222
223Buffers are usually obtained by sending a buffer request to an exporting
224object via :c:func:`PyObject_GetBuffer`. Since the complexity of the logical
225structure of the memory can vary drastically, the consumer uses the *flags*
226argument to specify the exact buffer type it can handle.
227
228All :c:data:`Py_buffer` fields are unambiguously defined by the request
229type.
230
231request-independent fields
232~~~~~~~~~~~~~~~~~~~~~~~~~~
233The following fields are not influenced by *flags* and must always be filled in
234with the correct values: :c:member:`~Py_buffer.obj`, :c:member:`~Py_buffer.buf`,
235:c:member:`~Py_buffer.len`, :c:member:`~Py_buffer.itemsize`, :c:member:`~Py_buffer.ndim`.
236
237
238readonly, format
239~~~~~~~~~~~~~~~~
240
241   .. c:macro:: PyBUF_WRITABLE
242
243      Controls the :c:member:`~Py_buffer.readonly` field. If set, the exporter
244      MUST provide a writable buffer or else report failure. Otherwise, the
245      exporter MAY provide either a read-only or writable buffer, but the choice
246      MUST be consistent for all consumers.
247
248   .. c:macro:: PyBUF_FORMAT
249
250      Controls the :c:member:`~Py_buffer.format` field. If set, this field MUST
251      be filled in correctly. Otherwise, this field MUST be ``NULL``.
252
253
254:c:macro:`PyBUF_WRITABLE` can be \|'d to any of the flags in the next section.
255Since :c:macro:`PyBUF_SIMPLE` is defined as 0, :c:macro:`PyBUF_WRITABLE`
256can be used as a stand-alone flag to request a simple writable buffer.
257
258:c:macro:`PyBUF_FORMAT` can be \|'d to any of the flags except :c:macro:`PyBUF_SIMPLE`.
259The latter already implies format ``B`` (unsigned bytes).
260
261
262shape, strides, suboffsets
263~~~~~~~~~~~~~~~~~~~~~~~~~~
264
265The flags that control the logical structure of the memory are listed
266in decreasing order of complexity. Note that each flag contains all bits
267of the flags below it.
268
269.. tabularcolumns:: |p{0.35\linewidth}|l|l|l|
270
271+-----------------------------+-------+---------+------------+
272|  Request                    | shape | strides | suboffsets |
273+=============================+=======+=========+============+
274| .. c:macro:: PyBUF_INDIRECT |  yes  |   yes   | if needed  |
275+-----------------------------+-------+---------+------------+
276| .. c:macro:: PyBUF_STRIDES  |  yes  |   yes   |    NULL    |
277+-----------------------------+-------+---------+------------+
278| .. c:macro:: PyBUF_ND       |  yes  |   NULL  |    NULL    |
279+-----------------------------+-------+---------+------------+
280| .. c:macro:: PyBUF_SIMPLE   |  NULL |   NULL  |    NULL    |
281+-----------------------------+-------+---------+------------+
282
283
284.. index:: contiguous, C-contiguous, Fortran contiguous
285
286contiguity requests
287~~~~~~~~~~~~~~~~~~~
288
289C or Fortran :term:`contiguity <contiguous>` can be explicitly requested,
290with and without stride information. Without stride information, the buffer
291must be C-contiguous.
292
293.. tabularcolumns:: |p{0.35\linewidth}|l|l|l|l|
294
295+-----------------------------------+-------+---------+------------+--------+
296|  Request                          | shape | strides | suboffsets | contig |
297+===================================+=======+=========+============+========+
298| .. c:macro:: PyBUF_C_CONTIGUOUS   |  yes  |   yes   |    NULL    |   C    |
299+-----------------------------------+-------+---------+------------+--------+
300| .. c:macro:: PyBUF_F_CONTIGUOUS   |  yes  |   yes   |    NULL    |   F    |
301+-----------------------------------+-------+---------+------------+--------+
302| .. c:macro:: PyBUF_ANY_CONTIGUOUS |  yes  |   yes   |    NULL    | C or F |
303+-----------------------------------+-------+---------+------------+--------+
304| :c:macro:`PyBUF_ND`               |  yes  |   NULL  |    NULL    |   C    |
305+-----------------------------------+-------+---------+------------+--------+
306
307
308compound requests
309~~~~~~~~~~~~~~~~~
310
311All possible requests are fully defined by some combination of the flags in
312the previous section. For convenience, the buffer protocol provides frequently
313used combinations as single flags.
314
315In the following table *U* stands for undefined contiguity. The consumer would
316have to call :c:func:`PyBuffer_IsContiguous` to determine contiguity.
317
318.. tabularcolumns:: |p{0.35\linewidth}|l|l|l|l|l|l|
319
320+-------------------------------+-------+---------+------------+--------+----------+--------+
321|  Request                      | shape | strides | suboffsets | contig | readonly | format |
322+===============================+=======+=========+============+========+==========+========+
323| .. c:macro:: PyBUF_FULL       |  yes  |   yes   | if needed  |   U    |     0    |  yes   |
324+-------------------------------+-------+---------+------------+--------+----------+--------+
325| .. c:macro:: PyBUF_FULL_RO    |  yes  |   yes   | if needed  |   U    |  1 or 0  |  yes   |
326+-------------------------------+-------+---------+------------+--------+----------+--------+
327| .. c:macro:: PyBUF_RECORDS    |  yes  |   yes   |    NULL    |   U    |     0    |  yes   |
328+-------------------------------+-------+---------+------------+--------+----------+--------+
329| .. c:macro:: PyBUF_RECORDS_RO |  yes  |   yes   |    NULL    |   U    |  1 or 0  |  yes   |
330+-------------------------------+-------+---------+------------+--------+----------+--------+
331| .. c:macro:: PyBUF_STRIDED    |  yes  |   yes   |    NULL    |   U    |     0    |  NULL  |
332+-------------------------------+-------+---------+------------+--------+----------+--------+
333| .. c:macro:: PyBUF_STRIDED_RO |  yes  |   yes   |    NULL    |   U    |  1 or 0  |  NULL  |
334+-------------------------------+-------+---------+------------+--------+----------+--------+
335| .. c:macro:: PyBUF_CONTIG     |  yes  |   NULL  |    NULL    |   C    |     0    |  NULL  |
336+-------------------------------+-------+---------+------------+--------+----------+--------+
337| .. c:macro:: PyBUF_CONTIG_RO  |  yes  |   NULL  |    NULL    |   C    |  1 or 0  |  NULL  |
338+-------------------------------+-------+---------+------------+--------+----------+--------+
339
340
341Complex arrays
342==============
343
344NumPy-style: shape and strides
345~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
346
347The logical structure of NumPy-style arrays is defined by :c:member:`~Py_buffer.itemsize`,
348:c:member:`~Py_buffer.ndim`, :c:member:`~Py_buffer.shape` and :c:member:`~Py_buffer.strides`.
349
350If ``ndim == 0``, the memory location pointed to by :c:member:`~Py_buffer.buf` is
351interpreted as a scalar of size :c:member:`~Py_buffer.itemsize`. In that case,
352both :c:member:`~Py_buffer.shape` and :c:member:`~Py_buffer.strides` are ``NULL``.
353
354If :c:member:`~Py_buffer.strides` is ``NULL``, the array is interpreted as
355a standard n-dimensional C-array. Otherwise, the consumer must access an
356n-dimensional array as follows:
357
358.. code-block:: c
359
360   ptr = (char *)buf + indices[0] * strides[0] + ... + indices[n-1] * strides[n-1];
361   item = *((typeof(item) *)ptr);
362
363
364As noted above, :c:member:`~Py_buffer.buf` can point to any location within
365the actual memory block. An exporter can check the validity of a buffer with
366this function:
367
368.. code-block:: python
369
370   def verify_structure(memlen, itemsize, ndim, shape, strides, offset):
371       """Verify that the parameters represent a valid array within
372          the bounds of the allocated memory:
373              char *mem: start of the physical memory block
374              memlen: length of the physical memory block
375              offset: (char *)buf - mem
376       """
377       if offset % itemsize:
378           return False
379       if offset < 0 or offset+itemsize > memlen:
380           return False
381       if any(v % itemsize for v in strides):
382           return False
383
384       if ndim <= 0:
385           return ndim == 0 and not shape and not strides
386       if 0 in shape:
387           return True
388
389       imin = sum(strides[j]*(shape[j]-1) for j in range(ndim)
390                  if strides[j] <= 0)
391       imax = sum(strides[j]*(shape[j]-1) for j in range(ndim)
392                  if strides[j] > 0)
393
394       return 0 <= offset+imin and offset+imax+itemsize <= memlen
395
396
397PIL-style: shape, strides and suboffsets
398~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
399
400In addition to the regular items, PIL-style arrays can contain pointers
401that must be followed in order to get to the next element in a dimension.
402For example, the regular three-dimensional C-array ``char v[2][2][3]`` can
403also be viewed as an array of 2 pointers to 2 two-dimensional arrays:
404``char (*v[2])[2][3]``. In suboffsets representation, those two pointers
405can be embedded at the start of :c:member:`~Py_buffer.buf`, pointing
406to two ``char x[2][3]`` arrays that can be located anywhere in memory.
407
408
409Here is a function that returns a pointer to the element in an N-D array
410pointed to by an N-dimensional index when there are both non-``NULL`` strides
411and suboffsets::
412
413   void *get_item_pointer(int ndim, void *buf, Py_ssize_t *strides,
414                          Py_ssize_t *suboffsets, Py_ssize_t *indices) {
415       char *pointer = (char*)buf;
416       int i;
417       for (i = 0; i < ndim; i++) {
418           pointer += strides[i] * indices[i];
419           if (suboffsets[i] >=0 ) {
420               pointer = *((char**)pointer) + suboffsets[i];
421           }
422       }
423       return (void*)pointer;
424   }
425
426
427Buffer-related functions
428========================
429
430.. c:function:: int PyObject_CheckBuffer(PyObject *obj)
431
432   Return ``1`` if *obj* supports the buffer interface otherwise ``0``.  When ``1`` is
433   returned, it doesn't guarantee that :c:func:`PyObject_GetBuffer` will
434   succeed.  This function always succeeds.
435
436
437.. c:function:: int PyObject_GetBuffer(PyObject *exporter, Py_buffer *view, int flags)
438
439   Send a request to *exporter* to fill in *view* as specified by  *flags*.
440   If the exporter cannot provide a buffer of the exact type, it MUST raise
441   :c:data:`PyExc_BufferError`, set ``view->obj`` to ``NULL`` and
442   return ``-1``.
443
444   On success, fill in *view*, set ``view->obj`` to a new reference
445   to *exporter* and return 0. In the case of chained buffer providers
446   that redirect requests to a single object, ``view->obj`` MAY
447   refer to this object instead of *exporter* (See :ref:`Buffer Object Structures <buffer-structs>`).
448
449   Successful calls to :c:func:`PyObject_GetBuffer` must be paired with calls
450   to :c:func:`PyBuffer_Release`, similar to :c:func:`malloc` and :c:func:`free`.
451   Thus, after the consumer is done with the buffer, :c:func:`PyBuffer_Release`
452   must be called exactly once.
453
454
455.. c:function:: void PyBuffer_Release(Py_buffer *view)
456
457   Release the buffer *view* and decrement the reference count for
458   ``view->obj``. This function MUST be called when the buffer
459   is no longer being used, otherwise reference leaks may occur.
460
461   It is an error to call this function on a buffer that was not obtained via
462   :c:func:`PyObject_GetBuffer`.
463
464
465.. c:function:: Py_ssize_t PyBuffer_SizeFromFormat(const char *)
466
467   Return the implied :c:data:`~Py_buffer.itemsize` from :c:data:`~Py_buffer.format`.
468   This function is not yet implemented.
469
470
471.. c:function:: int PyBuffer_IsContiguous(Py_buffer *view, char order)
472
473   Return ``1`` if the memory defined by the *view* is C-style (*order* is
474   ``'C'``) or Fortran-style (*order* is ``'F'``) :term:`contiguous` or either one
475   (*order* is ``'A'``).  Return ``0`` otherwise.  This function always succeeds.
476
477
478.. c:function:: void* PyBuffer_GetPointer(Py_buffer *view, Py_ssize_t *indices)
479
480   Get the memory area pointed to by the *indices* inside the given *view*.
481   *indices* must point to an array of ``view->ndim`` indices.
482
483
484.. c:function:: int PyBuffer_FromContiguous(Py_buffer *view, void *buf, Py_ssize_t len, char fort)
485
486   Copy contiguous *len* bytes from *buf* to *view*.
487   *fort* can be ``'C'`` or ``'F'`` (for C-style or Fortran-style ordering).
488   ``0`` is returned on success, ``-1`` on error.
489
490
491.. c:function:: int PyBuffer_ToContiguous(void *buf, Py_buffer *src, Py_ssize_t len, char order)
492
493   Copy *len* bytes from *src* to its contiguous representation in *buf*.
494   *order* can be ``'C'`` or ``'F'`` or ``'A'`` (for C-style or Fortran-style
495   ordering or either one). ``0`` is returned on success, ``-1`` on error.
496
497   This function fails if *len* != *src->len*.
498
499
500.. c:function:: void PyBuffer_FillContiguousStrides(int ndims, Py_ssize_t *shape, Py_ssize_t *strides, int itemsize, char order)
501
502   Fill the *strides* array with byte-strides of a :term:`contiguous` (C-style if
503   *order* is ``'C'`` or Fortran-style if *order* is ``'F'``) array of the
504   given shape with the given number of bytes per element.
505
506
507.. c:function:: int PyBuffer_FillInfo(Py_buffer *view, PyObject *exporter, void *buf, Py_ssize_t len, int readonly, int flags)
508
509   Handle buffer requests for an exporter that wants to expose *buf* of size *len*
510   with writability set according to *readonly*. *buf* is interpreted as a sequence
511   of unsigned bytes.
512
513   The *flags* argument indicates the request type. This function always fills in
514   *view* as specified by flags, unless *buf* has been designated as read-only
515   and :c:macro:`PyBUF_WRITABLE` is set in *flags*.
516
517   On success, set ``view->obj`` to a new reference to *exporter* and
518   return 0. Otherwise, raise :c:data:`PyExc_BufferError`, set
519   ``view->obj`` to ``NULL`` and return ``-1``;
520
521   If this function is used as part of a :ref:`getbufferproc <buffer-structs>`,
522   *exporter* MUST be set to the exporting object and *flags* must be passed
523   unmodified. Otherwise, *exporter* MUST be ``NULL``.
524