1.. highlightlang:: c
2
3.. _stringobjects:
4
5String/Bytes Objects
6--------------------
7
8These functions raise :exc:`TypeError` when expecting a string parameter and are
9called with a non-string parameter.
10
11.. note::
12
13   These functions have been renamed to PyBytes_* in Python 3.x. Unless
14   otherwise noted, the PyBytes functions available in 3.x are aliased to their
15   PyString_* equivalents to help porting.
16
17.. index:: object: string
18
19
20.. c:type:: PyStringObject
21
22   This subtype of :c:type:`PyObject` represents a Python string object.
23
24
25.. c:var:: PyTypeObject PyString_Type
26
27   .. index:: single: StringType (in module types)
28
29   This instance of :c:type:`PyTypeObject` represents the Python string type; it is
30   the same object as ``str`` and ``types.StringType`` in the Python layer. .
31
32
33.. c:function:: int PyString_Check(PyObject *o)
34
35   Return true if the object *o* is a string object or an instance of a subtype of
36   the string type.
37
38   .. versionchanged:: 2.2
39      Allowed subtypes to be accepted.
40
41
42.. c:function:: int PyString_CheckExact(PyObject *o)
43
44   Return true if the object *o* is a string object, but not an instance of a
45   subtype of the string type.
46
47   .. versionadded:: 2.2
48
49
50.. c:function:: PyObject* PyString_FromString(const char *v)
51
52   Return a new string object with a copy of the string *v* as value on success,
53   and *NULL* on failure.  The parameter *v* must not be *NULL*; it will not be
54   checked.
55
56
57.. c:function:: PyObject* PyString_FromStringAndSize(const char *v, Py_ssize_t len)
58
59   Return a new string object with a copy of the string *v* as value and length
60   *len* on success, and *NULL* on failure.  If *v* is *NULL*, the contents of the
61   string are uninitialized.
62
63   .. versionchanged:: 2.5
64      This function used an :c:type:`int` type for *len*. This might require
65      changes in your code for properly supporting 64-bit systems.
66
67
68.. c:function:: PyObject* PyString_FromFormat(const char *format, ...)
69
70   Take a C :c:func:`printf`\ -style *format* string and a variable number of
71   arguments, calculate the size of the resulting Python string and return a string
72   with the values formatted into it.  The variable arguments must be C types and
73   must correspond exactly to the format characters in the *format* string.  The
74   following format characters are allowed:
75
76   .. % This should be exactly the same as the table in PyErr_Format.
77   .. % One should just refer to the other.
78   .. % The descriptions for %zd and %zu are wrong, but the truth is complicated
79   .. % because not all compilers support the %z width modifier -- we fake it
80   .. % when necessary via interpolating PY_FORMAT_SIZE_T.
81   .. % Similar comments apply to the %ll width modifier and
82   .. % PY_FORMAT_LONG_LONG.
83   .. % %u, %lu, %zu should have "new in Python 2.5" blurbs.
84
85   +-------------------+---------------+--------------------------------+
86   | Format Characters | Type          | Comment                        |
87   +===================+===============+================================+
88   | :attr:`%%`        | *n/a*         | The literal % character.       |
89   +-------------------+---------------+--------------------------------+
90   | :attr:`%c`        | int           | A single character,            |
91   |                   |               | represented as a C int.        |
92   +-------------------+---------------+--------------------------------+
93   | :attr:`%d`        | int           | Exactly equivalent to          |
94   |                   |               | ``printf("%d")``.              |
95   +-------------------+---------------+--------------------------------+
96   | :attr:`%u`        | unsigned int  | Exactly equivalent to          |
97   |                   |               | ``printf("%u")``.              |
98   +-------------------+---------------+--------------------------------+
99   | :attr:`%ld`       | long          | Exactly equivalent to          |
100   |                   |               | ``printf("%ld")``.             |
101   +-------------------+---------------+--------------------------------+
102   | :attr:`%lu`       | unsigned long | Exactly equivalent to          |
103   |                   |               | ``printf("%lu")``.             |
104   +-------------------+---------------+--------------------------------+
105   | :attr:`%lld`      | long long     | Exactly equivalent to          |
106   |                   |               | ``printf("%lld")``.            |
107   +-------------------+---------------+--------------------------------+
108   | :attr:`%llu`      | unsigned      | Exactly equivalent to          |
109   |                   | long long     | ``printf("%llu")``.            |
110   +-------------------+---------------+--------------------------------+
111   | :attr:`%zd`       | Py_ssize_t    | Exactly equivalent to          |
112   |                   |               | ``printf("%zd")``.             |
113   +-------------------+---------------+--------------------------------+
114   | :attr:`%zu`       | size_t        | Exactly equivalent to          |
115   |                   |               | ``printf("%zu")``.             |
116   +-------------------+---------------+--------------------------------+
117   | :attr:`%i`        | int           | Exactly equivalent to          |
118   |                   |               | ``printf("%i")``.              |
119   +-------------------+---------------+--------------------------------+
120   | :attr:`%x`        | int           | Exactly equivalent to          |
121   |                   |               | ``printf("%x")``.              |
122   +-------------------+---------------+--------------------------------+
123   | :attr:`%s`        | char\*        | A null-terminated C character  |
124   |                   |               | array.                         |
125   +-------------------+---------------+--------------------------------+
126   | :attr:`%p`        | void\*        | The hex representation of a C  |
127   |                   |               | pointer. Mostly equivalent to  |
128   |                   |               | ``printf("%p")`` except that   |
129   |                   |               | it is guaranteed to start with |
130   |                   |               | the literal ``0x`` regardless  |
131   |                   |               | of what the platform's         |
132   |                   |               | ``printf`` yields.             |
133   +-------------------+---------------+--------------------------------+
134
135   An unrecognized format character causes all the rest of the format string to be
136   copied as-is to the result string, and any extra arguments discarded.
137
138   .. note::
139
140      The `"%lld"` and `"%llu"` format specifiers are only available
141      when :const:`HAVE_LONG_LONG` is defined.
142
143   .. versionchanged:: 2.7
144      Support for `"%lld"` and `"%llu"` added.
145
146
147.. c:function:: PyObject* PyString_FromFormatV(const char *format, va_list vargs)
148
149   Identical to :c:func:`PyString_FromFormat` except that it takes exactly two
150   arguments.
151
152
153.. c:function:: Py_ssize_t PyString_Size(PyObject *string)
154
155   Return the length of the string in string object *string*.
156
157   .. versionchanged:: 2.5
158      This function returned an :c:type:`int` type. This might require changes
159      in your code for properly supporting 64-bit systems.
160
161
162.. c:function:: Py_ssize_t PyString_GET_SIZE(PyObject *string)
163
164   Macro form of :c:func:`PyString_Size` but without error checking.
165
166   .. versionchanged:: 2.5
167      This macro returned an :c:type:`int` type. This might require changes in
168      your code for properly supporting 64-bit systems.
169
170
171.. c:function:: char* PyString_AsString(PyObject *string)
172
173   Return a NUL-terminated representation of the contents of *string*.  The pointer
174   refers to the internal buffer of *string*, not a copy.  The data must not be
175   modified in any way, unless the string was just created using
176   ``PyString_FromStringAndSize(NULL, size)``. It must not be deallocated.  If
177   *string* is a Unicode object, this function computes the default encoding of
178   *string* and operates on that.  If *string* is not a string object at all,
179   :c:func:`PyString_AsString` returns *NULL* and raises :exc:`TypeError`.
180
181
182.. c:function:: char* PyString_AS_STRING(PyObject *string)
183
184   Macro form of :c:func:`PyString_AsString` but without error checking.  Only
185   string objects are supported; no Unicode objects should be passed.
186
187
188.. c:function:: int PyString_AsStringAndSize(PyObject *obj, char **buffer, Py_ssize_t *length)
189
190   Return a NUL-terminated representation of the contents of the object *obj*
191   through the output variables *buffer* and *length*.
192
193   The function accepts both string and Unicode objects as input. For Unicode
194   objects it returns the default encoded version of the object.  If *length* is
195   *NULL*, the resulting buffer may not contain NUL characters; if it does, the
196   function returns ``-1`` and a :exc:`TypeError` is raised.
197
198   The buffer refers to an internal string buffer of *obj*, not a copy. The data
199   must not be modified in any way, unless the string was just created using
200   ``PyString_FromStringAndSize(NULL, size)``.  It must not be deallocated.  If
201   *string* is a Unicode object, this function computes the default encoding of
202   *string* and operates on that.  If *string* is not a string object at all,
203   :c:func:`PyString_AsStringAndSize` returns ``-1`` and raises :exc:`TypeError`.
204
205   .. versionchanged:: 2.5
206      This function used an :c:type:`int *` type for *length*. This might
207      require changes in your code for properly supporting 64-bit systems.
208
209
210.. c:function:: void PyString_Concat(PyObject **string, PyObject *newpart)
211
212   Create a new string object in *\*string* containing the contents of *newpart*
213   appended to *string*; the caller will own the new reference.  The reference to
214   the old value of *string* will be stolen.  If the new string cannot be created,
215   the old reference to *string* will still be discarded and the value of
216   *\*string* will be set to *NULL*; the appropriate exception will be set.
217
218
219.. c:function:: void PyString_ConcatAndDel(PyObject **string, PyObject *newpart)
220
221   Create a new string object in *\*string* containing the contents of *newpart*
222   appended to *string*.  This version decrements the reference count of *newpart*.
223
224
225.. c:function:: int _PyString_Resize(PyObject **string, Py_ssize_t newsize)
226
227   A way to resize a string object even though it is "immutable". Only use this to
228   build up a brand new string object; don't use this if the string may already be
229   known in other parts of the code.  It is an error to call this function if the
230   refcount on the input string object is not one. Pass the address of an existing
231   string object as an lvalue (it may be written into), and the new size desired.
232   On success, *\*string* holds the resized string object and ``0`` is returned;
233   the address in *\*string* may differ from its input value.  If the reallocation
234   fails, the original string object at *\*string* is deallocated, *\*string* is
235   set to *NULL*, a memory exception is set, and ``-1`` is returned.
236
237   .. versionchanged:: 2.5
238      This function used an :c:type:`int` type for *newsize*. This might
239      require changes in your code for properly supporting 64-bit systems.
240
241.. c:function:: PyObject* PyString_Format(PyObject *format, PyObject *args)
242
243   Return a new string object from *format* and *args*. Analogous to ``format %
244   args``.  The *args* argument must be a tuple or dict.
245
246
247.. c:function:: void PyString_InternInPlace(PyObject **string)
248
249   Intern the argument *\*string* in place.  The argument must be the address of a
250   pointer variable pointing to a Python string object.  If there is an existing
251   interned string that is the same as *\*string*, it sets *\*string* to it
252   (decrementing the reference count of the old string object and incrementing the
253   reference count of the interned string object), otherwise it leaves *\*string*
254   alone and interns it (incrementing its reference count).  (Clarification: even
255   though there is a lot of talk about reference counts, think of this function as
256   reference-count-neutral; you own the object after the call if and only if you
257   owned it before the call.)
258
259   .. note::
260
261      This function is not available in 3.x and does not have a PyBytes alias.
262
263
264.. c:function:: PyObject* PyString_InternFromString(const char *v)
265
266   A combination of :c:func:`PyString_FromString` and
267   :c:func:`PyString_InternInPlace`, returning either a new string object that has
268   been interned, or a new ("owned") reference to an earlier interned string object
269   with the same value.
270
271   .. note::
272
273      This function is not available in 3.x and does not have a PyBytes alias.
274
275
276.. c:function:: PyObject* PyString_Decode(const char *s, Py_ssize_t size, const char *encoding, const char *errors)
277
278   Create an object by decoding *size* bytes of the encoded buffer *s* using the
279   codec registered for *encoding*.  *encoding* and *errors* have the same meaning
280   as the parameters of the same name in the :func:`unicode` built-in function.
281   The codec to be used is looked up using the Python codec registry.  Return
282   *NULL* if an exception was raised by the codec.
283
284   .. note::
285
286      This function is not available in 3.x and does not have a PyBytes alias.
287
288   .. versionchanged:: 2.5
289      This function used an :c:type:`int` type for *size*. This might require
290      changes in your code for properly supporting 64-bit systems.
291
292
293.. c:function:: PyObject* PyString_AsDecodedObject(PyObject *str, const char *encoding, const char *errors)
294
295   Decode a string object by passing it to the codec registered for *encoding* and
296   return the result as Python object. *encoding* and *errors* have the same
297   meaning as the parameters of the same name in the string :meth:`encode` method.
298   The codec to be used is looked up using the Python codec registry. Return *NULL*
299   if an exception was raised by the codec.
300
301   .. note::
302
303      This function is not available in 3.x and does not have a PyBytes alias.
304
305
306.. c:function:: PyObject* PyString_Encode(const char *s, Py_ssize_t size, const char *encoding, const char *errors)
307
308   Encode the :c:type:`char` buffer of the given size by passing it to the codec
309   registered for *encoding* and return a Python object. *encoding* and *errors*
310   have the same meaning as the parameters of the same name in the string
311   :meth:`encode` method. The codec to be used is looked up using the Python codec
312   registry.  Return *NULL* if an exception was raised by the codec.
313
314   .. note::
315
316      This function is not available in 3.x and does not have a PyBytes alias.
317
318   .. versionchanged:: 2.5
319      This function used an :c:type:`int` type for *size*. This might require
320      changes in your code for properly supporting 64-bit systems.
321
322
323.. c:function:: PyObject* PyString_AsEncodedObject(PyObject *str, const char *encoding, const char *errors)
324
325   Encode a string object using the codec registered for *encoding* and return the
326   result as Python object. *encoding* and *errors* have the same meaning as the
327   parameters of the same name in the string :meth:`encode` method. The codec to be
328   used is looked up using the Python codec registry. Return *NULL* if an exception
329   was raised by the codec.
330
331   .. note::
332
333      This function is not available in 3.x and does not have a PyBytes alias.
334