1:mod:`xml.etree.ElementTree` --- The ElementTree XML API
2========================================================
3
4.. module:: xml.etree.ElementTree
5   :synopsis: Implementation of the ElementTree API.
6
7.. moduleauthor:: Fredrik Lundh <fredrik@pythonware.com>
8
9**Source code:** :source:`Lib/xml/etree/ElementTree.py`
10
11--------------
12
13The :mod:`xml.etree.ElementTree` module implements a simple and efficient API
14for parsing and creating XML data.
15
16.. versionchanged:: 3.3
17   This module will use a fast implementation whenever available.
18   The :mod:`xml.etree.cElementTree` module is deprecated.
19
20
21.. warning::
22
23   The :mod:`xml.etree.ElementTree` module is not secure against
24   maliciously constructed data.  If you need to parse untrusted or
25   unauthenticated data see :ref:`xml-vulnerabilities`.
26
27Tutorial
28--------
29
30This is a short tutorial for using :mod:`xml.etree.ElementTree` (``ET`` in
31short).  The goal is to demonstrate some of the building blocks and basic
32concepts of the module.
33
34XML tree and elements
35^^^^^^^^^^^^^^^^^^^^^
36
37XML is an inherently hierarchical data format, and the most natural way to
38represent it is with a tree.  ``ET`` has two classes for this purpose -
39:class:`ElementTree` represents the whole XML document as a tree, and
40:class:`Element` represents a single node in this tree.  Interactions with
41the whole document (reading and writing to/from files) are usually done
42on the :class:`ElementTree` level.  Interactions with a single XML element
43and its sub-elements are done on the :class:`Element` level.
44
45.. _elementtree-parsing-xml:
46
47Parsing XML
48^^^^^^^^^^^
49
50We'll be using the following XML document as the sample data for this section:
51
52.. code-block:: xml
53
54   <?xml version="1.0"?>
55   <data>
56       <country name="Liechtenstein">
57           <rank>1</rank>
58           <year>2008</year>
59           <gdppc>141100</gdppc>
60           <neighbor name="Austria" direction="E"/>
61           <neighbor name="Switzerland" direction="W"/>
62       </country>
63       <country name="Singapore">
64           <rank>4</rank>
65           <year>2011</year>
66           <gdppc>59900</gdppc>
67           <neighbor name="Malaysia" direction="N"/>
68       </country>
69       <country name="Panama">
70           <rank>68</rank>
71           <year>2011</year>
72           <gdppc>13600</gdppc>
73           <neighbor name="Costa Rica" direction="W"/>
74           <neighbor name="Colombia" direction="E"/>
75       </country>
76   </data>
77
78We can import this data by reading from a file::
79
80   import xml.etree.ElementTree as ET
81   tree = ET.parse('country_data.xml')
82   root = tree.getroot()
83
84Or directly from a string::
85
86   root = ET.fromstring(country_data_as_string)
87
88:func:`fromstring` parses XML from a string directly into an :class:`Element`,
89which is the root element of the parsed tree.  Other parsing functions may
90create an :class:`ElementTree`.  Check the documentation to be sure.
91
92As an :class:`Element`, ``root`` has a tag and a dictionary of attributes::
93
94   >>> root.tag
95   'data'
96   >>> root.attrib
97   {}
98
99It also has children nodes over which we can iterate::
100
101   >>> for child in root:
102   ...     print(child.tag, child.attrib)
103   ...
104   country {'name': 'Liechtenstein'}
105   country {'name': 'Singapore'}
106   country {'name': 'Panama'}
107
108Children are nested, and we can access specific child nodes by index::
109
110   >>> root[0][1].text
111   '2008'
112
113
114.. note::
115
116   Not all elements of the XML input will end up as elements of the
117   parsed tree. Currently, this module skips over any XML comments,
118   processing instructions, and document type declarations in the
119   input. Nevertheless, trees built using this module's API rather
120   than parsing from XML text can have comments and processing
121   instructions in them; they will be included when generating XML
122   output. A document type declaration may be accessed by passing a
123   custom :class:`TreeBuilder` instance to the :class:`XMLParser`
124   constructor.
125
126
127.. _elementtree-pull-parsing:
128
129Pull API for non-blocking parsing
130^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
131
132Most parsing functions provided by this module require the whole document
133to be read at once before returning any result.  It is possible to use an
134:class:`XMLParser` and feed data into it incrementally, but it is a push API that
135calls methods on a callback target, which is too low-level and inconvenient for
136most needs.  Sometimes what the user really wants is to be able to parse XML
137incrementally, without blocking operations, while enjoying the convenience of
138fully constructed :class:`Element` objects.
139
140The most powerful tool for doing this is :class:`XMLPullParser`.  It does not
141require a blocking read to obtain the XML data, and is instead fed with data
142incrementally with :meth:`XMLPullParser.feed` calls.  To get the parsed XML
143elements, call :meth:`XMLPullParser.read_events`.  Here is an example::
144
145   >>> parser = ET.XMLPullParser(['start', 'end'])
146   >>> parser.feed('<mytag>sometext')
147   >>> list(parser.read_events())
148   [('start', <Element 'mytag' at 0x7fa66db2be58>)]
149   >>> parser.feed(' more text</mytag>')
150   >>> for event, elem in parser.read_events():
151   ...     print(event)
152   ...     print(elem.tag, 'text=', elem.text)
153   ...
154   end
155
156The obvious use case is applications that operate in a non-blocking fashion
157where the XML data is being received from a socket or read incrementally from
158some storage device.  In such cases, blocking reads are unacceptable.
159
160Because it's so flexible, :class:`XMLPullParser` can be inconvenient to use for
161simpler use-cases.  If you don't mind your application blocking on reading XML
162data but would still like to have incremental parsing capabilities, take a look
163at :func:`iterparse`.  It can be useful when you're reading a large XML document
164and don't want to hold it wholly in memory.
165
166Finding interesting elements
167^^^^^^^^^^^^^^^^^^^^^^^^^^^^
168
169:class:`Element` has some useful methods that help iterate recursively over all
170the sub-tree below it (its children, their children, and so on).  For example,
171:meth:`Element.iter`::
172
173   >>> for neighbor in root.iter('neighbor'):
174   ...     print(neighbor.attrib)
175   ...
176   {'name': 'Austria', 'direction': 'E'}
177   {'name': 'Switzerland', 'direction': 'W'}
178   {'name': 'Malaysia', 'direction': 'N'}
179   {'name': 'Costa Rica', 'direction': 'W'}
180   {'name': 'Colombia', 'direction': 'E'}
181
182:meth:`Element.findall` finds only elements with a tag which are direct
183children of the current element.  :meth:`Element.find` finds the *first* child
184with a particular tag, and :attr:`Element.text` accesses the element's text
185content.  :meth:`Element.get` accesses the element's attributes::
186
187   >>> for country in root.findall('country'):
188   ...     rank = country.find('rank').text
189   ...     name = country.get('name')
190   ...     print(name, rank)
191   ...
192   Liechtenstein 1
193   Singapore 4
194   Panama 68
195
196More sophisticated specification of which elements to look for is possible by
197using :ref:`XPath <elementtree-xpath>`.
198
199Modifying an XML File
200^^^^^^^^^^^^^^^^^^^^^
201
202:class:`ElementTree` provides a simple way to build XML documents and write them to files.
203The :meth:`ElementTree.write` method serves this purpose.
204
205Once created, an :class:`Element` object may be manipulated by directly changing
206its fields (such as :attr:`Element.text`), adding and modifying attributes
207(:meth:`Element.set` method), as well as adding new children (for example
208with :meth:`Element.append`).
209
210Let's say we want to add one to each country's rank, and add an ``updated``
211attribute to the rank element::
212
213   >>> for rank in root.iter('rank'):
214   ...     new_rank = int(rank.text) + 1
215   ...     rank.text = str(new_rank)
216   ...     rank.set('updated', 'yes')
217   ...
218   >>> tree.write('output.xml')
219
220Our XML now looks like this:
221
222.. code-block:: xml
223
224   <?xml version="1.0"?>
225   <data>
226       <country name="Liechtenstein">
227           <rank updated="yes">2</rank>
228           <year>2008</year>
229           <gdppc>141100</gdppc>
230           <neighbor name="Austria" direction="E"/>
231           <neighbor name="Switzerland" direction="W"/>
232       </country>
233       <country name="Singapore">
234           <rank updated="yes">5</rank>
235           <year>2011</year>
236           <gdppc>59900</gdppc>
237           <neighbor name="Malaysia" direction="N"/>
238       </country>
239       <country name="Panama">
240           <rank updated="yes">69</rank>
241           <year>2011</year>
242           <gdppc>13600</gdppc>
243           <neighbor name="Costa Rica" direction="W"/>
244           <neighbor name="Colombia" direction="E"/>
245       </country>
246   </data>
247
248We can remove elements using :meth:`Element.remove`.  Let's say we want to
249remove all countries with a rank higher than 50::
250
251   >>> for country in root.findall('country'):
252   ...     # using root.findall() to avoid removal during traversal
253   ...     rank = int(country.find('rank').text)
254   ...     if rank > 50:
255   ...         root.remove(country)
256   ...
257   >>> tree.write('output.xml')
258
259Note that concurrent modification while iterating can lead to problems,
260just like when iterating and modifying Python lists or dicts.
261Therefore, the example first collects all matching elements with
262``root.findall()``, and only then iterates over the list of matches.
263
264Our XML now looks like this:
265
266.. code-block:: xml
267
268   <?xml version="1.0"?>
269   <data>
270       <country name="Liechtenstein">
271           <rank updated="yes">2</rank>
272           <year>2008</year>
273           <gdppc>141100</gdppc>
274           <neighbor name="Austria" direction="E"/>
275           <neighbor name="Switzerland" direction="W"/>
276       </country>
277       <country name="Singapore">
278           <rank updated="yes">5</rank>
279           <year>2011</year>
280           <gdppc>59900</gdppc>
281           <neighbor name="Malaysia" direction="N"/>
282       </country>
283   </data>
284
285Building XML documents
286^^^^^^^^^^^^^^^^^^^^^^
287
288The :func:`SubElement` function also provides a convenient way to create new
289sub-elements for a given element::
290
291   >>> a = ET.Element('a')
292   >>> b = ET.SubElement(a, 'b')
293   >>> c = ET.SubElement(a, 'c')
294   >>> d = ET.SubElement(c, 'd')
295   >>> ET.dump(a)
296   <a><b /><c><d /></c></a>
297
298Parsing XML with Namespaces
299^^^^^^^^^^^^^^^^^^^^^^^^^^^
300
301If the XML input has `namespaces
302<https://en.wikipedia.org/wiki/XML_namespace>`__, tags and attributes
303with prefixes in the form ``prefix:sometag`` get expanded to
304``{uri}sometag`` where the *prefix* is replaced by the full *URI*.
305Also, if there is a `default namespace
306<https://www.w3.org/TR/xml-names/#defaulting>`__,
307that full URI gets prepended to all of the non-prefixed tags.
308
309Here is an XML example that incorporates two namespaces, one with the
310prefix "fictional" and the other serving as the default namespace:
311
312.. code-block:: xml
313
314    <?xml version="1.0"?>
315    <actors xmlns:fictional="http://characters.example.com"
316            xmlns="http://people.example.com">
317        <actor>
318            <name>John Cleese</name>
319            <fictional:character>Lancelot</fictional:character>
320            <fictional:character>Archie Leach</fictional:character>
321        </actor>
322        <actor>
323            <name>Eric Idle</name>
324            <fictional:character>Sir Robin</fictional:character>
325            <fictional:character>Gunther</fictional:character>
326            <fictional:character>Commander Clement</fictional:character>
327        </actor>
328    </actors>
329
330One way to search and explore this XML example is to manually add the
331URI to every tag or attribute in the xpath of a
332:meth:`~Element.find` or :meth:`~Element.findall`::
333
334    root = fromstring(xml_text)
335    for actor in root.findall('{http://people.example.com}actor'):
336        name = actor.find('{http://people.example.com}name')
337        print(name.text)
338        for char in actor.findall('{http://characters.example.com}character'):
339            print(' |-->', char.text)
340
341A better way to search the namespaced XML example is to create a
342dictionary with your own prefixes and use those in the search functions::
343
344    ns = {'real_person': 'http://people.example.com',
345          'role': 'http://characters.example.com'}
346
347    for actor in root.findall('real_person:actor', ns):
348        name = actor.find('real_person:name', ns)
349        print(name.text)
350        for char in actor.findall('role:character', ns):
351            print(' |-->', char.text)
352
353These two approaches both output::
354
355    John Cleese
356     |--> Lancelot
357     |--> Archie Leach
358    Eric Idle
359     |--> Sir Robin
360     |--> Gunther
361     |--> Commander Clement
362
363
364Additional resources
365^^^^^^^^^^^^^^^^^^^^
366
367See http://effbot.org/zone/element-index.htm for tutorials and links to other
368docs.
369
370
371.. _elementtree-xpath:
372
373XPath support
374-------------
375
376This module provides limited support for
377`XPath expressions <https://www.w3.org/TR/xpath>`_ for locating elements in a
378tree.  The goal is to support a small subset of the abbreviated syntax; a full
379XPath engine is outside the scope of the module.
380
381Example
382^^^^^^^
383
384Here's an example that demonstrates some of the XPath capabilities of the
385module.  We'll be using the ``countrydata`` XML document from the
386:ref:`Parsing XML <elementtree-parsing-xml>` section::
387
388   import xml.etree.ElementTree as ET
389
390   root = ET.fromstring(countrydata)
391
392   # Top-level elements
393   root.findall(".")
394
395   # All 'neighbor' grand-children of 'country' children of the top-level
396   # elements
397   root.findall("./country/neighbor")
398
399   # Nodes with name='Singapore' that have a 'year' child
400   root.findall(".//year/..[@name='Singapore']")
401
402   # 'year' nodes that are children of nodes with name='Singapore'
403   root.findall(".//*[@name='Singapore']/year")
404
405   # All 'neighbor' nodes that are the second child of their parent
406   root.findall(".//neighbor[2]")
407
408For XML with namespaces, use the usual qualified ``{namespace}tag`` notation::
409
410   # All dublin-core "title" tags in the document
411   root.findall(".//{http://purl.org/dc/elements/1.1/}title")
412
413
414Supported XPath syntax
415^^^^^^^^^^^^^^^^^^^^^^
416
417.. tabularcolumns:: |l|L|
418
419+-----------------------+------------------------------------------------------+
420| Syntax                | Meaning                                              |
421+=======================+======================================================+
422| ``tag``               | Selects all child elements with the given tag.       |
423|                       | For example, ``spam`` selects all child elements     |
424|                       | named ``spam``, and ``spam/egg`` selects all         |
425|                       | grandchildren named ``egg`` in all children named    |
426|                       | ``spam``.  ``{namespace}*`` selects all tags in the  |
427|                       | given namespace, ``{*}spam`` selects tags named      |
428|                       | ``spam`` in any (or no) namespace, and ``{}*``       |
429|                       | only selects tags that are not in a namespace.       |
430|                       |                                                      |
431|                       | .. versionchanged:: 3.8                              |
432|                       |    Support for star-wildcards was added.             |
433+-----------------------+------------------------------------------------------+
434| ``*``                 | Selects all child elements, including comments and   |
435|                       | processing instructions.  For example, ``*/egg``     |
436|                       | selects all grandchildren named ``egg``.             |
437+-----------------------+------------------------------------------------------+
438| ``.``                 | Selects the current node.  This is mostly useful     |
439|                       | at the beginning of the path, to indicate that it's  |
440|                       | a relative path.                                     |
441+-----------------------+------------------------------------------------------+
442| ``//``                | Selects all subelements, on all levels beneath the   |
443|                       | current  element.  For example, ``.//egg`` selects   |
444|                       | all ``egg`` elements in the entire tree.             |
445+-----------------------+------------------------------------------------------+
446| ``..``                | Selects the parent element.  Returns ``None`` if the |
447|                       | path attempts to reach the ancestors of the start    |
448|                       | element (the element ``find`` was called on).        |
449+-----------------------+------------------------------------------------------+
450| ``[@attrib]``         | Selects all elements that have the given attribute.  |
451+-----------------------+------------------------------------------------------+
452| ``[@attrib='value']`` | Selects all elements for which the given attribute   |
453|                       | has the given value.  The value cannot contain       |
454|                       | quotes.                                              |
455+-----------------------+------------------------------------------------------+
456| ``[tag]``             | Selects all elements that have a child named         |
457|                       | ``tag``.  Only immediate children are supported.     |
458+-----------------------+------------------------------------------------------+
459| ``[.='text']``        | Selects all elements whose complete text content,    |
460|                       | including descendants, equals the given ``text``.    |
461|                       |                                                      |
462|                       | .. versionadded:: 3.7                                |
463+-----------------------+------------------------------------------------------+
464| ``[tag='text']``      | Selects all elements that have a child named         |
465|                       | ``tag`` whose complete text content, including       |
466|                       | descendants, equals the given ``text``.              |
467+-----------------------+------------------------------------------------------+
468| ``[position]``        | Selects all elements that are located at the given   |
469|                       | position.  The position can be either an integer     |
470|                       | (1 is the first position), the expression ``last()`` |
471|                       | (for the last position), or a position relative to   |
472|                       | the last position (e.g. ``last()-1``).               |
473+-----------------------+------------------------------------------------------+
474
475Predicates (expressions within square brackets) must be preceded by a tag
476name, an asterisk, or another predicate.  ``position`` predicates must be
477preceded by a tag name.
478
479Reference
480---------
481
482.. _elementtree-functions:
483
484Functions
485^^^^^^^^^
486
487.. function:: canonicalize(xml_data=None, *, out=None, from_file=None, **options)
488
489   `C14N 2.0 <https://www.w3.org/TR/xml-c14n2/>`_ transformation function.
490
491   Canonicalization is a way to normalise XML output in a way that allows
492   byte-by-byte comparisons and digital signatures.  It reduced the freedom
493   that XML serializers have and instead generates a more constrained XML
494   representation.  The main restrictions regard the placement of namespace
495   declarations, the ordering of attributes, and ignorable whitespace.
496
497   This function takes an XML data string (*xml_data*) or a file path or
498   file-like object (*from_file*) as input, converts it to the canonical
499   form, and writes it out using the *out* file(-like) object, if provided,
500   or returns it as a text string if not.  The output file receives text,
501   not bytes.  It should therefore be opened in text mode with ``utf-8``
502   encoding.
503
504   Typical uses::
505
506      xml_data = "<root>...</root>"
507      print(canonicalize(xml_data))
508
509      with open("c14n_output.xml", mode='w', encoding='utf-8') as out_file:
510          canonicalize(xml_data, out=out_file)
511
512      with open("c14n_output.xml", mode='w', encoding='utf-8') as out_file:
513          canonicalize(from_file="inputfile.xml", out=out_file)
514
515   The configuration *options* are as follows:
516
517   - *with_comments*: set to true to include comments (default: false)
518   - *strip_text*: set to true to strip whitespace before and after text content
519                   (default: false)
520   - *rewrite_prefixes*: set to true to replace namespace prefixes by "n{number}"
521                         (default: false)
522   - *qname_aware_tags*: a set of qname aware tag names in which prefixes
523                         should be replaced in text content (default: empty)
524   - *qname_aware_attrs*: a set of qname aware attribute names in which prefixes
525                          should be replaced in text content (default: empty)
526   - *exclude_attrs*: a set of attribute names that should not be serialised
527   - *exclude_tags*: a set of tag names that should not be serialised
528
529   In the option list above, "a set" refers to any collection or iterable of
530   strings, no ordering is expected.
531
532   .. versionadded:: 3.8
533
534
535.. function:: Comment(text=None)
536
537   Comment element factory.  This factory function creates a special element
538   that will be serialized as an XML comment by the standard serializer.  The
539   comment string can be either a bytestring or a Unicode string.  *text* is a
540   string containing the comment string.  Returns an element instance
541   representing a comment.
542
543   Note that :class:`XMLParser` skips over comments in the input
544   instead of creating comment objects for them. An :class:`ElementTree` will
545   only contain comment nodes if they have been inserted into to
546   the tree using one of the :class:`Element` methods.
547
548.. function:: dump(elem)
549
550   Writes an element tree or element structure to sys.stdout.  This function
551   should be used for debugging only.
552
553   The exact output format is implementation dependent.  In this version, it's
554   written as an ordinary XML file.
555
556   *elem* is an element tree or an individual element.
557
558   .. versionchanged:: 3.8
559      The :func:`dump` function now preserves the attribute order specified
560      by the user.
561
562
563.. function:: fromstring(text, parser=None)
564
565   Parses an XML section from a string constant.  Same as :func:`XML`.  *text*
566   is a string containing XML data.  *parser* is an optional parser instance.
567   If not given, the standard :class:`XMLParser` parser is used.
568   Returns an :class:`Element` instance.
569
570
571.. function:: fromstringlist(sequence, parser=None)
572
573   Parses an XML document from a sequence of string fragments.  *sequence* is a
574   list or other sequence containing XML data fragments.  *parser* is an
575   optional parser instance.  If not given, the standard :class:`XMLParser`
576   parser is used.  Returns an :class:`Element` instance.
577
578   .. versionadded:: 3.2
579
580
581.. function:: iselement(element)
582
583   Check if an object appears to be a valid element object.  *element* is an
584   element instance.  Return ``True`` if this is an element object.
585
586
587.. function:: iterparse(source, events=None, parser=None)
588
589   Parses an XML section into an element tree incrementally, and reports what's
590   going on to the user.  *source* is a filename or :term:`file object`
591   containing XML data.  *events* is a sequence of events to report back.  The
592   supported events are the strings ``"start"``, ``"end"``, ``"comment"``,
593   ``"pi"``, ``"start-ns"`` and ``"end-ns"``
594   (the "ns" events are used to get detailed namespace
595   information).  If *events* is omitted, only ``"end"`` events are reported.
596   *parser* is an optional parser instance.  If not given, the standard
597   :class:`XMLParser` parser is used.  *parser* must be a subclass of
598   :class:`XMLParser` and can only use the default :class:`TreeBuilder` as a
599   target.  Returns an :term:`iterator` providing ``(event, elem)`` pairs.
600
601   Note that while :func:`iterparse` builds the tree incrementally, it issues
602   blocking reads on *source* (or the file it names).  As such, it's unsuitable
603   for applications where blocking reads can't be made.  For fully non-blocking
604   parsing, see :class:`XMLPullParser`.
605
606   .. note::
607
608      :func:`iterparse` only guarantees that it has seen the ">" character of a
609      starting tag when it emits a "start" event, so the attributes are defined,
610      but the contents of the text and tail attributes are undefined at that
611      point.  The same applies to the element children; they may or may not be
612      present.
613
614      If you need a fully populated element, look for "end" events instead.
615
616   .. deprecated:: 3.4
617      The *parser* argument.
618
619   .. versionchanged:: 3.8
620      The ``comment`` and ``pi`` events were added.
621
622
623.. function:: parse(source, parser=None)
624
625   Parses an XML section into an element tree.  *source* is a filename or file
626   object containing XML data.  *parser* is an optional parser instance.  If
627   not given, the standard :class:`XMLParser` parser is used.  Returns an
628   :class:`ElementTree` instance.
629
630
631.. function:: ProcessingInstruction(target, text=None)
632
633   PI element factory.  This factory function creates a special element that
634   will be serialized as an XML processing instruction.  *target* is a string
635   containing the PI target.  *text* is a string containing the PI contents, if
636   given.  Returns an element instance, representing a processing instruction.
637
638   Note that :class:`XMLParser` skips over processing instructions
639   in the input instead of creating comment objects for them. An
640   :class:`ElementTree` will only contain processing instruction nodes if
641   they have been inserted into to the tree using one of the
642   :class:`Element` methods.
643
644.. function:: register_namespace(prefix, uri)
645
646   Registers a namespace prefix.  The registry is global, and any existing
647   mapping for either the given prefix or the namespace URI will be removed.
648   *prefix* is a namespace prefix.  *uri* is a namespace uri.  Tags and
649   attributes in this namespace will be serialized with the given prefix, if at
650   all possible.
651
652   .. versionadded:: 3.2
653
654
655.. function:: SubElement(parent, tag, attrib={}, **extra)
656
657   Subelement factory.  This function creates an element instance, and appends
658   it to an existing element.
659
660   The element name, attribute names, and attribute values can be either
661   bytestrings or Unicode strings.  *parent* is the parent element.  *tag* is
662   the subelement name.  *attrib* is an optional dictionary, containing element
663   attributes.  *extra* contains additional attributes, given as keyword
664   arguments.  Returns an element instance.
665
666
667.. function:: tostring(element, encoding="us-ascii", method="xml", *, \
668                       xml_declaration=None, default_namespace=None, \
669                       short_empty_elements=True)
670
671   Generates a string representation of an XML element, including all
672   subelements.  *element* is an :class:`Element` instance.  *encoding* [1]_ is
673   the output encoding (default is US-ASCII).  Use ``encoding="unicode"`` to
674   generate a Unicode string (otherwise, a bytestring is generated).  *method*
675   is either ``"xml"``, ``"html"`` or ``"text"`` (default is ``"xml"``).
676   *xml_declaration*, *default_namespace* and *short_empty_elements* has the same
677   meaning as in :meth:`ElementTree.write`. Returns an (optionally) encoded string
678   containing the XML data.
679
680   .. versionadded:: 3.4
681      The *short_empty_elements* parameter.
682
683   .. versionadded:: 3.8
684      The *xml_declaration* and *default_namespace* parameters.
685
686   .. versionchanged:: 3.8
687      The :func:`tostring` function now preserves the attribute order
688      specified by the user.
689
690
691.. function:: tostringlist(element, encoding="us-ascii", method="xml", *, \
692                           xml_declaration=None, default_namespace=None, \
693                           short_empty_elements=True)
694
695   Generates a string representation of an XML element, including all
696   subelements.  *element* is an :class:`Element` instance.  *encoding* [1]_ is
697   the output encoding (default is US-ASCII).  Use ``encoding="unicode"`` to
698   generate a Unicode string (otherwise, a bytestring is generated).  *method*
699   is either ``"xml"``, ``"html"`` or ``"text"`` (default is ``"xml"``).
700   *xml_declaration*, *default_namespace* and *short_empty_elements* has the same
701   meaning as in :meth:`ElementTree.write`. Returns a list of (optionally) encoded
702   strings containing the XML data. It does not guarantee any specific sequence,
703   except that ``b"".join(tostringlist(element)) == tostring(element)``.
704
705   .. versionadded:: 3.2
706
707   .. versionadded:: 3.4
708      The *short_empty_elements* parameter.
709
710   .. versionadded:: 3.8
711      The *xml_declaration* and *default_namespace* parameters.
712
713   .. versionchanged:: 3.8
714      The :func:`tostringlist` function now preserves the attribute order
715      specified by the user.
716
717
718.. function:: XML(text, parser=None)
719
720   Parses an XML section from a string constant.  This function can be used to
721   embed "XML literals" in Python code.  *text* is a string containing XML
722   data.  *parser* is an optional parser instance.  If not given, the standard
723   :class:`XMLParser` parser is used.  Returns an :class:`Element` instance.
724
725
726.. function:: XMLID(text, parser=None)
727
728   Parses an XML section from a string constant, and also returns a dictionary
729   which maps from element id:s to elements.  *text* is a string containing XML
730   data.  *parser* is an optional parser instance.  If not given, the standard
731   :class:`XMLParser` parser is used.  Returns a tuple containing an
732   :class:`Element` instance and a dictionary.
733
734
735.. _elementtree-xinclude:
736
737XInclude support
738----------------
739
740This module provides limited support for
741`XInclude directives <https://www.w3.org/TR/xinclude/>`_, via the :mod:`xml.etree.ElementInclude` helper module.  This module can be used to insert subtrees and text strings into element trees, based on information in the tree.
742
743Example
744^^^^^^^
745
746Here's an example that demonstrates use of the XInclude module. To include an XML document in the current document, use the ``{http://www.w3.org/2001/XInclude}include`` element and set the **parse** attribute to ``"xml"``, and use the **href** attribute to specify the document to include.
747
748.. code-block:: xml
749
750    <?xml version="1.0"?>
751    <document xmlns:xi="http://www.w3.org/2001/XInclude">
752      <xi:include href="source.xml" parse="xml" />
753    </document>
754
755By default, the **href** attribute is treated as a file name. You can use custom loaders to override this behaviour. Also note that the standard helper does not support XPointer syntax.
756
757To process this file, load it as usual, and pass the root element to the :mod:`xml.etree.ElementTree` module:
758
759.. code-block:: python
760
761   from xml.etree import ElementTree, ElementInclude
762
763   tree = ElementTree.parse("document.xml")
764   root = tree.getroot()
765
766   ElementInclude.include(root)
767
768The ElementInclude module replaces the ``{http://www.w3.org/2001/XInclude}include`` element with the root element from the **source.xml** document. The result might look something like this:
769
770.. code-block:: xml
771
772    <document xmlns:xi="http://www.w3.org/2001/XInclude">
773      <para>This is a paragraph.</para>
774    </document>
775
776If the **parse** attribute is omitted, it defaults to "xml". The href attribute is required.
777
778To include a text document, use the ``{http://www.w3.org/2001/XInclude}include`` element, and set the **parse** attribute to "text":
779
780.. code-block:: xml
781
782    <?xml version="1.0"?>
783    <document xmlns:xi="http://www.w3.org/2001/XInclude">
784      Copyright (c) <xi:include href="year.txt" parse="text" />.
785    </document>
786
787The result might look something like:
788
789.. code-block:: xml
790
791    <document xmlns:xi="http://www.w3.org/2001/XInclude">
792      Copyright (c) 2003.
793    </document>
794
795Reference
796---------
797
798.. _elementinclude-functions:
799
800Functions
801^^^^^^^^^
802
803.. function:: xml.etree.ElementInclude.default_loader( href, parse, encoding=None)
804
805   Default loader. This default loader reads an included resource from disk.  *href* is a URL.
806   *parse* is for parse mode either "xml" or "text".  *encoding*
807   is an optional text encoding.  If not given, encoding is ``utf-8``.  Returns the
808   expanded resource.  If the parse mode is ``"xml"``, this is an ElementTree
809   instance.  If the parse mode is "text", this is a Unicode string.  If the
810   loader fails, it can return None or raise an exception.
811
812
813.. function:: xml.etree.ElementInclude.include( elem, loader=None)
814
815   This function expands XInclude directives.  *elem* is the root element.  *loader* is
816   an optional resource loader.  If omitted, it defaults to :func:`default_loader`.
817   If given, it should be a callable that implements the same interface as
818   :func:`default_loader`.  Returns the expanded resource.  If the parse mode is
819   ``"xml"``, this is an ElementTree instance.  If the parse mode is "text",
820   this is a Unicode string.  If the loader fails, it can return None or
821   raise an exception.
822
823
824.. _elementtree-element-objects:
825
826Element Objects
827^^^^^^^^^^^^^^^
828
829.. class:: Element(tag, attrib={}, **extra)
830
831   Element class.  This class defines the Element interface, and provides a
832   reference implementation of this interface.
833
834   The element name, attribute names, and attribute values can be either
835   bytestrings or Unicode strings.  *tag* is the element name.  *attrib* is
836   an optional dictionary, containing element attributes.  *extra* contains
837   additional attributes, given as keyword arguments.
838
839
840   .. attribute:: tag
841
842      A string identifying what kind of data this element represents (the
843      element type, in other words).
844
845
846   .. attribute:: text
847                  tail
848
849      These attributes can be used to hold additional data associated with
850      the element.  Their values are usually strings but may be any
851      application-specific object.  If the element is created from
852      an XML file, the *text* attribute holds either the text between
853      the element's start tag and its first child or end tag, or ``None``, and
854      the *tail* attribute holds either the text between the element's
855      end tag and the next tag, or ``None``.  For the XML data
856
857      .. code-block:: xml
858
859         <a><b>1<c>2<d/>3</c></b>4</a>
860
861      the *a* element has ``None`` for both *text* and *tail* attributes,
862      the *b* element has *text* ``"1"`` and *tail* ``"4"``,
863      the *c* element has *text* ``"2"`` and *tail* ``None``,
864      and the *d* element has *text* ``None`` and *tail* ``"3"``.
865
866      To collect the inner text of an element, see :meth:`itertext`, for
867      example ``"".join(element.itertext())``.
868
869      Applications may store arbitrary objects in these attributes.
870
871
872   .. attribute:: attrib
873
874      A dictionary containing the element's attributes.  Note that while the
875      *attrib* value is always a real mutable Python dictionary, an ElementTree
876      implementation may choose to use another internal representation, and
877      create the dictionary only if someone asks for it.  To take advantage of
878      such implementations, use the dictionary methods below whenever possible.
879
880   The following dictionary-like methods work on the element attributes.
881
882
883   .. method:: clear()
884
885      Resets an element.  This function removes all subelements, clears all
886      attributes, and sets the text and tail attributes to ``None``.
887
888
889   .. method:: get(key, default=None)
890
891      Gets the element attribute named *key*.
892
893      Returns the attribute value, or *default* if the attribute was not found.
894
895
896   .. method:: items()
897
898      Returns the element attributes as a sequence of (name, value) pairs.  The
899      attributes are returned in an arbitrary order.
900
901
902   .. method:: keys()
903
904      Returns the elements attribute names as a list.  The names are returned
905      in an arbitrary order.
906
907
908   .. method:: set(key, value)
909
910      Set the attribute *key* on the element to *value*.
911
912   The following methods work on the element's children (subelements).
913
914
915   .. method:: append(subelement)
916
917      Adds the element *subelement* to the end of this element's internal list
918      of subelements.  Raises :exc:`TypeError` if *subelement* is not an
919      :class:`Element`.
920
921
922   .. method:: extend(subelements)
923
924      Appends *subelements* from a sequence object with zero or more elements.
925      Raises :exc:`TypeError` if a subelement is not an :class:`Element`.
926
927      .. versionadded:: 3.2
928
929
930   .. method:: find(match, namespaces=None)
931
932      Finds the first subelement matching *match*.  *match* may be a tag name
933      or a :ref:`path <elementtree-xpath>`.  Returns an element instance
934      or ``None``.  *namespaces* is an optional mapping from namespace prefix
935      to full name.  Pass ``''`` as prefix to move all unprefixed tag names
936      in the expression into the given namespace.
937
938
939   .. method:: findall(match, namespaces=None)
940
941      Finds all matching subelements, by tag name or
942      :ref:`path <elementtree-xpath>`.  Returns a list containing all matching
943      elements in document order.  *namespaces* is an optional mapping from
944      namespace prefix to full name.  Pass ``''`` as prefix to move all
945      unprefixed tag names in the expression into the given namespace.
946
947
948   .. method:: findtext(match, default=None, namespaces=None)
949
950      Finds text for the first subelement matching *match*.  *match* may be
951      a tag name or a :ref:`path <elementtree-xpath>`.  Returns the text content
952      of the first matching element, or *default* if no element was found.
953      Note that if the matching element has no text content an empty string
954      is returned. *namespaces* is an optional mapping from namespace prefix
955      to full name.  Pass ``''`` as prefix to move all unprefixed tag names
956      in the expression into the given namespace.
957
958
959   .. method:: getchildren()
960
961      .. deprecated-removed:: 3.2 3.9
962         Use ``list(elem)`` or iteration.
963
964
965   .. method:: getiterator(tag=None)
966
967      .. deprecated-removed:: 3.2 3.9
968         Use method :meth:`Element.iter` instead.
969
970
971   .. method:: insert(index, subelement)
972
973      Inserts *subelement* at the given position in this element.  Raises
974      :exc:`TypeError` if *subelement* is not an :class:`Element`.
975
976
977   .. method:: iter(tag=None)
978
979      Creates a tree :term:`iterator` with the current element as the root.
980      The iterator iterates over this element and all elements below it, in
981      document (depth first) order.  If *tag* is not ``None`` or ``'*'``, only
982      elements whose tag equals *tag* are returned from the iterator.  If the
983      tree structure is modified during iteration, the result is undefined.
984
985      .. versionadded:: 3.2
986
987
988   .. method:: iterfind(match, namespaces=None)
989
990      Finds all matching subelements, by tag name or
991      :ref:`path <elementtree-xpath>`.  Returns an iterable yielding all
992      matching elements in document order. *namespaces* is an optional mapping
993      from namespace prefix to full name.
994
995
996      .. versionadded:: 3.2
997
998
999   .. method:: itertext()
1000
1001      Creates a text iterator.  The iterator loops over this element and all
1002      subelements, in document order, and returns all inner text.
1003
1004      .. versionadded:: 3.2
1005
1006
1007   .. method:: makeelement(tag, attrib)
1008
1009      Creates a new element object of the same type as this element.  Do not
1010      call this method, use the :func:`SubElement` factory function instead.
1011
1012
1013   .. method:: remove(subelement)
1014
1015      Removes *subelement* from the element.  Unlike the find\* methods this
1016      method compares elements based on the instance identity, not on tag value
1017      or contents.
1018
1019   :class:`Element` objects also support the following sequence type methods
1020   for working with subelements: :meth:`~object.__delitem__`,
1021   :meth:`~object.__getitem__`, :meth:`~object.__setitem__`,
1022   :meth:`~object.__len__`.
1023
1024   Caution: Elements with no subelements will test as ``False``.  This behavior
1025   will change in future versions.  Use specific ``len(elem)`` or ``elem is
1026   None`` test instead. ::
1027
1028     element = root.find('foo')
1029
1030     if not element:  # careful!
1031         print("element not found, or element has no subelements")
1032
1033     if element is None:
1034         print("element not found")
1035
1036   Prior to Python 3.8, the serialisation order of the XML attributes of
1037   elements was artificially made predictable by sorting the attributes by
1038   their name. Based on the now guaranteed ordering of dicts, this arbitrary
1039   reordering was removed in Python 3.8 to preserve the order in which
1040   attributes were originally parsed or created by user code.
1041
1042   In general, user code should try not to depend on a specific ordering of
1043   attributes, given that the `XML Information Set
1044   <https://www.w3.org/TR/xml-infoset/>`_ explicitly excludes the attribute
1045   order from conveying information. Code should be prepared to deal with
1046   any ordering on input. In cases where deterministic XML output is required,
1047   e.g. for cryptographic signing or test data sets, canonical serialisation
1048   is available with the :func:`canonicalize` function.
1049
1050   In cases where canonical output is not applicable but a specific attribute
1051   order is still desirable on output, code should aim for creating the
1052   attributes directly in the desired order, to avoid perceptual mismatches
1053   for readers of the code. In cases where this is difficult to achieve, a
1054   recipe like the following can be applied prior to serialisation to enforce
1055   an order independently from the Element creation::
1056
1057     def reorder_attributes(root):
1058         for el in root.iter():
1059             attrib = el.attrib
1060             if len(attrib) > 1:
1061                 # adjust attribute order, e.g. by sorting
1062                 attribs = sorted(attrib.items())
1063                 attrib.clear()
1064                 attrib.update(attribs)
1065
1066
1067.. _elementtree-elementtree-objects:
1068
1069ElementTree Objects
1070^^^^^^^^^^^^^^^^^^^
1071
1072
1073.. class:: ElementTree(element=None, file=None)
1074
1075   ElementTree wrapper class.  This class represents an entire element
1076   hierarchy, and adds some extra support for serialization to and from
1077   standard XML.
1078
1079   *element* is the root element.  The tree is initialized with the contents
1080   of the XML *file* if given.
1081
1082
1083   .. method:: _setroot(element)
1084
1085      Replaces the root element for this tree.  This discards the current
1086      contents of the tree, and replaces it with the given element.  Use with
1087      care.  *element* is an element instance.
1088
1089
1090   .. method:: find(match, namespaces=None)
1091
1092      Same as :meth:`Element.find`, starting at the root of the tree.
1093
1094
1095   .. method:: findall(match, namespaces=None)
1096
1097      Same as :meth:`Element.findall`, starting at the root of the tree.
1098
1099
1100   .. method:: findtext(match, default=None, namespaces=None)
1101
1102      Same as :meth:`Element.findtext`, starting at the root of the tree.
1103
1104
1105   .. method:: getiterator(tag=None)
1106
1107      .. deprecated-removed:: 3.2 3.9
1108         Use method :meth:`ElementTree.iter` instead.
1109
1110
1111   .. method:: getroot()
1112
1113      Returns the root element for this tree.
1114
1115
1116   .. method:: iter(tag=None)
1117
1118      Creates and returns a tree iterator for the root element.  The iterator
1119      loops over all elements in this tree, in section order.  *tag* is the tag
1120      to look for (default is to return all elements).
1121
1122
1123   .. method:: iterfind(match, namespaces=None)
1124
1125      Same as :meth:`Element.iterfind`, starting at the root of the tree.
1126
1127      .. versionadded:: 3.2
1128
1129
1130   .. method:: parse(source, parser=None)
1131
1132      Loads an external XML section into this element tree.  *source* is a file
1133      name or :term:`file object`.  *parser* is an optional parser instance.
1134      If not given, the standard :class:`XMLParser` parser is used.  Returns the
1135      section root element.
1136
1137
1138   .. method:: write(file, encoding="us-ascii", xml_declaration=None, \
1139                     default_namespace=None, method="xml", *, \
1140                     short_empty_elements=True)
1141
1142      Writes the element tree to a file, as XML.  *file* is a file name, or a
1143      :term:`file object` opened for writing.  *encoding* [1]_ is the output
1144      encoding (default is US-ASCII).
1145      *xml_declaration* controls if an XML declaration should be added to the
1146      file.  Use ``False`` for never, ``True`` for always, ``None``
1147      for only if not US-ASCII or UTF-8 or Unicode (default is ``None``).
1148      *default_namespace* sets the default XML namespace (for "xmlns").
1149      *method* is either ``"xml"``, ``"html"`` or ``"text"`` (default is
1150      ``"xml"``).
1151      The keyword-only *short_empty_elements* parameter controls the formatting
1152      of elements that contain no content.  If ``True`` (the default), they are
1153      emitted as a single self-closed tag, otherwise they are emitted as a pair
1154      of start/end tags.
1155
1156      The output is either a string (:class:`str`) or binary (:class:`bytes`).
1157      This is controlled by the *encoding* argument.  If *encoding* is
1158      ``"unicode"``, the output is a string; otherwise, it's binary.  Note that
1159      this may conflict with the type of *file* if it's an open
1160      :term:`file object`; make sure you do not try to write a string to a
1161      binary stream and vice versa.
1162
1163      .. versionadded:: 3.4
1164         The *short_empty_elements* parameter.
1165
1166      .. versionchanged:: 3.8
1167         The :meth:`write` method now preserves the attribute order specified
1168         by the user.
1169
1170
1171This is the XML file that is going to be manipulated::
1172
1173    <html>
1174        <head>
1175            <title>Example page</title>
1176        </head>
1177        <body>
1178            <p>Moved to <a href="http://example.org/">example.org</a>
1179            or <a href="http://example.com/">example.com</a>.</p>
1180        </body>
1181    </html>
1182
1183Example of changing the attribute "target" of every link in first paragraph::
1184
1185    >>> from xml.etree.ElementTree import ElementTree
1186    >>> tree = ElementTree()
1187    >>> tree.parse("index.xhtml")
1188    <Element 'html' at 0xb77e6fac>
1189    >>> p = tree.find("body/p")     # Finds first occurrence of tag p in body
1190    >>> p
1191    <Element 'p' at 0xb77ec26c>
1192    >>> links = list(p.iter("a"))   # Returns list of all links
1193    >>> links
1194    [<Element 'a' at 0xb77ec2ac>, <Element 'a' at 0xb77ec1cc>]
1195    >>> for i in links:             # Iterates through all found links
1196    ...     i.attrib["target"] = "blank"
1197    >>> tree.write("output.xhtml")
1198
1199.. _elementtree-qname-objects:
1200
1201QName Objects
1202^^^^^^^^^^^^^
1203
1204
1205.. class:: QName(text_or_uri, tag=None)
1206
1207   QName wrapper.  This can be used to wrap a QName attribute value, in order
1208   to get proper namespace handling on output.  *text_or_uri* is a string
1209   containing the QName value, in the form {uri}local, or, if the tag argument
1210   is given, the URI part of a QName.  If *tag* is given, the first argument is
1211   interpreted as a URI, and this argument is interpreted as a local name.
1212   :class:`QName` instances are opaque.
1213
1214
1215
1216.. _elementtree-treebuilder-objects:
1217
1218TreeBuilder Objects
1219^^^^^^^^^^^^^^^^^^^
1220
1221
1222.. class:: TreeBuilder(element_factory=None, *, comment_factory=None, \
1223                       pi_factory=None, insert_comments=False, insert_pis=False)
1224
1225   Generic element structure builder.  This builder converts a sequence of
1226   start, data, end, comment and pi method calls to a well-formed element
1227   structure.  You can use this class to build an element structure using
1228   a custom XML parser, or a parser for some other XML-like format.
1229
1230   *element_factory*, when given, must be a callable accepting two positional
1231   arguments: a tag and a dict of attributes.  It is expected to return a new
1232   element instance.
1233
1234   The *comment_factory* and *pi_factory* functions, when given, should behave
1235   like the :func:`Comment` and :func:`ProcessingInstruction` functions to
1236   create comments and processing instructions.  When not given, the default
1237   factories will be used.  When *insert_comments* and/or *insert_pis* is true,
1238   comments/pis will be inserted into the tree if they appear within the root
1239   element (but not outside of it).
1240
1241   .. method:: close()
1242
1243      Flushes the builder buffers, and returns the toplevel document
1244      element.  Returns an :class:`Element` instance.
1245
1246
1247   .. method:: data(data)
1248
1249      Adds text to the current element.  *data* is a string.  This should be
1250      either a bytestring, or a Unicode string.
1251
1252
1253   .. method:: end(tag)
1254
1255      Closes the current element.  *tag* is the element name.  Returns the
1256      closed element.
1257
1258
1259   .. method:: start(tag, attrs)
1260
1261      Opens a new element.  *tag* is the element name.  *attrs* is a dictionary
1262      containing element attributes.  Returns the opened element.
1263
1264
1265   .. method:: comment(text)
1266
1267      Creates a comment with the given *text*.  If ``insert_comments`` is true,
1268      this will also add it to the tree.
1269
1270      .. versionadded:: 3.8
1271
1272
1273   .. method:: pi(target, text)
1274
1275      Creates a comment with the given *target* name and *text*.  If
1276      ``insert_pis`` is true, this will also add it to the tree.
1277
1278      .. versionadded:: 3.8
1279
1280
1281   In addition, a custom :class:`TreeBuilder` object can provide the
1282   following methods:
1283
1284   .. method:: doctype(name, pubid, system)
1285
1286      Handles a doctype declaration.  *name* is the doctype name.  *pubid* is
1287      the public identifier.  *system* is the system identifier.  This method
1288      does not exist on the default :class:`TreeBuilder` class.
1289
1290      .. versionadded:: 3.2
1291
1292   .. method:: start_ns(prefix, uri)
1293
1294      Is called whenever the parser encounters a new namespace declaration,
1295      before the ``start()`` callback for the opening element that defines it.
1296      *prefix* is ``''`` for the default namespace and the declared
1297      namespace prefix name otherwise.  *uri* is the namespace URI.
1298
1299      .. versionadded:: 3.8
1300
1301   .. method:: end_ns(prefix)
1302
1303      Is called after the ``end()`` callback of an element that declared
1304      a namespace prefix mapping, with the name of the *prefix* that went
1305      out of scope.
1306
1307      .. versionadded:: 3.8
1308
1309
1310.. class:: C14NWriterTarget(write, *, \
1311             with_comments=False, strip_text=False, rewrite_prefixes=False, \
1312             qname_aware_tags=None, qname_aware_attrs=None, \
1313             exclude_attrs=None, exclude_tags=None)
1314
1315   A `C14N 2.0 <https://www.w3.org/TR/xml-c14n2/>`_ writer.  Arguments are the
1316   same as for the :func:`canonicalize` function.  This class does not build a
1317   tree but translates the callback events directly into a serialised form
1318   using the *write* function.
1319
1320   .. versionadded:: 3.8
1321
1322
1323.. _elementtree-xmlparser-objects:
1324
1325XMLParser Objects
1326^^^^^^^^^^^^^^^^^
1327
1328
1329.. class:: XMLParser(*, target=None, encoding=None)
1330
1331   This class is the low-level building block of the module.  It uses
1332   :mod:`xml.parsers.expat` for efficient, event-based parsing of XML.  It can
1333   be fed XML data incrementally with the :meth:`feed` method, and parsing
1334   events are translated to a push API - by invoking callbacks on the *target*
1335   object.  If *target* is omitted, the standard :class:`TreeBuilder` is used.
1336   If *encoding* [1]_ is given, the value overrides the
1337   encoding specified in the XML file.
1338
1339   .. versionchanged:: 3.8
1340      Parameters are now :ref:`keyword-only <keyword-only_parameter>`.
1341      The *html* argument no longer supported.
1342
1343
1344   .. method:: close()
1345
1346      Finishes feeding data to the parser.  Returns the result of calling the
1347      ``close()`` method of the *target* passed during construction; by default,
1348      this is the toplevel document element.
1349
1350
1351   .. method:: feed(data)
1352
1353      Feeds data to the parser.  *data* is encoded data.
1354
1355   :meth:`XMLParser.feed` calls *target*\'s ``start(tag, attrs_dict)`` method
1356   for each opening tag, its ``end(tag)`` method for each closing tag, and data
1357   is processed by method ``data(data)``.  For further supported callback
1358   methods, see the :class:`TreeBuilder` class.  :meth:`XMLParser.close` calls
1359   *target*\'s method ``close()``. :class:`XMLParser` can be used not only for
1360   building a tree structure. This is an example of counting the maximum depth
1361   of an XML file::
1362
1363    >>> from xml.etree.ElementTree import XMLParser
1364    >>> class MaxDepth:                     # The target object of the parser
1365    ...     maxDepth = 0
1366    ...     depth = 0
1367    ...     def start(self, tag, attrib):   # Called for each opening tag.
1368    ...         self.depth += 1
1369    ...         if self.depth > self.maxDepth:
1370    ...             self.maxDepth = self.depth
1371    ...     def end(self, tag):             # Called for each closing tag.
1372    ...         self.depth -= 1
1373    ...     def data(self, data):
1374    ...         pass            # We do not need to do anything with data.
1375    ...     def close(self):    # Called when all data has been parsed.
1376    ...         return self.maxDepth
1377    ...
1378    >>> target = MaxDepth()
1379    >>> parser = XMLParser(target=target)
1380    >>> exampleXml = """
1381    ... <a>
1382    ...   <b>
1383    ...   </b>
1384    ...   <b>
1385    ...     <c>
1386    ...       <d>
1387    ...       </d>
1388    ...     </c>
1389    ...   </b>
1390    ... </a>"""
1391    >>> parser.feed(exampleXml)
1392    >>> parser.close()
1393    4
1394
1395
1396.. _elementtree-xmlpullparser-objects:
1397
1398XMLPullParser Objects
1399^^^^^^^^^^^^^^^^^^^^^
1400
1401.. class:: XMLPullParser(events=None)
1402
1403   A pull parser suitable for non-blocking applications.  Its input-side API is
1404   similar to that of :class:`XMLParser`, but instead of pushing calls to a
1405   callback target, :class:`XMLPullParser` collects an internal list of parsing
1406   events and lets the user read from it. *events* is a sequence of events to
1407   report back.  The supported events are the strings ``"start"``, ``"end"``,
1408   ``"comment"``, ``"pi"``, ``"start-ns"`` and ``"end-ns"`` (the "ns" events
1409   are used to get detailed namespace information).  If *events* is omitted,
1410   only ``"end"`` events are reported.
1411
1412   .. method:: feed(data)
1413
1414      Feed the given bytes data to the parser.
1415
1416   .. method:: close()
1417
1418      Signal the parser that the data stream is terminated. Unlike
1419      :meth:`XMLParser.close`, this method always returns :const:`None`.
1420      Any events not yet retrieved when the parser is closed can still be
1421      read with :meth:`read_events`.
1422
1423   .. method:: read_events()
1424
1425      Return an iterator over the events which have been encountered in the
1426      data fed to the
1427      parser.  The iterator yields ``(event, elem)`` pairs, where *event* is a
1428      string representing the type of event (e.g. ``"end"``) and *elem* is the
1429      encountered :class:`Element` object, or other context value as follows.
1430
1431      * ``start``, ``end``: the current Element.
1432      * ``comment``, ``pi``: the current comment / processing instruction
1433      * ``start-ns``: a tuple ``(prefix, uri)`` naming the declared namespace
1434        mapping.
1435      * ``end-ns``: :const:`None` (this may change in a future version)
1436
1437      Events provided in a previous call to :meth:`read_events` will not be
1438      yielded again.  Events are consumed from the internal queue only when
1439      they are retrieved from the iterator, so multiple readers iterating in
1440      parallel over iterators obtained from :meth:`read_events` will have
1441      unpredictable results.
1442
1443   .. note::
1444
1445      :class:`XMLPullParser` only guarantees that it has seen the ">"
1446      character of a starting tag when it emits a "start" event, so the
1447      attributes are defined, but the contents of the text and tail attributes
1448      are undefined at that point.  The same applies to the element children;
1449      they may or may not be present.
1450
1451      If you need a fully populated element, look for "end" events instead.
1452
1453   .. versionadded:: 3.4
1454
1455   .. versionchanged:: 3.8
1456      The ``comment`` and ``pi`` events were added.
1457
1458
1459Exceptions
1460^^^^^^^^^^
1461
1462.. class:: ParseError
1463
1464   XML parse error, raised by the various parsing methods in this module when
1465   parsing fails.  The string representation of an instance of this exception
1466   will contain a user-friendly error message.  In addition, it will have
1467   the following attributes available:
1468
1469   .. attribute:: code
1470
1471      A numeric error code from the expat parser. See the documentation of
1472      :mod:`xml.parsers.expat` for the list of error codes and their meanings.
1473
1474   .. attribute:: position
1475
1476      A tuple of *line*, *column* numbers, specifying where the error occurred.
1477
1478.. rubric:: Footnotes
1479
1480.. [1] The encoding string included in XML output should conform to the
1481   appropriate standards.  For example, "UTF-8" is valid, but "UTF8" is
1482   not.  See https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl
1483   and https://www.iana.org/assignments/character-sets/character-sets.xhtml.
1484