1:mod:`xml.sax` --- Support for SAX2 parsers
2===========================================
3
4.. module:: xml.sax
5   :synopsis: Package containing SAX2 base classes and convenience functions.
6
7.. moduleauthor:: Lars Marius Garshol <larsga@garshol.priv.no>
8.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org>
9.. sectionauthor:: Martin v. Löwis <martin@v.loewis.de>
10
11**Source code:** :source:`Lib/xml/sax/__init__.py`
12
13--------------
14
15The :mod:`xml.sax` package provides a number of modules which implement the
16Simple API for XML (SAX) interface for Python.  The package itself provides the
17SAX exceptions and the convenience functions which will be most used by users of
18the SAX API.
19
20
21.. warning::
22
23   The :mod:`xml.sax` module is not secure against maliciously
24   constructed data.  If you need to parse untrusted or unauthenticated data see
25   :ref:`xml-vulnerabilities`.
26
27.. versionchanged:: 3.7.1
28
29   The SAX parser no longer processes general external entities by default
30   to increase security. Before, the parser created network connections
31   to fetch remote files or loaded local files from the file
32   system for DTD and entities. The feature can be enabled again with method
33   :meth:`~xml.sax.xmlreader.XMLReader.setFeature` on the parser object
34   and argument :data:`~xml.sax.handler.feature_external_ges`.
35
36The convenience functions are:
37
38
39.. function:: make_parser(parser_list=[])
40
41   Create and return a SAX :class:`~xml.sax.xmlreader.XMLReader` object.  The
42   first parser found will
43   be used.  If *parser_list* is provided, it must be an iterable of strings which
44   name modules that have a function named :func:`create_parser`.  Modules listed
45   in *parser_list* will be used before modules in the default list of parsers.
46
47   .. versionchanged:: 3.8
48      The *parser_list* argument can be any iterable, not just a list.
49
50
51.. function:: parse(filename_or_stream, handler, error_handler=handler.ErrorHandler())
52
53   Create a SAX parser and use it to parse a document.  The document, passed in as
54   *filename_or_stream*, can be a filename or a file object.  The *handler*
55   parameter needs to be a SAX :class:`~handler.ContentHandler` instance.  If
56   *error_handler* is given, it must be a SAX :class:`~handler.ErrorHandler`
57   instance; if
58   omitted,  :exc:`SAXParseException` will be raised on all errors.  There is no
59   return value; all work must be done by the *handler* passed in.
60
61
62.. function:: parseString(string, handler, error_handler=handler.ErrorHandler())
63
64   Similar to :func:`parse`, but parses from a buffer *string* received as a
65   parameter.  *string* must be a :class:`str` instance or a
66   :term:`bytes-like object`.
67
68   .. versionchanged:: 3.5
69      Added support of :class:`str` instances.
70
71A typical SAX application uses three kinds of objects: readers, handlers and
72input sources.  "Reader" in this context is another term for parser, i.e. some
73piece of code that reads the bytes or characters from the input source, and
74produces a sequence of events. The events then get distributed to the handler
75objects, i.e. the reader invokes a method on the handler.  A SAX application
76must therefore obtain a reader object, create or open the input sources, create
77the handlers, and connect these objects all together.  As the final step of
78preparation, the reader is called to parse the input. During parsing, methods on
79the handler objects are called based on structural and syntactic events from the
80input data.
81
82For these objects, only the interfaces are relevant; they are normally not
83instantiated by the application itself.  Since Python does not have an explicit
84notion of interface, they are formally introduced as classes, but applications
85may use implementations which do not inherit from the provided classes.  The
86:class:`~xml.sax.xmlreader.InputSource`, :class:`~xml.sax.xmlreader.Locator`,
87:class:`~xml.sax.xmlreader.Attributes`, :class:`~xml.sax.xmlreader.AttributesNS`,
88and :class:`~xml.sax.xmlreader.XMLReader` interfaces are defined in the
89module :mod:`xml.sax.xmlreader`.  The handler interfaces are defined in
90:mod:`xml.sax.handler`.  For convenience,
91:class:`~xml.sax.xmlreader.InputSource` (which is often
92instantiated directly) and the handler classes are also available from
93:mod:`xml.sax`.  These interfaces are described below.
94
95In addition to these classes, :mod:`xml.sax` provides the following exception
96classes.
97
98
99.. exception:: SAXException(msg, exception=None)
100
101   Encapsulate an XML error or warning.  This class can contain basic error or
102   warning information from either the XML parser or the application: it can be
103   subclassed to provide additional functionality or to add localization.  Note
104   that although the handlers defined in the
105   :class:`~xml.sax.handler.ErrorHandler` interface
106   receive instances of this exception, it is not required to actually raise the
107   exception --- it is also useful as a container for information.
108
109   When instantiated, *msg* should be a human-readable description of the error.
110   The optional *exception* parameter, if given, should be ``None`` or an exception
111   that was caught by the parsing code and is being passed along as information.
112
113   This is the base class for the other SAX exception classes.
114
115
116.. exception:: SAXParseException(msg, exception, locator)
117
118   Subclass of :exc:`SAXException` raised on parse errors. Instances of this
119   class are passed to the methods of the SAX
120   :class:`~xml.sax.handler.ErrorHandler` interface to provide information
121   about the parse error.  This class supports the SAX
122   :class:`~xml.sax.xmlreader.Locator` interface as well as the
123   :class:`SAXException` interface.
124
125
126.. exception:: SAXNotRecognizedException(msg, exception=None)
127
128   Subclass of :exc:`SAXException` raised when a SAX
129   :class:`~xml.sax.xmlreader.XMLReader` is
130   confronted with an unrecognized feature or property.  SAX applications and
131   extensions may use this class for similar purposes.
132
133
134.. exception:: SAXNotSupportedException(msg, exception=None)
135
136   Subclass of :exc:`SAXException` raised when a SAX
137   :class:`~xml.sax.xmlreader.XMLReader` is asked to
138   enable a feature that is not supported, or to set a property to a value that the
139   implementation does not support.  SAX applications and extensions may use this
140   class for similar purposes.
141
142
143.. seealso::
144
145   `SAX: The Simple API for XML <http://www.saxproject.org/>`_
146      This site is the focal point for the definition of the SAX API.  It provides a
147      Java implementation and online documentation.  Links to implementations and
148      historical information are also available.
149
150   Module :mod:`xml.sax.handler`
151      Definitions of the interfaces for application-provided objects.
152
153   Module :mod:`xml.sax.saxutils`
154      Convenience functions for use in SAX applications.
155
156   Module :mod:`xml.sax.xmlreader`
157      Definitions of the interfaces for parser-provided objects.
158
159
160.. _sax-exception-objects:
161
162SAXException Objects
163--------------------
164
165The :class:`SAXException` exception class supports the following methods:
166
167
168.. method:: SAXException.getMessage()
169
170   Return a human-readable message describing the error condition.
171
172
173.. method:: SAXException.getException()
174
175   Return an encapsulated exception object, or ``None``.
176
177