1Metadata-Version: 1.2
2Name: defusedxml
3Version: 0.6.0
4Summary: XML bomb protection for Python stdlib modules
5Home-page: https://github.com/tiran/defusedxml
6Author: Christian Heimes
7Author-email: christian@python.org
8Maintainer: Christian Heimes
9Maintainer-email: christian@python.org
10License: PSFL
11Download-URL: https://pypi.python.org/pypi/defusedxml
12Description: ===================================================
13        defusedxml -- defusing XML bombs and other exploits
14        ===================================================
15
16        .. image:: https://img.shields.io/pypi/v/defusedxml.svg
17            :target: https://pypi.org/project/defusedxml/
18            :alt: Latest Version
19
20        .. image:: https://img.shields.io/pypi/pyversions/defusedxml.svg
21            :target: https://pypi.org/project/defusedxml/
22            :alt: Supported Python versions
23
24        .. image:: https://travis-ci.org/tiran/defusedxml.svg?branch=master
25            :target: https://travis-ci.org/tiran/defusedxml
26            :alt: Travis CI
27
28        .. image:: https://codecov.io/github/tiran/defusedxml/coverage.svg?branch=master
29            :target: https://codecov.io/github/tiran/defusedxml?branch=master
30            :alt: codecov
31
32        .. image:: https://img.shields.io/pypi/dm/defusedxml.svg
33            :target: https://pypistats.org/packages/defusedxml
34            :alt: PyPI downloads
35
36        .. image:: https://img.shields.io/badge/code%20style-black-000000.svg
37            :target: https://github.com/ambv/black
38            :alt: Code style: black
39
40        ..
41
42            "It's just XML, what could probably go wrong?"
43
44        Christian Heimes <christian@python.org>
45
46        Synopsis
47        ========
48
49        The results of an attack on a vulnerable XML library can be fairly dramatic.
50        With just a few hundred **Bytes** of XML data an attacker can occupy several
51        **Gigabytes** of memory within **seconds**. An attacker can also keep
52        CPUs busy for a long time with a small to medium size request. Under some
53        circumstances it is even possible to access local files on your
54        server, to circumvent a firewall, or to abuse services to rebound attacks to
55        third parties.
56
57        The attacks use and abuse less common features of XML and its parsers. The
58        majority of developers are unacquainted with features such as processing
59        instructions and entity expansions that XML inherited from SGML. At best
60        they know about ``<!DOCTYPE>`` from experience with HTML but they are not
61        aware that a document type definition (DTD) can generate an HTTP request
62        or load a file from the file system.
63
64        None of the issues is new. They have been known for a long time. Billion
65        laughs was first reported in 2003. Nevertheless some XML libraries and
66        applications are still vulnerable and even heavy users of XML are
67        surprised by these features. It's hard to say whom to blame for the
68        situation. It's too short sighted to shift all blame on XML parsers and
69        XML libraries for using insecure default settings. After all they
70        properly implement XML specifications. Application developers must not rely
71        that a library is always configured for security and potential harmful data
72        by default.
73
74
75        .. contents:: Table of Contents
76           :depth: 2
77
78
79        Attack vectors
80        ==============
81
82        billion laughs / exponential entity expansion
83        ---------------------------------------------
84
85        The `Billion Laughs`_ attack -- also known as exponential entity expansion --
86        uses multiple levels of nested entities. The original example uses 9 levels
87        of 10 expansions in each level to expand the string ``lol`` to a string of
88        3 * 10 :sup:`9` bytes, hence the name "billion laughs". The resulting string
89        occupies 3 GB (2.79 GiB) of memory; intermediate strings require additional
90        memory. Because most parsers don't cache the intermediate step for every
91        expansion it is repeated over and over again. It increases the CPU load even
92        more.
93
94        An XML document of just a few hundred bytes can disrupt all services on a
95        machine within seconds.
96
97        Example XML::
98
99            <!DOCTYPE xmlbomb [
100            <!ENTITY a "1234567890" >
101            <!ENTITY b "&a;&a;&a;&a;&a;&a;&a;&a;">
102            <!ENTITY c "&b;&b;&b;&b;&b;&b;&b;&b;">
103            <!ENTITY d "&c;&c;&c;&c;&c;&c;&c;&c;">
104            ]>
105            <bomb>&d;</bomb>
106
107
108        quadratic blowup entity expansion
109        ---------------------------------
110
111        A quadratic blowup attack is similar to a `Billion Laughs`_ attack; it abuses
112        entity expansion, too. Instead of nested entities it repeats one large entity
113        with a couple of thousand chars over and over again. The attack isn't as
114        efficient as the exponential case but it avoids triggering countermeasures of
115        parsers against heavily nested entities. Some parsers limit the depth and
116        breadth of a single entity but not the total amount of expanded text
117        throughout an entire XML document.
118
119        A medium-sized XML document with a couple of hundred kilobytes can require a
120        couple of hundred MB to several GB of memory. When the attack is combined
121        with some level of nested expansion an attacker is able to achieve a higher
122        ratio of success.
123
124        ::
125
126            <!DOCTYPE bomb [
127            <!ENTITY a "xxxxxxx... a couple of ten thousand chars">
128            ]>
129            <bomb>&a;&a;&a;... repeat</bomb>
130
131
132        external entity expansion (remote)
133        ----------------------------------
134
135        Entity declarations can contain more than just text for replacement. They can
136        also point to external resources by public identifiers or system identifiers.
137        System identifiers are standard URIs. When the URI is a URL (e.g. a
138        ``http://`` locator) some parsers download the resource from the remote
139        location and embed them into the XML document verbatim.
140
141        Simple example of a parsed external entity::
142
143            <!DOCTYPE external [
144            <!ENTITY ee SYSTEM "http://www.python.org/some.xml">
145            ]>
146            <root>&ee;</root>
147
148        The case of parsed external entities works only for valid XML content. The
149        XML standard also supports unparsed external entities with a
150        ``NData declaration``.
151
152        External entity expansion opens the door to plenty of exploits. An attacker
153        can abuse a vulnerable XML library and application to rebound and forward
154        network requests with the IP address of the server. It highly depends
155        on the parser and the application what kind of exploit is possible. For
156        example:
157
158        * An attacker can circumvent firewalls and gain access to restricted
159          resources as all the requests are made from an internal and trustworthy
160          IP address, not from the outside.
161        * An attacker can abuse a service to attack, spy on or DoS your servers but
162          also third party services. The attack is disguised with the IP address of
163          the server and the attacker is able to utilize the high bandwidth of a big
164          machine.
165        * An attacker can exhaust additional resources on the machine, e.g. with
166          requests to a service that doesn't respond or responds with very large
167          files.
168        * An attacker may gain knowledge, when, how often and from which IP address
169          an XML document is accessed.
170        * An attacker could send mail from inside your network if the URL handler
171          supports ``smtp://`` URIs.
172
173
174        external entity expansion (local file)
175        --------------------------------------
176
177        External entities with references to local files are a sub-case of external
178        entity expansion. It's listed as an extra attack because it deserves extra
179        attention. Some XML libraries such as lxml disable network access by default
180        but still allow entity expansion with local file access by default. Local
181        files are either referenced with a ``file://`` URL or by a file path (either
182        relative or absolute).
183
184        An attacker may be able to access and download all files that can be read by
185        the application process. This may include critical configuration files, too.
186
187        ::
188
189            <!DOCTYPE external [
190            <!ENTITY ee SYSTEM "file:///PATH/TO/simple.xml">
191            ]>
192            <root>&ee;</root>
193
194
195        DTD retrieval
196        -------------
197
198        This case is similar to external entity expansion, too. Some XML libraries
199        like Python's xml.dom.pulldom retrieve document type definitions from remote
200        or local locations. Several attack scenarios from the external entity case
201        apply to this issue as well.
202
203        ::
204
205            <?xml version="1.0" encoding="utf-8"?>
206            <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
207              "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
208            <html>
209                <head/>
210                <body>text</body>
211            </html>
212
213
214        Python XML Libraries
215        ====================
216
217        .. csv-table:: vulnerabilities and features
218           :header: "kind", "sax", "etree", "minidom", "pulldom", "xmlrpc", "lxml", "genshi"
219           :widths: 24, 7, 8, 8, 7, 8, 8, 8
220           :stub-columns: 0
221
222           "billion laughs", "**True**", "**True**", "**True**", "**True**", "**True**", "False (1)", "False (5)"
223           "quadratic blowup", "**True**", "**True**", "**True**", "**True**", "**True**", "**True**", "False (5)"
224           "external entity expansion (remote)", "**True**", "False (3)", "False (4)", "**True**", "false", "False (1)", "False (5)"
225           "external entity expansion (local file)", "**True**", "False (3)", "False (4)", "**True**", "false", "**True**", "False (5)"
226           "DTD retrieval", "**True**", "False", "False", "**True**", "false", "False (1)", "False"
227           "gzip bomb", "False", "False", "False", "False", "**True**", "**partly** (2)", "False"
228           "xpath support (7)", "False", "False", "False", "False", "False", "**True**", "False"
229           "xsl(t) support (7)", "False", "False", "False", "False", "False", "**True**", "False"
230           "xinclude support (7)", "False", "**True** (6)", "False", "False", "False", "**True** (6)", "**True**"
231           "C library", "expat", "expat", "expat", "expat", "expat", "libxml2", "expat"
232
233        1. Lxml is protected against billion laughs attacks and doesn't do network
234           lookups by default.
235        2. libxml2 and lxml are not directly vulnerable to gzip decompression bombs
236           but they don't protect you against them either.
237        3. xml.etree doesn't expand entities and raises a ParserError when an entity
238           occurs.
239        4. minidom doesn't expand entities and simply returns the unexpanded entity
240           verbatim.
241        5. genshi.input of genshi 0.6 doesn't support entity expansion and raises a
242           ParserError when an entity occurs.
243        6. Library has (limited) XInclude support but requires an additional step to
244           process inclusion.
245        7. These are features but they may introduce exploitable holes, see
246           `Other things to consider`_
247
248
249        Settings in standard library
250        ----------------------------
251
252
253        xml.sax.handler Features
254        ........................
255
256        feature_external_ges (http://xml.org/sax/features/external-general-entities)
257          disables external entity expansion
258
259        feature_external_pes (http://xml.org/sax/features/external-parameter-entities)
260          the option is ignored and doesn't modify any functionality
261
262        DOM xml.dom.xmlbuilder.Options
263        ..............................
264
265        external_parameter_entities
266          ignored
267
268        external_general_entities
269          ignored
270
271        external_dtd_subset
272          ignored
273
274        entities
275          unsure
276
277
278        defusedxml
279        ==========
280
281        The `defusedxml package`_ (`defusedxml on PyPI`_)
282        contains several Python-only workarounds and fixes
283        for denial of service and other vulnerabilities in Python's XML libraries.
284        In order to benefit from the protection you just have to import and use the
285        listed functions / classes from the right defusedxml module instead of the
286        original module. Merely `defusedxml.xmlrpc`_ is implemented as monkey patch.
287
288        Instead of::
289
290           >>> from xml.etree.ElementTree import parse
291           >>> et = parse(xmlfile)
292
293        alter code to::
294
295           >>> from defusedxml.ElementTree import parse
296           >>> et = parse(xmlfile)
297
298        Additionally the package has an **untested** function to monkey patch
299        all stdlib modules with ``defusedxml.defuse_stdlib()``.
300
301        All functions and parser classes accept three additional keyword arguments.
302        They return either the same objects as the original functions or compatible
303        subclasses.
304
305        forbid_dtd (default: False)
306          disallow XML with a ``<!DOCTYPE>`` processing instruction and raise a
307          *DTDForbidden* exception when a DTD processing instruction is found.
308
309        forbid_entities (default: True)
310          disallow XML with ``<!ENTITY>`` declarations inside the DTD and raise an
311          *EntitiesForbidden* exception when an entity is declared.
312
313        forbid_external (default: True)
314          disallow any access to remote or local resources in external entities
315          or DTD and raising an *ExternalReferenceForbidden* exception when a DTD
316          or entity references an external resource.
317
318
319        defusedxml (package)
320        --------------------
321
322        DefusedXmlException, DTDForbidden, EntitiesForbidden,
323        ExternalReferenceForbidden, NotSupportedError
324
325        defuse_stdlib() (*experimental*)
326
327
328        defusedxml.cElementTree
329        -----------------------
330
331        parse(), iterparse(), fromstring(), XMLParser
332
333
334        defusedxml.ElementTree
335        -----------------------
336
337        parse(), iterparse(), fromstring(), XMLParser
338
339
340        defusedxml.expatreader
341        ----------------------
342
343        create_parser(), DefusedExpatParser
344
345
346        defusedxml.sax
347        --------------
348
349        parse(), parseString(), make_parser()
350
351
352        defusedxml.expatbuilder
353        -----------------------
354
355        parse(), parseString(), DefusedExpatBuilder, DefusedExpatBuilderNS
356
357
358        defusedxml.minidom
359        ------------------
360
361        parse(), parseString()
362
363
364        defusedxml.pulldom
365        ------------------
366
367        parse(), parseString()
368
369
370        defusedxml.xmlrpc
371        -----------------
372
373        The fix is implemented as monkey patch for the stdlib's xmlrpc package (3.x)
374        or xmlrpclib module (2.x). The function `monkey_patch()` enables the fixes,
375        `unmonkey_patch()` removes the patch and puts the code in its former state.
376
377        The monkey patch protects against XML related attacks as well as
378        decompression bombs and excessively large requests or responses. The default
379        setting is 30 MB for requests, responses and gzip decompression. You can
380        modify the default by changing the module variable `MAX_DATA`. A value of
381        `-1` disables the limit.
382
383
384        defusedxml.lxml
385        ---------------
386
387        **DEPRECATED** The module is deprecated and will be removed in a future
388        release.
389
390        The module acts as an *example* how you could protect code that uses
391        lxml.etree. It implements a custom Element class that filters out
392        Entity instances, a custom parser factory and a thread local storage for
393        parser instances. It also has a check_docinfo() function which inspects
394        a tree for internal or external DTDs and entity declarations. In order to
395        check for entities lxml > 3.0 is required.
396
397        parse(), fromstring()
398        RestrictedElement, GlobalParserTLS, getDefaultParser(), check_docinfo()
399
400
401        defusedexpat
402        ============
403
404        The `defusedexpat package`_ (`defusedexpat on PyPI`_)
405        comes with binary extensions and a
406        `modified expat`_ library instead of the standard `expat parser`_. It's
407        basically a stand-alone version of the patches for Python's standard
408        library C extensions.
409
410        Modifications in expat
411        ----------------------
412
413        new definitions::
414
415          XML_BOMB_PROTECTION
416          XML_DEFAULT_MAX_ENTITY_INDIRECTIONS
417          XML_DEFAULT_MAX_ENTITY_EXPANSIONS
418          XML_DEFAULT_RESET_DTD
419
420        new XML_FeatureEnum members::
421
422          XML_FEATURE_MAX_ENTITY_INDIRECTIONS
423          XML_FEATURE_MAX_ENTITY_EXPANSIONS
424          XML_FEATURE_IGNORE_DTD
425
426        new XML_Error members::
427
428          XML_ERROR_ENTITY_INDIRECTIONS
429          XML_ERROR_ENTITY_EXPANSION
430
431        new API functions::
432
433          int XML_GetFeature(XML_Parser parser,
434                             enum XML_FeatureEnum feature,
435                             long *value);
436          int XML_SetFeature(XML_Parser parser,
437                             enum XML_FeatureEnum feature,
438                             long value);
439          int XML_GetFeatureDefault(enum XML_FeatureEnum feature,
440                                    long *value);
441          int XML_SetFeatureDefault(enum XML_FeatureEnum feature,
442                                    long value);
443
444        XML_FEATURE_MAX_ENTITY_INDIRECTIONS
445           Limit the amount of indirections that are allowed to occur during the
446           expansion of a nested entity. A counter starts when an entity reference
447           is encountered. It resets after the entity is fully expanded. The limit
448           protects the parser against exponential entity expansion attacks (aka
449           billion laughs attack). When the limit is exceeded the parser stops and
450           fails with `XML_ERROR_ENTITY_INDIRECTIONS`.
451           A value of 0 disables the protection.
452
453           Supported range
454             0 .. UINT_MAX
455           Default
456             40
457
458        XML_FEATURE_MAX_ENTITY_EXPANSIONS
459           Limit the total length of all entity expansions throughout the entire
460           document. The lengths of all entities are accumulated in a parser variable.
461           The setting protects against quadratic blowup attacks (lots of expansions
462           of a large entity declaration). When the sum of all entities exceeds
463           the limit, the parser stops and fails with `XML_ERROR_ENTITY_EXPANSION`.
464           A value of 0 disables the protection.
465
466           Supported range
467             0 .. UINT_MAX
468           Default
469             8 MiB
470
471        XML_FEATURE_RESET_DTD
472           Reset all DTD information after the <!DOCTYPE> block has been parsed. When
473           the flag is set (default: false) all DTD information after the
474           endDoctypeDeclHandler has been called. The flag can be set inside the
475           endDoctypeDeclHandler. Without DTD information any entity reference in
476           the document body leads to `XML_ERROR_UNDEFINED_ENTITY`.
477
478           Supported range
479             0, 1
480           Default
481             0
482
483
484        How to avoid XML vulnerabilities
485        ================================
486
487        Best practices
488        --------------
489
490        * Don't allow DTDs
491        * Don't expand entities
492        * Don't resolve externals
493        * Limit parse depth
494        * Limit total input size
495        * Limit parse time
496        * Favor a SAX or iterparse-like parser for potential large data
497        * Validate and properly quote arguments to XSL transformations and
498          XPath queries
499        * Don't use XPath expression from untrusted sources
500        * Don't apply XSL transformations that come untrusted sources
501
502        (based on Brad Hill's `Attacking XML Security`_)
503
504
505        Other things to consider
506        ========================
507
508        XML, XML parsers and processing libraries have more features and possible
509        issue that could lead to DoS vulnerabilities or security exploits in
510        applications. I have compiled an incomplete list of theoretical issues that
511        need further research and more attention. The list is deliberately pessimistic
512        and a bit paranoid, too. It contains things that might go wrong under daffy
513        circumstances.
514
515
516        attribute blowup / hash collision attack
517        ----------------------------------------
518
519        XML parsers may use an algorithm with quadratic runtime O(n :sup:`2`) to
520        handle attributes and namespaces. If it uses hash tables (dictionaries) to
521        store attributes and namespaces the implementation may be vulnerable to
522        hash collision attacks, thus reducing the performance to O(n :sup:`2`) again.
523        In either case an attacker is able to forge a denial of service attack with
524        an XML document that contains thousands upon thousands of attributes in
525        a single node.
526
527        I haven't researched yet if expat, pyexpat or libxml2 are vulnerable.
528
529
530        decompression bomb
531        ------------------
532
533        The issue of decompression bombs (aka `ZIP bomb`_) apply to all XML libraries
534        that can parse compressed XML stream like gzipped HTTP streams or LZMA-ed
535        files. For an attacker it can reduce the amount of transmitted data by three
536        magnitudes or more. Gzip is able to compress 1 GiB zeros to roughly 1 MB,
537        lzma is even better::
538
539            $ dd if=/dev/zero bs=1M count=1024 | gzip > zeros.gz
540            $ dd if=/dev/zero bs=1M count=1024 | lzma -z > zeros.xy
541            $ ls -sh zeros.*
542            1020K zeros.gz
543             148K zeros.xy
544
545        None of Python's standard XML libraries decompress streams except for
546        ``xmlrpclib``. The module is vulnerable <https://bugs.python.org/issue16043>
547        to decompression bombs.
548
549        lxml can load and process compressed data through libxml2 transparently.
550        libxml2 can handle even very large blobs of compressed data efficiently
551        without using too much memory. But it doesn't protect applications from
552        decompression bombs. A carefully written SAX or iterparse-like approach can
553        be safe.
554
555
556        Processing Instruction
557        ----------------------
558
559        `PI`_'s like::
560
561          <?xml-stylesheet type="text/xsl" href="style.xsl"?>
562
563        may impose more threats for XML processing. It depends if and how a
564        processor handles processing instructions. The issue of URL retrieval with
565        network or local file access apply to processing instructions, too.
566
567
568        Other DTD features
569        ------------------
570
571        `DTD`_ has more features like ``<!NOTATION>``. I haven't researched how
572        these features may be a security threat.
573
574
575        XPath
576        -----
577
578        XPath statements may introduce DoS vulnerabilities. Code should never execute
579        queries from untrusted sources. An attacker may also be able to create an XML
580        document that makes certain XPath queries costly or resource hungry.
581
582
583        XPath injection attacks
584        -----------------------
585
586        XPath injeciton attacks pretty much work like SQL injection attacks.
587        Arguments to XPath queries must be quoted and validated properly, especially
588        when they are taken from the user. The page `Avoid the dangers of XPath injection`_
589        list some ramifications of XPath injections.
590
591        Python's standard library doesn't have XPath support. Lxml supports
592        parameterized XPath queries which does proper quoting. You just have to use
593        its xpath() method correctly::
594
595           # DON'T
596           >>> tree.xpath("/tag[@id='%s']" % value)
597
598           # instead do
599           >>> tree.xpath("/tag[@id=$tagid]", tagid=name)
600
601
602        XInclude
603        --------
604
605        `XML Inclusion`_ is another way to load and include external files::
606
607           <root xmlns:xi="http://www.w3.org/2001/XInclude">
608             <xi:include href="filename.txt" parse="text" />
609           </root>
610
611        This feature should be disabled when XML files from an untrusted source are
612        processed. Some Python XML libraries and libxml2 support XInclude but don't
613        have an option to sandbox inclusion and limit it to allowed directories.
614
615
616        XMLSchema location
617        ------------------
618
619        A validating XML parser may download schema files from the information in a
620        ``xsi:schemaLocation`` attribute.
621
622        ::
623
624          <ead xmlns="urn:isbn:1-931666-22-9"
625               xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
626               xsi:schemaLocation="urn:isbn:1-931666-22-9 http://www.loc.gov/ead/ead.xsd">
627          </ead>
628
629
630        XSL Transformation
631        ------------------
632
633        You should keep in mind that XSLT is a Turing complete language. Never
634        process XSLT code from unknown or untrusted source! XSLT processors may
635        allow you to interact with external resources in ways you can't even imagine.
636        Some processors even support extensions that allow read/write access to file
637        system, access to JRE objects or scripting with Jython.
638
639        Example from `Attacking XML Security`_ for Xalan-J::
640
641            <xsl:stylesheet version="1.0"
642             xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
643             xmlns:rt="http://xml.apache.org/xalan/java/java.lang.Runtime"
644             xmlns:ob="http://xml.apache.org/xalan/java/java.lang.Object"
645             exclude-result-prefixes= "rt ob">
646             <xsl:template match="/">
647               <xsl:variable name="runtimeObject" select="rt:getRuntime()"/>
648               <xsl:variable name="command"
649                 select="rt:exec($runtimeObject, &apos;c:\Windows\system32\cmd.exe&apos;)"/>
650               <xsl:variable name="commandAsString" select="ob:toString($command)"/>
651               <xsl:value-of select="$commandAsString"/>
652             </xsl:template>
653            </xsl:stylesheet>
654
655
656        Related CVEs
657        ============
658
659        CVE-2013-1664
660          Unrestricted entity expansion induces DoS vulnerabilities in Python XML
661          libraries (XML bomb)
662
663        CVE-2013-1665
664          External entity expansion in Python XML libraries inflicts potential
665          security flaws and DoS vulnerabilities
666
667
668        Other languages / frameworks
669        =============================
670
671        Several other programming languages and frameworks are vulnerable as well. A
672        couple of them are affected by the fact that libxml2 up to 2.9.0 has no
673        protection against quadratic blowup attacks. Most of them have potential
674        dangerous default settings for entity expansion and external entities, too.
675
676        Perl
677        ----
678
679        Perl's XML::Simple is vulnerable to quadratic entity expansion and external
680        entity expansion (both local and remote).
681
682
683        Ruby
684        ----
685
686        Ruby's REXML document parser is vulnerable to entity expansion attacks
687        (both quadratic and exponential) but it doesn't do external entity
688        expansion by default. In order to counteract entity expansion you have to
689        disable the feature::
690
691          REXML::Document.entity_expansion_limit = 0
692
693        libxml-ruby and hpricot don't expand entities in their default configuration.
694
695
696        PHP
697        ---
698
699        PHP's SimpleXML API is vulnerable to quadratic entity expansion and loads
700        entities from local and remote resources. The option ``LIBXML_NONET`` disables
701        network access but still allows local file access. ``LIBXML_NOENT`` seems to
702        have no effect on entity expansion in PHP 5.4.6.
703
704
705        C# / .NET / Mono
706        ----------------
707
708        Information in `XML DoS and Defenses (MSDN)`_ suggest that .NET is
709        vulnerable with its default settings. The article contains code snippets
710        how to create a secure XML reader::
711
712          XmlReaderSettings settings = new XmlReaderSettings();
713          settings.ProhibitDtd = false;
714          settings.MaxCharactersFromEntities = 1024;
715          settings.XmlResolver = null;
716          XmlReader reader = XmlReader.Create(stream, settings);
717
718
719        Java
720        ----
721
722        Untested. The documentation of Xerces and its `Xerces SecurityMananger`_
723        sounds like Xerces is also vulnerable to billion laugh attacks with its
724        default settings. It also does entity resolving when an
725        ``org.xml.sax.EntityResolver`` is configured. I'm not yet sure about the
726        default setting here.
727
728        Java specialists suggest to have a custom builder factory::
729
730          DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
731          builderFactory.setXIncludeAware(False);
732          builderFactory.setExpandEntityReferences(False);
733          builderFactory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, True);
734          # either
735          builderFactory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", True);
736          # or if you need DTDs
737          builderFactory.setFeature("http://xml.org/sax/features/external-general-entities", False);
738          builderFactory.setFeature("http://xml.org/sax/features/external-parameter-entities", False);
739          builderFactory.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", False);
740          builderFactory.setFeature("http://apache.org/xml/features/nonvalidating/load-dtd-grammar", False);
741
742
743        TODO
744        ====
745
746        * DOM: Use xml.dom.xmlbuilder options for entity handling
747        * SAX: take feature_external_ges and feature_external_pes (?) into account
748        * test experimental monkey patching of stdlib modules
749        * improve documentation
750
751
752        License
753        =======
754
755        Copyright (c) 2013-2017 by Christian Heimes <christian@python.org>
756
757        Licensed to PSF under a Contributor Agreement.
758
759        See https://www.python.org/psf/license for licensing details.
760
761
762        Acknowledgements
763        ================
764
765        Brett Cannon (Python Core developer)
766          review and code cleanup
767
768        Antoine Pitrou (Python Core developer)
769          code review
770
771        Aaron Patterson, Ben Murphy and Michael Koziarski (Ruby community)
772          Many thanks to Aaron, Ben and Michael from the Ruby community for their
773          report and assistance.
774
775        Thierry Carrez (OpenStack)
776          Many thanks to Thierry for his report to the Python Security Response
777          Team on behalf of the OpenStack security team.
778
779        Carl Meyer (Django)
780          Many thanks to Carl for his report to PSRT on behalf of the Django security
781          team.
782
783        Daniel Veillard (libxml2)
784          Many thanks to Daniel for his insight and assistance with libxml2.
785
786        semantics GmbH (https://www.semantics.de/)
787          Many thanks to my employer semantics for letting me work on the issue
788          during working hours as part of semantics's open source initiative.
789
790
791        References
792        ==========
793
794        * `XML DoS and Defenses (MSDN)`_
795        * `Billion Laughs`_ on Wikipedia
796        * `ZIP bomb`_ on Wikipedia
797        * `Configure SAX parsers for secure processing`_
798        * `Testing for XML Injection`_
799
800        .. _defusedxml package: https://github.com/tiran/defusedxml
801        .. _defusedxml on PyPI: https://pypi.python.org/pypi/defusedxml
802        .. _defusedexpat package: https://github.com/tiran/defusedexpat
803        .. _defusedexpat on PyPI: https://pypi.python.org/pypi/defusedexpat
804        .. _modified expat: https://github.com/tiran/expat
805        .. _expat parser: http://expat.sourceforge.net/
806        .. _Attacking XML Security: https://www.isecpartners.com/media/12976/iSEC-HILL-Attacking-XML-Security-bh07.pdf
807        .. _Billion Laughs: https://en.wikipedia.org/wiki/Billion_laughs
808        .. _XML DoS and Defenses (MSDN): https://msdn.microsoft.com/en-us/magazine/ee335713.aspx
809        .. _ZIP bomb: https://en.wikipedia.org/wiki/Zip_bomb
810        .. _DTD: https://en.wikipedia.org/wiki/Document_Type_Definition
811        .. _PI: https://en.wikipedia.org/wiki/Processing_Instruction
812        .. _Avoid the dangers of XPath injection: http://www.ibm.com/developerworks/xml/library/x-xpathinjection/index.html
813        .. _Configure SAX parsers for secure processing: http://www.ibm.com/developerworks/xml/library/x-tipcfsx/index.html
814        .. _Testing for XML Injection: https://www.owasp.org/index.php/Testing_for_XML_Injection_(OWASP-DV-008)
815        .. _Xerces SecurityMananger: https://xerces.apache.org/xerces2-j/javadocs/xerces2/org/apache/xerces/util/SecurityManager.html
816        .. _XML Inclusion: https://www.w3.org/TR/xinclude/#include_element
817
818        Changelog
819        =========
820
821        defusedxml 0.6.0
822        ----------------
823
824        *Release date: 17-Apr-2019*
825
826        - Increase test coverage.
827        - Add badges to README.
828
829
830        defusedxml 0.6.0rc1
831        -------------------
832
833        *Release date: 14-Apr-2019*
834
835        - Test on Python 3.7 stable and 3.8-dev
836        - Drop support for Python 3.4
837        - No longer pass *html* argument to XMLParse. It has been deprecated and
838          ignored for a long time. The DefusedXMLParser still takes a html argument.
839          A deprecation warning is issued when the argument is False and a TypeError
840          when it's True.
841        - defusedxml now fails early when pyexpat stdlib module is not available or
842          broken.
843        - defusedxml.ElementTree.__all__ now lists ParseError as public attribute.
844        - The defusedxml.ElementTree and defusedxml.cElementTree modules had a typo
845          and used XMLParse instead of XMLParser as an alias for DefusedXMLParser.
846          Both the old and fixed name are now available.
847
848
849        defusedxml 0.5.0
850        ----------------
851
852        *Release date: 07-Feb-2017*
853
854        - No changes
855
856
857        defusedxml 0.5.0.rc1
858        --------------------
859
860        *Release date: 28-Jan-2017*
861
862        - Add compatibility with Python 3.6
863        - Drop support for Python 2.6, 3.1, 3.2, 3.3
864        - Fix lxml tests (XMLSyntaxError: Detected an entity reference loop)
865
866
867        defusedxml 0.4.1
868        ----------------
869
870        *Release date: 28-Mar-2013*
871
872        - Add more demo exploits, e.g. python_external.py and Xalan XSLT demos.
873        - Improved documentation.
874
875
876        defusedxml 0.4
877        --------------
878
879        *Release date: 25-Feb-2013*
880
881        - As per http://seclists.org/oss-sec/2013/q1/340 please REJECT
882          CVE-2013-0278, CVE-2013-0279 and CVE-2013-0280 and use CVE-2013-1664,
883          CVE-2013-1665 for OpenStack/etc.
884        - Add missing parser_list argument to sax.make_parser(). The argument is
885          ignored, though. (thanks to Florian Apolloner)
886        - Add demo exploit for external entity attack on Python's SAX parser, XML-RPC
887          and WebDAV.
888
889
890        defusedxml 0.3
891        --------------
892
893        *Release date: 19-Feb-2013*
894
895        - Improve documentation
896
897
898        defusedxml 0.2
899        --------------
900
901        *Release date: 15-Feb-2013*
902
903        - Rename ExternalEntitiesForbidden to ExternalReferenceForbidden
904        - Rename defusedxml.lxml.check_dtd() to check_docinfo()
905        - Unify argument names in callbacks
906        - Add arguments and formatted representation to exceptions
907        - Add forbid_external argument to all functions and classes
908        - More tests
909        - LOTS of documentation
910        - Add example code for other languages (Ruby, Perl, PHP) and parsers (Genshi)
911        - Add protection against XML and gzip attacks to xmlrpclib
912
913        defusedxml 0.1
914        --------------
915
916        *Release date: 08-Feb-2013*
917
918        - Initial and internal release for PSRT review
919
920Keywords: xml bomb DoS
921Platform: all
922Classifier: Development Status :: 5 - Production/Stable
923Classifier: Intended Audience :: Developers
924Classifier: License :: OSI Approved :: Python Software Foundation License
925Classifier: Natural Language :: English
926Classifier: Programming Language :: Python
927Classifier: Programming Language :: Python :: 2
928Classifier: Programming Language :: Python :: 2.7
929Classifier: Programming Language :: Python :: 3
930Classifier: Programming Language :: Python :: 3.5
931Classifier: Programming Language :: Python :: 3.6
932Classifier: Programming Language :: Python :: 3.7
933Classifier: Programming Language :: Python :: 3.8
934Classifier: Topic :: Text Processing :: Markup :: XML
935Requires-Python: >=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*
936