1Metadata-Version: 1.2 2Name: defusedxml 3Version: 0.6.0 4Summary: XML bomb protection for Python stdlib modules 5Home-page: https://github.com/tiran/defusedxml 6Author: Christian Heimes 7Author-email: christian@python.org 8Maintainer: Christian Heimes 9Maintainer-email: christian@python.org 10License: PSFL 11Download-URL: https://pypi.python.org/pypi/defusedxml 12Description: =================================================== 13 defusedxml -- defusing XML bombs and other exploits 14 =================================================== 15 16 .. image:: https://img.shields.io/pypi/v/defusedxml.svg 17 :target: https://pypi.org/project/defusedxml/ 18 :alt: Latest Version 19 20 .. image:: https://img.shields.io/pypi/pyversions/defusedxml.svg 21 :target: https://pypi.org/project/defusedxml/ 22 :alt: Supported Python versions 23 24 .. image:: https://travis-ci.org/tiran/defusedxml.svg?branch=master 25 :target: https://travis-ci.org/tiran/defusedxml 26 :alt: Travis CI 27 28 .. image:: https://codecov.io/github/tiran/defusedxml/coverage.svg?branch=master 29 :target: https://codecov.io/github/tiran/defusedxml?branch=master 30 :alt: codecov 31 32 .. image:: https://img.shields.io/pypi/dm/defusedxml.svg 33 :target: https://pypistats.org/packages/defusedxml 34 :alt: PyPI downloads 35 36 .. image:: https://img.shields.io/badge/code%20style-black-000000.svg 37 :target: https://github.com/ambv/black 38 :alt: Code style: black 39 40 .. 41 42 "It's just XML, what could probably go wrong?" 43 44 Christian Heimes <christian@python.org> 45 46 Synopsis 47 ======== 48 49 The results of an attack on a vulnerable XML library can be fairly dramatic. 50 With just a few hundred **Bytes** of XML data an attacker can occupy several 51 **Gigabytes** of memory within **seconds**. An attacker can also keep 52 CPUs busy for a long time with a small to medium size request. Under some 53 circumstances it is even possible to access local files on your 54 server, to circumvent a firewall, or to abuse services to rebound attacks to 55 third parties. 56 57 The attacks use and abuse less common features of XML and its parsers. The 58 majority of developers are unacquainted with features such as processing 59 instructions and entity expansions that XML inherited from SGML. At best 60 they know about ``<!DOCTYPE>`` from experience with HTML but they are not 61 aware that a document type definition (DTD) can generate an HTTP request 62 or load a file from the file system. 63 64 None of the issues is new. They have been known for a long time. Billion 65 laughs was first reported in 2003. Nevertheless some XML libraries and 66 applications are still vulnerable and even heavy users of XML are 67 surprised by these features. It's hard to say whom to blame for the 68 situation. It's too short sighted to shift all blame on XML parsers and 69 XML libraries for using insecure default settings. After all they 70 properly implement XML specifications. Application developers must not rely 71 that a library is always configured for security and potential harmful data 72 by default. 73 74 75 .. contents:: Table of Contents 76 :depth: 2 77 78 79 Attack vectors 80 ============== 81 82 billion laughs / exponential entity expansion 83 --------------------------------------------- 84 85 The `Billion Laughs`_ attack -- also known as exponential entity expansion -- 86 uses multiple levels of nested entities. The original example uses 9 levels 87 of 10 expansions in each level to expand the string ``lol`` to a string of 88 3 * 10 :sup:`9` bytes, hence the name "billion laughs". The resulting string 89 occupies 3 GB (2.79 GiB) of memory; intermediate strings require additional 90 memory. Because most parsers don't cache the intermediate step for every 91 expansion it is repeated over and over again. It increases the CPU load even 92 more. 93 94 An XML document of just a few hundred bytes can disrupt all services on a 95 machine within seconds. 96 97 Example XML:: 98 99 <!DOCTYPE xmlbomb [ 100 <!ENTITY a "1234567890" > 101 <!ENTITY b "&a;&a;&a;&a;&a;&a;&a;&a;"> 102 <!ENTITY c "&b;&b;&b;&b;&b;&b;&b;&b;"> 103 <!ENTITY d "&c;&c;&c;&c;&c;&c;&c;&c;"> 104 ]> 105 <bomb>&d;</bomb> 106 107 108 quadratic blowup entity expansion 109 --------------------------------- 110 111 A quadratic blowup attack is similar to a `Billion Laughs`_ attack; it abuses 112 entity expansion, too. Instead of nested entities it repeats one large entity 113 with a couple of thousand chars over and over again. The attack isn't as 114 efficient as the exponential case but it avoids triggering countermeasures of 115 parsers against heavily nested entities. Some parsers limit the depth and 116 breadth of a single entity but not the total amount of expanded text 117 throughout an entire XML document. 118 119 A medium-sized XML document with a couple of hundred kilobytes can require a 120 couple of hundred MB to several GB of memory. When the attack is combined 121 with some level of nested expansion an attacker is able to achieve a higher 122 ratio of success. 123 124 :: 125 126 <!DOCTYPE bomb [ 127 <!ENTITY a "xxxxxxx... a couple of ten thousand chars"> 128 ]> 129 <bomb>&a;&a;&a;... repeat</bomb> 130 131 132 external entity expansion (remote) 133 ---------------------------------- 134 135 Entity declarations can contain more than just text for replacement. They can 136 also point to external resources by public identifiers or system identifiers. 137 System identifiers are standard URIs. When the URI is a URL (e.g. a 138 ``http://`` locator) some parsers download the resource from the remote 139 location and embed them into the XML document verbatim. 140 141 Simple example of a parsed external entity:: 142 143 <!DOCTYPE external [ 144 <!ENTITY ee SYSTEM "http://www.python.org/some.xml"> 145 ]> 146 <root>ⅇ</root> 147 148 The case of parsed external entities works only for valid XML content. The 149 XML standard also supports unparsed external entities with a 150 ``NData declaration``. 151 152 External entity expansion opens the door to plenty of exploits. An attacker 153 can abuse a vulnerable XML library and application to rebound and forward 154 network requests with the IP address of the server. It highly depends 155 on the parser and the application what kind of exploit is possible. For 156 example: 157 158 * An attacker can circumvent firewalls and gain access to restricted 159 resources as all the requests are made from an internal and trustworthy 160 IP address, not from the outside. 161 * An attacker can abuse a service to attack, spy on or DoS your servers but 162 also third party services. The attack is disguised with the IP address of 163 the server and the attacker is able to utilize the high bandwidth of a big 164 machine. 165 * An attacker can exhaust additional resources on the machine, e.g. with 166 requests to a service that doesn't respond or responds with very large 167 files. 168 * An attacker may gain knowledge, when, how often and from which IP address 169 an XML document is accessed. 170 * An attacker could send mail from inside your network if the URL handler 171 supports ``smtp://`` URIs. 172 173 174 external entity expansion (local file) 175 -------------------------------------- 176 177 External entities with references to local files are a sub-case of external 178 entity expansion. It's listed as an extra attack because it deserves extra 179 attention. Some XML libraries such as lxml disable network access by default 180 but still allow entity expansion with local file access by default. Local 181 files are either referenced with a ``file://`` URL or by a file path (either 182 relative or absolute). 183 184 An attacker may be able to access and download all files that can be read by 185 the application process. This may include critical configuration files, too. 186 187 :: 188 189 <!DOCTYPE external [ 190 <!ENTITY ee SYSTEM "file:///PATH/TO/simple.xml"> 191 ]> 192 <root>ⅇ</root> 193 194 195 DTD retrieval 196 ------------- 197 198 This case is similar to external entity expansion, too. Some XML libraries 199 like Python's xml.dom.pulldom retrieve document type definitions from remote 200 or local locations. Several attack scenarios from the external entity case 201 apply to this issue as well. 202 203 :: 204 205 <?xml version="1.0" encoding="utf-8"?> 206 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 207 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> 208 <html> 209 <head/> 210 <body>text</body> 211 </html> 212 213 214 Python XML Libraries 215 ==================== 216 217 .. csv-table:: vulnerabilities and features 218 :header: "kind", "sax", "etree", "minidom", "pulldom", "xmlrpc", "lxml", "genshi" 219 :widths: 24, 7, 8, 8, 7, 8, 8, 8 220 :stub-columns: 0 221 222 "billion laughs", "**True**", "**True**", "**True**", "**True**", "**True**", "False (1)", "False (5)" 223 "quadratic blowup", "**True**", "**True**", "**True**", "**True**", "**True**", "**True**", "False (5)" 224 "external entity expansion (remote)", "**True**", "False (3)", "False (4)", "**True**", "false", "False (1)", "False (5)" 225 "external entity expansion (local file)", "**True**", "False (3)", "False (4)", "**True**", "false", "**True**", "False (5)" 226 "DTD retrieval", "**True**", "False", "False", "**True**", "false", "False (1)", "False" 227 "gzip bomb", "False", "False", "False", "False", "**True**", "**partly** (2)", "False" 228 "xpath support (7)", "False", "False", "False", "False", "False", "**True**", "False" 229 "xsl(t) support (7)", "False", "False", "False", "False", "False", "**True**", "False" 230 "xinclude support (7)", "False", "**True** (6)", "False", "False", "False", "**True** (6)", "**True**" 231 "C library", "expat", "expat", "expat", "expat", "expat", "libxml2", "expat" 232 233 1. Lxml is protected against billion laughs attacks and doesn't do network 234 lookups by default. 235 2. libxml2 and lxml are not directly vulnerable to gzip decompression bombs 236 but they don't protect you against them either. 237 3. xml.etree doesn't expand entities and raises a ParserError when an entity 238 occurs. 239 4. minidom doesn't expand entities and simply returns the unexpanded entity 240 verbatim. 241 5. genshi.input of genshi 0.6 doesn't support entity expansion and raises a 242 ParserError when an entity occurs. 243 6. Library has (limited) XInclude support but requires an additional step to 244 process inclusion. 245 7. These are features but they may introduce exploitable holes, see 246 `Other things to consider`_ 247 248 249 Settings in standard library 250 ---------------------------- 251 252 253 xml.sax.handler Features 254 ........................ 255 256 feature_external_ges (http://xml.org/sax/features/external-general-entities) 257 disables external entity expansion 258 259 feature_external_pes (http://xml.org/sax/features/external-parameter-entities) 260 the option is ignored and doesn't modify any functionality 261 262 DOM xml.dom.xmlbuilder.Options 263 .............................. 264 265 external_parameter_entities 266 ignored 267 268 external_general_entities 269 ignored 270 271 external_dtd_subset 272 ignored 273 274 entities 275 unsure 276 277 278 defusedxml 279 ========== 280 281 The `defusedxml package`_ (`defusedxml on PyPI`_) 282 contains several Python-only workarounds and fixes 283 for denial of service and other vulnerabilities in Python's XML libraries. 284 In order to benefit from the protection you just have to import and use the 285 listed functions / classes from the right defusedxml module instead of the 286 original module. Merely `defusedxml.xmlrpc`_ is implemented as monkey patch. 287 288 Instead of:: 289 290 >>> from xml.etree.ElementTree import parse 291 >>> et = parse(xmlfile) 292 293 alter code to:: 294 295 >>> from defusedxml.ElementTree import parse 296 >>> et = parse(xmlfile) 297 298 Additionally the package has an **untested** function to monkey patch 299 all stdlib modules with ``defusedxml.defuse_stdlib()``. 300 301 All functions and parser classes accept three additional keyword arguments. 302 They return either the same objects as the original functions or compatible 303 subclasses. 304 305 forbid_dtd (default: False) 306 disallow XML with a ``<!DOCTYPE>`` processing instruction and raise a 307 *DTDForbidden* exception when a DTD processing instruction is found. 308 309 forbid_entities (default: True) 310 disallow XML with ``<!ENTITY>`` declarations inside the DTD and raise an 311 *EntitiesForbidden* exception when an entity is declared. 312 313 forbid_external (default: True) 314 disallow any access to remote or local resources in external entities 315 or DTD and raising an *ExternalReferenceForbidden* exception when a DTD 316 or entity references an external resource. 317 318 319 defusedxml (package) 320 -------------------- 321 322 DefusedXmlException, DTDForbidden, EntitiesForbidden, 323 ExternalReferenceForbidden, NotSupportedError 324 325 defuse_stdlib() (*experimental*) 326 327 328 defusedxml.cElementTree 329 ----------------------- 330 331 parse(), iterparse(), fromstring(), XMLParser 332 333 334 defusedxml.ElementTree 335 ----------------------- 336 337 parse(), iterparse(), fromstring(), XMLParser 338 339 340 defusedxml.expatreader 341 ---------------------- 342 343 create_parser(), DefusedExpatParser 344 345 346 defusedxml.sax 347 -------------- 348 349 parse(), parseString(), make_parser() 350 351 352 defusedxml.expatbuilder 353 ----------------------- 354 355 parse(), parseString(), DefusedExpatBuilder, DefusedExpatBuilderNS 356 357 358 defusedxml.minidom 359 ------------------ 360 361 parse(), parseString() 362 363 364 defusedxml.pulldom 365 ------------------ 366 367 parse(), parseString() 368 369 370 defusedxml.xmlrpc 371 ----------------- 372 373 The fix is implemented as monkey patch for the stdlib's xmlrpc package (3.x) 374 or xmlrpclib module (2.x). The function `monkey_patch()` enables the fixes, 375 `unmonkey_patch()` removes the patch and puts the code in its former state. 376 377 The monkey patch protects against XML related attacks as well as 378 decompression bombs and excessively large requests or responses. The default 379 setting is 30 MB for requests, responses and gzip decompression. You can 380 modify the default by changing the module variable `MAX_DATA`. A value of 381 `-1` disables the limit. 382 383 384 defusedxml.lxml 385 --------------- 386 387 **DEPRECATED** The module is deprecated and will be removed in a future 388 release. 389 390 The module acts as an *example* how you could protect code that uses 391 lxml.etree. It implements a custom Element class that filters out 392 Entity instances, a custom parser factory and a thread local storage for 393 parser instances. It also has a check_docinfo() function which inspects 394 a tree for internal or external DTDs and entity declarations. In order to 395 check for entities lxml > 3.0 is required. 396 397 parse(), fromstring() 398 RestrictedElement, GlobalParserTLS, getDefaultParser(), check_docinfo() 399 400 401 defusedexpat 402 ============ 403 404 The `defusedexpat package`_ (`defusedexpat on PyPI`_) 405 comes with binary extensions and a 406 `modified expat`_ library instead of the standard `expat parser`_. It's 407 basically a stand-alone version of the patches for Python's standard 408 library C extensions. 409 410 Modifications in expat 411 ---------------------- 412 413 new definitions:: 414 415 XML_BOMB_PROTECTION 416 XML_DEFAULT_MAX_ENTITY_INDIRECTIONS 417 XML_DEFAULT_MAX_ENTITY_EXPANSIONS 418 XML_DEFAULT_RESET_DTD 419 420 new XML_FeatureEnum members:: 421 422 XML_FEATURE_MAX_ENTITY_INDIRECTIONS 423 XML_FEATURE_MAX_ENTITY_EXPANSIONS 424 XML_FEATURE_IGNORE_DTD 425 426 new XML_Error members:: 427 428 XML_ERROR_ENTITY_INDIRECTIONS 429 XML_ERROR_ENTITY_EXPANSION 430 431 new API functions:: 432 433 int XML_GetFeature(XML_Parser parser, 434 enum XML_FeatureEnum feature, 435 long *value); 436 int XML_SetFeature(XML_Parser parser, 437 enum XML_FeatureEnum feature, 438 long value); 439 int XML_GetFeatureDefault(enum XML_FeatureEnum feature, 440 long *value); 441 int XML_SetFeatureDefault(enum XML_FeatureEnum feature, 442 long value); 443 444 XML_FEATURE_MAX_ENTITY_INDIRECTIONS 445 Limit the amount of indirections that are allowed to occur during the 446 expansion of a nested entity. A counter starts when an entity reference 447 is encountered. It resets after the entity is fully expanded. The limit 448 protects the parser against exponential entity expansion attacks (aka 449 billion laughs attack). When the limit is exceeded the parser stops and 450 fails with `XML_ERROR_ENTITY_INDIRECTIONS`. 451 A value of 0 disables the protection. 452 453 Supported range 454 0 .. UINT_MAX 455 Default 456 40 457 458 XML_FEATURE_MAX_ENTITY_EXPANSIONS 459 Limit the total length of all entity expansions throughout the entire 460 document. The lengths of all entities are accumulated in a parser variable. 461 The setting protects against quadratic blowup attacks (lots of expansions 462 of a large entity declaration). When the sum of all entities exceeds 463 the limit, the parser stops and fails with `XML_ERROR_ENTITY_EXPANSION`. 464 A value of 0 disables the protection. 465 466 Supported range 467 0 .. UINT_MAX 468 Default 469 8 MiB 470 471 XML_FEATURE_RESET_DTD 472 Reset all DTD information after the <!DOCTYPE> block has been parsed. When 473 the flag is set (default: false) all DTD information after the 474 endDoctypeDeclHandler has been called. The flag can be set inside the 475 endDoctypeDeclHandler. Without DTD information any entity reference in 476 the document body leads to `XML_ERROR_UNDEFINED_ENTITY`. 477 478 Supported range 479 0, 1 480 Default 481 0 482 483 484 How to avoid XML vulnerabilities 485 ================================ 486 487 Best practices 488 -------------- 489 490 * Don't allow DTDs 491 * Don't expand entities 492 * Don't resolve externals 493 * Limit parse depth 494 * Limit total input size 495 * Limit parse time 496 * Favor a SAX or iterparse-like parser for potential large data 497 * Validate and properly quote arguments to XSL transformations and 498 XPath queries 499 * Don't use XPath expression from untrusted sources 500 * Don't apply XSL transformations that come untrusted sources 501 502 (based on Brad Hill's `Attacking XML Security`_) 503 504 505 Other things to consider 506 ======================== 507 508 XML, XML parsers and processing libraries have more features and possible 509 issue that could lead to DoS vulnerabilities or security exploits in 510 applications. I have compiled an incomplete list of theoretical issues that 511 need further research and more attention. The list is deliberately pessimistic 512 and a bit paranoid, too. It contains things that might go wrong under daffy 513 circumstances. 514 515 516 attribute blowup / hash collision attack 517 ---------------------------------------- 518 519 XML parsers may use an algorithm with quadratic runtime O(n :sup:`2`) to 520 handle attributes and namespaces. If it uses hash tables (dictionaries) to 521 store attributes and namespaces the implementation may be vulnerable to 522 hash collision attacks, thus reducing the performance to O(n :sup:`2`) again. 523 In either case an attacker is able to forge a denial of service attack with 524 an XML document that contains thousands upon thousands of attributes in 525 a single node. 526 527 I haven't researched yet if expat, pyexpat or libxml2 are vulnerable. 528 529 530 decompression bomb 531 ------------------ 532 533 The issue of decompression bombs (aka `ZIP bomb`_) apply to all XML libraries 534 that can parse compressed XML stream like gzipped HTTP streams or LZMA-ed 535 files. For an attacker it can reduce the amount of transmitted data by three 536 magnitudes or more. Gzip is able to compress 1 GiB zeros to roughly 1 MB, 537 lzma is even better:: 538 539 $ dd if=/dev/zero bs=1M count=1024 | gzip > zeros.gz 540 $ dd if=/dev/zero bs=1M count=1024 | lzma -z > zeros.xy 541 $ ls -sh zeros.* 542 1020K zeros.gz 543 148K zeros.xy 544 545 None of Python's standard XML libraries decompress streams except for 546 ``xmlrpclib``. The module is vulnerable <https://bugs.python.org/issue16043> 547 to decompression bombs. 548 549 lxml can load and process compressed data through libxml2 transparently. 550 libxml2 can handle even very large blobs of compressed data efficiently 551 without using too much memory. But it doesn't protect applications from 552 decompression bombs. A carefully written SAX or iterparse-like approach can 553 be safe. 554 555 556 Processing Instruction 557 ---------------------- 558 559 `PI`_'s like:: 560 561 <?xml-stylesheet type="text/xsl" href="style.xsl"?> 562 563 may impose more threats for XML processing. It depends if and how a 564 processor handles processing instructions. The issue of URL retrieval with 565 network or local file access apply to processing instructions, too. 566 567 568 Other DTD features 569 ------------------ 570 571 `DTD`_ has more features like ``<!NOTATION>``. I haven't researched how 572 these features may be a security threat. 573 574 575 XPath 576 ----- 577 578 XPath statements may introduce DoS vulnerabilities. Code should never execute 579 queries from untrusted sources. An attacker may also be able to create an XML 580 document that makes certain XPath queries costly or resource hungry. 581 582 583 XPath injection attacks 584 ----------------------- 585 586 XPath injeciton attacks pretty much work like SQL injection attacks. 587 Arguments to XPath queries must be quoted and validated properly, especially 588 when they are taken from the user. The page `Avoid the dangers of XPath injection`_ 589 list some ramifications of XPath injections. 590 591 Python's standard library doesn't have XPath support. Lxml supports 592 parameterized XPath queries which does proper quoting. You just have to use 593 its xpath() method correctly:: 594 595 # DON'T 596 >>> tree.xpath("/tag[@id='%s']" % value) 597 598 # instead do 599 >>> tree.xpath("/tag[@id=$tagid]", tagid=name) 600 601 602 XInclude 603 -------- 604 605 `XML Inclusion`_ is another way to load and include external files:: 606 607 <root xmlns:xi="http://www.w3.org/2001/XInclude"> 608 <xi:include href="filename.txt" parse="text" /> 609 </root> 610 611 This feature should be disabled when XML files from an untrusted source are 612 processed. Some Python XML libraries and libxml2 support XInclude but don't 613 have an option to sandbox inclusion and limit it to allowed directories. 614 615 616 XMLSchema location 617 ------------------ 618 619 A validating XML parser may download schema files from the information in a 620 ``xsi:schemaLocation`` attribute. 621 622 :: 623 624 <ead xmlns="urn:isbn:1-931666-22-9" 625 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 626 xsi:schemaLocation="urn:isbn:1-931666-22-9 http://www.loc.gov/ead/ead.xsd"> 627 </ead> 628 629 630 XSL Transformation 631 ------------------ 632 633 You should keep in mind that XSLT is a Turing complete language. Never 634 process XSLT code from unknown or untrusted source! XSLT processors may 635 allow you to interact with external resources in ways you can't even imagine. 636 Some processors even support extensions that allow read/write access to file 637 system, access to JRE objects or scripting with Jython. 638 639 Example from `Attacking XML Security`_ for Xalan-J:: 640 641 <xsl:stylesheet version="1.0" 642 xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 643 xmlns:rt="http://xml.apache.org/xalan/java/java.lang.Runtime" 644 xmlns:ob="http://xml.apache.org/xalan/java/java.lang.Object" 645 exclude-result-prefixes= "rt ob"> 646 <xsl:template match="/"> 647 <xsl:variable name="runtimeObject" select="rt:getRuntime()"/> 648 <xsl:variable name="command" 649 select="rt:exec($runtimeObject, 'c:\Windows\system32\cmd.exe')"/> 650 <xsl:variable name="commandAsString" select="ob:toString($command)"/> 651 <xsl:value-of select="$commandAsString"/> 652 </xsl:template> 653 </xsl:stylesheet> 654 655 656 Related CVEs 657 ============ 658 659 CVE-2013-1664 660 Unrestricted entity expansion induces DoS vulnerabilities in Python XML 661 libraries (XML bomb) 662 663 CVE-2013-1665 664 External entity expansion in Python XML libraries inflicts potential 665 security flaws and DoS vulnerabilities 666 667 668 Other languages / frameworks 669 ============================= 670 671 Several other programming languages and frameworks are vulnerable as well. A 672 couple of them are affected by the fact that libxml2 up to 2.9.0 has no 673 protection against quadratic blowup attacks. Most of them have potential 674 dangerous default settings for entity expansion and external entities, too. 675 676 Perl 677 ---- 678 679 Perl's XML::Simple is vulnerable to quadratic entity expansion and external 680 entity expansion (both local and remote). 681 682 683 Ruby 684 ---- 685 686 Ruby's REXML document parser is vulnerable to entity expansion attacks 687 (both quadratic and exponential) but it doesn't do external entity 688 expansion by default. In order to counteract entity expansion you have to 689 disable the feature:: 690 691 REXML::Document.entity_expansion_limit = 0 692 693 libxml-ruby and hpricot don't expand entities in their default configuration. 694 695 696 PHP 697 --- 698 699 PHP's SimpleXML API is vulnerable to quadratic entity expansion and loads 700 entities from local and remote resources. The option ``LIBXML_NONET`` disables 701 network access but still allows local file access. ``LIBXML_NOENT`` seems to 702 have no effect on entity expansion in PHP 5.4.6. 703 704 705 C# / .NET / Mono 706 ---------------- 707 708 Information in `XML DoS and Defenses (MSDN)`_ suggest that .NET is 709 vulnerable with its default settings. The article contains code snippets 710 how to create a secure XML reader:: 711 712 XmlReaderSettings settings = new XmlReaderSettings(); 713 settings.ProhibitDtd = false; 714 settings.MaxCharactersFromEntities = 1024; 715 settings.XmlResolver = null; 716 XmlReader reader = XmlReader.Create(stream, settings); 717 718 719 Java 720 ---- 721 722 Untested. The documentation of Xerces and its `Xerces SecurityMananger`_ 723 sounds like Xerces is also vulnerable to billion laugh attacks with its 724 default settings. It also does entity resolving when an 725 ``org.xml.sax.EntityResolver`` is configured. I'm not yet sure about the 726 default setting here. 727 728 Java specialists suggest to have a custom builder factory:: 729 730 DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance(); 731 builderFactory.setXIncludeAware(False); 732 builderFactory.setExpandEntityReferences(False); 733 builderFactory.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, True); 734 # either 735 builderFactory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", True); 736 # or if you need DTDs 737 builderFactory.setFeature("http://xml.org/sax/features/external-general-entities", False); 738 builderFactory.setFeature("http://xml.org/sax/features/external-parameter-entities", False); 739 builderFactory.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", False); 740 builderFactory.setFeature("http://apache.org/xml/features/nonvalidating/load-dtd-grammar", False); 741 742 743 TODO 744 ==== 745 746 * DOM: Use xml.dom.xmlbuilder options for entity handling 747 * SAX: take feature_external_ges and feature_external_pes (?) into account 748 * test experimental monkey patching of stdlib modules 749 * improve documentation 750 751 752 License 753 ======= 754 755 Copyright (c) 2013-2017 by Christian Heimes <christian@python.org> 756 757 Licensed to PSF under a Contributor Agreement. 758 759 See https://www.python.org/psf/license for licensing details. 760 761 762 Acknowledgements 763 ================ 764 765 Brett Cannon (Python Core developer) 766 review and code cleanup 767 768 Antoine Pitrou (Python Core developer) 769 code review 770 771 Aaron Patterson, Ben Murphy and Michael Koziarski (Ruby community) 772 Many thanks to Aaron, Ben and Michael from the Ruby community for their 773 report and assistance. 774 775 Thierry Carrez (OpenStack) 776 Many thanks to Thierry for his report to the Python Security Response 777 Team on behalf of the OpenStack security team. 778 779 Carl Meyer (Django) 780 Many thanks to Carl for his report to PSRT on behalf of the Django security 781 team. 782 783 Daniel Veillard (libxml2) 784 Many thanks to Daniel for his insight and assistance with libxml2. 785 786 semantics GmbH (https://www.semantics.de/) 787 Many thanks to my employer semantics for letting me work on the issue 788 during working hours as part of semantics's open source initiative. 789 790 791 References 792 ========== 793 794 * `XML DoS and Defenses (MSDN)`_ 795 * `Billion Laughs`_ on Wikipedia 796 * `ZIP bomb`_ on Wikipedia 797 * `Configure SAX parsers for secure processing`_ 798 * `Testing for XML Injection`_ 799 800 .. _defusedxml package: https://github.com/tiran/defusedxml 801 .. _defusedxml on PyPI: https://pypi.python.org/pypi/defusedxml 802 .. _defusedexpat package: https://github.com/tiran/defusedexpat 803 .. _defusedexpat on PyPI: https://pypi.python.org/pypi/defusedexpat 804 .. _modified expat: https://github.com/tiran/expat 805 .. _expat parser: http://expat.sourceforge.net/ 806 .. _Attacking XML Security: https://www.isecpartners.com/media/12976/iSEC-HILL-Attacking-XML-Security-bh07.pdf 807 .. _Billion Laughs: https://en.wikipedia.org/wiki/Billion_laughs 808 .. _XML DoS and Defenses (MSDN): https://msdn.microsoft.com/en-us/magazine/ee335713.aspx 809 .. _ZIP bomb: https://en.wikipedia.org/wiki/Zip_bomb 810 .. _DTD: https://en.wikipedia.org/wiki/Document_Type_Definition 811 .. _PI: https://en.wikipedia.org/wiki/Processing_Instruction 812 .. _Avoid the dangers of XPath injection: http://www.ibm.com/developerworks/xml/library/x-xpathinjection/index.html 813 .. _Configure SAX parsers for secure processing: http://www.ibm.com/developerworks/xml/library/x-tipcfsx/index.html 814 .. _Testing for XML Injection: https://www.owasp.org/index.php/Testing_for_XML_Injection_(OWASP-DV-008) 815 .. _Xerces SecurityMananger: https://xerces.apache.org/xerces2-j/javadocs/xerces2/org/apache/xerces/util/SecurityManager.html 816 .. _XML Inclusion: https://www.w3.org/TR/xinclude/#include_element 817 818 Changelog 819 ========= 820 821 defusedxml 0.6.0 822 ---------------- 823 824 *Release date: 17-Apr-2019* 825 826 - Increase test coverage. 827 - Add badges to README. 828 829 830 defusedxml 0.6.0rc1 831 ------------------- 832 833 *Release date: 14-Apr-2019* 834 835 - Test on Python 3.7 stable and 3.8-dev 836 - Drop support for Python 3.4 837 - No longer pass *html* argument to XMLParse. It has been deprecated and 838 ignored for a long time. The DefusedXMLParser still takes a html argument. 839 A deprecation warning is issued when the argument is False and a TypeError 840 when it's True. 841 - defusedxml now fails early when pyexpat stdlib module is not available or 842 broken. 843 - defusedxml.ElementTree.__all__ now lists ParseError as public attribute. 844 - The defusedxml.ElementTree and defusedxml.cElementTree modules had a typo 845 and used XMLParse instead of XMLParser as an alias for DefusedXMLParser. 846 Both the old and fixed name are now available. 847 848 849 defusedxml 0.5.0 850 ---------------- 851 852 *Release date: 07-Feb-2017* 853 854 - No changes 855 856 857 defusedxml 0.5.0.rc1 858 -------------------- 859 860 *Release date: 28-Jan-2017* 861 862 - Add compatibility with Python 3.6 863 - Drop support for Python 2.6, 3.1, 3.2, 3.3 864 - Fix lxml tests (XMLSyntaxError: Detected an entity reference loop) 865 866 867 defusedxml 0.4.1 868 ---------------- 869 870 *Release date: 28-Mar-2013* 871 872 - Add more demo exploits, e.g. python_external.py and Xalan XSLT demos. 873 - Improved documentation. 874 875 876 defusedxml 0.4 877 -------------- 878 879 *Release date: 25-Feb-2013* 880 881 - As per http://seclists.org/oss-sec/2013/q1/340 please REJECT 882 CVE-2013-0278, CVE-2013-0279 and CVE-2013-0280 and use CVE-2013-1664, 883 CVE-2013-1665 for OpenStack/etc. 884 - Add missing parser_list argument to sax.make_parser(). The argument is 885 ignored, though. (thanks to Florian Apolloner) 886 - Add demo exploit for external entity attack on Python's SAX parser, XML-RPC 887 and WebDAV. 888 889 890 defusedxml 0.3 891 -------------- 892 893 *Release date: 19-Feb-2013* 894 895 - Improve documentation 896 897 898 defusedxml 0.2 899 -------------- 900 901 *Release date: 15-Feb-2013* 902 903 - Rename ExternalEntitiesForbidden to ExternalReferenceForbidden 904 - Rename defusedxml.lxml.check_dtd() to check_docinfo() 905 - Unify argument names in callbacks 906 - Add arguments and formatted representation to exceptions 907 - Add forbid_external argument to all functions and classes 908 - More tests 909 - LOTS of documentation 910 - Add example code for other languages (Ruby, Perl, PHP) and parsers (Genshi) 911 - Add protection against XML and gzip attacks to xmlrpclib 912 913 defusedxml 0.1 914 -------------- 915 916 *Release date: 08-Feb-2013* 917 918 - Initial and internal release for PSRT review 919 920Keywords: xml bomb DoS 921Platform: all 922Classifier: Development Status :: 5 - Production/Stable 923Classifier: Intended Audience :: Developers 924Classifier: License :: OSI Approved :: Python Software Foundation License 925Classifier: Natural Language :: English 926Classifier: Programming Language :: Python 927Classifier: Programming Language :: Python :: 2 928Classifier: Programming Language :: Python :: 2.7 929Classifier: Programming Language :: Python :: 3 930Classifier: Programming Language :: Python :: 3.5 931Classifier: Programming Language :: Python :: 3.6 932Classifier: Programming Language :: Python :: 3.7 933Classifier: Programming Language :: Python :: 3.8 934Classifier: Topic :: Text Processing :: Markup :: XML 935Requires-Python: >=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.* 936