1title:      Extensions API
2
3# Writing Extensions for Python-Markdown
4
5Python-Markdown includes an API for extension writers to plug their own custom functionality and syntax into the
6parser. An extension will patch into one or more stages of the parser:
7
8* [*Preprocessors*](#preprocessors) alter the source before it is passed to the parser.
9* [*Block Processors*](#blockprocessors) work with blocks of text separated by blank lines.
10* [*Tree Processors*](#treeprocessors) modify the constructed ElementTree
11* [*Inline Processors*](#inlineprocessors) are common tree processors for inline elements, such as `*strong*`.
12* [*Postprocessors*](#postprocessors) munge of the output of the parser just before it is returned.
13
14The parser loads text, applies the preprocessors, creates and builds an [ElementTree][ElementTree] object from the
15block processors and inline processors, renders the ElementTree object as Unicode text, and then then applies the
16postprocessors.
17
18There are classes and helpers provided to ease writing your extension. Each part of the API is discussed in its
19respective section below. Additionally, you can walk through the [Tutorial on Writing Extensions][tutorial]; look at
20some of the [Available Extensions][] and their [source code][extension source]. As always, you may report bugs, ask
21for help, and discuss various other issues on the [bug tracker].
22
23## Phases of processing {: #stages }
24
25### Preprocessors {: #preprocessors }
26
27Preprocessors munge the source text before it is passed to the Markdown parser. This is an excellent place to clean up
28bad characters or to extract portions for later processing that the parser may otherwise choke on.
29
30Preprocessors inherit from `markdown.preprocessors.Preprocessor` and implement a `run` method, which takes a single
31parameter `lines`. This parameter is the entire source text stored as a list of Unicode strings, one per line.  `run`
32should return its processed list of Unicode strings, one per line.
33
34#### Example
35
36This simple example removes any lines with 'NO RENDER' before processing:
37
38```python
39from markdown.preprocessors import Preprocessor
40import re
41
42class NoRender(Preprocessor):
43    """ Skip any line with words 'NO RENDER' in it. """
44    def run(self, lines):
45        new_lines = []
46        for line in lines:
47            m = re.search("NO RENDER", line)
48            if not m:
49                # any line without NO RENDER is passed through
50                new_lines.append(line)
51        return new_lines
52```
53
54#### Usages
55
56Some preprocessors in the Markdown source tree include:
57
58| Class                         | Kind      | Description |
59| ------------------------------|-----------|------------------------------------------------- |
60| [`NormalizeWhiteSpace`][c1]   | built-in  | Normalizes whitespace by expanding tabs, fixing `\r` line endings, etc. |
61| [`HtmlBlockPreprocessor`][c2] | built-in  | Removes html blocks from the text and stores them for later processing |
62| [`ReferencePreprocessor`][c3] | built-in  | Removes reference definitions from text and stores for later processing |
63| [`MetaPreprocessor`][c4]      | extension | Strips and records meta data at top of documents |
64| [`FootnotesPreprocessor`][c5] | extension | Removes footnote blocks from the text and stores them for later processing |
65
66[c1]: https://github.com/Python-Markdown/markdown/blob/master/markdown/preprocessors.py
67[c2]: https://github.com/Python-Markdown/markdown/blob/master/markdown/preprocessors.py
68[c3]: https://github.com/Python-Markdown/markdown/blob/master/markdown/preprocessors.py
69[c4]: https://github.com/Python-Markdown/markdown/blob/master/markdown/extensions/meta.py
70[c5]: https://github.com/Python-Markdown/markdown/blob/master/markdown/extensions/footnotes.py
71
72### Block Processors {: #blockprocessors }
73
74A block processor parses blocks of text and adds new elements to the `ElementTree`. Blocks of text, separated from
75other text by blank lines, may have a different syntax and produce a differently structured tree than other Markdown.
76Block processors excel at code formatting, equation layouts, and tables.
77
78Block processors inherit from `markdown.blockprocessors.BlockProcessor`, are passed `md.parser` on initialization, and
79implement both the `test` and `run` methods:
80
81* `test(self, parent, block)` takes two parameters: `parent` is the parent `ElementTree` element and `block` is a
82  single, multi-line, Unicode string of the current block. `test`, often a regular expression match, returns a true
83  value if the block processor's `run` method should be called to process starting at that block.
84* `run(self, parent, blocks)` has the same `parent` parameter as `test`; and `blocks` is the list of all remaining
85  blocks in the document, starting with the `block` passed to `test`. `run` may return `False` (not `None`) to signal
86  failure, meaning that it did not process the blocks after all. On success, `run` is expected to `pop` one or more
87  blocks from the front of `blocks` and attach new nodes to `parent`.
88
89Crafting block processors is more involved and flexible than the other processors, involving controlling recursive
90parsing of the block's contents and managing state across invocations. For example, a blank line is allowed in
91indented code, so the second invocation of the inline code processor appends to the element tree generated by the
92previous call.  Other block processors may insert new text into the `blocks` list, signal to future calls of itself,
93and more.
94
95To make writing these complex beasts more tractable, three convenience functions have been provided by the
96`BlockProcessor` parent class:
97
98* `lastChild(parent)` returns the last child of the given element or `None` if it has no children.
99* `detab(text)` removes one level of indent (four spaces by default) from the front of each line of the given
100  multi-line, text string, until a non-blank line is indented less.
101* `looseDetab(text, level)` removes multiple levels
102  of indent from the front of each line of `text` but does not affect lines indented less.
103
104Also, `BlockProcessor` provides the fields `self.tab_length`, the tab length (default 4), and `self.parser`, the
105current `BlockParser` instance.
106
107#### BlockParser
108
109`BlockParser`, not to be confused with `BlockProcessor`, is the class used by Markdown to cycle through all the
110registered block processors.  You should never need to create your own instance; use `self.parser` instead.
111
112The `BlockParser` instance provides a stack of strings for its current state, which your processor can push with
113`self.parser.set(state)`,  pop with `self.parser.reset()`, or check the the top state with
114`self.parser.isstate(state)`. Be sure your code pops the states it pushes.
115
116The `BlockParser` instance can also be called recursively, that is, to process blocks from within your block
117processor.  There are three methods:
118
119* `parseDocument(lines)` parses a list of lines, each a single-line Unicode string, returning a complete
120  `ElementTree`.
121* `parseChunk(parent, text)` parses a single, multi-line, possibly multi-block, Unicode string `text` and attaches the
122  resulting tree to `parent`.
123* `parseBlocks(parent, blocks)` takes a list of `blocks`, each a multi-line Unicode string without blank lines, and
124  attaches the resulting tree to `parent`.
125
126For perspective, Markdown calls `parseDocument` which calls `parseChunk` which calls `parseBlocks` which calls your
127block processor, which, in turn, might call one of these routines.
128
129#### Example
130
131This example calls out important paragraphs by giving them a border.  It looks for a fence line of exclamation points
132before and after and renders the fenced blocks into a new, styled `div`.  If it does not find the ending fence line,
133it does nothing.
134
135Our code, like most block processors, is longer than other examples:
136
137```python
138def test_block_processor():
139    class BoxBlockProcessor(BlockProcessor):
140        RE_FENCE_START = r'^ *!{3,} *\n' # start line, e.g., `   !!!! `
141        RE_FENCE_END = r'\n *!{3,}\s*$'  # last non-blank line, e.g, '!!!\n  \n\n'
142
143        def test(self, parent, block):
144            return re.match(self.RE_FENCE_START, block)
145
146        def run(self, parent, blocks):
147            original_block = blocks[0]
148            blocks[0] = re.sub(self.RE_FENCE_START, '', blocks[0])
149
150            # Find block with ending fence
151            for block_num, block in enumerate(blocks):
152                if re.search(self.RE_FENCE_END, block):
153                    # remove fence
154                    blocks[block_num] = re.sub(self.RE_FENCE_END, '', block)
155                    # render fenced area inside a new div
156                    e = etree.SubElement(parent, 'div')
157                    e.set('style', 'display: inline-block; border: 1px solid red;')
158                    self.parser.parseBlocks(e, blocks[0:block_num + 1])
159                    # remove used blocks
160                    for i in range(0, block_num + 1):
161                        blocks.pop(0)
162                    return True  # or could have had no return statement
163            # No closing marker!  Restore and do nothing
164            blocks[0] = original_block
165            return False  # equivalent to our test() routine returning False
166
167    class BoxExtension(Extension):
168        def extendMarkdown(self, md):
169            md.parser.blockprocessors.register(BoxBlockProcessor(md.parser), 'box', 175)
170```
171
172Start with this example input:
173
174``` text
175A regular paragraph of text.
176
177!!!!!
178First paragraph of wrapped text.
179
180Second Paragraph of **wrapped** text.
181!!!!!
182
183Another regular paragraph of text.
184```
185
186The fenced text adds one node with two children to the tree:
187
188* `div`, with a `style` attribute.  It renders as
189  `<div style="display: inline-block; border: 1px solid red;">...</div>`
190    * `p` with text `First paragraph of wrapped text.`
191    * `p` with text `Second Paragraph of **wrapped** text`.  The conversion to a `<strong>` tag will happen when
192      running the inline processors, which will happen after all of the block processors have completed.
193
194The example output might display as follows:
195
196!!! note ""
197    <p>A regular paragraph of text.</p>
198    <div style="display: inline-block; border: 1px solid red;">
199    <p>First paragraph of wrapped text.</p>
200    <p>Second Paragraph of **wrapped** text.</p>
201    </div>
202    <p>Another regular paragraph of text.</p>
203
204#### Usages
205
206Some block processors in the Markdown source tree include:
207
208| Class                       | Kind      |  Description                                |
209| ----------------------------|-----------|---------------------------------------------|
210| [`HashHeaderProcessor`][b1] | built-in  |  Title hashes (`#`), which may split blocks |
211| [`HRProcessor`][b2]         | built-in  |  Horizontal lines, e.g., `---`              |
212| [`OListProcessor`][b3]      | built-in  |  Ordered lists; complex and using `state`   |
213| [`Admonition`][b4]          | extension |  Render each [Admonition][] in a new `div`  |
214
215[b1]: https://github.com/Python-Markdown/markdown/blob/master/markdown/blockprocessors.py
216[b2]: https://github.com/Python-Markdown/markdown/blob/master/markdown/blockprocessors.py
217[b3]: https://github.com/Python-Markdown/markdown/blob/master/markdown/blockprocessors.py
218[Admonition]: https://python-markdown.github.io/extensions/admonition/
219[b4]: https://github.com/Python-Markdown/markdown/blob/master/markdown/extensions/admonition.py
220
221### Tree processors {: #treeprocessors }
222
223Tree processors manipulate the tree created by block processors.  They can even create an entirely new ElementTree
224object. This is an excellent place for creating summaries, adding collected references, or last minute adjustments.
225
226A tree processor must inherit from `markdown.treeprocessors.Treeprocessor` (note the capitalization). A tree processor
227must implement a `run` method which takes a single argument `root`. In most cases `root` would be an
228`xml.etree.ElementTree.Element` instance; however, in rare cases it could be some other type of ElementTree object.
229The `run` method may return `None`, in which case the (possibly modified) original `root` object is used, or it may
230return an entirely new `Element` object, which will replace the existing `root` object and all of its children.  It is
231generally preferred to modify `root` in place and return `None`, which avoids creating multiple copies of the entire
232document tree in memory.
233
234For specifics on manipulating the ElementTree, see [Working with the ElementTree][workingwithetree] below.
235
236#### Example
237
238A pseudo example:
239
240```python
241from markdown.treeprocessors import Treeprocessor
242
243class MyTreeprocessor(Treeprocessor):
244    def run(self, root):
245        root.text = 'modified content'
246        # No return statement is same as `return None`
247```
248
249#### Usages
250
251The  core `InlineProcessor` class is a tree processor.  It walks the tree, matches patterns, and splits and creates
252nodes on matches.
253
254Additional tree processors in the Markdown source tree include:
255
256| Class                             | Kind      | Description                                                   |
257| ----------------------------------|-----------|---------------------------------------------------------------|
258| [`PrettifyTreeprocessor`][e1]     | built-in  |  Add line breaks to the html document                         |
259| [`TocTreeprocessor`][e2]          | extension |  Builds a [table of contents][] from the finished tree        |
260| [`FootnoteTreeprocessor`][e3]     | extension |  Create [footnote][] div at end of document                   |
261| [`FootnotePostTreeprocessor`][e4] | extension |  Amend div created by `FootnoteTreeprocessor` with duplicates |
262
263[e1]: https://github.com/Python-Markdown/markdown/blob/master/markdown/treeprocessors.py
264[e2]: https://github.com/Python-Markdown/markdown/blob/master/markdown/extensions/toc.py
265[e3]: https://github.com/Python-Markdown/markdown/blob/master/markdown/extensions/footnotes.py
266[e4]: https://github.com/Python-Markdown/markdown/blob/master/markdown/extensions/footnotes.py
267[table of contents]: https://python-markdown.github.io/extensions/toc/
268[footnote]: https://python-markdown.github.io/extensions/footnotes/
269
270### Inline Processors {: #inlineprocessors }
271
272Inline processors, previously called inline patterns, are used to add formatting, such as `**emphasis**`, by replacing
273a matched pattern with a new element tree node. It is an excellent for adding new syntax for inline tags.  Inline
274processor code is often quite short.
275
276Inline processors inherit from `InlineProcessor`, are initialized, and implement `handleMatch`:
277
278*   `__init__(self, pattern, md=None)` is the inherited constructor.  You do not need to implement your own.
279    * `pattern` is the regular expression string that must match the code block in order for the `handleMatch` method
280      to be called.
281    * `md`, an optional parameter, is a pointer to the instance of `markdown.Markdown` and is available as `self.md`
282      on the `InlineProcessor` instance.
283
284*   `handleMatch(self, m, data)` must be implemented in all `InlineProcessor` subclasses.
285    * `m` is the regular expression [match object][] found by the `pattern` passed to `__init__`.
286    * `data` is a single, multi-line, Unicode string containing the entire block of text around the pattern.  A block
287      is text set apart by blank lines.
288    * Returns either `(None, None, None)`, indicating the provided match was rejected or `(el, start, end)`, if the
289      match was successfully processed.  On success, `el` is the element being added the tree, `start` and `end` are
290      indexes in `data` that were "consumed" by the pattern.  The "consumed" span will be replaced by a placeholder.
291      The same inline processor may be called several times on the same block.
292
293Inline Processors can define the property `ANCESTOR_EXCLUDES` which is either a list or tuple of undesirable ancestors.
294The processor will be skipped if it would cause the content to be a descendant of one of the listed tag names.
295
296##### Convenience Classes
297
298Convenience subclasses of `InlineProcessor` are provide for common operations:
299
300* [`SimpleTextInlineProcessor`][i1] returns the text of `group(1)` of the match.
301* [`SubstituteTagInlineProcessor`][i4] is initialized as `SubstituteTagInlineProcessor(pattern, tag)`. It returns a
302  new element `tag` whenever `pattern` is matched.
303* [`SimpleTagInlineProcessor`][i3] is initialized as `SimpleTagInlineProcessor(pattern, tag)`. It returns an element
304  `tag` with a text field of `group(2)` of the match.
305
306##### Example
307
308This example changes `--strike--` to `<del>strike</del>`.
309
310```python
311from markdown.inlinepatterns import InlineProcessor
312from markdown.extensions import Extension
313import xml.etree.ElementTree as etree
314
315
316class DelInlineProcessor(InlineProcessor):
317    def handleMatch(self, m, data):
318        el = etree.Element('del')
319        el.text = m.group(1)
320        return el, m.start(0), m.end(0)
321
322class DelExtension(Extension):
323    def extendMarkdown(self, md):
324        DEL_PATTERN = r'--(.*?)--'  # like --del--
325        md.inlinePatterns.register(DelInlineProcessor(DEL_PATTERN, md), 'del', 175)
326```
327
328Use this input example:
329
330``` text
331First line of the block.
332This is --strike one--.
333This is --strike two--.
334End of the block.
335```
336
337The example output might display as follows:
338
339!!! note ""
340    <p>First line of the block.
341    This is <del>strike one</del>.
342    This is <del>strike two</del>.
343    End of the block.</p>
344
345* On the first call to `handleMatch`
346    * `m` will be the match for `--strike one--`
347    * `data` will be the string:
348      `First line of the block.\nThis is --strike one--.\nThis is --strike two--.\nEnd of the block.`
349
350    Because the match was successful, the region between the returned `start` and `end` are replaced with a
351    placeholder token and the new element is added to the tree.
352
353* On the second call to `handleMatch`
354    * `m` will be the match for `--strike two--`
355    * `data` will be the string
356      `First line of the block.\nThis is klzzwxh:0000.\nThis is --strike two--.\nEnd of the block.`
357
358Note the placeholder token `klzzwxh:0000`. This allows the regular expression to be run against the entire block,
359not just the the text contained in an individual element. The placeholders will later be swapped back out for the
360actual elements by the parser.
361
362Actually it would not be necessary to create the above inline processor. The fact is, that example is not very DRY
363(Don't Repeat Yourself). A pattern for `**strong**` text would be almost identical, with the exception that it would
364create a `strong` element. Therefore, Markdown provides a number of generic `InlineProcessor` subclasses that can
365provide some common functionality. For example, strike could be implemented with an instance of the
366`SimpleTagInlineProcessor` class as demonstrated below. Feel free to use or extend any of the `InlineProcessor`
367subclasses found at `markdown.inlinepatterns`.
368
369```python
370from markdown.inlinepatterns import SimpleTagInlineProcessor
371from markdown.extensions import Extension
372
373class DelExtension(Extension):
374    def extendMarkdown(self, md):
375        md.inlinePatterns.register(SimpleTagInlineProcessor(r'()--(.*?)--', 'del'), 'del', 175)
376```
377
378
379##### Usages
380
381Here are some convenience functions and other examples:
382
383| Class                            | Kind      | Description                                                   |
384| ---------------------------------|-----------|---------------------------------------------------------------|
385| [`AsteriskProcessor`][i5]        | built-in  | Emphasis processor for handling strong and em matches inside asterisks |
386| [`AbbrInlineProcessor`][i6]      | extension | Apply tag to abbreviation registered by preprocessor          |
387| [`WikiLinksInlineProcessor`][i7] | extension | Link `[[article names]]` to wiki given in metadata            |
388| [`FootnoteInlineProcessor`][i8]  | extension | Replaces footnote in text with link to footnote div at bottom |
389
390[i1]: https://github.com/Python-Markdown/markdown/blob/master/markdown/inlinepatterns.py
391[i2]: https://github.com/Python-Markdown/markdown/blob/master/markdown/inlinepatterns.py
392[i3]: https://github.com/Python-Markdown/markdown/blob/master/markdown/inlinepatterns.py
393[i4]: https://github.com/Python-Markdown/markdown/blob/master/markdown/inlinepatterns.py
394[i5]: https://github.com/Python-Markdown/markdown/blob/master/markdown/inlinepatterns.py
395[i6]: https://github.com/Python-Markdown/markdown/blob/master/markdown/extensions/abbr.py
396[i7]: https://github.com/Python-Markdown/markdown/blob/master/markdown/extensions/wikilinks.py
397[i8]: https://github.com/Python-Markdown/markdown/blob/master/markdown/extensions/footnotes.py
398
399### Patterns
400
401In version 3.0, a new, more flexible inline processor was added, `markdown.inlinepatterns.InlineProcessor`.   The
402original inline patterns, which inherit from `markdown.inlinepatterns.Pattern` or one of its children are still
403supported, though users are encouraged to migrate.
404
405#### Comparison with new `InlineProcessor`
406
407The new `InlineProcessor` provides two major enhancements to `Patterns`:
408
4091. Inline Processors no longer need to match the entire block, so regular expressions no longer need to start with
410   `r'^(.*?)'` and end with `r'(.*?)%'`. This runs faster. The returned [match object][] will only contain what is
411   explicitly matched in the pattern, and extension pattern groups now start with `m.group(1)`.
412
4132.  The `handleMatch` method now takes an additional input called `data`, which is the entire block under analysis,
414    not just what is matched with the specified pattern. The method now returns the element *and* the indexes relative
415    to `data` that the return element is replacing (usually `m.start(0)` and `m.end(0)`).  If the boundaries are
416    returned as `None`, it is assumed that the match did not take place, and nothing will be altered in `data`.
417
418    This allows handling of more complex constructs than regular expressions can handle, e.g., matching nested
419    brackets, and explicit control of the span "consumed" by the processor.
420
421#### Inline Patterns
422
423Inline Patterns can implement inline HTML element syntax for Markdown such as `*emphasis*` or
424`[links](http://example.com)`. Pattern objects should be instances of classes that inherit from
425`markdown.inlinepatterns.Pattern` or one of its children. Each pattern object uses a single regular expression and
426must have the following methods:
427
428* **`getCompiledRegExp()`**:
429
430    Returns a compiled regular expression.
431
432* **`handleMatch(m)`**:
433
434    Accepts a match object and returns an ElementTree element of a plain Unicode string.
435
436Inline Patterns can define the property `ANCESTOR_EXCLUDES` with is either a list or tuple of undesirable ancestors.
437The pattern will be skipped if it would cause the content to be a descendant of one of the listed tag names.
438
439Note that any regular expression returned by `getCompiledRegExp` must capture the whole block. Therefore, they should
440all start with `r'^(.*?)'` and end with `r'(.*?)!'`. When using the default `getCompiledRegExp()` method provided in
441the `Pattern` you can pass in a regular expression without that and `getCompiledRegExp` will wrap your expression for
442you and set the `re.DOTALL` and `re.UNICODE` flags. This means that the first group of your match will be `m.group(2)`
443as `m.group(1)` will match everything before the pattern.
444
445For an example, consider this simplified emphasis pattern:
446
447```python
448from markdown.inlinepatterns import Pattern
449import xml.etree.ElementTree as etree
450
451class EmphasisPattern(Pattern):
452    def handleMatch(self, m):
453        el = etree.Element('em')
454        el.text = m.group(2)
455        return el
456```
457
458As discussed in [Integrating Your Code Into Markdown][], an instance of this class will need to be provided to
459Markdown. That instance would be created like so:
460
461```python
462# an oversimplified regex
463MYPATTERN = r'\*([^*]+)\*'
464# pass in pattern and create instance
465emphasis = EmphasisPattern(MYPATTERN)
466```
467
468### Postprocessors {: #postprocessors }
469
470Postprocessors munge the document after the ElementTree has been serialized into a string. Postprocessors should be
471used to work with the text just before output.  Usually, they are used add back sections that were extracted in a
472preprocessor, fix up outgoing encodings, or wrap the whole document.
473
474Postprocessors inherit from `markdown.postprocessors.Postprocessor` and implement a `run` method which takes a single
475parameter `text`, the entire HTML document as a single Unicode string.  `run` should return a single Unicode string
476ready for output.  Note that preprocessors use a list of lines while postprocessors use a single multi-line string.
477
478#### Example
479
480Here is a simple example that changes the output to one big page showing the raw html.
481
482```python
483from markdown.postprocessors import Postprocessor
484import re
485
486class ShowActualHtmlPostprocesor(Postprocessor):
487    """ Wrap entire output in <pre> tags as a diagnostic. """
488    def run(self, text):
489        return '<pre>\n' + re.sub('<', '&lt;', text) + '</pre>\n'
490```
491
492#### Usages
493
494Some postprocessors in the Markdown source tree include:
495
496| Class                         | Kind      |  Description                                       |
497| ------------------------------|-----------|----------------------------------------------------|
498| [`raw_html`][p1]              | built-in  | Restore raw html from `htmlStash`, stored by `HTMLBlockPreprocessor`, and code highlighters |
499| [`amp_substitute`][p2]        | built-in  | Convert ampersand substitutes to `&`; used in links |
500| [`unescape`][p3]              | built-in  | Convert some escaped characters back from integers; used in links |
501| [`FootnotePostProcessor`][p4] | extension | Replace footnote placeholders with html entities; as set by other stages |
502
503 [p1]: https://github.com/Python-Markdown/markdown/blob/master/markdown/postprocessors.py
504 [p2]: https://github.com/Python-Markdown/markdown/blob/master/markdown/postprocessors.py
505 [p3]: https://github.com/Python-Markdown/markdown/blob/master/markdown/postprocessors.py
506 [p4]: https://github.com/Python-Markdown/markdown/blob/master/markdown/extensions/footnotes.py
507
508
509## Working with the ElementTree {: #working_with_et }
510
511As mentioned, the Markdown parser converts a source document to an [ElementTree][ElementTree] object before
512serializing that back to Unicode text. Markdown has provided some helpers to ease that manipulation within the context
513of the Markdown module.
514
515First, import the ElementTree module:
516
517```python
518import xml.etree.ElementTree as etree
519```
520Sometimes you may want text inserted into an element to be parsed by [Inline Patterns][]. In such a situation, simply
521insert the text as you normally would and the text will be automatically run through the Inline Patterns. However, if
522you do *not* want some text to be parsed by Inline Patterns, then insert the text as an `AtomicString`.
523
524```python
525from markdown.util import AtomicString
526some_element.text = AtomicString(some_text)
527```
528
529Here's a basic example which creates an HTML table (note that the contents of the second cell (`td2`) will be run
530through Inline Patterns latter):
531
532```python
533table = etree.Element("table")
534table.set("cellpadding", "2")                      # Set cellpadding to 2
535tr = etree.SubElement(table, "tr")                 # Add child tr to table
536td1 = etree.SubElement(tr, "td")                   # Add child td1 to tr
537td1.text = markdown.util.AtomicString("Cell content") # Add plain text content
538td2 = etree.SubElement(tr, "td")                   # Add second td to tr
539td2.text = "*text* with **inline** formatting."    # Add markup text
540table.tail = "Text after table"                    # Add text after table
541```
542
543You can also manipulate an existing tree. Consider the following example which adds a `class` attribute to `<a>`
544elements:
545
546```python
547def set_link_class(self, element):
548    for child in element:
549        if child.tag == "a":
550              child.set("class", "myclass") #set the class attribute
551        set_link_class(child) # run recursively on children
552```
553
554For more information about working with ElementTree see the [ElementTree
555Documentation][ElementTree].
556
557## Working with Raw HTML {: #working_with_raw_html }
558
559Occasionally an extension may need to call out to a third party library which returns a pre-made string
560of raw HTML that needs to be inserted into the document unmodified. Raw strings can be stashed for later
561retrieval using an `htmlStash` instance, rather than converting them into `ElementTree` objects. A raw string
562(which may or may not be raw HTML) passed to `self.md.htmlStash.store()` will be saved to the stash and a
563placeholder string will be returned which should be inserted into the tree instead. After the tree is
564serialized, a postprocessor will replace the placeholder with the raw string. This prevents subsequent
565processing steps from modifying the HTML data. For example,
566
567```python
568html = "<p>This is some <em>raw</em> HTML data</p>"
569el = etree.Element("div")
570el.text = self.md.htmlStash.store(html)
571```
572
573For the global `htmlStash` instance to be available from a processor, the `markdown.Markdown` instance must
574be passed to the processor from [extendMarkdown](#extendmarkdown) and will be available as `self.md.htmlStash`.
575
576## Integrating Your Code Into Markdown {: #integrating_into_markdown }
577
578Once you have the various pieces of your extension built, you need to tell Markdown about them and ensure that they
579are run in the proper sequence. Markdown accepts an `Extension` instance for each extension. Therefore, you will need
580to define a class that extends `markdown.extensions.Extension` and over-rides the `extendMarkdown` method. Within this
581class you will manage configuration options for your extension and attach the various processors and patterns to the
582Markdown instance.
583
584It is important to note that the order of the various processors and patterns matters. For example, if we replace
585`http://...` links with `<a>` elements, and *then* try to deal with  inline HTML, we will end up with a mess.
586Therefore, the various types of processors and patterns are stored within an instance of the `markdown.Markdown` class
587in a [Registry][]. Your `Extension` class will need to manipulate those registries appropriately. You may `register`
588instances of your processors and patterns with an appropriate priority, `deregister` built-in instances, or replace a
589built-in instance with your own.
590
591### `extendMarkdown` {: #extendmarkdown }
592
593The `extendMarkdown` method of a `markdown.extensions.Extension` class accepts one argument:
594
595* **`md`**:
596
597    A pointer to the instance of the `markdown.Markdown` class. You should use this to access the
598    [Registries][Registry] of processors and patterns. They are found under the following attributes:
599
600    * `md.preprocessors`
601    * `md.inlinePatterns`
602    * `md.parser.blockprocessors`
603    * `md.treeprocessors`
604    * `md.postprocessors`
605
606    Some other things you may want to access on the `markdown.Markdown` instance are:
607
608    * `md.htmlStash`
609    * `md.output_formats`
610    * `md.set_output_format()`
611    * `md.output_format`
612    * `md.serializer`
613    * `md.registerExtension()`
614    * `md.tab_length`
615    * `md.block_level_elements`
616    * `md.isBlockLevel()`
617
618!!! Warning
619    With access to the above items, theoretically you have the option to change anything through various
620    [monkey_patching][] techniques. However, you should be aware that the various undocumented parts of Markdown may
621    change without notice and your monkey_patches may break with a new release. Therefore, what you really should be
622    doing is inserting processors and patterns into the Markdown pipeline. Consider yourself warned!
623
624[monkey_patching]: https://en.wikipedia.org/wiki/Monkey_patch
625
626A simple example:
627
628```python
629from markdown.extensions import Extension
630
631class MyExtension(Extension):
632    def extendMarkdown(self, md):
633        # Register instance of 'mypattern' with a priority of 175
634        md.inlinePatterns.register(MyPattern(md), 'mypattern', 175)
635```
636
637### registerExtension {: #registerextension }
638
639Some extensions may need to have their state reset between multiple runs of the `markdown.Markdown` class. For
640example, consider the following use of the [Footnotes][] extension:
641
642```python
643md = markdown.Markdown(extensions=['footnotes'])
644html1 = md.convert(text_with_footnote)
645md.reset()
646html2 = md.convert(text_without_footnote)
647```
648
649Without calling `reset`, the footnote definitions from the first document will be inserted into the second document as
650they are still stored within the class instance. Therefore the `Extension` class needs to define a `reset` method that
651will reset the state of the extension (i.e.: `self.footnotes = {}`). However, as many extensions do not have a need
652for `reset`, `reset` is only called on extensions that are registered.
653
654To register an extension, call `md.registerExtension` from within your `extendMarkdown` method:
655
656```python
657def extendMarkdown(self, md):
658    md.registerExtension(self)
659    # insert processors and patterns here
660```
661
662Then, each time `reset` is called on the `markdown.Markdown` instance, the `reset` method of each registered extension
663will be called as well. You should also note that `reset` will be called on each registered extension after it is
664initialized the first time. Keep that in mind when over-riding the extension's `reset` method.
665
666### Configuration Settings {: #configsettings }
667
668If an extension uses any parameters that the user may want to change, those parameters should be stored in
669`self.config` of your `markdown.extensions.Extension` class in the following format:
670
671```python
672class MyExtension(markdown.extensions.Extension):
673    def __init__(self, **kwargs):
674        self.config = {
675            'option1' : ['value1', 'description1'],
676            'option2' : ['value2', 'description2']
677        }
678        super(MyExtension, self).__init__(**kwargs)
679```
680
681When implemented this way the configuration parameters can be over-ridden at run time (thus the call to `super`). For
682example:
683
684```python
685markdown.Markdown(extensions=[MyExtension(option1='other value')])
686```
687
688Note that if a keyword is passed in that is not already defined in `self.config`, then a `KeyError` is raised.
689
690The `markdown.extensions.Extension` class and its subclasses have the following methods available to assist in working
691with configuration settings:
692
693* **`getConfig(key [, default])`**:
694
695    Returns the stored value for the given `key` or `default` if the `key` does not exist. If not set, `default`
696    returns an empty string.
697
698* **`getConfigs()`**:
699
700    Returns a dict of all key/value pairs.
701
702* **`getConfigInfo()`**:
703
704    Returns all configuration descriptions as a list of tuples.
705
706* **`setConfig(key, value)`**:
707
708    Sets a configuration setting for `key` with the given `value`. If `key` is unknown, a `KeyError` is raised. If the
709    previous value of `key` was a Boolean value, then `value` is converted to a Boolean value. If the previous value
710    of `key` is `None`, then `value` is converted to a Boolean value except when it is `None`. No conversion takes
711    place when the previous value of `key` is a string.
712
713* **`setConfigs(items)`**:
714
715    Sets multiple configuration settings given a dict of key/value pairs.
716
717### Naming an Extension { #naming_an_extension }
718
719As noted in the [library reference] an instance of an extension can be passed directly to `markdown.Markdown`. In
720fact, this is the preferred way to use third-party extensions.
721
722For example:
723
724```python
725import markdown
726from path.to.module import MyExtension
727md = markdown.Markdown(extensions=[MyExtension(option='value')])
728```
729
730However, Markdown also accepts "named" third party extensions for those occasions when it is impractical to import an
731extension directly (from the command line or from within templates). A "name" can either be a registered [entry
732point](#entry_point) or a string using Python's [dot notation](#dot_notation).
733
734#### Entry Point { #entry_point }
735
736[Entry points] are defined in a Python package's `setup.py` script. The script must use [setuptools] to support entry
737points. Python-Markdown extensions must be assigned to the `markdown.extensions` group. An entry point definition
738might look like this:
739
740```python
741from setuptools import setup
742
743setup(
744    # ...
745    entry_points={
746        'markdown.extensions': ['myextension = path.to.module:MyExtension']
747    }
748)
749```
750
751After a user installs your extension using the above script, they could then call the extension using the
752`myextension` string name like this:
753
754```python
755markdown.markdown(text, extensions=['myextension'])
756```
757
758Note that if two or more entry points within the same group are assigned the same name, Python-Markdown will only ever
759use the first one found and ignore all others. Therefore, be sure to give your extension a unique name.
760
761For more information on writing `setup.py` scripts, see the Python documentation on [Packaging and Distributing
762Projects].
763
764#### Dot Notation { #dot_notation }
765
766If an extension does not have a registered entry point, Python's dot notation may be used instead. The extension must
767be installed as a Python module on your PYTHONPATH. Generally, a class should be specified in the name. The class must
768be at the end of the name and be separated by a colon from the module.
769
770Therefore, if you were to import the class like this:
771
772```python
773from path.to.module import MyExtension
774```
775
776Then the extension can be loaded as follows:
777
778```python
779markdown.markdown(text, extensions=['path.to.module:MyExtension'])
780```
781
782You do not need to do anything special to support this feature. As long as your extension class is able to be
783imported, a user can include it with the above syntax.
784
785The above two methods are especially useful if you need to implement a large number of extensions with more than one
786residing in a module. However, if you do not want to require that your users include the class name in their string,
787you must define only one extension per module and that module must contain a module-level function called
788`makeExtension` that accepts `**kwargs` and returns an extension instance.
789
790For example:
791
792```python
793class MyExtension(markdown.extensions.Extension)
794    # Define extension here...
795
796def makeExtension(**kwargs):
797    return MyExtension(**kwargs)
798```
799
800When `markdown.Markdown` is passed the "name" of your extension as a dot notation string that does not include a class
801(for example `path.to.module`), it will import the module and call the `makeExtension` function to initiate your
802extension.
803
804## Registries
805
806The `markdown.util.Registry` class is a priority sorted registry which Markdown uses internally to determine the
807processing order of its various processors and patterns.
808
809A `Registry` instance provides two public methods to alter the data of the registry: `register` and `deregister`. Use
810`register` to add items and `deregister` to remove items. See each method for specifics.
811
812When registering an item, a "name" and a "priority" must be provided. All items are automatically sorted by the value
813of the "priority" parameter such that the item with the highest value will be processed first. The "name" is used to
814remove (`deregister`) and get items.
815
816A `Registry` instance is like a list (which maintains order) when reading data. You may iterate over the items, get an
817item and get a count (length) of all items. You may also check that the registry contains an item.
818
819When getting an item you may use either the index of the item or the string-based "name". For example:
820
821```python
822registry = Registry()
823registry.register(SomeItem(), 'itemname', 20)
824# Get the item by index
825item = registry[0]
826# Get the item by name
827item = registry['itemname']
828```
829
830When checking that the registry contains an item, you may use either the string-based "name", or a reference to the
831actual item. For example:
832
833```python
834someitem = SomeItem()
835registry.register(someitem, 'itemname', 20)
836# Contains the name
837assert 'itemname' in registry
838# Contains the item instance
839assert someitem in registry
840```
841
842`markdown.util.Registry` has the following methods:
843
844### `Registry.register(self, item, name, priority)` {: #registry.register data-toc-label='Registry.register'}
845
846:   Add an item to the registry with the given name and priority.
847
848    Parameters:
849
850    * `item`: The item being registered.
851    * `name`: A string used to reference the item.
852    * `priority`: An integer or float used to sort against all items.
853
854    If an item is registered with a "name" which already exists, the existing item is replaced with the new item.
855    Tread carefully as the old item is lost with no way to recover it. The new item will be sorted according to its
856    priority and will **not** retain the position of the old item.
857
858### `Registry.deregister(self, name, strict=True)`  {: #registry.deregister data-toc-label='Registry.deregister'}
859
860:   Remove an item from the registry.
861
862    Set `strict=False` to fail silently.
863
864### `Registry.get_index_for_name(self, name)` {: #registry.get_index_for_name data-toc-label='Registry.get_index_for_name'}
865
866:   Return the index of the given `name`.
867
868[match object]: https://docs.python.org/3/library/re.html#match-objects
869[bug tracker]: https://github.com/Python-Markdown/markdown/issues
870[extension source]:  https://github.com/Python-Markdown/markdown/tree/master/markdown/extensions
871[tutorial]: https://github.com/Python-Markdown/markdown/wiki/Tutorial:-Writing-Extensions-for-Python-Markdown
872[workingwithetree]: #working_with_et
873[Integrating your code into Markdown]: #integrating_into_markdown
874[extendMarkdown]: #extendmarkdown
875[Registry]: #registry
876[registerExtension]: #registerextension
877[Config Settings]: #configsettings
878[makeExtension]: #makeextension
879[ElementTree]: https://docs.python.org/3/library/xml.etree.elementtree.html
880[Available Extensions]: index.md
881[Footnotes]: https://github.com/Python-Markdown/markdown/blob/master/markdown/extensions/footnotes.py
882[Definition Lists]: https://github.com/Python-Markdown/markdown/blob/master/markdown/extensions/definition_lists
883[library reference]: ../reference.md
884[setuptools]: https://packaging.python.org/key_projects/#setuptools
885[Entry points]: https://setuptools.readthedocs.io/en/latest/setuptools.html#dynamic-discovery-of-services-and-plugins
886[Packaging and Distributing Projects]: https://packaging.python.org/tutorials/distributing-packages/
887