1# Introduction
2
3## What is Markdown?
4
5Markdown is a plain text format for writing structured documents,
6based on conventions for indicating formatting in email
7and usenet posts.  It was developed by John Gruber (with
8help from Aaron Swartz) and released in 2004 in the form of a
9[syntax description](http://daringfireball.net/projects/markdown/syntax)
10and a Perl script (`Markdown.pl`) for converting Markdown to
11HTML.  In the next decade, dozens of implementations were
12developed in many languages.  Some extended the original
13Markdown syntax with conventions for footnotes, tables, and
14other document elements.  Some allowed Markdown documents to be
15rendered in formats other than HTML.  Websites like Reddit,
16StackOverflow, and GitHub had millions of people using Markdown.
17And Markdown started to be used beyond the web, to author books,
18articles, slide shows, letters, and lecture notes.
19
20What distinguishes Markdown from many other lightweight markup
21syntaxes, which are often easier to write, is its readability.
22As Gruber writes:
23
24> The overriding design goal for Markdown's formatting syntax is
25> to make it as readable as possible. The idea is that a
26> Markdown-formatted document should be publishable as-is, as
27> plain text, without looking like it's been marked up with tags
28> or formatting instructions.
29> (<http://daringfireball.net/projects/markdown/>)
30
31The point can be illustrated by comparing a sample of
32[AsciiDoc](http://www.methods.co.nz/asciidoc/) with
33an equivalent sample of Markdown.  Here is a sample of
34AsciiDoc from the AsciiDoc manual:
35
36```
371. List item one.
38+
39List item one continued with a second paragraph followed by an
40Indented block.
41+
42.................
43$ ls *.sh
44$ mv *.sh ~/tmp
45.................
46+
47List item continued with a third paragraph.
48
492. List item two continued with an open block.
50+
51--
52This paragraph is part of the preceding list item.
53
54a. This list is nested and does not require explicit item
55continuation.
56+
57This paragraph is part of the preceding list item.
58
59b. List item b.
60
61This paragraph belongs to item two of the outer list.
62--
63```
64
65And here is the equivalent in Markdown:
66```
671.  List item one.
68
69    List item one continued with a second paragraph followed by an
70    Indented block.
71
72        $ ls *.sh
73        $ mv *.sh ~/tmp
74
75    List item continued with a third paragraph.
76
772.  List item two continued with an open block.
78
79    This paragraph is part of the preceding list item.
80
81    1. This list is nested and does not require explicit item continuation.
82
83       This paragraph is part of the preceding list item.
84
85    2. List item b.
86
87    This paragraph belongs to item two of the outer list.
88```
89
90The AsciiDoc version is, arguably, easier to write. You don't need
91to worry about indentation.  But the Markdown version is much easier
92to read.  The nesting of list items is apparent to the eye in the
93source, not just in the processed document.
94
95## Why is a spec needed?
96
97John Gruber's [canonical description of Markdown's
98syntax](http://daringfireball.net/projects/markdown/syntax)
99does not specify the syntax unambiguously.  Here are some examples of
100questions it does not answer:
101
1021.  How much indentation is needed for a sublist?  The spec says that
103    continuation paragraphs need to be indented four spaces, but is
104    not fully explicit about sublists.  It is natural to think that
105    they, too, must be indented four spaces, but `Markdown.pl` does
106    not require that.  This is hardly a "corner case," and divergences
107    between implementations on this issue often lead to surprises for
108    users in real documents. (See [this comment by John
109    Gruber](http://article.gmane.org/gmane.text.markdown.general/1997).)
110
1112.  Is a blank line needed before a block quote or heading?
112    Most implementations do not require the blank line.  However,
113    this can lead to unexpected results in hard-wrapped text, and
114    also to ambiguities in parsing (note that some implementations
115    put the heading inside the blockquote, while others do not).
116    (John Gruber has also spoken [in favor of requiring the blank
117    lines](http://article.gmane.org/gmane.text.markdown.general/2146).)
118
1193.  Is a blank line needed before an indented code block?
120    (`Markdown.pl` requires it, but this is not mentioned in the
121    documentation, and some implementations do not require it.)
122
123    ``` markdown
124    paragraph
125        code?
126    ```
127
1284.  What is the exact rule for determining when list items get
129    wrapped in `<p>` tags?  Can a list be partially "loose" and partially
130    "tight"?  What should we do with a list like this?
131
132    ``` markdown
133    1. one
134
135    2. two
136    3. three
137    ```
138
139    Or this?
140
141    ``` markdown
142    1.  one
143        - a
144
145        - b
146    2.  two
147    ```
148
149    (There are some relevant comments by John Gruber
150    [here](http://article.gmane.org/gmane.text.markdown.general/2554).)
151
1525.  Can list markers be indented?  Can ordered list markers be right-aligned?
153
154    ``` markdown
155     8. item 1
156     9. item 2
157    10. item 2a
158    ```
159
1606.  Is this one list with a thematic break in its second item,
161    or two lists separated by a thematic break?
162
163    ``` markdown
164    * a
165    * * * * *
166    * b
167    ```
168
1697.  When list markers change from numbers to bullets, do we have
170    two lists or one?  (The Markdown syntax description suggests two,
171    but the perl scripts and many other implementations produce one.)
172
173    ``` markdown
174    1. fee
175    2. fie
176    -  foe
177    -  fum
178    ```
179
1808.  What are the precedence rules for the markers of inline structure?
181    For example, is the following a valid link, or does the code span
182    take precedence ?
183
184    ``` markdown
185    [a backtick (`)](/url) and [another backtick (`)](/url).
186    ```
187
1889.  What are the precedence rules for markers of emphasis and strong
189    emphasis?  For example, how should the following be parsed?
190
191    ``` markdown
192    *foo *bar* baz*
193    ```
194
19510. What are the precedence rules between block-level and inline-level
196    structure?  For example, how should the following be parsed?
197
198    ``` markdown
199    - `a long code span can contain a hyphen like this
200      - and it can screw things up`
201    ```
202
20311. Can list items include section headings?  (`Markdown.pl` does not
204    allow this, but does allow blockquotes to include headings.)
205
206    ``` markdown
207    - # Heading
208    ```
209
21012. Can list items be empty?
211
212    ``` markdown
213    * a
214    *
215    * b
216    ```
217
21813. Can link references be defined inside block quotes or list items?
219
220    ``` markdown
221    > Blockquote [foo].
222    >
223    > [foo]: /url
224    ```
225
22614. If there are multiple definitions for the same reference, which takes
227    precedence?
228
229    ``` markdown
230    [foo]: /url1
231    [foo]: /url2
232
233    [foo][]
234    ```
235
236In the absence of a spec, early implementers consulted `Markdown.pl`
237to resolve these ambiguities.  But `Markdown.pl` was quite buggy, and
238gave manifestly bad results in many cases, so it was not a
239satisfactory replacement for a spec.
240
241Because there is no unambiguous spec, implementations have diverged
242considerably.  As a result, users are often surprised to find that
243a document that renders one way on one system (say, a GitHub wiki)
244renders differently on another (say, converting to docbook using
245pandoc).  To make matters worse, because nothing in Markdown counts
246as a "syntax error," the divergence often isn't discovered right away.
247
248## About this document
249
250This document attempts to specify Markdown syntax unambiguously.
251It contains many examples with side-by-side Markdown and
252HTML.  These are intended to double as conformance tests.  An
253accompanying script `spec_tests.py` can be used to run the tests
254against any Markdown program:
255
256    python test/spec_tests.py --spec spec.txt --program PROGRAM
257
258Since this document describes how Markdown is to be parsed into
259an abstract syntax tree, it would have made sense to use an abstract
260representation of the syntax tree instead of HTML.  But HTML is capable
261of representing the structural distinctions we need to make, and the
262choice of HTML for the tests makes it possible to run the tests against
263an implementation without writing an abstract syntax tree renderer.
264
265This document is generated from a text file, `spec.txt`, written
266in Markdown with a small extension for the side-by-side tests.
267The script `tools/makespec.py` can be used to convert `spec.txt` into
268HTML or CommonMark (which can then be converted into other formats).
269
270In the examples, the `→` character is used to represent tabs.
271
272# Preliminaries
273
274## Characters and lines
275
276Any sequence of [characters] is a valid CommonMark
277document.
278
279A [character](@) is a Unicode code point.  Although some
280code points (for example, combining accents) do not correspond to
281characters in an intuitive sense, all code points count as characters
282for purposes of this spec.
283
284This spec does not specify an encoding; it thinks of lines as composed
285of [characters] rather than bytes.  A conforming parser may be limited
286to a certain encoding.
287
288A [line](@) is a sequence of zero or more [characters]
289other than newline (`U+000A`) or carriage return (`U+000D`),
290followed by a [line ending] or by the end of file.
291
292A [line ending](@) is a newline (`U+000A`), a carriage return
293(`U+000D`) not followed by a newline, or a carriage return and a
294following newline.
295
296A line containing no characters, or a line containing only spaces
297(`U+0020`) or tabs (`U+0009`), is called a [blank line](@).
298
299The following definitions of character classes will be used in this spec:
300
301A [whitespace character](@) is a space
302(`U+0020`), tab (`U+0009`), newline (`U+000A`), line tabulation (`U+000B`),
303form feed (`U+000C`), or carriage return (`U+000D`).
304
305[Whitespace](@) is a sequence of one or more [whitespace
306characters].
307
308A [Unicode whitespace character](@) is
309any code point in the Unicode `Zs` general category, or a tab (`U+0009`),
310carriage return (`U+000D`), newline (`U+000A`), or form feed
311(`U+000C`).
312
313[Unicode whitespace](@) is a sequence of one
314or more [Unicode whitespace characters].
315
316A [space](@) is `U+0020`.
317
318A [non-whitespace character](@) is any character
319that is not a [whitespace character].
320
321An [ASCII punctuation character](@)
322is `!`, `"`, `#`, `$`, `%`, `&`, `'`, `(`, `)`,
323`*`, `+`, `,`, `-`, `.`, `/` (U+0021–2F),
324`:`, `;`, `<`, `=`, `>`, `?`, `@` (U+003A–0040),
325`[`, `\`, `]`, `^`, `_`, `` ` `` (U+005B–0060),
326`{`, `|`, `}`, or `~` (U+007B–007E).
327
328A [punctuation character](@) is an [ASCII
329punctuation character] or anything in
330the general Unicode categories  `Pc`, `Pd`, `Pe`, `Pf`, `Pi`, `Po`, or `Ps`.
331
332## Tabs
333
334Tabs in lines are not expanded to [spaces].  However,
335in contexts where whitespace helps to define block structure,
336tabs behave as if they were replaced by spaces with a tab stop
337of 4 characters.
338
339Thus, for example, a tab can be used instead of four spaces
340in an indented code block.  (Note, however, that internal
341tabs are passed through as literal tabs, not expanded to
342spaces.)
343
344```````````````````````````````` example
345→foo→baz→→bim
346.
347<pre><code>foo→baz→→bim
348</code></pre>
349````````````````````````````````
350
351```````````````````````````````` example
352  →foo→baz→→bim
353.
354<pre><code>foo→baz→→bim
355</code></pre>
356````````````````````````````````
357
358```````````````````````````````` example
359    a→a
360    ὐ→a
361.
362<pre><code>a→a
363ὐ→a
364</code></pre>
365````````````````````````````````
366
367In the following example, a continuation paragraph of a list
368item is indented with a tab; this has exactly the same effect
369as indentation with four spaces would:
370
371```````````````````````````````` example
372  - foo
373
374→bar
375.
376<ul>
377<li>
378<p>foo</p>
379<p>bar</p>
380</li>
381</ul>
382````````````````````````````````
383
384```````````````````````````````` example
385- foo
386
387→→bar
388.
389<ul>
390<li>
391<p>foo</p>
392<pre><code>  bar
393</code></pre>
394</li>
395</ul>
396````````````````````````````````
397
398Normally the `>` that begins a block quote may be followed
399optionally by a space, which is not considered part of the
400content.  In the following case `>` is followed by a tab,
401which is treated as if it were expanded into three spaces.
402Since one of these spaces is considered part of the
403delimiter, `foo` is considered to be indented six spaces
404inside the block quote context, so we get an indented
405code block starting with two spaces.
406
407```````````````````````````````` example
408>→→foo
409.
410<blockquote>
411<pre><code>  foo
412</code></pre>
413</blockquote>
414````````````````````````````````
415
416```````````````````````````````` example
417-→→foo
418.
419<ul>
420<li>
421<pre><code>  foo
422</code></pre>
423</li>
424</ul>
425````````````````````````````````
426
427
428```````````````````````````````` example
429    foo
430→bar
431.
432<pre><code>foo
433bar
434</code></pre>
435````````````````````````````````
436
437```````````````````````````````` example
438 - foo
439   - bar
440→ - baz
441.
442<ul>
443<li>foo
444<ul>
445<li>bar
446<ul>
447<li>baz</li>
448</ul>
449</li>
450</ul>
451</li>
452</ul>
453````````````````````````````````
454
455```````````````````````````````` example
456#→Foo
457.
458<h1>Foo</h1>
459````````````````````````````````
460
461```````````````````````````````` example
462*→*→*→
463.
464<hr />
465````````````````````````````````
466
467
468## Insecure characters
469
470For security reasons, the Unicode character `U+0000` must be replaced
471with the REPLACEMENT CHARACTER (`U+FFFD`).
472
473# Blocks and inlines
474
475We can think of a document as a sequence of
476[blocks](@)---structural elements like paragraphs, block
477quotations, lists, headings, rules, and code blocks.  Some blocks (like
478block quotes and list items) contain other blocks; others (like
479headings and paragraphs) contain [inline](@) content---text,
480links, emphasized text, images, code spans, and so on.
481
482## Precedence
483
484Indicators of block structure always take precedence over indicators
485of inline structure.  So, for example, the following is a list with
486two items, not a list with one item containing a code span:
487
488```````````````````````````````` example
489- `one
490- two`
491.
492<ul>
493<li>`one</li>
494<li>two`</li>
495</ul>
496````````````````````````````````
497
498
499This means that parsing can proceed in two steps:  first, the block
500structure of the document can be discerned; second, text lines inside
501paragraphs, headings, and other block constructs can be parsed for inline
502structure.  The second step requires information about link reference
503definitions that will be available only at the end of the first
504step.  Note that the first step requires processing lines in sequence,
505but the second can be parallelized, since the inline parsing of
506one block element does not affect the inline parsing of any other.
507
508## Container blocks and leaf blocks
509
510We can divide blocks into two types:
511[container blocks](@),
512which can contain other blocks, and [leaf blocks](@),
513which cannot.
514
515# Leaf blocks
516
517This section describes the different kinds of leaf block that make up a
518Markdown document.
519
520## Thematic breaks
521
522A line consisting of 0-3 spaces of indentation, followed by a sequence
523of three or more matching `-`, `_`, or `*` characters, each followed
524optionally by any number of spaces or tabs, forms a
525[thematic break](@).
526
527```````````````````````````````` example
528***
529---
530___
531.
532<hr />
533<hr />
534<hr />
535````````````````````````````````
536
537
538Wrong characters:
539
540```````````````````````````````` example
541+++
542.
543<p>+++</p>
544````````````````````````````````
545
546
547```````````````````````````````` example
548===
549.
550<p>===</p>
551````````````````````````````````
552
553
554Not enough characters:
555
556```````````````````````````````` example
557--
558**
559__
560.
561<p>--
562**
563__</p>
564````````````````````````````````
565
566
567One to three spaces indent are allowed:
568
569```````````````````````````````` example
570 ***
571  ***
572   ***
573.
574<hr />
575<hr />
576<hr />
577````````````````````````````````
578
579
580Four spaces is too many:
581
582```````````````````````````````` example
583    ***
584.
585<pre><code>***
586</code></pre>
587````````````````````````````````
588
589
590```````````````````````````````` example
591Foo
592    ***
593.
594<p>Foo
595***</p>
596````````````````````````````````
597
598
599More than three characters may be used:
600
601```````````````````````````````` example
602_____________________________________
603.
604<hr />
605````````````````````````````````
606
607
608Spaces are allowed between the characters:
609
610```````````````````````````````` example
611 - - -
612.
613<hr />
614````````````````````````````````
615
616
617```````````````````````````````` example
618 **  * ** * ** * **
619.
620<hr />
621````````````````````````````````
622
623
624```````````````````````````````` example
625-     -      -      -
626.
627<hr />
628````````````````````````````````
629
630
631Spaces are allowed at the end:
632
633```````````````````````````````` example
634- - - -
635.
636<hr />
637````````````````````````````````
638
639
640However, no other characters may occur in the line:
641
642```````````````````````````````` example
643_ _ _ _ a
644
645a------
646
647---a---
648.
649<p>_ _ _ _ a</p>
650<p>a------</p>
651<p>---a---</p>
652````````````````````````````````
653
654
655It is required that all of the [non-whitespace characters] be the same.
656So, this is not a thematic break:
657
658```````````````````````````````` example
659 *-*
660.
661<p><em>-</em></p>
662````````````````````````````````
663
664
665Thematic breaks do not need blank lines before or after:
666
667```````````````````````````````` example
668- foo
669***
670- bar
671.
672<ul>
673<li>foo</li>
674</ul>
675<hr />
676<ul>
677<li>bar</li>
678</ul>
679````````````````````````````````
680
681
682Thematic breaks can interrupt a paragraph:
683
684```````````````````````````````` example
685Foo
686***
687bar
688.
689<p>Foo</p>
690<hr />
691<p>bar</p>
692````````````````````````````````
693
694
695If a line of dashes that meets the above conditions for being a
696thematic break could also be interpreted as the underline of a [setext
697heading], the interpretation as a
698[setext heading] takes precedence. Thus, for example,
699this is a setext heading, not a paragraph followed by a thematic break:
700
701```````````````````````````````` example
702Foo
703---
704bar
705.
706<h2>Foo</h2>
707<p>bar</p>
708````````````````````````````````
709
710
711When both a thematic break and a list item are possible
712interpretations of a line, the thematic break takes precedence:
713
714```````````````````````````````` example
715* Foo
716* * *
717* Bar
718.
719<ul>
720<li>Foo</li>
721</ul>
722<hr />
723<ul>
724<li>Bar</li>
725</ul>
726````````````````````````````````
727
728
729If you want a thematic break in a list item, use a different bullet:
730
731```````````````````````````````` example
732- Foo
733- * * *
734.
735<ul>
736<li>Foo</li>
737<li>
738<hr />
739</li>
740</ul>
741````````````````````````````````
742
743
744## ATX headings
745
746An [ATX heading](@)
747consists of a string of characters, parsed as inline content, between an
748opening sequence of 1--6 unescaped `#` characters and an optional
749closing sequence of any number of unescaped `#` characters.
750The opening sequence of `#` characters must be followed by a
751[space] or by the end of line. The optional closing sequence of `#`s must be
752preceded by a [space] and may be followed by spaces only.  The opening
753`#` character may be indented 0-3 spaces.  The raw contents of the
754heading are stripped of leading and trailing spaces before being parsed
755as inline content.  The heading level is equal to the number of `#`
756characters in the opening sequence.
757
758Simple headings:
759
760```````````````````````````````` example
761# foo
762## foo
763### foo
764#### foo
765##### foo
766###### foo
767.
768<h1>foo</h1>
769<h2>foo</h2>
770<h3>foo</h3>
771<h4>foo</h4>
772<h5>foo</h5>
773<h6>foo</h6>
774````````````````````````````````
775
776
777More than six `#` characters is not a heading:
778
779```````````````````````````````` example
780####### foo
781.
782<p>####### foo</p>
783````````````````````````````````
784
785
786At least one space is required between the `#` characters and the
787heading's contents, unless the heading is empty.  Note that many
788implementations currently do not require the space.  However, the
789space was required by the
790[original ATX implementation](http://www.aaronsw.com/2002/atx/atx.py),
791and it helps prevent things like the following from being parsed as
792headings:
793
794```````````````````````````````` example
795#5 bolt
796
797#hashtag
798.
799<p>#5 bolt</p>
800<p>#hashtag</p>
801````````````````````````````````
802
803
804This is not a heading, because the first `#` is escaped:
805
806```````````````````````````````` example
807\## foo
808.
809<p>## foo</p>
810````````````````````````````````
811
812
813Contents are parsed as inlines:
814
815```````````````````````````````` example
816# foo *bar* \*baz\*
817.
818<h1>foo <em>bar</em> *baz*</h1>
819````````````````````````````````
820
821
822Leading and trailing [whitespace] is ignored in parsing inline content:
823
824```````````````````````````````` example
825#                  foo
826.
827<h1>foo</h1>
828````````````````````````````````
829
830
831One to three spaces indentation are allowed:
832
833```````````````````````````````` example
834 ### foo
835  ## foo
836   # foo
837.
838<h3>foo</h3>
839<h2>foo</h2>
840<h1>foo</h1>
841````````````````````````````````
842
843
844Four spaces are too much:
845
846```````````````````````````````` example
847    # foo
848.
849<pre><code># foo
850</code></pre>
851````````````````````````````````
852
853
854```````````````````````````````` example
855foo
856    # bar
857.
858<p>foo
859# bar</p>
860````````````````````````````````
861
862
863A closing sequence of `#` characters is optional:
864
865```````````````````````````````` example
866## foo ##
867  ###   bar    ###
868.
869<h2>foo</h2>
870<h3>bar</h3>
871````````````````````````````````
872
873
874It need not be the same length as the opening sequence:
875
876```````````````````````````````` example
877# foo ##################################
878##### foo ##
879.
880<h1>foo</h1>
881<h5>foo</h5>
882````````````````````````````````
883
884
885Spaces are allowed after the closing sequence:
886
887```````````````````````````````` example
888### foo ###
889.
890<h3>foo</h3>
891````````````````````````````````
892
893
894A sequence of `#` characters with anything but [spaces] following it
895is not a closing sequence, but counts as part of the contents of the
896heading:
897
898```````````````````````````````` example
899### foo ### b
900.
901<h3>foo ### b</h3>
902````````````````````````````````
903
904
905The closing sequence must be preceded by a space:
906
907```````````````````````````````` example
908# foo#
909.
910<h1>foo#</h1>
911````````````````````````````````
912
913
914Backslash-escaped `#` characters do not count as part
915of the closing sequence:
916
917```````````````````````````````` example
918### foo \###
919## foo #\##
920# foo \#
921.
922<h3>foo ###</h3>
923<h2>foo ###</h2>
924<h1>foo #</h1>
925````````````````````````````````
926
927
928ATX headings need not be separated from surrounding content by blank
929lines, and they can interrupt paragraphs:
930
931```````````````````````````````` example
932****
933## foo
934****
935.
936<hr />
937<h2>foo</h2>
938<hr />
939````````````````````````````````
940
941
942```````````````````````````````` example
943Foo bar
944# baz
945Bar foo
946.
947<p>Foo bar</p>
948<h1>baz</h1>
949<p>Bar foo</p>
950````````````````````````````````
951
952
953ATX headings can be empty:
954
955```````````````````````````````` example
956##
957#
958### ###
959.
960<h2></h2>
961<h1></h1>
962<h3></h3>
963````````````````````````````````
964
965
966## Setext headings
967
968A [setext heading](@) consists of one or more
969lines of text, each containing at least one [non-whitespace
970character], with no more than 3 spaces indentation, followed by
971a [setext heading underline].  The lines of text must be such
972that, were they not followed by the setext heading underline,
973they would be interpreted as a paragraph:  they cannot be
974interpretable as a [code fence], [ATX heading][ATX headings],
975[block quote][block quotes], [thematic break][thematic breaks],
976[list item][list items], or [HTML block][HTML blocks].
977
978A [setext heading underline](@) is a sequence of
979`=` characters or a sequence of `-` characters, with no more than 3
980spaces indentation and any number of trailing spaces.  If a line
981containing a single `-` can be interpreted as an
982empty [list items], it should be interpreted this way
983and not as a [setext heading underline].
984
985The heading is a level 1 heading if `=` characters are used in
986the [setext heading underline], and a level 2 heading if `-`
987characters are used.  The contents of the heading are the result
988of parsing the preceding lines of text as CommonMark inline
989content.
990
991In general, a setext heading need not be preceded or followed by a
992blank line.  However, it cannot interrupt a paragraph, so when a
993setext heading comes after a paragraph, a blank line is needed between
994them.
995
996Simple examples:
997
998```````````````````````````````` example
999Foo *bar*
1000=========
1001
1002Foo *bar*
1003---------
1004.
1005<h1>Foo <em>bar</em></h1>
1006<h2>Foo <em>bar</em></h2>
1007````````````````````````````````
1008
1009
1010The content of the header may span more than one line:
1011
1012```````````````````````````````` example
1013Foo *bar
1014baz*
1015====
1016.
1017<h1>Foo <em>bar
1018baz</em></h1>
1019````````````````````````````````
1020
1021The contents are the result of parsing the headings's raw
1022content as inlines.  The heading's raw content is formed by
1023concatenating the lines and removing initial and final
1024[whitespace].
1025
1026```````````````````````````````` example
1027  Foo *bar
1028baz*→
1029====
1030.
1031<h1>Foo <em>bar
1032baz</em></h1>
1033````````````````````````````````
1034
1035
1036The underlining can be any length:
1037
1038```````````````````````````````` example
1039Foo
1040-------------------------
1041
1042Foo
1043=
1044.
1045<h2>Foo</h2>
1046<h1>Foo</h1>
1047````````````````````````````````
1048
1049
1050The heading content can be indented up to three spaces, and need
1051not line up with the underlining:
1052
1053```````````````````````````````` example
1054   Foo
1055---
1056
1057  Foo
1058-----
1059
1060  Foo
1061  ===
1062.
1063<h2>Foo</h2>
1064<h2>Foo</h2>
1065<h1>Foo</h1>
1066````````````````````````````````
1067
1068
1069Four spaces indent is too much:
1070
1071```````````````````````````````` example
1072    Foo
1073    ---
1074
1075    Foo
1076---
1077.
1078<pre><code>Foo
1079---
1080
1081Foo
1082</code></pre>
1083<hr />
1084````````````````````````````````
1085
1086
1087The setext heading underline can be indented up to three spaces, and
1088may have trailing spaces:
1089
1090```````````````````````````````` example
1091Foo
1092   ----
1093.
1094<h2>Foo</h2>
1095````````````````````````````````
1096
1097
1098Four spaces is too much:
1099
1100```````````````````````````````` example
1101Foo
1102    ---
1103.
1104<p>Foo
1105---</p>
1106````````````````````````````````
1107
1108
1109The setext heading underline cannot contain internal spaces:
1110
1111```````````````````````````````` example
1112Foo
1113= =
1114
1115Foo
1116--- -
1117.
1118<p>Foo
1119= =</p>
1120<p>Foo</p>
1121<hr />
1122````````````````````````````````
1123
1124
1125Trailing spaces in the content line do not cause a line break:
1126
1127```````````````````````````````` example
1128Foo
1129-----
1130.
1131<h2>Foo</h2>
1132````````````````````````````````
1133
1134
1135Nor does a backslash at the end:
1136
1137```````````````````````````````` example
1138Foo\
1139----
1140.
1141<h2>Foo\</h2>
1142````````````````````````````````
1143
1144
1145Since indicators of block structure take precedence over
1146indicators of inline structure, the following are setext headings:
1147
1148```````````````````````````````` example
1149`Foo
1150----
1151`
1152
1153<a title="a lot
1154---
1155of dashes"/>
1156.
1157<h2>`Foo</h2>
1158<p>`</p>
1159<h2>&lt;a title=&quot;a lot</h2>
1160<p>of dashes&quot;/&gt;</p>
1161````````````````````````````````
1162
1163
1164The setext heading underline cannot be a [lazy continuation
1165line] in a list item or block quote:
1166
1167```````````````````````````````` example
1168> Foo
1169---
1170.
1171<blockquote>
1172<p>Foo</p>
1173</blockquote>
1174<hr />
1175````````````````````````````````
1176
1177
1178```````````````````````````````` example
1179> foo
1180bar
1181===
1182.
1183<blockquote>
1184<p>foo
1185bar
1186===</p>
1187</blockquote>
1188````````````````````````````````
1189
1190
1191```````````````````````````````` example
1192- Foo
1193---
1194.
1195<ul>
1196<li>Foo</li>
1197</ul>
1198<hr />
1199````````````````````````````````
1200
1201
1202A blank line is needed between a paragraph and a following
1203setext heading, since otherwise the paragraph becomes part
1204of the heading's content:
1205
1206```````````````````````````````` example
1207Foo
1208Bar
1209---
1210.
1211<h2>Foo
1212Bar</h2>
1213````````````````````````````````
1214
1215
1216But in general a blank line is not required before or after
1217setext headings:
1218
1219```````````````````````````````` example
1220---
1221Foo
1222---
1223Bar
1224---
1225Baz
1226.
1227<hr />
1228<h2>Foo</h2>
1229<h2>Bar</h2>
1230<p>Baz</p>
1231````````````````````````````````
1232
1233
1234Setext headings cannot be empty:
1235
1236```````````````````````````````` example
1237
1238====
1239.
1240<p>====</p>
1241````````````````````````````````
1242
1243
1244Setext heading text lines must not be interpretable as block
1245constructs other than paragraphs.  So, the line of dashes
1246in these examples gets interpreted as a thematic break:
1247
1248```````````````````````````````` example
1249---
1250---
1251.
1252<hr />
1253<hr />
1254````````````````````````````````
1255
1256
1257```````````````````````````````` example
1258- foo
1259-----
1260.
1261<ul>
1262<li>foo</li>
1263</ul>
1264<hr />
1265````````````````````````````````
1266
1267
1268```````````````````````````````` example
1269    foo
1270---
1271.
1272<pre><code>foo
1273</code></pre>
1274<hr />
1275````````````````````````````````
1276
1277
1278```````````````````````````````` example
1279> foo
1280-----
1281.
1282<blockquote>
1283<p>foo</p>
1284</blockquote>
1285<hr />
1286````````````````````````````````
1287
1288
1289If you want a heading with `> foo` as its literal text, you can
1290use backslash escapes:
1291
1292```````````````````````````````` example
1293\> foo
1294------
1295.
1296<h2>&gt; foo</h2>
1297````````````````````````````````
1298
1299
1300**Compatibility note:**  Most existing Markdown implementations
1301do not allow the text of setext headings to span multiple lines.
1302But there is no consensus about how to interpret
1303
1304``` markdown
1305Foo
1306bar
1307---
1308baz
1309```
1310
1311One can find four different interpretations:
1312
13131. paragraph "Foo", heading "bar", paragraph "baz"
13142. paragraph "Foo bar", thematic break, paragraph "baz"
13153. paragraph "Foo bar --- baz"
13164. heading "Foo bar", paragraph "baz"
1317
1318We find interpretation 4 most natural, and interpretation 4
1319increases the expressive power of CommonMark, by allowing
1320multiline headings.  Authors who want interpretation 1 can
1321put a blank line after the first paragraph:
1322
1323```````````````````````````````` example
1324Foo
1325
1326bar
1327---
1328baz
1329.
1330<p>Foo</p>
1331<h2>bar</h2>
1332<p>baz</p>
1333````````````````````````````````
1334
1335
1336Authors who want interpretation 2 can put blank lines around
1337the thematic break,
1338
1339```````````````````````````````` example
1340Foo
1341bar
1342
1343---
1344
1345baz
1346.
1347<p>Foo
1348bar</p>
1349<hr />
1350<p>baz</p>
1351````````````````````````````````
1352
1353
1354or use a thematic break that cannot count as a [setext heading
1355underline], such as
1356
1357```````````````````````````````` example
1358Foo
1359bar
1360* * *
1361baz
1362.
1363<p>Foo
1364bar</p>
1365<hr />
1366<p>baz</p>
1367````````````````````````````````
1368
1369
1370Authors who want interpretation 3 can use backslash escapes:
1371
1372```````````````````````````````` example
1373Foo
1374bar
1375\---
1376baz
1377.
1378<p>Foo
1379bar
1380---
1381baz</p>
1382````````````````````````````````
1383
1384
1385## Indented code blocks
1386
1387An [indented code block](@) is composed of one or more
1388[indented chunks] separated by blank lines.
1389An [indented chunk](@) is a sequence of non-blank lines,
1390each indented four or more spaces. The contents of the code block are
1391the literal contents of the lines, including trailing
1392[line endings], minus four spaces of indentation.
1393An indented code block has no [info string].
1394
1395An indented code block cannot interrupt a paragraph, so there must be
1396a blank line between a paragraph and a following indented code block.
1397(A blank line is not needed, however, between a code block and a following
1398paragraph.)
1399
1400```````````````````````````````` example
1401    a simple
1402      indented code block
1403.
1404<pre><code>a simple
1405  indented code block
1406</code></pre>
1407````````````````````````````````
1408
1409
1410If there is any ambiguity between an interpretation of indentation
1411as a code block and as indicating that material belongs to a [list
1412item][list items], the list item interpretation takes precedence:
1413
1414```````````````````````````````` example
1415  - foo
1416
1417    bar
1418.
1419<ul>
1420<li>
1421<p>foo</p>
1422<p>bar</p>
1423</li>
1424</ul>
1425````````````````````````````````
1426
1427
1428```````````````````````````````` example
14291.  foo
1430
1431    - bar
1432.
1433<ol>
1434<li>
1435<p>foo</p>
1436<ul>
1437<li>bar</li>
1438</ul>
1439</li>
1440</ol>
1441````````````````````````````````
1442
1443
1444
1445The contents of a code block are literal text, and do not get parsed
1446as Markdown:
1447
1448```````````````````````````````` example
1449    <a/>
1450    *hi*
1451
1452    - one
1453.
1454<pre><code>&lt;a/&gt;
1455*hi*
1456
1457- one
1458</code></pre>
1459````````````````````````````````
1460
1461
1462Here we have three chunks separated by blank lines:
1463
1464```````````````````````````````` example
1465    chunk1
1466
1467    chunk2
1468
1469
1470
1471    chunk3
1472.
1473<pre><code>chunk1
1474
1475chunk2
1476
1477
1478
1479chunk3
1480</code></pre>
1481````````````````````````````````
1482
1483
1484Any initial spaces beyond four will be included in the content, even
1485in interior blank lines:
1486
1487```````````````````````````````` example
1488    chunk1
1489
1490      chunk2
1491.
1492<pre><code>chunk1
1493
1494  chunk2
1495</code></pre>
1496````````````````````````````````
1497
1498
1499An indented code block cannot interrupt a paragraph.  (This
1500allows hanging indents and the like.)
1501
1502```````````````````````````````` example
1503Foo
1504    bar
1505
1506.
1507<p>Foo
1508bar</p>
1509````````````````````````````````
1510
1511
1512However, any non-blank line with fewer than four leading spaces ends
1513the code block immediately.  So a paragraph may occur immediately
1514after indented code:
1515
1516```````````````````````````````` example
1517    foo
1518bar
1519.
1520<pre><code>foo
1521</code></pre>
1522<p>bar</p>
1523````````````````````````````````
1524
1525
1526And indented code can occur immediately before and after other kinds of
1527blocks:
1528
1529```````````````````````````````` example
1530# Heading
1531    foo
1532Heading
1533------
1534    foo
1535----
1536.
1537<h1>Heading</h1>
1538<pre><code>foo
1539</code></pre>
1540<h2>Heading</h2>
1541<pre><code>foo
1542</code></pre>
1543<hr />
1544````````````````````````````````
1545
1546
1547The first line can be indented more than four spaces:
1548
1549```````````````````````````````` example
1550        foo
1551    bar
1552.
1553<pre><code>    foo
1554bar
1555</code></pre>
1556````````````````````````````````
1557
1558
1559Blank lines preceding or following an indented code block
1560are not included in it:
1561
1562```````````````````````````````` example
1563
1564
1565    foo
1566
1567
1568.
1569<pre><code>foo
1570</code></pre>
1571````````````````````````````````
1572
1573
1574Trailing spaces are included in the code block's content:
1575
1576```````````````````````````````` example
1577    foo
1578.
1579<pre><code>foo
1580</code></pre>
1581````````````````````````````````
1582
1583
1584
1585## Fenced code blocks
1586
1587A [code fence](@) is a sequence
1588of at least three consecutive backtick characters (`` ` ``) or
1589tildes (`~`).  (Tildes and backticks cannot be mixed.)
1590A [fenced code block](@)
1591begins with a code fence, indented no more than three spaces.
1592
1593The line with the opening code fence may optionally contain some text
1594following the code fence; this is trimmed of leading and trailing
1595whitespace and called the [info string](@). If the [info string] comes
1596after a backtick fence, it may not contain any backtick
1597characters.  (The reason for this restriction is that otherwise
1598some inline code would be incorrectly interpreted as the
1599beginning of a fenced code block.)
1600
1601The content of the code block consists of all subsequent lines, until
1602a closing [code fence] of the same type as the code block
1603began with (backticks or tildes), and with at least as many backticks
1604or tildes as the opening code fence.  If the leading code fence is
1605indented N spaces, then up to N spaces of indentation are removed from
1606each line of the content (if present).  (If a content line is not
1607indented, it is preserved unchanged.  If it is indented less than N
1608spaces, all of the indentation is removed.)
1609
1610The closing code fence may be indented up to three spaces, and may be
1611followed only by spaces, which are ignored.  If the end of the
1612containing block (or document) is reached and no closing code fence
1613has been found, the code block contains all of the lines after the
1614opening code fence until the end of the containing block (or
1615document).  (An alternative spec would require backtracking in the
1616event that a closing code fence is not found.  But this makes parsing
1617much less efficient, and there seems to be no real down side to the
1618behavior described here.)
1619
1620A fenced code block may interrupt a paragraph, and does not require
1621a blank line either before or after.
1622
1623The content of a code fence is treated as literal text, not parsed
1624as inlines.  The first word of the [info string] is typically used to
1625specify the language of the code sample, and rendered in the `class`
1626attribute of the `code` tag.  However, this spec does not mandate any
1627particular treatment of the [info string].
1628
1629Here is a simple example with backticks:
1630
1631```````````````````````````````` example
1632```
1633<
1634 >
1635```
1636.
1637<pre><code>&lt;
1638 &gt;
1639</code></pre>
1640````````````````````````````````
1641
1642
1643With tildes:
1644
1645```````````````````````````````` example
1646~~~
1647<
1648 >
1649~~~
1650.
1651<pre><code>&lt;
1652 &gt;
1653</code></pre>
1654````````````````````````````````
1655
1656Fewer than three backticks is not enough:
1657
1658```````````````````````````````` example
1659``
1660foo
1661``
1662.
1663<p><code>foo</code></p>
1664````````````````````````````````
1665
1666The closing code fence must use the same character as the opening
1667fence:
1668
1669```````````````````````````````` example
1670```
1671aaa
1672~~~
1673```
1674.
1675<pre><code>aaa
1676~~~
1677</code></pre>
1678````````````````````````````````
1679
1680
1681```````````````````````````````` example
1682~~~
1683aaa
1684```
1685~~~
1686.
1687<pre><code>aaa
1688```
1689</code></pre>
1690````````````````````````````````
1691
1692
1693The closing code fence must be at least as long as the opening fence:
1694
1695```````````````````````````````` example
1696````
1697aaa
1698```
1699``````
1700.
1701<pre><code>aaa
1702```
1703</code></pre>
1704````````````````````````````````
1705
1706
1707```````````````````````````````` example
1708~~~~
1709aaa
1710~~~
1711~~~~
1712.
1713<pre><code>aaa
1714~~~
1715</code></pre>
1716````````````````````````````````
1717
1718
1719Unclosed code blocks are closed by the end of the document
1720(or the enclosing [block quote][block quotes] or [list item][list items]):
1721
1722```````````````````````````````` example
1723```
1724.
1725<pre><code></code></pre>
1726````````````````````````````````
1727
1728
1729```````````````````````````````` example
1730`````
1731
1732```
1733aaa
1734.
1735<pre><code>
1736```
1737aaa
1738</code></pre>
1739````````````````````````````````
1740
1741
1742```````````````````````````````` example
1743> ```
1744> aaa
1745
1746bbb
1747.
1748<blockquote>
1749<pre><code>aaa
1750</code></pre>
1751</blockquote>
1752<p>bbb</p>
1753````````````````````````````````
1754
1755
1756A code block can have all empty lines as its content:
1757
1758```````````````````````````````` example
1759```
1760
1761
1762```
1763.
1764<pre><code>
1765
1766</code></pre>
1767````````````````````````````````
1768
1769
1770A code block can be empty:
1771
1772```````````````````````````````` example
1773```
1774```
1775.
1776<pre><code></code></pre>
1777````````````````````````````````
1778
1779
1780Fences can be indented.  If the opening fence is indented,
1781content lines will have equivalent opening indentation removed,
1782if present:
1783
1784```````````````````````````````` example
1785 ```
1786 aaa
1787aaa
1788```
1789.
1790<pre><code>aaa
1791aaa
1792</code></pre>
1793````````````````````````````````
1794
1795
1796```````````````````````````````` example
1797  ```
1798aaa
1799  aaa
1800aaa
1801  ```
1802.
1803<pre><code>aaa
1804aaa
1805aaa
1806</code></pre>
1807````````````````````````````````
1808
1809
1810```````````````````````````````` example
1811   ```
1812   aaa
1813    aaa
1814  aaa
1815   ```
1816.
1817<pre><code>aaa
1818 aaa
1819aaa
1820</code></pre>
1821````````````````````````````````
1822
1823
1824Four spaces indentation produces an indented code block:
1825
1826```````````````````````````````` example
1827    ```
1828    aaa
1829    ```
1830.
1831<pre><code>```
1832aaa
1833```
1834</code></pre>
1835````````````````````````````````
1836
1837
1838Closing fences may be indented by 0-3 spaces, and their indentation
1839need not match that of the opening fence:
1840
1841```````````````````````````````` example
1842```
1843aaa
1844  ```
1845.
1846<pre><code>aaa
1847</code></pre>
1848````````````````````````````````
1849
1850
1851```````````````````````````````` example
1852   ```
1853aaa
1854  ```
1855.
1856<pre><code>aaa
1857</code></pre>
1858````````````````````````````````
1859
1860
1861This is not a closing fence, because it is indented 4 spaces:
1862
1863```````````````````````````````` example
1864```
1865aaa
1866    ```
1867.
1868<pre><code>aaa
1869    ```
1870</code></pre>
1871````````````````````````````````
1872
1873
1874
1875Code fences (opening and closing) cannot contain internal spaces:
1876
1877```````````````````````````````` example
1878``` ```
1879aaa
1880.
1881<p><code> </code>
1882aaa</p>
1883````````````````````````````````
1884
1885
1886```````````````````````````````` example
1887~~~~~~
1888aaa
1889~~~ ~~
1890.
1891<pre><code>aaa
1892~~~ ~~
1893</code></pre>
1894````````````````````````````````
1895
1896
1897Fenced code blocks can interrupt paragraphs, and can be followed
1898directly by paragraphs, without a blank line between:
1899
1900```````````````````````````````` example
1901foo
1902```
1903bar
1904```
1905baz
1906.
1907<p>foo</p>
1908<pre><code>bar
1909</code></pre>
1910<p>baz</p>
1911````````````````````````````````
1912
1913
1914Other blocks can also occur before and after fenced code blocks
1915without an intervening blank line:
1916
1917```````````````````````````````` example
1918foo
1919---
1920~~~
1921bar
1922~~~
1923# baz
1924.
1925<h2>foo</h2>
1926<pre><code>bar
1927</code></pre>
1928<h1>baz</h1>
1929````````````````````````````````
1930
1931
1932An [info string] can be provided after the opening code fence.
1933Although this spec doesn't mandate any particular treatment of
1934the info string, the first word is typically used to specify
1935the language of the code block. In HTML output, the language is
1936normally indicated by adding a class to the `code` element consisting
1937of `language-` followed by the language name.
1938
1939```````````````````````````````` example
1940```ruby
1941def foo(x)
1942  return 3
1943end
1944```
1945.
1946<pre><code class="language-ruby">def foo(x)
1947  return 3
1948end
1949</code></pre>
1950````````````````````````````````
1951
1952
1953```````````````````````````````` example
1954~~~~    ruby startline=3 $%@#$
1955def foo(x)
1956  return 3
1957end
1958~~~~~~~
1959.
1960<pre><code class="language-ruby">def foo(x)
1961  return 3
1962end
1963</code></pre>
1964````````````````````````````````
1965
1966
1967```````````````````````````````` example
1968````;
1969````
1970.
1971<pre><code class="language-;"></code></pre>
1972````````````````````````````````
1973
1974
1975[Info strings] for backtick code blocks cannot contain backticks:
1976
1977```````````````````````````````` example
1978``` aa ```
1979foo
1980.
1981<p><code>aa</code>
1982foo</p>
1983````````````````````````````````
1984
1985
1986[Info strings] for tilde code blocks can contain backticks and tildes:
1987
1988```````````````````````````````` example
1989~~~ aa ``` ~~~
1990foo
1991~~~
1992.
1993<pre><code class="language-aa">foo
1994</code></pre>
1995````````````````````````````````
1996
1997
1998Closing code fences cannot have [info strings]:
1999
2000```````````````````````````````` example
2001```
2002``` aaa
2003```
2004.
2005<pre><code>``` aaa
2006</code></pre>
2007````````````````````````````````
2008
2009
2010
2011## HTML blocks
2012
2013An [HTML block](@) is a group of lines that is treated
2014as raw HTML (and will not be escaped in HTML output).
2015
2016There are seven kinds of [HTML block], which can be defined by their
2017start and end conditions.  The block begins with a line that meets a
2018[start condition](@) (after up to three spaces optional indentation).
2019It ends with the first subsequent line that meets a matching [end
2020condition](@), or the last line of the document, or the last line of
2021the [container block](#container-blocks) containing the current HTML
2022block, if no line is encountered that meets the [end condition].  If
2023the first line meets both the [start condition] and the [end
2024condition], the block will contain just that line.
2025
20261.  **Start condition:**  line begins with the string `<script`,
2027`<pre`, or `<style` (case-insensitive), followed by whitespace,
2028the string `>`, or the end of the line.\
2029**End condition:**  line contains an end tag
2030`</script>`, `</pre>`, or `</style>` (case-insensitive; it
2031need not match the start tag).
2032
20332.  **Start condition:** line begins with the string `<!--`.\
2034**End condition:**  line contains the string `-->`.
2035
20363.  **Start condition:** line begins with the string `<?`.\
2037**End condition:** line contains the string `?>`.
2038
20394.  **Start condition:** line begins with the string `<!`
2040followed by an uppercase ASCII letter.\
2041**End condition:** line contains the character `>`.
2042
20435.  **Start condition:**  line begins with the string
2044`<![CDATA[`.\
2045**End condition:** line contains the string `]]>`.
2046
20476.  **Start condition:** line begins the string `<` or `</`
2048followed by one of the strings (case-insensitive) `address`,
2049`article`, `aside`, `base`, `basefont`, `blockquote`, `body`,
2050`caption`, `center`, `col`, `colgroup`, `dd`, `details`, `dialog`,
2051`dir`, `div`, `dl`, `dt`, `fieldset`, `figcaption`, `figure`,
2052`footer`, `form`, `frame`, `frameset`,
2053`h1`, `h2`, `h3`, `h4`, `h5`, `h6`, `head`, `header`, `hr`,
2054`html`, `iframe`, `legend`, `li`, `link`, `main`, `menu`, `menuitem`,
2055`nav`, `noframes`, `ol`, `optgroup`, `option`, `p`, `param`,
2056`section`, `source`, `summary`, `table`, `tbody`, `td`,
2057`tfoot`, `th`, `thead`, `title`, `tr`, `track`, `ul`, followed
2058by [whitespace], the end of the line, the string `>`, or
2059the string `/>`.\
2060**End condition:** line is followed by a [blank line].
2061
20627.  **Start condition:**  line begins with a complete [open tag]
2063(with any [tag name] other than `script`,
2064`style`, or `pre`) or a complete [closing tag],
2065followed only by [whitespace] or the end of the line.\
2066**End condition:** line is followed by a [blank line].
2067
2068HTML blocks continue until they are closed by their appropriate
2069[end condition], or the last line of the document or other [container
2070block](#container-blocks).  This means any HTML **within an HTML
2071block** that might otherwise be recognised as a start condition will
2072be ignored by the parser and passed through as-is, without changing
2073the parser's state.
2074
2075For instance, `<pre>` within a HTML block started by `<table>` will not affect
2076the parser state; as the HTML block was started in by start condition 6, it
2077will end at any blank line. This can be surprising:
2078
2079```````````````````````````````` example
2080<table><tr><td>
2081<pre>
2082**Hello**,
2083
2084_world_.
2085</pre>
2086</td></tr></table>
2087.
2088<table><tr><td>
2089<pre>
2090**Hello**,
2091<p><em>world</em>.
2092</pre></p>
2093</td></tr></table>
2094````````````````````````````````
2095
2096In this case, the HTML block is terminated by the newline — the `**Hello**`
2097text remains verbatim — and regular parsing resumes, with a paragraph,
2098emphasised `world` and inline and block HTML following.
2099
2100All types of [HTML blocks] except type 7 may interrupt
2101a paragraph.  Blocks of type 7 may not interrupt a paragraph.
2102(This restriction is intended to prevent unwanted interpretation
2103of long tags inside a wrapped paragraph as starting HTML blocks.)
2104
2105Some simple examples follow.  Here are some basic HTML blocks
2106of type 6:
2107
2108```````````````````````````````` example
2109<table>
2110  <tr>
2111    <td>
2112           hi
2113    </td>
2114  </tr>
2115</table>
2116
2117okay.
2118.
2119<table>
2120  <tr>
2121    <td>
2122           hi
2123    </td>
2124  </tr>
2125</table>
2126<p>okay.</p>
2127````````````````````````````````
2128
2129
2130```````````````````````````````` example
2131 <div>
2132  *hello*
2133         <foo><a>
2134.
2135 <div>
2136  *hello*
2137         <foo><a>
2138````````````````````````````````
2139
2140
2141A block can also start with a closing tag:
2142
2143```````````````````````````````` example
2144</div>
2145*foo*
2146.
2147</div>
2148*foo*
2149````````````````````````````````
2150
2151
2152Here we have two HTML blocks with a Markdown paragraph between them:
2153
2154```````````````````````````````` example
2155<DIV CLASS="foo">
2156
2157*Markdown*
2158
2159</DIV>
2160.
2161<DIV CLASS="foo">
2162<p><em>Markdown</em></p>
2163</DIV>
2164````````````````````````````````
2165
2166
2167The tag on the first line can be partial, as long
2168as it is split where there would be whitespace:
2169
2170```````````````````````````````` example
2171<div id="foo"
2172  class="bar">
2173</div>
2174.
2175<div id="foo"
2176  class="bar">
2177</div>
2178````````````````````````````````
2179
2180
2181```````````````````````````````` example
2182<div id="foo" class="bar
2183  baz">
2184</div>
2185.
2186<div id="foo" class="bar
2187  baz">
2188</div>
2189````````````````````````````````
2190
2191
2192An open tag need not be closed:
2193```````````````````````````````` example
2194<div>
2195*foo*
2196
2197*bar*
2198.
2199<div>
2200*foo*
2201<p><em>bar</em></p>
2202````````````````````````````````
2203
2204
2205
2206A partial tag need not even be completed (garbage
2207in, garbage out):
2208
2209```````````````````````````````` example
2210<div id="foo"
2211*hi*
2212.
2213<div id="foo"
2214*hi*
2215````````````````````````````````
2216
2217
2218```````````````````````````````` example
2219<div class
2220foo
2221.
2222<div class
2223foo
2224````````````````````````````````
2225
2226
2227The initial tag doesn't even need to be a valid
2228tag, as long as it starts like one:
2229
2230```````````````````````````````` example
2231<div *???-&&&-<---
2232*foo*
2233.
2234<div *???-&&&-<---
2235*foo*
2236````````````````````````````````
2237
2238
2239In type 6 blocks, the initial tag need not be on a line by
2240itself:
2241
2242```````````````````````````````` example
2243<div><a href="bar">*foo*</a></div>
2244.
2245<div><a href="bar">*foo*</a></div>
2246````````````````````````````````
2247
2248
2249```````````````````````````````` example
2250<table><tr><td>
2251foo
2252</td></tr></table>
2253.
2254<table><tr><td>
2255foo
2256</td></tr></table>
2257````````````````````````````````
2258
2259
2260Everything until the next blank line or end of document
2261gets included in the HTML block.  So, in the following
2262example, what looks like a Markdown code block
2263is actually part of the HTML block, which continues until a blank
2264line or the end of the document is reached:
2265
2266```````````````````````````````` example
2267<div></div>
2268``` c
2269int x = 33;
2270```
2271.
2272<div></div>
2273``` c
2274int x = 33;
2275```
2276````````````````````````````````
2277
2278
2279To start an [HTML block] with a tag that is *not* in the
2280list of block-level tags in (6), you must put the tag by
2281itself on the first line (and it must be complete):
2282
2283```````````````````````````````` example
2284<a href="foo">
2285*bar*
2286</a>
2287.
2288<a href="foo">
2289*bar*
2290</a>
2291````````````````````````````````
2292
2293
2294In type 7 blocks, the [tag name] can be anything:
2295
2296```````````````````````````````` example
2297<Warning>
2298*bar*
2299</Warning>
2300.
2301<Warning>
2302*bar*
2303</Warning>
2304````````````````````````````````
2305
2306
2307```````````````````````````````` example
2308<i class="foo">
2309*bar*
2310</i>
2311.
2312<i class="foo">
2313*bar*
2314</i>
2315````````````````````````````````
2316
2317
2318```````````````````````````````` example
2319</ins>
2320*bar*
2321.
2322</ins>
2323*bar*
2324````````````````````````````````
2325
2326
2327These rules are designed to allow us to work with tags that
2328can function as either block-level or inline-level tags.
2329The `<del>` tag is a nice example.  We can surround content with
2330`<del>` tags in three different ways.  In this case, we get a raw
2331HTML block, because the `<del>` tag is on a line by itself:
2332
2333```````````````````````````````` example
2334<del>
2335*foo*
2336</del>
2337.
2338<del>
2339*foo*
2340</del>
2341````````````````````````````````
2342
2343
2344In this case, we get a raw HTML block that just includes
2345the `<del>` tag (because it ends with the following blank
2346line).  So the contents get interpreted as CommonMark:
2347
2348```````````````````````````````` example
2349<del>
2350
2351*foo*
2352
2353</del>
2354.
2355<del>
2356<p><em>foo</em></p>
2357</del>
2358````````````````````````````````
2359
2360
2361Finally, in this case, the `<del>` tags are interpreted
2362as [raw HTML] *inside* the CommonMark paragraph.  (Because
2363the tag is not on a line by itself, we get inline HTML
2364rather than an [HTML block].)
2365
2366```````````````````````````````` example
2367<del>*foo*</del>
2368.
2369<p><del><em>foo</em></del></p>
2370````````````````````````````````
2371
2372
2373HTML tags designed to contain literal content
2374(`script`, `style`, `pre`), comments, processing instructions,
2375and declarations are treated somewhat differently.
2376Instead of ending at the first blank line, these blocks
2377end at the first line containing a corresponding end tag.
2378As a result, these blocks can contain blank lines:
2379
2380A pre tag (type 1):
2381
2382```````````````````````````````` example
2383<pre language="haskell"><code>
2384import Text.HTML.TagSoup
2385
2386main :: IO ()
2387main = print $ parseTags tags
2388</code></pre>
2389okay
2390.
2391<pre language="haskell"><code>
2392import Text.HTML.TagSoup
2393
2394main :: IO ()
2395main = print $ parseTags tags
2396</code></pre>
2397<p>okay</p>
2398````````````````````````````````
2399
2400
2401A script tag (type 1):
2402
2403```````````````````````````````` example
2404<script type="text/javascript">
2405// JavaScript example
2406
2407document.getElementById("demo").innerHTML = "Hello JavaScript!";
2408</script>
2409okay
2410.
2411<script type="text/javascript">
2412// JavaScript example
2413
2414document.getElementById("demo").innerHTML = "Hello JavaScript!";
2415</script>
2416<p>okay</p>
2417````````````````````````````````
2418
2419
2420A style tag (type 1):
2421
2422```````````````````````````````` example
2423<style
2424  type="text/css">
2425h1 {color:red;}
2426
2427p {color:blue;}
2428</style>
2429okay
2430.
2431<style
2432  type="text/css">
2433h1 {color:red;}
2434
2435p {color:blue;}
2436</style>
2437<p>okay</p>
2438````````````````````````````````
2439
2440
2441If there is no matching end tag, the block will end at the
2442end of the document (or the enclosing [block quote][block quotes]
2443or [list item][list items]):
2444
2445```````````````````````````````` example
2446<style
2447  type="text/css">
2448
2449foo
2450.
2451<style
2452  type="text/css">
2453
2454foo
2455````````````````````````````````
2456
2457
2458```````````````````````````````` example
2459> <div>
2460> foo
2461
2462bar
2463.
2464<blockquote>
2465<div>
2466foo
2467</blockquote>
2468<p>bar</p>
2469````````````````````````````````
2470
2471
2472```````````````````````````````` example
2473- <div>
2474- foo
2475.
2476<ul>
2477<li>
2478<div>
2479</li>
2480<li>foo</li>
2481</ul>
2482````````````````````````````````
2483
2484
2485The end tag can occur on the same line as the start tag:
2486
2487```````````````````````````````` example
2488<style>p{color:red;}</style>
2489*foo*
2490.
2491<style>p{color:red;}</style>
2492<p><em>foo</em></p>
2493````````````````````````````````
2494
2495
2496```````````````````````````````` example
2497<!-- foo -->*bar*
2498*baz*
2499.
2500<!-- foo -->*bar*
2501<p><em>baz</em></p>
2502````````````````````````````````
2503
2504
2505Note that anything on the last line after the
2506end tag will be included in the [HTML block]:
2507
2508```````````````````````````````` example
2509<script>
2510foo
2511</script>1. *bar*
2512.
2513<script>
2514foo
2515</script>1. *bar*
2516````````````````````````````````
2517
2518
2519A comment (type 2):
2520
2521```````````````````````````````` example
2522<!-- Foo
2523
2524bar
2525   baz -->
2526okay
2527.
2528<!-- Foo
2529
2530bar
2531   baz -->
2532<p>okay</p>
2533````````````````````````````````
2534
2535
2536
2537A processing instruction (type 3):
2538
2539```````````````````````````````` example
2540<?php
2541
2542  echo '>';
2543
2544?>
2545okay
2546.
2547<?php
2548
2549  echo '>';
2550
2551?>
2552<p>okay</p>
2553````````````````````````````````
2554
2555
2556A declaration (type 4):
2557
2558```````````````````````````````` example
2559<!DOCTYPE html>
2560.
2561<!DOCTYPE html>
2562````````````````````````````````
2563
2564
2565CDATA (type 5):
2566
2567```````````````````````````````` example
2568<![CDATA[
2569function matchwo(a,b)
2570{
2571  if (a < b && a < 0) then {
2572    return 1;
2573
2574  } else {
2575
2576    return 0;
2577  }
2578}
2579]]>
2580okay
2581.
2582<![CDATA[
2583function matchwo(a,b)
2584{
2585  if (a < b && a < 0) then {
2586    return 1;
2587
2588  } else {
2589
2590    return 0;
2591  }
2592}
2593]]>
2594<p>okay</p>
2595````````````````````````````````
2596
2597
2598The opening tag can be indented 1-3 spaces, but not 4:
2599
2600```````````````````````````````` example
2601  <!-- foo -->
2602
2603    <!-- foo -->
2604.
2605  <!-- foo -->
2606<pre><code>&lt;!-- foo --&gt;
2607</code></pre>
2608````````````````````````````````
2609
2610
2611```````````````````````````````` example
2612  <div>
2613
2614    <div>
2615.
2616  <div>
2617<pre><code>&lt;div&gt;
2618</code></pre>
2619````````````````````````````````
2620
2621
2622An HTML block of types 1--6 can interrupt a paragraph, and need not be
2623preceded by a blank line.
2624
2625```````````````````````````````` example
2626Foo
2627<div>
2628bar
2629</div>
2630.
2631<p>Foo</p>
2632<div>
2633bar
2634</div>
2635````````````````````````````````
2636
2637
2638However, a following blank line is needed, except at the end of
2639a document, and except for blocks of types 1--5, [above][HTML
2640block]:
2641
2642```````````````````````````````` example
2643<div>
2644bar
2645</div>
2646*foo*
2647.
2648<div>
2649bar
2650</div>
2651*foo*
2652````````````````````````````````
2653
2654
2655HTML blocks of type 7 cannot interrupt a paragraph:
2656
2657```````````````````````````````` example
2658Foo
2659<a href="bar">
2660baz
2661.
2662<p>Foo
2663<a href="bar">
2664baz</p>
2665````````````````````````````````
2666
2667
2668This rule differs from John Gruber's original Markdown syntax
2669specification, which says:
2670
2671> The only restrictions are that block-level HTML elements —
2672> e.g. `<div>`, `<table>`, `<pre>`, `<p>`, etc. — must be separated from
2673> surrounding content by blank lines, and the start and end tags of the
2674> block should not be indented with tabs or spaces.
2675
2676In some ways Gruber's rule is more restrictive than the one given
2677here:
2678
2679- It requires that an HTML block be preceded by a blank line.
2680- It does not allow the start tag to be indented.
2681- It requires a matching end tag, which it also does not allow to
2682  be indented.
2683
2684Most Markdown implementations (including some of Gruber's own) do not
2685respect all of these restrictions.
2686
2687There is one respect, however, in which Gruber's rule is more liberal
2688than the one given here, since it allows blank lines to occur inside
2689an HTML block.  There are two reasons for disallowing them here.
2690First, it removes the need to parse balanced tags, which is
2691expensive and can require backtracking from the end of the document
2692if no matching end tag is found. Second, it provides a very simple
2693and flexible way of including Markdown content inside HTML tags:
2694simply separate the Markdown from the HTML using blank lines:
2695
2696Compare:
2697
2698```````````````````````````````` example
2699<div>
2700
2701*Emphasized* text.
2702
2703</div>
2704.
2705<div>
2706<p><em>Emphasized</em> text.</p>
2707</div>
2708````````````````````````````````
2709
2710
2711```````````````````````````````` example
2712<div>
2713*Emphasized* text.
2714</div>
2715.
2716<div>
2717*Emphasized* text.
2718</div>
2719````````````````````````````````
2720
2721
2722Some Markdown implementations have adopted a convention of
2723interpreting content inside tags as text if the open tag has
2724the attribute `markdown=1`.  The rule given above seems a simpler and
2725more elegant way of achieving the same expressive power, which is also
2726much simpler to parse.
2727
2728The main potential drawback is that one can no longer paste HTML
2729blocks into Markdown documents with 100% reliability.  However,
2730*in most cases* this will work fine, because the blank lines in
2731HTML are usually followed by HTML block tags.  For example:
2732
2733```````````````````````````````` example
2734<table>
2735
2736<tr>
2737
2738<td>
2739Hi
2740</td>
2741
2742</tr>
2743
2744</table>
2745.
2746<table>
2747<tr>
2748<td>
2749Hi
2750</td>
2751</tr>
2752</table>
2753````````````````````````````````
2754
2755
2756There are problems, however, if the inner tags are indented
2757*and* separated by spaces, as then they will be interpreted as
2758an indented code block:
2759
2760```````````````````````````````` example
2761<table>
2762
2763  <tr>
2764
2765    <td>
2766      Hi
2767    </td>
2768
2769  </tr>
2770
2771</table>
2772.
2773<table>
2774  <tr>
2775<pre><code>&lt;td&gt;
2776  Hi
2777&lt;/td&gt;
2778</code></pre>
2779  </tr>
2780</table>
2781````````````````````````````````
2782
2783
2784Fortunately, blank lines are usually not necessary and can be
2785deleted.  The exception is inside `<pre>` tags, but as described
2786[above][HTML blocks], raw HTML blocks starting with `<pre>`
2787*can* contain blank lines.
2788
2789## Link reference definitions
2790
2791A [link reference definition](@)
2792consists of a [link label], indented up to three spaces, followed
2793by a colon (`:`), optional [whitespace] (including up to one
2794[line ending]), a [link destination],
2795optional [whitespace] (including up to one
2796[line ending]), and an optional [link
2797title], which if it is present must be separated
2798from the [link destination] by [whitespace].
2799No further [non-whitespace characters] may occur on the line.
2800
2801A [link reference definition]
2802does not correspond to a structural element of a document.  Instead, it
2803defines a label which can be used in [reference links]
2804and reference-style [images] elsewhere in the document.  [Link
2805reference definitions] can come either before or after the links that use
2806them.
2807
2808```````````````````````````````` example
2809[foo]: /url "title"
2810
2811[foo]
2812.
2813<p><a href="/url" title="title">foo</a></p>
2814````````````````````````````````
2815
2816
2817```````````````````````````````` example
2818   [foo]:
2819      /url
2820           'the title'
2821
2822[foo]
2823.
2824<p><a href="/url" title="the title">foo</a></p>
2825````````````````````````````````
2826
2827
2828```````````````````````````````` example
2829[Foo*bar\]]:my_(url) 'title (with parens)'
2830
2831[Foo*bar\]]
2832.
2833<p><a href="my_(url)" title="title (with parens)">Foo*bar]</a></p>
2834````````````````````````````````
2835
2836
2837```````````````````````````````` example
2838[Foo bar]:
2839<my url>
2840'title'
2841
2842[Foo bar]
2843.
2844<p><a href="my%20url" title="title">Foo bar</a></p>
2845````````````````````````````````
2846
2847
2848The title may extend over multiple lines:
2849
2850```````````````````````````````` example
2851[foo]: /url '
2852title
2853line1
2854line2
2855'
2856
2857[foo]
2858.
2859<p><a href="/url" title="
2860title
2861line1
2862line2
2863">foo</a></p>
2864````````````````````````````````
2865
2866
2867However, it may not contain a [blank line]:
2868
2869```````````````````````````````` example
2870[foo]: /url 'title
2871
2872with blank line'
2873
2874[foo]
2875.
2876<p>[foo]: /url 'title</p>
2877<p>with blank line'</p>
2878<p>[foo]</p>
2879````````````````````````````````
2880
2881
2882The title may be omitted:
2883
2884```````````````````````````````` example
2885[foo]:
2886/url
2887
2888[foo]
2889.
2890<p><a href="/url">foo</a></p>
2891````````````````````````````````
2892
2893
2894The link destination may not be omitted:
2895
2896```````````````````````````````` example
2897[foo]:
2898
2899[foo]
2900.
2901<p>[foo]:</p>
2902<p>[foo]</p>
2903````````````````````````````````
2904
2905 However, an empty link destination may be specified using
2906 angle brackets:
2907
2908```````````````````````````````` example
2909[foo]: <>
2910
2911[foo]
2912.
2913<p><a href="">foo</a></p>
2914````````````````````````````````
2915
2916The title must be separated from the link destination by
2917whitespace:
2918
2919```````````````````````````````` example
2920[foo]: <bar>(baz)
2921
2922[foo]
2923.
2924<p>[foo]: <bar>(baz)</p>
2925<p>[foo]</p>
2926````````````````````````````````
2927
2928
2929Both title and destination can contain backslash escapes
2930and literal backslashes:
2931
2932```````````````````````````````` example
2933[foo]: /url\bar\*baz "foo\"bar\baz"
2934
2935[foo]
2936.
2937<p><a href="/url%5Cbar*baz" title="foo&quot;bar\baz">foo</a></p>
2938````````````````````````````````
2939
2940
2941A link can come before its corresponding definition:
2942
2943```````````````````````````````` example
2944[foo]
2945
2946[foo]: url
2947.
2948<p><a href="url">foo</a></p>
2949````````````````````````````````
2950
2951
2952If there are several matching definitions, the first one takes
2953precedence:
2954
2955```````````````````````````````` example
2956[foo]
2957
2958[foo]: first
2959[foo]: second
2960.
2961<p><a href="first">foo</a></p>
2962````````````````````````````````
2963
2964
2965As noted in the section on [Links], matching of labels is
2966case-insensitive (see [matches]).
2967
2968```````````````````````````````` example
2969[FOO]: /url
2970
2971[Foo]
2972.
2973<p><a href="/url">Foo</a></p>
2974````````````````````````````````
2975
2976
2977```````````````````````````````` example
2978[ΑΓΩ]: /φου
2979
2980[αγω]
2981.
2982<p><a href="/%CF%86%CE%BF%CF%85">αγω</a></p>
2983````````````````````````````````
2984
2985
2986Here is a link reference definition with no corresponding link.
2987It contributes nothing to the document.
2988
2989```````````````````````````````` example
2990[foo]: /url
2991.
2992````````````````````````````````
2993
2994
2995Here is another one:
2996
2997```````````````````````````````` example
2998[
2999foo
3000]: /url
3001bar
3002.
3003<p>bar</p>
3004````````````````````````````````
3005
3006
3007This is not a link reference definition, because there are
3008[non-whitespace characters] after the title:
3009
3010```````````````````````````````` example
3011[foo]: /url "title" ok
3012.
3013<p>[foo]: /url &quot;title&quot; ok</p>
3014````````````````````````````````
3015
3016
3017This is a link reference definition, but it has no title:
3018
3019```````````````````````````````` example
3020[foo]: /url
3021"title" ok
3022.
3023<p>&quot;title&quot; ok</p>
3024````````````````````````````````
3025
3026
3027This is not a link reference definition, because it is indented
3028four spaces:
3029
3030```````````````````````````````` example
3031    [foo]: /url "title"
3032
3033[foo]
3034.
3035<pre><code>[foo]: /url &quot;title&quot;
3036</code></pre>
3037<p>[foo]</p>
3038````````````````````````````````
3039
3040
3041This is not a link reference definition, because it occurs inside
3042a code block:
3043
3044```````````````````````````````` example
3045```
3046[foo]: /url
3047```
3048
3049[foo]
3050.
3051<pre><code>[foo]: /url
3052</code></pre>
3053<p>[foo]</p>
3054````````````````````````````````
3055
3056
3057A [link reference definition] cannot interrupt a paragraph.
3058
3059```````````````````````````````` example
3060Foo
3061[bar]: /baz
3062
3063[bar]
3064.
3065<p>Foo
3066[bar]: /baz</p>
3067<p>[bar]</p>
3068````````````````````````````````
3069
3070
3071However, it can directly follow other block elements, such as headings
3072and thematic breaks, and it need not be followed by a blank line.
3073
3074```````````````````````````````` example
3075# [Foo]
3076[foo]: /url
3077> bar
3078.
3079<h1><a href="/url">Foo</a></h1>
3080<blockquote>
3081<p>bar</p>
3082</blockquote>
3083````````````````````````````````
3084
3085```````````````````````````````` example
3086[foo]: /url
3087bar
3088===
3089[foo]
3090.
3091<h1>bar</h1>
3092<p><a href="/url">foo</a></p>
3093````````````````````````````````
3094
3095```````````````````````````````` example
3096[foo]: /url
3097===
3098[foo]
3099.
3100<p>===
3101<a href="/url">foo</a></p>
3102````````````````````````````````
3103
3104
3105Several [link reference definitions]
3106can occur one after another, without intervening blank lines.
3107
3108```````````````````````````````` example
3109[foo]: /foo-url "foo"
3110[bar]: /bar-url
3111  "bar"
3112[baz]: /baz-url
3113
3114[foo],
3115[bar],
3116[baz]
3117.
3118<p><a href="/foo-url" title="foo">foo</a>,
3119<a href="/bar-url" title="bar">bar</a>,
3120<a href="/baz-url">baz</a></p>
3121````````````````````````````````
3122
3123
3124[Link reference definitions] can occur
3125inside block containers, like lists and block quotations.  They
3126affect the entire document, not just the container in which they
3127are defined:
3128
3129```````````````````````````````` example
3130[foo]
3131
3132> [foo]: /url
3133.
3134<p><a href="/url">foo</a></p>
3135<blockquote>
3136</blockquote>
3137````````````````````````````````
3138
3139
3140Whether something is a [link reference definition] is
3141independent of whether the link reference it defines is
3142used in the document.  Thus, for example, the following
3143document contains just a link reference definition, and
3144no visible content:
3145
3146```````````````````````````````` example
3147[foo]: /url
3148.
3149````````````````````````````````
3150
3151
3152## Paragraphs
3153
3154A sequence of non-blank lines that cannot be interpreted as other
3155kinds of blocks forms a [paragraph](@).
3156The contents of the paragraph are the result of parsing the
3157paragraph's raw content as inlines.  The paragraph's raw content
3158is formed by concatenating the lines and removing initial and final
3159[whitespace].
3160
3161A simple example with two paragraphs:
3162
3163```````````````````````````````` example
3164aaa
3165
3166bbb
3167.
3168<p>aaa</p>
3169<p>bbb</p>
3170````````````````````````````````
3171
3172
3173Paragraphs can contain multiple lines, but no blank lines:
3174
3175```````````````````````````````` example
3176aaa
3177bbb
3178
3179ccc
3180ddd
3181.
3182<p>aaa
3183bbb</p>
3184<p>ccc
3185ddd</p>
3186````````````````````````````````
3187
3188
3189Multiple blank lines between paragraph have no effect:
3190
3191```````````````````````````````` example
3192aaa
3193
3194
3195bbb
3196.
3197<p>aaa</p>
3198<p>bbb</p>
3199````````````````````````````````
3200
3201
3202Leading spaces are skipped:
3203
3204```````````````````````````````` example
3205  aaa
3206 bbb
3207.
3208<p>aaa
3209bbb</p>
3210````````````````````````````````
3211
3212
3213Lines after the first may be indented any amount, since indented
3214code blocks cannot interrupt paragraphs.
3215
3216```````````````````````````````` example
3217aaa
3218             bbb
3219                                       ccc
3220.
3221<p>aaa
3222bbb
3223ccc</p>
3224````````````````````````````````
3225
3226
3227However, the first line may be indented at most three spaces,
3228or an indented code block will be triggered:
3229
3230```````````````````````````````` example
3231   aaa
3232bbb
3233.
3234<p>aaa
3235bbb</p>
3236````````````````````````````````
3237
3238
3239```````````````````````````````` example
3240    aaa
3241bbb
3242.
3243<pre><code>aaa
3244</code></pre>
3245<p>bbb</p>
3246````````````````````````````````
3247
3248
3249Final spaces are stripped before inline parsing, so a paragraph
3250that ends with two or more spaces will not end with a [hard line
3251break]:
3252
3253```````````````````````````````` example
3254aaa
3255bbb
3256.
3257<p>aaa<br />
3258bbb</p>
3259````````````````````````````````
3260
3261
3262## Blank lines
3263
3264[Blank lines] between block-level elements are ignored,
3265except for the role they play in determining whether a [list]
3266is [tight] or [loose].
3267
3268Blank lines at the beginning and end of the document are also ignored.
3269
3270```````````````````````````````` example
3271
3272
3273aaa
3274
3275
3276# aaa
3277
3278
3279.
3280<p>aaa</p>
3281<h1>aaa</h1>
3282````````````````````````````````
3283
3284
3285
3286# Container blocks
3287
3288A [container block](#container-blocks) is a block that has other
3289blocks as its contents.  There are two basic kinds of container blocks:
3290[block quotes] and [list items].
3291[Lists] are meta-containers for [list items].
3292
3293We define the syntax for container blocks recursively.  The general
3294form of the definition is:
3295
3296> If X is a sequence of blocks, then the result of
3297> transforming X in such-and-such a way is a container of type Y
3298> with these blocks as its content.
3299
3300So, we explain what counts as a block quote or list item by explaining
3301how these can be *generated* from their contents. This should suffice
3302to define the syntax, although it does not give a recipe for *parsing*
3303these constructions.  (A recipe is provided below in the section entitled
3304[A parsing strategy](#appendix-a-parsing-strategy).)
3305
3306## Block quotes
3307
3308A [block quote marker](@)
3309consists of 0-3 spaces of initial indent, plus (a) the character `>` together
3310with a following space, or (b) a single character `>` not followed by a space.
3311
3312The following rules define [block quotes]:
3313
33141.  **Basic case.**  If a string of lines *Ls* constitute a sequence
3315    of blocks *Bs*, then the result of prepending a [block quote
3316    marker] to the beginning of each line in *Ls*
3317    is a [block quote](#block-quotes) containing *Bs*.
3318
33192.  **Laziness.**  If a string of lines *Ls* constitute a [block
3320    quote](#block-quotes) with contents *Bs*, then the result of deleting
3321    the initial [block quote marker] from one or
3322    more lines in which the next [non-whitespace character] after the [block
3323    quote marker] is [paragraph continuation
3324    text] is a block quote with *Bs* as its content.
3325    [Paragraph continuation text](@) is text
3326    that will be parsed as part of the content of a paragraph, but does
3327    not occur at the beginning of the paragraph.
3328
33293.  **Consecutiveness.**  A document cannot contain two [block
3330    quotes] in a row unless there is a [blank line] between them.
3331
3332Nothing else counts as a [block quote](#block-quotes).
3333
3334Here is a simple example:
3335
3336```````````````````````````````` example
3337> # Foo
3338> bar
3339> baz
3340.
3341<blockquote>
3342<h1>Foo</h1>
3343<p>bar
3344baz</p>
3345</blockquote>
3346````````````````````````````````
3347
3348
3349The spaces after the `>` characters can be omitted:
3350
3351```````````````````````````````` example
3352># Foo
3353>bar
3354> baz
3355.
3356<blockquote>
3357<h1>Foo</h1>
3358<p>bar
3359baz</p>
3360</blockquote>
3361````````````````````````````````
3362
3363
3364The `>` characters can be indented 1-3 spaces:
3365
3366```````````````````````````````` example
3367   > # Foo
3368   > bar
3369 > baz
3370.
3371<blockquote>
3372<h1>Foo</h1>
3373<p>bar
3374baz</p>
3375</blockquote>
3376````````````````````````````````
3377
3378
3379Four spaces gives us a code block:
3380
3381```````````````````````````````` example
3382    > # Foo
3383    > bar
3384    > baz
3385.
3386<pre><code>&gt; # Foo
3387&gt; bar
3388&gt; baz
3389</code></pre>
3390````````````````````````````````
3391
3392
3393The Laziness clause allows us to omit the `>` before
3394[paragraph continuation text]:
3395
3396```````````````````````````````` example
3397> # Foo
3398> bar
3399baz
3400.
3401<blockquote>
3402<h1>Foo</h1>
3403<p>bar
3404baz</p>
3405</blockquote>
3406````````````````````````````````
3407
3408
3409A block quote can contain some lazy and some non-lazy
3410continuation lines:
3411
3412```````````````````````````````` example
3413> bar
3414baz
3415> foo
3416.
3417<blockquote>
3418<p>bar
3419baz
3420foo</p>
3421</blockquote>
3422````````````````````````````````
3423
3424
3425Laziness only applies to lines that would have been continuations of
3426paragraphs had they been prepended with [block quote markers].
3427For example, the `> ` cannot be omitted in the second line of
3428
3429``` markdown
3430> foo
3431> ---
3432```
3433
3434without changing the meaning:
3435
3436```````````````````````````````` example
3437> foo
3438---
3439.
3440<blockquote>
3441<p>foo</p>
3442</blockquote>
3443<hr />
3444````````````````````````````````
3445
3446
3447Similarly, if we omit the `> ` in the second line of
3448
3449``` markdown
3450> - foo
3451> - bar
3452```
3453
3454then the block quote ends after the first line:
3455
3456```````````````````````````````` example
3457> - foo
3458- bar
3459.
3460<blockquote>
3461<ul>
3462<li>foo</li>
3463</ul>
3464</blockquote>
3465<ul>
3466<li>bar</li>
3467</ul>
3468````````````````````````````````
3469
3470
3471For the same reason, we can't omit the `> ` in front of
3472subsequent lines of an indented or fenced code block:
3473
3474```````````````````````````````` example
3475>     foo
3476    bar
3477.
3478<blockquote>
3479<pre><code>foo
3480</code></pre>
3481</blockquote>
3482<pre><code>bar
3483</code></pre>
3484````````````````````````````````
3485
3486
3487```````````````````````````````` example
3488> ```
3489foo
3490```
3491.
3492<blockquote>
3493<pre><code></code></pre>
3494</blockquote>
3495<p>foo</p>
3496<pre><code></code></pre>
3497````````````````````````````````
3498
3499
3500Note that in the following case, we have a [lazy
3501continuation line]:
3502
3503```````````````````````````````` example
3504> foo
3505    - bar
3506.
3507<blockquote>
3508<p>foo
3509- bar</p>
3510</blockquote>
3511````````````````````````````````
3512
3513
3514To see why, note that in
3515
3516```markdown
3517> foo
3518>     - bar
3519```
3520
3521the `- bar` is indented too far to start a list, and can't
3522be an indented code block because indented code blocks cannot
3523interrupt paragraphs, so it is [paragraph continuation text].
3524
3525A block quote can be empty:
3526
3527```````````````````````````````` example
3528>
3529.
3530<blockquote>
3531</blockquote>
3532````````````````````````````````
3533
3534
3535```````````````````````````````` example
3536>
3537>
3538>
3539.
3540<blockquote>
3541</blockquote>
3542````````````````````````````````
3543
3544
3545A block quote can have initial or final blank lines:
3546
3547```````````````````````````````` example
3548>
3549> foo
3550>
3551.
3552<blockquote>
3553<p>foo</p>
3554</blockquote>
3555````````````````````````````````
3556
3557
3558A blank line always separates block quotes:
3559
3560```````````````````````````````` example
3561> foo
3562
3563> bar
3564.
3565<blockquote>
3566<p>foo</p>
3567</blockquote>
3568<blockquote>
3569<p>bar</p>
3570</blockquote>
3571````````````````````````````````
3572
3573
3574(Most current Markdown implementations, including John Gruber's
3575original `Markdown.pl`, will parse this example as a single block quote
3576with two paragraphs.  But it seems better to allow the author to decide
3577whether two block quotes or one are wanted.)
3578
3579Consecutiveness means that if we put these block quotes together,
3580we get a single block quote:
3581
3582```````````````````````````````` example
3583> foo
3584> bar
3585.
3586<blockquote>
3587<p>foo
3588bar</p>
3589</blockquote>
3590````````````````````````````````
3591
3592
3593To get a block quote with two paragraphs, use:
3594
3595```````````````````````````````` example
3596> foo
3597>
3598> bar
3599.
3600<blockquote>
3601<p>foo</p>
3602<p>bar</p>
3603</blockquote>
3604````````````````````````````````
3605
3606
3607Block quotes can interrupt paragraphs:
3608
3609```````````````````````````````` example
3610foo
3611> bar
3612.
3613<p>foo</p>
3614<blockquote>
3615<p>bar</p>
3616</blockquote>
3617````````````````````````````````
3618
3619
3620In general, blank lines are not needed before or after block
3621quotes:
3622
3623```````````````````````````````` example
3624> aaa
3625***
3626> bbb
3627.
3628<blockquote>
3629<p>aaa</p>
3630</blockquote>
3631<hr />
3632<blockquote>
3633<p>bbb</p>
3634</blockquote>
3635````````````````````````````````
3636
3637
3638However, because of laziness, a blank line is needed between
3639a block quote and a following paragraph:
3640
3641```````````````````````````````` example
3642> bar
3643baz
3644.
3645<blockquote>
3646<p>bar
3647baz</p>
3648</blockquote>
3649````````````````````````````````
3650
3651
3652```````````````````````````````` example
3653> bar
3654
3655baz
3656.
3657<blockquote>
3658<p>bar</p>
3659</blockquote>
3660<p>baz</p>
3661````````````````````````````````
3662
3663
3664```````````````````````````````` example
3665> bar
3666>
3667baz
3668.
3669<blockquote>
3670<p>bar</p>
3671</blockquote>
3672<p>baz</p>
3673````````````````````````````````
3674
3675
3676It is a consequence of the Laziness rule that any number
3677of initial `>`s may be omitted on a continuation line of a
3678nested block quote:
3679
3680```````````````````````````````` example
3681> > > foo
3682bar
3683.
3684<blockquote>
3685<blockquote>
3686<blockquote>
3687<p>foo
3688bar</p>
3689</blockquote>
3690</blockquote>
3691</blockquote>
3692````````````````````````````````
3693
3694
3695```````````````````````````````` example
3696>>> foo
3697> bar
3698>>baz
3699.
3700<blockquote>
3701<blockquote>
3702<blockquote>
3703<p>foo
3704bar
3705baz</p>
3706</blockquote>
3707</blockquote>
3708</blockquote>
3709````````````````````````````````
3710
3711
3712When including an indented code block in a block quote,
3713remember that the [block quote marker] includes
3714both the `>` and a following space.  So *five spaces* are needed after
3715the `>`:
3716
3717```````````````````````````````` example
3718>     code
3719
3720>    not code
3721.
3722<blockquote>
3723<pre><code>code
3724</code></pre>
3725</blockquote>
3726<blockquote>
3727<p>not code</p>
3728</blockquote>
3729````````````````````````````````
3730
3731
3732
3733## List items
3734
3735A [list marker](@) is a
3736[bullet list marker] or an [ordered list marker].
3737
3738A [bullet list marker](@)
3739is a `-`, `+`, or `*` character.
3740
3741An [ordered list marker](@)
3742is a sequence of 1--9 arabic digits (`0-9`), followed by either a
3743`.` character or a `)` character.  (The reason for the length
3744limit is that with 10 digits we start seeing integer overflows
3745in some browsers.)
3746
3747The following rules define [list items]:
3748
37491.  **Basic case.**  If a sequence of lines *Ls* constitute a sequence of
3750    blocks *Bs* starting with a [non-whitespace character], and *M* is a
3751    list marker of width *W* followed by 1 ≤ *N* ≤ 4 spaces, then the result
3752    of prepending *M* and the following spaces to the first line of
3753    *Ls*, and indenting subsequent lines of *Ls* by *W + N* spaces, is a
3754    list item with *Bs* as its contents.  The type of the list item
3755    (bullet or ordered) is determined by the type of its list marker.
3756    If the list item is ordered, then it is also assigned a start
3757    number, based on the ordered list marker.
3758
3759    Exceptions:
3760
3761    1. When the first list item in a [list] interrupts
3762       a paragraph---that is, when it starts on a line that would
3763       otherwise count as [paragraph continuation text]---then (a)
3764       the lines *Ls* must not begin with a blank line, and (b) if
3765       the list item is ordered, the start number must be 1.
3766    2. If any line is a [thematic break][thematic breaks] then
3767       that line is not a list item.
3768
3769For example, let *Ls* be the lines
3770
3771```````````````````````````````` example
3772A paragraph
3773with two lines.
3774
3775    indented code
3776
3777> A block quote.
3778.
3779<p>A paragraph
3780with two lines.</p>
3781<pre><code>indented code
3782</code></pre>
3783<blockquote>
3784<p>A block quote.</p>
3785</blockquote>
3786````````````````````````````````
3787
3788
3789And let *M* be the marker `1.`, and *N* = 2.  Then rule #1 says
3790that the following is an ordered list item with start number 1,
3791and the same contents as *Ls*:
3792
3793```````````````````````````````` example
37941.  A paragraph
3795    with two lines.
3796
3797        indented code
3798
3799    > A block quote.
3800.
3801<ol>
3802<li>
3803<p>A paragraph
3804with two lines.</p>
3805<pre><code>indented code
3806</code></pre>
3807<blockquote>
3808<p>A block quote.</p>
3809</blockquote>
3810</li>
3811</ol>
3812````````````````````````````````
3813
3814
3815The most important thing to notice is that the position of
3816the text after the list marker determines how much indentation
3817is needed in subsequent blocks in the list item.  If the list
3818marker takes up two spaces, and there are three spaces between
3819the list marker and the next [non-whitespace character], then blocks
3820must be indented five spaces in order to fall under the list
3821item.
3822
3823Here are some examples showing how far content must be indented to be
3824put under the list item:
3825
3826```````````````````````````````` example
3827- one
3828
3829 two
3830.
3831<ul>
3832<li>one</li>
3833</ul>
3834<p>two</p>
3835````````````````````````````````
3836
3837
3838```````````````````````````````` example
3839- one
3840
3841  two
3842.
3843<ul>
3844<li>
3845<p>one</p>
3846<p>two</p>
3847</li>
3848</ul>
3849````````````````````````````````
3850
3851
3852```````````````````````````````` example
3853 -    one
3854
3855     two
3856.
3857<ul>
3858<li>one</li>
3859</ul>
3860<pre><code> two
3861</code></pre>
3862````````````````````````````````
3863
3864
3865```````````````````````````````` example
3866 -    one
3867
3868      two
3869.
3870<ul>
3871<li>
3872<p>one</p>
3873<p>two</p>
3874</li>
3875</ul>
3876````````````````````````````````
3877
3878
3879It is tempting to think of this in terms of columns:  the continuation
3880blocks must be indented at least to the column of the first
3881[non-whitespace character] after the list marker. However, that is not quite right.
3882The spaces after the list marker determine how much relative indentation
3883is needed.  Which column this indentation reaches will depend on
3884how the list item is embedded in other constructions, as shown by
3885this example:
3886
3887```````````````````````````````` example
3888   > > 1.  one
3889>>
3890>>     two
3891.
3892<blockquote>
3893<blockquote>
3894<ol>
3895<li>
3896<p>one</p>
3897<p>two</p>
3898</li>
3899</ol>
3900</blockquote>
3901</blockquote>
3902````````````````````````````````
3903
3904
3905Here `two` occurs in the same column as the list marker `1.`,
3906but is actually contained in the list item, because there is
3907sufficient indentation after the last containing blockquote marker.
3908
3909The converse is also possible.  In the following example, the word `two`
3910occurs far to the right of the initial text of the list item, `one`, but
3911it is not considered part of the list item, because it is not indented
3912far enough past the blockquote marker:
3913
3914```````````````````````````````` example
3915>>- one
3916>>
3917  >  > two
3918.
3919<blockquote>
3920<blockquote>
3921<ul>
3922<li>one</li>
3923</ul>
3924<p>two</p>
3925</blockquote>
3926</blockquote>
3927````````````````````````````````
3928
3929
3930Note that at least one space is needed between the list marker and
3931any following content, so these are not list items:
3932
3933```````````````````````````````` example
3934-one
3935
39362.two
3937.
3938<p>-one</p>
3939<p>2.two</p>
3940````````````````````````````````
3941
3942
3943A list item may contain blocks that are separated by more than
3944one blank line.
3945
3946```````````````````````````````` example
3947- foo
3948
3949
3950  bar
3951.
3952<ul>
3953<li>
3954<p>foo</p>
3955<p>bar</p>
3956</li>
3957</ul>
3958````````````````````````````````
3959
3960
3961A list item may contain any kind of block:
3962
3963```````````````````````````````` example
39641.  foo
3965
3966    ```
3967    bar
3968    ```
3969
3970    baz
3971
3972    > bam
3973.
3974<ol>
3975<li>
3976<p>foo</p>
3977<pre><code>bar
3978</code></pre>
3979<p>baz</p>
3980<blockquote>
3981<p>bam</p>
3982</blockquote>
3983</li>
3984</ol>
3985````````````````````````````````
3986
3987
3988A list item that contains an indented code block will preserve
3989empty lines within the code block verbatim.
3990
3991```````````````````````````````` example
3992- Foo
3993
3994      bar
3995
3996
3997      baz
3998.
3999<ul>
4000<li>
4001<p>Foo</p>
4002<pre><code>bar
4003
4004
4005baz
4006</code></pre>
4007</li>
4008</ul>
4009````````````````````````````````
4010
4011Note that ordered list start numbers must be nine digits or less:
4012
4013```````````````````````````````` example
4014123456789. ok
4015.
4016<ol start="123456789">
4017<li>ok</li>
4018</ol>
4019````````````````````````````````
4020
4021
4022```````````````````````````````` example
40231234567890. not ok
4024.
4025<p>1234567890. not ok</p>
4026````````````````````````````````
4027
4028
4029A start number may begin with 0s:
4030
4031```````````````````````````````` example
40320. ok
4033.
4034<ol start="0">
4035<li>ok</li>
4036</ol>
4037````````````````````````````````
4038
4039
4040```````````````````````````````` example
4041003. ok
4042.
4043<ol start="3">
4044<li>ok</li>
4045</ol>
4046````````````````````````````````
4047
4048
4049A start number may not be negative:
4050
4051```````````````````````````````` example
4052-1. not ok
4053.
4054<p>-1. not ok</p>
4055````````````````````````````````
4056
4057
4058
40592.  **Item starting with indented code.**  If a sequence of lines *Ls*
4060    constitute a sequence of blocks *Bs* starting with an indented code
4061    block, and *M* is a list marker of width *W* followed by
4062    one space, then the result of prepending *M* and the following
4063    space to the first line of *Ls*, and indenting subsequent lines of
4064    *Ls* by *W + 1* spaces, is a list item with *Bs* as its contents.
4065    If a line is empty, then it need not be indented.  The type of the
4066    list item (bullet or ordered) is determined by the type of its list
4067    marker.  If the list item is ordered, then it is also assigned a
4068    start number, based on the ordered list marker.
4069
4070An indented code block will have to be indented four spaces beyond
4071the edge of the region where text will be included in the list item.
4072In the following case that is 6 spaces:
4073
4074```````````````````````````````` example
4075- foo
4076
4077      bar
4078.
4079<ul>
4080<li>
4081<p>foo</p>
4082<pre><code>bar
4083</code></pre>
4084</li>
4085</ul>
4086````````````````````````````````
4087
4088
4089And in this case it is 11 spaces:
4090
4091```````````````````````````````` example
4092  10.  foo
4093
4094           bar
4095.
4096<ol start="10">
4097<li>
4098<p>foo</p>
4099<pre><code>bar
4100</code></pre>
4101</li>
4102</ol>
4103````````````````````````````````
4104
4105
4106If the *first* block in the list item is an indented code block,
4107then by rule #2, the contents must be indented *one* space after the
4108list marker:
4109
4110```````````````````````````````` example
4111    indented code
4112
4113paragraph
4114
4115    more code
4116.
4117<pre><code>indented code
4118</code></pre>
4119<p>paragraph</p>
4120<pre><code>more code
4121</code></pre>
4122````````````````````````````````
4123
4124
4125```````````````````````````````` example
41261.     indented code
4127
4128   paragraph
4129
4130       more code
4131.
4132<ol>
4133<li>
4134<pre><code>indented code
4135</code></pre>
4136<p>paragraph</p>
4137<pre><code>more code
4138</code></pre>
4139</li>
4140</ol>
4141````````````````````````````````
4142
4143
4144Note that an additional space indent is interpreted as space
4145inside the code block:
4146
4147```````````````````````````````` example
41481.      indented code
4149
4150   paragraph
4151
4152       more code
4153.
4154<ol>
4155<li>
4156<pre><code> indented code
4157</code></pre>
4158<p>paragraph</p>
4159<pre><code>more code
4160</code></pre>
4161</li>
4162</ol>
4163````````````````````````````````
4164
4165
4166Note that rules #1 and #2 only apply to two cases:  (a) cases
4167in which the lines to be included in a list item begin with a
4168[non-whitespace character], and (b) cases in which
4169they begin with an indented code
4170block.  In a case like the following, where the first block begins with
4171a three-space indent, the rules do not allow us to form a list item by
4172indenting the whole thing and prepending a list marker:
4173
4174```````````````````````````````` example
4175   foo
4176
4177bar
4178.
4179<p>foo</p>
4180<p>bar</p>
4181````````````````````````````````
4182
4183
4184```````````````````````````````` example
4185-    foo
4186
4187  bar
4188.
4189<ul>
4190<li>foo</li>
4191</ul>
4192<p>bar</p>
4193````````````````````````````````
4194
4195
4196This is not a significant restriction, because when a block begins
4197with 1-3 spaces indent, the indentation can always be removed without
4198a change in interpretation, allowing rule #1 to be applied.  So, in
4199the above case:
4200
4201```````````````````````````````` example
4202-  foo
4203
4204   bar
4205.
4206<ul>
4207<li>
4208<p>foo</p>
4209<p>bar</p>
4210</li>
4211</ul>
4212````````````````````````````````
4213
4214
42153.  **Item starting with a blank line.**  If a sequence of lines *Ls*
4216    starting with a single [blank line] constitute a (possibly empty)
4217    sequence of blocks *Bs*, not separated from each other by more than
4218    one blank line, and *M* is a list marker of width *W*,
4219    then the result of prepending *M* to the first line of *Ls*, and
4220    indenting subsequent lines of *Ls* by *W + 1* spaces, is a list
4221    item with *Bs* as its contents.
4222    If a line is empty, then it need not be indented.  The type of the
4223    list item (bullet or ordered) is determined by the type of its list
4224    marker.  If the list item is ordered, then it is also assigned a
4225    start number, based on the ordered list marker.
4226
4227Here are some list items that start with a blank line but are not empty:
4228
4229```````````````````````````````` example
4230-
4231  foo
4232-
4233  ```
4234  bar
4235  ```
4236-
4237      baz
4238.
4239<ul>
4240<li>foo</li>
4241<li>
4242<pre><code>bar
4243</code></pre>
4244</li>
4245<li>
4246<pre><code>baz
4247</code></pre>
4248</li>
4249</ul>
4250````````````````````````````````
4251
4252When the list item starts with a blank line, the number of spaces
4253following the list marker doesn't change the required indentation:
4254
4255```````````````````````````````` example
4256-
4257  foo
4258.
4259<ul>
4260<li>foo</li>
4261</ul>
4262````````````````````````````````
4263
4264
4265A list item can begin with at most one blank line.
4266In the following example, `foo` is not part of the list
4267item:
4268
4269```````````````````````````````` example
4270-
4271
4272  foo
4273.
4274<ul>
4275<li></li>
4276</ul>
4277<p>foo</p>
4278````````````````````````````````
4279
4280
4281Here is an empty bullet list item:
4282
4283```````````````````````````````` example
4284- foo
4285-
4286- bar
4287.
4288<ul>
4289<li>foo</li>
4290<li></li>
4291<li>bar</li>
4292</ul>
4293````````````````````````````````
4294
4295
4296It does not matter whether there are spaces following the [list marker]:
4297
4298```````````````````````````````` example
4299- foo
4300-
4301- bar
4302.
4303<ul>
4304<li>foo</li>
4305<li></li>
4306<li>bar</li>
4307</ul>
4308````````````````````````````````
4309
4310
4311Here is an empty ordered list item:
4312
4313```````````````````````````````` example
43141. foo
43152.
43163. bar
4317.
4318<ol>
4319<li>foo</li>
4320<li></li>
4321<li>bar</li>
4322</ol>
4323````````````````````````````````
4324
4325
4326A list may start or end with an empty list item:
4327
4328```````````````````````````````` example
4329*
4330.
4331<ul>
4332<li></li>
4333</ul>
4334````````````````````````````````
4335
4336However, an empty list item cannot interrupt a paragraph:
4337
4338```````````````````````````````` example
4339foo
4340*
4341
4342foo
43431.
4344.
4345<p>foo
4346*</p>
4347<p>foo
43481.</p>
4349````````````````````````````````
4350
4351
43524.  **Indentation.**  If a sequence of lines *Ls* constitutes a list item
4353    according to rule #1, #2, or #3, then the result of indenting each line
4354    of *Ls* by 1-3 spaces (the same for each line) also constitutes a
4355    list item with the same contents and attributes.  If a line is
4356    empty, then it need not be indented.
4357
4358Indented one space:
4359
4360```````````````````````````````` example
4361 1.  A paragraph
4362     with two lines.
4363
4364         indented code
4365
4366     > A block quote.
4367.
4368<ol>
4369<li>
4370<p>A paragraph
4371with two lines.</p>
4372<pre><code>indented code
4373</code></pre>
4374<blockquote>
4375<p>A block quote.</p>
4376</blockquote>
4377</li>
4378</ol>
4379````````````````````````````````
4380
4381
4382Indented two spaces:
4383
4384```````````````````````````````` example
4385  1.  A paragraph
4386      with two lines.
4387
4388          indented code
4389
4390      > A block quote.
4391.
4392<ol>
4393<li>
4394<p>A paragraph
4395with two lines.</p>
4396<pre><code>indented code
4397</code></pre>
4398<blockquote>
4399<p>A block quote.</p>
4400</blockquote>
4401</li>
4402</ol>
4403````````````````````````````````
4404
4405
4406Indented three spaces:
4407
4408```````````````````````````````` example
4409   1.  A paragraph
4410       with two lines.
4411
4412           indented code
4413
4414       > A block quote.
4415.
4416<ol>
4417<li>
4418<p>A paragraph
4419with two lines.</p>
4420<pre><code>indented code
4421</code></pre>
4422<blockquote>
4423<p>A block quote.</p>
4424</blockquote>
4425</li>
4426</ol>
4427````````````````````````````````
4428
4429
4430Four spaces indent gives a code block:
4431
4432```````````````````````````````` example
4433    1.  A paragraph
4434        with two lines.
4435
4436            indented code
4437
4438        > A block quote.
4439.
4440<pre><code>1.  A paragraph
4441    with two lines.
4442
4443        indented code
4444
4445    &gt; A block quote.
4446</code></pre>
4447````````````````````````````````
4448
4449
4450
44515.  **Laziness.**  If a string of lines *Ls* constitute a [list
4452    item](#list-items) with contents *Bs*, then the result of deleting
4453    some or all of the indentation from one or more lines in which the
4454    next [non-whitespace character] after the indentation is
4455    [paragraph continuation text] is a
4456    list item with the same contents and attributes.  The unindented
4457    lines are called
4458    [lazy continuation line](@)s.
4459
4460Here is an example with [lazy continuation lines]:
4461
4462```````````````````````````````` example
4463  1.  A paragraph
4464with two lines.
4465
4466          indented code
4467
4468      > A block quote.
4469.
4470<ol>
4471<li>
4472<p>A paragraph
4473with two lines.</p>
4474<pre><code>indented code
4475</code></pre>
4476<blockquote>
4477<p>A block quote.</p>
4478</blockquote>
4479</li>
4480</ol>
4481````````````````````````````````
4482
4483
4484Indentation can be partially deleted:
4485
4486```````````````````````````````` example
4487  1.  A paragraph
4488    with two lines.
4489.
4490<ol>
4491<li>A paragraph
4492with two lines.</li>
4493</ol>
4494````````````````````````````````
4495
4496
4497These examples show how laziness can work in nested structures:
4498
4499```````````````````````````````` example
4500> 1. > Blockquote
4501continued here.
4502.
4503<blockquote>
4504<ol>
4505<li>
4506<blockquote>
4507<p>Blockquote
4508continued here.</p>
4509</blockquote>
4510</li>
4511</ol>
4512</blockquote>
4513````````````````````````````````
4514
4515
4516```````````````````````````````` example
4517> 1. > Blockquote
4518> continued here.
4519.
4520<blockquote>
4521<ol>
4522<li>
4523<blockquote>
4524<p>Blockquote
4525continued here.</p>
4526</blockquote>
4527</li>
4528</ol>
4529</blockquote>
4530````````````````````````````````
4531
4532
4533
45346.  **That's all.** Nothing that is not counted as a list item by rules
4535    #1--5 counts as a [list item](#list-items).
4536
4537The rules for sublists follow from the general rules
4538[above][List items].  A sublist must be indented the same number
4539of spaces a paragraph would need to be in order to be included
4540in the list item.
4541
4542So, in this case we need two spaces indent:
4543
4544```````````````````````````````` example
4545- foo
4546  - bar
4547    - baz
4548      - boo
4549.
4550<ul>
4551<li>foo
4552<ul>
4553<li>bar
4554<ul>
4555<li>baz
4556<ul>
4557<li>boo</li>
4558</ul>
4559</li>
4560</ul>
4561</li>
4562</ul>
4563</li>
4564</ul>
4565````````````````````````````````
4566
4567
4568One is not enough:
4569
4570```````````````````````````````` example
4571- foo
4572 - bar
4573  - baz
4574   - boo
4575.
4576<ul>
4577<li>foo</li>
4578<li>bar</li>
4579<li>baz</li>
4580<li>boo</li>
4581</ul>
4582````````````````````````````````
4583
4584
4585Here we need four, because the list marker is wider:
4586
4587```````````````````````````````` example
458810) foo
4589    - bar
4590.
4591<ol start="10">
4592<li>foo
4593<ul>
4594<li>bar</li>
4595</ul>
4596</li>
4597</ol>
4598````````````````````````````````
4599
4600
4601Three is not enough:
4602
4603```````````````````````````````` example
460410) foo
4605   - bar
4606.
4607<ol start="10">
4608<li>foo</li>
4609</ol>
4610<ul>
4611<li>bar</li>
4612</ul>
4613````````````````````````````````
4614
4615
4616A list may be the first block in a list item:
4617
4618```````````````````````````````` example
4619- - foo
4620.
4621<ul>
4622<li>
4623<ul>
4624<li>foo</li>
4625</ul>
4626</li>
4627</ul>
4628````````````````````````````````
4629
4630
4631```````````````````````````````` example
46321. - 2. foo
4633.
4634<ol>
4635<li>
4636<ul>
4637<li>
4638<ol start="2">
4639<li>foo</li>
4640</ol>
4641</li>
4642</ul>
4643</li>
4644</ol>
4645````````````````````````````````
4646
4647
4648A list item can contain a heading:
4649
4650```````````````````````````````` example
4651- # Foo
4652- Bar
4653  ---
4654  baz
4655.
4656<ul>
4657<li>
4658<h1>Foo</h1>
4659</li>
4660<li>
4661<h2>Bar</h2>
4662baz</li>
4663</ul>
4664````````````````````````````````
4665
4666
4667### Motivation
4668
4669John Gruber's Markdown spec says the following about list items:
4670
46711. "List markers typically start at the left margin, but may be indented
4672   by up to three spaces. List markers must be followed by one or more
4673   spaces or a tab."
4674
46752. "To make lists look nice, you can wrap items with hanging indents....
4676   But if you don't want to, you don't have to."
4677
46783. "List items may consist of multiple paragraphs. Each subsequent
4679   paragraph in a list item must be indented by either 4 spaces or one
4680   tab."
4681
46824. "It looks nice if you indent every line of the subsequent paragraphs,
4683   but here again, Markdown will allow you to be lazy."
4684
46855. "To put a blockquote within a list item, the blockquote's `>`
4686   delimiters need to be indented."
4687
46886. "To put a code block within a list item, the code block needs to be
4689   indented twice — 8 spaces or two tabs."
4690
4691These rules specify that a paragraph under a list item must be indented
4692four spaces (presumably, from the left margin, rather than the start of
4693the list marker, but this is not said), and that code under a list item
4694must be indented eight spaces instead of the usual four.  They also say
4695that a block quote must be indented, but not by how much; however, the
4696example given has four spaces indentation.  Although nothing is said
4697about other kinds of block-level content, it is certainly reasonable to
4698infer that *all* block elements under a list item, including other
4699lists, must be indented four spaces.  This principle has been called the
4700*four-space rule*.
4701
4702The four-space rule is clear and principled, and if the reference
4703implementation `Markdown.pl` had followed it, it probably would have
4704become the standard.  However, `Markdown.pl` allowed paragraphs and
4705sublists to start with only two spaces indentation, at least on the
4706outer level.  Worse, its behavior was inconsistent: a sublist of an
4707outer-level list needed two spaces indentation, but a sublist of this
4708sublist needed three spaces.  It is not surprising, then, that different
4709implementations of Markdown have developed very different rules for
4710determining what comes under a list item.  (Pandoc and python-Markdown,
4711for example, stuck with Gruber's syntax description and the four-space
4712rule, while discount, redcarpet, marked, PHP Markdown, and others
4713followed `Markdown.pl`'s behavior more closely.)
4714
4715Unfortunately, given the divergences between implementations, there
4716is no way to give a spec for list items that will be guaranteed not
4717to break any existing documents.  However, the spec given here should
4718correctly handle lists formatted with either the four-space rule or
4719the more forgiving `Markdown.pl` behavior, provided they are laid out
4720in a way that is natural for a human to read.
4721
4722The strategy here is to let the width and indentation of the list marker
4723determine the indentation necessary for blocks to fall under the list
4724item, rather than having a fixed and arbitrary number.  The writer can
4725think of the body of the list item as a unit which gets indented to the
4726right enough to fit the list marker (and any indentation on the list
4727marker).  (The laziness rule, #5, then allows continuation lines to be
4728unindented if needed.)
4729
4730This rule is superior, we claim, to any rule requiring a fixed level of
4731indentation from the margin.  The four-space rule is clear but
4732unnatural. It is quite unintuitive that
4733
4734``` markdown
4735- foo
4736
4737  bar
4738
4739  - baz
4740```
4741
4742should be parsed as two lists with an intervening paragraph,
4743
4744``` html
4745<ul>
4746<li>foo</li>
4747</ul>
4748<p>bar</p>
4749<ul>
4750<li>baz</li>
4751</ul>
4752```
4753
4754as the four-space rule demands, rather than a single list,
4755
4756``` html
4757<ul>
4758<li>
4759<p>foo</p>
4760<p>bar</p>
4761<ul>
4762<li>baz</li>
4763</ul>
4764</li>
4765</ul>
4766```
4767
4768The choice of four spaces is arbitrary.  It can be learned, but it is
4769not likely to be guessed, and it trips up beginners regularly.
4770
4771Would it help to adopt a two-space rule?  The problem is that such
4772a rule, together with the rule allowing 1--3 spaces indentation of the
4773initial list marker, allows text that is indented *less than* the
4774original list marker to be included in the list item. For example,
4775`Markdown.pl` parses
4776
4777``` markdown
4778   - one
4779
4780  two
4781```
4782
4783as a single list item, with `two` a continuation paragraph:
4784
4785``` html
4786<ul>
4787<li>
4788<p>one</p>
4789<p>two</p>
4790</li>
4791</ul>
4792```
4793
4794and similarly
4795
4796``` markdown
4797>   - one
4798>
4799>  two
4800```
4801
4802as
4803
4804``` html
4805<blockquote>
4806<ul>
4807<li>
4808<p>one</p>
4809<p>two</p>
4810</li>
4811</ul>
4812</blockquote>
4813```
4814
4815This is extremely unintuitive.
4816
4817Rather than requiring a fixed indent from the margin, we could require
4818a fixed indent (say, two spaces, or even one space) from the list marker (which
4819may itself be indented).  This proposal would remove the last anomaly
4820discussed.  Unlike the spec presented above, it would count the following
4821as a list item with a subparagraph, even though the paragraph `bar`
4822is not indented as far as the first paragraph `foo`:
4823
4824``` markdown
4825 10. foo
4826
4827   bar
4828```
4829
4830Arguably this text does read like a list item with `bar` as a subparagraph,
4831which may count in favor of the proposal.  However, on this proposal indented
4832code would have to be indented six spaces after the list marker.  And this
4833would break a lot of existing Markdown, which has the pattern:
4834
4835``` markdown
48361.  foo
4837
4838        indented code
4839```
4840
4841where the code is indented eight spaces.  The spec above, by contrast, will
4842parse this text as expected, since the code block's indentation is measured
4843from the beginning of `foo`.
4844
4845The one case that needs special treatment is a list item that *starts*
4846with indented code.  How much indentation is required in that case, since
4847we don't have a "first paragraph" to measure from?  Rule #2 simply stipulates
4848that in such cases, we require one space indentation from the list marker
4849(and then the normal four spaces for the indented code).  This will match the
4850four-space rule in cases where the list marker plus its initial indentation
4851takes four spaces (a common case), but diverge in other cases.
4852
4853## Lists
4854
4855A [list](@) is a sequence of one or more
4856list items [of the same type].  The list items
4857may be separated by any number of blank lines.
4858
4859Two list items are [of the same type](@)
4860if they begin with a [list marker] of the same type.
4861Two list markers are of the
4862same type if (a) they are bullet list markers using the same character
4863(`-`, `+`, or `*`) or (b) they are ordered list numbers with the same
4864delimiter (either `.` or `)`).
4865
4866A list is an [ordered list](@)
4867if its constituent list items begin with
4868[ordered list markers], and a
4869[bullet list](@) if its constituent list
4870items begin with [bullet list markers].
4871
4872The [start number](@)
4873of an [ordered list] is determined by the list number of
4874its initial list item.  The numbers of subsequent list items are
4875disregarded.
4876
4877A list is [loose](@) if any of its constituent
4878list items are separated by blank lines, or if any of its constituent
4879list items directly contain two block-level elements with a blank line
4880between them.  Otherwise a list is [tight](@).
4881(The difference in HTML output is that paragraphs in a loose list are
4882wrapped in `<p>` tags, while paragraphs in a tight list are not.)
4883
4884Changing the bullet or ordered list delimiter starts a new list:
4885
4886```````````````````````````````` example
4887- foo
4888- bar
4889+ baz
4890.
4891<ul>
4892<li>foo</li>
4893<li>bar</li>
4894</ul>
4895<ul>
4896<li>baz</li>
4897</ul>
4898````````````````````````````````
4899
4900
4901```````````````````````````````` example
49021. foo
49032. bar
49043) baz
4905.
4906<ol>
4907<li>foo</li>
4908<li>bar</li>
4909</ol>
4910<ol start="3">
4911<li>baz</li>
4912</ol>
4913````````````````````````````````
4914
4915
4916In CommonMark, a list can interrupt a paragraph. That is,
4917no blank line is needed to separate a paragraph from a following
4918list:
4919
4920```````````````````````````````` example
4921Foo
4922- bar
4923- baz
4924.
4925<p>Foo</p>
4926<ul>
4927<li>bar</li>
4928<li>baz</li>
4929</ul>
4930````````````````````````````````
4931
4932`Markdown.pl` does not allow this, through fear of triggering a list
4933via a numeral in a hard-wrapped line:
4934
4935``` markdown
4936The number of windows in my house is
493714.  The number of doors is 6.
4938```
4939
4940Oddly, though, `Markdown.pl` *does* allow a blockquote to
4941interrupt a paragraph, even though the same considerations might
4942apply.
4943
4944In CommonMark, we do allow lists to interrupt paragraphs, for
4945two reasons.  First, it is natural and not uncommon for people
4946to start lists without blank lines:
4947
4948``` markdown
4949I need to buy
4950- new shoes
4951- a coat
4952- a plane ticket
4953```
4954
4955Second, we are attracted to a
4956
4957> [principle of uniformity](@):
4958> if a chunk of text has a certain
4959> meaning, it will continue to have the same meaning when put into a
4960> container block (such as a list item or blockquote).
4961
4962(Indeed, the spec for [list items] and [block quotes] presupposes
4963this principle.) This principle implies that if
4964
4965``` markdown
4966  * I need to buy
4967    - new shoes
4968    - a coat
4969    - a plane ticket
4970```
4971
4972is a list item containing a paragraph followed by a nested sublist,
4973as all Markdown implementations agree it is (though the paragraph
4974may be rendered without `<p>` tags, since the list is "tight"),
4975then
4976
4977``` markdown
4978I need to buy
4979- new shoes
4980- a coat
4981- a plane ticket
4982```
4983
4984by itself should be a paragraph followed by a nested sublist.
4985
4986Since it is well established Markdown practice to allow lists to
4987interrupt paragraphs inside list items, the [principle of
4988uniformity] requires us to allow this outside list items as
4989well.  ([reStructuredText](http://docutils.sourceforge.net/rst.html)
4990takes a different approach, requiring blank lines before lists
4991even inside other list items.)
4992
4993In order to solve of unwanted lists in paragraphs with
4994hard-wrapped numerals, we allow only lists starting with `1` to
4995interrupt paragraphs.  Thus,
4996
4997```````````````````````````````` example
4998The number of windows in my house is
499914.  The number of doors is 6.
5000.
5001<p>The number of windows in my house is
500214.  The number of doors is 6.</p>
5003````````````````````````````````
5004
5005We may still get an unintended result in cases like
5006
5007```````````````````````````````` example
5008The number of windows in my house is
50091.  The number of doors is 6.
5010.
5011<p>The number of windows in my house is</p>
5012<ol>
5013<li>The number of doors is 6.</li>
5014</ol>
5015````````````````````````````````
5016
5017but this rule should prevent most spurious list captures.
5018
5019There can be any number of blank lines between items:
5020
5021```````````````````````````````` example
5022- foo
5023
5024- bar
5025
5026
5027- baz
5028.
5029<ul>
5030<li>
5031<p>foo</p>
5032</li>
5033<li>
5034<p>bar</p>
5035</li>
5036<li>
5037<p>baz</p>
5038</li>
5039</ul>
5040````````````````````````````````
5041
5042```````````````````````````````` example
5043- foo
5044  - bar
5045    - baz
5046
5047
5048      bim
5049.
5050<ul>
5051<li>foo
5052<ul>
5053<li>bar
5054<ul>
5055<li>
5056<p>baz</p>
5057<p>bim</p>
5058</li>
5059</ul>
5060</li>
5061</ul>
5062</li>
5063</ul>
5064````````````````````````````````
5065
5066
5067To separate consecutive lists of the same type, or to separate a
5068list from an indented code block that would otherwise be parsed
5069as a subparagraph of the final list item, you can insert a blank HTML
5070comment:
5071
5072```````````````````````````````` example
5073- foo
5074- bar
5075
5076<!-- -->
5077
5078- baz
5079- bim
5080.
5081<ul>
5082<li>foo</li>
5083<li>bar</li>
5084</ul>
5085<!-- -->
5086<ul>
5087<li>baz</li>
5088<li>bim</li>
5089</ul>
5090````````````````````````````````
5091
5092
5093```````````````````````````````` example
5094-   foo
5095
5096    notcode
5097
5098-   foo
5099
5100<!-- -->
5101
5102    code
5103.
5104<ul>
5105<li>
5106<p>foo</p>
5107<p>notcode</p>
5108</li>
5109<li>
5110<p>foo</p>
5111</li>
5112</ul>
5113<!-- -->
5114<pre><code>code
5115</code></pre>
5116````````````````````````````````
5117
5118
5119List items need not be indented to the same level.  The following
5120list items will be treated as items at the same list level,
5121since none is indented enough to belong to the previous list
5122item:
5123
5124```````````````````````````````` example
5125- a
5126 - b
5127  - c
5128   - d
5129  - e
5130 - f
5131- g
5132.
5133<ul>
5134<li>a</li>
5135<li>b</li>
5136<li>c</li>
5137<li>d</li>
5138<li>e</li>
5139<li>f</li>
5140<li>g</li>
5141</ul>
5142````````````````````````````````
5143
5144
5145```````````````````````````````` example
51461. a
5147
5148  2. b
5149
5150   3. c
5151.
5152<ol>
5153<li>
5154<p>a</p>
5155</li>
5156<li>
5157<p>b</p>
5158</li>
5159<li>
5160<p>c</p>
5161</li>
5162</ol>
5163````````````````````````````````
5164
5165Note, however, that list items may not be indented more than
5166three spaces.  Here `- e` is treated as a paragraph continuation
5167line, because it is indented more than three spaces:
5168
5169```````````````````````````````` example
5170- a
5171 - b
5172  - c
5173   - d
5174    - e
5175.
5176<ul>
5177<li>a</li>
5178<li>b</li>
5179<li>c</li>
5180<li>d
5181- e</li>
5182</ul>
5183````````````````````````````````
5184
5185And here, `3. c` is treated as in indented code block,
5186because it is indented four spaces and preceded by a
5187blank line.
5188
5189```````````````````````````````` example
51901. a
5191
5192  2. b
5193
5194    3. c
5195.
5196<ol>
5197<li>
5198<p>a</p>
5199</li>
5200<li>
5201<p>b</p>
5202</li>
5203</ol>
5204<pre><code>3. c
5205</code></pre>
5206````````````````````````````````
5207
5208
5209This is a loose list, because there is a blank line between
5210two of the list items:
5211
5212```````````````````````````````` example
5213- a
5214- b
5215
5216- c
5217.
5218<ul>
5219<li>
5220<p>a</p>
5221</li>
5222<li>
5223<p>b</p>
5224</li>
5225<li>
5226<p>c</p>
5227</li>
5228</ul>
5229````````````````````````````````
5230
5231
5232So is this, with a empty second item:
5233
5234```````````````````````````````` example
5235* a
5236*
5237
5238* c
5239.
5240<ul>
5241<li>
5242<p>a</p>
5243</li>
5244<li></li>
5245<li>
5246<p>c</p>
5247</li>
5248</ul>
5249````````````````````````````````
5250
5251
5252These are loose lists, even though there is no space between the items,
5253because one of the items directly contains two block-level elements
5254with a blank line between them:
5255
5256```````````````````````````````` example
5257- a
5258- b
5259
5260  c
5261- d
5262.
5263<ul>
5264<li>
5265<p>a</p>
5266</li>
5267<li>
5268<p>b</p>
5269<p>c</p>
5270</li>
5271<li>
5272<p>d</p>
5273</li>
5274</ul>
5275````````````````````````````````
5276
5277
5278```````````````````````````````` example
5279- a
5280- b
5281
5282  [ref]: /url
5283- d
5284.
5285<ul>
5286<li>
5287<p>a</p>
5288</li>
5289<li>
5290<p>b</p>
5291</li>
5292<li>
5293<p>d</p>
5294</li>
5295</ul>
5296````````````````````````````````
5297
5298
5299This is a tight list, because the blank lines are in a code block:
5300
5301```````````````````````````````` example
5302- a
5303- ```
5304  b
5305
5306
5307  ```
5308- c
5309.
5310<ul>
5311<li>a</li>
5312<li>
5313<pre><code>b
5314
5315
5316</code></pre>
5317</li>
5318<li>c</li>
5319</ul>
5320````````````````````````````````
5321
5322
5323This is a tight list, because the blank line is between two
5324paragraphs of a sublist.  So the sublist is loose while
5325the outer list is tight:
5326
5327```````````````````````````````` example
5328- a
5329  - b
5330
5331    c
5332- d
5333.
5334<ul>
5335<li>a
5336<ul>
5337<li>
5338<p>b</p>
5339<p>c</p>
5340</li>
5341</ul>
5342</li>
5343<li>d</li>
5344</ul>
5345````````````````````````````````
5346
5347
5348This is a tight list, because the blank line is inside the
5349block quote:
5350
5351```````````````````````````````` example
5352* a
5353  > b
5354  >
5355* c
5356.
5357<ul>
5358<li>a
5359<blockquote>
5360<p>b</p>
5361</blockquote>
5362</li>
5363<li>c</li>
5364</ul>
5365````````````````````````````````
5366
5367
5368This list is tight, because the consecutive block elements
5369are not separated by blank lines:
5370
5371```````````````````````````````` example
5372- a
5373  > b
5374  ```
5375  c
5376  ```
5377- d
5378.
5379<ul>
5380<li>a
5381<blockquote>
5382<p>b</p>
5383</blockquote>
5384<pre><code>c
5385</code></pre>
5386</li>
5387<li>d</li>
5388</ul>
5389````````````````````````````````
5390
5391
5392A single-paragraph list is tight:
5393
5394```````````````````````````````` example
5395- a
5396.
5397<ul>
5398<li>a</li>
5399</ul>
5400````````````````````````````````
5401
5402
5403```````````````````````````````` example
5404- a
5405  - b
5406.
5407<ul>
5408<li>a
5409<ul>
5410<li>b</li>
5411</ul>
5412</li>
5413</ul>
5414````````````````````````````````
5415
5416
5417This list is loose, because of the blank line between the
5418two block elements in the list item:
5419
5420```````````````````````````````` example
54211. ```
5422   foo
5423   ```
5424
5425   bar
5426.
5427<ol>
5428<li>
5429<pre><code>foo
5430</code></pre>
5431<p>bar</p>
5432</li>
5433</ol>
5434````````````````````````````````
5435
5436
5437Here the outer list is loose, the inner list tight:
5438
5439```````````````````````````````` example
5440* foo
5441  * bar
5442
5443  baz
5444.
5445<ul>
5446<li>
5447<p>foo</p>
5448<ul>
5449<li>bar</li>
5450</ul>
5451<p>baz</p>
5452</li>
5453</ul>
5454````````````````````````````````
5455
5456
5457```````````````````````````````` example
5458- a
5459  - b
5460  - c
5461
5462- d
5463  - e
5464  - f
5465.
5466<ul>
5467<li>
5468<p>a</p>
5469<ul>
5470<li>b</li>
5471<li>c</li>
5472</ul>
5473</li>
5474<li>
5475<p>d</p>
5476<ul>
5477<li>e</li>
5478<li>f</li>
5479</ul>
5480</li>
5481</ul>
5482````````````````````````````````
5483
5484
5485# Inlines
5486
5487Inlines are parsed sequentially from the beginning of the character
5488stream to the end (left to right, in left-to-right languages).
5489Thus, for example, in
5490
5491```````````````````````````````` example
5492`hi`lo`
5493.
5494<p><code>hi</code>lo`</p>
5495````````````````````````````````
5496
5497`hi` is parsed as code, leaving the backtick at the end as a literal
5498backtick.
5499
5500
5501## Backslash escapes
5502
5503Any ASCII punctuation character may be backslash-escaped:
5504
5505```````````````````````````````` example
5506\!\"\#\$\%\&\'\(\)\*\+\,\-\.\/\:\;\<\=\>\?\@\[\\\]\^\_\`\{\|\}\~
5507.
5508<p>!&quot;#$%&amp;'()*+,-./:;&lt;=&gt;?@[\]^_`{|}~</p>
5509````````````````````````````````
5510
5511
5512Backslashes before other characters are treated as literal
5513backslashes:
5514
5515```````````````````````````````` example
5516\→\A\a\ \3\φ\«
5517.
5518<p>\→\A\a\ \3\φ\«</p>
5519````````````````````````````````
5520
5521
5522Escaped characters are treated as regular characters and do
5523not have their usual Markdown meanings:
5524
5525```````````````````````````````` example
5526\*not emphasized*
5527\<br/> not a tag
5528\[not a link](/foo)
5529\`not code`
55301\. not a list
5531\* not a list
5532\# not a heading
5533\[foo]: /url "not a reference"
5534\&ouml; not a character entity
5535.
5536<p>*not emphasized*
5537&lt;br/&gt; not a tag
5538[not a link](/foo)
5539`not code`
55401. not a list
5541* not a list
5542# not a heading
5543[foo]: /url &quot;not a reference&quot;
5544&amp;ouml; not a character entity</p>
5545````````````````````````````````
5546
5547
5548If a backslash is itself escaped, the following character is not:
5549
5550```````````````````````````````` example
5551\\*emphasis*
5552.
5553<p>\<em>emphasis</em></p>
5554````````````````````````````````
5555
5556
5557A backslash at the end of the line is a [hard line break]:
5558
5559```````````````````````````````` example
5560foo\
5561bar
5562.
5563<p>foo<br />
5564bar</p>
5565````````````````````````````````
5566
5567
5568Backslash escapes do not work in code blocks, code spans, autolinks, or
5569raw HTML:
5570
5571```````````````````````````````` example
5572`` \[\` ``
5573.
5574<p><code>\[\`</code></p>
5575````````````````````````````````
5576
5577
5578```````````````````````````````` example
5579    \[\]
5580.
5581<pre><code>\[\]
5582</code></pre>
5583````````````````````````````````
5584
5585
5586```````````````````````````````` example
5587~~~
5588\[\]
5589~~~
5590.
5591<pre><code>\[\]
5592</code></pre>
5593````````````````````````````````
5594
5595
5596```````````````````````````````` example
5597<http://example.com?find=\*>
5598.
5599<p><a href="http://example.com?find=%5C*">http://example.com?find=\*</a></p>
5600````````````````````````````````
5601
5602
5603```````````````````````````````` example
5604<a href="/bar\/)">
5605.
5606<a href="/bar\/)">
5607````````````````````````````````
5608
5609
5610But they work in all other contexts, including URLs and link titles,
5611link references, and [info strings] in [fenced code blocks]:
5612
5613```````````````````````````````` example
5614[foo](/bar\* "ti\*tle")
5615.
5616<p><a href="/bar*" title="ti*tle">foo</a></p>
5617````````````````````````````````
5618
5619
5620```````````````````````````````` example
5621[foo]
5622
5623[foo]: /bar\* "ti\*tle"
5624.
5625<p><a href="/bar*" title="ti*tle">foo</a></p>
5626````````````````````````````````
5627
5628
5629```````````````````````````````` example
5630``` foo\+bar
5631foo
5632```
5633.
5634<pre><code class="language-foo+bar">foo
5635</code></pre>
5636````````````````````````````````
5637
5638
5639
5640## Entity and numeric character references
5641
5642Valid HTML entity references and numeric character references
5643can be used in place of the corresponding Unicode character,
5644with the following exceptions:
5645
5646- Entity and character references are not recognized in code
5647  blocks and code spans.
5648
5649- Entity and character references cannot stand in place of
5650  special characters that define structural elements in
5651  CommonMark.  For example, although `&#42;` can be used
5652  in place of a literal `*` character, `&#42;` cannot replace
5653  `*` in emphasis delimiters, bullet list markers, or thematic
5654  breaks.
5655
5656Conforming CommonMark parsers need not store information about
5657whether a particular character was represented in the source
5658using a Unicode character or an entity reference.
5659
5660[Entity references](@) consist of `&` + any of the valid
5661HTML5 entity names + `;`. The
5662document <https://html.spec.whatwg.org/multipage/entities.json>
5663is used as an authoritative source for the valid entity
5664references and their corresponding code points.
5665
5666```````````````````````````````` example
5667&nbsp; &amp; &copy; &AElig; &Dcaron;
5668&frac34; &HilbertSpace; &DifferentialD;
5669&ClockwiseContourIntegral; &ngE;
5670.
5671<p>  &amp; © Æ Ď
5672¾ ℋ ⅆ
5673∲ ≧̸</p>
5674````````````````````````````````
5675
5676
5677[Decimal numeric character
5678references](@)
5679consist of `&#` + a string of 1--7 arabic digits + `;`. A
5680numeric character reference is parsed as the corresponding
5681Unicode character. Invalid Unicode code points will be replaced by
5682the REPLACEMENT CHARACTER (`U+FFFD`).  For security reasons,
5683the code point `U+0000` will also be replaced by `U+FFFD`.
5684
5685```````````````````````````````` example
5686&#35; &#1234; &#992; &#0;
5687.
5688<p># Ӓ Ϡ �</p>
5689````````````````````````````````
5690
5691
5692[Hexadecimal numeric character
5693references](@) consist of `&#` +
5694either `X` or `x` + a string of 1-6 hexadecimal digits + `;`.
5695They too are parsed as the corresponding Unicode character (this
5696time specified with a hexadecimal numeral instead of decimal).
5697
5698```````````````````````````````` example
5699&#X22; &#XD06; &#xcab;
5700.
5701<p>&quot; ആ ಫ</p>
5702````````````````````````````````
5703
5704
5705Here are some nonentities:
5706
5707```````````````````````````````` example
5708&nbsp &x; &#; &#x;
5709&#87654321;
5710&#abcdef0;
5711&ThisIsNotDefined; &hi?;
5712.
5713<p>&amp;nbsp &amp;x; &amp;#; &amp;#x;
5714&amp;#87654321;
5715&amp;#abcdef0;
5716&amp;ThisIsNotDefined; &amp;hi?;</p>
5717````````````````````````````````
5718
5719
5720Although HTML5 does accept some entity references
5721without a trailing semicolon (such as `&copy`), these are not
5722recognized here, because it makes the grammar too ambiguous:
5723
5724```````````````````````````````` example
5725&copy
5726.
5727<p>&amp;copy</p>
5728````````````````````````````````
5729
5730
5731Strings that are not on the list of HTML5 named entities are not
5732recognized as entity references either:
5733
5734```````````````````````````````` example
5735&MadeUpEntity;
5736.
5737<p>&amp;MadeUpEntity;</p>
5738````````````````````````````````
5739
5740
5741Entity and numeric character references are recognized in any
5742context besides code spans or code blocks, including
5743URLs, [link titles], and [fenced code block][] [info strings]:
5744
5745```````````````````````````````` example
5746<a href="&ouml;&ouml;.html">
5747.
5748<a href="&ouml;&ouml;.html">
5749````````````````````````````````
5750
5751
5752```````````````````````````````` example
5753[foo](/f&ouml;&ouml; "f&ouml;&ouml;")
5754.
5755<p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p>
5756````````````````````````````````
5757
5758
5759```````````````````````````````` example
5760[foo]
5761
5762[foo]: /f&ouml;&ouml; "f&ouml;&ouml;"
5763.
5764<p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p>
5765````````````````````````````````
5766
5767
5768```````````````````````````````` example
5769``` f&ouml;&ouml;
5770foo
5771```
5772.
5773<pre><code class="language-föö">foo
5774</code></pre>
5775````````````````````````````````
5776
5777
5778Entity and numeric character references are treated as literal
5779text in code spans and code blocks:
5780
5781```````````````````````````````` example
5782`f&ouml;&ouml;`
5783.
5784<p><code>f&amp;ouml;&amp;ouml;</code></p>
5785````````````````````````````````
5786
5787
5788```````````````````````````````` example
5789    f&ouml;f&ouml;
5790.
5791<pre><code>f&amp;ouml;f&amp;ouml;
5792</code></pre>
5793````````````````````````````````
5794
5795
5796Entity and numeric character references cannot be used
5797in place of symbols indicating structure in CommonMark
5798documents.
5799
5800```````````````````````````````` example
5801&#42;foo&#42;
5802*foo*
5803.
5804<p>*foo*
5805<em>foo</em></p>
5806````````````````````````````````
5807
5808```````````````````````````````` example
5809&#42; foo
5810
5811* foo
5812.
5813<p>* foo</p>
5814<ul>
5815<li>foo</li>
5816</ul>
5817````````````````````````````````
5818
5819```````````````````````````````` example
5820foo&#10;&#10;bar
5821.
5822<p>foo
5823
5824bar</p>
5825````````````````````````````````
5826
5827```````````````````````````````` example
5828&#9;foo
5829.
5830<p>→foo</p>
5831````````````````````````````````
5832
5833
5834```````````````````````````````` example
5835[a](url &quot;tit&quot;)
5836.
5837<p>[a](url &quot;tit&quot;)</p>
5838````````````````````````````````
5839
5840
5841## Code spans
5842
5843A [backtick string](@)
5844is a string of one or more backtick characters (`` ` ``) that is neither
5845preceded nor followed by a backtick.
5846
5847A [code span](@) begins with a backtick string and ends with
5848a backtick string of equal length.  The contents of the code span are
5849the characters between the two backtick strings, normalized in the
5850following ways:
5851
5852- First, [line endings] are converted to [spaces].
5853- If the resulting string both begins *and* ends with a [space]
5854  character, but does not consist entirely of [space]
5855  characters, a single [space] character is removed from the
5856  front and back.  This allows you to include code that begins
5857  or ends with backtick characters, which must be separated by
5858  whitespace from the opening or closing backtick strings.
5859
5860This is a simple code span:
5861
5862```````````````````````````````` example
5863`foo`
5864.
5865<p><code>foo</code></p>
5866````````````````````````````````
5867
5868
5869Here two backticks are used, because the code contains a backtick.
5870This example also illustrates stripping of a single leading and
5871trailing space:
5872
5873```````````````````````````````` example
5874`` foo ` bar ``
5875.
5876<p><code>foo ` bar</code></p>
5877````````````````````````````````
5878
5879
5880This example shows the motivation for stripping leading and trailing
5881spaces:
5882
5883```````````````````````````````` example
5884` `` `
5885.
5886<p><code>``</code></p>
5887````````````````````````````````
5888
5889Note that only *one* space is stripped:
5890
5891```````````````````````````````` example
5892`  ``  `
5893.
5894<p><code> `` </code></p>
5895````````````````````````````````
5896
5897The stripping only happens if the space is on both
5898sides of the string:
5899
5900```````````````````````````````` example
5901` a`
5902.
5903<p><code> a</code></p>
5904````````````````````````````````
5905
5906Only [spaces], and not [unicode whitespace] in general, are
5907stripped in this way:
5908
5909```````````````````````````````` example
5910` b `
5911.
5912<p><code> b </code></p>
5913````````````````````````````````
5914
5915No stripping occurs if the code span contains only spaces:
5916
5917```````````````````````````````` example
5918` `
5919`  `
5920.
5921<p><code> </code>
5922<code>  </code></p>
5923````````````````````````````````
5924
5925
5926[Line endings] are treated like spaces:
5927
5928```````````````````````````````` example
5929``
5930foo
5931bar
5932baz
5933``
5934.
5935<p><code>foo bar   baz</code></p>
5936````````````````````````````````
5937
5938```````````````````````````````` example
5939``
5940foo
5941``
5942.
5943<p><code>foo </code></p>
5944````````````````````````````````
5945
5946
5947Interior spaces are not collapsed:
5948
5949```````````````````````````````` example
5950`foo   bar
5951baz`
5952.
5953<p><code>foo   bar  baz</code></p>
5954````````````````````````````````
5955
5956Note that browsers will typically collapse consecutive spaces
5957when rendering `<code>` elements, so it is recommended that
5958the following CSS be used:
5959
5960    code{white-space: pre-wrap;}
5961
5962
5963Note that backslash escapes do not work in code spans. All backslashes
5964are treated literally:
5965
5966```````````````````````````````` example
5967`foo\`bar`
5968.
5969<p><code>foo\</code>bar`</p>
5970````````````````````````````````
5971
5972
5973Backslash escapes are never needed, because one can always choose a
5974string of *n* backtick characters as delimiters, where the code does
5975not contain any strings of exactly *n* backtick characters.
5976
5977```````````````````````````````` example
5978``foo`bar``
5979.
5980<p><code>foo`bar</code></p>
5981````````````````````````````````
5982
5983```````````````````````````````` example
5984` foo `` bar `
5985.
5986<p><code>foo `` bar</code></p>
5987````````````````````````````````
5988
5989
5990Code span backticks have higher precedence than any other inline
5991constructs except HTML tags and autolinks.  Thus, for example, this is
5992not parsed as emphasized text, since the second `*` is part of a code
5993span:
5994
5995```````````````````````````````` example
5996*foo`*`
5997.
5998<p>*foo<code>*</code></p>
5999````````````````````````````````
6000
6001
6002And this is not parsed as a link:
6003
6004```````````````````````````````` example
6005[not a `link](/foo`)
6006.
6007<p>[not a <code>link](/foo</code>)</p>
6008````````````````````````````````
6009
6010
6011Code spans, HTML tags, and autolinks have the same precedence.
6012Thus, this is code:
6013
6014```````````````````````````````` example
6015`<a href="`">`
6016.
6017<p><code>&lt;a href=&quot;</code>&quot;&gt;`</p>
6018````````````````````````````````
6019
6020
6021But this is an HTML tag:
6022
6023```````````````````````````````` example
6024<a href="`">`
6025.
6026<p><a href="`">`</p>
6027````````````````````````````````
6028
6029
6030And this is code:
6031
6032```````````````````````````````` example
6033`<http://foo.bar.`baz>`
6034.
6035<p><code>&lt;http://foo.bar.</code>baz&gt;`</p>
6036````````````````````````````````
6037
6038
6039But this is an autolink:
6040
6041```````````````````````````````` example
6042<http://foo.bar.`baz>`
6043.
6044<p><a href="http://foo.bar.%60baz">http://foo.bar.`baz</a>`</p>
6045````````````````````````````````
6046
6047
6048When a backtick string is not closed by a matching backtick string,
6049we just have literal backticks:
6050
6051```````````````````````````````` example
6052```foo``
6053.
6054<p>```foo``</p>
6055````````````````````````````````
6056
6057
6058```````````````````````````````` example
6059`foo
6060.
6061<p>`foo</p>
6062````````````````````````````````
6063
6064The following case also illustrates the need for opening and
6065closing backtick strings to be equal in length:
6066
6067```````````````````````````````` example
6068`foo``bar``
6069.
6070<p>`foo<code>bar</code></p>
6071````````````````````````````````
6072
6073
6074## Emphasis and strong emphasis
6075
6076John Gruber's original [Markdown syntax
6077description](http://daringfireball.net/projects/markdown/syntax#em) says:
6078
6079> Markdown treats asterisks (`*`) and underscores (`_`) as indicators of
6080> emphasis. Text wrapped with one `*` or `_` will be wrapped with an HTML
6081> `<em>` tag; double `*`'s or `_`'s will be wrapped with an HTML `<strong>`
6082> tag.
6083
6084This is enough for most users, but these rules leave much undecided,
6085especially when it comes to nested emphasis.  The original
6086`Markdown.pl` test suite makes it clear that triple `***` and
6087`___` delimiters can be used for strong emphasis, and most
6088implementations have also allowed the following patterns:
6089
6090``` markdown
6091***strong emph***
6092***strong** in emph*
6093***emph* in strong**
6094**in strong *emph***
6095*in emph **strong***
6096```
6097
6098The following patterns are less widely supported, but the intent
6099is clear and they are useful (especially in contexts like bibliography
6100entries):
6101
6102``` markdown
6103*emph *with emph* in it*
6104**strong **with strong** in it**
6105```
6106
6107Many implementations have also restricted intraword emphasis to
6108the `*` forms, to avoid unwanted emphasis in words containing
6109internal underscores.  (It is best practice to put these in code
6110spans, but users often do not.)
6111
6112``` markdown
6113internal emphasis: foo*bar*baz
6114no emphasis: foo_bar_baz
6115```
6116
6117The rules given below capture all of these patterns, while allowing
6118for efficient parsing strategies that do not backtrack.
6119
6120First, some definitions.  A [delimiter run](@) is either
6121a sequence of one or more `*` characters that is not preceded or
6122followed by a non-backslash-escaped `*` character, or a sequence
6123of one or more `_` characters that is not preceded or followed by
6124a non-backslash-escaped `_` character.
6125
6126A [left-flanking delimiter run](@) is
6127a [delimiter run] that is (1) not followed by [Unicode whitespace],
6128and either (2a) not followed by a [punctuation character], or
6129(2b) followed by a [punctuation character] and
6130preceded by [Unicode whitespace] or a [punctuation character].
6131For purposes of this definition, the beginning and the end of
6132the line count as Unicode whitespace.
6133
6134A [right-flanking delimiter run](@) is
6135a [delimiter run] that is (1) not preceded by [Unicode whitespace],
6136and either (2a) not preceded by a [punctuation character], or
6137(2b) preceded by a [punctuation character] and
6138followed by [Unicode whitespace] or a [punctuation character].
6139For purposes of this definition, the beginning and the end of
6140the line count as Unicode whitespace.
6141
6142Here are some examples of delimiter runs.
6143
6144  - left-flanking but not right-flanking:
6145
6146    ```
6147    ***abc
6148      _abc
6149    **"abc"
6150     _"abc"
6151    ```
6152
6153  - right-flanking but not left-flanking:
6154
6155    ```
6156     abc***
6157     abc_
6158    "abc"**
6159    "abc"_
6160    ```
6161
6162  - Both left and right-flanking:
6163
6164    ```
6165     abc***def
6166    "abc"_"def"
6167    ```
6168
6169  - Neither left nor right-flanking:
6170
6171    ```
6172    abc *** def
6173    a _ b
6174    ```
6175
6176(The idea of distinguishing left-flanking and right-flanking
6177delimiter runs based on the character before and the character
6178after comes from Roopesh Chander's
6179[vfmd](http://www.vfmd.org/vfmd-spec/specification/#procedure-for-identifying-emphasis-tags).
6180vfmd uses the terminology "emphasis indicator string" instead of "delimiter
6181run," and its rules for distinguishing left- and right-flanking runs
6182are a bit more complex than the ones given here.)
6183
6184The following rules define emphasis and strong emphasis:
6185
61861.  A single `*` character [can open emphasis](@)
6187    iff (if and only if) it is part of a [left-flanking delimiter run].
6188
61892.  A single `_` character [can open emphasis] iff
6190    it is part of a [left-flanking delimiter run]
6191    and either (a) not part of a [right-flanking delimiter run]
6192    or (b) part of a [right-flanking delimiter run]
6193    preceded by punctuation.
6194
61953.  A single `*` character [can close emphasis](@)
6196    iff it is part of a [right-flanking delimiter run].
6197
61984.  A single `_` character [can close emphasis] iff
6199    it is part of a [right-flanking delimiter run]
6200    and either (a) not part of a [left-flanking delimiter run]
6201    or (b) part of a [left-flanking delimiter run]
6202    followed by punctuation.
6203
62045.  A double `**` [can open strong emphasis](@)
6205    iff it is part of a [left-flanking delimiter run].
6206
62076.  A double `__` [can open strong emphasis] iff
6208    it is part of a [left-flanking delimiter run]
6209    and either (a) not part of a [right-flanking delimiter run]
6210    or (b) part of a [right-flanking delimiter run]
6211    preceded by punctuation.
6212
62137.  A double `**` [can close strong emphasis](@)
6214    iff it is part of a [right-flanking delimiter run].
6215
62168.  A double `__` [can close strong emphasis] iff
6217    it is part of a [right-flanking delimiter run]
6218    and either (a) not part of a [left-flanking delimiter run]
6219    or (b) part of a [left-flanking delimiter run]
6220    followed by punctuation.
6221
62229.  Emphasis begins with a delimiter that [can open emphasis] and ends
6223    with a delimiter that [can close emphasis], and that uses the same
6224    character (`_` or `*`) as the opening delimiter.  The
6225    opening and closing delimiters must belong to separate
6226    [delimiter runs].  If one of the delimiters can both
6227    open and close emphasis, then the sum of the lengths of the
6228    delimiter runs containing the opening and closing delimiters
6229    must not be a multiple of 3 unless both lengths are
6230    multiples of 3.
6231
623210. Strong emphasis begins with a delimiter that
6233    [can open strong emphasis] and ends with a delimiter that
6234    [can close strong emphasis], and that uses the same character
6235    (`_` or `*`) as the opening delimiter.  The
6236    opening and closing delimiters must belong to separate
6237    [delimiter runs].  If one of the delimiters can both open
6238    and close strong emphasis, then the sum of the lengths of
6239    the delimiter runs containing the opening and closing
6240    delimiters must not be a multiple of 3 unless both lengths
6241    are multiples of 3.
6242
624311. A literal `*` character cannot occur at the beginning or end of
6244    `*`-delimited emphasis or `**`-delimited strong emphasis, unless it
6245    is backslash-escaped.
6246
624712. A literal `_` character cannot occur at the beginning or end of
6248    `_`-delimited emphasis or `__`-delimited strong emphasis, unless it
6249    is backslash-escaped.
6250
6251Where rules 1--12 above are compatible with multiple parsings,
6252the following principles resolve ambiguity:
6253
625413. The number of nestings should be minimized. Thus, for example,
6255    an interpretation `<strong>...</strong>` is always preferred to
6256    `<em><em>...</em></em>`.
6257
625814. An interpretation `<em><strong>...</strong></em>` is always
6259    preferred to `<strong><em>...</em></strong>`.
6260
626115. When two potential emphasis or strong emphasis spans overlap,
6262    so that the second begins before the first ends and ends after
6263    the first ends, the first takes precedence. Thus, for example,
6264    `*foo _bar* baz_` is parsed as `<em>foo _bar</em> baz_` rather
6265    than `*foo <em>bar* baz</em>`.
6266
626716. When there are two potential emphasis or strong emphasis spans
6268    with the same closing delimiter, the shorter one (the one that
6269    opens later) takes precedence. Thus, for example,
6270    `**foo **bar baz**` is parsed as `**foo <strong>bar baz</strong>`
6271    rather than `<strong>foo **bar baz</strong>`.
6272
627317. Inline code spans, links, images, and HTML tags group more tightly
6274    than emphasis.  So, when there is a choice between an interpretation
6275    that contains one of these elements and one that does not, the
6276    former always wins.  Thus, for example, `*[foo*](bar)` is
6277    parsed as `*<a href="bar">foo*</a>` rather than as
6278    `<em>[foo</em>](bar)`.
6279
6280These rules can be illustrated through a series of examples.
6281
6282Rule 1:
6283
6284```````````````````````````````` example
6285*foo bar*
6286.
6287<p><em>foo bar</em></p>
6288````````````````````````````````
6289
6290
6291This is not emphasis, because the opening `*` is followed by
6292whitespace, and hence not part of a [left-flanking delimiter run]:
6293
6294```````````````````````````````` example
6295a * foo bar*
6296.
6297<p>a * foo bar*</p>
6298````````````````````````````````
6299
6300
6301This is not emphasis, because the opening `*` is preceded
6302by an alphanumeric and followed by punctuation, and hence
6303not part of a [left-flanking delimiter run]:
6304
6305```````````````````````````````` example
6306a*"foo"*
6307.
6308<p>a*&quot;foo&quot;*</p>
6309````````````````````````````````
6310
6311
6312Unicode nonbreaking spaces count as whitespace, too:
6313
6314```````````````````````````````` example
6315* a *
6316.
6317<p>* a *</p>
6318````````````````````````````````
6319
6320
6321Intraword emphasis with `*` is permitted:
6322
6323```````````````````````````````` example
6324foo*bar*
6325.
6326<p>foo<em>bar</em></p>
6327````````````````````````````````
6328
6329
6330```````````````````````````````` example
63315*6*78
6332.
6333<p>5<em>6</em>78</p>
6334````````````````````````````````
6335
6336
6337Rule 2:
6338
6339```````````````````````````````` example
6340_foo bar_
6341.
6342<p><em>foo bar</em></p>
6343````````````````````````````````
6344
6345
6346This is not emphasis, because the opening `_` is followed by
6347whitespace:
6348
6349```````````````````````````````` example
6350_ foo bar_
6351.
6352<p>_ foo bar_</p>
6353````````````````````````````````
6354
6355
6356This is not emphasis, because the opening `_` is preceded
6357by an alphanumeric and followed by punctuation:
6358
6359```````````````````````````````` example
6360a_"foo"_
6361.
6362<p>a_&quot;foo&quot;_</p>
6363````````````````````````````````
6364
6365
6366Emphasis with `_` is not allowed inside words:
6367
6368```````````````````````````````` example
6369foo_bar_
6370.
6371<p>foo_bar_</p>
6372````````````````````````````````
6373
6374
6375```````````````````````````````` example
63765_6_78
6377.
6378<p>5_6_78</p>
6379````````````````````````````````
6380
6381
6382```````````````````````````````` example
6383пристаням_стремятся_
6384.
6385<p>пристаням_стремятся_</p>
6386````````````````````````````````
6387
6388
6389Here `_` does not generate emphasis, because the first delimiter run
6390is right-flanking and the second left-flanking:
6391
6392```````````````````````````````` example
6393aa_"bb"_cc
6394.
6395<p>aa_&quot;bb&quot;_cc</p>
6396````````````````````````````````
6397
6398
6399This is emphasis, even though the opening delimiter is
6400both left- and right-flanking, because it is preceded by
6401punctuation:
6402
6403```````````````````````````````` example
6404foo-_(bar)_
6405.
6406<p>foo-<em>(bar)</em></p>
6407````````````````````````````````
6408
6409
6410Rule 3:
6411
6412This is not emphasis, because the closing delimiter does
6413not match the opening delimiter:
6414
6415```````````````````````````````` example
6416_foo*
6417.
6418<p>_foo*</p>
6419````````````````````````````````
6420
6421
6422This is not emphasis, because the closing `*` is preceded by
6423whitespace:
6424
6425```````````````````````````````` example
6426*foo bar *
6427.
6428<p>*foo bar *</p>
6429````````````````````````````````
6430
6431
6432A newline also counts as whitespace:
6433
6434```````````````````````````````` example
6435*foo bar
6436*
6437.
6438<p>*foo bar
6439*</p>
6440````````````````````````````````
6441
6442
6443This is not emphasis, because the second `*` is
6444preceded by punctuation and followed by an alphanumeric
6445(hence it is not part of a [right-flanking delimiter run]:
6446
6447```````````````````````````````` example
6448*(*foo)
6449.
6450<p>*(*foo)</p>
6451````````````````````````````````
6452
6453
6454The point of this restriction is more easily appreciated
6455with this example:
6456
6457```````````````````````````````` example
6458*(*foo*)*
6459.
6460<p><em>(<em>foo</em>)</em></p>
6461````````````````````````````````
6462
6463
6464Intraword emphasis with `*` is allowed:
6465
6466```````````````````````````````` example
6467*foo*bar
6468.
6469<p><em>foo</em>bar</p>
6470````````````````````````````````
6471
6472
6473
6474Rule 4:
6475
6476This is not emphasis, because the closing `_` is preceded by
6477whitespace:
6478
6479```````````````````````````````` example
6480_foo bar _
6481.
6482<p>_foo bar _</p>
6483````````````````````````````````
6484
6485
6486This is not emphasis, because the second `_` is
6487preceded by punctuation and followed by an alphanumeric:
6488
6489```````````````````````````````` example
6490_(_foo)
6491.
6492<p>_(_foo)</p>
6493````````````````````````````````
6494
6495
6496This is emphasis within emphasis:
6497
6498```````````````````````````````` example
6499_(_foo_)_
6500.
6501<p><em>(<em>foo</em>)</em></p>
6502````````````````````````````````
6503
6504
6505Intraword emphasis is disallowed for `_`:
6506
6507```````````````````````````````` example
6508_foo_bar
6509.
6510<p>_foo_bar</p>
6511````````````````````````````````
6512
6513
6514```````````````````````````````` example
6515_пристаням_стремятся
6516.
6517<p>_пристаням_стремятся</p>
6518````````````````````````````````
6519
6520
6521```````````````````````````````` example
6522_foo_bar_baz_
6523.
6524<p><em>foo_bar_baz</em></p>
6525````````````````````````````````
6526
6527
6528This is emphasis, even though the closing delimiter is
6529both left- and right-flanking, because it is followed by
6530punctuation:
6531
6532```````````````````````````````` example
6533_(bar)_.
6534.
6535<p><em>(bar)</em>.</p>
6536````````````````````````````````
6537
6538
6539Rule 5:
6540
6541```````````````````````````````` example
6542**foo bar**
6543.
6544<p><strong>foo bar</strong></p>
6545````````````````````````````````
6546
6547
6548This is not strong emphasis, because the opening delimiter is
6549followed by whitespace:
6550
6551```````````````````````````````` example
6552** foo bar**
6553.
6554<p>** foo bar**</p>
6555````````````````````````````````
6556
6557
6558This is not strong emphasis, because the opening `**` is preceded
6559by an alphanumeric and followed by punctuation, and hence
6560not part of a [left-flanking delimiter run]:
6561
6562```````````````````````````````` example
6563a**"foo"**
6564.
6565<p>a**&quot;foo&quot;**</p>
6566````````````````````````````````
6567
6568
6569Intraword strong emphasis with `**` is permitted:
6570
6571```````````````````````````````` example
6572foo**bar**
6573.
6574<p>foo<strong>bar</strong></p>
6575````````````````````````````````
6576
6577
6578Rule 6:
6579
6580```````````````````````````````` example
6581__foo bar__
6582.
6583<p><strong>foo bar</strong></p>
6584````````````````````````````````
6585
6586
6587This is not strong emphasis, because the opening delimiter is
6588followed by whitespace:
6589
6590```````````````````````````````` example
6591__ foo bar__
6592.
6593<p>__ foo bar__</p>
6594````````````````````````````````
6595
6596
6597A newline counts as whitespace:
6598```````````````````````````````` example
6599__
6600foo bar__
6601.
6602<p>__
6603foo bar__</p>
6604````````````````````````````````
6605
6606
6607This is not strong emphasis, because the opening `__` is preceded
6608by an alphanumeric and followed by punctuation:
6609
6610```````````````````````````````` example
6611a__"foo"__
6612.
6613<p>a__&quot;foo&quot;__</p>
6614````````````````````````````````
6615
6616
6617Intraword strong emphasis is forbidden with `__`:
6618
6619```````````````````````````````` example
6620foo__bar__
6621.
6622<p>foo__bar__</p>
6623````````````````````````````````
6624
6625
6626```````````````````````````````` example
66275__6__78
6628.
6629<p>5__6__78</p>
6630````````````````````````````````
6631
6632
6633```````````````````````````````` example
6634пристаням__стремятся__
6635.
6636<p>пристаням__стремятся__</p>
6637````````````````````````````````
6638
6639
6640```````````````````````````````` example
6641__foo, __bar__, baz__
6642.
6643<p><strong>foo, <strong>bar</strong>, baz</strong></p>
6644````````````````````````````````
6645
6646
6647This is strong emphasis, even though the opening delimiter is
6648both left- and right-flanking, because it is preceded by
6649punctuation:
6650
6651```````````````````````````````` example
6652foo-__(bar)__
6653.
6654<p>foo-<strong>(bar)</strong></p>
6655````````````````````````````````
6656
6657
6658
6659Rule 7:
6660
6661This is not strong emphasis, because the closing delimiter is preceded
6662by whitespace:
6663
6664```````````````````````````````` example
6665**foo bar **
6666.
6667<p>**foo bar **</p>
6668````````````````````````````````
6669
6670
6671(Nor can it be interpreted as an emphasized `*foo bar *`, because of
6672Rule 11.)
6673
6674This is not strong emphasis, because the second `**` is
6675preceded by punctuation and followed by an alphanumeric:
6676
6677```````````````````````````````` example
6678**(**foo)
6679.
6680<p>**(**foo)</p>
6681````````````````````````````````
6682
6683
6684The point of this restriction is more easily appreciated
6685with these examples:
6686
6687```````````````````````````````` example
6688*(**foo**)*
6689.
6690<p><em>(<strong>foo</strong>)</em></p>
6691````````````````````````````````
6692
6693
6694```````````````````````````````` example
6695**Gomphocarpus (*Gomphocarpus physocarpus*, syn.
6696*Asclepias physocarpa*)**
6697.
6698<p><strong>Gomphocarpus (<em>Gomphocarpus physocarpus</em>, syn.
6699<em>Asclepias physocarpa</em>)</strong></p>
6700````````````````````````````````
6701
6702
6703```````````````````````````````` example
6704**foo "*bar*" foo**
6705.
6706<p><strong>foo &quot;<em>bar</em>&quot; foo</strong></p>
6707````````````````````````````````
6708
6709
6710Intraword emphasis:
6711
6712```````````````````````````````` example
6713**foo**bar
6714.
6715<p><strong>foo</strong>bar</p>
6716````````````````````````````````
6717
6718
6719Rule 8:
6720
6721This is not strong emphasis, because the closing delimiter is
6722preceded by whitespace:
6723
6724```````````````````````````````` example
6725__foo bar __
6726.
6727<p>__foo bar __</p>
6728````````````````````````````````
6729
6730
6731This is not strong emphasis, because the second `__` is
6732preceded by punctuation and followed by an alphanumeric:
6733
6734```````````````````````````````` example
6735__(__foo)
6736.
6737<p>__(__foo)</p>
6738````````````````````````````````
6739
6740
6741The point of this restriction is more easily appreciated
6742with this example:
6743
6744```````````````````````````````` example
6745_(__foo__)_
6746.
6747<p><em>(<strong>foo</strong>)</em></p>
6748````````````````````````````````
6749
6750
6751Intraword strong emphasis is forbidden with `__`:
6752
6753```````````````````````````````` example
6754__foo__bar
6755.
6756<p>__foo__bar</p>
6757````````````````````````````````
6758
6759
6760```````````````````````````````` example
6761__пристаням__стремятся
6762.
6763<p>__пристаням__стремятся</p>
6764````````````````````````````````
6765
6766
6767```````````````````````````````` example
6768__foo__bar__baz__
6769.
6770<p><strong>foo__bar__baz</strong></p>
6771````````````````````````````````
6772
6773
6774This is strong emphasis, even though the closing delimiter is
6775both left- and right-flanking, because it is followed by
6776punctuation:
6777
6778```````````````````````````````` example
6779__(bar)__.
6780.
6781<p><strong>(bar)</strong>.</p>
6782````````````````````````````````
6783
6784
6785Rule 9:
6786
6787Any nonempty sequence of inline elements can be the contents of an
6788emphasized span.
6789
6790```````````````````````````````` example
6791*foo [bar](/url)*
6792.
6793<p><em>foo <a href="/url">bar</a></em></p>
6794````````````````````````````````
6795
6796
6797```````````````````````````````` example
6798*foo
6799bar*
6800.
6801<p><em>foo
6802bar</em></p>
6803````````````````````````````````
6804
6805
6806In particular, emphasis and strong emphasis can be nested
6807inside emphasis:
6808
6809```````````````````````````````` example
6810_foo __bar__ baz_
6811.
6812<p><em>foo <strong>bar</strong> baz</em></p>
6813````````````````````````````````
6814
6815
6816```````````````````````````````` example
6817_foo _bar_ baz_
6818.
6819<p><em>foo <em>bar</em> baz</em></p>
6820````````````````````````````````
6821
6822
6823```````````````````````````````` example
6824__foo_ bar_
6825.
6826<p><em><em>foo</em> bar</em></p>
6827````````````````````````````````
6828
6829
6830```````````````````````````````` example
6831*foo *bar**
6832.
6833<p><em>foo <em>bar</em></em></p>
6834````````````````````````````````
6835
6836
6837```````````````````````````````` example
6838*foo **bar** baz*
6839.
6840<p><em>foo <strong>bar</strong> baz</em></p>
6841````````````````````````````````
6842
6843```````````````````````````````` example
6844*foo**bar**baz*
6845.
6846<p><em>foo<strong>bar</strong>baz</em></p>
6847````````````````````````````````
6848
6849Note that in the preceding case, the interpretation
6850
6851``` markdown
6852<p><em>foo</em><em>bar<em></em>baz</em></p>
6853```
6854
6855
6856is precluded by the condition that a delimiter that
6857can both open and close (like the `*` after `foo`)
6858cannot form emphasis if the sum of the lengths of
6859the delimiter runs containing the opening and
6860closing delimiters is a multiple of 3 unless
6861both lengths are multiples of 3.
6862
6863
6864For the same reason, we don't get two consecutive
6865emphasis sections in this example:
6866
6867```````````````````````````````` example
6868*foo**bar*
6869.
6870<p><em>foo**bar</em></p>
6871````````````````````````````````
6872
6873
6874The same condition ensures that the following
6875cases are all strong emphasis nested inside
6876emphasis, even when the interior spaces are
6877omitted:
6878
6879
6880```````````````````````````````` example
6881***foo** bar*
6882.
6883<p><em><strong>foo</strong> bar</em></p>
6884````````````````````````````````
6885
6886
6887```````````````````````````````` example
6888*foo **bar***
6889.
6890<p><em>foo <strong>bar</strong></em></p>
6891````````````````````````````````
6892
6893
6894```````````````````````````````` example
6895*foo**bar***
6896.
6897<p><em>foo<strong>bar</strong></em></p>
6898````````````````````````````````
6899
6900
6901When the lengths of the interior closing and opening
6902delimiter runs are *both* multiples of 3, though,
6903they can match to create emphasis:
6904
6905```````````````````````````````` example
6906foo***bar***baz
6907.
6908<p>foo<em><strong>bar</strong></em>baz</p>
6909````````````````````````````````
6910
6911```````````````````````````````` example
6912foo******bar*********baz
6913.
6914<p>foo<strong><strong><strong>bar</strong></strong></strong>***baz</p>
6915````````````````````````````````
6916
6917
6918Indefinite levels of nesting are possible:
6919
6920```````````````````````````````` example
6921*foo **bar *baz* bim** bop*
6922.
6923<p><em>foo <strong>bar <em>baz</em> bim</strong> bop</em></p>
6924````````````````````````````````
6925
6926
6927```````````````````````````````` example
6928*foo [*bar*](/url)*
6929.
6930<p><em>foo <a href="/url"><em>bar</em></a></em></p>
6931````````````````````````````````
6932
6933
6934There can be no empty emphasis or strong emphasis:
6935
6936```````````````````````````````` example
6937** is not an empty emphasis
6938.
6939<p>** is not an empty emphasis</p>
6940````````````````````````````````
6941
6942
6943```````````````````````````````` example
6944**** is not an empty strong emphasis
6945.
6946<p>**** is not an empty strong emphasis</p>
6947````````````````````````````````
6948
6949
6950
6951Rule 10:
6952
6953Any nonempty sequence of inline elements can be the contents of an
6954strongly emphasized span.
6955
6956```````````````````````````````` example
6957**foo [bar](/url)**
6958.
6959<p><strong>foo <a href="/url">bar</a></strong></p>
6960````````````````````````````````
6961
6962
6963```````````````````````````````` example
6964**foo
6965bar**
6966.
6967<p><strong>foo
6968bar</strong></p>
6969````````````````````````````````
6970
6971
6972In particular, emphasis and strong emphasis can be nested
6973inside strong emphasis:
6974
6975```````````````````````````````` example
6976__foo _bar_ baz__
6977.
6978<p><strong>foo <em>bar</em> baz</strong></p>
6979````````````````````````````````
6980
6981
6982```````````````````````````````` example
6983__foo __bar__ baz__
6984.
6985<p><strong>foo <strong>bar</strong> baz</strong></p>
6986````````````````````````````````
6987
6988
6989```````````````````````````````` example
6990____foo__ bar__
6991.
6992<p><strong><strong>foo</strong> bar</strong></p>
6993````````````````````````````````
6994
6995
6996```````````````````````````````` example
6997**foo **bar****
6998.
6999<p><strong>foo <strong>bar</strong></strong></p>
7000````````````````````````````````
7001
7002
7003```````````````````````````````` example
7004**foo *bar* baz**
7005.
7006<p><strong>foo <em>bar</em> baz</strong></p>
7007````````````````````````````````
7008
7009
7010```````````````````````````````` example
7011**foo*bar*baz**
7012.
7013<p><strong>foo<em>bar</em>baz</strong></p>
7014````````````````````````````````
7015
7016
7017```````````````````````````````` example
7018***foo* bar**
7019.
7020<p><strong><em>foo</em> bar</strong></p>
7021````````````````````````````````
7022
7023
7024```````````````````````````````` example
7025**foo *bar***
7026.
7027<p><strong>foo <em>bar</em></strong></p>
7028````````````````````````````````
7029
7030
7031Indefinite levels of nesting are possible:
7032
7033```````````````````````````````` example
7034**foo *bar **baz**
7035bim* bop**
7036.
7037<p><strong>foo <em>bar <strong>baz</strong>
7038bim</em> bop</strong></p>
7039````````````````````````````````
7040
7041
7042```````````````````````````````` example
7043**foo [*bar*](/url)**
7044.
7045<p><strong>foo <a href="/url"><em>bar</em></a></strong></p>
7046````````````````````````````````
7047
7048
7049There can be no empty emphasis or strong emphasis:
7050
7051```````````````````````````````` example
7052__ is not an empty emphasis
7053.
7054<p>__ is not an empty emphasis</p>
7055````````````````````````````````
7056
7057
7058```````````````````````````````` example
7059____ is not an empty strong emphasis
7060.
7061<p>____ is not an empty strong emphasis</p>
7062````````````````````````````````
7063
7064
7065
7066Rule 11:
7067
7068```````````````````````````````` example
7069foo ***
7070.
7071<p>foo ***</p>
7072````````````````````````````````
7073
7074
7075```````````````````````````````` example
7076foo *\**
7077.
7078<p>foo <em>*</em></p>
7079````````````````````````````````
7080
7081
7082```````````````````````````````` example
7083foo *_*
7084.
7085<p>foo <em>_</em></p>
7086````````````````````````````````
7087
7088
7089```````````````````````````````` example
7090foo *****
7091.
7092<p>foo *****</p>
7093````````````````````````````````
7094
7095
7096```````````````````````````````` example
7097foo **\***
7098.
7099<p>foo <strong>*</strong></p>
7100````````````````````````````````
7101
7102
7103```````````````````````````````` example
7104foo **_**
7105.
7106<p>foo <strong>_</strong></p>
7107````````````````````````````````
7108
7109
7110Note that when delimiters do not match evenly, Rule 11 determines
7111that the excess literal `*` characters will appear outside of the
7112emphasis, rather than inside it:
7113
7114```````````````````````````````` example
7115**foo*
7116.
7117<p>*<em>foo</em></p>
7118````````````````````````````````
7119
7120
7121```````````````````````````````` example
7122*foo**
7123.
7124<p><em>foo</em>*</p>
7125````````````````````````````````
7126
7127
7128```````````````````````````````` example
7129***foo**
7130.
7131<p>*<strong>foo</strong></p>
7132````````````````````````````````
7133
7134
7135```````````````````````````````` example
7136****foo*
7137.
7138<p>***<em>foo</em></p>
7139````````````````````````````````
7140
7141
7142```````````````````````````````` example
7143**foo***
7144.
7145<p><strong>foo</strong>*</p>
7146````````````````````````````````
7147
7148
7149```````````````````````````````` example
7150*foo****
7151.
7152<p><em>foo</em>***</p>
7153````````````````````````````````
7154
7155
7156
7157Rule 12:
7158
7159```````````````````````````````` example
7160foo ___
7161.
7162<p>foo ___</p>
7163````````````````````````````````
7164
7165
7166```````````````````````````````` example
7167foo _\__
7168.
7169<p>foo <em>_</em></p>
7170````````````````````````````````
7171
7172
7173```````````````````````````````` example
7174foo _*_
7175.
7176<p>foo <em>*</em></p>
7177````````````````````````````````
7178
7179
7180```````````````````````````````` example
7181foo _____
7182.
7183<p>foo _____</p>
7184````````````````````````````````
7185
7186
7187```````````````````````````````` example
7188foo __\___
7189.
7190<p>foo <strong>_</strong></p>
7191````````````````````````````````
7192
7193
7194```````````````````````````````` example
7195foo __*__
7196.
7197<p>foo <strong>*</strong></p>
7198````````````````````````````````
7199
7200
7201```````````````````````````````` example
7202__foo_
7203.
7204<p>_<em>foo</em></p>
7205````````````````````````````````
7206
7207
7208Note that when delimiters do not match evenly, Rule 12 determines
7209that the excess literal `_` characters will appear outside of the
7210emphasis, rather than inside it:
7211
7212```````````````````````````````` example
7213_foo__
7214.
7215<p><em>foo</em>_</p>
7216````````````````````````````````
7217
7218
7219```````````````````````````````` example
7220___foo__
7221.
7222<p>_<strong>foo</strong></p>
7223````````````````````````````````
7224
7225
7226```````````````````````````````` example
7227____foo_
7228.
7229<p>___<em>foo</em></p>
7230````````````````````````````````
7231
7232
7233```````````````````````````````` example
7234__foo___
7235.
7236<p><strong>foo</strong>_</p>
7237````````````````````````````````
7238
7239
7240```````````````````````````````` example
7241_foo____
7242.
7243<p><em>foo</em>___</p>
7244````````````````````````````````
7245
7246
7247Rule 13 implies that if you want emphasis nested directly inside
7248emphasis, you must use different delimiters:
7249
7250```````````````````````````````` example
7251**foo**
7252.
7253<p><strong>foo</strong></p>
7254````````````````````````````````
7255
7256
7257```````````````````````````````` example
7258*_foo_*
7259.
7260<p><em><em>foo</em></em></p>
7261````````````````````````````````
7262
7263
7264```````````````````````````````` example
7265__foo__
7266.
7267<p><strong>foo</strong></p>
7268````````````````````````````````
7269
7270
7271```````````````````````````````` example
7272_*foo*_
7273.
7274<p><em><em>foo</em></em></p>
7275````````````````````````````````
7276
7277
7278However, strong emphasis within strong emphasis is possible without
7279switching delimiters:
7280
7281```````````````````````````````` example
7282****foo****
7283.
7284<p><strong><strong>foo</strong></strong></p>
7285````````````````````````````````
7286
7287
7288```````````````````````````````` example
7289____foo____
7290.
7291<p><strong><strong>foo</strong></strong></p>
7292````````````````````````````````
7293
7294
7295
7296Rule 13 can be applied to arbitrarily long sequences of
7297delimiters:
7298
7299```````````````````````````````` example
7300******foo******
7301.
7302<p><strong><strong><strong>foo</strong></strong></strong></p>
7303````````````````````````````````
7304
7305
7306Rule 14:
7307
7308```````````````````````````````` example
7309***foo***
7310.
7311<p><em><strong>foo</strong></em></p>
7312````````````````````````````````
7313
7314
7315```````````````````````````````` example
7316_____foo_____
7317.
7318<p><em><strong><strong>foo</strong></strong></em></p>
7319````````````````````````````````
7320
7321
7322Rule 15:
7323
7324```````````````````````````````` example
7325*foo _bar* baz_
7326.
7327<p><em>foo _bar</em> baz_</p>
7328````````````````````````````````
7329
7330
7331```````````````````````````````` example
7332*foo __bar *baz bim__ bam*
7333.
7334<p><em>foo <strong>bar *baz bim</strong> bam</em></p>
7335````````````````````````````````
7336
7337
7338Rule 16:
7339
7340```````````````````````````````` example
7341**foo **bar baz**
7342.
7343<p>**foo <strong>bar baz</strong></p>
7344````````````````````````````````
7345
7346
7347```````````````````````````````` example
7348*foo *bar baz*
7349.
7350<p>*foo <em>bar baz</em></p>
7351````````````````````````````````
7352
7353
7354Rule 17:
7355
7356```````````````````````````````` example
7357*[bar*](/url)
7358.
7359<p>*<a href="/url">bar*</a></p>
7360````````````````````````````````
7361
7362
7363```````````````````````````````` example
7364_foo [bar_](/url)
7365.
7366<p>_foo <a href="/url">bar_</a></p>
7367````````````````````````````````
7368
7369
7370```````````````````````````````` example
7371*<img src="foo" title="*"/>
7372.
7373<p>*<img src="foo" title="*"/></p>
7374````````````````````````````````
7375
7376
7377```````````````````````````````` example
7378**<a href="**">
7379.
7380<p>**<a href="**"></p>
7381````````````````````````````````
7382
7383
7384```````````````````````````````` example
7385__<a href="__">
7386.
7387<p>__<a href="__"></p>
7388````````````````````````````````
7389
7390
7391```````````````````````````````` example
7392*a `*`*
7393.
7394<p><em>a <code>*</code></em></p>
7395````````````````````````````````
7396
7397
7398```````````````````````````````` example
7399_a `_`_
7400.
7401<p><em>a <code>_</code></em></p>
7402````````````````````````````````
7403
7404
7405```````````````````````````````` example
7406**a<http://foo.bar/?q=**>
7407.
7408<p>**a<a href="http://foo.bar/?q=**">http://foo.bar/?q=**</a></p>
7409````````````````````````````````
7410
7411
7412```````````````````````````````` example
7413__a<http://foo.bar/?q=__>
7414.
7415<p>__a<a href="http://foo.bar/?q=__">http://foo.bar/?q=__</a></p>
7416````````````````````````````````
7417
7418
7419
7420## Links
7421
7422A link contains [link text] (the visible text), a [link destination]
7423(the URI that is the link destination), and optionally a [link title].
7424There are two basic kinds of links in Markdown.  In [inline links] the
7425destination and title are given immediately after the link text.  In
7426[reference links] the destination and title are defined elsewhere in
7427the document.
7428
7429A [link text](@) consists of a sequence of zero or more
7430inline elements enclosed by square brackets (`[` and `]`).  The
7431following rules apply:
7432
7433- Links may not contain other links, at any level of nesting. If
7434  multiple otherwise valid link definitions appear nested inside each
7435  other, the inner-most definition is used.
7436
7437- Brackets are allowed in the [link text] only if (a) they
7438  are backslash-escaped or (b) they appear as a matched pair of brackets,
7439  with an open bracket `[`, a sequence of zero or more inlines, and
7440  a close bracket `]`.
7441
7442- Backtick [code spans], [autolinks], and raw [HTML tags] bind more tightly
7443  than the brackets in link text.  Thus, for example,
7444  `` [foo`]` `` could not be a link text, since the second `]`
7445  is part of a code span.
7446
7447- The brackets in link text bind more tightly than markers for
7448  [emphasis and strong emphasis]. Thus, for example, `*[foo*](url)` is a link.
7449
7450A [link destination](@) consists of either
7451
7452- a sequence of zero or more characters between an opening `<` and a
7453  closing `>` that contains no line breaks or unescaped
7454  `<` or `>` characters, or
7455
7456- a nonempty sequence of characters that does not start with
7457  `<`, does not include ASCII space or control characters, and
7458  includes parentheses only if (a) they are backslash-escaped or
7459  (b) they are part of a balanced pair of unescaped parentheses.
7460  (Implementations may impose limits on parentheses nesting to
7461  avoid performance issues, but at least three levels of nesting
7462  should be supported.)
7463
7464A [link title](@)  consists of either
7465
7466- a sequence of zero or more characters between straight double-quote
7467  characters (`"`), including a `"` character only if it is
7468  backslash-escaped, or
7469
7470- a sequence of zero or more characters between straight single-quote
7471  characters (`'`), including a `'` character only if it is
7472  backslash-escaped, or
7473
7474- a sequence of zero or more characters between matching parentheses
7475  (`(...)`), including a `(` or `)` character only if it is
7476  backslash-escaped.
7477
7478Although [link titles] may span multiple lines, they may not contain
7479a [blank line].
7480
7481An [inline link](@) consists of a [link text] followed immediately
7482by a left parenthesis `(`, optional [whitespace], an optional
7483[link destination], an optional [link title] separated from the link
7484destination by [whitespace], optional [whitespace], and a right
7485parenthesis `)`. The link's text consists of the inlines contained
7486in the [link text] (excluding the enclosing square brackets).
7487The link's URI consists of the link destination, excluding enclosing
7488`<...>` if present, with backslash-escapes in effect as described
7489above.  The link's title consists of the link title, excluding its
7490enclosing delimiters, with backslash-escapes in effect as described
7491above.
7492
7493Here is a simple inline link:
7494
7495```````````````````````````````` example
7496[link](/uri "title")
7497.
7498<p><a href="/uri" title="title">link</a></p>
7499````````````````````````````````
7500
7501
7502The title may be omitted:
7503
7504```````````````````````````````` example
7505[link](/uri)
7506.
7507<p><a href="/uri">link</a></p>
7508````````````````````````````````
7509
7510
7511Both the title and the destination may be omitted:
7512
7513```````````````````````````````` example
7514[link]()
7515.
7516<p><a href="">link</a></p>
7517````````````````````````````````
7518
7519
7520```````````````````````````````` example
7521[link](<>)
7522.
7523<p><a href="">link</a></p>
7524````````````````````````````````
7525
7526The destination can only contain spaces if it is
7527enclosed in pointy brackets:
7528
7529```````````````````````````````` example
7530[link](/my uri)
7531.
7532<p>[link](/my uri)</p>
7533````````````````````````````````
7534
7535```````````````````````````````` example
7536[link](</my uri>)
7537.
7538<p><a href="/my%20uri">link</a></p>
7539````````````````````````````````
7540
7541The destination cannot contain line breaks,
7542even if enclosed in pointy brackets:
7543
7544```````````````````````````````` example
7545[link](foo
7546bar)
7547.
7548<p>[link](foo
7549bar)</p>
7550````````````````````````````````
7551
7552```````````````````````````````` example
7553[link](<foo
7554bar>)
7555.
7556<p>[link](<foo
7557bar>)</p>
7558````````````````````````````````
7559
7560The destination can contain `)` if it is enclosed
7561in pointy brackets:
7562
7563```````````````````````````````` example
7564[a](<b)c>)
7565.
7566<p><a href="b)c">a</a></p>
7567````````````````````````````````
7568
7569Pointy brackets that enclose links must be unescaped:
7570
7571```````````````````````````````` example
7572[link](<foo\>)
7573.
7574<p>[link](&lt;foo&gt;)</p>
7575````````````````````````````````
7576
7577These are not links, because the opening pointy bracket
7578is not matched properly:
7579
7580```````````````````````````````` example
7581[a](<b)c
7582[a](<b)c>
7583[a](<b>c)
7584.
7585<p>[a](&lt;b)c
7586[a](&lt;b)c&gt;
7587[a](<b>c)</p>
7588````````````````````````````````
7589
7590Parentheses inside the link destination may be escaped:
7591
7592```````````````````````````````` example
7593[link](\(foo\))
7594.
7595<p><a href="(foo)">link</a></p>
7596````````````````````````````````
7597
7598Any number of parentheses are allowed without escaping, as long as they are
7599balanced:
7600
7601```````````````````````````````` example
7602[link](foo(and(bar)))
7603.
7604<p><a href="foo(and(bar))">link</a></p>
7605````````````````````````````````
7606
7607However, if you have unbalanced parentheses, you need to escape or use the
7608`<...>` form:
7609
7610```````````````````````````````` example
7611[link](foo\(and\(bar\))
7612.
7613<p><a href="foo(and(bar)">link</a></p>
7614````````````````````````````````
7615
7616
7617```````````````````````````````` example
7618[link](<foo(and(bar)>)
7619.
7620<p><a href="foo(and(bar)">link</a></p>
7621````````````````````````````````
7622
7623
7624Parentheses and other symbols can also be escaped, as usual
7625in Markdown:
7626
7627```````````````````````````````` example
7628[link](foo\)\:)
7629.
7630<p><a href="foo):">link</a></p>
7631````````````````````````````````
7632
7633
7634A link can contain fragment identifiers and queries:
7635
7636```````````````````````````````` example
7637[link](#fragment)
7638
7639[link](http://example.com#fragment)
7640
7641[link](http://example.com?foo=3#frag)
7642.
7643<p><a href="#fragment">link</a></p>
7644<p><a href="http://example.com#fragment">link</a></p>
7645<p><a href="http://example.com?foo=3#frag">link</a></p>
7646````````````````````````````````
7647
7648
7649Note that a backslash before a non-escapable character is
7650just a backslash:
7651
7652```````````````````````````````` example
7653[link](foo\bar)
7654.
7655<p><a href="foo%5Cbar">link</a></p>
7656````````````````````````````````
7657
7658
7659URL-escaping should be left alone inside the destination, as all
7660URL-escaped characters are also valid URL characters. Entity and
7661numerical character references in the destination will be parsed
7662into the corresponding Unicode code points, as usual.  These may
7663be optionally URL-escaped when written as HTML, but this spec
7664does not enforce any particular policy for rendering URLs in
7665HTML or other formats.  Renderers may make different decisions
7666about how to escape or normalize URLs in the output.
7667
7668```````````````````````````````` example
7669[link](foo%20b&auml;)
7670.
7671<p><a href="foo%20b%C3%A4">link</a></p>
7672````````````````````````````````
7673
7674
7675Note that, because titles can often be parsed as destinations,
7676if you try to omit the destination and keep the title, you'll
7677get unexpected results:
7678
7679```````````````````````````````` example
7680[link]("title")
7681.
7682<p><a href="%22title%22">link</a></p>
7683````````````````````````````````
7684
7685
7686Titles may be in single quotes, double quotes, or parentheses:
7687
7688```````````````````````````````` example
7689[link](/url "title")
7690[link](/url 'title')
7691[link](/url (title))
7692.
7693<p><a href="/url" title="title">link</a>
7694<a href="/url" title="title">link</a>
7695<a href="/url" title="title">link</a></p>
7696````````````````````````````````
7697
7698
7699Backslash escapes and entity and numeric character references
7700may be used in titles:
7701
7702```````````````````````````````` example
7703[link](/url "title \"&quot;")
7704.
7705<p><a href="/url" title="title &quot;&quot;">link</a></p>
7706````````````````````````````````
7707
7708
7709Titles must be separated from the link using a [whitespace].
7710Other [Unicode whitespace] like non-breaking space doesn't work.
7711
7712```````````````````````````````` example
7713[link](/url "title")
7714.
7715<p><a href="/url%C2%A0%22title%22">link</a></p>
7716````````````````````````````````
7717
7718
7719Nested balanced quotes are not allowed without escaping:
7720
7721```````````````````````````````` example
7722[link](/url "title "and" title")
7723.
7724<p>[link](/url &quot;title &quot;and&quot; title&quot;)</p>
7725````````````````````````````````
7726
7727
7728But it is easy to work around this by using a different quote type:
7729
7730```````````````````````````````` example
7731[link](/url 'title "and" title')
7732.
7733<p><a href="/url" title="title &quot;and&quot; title">link</a></p>
7734````````````````````````````````
7735
7736
7737(Note:  `Markdown.pl` did allow double quotes inside a double-quoted
7738title, and its test suite included a test demonstrating this.
7739But it is hard to see a good rationale for the extra complexity this
7740brings, since there are already many ways---backslash escaping,
7741entity and numeric character references, or using a different
7742quote type for the enclosing title---to write titles containing
7743double quotes.  `Markdown.pl`'s handling of titles has a number
7744of other strange features.  For example, it allows single-quoted
7745titles in inline links, but not reference links.  And, in
7746reference links but not inline links, it allows a title to begin
7747with `"` and end with `)`.  `Markdown.pl` 1.0.1 even allows
7748titles with no closing quotation mark, though 1.0.2b8 does not.
7749It seems preferable to adopt a simple, rational rule that works
7750the same way in inline links and link reference definitions.)
7751
7752[Whitespace] is allowed around the destination and title:
7753
7754```````````````````````````````` example
7755[link](   /uri
7756  "title"  )
7757.
7758<p><a href="/uri" title="title">link</a></p>
7759````````````````````````````````
7760
7761
7762But it is not allowed between the link text and the
7763following parenthesis:
7764
7765```````````````````````````````` example
7766[link] (/uri)
7767.
7768<p>[link] (/uri)</p>
7769````````````````````````````````
7770
7771
7772The link text may contain balanced brackets, but not unbalanced ones,
7773unless they are escaped:
7774
7775```````````````````````````````` example
7776[link [foo [bar]]](/uri)
7777.
7778<p><a href="/uri">link [foo [bar]]</a></p>
7779````````````````````````````````
7780
7781
7782```````````````````````````````` example
7783[link] bar](/uri)
7784.
7785<p>[link] bar](/uri)</p>
7786````````````````````````````````
7787
7788
7789```````````````````````````````` example
7790[link [bar](/uri)
7791.
7792<p>[link <a href="/uri">bar</a></p>
7793````````````````````````````````
7794
7795
7796```````````````````````````````` example
7797[link \[bar](/uri)
7798.
7799<p><a href="/uri">link [bar</a></p>
7800````````````````````````````````
7801
7802
7803The link text may contain inline content:
7804
7805```````````````````````````````` example
7806[link *foo **bar** `#`*](/uri)
7807.
7808<p><a href="/uri">link <em>foo <strong>bar</strong> <code>#</code></em></a></p>
7809````````````````````````````````
7810
7811
7812```````````````````````````````` example
7813[![moon](moon.jpg)](/uri)
7814.
7815<p><a href="/uri"><img src="moon.jpg" alt="moon" /></a></p>
7816````````````````````````````````
7817
7818
7819However, links may not contain other links, at any level of nesting.
7820
7821```````````````````````````````` example
7822[foo [bar](/uri)](/uri)
7823.
7824<p>[foo <a href="/uri">bar</a>](/uri)</p>
7825````````````````````````````````
7826
7827
7828```````````````````````````````` example
7829[foo *[bar [baz](/uri)](/uri)*](/uri)
7830.
7831<p>[foo <em>[bar <a href="/uri">baz</a>](/uri)</em>](/uri)</p>
7832````````````````````````````````
7833
7834
7835```````````````````````````````` example
7836![[[foo](uri1)](uri2)](uri3)
7837.
7838<p><img src="uri3" alt="[foo](uri2)" /></p>
7839````````````````````````````````
7840
7841
7842These cases illustrate the precedence of link text grouping over
7843emphasis grouping:
7844
7845```````````````````````````````` example
7846*[foo*](/uri)
7847.
7848<p>*<a href="/uri">foo*</a></p>
7849````````````````````````````````
7850
7851
7852```````````````````````````````` example
7853[foo *bar](baz*)
7854.
7855<p><a href="baz*">foo *bar</a></p>
7856````````````````````````````````
7857
7858
7859Note that brackets that *aren't* part of links do not take
7860precedence:
7861
7862```````````````````````````````` example
7863*foo [bar* baz]
7864.
7865<p><em>foo [bar</em> baz]</p>
7866````````````````````````````````
7867
7868
7869These cases illustrate the precedence of HTML tags, code spans,
7870and autolinks over link grouping:
7871
7872```````````````````````````````` example
7873[foo <bar attr="](baz)">
7874.
7875<p>[foo <bar attr="](baz)"></p>
7876````````````````````````````````
7877
7878
7879```````````````````````````````` example
7880[foo`](/uri)`
7881.
7882<p>[foo<code>](/uri)</code></p>
7883````````````````````````````````
7884
7885
7886```````````````````````````````` example
7887[foo<http://example.com/?search=](uri)>
7888.
7889<p>[foo<a href="http://example.com/?search=%5D(uri)">http://example.com/?search=](uri)</a></p>
7890````````````````````````````````
7891
7892
7893There are three kinds of [reference link](@)s:
7894[full](#full-reference-link), [collapsed](#collapsed-reference-link),
7895and [shortcut](#shortcut-reference-link).
7896
7897A [full reference link](@)
7898consists of a [link text] immediately followed by a [link label]
7899that [matches] a [link reference definition] elsewhere in the document.
7900
7901A [link label](@)  begins with a left bracket (`[`) and ends
7902with the first right bracket (`]`) that is not backslash-escaped.
7903Between these brackets there must be at least one [non-whitespace character].
7904Unescaped square bracket characters are not allowed inside the
7905opening and closing square brackets of [link labels].  A link
7906label can have at most 999 characters inside the square
7907brackets.
7908
7909One label [matches](@)
7910another just in case their normalized forms are equal.  To normalize a
7911label, strip off the opening and closing brackets,
7912perform the *Unicode case fold*, strip leading and trailing
7913[whitespace] and collapse consecutive internal
7914[whitespace] to a single space.  If there are multiple
7915matching reference link definitions, the one that comes first in the
7916document is used.  (It is desirable in such cases to emit a warning.)
7917
7918The contents of the first link label are parsed as inlines, which are
7919used as the link's text.  The link's URI and title are provided by the
7920matching [link reference definition].
7921
7922Here is a simple example:
7923
7924```````````````````````````````` example
7925[foo][bar]
7926
7927[bar]: /url "title"
7928.
7929<p><a href="/url" title="title">foo</a></p>
7930````````````````````````````````
7931
7932
7933The rules for the [link text] are the same as with
7934[inline links].  Thus:
7935
7936The link text may contain balanced brackets, but not unbalanced ones,
7937unless they are escaped:
7938
7939```````````````````````````````` example
7940[link [foo [bar]]][ref]
7941
7942[ref]: /uri
7943.
7944<p><a href="/uri">link [foo [bar]]</a></p>
7945````````````````````````````````
7946
7947
7948```````````````````````````````` example
7949[link \[bar][ref]
7950
7951[ref]: /uri
7952.
7953<p><a href="/uri">link [bar</a></p>
7954````````````````````````````````
7955
7956
7957The link text may contain inline content:
7958
7959```````````````````````````````` example
7960[link *foo **bar** `#`*][ref]
7961
7962[ref]: /uri
7963.
7964<p><a href="/uri">link <em>foo <strong>bar</strong> <code>#</code></em></a></p>
7965````````````````````````````````
7966
7967
7968```````````````````````````````` example
7969[![moon](moon.jpg)][ref]
7970
7971[ref]: /uri
7972.
7973<p><a href="/uri"><img src="moon.jpg" alt="moon" /></a></p>
7974````````````````````````````````
7975
7976
7977However, links may not contain other links, at any level of nesting.
7978
7979```````````````````````````````` example
7980[foo [bar](/uri)][ref]
7981
7982[ref]: /uri
7983.
7984<p>[foo <a href="/uri">bar</a>]<a href="/uri">ref</a></p>
7985````````````````````````````````
7986
7987
7988```````````````````````````````` example
7989[foo *bar [baz][ref]*][ref]
7990
7991[ref]: /uri
7992.
7993<p>[foo <em>bar <a href="/uri">baz</a></em>]<a href="/uri">ref</a></p>
7994````````````````````````````````
7995
7996
7997(In the examples above, we have two [shortcut reference links]
7998instead of one [full reference link].)
7999
8000The following cases illustrate the precedence of link text grouping over
8001emphasis grouping:
8002
8003```````````````````````````````` example
8004*[foo*][ref]
8005
8006[ref]: /uri
8007.
8008<p>*<a href="/uri">foo*</a></p>
8009````````````````````````````````
8010
8011
8012```````````````````````````````` example
8013[foo *bar][ref]
8014
8015[ref]: /uri
8016.
8017<p><a href="/uri">foo *bar</a></p>
8018````````````````````````````````
8019
8020
8021These cases illustrate the precedence of HTML tags, code spans,
8022and autolinks over link grouping:
8023
8024```````````````````````````````` example
8025[foo <bar attr="][ref]">
8026
8027[ref]: /uri
8028.
8029<p>[foo <bar attr="][ref]"></p>
8030````````````````````````````````
8031
8032
8033```````````````````````````````` example
8034[foo`][ref]`
8035
8036[ref]: /uri
8037.
8038<p>[foo<code>][ref]</code></p>
8039````````````````````````````````
8040
8041
8042```````````````````````````````` example
8043[foo<http://example.com/?search=][ref]>
8044
8045[ref]: /uri
8046.
8047<p>[foo<a href="http://example.com/?search=%5D%5Bref%5D">http://example.com/?search=][ref]</a></p>
8048````````````````````````````````
8049
8050
8051Matching is case-insensitive:
8052
8053```````````````````````````````` example
8054[foo][BaR]
8055
8056[bar]: /url "title"
8057.
8058<p><a href="/url" title="title">foo</a></p>
8059````````````````````````````````
8060
8061
8062Unicode case fold is used:
8063
8064```````````````````````````````` example
8065[Толпой][Толпой] is a Russian word.
8066
8067[ТОЛПОЙ]: /url
8068.
8069<p><a href="/url">Толпой</a> is a Russian word.</p>
8070````````````````````````````````
8071
8072
8073Consecutive internal [whitespace] is treated as one space for
8074purposes of determining matching:
8075
8076```````````````````````````````` example
8077[Foo
8078  bar]: /url
8079
8080[Baz][Foo bar]
8081.
8082<p><a href="/url">Baz</a></p>
8083````````````````````````````````
8084
8085
8086No [whitespace] is allowed between the [link text] and the
8087[link label]:
8088
8089```````````````````````````````` example
8090[foo] [bar]
8091
8092[bar]: /url "title"
8093.
8094<p>[foo] <a href="/url" title="title">bar</a></p>
8095````````````````````````````````
8096
8097
8098```````````````````````````````` example
8099[foo]
8100[bar]
8101
8102[bar]: /url "title"
8103.
8104<p>[foo]
8105<a href="/url" title="title">bar</a></p>
8106````````````````````````````````
8107
8108
8109This is a departure from John Gruber's original Markdown syntax
8110description, which explicitly allows whitespace between the link
8111text and the link label.  It brings reference links in line with
8112[inline links], which (according to both original Markdown and
8113this spec) cannot have whitespace after the link text.  More
8114importantly, it prevents inadvertent capture of consecutive
8115[shortcut reference links]. If whitespace is allowed between the
8116link text and the link label, then in the following we will have
8117a single reference link, not two shortcut reference links, as
8118intended:
8119
8120``` markdown
8121[foo]
8122[bar]
8123
8124[foo]: /url1
8125[bar]: /url2
8126```
8127
8128(Note that [shortcut reference links] were introduced by Gruber
8129himself in a beta version of `Markdown.pl`, but never included
8130in the official syntax description.  Without shortcut reference
8131links, it is harmless to allow space between the link text and
8132link label; but once shortcut references are introduced, it is
8133too dangerous to allow this, as it frequently leads to
8134unintended results.)
8135
8136When there are multiple matching [link reference definitions],
8137the first is used:
8138
8139```````````````````````````````` example
8140[foo]: /url1
8141
8142[foo]: /url2
8143
8144[bar][foo]
8145.
8146<p><a href="/url1">bar</a></p>
8147````````````````````````````````
8148
8149
8150Note that matching is performed on normalized strings, not parsed
8151inline content.  So the following does not match, even though the
8152labels define equivalent inline content:
8153
8154```````````````````````````````` example
8155[bar][foo\!]
8156
8157[foo!]: /url
8158.
8159<p>[bar][foo!]</p>
8160````````````````````````````````
8161
8162
8163[Link labels] cannot contain brackets, unless they are
8164backslash-escaped:
8165
8166```````````````````````````````` example
8167[foo][ref[]
8168
8169[ref[]: /uri
8170.
8171<p>[foo][ref[]</p>
8172<p>[ref[]: /uri</p>
8173````````````````````````````````
8174
8175
8176```````````````````````````````` example
8177[foo][ref[bar]]
8178
8179[ref[bar]]: /uri
8180.
8181<p>[foo][ref[bar]]</p>
8182<p>[ref[bar]]: /uri</p>
8183````````````````````````````````
8184
8185
8186```````````````````````````````` example
8187[[[foo]]]
8188
8189[[[foo]]]: /url
8190.
8191<p>[[[foo]]]</p>
8192<p>[[[foo]]]: /url</p>
8193````````````````````````````````
8194
8195
8196```````````````````````````````` example
8197[foo][ref\[]
8198
8199[ref\[]: /uri
8200.
8201<p><a href="/uri">foo</a></p>
8202````````````````````````````````
8203
8204
8205Note that in this example `]` is not backslash-escaped:
8206
8207```````````````````````````````` example
8208[bar\\]: /uri
8209
8210[bar\\]
8211.
8212<p><a href="/uri">bar\</a></p>
8213````````````````````````````````
8214
8215
8216A [link label] must contain at least one [non-whitespace character]:
8217
8218```````````````````````````````` example
8219[]
8220
8221[]: /uri
8222.
8223<p>[]</p>
8224<p>[]: /uri</p>
8225````````````````````````````````
8226
8227
8228```````````````````````````````` example
8229[
8230 ]
8231
8232[
8233 ]: /uri
8234.
8235<p>[
8236]</p>
8237<p>[
8238]: /uri</p>
8239````````````````````````````````
8240
8241
8242A [collapsed reference link](@)
8243consists of a [link label] that [matches] a
8244[link reference definition] elsewhere in the
8245document, followed by the string `[]`.
8246The contents of the first link label are parsed as inlines,
8247which are used as the link's text.  The link's URI and title are
8248provided by the matching reference link definition.  Thus,
8249`[foo][]` is equivalent to `[foo][foo]`.
8250
8251```````````````````````````````` example
8252[foo][]
8253
8254[foo]: /url "title"
8255.
8256<p><a href="/url" title="title">foo</a></p>
8257````````````````````````````````
8258
8259
8260```````````````````````````````` example
8261[*foo* bar][]
8262
8263[*foo* bar]: /url "title"
8264.
8265<p><a href="/url" title="title"><em>foo</em> bar</a></p>
8266````````````````````````````````
8267
8268
8269The link labels are case-insensitive:
8270
8271```````````````````````````````` example
8272[Foo][]
8273
8274[foo]: /url "title"
8275.
8276<p><a href="/url" title="title">Foo</a></p>
8277````````````````````````````````
8278
8279
8280
8281As with full reference links, [whitespace] is not
8282allowed between the two sets of brackets:
8283
8284```````````````````````````````` example
8285[foo]
8286[]
8287
8288[foo]: /url "title"
8289.
8290<p><a href="/url" title="title">foo</a>
8291[]</p>
8292````````````````````````````````
8293
8294
8295A [shortcut reference link](@)
8296consists of a [link label] that [matches] a
8297[link reference definition] elsewhere in the
8298document and is not followed by `[]` or a link label.
8299The contents of the first link label are parsed as inlines,
8300which are used as the link's text.  The link's URI and title
8301are provided by the matching link reference definition.
8302Thus, `[foo]` is equivalent to `[foo][]`.
8303
8304```````````````````````````````` example
8305[foo]
8306
8307[foo]: /url "title"
8308.
8309<p><a href="/url" title="title">foo</a></p>
8310````````````````````````````````
8311
8312
8313```````````````````````````````` example
8314[*foo* bar]
8315
8316[*foo* bar]: /url "title"
8317.
8318<p><a href="/url" title="title"><em>foo</em> bar</a></p>
8319````````````````````````````````
8320
8321
8322```````````````````````````````` example
8323[[*foo* bar]]
8324
8325[*foo* bar]: /url "title"
8326.
8327<p>[<a href="/url" title="title"><em>foo</em> bar</a>]</p>
8328````````````````````````````````
8329
8330
8331```````````````````````````````` example
8332[[bar [foo]
8333
8334[foo]: /url
8335.
8336<p>[[bar <a href="/url">foo</a></p>
8337````````````````````````````````
8338
8339
8340The link labels are case-insensitive:
8341
8342```````````````````````````````` example
8343[Foo]
8344
8345[foo]: /url "title"
8346.
8347<p><a href="/url" title="title">Foo</a></p>
8348````````````````````````````````
8349
8350
8351A space after the link text should be preserved:
8352
8353```````````````````````````````` example
8354[foo] bar
8355
8356[foo]: /url
8357.
8358<p><a href="/url">foo</a> bar</p>
8359````````````````````````````````
8360
8361
8362If you just want bracketed text, you can backslash-escape the
8363opening bracket to avoid links:
8364
8365```````````````````````````````` example
8366\[foo]
8367
8368[foo]: /url "title"
8369.
8370<p>[foo]</p>
8371````````````````````````````````
8372
8373
8374Note that this is a link, because a link label ends with the first
8375following closing bracket:
8376
8377```````````````````````````````` example
8378[foo*]: /url
8379
8380*[foo*]
8381.
8382<p>*<a href="/url">foo*</a></p>
8383````````````````````````````````
8384
8385
8386Full and compact references take precedence over shortcut
8387references:
8388
8389```````````````````````````````` example
8390[foo][bar]
8391
8392[foo]: /url1
8393[bar]: /url2
8394.
8395<p><a href="/url2">foo</a></p>
8396````````````````````````````````
8397
8398```````````````````````````````` example
8399[foo][]
8400
8401[foo]: /url1
8402.
8403<p><a href="/url1">foo</a></p>
8404````````````````````````````````
8405
8406Inline links also take precedence:
8407
8408```````````````````````````````` example
8409[foo]()
8410
8411[foo]: /url1
8412.
8413<p><a href="">foo</a></p>
8414````````````````````````````````
8415
8416```````````````````````````````` example
8417[foo](not a link)
8418
8419[foo]: /url1
8420.
8421<p><a href="/url1">foo</a>(not a link)</p>
8422````````````````````````````````
8423
8424In the following case `[bar][baz]` is parsed as a reference,
8425`[foo]` as normal text:
8426
8427```````````````````````````````` example
8428[foo][bar][baz]
8429
8430[baz]: /url
8431.
8432<p>[foo]<a href="/url">bar</a></p>
8433````````````````````````````````
8434
8435
8436Here, though, `[foo][bar]` is parsed as a reference, since
8437`[bar]` is defined:
8438
8439```````````````````````````````` example
8440[foo][bar][baz]
8441
8442[baz]: /url1
8443[bar]: /url2
8444.
8445<p><a href="/url2">foo</a><a href="/url1">baz</a></p>
8446````````````````````````````````
8447
8448
8449Here `[foo]` is not parsed as a shortcut reference, because it
8450is followed by a link label (even though `[bar]` is not defined):
8451
8452```````````````````````````````` example
8453[foo][bar][baz]
8454
8455[baz]: /url1
8456[foo]: /url2
8457.
8458<p>[foo]<a href="/url1">bar</a></p>
8459````````````````````````````````
8460
8461
8462
8463## Images
8464
8465Syntax for images is like the syntax for links, with one
8466difference. Instead of [link text], we have an
8467[image description](@).  The rules for this are the
8468same as for [link text], except that (a) an
8469image description starts with `![` rather than `[`, and
8470(b) an image description may contain links.
8471An image description has inline elements
8472as its contents.  When an image is rendered to HTML,
8473this is standardly used as the image's `alt` attribute.
8474
8475```````````````````````````````` example
8476![foo](/url "title")
8477.
8478<p><img src="/url" alt="foo" title="title" /></p>
8479````````````````````````````````
8480
8481
8482```````````````````````````````` example
8483![foo *bar*]
8484
8485[foo *bar*]: train.jpg "train & tracks"
8486.
8487<p><img src="train.jpg" alt="foo bar" title="train &amp; tracks" /></p>
8488````````````````````````````````
8489
8490
8491```````````````````````````````` example
8492![foo ![bar](/url)](/url2)
8493.
8494<p><img src="/url2" alt="foo bar" /></p>
8495````````````````````````````````
8496
8497
8498```````````````````````````````` example
8499![foo [bar](/url)](/url2)
8500.
8501<p><img src="/url2" alt="foo bar" /></p>
8502````````````````````````````````
8503
8504
8505Though this spec is concerned with parsing, not rendering, it is
8506recommended that in rendering to HTML, only the plain string content
8507of the [image description] be used.  Note that in
8508the above example, the alt attribute's value is `foo bar`, not `foo
8509[bar](/url)` or `foo <a href="/url">bar</a>`.  Only the plain string
8510content is rendered, without formatting.
8511
8512```````````````````````````````` example
8513![foo *bar*][]
8514
8515[foo *bar*]: train.jpg "train & tracks"
8516.
8517<p><img src="train.jpg" alt="foo bar" title="train &amp; tracks" /></p>
8518````````````````````````````````
8519
8520
8521```````````````````````````````` example
8522![foo *bar*][foobar]
8523
8524[FOOBAR]: train.jpg "train & tracks"
8525.
8526<p><img src="train.jpg" alt="foo bar" title="train &amp; tracks" /></p>
8527````````````````````````````````
8528
8529
8530```````````````````````````````` example
8531![foo](train.jpg)
8532.
8533<p><img src="train.jpg" alt="foo" /></p>
8534````````````````````````````````
8535
8536
8537```````````````````````````````` example
8538My ![foo bar](/path/to/train.jpg  "title"   )
8539.
8540<p>My <img src="/path/to/train.jpg" alt="foo bar" title="title" /></p>
8541````````````````````````````````
8542
8543
8544```````````````````````````````` example
8545![foo](<url>)
8546.
8547<p><img src="url" alt="foo" /></p>
8548````````````````````````````````
8549
8550
8551```````````````````````````````` example
8552![](/url)
8553.
8554<p><img src="/url" alt="" /></p>
8555````````````````````````````````
8556
8557
8558Reference-style:
8559
8560```````````````````````````````` example
8561![foo][bar]
8562
8563[bar]: /url
8564.
8565<p><img src="/url" alt="foo" /></p>
8566````````````````````````````````
8567
8568
8569```````````````````````````````` example
8570![foo][bar]
8571
8572[BAR]: /url
8573.
8574<p><img src="/url" alt="foo" /></p>
8575````````````````````````````````
8576
8577
8578Collapsed:
8579
8580```````````````````````````````` example
8581![foo][]
8582
8583[foo]: /url "title"
8584.
8585<p><img src="/url" alt="foo" title="title" /></p>
8586````````````````````````````````
8587
8588
8589```````````````````````````````` example
8590![*foo* bar][]
8591
8592[*foo* bar]: /url "title"
8593.
8594<p><img src="/url" alt="foo bar" title="title" /></p>
8595````````````````````````````````
8596
8597
8598The labels are case-insensitive:
8599
8600```````````````````````````````` example
8601![Foo][]
8602
8603[foo]: /url "title"
8604.
8605<p><img src="/url" alt="Foo" title="title" /></p>
8606````````````````````````````````
8607
8608
8609As with reference links, [whitespace] is not allowed
8610between the two sets of brackets:
8611
8612```````````````````````````````` example
8613![foo]
8614[]
8615
8616[foo]: /url "title"
8617.
8618<p><img src="/url" alt="foo" title="title" />
8619[]</p>
8620````````````````````````````````
8621
8622
8623Shortcut:
8624
8625```````````````````````````````` example
8626![foo]
8627
8628[foo]: /url "title"
8629.
8630<p><img src="/url" alt="foo" title="title" /></p>
8631````````````````````````````````
8632
8633
8634```````````````````````````````` example
8635![*foo* bar]
8636
8637[*foo* bar]: /url "title"
8638.
8639<p><img src="/url" alt="foo bar" title="title" /></p>
8640````````````````````````````````
8641
8642
8643Note that link labels cannot contain unescaped brackets:
8644
8645```````````````````````````````` example
8646![[foo]]
8647
8648[[foo]]: /url "title"
8649.
8650<p>![[foo]]</p>
8651<p>[[foo]]: /url &quot;title&quot;</p>
8652````````````````````````````````
8653
8654
8655The link labels are case-insensitive:
8656
8657```````````````````````````````` example
8658![Foo]
8659
8660[foo]: /url "title"
8661.
8662<p><img src="/url" alt="Foo" title="title" /></p>
8663````````````````````````````````
8664
8665
8666If you just want a literal `!` followed by bracketed text, you can
8667backslash-escape the opening `[`:
8668
8669```````````````````````````````` example
8670!\[foo]
8671
8672[foo]: /url "title"
8673.
8674<p>![foo]</p>
8675````````````````````````````````
8676
8677
8678If you want a link after a literal `!`, backslash-escape the
8679`!`:
8680
8681```````````````````````````````` example
8682\![foo]
8683
8684[foo]: /url "title"
8685.
8686<p>!<a href="/url" title="title">foo</a></p>
8687````````````````````````````````
8688
8689
8690## Autolinks
8691
8692[Autolink](@)s are absolute URIs and email addresses inside
8693`<` and `>`. They are parsed as links, with the URL or email address
8694as the link label.
8695
8696A [URI autolink](@) consists of `<`, followed by an
8697[absolute URI] followed by `>`.  It is parsed as
8698a link to the URI, with the URI as the link's label.
8699
8700An [absolute URI](@),
8701for these purposes, consists of a [scheme] followed by a colon (`:`)
8702followed by zero or more characters other than ASCII
8703[whitespace] and control characters, `<`, and `>`.  If
8704the URI includes these characters, they must be percent-encoded
8705(e.g. `%20` for a space).
8706
8707For purposes of this spec, a [scheme](@) is any sequence
8708of 2--32 characters beginning with an ASCII letter and followed
8709by any combination of ASCII letters, digits, or the symbols plus
8710("+"), period ("."), or hyphen ("-").
8711
8712Here are some valid autolinks:
8713
8714```````````````````````````````` example
8715<http://foo.bar.baz>
8716.
8717<p><a href="http://foo.bar.baz">http://foo.bar.baz</a></p>
8718````````````````````````````````
8719
8720
8721```````````````````````````````` example
8722<http://foo.bar.baz/test?q=hello&id=22&boolean>
8723.
8724<p><a href="http://foo.bar.baz/test?q=hello&amp;id=22&amp;boolean">http://foo.bar.baz/test?q=hello&amp;id=22&amp;boolean</a></p>
8725````````````````````````````````
8726
8727
8728```````````````````````````````` example
8729<irc://foo.bar:2233/baz>
8730.
8731<p><a href="irc://foo.bar:2233/baz">irc://foo.bar:2233/baz</a></p>
8732````````````````````````````````
8733
8734
8735Uppercase is also fine:
8736
8737```````````````````````````````` example
8738<MAILTO:FOO@BAR.BAZ>
8739.
8740<p><a href="MAILTO:FOO@BAR.BAZ">MAILTO:FOO@BAR.BAZ</a></p>
8741````````````````````````````````
8742
8743
8744Note that many strings that count as [absolute URIs] for
8745purposes of this spec are not valid URIs, because their
8746schemes are not registered or because of other problems
8747with their syntax:
8748
8749```````````````````````````````` example
8750<a+b+c:d>
8751.
8752<p><a href="a+b+c:d">a+b+c:d</a></p>
8753````````````````````````````````
8754
8755
8756```````````````````````````````` example
8757<made-up-scheme://foo,bar>
8758.
8759<p><a href="made-up-scheme://foo,bar">made-up-scheme://foo,bar</a></p>
8760````````````````````````````````
8761
8762
8763```````````````````````````````` example
8764<http://../>
8765.
8766<p><a href="http://../">http://../</a></p>
8767````````````````````````````````
8768
8769
8770```````````````````````````````` example
8771<localhost:5001/foo>
8772.
8773<p><a href="localhost:5001/foo">localhost:5001/foo</a></p>
8774````````````````````````````````
8775
8776
8777Spaces are not allowed in autolinks:
8778
8779```````````````````````````````` example
8780<http://foo.bar/baz bim>
8781.
8782<p>&lt;http://foo.bar/baz bim&gt;</p>
8783````````````````````````````````
8784
8785
8786Backslash-escapes do not work inside autolinks:
8787
8788```````````````````````````````` example
8789<http://example.com/\[\>
8790.
8791<p><a href="http://example.com/%5C%5B%5C">http://example.com/\[\</a></p>
8792````````````````````````````````
8793
8794
8795An [email autolink](@)
8796consists of `<`, followed by an [email address],
8797followed by `>`.  The link's label is the email address,
8798and the URL is `mailto:` followed by the email address.
8799
8800An [email address](@),
8801for these purposes, is anything that matches
8802the [non-normative regex from the HTML5
8803spec](https://html.spec.whatwg.org/multipage/forms.html#e-mail-state-(type=email)):
8804
8805    /^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?
8806    (?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/
8807
8808Examples of email autolinks:
8809
8810```````````````````````````````` example
8811<foo@bar.example.com>
8812.
8813<p><a href="mailto:foo@bar.example.com">foo@bar.example.com</a></p>
8814````````````````````````````````
8815
8816
8817```````````````````````````````` example
8818<foo+special@Bar.baz-bar0.com>
8819.
8820<p><a href="mailto:foo+special@Bar.baz-bar0.com">foo+special@Bar.baz-bar0.com</a></p>
8821````````````````````````````````
8822
8823
8824Backslash-escapes do not work inside email autolinks:
8825
8826```````````````````````````````` example
8827<foo\+@bar.example.com>
8828.
8829<p>&lt;foo+@bar.example.com&gt;</p>
8830````````````````````````````````
8831
8832
8833These are not autolinks:
8834
8835```````````````````````````````` example
8836<>
8837.
8838<p>&lt;&gt;</p>
8839````````````````````````````````
8840
8841
8842```````````````````````````````` example
8843< http://foo.bar >
8844.
8845<p>&lt; http://foo.bar &gt;</p>
8846````````````````````````````````
8847
8848
8849```````````````````````````````` example
8850<m:abc>
8851.
8852<p>&lt;m:abc&gt;</p>
8853````````````````````````````````
8854
8855
8856```````````````````````````````` example
8857<foo.bar.baz>
8858.
8859<p>&lt;foo.bar.baz&gt;</p>
8860````````````````````````````````
8861
8862
8863```````````````````````````````` example
8864http://example.com
8865.
8866<p>http://example.com</p>
8867````````````````````````````````
8868
8869
8870```````````````````````````````` example
8871foo@bar.example.com
8872.
8873<p>foo@bar.example.com</p>
8874````````````````````````````````
8875
8876
8877## Raw HTML
8878
8879Text between `<` and `>` that looks like an HTML tag is parsed as a
8880raw HTML tag and will be rendered in HTML without escaping.
8881Tag and attribute names are not limited to current HTML tags,
8882so custom tags (and even, say, DocBook tags) may be used.
8883
8884Here is the grammar for tags:
8885
8886A [tag name](@) consists of an ASCII letter
8887followed by zero or more ASCII letters, digits, or
8888hyphens (`-`).
8889
8890An [attribute](@) consists of [whitespace],
8891an [attribute name], and an optional
8892[attribute value specification].
8893
8894An [attribute name](@)
8895consists of an ASCII letter, `_`, or `:`, followed by zero or more ASCII
8896letters, digits, `_`, `.`, `:`, or `-`.  (Note:  This is the XML
8897specification restricted to ASCII.  HTML5 is laxer.)
8898
8899An [attribute value specification](@)
8900consists of optional [whitespace],
8901a `=` character, optional [whitespace], and an [attribute
8902value].
8903
8904An [attribute value](@)
8905consists of an [unquoted attribute value],
8906a [single-quoted attribute value], or a [double-quoted attribute value].
8907
8908An [unquoted attribute value](@)
8909is a nonempty string of characters not
8910including [whitespace], `"`, `'`, `=`, `<`, `>`, or `` ` ``.
8911
8912A [single-quoted attribute value](@)
8913consists of `'`, zero or more
8914characters not including `'`, and a final `'`.
8915
8916A [double-quoted attribute value](@)
8917consists of `"`, zero or more
8918characters not including `"`, and a final `"`.
8919
8920An [open tag](@) consists of a `<` character, a [tag name],
8921zero or more [attributes], optional [whitespace], an optional `/`
8922character, and a `>` character.
8923
8924A [closing tag](@) consists of the string `</`, a
8925[tag name], optional [whitespace], and the character `>`.
8926
8927An [HTML comment](@) consists of `<!--` + *text* + `-->`,
8928where *text* does not start with `>` or `->`, does not end with `-`,
8929and does not contain `--`.  (See the
8930[HTML5 spec](http://www.w3.org/TR/html5/syntax.html#comments).)
8931
8932A [processing instruction](@)
8933consists of the string `<?`, a string
8934of characters not including the string `?>`, and the string
8935`?>`.
8936
8937A [declaration](@) consists of the
8938string `<!`, a name consisting of one or more uppercase ASCII letters,
8939[whitespace], a string of characters not including the
8940character `>`, and the character `>`.
8941
8942A [CDATA section](@) consists of
8943the string `<![CDATA[`, a string of characters not including the string
8944`]]>`, and the string `]]>`.
8945
8946An [HTML tag](@) consists of an [open tag], a [closing tag],
8947an [HTML comment], a [processing instruction], a [declaration],
8948or a [CDATA section].
8949
8950Here are some simple open tags:
8951
8952```````````````````````````````` example
8953<a><bab><c2c>
8954.
8955<p><a><bab><c2c></p>
8956````````````````````````````````
8957
8958
8959Empty elements:
8960
8961```````````````````````````````` example
8962<a/><b2/>
8963.
8964<p><a/><b2/></p>
8965````````````````````````````````
8966
8967
8968[Whitespace] is allowed:
8969
8970```````````````````````````````` example
8971<a  /><b2
8972data="foo" >
8973.
8974<p><a  /><b2
8975data="foo" ></p>
8976````````````````````````````````
8977
8978
8979With attributes:
8980
8981```````````````````````````````` example
8982<a foo="bar" bam = 'baz <em>"</em>'
8983_boolean zoop:33=zoop:33 />
8984.
8985<p><a foo="bar" bam = 'baz <em>"</em>'
8986_boolean zoop:33=zoop:33 /></p>
8987````````````````````````````````
8988
8989
8990Custom tag names can be used:
8991
8992```````````````````````````````` example
8993Foo <responsive-image src="foo.jpg" />
8994.
8995<p>Foo <responsive-image src="foo.jpg" /></p>
8996````````````````````````````````
8997
8998
8999Illegal tag names, not parsed as HTML:
9000
9001```````````````````````````````` example
9002<33> <__>
9003.
9004<p>&lt;33&gt; &lt;__&gt;</p>
9005````````````````````````````````
9006
9007
9008Illegal attribute names:
9009
9010```````````````````````````````` example
9011<a h*#ref="hi">
9012.
9013<p>&lt;a h*#ref=&quot;hi&quot;&gt;</p>
9014````````````````````````````````
9015
9016
9017Illegal attribute values:
9018
9019```````````````````````````````` example
9020<a href="hi'> <a href=hi'>
9021.
9022<p>&lt;a href=&quot;hi'&gt; &lt;a href=hi'&gt;</p>
9023````````````````````````````````
9024
9025
9026Illegal [whitespace]:
9027
9028```````````````````````````````` example
9029< a><
9030foo><bar/ >
9031<foo bar=baz
9032bim!bop />
9033.
9034<p>&lt; a&gt;&lt;
9035foo&gt;&lt;bar/ &gt;
9036&lt;foo bar=baz
9037bim!bop /&gt;</p>
9038````````````````````````````````
9039
9040
9041Missing [whitespace]:
9042
9043```````````````````````````````` example
9044<a href='bar'title=title>
9045.
9046<p>&lt;a href='bar'title=title&gt;</p>
9047````````````````````````````````
9048
9049
9050Closing tags:
9051
9052```````````````````````````````` example
9053</a></foo >
9054.
9055<p></a></foo ></p>
9056````````````````````````````````
9057
9058
9059Illegal attributes in closing tag:
9060
9061```````````````````````````````` example
9062</a href="foo">
9063.
9064<p>&lt;/a href=&quot;foo&quot;&gt;</p>
9065````````````````````````````````
9066
9067
9068Comments:
9069
9070```````````````````````````````` example
9071foo <!-- this is a
9072comment - with hyphen -->
9073.
9074<p>foo <!-- this is a
9075comment - with hyphen --></p>
9076````````````````````````````````
9077
9078
9079```````````````````````````````` example
9080foo <!-- not a comment -- two hyphens -->
9081.
9082<p>foo &lt;!-- not a comment -- two hyphens --&gt;</p>
9083````````````````````````````````
9084
9085
9086Not comments:
9087
9088```````````````````````````````` example
9089foo <!--> foo -->
9090
9091foo <!-- foo--->
9092.
9093<p>foo &lt;!--&gt; foo --&gt;</p>
9094<p>foo &lt;!-- foo---&gt;</p>
9095````````````````````````````````
9096
9097
9098Processing instructions:
9099
9100```````````````````````````````` example
9101foo <?php echo $a; ?>
9102.
9103<p>foo <?php echo $a; ?></p>
9104````````````````````````````````
9105
9106
9107Declarations:
9108
9109```````````````````````````````` example
9110foo <!ELEMENT br EMPTY>
9111.
9112<p>foo <!ELEMENT br EMPTY></p>
9113````````````````````````````````
9114
9115
9116CDATA sections:
9117
9118```````````````````````````````` example
9119foo <![CDATA[>&<]]>
9120.
9121<p>foo <![CDATA[>&<]]></p>
9122````````````````````````````````
9123
9124
9125Entity and numeric character references are preserved in HTML
9126attributes:
9127
9128```````````````````````````````` example
9129foo <a href="&ouml;">
9130.
9131<p>foo <a href="&ouml;"></p>
9132````````````````````````````````
9133
9134
9135Backslash escapes do not work in HTML attributes:
9136
9137```````````````````````````````` example
9138foo <a href="\*">
9139.
9140<p>foo <a href="\*"></p>
9141````````````````````````````````
9142
9143
9144```````````````````````````````` example
9145<a href="\"">
9146.
9147<p>&lt;a href=&quot;&quot;&quot;&gt;</p>
9148````````````````````````````````
9149
9150
9151## Hard line breaks
9152
9153A line break (not in a code span or HTML tag) that is preceded
9154by two or more spaces and does not occur at the end of a block
9155is parsed as a [hard line break](@) (rendered
9156in HTML as a `<br />` tag):
9157
9158```````````````````````````````` example
9159foo
9160baz
9161.
9162<p>foo<br />
9163baz</p>
9164````````````````````````````````
9165
9166
9167For a more visible alternative, a backslash before the
9168[line ending] may be used instead of two spaces:
9169
9170```````````````````````````````` example
9171foo\
9172baz
9173.
9174<p>foo<br />
9175baz</p>
9176````````````````````````````````
9177
9178
9179More than two spaces can be used:
9180
9181```````````````````````````````` example
9182foo
9183baz
9184.
9185<p>foo<br />
9186baz</p>
9187````````````````````````````````
9188
9189
9190Leading spaces at the beginning of the next line are ignored:
9191
9192```````````````````````````````` example
9193foo
9194     bar
9195.
9196<p>foo<br />
9197bar</p>
9198````````````````````````````````
9199
9200
9201```````````````````````````````` example
9202foo\
9203     bar
9204.
9205<p>foo<br />
9206bar</p>
9207````````````````````````````````
9208
9209
9210Line breaks can occur inside emphasis, links, and other constructs
9211that allow inline content:
9212
9213```````````````````````````````` example
9214*foo
9215bar*
9216.
9217<p><em>foo<br />
9218bar</em></p>
9219````````````````````````````````
9220
9221
9222```````````````````````````````` example
9223*foo\
9224bar*
9225.
9226<p><em>foo<br />
9227bar</em></p>
9228````````````````````````````````
9229
9230
9231Line breaks do not occur inside code spans
9232
9233```````````````````````````````` example
9234`code
9235span`
9236.
9237<p><code>code  span</code></p>
9238````````````````````````````````
9239
9240
9241```````````````````````````````` example
9242`code\
9243span`
9244.
9245<p><code>code\ span</code></p>
9246````````````````````````````````
9247
9248
9249or HTML tags:
9250
9251```````````````````````````````` example
9252<a href="foo
9253bar">
9254.
9255<p><a href="foo
9256bar"></p>
9257````````````````````````````````
9258
9259
9260```````````````````````````````` example
9261<a href="foo\
9262bar">
9263.
9264<p><a href="foo\
9265bar"></p>
9266````````````````````````````````
9267
9268
9269Hard line breaks are for separating inline content within a block.
9270Neither syntax for hard line breaks works at the end of a paragraph or
9271other block element:
9272
9273```````````````````````````````` example
9274foo\
9275.
9276<p>foo\</p>
9277````````````````````````````````
9278
9279
9280```````````````````````````````` example
9281foo
9282.
9283<p>foo</p>
9284````````````````````````````````
9285
9286
9287```````````````````````````````` example
9288### foo\
9289.
9290<h3>foo\</h3>
9291````````````````````````````````
9292
9293
9294```````````````````````````````` example
9295### foo
9296.
9297<h3>foo</h3>
9298````````````````````````````````
9299
9300
9301## Soft line breaks
9302
9303A regular line break (not in a code span or HTML tag) that is not
9304preceded by two or more spaces or a backslash is parsed as a
9305[softbreak](@).  (A softbreak may be rendered in HTML either as a
9306[line ending] or as a space. The result will be the same in
9307browsers. In the examples here, a [line ending] will be used.)
9308
9309```````````````````````````````` example
9310foo
9311baz
9312.
9313<p>foo
9314baz</p>
9315````````````````````````````````
9316
9317
9318Spaces at the end of the line and beginning of the next line are
9319removed:
9320
9321```````````````````````````````` example
9322foo
9323 baz
9324.
9325<p>foo
9326baz</p>
9327````````````````````````````````
9328
9329
9330A conforming parser may render a soft line break in HTML either as a
9331line break or as a space.
9332
9333A renderer may also provide an option to render soft line breaks
9334as hard line breaks.
9335
9336## Textual content
9337
9338Any characters not given an interpretation by the above rules will
9339be parsed as plain textual content.
9340
9341```````````````````````````````` example
9342hello $.;'there
9343.
9344<p>hello $.;'there</p>
9345````````````````````````````````
9346
9347
9348```````````````````````````````` example
9349Foo χρῆν
9350.
9351<p>Foo χρῆν</p>
9352````````````````````````````````
9353
9354
9355Internal spaces are preserved verbatim:
9356
9357```````````````````````````````` example
9358Multiple     spaces
9359.
9360<p>Multiple     spaces</p>
9361````````````````````````````````
9362
9363
9364<!-- END TESTS -->
9365
9366# Appendix: A parsing strategy
9367
9368In this appendix we describe some features of the parsing strategy
9369used in the CommonMark reference implementations.
9370
9371## Overview
9372
9373Parsing has two phases:
9374
93751. In the first phase, lines of input are consumed and the block
9376structure of the document---its division into paragraphs, block quotes,
9377list items, and so on---is constructed.  Text is assigned to these
9378blocks but not parsed. Link reference definitions are parsed and a
9379map of links is constructed.
9380
93812. In the second phase, the raw text contents of paragraphs and headings
9382are parsed into sequences of Markdown inline elements (strings,
9383code spans, links, emphasis, and so on), using the map of link
9384references constructed in phase 1.
9385
9386At each point in processing, the document is represented as a tree of
9387**blocks**.  The root of the tree is a `document` block.  The `document`
9388may have any number of other blocks as **children**.  These children
9389may, in turn, have other blocks as children.  The last child of a block
9390is normally considered **open**, meaning that subsequent lines of input
9391can alter its contents.  (Blocks that are not open are **closed**.)
9392Here, for example, is a possible document tree, with the open blocks
9393marked by arrows:
9394
9395``` tree
9396-> document
9397  -> block_quote
9398       paragraph
9399         "Lorem ipsum dolor\nsit amet."
9400    -> list (type=bullet tight=true bullet_char=-)
9401         list_item
9402           paragraph
9403             "Qui *quodsi iracundia*"
9404      -> list_item
9405        -> paragraph
9406             "aliquando id"
9407```
9408
9409## Phase 1: block structure
9410
9411Each line that is processed has an effect on this tree.  The line is
9412analyzed and, depending on its contents, the document may be altered
9413in one or more of the following ways:
9414
94151. One or more open blocks may be closed.
94162. One or more new blocks may be created as children of the
9417   last open block.
94183. Text may be added to the last (deepest) open block remaining
9419   on the tree.
9420
9421Once a line has been incorporated into the tree in this way,
9422it can be discarded, so input can be read in a stream.
9423
9424For each line, we follow this procedure:
9425
94261. First we iterate through the open blocks, starting with the
9427root document, and descending through last children down to the last
9428open block.  Each block imposes a condition that the line must satisfy
9429if the block is to remain open.  For example, a block quote requires a
9430`>` character.  A paragraph requires a non-blank line.
9431In this phase we may match all or just some of the open
9432blocks.  But we cannot close unmatched blocks yet, because we may have a
9433[lazy continuation line].
9434
94352.  Next, after consuming the continuation markers for existing
9436blocks, we look for new block starts (e.g. `>` for a block quote).
9437If we encounter a new block start, we close any blocks unmatched
9438in step 1 before creating the new block as a child of the last
9439matched block.
9440
94413.  Finally, we look at the remainder of the line (after block
9442markers like `>`, list markers, and indentation have been consumed).
9443This is text that can be incorporated into the last open
9444block (a paragraph, code block, heading, or raw HTML).
9445
9446Setext headings are formed when we see a line of a paragraph
9447that is a [setext heading underline].
9448
9449Reference link definitions are detected when a paragraph is closed;
9450the accumulated text lines are parsed to see if they begin with
9451one or more reference link definitions.  Any remainder becomes a
9452normal paragraph.
9453
9454We can see how this works by considering how the tree above is
9455generated by four lines of Markdown:
9456
9457``` markdown
9458> Lorem ipsum dolor
9459sit amet.
9460> - Qui *quodsi iracundia*
9461> - aliquando id
9462```
9463
9464At the outset, our document model is just
9465
9466``` tree
9467-> document
9468```
9469
9470The first line of our text,
9471
9472``` markdown
9473> Lorem ipsum dolor
9474```
9475
9476causes a `block_quote` block to be created as a child of our
9477open `document` block, and a `paragraph` block as a child of
9478the `block_quote`.  Then the text is added to the last open
9479block, the `paragraph`:
9480
9481``` tree
9482-> document
9483  -> block_quote
9484    -> paragraph
9485         "Lorem ipsum dolor"
9486```
9487
9488The next line,
9489
9490``` markdown
9491sit amet.
9492```
9493
9494is a "lazy continuation" of the open `paragraph`, so it gets added
9495to the paragraph's text:
9496
9497``` tree
9498-> document
9499  -> block_quote
9500    -> paragraph
9501         "Lorem ipsum dolor\nsit amet."
9502```
9503
9504The third line,
9505
9506``` markdown
9507> - Qui *quodsi iracundia*
9508```
9509
9510causes the `paragraph` block to be closed, and a new `list` block
9511opened as a child of the `block_quote`.  A `list_item` is also
9512added as a child of the `list`, and a `paragraph` as a child of
9513the `list_item`.  The text is then added to the new `paragraph`:
9514
9515``` tree
9516-> document
9517  -> block_quote
9518       paragraph
9519         "Lorem ipsum dolor\nsit amet."
9520    -> list (type=bullet tight=true bullet_char=-)
9521      -> list_item
9522        -> paragraph
9523             "Qui *quodsi iracundia*"
9524```
9525
9526The fourth line,
9527
9528``` markdown
9529> - aliquando id
9530```
9531
9532causes the `list_item` (and its child the `paragraph`) to be closed,
9533and a new `list_item` opened up as child of the `list`.  A `paragraph`
9534is added as a child of the new `list_item`, to contain the text.
9535We thus obtain the final tree:
9536
9537``` tree
9538-> document
9539  -> block_quote
9540       paragraph
9541         "Lorem ipsum dolor\nsit amet."
9542    -> list (type=bullet tight=true bullet_char=-)
9543         list_item
9544           paragraph
9545             "Qui *quodsi iracundia*"
9546      -> list_item
9547        -> paragraph
9548             "aliquando id"
9549```
9550
9551## Phase 2: inline structure
9552
9553Once all of the input has been parsed, all open blocks are closed.
9554
9555We then "walk the tree," visiting every node, and parse raw
9556string contents of paragraphs and headings as inlines.  At this
9557point we have seen all the link reference definitions, so we can
9558resolve reference links as we go.
9559
9560``` tree
9561document
9562  block_quote
9563    paragraph
9564      str "Lorem ipsum dolor"
9565      softbreak
9566      str "sit amet."
9567    list (type=bullet tight=true bullet_char=-)
9568      list_item
9569        paragraph
9570          str "Qui "
9571          emph
9572            str "quodsi iracundia"
9573      list_item
9574        paragraph
9575          str "aliquando id"
9576```
9577
9578Notice how the [line ending] in the first paragraph has
9579been parsed as a `softbreak`, and the asterisks in the first list item
9580have become an `emph`.
9581
9582### An algorithm for parsing nested emphasis and links
9583
9584By far the trickiest part of inline parsing is handling emphasis,
9585strong emphasis, links, and images.  This is done using the following
9586algorithm.
9587
9588When we're parsing inlines and we hit either
9589
9590- a run of `*` or `_` characters, or
9591- a `[` or `![`
9592
9593we insert a text node with these symbols as its literal content, and we
9594add a pointer to this text node to the [delimiter stack](@).
9595
9596The [delimiter stack] is a doubly linked list.  Each
9597element contains a pointer to a text node, plus information about
9598
9599- the type of delimiter (`[`, `![`, `*`, `_`)
9600- the number of delimiters,
9601- whether the delimiter is "active" (all are active to start), and
9602- whether the delimiter is a potential opener, a potential closer,
9603  or both (which depends on what sort of characters precede
9604  and follow the delimiters).
9605
9606When we hit a `]` character, we call the *look for link or image*
9607procedure (see below).
9608
9609When we hit the end of the input, we call the *process emphasis*
9610procedure (see below), with `stack_bottom` = NULL.
9611
9612#### *look for link or image*
9613
9614Starting at the top of the delimiter stack, we look backwards
9615through the stack for an opening `[` or `![` delimiter.
9616
9617- If we don't find one, we return a literal text node `]`.
9618
9619- If we do find one, but it's not *active*, we remove the inactive
9620  delimiter from the stack, and return a literal text node `]`.
9621
9622- If we find one and it's active, then we parse ahead to see if
9623  we have an inline link/image, reference link/image, compact reference
9624  link/image, or shortcut reference link/image.
9625
9626  + If we don't, then we remove the opening delimiter from the
9627    delimiter stack and return a literal text node `]`.
9628
9629  + If we do, then
9630
9631    * We return a link or image node whose children are the inlines
9632      after the text node pointed to by the opening delimiter.
9633
9634    * We run *process emphasis* on these inlines, with the `[` opener
9635      as `stack_bottom`.
9636
9637    * We remove the opening delimiter.
9638
9639    * If we have a link (and not an image), we also set all
9640      `[` delimiters before the opening delimiter to *inactive*.  (This
9641      will prevent us from getting links within links.)
9642
9643#### *process emphasis*
9644
9645Parameter `stack_bottom` sets a lower bound to how far we
9646descend in the [delimiter stack].  If it is NULL, we can
9647go all the way to the bottom.  Otherwise, we stop before
9648visiting `stack_bottom`.
9649
9650Let `current_position` point to the element on the [delimiter stack]
9651just above `stack_bottom` (or the first element if `stack_bottom`
9652is NULL).
9653
9654We keep track of the `openers_bottom` for each delimiter
9655type (`*`, `_`) and each length of the closing delimiter run
9656(modulo 3).  Initialize this to `stack_bottom`.
9657
9658Then we repeat the following until we run out of potential
9659closers:
9660
9661- Move `current_position` forward in the delimiter stack (if needed)
9662  until we find the first potential closer with delimiter `*` or `_`.
9663  (This will be the potential closer closest
9664  to the beginning of the input -- the first one in parse order.)
9665
9666- Now, look back in the stack (staying above `stack_bottom` and
9667  the `openers_bottom` for this delimiter type) for the
9668  first matching potential opener ("matching" means same delimiter).
9669
9670- If one is found:
9671
9672  + Figure out whether we have emphasis or strong emphasis:
9673    if both closer and opener spans have length >= 2, we have
9674    strong, otherwise regular.
9675
9676  + Insert an emph or strong emph node accordingly, after
9677    the text node corresponding to the opener.
9678
9679  + Remove any delimiters between the opener and closer from
9680    the delimiter stack.
9681
9682  + Remove 1 (for regular emph) or 2 (for strong emph) delimiters
9683    from the opening and closing text nodes.  If they become empty
9684    as a result, remove them and remove the corresponding element
9685    of the delimiter stack.  If the closing node is removed, reset
9686    `current_position` to the next element in the stack.
9687
9688- If none is found:
9689
9690  + Set `openers_bottom` to the element before `current_position`.
9691    (We know that there are no openers for this kind of closer up to and
9692    including this point, so this puts a lower bound on future searches.)
9693
9694  + If the closer at `current_position` is not a potential opener,
9695    remove it from the delimiter stack (since we know it can't
9696    be a closer either).
9697
9698  + Advance `current_position` to the next element in the stack.
9699
9700After we're done, we remove all delimiters above `stack_bottom` from the
9701delimiter stack.
9702
9703