• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

benchmark/H09-Sep-2001-10697

src/H09-Sep-2001-6,0165,326

test/H03-May-2022-361293

LICENSEH A D09-Sep-20011.5 KiB3124

README.mdH A D09-Sep-20015.8 KiB144112

Setup.hsH A D09-Sep-200146 32

changelog.mdH A D09-Sep-20011.8 KiB6240

commonmark.cabalH A D09-Sep-20013.3 KiB118111

README.md

1# commonmark
2
3[![hackage release](https://img.shields.io/hackage/v/commonmark.svg?label=hackage)](http://hackage.haskell.org/package/commonmark)
4
5This package provides the core parsing functionality
6for commonmark, together with HTML renderers.
7
8:construction: This library is still in an **experimental state**.
9Comments on the API and implementation are very much welcome.
10Further changes should be expected.
11
12The library is **fully commonmark-compliant** and passes the
13test suite for version 0.30 of the commonmark spec.
14It is designed to be **customizable and easily
15extensible.**  To customize the output, create an
16AST, or support a new output format, one need only define some
17new typeclass instances.  It is also easy to add new syntax
18elements or modify existing ones.
19
20**Accurate information about source positions** is available
21for all block and inline elements.  Thus the library can be
22used to create an accurate syntax highlighter or
23an editor with synced live preview.
24
25Finally, the library has been designed for **robust performance
26even in pathological cases**. The parser behaves well on
27pathological cases that tend to cause stack overflows or
28exponential slowdowns in other parsers, with parsing speed that
29varies linearly with input length.
30
31## Related libraries
32
33- **[`commonmark-extensions`](https://github.com/jgm/commonmark-hs/tree/master/commonmark-extensions)**
34  provides a set of useful extensions to core commonmark syntax,
35  including all GitHub-flavored Markdown extensions and many
36  pandoc extensions.  For convenience, the package of extensions
37  defining GitHub-flavored Markdown is exported as `gfmExtensions`.
38
39- **[`commonmark-pandoc`](https://github.com/jgm/commonmark-hs/tree/master/commonmark-pandoc)** defines
40  type instances for parsing commonmark as a Pandoc AST.
41
42- **[`commonmark-cli`](https://github.com/jgm/commonmark-hs/tree/master/commonmark-cli)** is a
43  command-line program that uses this library to convert
44  and syntax-highlight commonmark documents.
45
46
47## Simple usage example
48
49This program reads commonmark from stdin and renders HTML to stdout:
50
51``` haskell
52{-# LANGUAGE ScopedTypeVariables #-}
53import Commonmark
54import Data.Text.IO as TIO
55import Data.Text.Lazy.IO as TLIO
56
57main = do
58  res <- commonmark "stdin" <$> TIO.getContents
59  case res of
60    Left e                  -> error (show e)
61    Right (html :: Html ()) -> TLIO.putStr $ renderHtml html
62```
63
64## Notes on the design
65
66The input is a token stream (`[Tok]`), which can be
67be produced from a `Text` using `tokenize`.  The `Tok`
68elements record source positions, making these easier
69to track.
70
71Extensibility is emphasized throughout.  There are two ways in
72which one might want to extend a commonmark converter.  First,
73one might want to support an alternate output format, or to
74change the output for a given format.  Second, one might want
75to add new syntactic elements (e.g., definition lists).
76
77To support both kinds of extension, we export the function
78
79```haskell
80parseCommonmarkWith :: (Monad m, IsBlock il bl, IsInline il)
81                    => SyntaxSpec m il bl -- ^ Defines syntax
82                    -> [Tok] -- ^ Tokenized commonmark input
83                    -> m (Either ParseError bl)  -- ^ Result or error
84```
85
86The parser function takes two arguments:  a `SyntaxSpec` which
87defines parsing for the various syntactic elements, and a list
88of tokens.  Output is polymorphic:  you can
89convert commonmark to any type that is an instance of the
90`IsBlock` typeclass.  This gives tremendous flexibility.
91Want to produce HTML? You can use the `Html ()` type defined
92in `Commonmark.Types` for basic HTML, or `Html SourceRange`
93for HTML with source range attributes on every element.
94
95```haskell
96GHCI> :set -XOverloadedStrings
97GHCI>
98GHCI> parseCommonmarkWith defaultSyntaxSpec (tokenize "source" "Hi there") :: IO (Either ParseError (Html ()))
99Right <p>Hi there</p>
100> parseCommonmarkWith defaultSyntaxSpec (tokenize "source" "Hi there") :: IO (Either ParseError (Html SourceRange))
101Right <p data-sourcepos="source@1:1-1:9">Hi there</p>
102```
103
104Want to produce a Pandoc AST?  You can use the type
105`Cm a Text.Pandoc.Builder.Blocks` defined in `commonmark-pandoc`.
106
107```haskell
108GHCI> parseCommonmarkWith defaultSyntaxSpec (tokenize "source" "Hi there") :: Maybe (Either ParseError (Cm () B.Blocks))
109Just (Right (Cm {unCm = Many {unMany = fromList [Para [Str "Hi",Space,Str "there"]]}}))
110GHCI> parseCommonmarkWith defaultSyntaxSpec (tokenize "source" "Hi there") :: Maybe (Either ParseError (Cm SourceRange B.Blocks))
111Just (Right (Cm {unCm = Many {unMany = fromList [Div ("",[],[("data-pos","source@1:1-1:9")]) [Para [Span ("",[],[("data-pos","source@1:1-1:3")]) [Str "Hi"],Span ("",[],[("data-pos","source@1:3-1:4")]) [Space],Span ("",[],[("data-pos","source@1:4-1:9")]) [Str "there"]]]]}}))
112```
113
114If you want to support another format (for example, Haddock's `DocH`),
115just define typeclass instances of `IsBlock` and `IsInline` for
116your type.
117
118Supporting a new syntactic element generally requires (a) adding
119a `SyntaxSpec` for it and (b) defining relevant type class
120instances for the element.  See the examples in
121`Commonmark.Extensions.*`.  Note that `SyntaxSpec` is a Monoid,
122so you can specify `myNewSyntaxSpec <> defaultSyntaxSpec`.
123
124## Performance
125
126Here are some benchmarks on real-world commonmark documents,
127using `make benchmark`.  To get `benchmark.md`, we concatenated
128a number of real-world commonmark documents.  The resulting file
129was 355K.  The
130[`bench`](http://hackage.haskell.org/package/bench) tool was
131used to run the benchmarks.
132
133 | program                   | time (ms) |
134 | -------                   | ---------:|
135 | cmark                     |        12 |
136 | cheapskate                |       105 |
137 | commonmark.js             |       217 |
138 | **commonmark-hs**         |       229 |
139 | pandoc -f commonmark      |       948 |
140
141It would be good to improve performance.  I'd welcome help
142with this.
143
144