1# pulldown-cmark 2 3[![Build Status](https://dev.azure.com/raphlinus/pulldown-cmark/_apis/build/status/pulldown-cmark-CI?branchName=master)](https://dev.azure.com/raphlinus/pulldown-cmark/_build/latest?definitionId=2&branchName=master) 4[![Docs](https://docs.rs/pulldown-cmark/badge.svg)](https://docs.rs/pulldown-cmark) 5[![Crates.io](https://img.shields.io/crates/v/pulldown-cmark.svg?maxAge=2592000)](https://crates.io/crates/pulldown-cmark) 6 7[Documentation](https://docs.rs/pulldown-cmark/) 8 9This library is a pull parser for [CommonMark](http://commonmark.org/), written 10in [Rust](http://www.rust-lang.org/). It comes with a simple command-line tool, 11useful for rendering to HTML, and is also designed to be easy to use from as 12a library. 13 14It is designed to be: 15 16* Fast; a bare minimum of allocation and copying 17* Safe; written in pure Rust with no unsafe blocks 18* Versatile; in particular source-maps are supported 19* Correct; the goal is 100% compliance with the [CommonMark spec](http://spec.commonmark.org/) 20 21Further, it optionally supports parsing footnotes, 22[Github flavored tables](https://github.github.com/gfm/#tables-extension-), 23[Github flavored task lists](https://github.github.com/gfm/#task-list-items-extension-) and 24[strikethrough](https://github.github.com/gfm/#strikethrough-extension-). 25 26Rustc 1.36 or newer is required to build the crate. 27 28## Why a pull parser? 29 30There are many parsers for Markdown and its variants, but to my knowledge none 31use pull parsing. Pull parsing has become popular for XML, especially for 32memory-conscious applications, because it uses dramatically less memory than 33constructing a document tree, but is much easier to use than push parsers. Push 34parsers are notoriously difficult to use, and also often error-prone because of 35the need for user to delicately juggle state in a series of callbacks. 36 37In a clean design, the parsing and rendering stages are neatly separated, but 38this is often sacrificed in the name of performance and expedience. Many Markdown 39implementations mix parsing and rendering together, and even designs that try 40to separate them (such as the popular [hoedown](https://github.com/hoedown/hoedown)), 41make the assumption that the rendering process can be fully represented as a 42serialized string. 43 44Pull parsing is in some sense the most versatile architecture. It's possible to 45drive a push interface, also with minimal memory, and quite straightforward to 46construct an AST. Another advantage is that source-map information (the mapping 47between parsed blocks and offsets within the source text) is readily available; 48you can call `into_offset_iter()` to create an iterator that yields `(Event, Range)` 49pairs, where the second element is the event's corresponding range in the source 50document. 51 52While manipulating ASTs is the most flexible way to transform documents, 53operating on iterators is surprisingly easy, and quite efficient. Here, for 54example, is the code to transform soft line breaks into hard breaks: 55 56```rust 57let parser = parser.map(|event| match event { 58 Event::SoftBreak => Event::HardBreak, 59 _ => event 60}); 61``` 62 63Or expanding an abbreviation in text: 64 65```rust 66let parser = parser.map(|event| match event { 67 Event::Text(text) => Event::Text(text.replace("abbr", "abbreviation").into()), 68 _ => event 69}); 70``` 71 72Another simple example is code to determine the max nesting level: 73 74```rust 75let mut max_nesting = 0; 76let mut level = 0; 77for event in parser { 78 match event { 79 Event::Start(_) => { 80 level += 1; 81 max_nesting = std::cmp::max(max_nesting, level); 82 } 83 Event::End(_) => level -= 1, 84 _ => () 85 } 86} 87``` 88 89There are some basic but fully functional examples of the usage of the crate in the 90`examples` directory of this repository. 91 92## Using Rust idiomatically 93 94A lot of the internal scanning code is written at a pretty low level (it 95pretty much scans byte patterns for the bits of syntax), but the external 96interface is designed to be idiomatic Rust. 97 98Pull parsers are at heart an iterator of events (start and end tags, text, 99and other bits and pieces). The parser data structure implements the 100Rust Iterator trait directly, and Event is an enum. Thus, you can use the 101full power and expressivity of Rust's iterator infrastructure, including 102for loops and `map` (as in the examples above), collecting the events into 103a vector (for recording, playback, and manipulation), and more. 104 105Further, the `Text` event (representing text) is a small copy-on-write string. 106The vast majority of text fragments are just 107slices of the source document. For these, copy-on-write gives a convenient 108representation that requires no allocation or copying, but allocated 109strings are available when they're needed. Thus, when rendering text to 110HTML, most text is copied just once, from the source document to the 111HTML buffer. 112 113When using the pulldown-cmark's own HTML renderer, make sure to write to a buffered 114target like a `Vec<u8>` or `String`. Since it performs many (very) small writes, writing 115directly to stdout, files, or sockets is detrimental to performance. Such writers can 116be wrapped in a [`BufWriter`](https://doc.rust-lang.org/std/io/struct.BufWriter.html). 117 118## Build options 119 120By default, the binary is built as well. If you don't want/need it, then build like this: 121 122```bash 123> cargo build --no-default-features 124``` 125 126Or put in your `Cargo.toml` file: 127 128```toml 129pulldown-cmark = { version = "0.8", default-features = false } 130``` 131 132SIMD accelerated scanners are available for the x64 platform from version 0.5 onwards. To 133enable them, build with simd feature: 134 135```bash 136> cargo build --release --features simd 137``` 138 139Or add the feature to your project's `Cargo.toml`: 140 141```toml 142pulldown-cmark = { version = "0.8", default-features = false, features = ["simd"] } 143``` 144 145## Authors 146 147The main author is Raph Levien. The implementation of the new design (v0.3+) was completed by Marcus Klaas de Vries. 148 149## Contributions 150 151We gladly accept contributions via GitHub pull requests. Please see 152[CONTRIBUTING.md](CONTRIBUTING.md) for more details. 153