1# PEG Parser Combinators Implemented in Rust
2
3This article introduces [pom](https://github.com/J-F-Liu/pom), a PEG parser combinator library implemented in Rust, using operator overloading without macros.
4
5## Why Rust?
6
7![Rust](rust.png)
8
9After I've learned C/C++ and C#, I found that choosing a new programming language can greatly affect a programmer's productivity.
10On one hand I keep sorting out new languages, there are hundreds of them, I examine and choose what I like best, my favorites are C#, Ruby, TypeScript and Rust.
11On the other hand I try to design a new language and implement a compiler by myself.
12
13I like the syntax provided by C#, but hate the huge .NET runtime. Dependency on CLR makes distribution of an application written in C# very hard. Compiling to native code is always what I longed for a programming language.
14In year 2003 I thought a compiler can get rid of garbage collector by generating free memory instructions in appropriate locations in the target program.
15But I didn't go deep into the design of the details of this mechanism, I decided to firstly write a working compiler, then improve the design of the language and implementation of the compiler bit by bit.
16
17The first stage of compilation is parsing. I tried some parser generators, but not satisfied with the result.
18Then I dig into the parsing theory, followed several books, implemented DFA, NFA, NFA to DFA conversion, LL(1), LR, LALR algorithms,
19then wrote a parser to parse BNF, EBNF or TBNF grammar file, and generate parser code corresponding to the grammar.
20
21The syntax/semantics analysis and code generation parts of a compiler are more difficult.
22I even tried to define a intermediate assembly language, at that time I didn't know LLVM. My effort of writing a compiler ceased for years, then Rust was born.
23
24At first glance, the Rust's syntax is a bit strange, why use `fn` instaed of `def`, why use `let mut` instead of `var`, I was not attracted by it.
25After read a publication on O'Reilly [*Why Rust?*](http://www.oreilly.com/programming/free/files/why-rust.pdf) I suddenly realized that this is language I'm trying to build,
26when you actually start using Rust you'll find that `fn` and `let mut` fits Rust's logic well. For me, **Rust is once a dream now a reality.**
27
28Rust has a steep learning curve, more challenging than any of the previous programming languages I learned. All this learning is worthwhile when you finally get your program working and polished.
29Object oriented class hierarchy is not good enough for code reuse, Rust's enum, tuple, struct and trait type system is a better solution.
30I still wondering whether the Rust compiler can be smart enough to elide all the lifetime parameters, they are mostly noise and obstacle when reading and writing programs.
31
32## What is PEG?
33
34When I discovered [PEG](http://bford.info/packrat/), I knew that all my previous work on LALR can be thrown away.
35I rewrote my parser generator using and working with PEG. Using this parser generator I created a [YAML parser](https://www.codeproject.com/Articles/28720/YAML-Parser-in-C) and a [Lua Interpreter](https://www.codeproject.com/Articles/228212/Lua-Interpreter).
36
37[Parsing Expression Grammars](http://en.wikipedia.org/wiki/Parsing_expression_grammar) (PEGs) are an alternative to [Context-Free Grammars](http://en.wikipedia.org/wiki/Context-free_grammar) (CFGs) for formally specifying syntax.
38CFG describe a rule system to generate language strings while PEG describe a rule system to recognize language strings.
39
40![CFG](cfg.png)
41
42![PEG](peg.png)
43
44Unlike CFGs, PEGs cannot be ambiguous; if a string parses, it has exactly one valid parse tree.
45We normally specify our languages directly by how to recognize it, so PEG is both closer match to syntax practices and more powerful than nondeterministic CFG.
46
47### Parsing expressions
48
49| Expr       | Description                         |
50| ---------- | ----------------------------------- |
51| ε          | the empty string                    |
52| a          | terminal (a ∈ Σ)                    |
53| A          | non-terminal (A ∈ N)                |
54| e1 e2      | sequence                            |
55| e1 / e2    | prioritized choice                  |
56| e?, e*, e+ | optional, zero-or-more, one-or-more |
57| &e, !e     | syntactic predicates                |
58
59## What is parser combinator?
60
61When I heard of Parsec in the Haskell world, I got the concept of parser combinator for my first time.
62
63A *parser* is a function which takes a *string* (a series of *symbols*) as input, and returns matching result as *output*.
64
65A *combinator* is a higher-order function (a "functional") which takes zero or more functions (each of the same type) as input and returns a new function of the same type as output.
66
67A *parser combinator* is a higher-order function which takes parsers as input and returns a new parser as output.
68
69Parser combinators allow you write grammar rules and create a parser directly in the host language, without a separated parser generation step, so the whole procedure is more fluent.
70
71## How to implement parser combinators?
72
73I thought deeply about how to implement parser combinator using language constructs provided by Rust. In summary, there are four approaches:
74
751. Parser as closure
76
77   ```rust
78   pub fn empty<I>() -> impl Fn(&mut Input<I>) -> Result<()> {
79     |_: &mut Input<I>| Ok(())
80   }
81
82   pub fn term<I>(t: I) -> impl Fn(&mut Input<I>) -> Result<I> {
83       ...
84   }
85
86   pub fn seq<'a, I>(tag: &'a [I]) -> impl Fn(&mut Input<I>) -> Result<&'a [I]> {
87     ...
88   }
89   ...
90   // To create a parser for integer
91   let parser = concatenate(optional(one_of("+-")), one_or_more(one_of("0123456789")));
92   ```
93
94   *Pros*: Less implementation code.
95
96   *Cons*: Cannot overload operators, poor readability.
97
982. Parser as struct
99
100   ```rust
101   pub struct Parser<I, O> {
102       method: Box<Fn(&mut Input<I>) -> Result<O>>,
103   }
104
105   impl<I, O> Parser<I, O> {
106       /// Create new parser.
107       pub fn new<P>(parse: P) -> Parser<I, O>
108           where P: Fn(&mut Input<I>) -> Result<O> + 'static
109       {
110           Parser { method: Box::new(parse) }
111       }
112
113       /// Apply the parser to parse input.
114       pub fn parse(&self, input: &mut Input<I>) -> Result<O> {
115           (self.method)(input)
116       }
117       ...
118   }
119
120   pub fn empty<I>() -> Parser<I, ()> {
121       Parser::new(|_: &mut Input<I>| Ok(()))
122   }
123
124   pub fn term<I>(t: I) -> Parser<I, I> {
125       ...
126   }
127   ...
128   impl<I: Copy, O, U> Add<Parser<I, U>> for Parser<I, O> {
129       type Output = Parser<I, (O, U)>;
130
131       fn add(self, other: Parser<I, U>) -> Self::Output
132           where I: 'static,
133                 O: 'static,
134                 U: 'static
135       {
136           Parser::new(move |input: &mut Input<I>| {
137               let start = input.position();
138               let result = self.parse(input)
139                   .and_then(|out1| other.parse(input).map(|out2| (out1, out2)));
140               if result.is_err() {
141                   input.jump_to(start);
142               }
143               result
144           })
145       }
146   }
147   ...
148   // To create a parser for integer
149   let parser = one_of("+-").opt() + one_of("0123456789").repeat(1..);
150   ```
151
152   *Pros*: Can overload operators, elegant code.
153
154   *Cons*: Depends on compiler's zero-cost abstractions to optimize runtime performance.
155
156   Crate [pom](https://github.com/J-F-Liu/pom) is using this approach.
157
1583. Parser as trait
159
160   ```rust
161   pub trait Parser  {
162     type I: ?Sized;
163     type O;
164
165     fn parse<'a>(&self, data: &'a Self::I) -> ParseResult<&'a Self::I, Self::O>;
166   }
167
168   pub trait ParserCombinator : Parser + Clone {
169     fn then<P: Parser<I=Self::I>>(&self, p: P) -> ChainedParser<Self,P> {
170       ChainedParser{first: self.clone(), second: p}
171     }
172     ...
173   }
174
175   pub fn opt<T: Parser>(t: T) -> OptionParser<T> {
176     OptionParser{parser: t}
177   }
178
179   pub fn recursive<I:?Sized,O, F:  Fn() -> Box<Parser<I=I,O=O>>>(f: F) -> RecursiveParser<I,O,F> {
180     RecursiveParser{parser: Rc::new(f)}
181   }
182
183   ...
184
185   pub struct ChainedParser<A,B> {
186     first: A,
187     second: B,
188   }
189   ...
190   impl<C: ?Sized, A: Parser<I=C>, B: Parser<I=C>> Parser for ChainedParser<A, B> {
191     type I = C;
192     type O = (A::O,B::O);
193
194     fn parse<'a>(&self, data: &'a Self::I) -> ParseResult<&'a Self::I, Self::O>{
195       match self.first.parse(data) {
196         Ok((a, d2)) => match self.second.parse(d2) {
197           Ok((b, remain)) => Ok(((a, b), remain)),
198           Err(err) => Err(err)
199         },
200         Err(err) => Err(err)
201       }
202     }
203   }
204
205   impl<C: ?Sized, A: ParserCombinator<I=C>, B: ParserCombinator<I=C>>  Clone for ChainedParser<A, B> {
206       ...
207   }
208   ...
209   ```
210
211   *Pros*: Can overload operators.
212
213   *Cons*: Bloated code.
214
215   Crate [peruse](https://github.com/DanSimon/peruse) is using this approach.
216
2174. Parser as macro
218
219   ```rust
220   #[macro_export]
221   macro_rules! do_parse (
222     (__impl $i:expr, $consumed:expr, ( $($rest:expr),* )) => (
223       $crate::IResult::Done($i, ( $($rest),* ))
224     );
225
226     (__impl $i:expr, $consumed:expr, $e:ident >> $($rest:tt)*) => (
227       do_parse!(__impl $i, $consumed, call!($e) >> $($rest)*);
228     );
229     (__impl $i:expr, $consumed:expr, $submac:ident!( $($args:tt)* ) >> $($rest:tt)*) => (
230       {
231         match $submac!($i, $($args)*) {
232           $crate::IResult::Error(e)      => $crate::IResult::Error(e),
233           $crate::IResult::Incomplete($crate::Needed::Unknown) =>
234             $crate::IResult::Incomplete($crate::Needed::Unknown),
235           $crate::IResult::Incomplete($crate::Needed::Size(i)) =>
236             $crate::IResult::Incomplete($crate::Needed::Size($consumed + i)),
237           $crate::IResult::Done(i,_)     => {
238             do_parse!(__impl i,
239               $consumed + ($crate::InputLength::input_len(&($i)) -
240                            $crate::InputLength::input_len(&i)), $($rest)*)
241           },
242         }
243       }
244     );
245
246     (__impl $i:expr, $consumed:expr, $field:ident : $e:ident >> $($rest:tt)*) => (
247       do_parse!(__impl $i, $consumed, $field: call!($e) >> $($rest)*);
248     );
249
250     (__impl $i:expr, $consumed:expr, $field:ident : $submac:ident!( $($args:tt)* ) >> $($rest:tt)*) => (
251       {
252         match  $submac!($i, $($args)*) {
253           $crate::IResult::Error(e)      => $crate::IResult::Error(e),
254           $crate::IResult::Incomplete($crate::Needed::Unknown) =>
255             $crate::IResult::Incomplete($crate::Needed::Unknown),
256           $crate::IResult::Incomplete($crate::Needed::Size(i)) =>
257             $crate::IResult::Incomplete($crate::Needed::Size($consumed + i)),
258           $crate::IResult::Done(i,o)     => {
259             let $field = o;
260             do_parse!(__impl i,
261               $consumed + ($crate::InputLength::input_len(&($i)) -
262                            $crate::InputLength::input_len(&i)), $($rest)*)
263           },
264         }
265       }
266     );
267
268     // ending the chain
269     (__impl $i:expr, $consumed:expr, $e:ident >> ( $($rest:tt)* )) => (
270       do_parse!(__impl $i, $consumed, call!($e) >> ( $($rest)* ));
271     );
272
273     (__impl $i:expr, $consumed:expr, $submac:ident!( $($args:tt)* ) >> ( $($rest:tt)* )) => (
274       match $submac!($i, $($args)*) {
275         $crate::IResult::Error(e)      => $crate::IResult::Error(e),
276         $crate::IResult::Incomplete($crate::Needed::Unknown) =>
277           $crate::IResult::Incomplete($crate::Needed::Unknown),
278         $crate::IResult::Incomplete($crate::Needed::Size(i)) =>
279           $crate::IResult::Incomplete($crate::Needed::Size($consumed + i)),
280         $crate::IResult::Done(i,_)     => {
281           $crate::IResult::Done(i, ( $($rest)* ))
282         },
283       }
284     );
285
286     (__impl $i:expr, $consumed:expr, $field:ident : $e:ident >> ( $($rest:tt)* )) => (
287       do_parse!(__impl $i, $consumed, $field: call!($e) >> ( $($rest)* ) );
288     );
289
290     (__impl $i:expr, $consumed:expr, $field:ident : $submac:ident!( $($args:tt)* ) >> ( $($rest:tt)* )) => (
291       match $submac!($i, $($args)*) {
292         $crate::IResult::Error(e)      => $crate::IResult::Error(e),
293         $crate::IResult::Incomplete($crate::Needed::Unknown) =>
294           $crate::IResult::Incomplete($crate::Needed::Unknown),
295         $crate::IResult::Incomplete($crate::Needed::Size(i)) =>
296           $crate::IResult::Incomplete($crate::Needed::Size($consumed + i)),
297         $crate::IResult::Done(i,o)     => {
298           let $field = o;
299           $crate::IResult::Done(i, ( $($rest)* ))
300         },
301       }
302     );
303
304     ($i:expr, $($rest:tt)*) => (
305       {
306         do_parse!(__impl $i, 0usize, $($rest)*)
307       }
308     );
309   );
310   ...
311   // To create a parser for integer
312   named!(integer<&[u8], i64>, map!(
313     pair!(
314       opt!(sign),
315       map_res!(map_res!(digit, str::from_utf8), i64::from_str)
316     ),
317     |(sign, value): (Option<i64>, i64)| { sign.unwrap_or(1) * value }
318   ));
319   ```
320
321   *Pros*: Can create DSL syntax, high performance.
322
323   *Cons*: Macros themselves are difficult to read, write and debug.
324
325   Crate [nom](https://github.com/Geal/nom) is using this approach.
326
327According to above comparison, parser as struct is the best approach. At first I choose to use nom to create a PDF parser, it turns out a special PDF feature blocked me.
328When parsing a PDF stream object, it's length may be a referenced object, hence the need to get the length from a reader.
329The `named! ` macro cannot accept extra parameters, there is no obvious way to read a length object inside a stream object parser.
330This is the primary reason why I started to develop pom.
331
332## List of predefined parsers and combinators in pom
333
334| Basic Parsers  | Description                              |
335| -------------- | ---------------------------------------- |
336| empty()        | Always succeeds, consume no input.       |
337| end()          | Match end of input.                      |
338| sym(t)        | Match a single terminal symbol *t*.       |
339| seq(s)         | Match sequence of symbols.               |
340| list(p,s)      | Match list of *p*, separated by *s*.     |
341| one_of(set)    | Success when current input symbol is one of the set. |
342| none_of(set)   | Success when current input symbol is none of the set. |
343| is_a(predicate)  | Success when predicate return true on current input symbol. |
344| not_a(predicate) | Success when predicate return false on current input symbol. |
345| take(n)        | Read *n* symbols.                        |
346| skip(n)        | Skip *n* symbols.                        |
347| call(pf)       | Call a parser factory, can used to create recursive parsers. |
348
349These are functions to create basic parsers.
350
351
352| Parser Combinators | Description                              |
353| ------------------ | ---------------------------------------- |
354| p &#124; q         | Match p or q, return result of the first success. |
355| p + q              | Match p and q, if both success return a pair of results. |
356| p - q              | Match p and q, if both success return result of p. |
357| p * q              | Match p and q, if both success return result of q. |
358| p >> q             | Parse p and get result P, then parse and return result of q(P). |
359| -p                 | Success when p success, doen't consume input. |
360| !p                 | Success when p fail, doen't consume input. |
361| p.opt()            | Make parser optional.                    |
362| p.repeat(m..n)     | `p.repeat(0..)` repeat p zero or more times<br>`p.repeat(1..)` repeat p one or more times<br>`p.repeat(1..4)` match p at least 1 and at most 3 times |
363| p.map(f)           | Convert parser result to desired value.  |
364| p.convert(f)       | Convert parser result to desired value, fail in case of conversion error. |
365| p.pos()            | Get input position after matching p.     |
366| p.collect()        | Collect all matched input symbols.       |
367| p.discard()        | Discard parser output.                   |
368| p.name(_)          | Give parser a name to identify parsing errors. |
369
370These are operations to create new parsers based on other parsers. The choice of operators is established by their operator precedence, arity and "meaning".
371
372Use `*` to ignore the result of first operand on the start of an expression, `+` and `-` can fulfill the need on the rest of the expression.
373For example, `A * B * C - D + E - F` will return the results of C and E as a pair.
374
375## Using the code
376
377There are three ways to create a parser:
378
3791. As a variable, normally used to construct another parser.
380
381   ```rust
382   let integer = one_of(b"+-").opt() + one_of(b"0123456789").repeat(1..);
383   ```
384
3852. As a closure, when referenced several times in constructing another parser.
386
387   ```rust
388   let integer = || one_of(b"+-").opt() + one_of(b"0123456789").repeat(1..);
389   let pair = sym(b'(') * integer() - sym(b',') + integer() - sym(b')');
390   ```
391
3923. As a function, provides a high level construct.
393
394   ```rust
395   fn integer() -> Parser<u8, (Option<u8>, Vec<u8>)> {
396       one_of(b"+-").opt() + one_of(b"0123456789").repeat(1..)
397   }
398   ```
399
400## Example JSON Parser
401
402Let me explain the parser combinators in more detail by creating a JSON parser. Syntax diagrams can be found on [json.org](http://www.json.org/).
403
404```rust
405extern crate pom;
406use pom::{Parser, DataInput};
407use pom::char_class::hex_digit;
408use pom::parser::*;
409
410use std::str::FromStr;
411use std::char::{decode_utf16, REPLACEMENT_CHARACTER};
412use std::collections::HashMap;
413
414#[derive(Debug, PartialEq)]
415pub enum JsonValue {
416    Null,
417    Bool(bool),
418    Str(String),
419    Num(f64),
420    Array(Vec<JsonValue>),
421    Object(HashMap<String,JsonValue>)
422}
423```
424
425Import predefined parser combinators and utility functions, define the JSON parser's output value as an enum.
426
427```rust
428fn space() -> Parser<u8, ()> {
429    one_of(b" \t\r\n").repeat(0..).discard()
430}
431```
432
433Match zero or more space characters, the output is ignored.
434
435```rust
436fn number() -> Parser<u8, f64> {
437    let integer = one_of(b"123456789") - one_of(b"0123456789").repeat(0..) | sym(b'0');
438    let frac = sym(b'.') + one_of(b"0123456789").repeat(1..);
439    let exp = one_of(b"eE") + one_of(b"+-").opt() + one_of(b"0123456789").repeat(1..);
440    let number = sym(b'-').opt() + integer + frac.opt() + exp.opt();
441    number.collect().convert(String::from_utf8).convert(|s|f64::from_str(&s))
442}
443```
444
445Don't care each output of integer, frac or exp, collect() method get all the match character as a Vec<u8>, then it is converted to a string, and further converted to a float number.
446
447```rust
448fn string() -> Parser<u8, String> {
449    let special_char = sym(b'\\') | sym(b'/') | sym(b'"')
450        | sym(b'b').map(|_|b'\x08') | sym(b'f').map(|_|b'\x0C')
451        | sym(b'n').map(|_|b'\n') | sym(b'r').map(|_|b'\r') | sym(b't').map(|_|b'\t');
452    let escape_sequence = sym(b'\\') * special_char;
453    let char_string = (none_of(b"\\\"") | escape_sequence).repeat(1..).convert(String::from_utf8);
454    let utf16_char = seq(b"\\u") * is_a(hex_digit).repeat(4).convert(String::from_utf8).convert(|digits|u16::from_str_radix(&digits, 16));
455    let utf16_string = utf16_char.repeat(1..).map(|chars|decode_utf16(chars).map(|r| r.unwrap_or(REPLACEMENT_CHARACTER)).collect::<String>());
456    let string = sym(b'"') * (char_string | utf16_string).repeat(0..) - sym(b'"');
457    string.map(|strings|strings.concat())
458}
459```
460
461The bulk of code is written to parse escape sequences.
462According to [Wikipedia](https://en.wikipedia.org/wiki/JSON#Data_portability_issues), UTF-16 surrogate pairs is a detail missed by some JSON parsers.
463We implement this easily with Rust's Unicode support.
464
465```rust
466fn array() -> Parser<u8, Vec<JsonValue>> {
467    let elems = list(call(value), sym(b',') * space());
468    sym(b'[') * space() * elems - sym(b']')
469}
470
471fn object() -> Parser<u8, HashMap<String, JsonValue>> {
472    let member = string() - space() - sym(b':') - space() + call(value);
473    let members = list(member, sym(b',') * space());
474    let obj = sym(b'{') * space() * members - sym(b'}');
475    obj.map(|members|members.into_iter().collect::<HashMap<_,_>>())
476}
477
478fn value() -> Parser<u8, JsonValue> {
479    ( seq(b"null").map(|_|JsonValue::Null)
480    | seq(b"true").map(|_|JsonValue::Bool(true))
481    | seq(b"false").map(|_|JsonValue::Bool(false))
482    | number().map(|num|JsonValue::Num(num))
483    | string().map(|text|JsonValue::Str(text))
484    | array().map(|arr|JsonValue::Array(arr))
485    | object().map(|obj|JsonValue::Object(obj))
486    ) - space()
487}
488```
489
490array and object are very straight to parse, notice `call(value)`, at the first attempt I write it as `value()`, then an infinite loop is created. Recursive parsing is solved by adding `call()` to `pom`.
491
492```rust
493pub fn json() -> Parser<u8, JsonValue> {
494    space() * value() - end()
495}
496```
497
498The final JSON parser, declared as public. According to [RFC 7159](https://tools.ietf.org/html/rfc7159) a JSON text is a serialized value of any of the six types.
499`end()` is used to ensure there is no extra text in the input.
500
501```rust
502fn main() {
503    let test = br#"
504    {
505        "Image": {
506            "Width":  800,
507            "Height": 600,
508            "Title":  "View from 15th Floor",
509            "Thumbnail": {
510                "Url":    "http://www.example.com/image/481989943",
511                "Height": 125,
512                "Width":  100
513            },
514            "Animated" : false,
515            "IDs": [116, 943, 234, 38793]
516        },
517        "escaped characters": "\u2192\uD83D\uDE00\"\t\uD834\uDD1E"
518    }"#;
519
520    let mut input = DataInput::new(test);
521    println!("{:?}", json().parse(&mut input));
522}
523```
524
525Use the JSON parser to parse JSON text, the output is:
526
527```
528cargo run --example json
529   Compiling pom v0.6.0 (file:///work/pom)
530    Finished debug [unoptimized + debuginfo] target(s) in 2.20 secs
531     Running `target/debug/examples/json`
532Ok(Object({"Image": Object({"Width": Num(800), "Title": Str("View from 15th Floor"), "Height": Num(600), "Animated": Bool(false), "IDs": Array([Num(116), Num(943), Num(234), Num(38793)]), "Thumbnail": Object({"Height
533": Num(125), "Url": Str("http://www.example.com/image/481989943"), "Width": Num(100)})}), "escaped characters": Str("→��\"\t��")}))
534```
535
536The above parser assumes that the input bytes is UTF-8 encoded text; otherwise, you can use the [char version of JSON parser](https://github.com/J-F-Liu/pom/blob/master/examples/json_char.rs).
537
538`p >> q` is not covered in the JSON example. It is used to pass the output of `p` into parser creation of `p`.
539
540```rust
541let mut input = DataInput::new(b"5oooooooo");
542let parser = one_of(b"0123456789").map(|c|c - b'0') >> |n| {
543    take(n as usize) + sym(b'o').repeat(0..)
544};
545let output = parser.parse(&mut input);
546assert_eq!(output, Ok( (vec![b'o';5], vec![b'o';3]) ));
547```
548
549The first character indicates the number of `o`s to parse, then the number is used in the closure `|n| take(n)`.
550
551## More examples
552
553- A [simple PDF parser](https://github.com/J-F-Liu/lopdf/blob/491dece5867a2b81878208bcb5e07ff1007c0d89/src/parser.rs), you can compare it with the equivalent [nom version](https://github.com/J-F-Liu/lopdf/blob/dff82c49fea9ac9ea23edf42ad80e480bd5edb46/src/parser.rs).
554- A [complete PDF parser](https://github.com/J-F-Liu/lopdf/blob/master/src/parser.rs) which can read length object when parsing stream object.
555
556## Conclusion
557
558I think I created something really cool, you can use pom to write all kinds of parsers elegantly.
559I helped pom to evolve version by version into what it is, and pom also helps me to grow my Rust programming skills a lot.
560Of course there is still room for improvement, any feed back is welcome.
561
562## Points of interest
563
564I try to add a `cache()` method to `Parser`. Memorize the result on given input position, return the result directly when called again, effectively implementing the Packrat Parsing algorithm.
565But there are two problems, 1) save result means mutate a Hashmap, so Parser's method field should be a Box of `FnMut`,
5662) Hashmap returns an reference of value for a given key, the value cannot be moved, so need to make the value cloneable.
567
568## Pain points where Rust needs to improve
569
5701. Implement trait for `[T]` should automatically implement `[T; N]`.
5712. The standard library should provide a char_at() method return the char and the number of bytes consumed, like:
572	```rust
573	pub trait Encoding {
574		/// Get char at a byte index, return the char and the number of bytes read.
575		fn char_at(&self, data: &[u8], index: usize) -> Result<(char, u32)>;
576	}
577	```
5783. Can ellide 'static lifetime parameter, allow `Parser<'static, I, O>` written as `Parser<I, O>`.
5794. Should `impl Copy for closure`, so that FnOnce closure can be passed to map() inside Fn closure.
580	```rust
581	pub fn map<U, F>(self, f: F) -> Parser<'a, I, U>
582		where F: FnOnce(O) -> U + Copy + 'a,
583			  I: 'static,
584			  O: 'static,
585			  U: 'static
586	{
587		Parser::new(move |input: &mut Input<I>| {
588			self.parse(input).map(f)
589		})
590	}
591	```
592
593## More Readings
594
595- [The Rust programming language, in the words of its practitioners](https://brson.github.io/fireflowers/)
596- [PEGs, Packrats and Parser Combinators](http://scg.unibe.ch/download/lectures/cc2011/10PEGs.pptx.pdf)
597- [An introduction to parsing text in Haskell with Parsec](http://unbui.lt/#!/post/haskell-parsec-basics/)
598