• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

.vscode/H03-May-2022-4745

assets/H03-May-2022-9392

benches/H03-May-2022-4735

doc/H03-May-2022-598478

examples/H03-May-2022-300260

src/H03-May-2022-1,063902

tests/H03-May-2022-2922

.cargo-checksum.jsonH A D03-May-202289 11

.cargo_vcs_info.jsonH A D01-Jan-197074 65

.editorconfigH A D21-Jan-2017186 1310

.gitignoreH A D21-Jan-201718 32

.travis.ymlH A D07-Dec-2018443 2319

Cargo.tomlH A D01-Jan-19701 KiB2826

Cargo.toml.orig-cargoH A D29-Mar-2019511 1715

LICENSEH A D21-Jan-20171 KiB2217

README.mdH A D12-Dec-20187.6 KiB189156

rustfmt.tomlH A D12-Dec-201863 43

README.md

1# pom
2
3[![Crates.io](https://img.shields.io/crates/v/pom.svg)](https://crates.io/crates/pom)
4[![Build Status](https://travis-ci.org/J-F-Liu/pom.png)](https://travis-ci.org/J-F-Liu/pom)
5[![Docs](https://docs.rs/pom/badge.svg)](https://docs.rs/pom)
6[![Discord](https://img.shields.io/badge/discord-pom-red.svg)](https://discord.gg/CVy85pg)
7
8PEG parser combinators created using operator overloading without macros.
9
10## Document
11
12- [Tutorial](https://github.com/J-F-Liu/pom/blob/master/doc/article.md)
13- [API Reference](https://docs.rs/crate/pom/)
14
15## What is PEG?
16
17PEG stands for parsing expression grammar, is a type of analytic formal grammar, i.e. it describes a formal language in terms of a set of rules for recognizing strings in the language.
18Unlike CFGs, PEGs cannot be ambiguous; if a string parses, it has exactly one valid parse tree.
19Each parsing function conceptually takes an input string as its argument, and yields one of the following results:
20- success, in which the function may optionally move forward or consume one or more characters of the input string supplied to it, or
21- failure, in which case no input is consumed.
22
23Read more on [Wikipedia](https://en.wikipedia.org/wiki/Parsing_expression_grammar).
24
25## What is parser combinator?
26
27A parser combinator is a higher-order function that accepts several parsers as input and returns a new parser as its output.
28Parser combinators enable a recursive descent parsing strategy that facilitates modular piecewise construction and testing.
29
30Parsers built using combinators are straightforward to construct, readable, modular, well-structured and easily maintainable.
31With operator overloading, a parser combinator can take the form of an infix operator, used to glue different parsers to form a complete rule.
32Parser combinators thereby enable parsers to be defined in an embedded style, in code which is similar in structure to the rules of the formal grammar.
33And the code is easier to debug than macros.
34
35The main advantage is that you don't need to go through any kind of code generation step, you're always using the vanilla language underneath.
36Aside from build issues (and the usual issues around error messages and debuggability, which in fairness are about as bad with macros as with code generation), it's usually easier to freely intermix grammar expressions and plain code.
37
38## List of predefined parsers and combinators
39
40|Basic Parsers|Description|
41| --- | --- |
42|empty()|Always succeeds, consume no input.|
43|end()  |Match end of input.|
44|sym(t)|Match a single terminal symbol *t*.|
45|seq(s) |Match sequence of symbols.|
46|list(p,s) |Match list of *p*, separated by *s*.|
47|one_of(set) |Success when current input symbol is one of the set.|
48|none_of(set)|Success when current input symbol is none of the set.|
49|is_a(predicate) |Success when predicate return true on current input symbol.|
50|not_a(predicate)|Success when predicate return false on current input symbol.|
51|take(n)|Read *n* symbols.|
52|skip(n)|Skip *n* symbols.|
53|call(pf)|Call a parser factory, can used to create recursive parsers.|
54
55|Parser Combinators|Description|
56| --- | --- |
57| p | q | Match p or q, return result of the first success. |
58| p + q | Match p and q, if both success return a pair of results. |
59| p - q | Match p and q, if both success return result of p. |
60| p * q | Match p and q, if both success return result of q. |
61| p >> q | Parse p and get result P, then parse and return result of q(P). |
62| -p | Success when p success, doesn't consume input. |
63| !p | Success when p fail, doesn't consume input. |
64|p.opt()|Make parser optional. Returns an `Option`.|
65|p.repeat(m..n)| `p.repeat(0..)` repeat p zero or more times<br>`p.repeat(1..)` repeat p one or more times<br>`p.repeat(1..4)` match p at least 1 and at most 3 times<br>`p.repeat(5)` repeat p exactly 5 times|
66|p.map(f)|Convert parser result to desired value.|
67|p.convert(f)|Convert parser result to desired value, fail in case of conversion error.|
68|p.pos() |Get input position after matching p.|
69|p.collect()|Collect all matched input symbols.|
70|p.discard()|Discard parser output.|
71|p.name(_)|Give parser a name to identify parsing errors.|
72|p.expect(_)|Mark parser as expected, abort early when failed in ordered choice.|
73
74The choice of operators is established by their operator precedence, arity and "meaning".
75Use `*` to ignore the result of first operand on the start of an expression, `+` and `-` can fulfill the need on the rest of the expression.
76
77For example, `A * B * C - D + E - F` will return the results of C and E as a pair.
78
79## Example code
80```rust
81extern crate pom;
82use pom::parser::*;
83
84let input = b"abcde";
85let parser = sym(b'a') * none_of(b"AB") - sym(b'c') + seq(b"de");
86let output = parser.parse(input);
87assert_eq!(output, Ok( (b'b', vec![b'd', b'e']) ) );
88```
89
90### Example JSON parser
91```rust
92extern crate pom;
93use pom::parser::*;
94use pom::Parser;
95
96use std::collections::HashMap;
97use std::str::{self, FromStr};
98
99#[derive(Debug, PartialEq)]
100pub enum JsonValue {
101	Null,
102	Bool(bool),
103	Str(String),
104	Num(f64),
105	Array(Vec<JsonValue>),
106	Object(HashMap<String,JsonValue>)
107}
108
109fn space() -> Parser<u8, ()> {
110	one_of(b" \t\r\n").repeat(0..).discard()
111}
112
113fn number() -> Parser<u8, f64> {
114	let integer = one_of(b"123456789") - one_of(b"0123456789").repeat(0..) | sym(b'0');
115	let frac = sym(b'.') + one_of(b"0123456789").repeat(1..);
116	let exp = one_of(b"eE") + one_of(b"+-").opt() + one_of(b"0123456789").repeat(1..);
117	let number = sym(b'-').opt() + integer + frac.opt() + exp.opt();
118	number.collect().convert(str::from_utf8).convert(|s|f64::from_str(&s))
119}
120
121fn string() -> Parser<u8, String> {
122	let special_char = sym(b'\\') | sym(b'/') | sym(b'"')
123		| sym(b'b').map(|_|b'\x08') | sym(b'f').map(|_|b'\x0C')
124		| sym(b'n').map(|_|b'\n') | sym(b'r').map(|_|b'\r') | sym(b't').map(|_|b'\t');
125	let escape_sequence = sym(b'\\') * special_char;
126	let string = sym(b'"') * (none_of(b"\\\"") | escape_sequence).repeat(0..) - sym(b'"');
127	string.convert(String::from_utf8)
128}
129
130fn array() -> Parser<u8, Vec<JsonValue>> {
131	let elems = list(call(value), sym(b',') * space());
132	sym(b'[') * space() * elems - sym(b']')
133}
134
135fn object() -> Parser<u8, HashMap<String, JsonValue>> {
136	let member = string() - space() - sym(b':') - space() + call(value);
137	let members = list(member, sym(b',') * space());
138	let obj = sym(b'{') * space() * members - sym(b'}');
139	obj.map(|members|members.into_iter().collect::<HashMap<_,_>>())
140}
141
142fn value() -> Parser<u8, JsonValue> {
143	( seq(b"null").map(|_|JsonValue::Null)
144	| seq(b"true").map(|_|JsonValue::Bool(true))
145	| seq(b"false").map(|_|JsonValue::Bool(false))
146	| number().map(|num|JsonValue::Num(num))
147	| string().map(|text|JsonValue::Str(text))
148	| array().map(|arr|JsonValue::Array(arr))
149	| object().map(|obj|JsonValue::Object(obj))
150	) - space()
151}
152
153pub fn json() -> Parser<u8, JsonValue> {
154	space() * value() - end()
155}
156
157fn main() {
158	let input = br#"
159	{
160        "Image": {
161            "Width":  800,
162            "Height": 600,
163            "Title":  "View from 15th Floor",
164            "Thumbnail": {
165                "Url":    "http://www.example.com/image/481989943",
166                "Height": 125,
167                "Width":  100
168            },
169            "Animated" : false,
170            "IDs": [116, 943, 234, 38793]
171        }
172    }"#;
173
174	println!("{:?}", json().parse(input));
175}
176```
177You can run this example with the following command:
178```
179cargo run --example json
180```
181
182## Benchmark
183
184| Parser           | Time to parse the same JSON file |
185|------------------|----------------------------------|
186| pom: json_byte   | 621,319 ns/iter (+/- 20,318)     |
187| pom: json_char   | 627,110 ns/iter (+/- 11,463)     |
188| [pest](https://github.com/dragostis/pest): json_char  | 13,359 ns/iter (+/- 811)|
189