1# Philosophy
2
3Parsers are innately complicated and confusing. They're difficult to understand, difficult to write, and difficult to use. Even experts on the subject can become baffled by the nuances of these complicated state-machines.
4
5Lark's mission is to make the process of writing them as simple and abstract as possible, by following these design principles:
6
7## Design Principles
8
91. Readability matters
10
112. Keep the grammar clean and simple
12
132. Don't force the user to decide on things that the parser can figure out on its own
14
154. Usability is more important than performance
16
175. Performance is still very important
18
196. Follow the Zen of Python, whenever possible and applicable
20
21
22In accordance with these principles, I arrived at the following design choices:
23
24-----------
25
26## Design Choices
27
28### 1. Separation of code and grammar
29
30Grammars are the de-facto reference for your language, and for the structure of your parse-tree. For any non-trivial language, the conflation of code and grammar always turns out convoluted and difficult to read.
31
32The grammars in Lark are EBNF-inspired, so they are especially easy to read & work with.
33
34### 2. Always build a parse-tree (unless told not to)
35
36Trees are always simpler to work with than state-machines.
37
381. Trees allow you to see the "state-machine" visually
39
402. Trees allow your computation to be aware of previous and future states
41
423. Trees allow you to process the parse in steps, instead of forcing you to do it all at once.
43
44And anyway, every parse-tree can be replayed as a state-machine, so there is no loss of information.
45
46See this answer in more detail [here](https://github.com/erezsh/lark/issues/4).
47
48To improve performance, you can skip building the tree for LALR(1), by providing Lark with a transformer (see the [JSON example](https://github.com/erezsh/lark/blob/master/examples/json_parser.py)).
49
50### 3. Earley is the default
51
52The Earley algorithm can accept *any* context-free grammar you throw at it (i.e. any grammar you can write in EBNF, it can parse). That makes it extremely friendly to beginners, who are not aware of the strange and arbitrary restrictions that LALR(1) places on its grammars.
53
54As the users grow to understand the structure of their grammar, the scope of their target language, and their performance requirements, they may choose to switch over to LALR(1) to gain a huge performance boost, possibly at the cost of some language features.
55
56Both Earley and LALR(1) can use the same grammar, as long as all constraints are satisfied.
57
58In short, "Premature optimization is the root of all evil."
59
60### Other design features
61
62- Automatically resolve terminal collisions whenever possible
63
64- Automatically keep track of line & column numbers
65
66