1# Philosophy 2 3Parsers are innately complicated and confusing. They're difficult to understand, difficult to write, and difficult to use. Even experts on the subject can become baffled by the nuances of these complicated state-machines. 4 5Lark's mission is to make the process of writing them as simple and abstract as possible, by following these design principles: 6 7## Design Principles 8 91. Readability matters 10 112. Keep the grammar clean and simple 12 132. Don't force the user to decide on things that the parser can figure out on its own 14 154. Usability is more important than performance 16 175. Performance is still very important 18 196. Follow the Zen of Python, whenever possible and applicable 20 21 22In accordance with these principles, I arrived at the following design choices: 23 24----------- 25 26## Design Choices 27 28### 1. Separation of code and grammar 29 30Grammars are the de-facto reference for your language, and for the structure of your parse-tree. For any non-trivial language, the conflation of code and grammar always turns out convoluted and difficult to read. 31 32The grammars in Lark are EBNF-inspired, so they are especially easy to read & work with. 33 34### 2. Always build a parse-tree (unless told not to) 35 36Trees are always simpler to work with than state-machines. 37 381. Trees allow you to see the "state-machine" visually 39 402. Trees allow your computation to be aware of previous and future states 41 423. Trees allow you to process the parse in steps, instead of forcing you to do it all at once. 43 44And anyway, every parse-tree can be replayed as a state-machine, so there is no loss of information. 45 46See this answer in more detail [here](https://github.com/erezsh/lark/issues/4). 47 48To improve performance, you can skip building the tree for LALR(1), by providing Lark with a transformer (see the [JSON example](https://github.com/erezsh/lark/blob/master/examples/json_parser.py)). 49 50### 3. Earley is the default 51 52The Earley algorithm can accept *any* context-free grammar you throw at it (i.e. any grammar you can write in EBNF, it can parse). That makes it extremely friendly to beginners, who are not aware of the strange and arbitrary restrictions that LALR(1) places on its grammars. 53 54As the users grow to understand the structure of their grammar, the scope of their target language, and their performance requirements, they may choose to switch over to LALR(1) to gain a huge performance boost, possibly at the cost of some language features. 55 56Both Earley and LALR(1) can use the same grammar, as long as all constraints are satisfied. 57 58In short, "Premature optimization is the root of all evil." 59 60### Other design features 61 62- Automatically resolve terminal collisions whenever possible 63 64- Automatically keep track of line & column numbers 65 66