• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

.github/H11-Sep-2021-

buffer/H11-Sep-2021-

css/H11-Sep-2021-

html/H11-Sep-2021-

js/H11-Sep-2021-

json/H11-Sep-2021-

strconv/H11-Sep-2021-

tests/H11-Sep-2021-

xml/H11-Sep-2021-

.gitattributesH A D11-Sep-202136

.gitignoreH A D11-Sep-202197

.golangci.ymlH A D11-Sep-2021195

LICENSE.mdH A D11-Sep-20211 KiB

README.mdH A D11-Sep-20215.4 KiB

common.goH A D11-Sep-20215.3 KiB

common_test.goH A D11-Sep-20214.5 KiB

error.goH A D11-Sep-20211.2 KiB

error_test.goH A D11-Sep-20211 KiB

go.modH A D11-Sep-202186

go.sumH A D11-Sep-2021167

input.goH A D11-Sep-20214.1 KiB

input_test.goH A D11-Sep-20214.1 KiB

position.goH A D11-Sep-20212 KiB

position_test.goH A D11-Sep-20212.8 KiB

util.goH A D11-Sep-202114.8 KiB

util_test.goH A D11-Sep-20219.3 KiB

README.md

1# Parse [![API reference](https://img.shields.io/badge/godoc-reference-5272B4)](https://pkg.go.dev/github.com/tdewolff/parse/v2?tab=doc) [![Go Report Card](https://goreportcard.com/badge/github.com/tdewolff/parse)](https://goreportcard.com/report/github.com/tdewolff/parse) [![Coverage Status](https://coveralls.io/repos/github/tdewolff/parse/badge.svg?branch=master)](https://coveralls.io/github/tdewolff/parse?branch=master) [![Donate](https://img.shields.io/badge/patreon-donate-DFB317)](https://www.patreon.com/tdewolff)
2
3This package contains several lexers and parsers written in [Go][1]. All subpackages are built to be streaming, high performance and to be in accordance with the official (latest) specifications.
4
5The lexers are implemented using `buffer.Lexer` in https://github.com/tdewolff/parse/buffer and the parsers work on top of the lexers. Some subpackages have hashes defined (using [Hasher](https://github.com/tdewolff/hasher)) that speed up common byte-slice comparisons.
6
7## Buffer
8### Reader
9Reader is a wrapper around a `[]byte` that implements the `io.Reader` interface. It is comparable to `bytes.Reader` but has slightly different semantics (and a slightly smaller memory footprint).
10
11### Writer
12Writer is a buffer that implements the `io.Writer` interface and expands the buffer as needed. The reset functionality allows for better memory reuse. After calling `Reset`, it will overwrite the current buffer and thus reduce allocations.
13
14### Lexer
15Lexer is a read buffer specifically designed for building lexers. It keeps track of two positions: a start and end position. The start position is the beginning of the current token being parsed, the end position is being moved forward until a valid token is found. Calling `Shift` will collapse the positions to the end and return the parsed `[]byte`.
16
17Moving the end position can go through `Move(int)` which also accepts negative integers. One can also use `Pos() int` to try and parse a token, and if it fails rewind with `Rewind(int)`, passing the previously saved position.
18
19`Peek(int) byte` will peek forward (relative to the end position) and return the byte at that location. `PeekRune(int) (rune, int)` returns UTF-8 runes and its length at the given **byte** position. Upon an error `Peek` will return `0`, the **user must peek at every character** and not skip any, otherwise it may skip a `0` and panic on out-of-bounds indexing.
20
21`Lexeme() []byte` will return the currently selected bytes, `Skip()` will collapse the selection. `Shift() []byte` is a combination of `Lexeme() []byte` and `Skip()`.
22
23When the passed `io.Reader` returned an error, `Err() error` will return that error even if not at the end of the buffer.
24
25### StreamLexer
26StreamLexer behaves like Lexer but uses a buffer pool to read in chunks from `io.Reader`, retaining old buffers in memory that are still in use, and re-using old buffers otherwise. Calling `Free(n int)` frees up `n` bytes from the internal buffer(s). It holds an array of buffers to accommodate for keeping everything in-memory. Calling `ShiftLen() int` returns the number of bytes that have been shifted since the previous call to `ShiftLen`, which can be used to specify how many bytes need to be freed up from the buffer. If you don't need to keep returned byte slices around, call `Free(ShiftLen())` after every `Shift` call.
27
28## Strconv
29This package contains string conversion function much like the standard library's `strconv` package, but it is specifically tailored for the performance needs within the `minify` package.
30
31For example, the floating-point to string conversion function is approximately twice as fast as the standard library, but it is not as precise.
32
33## CSS
34This package is a CSS3 lexer and parser. Both follow the specification at [CSS Syntax Module Level 3](http://www.w3.org/TR/css-syntax-3/). The lexer takes an io.Reader and converts it into tokens until the EOF. The parser returns a parse tree of the full io.Reader input stream, but the low-level `Next` function can be used for stream parsing to returns grammar units until the EOF.
35
36[See README here](https://github.com/tdewolff/parse/tree/master/css).
37
38## HTML
39This package is an HTML5 lexer. It follows the specification at [The HTML syntax](http://www.w3.org/TR/html5/syntax.html). The lexer takes an io.Reader and converts it into tokens until the EOF.
40
41[See README here](https://github.com/tdewolff/parse/tree/master/html).
42
43## JS
44This package is a JS lexer (ECMA-262, edition 6.0). It follows the specification at [ECMAScript Language Specification](http://www.ecma-international.org/ecma-262/6.0/). The lexer takes an io.Reader and converts it into tokens until the EOF.
45
46[See README here](https://github.com/tdewolff/parse/tree/master/js).
47
48## JSON
49This package is a JSON parser (ECMA-404). It follows the specification at [JSON](http://json.org/). The parser takes an io.Reader and converts it into tokens until the EOF.
50
51[See README here](https://github.com/tdewolff/parse/tree/master/json).
52
53## SVG
54This package contains common hashes for SVG1.1 tags and attributes.
55
56## XML
57This package is an XML1.0 lexer. It follows the specification at [Extensible Markup Language (XML) 1.0 (Fifth Edition)](http://www.w3.org/TR/xml/). The lexer takes an io.Reader and converts it into tokens until the EOF.
58
59[See README here](https://github.com/tdewolff/parse/tree/master/xml).
60
61## License
62Released under the [MIT license](LICENSE.md).
63
64[1]: http://golang.org/ "Go Language"
65