• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

benches/H03-May-2022-165119

data/H03-May-2022-2,2342,233

examples/H03-May-2022-619439

html5lib-tests/H03-May-2022-75,93571,484

macros/H03-May-2022-483302

scripts/H03-May-2022-15291

src/H03-May-2022-7,3585,508

tests/H03-May-2022-865676

.cargo-checksum.jsonH A D03-May-202289 11

.gitignoreH A D24-Mar-201653 54

.gitmodulesH A D27-Feb-2017207 76

.travis.ymlH A D24-Mar-2016190 1411

AUTHORSH A D24-Mar-2016724 2119

COPYRIGHTH A D24-Mar-2016416 97

Cargo.tomlH A D27-Feb-20171.1 KiB6147

LICENSE-APACHEH A D24-Mar-201610.6 KiB202169

LICENSE-MITH A D24-Mar-20161.1 KiB2622

README.mdH A D17-Oct-20162.6 KiB5428

STRUCTURE.mdH A D26-Oct-20161.5 KiB2613

build.rsH A D18-Oct-20162.6 KiB7551

README.md

1# html5ever
2
3[![Build Status](https://travis-ci.org/servo/html5ever.svg?branch=master)](https://travis-ci.org/servo/html5ever)
4
5[API Documentation][API documentation]
6
7html5ever is an HTML parser developed as part of the [Servo](https://github.com/servo/servo) project.
8
9It can parse and serialize HTML according to the [WHATWG](https://whatwg.org/) specs (aka "HTML5").  There are some omissions at present, most of which are documented [in the bug tracker](https://github.com/servo/html5ever/issues?q=is%3Aopen+is%3Aissue+label%3Aweb-compat).  html5ever passes all tokenizer tests from [html5lib-tests](https://github.com/html5lib/html5lib-tests), and most tree builder tests outside of the unimplemented features.  The goal is to pass all html5lib tests, and also provide all hooks needed by a production web browser, e.g. `document.write`.
10
11Note that the HTML syntax is a language almost, but not quite, entirely unlike XML.  For correct parsing of XHTML, use an XML parser.  (That said, many XHTML documents in the wild are serialized in an HTML-compatible form.)
12
13html5ever is written in [Rust](http://www.rust-lang.org/), so it avoids the most notorious security problems from C, but has performance similar to a parser written in C.  You can call html5ever as if it were a C library, without pulling in a garbage collector or other heavy runtime requirements.
14
15
16## Getting started in Rust
17
18Add html5ever as a dependency in your [`Cargo.toml`](http://crates.io/) file:
19
20```toml
21[dependencies]
22html5ever = "*"
23```
24
25Then take a look at [`examples/html2html.rs`](https://github.com/servo/html5ever/blob/master/examples/html2html.rs) and [`examples/print-rcdom.rs`](https://github.com/servo/html5ever/blob/master/examples/print-rcdom.rs) and the [API documentation][].
26
27## Getting started in other languages
28
29Bindings for Python and other languages are much desired.
30
31
32## Working on html5ever
33
34To fetch the test suite, you need to run
35
36```
37git submodule update --init
38```
39
40Run `cargo doc` in the repository root to build local documentation under `target/doc/`.
41
42
43## Details
44
45html5ever uses callbacks to manipulate the DOM, so it works with your choice of DOM representation.  A simple reference-counted DOM is included.
46
47html5ever exclusively uses UTF-8 to represent strings.  In the future it will support other document encodings (and UCS-2 `document.write`) by converting input.
48
49The code is cross-referenced with the WHATWG syntax spec, and eventually we will have a way to present code and spec side-by-side.
50
51html5ever builds against the official stable releases of Rust, though some optimizations are only supported on nightly releases.
52
53[API documentation]: http://doc.servo.org/html5ever/index.html
54