• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

.github/workflows/H21-Jul-2021-8572

c_src/H21-Jul-2021-1,3641,105

include/H21-Jul-2021-9035

lib/H21-Jul-2021-4533

priv/lib/H21-Jul-2021-21

spec/H21-Jul-2021-171113

src/H03-May-2022-6,3885,684

test/H21-Jul-2021-1,2281,087

CHANGELOG.mdH A D21-Jul-20215.2 KiB230136

CODE_OF_CONDUCT.mdH A D21-Jul-20213.3 KiB7757

CONTRIBUTING.mdH A D21-Jul-20215.9 KiB14097

MakefileH A D21-Jul-2021665 3523

Makefile.mixH A D21-Jul-20211.2 KiB5433

README.mdH A D21-Jul-20215 KiB179120

configureH A D21-Jul-2021141 KiB5,0034,160

configure.acH A D21-Jul-20211.1 KiB5036

mix.exsH A D21-Jul-20211.7 KiB5347

mix.lockH A D21-Jul-2021725 65

rebar.configH A D03-May-20222.2 KiB5546

rebar.config.scriptH A D21-Jul-20215.2 KiB162145

vars.config.inH A D21-Jul-2021160 97

README.md

1# Erlang and Elixir XML Parsing
2
3[![CI](https://github.com/processone/fast_xml/actions/workflows/ci.yml/badge.svg?branch=master)](https://github.com/processone/fast_xml/actions/workflows/ci.yml)
4[![Coverage Status](https://coveralls.io/repos/processone/fast_xml/badge.svg?branch=master&service=github)](https://coveralls.io/github/processone/fast_xml?branch=master)
5[![Hex version](https://img.shields.io/hexpm/v/fast_xml.svg "Hex version")](https://hex.pm/packages/fast_xml)
6
7Fast Expat based Erlang XML parsing and manipulation library, with a
8strong focus on XML stream parsing from network.
9
10It supports:
11
12- Full XML structure parsing: Suitable for small but complete XML chunks.
13- XML stream parsing: Suitable for large XML document, or infinite
14  network XML stream like XMPP.
15
16This module can parse files much faster than built-in module `xmerl`.
17Depending on file complexity and size `fxml_stream:parse_element/1` can
18be 8-18 times faster than calling `xmerl_scan:string/2`.
19
20This application was previously called
21[p1_xml](https://github.com/processone/xml) and was renamed after
22major optimisations to put emphasis on the fact it is damn fast.
23
24## Building
25
26Erlang XML parser can be build as follow:
27
28    ./configure && make
29
30Erlang XML parser is a rebar-compatible OTP
31application. Alternatively, you can build it with rebar:
32
33    rebar compile
34
35## Dependencies
36
37Erlang XML parser depends on Expat XML parser. You need development
38headers for Expat library to build it.
39
40You can use `configure` options to pass custom path to Expat libraries and headers:
41
42    --with-expat=[ARG]      use Expat XML Parser from given prefix (ARG=path);
43                            check standard prefixes (ARG=yes); disable (ARG=no)
44    --with-expat-inc=[DIR]  path to Expat XML Parser headers
45    --with-expat-lib=[ARG]  link options for Expat XML Parser libraries
46
47## xmlel record and types
48
49XML elements are provided as Erlang xmlel records.
50
51Format of the record allows defining a simple tree-like
52structure. xmlel record has the following fields:
53
54- name     :: binary()
55- attrs    :: [attr()]
56- children :: [xmlel() | cdata()]
57
58cdata type is a tuple of the form:
59
60    {xmlcdata, CData::binary()}
61
62attr type if a tuple of the form:
63
64    {Name::binary(), Value::binary()}
65
66## XML full structure parsing
67
68You can definitely parse a complete XML structure with `fast_xml`:
69
70```shell
71$ erl -pa ebin
72Erlang/OTP 17 [erts-6.3] [source] [64-bit] [smp:4:4] [async-threads:10] [hipe] [kernel-poll:false] [dtrace]
73
74Eshell V6.3  (abort with ^G)
751> application:start(fast_xml).
76ok
772> rr(fxml).
78[xmlel]
793> fxml_stream:parse_element(<<"<test>content cdata</test>">>).
80#xmlel{name = <<"test">>,attrs = [],
81       children = [{xmlcdata,<<"content cdata">>}]}
82```
83
84## XML Stream parsing example
85
86You can also parse continuous stream. Our design allows decoupling
87very easily the process receiving the raw XML to parse from the
88process receiving the parsed content.
89
90The workflow is as follow:
91
92    state = new(CallbackPID); parse(state, data); parse(state, moredata); ...
93
94and the parsed XML fragments (stanzas) are send to CallbackPID.
95
96With that approach you can be very flexible on how you architect your
97own application.
98
99Here is an example XML stream parsing:
100
101```shell
102$ erl -pa ebin
103Erlang/OTP 17 [erts-6.3] [source] [64-bit] [smp:4:4] [async-threads:10] [hipe] [kernel-poll:false] [dtrace]
104
105Eshell V6.3  (abort with ^G)
106
107% Start the application:
1081> application:start(fast_xml).
109ok
110
111% Create a new stream, using self PID to received XML parsing event:
1122> S1 = fxml_stream:new(self()).
113<<>>
114
115% Start feeding content to the XML parser.
1163> S2 = fxml_stream:parse(S1, <<"<root>">>).
117<<>>
118
119% Receive Erlang message send to shell process:
1204> flush().
121Shell got {'$gen_event',{xmlstreamstart,<<"root">>,[]}}
122ok
123
124% Feed more content:
1255> S3 = fxml_stream:parse(S2, <<"<xmlelement>content cdata</xmlelement>">>).
126<<>>
127
128% Receive more messages:
1296> flush().
130Shell got {'$gen_event',
131              {xmlstreamelement,
132                  {xmlel,<<"xmlelement">>,[],
133                      [{xmlcdata,<<"content cdata">>}]}}}
134ok
135
136% Feed more content:
1377> S4 = fxml_stream:parse(S3, <<"</root>">>).
138<<>>
139
140% Receive messages:
1418> flush().
142Shell got {'$gen_event',{xmlstreamend,<<"root">>}}
143ok
144
1459> fxml_stream:close(S4).
146true
147```
148
149Note how the root element is important. We expect to have the root
150element serve as boundary with stream start and stream end
151event. Then, lower level tags are passed as sub stream elements.
152
153## How does this module relate to exmpp ?
154
155This module is a low level fast XML parser. It is not an XMPP client
156library like [exmpp](https://processone.github.io/exmpp/).
157
158## References
159
160This module is use at large scale for parsing massive XML content in
161[ejabberd](https://www.ejabberd.im) XMPP server project. It is used in
162production in thousands of real life deployments.
163
164## Development
165
166### Test
167
168#### Unit test
169
170You can run eunit test with the command:
171
172    $ rebar eunit
173
174#### Elixir / Quickcheck test
175
176You can run test written with Elixir / Quickcheck thanks to the mix command:
177
178    MIX_EXS=test/elixir/mix.exs mix test
179