README
1NAME
2 XML::Fast - Simple and very fast XML - hash conversion
3
4SYNOPSIS
5 use XML::Fast;
6
7 my $hash = xml2hash $xml;
8 my $hash2 = xml2hash $xml, attr => '.', text => '~';
9
10DESCRIPTION
11 This module implements simple, state machine based, XML parser written
12 in C.
13
14 It could parse and recover some kind of broken XML's. If you need XML
15 validator, use XML::LibXML
16
17RATIONALE
18 Another similar module is XML::Bare. I've used it for some time, but it
19 have some failures:
20
21 * If your XML have node with TextNode, then CDATANode, then again
22 TextNode, you'll got broken value
23
24 * It doesn't support charsets
25
26 * It doesn't support any kind of entities.
27
28 So, after count of tries to fix XML::Bare I've decided to write parser
29 from scratch.
30
31 Here is some features and principles:
32
33 * It uses minimal count of memory allocations.
34
35 * All XML is parsed in 1 scan.
36
37 * All values are copied from source XML only once (to destination
38 keys/values)
39
40 * If some types of nodes (for ex comments) are ignored, there are no
41 memory allocations/copy for them.
42
43 I've removed benchmark results, since they are very different for
44 different xml's. Sometimes XML::Bare is faster, sometimes not. So,
45 XML::Fast mainly should be considered not "faster-than-bare", but
46 "format-other-than-bare"
47
48EXPORT
49 xml2hash $xml, [ %options ]
50 hash2xml $hash, [ %options ]
51OPTIONS
52 order [ = 0 ]
53 Not implemented yet. Strictly keep the output order. When enabled,
54 structures become more complex, but xml could be completely
55 reverted.
56
57 attr [ = '-' ]
58 Attribute prefix
59
60 <node attr="test" /> => { node => { -attr => "test" } }
61
62 text [ = '#text' ]
63 Key name for storing text
64
65 When undef, text nodes will be ignored
66
67 <node>text<sub /></node> => { node => { sub => '', '#text' => "test" } }
68
69 join [ = '' ]
70 Join separator for text nodes, splitted by subnodes
71
72 Ignored when "order" in effect
73
74 # default:
75 xml2hash( '<item>Test1<sub />Test2</item>' )
76 : { item => { sub => '', '~' => 'Test1Test2' } };
77
78 xml2hash( '<item>Test1<sub />Test2</item>', join => '+' )
79 : { item => { sub => '', '~' => 'Test1+Test2' } };
80
81 trim [ = 1 ]
82 Trim leading and trailing whitespace from text nodes
83
84 cdata [ = undef ]
85 When defined, CDATA sections will be stored under this key
86
87 # cdata = undef
88 <node><![CDATA[ test ]]></node> => { node => 'test' }
89
90 # cdata = '#'
91 <node><![CDATA[ test ]]></node> => { node => { '#' => 'test' } }
92
93 comm [ = undef ]
94 When defined, comments sections will be stored under this key
95
96 When undef, comments will be ignored
97
98 # comm = undef
99 <node><!-- comm --><sub/></node> => { node => { sub => '' } }
100
101 # comm = '/'
102 <node><!-- comm --><sub/></node> => { node => { sub => '', '/' => 'comm' } }
103
104 array => 1
105 Force all nodes to be kept as arrays.
106
107 # no array
108 <node><sub/></node> => { node => { sub => '' } }
109
110 # array = 1
111 <node><sub/></node> => { node => [ { sub => [ '' ] } ] }
112
113 array => [ 'node', 'names']
114 Force nodes with names to be stored as arrays
115
116 # no array
117 <node><sub/></node> => { node => { sub => '' } }
118
119 # array => ['sub']
120 <node><sub/></node> => { node => { sub => [ '' ] } }
121
122 utf8decode => 1
123 Force decoding of utf8 sequences, instead of just upgrading them
124 (may be useful for broken xml)
125
126SEE ALSO
127 * XML::Bare
128
129 Another fast parser
130
131 * XML::LibXML
132
133 The most powerful XML parser for perl. If you don't need to parse
134 gigabytes of XML ;)
135
136 * XML::Hash::LX
137
138 XML parser, that uses XML::LibXML for parsing and then constructs
139 hash structure, identical to one, generated by this module. (At
140 least, it should ;)). But of course it is much more slower, than
141 XML::Fast
142
143LIMITATIONS
144 * Does not support wide charsets (UTF-16/32) (see RT71534
145 <https://rt.cpan.org/Ticket/Display.html?id=71534>)
146
147TODO
148 * Ordered mode (as implemented in XML::Hash::LX)
149
150 * Create hash2xml, identical to one in XML::Hash::LX
151
152 * Partial content event-based parsing (I need this for reading XML
153 streams)
154
155 Patches, propositions and bug reports are welcome ;)
156
157AUTHOR
158 Mons Anderson, <mons@cpan.org>
159
160COPYRIGHT AND LICENSE
161 Copyright (C) 2010 Mons Anderson
162
163 This library is free software; you can redistribute it and/or modify it
164 under the same terms as Perl itself.
165
166