1<?xml version="1.0" encoding="ISO-8859-1" ?>
2<!DOCTYPE package SYSTEM "http://pear.php.net/dtd/package-1.0">
3<package version="1.0">
4  <name>XML_HTMLSax</name>
5  <summary>A SAX based parser for HTML and other badly formed XML documents</summary>
6  <description>XML_HTMLSax is a SAX based XML parser for badly formed XML documents, such as HTML.
7  The original code base was developed by Alexander Zhukov and published at http://sourceforge.net/projects/phpshelve/. Alexander kindly gave permission to modify the code and license for inclusion in PEAR.
8
9  PEAR::XML_HTMLSax provides an API very similar to the native PHP Expat extension, allowing handlers using one to be easily adapted to the other. The key difference is HTMLSax will not break on badly formed XML, allowing it to be used for parsing HTML documents. Otherwise HTMLSax supports all the handlers available from Expat except namespace and external entity handlers. Provides methods for handling XML escapes as well as JSP/ASP opening and close tags.
10
11  Version 2 has had it's internals completely overhauled to use a Lexer, delivering performance *approaching* that of the native XML extension, as well as a radically improved, modular design that makes adding further functionality easy.
12
13  The public API has remained the same as older versions, except for the set_option() method, the available options having been renamed. Additional options are now also available, which allow HTMLSax to behave almost exactly like the native Expat extension. For example if the contents of XML elements contain linefeeds, tabs and XML entities, HTMLSax can be instructed to trigger additional data handler calls.
14
15  A big thanks to Jeff Moore (lead developer of WACT: http://wact.sourceforge.net) who's largely responsible for new design, as well input from other members at Sitepoint's Advanced PHP forums: http://www.sitepointforums.com/showthread.php?threadid=121246.
16
17  Thanks also to Marcus Baker (lead developer of SimpleTest: http://www.lastcraft.com/simple_test.php) for sorting out the unit tests.</description>
18  <maintainers>
19    <maintainer>
20      <user>hfuecks</user>
21      <name>Harry Fuecks</name>
22      <email>hfuecks@phppatterns.com</email>
23      <role>lead</role>
24    </maintainer>
25  </maintainers>
26  <release>
27    <version>2.1.2</version>
28    <date>2003-12-05</date>
29    <license>PHP</license>
30    <state>stable</state>
31    <notes>* Bug fixed (thanks Jeff) where badly formed attributes resulted in infinite loop
32* Added additional boolean argument to open and close handler calls to spot empty tags like br/ - should not break exising APIs
33* Added XML_OPTION_FULL_ESCAPES which (when = 1) passes through the complete content in an XML escape, allowing comment / cdata reconstruction</notes>
34    <deps>
35      <dep type="php" rel="ge" version="4.0.5"/>
36    </deps>
37    <provides type="class" name="XML_HTMLSax_StateParser" />
38    <provides type="class" name="XML_HTMLSax_StateParser_Lt430" extends="XML_HTMLSax_StateParser" />
39    <provides type="class" name="XML_HTMLSax_StateParser_Gtet430" extends="XML_HTMLSax_StateParser" />
40    <provides type="class" name="XML_HTMLSax_NullHandler" />
41    <provides type="class" name="XML_HTMLSax" extends="Pear" />
42    <provides type="function" name="XML_HTMLSax_StateParser::unscanCharacter" />
43    <provides type="function" name="XML_HTMLSax_StateParser::ignoreCharacter" />
44    <provides type="function" name="XML_HTMLSax_StateParser::scanCharacter" />
45    <provides type="function" name="XML_HTMLSax_StateParser::scanUntilString" />
46    <provides type="function" name="XML_HTMLSax_StateParser::scanUntilCharacters" />
47    <provides type="function" name="XML_HTMLSax_StateParser::ignoreWhitespace" />
48    <provides type="function" name="XML_HTMLSax_StateParser::parse" />
49    <provides type="function" name="XML_HTMLSax_StateParser_Lt430::scanUntilCharacters" />
50    <provides type="function" name="XML_HTMLSax_StateParser_Lt430::ignoreWhitespace" />
51    <provides type="function" name="XML_HTMLSax_StateParser_Lt430::parse" />
52    <provides type="function" name="XML_HTMLSax_StateParser_Gtet430::scanUntilCharacters" />
53    <provides type="function" name="XML_HTMLSax_StateParser_Gtet430::ignoreWhitespace" />
54    <provides type="function" name="XML_HTMLSax_StateParser_Gtet430::parse" />
55    <provides type="function" name="XML_HTMLSax_NullHandler::DoNothing" />
56    <provides type="function" name="XML_HTMLSax::set_object" />
57    <provides type="function" name="XML_HTMLSax::set_option" />
58    <provides type="function" name="XML_HTMLSax::set_data_handler" />
59    <provides type="function" name="XML_HTMLSax::set_element_handler" />
60    <provides type="function" name="XML_HTMLSax::set_pi_handler" />
61    <provides type="function" name="XML_HTMLSax::set_escape_handler" />
62    <provides type="function" name="XML_HTMLSax::set_jasp_handler" />
63    <provides type="function" name="XML_HTMLSax::get_current_position" />
64    <provides type="function" name="XML_HTMLSax::get_length" />
65    <provides type="function" name="XML_HTMLSax::parse" />
66    <provides type="class" name="XML_HTMLSax_StartingState" />
67    <provides type="class" name="XML_HTMLSax_TagState" />
68    <provides type="class" name="XML_HTMLSax_ClosingTagState" />
69    <provides type="class" name="XML_HTMLSax_OpeningTagState" />
70    <provides type="class" name="XML_HTMLSax_EscapeState" />
71    <provides type="class" name="XML_HTMLSax_JaspState" />
72    <provides type="class" name="XML_HTMLSax_PiState" />
73    <provides type="function" name="XML_HTMLSax_StartingState::parse" />
74    <provides type="function" name="XML_HTMLSax_TagState::parse" />
75    <provides type="function" name="XML_HTMLSax_ClosingTagState::parse" />
76    <provides type="function" name="XML_HTMLSax_OpeningTagState::parseAttributes" />
77    <provides type="function" name="XML_HTMLSax_OpeningTagState::parse" />
78    <provides type="function" name="XML_HTMLSax_EscapeState::parse" />
79    <provides type="function" name="XML_HTMLSax_JaspState::parse" />
80    <provides type="function" name="XML_HTMLSax_PiState::parse" />
81    <provides type="class" name="XML_HTMLSax_Trim" />
82    <provides type="class" name="XML_HTMLSax_CaseFolding" />
83    <provides type="class" name="XML_HTMLSax_Linefeed" />
84    <provides type="class" name="XML_HTMLSax_Tab" />
85    <provides type="class" name="XML_HTMLSax_Entities_Parsed" />
86    <provides type="class" name="XML_HTMLSax_Entities_Unparsed" />
87    <provides type="function" name="XML_HTMLSax_Trim::trimData" />
88    <provides type="function" name="XML_HTMLSax_CaseFolding::foldOpen" />
89    <provides type="function" name="XML_HTMLSax_CaseFolding::foldClose" />
90    <provides type="function" name="XML_HTMLSax_Linefeed::breakData" />
91    <provides type="function" name="XML_HTMLSax_Tab::breakData" />
92    <provides type="function" name="XML_HTMLSax_Entities_Parsed::breakData" />
93    <provides type="function" name="XML_HTMLSax_Entities_Unparsed::breakData" />
94    <provides type="function" name="html_entity_decode" />
95    <filelist>
96      <file role="php" baseinstalldir="XML" md5sum="4646f0e3b0b6cb1af1f8d2f0eb558fcc" name="XML_HTMLSax.php"/>
97      <file role="php" baseinstalldir="XML" md5sum="04bd2e034cfa78902c883103549d952b" name="HTMLSax/XML_HTMLSax_States.php"/>
98      <file role="php" baseinstalldir="XML" md5sum="3bf6c70e6e4a3692f0833cdb4e6c077b" name="HTMLSax/XML_HTMLSax_Decorators.php"/>
99      <file role="doc" baseinstalldir="XML" md5sum="fa5e91af821291a1bd3f90ce2c8557a4" name="docs/Readme"/>
100      <file role="doc" baseinstalldir="XML" md5sum="212961f0b0437c92ce65128bf1e33740" name="docs/examples/SimpleExample.php"/>
101      <file role="doc" baseinstalldir="XML" md5sum="ee798189d1ff9b1f614ab4f13c916cc4" name="docs/examples/HTMLtoXHTML.php"/>
102      <file role="doc" baseinstalldir="XML" md5sum="23f122dad1412ef196b880c9b646662c" name="docs/examples/ExpatvsHtmlSax.php"/>
103      <file role="doc" baseinstalldir="XML" md5sum="6d8e0358d7581138624843192f29b1fc" name="docs/examples/example.html"/>
104      <file role="doc" baseinstalldir="XML" md5sum="90aaba50fabb9de12d0b4664f008d2dd" name="docs/tests/index.php"/>
105      <file role="doc" baseinstalldir="XML" md5sum="341998c9086a1196e2e8fcbbe2d0c9f1" name="docs/tests/unit_tests.php"/>
106      <file role="doc" baseinstalldir="XML" md5sum="db45e0a797ffece8914464c8e24b75a3" name="docs/tests/xml_htmlsax_test.php"/>
107    </filelist>
108  </release>
109  <changelog>
110    <release>
111      <version>2.1.1</version>
112      <date>2003-10-08</date>
113      <license>PHP</license>
114      <state>stable</state>
115      <notes>* Reporting of byte index with get_current_position() more accurate on opening tags (thanks to Alexander Orlov at x-code.com)
116* All parser options now available to PHP versions lt 4.3.x, using implementation of html_entity_decode in PHP
117
118</notes>
119    </release>
120    <release>
121      <version>2.1.0</version>
122      <date>2003-09-10</date>
123      <license>PHP</license>
124      <state>stable</state>
125      <notes>* Well (unit) tested with SimpleTest
126
127</notes>
128    </release>
129    <release>
130      <version>2.0.2</version>
131      <date>2003-08-11</date>
132      <license>PHP</license>
133      <state>alpha</state>
134      <notes>* API is backwards compatible apart from the renaming of parser options
135* Performance dramatically increased. Not much slower than Expat
136* Better handling of XML comments and CDATA
137* Option to trigger additional data handler calls for linefeeds and tabs
138* Option to trigger additional data handler calls for XML entities and parse them if required.
139* Added public get_current_position() and get_length() methods
140
141</notes>
142    </release>
143    <release>
144      <version>1.1</version>
145      <date>2003-06-26</date>
146      <license>PHP</license>
147      <state>stable</state>
148      <notes>* Bug fixes to Attribute_Parser to cope with newline, tag, forward slash and whitespace issues.
149</notes>
150    </release>
151    <release>
152      <version>1.0</version>
153      <date>2003-06-08</date>
154      <state>stable</state>
155      <notes>* Modifications to file structure to place Attributes_Parser.php
156  and State_Machine.php in subdirectory HTMLSax
157* XML_HTMLSax.php includes Attributes_Parser.php and State_Machine.php
158  using require_once()
159
160</notes>
161    </release>
162    <release>
163      <version>0.9.0rc2</version>
164      <date>2003-05-18</date>
165      <state>beta</state>
166      <notes>*First release under PEAR
167*Changed package name to XML_HTMLSax
168*Added patch from John Luxford to parse single quoted attributes
169*Modified State_Machine to be a simple variable store
170
171
172
173</notes>
174    </release>
175    <release>
176      <version>0.9.0rc1</version>
177      <date>2003-05-09</date>
178      <state>beta</state>
179      <notes>A summary of the main differences between this version
180      of HTML_Sax and HTMLSax2002082201 are as follows;
181      *Instead of extending HTMLSax with your own &quot;handlers&quot; class,
182       you now use the set_object() method to pass an instance of the
183       class to HTMLSax.
184      *Class method callbacks are specified using the following methods;
185      *set_element_handler('startHandler','endHandler') &lt;tag&gt; and &lt;/tag&gt;
186      *set_data_handler('dataHandler') for contents of an element
187      *set_pi_handler('piHandler') for &lt;?php ?&gt;, &lt;?xml ?&gt; etc.
188      *set_escape_handler(') for anything beginning with &lt;!
189      *set_jasp_handler() - set listener for &lt;% %&gt; tags
190      *Attributes which no value are created and set to true
191      *Comments are handled and may contain entities; &lt; &gt;
192      *The callback handlers will all be passed an instance of HTMLSax
193       in the same way as the native PHP XML Expat extension
194      *Setting of parser options is handled specifically by the set_option()
195       method. Available options are;
196      *skipWhiteSpace; instruct the parser to ignore whitespace characters
197      *trimDataNodes; trim whitespace inside character data
198      *breakOnNewLine; newline characters found in character data are treated
199       as new events triggering another data callback
200      *caseFolding; converts element names to uppercase
201
202</notes>
203    </release>
204  </changelog>
205</package>
206