README
1
2 THIS VERSION BREAKS BACKWARDS COMPATIBILITY
3 It is very similar and should require only minor changes,
4 but it is not a drop-in upgrade.
5
6 parseXML is now parseRSS
7
8Okay, apparently a few people actually use this, so I should probably release
9a new version. I've actually had this done for almost a year now but I wanted
10to write a real test suite for it, maybe next time. Instead, all I can say is
11I've been *using* it for the last year without a problem. Please provide me
12with a copy of any document wish it chokes on when making a bug report.
13
14=============================================================================
15
16XML::RSSLite is a meant as a relaxed parser+,* and lightweight+,++
17replacement for XML::RSS. In fact, it contains a generic lightweight
18XML pseudo-parser** that can be used for other content.
19
20For RSS/RDF/weblog/Scripting News content parseRSS does the following:
21
22 o Remove html tags to leave plain text
23
24 o Remove characters other than 0-9~!@#$%^&*()-+=a-zA-Z[];',.:"<>?\s
25
26 o Use <url> tags when <link> is empty
27
28 o Use misplaced urls in <title> when <link> is empty
29
30 o Exract links from <a href=...> if required
31
32 o Limit links to ftp and http
33
34 o Join relative urls to the site base
35
36If you can make a convincing argument against any of these behaviors they
37may be relaxed. Otherwise, you might use parseXML.
38
39+ Under certain circumstances; not valid during leap years, full
40 moons, high tides, vernal equinoxes, or Wednesdays. YMMV.
41
42* We hope, the new parser may be too strict, please provide samples
43 of content which you believe should parse but does not.
44
45++ The new found "correctness" comes at a performance cost, it is
46 slower than prior versions, but still faster than XML::RSS.
47
48** Not fully compliant with the W3C specifications.
49