README
1NAME
2 HTML::HTML5::Entities - drop-in replacement for HTML::Entities
3
4SYNOPSIS
5 use HTML::Entities;
6
7 my $enc = encode_entities('fish & chips');
8 print "$enc\n"; # fish & chips
9
10 my $dec = decode_entities($enc);
11 print "$dec\n"; # fish & chips
12
13DESCRIPTION
14 This is a drop-in replacement for HTML::Entities, providing the character
15 entities defined in HTML5. Some caveats:
16
17 * The implementation is pure perl, hence in some cases slower,
18 especially decoding.
19
20 * It will not work in Perl < 5.8.1.
21
22 Functions
23 `decode_entities($string, ...)`
24 This routine replaces HTML entities found in the $string with the
25 corresponding Unicode character. If multiple strings are provided as
26 arguments they are each decoded separately and the same number of
27 strings are returned.
28
29 If called in void context the arguments are decoded in-place.
30
31 This routine is exported by default.
32
33 `_decode_entities($string, \%entity2char)`
34 `_decode_entities($string, \%entity2char, $expand_prefix)`
35 This will in-place replace HTML entities in $string. The %entity2char
36 hash must be provided. Named entities not found in the %entity2char
37 hash are left alone. Numeric entities are always expanded.
38
39 If $expand_prefix is TRUE then entities without trailing ";" in
40 %entity2char will even be expanded as a prefix of a longer
41 unrecognized name.
42
43 $string = "foo bar";
44 _decode_entities($string, { nb => "@", nbsp => "\xA0" }, 1);
45 print $string; # will print "foo bar"
46
47 This routine is exported by default.
48
49 `encode_entities($string)`
50 `encode_entities($string, $unsafe_chars)`
51 This routine replaces unsafe characters in $string with their entity
52 representation. A second argument can be given to specify which
53 characters to consider unsafe (i.e., which to escape). This may be a
54 regular expression.
55
56 If called in void context the string is encoded in-place.
57
58 This routine is exported by default.
59
60 `encode_entities_numeric($string)`
61 This routine works just like encode_entities, except that the
62 replacement entities are always numeric.
63
64 This routine is not exported by default.
65
66 `num_entity($string)`
67 Given a single character string, encodes it as a numeric entity.
68
69 This routine is not exported by default.
70
71 The following functions cannot be exported. They behave the same as the
72 exportable functions.
73
74 `HTML::Entities::decode($string, ...)`
75 `HTML::Entities::encode($string)`
76 `HTML::Entities::encode($string, $unsafe_characters)`
77 `HTML::Entities::encode_numeric($string)`
78 `HTML::Entities::encode_numeric($string, $unsafe_characters)`
79 `HTML::Entities::encode_numerically($string)`
80 `HTML::Entities::encode_numerically($string, $unsafe_characters)`
81
82 Variables
83 $HTML::HTML5::Entities::hex
84 This variable controls whether numeric entities will use hexadecimal
85 or decimal notation. It is TRUE (hexadecimal) by default, but can be
86 set to FALSE.
87
88 It only affects the encoding functions. Decoding always understands
89 both notations.
90
91 %HTML::HTML5::Entities::char2entity
92 %HTML::HTML5::Entities::entity2char
93 There contain the mapping from all characters to the corresponding
94 entities (and vice versa, respectively). These variables may be
95 exported.
96
97 Note that %char2entity is a more conservative set of mappings,
98 intended to be safe for serialising strings to HTML4, HTML5 and XHTML
99 1.x. And for hysterical raisins, %entity2char does not include the
100 leading ampersands, while %char2entity does.
101
102BUGS
103 Please report any bugs to
104 <http://rt.cpan.org/Dist/Display.html?Queue=HTML-HTML5-Entities>.
105
106SEE ALSO
107 HTML::Entities, HTML::HTML5::Parser, HTML::HTML5::Writer.
108
109AUTHOR
110 Toby Inkster <tobyink@cpan.org>.
111
112COPYRIGHT AND LICENCE
113 Encoding and Decoding Functions
114 Copyright (c) 1995-2006 by Gisle Aas.
115
116 Copyright (c) 2012 by Toby Inkster.
117
118 This is free software; you can redistribute it and/or modify it under the
119 same terms as the Perl 5 programming language system itself.
120
121 Entity Tables
122 Copyright (c) 2004-2007 by Apple Computer Inc, Mozilla Foundation, and
123 Opera Software ASA.
124
125 Copyright (c) 2007-2011 by Wakaba <w@suika.fam.cx>.
126
127 Copyright (c) 2009-2012 by Toby Inkster <tobyink@cpan.org>.
128
129DISCLAIMER OF WARRANTIES
130 THIS PACKAGE IS PROVIDED "AS IS" AND WITHOUT ANY EXPRESS OR IMPLIED
131 WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
132 MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
133
134