1NAME
2
3 XML::XPath - a set of modules for parsing and evaluating XPath
4 statements
5
6DESCRIPTION
7
8 This module aims to comply exactly to the XPath specification at
9 http://www.w3.org/TR/xpath and yet allow extensions to be added
10 in the form of functions. Modules such as XSLT and XPointer may
11 need to do this as they support functionality beyond XPath.
12
13INSTALLATION
14
15To install this module, run the following commands:
16
17 perl Makefile.PL
18 make
19 make test
20 make install
21
22SYNOPSIS
23
24 use XML::XPath;
25 use XML::XPath::XMLParser;
26
27 my $xp = XML::XPath->new(filename => 'test.xhtml');
28
29 my $nodeset = $xp->find('/html/body/p'); # find all paragraphs
30
31 foreach my $node ($nodeset->get_nodelist) {
32 print "FOUND\n\n",
33 XML::XPath::XMLParser::as_string($node),
34 "\n\n";
35 }
36
37DETAILS
38
39 There's an awful lot to all of this, so bear with it - if you
40 stick it out it should be worth it. Please get a good
41 understanding of XPath by reading the spec before asking me
42 questions. All of the classes and parts herein are named to be
43 synonimous with the names in the specification, so consult that
44 if you don't understand why I'm doing something in the code.
45
46API
47
48 The API of XML::XPath itself is extremely simple to allow you to
49 get going almost immediately. The deeper API's are more complex,
50 but you shouldn't have to touch most of that.
51
52 new()
53
54 This constructor follows the often seen named parameter method
55 call. Parameters you can use are: filename, parser, xml, ioref
56 and context. The filename parameter specifies an XML file to
57 parse. The xml parameter specifies a string to parse, and the
58 ioref parameter specifies an ioref to parse. The context option
59 allows you to specify a context node. The context node has to be
60 in the format of a node as specified in the
61 XML::XPath::XMLParser manpage. The 4 parameters filename, xml,
62 ioref and context are mutually exclusive - you should only
63 specify one (if you specify anything other than context, the
64 context node is the root of your document). The parser option
65 allows you to pass in an already prepared XML::Parser object, to
66 save you having to create more than one in your application (if,
67 for example, you're doing more than just XPath).
68
69 my $xp = XML::XPath->new( context => $node );
70
71 It is very much recommended that you use only 1 XPath object
72 throughout the life of your application. This is because the
73 object (and it's sub-objects) maintain certain bits of state
74 information that will be useful (such as XPath variables) to
75 later calls to find(). It's also a good idea because you'll use
76 less memory this way.
77
78 *nodeset* = find($path, [$context])
79
80 The find function takes an XPath expression (a string) and
81 returns either an XML::XPath::NodeSet object containing the
82 nodes it found (or empty if no nodes matched the path), or one
83 of XML::XPath::Literal (a string), XML::XPath::Number, or
84 XML::XPath::Boolean. It should always return something - and you
85 can use ->isa() to find out what it returned. If you need to
86 check how many nodes it found you should check $nodeset->size.
87 See the XML::XPath::NodeSet manpage. An optional second
88 parameter of a context node allows you to use this method
89 repeatedly, for example XSLT needs to do this.
90
91 findnodes($path, [$context])
92
93 Returns a list of nodes found by $path, optionally in context
94 $context. In scalar context returns an XML::XPath::NodeSet
95 object.
96
97 findnodes_as_string($path, [$context])
98
99 Returns the nodes found reproduced as XML. The result is not
100 guaranteed to be valid XML though.
101
102 findvalue($path, [$context])
103
104 Returns either a `XML::XPath::Literal', a `XML::XPath::Boolean'
105 or a `XML::XPath::Number' object. If the path returns a NodeSet,
106 $nodeset->to_literal is called automatically for you (and thus a
107 `XML::XPath::Literal' is returned). Note that for each of the
108 objects stringification is overloaded, so you can just print the
109 value found, or manipulate it in the ways you would a normal
110 perl value (e.g. using regular expressions).
111
112 matches($node, $path, [$context])
113
114 Returns true if the node matches the path (optionally in context
115 $context).
116
117 set_namespace($prefix, $uri)
118
119 Sets the namespace prefix mapping to the uri.
120
121 Normally in XML::XPath the prefixes in XPath node tests take
122 their context from the current node. This means that foo:bar
123 will always match an element <foo:bar> regardless of the
124 namespace that the prefix foo is mapped to (which might even
125 change within the document, resulting in unexpected results). In
126 order to make prefixes in XPath node tests actually map to a
127 real URI, you need to enable that via a call to the
128 set_namespace method of your XML::XPath object.
129
130 clear_namespaces()
131
132 Clears all previously set namespace mappings.
133
134 $XML::XPath::Namespaces
135
136 Set this to 0 if you *don't* want namespace processing to occur.
137 This will make everything a little (tiny) bit faster, but you'll
138 suffer for it, probably.
139
140Node Object Model
141
142 See the XML::XPath::Node manpage, the XML::XPath::Node::Element
143 manpage, the XML::XPath::Node::Text manpage, the
144 XML::XPath::Node::Comment manpage, the
145 XML::XPath::Node::Attribute manpage, the
146 XML::XPath::Node::Namespace manpage, and the
147 XML::XPath::Node::PI manpage.
148
149On Garbage Collection
150 XPath nodes work in a special way that allows circular
151 references, and yet still lets Perl's reference counting garbage
152 collector to clean up the nodes after use. This should be
153 totally transparent to the user, with one caveat: If you free
154 your tree before letting go of a sub-tree, consider that playing
155 with fire and you may get burned. What does this mean to the
156 average user? Not much. Provided you don't free (or let go out
157 of scope) either the tree you passed to XML::XPath->new, or if
158 you didn't pass a tree, and passed a filename or IO-ref, then
159 provided you don't let the XML::XPath object go out of scope
160 before you let results of find() and its friends go out of
161 scope, then you'll be fine. Even if you do let the tree go out
162 of scope before results, you'll probably still be fine. The only
163 case where you may get stung is when the last part of your
164 path/query is either an ancestor or parent axis. In that case
165 the worst that will happen is you'll end up with a circular
166 reference that won't get cleared until interpreter destruction
167 time. You can get around that by explicitly calling $node-
168 >DESTROY on each of your result nodes, if you really need to do
169 that.
170
171 Mail me direct if that's not clear. Note that it's not doom and
172 gloom. It's by no means perfect, but the worst that will happen
173 is a long running process could leak memory. Most long running
174 processes will therefore be able to explicitly be careful not to
175 free the tree (or XML::XPath object) before freeing results.
176 AxKit, an application that uses XML::XPath, does this and I
177 didn't have to make any changes to the code - it's already
178 sensible programming.
179
180 If you *really* don't want all this to happen, then set the
181 variable $XML::XPath::SafeMode, and call $xp->cleanup() on the
182 XML::XPath object when you're finished, or $tree->dispose() if
183 you have a tree instead.
184
185Example
186 Please see the test files in t/ for examples on how to use
187 XPath.
188
189Support/Author
190 This module is copyright 2000 AxKit.com Ltd. This is free
191 software, and as such comes with NO WARRANTY. No dates are used
192 in this module. You may distribute this module under the terms
193 of either the Gnu GPL, or the Artistic License (the same terms
194 as Perl itself).
195
196 For support, please subscribe to the Perl-XML mailing list at
197 the URL http://listserv.activestate.com/mailman/listinfo/perl-
198 xml
199
200 Matt Sergeant, matt@sergeant.org
201
202SEE ALSO
203 the XML::XPath::Literal manpage, the XML::XPath::Boolean
204 manpage, the XML::XPath::Number manpage, the
205 XML::XPath::XMLParser manpage, the XML::XPath::NodeSet manpage,
206 the XML::XPath::PerlSAX manpage, the XML::XPath::Builder
207 manpage.