1NAME
2 Data::Stag - Structured Tags datastructures
3
4SYNOPSIS
5 # PROCEDURAL USAGE
6 use Data::Stag qw(:all);
7 $doc = stag_parse($file);
8 @persons = stag_find($doc, "person");
9 foreach $p (@persons) {
10 printf "%s, %s phone: %s\n",
11 stag_sget($p, "family_name"),
12 stag_sget($p, "given_name"),
13 stag_sget($p, "phone_no"),
14 ;
15 }
16
17 # OBJECT-ORIENTED USAGE
18 use Data::Stag;
19 $doc = Data::Stag->parse($file);
20 @persons = $doc->find("person");
21 foreach $p (@person) {
22 printf "%s, %s phone:%s\n",
23 $p->sget("family_name"),
24 $p->sget("given_name"),
25 $p->sget("phone_no"),
26 ;
27 }
28
29DESCRIPTION
30 This module is for manipulating data as hierarchical tag/value pairs
31 (Structured TAGs or Simple Tree AGgreggates). These datastructures can
32 be represented as nested arrays, which have the advantage of being
33 native to perl. A simple example is shown below:
34
35 [ person=> [ [ family_name => $family_name ],
36 [ given_name => $given_name ],
37 [ phone_no => $phone_no ] ] ],
38
39 the Data::Stag manpage uses a subset of XML for import and export. This
40 means the module can also be used as a general XML parser/writer (with
41 certain caveats).
42
43 The above set of structured tags can be represented in XML as
44
45 <person>
46 <family_name>...</family_name>
47 <given_name>...</given_name>
48 <phone_no>...</phone_no>
49 </person>
50
51 This datastructure can be examined, manipulated and exported using Stag
52 functions or methods:
53
54 $document = Data::Stag->parse($file);
55 @persons = $document->find('person');
56 foreach my $person (@person) {
57 $person->set('full_name',
58 $person->sget('given_name') . ' ' .
59 $person->sget('family_name'));
60 }
61
62 Advanced querying is performed by passing functions, for example:
63
64 # get all people in dataset with name starting 'A'
65 @persons =
66 $document->where('person',
67 sub {shift->sget('family_name') =~ /^A/});
68
69 One of the things that marks this module out against other XML modules
70 is this emphasis on a functional approach as an obect-oriented or
71 procedural approach.
72
73 PROCEDURAL VS OBJECT-ORIENTED USAGE
74
75 Depending on your preference, this module can be used a set of
76 procedural subroutine calls, or as method calls upon Data::Stag objects,
77 or both.
78
79 In procedural mode, all the subroutine calls are prefixed "stag_" to
80 avoid namespace clashes. The following three calls are equivalent:
81
82 $person = stag_find($doc, "person");
83 $person = $doc->find("person");
84 $person = $doc->find_person;
85
86 In object mode, you can treat any tree element as if it is an object
87 with automatically defined methods for getting/setting the tag values.
88
89 USE OF XML
90
91 Nested arrays can be imported and exported as XML, as well as other
92 formats. XML can be slurped into memory all at once (using less memory
93 than an equivalent DOM tree), or a simplified SAX style event handling
94 model can be used. Similarly, data can be exported all at once, or as a
95 series of events.
96
97 Although this module can be used as a general XML tool, it is intended
98 primarily as a tool for manipulating hierarchical data using nested
99 tag/value pairs.
100
101 By using a simpler subset of XML equivalent to a basic data tree
102 structure, we can write simpler, cleaner code. This simplicity comes at
103 a price - this module is not very suitable for XML with attributes or
104 mixed content.
105
106 All attributes are turned into elements. This means that it will not
107 round-trip a piece of xml with attributes in it. For some applications
108 this is acceptable, for others it is not.
109
110 Mixed content cannot be represented in a simple tree format, so this is
111 also expanded.
112
113 The following piece of XML
114
115 <paragraph id="1">
116 example of <bold>mixed</bold>content
117 </paragraph>
118
119 gets parsed as if it were actually:
120
121 <paragraph>
122 <paragraph-id>1</paragraph-id>
123 <paragraph-text>example of</paragraph-text>
124 <bold>mixed</bold>
125 <paragraph-text>content</paragraph-text>
126 </paragraph>
127
128 This module is more suited to dealing with data-oriented documents than
129 text-oriented documents.
130
131 It can also be used as part of a SAX-style event generation / handling
132 framework - see the Data::Stag::BaseHandler manpage
133
134 Because nested arrays are native to perl, we can specify an XML
135 datastructure directly in perl without going through multiple object
136 calls.
137
138 For example, instead of the lengthy
139
140 $obj->startTag("record");
141 $obj->startTag("field1");
142 $obj->characters("foo");
143 $obj->endTag("field1");
144 $obj->startTag("field2");
145 $obj->characters("bar");
146 $obj->endTag("field2");
147 $obj->end("record");
148
149 We can instead write
150
151 $struct = [ record => [
152 [ field1 => 'foo'],
153 [ field2 => 'bar']]];
154
155 PARSING
156
157 The following example is for parsing out subsections of a tree and
158 changing sub-elements
159
160 use Data::Stag qw(:all);
161 my $tree = stag_parse($xmlfile);
162 my ($subtree) = stag_findnode($tree, $element);
163 stag_set($element, $sub_element, $new_val);
164 print stag_xml($subtree);
165
166 OBJECT ORIENTED
167
168 The same can be done in a more OO fashion
169
170 use Data::Stag qw(:all);
171 my $tree = Data::Stag->parse($xmlfile);
172 my ($subtree) = $tree->findnode($element);
173 $element->set($sub_element, $new_val);
174 print $subtree->xml;
175
176 IN A STREAM
177
178 Rather than parsing in a whole file into memory all at once (which may
179 not be suitable for very large files), you can take an event handling
180 approach. The easiest way to do this to register which nodes in the file
181 you are interested in using the makehandler method. The parser will
182 sweep through the file, building objects as it goes, and handing the
183 object to a subroutine that you specify.
184
185 For example:
186
187 use Data::Stag;
188 # catch the end of 'person' elements
189 my $h = Data::Stag->makehandler( person=> sub {
190 my ($self, $person) = @_;
191 printf "name:%s phone:%s\n",
192 $person->get_name,
193 $person->get_phone;
194 return; # clear node
195 });
196 Data::Stag->parse(-handler=>$h,
197 -file=>$f);
198
199 see the Data::Stag::BaseHandler manpage for writing handlers
200
201 See the Stag website at http://stag.sourceforge.net for more examples.
202
203 STRUCTURED TAGS TREE DATA STRUCTURE
204
205 A tree of structured tags is represented as a recursively nested array,
206 the elements of the array represent nodes in the tree.
207
208 A node is a name/data pair, that can represent tags and values. A node
209 is represented using a reference to an array, where the first element of
210 the array is the tagname, or element, and the second element is the data
211
212 This can be visualised as a box:
213
214 +-----------+
215 |Name | Data|
216 +-----------+
217
218 In perl, we represent this pair as a reference to an array
219
220 [ Name => $Data ]
221
222 The Data can either be a list of child nodes (subtrees), or a data
223 value.
224
225 The terminal nodes (leafs of the tree) contain data values; this is
226 represented in perl using primitive scalars.
227
228 For example:
229
230 [ Name => 'Fred' ]
231
232 For non-terminal nodes, the Data is a reference to an array, where each
233 element of the the array is a new node.
234
235 +-----------+
236 |Name | Data|
237 +-----------+
238 ||| +-----------+
239 ||+-->|Name | Data|
240 || +-----------+
241 ||
242 || +-----------+
243 |+--->|Name | Data|
244 | +-----------+
245 |
246 | +-----------+
247 +---->|Name | Data|
248 +-----------+
249
250 In perl this would be:
251
252 [ Name => [
253 [Name1 => $Data1],
254 [Name2 => $Data2],
255 [Name3 => $Data3],
256 ]
257 ];
258
259 The extra level of nesting is required to be able to store any node in
260 the tree using a single variable. This representation has lots of
261 advantages over others, eg hashes and mixed hash/array structures.
262
263 MANIPULATION AND QUERYING
264
265 The following example is taken from biology; we have a list of species
266 (mouse, human, fly) and a list of genes found in that species. These are
267 cross-referenced by an identifier called tax_id. We can do a
268 relational-style inner join on this identifier, as follows -
269
270 use Data::Stag qw(:all);
271 my $tree =
272 Data::Stag->new(
273 'db' => [
274 [ 'species_set' => [
275 [ 'species' => [
276 [ 'common_name' => 'house mouse' ],
277 [ 'binomial' => 'Mus musculus' ],
278 [ 'tax_id' => '10090' ]]],
279 [ 'species' => [
280 [ 'common_name' => 'fruit fly' ],
281 [ 'binomial' => 'Drosophila melanogaster' ],
282 [ 'tax_id' => '7227' ]]],
283 [ 'species' => [
284 [ 'common_name' => 'human' ],
285 [ 'binomial' => 'Homo sapiens' ],
286 [ 'tax_id' => '9606' ]]]]],
287 [ 'gene_set' => [
288 [ 'gene' => [
289 [ 'symbol' => 'HGNC' ],
290 [ 'tax_id' => '9606' ],
291 [ 'phenotype' => 'Hemochromatosis' ],
292 [ 'phenotype' => 'Porphyria variegata' ],
293 [ 'GO_term' => 'iron homeostasis' ],
294 [ 'map' => '6p21.3' ]]],
295 [ 'gene' => [
296 [ 'symbol' => 'Hfe' ],
297 [ 'synonym' => 'MR2' ],
298 [ 'tax_id' => '10090' ],
299 [ 'GO_term' => 'integral membrane protein' ],
300 [ 'map' => '13 A2-A4' ]]]]]]
301 );
302
303 # inner join of species and gene parts of tree,
304 # based on 'tax_id' element
305 my $gene_set = $tree->find("gene_set"); # get <gene_set> element
306 my $species_set = $tree->find("species_set"); # get <species_set> element
307 $gene_set->ijoin("gene", "tax_id", $species_set); # INNER JOIN
308
309 print "Reorganised data:\n";
310 print $gene_set->xml;
311
312 # find all genes starting with letter 'H' in where species/common_name=human
313 my @genes =
314 $gene_set->where('gene',
315 sub { my $g = shift;
316 $g->get_symbol =~ /^H/ &&
317 $g->findval("common_name") eq ('human')});
318
319 print "Human genes beginning 'H'\n";
320 print $_->xml foreach @genes;
321
322 S-Expression (Lisp) representation
323
324 The data represented using this module can be represented as Lisp-style
325 S-Expressions.
326
327 See the Data::Stag::SxprParser manpage and the Data::Stag::SxprWriter
328 manpage
329
330 If we execute this code on the XML from the example above
331
332 $stag = Data::Stag->parse($xmlfile);
333 print $stag->sxpr;
334
335 The following S-Expression will be printed:
336
337 '(db
338 (species_set
339 (species
340 (common_name "house mouse")
341 (binomial "Mus musculus")
342 (tax_id "10090"))
343 (species
344 (common_name "fruit fly")
345 (binomial "Drosophila melanogaster")
346 (tax_id "7227"))
347 (species
348 (common_name "human")
349 (binomial "Homo sapiens")
350 (tax_id "9606")))
351 (gene_set
352 (gene
353 (symbol "HGNC")
354 (tax_id "9606")
355 (phenotype "Hemochromatosis")
356 (phenotype "Porphyria variegata")
357 (GO_term "iron homeostasis")
358 (map
359 (cytological
360 (chromosome "6")
361 (band "p21.3"))))
362 (gene
363 (symbol "Hfe")
364 (synonym "MR2")
365 (tax_id "10090")
366 (GO_term "integral membrane protein")))
367 (similarity_set
368 (pair
369 (symbol "HGNC")
370 (symbol "Hfe"))
371 (pair
372 (symbol "WNT3A")
373 (symbol "Wnt3a"))))
374
375 TIPS FOR EMACS USERS AND LISP PROGRAMMERS
376
377 If you use emacs, you can save this as a file with the ".el" suffix and
378 get syntax highlighting for editing this file. Quotes around the
379 terminal node data items are optional.
380
381 If you know emacs lisp or any other lisp, this also turns out to be a
382 very nice language for manipulating these datastructures. Try copying
383 and pasting the above s-expression to the emacs scratch buffer and
384 playing with it in lisp.
385
386 INDENTED TEXT REPRESENTATION
387
388 Data::Stag has its own text format for writing data trees. Again, this
389 is only possible because we are working with a subset of XML (no
390 attributes, no mixed elements). The data structure above can be written
391 as follows -
392
393 db:
394 species_set:
395 species:
396 common_name: house mouse
397 binomial: Mus musculus
398 tax_id: 10090
399 species:
400 common_name: fruit fly
401 binomial: Drosophila melanogaster
402 tax_id: 7227
403 species:
404 common_name: human
405 binomial: Homo sapiens
406 tax_id: 9606
407 gene_set:
408 gene:
409 symbol: HGNC
410 tax_id: 9606
411 phenotype: Hemochromatosis
412 phenotype: Porphyria variegata
413 GO_term: iron homeostasis
414 map: 6p21.3
415 gene:
416 symbol: Hfe
417 synonym: MR2
418 tax_id: 10090
419 GO_term: integral membrane protein
420 map: 13 A2-A4
421 similarity_set:
422 pair:
423 symbol: HGNC
424 symbol: Hfe
425 pair:
426 symbol: WNT3A
427 symbol: Wnt3a
428
429 See the Data::Stag::ITextParser manpage and the Data::Stag::ITextWriter
430 manpage
431
432 NESTED ARRAY SPECIFICATION II
433
434 To avoid excessive square bracket usage, you can specify a structure
435 like this:
436
437 use Data::Stag qw(:all);
438
439 *N = \&stag_new;
440 my $tree =
441 N(top=>[
442 N('personset'=>[
443 N('person'=>[
444 N('name'=>'davey'),
445 N('address'=>'here'),
446 N('description'=>[
447 N('hair'=>'green'),
448 N('eyes'=>'two'),
449 N('teeth'=>5),
450 ]
451 ),
452 N('pets'=>[
453 N('petname'=>'igor'),
454 N('petname'=>'ginger'),
455 ]
456 ),
457
458 ],
459 ),
460 N('person'=>[
461 N('name'=>'shuggy'),
462 N('address'=>'there'),
463 N('description'=>[
464 N('hair'=>'red'),
465 N('eyes'=>'three'),
466 N('teeth'=>1),
467 ]
468 ),
469 N('pets'=>[
470 N('petname'=>'thud'),
471 N('petname'=>'spud'),
472 ]
473 ),
474 ]
475 ),
476 ]
477 ),
478 N('animalset'=>[
479 N('animal'=>[
480 N('name'=>'igor'),
481 N('class'=>'rat'),
482 N('description'=>[
483 N('fur'=>'white'),
484 N('eyes'=>'red'),
485 N('teeth'=>50),
486 ],
487 ),
488 ],
489 ),
490 ]
491 ),
492
493 ]
494 );
495
496 # find all people
497 my @persons = stag_find($tree, 'person');
498
499 # write xml for all red haired people
500 foreach my $p (@persons) {
501 print stag_xml($p)
502 if stag_tmatch($p, "hair", "red");
503 } ;
504
505 # find all people that have name == shuggy
506 my @p =
507 stag_qmatch($tree,
508 "person",
509 "name",
510 "shuggy");
511
512NODES AS DATA OBJECTS
513 As well as the methods listed below, a node can be treated as if it is a
514 data object of a class determined by the element.
515
516 For example, the following are equivalent.
517
518 $node->get_name;
519 $node->get('name');
520
521 $node->set_name('fred');
522 $node->set('name', 'fred');
523
524 This is really just syntactic sugar. The autoloaded methods are not
525 checked against any schema, although this may be added in future.
526
527STAG METHODS
528 All method calls are also available as procedural subroutine calls;
529 unless otherwise noted, the subroutine call is the same as the method
530 call, but with the string stag_ prefixed to the method name. The first
531 argument should be a Data::Stag datastructure.
532
533 To import all subroutines into the current namespace, use this idiom:
534
535 use Data::Stag qw(:all);
536 $doc = stag_parse($file);
537 @persons = stag_find($doc, 'person');
538
539 If you wish to use this module procedurally, and you are too lazy to
540 prefix all calls with stag_, use this idiom:
541
542 use Data::Stag qw(:lazy);
543 $doc = parse($file);
544 @persons = find($doc, 'person');
545
546 But beware of clashes!
547
548 Most method calls also have a handy short mnemonic. Use of these is
549 optional. Software engineering types prefer longer names, in the belief
550 that this leads to clearer code. Hacker types prefer shorter names, as
551 this requires less keystrokes, and leads to a more compact
552 representation of the code. It is expected that if you do use this
553 module, then its usage will be fairly ubiquitous within your code, and
554 the mnemonics will become familiar, much like the qw and s/ operators in
555 perl. As always with perl, the decision is yours.
556
557 Some methods take a single parameter or list of parameters; some have
558 large lists of parameters that can be passed in any order. If the
559 documentation states:
560
561 Args: [x str], [y int], [z ANY]
562
563 Then the method can be called like this:
564
565 $stag->foo("this is x", 55, $ref);
566
567 or like this:
568
569 $stag->foo(-z=>$ref, -x=>"this is x", -y=>55);
570
571 INITIALIZATION METHODS
572
573 new
574
575 Title: new
576
577 Args: element str, data STAG-DATA
578 Returns: Data::Stag node
579 Example: $node = stag_new();
580 Example: $node = Data::Stag->new;
581 Example: $node = Data::Stag->new(person => [[name=>$n], [phone=>$p]]);
582
583 creates a new instance of a Data::Stag node
584
585 stagify (nodify)
586
587 Title: stagify
588 Synonym: nodify
589 Args: data ARRAY-REF
590 Returns: Data::Stag node
591 Example: $node = stag_stagify([person => [[name=>$n], [phone=>$p]]]);
592
593 turns a perl array reference into a Data::Stag node.
594
595 similar to new
596
597 parse
598
599 Title: parse
600
601 Args: [file str], [format str], [handler obj], [fh FileHandle]
602 Returns: Data::Stag node
603 Example: $node = stag_parse($fn);
604 Example: $node = stag_parse(-fh=>$fh, -handler=>$h, -errhandler=>$eh);
605 Example: $node = Data::Stag->parse(-file=>$fn, -handler=>$myhandler);
606
607 slurps a file or string into a Data::Stag node structure. Will guess the
608 format (xml, sxpr, itext) from the suffix if it is not given.
609
610 The format can also be the name of a parsing module, or an actual parser
611 object;
612
613 The handler is any object that can take nested Stag events (start_event,
614 end_event, evbody) which are generated from the parse. If the handler is
615 omitted, all events will be cached and the resulting tree will be
616 returned.
617
618 See the Data::Stag::BaseHandler manpage for writing your own handlers
619
620 See the Data::Stag::BaseGenerator manpage for details on parser classes,
621 and error handling
622
623 parsestr
624
625 Title: parsestr
626
627 Args: [str str], [format str], [handler obj]
628 Returns: Data::Stag node
629 Example: $node = stag_parsestr('(a (b (c "1")))');
630 Example: $node = Data::Stag->parsestr(-str=>$str, -handler=>$myhandler);
631
632 Similar to parse(), except the first argument is a string
633
634 from
635
636 Title: from
637
638 Args: format str, source str
639 Returns: Data::Stag node
640 Example: $node = stag_from('xml', $fn);
641 Example: $node = stag_from('xmlstr', q[<top><x>1</x></top>]);
642 Example: $node = Data::Stag->from($parser, $fn);
643
644 Similar to parse
645
646 slurps a file or string into a Data::Stag node structure.
647
648 The format can also be the name of a parsing module, or an actual parser
649 object
650
651 unflatten
652
653 Title: unflatten
654
655 Args: data array
656 Returns: Data::Stag node
657 Example: $node = stag_unflatten(person=>[name=>$n, phone=>$p, address=>[street=>$s, city=>$c]]);
658
659 Creates a node structure from a semi-flattened representation, in which
660 children of a node are represented as a flat list of data rather than a
661 list of array references.
662
663 This means a structure can be specified as:
664
665 person=>[name=>$n,
666 phone=>$p,
667 address=>[street=>$s,
668 city=>$c]]
669
670 Instead of:
671
672 [person=>[ [name=>$n],
673 [phone=>$p],
674 [address=>[ [street=>$s],
675 [city=>$c] ] ]
676 ]
677 ]
678
679 The former gets converted into the latter for the internal
680 representation
681
682 makehandler
683
684 Title: makehandler
685
686 Args: hash of CODEREFs keyed by element name
687 OR a string containing the name of a module
688 Returns: L<Data::Stag::BaseHandler>
689 Example: $h = Data::Stag->makehandler(%subs);
690 Example: $h = Data::Stag->makehandler("My::FooHandler");
691
692 This creates a Stag event handler. The argument is a hash of subroutines
693 keyed by element/node name. After each node is fired by the
694 parser/generator, the subroutine is called, passing the handler object
695 and the stag node as arguments. whatever the subroutine returns is
696 placed back into the tree
697
698 For example, for a a parser/generator that fires events with the
699 following tree form
700
701 <person>
702 <name>foo</name>
703 ...
704 </person>
705
706 we can create a handler that writes person/name like this:
707
708 $h = Data::Stag->makehandler(
709 person => sub { my ($self,$stag) = @_;
710 print $stag->name;
711 return $stag; # dont change tree
712 });
713 $stag = Data::Stag->parse(-str=>"(...)", -handler=>$h)
714
715 See the Data::Stag::BaseHandler manpage for details on handlers
716
717 getformathandler
718
719 Title: getformathandler
720
721 Args: format str OR L<Data::Stag::BaseHandler>
722 Returns: L<Data::Stag::BaseHandler>
723 Example: $h = Data::Stag->getformathandler('xml');
724 $h->file("my.xml");
725 Data::Stag->parse(-fn=>$fn, -handler=>$h);
726
727 Creates a Stag event handler - this handler can be passed to an event
728 generator / parser. Built in handlers include:
729
730 xml Generates xml tags from events
731
732 sxpr
733 Generates S-Expressions from events
734
735 itext
736 Generates indented text from events
737
738 All the above are kinds of the Data::Stag::Writer manpage
739
740 chainhandler
741
742 Title: chainhandler
743
744 Args: blocked events - str or str[]
745 initial handler - handler object
746 final handler - handler object
747 Returns:
748 Example: $h = Data::Stag->chainhandler('foo', $processor, 'xml')
749
750 chains handlers together - for example, you may want to make transforms
751 on an event stream, and then pass the event stream to another handler -
752 for example, and xml handler
753
754 $processor = Data::Stag->makehandler(
755 a => sub { my ($self,$stag) = @_;
756 $stag->set_foo("bar");
757 return $stag
758 },
759 b => sub { my ($self,$stag) = @_;
760 $stag->set_blah("eek");
761 return $stag
762 },
763 );
764 $chainh = Data::Stag->chainhandler(['a', 'b'], $processor, 'xml');
765 $stag = Data::Stag->parse(-str=>"(...)", -handler=>$chainh)
766
767 chains together two handlers (see also the script stag-handle.pl)
768
769 RECURSIVE SEARCHING
770
771 find (f)
772
773 Title: find
774 Synonym: f
775
776 Args: element str
777 Returns: node[] or ANY
778 Example: @persons = stag_find($struct, 'person');
779 Example: @persons = $struct->find('person');
780
781 recursively searches tree for all elements of the given type, and
782 returns all nodes or data elements found.
783
784 if the element found is a non-terminal node, will return the node if the
785 element found is a terminal (leaf) node, will return the data value
786
787 the element argument can be a path
788
789 @names = $struct->find('department/person/name');
790
791 will find name in the nested structure below:
792
793 (department
794 (person
795 (name "foo")))
796
797 findnode (fn)
798
799 Title: findnode
800 Synonym: fn
801
802 Args: element str
803 Returns: node[]
804 Example: @persons = stag_findnode($struct, 'person');
805 Example: @persons = $struct->findnode('person');
806
807 recursively searches tree for all elements of the given type, and
808 returns all nodes found.
809
810 paths can also be used (see find)
811
812 findval (fv)
813
814 Title: findval
815 Synonym: fv
816
817 Args: element str
818 Returns: ANY[] or ANY
819 Example: @names = stag_findval($struct, 'name');
820 Example: @names = $struct->findval('name');
821 Example: $firstname = $struct->findval('name');
822
823 recursively searches tree for all elements of the given type, and
824 returns all data values found. the data values could be primitive
825 scalars or nodes.
826
827 paths can also be used (see find)
828
829 sfindval (sfv)
830
831 Title: sfindval
832 Synonym: sfv
833
834 Args: element str
835 Returns: ANY
836 Example: $name = stag_sfindval($struct, 'name');
837 Example: $name = $struct->sfindval('name');
838
839 as findval, but returns the first value found
840
841 paths can also be used (see find)
842
843 findvallist (fvl)
844
845 Title: findvallist
846 Synonym: fvl
847
848 Args: element str[]
849 Returns: ANY[]
850 Example: ($name, $phone) = stag_findvallist($personstruct, 'name', 'phone');
851 Example: ($name, $phone) = $personstruct->findvallist('name', 'phone');
852
853 recursively searches tree for all elements in the list
854
855 DEPRECATED
856
857 DATA ACCESSOR METHODS
858
859 these allow getting and setting of elements directly underneath the
860 current one
861
862 get (g)
863
864 Title: get
865 Synonym: g
866
867 Args: element str
868 Return: node[] or ANY
869 Example: $name = $person->get('name');
870 Example: @phone_nos = $person->get('phone_no');
871
872 gets the value of the named sub-element
873
874 if the sub-element is a non-terminal, will return a node(s) if the
875 sub-element is a terminal (leaf) it will return the data value(s)
876
877 the examples above would work on a data structure like this:
878
879 [person => [ [name => 'fred'],
880 [phone_no => '1-800-111-2222'],
881 [phone_no => '1-415-555-5555']]]
882
883 will return an array or single value depending on the context
884
885 [equivalent to findval(), except that only direct children (as opposed
886 to all descendents) are checked]
887
888 paths can also be used, like this:
889
890 @phones_nos = $struct->get('person/phone_no')
891
892 sget (sg)
893
894 Title: sget
895 Synonym: sg
896
897 Args: element str
898 Return: ANY
899 Example: $name = $person->sget('name');
900 Example: $phone = $person->sget('phone_no');
901 Example: $phone = $person->sget('department/person/name');
902
903 as get but always returns a single value
904
905 [equivalent to sfindval(), except that only direct children (as opposed
906 to all descendents) are checked]
907
908 getl (gl getlist)
909
910 Title: gl
911 Synonym: getl
912 Synonym: getlist
913
914 Args: element str[]
915 Return: node[] or ANY[]
916 Example: ($name, @phone) = $person->getl('name', 'phone_no');
917
918 returns the data values for a list of sub-elements of a node
919
920 [equivalent to findvallist(), except that only direct children (as
921 opposed to all descendents) are checked]
922
923 getn (gn getnode)
924
925 Title: getn
926 Synonym: gn
927 Synonym: getnode
928
929 Args: element str
930 Return: node[]
931 Example: $namestruct = $person->getn('name');
932 Example: @pstructs = $person->getn('phone_no');
933
934 as get but returns the whole node rather than just the data value
935
936 [equivalent to findnode(), except that only direct children (as opposed
937 to all descendents) are checked]
938
939 sgetmap (sgm)
940
941 Title: sgetmap
942 Synonym: sgm
943
944 Args: hash
945 Return: hash
946 Example: %h = $person->sgetmap('social-security-no'=>'id',
947 'name' =>'label',
948 'job' =>0,
949 'address' =>'location');
950
951 returns a hash of key/val pairs based on the values of the data values
952 of the subnodes in the current element; keys are mapped according to the
953 hash passed (a value of '' or 0 will map an identical key/val).
954
955 no multivalued data elements are allowed
956
957 set (s)
958
959 Title: set
960 Synonym: s
961
962 Args: element str, datavalue ANY (list)
963 Return: ANY
964 Example: $person->set('name', 'fred'); # single val
965 Example: $person->set('phone_no', $cellphone, $homephone);
966
967 sets the data value of an element for any node. if the element is
968 multivalued, all the old values will be replaced with the new ones
969 specified.
970
971 ordering will be preserved, unless the element specified does not exist,
972 in which case, the new tag/value pair will be placed at the end.
973
974 for example, if we have a stag node $person
975
976 person:
977 name: shuggy
978 job: bus driver
979
980 if we do this
981
982 $person->set('name', ());
983
984 we will end up with
985
986 person:
987 job: bus driver
988
989 then if we do this
990
991 $person->set('name', 'shuggy');
992
993 the 'name' node will be placed as the last attribute
994
995 person:
996 job: bus driver
997 name: shuggy
998
999 You can also use magic methods, for example
1000
1001 $person->set_name('shuggy');
1002 $person->set_job('bus driver', 'poet');
1003 print $person->itext;
1004
1005 will print
1006
1007 person:
1008 name: shuggy
1009 job: bus driver
1010 job: poet
1011
1012 note that if the datavalue is a non-terminal node as opposed to a
1013 primitive value, then you have to do it like this:
1014
1015 $people = Data::Stag->new(people=>[
1016 [person=>[[name=>'Sherlock Holmes']]],
1017 [person=>[[name=>'Moriarty']]],
1018 ]);
1019 $address = Data::Stag->new(address=>[
1020 [address_line=>"221B Baker Street"],
1021 [city=>"London"],
1022 [country=>"Great Britain"]]);
1023 ($person) = $people->qmatch('person', (name => "Sherlock Holmes"));
1024 $person->set("address", $address->data);
1025
1026 unset (u)
1027
1028 Title: unset
1029 Synonym: u
1030
1031 Args: element str, datavalue ANY
1032 Return: ANY
1033 Example: $person->unset('name');
1034 Example: $person->unset('phone_no');
1035
1036 prunes all nodes of the specified element from the current node
1037
1038 You can use magic methods, like this
1039
1040 $person->unset_name;
1041 $person->unset_phone_no;
1042
1043 free
1044
1045 Title: free
1046 Synonym: u
1047
1048 Args:
1049 Return:
1050 Example: $person->free;
1051
1052 removes all data from a node. If that node is a subnode of another node,
1053 it is removed altogether
1054
1055 for instance, if we had the data below:
1056
1057 <person>
1058 <name>fred</name>
1059 <address>
1060 ..
1061 </address>
1062 </person>
1063
1064 and called
1065
1066 $person->get_address->free
1067
1068 then the person node would look like this:
1069
1070 <person>
1071 <name>fred</name>
1072 </person>
1073
1074 add (a)
1075
1076 Title: add
1077 Synonym: a
1078
1079 Args: element str, datavalues ANY[]
1080 OR
1081 Data::Stag
1082 Return: ANY
1083 Example: $person->add('phone_no', $cellphone, $homephone);
1084 Example: $person->add_phone_no('1-555-555-5555');
1085 Example: $dataset->add($person)
1086
1087 adds a datavalue or list of datavalues. appends if already existing,
1088 creates new element value pairs if not already existing.
1089
1090 if the argument is a stag node, it will add this node under the current
1091 one
1092
1093 element (e name)
1094
1095 Title: element
1096 Synonym: e
1097 Synonym: name
1098
1099 Args:
1100 Return: element str
1101 Example: $element = $struct->element
1102
1103 returns the element name of the current node.
1104
1105 This is illustrated in the different representation formats below
1106
1107 sxpr
1108 (element "data")
1109
1110 or
1111
1112 (element
1113 (sub_element "..."))
1114
1115 xml
1116 <element>data</element>
1117
1118 or
1119
1120 <element>
1121 <sub_element>...</sub_element>
1122 </element>
1123
1124 perl
1125 [element => $data ]
1126
1127 or
1128
1129 [element => [
1130 [sub_element => "..." ]]]
1131
1132 itext
1133 element: data
1134
1135 or
1136
1137 element:
1138 sub_element: ...
1139
1140 kids (k children)
1141
1142 Title: kids
1143 Synonym: k
1144 Synonym: children
1145
1146 Args:
1147 Return: ANY or ANY[]
1148 Example: @nodes = $person->kids
1149 Example: $name = $namestruct->kids
1150
1151 returns the data value(s) of the current node; if it is a terminal node,
1152 returns a single value which is the data. if it is non-terminal, returns
1153 an array of nodes
1154
1155 addkid (ak addchild)
1156
1157 Title: addkid
1158 Synonym: ak
1159 Synonym: addchild
1160
1161 Args: kid node
1162 Return: ANY
1163 Example: $person->addkid('job', $job);
1164
1165 adds a new child node to a non-terminal node, after all the existing
1166 child nodes
1167
1168 subnodes
1169
1170 Title: subnodes
1171
1172 Args:
1173 Return: ANY[]
1174 Example: @nodes = $person->subnodes
1175
1176 returns the non-terminal data value(s) of the current node;
1177
1178 QUERYING AND ADVANCED DATA MANIPULATION
1179
1180 ijoin (j)
1181
1182 Title: ijoin
1183 Synonym: j
1184 Synonym: ij
1185
1186 Args: element str, key str, data Node
1187 Return: undef
1188
1189 does a relational style inner join - see previous example in this doc
1190
1191 key can either be a single node name that must be shared (analagous to
1192 SQL INNER JOIN .. USING), or a key1=key2 equivalence relation (analagous
1193 to SQL INNER JOIN ... ON)
1194
1195 qmatch (qm)
1196
1197 Title: qmatch
1198 Synonym: qm
1199
1200 Args: return-element str, match-element str, match-value str
1201 Return: node[]
1202 Example: @persons = $s->qmatch('person', 'name', 'fred');
1203 Example: @persons = $s->qmatch('person', (job=>'bus driver'));
1204
1205 queries the node tree for all elements that satisfy the specified
1206 key=val match - see previous example in this doc
1207
1208 for those inclined to thinking relationally, this can be thought of as a
1209 query that returns a stag object:
1210
1211 SELECT <return-element> FROM <stag-node> WHERE <match-element> = <match-value>
1212
1213 this always returns an array; this means that calling in a scalar
1214 context will return the number of elements; for example
1215
1216 $n = $s->qmatch('person', (name=>'fred'));
1217
1218 the value of $n will be equal to the number of persons called fred
1219
1220 tmatch (tm)
1221
1222 Title: tmatch
1223 Synonym: tm
1224
1225 Args: element str, value str
1226 Return: bool
1227 Example: @persons = grep {$_->tmatch('name', 'fred')} @persons
1228
1229 returns true if the the value of the specified element matches - see
1230 previous example in this doc
1231
1232 tmatchhash (tmh)
1233
1234 Title: tmatchhash
1235 Synonym: tmh
1236
1237 Args: match hashref
1238 Return: bool
1239 Example: @persons = grep {$_->tmatchhash({name=>'fred', hair_colour=>'green'})} @persons
1240
1241 returns true if the node matches a set of constraints, specified as
1242 hash.
1243
1244 tmatchnode (tmn)
1245
1246 Title: tmatchnode
1247 Synonym: tmn
1248
1249 Args: match node
1250 Return: bool
1251 Example: @persons = grep {$_->tmatchnode([person=>[[name=>'fred'], [hair_colour=>'green']]])} @persons
1252
1253 returns true if the node matches a set of constraints, specified as node
1254
1255 cmatch (cm)
1256
1257 Title: cmatch
1258 Synonym: cm
1259
1260 Args: element str, value str
1261 Return: bool
1262 Example: $n_freds = $personset->cmatch('name', 'fred');
1263
1264 counts the number of matches
1265
1266 where (w)
1267
1268 Title: where
1269 Synonym: w
1270
1271 Args: element str, test CODE
1272 Return: Node[]
1273 Example: @rich_persons = $data->where('person', sub {shift->get_salary > 100000});
1274
1275 the tree is queried for all elements of the specified type that satisfy
1276 the coderef (must return a boolean)
1277
1278 my @rich_dog_or_cat_owners =
1279 $data->where('person',
1280 sub {my $p = shift;
1281 $p->get_salary > 100000 &&
1282 $p->where('pet',
1283 sub {shift->get_type =~ /(dog|cat)/})});
1284
1285 iterate (i)
1286
1287 Title: iterate
1288 Synonym: i
1289
1290 Args: CODE
1291 Return: Node[]
1292 Example: $data->iterate(sub {
1293 my $stag = shift;
1294 my $parent = shift;
1295 if ($stag->element eq 'pet') {
1296 $parent->set_pet_name($stag->get_name);
1297 }
1298 });
1299
1300 iterates through whole tree calling the specified subroutine.
1301
1302 the first arg passed to the subroutine is the stag node representing the
1303 tree at that point; the second arg is for the parent.
1304
1305 for instance, the example code above would turn this
1306
1307 (person
1308 (name "jim")
1309 (pet
1310 (name "fluffy")))
1311
1312 into this
1313
1314 (person
1315 (name "jim")
1316 (pet_name "fluffy")
1317 (pet
1318 (name "fluffy")))
1319
1320 MISCELLANEOUS METHODS
1321
1322 duplicate (d)
1323
1324 Title: duplicate
1325 Synonym: d
1326
1327 Args:
1328 Return: Node
1329 Example: $node2 = $node->duplicate;
1330
1331 does a deep copy of a stag structure
1332
1333 isanode
1334
1335 Title: isanode
1336
1337 Args:
1338 Return: bool
1339 Example: if (stag_isanode($node)) { ... }
1340
1341 hash
1342
1343 Title: hash
1344
1345 Args:
1346 Return: hash
1347 Example: $h = $node->hash;
1348
1349 turns a tree into a hash. all data values will be arrayrefs
1350
1351 pairs
1352
1353 Title: pairs
1354
1355 turns a tree into a hash. all data values will be scalar (IMPORTANT:
1356 this means duplicate values will be lost)
1357
1358 write
1359
1360 Title: write
1361
1362 Args: filename str, format str[optional]
1363 Return:
1364 Example: $node->write("myfile.xml");
1365 Example: $node->write("myfile", "itext");
1366
1367 will try and guess the format from the extension if not specified
1368
1369 xml
1370
1371 Title: xml
1372
1373 Args: filename str, format str[optional]
1374 Return:
1375 Example: $node->write("myfile.xml");
1376 Example: $node->write("myfile", "itext");
1377
1378 Args:
1379 Return: xml str
1380 Example: print $node->xml;
1381
1382 XML METHODS
1383
1384 sax
1385
1386 Title: sax
1387
1388 Args: saxhandler SAX-CLASS
1389 Return:
1390 Example: $node->sax($mysaxhandler);
1391
1392 turns a tree into a series of SAX events
1393
1394 xpath (xp tree2xpath)
1395
1396 Title: xpath
1397 Synonym: xp
1398 Synonym: tree2xpath
1399
1400 Args:
1401 Return: xpath object
1402 Example: $xp = $node->xpath; $q = $xp->find($xpathquerystr);
1403
1404 xpquery (xpq xpathquery)
1405
1406 Title: xpquery
1407 Synonym: xpq
1408 Synonym: xpathquery
1409
1410 Args: xpathquery str
1411 Return: Node[]
1412 Example: @nodes = $node->xqp($xpathquerystr);
1413
1414STAG SCRIPTS
1415 The following scripts come with the stag module
1416
1417 stag-autoschema.pl
1418 writes the implicit stag-schema for a stag file
1419
1420 stag-db.pl
1421 persistent storage and retrieval for stag data (xml, sxpr, itext)
1422
1423 stag-diff.pl
1424 finds the difference between two stag files
1425
1426 stag-drawtree.pl
1427 draws a stag file (xml, itext, sxpr) as a PNG diagram
1428
1429 stag-filter.pl
1430 filters a stag file (xml, itext, sxpr) for nodes of interest
1431
1432 stag-findsubtree.pl
1433 finds nodes in a stag file
1434
1435 stag-flatten.pl
1436 turns stag data into a flat table
1437
1438 stag-grep.pl
1439 filters a stag file (xml, itext, sxpr) for nodes of interest
1440
1441 stag-handle.pl
1442 streams a stag file through a handler into a writer
1443
1444 stag-join.pl
1445 joins two stag files together based around common key
1446
1447 stag-mogrify.pl
1448 mangle stag files
1449
1450 stag-parse.pl
1451 parses a file and fires events (e.g. sxpr to xml)
1452
1453 stag-query.pl
1454 aggregare queries
1455
1456 stag-split.pl
1457 splits a stag file (xml, itext, sxpr) into multiple files
1458
1459 stag-splitter.pl
1460 splits a stag file into multiple files
1461
1462 stag-view.pl
1463 draws an expandable Tk tree diagram showing stag data
1464
1465 To get more documentation, type
1466
1467 stag_<script> -h
1468
1469BUGS
1470 none known so far, possibly quite a few undocumented features!
1471
1472 Not a bug, but the underlying default datastructure of nested arrays is
1473 more heavyweight than it needs to be. More lightweight implementations
1474 are possible. Some time I will write a C implementation.
1475
1476WEBSITE
1477 http://stag.sourceforge.net
1478
1479WEBSITE
1480 http://stag.sourceforge.net
1481
1482AUTHOR
1483 Chris Mungall <cjm AT fruitfly DOT org>
1484
1485COPYRIGHT
1486 Copyright (c) 2004 Chris Mungall
1487
1488 This module is free software. You may distribute this module under the
1489 same terms as perl itself
1490