• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

Data/H18-Sep-2013-8,4954,484

c-ext/H18-Sep-2013-10056

dev/H18-Sep-2013-404257

elisp/H18-Sep-2013-12652

homepage/H03-May-2022-1,5111,425

scripts/H18-Sep-2013-3,2881,781

t/H18-Sep-2013-2,6962,375

ChangesH A D09-Aug-20132.9 KiB11176

INSTALLH A D14-Dec-2009484 2114

MANIFESTH A D18-Sep-20133.6 KiB152151

META.jsonH A D18-Sep-2013868 4241

META.ymlH A D18-Sep-2013482 2322

Makefile.PLH A D09-Aug-20133.3 KiB11192

READMEH A D08-Aug-201343.7 KiB1,4901,081

README

1NAME
2      Data::Stag - Structured Tags datastructures
3
4SYNOPSIS
5      # PROCEDURAL USAGE
6      use Data::Stag qw(:all);
7      $doc = stag_parse($file);
8      @persons = stag_find($doc, "person");
9      foreach $p (@persons) {
10        printf "%s, %s phone: %s\n",
11          stag_sget($p, "family_name"),
12          stag_sget($p, "given_name"),
13          stag_sget($p, "phone_no"),
14        ;
15      }
16
17      # OBJECT-ORIENTED USAGE
18      use Data::Stag;
19      $doc = Data::Stag->parse($file);
20      @persons = $doc->find("person");
21      foreach $p (@person) {
22        printf "%s, %s phone:%s\n",
23          $p->sget("family_name"),
24          $p->sget("given_name"),
25          $p->sget("phone_no"),
26        ;
27      }
28
29DESCRIPTION
30    This module is for manipulating data as hierarchical tag/value pairs
31    (Structured TAGs or Simple Tree AGgreggates). These datastructures can
32    be represented as nested arrays, which have the advantage of being
33    native to perl. A simple example is shown below:
34
35      [ person=> [  [ family_name => $family_name ],
36                    [ given_name  => $given_name  ],
37                    [ phone_no    => $phone_no    ] ] ],
38
39    the Data::Stag manpage uses a subset of XML for import and export. This
40    means the module can also be used as a general XML parser/writer (with
41    certain caveats).
42
43    The above set of structured tags can be represented in XML as
44
45      <person>
46        <family_name>...</family_name>
47        <given_name>...</given_name>
48        <phone_no>...</phone_no>
49      </person>
50
51    This datastructure can be examined, manipulated and exported using Stag
52    functions or methods:
53
54      $document = Data::Stag->parse($file);
55      @persons = $document->find('person');
56      foreach my $person (@person) {
57        $person->set('full_name',
58                     $person->sget('given_name') . ' ' .
59                     $person->sget('family_name'));
60      }
61
62    Advanced querying is performed by passing functions, for example:
63
64      # get all people in dataset with name starting 'A'
65      @persons =
66        $document->where('person',
67                         sub {shift->sget('family_name') =~ /^A/});
68
69    One of the things that marks this module out against other XML modules
70    is this emphasis on a functional approach as an obect-oriented or
71    procedural approach.
72
73  PROCEDURAL VS OBJECT-ORIENTED USAGE
74
75    Depending on your preference, this module can be used a set of
76    procedural subroutine calls, or as method calls upon Data::Stag objects,
77    or both.
78
79    In procedural mode, all the subroutine calls are prefixed "stag_" to
80    avoid namespace clashes. The following three calls are equivalent:
81
82      $person = stag_find($doc, "person");
83      $person = $doc->find("person");
84      $person = $doc->find_person;
85
86    In object mode, you can treat any tree element as if it is an object
87    with automatically defined methods for getting/setting the tag values.
88
89  USE OF XML
90
91    Nested arrays can be imported and exported as XML, as well as other
92    formats. XML can be slurped into memory all at once (using less memory
93    than an equivalent DOM tree), or a simplified SAX style event handling
94    model can be used. Similarly, data can be exported all at once, or as a
95    series of events.
96
97    Although this module can be used as a general XML tool, it is intended
98    primarily as a tool for manipulating hierarchical data using nested
99    tag/value pairs.
100
101    By using a simpler subset of XML equivalent to a basic data tree
102    structure, we can write simpler, cleaner code. This simplicity comes at
103    a price - this module is not very suitable for XML with attributes or
104    mixed content.
105
106    All attributes are turned into elements. This means that it will not
107    round-trip a piece of xml with attributes in it. For some applications
108    this is acceptable, for others it is not.
109
110    Mixed content cannot be represented in a simple tree format, so this is
111    also expanded.
112
113    The following piece of XML
114
115      <paragraph id="1">
116        example of <bold>mixed</bold>content
117      </paragraph>
118
119    gets parsed as if it were actually:
120
121      <paragraph>
122        <paragraph-id>1</paragraph-id>
123        <paragraph-text>example of</paragraph-text>
124        <bold>mixed</bold>
125        <paragraph-text>content</paragraph-text>
126      </paragraph>
127
128    This module is more suited to dealing with data-oriented documents than
129    text-oriented documents.
130
131    It can also be used as part of a SAX-style event generation / handling
132    framework - see the Data::Stag::BaseHandler manpage
133
134    Because nested arrays are native to perl, we can specify an XML
135    datastructure directly in perl without going through multiple object
136    calls.
137
138    For example, instead of the lengthy
139
140      $obj->startTag("record");
141      $obj->startTag("field1");
142      $obj->characters("foo");
143      $obj->endTag("field1");
144      $obj->startTag("field2");
145      $obj->characters("bar");
146      $obj->endTag("field2");
147      $obj->end("record");
148
149    We can instead write
150
151      $struct = [ record => [
152                  [ field1 => 'foo'],
153                  [ field2 => 'bar']]];
154
155   PARSING
156
157    The following example is for parsing out subsections of a tree and
158    changing sub-elements
159
160      use Data::Stag qw(:all);
161      my $tree = stag_parse($xmlfile);
162      my ($subtree) = stag_findnode($tree, $element);
163      stag_set($element, $sub_element, $new_val);
164      print stag_xml($subtree);
165
166   OBJECT ORIENTED
167
168    The same can be done in a more OO fashion
169
170      use Data::Stag qw(:all);
171      my $tree = Data::Stag->parse($xmlfile);
172      my ($subtree) = $tree->findnode($element);
173      $element->set($sub_element, $new_val);
174      print $subtree->xml;
175
176   IN A STREAM
177
178    Rather than parsing in a whole file into memory all at once (which may
179    not be suitable for very large files), you can take an event handling
180    approach. The easiest way to do this to register which nodes in the file
181    you are interested in using the makehandler method. The parser will
182    sweep through the file, building objects as it goes, and handing the
183    object to a subroutine that you specify.
184
185    For example:
186
187      use Data::Stag;
188      # catch the end of 'person' elements
189      my $h = Data::Stag->makehandler( person=> sub {
190                                                   my ($self, $person) = @_;
191                                                   printf "name:%s phone:%s\n",
192                                                     $person->get_name,
193                                                     $person->get_phone;
194                                                   return;   # clear node
195                                                    });
196      Data::Stag->parse(-handler=>$h,
197                        -file=>$f);
198
199    see the Data::Stag::BaseHandler manpage for writing handlers
200
201    See the Stag website at http://stag.sourceforge.net for more examples.
202
203  STRUCTURED TAGS TREE DATA STRUCTURE
204
205    A tree of structured tags is represented as a recursively nested array,
206    the elements of the array represent nodes in the tree.
207
208    A node is a name/data pair, that can represent tags and values. A node
209    is represented using a reference to an array, where the first element of
210    the array is the tagname, or element, and the second element is the data
211
212    This can be visualised as a box:
213
214      +-----------+
215      |Name | Data|
216      +-----------+
217
218    In perl, we represent this pair as a reference to an array
219
220      [ Name => $Data ]
221
222    The Data can either be a list of child nodes (subtrees), or a data
223    value.
224
225    The terminal nodes (leafs of the tree) contain data values; this is
226    represented in perl using primitive scalars.
227
228    For example:
229
230      [ Name => 'Fred' ]
231
232    For non-terminal nodes, the Data is a reference to an array, where each
233    element of the the array is a new node.
234
235      +-----------+
236      |Name | Data|
237      +-----------+
238              |||   +-----------+
239              ||+-->|Name | Data|
240              ||    +-----------+
241              ||
242              ||    +-----------+
243              |+--->|Name | Data|
244              |     +-----------+
245              |
246              |     +-----------+
247              +---->|Name | Data|
248                    +-----------+
249
250    In perl this would be:
251
252      [ Name => [
253                  [Name1 => $Data1],
254                  [Name2 => $Data2],
255                  [Name3 => $Data3],
256                ]
257      ];
258
259    The extra level of nesting is required to be able to store any node in
260    the tree using a single variable. This representation has lots of
261    advantages over others, eg hashes and mixed hash/array structures.
262
263  MANIPULATION AND QUERYING
264
265    The following example is taken from biology; we have a list of species
266    (mouse, human, fly) and a list of genes found in that species. These are
267    cross-referenced by an identifier called tax_id. We can do a
268    relational-style inner join on this identifier, as follows -
269
270      use Data::Stag qw(:all);
271      my $tree =
272      Data::Stag->new(
273        'db' => [
274        [ 'species_set' => [
275          [ 'species' => [
276            [ 'common_name' => 'house mouse' ],
277            [ 'binomial' => 'Mus musculus' ],
278            [ 'tax_id' => '10090' ]]],
279          [ 'species' => [
280            [ 'common_name' => 'fruit fly' ],
281            [ 'binomial' => 'Drosophila melanogaster' ],
282            [ 'tax_id' => '7227' ]]],
283          [ 'species' => [
284            [ 'common_name' => 'human' ],
285            [ 'binomial' => 'Homo sapiens' ],
286            [ 'tax_id' => '9606' ]]]]],
287        [ 'gene_set' => [
288          [ 'gene' => [
289            [ 'symbol' => 'HGNC' ],
290            [ 'tax_id' => '9606' ],
291            [ 'phenotype' => 'Hemochromatosis' ],
292            [ 'phenotype' => 'Porphyria variegata' ],
293            [ 'GO_term' => 'iron homeostasis' ],
294            [ 'map' => '6p21.3' ]]],
295          [ 'gene' => [
296            [ 'symbol' => 'Hfe' ],
297            [ 'synonym' => 'MR2' ],
298            [ 'tax_id' => '10090' ],
299            [ 'GO_term' => 'integral membrane protein' ],
300            [ 'map' => '13 A2-A4' ]]]]]]
301       );
302
303      # inner join of species and gene parts of tree,
304      # based on 'tax_id' element
305      my $gene_set = $tree->find("gene_set");       # get <gene_set> element
306      my $species_set = $tree->find("species_set"); # get <species_set> element
307      $gene_set->ijoin("gene", "tax_id", $species_set);   # INNER JOIN
308
309      print "Reorganised data:\n";
310      print $gene_set->xml;
311
312      # find all genes starting with letter 'H' in where species/common_name=human
313      my @genes =
314        $gene_set->where('gene',
315                         sub { my $g = shift;
316                               $g->get_symbol =~ /^H/ &&
317                               $g->findval("common_name") eq ('human')});
318
319      print "Human genes beginning 'H'\n";
320      print $_->xml foreach @genes;
321
322  S-Expression (Lisp) representation
323
324    The data represented using this module can be represented as Lisp-style
325    S-Expressions.
326
327    See the Data::Stag::SxprParser manpage and the Data::Stag::SxprWriter
328    manpage
329
330    If we execute this code on the XML from the example above
331
332      $stag = Data::Stag->parse($xmlfile);
333      print $stag->sxpr;
334
335    The following S-Expression will be printed:
336
337      '(db
338        (species_set
339          (species
340            (common_name "house mouse")
341            (binomial "Mus musculus")
342            (tax_id "10090"))
343          (species
344            (common_name "fruit fly")
345            (binomial "Drosophila melanogaster")
346            (tax_id "7227"))
347          (species
348            (common_name "human")
349            (binomial "Homo sapiens")
350            (tax_id "9606")))
351        (gene_set
352          (gene
353            (symbol "HGNC")
354            (tax_id "9606")
355            (phenotype "Hemochromatosis")
356            (phenotype "Porphyria variegata")
357            (GO_term "iron homeostasis")
358            (map
359              (cytological
360                (chromosome "6")
361                (band "p21.3"))))
362          (gene
363            (symbol "Hfe")
364            (synonym "MR2")
365            (tax_id "10090")
366            (GO_term "integral membrane protein")))
367        (similarity_set
368          (pair
369            (symbol "HGNC")
370            (symbol "Hfe"))
371          (pair
372            (symbol "WNT3A")
373            (symbol "Wnt3a"))))
374
375   TIPS FOR EMACS USERS AND LISP PROGRAMMERS
376
377    If you use emacs, you can save this as a file with the ".el" suffix and
378    get syntax highlighting for editing this file. Quotes around the
379    terminal node data items are optional.
380
381    If you know emacs lisp or any other lisp, this also turns out to be a
382    very nice language for manipulating these datastructures. Try copying
383    and pasting the above s-expression to the emacs scratch buffer and
384    playing with it in lisp.
385
386  INDENTED TEXT REPRESENTATION
387
388    Data::Stag has its own text format for writing data trees. Again, this
389    is only possible because we are working with a subset of XML (no
390    attributes, no mixed elements). The data structure above can be written
391    as follows -
392
393      db:
394        species_set:
395          species:
396            common_name: house mouse
397            binomial: Mus musculus
398            tax_id: 10090
399          species:
400            common_name: fruit fly
401            binomial: Drosophila melanogaster
402            tax_id: 7227
403          species:
404            common_name: human
405            binomial: Homo sapiens
406            tax_id: 9606
407        gene_set:
408          gene:
409            symbol: HGNC
410            tax_id: 9606
411            phenotype: Hemochromatosis
412            phenotype: Porphyria variegata
413            GO_term: iron homeostasis
414            map: 6p21.3
415          gene:
416            symbol: Hfe
417            synonym: MR2
418            tax_id: 10090
419            GO_term: integral membrane protein
420            map: 13 A2-A4
421        similarity_set:
422          pair:
423            symbol: HGNC
424            symbol: Hfe
425          pair:
426            symbol: WNT3A
427            symbol: Wnt3a
428
429    See the Data::Stag::ITextParser manpage and the Data::Stag::ITextWriter
430    manpage
431
432  NESTED ARRAY SPECIFICATION II
433
434    To avoid excessive square bracket usage, you can specify a structure
435    like this:
436
437      use Data::Stag qw(:all);
438
439      *N = \&stag_new;
440      my $tree =
441        N(top=>[
442                N('personset'=>[
443                                N('person'=>[
444                                             N('name'=>'davey'),
445                                             N('address'=>'here'),
446                                             N('description'=>[
447                                                               N('hair'=>'green'),
448                                                               N('eyes'=>'two'),
449                                                               N('teeth'=>5),
450                                                              ]
451                                              ),
452                                             N('pets'=>[
453                                                        N('petname'=>'igor'),
454                                                        N('petname'=>'ginger'),
455                                                       ]
456                                              ),
457
458                                            ],
459                                 ),
460                                N('person'=>[
461                                             N('name'=>'shuggy'),
462                                             N('address'=>'there'),
463                                             N('description'=>[
464                                                               N('hair'=>'red'),
465                                                               N('eyes'=>'three'),
466                                                               N('teeth'=>1),
467                                                              ]
468                                              ),
469                                             N('pets'=>[
470                                                        N('petname'=>'thud'),
471                                                        N('petname'=>'spud'),
472                                                       ]
473                                              ),
474                                            ]
475                                 ),
476                               ]
477                 ),
478                N('animalset'=>[
479                                N('animal'=>[
480                                             N('name'=>'igor'),
481                                             N('class'=>'rat'),
482                                             N('description'=>[
483                                                               N('fur'=>'white'),
484                                                               N('eyes'=>'red'),
485                                                               N('teeth'=>50),
486                                                              ],
487                                              ),
488                                            ],
489                                 ),
490                               ]
491                 ),
492
493               ]
494         );
495
496      # find all people
497      my @persons = stag_find($tree, 'person');
498
499      # write xml for all red haired people
500      foreach my $p (@persons) {
501        print stag_xml($p)
502          if stag_tmatch($p, "hair", "red");
503      } ;
504
505      # find all people that have name == shuggy
506      my @p =
507        stag_qmatch($tree,
508                    "person",
509                    "name",
510                    "shuggy");
511
512NODES AS DATA OBJECTS
513    As well as the methods listed below, a node can be treated as if it is a
514    data object of a class determined by the element.
515
516    For example, the following are equivalent.
517
518      $node->get_name;
519      $node->get('name');
520
521      $node->set_name('fred');
522      $node->set('name', 'fred');
523
524    This is really just syntactic sugar. The autoloaded methods are not
525    checked against any schema, although this may be added in future.
526
527STAG METHODS
528    All method calls are also available as procedural subroutine calls;
529    unless otherwise noted, the subroutine call is the same as the method
530    call, but with the string stag_ prefixed to the method name. The first
531    argument should be a Data::Stag datastructure.
532
533    To import all subroutines into the current namespace, use this idiom:
534
535      use Data::Stag qw(:all);
536      $doc = stag_parse($file);
537      @persons = stag_find($doc, 'person');
538
539    If you wish to use this module procedurally, and you are too lazy to
540    prefix all calls with stag_, use this idiom:
541
542      use Data::Stag qw(:lazy);
543      $doc = parse($file);
544      @persons = find($doc, 'person');
545
546    But beware of clashes!
547
548    Most method calls also have a handy short mnemonic. Use of these is
549    optional. Software engineering types prefer longer names, in the belief
550    that this leads to clearer code. Hacker types prefer shorter names, as
551    this requires less keystrokes, and leads to a more compact
552    representation of the code. It is expected that if you do use this
553    module, then its usage will be fairly ubiquitous within your code, and
554    the mnemonics will become familiar, much like the qw and s/ operators in
555    perl. As always with perl, the decision is yours.
556
557    Some methods take a single parameter or list of parameters; some have
558    large lists of parameters that can be passed in any order. If the
559    documentation states:
560
561      Args: [x str], [y int], [z ANY]
562
563    Then the method can be called like this:
564
565      $stag->foo("this is x", 55, $ref);
566
567    or like this:
568
569      $stag->foo(-z=>$ref, -x=>"this is x", -y=>55);
570
571  INITIALIZATION METHODS
572
573   new
574
575           Title: new
576
577            Args: element str, data STAG-DATA
578         Returns: Data::Stag node
579         Example: $node = stag_new();
580         Example: $node = Data::Stag->new;
581         Example: $node = Data::Stag->new(person => [[name=>$n], [phone=>$p]]);
582
583    creates a new instance of a Data::Stag node
584
585   stagify (nodify)
586
587           Title: stagify
588         Synonym: nodify
589            Args: data ARRAY-REF
590         Returns: Data::Stag node
591         Example: $node = stag_stagify([person => [[name=>$n], [phone=>$p]]]);
592
593    turns a perl array reference into a Data::Stag node.
594
595    similar to new
596
597   parse
598
599           Title: parse
600
601            Args: [file str], [format str], [handler obj], [fh FileHandle]
602         Returns: Data::Stag node
603         Example: $node = stag_parse($fn);
604         Example: $node = stag_parse(-fh=>$fh, -handler=>$h, -errhandler=>$eh);
605         Example: $node = Data::Stag->parse(-file=>$fn, -handler=>$myhandler);
606
607    slurps a file or string into a Data::Stag node structure. Will guess the
608    format (xml, sxpr, itext) from the suffix if it is not given.
609
610    The format can also be the name of a parsing module, or an actual parser
611    object;
612
613    The handler is any object that can take nested Stag events (start_event,
614    end_event, evbody) which are generated from the parse. If the handler is
615    omitted, all events will be cached and the resulting tree will be
616    returned.
617
618    See the Data::Stag::BaseHandler manpage for writing your own handlers
619
620    See the Data::Stag::BaseGenerator manpage for details on parser classes,
621    and error handling
622
623   parsestr
624
625           Title: parsestr
626
627            Args: [str str], [format str], [handler obj]
628         Returns: Data::Stag node
629         Example: $node = stag_parsestr('(a (b (c "1")))');
630         Example: $node = Data::Stag->parsestr(-str=>$str, -handler=>$myhandler);
631
632    Similar to parse(), except the first argument is a string
633
634   from
635
636           Title: from
637
638            Args: format str, source str
639         Returns: Data::Stag node
640         Example: $node = stag_from('xml', $fn);
641         Example: $node = stag_from('xmlstr', q[<top><x>1</x></top>]);
642         Example: $node = Data::Stag->from($parser, $fn);
643
644    Similar to parse
645
646    slurps a file or string into a Data::Stag node structure.
647
648    The format can also be the name of a parsing module, or an actual parser
649    object
650
651   unflatten
652
653           Title: unflatten
654
655            Args: data array
656         Returns: Data::Stag node
657         Example: $node = stag_unflatten(person=>[name=>$n, phone=>$p, address=>[street=>$s, city=>$c]]);
658
659    Creates a node structure from a semi-flattened representation, in which
660    children of a node are represented as a flat list of data rather than a
661    list of array references.
662
663    This means a structure can be specified as:
664
665      person=>[name=>$n,
666               phone=>$p,
667               address=>[street=>$s,
668                         city=>$c]]
669
670    Instead of:
671
672      [person=>[ [name=>$n],
673                 [phone=>$p],
674                 [address=>[ [street=>$s],
675                             [city=>$c] ] ]
676               ]
677      ]
678
679    The former gets converted into the latter for the internal
680    representation
681
682   makehandler
683
684           Title: makehandler
685
686            Args: hash of CODEREFs keyed by element name
687                  OR a string containing the name of a module
688         Returns: L<Data::Stag::BaseHandler>
689         Example: $h = Data::Stag->makehandler(%subs);
690         Example: $h = Data::Stag->makehandler("My::FooHandler");
691
692    This creates a Stag event handler. The argument is a hash of subroutines
693    keyed by element/node name. After each node is fired by the
694    parser/generator, the subroutine is called, passing the handler object
695    and the stag node as arguments. whatever the subroutine returns is
696    placed back into the tree
697
698    For example, for a a parser/generator that fires events with the
699    following tree form
700
701      <person>
702        <name>foo</name>
703        ...
704      </person>
705
706    we can create a handler that writes person/name like this:
707
708      $h = Data::Stag->makehandler(
709                                   person => sub { my ($self,$stag) = @_;
710                                                   print $stag->name;
711                                                   return $stag; # dont change tree
712                                                 });
713      $stag = Data::Stag->parse(-str=>"(...)", -handler=>$h)
714
715    See the Data::Stag::BaseHandler manpage for details on handlers
716
717   getformathandler
718
719           Title: getformathandler
720
721            Args: format str OR L<Data::Stag::BaseHandler>
722         Returns: L<Data::Stag::BaseHandler>
723         Example: $h = Data::Stag->getformathandler('xml');
724                  $h->file("my.xml");
725                  Data::Stag->parse(-fn=>$fn, -handler=>$h);
726
727    Creates a Stag event handler - this handler can be passed to an event
728    generator / parser. Built in handlers include:
729
730    xml Generates xml tags from events
731
732    sxpr
733        Generates S-Expressions from events
734
735    itext
736        Generates indented text from events
737
738    All the above are kinds of the Data::Stag::Writer manpage
739
740   chainhandler
741
742           Title: chainhandler
743
744            Args: blocked events - str or str[]
745                  initial handler - handler object
746                  final handler - handler object
747         Returns:
748         Example: $h = Data::Stag->chainhandler('foo', $processor, 'xml')
749
750    chains handlers together - for example, you may want to make transforms
751    on an event stream, and then pass the event stream to another handler -
752    for example, and xml handler
753
754      $processor = Data::Stag->makehandler(
755                                           a => sub { my ($self,$stag) = @_;
756                                                      $stag->set_foo("bar");
757                                                      return $stag
758                                                    },
759                                           b => sub { my ($self,$stag) = @_;
760                                                      $stag->set_blah("eek");
761                                                      return $stag
762                                                    },
763                                           );
764      $chainh = Data::Stag->chainhandler(['a', 'b'], $processor, 'xml');
765      $stag = Data::Stag->parse(-str=>"(...)", -handler=>$chainh)
766
767    chains together two handlers (see also the script stag-handle.pl)
768
769  RECURSIVE SEARCHING
770
771   find (f)
772
773           Title: find
774         Synonym: f
775
776            Args: element str
777         Returns: node[] or ANY
778         Example: @persons = stag_find($struct, 'person');
779         Example: @persons = $struct->find('person');
780
781    recursively searches tree for all elements of the given type, and
782    returns all nodes or data elements found.
783
784    if the element found is a non-terminal node, will return the node if the
785    element found is a terminal (leaf) node, will return the data value
786
787    the element argument can be a path
788
789      @names = $struct->find('department/person/name');
790
791    will find name in the nested structure below:
792
793      (department
794       (person
795        (name "foo")))
796
797   findnode (fn)
798
799           Title: findnode
800         Synonym: fn
801
802            Args: element str
803         Returns: node[]
804         Example: @persons = stag_findnode($struct, 'person');
805         Example: @persons = $struct->findnode('person');
806
807    recursively searches tree for all elements of the given type, and
808    returns all nodes found.
809
810    paths can also be used (see find)
811
812   findval (fv)
813
814           Title: findval
815         Synonym: fv
816
817            Args: element str
818         Returns: ANY[] or ANY
819         Example: @names = stag_findval($struct, 'name');
820         Example: @names = $struct->findval('name');
821         Example: $firstname = $struct->findval('name');
822
823    recursively searches tree for all elements of the given type, and
824    returns all data values found. the data values could be primitive
825    scalars or nodes.
826
827    paths can also be used (see find)
828
829   sfindval (sfv)
830
831           Title: sfindval
832         Synonym: sfv
833
834            Args: element str
835         Returns: ANY
836         Example: $name = stag_sfindval($struct, 'name');
837         Example: $name = $struct->sfindval('name');
838
839    as findval, but returns the first value found
840
841    paths can also be used (see find)
842
843   findvallist (fvl)
844
845           Title: findvallist
846         Synonym: fvl
847
848            Args: element str[]
849         Returns: ANY[]
850         Example: ($name, $phone) = stag_findvallist($personstruct, 'name', 'phone');
851         Example: ($name, $phone) = $personstruct->findvallist('name', 'phone');
852
853    recursively searches tree for all elements in the list
854
855    DEPRECATED
856
857  DATA ACCESSOR METHODS
858
859    these allow getting and setting of elements directly underneath the
860    current one
861
862   get (g)
863
864           Title: get
865         Synonym: g
866
867            Args: element str
868          Return: node[] or ANY
869         Example: $name = $person->get('name');
870         Example: @phone_nos = $person->get('phone_no');
871
872    gets the value of the named sub-element
873
874    if the sub-element is a non-terminal, will return a node(s) if the
875    sub-element is a terminal (leaf) it will return the data value(s)
876
877    the examples above would work on a data structure like this:
878
879      [person => [ [name => 'fred'],
880                   [phone_no => '1-800-111-2222'],
881                   [phone_no => '1-415-555-5555']]]
882
883    will return an array or single value depending on the context
884
885    [equivalent to findval(), except that only direct children (as opposed
886    to all descendents) are checked]
887
888    paths can also be used, like this:
889
890     @phones_nos = $struct->get('person/phone_no')
891
892   sget (sg)
893
894           Title: sget
895         Synonym: sg
896
897            Args: element str
898          Return: ANY
899         Example: $name = $person->sget('name');
900         Example: $phone = $person->sget('phone_no');
901         Example: $phone = $person->sget('department/person/name');
902
903    as get but always returns a single value
904
905    [equivalent to sfindval(), except that only direct children (as opposed
906    to all descendents) are checked]
907
908   getl (gl getlist)
909
910           Title: gl
911         Synonym: getl
912         Synonym: getlist
913
914            Args: element str[]
915          Return: node[] or ANY[]
916         Example: ($name, @phone) = $person->getl('name', 'phone_no');
917
918    returns the data values for a list of sub-elements of a node
919
920    [equivalent to findvallist(), except that only direct children (as
921    opposed to all descendents) are checked]
922
923   getn (gn getnode)
924
925           Title: getn
926         Synonym: gn
927         Synonym: getnode
928
929            Args: element str
930          Return: node[]
931         Example: $namestruct = $person->getn('name');
932         Example: @pstructs = $person->getn('phone_no');
933
934    as get but returns the whole node rather than just the data value
935
936    [equivalent to findnode(), except that only direct children (as opposed
937    to all descendents) are checked]
938
939   sgetmap (sgm)
940
941           Title: sgetmap
942         Synonym: sgm
943
944            Args: hash
945          Return: hash
946         Example: %h = $person->sgetmap('social-security-no'=>'id',
947                                        'name'              =>'label',
948                                        'job'               =>0,
949                                        'address'           =>'location');
950
951    returns a hash of key/val pairs based on the values of the data values
952    of the subnodes in the current element; keys are mapped according to the
953    hash passed (a value of '' or 0 will map an identical key/val).
954
955    no multivalued data elements are allowed
956
957   set (s)
958
959           Title: set
960         Synonym: s
961
962            Args: element str, datavalue ANY (list)
963          Return: ANY
964         Example: $person->set('name', 'fred');    # single val
965         Example: $person->set('phone_no', $cellphone, $homephone);
966
967    sets the data value of an element for any node. if the element is
968    multivalued, all the old values will be replaced with the new ones
969    specified.
970
971    ordering will be preserved, unless the element specified does not exist,
972    in which case, the new tag/value pair will be placed at the end.
973
974    for example, if we have a stag node $person
975
976      person:
977        name: shuggy
978        job:  bus driver
979
980    if we do this
981
982      $person->set('name', ());
983
984    we will end up with
985
986      person:
987        job:  bus driver
988
989    then if we do this
990
991      $person->set('name', 'shuggy');
992
993    the 'name' node will be placed as the last attribute
994
995      person:
996        job:  bus driver
997        name: shuggy
998
999    You can also use magic methods, for example
1000
1001      $person->set_name('shuggy');
1002      $person->set_job('bus driver', 'poet');
1003      print $person->itext;
1004
1005    will print
1006
1007      person:
1008        name: shuggy
1009        job:  bus driver
1010        job:  poet
1011
1012    note that if the datavalue is a non-terminal node as opposed to a
1013    primitive value, then you have to do it like this:
1014
1015      $people  = Data::Stag->new(people=>[
1016                                          [person=>[[name=>'Sherlock Holmes']]],
1017                                          [person=>[[name=>'Moriarty']]],
1018                                         ]);
1019      $address = Data::Stag->new(address=>[
1020                                           [address_line=>"221B Baker Street"],
1021                                           [city=>"London"],
1022                                           [country=>"Great Britain"]]);
1023      ($person) = $people->qmatch('person', (name => "Sherlock Holmes"));
1024      $person->set("address", $address->data);
1025
1026   unset (u)
1027
1028           Title: unset
1029         Synonym: u
1030
1031            Args: element str, datavalue ANY
1032          Return: ANY
1033         Example: $person->unset('name');
1034         Example: $person->unset('phone_no');
1035
1036    prunes all nodes of the specified element from the current node
1037
1038    You can use magic methods, like this
1039
1040      $person->unset_name;
1041      $person->unset_phone_no;
1042
1043   free
1044
1045           Title: free
1046         Synonym: u
1047
1048            Args:
1049          Return:
1050         Example: $person->free;
1051
1052    removes all data from a node. If that node is a subnode of another node,
1053    it is removed altogether
1054
1055    for instance, if we had the data below:
1056
1057      <person>
1058        <name>fred</name>
1059        <address>
1060        ..
1061        </address>
1062      </person>
1063
1064    and called
1065
1066      $person->get_address->free
1067
1068    then the person node would look like this:
1069
1070      <person>
1071        <name>fred</name>
1072      </person>
1073
1074   add (a)
1075
1076           Title: add
1077         Synonym: a
1078
1079            Args: element str, datavalues ANY[]
1080                  OR
1081                  Data::Stag
1082          Return: ANY
1083         Example: $person->add('phone_no', $cellphone, $homephone);
1084         Example: $person->add_phone_no('1-555-555-5555');
1085         Example: $dataset->add($person)
1086
1087    adds a datavalue or list of datavalues. appends if already existing,
1088    creates new element value pairs if not already existing.
1089
1090    if the argument is a stag node, it will add this node under the current
1091    one
1092
1093   element (e name)
1094
1095           Title: element
1096         Synonym: e
1097         Synonym: name
1098
1099            Args:
1100          Return: element str
1101         Example: $element = $struct->element
1102
1103    returns the element name of the current node.
1104
1105    This is illustrated in the different representation formats below
1106
1107    sxpr
1108          (element "data")
1109
1110        or
1111
1112          (element
1113           (sub_element "..."))
1114
1115    xml
1116          <element>data</element>
1117
1118        or
1119
1120          <element>
1121            <sub_element>...</sub_element>
1122          </element>
1123
1124    perl
1125          [element => $data ]
1126
1127        or
1128
1129          [element => [
1130                        [sub_element => "..." ]]]
1131
1132    itext
1133          element: data
1134
1135        or
1136
1137          element:
1138            sub_element: ...
1139
1140   kids (k children)
1141
1142           Title: kids
1143         Synonym: k
1144         Synonym: children
1145
1146            Args:
1147          Return: ANY or ANY[]
1148         Example: @nodes = $person->kids
1149         Example: $name = $namestruct->kids
1150
1151    returns the data value(s) of the current node; if it is a terminal node,
1152    returns a single value which is the data. if it is non-terminal, returns
1153    an array of nodes
1154
1155   addkid (ak addchild)
1156
1157           Title: addkid
1158         Synonym: ak
1159         Synonym: addchild
1160
1161            Args: kid node
1162          Return: ANY
1163         Example: $person->addkid('job', $job);
1164
1165    adds a new child node to a non-terminal node, after all the existing
1166    child nodes
1167
1168   subnodes
1169
1170           Title: subnodes
1171
1172            Args:
1173          Return: ANY[]
1174         Example: @nodes = $person->subnodes
1175
1176    returns the non-terminal data value(s) of the current node;
1177
1178  QUERYING AND ADVANCED DATA MANIPULATION
1179
1180   ijoin (j)
1181
1182           Title: ijoin
1183         Synonym: j
1184         Synonym: ij
1185
1186            Args: element str, key str, data Node
1187          Return: undef
1188
1189    does a relational style inner join - see previous example in this doc
1190
1191    key can either be a single node name that must be shared (analagous to
1192    SQL INNER JOIN .. USING), or a key1=key2 equivalence relation (analagous
1193    to SQL INNER JOIN ... ON)
1194
1195   qmatch (qm)
1196
1197           Title: qmatch
1198         Synonym: qm
1199
1200            Args: return-element str, match-element str, match-value str
1201          Return: node[]
1202         Example: @persons = $s->qmatch('person', 'name', 'fred');
1203         Example: @persons = $s->qmatch('person', (job=>'bus driver'));
1204
1205    queries the node tree for all elements that satisfy the specified
1206    key=val match - see previous example in this doc
1207
1208    for those inclined to thinking relationally, this can be thought of as a
1209    query that returns a stag object:
1210
1211      SELECT <return-element> FROM <stag-node> WHERE <match-element> = <match-value>
1212
1213    this always returns an array; this means that calling in a scalar
1214    context will return the number of elements; for example
1215
1216      $n = $s->qmatch('person', (name=>'fred'));
1217
1218    the value of $n will be equal to the number of persons called fred
1219
1220   tmatch (tm)
1221
1222           Title: tmatch
1223         Synonym: tm
1224
1225            Args: element str, value str
1226          Return: bool
1227         Example: @persons = grep {$_->tmatch('name', 'fred')} @persons
1228
1229    returns true if the the value of the specified element matches - see
1230    previous example in this doc
1231
1232   tmatchhash (tmh)
1233
1234           Title: tmatchhash
1235         Synonym: tmh
1236
1237            Args: match hashref
1238          Return: bool
1239         Example: @persons = grep {$_->tmatchhash({name=>'fred', hair_colour=>'green'})} @persons
1240
1241    returns true if the node matches a set of constraints, specified as
1242    hash.
1243
1244   tmatchnode (tmn)
1245
1246           Title: tmatchnode
1247         Synonym: tmn
1248
1249            Args: match node
1250          Return: bool
1251         Example: @persons = grep {$_->tmatchnode([person=>[[name=>'fred'], [hair_colour=>'green']]])} @persons
1252
1253    returns true if the node matches a set of constraints, specified as node
1254
1255   cmatch (cm)
1256
1257           Title: cmatch
1258         Synonym: cm
1259
1260            Args: element str, value str
1261          Return: bool
1262         Example: $n_freds = $personset->cmatch('name', 'fred');
1263
1264    counts the number of matches
1265
1266   where (w)
1267
1268           Title: where
1269         Synonym: w
1270
1271            Args: element str, test CODE
1272          Return: Node[]
1273         Example: @rich_persons = $data->where('person', sub {shift->get_salary > 100000});
1274
1275    the tree is queried for all elements of the specified type that satisfy
1276    the coderef (must return a boolean)
1277
1278      my @rich_dog_or_cat_owners =
1279        $data->where('person',
1280                     sub {my $p = shift;
1281                          $p->get_salary > 100000 &&
1282                          $p->where('pet',
1283                                    sub {shift->get_type =~ /(dog|cat)/})});
1284
1285   iterate (i)
1286
1287           Title: iterate
1288         Synonym: i
1289
1290            Args: CODE
1291          Return: Node[]
1292         Example: $data->iterate(sub {
1293                                     my $stag = shift;
1294                                     my $parent = shift;
1295                                     if ($stag->element eq 'pet') {
1296                                         $parent->set_pet_name($stag->get_name);
1297                                     }
1298                                 });
1299
1300    iterates through whole tree calling the specified subroutine.
1301
1302    the first arg passed to the subroutine is the stag node representing the
1303    tree at that point; the second arg is for the parent.
1304
1305    for instance, the example code above would turn this
1306
1307      (person
1308       (name "jim")
1309       (pet
1310        (name "fluffy")))
1311
1312    into this
1313
1314      (person
1315       (name "jim")
1316       (pet_name "fluffy")
1317       (pet
1318        (name "fluffy")))
1319
1320  MISCELLANEOUS METHODS
1321
1322   duplicate (d)
1323
1324           Title: duplicate
1325         Synonym: d
1326
1327            Args:
1328          Return: Node
1329         Example: $node2 = $node->duplicate;
1330
1331    does a deep copy of a stag structure
1332
1333   isanode
1334
1335           Title: isanode
1336
1337            Args:
1338          Return: bool
1339         Example: if (stag_isanode($node)) { ... }
1340
1341   hash
1342
1343           Title: hash
1344
1345            Args:
1346          Return: hash
1347         Example: $h = $node->hash;
1348
1349    turns a tree into a hash. all data values will be arrayrefs
1350
1351   pairs
1352
1353           Title: pairs
1354
1355    turns a tree into a hash. all data values will be scalar (IMPORTANT:
1356    this means duplicate values will be lost)
1357
1358   write
1359
1360           Title: write
1361
1362            Args: filename str, format str[optional]
1363          Return:
1364         Example: $node->write("myfile.xml");
1365         Example: $node->write("myfile", "itext");
1366
1367    will try and guess the format from the extension if not specified
1368
1369   xml
1370
1371           Title: xml
1372
1373            Args: filename str, format str[optional]
1374          Return:
1375         Example: $node->write("myfile.xml");
1376         Example: $node->write("myfile", "itext");
1377
1378            Args:
1379          Return: xml str
1380         Example: print $node->xml;
1381
1382  XML METHODS
1383
1384   sax
1385
1386           Title: sax
1387
1388            Args: saxhandler SAX-CLASS
1389          Return:
1390         Example: $node->sax($mysaxhandler);
1391
1392    turns a tree into a series of SAX events
1393
1394   xpath (xp tree2xpath)
1395
1396           Title: xpath
1397         Synonym: xp
1398         Synonym: tree2xpath
1399
1400            Args:
1401          Return: xpath object
1402         Example: $xp = $node->xpath; $q = $xp->find($xpathquerystr);
1403
1404   xpquery (xpq xpathquery)
1405
1406           Title: xpquery
1407         Synonym: xpq
1408         Synonym: xpathquery
1409
1410            Args: xpathquery str
1411          Return: Node[]
1412         Example: @nodes = $node->xqp($xpathquerystr);
1413
1414STAG SCRIPTS
1415    The following scripts come with the stag module
1416
1417    stag-autoschema.pl
1418        writes the implicit stag-schema for a stag file
1419
1420    stag-db.pl
1421        persistent storage and retrieval for stag data (xml, sxpr, itext)
1422
1423    stag-diff.pl
1424        finds the difference between two stag files
1425
1426    stag-drawtree.pl
1427        draws a stag file (xml, itext, sxpr) as a PNG diagram
1428
1429    stag-filter.pl
1430        filters a stag file (xml, itext, sxpr) for nodes of interest
1431
1432    stag-findsubtree.pl
1433        finds nodes in a stag file
1434
1435    stag-flatten.pl
1436        turns stag data into a flat table
1437
1438    stag-grep.pl
1439        filters a stag file (xml, itext, sxpr) for nodes of interest
1440
1441    stag-handle.pl
1442        streams a stag file through a handler into a writer
1443
1444    stag-join.pl
1445        joins two stag files together based around common key
1446
1447    stag-mogrify.pl
1448        mangle stag files
1449
1450    stag-parse.pl
1451        parses a file and fires events (e.g. sxpr to xml)
1452
1453    stag-query.pl
1454        aggregare queries
1455
1456    stag-split.pl
1457        splits a stag file (xml, itext, sxpr) into multiple files
1458
1459    stag-splitter.pl
1460        splits a stag file into multiple files
1461
1462    stag-view.pl
1463        draws an expandable Tk tree diagram showing stag data
1464
1465    To get more documentation, type
1466
1467      stag_<script> -h
1468
1469BUGS
1470    none known so far, possibly quite a few undocumented features!
1471
1472    Not a bug, but the underlying default datastructure of nested arrays is
1473    more heavyweight than it needs to be. More lightweight implementations
1474    are possible. Some time I will write a C implementation.
1475
1476WEBSITE
1477    http://stag.sourceforge.net
1478
1479WEBSITE
1480    http://stag.sourceforge.net
1481
1482AUTHOR
1483    Chris Mungall <cjm AT fruitfly DOT org>
1484
1485COPYRIGHT
1486    Copyright (c) 2004 Chris Mungall
1487
1488    This module is free software. You may distribute this module under the
1489    same terms as perl itself
1490