• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

Regex/H02-Feb-2011-1,134238

extra/H02-Feb-2011-1800

ChangesH A D02-Feb-20111.1 KiB3330

MANIFESTH A D02-Feb-2011163 109

META.ymlH A D02-Feb-2011332 1210

Makefile.PLH A D02-Feb-2011294 118

READMEH A D02-Feb-20116.6 KiB189148

Regex.pmH A D02-Feb-201126.4 KiB914593

test.plH A D02-Feb-20113.3 KiB171117

README

1NAME
2    YAPE::Regex - Yet Another Parser/Extractor for Regular
3    Expressions
4
5SYNOPSIS
6      use YAPE::Regex;
7      use strict;
8
9      my $regex = qr/reg(ular\s+)?exp?(ression)?/i;
10      my $parser = YAPE::Regex->new($regex);
11
12      # here is the tokenizing part
13      while (my $chunk = $parser->next) {
14        # ...
15      }
16
17`YAPE' MODULES
18    The `YAPE' hierarchy of modules is an attempt at a unified means
19    of parsing and extracting content. It attempts to maintain a
20    generic interface, to promote simplicity and reusability. The
21    API is powerful, yet simple. The modules do tokenization (which
22    can be intercepted) and build trees, so that extraction of
23    specific nodes is doable.
24
25DESCRIPTION
26    This module is yet another (?) parser and tree-builder for Perl
27    regular expressions. It builds a tree out of a regex, but at the
28    moment, the extent of the extraction tool for the tree is quite
29    limited (see the section on "Extracting Sections"). However, the
30    tree can be useful to extension modules.
31
32USAGE
33    In addition to the base class, `YAPE::Regex', there is the
34    auxiliary class `YAPE::Regex::Element' (common to all `YAPE'
35    base classes) that holds the individual nodes' classes. There is
36    documentation for the node classes in that module's
37    documentation.
38
39  Methods for `YAPE::Regex'
40
41    * `use YAPE::Regex;'
42    * `use YAPE::Regex qw( MyExt::Mod );'
43        If supplied no arguments, the module is loaded normally, and
44        the node classes are given the proper inheritence (from
45        `YAPE::Regex::Element'). If you supply a module (or list of
46        modules), `import' will automatically include them (if
47        needed) and set up *their* node classes with the proper
48        inheritence -- that is, it will append `YAPE::Regex' to
49        `@MyExt::Mod::ISA', and `YAPE::Regex::xxx' to each node
50        class's `@ISA' (where `xxx' is the name of the specific node
51        class).
52
53          package MyExt::Mod;
54          use YAPE::Regex 'MyExt::Mod';
55
56          # does the work of:
57          # @MyExt::Mod::ISA = 'YAPE::Regex'
58          # @MyExt::Mod::text::ISA = 'YAPE::Regex::text'
59          # ...
60
61    * `my $p = YAPE::Regex->new($REx);'
62        Creates a `YAPE::Regex' object, using the contents of `$REx'
63        as a regular expression. The `new' method will *attempt* to
64        convert `$REx' to a compiled regex (using `qr//') if `$REx'
65        isn't already one. If there is an error in the regex, this
66        will fail, but the parser will pretend it was ok. It will
67        then report the bad token when it gets to it, in the course
68        of parsing.
69
70    * `my $text = $p->chunk($len);'
71        Returns the next `$len' characters in the input string;
72        `$len' defaults to 30 characters. This is useful for
73        figuring out why a parsing error occurs.
74
75    * `my $done = $p->done;'
76        Returns true if the parser is done with the input string,
77        and false otherwise.
78
79    * `my $errstr = $p->error;'
80        Returns the parser error message.
81
82    * `my $backref = $p->extract;'
83        Returns a code reference that returns the next back-
84        reference in the regex. For more information on enhancements
85        in upcoming versions of this module, check the section on
86        "Extracting Sections".
87
88    * `my $node = $p->display(...);'
89        Returns a string representation of the entire content. It
90        calls the `parse' method in case there is more data that has
91        not yet been parsed. This calls the `fullstring' method on
92        the root nodes. Check the `YAPE::Regex::Element' docs on the
93        arguments to `fullstring'.
94
95    * `my $node = $p->next;'
96        Returns the next token, or `undef' if there is no valid
97        token. There will be an error message (accessible with the
98        `error' method) if there was a problem in the parsing.
99
100    * `my $node = $p->parse;'
101        Calls `next' until all the data has been parsed.
102
103    * `my $node = $p->root;'
104        Returns the root node of the tree structure.
105
106    * `my $state = $p->state;'
107        Returns the current state of the parser. It is one of the
108        following values: `alt', `anchor', `any', `backref',
109        `capture(N)', `Cchar', `class', `close', `code', `comment',
110        `cond(TYPE)', `ctrl', `cut', `done', `error', `flags',
111        `group', `hex', `later', `lookahead(neg|pos)',
112        `lookbehind(neg|pos)', `macro', `named', `oct', `slash',
113        `text', and `utf8hex'.
114
115        For `capture(N)', *N* will be the number the captured
116        pattern represents.
117
118        For `cond(TYPE)', *TYPE* will either be a number
119        representing the back-reference that the conditional depends
120        on, or the string `assert'.
121
122        For `lookahead' and `lookbehind', one of `neg' and `pos'
123        will be there, depending on the type of assertion.
124
125    * `my $node = $p->top;'
126        Synonymous to `root'.
127
128  Extracting Sections
129
130    While extraction of nodes is the goal of the `YAPE' modules, the
131    author is at a loss for words as to what needs to be extracted
132    from a regex. At the current time, all the `extract' method does
133    is allow you access to the regex's set of back-references:
134
135      my $extor = $parser->extract;
136      while (my $backref = $extor->()) {
137        # ...
138      }
139
140    `japhy' is very open to suggestions as to the approach to node
141    extraction (in how the API should look, in addition to what
142    should be proffered). Preliminary ideas include extraction
143    keywords like the output of -Dr (or the `re' module's `debug'
144    option).
145
146EXTENSIONS
147    * `YAPE::Regex::Explain' 3.00
148        Presents an explanation of a regular expression, node by
149        node.
150
151    * `YAPE::Regex::Reverse' (Not released)
152        Reverses the nodes of a regular expression.
153
154TO DO
155    This is a listing of things to add to future versions of this
156    module.
157
158  API
159
160    * Create a robust `extract' method
161        Open to suggestions.
162
163BUGS
164    Following is a list of known or reported bugs.
165
166  Pending
167
168    * `use charnames ':full''
169        To understand `\N{...}' properly, you must be using 5.6.0 or
170        higher. However, the parser only knows how to resolve full
171        names (those made using `use charnames ':full''). There
172        might be an option in the future to specify a class name.
173
174SUPPORT
175    Visit `YAPE''s web site at http://www.pobox.com/~japhy/YAPE/.
176
177SEE ALSO
178    The `YAPE::Regex::Element' documentation, for information on the
179    node classes. Also, `Text::Balanced', Damian Conway's excellent
180    module, used for the matching of `(?{ ... })' and `(??{ ... })'
181    blocks.
182
183AUTHOR
184      Jeff "japhy" Pinyan
185      CPAN ID: PINYAN
186      japhy@pobox.com
187      http://www.pobox.com/~japhy/
188
189