1NOTE to myself -- this pod needs to be updated to have option
2patterns described!
3
4
5=head1 NAME
6
7Getopt::Tabular - table-driven argument parsing for Perl 5
8
9=head1 SYNOPSIS
10
11    use Getopt::Tabular;
12
13(or)
14
15    use Getopt::Tabular qw/GetOptions
16                           SetHelp SetHelpOption
17                           SetError GetError/;
18
19    ...
20
21    &Getopt::Tabular::SetHelp (long_help, usage_string);
22
23    @opt_table = (
24                  [section_description, "section"],
25                  [option, type, num_values, option_data, help_string],
26                  ...
27                 );
28    &GetOptions (\@opt_table, \@ARGV [, \@newARGV]) || exit 1;
29
30=head1 DESCRIPTION
31
32B<Getopt::Tabular> is a Perl 5 module for table-driven argument parsing,
33vaguely inspired by John Ousterhout's Tk_ParseArgv.  All you really need
34to do to use the package is set up a table describing all your
35command-line options, and call &GetOptions with three arguments: a
36reference to your option table, a reference to C<@ARGV> (or something
37like it), and an optional third array reference (say, to C<@newARGV>).
38&GetOptions will process all arguments in C<@ARGV>, and copy any
39leftover arguments (i.e. those that are not options or arguments to some
40option) to the C<@newARGV> array.  (If the C<@newARGV> argument is not
41supplied, C<GetOptions> will replace C<@ARGV> with the stripped-down
42argument list.)  If there are any invalid options, C<GetOptions> will
43print an error message and return 0.
44
45Before I tell you all about why Getopt::Tabular is a wonderful thing, let me
46explain some of the terminology that will keep popping up here.
47
48=over 4
49
50=item argument
51
52any single word appearing on the command-line, i.e. one element of the
53C<@ARGV> array.
54
55=item option
56
57an argument that starts with a certain sequence of characters; the default
58is "-".  (If you like GNU-style options, you can change this to "--".)  In
59most Getopt::Tabular-based applications, options can come anywhere on the
60command line, and their order is unimportant (unless one option overrides a
61previous option).  Also, Getopt::Tabular will allow any non-ambiguous
62abbreviation of options.
63
64=item option argument
65
66(or I<value>) an argument that immediately follows certain types of
67options.  For instance, if C<-foo> is a scalar-valued integer option, and
68C<-foo 3> appears on the command line, then C<3> will be the argument to
69C<-foo>.
70
71=item option type
72
73controls how C<GetOptions> deals with an option and the arguments that
74follow it.  (Actually, for most option types, the type interacts with the
75C<num_values> field, which determines whether the option is scalar- or
76vector-valued.  This will be fully explained in due course.)
77
78=back
79
80=head1 FEATURES
81
82Now for the advertising, i.e. why Getopt::Tabular is a good thing.
83
84=over 4
85
86=item *
87
88Command-line arguments are carefully type-checked, both by pattern and
89number---e.g. if an option requires two integers, GetOptions makes sure
90that exactly two integers follow it!
91
92=item *
93
94The valid command-line arguments are specified in a data structure
95separate from the call to GetOptions; this makes it easier to have very
96long lists of options, and to parse options from multiple sources (e.g. the
97command line, an environment variable, and a configuration file).
98
99=item *
100
101Getopt::Tabular can intelligently generate help text based on your option
102descriptions.
103
104=item *
105
106The type system is extensible, and if you can define your desired argument
107type using a single Perl regular expression then it's particularly easy to
108extend.
109
110=item *
111
112To make your program look smarter, options can be abbreviated and come in
113any order.
114
115=item *
116
117 You can parse options in a "spoof" mode that has no side-effects -- this
118is useful for making a validation pass over the command line without
119actually doing anything.
120
121=back
122
123In general, I have found that Getopt::Tabular tends to encourage programs
124with long lists of sophisticated options, leading to great flexibility,
125intelligent operation, and the potential for insanely long command lines.
126
127=head1 BASIC OPERATION
128
129The basic operation of Getopt::Tabular is driven by an I<option table>,
130which is just a list of I<option descriptions> (otherwise known as option
131table entries, or just entries).  Each option description tells
132C<GetOptions> everything it needs to know when it encounters a particular
133option on the command line.  For instance,
134
135    ["-foo", "integer", 2, \@Foo, "set the foo values"]
136
137means that whenever C<-foo> is seen on the command line, C<GetOptions> is
138to make sure that the next two arguments are integers, and copy them into
139the caller's C<@Foo> array.  (Well, really into the C<@Foo> array where the
140option table is defined.  This is almost always the same as C<GetOptions>'
141caller, though.)
142
143Typically, you'll group a bunch of option descriptions together like
144this:
145
146    @options =
147        (["-range", "integer", 2, \@Range,
148          "set the range of allowed values"],
149         ["-file", "string", 1, \$File,
150           "set the output file"],
151         ["-clobber", "boolean", 0, \$Clobber,
152           "clobber existing files"],
153         ...
154        );
155
156and then call C<GetOptions> like this:
157
158    &GetOptions (\@options, \@ARGV) || exit 1;
159
160which replaces C<@ARGV> with a new array containing all the arguments
161left-over after options and their arguments have been removed.  You can
162also call C<GetOptions> with three arguments, like this:
163
164    &GetOptions (\@options, \@ARGV, \@newARGV) || exit 1;
165
166in which case C<@ARGV> is untouched, and C<@newARGV> gets the leftover
167arguments.
168
169In case of error, C<GetOptions> prints enough information for the user to
170figure out what's going wrong.  If you supply one, it'll even print out a
171brief usage message in case of error.  Thus, it's enough to just C<exit 1>
172when C<GetOptions> indicates an error by returning 0.
173
174Detailed descriptions of the contents of an option table entry are given
175next, followed by the complete run-down of available types, full details on
176error handling, and how help text is generated.
177
178=head1 OPTION TABLE ENTRIES
179
180The fields in the option table control how arguments are parsed, so it's
181important to understand each one in turn.  First, the format of entries in
182the table is fairly rigid, even though this isn't really necessary with
183Perl.  It's done that way to make the Getopt::Tabular code a little easier;
184the drawback is that some entries will have unused values (e.g. the
185C<num_values> field is never used for boolean options, but you still have
186to put something there as a place-holder).  The fields are as follows:
187
188=over 4
189
190=item option
191
192This is the option name, e.g. "-verbose" or "-some_value".  For most option
193types, this is simply an option prefix followed by text; for boolean
194options, however, it can be a little more complicated.  (The exact rules
195are discussed under L<"OPTION TYPES">.)  And yes, even though you tell
196Getopt::Tabular the valid option prefixes, you still have to put one onto
197the option names in the table.
198
199=item type
200
201The option type decides what action will be taken when this option is seen
202on the command line, and (if applicable) what sort of values will be
203accepted for this option.  There are three broad classes of types: those
204that imply copying data from the command line into some variable in the
205caller's space; those that imply copying constant data into the caller's
206space without taking any more arguments from the command line; and those
207that imply some other action to be taken.  The available option types are
208covered in greater detail below (see L<OPTION TYPES>), but briefly:
209C<string>, C<integer>, and C<float> all imply copying values from the
210command line to a variable; C<constant>, C<boolean>, C<copy>,
211C<arrayconst>, and C<hashconst> all imply copying some pre-defined data
212into a variable; C<call> and C<eval> allow the execution of some arbitrary
213subroutine or chunk of code; and C<help> options will cause C<GetOptions>
214to print out all available help text and return 0.
215
216=item num_values
217
218for C<string>, C<integer>, and C<float> options, this determines whether
219the option is a scalar (B<num_values> = 1) or vector (B<num_values> > 1)
220option.  (Note that whether the option is scalar- or vector-valued has an
221important influence on what you must supply in the B<option_data> field!)
222For C<constant>, C<copy>, C<arrayconst>, and C<hashconst> option types,
223B<num_values> is a bit of a misnomer: it actually contains the value (or a
224reference to it, if array or hash) to be copied when the option is
225encountered.  For C<call> options, B<num_values> can be used to supply
226extra arguments to the called subroutine.  In any case, though, you can
227think of B<num_values> as an input value.  For C<boolean> and C<eval>
228options, B<num_values> is ignored and should be C<undef> or 0.
229
230=item option_data
231
232For C<string>, C<integer>, C<float>, C<boolean>, C<constant>, C<copy>,
233C<arrayconst>, and C<hashconst> types, this must be a reference to the
234variable into which you want C<GetOptions> to copy the appropriate thing.
235The "appropriate thing" is either the argument(s) following the option, the
236constant supplied as B<num_values>, or 1 or 0 (for boolean options).
237
238For C<boolean>, C<constant>, C<copy>, and scalar-valued C<string>,
239C<integer>, and C<float> options, this must be a scalar reference.  For
240vector-valued C<string>, C<integer>, and C<float> options (B<num_values> >
2411), and for C<arrayconst> options, this must be an array reference.  For
242C<hashconst> options, this must be a hash reference.
243
244Finally, B<option_data> is also used as an input value for C<call> and
245C<eval> options: for C<call>, it should be a subroutine reference, and for
246C<eval> options, it should be a string containing valid Perl code to
247evaluate when the option is seen.  The subroutine called by a C<call>
248option should take at least two arguments: a string, which is the actual
249option that triggered the call (because the same subroutine could be tied
250to many options), and an array reference, which contains all command line
251arguments after that option.  (Further arguments can be supplied in the
252B<num_values> field.)  The subroutine may freely modify this array, and
253those modifications will affect the behaviour of C<GetOptions> afterwards.
254
255The chunk of code passed to an C<eval> option is evaluated in the package
256from which C<GetOptions> is called, and does not have access to any
257internal Getopt::Tabular data.
258
259=item help_string
260
261(optional) a brief description of the option.  Don't worry about formatting
262this in any way; when C<GetOptions> has to print out your help, it will do so
263quite nicely without any intervention.  If the help string is not defined,
264then that option will not be included in the option help text.  (However,
265you could supply an empty string -- which is defined -- to make C<GetOptions>
266just print out the option name, but nothing else.)
267
268=item arg_desc
269
270(optional) an even briefer description of the values that you expect to
271follow your option.  This is mainly used to supply place-holders in the help
272string, and is specified separately so that C<GetOptions> can act fairly
273intelligently when formatting a help message.  See L<"HELP TEXT"> for more
274information.
275
276=back
277
278
279=head1 OPTION TYPES
280
281The option type field is the single-most important field in the table, as
282the type for an option C<-foo> determines (along with B<num_values>) what
283action C<GetOptions> takes when it sees C<-foo> on the command line: how many
284following arguments become C<-foo>'s arguments, what regular expression
285those arguments must conform to, or whether some other action should be
286taken.
287
288As mentioned above, there are three main classes of argument types:
289
290=over 4
291
292=item argument-driven options
293
294These are options that imply taking one or more option arguments from
295the command line after the option itself is taken.  The arguments are
296then copied into some variable supplied (by reference) in the option
297table entry.
298
299=item constant-valued options
300
301These are options that have a constant value associated with them; when
302the option is seen on the command line, that constant is copied to some
303variable in the caller's space.  (Both the constant and the value are
304supplied in the option table entry.)  Constants can be scalars, arrays,
305or hashes.
306
307=item other options
308
309These imply some other action to be taken, usually supplied as a string
310to C<eval> or a subroutine to call.
311
312=back
313
314=head2 Argument-driven option types
315
316=over 4
317
318=item string, integer, float
319
320These are the option types that imply "option arguments", i.e. arguments
321after the option that will be consumed when that option is encountered on
322the command line and copied into the caller's space via some reference.
323For instance, if you want an option C<-foo> to take a single string as
324an argument, with that string being copied to the scalar variable
325C<$Foo>, then you would have this entry in your option table:
326
327    ["-foo", "string", 1, \$Foo]
328
329(For conciseness, I've omitted the B<help_string> and B<argdesc> entries
330in all of the example entries in this section.  In reality, you should
331religiously supply help text in order to make your programs easier to
332use and easier to maintain.)
333
334If B<num_values> is some I<n> greater than one, then the B<option_data>
335field must be an I<array> reference, and I<n> arguments are copied from
336the command line into that array.  (The array is clobbered each time
337C<-foo> is encountered, not appended to.)  In this case, C<-foo> is
338referred to as a I<vector-valued> option, as it must be followed by a
339fixed number of arguments.  (Eventually, I plan to add I<list-valued>
340options, which take a variable number of arguments.)  For example an
341option table like
342
343    ["-foo", "string", 3, \@Foo]
344
345would result in the C<@Foo> array being set to the three strings
346immediately following any C<-foo> option on the command line.
347
348The only difference between B<string>, B<integer>, and B<float> options is
349how picky C<GetOptions> is about the value(s) it will accept.  For
350B<string> options, anything is OK; for B<integer> options, the values must
351look like integers (i.e., they must match C</[+-]?\d+/>); for B<float>
352options, the values must look like C floating point numbers (trust me, you
353don't want to see the regexp for this).  Note that since string options
354will accept anything, they might accidentally slurp up arguments that are
355meant to be further options, if the user forgets to put the correct string.
356For instance, if C<-foo> and C<-bar> are both scalar-valued string options,
357and the arguments C<-foo -bar> are seen on the command-line, then "-bar"
358will become the argument to C<-foo>, and never be processed as an option
359itself.  (This could be construed as either a bug or a feature.  If you
360feel really strongly that it's a bug, then complain and I'll consider doing
361something about it.)
362
363If not enough arguments are found that match the required regular
364expression, C<GetOptions> prints to standard error a clear and useful error
365message, followed by the usage summary (if you supplied one), and returns
3660.  The error messages look something like "-foo option must be followed by
367an integer", or "-foo option must be followed by 3 strings", so it really
368is enough for your program to C<exit 1> without printing any further
369message.
370
371=item User-defined patterns
372
373Since the three option types described above are defined by nothing more
374than a regular expression, it's easy to define your own option types.  For
375instance, let's say you want an option to accept only strings of upper-case
376letters.  You could then call C<&Getopt::Tabular::AddPatternType> as
377follows:
378
379    &Getopt::Tabular::AddPatternType
380      ("upperstring", "[A-Z]+", "uppercase string")
381
382Note that the third parameter is optional, and is only supplied to make
383error messages clearer.  For instance, if you now have a scalar-valued
384option C<-zap> of type C<upperstring>:
385
386   ["-zap", "upperstring", 1, \$Zap]
387
388and the user gets it wrong and puts an argument that doesn't consist of
389all uppercase letters after C<-zap>, then C<GetOptions> will complain
390that "-zap option must be followed by an uppercase string".  If you
391hadn't supplied the third argument to C<&AddType>, then the error
392message would have been the slightly less helpful "-zap option must be
393followed by an upperstring".  Also, you might have to worry about how
394C<GetOptions> pluralizes your description: in this case, it will simply
395add an "s", which works fine much of the time, but not always.
396Alternately, you could supply a two-element list containing the singular
397and plural forms:
398
399    &Getopt::Tabular::AddPatternType
400      ("upperstring", "[A-Z]+",
401        ["string of uppercase letters", "strings of uppercase letters"])
402
403So, if C<-zap> instead expects three C<upperstring>s, and the user
404goofs, then the error message would be (in the first example) "-zap
405option must be followed by 3 uppercase strings" or "-zap option must be
406followed by three strings of uppercase letters" (second example).
407
408Of course, if you don't intend to have vector-valued options of your new
409type, pluralization hardly matters.  Also, while it might seem that this
410is a nice stab in the direction of multi-lingual support, the error
411messages are still hard-coded to English in other places.  Maybe in the
412next version...
413
414=back
415
416=head2 Constant-valued option types
417
418=over 4
419
420=item boolean
421
422For B<boolean> options, B<option_data> must be a scalar reference;
423B<num_values> is ignored (you can just set it to C<undef> or 0).
424Booleans are slightly weird in that every boolean option implies I<two>
425possible arguments that will be accepted on the command line, called the
426positive and negative alternatives.  The positive alternative (which is
427what you specify as the option name) results in a true value, while the
428negative alternative results in false.  Most of the time, you can let
429C<GetOptions> pick the negative alternative for you: it just inserts
430"no" after the option prefix, so "-clobber" becomes "-noclobber".  (More
431precisely, C<GetOptions> tests all option prefixes until one of them
432matches at the beginning of the option name.  It then inserts "no"
433between this prefix and the rest of the string.  So, if you want to
434support both GNU-style options (like C<--clobber>) and one-hyphen
435options (C<-c>), be sure to give "--" I<first> when setting the option
436patterns with C<&SetOptionPatterns>.  Otherwise, the negative
437alternative to "--clobber" will be "-no-clobber", which might not be
438what you wanted.)  Sometimes, though, you want to explicitly specify the
439negative alternative.  This is done by putting both alternatives in the
440option name, separated by a vertical bar, e.g. "-verbose|-quiet".
441
442For example, the above two examples might be specified as
443
444    ["-clobber", "boolean", undef, \$Clobber],
445    ["-verbose|-quiet", "boolean", undef, \$Verbose],...);
446
447If C<-clobber> is seen on the command line, C<$Clobber> will be set to 1;
448if C<-noclobber> is seen, then C<$Clobber> will be set to 0.  Likewise,
449C<-verbose> results in C<$Verbose> being set to 1, and C<-quiet> will set
450C<$Verbose> to 0.
451
452=item const
453
454For B<const> options, put a scalar value (I<not> reference) in
455B<num_values>, and a scalar reference in B<option_data>.  For example:
456
457    ["-foo", "const", "hello there", \$Foo]
458
459On encountering C<-foo>, C<GetOptions> will copy "hello there" to C<$Foo>.
460
461=item arrayconst
462
463For B<arrayconst> options, put an array reference (input) (I<not> an array
464value) in B<num_values>, and another array reference (output) in
465B<option_data>.  For example:
466
467    ["-foo", "arrayconst", [3, 6, 2], \@Foo]
468
469On encountering C<-foo>, C<GetOptions> will copy the array C<(3,6,2)> into
470C<@Foo>.
471
472=item hashconst
473
474For B<hashconst> options, put a hash reference (input) (I<not> a hash
475value) in B<num_values>, and another hash reference (output) in
476B<option_data>.  For example:
477
478    ["-foo", "hashconst", { "Perl"   => "Larry Wall",
479                            "C"      => "Dennis Ritchie",
480                            "Pascal" => "Niklaus Wirth" },
481     \%Inventors]
482
483On encountering C<-foo>, C<GetOptions> will copy into C<%Inventors> a hash
484relating various programming languages to the culprits primarily
485responsible for their invention.
486
487=item copy
488
489B<copy> options act just like B<const> options, except when
490B<num_values> is undefined.  In that case, the option name itself will
491be copied to the scalar referenced by B<option_data>, rather than the
492C<undef> value that would be copied under these circumstances with a
493B<const> option.  This is useful when one program accepts options that
494it simply passes to a sub-program; for instance, if F<prog1> calls
495F<prog2>, and F<prog2> might be run with the -foo option, then
496F<prog1>'s argument table might have this option:
497
498    ["-foo", "copy", undef, \$Foo,
499     "run prog2 with the -foo option"]
500
501and later on, you would run F<prog2> like this:
502
503    system ("prog2 $Foo ...");
504
505That way, if C<-foo> is never seen on F<prog1>'s command line, C<$Foo> will
506be untouched, and will expand to the empty string when building the command
507line for F<prog2>.
508
509If B<num_values> is anything other than C<undef>, then B<copy> options
510behave just like B<constant> options.
511
512=back
513
514=head2 Other option types
515
516=over 4
517
518=item call
519
520For B<call> options, B<option_data> must be a reference to a subroutine.
521The subroutine will be called with at least two arguments: a string
522containing the option that triggered the call (because the same
523subroutine might be activated by many options), a reference to an array
524containing all remaining command-line arguments after the option, and
525other arguments specified using the B<num_values> field.  (To be used
526for this purpose, B<num_values> must be an array reference; otherwise,
527it is ignored.)  For example, you might define a subroutine
528
529    sub process_foo
530    {
531       my ($opt, $args, $dest) = @_;
532
533       $$dest = shift @$args;    # not quite right! (see below)
534    }
535
536with a corresponding option table entry:
537
538    ["-foo", "call", [\$Foo], \&process_foo]
539
540and then C<-foo> would act just like a scalar-valued string option that
541copies into C<$Foo>.  (Well, I<almost> ... read on.)
542
543A subtle point that might be missed from the above code: the value returned
544by C<&process_foo> I<does> matter: if it is false, then C<GetOptions> will
545return 0 to its caller, indicating failure.  To make sure that the user
546gets a useful error message, you should supply one by calling C<SetError>;
547doing so will prevent C<GetOptions> from printing out a rather mysterious
548(to the end user, at least) message along the lines of "subroutine call
549failed".  The above example has two subtle problems: first, if the argument
550following C<-foo> is an empty string, then C<process_foo> will return
551the empty string---a false value---thus causing C<GetOptions> to fail
552confusingly.  Second, if there no arguments after C<-foo>, then
553C<process_foo> will return C<undef>---again, a false value, causing
554C<GetOptions> to fail.
555
556To solve these problems, we have to define the requirements for the
557C<-foo> option a little more rigorously.  Let's say that any string
558(including the empty string) is valid, but that there must be something
559there.  Then C<process_foo> is written as follows:
560
561    sub process_foo
562    {
563       my ($opt, $args, $dest) = @_;
564
565       $$dest = shift @$args;
566       (defined $$dest) && return 1;
567       &Getopt::Tabular::SetError
568         ("bad_foo", "$opt option must be followed by a string");
569       return 0;
570    }
571
572The C<SetError> routine actually takes two arguments: an error class and
573an error message.  This is explained fully in the L<ERROR HANDLING>
574section, below.  And, if you find yourself writing a lot of routines
575like this, C<SetError> is optionally exported from C<Getopt::Tabular>,
576so you can of course import it into your main package like this:
577
578    use Getopt::Tabular qw/GetOptions SetError/;
579
580=item eval
581
582An B<eval> option specifies a chunk of Perl code to be executed
583(C<eval>'d) when the option is encountered on the command line.  The
584code is supplied (as a string) in the B<option_data> field; again,
585B<num_values> is ignored.  For example:
586
587    ["-foo", "eval", undef,
588     'print "-foo seen on command line\n"']
589
590will cause C<GetOptions> to print out (via an C<eval>) the string "-foo seen
591on the command line\n" when -foo is seen.  No other action is taken
592apart from what you include in the eval string.  The code is evaluated
593in the package from which C<GetOptions> was called, so you can access
594variables and subroutines in your program easily.  If any error occurs
595in the C<eval>, C<GetOptions> complains loudly and returns 0.
596
597Note that the supplied code is always evaluated in a C<no strict>
598environment---that's because F<Getopt::Tabular> is itself C<use
599strict>-compliant, and I didn't want to force strictness on every quick
600hack that uses the module.  (Especially since B<eval> options seem to be
601used mostly in quick hacks.)  (Anyone who knows how to fetch the
602strictness state for another package or scope is welcome to send me
603hints!)  However, the B<-w> state is untouched.
604
605=item section
606
607B<section> options are just used to help formatting the help text.  See
608L<HELP TEXT> below for more details.
609
610=back
611
612
613=head1 ERROR HANDLING
614
615Generally, handling errors in the argument list is pretty transparent:
616C<GetOptions> (or one of its minions) generates an error message and
617assigns an error class, C<GetOptions> prints the message to the standard
618error, and returns 0.  You can access the error class and error message
619using the C<GetError> routine:
620
621    ($err_class, $err_msg) = &Getopt::Tabular::GetError ();
622
623(Like C<SetError>, C<GetError> can also be exported from
624F<Getopt::Tabular>.)  The error message is pretty simple---it is an
625explanation for the end user of what went wrong, which is why
626C<GetOptions> just prints it out and forgets about it.  The error class
627is further information that might be useful for your program; the
628current values are:
629
630=over 4
631
632=item bad_option
633
634set when something that looks like an option is found on the command
635line, but it's either unknown or an ambiguous abbreviation.
636
637=item bad_value
638
639set when an option is followed by an invalid argument (i.e., one that
640doesn't match the regexp for that type), or the wrong number of
641arguments.
642
643=item bad_call
644
645set when a subroutine called via a B<call> option or the code evaluated
646for an B<eval> option returns a false value.  The subroutine or eval'd
647code can override this by calling C<SetError> itself.
648
649=item bad_eval
650
651set when the code evaluted for an B<eval> option has an error in it.
652
653=item help
654
655set when the user requests help
656
657=back
658
659Note that most of these are errors on the end user's part, such as bad
660or missing arguments.  There are also errors that can be caused by you,
661the programmer, such as bad or missing values in the option table; these
662generally result in C<GetOptions> croaking so that your program dies
663immediately with enough information that you can figure out where the
664mistake is.  B<bad_eval> is a borderline case; there are conceivably
665cases where the end user's input can result in bogus code to evaluate,
666so I grouped this one in the "user errors" class.  Finally, asking for
667help isn't really an error, but the assumption is that you probably
668shouldn't continue normal processing after printing out the help---so
669C<GetOptions> returns 0 in this case.  You can always fetch the error
670class with C<GetError> if you want to treat real errors differently from
671help requests.
672
673=head1 HELP TEXT
674
675One of Getopt::Tabular's niftier features is the ability to generate and
676format a pile of useful help text from the snippets of help you include
677in your option table.  The best way to illustrate this is with a couple
678of brief examples.  First, it's helpful to know how the user can trigger
679a help display.  This is quite simple: by default, C<GetOptions> always
680has a "-help" option, presence of which on the command line triggers a
681help display.  (Actually, the help option is really your preferred
682option prefix plus "help".  So, if you like to make GNU-style options to
683take precedence as follows:
684
685    &Getopt::Tabular::SetOptionPatterns qw|(--)([\w-]+) (-)(\w+)|;
686
687then the help option will be "--help".  There is only one help option
688available, and you can set it by calling C<&SetHelpOption> (another
689optional export).
690
691Note that in addition to the option help embedded in the option table,
692C<GetOptions> can optionally print out two other messages: a descriptive
693text (usually a short paragraph giving a rough overview of what your
694program does, possibly referring the user to the fine manual page), and
695a usage text.  These are both supplied by calling C<&SetHelp>, e.g.
696
697    $Help = <<HELP;
698    This is the foo program.  It reads one file (specified by -infile),
699    operates on it some unspecified way (possibly modified by
700    -threshold), and does absolutely nothing with the results.
701    (The utility of the -clobber option has yet to be established.)
702    HELP
703
704    $Usage = <<USAGE;
705    usage: foo [options]
706           foo -help to list options
707    USAGE
708
709    &Getopt::Tabular::SetHelp ($Help, $Usage)
710
711Note that either of the long help or usage strings may be empty, in
712which case C<GetOptions> simply won't print them.  In the case where both
713are supplied, the long help message is printed first, followed by the
714option help summary, followed by the usage.  C<GetOptions> inserts enough
715blank lines to make the output look just fine on its own, so you
716shouldn't pad either the long help or usage message with blanks.  (It
717looks best if each ends with a newline, though, so setting the help
718strings with here-documents---as in this example---is the recommended
719approach.)
720
721As an example of the help display generated by a typical option table,
722let's take a look at the following:
723
724    $Verbose = 1;
725    $Clobber = 0;
726    undef $InFile;
727    @Threshold = (0, 1);
728
729    @argtbl = (["-verbose|-quiet", "boolean", 0, \$Verbose,
730                "be noisy"],
731               ["-clobber", "boolean", 0, \$Clobber,
732                "overwrite existing files"],
733               ["-infile", "string", 1, \$InFile,
734                "specify the input file from which to read a large " .
735                "and sundry variety of data, to which many " .
736                "interesting operations will be applied", "<f>"],
737               ["-threshold", "float", 2, \@Threshold,
738                "only consider values between <v1> and <v2>",
739                "<v1> <v2>"]);
740
741
742Assuming you haven't supplied long help or usage strings, then when
743C<GetOptions> encounters the help option, it will immediately stop
744parsing arguments and print out the following option summary:
745
746    Summary of options:
747       -verbose    be noisy [default]
748       -quiet      opposite of -verbose
749       -clobber    overwrite existing files
750       -noclobber  opposite of -clobber [default]
751       -infile <f> specify the input file from which to read a large and
752                   sundry variety of data, to which many interesting
753                   operations will be applied
754       -threshold <v1> <v2>
755                   only consider values between <v1> and <v2> [default: 0 1]
756
757There are a number of interesting things to note here.  First, there are
758three option table fields that affect the generation of help text:
759B<option>, B<help_string>, and B<argdesc>.  Note how the B<argdesc>
760strings are simply option placeholders, usually used to 1) indicate how
761many values are expected to follow an option, 2) (possibly) imply what
762form they take (although that's not really shown here), and 3) explain
763the exact meaning of the values in the help text.  B<argdesc> is just a
764string like the help string; you can put whatever you like in it.  What
765I've shown above is just my personal preference (which may well evolve).
766
767A new feature with version 0.3 of Getopt::Tabular is the inclusion of
768default values with the help for certain options.  A number of
769conditions must be fulfilled for this to happen for a given option:
770first, the option type must be one of the "argument-driven" types, such
771as C<integer>, C<float>, C<string>, or a user-defined type.  Second, the
772option data field must refer either to a defined scalar value (for
773scalar-valued options) or to a list of one or more defined values (for
774vector-valued options).  Thus, in the above example, the C<-infile>
775option doesn't have its default printed because the C<$InFile> scalar is
776undefined.  Likewise, if the C<@Threshold> array were the empty list
777C<()>, or a list of undefined values C<(undef,undef)>, then the default
778value for C<-threshold> also would not have been printed.
779
780The formatting is done as follows: enough room is made on the right hand
781side for the longest option name, initially omitting the argument
782placeholders.  Then, if an option has placeholders, and there is room
783for them in between the option and the help string, everything (option,
784placeholders, help string) is printed together.  An example of this is
785the C<-infile> option: here, "-infile <f>" is just small enough to fit
786in the 12-character column (10 characters because that is the length of
787the longest option, and 2 blanks), so the help text is placed right
788after it on the same line.  However, the C<-threshold> option becomes
789too long when its argument placeholders are appended to it, so the help
790text is pushed onto the next line.
791
792In any event, the help string supplied by the caller starts at the same
793column, and is filled to make a nice paragraph of help.  C<GetOptions> will
794fill to the width of the terminal (or 80 columns if it fails to find the
795terminal width).
796
797Finally, you can have pseudo entries of type B<section>, which are
798important to make long option lists readable (and one consequence of
799using Getopt::Tabular is programs with ridiculously long option lists -- not
800altogether a bad thing, I suppose).  For example, this table fragment:
801
802    @argtbl = (...,
803               ["-foo", "integer", 1, \$Foo,
804                "set the foo value", "f"],
805               ["-enterfoomode", "call", 0, \&enter_foo_mode,
806                "enter foo mode"],
807               ["Non-foo related options", "section"],
808               ["-bar", "string", 2, \@Bar,
809                "set the bar strings (which have nothing whatsoever " .
810                "to do with foo", "<bar1> <bar2>"],
811               ...);
812
813results in the following chunk of help text:
814
815       -foo f         set the foo value
816       -enterfoomode  enter foo mode
817
818    -- Non-foo related options ---------------------------------
819       -bar b1 b2     set the bar strings (which have nothing
820                      whatsoever to do with foo
821
822(This example also illustrates a slightly different style of argument
823placeholder.  Take your pick, or invent your own!)
824
825=head1 SPOOF MODE
826
827Since callbacks from the command line (C<call> and C<eval> options) can
828do anything, they might be quite expensive.  In certain cases, then, you
829might want to make an initial pass over the command line to ensure that
830everything is OK before parsing it "for real" and incurring all those
831expensive callbacks.  Thus, C<Getopt::Tabular> provides a "spoof" mode
832for parsing a command line without side-effects.  In the simplest case,
833you can access spoof mode like this:
834
835   use Getopt::Tabular qw(SpoofGetOptions GetOptions);
836     .
837     .
838     .
839   &SpoofGetOptions (\@options, \@ARGV, \@newARGV) || exit 1;
840
841and then later on, you would call C<GetOptions> with the I<original>
842C<@ARGV> (so it can do what C<SpoofGetOptions> merely pretended to do):
843
844   &GetOptions (\@options, \@ARGV, \@newARGV) || exit 1;
845
846For most option types, any errors that C<GetOptions> would catch should
847also be caught by C<SpoofGetOptions> -- so you might initially think
848that you can get away without that C<|| exit 1> after calling
849C<GetOptions>.  However, it's a good idea for a couple of reasons.
850First, you might inadvertently changed C<@ARGV> -- this is usually a bug
851and a silly thing to do, so you'd probably want your program to crash
852loudly rather than fail mysteriously later on.  Second, and more likely,
853some of those expensive operations that you're initially avoiding by
854using C<SpoofGetOptions> might themselves fail -- which would cause
855C<GetOptions> to return false where C<SpoofGetOption> completes without
856a problem.  (Finally, there's the faint possiblity of bugs in
857C<Getopt::Tabular> that would cause different behaviour in spoof mode
858and real mode -- this really shouldn't happen, though.)
859
860In reality, using spoof mode requires a bit more work.  In particular,
861the whole reason for spoof argument parsing is to avoid expensive
862callbacks, but since callbacks can eat any number of command line
863arguments, you have to emulate them in some way.  It's not possible for
864C<SpoofGetOptions> to do this for you, so you have to help out by
865supplying "spoof" callbacks.  As an example, let's say you have a
866callback option that eats one argument (a filename) and immediately
867reads that file:
868
869   @filedata = ();
870
871   sub read_file
872   {
873      my ($opt, $args) = @_;
874
875      warn ("$opt option requires an argument\n"), return 0 unless @$args;
876      my $file = shift @$args;
877      open (FILE, $file) ||
878         (warn ("$file: $!\n"), return 0);
879      push (@filedata, <FILE>);
880      close (FILE);
881      return 1;
882   }
883
884   @options =
885      (['-read_file', 'call', undef, \&read_file]);
886
887Since C<-read_file> could occur any number of times on the command line,
888we might end up reading an awful lot of files, and thus it might be a
889long time before we catch errors late in the command line.  Thus, we'd
890like to do a "spoof" pass over the command line to catch all errors.  A
891simplistic approach would be to supply a spoof callback that just eats
892one argument and returns success:
893
894   sub spoof_read_file
895   {
896      my ($opt, $args) = @_;
897      (warn ("$opt option requires an argument\n"), return 0)
898         unless @$args;
899      shift @$args;
900      return 1;
901   }
902
903Then, you have to tell C<Getopt::Tabular> about this alternate callback
904with no side-effects (apart from eating that one argument):
905
906   &Getopt::Tabular::SetSpoofCodes (-read_file => \&spoof_read_file);
907
908(C<SetSpoofCodes> just takes a list of key/value pairs, where the keys
909are C<call> or C<eval> options, and the values are the "no side-effects"
910callbacks.  Naturally, the replacement callback for an C<eval> option
911should be a string, and for a C<call> option it should be a code
912reference.  This is not actually checked, however, until you call
913C<SpoofGetOptions>, because C<SetSpoofCodes> doesn't know whether
914options are C<call> or C<eval> or what.)
915
916A more useful C<spoof_read_file>, however, would actually check if the
917requested file exists -- i.e., we should try to catch as many errors as
918possible, as early as possible:
919
920   sub spoof_read_file
921   {
922      my ($opt, $args) = @_;
923      warn ("$opt option requires an argument\n"), return 0
924         unless @$args;
925      my $file = shift @$args;
926      warn ("$file does not exist or is not readable\n"), return 0
927         unless -r $file;
928      return 1;
929   }
930
931Finally, you can frequently merge the "real" and "spoof" callback into
932one subroutine:
933
934   sub read_file
935   {
936      my ($opt, $args, $spoof) = @_;
937
938      warn ("$opt option requires an argument\n"), return 0 unless @$args;
939      my $file = shift @$args;
940      warn ("$file does not exist or is not readable\n"), return 0
941         unless -r $file;
942      return 1 if $spoof;
943      open (FILE, $file) ||
944         (warn ("$file: $!\n"), return 0);
945      push (@filedata, <FILE>);
946      close (FILE);
947      return 1;
948   }
949
950And then, when specifying the replacement callback to C<SetSpoofCodes>,
951just create an anonymous sub that calls C<read_file> with C<$spoof>
952true:
953
954   &Getopt::Tabular::SetSpoofCodes
955      (-read_file => sub { &read_file (@_[0,1], 1) });
956
957Even though this means a bigger and more complicated callback, you only
958need I<one> such callback -- the alternative is to carry around both
959C<read_file> and C<spoof_read_file>, which might do redundant processing
960of the argument list.
961
962=head1 AUTHOR
963
964Greg Ward <greg@bic.mni.mcgill.ca>
965
966Started in July, 1995 as ParseArgs.pm, with John Ousterhout's
967Tk_ParseArgv.c as a loose inspiration.  Many many features added over
968the ensuing months; documentation written in a mad frenzy 16-18 April,
9691996.  Renamed to Getopt::Tabular, revamped, reorganized, and
970documentation expanded 8-11 November, 1996.
971
972Copyright (c) 1995-97 Greg Ward. All rights reserved.  This is
973free software; you can redistribute it and/or modify it under the same
974terms as Perl itself.
975
976=head1 BUGS
977
978The documentation is bigger than the code, and I still haven't covered
979option patterns or extending the type system (apart from pattern types).
980Yow!
981
982No support for list-valued options, although you can roll your own with
983B<call> options.  (See the demo program included with the distribution
984for an example.)
985
986Error messages are hard-coded to English.
987