1NOTE to myself -- this pod needs to be updated to have option 2patterns described! 3 4 5=head1 NAME 6 7Getopt::Tabular - table-driven argument parsing for Perl 5 8 9=head1 SYNOPSIS 10 11 use Getopt::Tabular; 12 13(or) 14 15 use Getopt::Tabular qw/GetOptions 16 SetHelp SetHelpOption 17 SetError GetError/; 18 19 ... 20 21 &Getopt::Tabular::SetHelp (long_help, usage_string); 22 23 @opt_table = ( 24 [section_description, "section"], 25 [option, type, num_values, option_data, help_string], 26 ... 27 ); 28 &GetOptions (\@opt_table, \@ARGV [, \@newARGV]) || exit 1; 29 30=head1 DESCRIPTION 31 32B<Getopt::Tabular> is a Perl 5 module for table-driven argument parsing, 33vaguely inspired by John Ousterhout's Tk_ParseArgv. All you really need 34to do to use the package is set up a table describing all your 35command-line options, and call &GetOptions with three arguments: a 36reference to your option table, a reference to C<@ARGV> (or something 37like it), and an optional third array reference (say, to C<@newARGV>). 38&GetOptions will process all arguments in C<@ARGV>, and copy any 39leftover arguments (i.e. those that are not options or arguments to some 40option) to the C<@newARGV> array. (If the C<@newARGV> argument is not 41supplied, C<GetOptions> will replace C<@ARGV> with the stripped-down 42argument list.) If there are any invalid options, C<GetOptions> will 43print an error message and return 0. 44 45Before I tell you all about why Getopt::Tabular is a wonderful thing, let me 46explain some of the terminology that will keep popping up here. 47 48=over 4 49 50=item argument 51 52any single word appearing on the command-line, i.e. one element of the 53C<@ARGV> array. 54 55=item option 56 57an argument that starts with a certain sequence of characters; the default 58is "-". (If you like GNU-style options, you can change this to "--".) In 59most Getopt::Tabular-based applications, options can come anywhere on the 60command line, and their order is unimportant (unless one option overrides a 61previous option). Also, Getopt::Tabular will allow any non-ambiguous 62abbreviation of options. 63 64=item option argument 65 66(or I<value>) an argument that immediately follows certain types of 67options. For instance, if C<-foo> is a scalar-valued integer option, and 68C<-foo 3> appears on the command line, then C<3> will be the argument to 69C<-foo>. 70 71=item option type 72 73controls how C<GetOptions> deals with an option and the arguments that 74follow it. (Actually, for most option types, the type interacts with the 75C<num_values> field, which determines whether the option is scalar- or 76vector-valued. This will be fully explained in due course.) 77 78=back 79 80=head1 FEATURES 81 82Now for the advertising, i.e. why Getopt::Tabular is a good thing. 83 84=over 4 85 86=item * 87 88Command-line arguments are carefully type-checked, both by pattern and 89number---e.g. if an option requires two integers, GetOptions makes sure 90that exactly two integers follow it! 91 92=item * 93 94The valid command-line arguments are specified in a data structure 95separate from the call to GetOptions; this makes it easier to have very 96long lists of options, and to parse options from multiple sources (e.g. the 97command line, an environment variable, and a configuration file). 98 99=item * 100 101Getopt::Tabular can intelligently generate help text based on your option 102descriptions. 103 104=item * 105 106The type system is extensible, and if you can define your desired argument 107type using a single Perl regular expression then it's particularly easy to 108extend. 109 110=item * 111 112To make your program look smarter, options can be abbreviated and come in 113any order. 114 115=item * 116 117 You can parse options in a "spoof" mode that has no side-effects -- this 118is useful for making a validation pass over the command line without 119actually doing anything. 120 121=back 122 123In general, I have found that Getopt::Tabular tends to encourage programs 124with long lists of sophisticated options, leading to great flexibility, 125intelligent operation, and the potential for insanely long command lines. 126 127=head1 BASIC OPERATION 128 129The basic operation of Getopt::Tabular is driven by an I<option table>, 130which is just a list of I<option descriptions> (otherwise known as option 131table entries, or just entries). Each option description tells 132C<GetOptions> everything it needs to know when it encounters a particular 133option on the command line. For instance, 134 135 ["-foo", "integer", 2, \@Foo, "set the foo values"] 136 137means that whenever C<-foo> is seen on the command line, C<GetOptions> is 138to make sure that the next two arguments are integers, and copy them into 139the caller's C<@Foo> array. (Well, really into the C<@Foo> array where the 140option table is defined. This is almost always the same as C<GetOptions>' 141caller, though.) 142 143Typically, you'll group a bunch of option descriptions together like 144this: 145 146 @options = 147 (["-range", "integer", 2, \@Range, 148 "set the range of allowed values"], 149 ["-file", "string", 1, \$File, 150 "set the output file"], 151 ["-clobber", "boolean", 0, \$Clobber, 152 "clobber existing files"], 153 ... 154 ); 155 156and then call C<GetOptions> like this: 157 158 &GetOptions (\@options, \@ARGV) || exit 1; 159 160which replaces C<@ARGV> with a new array containing all the arguments 161left-over after options and their arguments have been removed. You can 162also call C<GetOptions> with three arguments, like this: 163 164 &GetOptions (\@options, \@ARGV, \@newARGV) || exit 1; 165 166in which case C<@ARGV> is untouched, and C<@newARGV> gets the leftover 167arguments. 168 169In case of error, C<GetOptions> prints enough information for the user to 170figure out what's going wrong. If you supply one, it'll even print out a 171brief usage message in case of error. Thus, it's enough to just C<exit 1> 172when C<GetOptions> indicates an error by returning 0. 173 174Detailed descriptions of the contents of an option table entry are given 175next, followed by the complete run-down of available types, full details on 176error handling, and how help text is generated. 177 178=head1 OPTION TABLE ENTRIES 179 180The fields in the option table control how arguments are parsed, so it's 181important to understand each one in turn. First, the format of entries in 182the table is fairly rigid, even though this isn't really necessary with 183Perl. It's done that way to make the Getopt::Tabular code a little easier; 184the drawback is that some entries will have unused values (e.g. the 185C<num_values> field is never used for boolean options, but you still have 186to put something there as a place-holder). The fields are as follows: 187 188=over 4 189 190=item option 191 192This is the option name, e.g. "-verbose" or "-some_value". For most option 193types, this is simply an option prefix followed by text; for boolean 194options, however, it can be a little more complicated. (The exact rules 195are discussed under L<"OPTION TYPES">.) And yes, even though you tell 196Getopt::Tabular the valid option prefixes, you still have to put one onto 197the option names in the table. 198 199=item type 200 201The option type decides what action will be taken when this option is seen 202on the command line, and (if applicable) what sort of values will be 203accepted for this option. There are three broad classes of types: those 204that imply copying data from the command line into some variable in the 205caller's space; those that imply copying constant data into the caller's 206space without taking any more arguments from the command line; and those 207that imply some other action to be taken. The available option types are 208covered in greater detail below (see L<OPTION TYPES>), but briefly: 209C<string>, C<integer>, and C<float> all imply copying values from the 210command line to a variable; C<constant>, C<boolean>, C<copy>, 211C<arrayconst>, and C<hashconst> all imply copying some pre-defined data 212into a variable; C<call> and C<eval> allow the execution of some arbitrary 213subroutine or chunk of code; and C<help> options will cause C<GetOptions> 214to print out all available help text and return 0. 215 216=item num_values 217 218for C<string>, C<integer>, and C<float> options, this determines whether 219the option is a scalar (B<num_values> = 1) or vector (B<num_values> > 1) 220option. (Note that whether the option is scalar- or vector-valued has an 221important influence on what you must supply in the B<option_data> field!) 222For C<constant>, C<copy>, C<arrayconst>, and C<hashconst> option types, 223B<num_values> is a bit of a misnomer: it actually contains the value (or a 224reference to it, if array or hash) to be copied when the option is 225encountered. For C<call> options, B<num_values> can be used to supply 226extra arguments to the called subroutine. In any case, though, you can 227think of B<num_values> as an input value. For C<boolean> and C<eval> 228options, B<num_values> is ignored and should be C<undef> or 0. 229 230=item option_data 231 232For C<string>, C<integer>, C<float>, C<boolean>, C<constant>, C<copy>, 233C<arrayconst>, and C<hashconst> types, this must be a reference to the 234variable into which you want C<GetOptions> to copy the appropriate thing. 235The "appropriate thing" is either the argument(s) following the option, the 236constant supplied as B<num_values>, or 1 or 0 (for boolean options). 237 238For C<boolean>, C<constant>, C<copy>, and scalar-valued C<string>, 239C<integer>, and C<float> options, this must be a scalar reference. For 240vector-valued C<string>, C<integer>, and C<float> options (B<num_values> > 2411), and for C<arrayconst> options, this must be an array reference. For 242C<hashconst> options, this must be a hash reference. 243 244Finally, B<option_data> is also used as an input value for C<call> and 245C<eval> options: for C<call>, it should be a subroutine reference, and for 246C<eval> options, it should be a string containing valid Perl code to 247evaluate when the option is seen. The subroutine called by a C<call> 248option should take at least two arguments: a string, which is the actual 249option that triggered the call (because the same subroutine could be tied 250to many options), and an array reference, which contains all command line 251arguments after that option. (Further arguments can be supplied in the 252B<num_values> field.) The subroutine may freely modify this array, and 253those modifications will affect the behaviour of C<GetOptions> afterwards. 254 255The chunk of code passed to an C<eval> option is evaluated in the package 256from which C<GetOptions> is called, and does not have access to any 257internal Getopt::Tabular data. 258 259=item help_string 260 261(optional) a brief description of the option. Don't worry about formatting 262this in any way; when C<GetOptions> has to print out your help, it will do so 263quite nicely without any intervention. If the help string is not defined, 264then that option will not be included in the option help text. (However, 265you could supply an empty string -- which is defined -- to make C<GetOptions> 266just print out the option name, but nothing else.) 267 268=item arg_desc 269 270(optional) an even briefer description of the values that you expect to 271follow your option. This is mainly used to supply place-holders in the help 272string, and is specified separately so that C<GetOptions> can act fairly 273intelligently when formatting a help message. See L<"HELP TEXT"> for more 274information. 275 276=back 277 278 279=head1 OPTION TYPES 280 281The option type field is the single-most important field in the table, as 282the type for an option C<-foo> determines (along with B<num_values>) what 283action C<GetOptions> takes when it sees C<-foo> on the command line: how many 284following arguments become C<-foo>'s arguments, what regular expression 285those arguments must conform to, or whether some other action should be 286taken. 287 288As mentioned above, there are three main classes of argument types: 289 290=over 4 291 292=item argument-driven options 293 294These are options that imply taking one or more option arguments from 295the command line after the option itself is taken. The arguments are 296then copied into some variable supplied (by reference) in the option 297table entry. 298 299=item constant-valued options 300 301These are options that have a constant value associated with them; when 302the option is seen on the command line, that constant is copied to some 303variable in the caller's space. (Both the constant and the value are 304supplied in the option table entry.) Constants can be scalars, arrays, 305or hashes. 306 307=item other options 308 309These imply some other action to be taken, usually supplied as a string 310to C<eval> or a subroutine to call. 311 312=back 313 314=head2 Argument-driven option types 315 316=over 4 317 318=item string, integer, float 319 320These are the option types that imply "option arguments", i.e. arguments 321after the option that will be consumed when that option is encountered on 322the command line and copied into the caller's space via some reference. 323For instance, if you want an option C<-foo> to take a single string as 324an argument, with that string being copied to the scalar variable 325C<$Foo>, then you would have this entry in your option table: 326 327 ["-foo", "string", 1, \$Foo] 328 329(For conciseness, I've omitted the B<help_string> and B<argdesc> entries 330in all of the example entries in this section. In reality, you should 331religiously supply help text in order to make your programs easier to 332use and easier to maintain.) 333 334If B<num_values> is some I<n> greater than one, then the B<option_data> 335field must be an I<array> reference, and I<n> arguments are copied from 336the command line into that array. (The array is clobbered each time 337C<-foo> is encountered, not appended to.) In this case, C<-foo> is 338referred to as a I<vector-valued> option, as it must be followed by a 339fixed number of arguments. (Eventually, I plan to add I<list-valued> 340options, which take a variable number of arguments.) For example an 341option table like 342 343 ["-foo", "string", 3, \@Foo] 344 345would result in the C<@Foo> array being set to the three strings 346immediately following any C<-foo> option on the command line. 347 348The only difference between B<string>, B<integer>, and B<float> options is 349how picky C<GetOptions> is about the value(s) it will accept. For 350B<string> options, anything is OK; for B<integer> options, the values must 351look like integers (i.e., they must match C</[+-]?\d+/>); for B<float> 352options, the values must look like C floating point numbers (trust me, you 353don't want to see the regexp for this). Note that since string options 354will accept anything, they might accidentally slurp up arguments that are 355meant to be further options, if the user forgets to put the correct string. 356For instance, if C<-foo> and C<-bar> are both scalar-valued string options, 357and the arguments C<-foo -bar> are seen on the command-line, then "-bar" 358will become the argument to C<-foo>, and never be processed as an option 359itself. (This could be construed as either a bug or a feature. If you 360feel really strongly that it's a bug, then complain and I'll consider doing 361something about it.) 362 363If not enough arguments are found that match the required regular 364expression, C<GetOptions> prints to standard error a clear and useful error 365message, followed by the usage summary (if you supplied one), and returns 3660. The error messages look something like "-foo option must be followed by 367an integer", or "-foo option must be followed by 3 strings", so it really 368is enough for your program to C<exit 1> without printing any further 369message. 370 371=item User-defined patterns 372 373Since the three option types described above are defined by nothing more 374than a regular expression, it's easy to define your own option types. For 375instance, let's say you want an option to accept only strings of upper-case 376letters. You could then call C<&Getopt::Tabular::AddPatternType> as 377follows: 378 379 &Getopt::Tabular::AddPatternType 380 ("upperstring", "[A-Z]+", "uppercase string") 381 382Note that the third parameter is optional, and is only supplied to make 383error messages clearer. For instance, if you now have a scalar-valued 384option C<-zap> of type C<upperstring>: 385 386 ["-zap", "upperstring", 1, \$Zap] 387 388and the user gets it wrong and puts an argument that doesn't consist of 389all uppercase letters after C<-zap>, then C<GetOptions> will complain 390that "-zap option must be followed by an uppercase string". If you 391hadn't supplied the third argument to C<&AddType>, then the error 392message would have been the slightly less helpful "-zap option must be 393followed by an upperstring". Also, you might have to worry about how 394C<GetOptions> pluralizes your description: in this case, it will simply 395add an "s", which works fine much of the time, but not always. 396Alternately, you could supply a two-element list containing the singular 397and plural forms: 398 399 &Getopt::Tabular::AddPatternType 400 ("upperstring", "[A-Z]+", 401 ["string of uppercase letters", "strings of uppercase letters"]) 402 403So, if C<-zap> instead expects three C<upperstring>s, and the user 404goofs, then the error message would be (in the first example) "-zap 405option must be followed by 3 uppercase strings" or "-zap option must be 406followed by three strings of uppercase letters" (second example). 407 408Of course, if you don't intend to have vector-valued options of your new 409type, pluralization hardly matters. Also, while it might seem that this 410is a nice stab in the direction of multi-lingual support, the error 411messages are still hard-coded to English in other places. Maybe in the 412next version... 413 414=back 415 416=head2 Constant-valued option types 417 418=over 4 419 420=item boolean 421 422For B<boolean> options, B<option_data> must be a scalar reference; 423B<num_values> is ignored (you can just set it to C<undef> or 0). 424Booleans are slightly weird in that every boolean option implies I<two> 425possible arguments that will be accepted on the command line, called the 426positive and negative alternatives. The positive alternative (which is 427what you specify as the option name) results in a true value, while the 428negative alternative results in false. Most of the time, you can let 429C<GetOptions> pick the negative alternative for you: it just inserts 430"no" after the option prefix, so "-clobber" becomes "-noclobber". (More 431precisely, C<GetOptions> tests all option prefixes until one of them 432matches at the beginning of the option name. It then inserts "no" 433between this prefix and the rest of the string. So, if you want to 434support both GNU-style options (like C<--clobber>) and one-hyphen 435options (C<-c>), be sure to give "--" I<first> when setting the option 436patterns with C<&SetOptionPatterns>. Otherwise, the negative 437alternative to "--clobber" will be "-no-clobber", which might not be 438what you wanted.) Sometimes, though, you want to explicitly specify the 439negative alternative. This is done by putting both alternatives in the 440option name, separated by a vertical bar, e.g. "-verbose|-quiet". 441 442For example, the above two examples might be specified as 443 444 ["-clobber", "boolean", undef, \$Clobber], 445 ["-verbose|-quiet", "boolean", undef, \$Verbose],...); 446 447If C<-clobber> is seen on the command line, C<$Clobber> will be set to 1; 448if C<-noclobber> is seen, then C<$Clobber> will be set to 0. Likewise, 449C<-verbose> results in C<$Verbose> being set to 1, and C<-quiet> will set 450C<$Verbose> to 0. 451 452=item const 453 454For B<const> options, put a scalar value (I<not> reference) in 455B<num_values>, and a scalar reference in B<option_data>. For example: 456 457 ["-foo", "const", "hello there", \$Foo] 458 459On encountering C<-foo>, C<GetOptions> will copy "hello there" to C<$Foo>. 460 461=item arrayconst 462 463For B<arrayconst> options, put an array reference (input) (I<not> an array 464value) in B<num_values>, and another array reference (output) in 465B<option_data>. For example: 466 467 ["-foo", "arrayconst", [3, 6, 2], \@Foo] 468 469On encountering C<-foo>, C<GetOptions> will copy the array C<(3,6,2)> into 470C<@Foo>. 471 472=item hashconst 473 474For B<hashconst> options, put a hash reference (input) (I<not> a hash 475value) in B<num_values>, and another hash reference (output) in 476B<option_data>. For example: 477 478 ["-foo", "hashconst", { "Perl" => "Larry Wall", 479 "C" => "Dennis Ritchie", 480 "Pascal" => "Niklaus Wirth" }, 481 \%Inventors] 482 483On encountering C<-foo>, C<GetOptions> will copy into C<%Inventors> a hash 484relating various programming languages to the culprits primarily 485responsible for their invention. 486 487=item copy 488 489B<copy> options act just like B<const> options, except when 490B<num_values> is undefined. In that case, the option name itself will 491be copied to the scalar referenced by B<option_data>, rather than the 492C<undef> value that would be copied under these circumstances with a 493B<const> option. This is useful when one program accepts options that 494it simply passes to a sub-program; for instance, if F<prog1> calls 495F<prog2>, and F<prog2> might be run with the -foo option, then 496F<prog1>'s argument table might have this option: 497 498 ["-foo", "copy", undef, \$Foo, 499 "run prog2 with the -foo option"] 500 501and later on, you would run F<prog2> like this: 502 503 system ("prog2 $Foo ..."); 504 505That way, if C<-foo> is never seen on F<prog1>'s command line, C<$Foo> will 506be untouched, and will expand to the empty string when building the command 507line for F<prog2>. 508 509If B<num_values> is anything other than C<undef>, then B<copy> options 510behave just like B<constant> options. 511 512=back 513 514=head2 Other option types 515 516=over 4 517 518=item call 519 520For B<call> options, B<option_data> must be a reference to a subroutine. 521The subroutine will be called with at least two arguments: a string 522containing the option that triggered the call (because the same 523subroutine might be activated by many options), a reference to an array 524containing all remaining command-line arguments after the option, and 525other arguments specified using the B<num_values> field. (To be used 526for this purpose, B<num_values> must be an array reference; otherwise, 527it is ignored.) For example, you might define a subroutine 528 529 sub process_foo 530 { 531 my ($opt, $args, $dest) = @_; 532 533 $$dest = shift @$args; # not quite right! (see below) 534 } 535 536with a corresponding option table entry: 537 538 ["-foo", "call", [\$Foo], \&process_foo] 539 540and then C<-foo> would act just like a scalar-valued string option that 541copies into C<$Foo>. (Well, I<almost> ... read on.) 542 543A subtle point that might be missed from the above code: the value returned 544by C<&process_foo> I<does> matter: if it is false, then C<GetOptions> will 545return 0 to its caller, indicating failure. To make sure that the user 546gets a useful error message, you should supply one by calling C<SetError>; 547doing so will prevent C<GetOptions> from printing out a rather mysterious 548(to the end user, at least) message along the lines of "subroutine call 549failed". The above example has two subtle problems: first, if the argument 550following C<-foo> is an empty string, then C<process_foo> will return 551the empty string---a false value---thus causing C<GetOptions> to fail 552confusingly. Second, if there no arguments after C<-foo>, then 553C<process_foo> will return C<undef>---again, a false value, causing 554C<GetOptions> to fail. 555 556To solve these problems, we have to define the requirements for the 557C<-foo> option a little more rigorously. Let's say that any string 558(including the empty string) is valid, but that there must be something 559there. Then C<process_foo> is written as follows: 560 561 sub process_foo 562 { 563 my ($opt, $args, $dest) = @_; 564 565 $$dest = shift @$args; 566 (defined $$dest) && return 1; 567 &Getopt::Tabular::SetError 568 ("bad_foo", "$opt option must be followed by a string"); 569 return 0; 570 } 571 572The C<SetError> routine actually takes two arguments: an error class and 573an error message. This is explained fully in the L<ERROR HANDLING> 574section, below. And, if you find yourself writing a lot of routines 575like this, C<SetError> is optionally exported from C<Getopt::Tabular>, 576so you can of course import it into your main package like this: 577 578 use Getopt::Tabular qw/GetOptions SetError/; 579 580=item eval 581 582An B<eval> option specifies a chunk of Perl code to be executed 583(C<eval>'d) when the option is encountered on the command line. The 584code is supplied (as a string) in the B<option_data> field; again, 585B<num_values> is ignored. For example: 586 587 ["-foo", "eval", undef, 588 'print "-foo seen on command line\n"'] 589 590will cause C<GetOptions> to print out (via an C<eval>) the string "-foo seen 591on the command line\n" when -foo is seen. No other action is taken 592apart from what you include in the eval string. The code is evaluated 593in the package from which C<GetOptions> was called, so you can access 594variables and subroutines in your program easily. If any error occurs 595in the C<eval>, C<GetOptions> complains loudly and returns 0. 596 597Note that the supplied code is always evaluated in a C<no strict> 598environment---that's because F<Getopt::Tabular> is itself C<use 599strict>-compliant, and I didn't want to force strictness on every quick 600hack that uses the module. (Especially since B<eval> options seem to be 601used mostly in quick hacks.) (Anyone who knows how to fetch the 602strictness state for another package or scope is welcome to send me 603hints!) However, the B<-w> state is untouched. 604 605=item section 606 607B<section> options are just used to help formatting the help text. See 608L<HELP TEXT> below for more details. 609 610=back 611 612 613=head1 ERROR HANDLING 614 615Generally, handling errors in the argument list is pretty transparent: 616C<GetOptions> (or one of its minions) generates an error message and 617assigns an error class, C<GetOptions> prints the message to the standard 618error, and returns 0. You can access the error class and error message 619using the C<GetError> routine: 620 621 ($err_class, $err_msg) = &Getopt::Tabular::GetError (); 622 623(Like C<SetError>, C<GetError> can also be exported from 624F<Getopt::Tabular>.) The error message is pretty simple---it is an 625explanation for the end user of what went wrong, which is why 626C<GetOptions> just prints it out and forgets about it. The error class 627is further information that might be useful for your program; the 628current values are: 629 630=over 4 631 632=item bad_option 633 634set when something that looks like an option is found on the command 635line, but it's either unknown or an ambiguous abbreviation. 636 637=item bad_value 638 639set when an option is followed by an invalid argument (i.e., one that 640doesn't match the regexp for that type), or the wrong number of 641arguments. 642 643=item bad_call 644 645set when a subroutine called via a B<call> option or the code evaluated 646for an B<eval> option returns a false value. The subroutine or eval'd 647code can override this by calling C<SetError> itself. 648 649=item bad_eval 650 651set when the code evaluted for an B<eval> option has an error in it. 652 653=item help 654 655set when the user requests help 656 657=back 658 659Note that most of these are errors on the end user's part, such as bad 660or missing arguments. There are also errors that can be caused by you, 661the programmer, such as bad or missing values in the option table; these 662generally result in C<GetOptions> croaking so that your program dies 663immediately with enough information that you can figure out where the 664mistake is. B<bad_eval> is a borderline case; there are conceivably 665cases where the end user's input can result in bogus code to evaluate, 666so I grouped this one in the "user errors" class. Finally, asking for 667help isn't really an error, but the assumption is that you probably 668shouldn't continue normal processing after printing out the help---so 669C<GetOptions> returns 0 in this case. You can always fetch the error 670class with C<GetError> if you want to treat real errors differently from 671help requests. 672 673=head1 HELP TEXT 674 675One of Getopt::Tabular's niftier features is the ability to generate and 676format a pile of useful help text from the snippets of help you include 677in your option table. The best way to illustrate this is with a couple 678of brief examples. First, it's helpful to know how the user can trigger 679a help display. This is quite simple: by default, C<GetOptions> always 680has a "-help" option, presence of which on the command line triggers a 681help display. (Actually, the help option is really your preferred 682option prefix plus "help". So, if you like to make GNU-style options to 683take precedence as follows: 684 685 &Getopt::Tabular::SetOptionPatterns qw|(--)([\w-]+) (-)(\w+)|; 686 687then the help option will be "--help". There is only one help option 688available, and you can set it by calling C<&SetHelpOption> (another 689optional export). 690 691Note that in addition to the option help embedded in the option table, 692C<GetOptions> can optionally print out two other messages: a descriptive 693text (usually a short paragraph giving a rough overview of what your 694program does, possibly referring the user to the fine manual page), and 695a usage text. These are both supplied by calling C<&SetHelp>, e.g. 696 697 $Help = <<HELP; 698 This is the foo program. It reads one file (specified by -infile), 699 operates on it some unspecified way (possibly modified by 700 -threshold), and does absolutely nothing with the results. 701 (The utility of the -clobber option has yet to be established.) 702 HELP 703 704 $Usage = <<USAGE; 705 usage: foo [options] 706 foo -help to list options 707 USAGE 708 709 &Getopt::Tabular::SetHelp ($Help, $Usage) 710 711Note that either of the long help or usage strings may be empty, in 712which case C<GetOptions> simply won't print them. In the case where both 713are supplied, the long help message is printed first, followed by the 714option help summary, followed by the usage. C<GetOptions> inserts enough 715blank lines to make the output look just fine on its own, so you 716shouldn't pad either the long help or usage message with blanks. (It 717looks best if each ends with a newline, though, so setting the help 718strings with here-documents---as in this example---is the recommended 719approach.) 720 721As an example of the help display generated by a typical option table, 722let's take a look at the following: 723 724 $Verbose = 1; 725 $Clobber = 0; 726 undef $InFile; 727 @Threshold = (0, 1); 728 729 @argtbl = (["-verbose|-quiet", "boolean", 0, \$Verbose, 730 "be noisy"], 731 ["-clobber", "boolean", 0, \$Clobber, 732 "overwrite existing files"], 733 ["-infile", "string", 1, \$InFile, 734 "specify the input file from which to read a large " . 735 "and sundry variety of data, to which many " . 736 "interesting operations will be applied", "<f>"], 737 ["-threshold", "float", 2, \@Threshold, 738 "only consider values between <v1> and <v2>", 739 "<v1> <v2>"]); 740 741 742Assuming you haven't supplied long help or usage strings, then when 743C<GetOptions> encounters the help option, it will immediately stop 744parsing arguments and print out the following option summary: 745 746 Summary of options: 747 -verbose be noisy [default] 748 -quiet opposite of -verbose 749 -clobber overwrite existing files 750 -noclobber opposite of -clobber [default] 751 -infile <f> specify the input file from which to read a large and 752 sundry variety of data, to which many interesting 753 operations will be applied 754 -threshold <v1> <v2> 755 only consider values between <v1> and <v2> [default: 0 1] 756 757There are a number of interesting things to note here. First, there are 758three option table fields that affect the generation of help text: 759B<option>, B<help_string>, and B<argdesc>. Note how the B<argdesc> 760strings are simply option placeholders, usually used to 1) indicate how 761many values are expected to follow an option, 2) (possibly) imply what 762form they take (although that's not really shown here), and 3) explain 763the exact meaning of the values in the help text. B<argdesc> is just a 764string like the help string; you can put whatever you like in it. What 765I've shown above is just my personal preference (which may well evolve). 766 767A new feature with version 0.3 of Getopt::Tabular is the inclusion of 768default values with the help for certain options. A number of 769conditions must be fulfilled for this to happen for a given option: 770first, the option type must be one of the "argument-driven" types, such 771as C<integer>, C<float>, C<string>, or a user-defined type. Second, the 772option data field must refer either to a defined scalar value (for 773scalar-valued options) or to a list of one or more defined values (for 774vector-valued options). Thus, in the above example, the C<-infile> 775option doesn't have its default printed because the C<$InFile> scalar is 776undefined. Likewise, if the C<@Threshold> array were the empty list 777C<()>, or a list of undefined values C<(undef,undef)>, then the default 778value for C<-threshold> also would not have been printed. 779 780The formatting is done as follows: enough room is made on the right hand 781side for the longest option name, initially omitting the argument 782placeholders. Then, if an option has placeholders, and there is room 783for them in between the option and the help string, everything (option, 784placeholders, help string) is printed together. An example of this is 785the C<-infile> option: here, "-infile <f>" is just small enough to fit 786in the 12-character column (10 characters because that is the length of 787the longest option, and 2 blanks), so the help text is placed right 788after it on the same line. However, the C<-threshold> option becomes 789too long when its argument placeholders are appended to it, so the help 790text is pushed onto the next line. 791 792In any event, the help string supplied by the caller starts at the same 793column, and is filled to make a nice paragraph of help. C<GetOptions> will 794fill to the width of the terminal (or 80 columns if it fails to find the 795terminal width). 796 797Finally, you can have pseudo entries of type B<section>, which are 798important to make long option lists readable (and one consequence of 799using Getopt::Tabular is programs with ridiculously long option lists -- not 800altogether a bad thing, I suppose). For example, this table fragment: 801 802 @argtbl = (..., 803 ["-foo", "integer", 1, \$Foo, 804 "set the foo value", "f"], 805 ["-enterfoomode", "call", 0, \&enter_foo_mode, 806 "enter foo mode"], 807 ["Non-foo related options", "section"], 808 ["-bar", "string", 2, \@Bar, 809 "set the bar strings (which have nothing whatsoever " . 810 "to do with foo", "<bar1> <bar2>"], 811 ...); 812 813results in the following chunk of help text: 814 815 -foo f set the foo value 816 -enterfoomode enter foo mode 817 818 -- Non-foo related options --------------------------------- 819 -bar b1 b2 set the bar strings (which have nothing 820 whatsoever to do with foo 821 822(This example also illustrates a slightly different style of argument 823placeholder. Take your pick, or invent your own!) 824 825=head1 SPOOF MODE 826 827Since callbacks from the command line (C<call> and C<eval> options) can 828do anything, they might be quite expensive. In certain cases, then, you 829might want to make an initial pass over the command line to ensure that 830everything is OK before parsing it "for real" and incurring all those 831expensive callbacks. Thus, C<Getopt::Tabular> provides a "spoof" mode 832for parsing a command line without side-effects. In the simplest case, 833you can access spoof mode like this: 834 835 use Getopt::Tabular qw(SpoofGetOptions GetOptions); 836 . 837 . 838 . 839 &SpoofGetOptions (\@options, \@ARGV, \@newARGV) || exit 1; 840 841and then later on, you would call C<GetOptions> with the I<original> 842C<@ARGV> (so it can do what C<SpoofGetOptions> merely pretended to do): 843 844 &GetOptions (\@options, \@ARGV, \@newARGV) || exit 1; 845 846For most option types, any errors that C<GetOptions> would catch should 847also be caught by C<SpoofGetOptions> -- so you might initially think 848that you can get away without that C<|| exit 1> after calling 849C<GetOptions>. However, it's a good idea for a couple of reasons. 850First, you might inadvertently changed C<@ARGV> -- this is usually a bug 851and a silly thing to do, so you'd probably want your program to crash 852loudly rather than fail mysteriously later on. Second, and more likely, 853some of those expensive operations that you're initially avoiding by 854using C<SpoofGetOptions> might themselves fail -- which would cause 855C<GetOptions> to return false where C<SpoofGetOption> completes without 856a problem. (Finally, there's the faint possiblity of bugs in 857C<Getopt::Tabular> that would cause different behaviour in spoof mode 858and real mode -- this really shouldn't happen, though.) 859 860In reality, using spoof mode requires a bit more work. In particular, 861the whole reason for spoof argument parsing is to avoid expensive 862callbacks, but since callbacks can eat any number of command line 863arguments, you have to emulate them in some way. It's not possible for 864C<SpoofGetOptions> to do this for you, so you have to help out by 865supplying "spoof" callbacks. As an example, let's say you have a 866callback option that eats one argument (a filename) and immediately 867reads that file: 868 869 @filedata = (); 870 871 sub read_file 872 { 873 my ($opt, $args) = @_; 874 875 warn ("$opt option requires an argument\n"), return 0 unless @$args; 876 my $file = shift @$args; 877 open (FILE, $file) || 878 (warn ("$file: $!\n"), return 0); 879 push (@filedata, <FILE>); 880 close (FILE); 881 return 1; 882 } 883 884 @options = 885 (['-read_file', 'call', undef, \&read_file]); 886 887Since C<-read_file> could occur any number of times on the command line, 888we might end up reading an awful lot of files, and thus it might be a 889long time before we catch errors late in the command line. Thus, we'd 890like to do a "spoof" pass over the command line to catch all errors. A 891simplistic approach would be to supply a spoof callback that just eats 892one argument and returns success: 893 894 sub spoof_read_file 895 { 896 my ($opt, $args) = @_; 897 (warn ("$opt option requires an argument\n"), return 0) 898 unless @$args; 899 shift @$args; 900 return 1; 901 } 902 903Then, you have to tell C<Getopt::Tabular> about this alternate callback 904with no side-effects (apart from eating that one argument): 905 906 &Getopt::Tabular::SetSpoofCodes (-read_file => \&spoof_read_file); 907 908(C<SetSpoofCodes> just takes a list of key/value pairs, where the keys 909are C<call> or C<eval> options, and the values are the "no side-effects" 910callbacks. Naturally, the replacement callback for an C<eval> option 911should be a string, and for a C<call> option it should be a code 912reference. This is not actually checked, however, until you call 913C<SpoofGetOptions>, because C<SetSpoofCodes> doesn't know whether 914options are C<call> or C<eval> or what.) 915 916A more useful C<spoof_read_file>, however, would actually check if the 917requested file exists -- i.e., we should try to catch as many errors as 918possible, as early as possible: 919 920 sub spoof_read_file 921 { 922 my ($opt, $args) = @_; 923 warn ("$opt option requires an argument\n"), return 0 924 unless @$args; 925 my $file = shift @$args; 926 warn ("$file does not exist or is not readable\n"), return 0 927 unless -r $file; 928 return 1; 929 } 930 931Finally, you can frequently merge the "real" and "spoof" callback into 932one subroutine: 933 934 sub read_file 935 { 936 my ($opt, $args, $spoof) = @_; 937 938 warn ("$opt option requires an argument\n"), return 0 unless @$args; 939 my $file = shift @$args; 940 warn ("$file does not exist or is not readable\n"), return 0 941 unless -r $file; 942 return 1 if $spoof; 943 open (FILE, $file) || 944 (warn ("$file: $!\n"), return 0); 945 push (@filedata, <FILE>); 946 close (FILE); 947 return 1; 948 } 949 950And then, when specifying the replacement callback to C<SetSpoofCodes>, 951just create an anonymous sub that calls C<read_file> with C<$spoof> 952true: 953 954 &Getopt::Tabular::SetSpoofCodes 955 (-read_file => sub { &read_file (@_[0,1], 1) }); 956 957Even though this means a bigger and more complicated callback, you only 958need I<one> such callback -- the alternative is to carry around both 959C<read_file> and C<spoof_read_file>, which might do redundant processing 960of the argument list. 961 962=head1 AUTHOR 963 964Greg Ward <greg@bic.mni.mcgill.ca> 965 966Started in July, 1995 as ParseArgs.pm, with John Ousterhout's 967Tk_ParseArgv.c as a loose inspiration. Many many features added over 968the ensuing months; documentation written in a mad frenzy 16-18 April, 9691996. Renamed to Getopt::Tabular, revamped, reorganized, and 970documentation expanded 8-11 November, 1996. 971 972Copyright (c) 1995-97 Greg Ward. All rights reserved. This is 973free software; you can redistribute it and/or modify it under the same 974terms as Perl itself. 975 976=head1 BUGS 977 978The documentation is bigger than the code, and I still haven't covered 979option patterns or extending the type system (apart from pattern types). 980Yow! 981 982No support for list-valued options, although you can roll your own with 983B<call> options. (See the demo program included with the distribution 984for an example.) 985 986Error messages are hard-coded to English. 987