1package CGI::Struct; 2 3use strict; 4use warnings; 5 6=head1 NAME 7 8CGI::Struct - Build structures from CGI data 9 10=head1 VERSION 11 12Version 1.21 13 14=cut 15 16our $VERSION = '1.21'; 17 18 19=head1 SYNOPSIS 20 21This module allows transforming CGI GET/POST data into intricate data 22structures. It is reminiscent of PHP's building arrays from form data, 23but with a perl twist. 24 25 use CGI; 26 use CGI::Struct; 27 my $cgi = CGI->new; 28 my %params = $cgi->Vars; 29 my $struct = build_cgi_struct \%params; 30 31=head1 DESCRIPTION 32 33CGI::Struct lets you transform CGI data keys that I<look like> perl data 34structures into I<actual> perl data structures. 35 36CGI::Struct makes no attempt to actually I<read in> the variables from 37the request. You should be using L<CGI> or some equivalent for that. 38CGI::Struct expects to be handed a reference to a hash containing all the 39keys/values you care about. The common way is to use something like 40C<CGI-E<gt>Vars> or (as the author does) 41C<Plack::Request-E<gt>parameters-E<gt>mixed>. 42 43Whatever you use should give you a hash mapping the request variable 44names (keys) to the values sent in by the users (values). Any of the 45major CGIish modules will have such a method; consult the documentation 46for yours if you don't know it offhand. 47 48Of course, this isn't necessarily tied strictly to CGI; you I<could> use 49it to build data structures from any other source with similar syntax. 50All CGI::Struct does is take one hash (reference) and turn it into 51another hash (reference). However, it's aimed at CGI uses, so it may or 52may not work for something else. 53 54 55=head1 EXAMPLES 56 57=head2 Basic Usage 58 59 <form action="request.cgi"> 60 Name: <input type="text" name="uinfo{name}"> 61 Address: <input type="text" name="uinfo{addr}"> 62 Email: <input type="text" name="uinfo{email}"> 63 </form> 64 65When filled out and submitted the data will come in to request.cgi, which 66will use something like C<CGI-E<gt>Vars> to parse it out into a hash 67 68 use CGI; 69 my $cgi = CGI->new; 70 my %params = $cgi->Vars; 71 72You'll wind up with something like 73 74 %params = ( 75 'uinfo{name}' => 'Bob', 76 'uinfo{addr}' => '123 Main Street', 77 'uinfo{email}' => 'bob@bob.bob', 78 ) 79 80Now we use CGI::Struct to parse that out 81 82 use CGI::Struct; 83 my $struct = build_cgi_struct \%params; 84 85and we wind up with a structure that looks more like 86 87 $struct = { 88 'uinfo' => { 89 name => 'Bob', 90 addr => '123 Main Street', 91 email => 'bob@bob.bob', 92 } 93 } 94 95which is much simpler to use in your code. 96 97=head2 Arrays 98 99CGI::Struct also has the ability to build out arrays. 100 101 First cousin: <input type="text" name="cousins[0]"> 102 Second cousin: <input type="text" name="cousins[1]"> 103 Third cousin: <input type="text" name="cousins[2]"> 104 105Run it through CGI to get the parameters, run through 106L</build_cgi_struct>, and we get 107 108 $struct = { 109 'cousins' => [ 110 'Jill', 111 'Joe', 112 'Judy' 113 ] 114 } 115 116Of course, most CGIish modules will roll that up into an array if you 117just call it 'cousins' and have multiple inputs. But this lets you 118specify the indices. For instance, you may want to base the array from 1 119instead of 0: 120 121 First cousin: <input type="text" name="cousins[1]"> 122 Second cousin: <input type="text" name="cousins[2]"> 123 Third cousin: <input type="text" name="cousins[3]"> 124 125 $struct = { 126 'cousins' => [ 127 undef, 128 'Jill', 129 'Joe', 130 'Judy' 131 ] 132 } 133 134See also the L</Auto-arrays> section. 135 136=head3 NULL delimited multiple values 137 138When using L<CGI>'s C<-E<gt>Vars> and similar, multiple passed values 139will wind up as a C<\0>-delimited string, rather than an array ref. By 140default, CGI::Struct will split it out into an array ref. This behavior 141can by disabled by using the C<nullsplit> config param; see the 142L<function doc below|/build_cgi_struct>. 143 144=head2 Deeper and deeper 145 146Specifying arrays explicitly is also useful when building arbitrarily 147deep structures, since the array doesn't have to be at the end 148 149 <select name="users{bob}{cousins}[5]{firstname}"> 150 151After a quick trip through L</build_cgi_struct>, that'll turn into 152C<$struct-E<gt>{users}{bob}{cousins}[5]{firstname}> just like you'd expect. 153 154=head2 Dotted hashes 155 156Also supported is dot notation for hash keys. This saves you a few 157keystrokes, and can look neater. Hashes may be specified with either 158the C<.> or with C<{}>. Arrays can only be written with C<[]>. 159 160The above C<select> could be written using dots for some or all of the 161hash keys instead, looking a little Javascript-ish 162 163 <select name="users.bob.cousins[5].firstname"> 164 <select name="users.bob{cousins}[5].firstname"> 165 <select name="users{bob}.cousins[5]{firstname}"> 166 167of course, you wouldn't really want to mix-and-match in one field in 168practice; it just looks silly. 169 170Sometimes, though, you may want to have dots in field names, and you 171wouldn't want this parsing to happen then. It can be disabled for a run 172of L</build_cgi_struct> by passing a config param in; see the L<function 173doc below|/build_cgi_struct>. 174 175=head2 Auto-arrays 176 177CGI::Struct also builds 'auto-arrays', which is to say it turns 178parameters ending with an empty C<[]> into arrays and pushes things onto 179them. 180 181 <select multiple="multiple" name="users[]"> 182 183turns into 184 185 $struct->{users} = ['lots', 'of', 'choices']; 186 187This may seem unnecessary, given the ability of most CGI modules to 188already build the array just by having multiple C<users> params given. 189Also, since L</build_cgi_struct> only sees the data after your CGI module 190has already parsed it out, it will only ever see a single key in its 191input hash for any name anyway, since hashes can't have multiple keys 192with the same name anyway. 193 194However, there are a few uses for it. PHP does this, so it makes for an 195easier transition. Also, it forces an array, so if you only chose one 196entry in the list, L</build_cgi_struct> would still make that element in 197the structure a (single-element) array 198 199 $struct->{users} = ['one choice']; 200 201which makes your code a bit simpler, since you don't have to expect both 202a scalar and an array in that place (though of course you should make 203sure it's what you expect for robustness). 204 205 206=head1 FUNCTIONS 207 208=cut 209 210 211# Delimiters/groupers 212my $delims = "[{."; 213 214# Tuple types for each delim 215my %dtypes = ( '[' => 'array', '{' => 'hash', '.' => 'hash' ); 216 217# Correponding ending groups 218my %dcorr = ( '[' => ']', '{' => '}', '.' => undef ); 219 220# Yeah, export it 221require Exporter; 222our @ISA = qw(Exporter); 223our @EXPORT = qw(build_cgi_struct); 224 225use Storable qw(dclone); 226 227 228 229 230 231=head2 build_cgi_struct 232 233 $struct = build_cgi_struct \%params; 234 235 $struct = build_cgi_struct \%params, \@errs; 236 237 $struct = build_cgi_struct \%params, \@errs, \%conf; 238 239C<build_cgi_struct()> is the only function provided by this module. It 240takes as an argument a reference to a hash of parameter name keys and 241parameter value values. It returns a reference to a hash with the fully 242built up structure. Any keys that can't be figured out are not present 243in the returned hash. 244 245An optional array reference can be passed as the second argument, in 246which case the array will be filled in with any warnings or errors found 247in trying to build the structure. This should be taken as a debugging 248tool for the developer's eyes to parse, not a source of friendly-looking 249warnings to hand to non-technical users or as strongly formatted strings 250for automated error mining. 251 252A hash reference may be supplied as a third argument for passing config 253parameters. The currently supported parameters are: 254 255=over 256 257=item nodot 258 259This allows you to disable processing of C<.> as a hash element 260separator. There may be cases where you want a C<.> as part of a field 261name, so this lets you still use C<{}> and C<[]> structure in those 262cases. 263 264The default is B<false> (i.e., I<do> use C<.> as separator). Pass a true 265value (like C<1>) to B<not> do so. 266 267=item nullsplit 268 269C<CGI-E<gt>Vars> and compatible functions tend to, in hash form, wind up 270with a NULL-delimited list rather than an array ref when passed multiple 271values with the same key. CGI::Struct will check string values for 272embedded C<\0>'s and, if found, C<split> the string on them and create an 273arrayref. 274 275The C<nullsplit> config param lets you disable this if you want strings 276with embedded C<\0> to pass through unmolested. Pass a false value (like 277C<0>) to disable the splitting. 278 279=item dclone 280 281By default, CGI::Struct uses L<Storable>'s C<dclone> to do deep copies of 282incoming data structures. This ensures that whatever changes you might 283make to C<$struct> later on don't change stuff in C<%params> too. By 284setting dclone to a B<false> value (like C<0>) you can disable this, and 285make it so deeper refs in the data structures point to the same items. 286 287You probably don't want to do this, unless some data is so huge you don't 288want to keep 2 copies around, or you really I<do> want to edit the 289original C<%params> for some reason. 290 291=back 292 293=cut 294 295sub build_cgi_struct 296{ 297 my ($iv, $errs, $conf) = @_; 298 299 my (%ret, @errs); 300 301 # Allow disabling '.' 302 my $delims = $delims; 303 $delims =~ s/\.// if($conf && $conf->{nodot}); 304 305 # nullsplit defaults on 306 $conf->{nullsplit} = 1 unless exists $conf->{nullsplit}; 307 308 # So does deep cloning 309 $conf->{dclone} = 1 unless exists $conf->{dclone}; 310 my $dclone = sub { @_ > 1 ? @_ : $_[0] }; 311 $dclone = \&dclone if $conf->{dclone}; 312 313 # Loop over keys, one at a time. 314 DKEYS: for my $k (keys %$iv) 315 { 316 # Shortcut; if it doesn't contain any special chars, just assign 317 # to the output and go back around. 318 unless( $k =~ /[$delims]/) 319 { 320 my $nval = ref $iv->{$k} ? $dclone->($iv->{$k}) : $iv->{$k}; 321 $nval = [split /\0/, $nval] 322 if($conf->{nullsplit} && ref($nval) eq '' 323 && $nval =~ /\0/); 324 $ret{$k} = $nval; 325 next; 326 } 327 328 # Bomb if it starts with a special 329 if($k =~ /^[$delims]/) 330 { 331 push @errs, "Bad key; unexpected initial char in $k"; 332 next; 333 } 334 335 # Break it up into the pieces. Use the capture in split's 336 # pattern so we get the bits it matched, so we can differentiate 337 # between hashes and arrays. 338 my @kps = split /([$delims])/, $k; 339 340 # The first of that is our top-level key. Use that to initialize 341 # our pointer to walk down the structure. 342 # $p remains a reference to a reference all the way down the 343 # walk. That's necessary; if we just make it a single reference, 344 # then it couldn't be used to replace a level as necessary (e.g., 345 # from undef to [] or {} when we initialize). 346 my $p; 347 { 348 my $topname = shift @kps; 349 350 # Make sure the key exists, then ref at it. 351 $ret{$topname} ||= undef; 352 353 # A reference to a reference 354 $p = \$ret{$topname}; 355 } 356 357 # Flag for autoarr'ing the value 358 my $autoarr = 0; 359 360 # Now walk over the rest of the pieces and create the structure 361 # all the way down 362 my $i = 0; 363 while($i <= $#kps) 364 { 365 # First bit should be a special 366 if(length($kps[$i]) != 1 || $kps[$i] !~ /^[$delims]$/) 367 { 368 # This should only be possible via internal error. If 369 # deliminters aren't properly matched anywhere along the 370 # way, we _could_ end up with a case where the 371 # even-numbered items here aren't valid openers, but if 372 # that's the case then some error will have already 373 # triggered about the mismatch. 374 die "Internal error: Bad type $kps[$i] found at $i for $k"; 375 } 376 377 # OK, pull out that delimiter, and the name of the piece 378 my $sdel = $kps[$i++]; 379 my $sname = $kps[$i++]; 380 381 # The name should end with the corresponding ender... 382 if($dcorr{$sdel} && $dcorr{$sdel} ne substr($sname, -1)) 383 { 384 push @errs, "Didn't find ender for ${sdel} in $sname for $k"; 385 next DKEYS; 386 } 387 # ... and remove it, leaving just the name 388 chop $sname if $dcorr{$sdel}; 389 390 # Better be >0 chars... 391 unless(defined($sname) && length $sname) 392 { 393 # Special case: if this is the last bit, and it's an 394 # array, then we do the auto-array stuff. 395 if($i > $#kps && $dtypes{$sdel} eq "array") 396 { 397 $autoarr = 1; 398 last; 399 } 400 401 # Otherwise a 0-length label is an error. 402 push @errs, "Zero-length name element found in $k"; 403 next DKEYS; 404 } 405 406 # If it's an array, better be a number 407 if($dtypes{$sdel} eq "array" && $sname !~ /^\d+$/) 408 { 409 push @errs, "Array subscript should be a number, " 410 . "not $sname in $k"; 411 next DKEYS; 412 } 413 414 415 # Now we know the type, so fill in that level of the 416 # structure 417 my $stype = $dtypes{$sdel}; 418 419 # Initialize if necessary. 420 if($stype eq "array") 421 { ($$p) ||= [] } 422 elsif($stype eq "hash") 423 { ($$p) ||= {} } 424 else 425 { die "Internal error: unknown type $stype in $k" } 426 427 # Check type 428 unless(ref($$p) eq uc($stype)) 429 { 430 push @errs, "Type mismatch: already have " . ref($$p) 431 . ", expecting $stype for $sname in $k"; 432 # Give up on this key totally; who knows what to do 433 next DKEYS; 434 } 435 436 # Set. Move our pointer down a step, and loop back around to 437 # the next component in this path 438 if($stype eq "array") 439 { $p = \($$p)->[$sname] } 440 elsif($stype eq "hash") 441 { $p = \($$p)->{$sname} } 442 443 # And back around 444 } 445 446 447 # OK, we're now all the way to the bottom, and $p is a reference 448 # to that last step in the structure. Fill in the value ($p 449 # becomes a reference to a reference to that value). 450 # Special case: for autoarrays, we make sure the value ends up 451 # being a single-element array rather than a scalar, if it isn't 452 # already an array. 453 my $nval = ref $iv->{$k} ? $dclone->($iv->{$k}) : $iv->{$k}; 454 $nval = [split /\0/, $nval] 455 if($conf->{nullsplit} && ref($nval) eq '' && $nval =~ /\0/); 456 if($autoarr && $nval && ref($nval) ne 'ARRAY') 457 { $$p = [$nval]; } 458 else 459 { $$p = $nval; } 460 461 # And around to the next key 462 } 463 464 465 # If they asked for error details, give it to 'em 466 push @$errs, @errs if $errs; 467 468 # Done! 469 return \%ret; 470} 471 472=head1 SEE ALSO 473 474L<CGI>, L<CGI::Simple>, L<CGI::Minimal>, L<Plack>, and many other choices 475for handling transforming a browser's request info a data structure 476suitable for parsing. 477 478L<CGI::State> is somewhat similar to CGI::Struct, but is extremely 479tightly coupled to L<CGI> and doesn't have as much flexibility in the 480structures it can build. 481 482L<CGI::Expand> also does similar things, but is more closely tied to 483L<CGI> or a near-equivalent. It tries to DWIM hashes and arrays using 484only a single separator. 485 486The structure building done here is a perlish equivalent to the structure 487building PHP does with passed-in parameters. 488 489=head1 AUTHOR 490 491Matthew Fuller, C<< <fullermd@over-yonder.net> >> 492 493=head1 BUGS 494 495Please report any bugs or feature requests to C<bug-cgi-struct at 496rt.cpan.org>, or through the web interface at 497L<http://rt.cpan.org/NoAuth/ReportBug.html?Queue=CGI-Struct>. I will be 498notified, and then you'll automatically be notified of progress on your 499bug as I make changes. 500 501=head1 SUPPORTED VERSIONS 502 503CGI::Struct should work on perl 5.6 and later. It includes a 504comprehensive test suite, so passing that should be an indicator that it 505works. If that's not the case, I want to hear about it so the testing 506can be improved! 507 508=head1 SUPPORT 509 510You can find documentation for this module with the perldoc command. 511 512 perldoc CGI::Struct 513 514 515You can also look for information at: 516 517=over 4 518 519=item * RT: CPAN's request tracker 520 521L<http://rt.cpan.org/NoAuth/Bugs.html?Dist=CGI-Struct> 522 523=item * AnnoCPAN: Annotated CPAN documentation 524 525L<http://annocpan.org/dist/CGI-Struct> 526 527=item * CPAN Ratings 528 529L<http://cpanratings.perl.org/d/CGI-Struct> 530 531=item * Search CPAN 532 533L<http://search.cpan.org/dist/CGI-Struct/> 534 535=back 536 537 538=head1 LICENSE AND COPYRIGHT 539 540Copyright 2010-2012 Matthew Fuller. 541 542This software is licensed under the 2-clause BSD license. See the 543LICENSE file in the distribution for details. 544 545=cut 546 5471; # End of CGI::Struct 548