1package Hash::MultiKey; 2 3use 5.006; 4use strict; 5use warnings; 6 7use Carp; 8 9use vars qw($VERSION); 10$VERSION = '0.06'; 11 12# ---[ Implementation Overview ]---------------------------------------- 13# 14# The first implementation of this module was based in an explicit tree. 15# Right after its announcement in news:comp.lang.perl.modules Benjamin 16# Goldberg suggested a radically different approach, far much simple and 17# efficient. The current code is entirely based on his idea. 18# 19# Multi-key hashes are implemented now with a plain hash. There is no 20# nesting involved. 21# 22# Lists of keys are converted to strings with pack(): 23# 24# $key = pack 'N' . ('w/a*' x @$keys), scalar(@$keys), @$keys; 25# 26# and that $key is what's used in the underlying hash. The first chunk 27# stores the number of keys, to be used afterwards when we decode it. 28# Then, pairs length_of_key/key follow. 29# 30# Conversely, to retrieve the original list of keys from a real key we 31# use unpack(): 32# 33# $n = unpack 'N', $key; 34# [ unpack 'x4' . ('w/a*' x $n), $key ]; 35# 36# Iteration is delegated to the iterator of the very hash. 37# 38# Knowing that the following code is crystal clear, so comments have 39# been removed altogether. 40# 41# ---------------------------------------------------------------------- 42 43 44sub TIEHASH { 45 bless {}, shift; 46} 47 48sub CLEAR { 49 %{ shift() } = (); 50} 51 52sub FETCH { 53 my ($self, $keys) = @_; 54 $keys = [$keys eq '' ? ('') : split /$;/, $keys, -1] unless ref $keys eq 'ARRAY'; 55 @$keys or croak "Empty multi-key\n"; 56 $self->{pack 'N' . ('w/a*' x @$keys), scalar(@$keys), @$keys}; 57} 58 59sub STORE { 60 my ($self, $keys, $value) = @_; 61 $keys = [$keys eq '' ? ('') : split /$;/, $keys, -1] unless ref $keys eq 'ARRAY'; 62 @$keys or croak "Empty multi-key\n"; 63 $self->{pack 'N' . ('w/a*' x @$keys), scalar(@$keys), @$keys} = $value; 64} 65 66sub DELETE { 67 my ($self, $keys) = @_; 68 $keys = [$keys eq '' ? ('') : split /$;/, $keys, -1] unless ref $keys eq 'ARRAY'; 69 @$keys or croak "Empty multi-key\n"; 70 delete $self->{pack 'N' . ('w/a*' x @$keys), scalar(@$keys), @$keys}; 71} 72 73sub EXISTS { 74 my ($self, $keys) = @_; 75 $keys = [$keys eq '' ? ('') : split /$;/, $keys, -1] unless ref $keys eq 'ARRAY'; 76 @$keys or croak "Empty multi-key\n"; 77 exists $self->{pack 'N' . ('w/a*' x @$keys), scalar(@$keys), @$keys}; 78} 79 80sub FIRSTKEY { 81 my ($self) = @_; 82 keys %$self; # reset iterator 83 $self->NEXTKEY; 84} 85 86sub NEXTKEY { 87 my ($self) = @_; 88 defined(my $key = each %$self) or return; 89 my $n = unpack 'N', $key; 90 [ unpack 'x4' . ('w/a*' x $n), $key ]; 91} 92 93sub SCALAR { 94 my ($self) = @_; 95 scalar %$self; 96} 97 981; 99 100 101__END__ 102 103=head1 NAME 104 105Hash::MultiKey - hashes whose keys can be multiple 106 107=head1 SYNOPSIS 108 109 use Hash::MultiKey; 110 111 # tie first 112 tie %hmk, 'Hash::MultiKey'; 113 114 # store 115 $hmk{['foo', 'bar', 'baz']} = 1; 116 117 # fetch 118 $v = $hmk{['foo', 'bar', 'baz']}; 119 120 # exists 121 exists $hmk{['foo', 'bar', 'baz']}; # true 122 123 # each 124 while (($mk, $v) = each %hmk) { 125 @keys = @$mk; 126 # ... 127 } 128 129 # keys 130 foreach $mk (keys %hmk) { 131 @keys = @$mk; 132 # ... 133 } 134 135 # values 136 foreach $v (values %hmk) { 137 $v =~ s/foo/bar/g; # alias, modifies value in %hmk 138 # ... 139 } 140 141 # delete 142 $rmed_value = delete $hmk{['foo', 'bar', 'baz']}; 143 144 # clear 145 %hmk = (); 146 147 # syntactic sugar, but see risks below 148 $hmk{'foo', 'bar', 'baz', 'zoo'} = 2; 149 150 # finally, untie 151 untie %hmk; 152 153=head1 DESCRIPTION 154 155Hash::MultiKey provides hashes that accept arrayrefs of strings as keys. 156 157Two multi-keys are regarded as being equal if their I<contents> are 158equal, there is no need to use the same reference to refer to the same 159hash entry: 160 161 $hmk{['foo', 'bar', 'baz']} = 1; 162 exists $hmk{['foo', 'bar', 'baz']}; # different arrayref, but true 163 164A given hash can have multi-keys of different lengths: 165 166 $hmk{['foo']} = 1; # length 1 167 $hmk{['foo', 'bar', 'baz']} = 3; # length 3, no problem 168 169In addition, multi-keys cannot be empty: 170 171 $hmk{[]} = 1; # ERROR 172 173The next sections document how hash-related operations work in a 174multi-key hash. Some parts have been copied from standard documentation, 175since everything has standard semantics. 176 177=head2 tie 178 179Once you have tied a hash variable to Hash::MultiKey as in 180 181 tie my (%hmk), 'Hash::MultiKey'; 182 183you've got a hash whose keys are arrayrefs of strings. Having that in 184mind everything works as expected. 185 186=head2 store 187 188Assignment is this easy: 189 190 $hmk{['foo', 'bar', 'baz']} = 1; 191 192=head2 fetch 193 194So is fetching: 195 196 $v = $hmk{['foo', 'bar', 'baz']}; 197 198=head2 exists 199 200Testing for existence works as usual: 201 202 exists $hmk{['foo', 'bar', 'baz']}; # true 203 204Only whole multi-keys as they were used in assigments have entries. 205Sub-chains do not exist unless they were assigned some value. 206 207For instance, C<['foo']> is a sub-chain of C<['foo', 'bar', 'baz']>, but 208if it has no entry in %hmk so far 209 210 exists $hmk{['foo']}; # false 211 212=head2 each 213 214As with everyday C<each()>, when called in list context returns a 2152-element list consisting of the key and value for the next element of 216the hash, so that you can iterate over it. When called in scalar 217context, returns only the key for the next element in the hash. 218 219Remember keys are arrayrefs of strings here: 220 221 while (($mk, $v) = each %hmk) { 222 @keys = @$mk; 223 # ... 224 } 225 226The order in which entries are returned is guaranteed to be the same one 227as either the C<keys()> or C<values()> function would produce on the 228same (unmodified) hash. 229 230When the hash is entirely read, a null array is returned in list context 231(which when assigned produces a false (0) value), and C<undef> in scalar 232context. The next call to C<each()> after that will start iterating 233again. 234 235There is a single iterator for each hash, shared by all C<each()>, 236C<keys()>, and C<values()> function calls in the program. 237 238Adding or deleting entries while we're iterating over the hash results 239in undefined behaviour. Nevertheless, it is always safe to delete the 240item most recently returned by C<each()>, which means that the following 241code will work: 242 243 while (($mk, $v) = each %hmk) { 244 print "@$mk\n"; 245 delete $hmk{$mk}; # this is safe 246 } 247 248=head2 keys 249 250Returns a list consisting of all the keys of the named hash. (In scalar 251context, returns the number of keys.) The keys are returned in an 252apparently random order. The actual random order is subject to change in 253future versions of perl, but it is guaranteed to be the same order as 254either the C<values()> or C<each()> function produces (given that the 255hash has not been modified). As a side effect, it resets hash's 256iterator. 257 258Remember keys are arrayrefs of strings here: 259 260 foreach $mk (keys %hmk) { 261 @keys = @$mk; 262 # ... 263 } 264 265There is a single iterator for each hash, shared by all C<each()>, 266C<keys()>, and C<values()> function calls in the program. 267 268The returned values are copies of the original keys in the hash, so 269modifying them will not affect the original hash. Compare C<values()>. 270 271=head2 values 272 273Returns a list consisting of all the values of the named hash. (In a 274scalar context, returns the number of values.) The values are returned 275in an apparently random order. The actual random order is subject to 276change in future versions of perl, but it is guaranteed to be the same 277order as either the C<keys()> or C<each()> function would produce on the 278same (unmodified) hash. 279 280Note that the values are not copied, which means modifying them will 281modify the contents of the hash: 282 283 s/foo/bar/g foreach values %hmk; # modifies %hmk's values 284 s/foo/bar/g foreach @hash{keys %hash}; # same 285 286As a side effect, calling C<values()> resets hash's internal iterator. 287 288There is a single iterator for each hash, shared by all C<each()>, 289C<keys()>, and C<values()> function calls in the program. 290 291 292=head2 delete 293 294Deletes the specified element(s) from the hash. Returns each element so 295deleted or the undefined value if there was no such element. 296 297The following (inefficiently) deletes all the values of %hmk: 298 299 foreach $mk (keys %hmk) { 300 delete $hmk{$mk}; 301 } 302 303And so do this: 304 305 delete @hmk{keys %hmk}; 306 307But both methods are slower than just assigning the empty list to %hmk: 308 309 %hmk = (); # clear %hmk, the efficient way 310 311=head2 untie 312 313Untie the variable when you're done: 314 315 untie %hmk; 316 317=head1 SYNTACTIC SUGAR 318 319Hash::MultiKey supports also this syntax: 320 321 $hash{'see', '$;', 'in', 'perldoc', 'perlvar'} = 1; 322 323If the key is a string instead of an arrayref the underlying code splits 324it using C<$;> (see why in L<MOTIVATION>) and from then on the key is an 325arrayref as any true multi-key. Thus, the assigment above is equivalent 326to 327 328 $hash{['see', '$;', 'in', 'perldoc', 'perlvar']} = 1; 329 330once it has been processed. 331 332You I<don't> need to split the string back while iterating with 333C<each()> or C<keys()>, it already comes as an arrayref of strings. 334 335Nevertheless take into account that this is B<slower>, and B<broken> if 336any of the components contains C<$;>. It is supported just for 337consistency's sake. 338 339 340=head1 MOTIVATION 341 342Perl comes already with some support for hashes with multi-keys. As you 343surely know, if perl sees 344 345 $hash{'foo', 'bar', 'baz'} = 1; 346 347it joins C<('foo', 'bar', 'baz')> with C<$;> to obtain the actual key, 348thus resulting in a string. Then you retrieve the components of the 349multi-key like this: 350 351 while (($k, $v) = each %hash) { 352 @keys = $k eq '' ? ('') : split /$;/, $k, -1; 353 # ... 354 } 355 356Since C<$;> is C<\034> by default, a non-printable character, this is 357often enough. 358 359Sometimes, however, that's not the most convenient way to work with 360multi-keys. For instance, that magic join doesn't work with arrays: 361 362 @array = ('foo', 'bar', 'baz'); 363 $hash{@array} = 1; # WARNING, @array evaluated in scalar context! 364 365You could be dealing with binary data. Or you could be writing a public 366module that uses user input in such a hash and don't want to rely on 367input not coming with C<$;>, or don't want to document such an obscure, 368gratuitous, and implementation dependent constraint. 369 370In such cases, Hash::MultiKey can help. 371 372=head1 AUTHORS 373 374Xavier Noria (FXN), Benjamin Goldberg (GOLDBB). 375 376=head1 THANKS 377 378Iain Truskett (SPOON) kindly checked whether this module works in perl 3795.005 and found out the use of "/" in C<pack()>, introduced in perl 3805.006, prevents that. 381 382Philip Monsen reported some tests of Hash::MultiKey 0.05 failed with 383perl 5.8.4. 384 385=head1 COPYRIGHT and LICENSE 386 387Copyright (C) 2003, Xavier Noria E<lt>fxn@cpan.orgE<gt>. All rights 388reserved. This module is free software; you can redistribute it and/or 389modify it under the same terms as Perl itself. 390 391=head1 SEE ALSO 392 393L<perlvar>, L<perltie> 394 395=cut 396