1package Hash::MultiKey;
2
3use 5.006;
4use strict;
5use warnings;
6
7use Carp;
8
9use vars qw($VERSION);
10$VERSION = '0.06';
11
12# ---[ Implementation Overview ]----------------------------------------
13#
14# The first implementation of this module was based in an explicit tree.
15# Right after its announcement in news:comp.lang.perl.modules Benjamin
16# Goldberg suggested a radically different approach, far much simple and
17# efficient. The current code is entirely based on his idea.
18#
19# Multi-key hashes are implemented now with a plain hash. There is no
20# nesting involved.
21#
22# Lists of keys are converted to strings with pack():
23#
24#     $key = pack 'N' . ('w/a*' x @$keys), scalar(@$keys), @$keys;
25#
26# and that $key is what's used in the underlying hash. The first chunk
27# stores the number of keys, to be used afterwards when we decode it.
28# Then, pairs length_of_key/key follow.
29#
30# Conversely, to retrieve the original list of keys from a real key we
31# use unpack():
32#
33#     $n = unpack 'N', $key;
34#     [ unpack 'x4' . ('w/a*' x $n), $key ];
35#
36# Iteration is delegated to the iterator of the very hash.
37#
38# Knowing that the following code is crystal clear, so comments have
39# been removed altogether.
40#
41# ----------------------------------------------------------------------
42
43
44sub TIEHASH {
45    bless {}, shift;
46}
47
48sub CLEAR {
49    %{ shift() } = ();
50}
51
52sub FETCH {
53    my ($self, $keys) = @_;
54    $keys = [$keys eq '' ? ('') : split /$;/, $keys, -1] unless ref $keys eq 'ARRAY';
55    @$keys or croak "Empty multi-key\n";
56    $self->{pack 'N' . ('w/a*' x @$keys), scalar(@$keys), @$keys};
57}
58
59sub STORE {
60    my ($self, $keys, $value) = @_;
61    $keys = [$keys eq '' ? ('') : split /$;/, $keys, -1] unless ref $keys eq 'ARRAY';
62    @$keys or croak "Empty multi-key\n";
63    $self->{pack 'N' . ('w/a*' x @$keys), scalar(@$keys), @$keys} = $value;
64}
65
66sub DELETE {
67    my ($self, $keys) = @_;
68    $keys = [$keys eq '' ? ('') : split /$;/, $keys, -1] unless ref $keys eq 'ARRAY';
69    @$keys or croak "Empty multi-key\n";
70    delete $self->{pack 'N' . ('w/a*' x @$keys), scalar(@$keys), @$keys};
71}
72
73sub EXISTS {
74    my ($self, $keys) = @_;
75    $keys = [$keys eq '' ? ('') : split /$;/, $keys, -1] unless ref $keys eq 'ARRAY';
76    @$keys or croak "Empty multi-key\n";
77    exists $self->{pack 'N' . ('w/a*' x @$keys), scalar(@$keys), @$keys};
78}
79
80sub FIRSTKEY {
81    my ($self) = @_;
82    keys %$self; # reset iterator
83    $self->NEXTKEY;
84}
85
86sub NEXTKEY {
87    my ($self) = @_;
88    defined(my $key = each %$self) or return;
89    my $n = unpack 'N', $key;
90    [ unpack 'x4' . ('w/a*' x $n), $key ];
91}
92
93sub SCALAR {
94    my ($self) = @_;
95    scalar %$self;
96}
97
981;
99
100
101__END__
102
103=head1 NAME
104
105Hash::MultiKey - hashes whose keys can be multiple
106
107=head1 SYNOPSIS
108
109  use Hash::MultiKey;
110
111  # tie first
112  tie %hmk, 'Hash::MultiKey';
113
114  # store
115  $hmk{['foo', 'bar', 'baz']} = 1;
116
117  # fetch
118  $v = $hmk{['foo', 'bar', 'baz']};
119
120  # exists
121  exists $hmk{['foo', 'bar', 'baz']}; # true
122
123  # each
124  while (($mk, $v) = each %hmk) {
125      @keys = @$mk;
126      # ...
127  }
128
129  # keys
130  foreach $mk (keys %hmk) {
131      @keys = @$mk;
132      # ...
133  }
134
135  # values
136  foreach $v (values %hmk) {
137      $v =~ s/foo/bar/g; # alias, modifies value in %hmk
138      # ...
139  }
140
141  # delete
142  $rmed_value = delete $hmk{['foo', 'bar', 'baz']};
143
144  # clear
145  %hmk = ();
146
147  # syntactic sugar, but see risks below
148  $hmk{'foo', 'bar', 'baz', 'zoo'} = 2;
149
150  # finally, untie
151  untie %hmk;
152
153=head1 DESCRIPTION
154
155Hash::MultiKey provides hashes that accept arrayrefs of strings as keys.
156
157Two multi-keys are regarded as being equal if their I<contents> are
158equal, there is no need to use the same reference to refer to the same
159hash entry:
160
161    $hmk{['foo', 'bar', 'baz']} = 1;
162    exists $hmk{['foo', 'bar', 'baz']}; # different arrayref, but true
163
164A given hash can have multi-keys of different lengths:
165
166    $hmk{['foo']}               = 1; # length 1
167    $hmk{['foo', 'bar', 'baz']} = 3; # length 3, no problem
168
169In addition, multi-keys cannot be empty:
170
171    $hmk{[]} = 1; # ERROR
172
173The next sections document how hash-related operations work in a
174multi-key hash. Some parts have been copied from standard documentation,
175since everything has standard semantics.
176
177=head2 tie
178
179Once you have tied a hash variable to Hash::MultiKey as in
180
181    tie my (%hmk), 'Hash::MultiKey';
182
183you've got a hash whose keys are arrayrefs of strings. Having that in
184mind everything works as expected.
185
186=head2 store
187
188Assignment is this easy:
189
190    $hmk{['foo', 'bar', 'baz']} = 1;
191
192=head2 fetch
193
194So is fetching:
195
196    $v = $hmk{['foo', 'bar', 'baz']};
197
198=head2 exists
199
200Testing for existence works as usual:
201
202    exists $hmk{['foo', 'bar', 'baz']}; # true
203
204Only whole multi-keys as they were used in assigments have entries.
205Sub-chains do not exist unless they were assigned some value.
206
207For instance, C<['foo']> is a sub-chain of C<['foo', 'bar', 'baz']>, but
208if it has no entry in %hmk so far
209
210    exists $hmk{['foo']}; # false
211
212=head2 each
213
214As with everyday C<each()>, when called in list context returns a
2152-element list consisting of the key and value for the next element of
216the hash, so that you can iterate over it. When called in scalar
217context, returns only the key for the next element in the hash.
218
219Remember keys are arrayrefs of strings here:
220
221    while (($mk, $v) = each %hmk) {
222        @keys = @$mk;
223        # ...
224    }
225
226The order in which entries are returned is guaranteed to be the same one
227as either the C<keys()> or C<values()> function would produce on the
228same (unmodified) hash.
229
230When the hash is entirely read, a null array is returned in list context
231(which when assigned produces a false (0) value), and C<undef> in scalar
232context. The next call to C<each()> after that will start iterating
233again.
234
235There is a single iterator for each hash, shared by all C<each()>,
236C<keys()>, and C<values()> function calls in the program.
237
238Adding or deleting entries while we're iterating over the hash results
239in undefined behaviour. Nevertheless, it is always safe to delete the
240item most recently returned by C<each()>, which means that the following
241code will work:
242
243    while (($mk, $v) = each %hmk) {
244        print "@$mk\n";
245        delete $hmk{$mk}; # this is safe
246    }
247
248=head2 keys
249
250Returns a list consisting of all the keys of the named hash. (In scalar
251context, returns the number of keys.) The keys are returned in an
252apparently random order. The actual random order is subject to change in
253future versions of perl, but it is guaranteed to be the same order as
254either the C<values()> or C<each()> function produces (given that the
255hash has not been modified). As a side effect, it resets hash's
256iterator.
257
258Remember keys are arrayrefs of strings here:
259
260    foreach $mk (keys %hmk) {
261        @keys = @$mk;
262        # ...
263    }
264
265There is a single iterator for each hash, shared by all C<each()>,
266C<keys()>, and C<values()> function calls in the program.
267
268The returned values are copies of the original keys in the hash, so
269modifying them will not affect the original hash. Compare C<values()>.
270
271=head2 values
272
273Returns a list consisting of all the values of the named hash. (In a
274scalar context, returns the number of values.) The values are returned
275in an apparently random order. The actual random order is subject to
276change in future versions of perl, but it is guaranteed to be the same
277order as either the C<keys()> or C<each()> function would produce on the
278same (unmodified) hash.
279
280Note that the values are not copied, which means modifying them will
281modify the contents of the hash:
282
283   s/foo/bar/g foreach values %hmk;       # modifies %hmk's values
284   s/foo/bar/g foreach @hash{keys %hash}; # same
285
286As a side effect, calling C<values()> resets hash's internal iterator.
287
288There is a single iterator for each hash, shared by all C<each()>,
289C<keys()>, and C<values()> function calls in the program.
290
291
292=head2 delete
293
294Deletes the specified element(s) from the hash. Returns each element so
295deleted or the undefined value if there was no such element.
296
297The following (inefficiently) deletes all the values of %hmk:
298
299    foreach $mk (keys %hmk) {
300        delete $hmk{$mk};
301    }
302
303And so do this:
304
305    delete @hmk{keys %hmk};
306
307But both methods are slower than just assigning the empty list to %hmk:
308
309    %hmk = (); # clear %hmk, the efficient way
310
311=head2 untie
312
313Untie the variable when you're done:
314
315    untie %hmk;
316
317=head1 SYNTACTIC SUGAR
318
319Hash::MultiKey supports also this syntax:
320
321    $hash{'see', '$;', 'in', 'perldoc', 'perlvar'} = 1;
322
323If the key is a string instead of an arrayref the underlying code splits
324it using C<$;> (see why in L<MOTIVATION>) and from then on the key is an
325arrayref as any true multi-key. Thus, the assigment above is equivalent
326to
327
328    $hash{['see', '$;', 'in', 'perldoc', 'perlvar']} = 1;
329
330once it has been processed.
331
332You I<don't> need to split the string back while iterating with
333C<each()> or C<keys()>, it already comes as an arrayref of strings.
334
335Nevertheless take into account that this is B<slower>, and B<broken> if
336any of the components contains C<$;>. It is supported just for
337consistency's sake.
338
339
340=head1 MOTIVATION
341
342Perl comes already with some support for hashes with multi-keys. As you
343surely know, if perl sees
344
345    $hash{'foo', 'bar', 'baz'} = 1;
346
347it joins C<('foo', 'bar', 'baz')> with C<$;> to obtain the actual key,
348thus resulting in a string. Then you retrieve the components of the
349multi-key like this:
350
351    while (($k, $v) = each %hash) {
352        @keys = $k eq '' ? ('') : split /$;/, $k, -1;
353        # ...
354    }
355
356Since C<$;> is C<\034> by default, a non-printable character, this is
357often enough.
358
359Sometimes, however, that's not the most convenient way to work with
360multi-keys. For instance, that magic join doesn't work with arrays:
361
362    @array = ('foo', 'bar', 'baz');
363    $hash{@array} = 1; # WARNING, @array evaluated in scalar context!
364
365You could be dealing with binary data. Or you could be writing a public
366module that uses user input in such a hash and don't want to rely on
367input not coming with C<$;>, or don't want to document such an obscure,
368gratuitous, and implementation dependent constraint.
369
370In such cases, Hash::MultiKey can help.
371
372=head1 AUTHORS
373
374Xavier Noria (FXN), Benjamin Goldberg (GOLDBB).
375
376=head1 THANKS
377
378Iain Truskett (SPOON) kindly checked whether this module works in perl
3795.005 and found out the use of "/" in C<pack()>, introduced in perl
3805.006, prevents that.
381
382Philip Monsen reported some tests of Hash::MultiKey 0.05 failed with
383perl 5.8.4.
384
385=head1 COPYRIGHT and LICENSE
386
387Copyright (C) 2003, Xavier Noria E<lt>fxn@cpan.orgE<gt>. All rights
388reserved. This module is free software; you can redistribute it and/or
389modify it under the same terms as Perl itself.
390
391=head1 SEE ALSO
392
393L<perlvar>, L<perltie>
394
395=cut
396