• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

eg/H15-Jan-2013-22883

t/H15-Jan-2013-198149

Bucketizer.pmH A D15-Jan-201316.9 KiB622191

ChangesH A D15-Jan-2013923 4129

MANIFESTH A D15-Jan-2013311 1413

MANIFEST.SKIPH A D15-Jan-2013104 1211

META.jsonH A D15-Jan-20131.1 KiB4948

META.ymlH A D15-Jan-2013640 2726

Makefile.PLH A D05-Mar-2012888 2722

READMEH A D15-Jan-20139.5 KiB271200

README

1######################################################################
2    Algorithm::Bucketizer 0.13
3######################################################################
4
5NAME
6    Algorithm::Bucketizer - Distribute sized items to buckets with limited
7    size
8
9SYNOPSIS
10      use Algorithm::Bucketizer;
11
12          # Create a bucketizer
13      my $bucketizer = Algorithm::Bucketizer->new(bucketsize => $size);
14
15          # Add items to it
16      $bucketizer->add_item($item, $size);
17
18          # Optimize distribution
19      $bucketizer->optimize(maxrounds => 100);
20
21          # When done adding, get the buckets
22          # (they're of type Algorithm::Bucketizer::Bucket)
23      my @buckets = $bucketizer->buckets();
24
25          # Access bucket content by using
26          # Algorithm::Bucketizer::Bucket methods
27      my @items  = $bucket->items();
28      my $serial = $bucket->serial();
29
30DESCRIPTION
31    So, you own a number of mp3-Songs on your hard disc and want to copy
32    them to a number of CDs, maxing out the space available on each of them?
33    You want to distribute your picture collection into several folders, so
34    each of them doesn't exceed a certain size? "Algorithm::Bucketizer"
35    comes to the rescue.
36
37    "Algorithm::Bucketizer" distributes items of a defined size into a
38    number of dynamically created buckets, each of them capable of holding
39    items of a defined total size.
40
41    By calling the "$bucketizer->add_item()" method with the item (can be a
42    scalar or an object reference) and its size as parameters, you're adding
43    items to the system. The bucketizer will determine if the item fits into
44    one of the existing buckets and put it in there if possible. If none of
45    the existing buckets has enough space left to hold the new item (or if
46    no buckets exist yet for that matter), the bucketizer will create a new
47    bucket and put the item in there.
48
49    After adding all items to the system, the bucketizer lets you iterate
50    over all buckets with the "$bucketizer->items()" method and determine
51    what's in each of them.
52
53  Algorithms
54    Currently, "Algorithm::Bucketizer" comes with two algorithms, "simple"
55    and "retry".
56
57    In "simple" mode, the algorithm will just try to fit in your items in
58    the order in which they're arriving. If an item fits into the current
59    bucket, it's being dropped in, if not, the algorithm moves on to the
60    next bucket. It never goes back to previous buckets, although a new item
61    might as well fit in there. This mode might be useful if preserving the
62    original order of items is required. To query/manipulate the bucket the
63    Bucketizer will try to fit in the next item, use
64    "current_bucket_index()" explained below.
65
66    In "retry" mode, the algorithm will try each existing bucket first,
67    before opening a new one. If you have many items of various sizes,
68    "retry" allows you to fit them into less buckets than in "simple" mode.
69
70    The "new()" method chooses the algorithm:
71
72        my $dumb = Algorithm::Bucketizer->new( algorithm => "simple" );
73
74        my $smart = Algorithm::Bucketizer->new( algorithm => "retry" );
75
76    In addition to these inserting algorithms, check "Optimize" to optimize
77    the distribution, minimizing the number of required buckets.
78
79  Prefilling Buckets
80    Sometimes you will have preexisting buckets, which you need to tell the
81    algorithm about before it starts adding new items. The
82    "prefill_bucket()" method does exactly that, simply putting an item into
83    a specified bucket:
84
85        $b->prefill_bucket($bucket_idx, $item, $itemsize);
86
87    $bucket_idx is the index of the bucket, starting from 0. Non-existing
88    buckets are automatically created for you. Make sure you have a
89    consecutive number of buckets at the end of the prefill.
90
91  Optimize
92    Once you've inserted all items, you might choose to optimize the
93    distribution over the buckets, in order to *minimize* the number of
94    required buckets to hold all the elements.
95
96    Optimally distributing a number discrete-sized items into a number of
97    discrete-sized buckets, however, is a non-trivial task. It's the
98    "bin-packing problem", related to the "knapsack problem", which are both
99    *NP-complete*.
100
101    "Algorithm::Bucketize" therefore provides different optimization
102    techniques to (stupidly) approximate an ideal solution, which can't be
103    obtained otherwise (yet).
104
105    Currently, it implements "random" and "brute_force".
106
107    "random" tries to randomly vary the distribution until a time or round
108    limit is reached.
109
110            # Try randomly to improve distribution,
111            # timing out after 100 rounds
112        $b->optimize(algorithm => "random", maxrounds => 100);
113
114            # Try randomly to improve distribution,
115            # timing out after 60 secs
116        $b->optimize(algorithm => "random", maxtime => 60);
117
118            # Try to improve distribution by brute_force trying
119            # all possible combinations (watch out: can take forever)
120        $b->optimize(algorithm => "brute_force",
121                     maxtime => ...,
122                     maxrounds => ...,
123                    );
124
125    I'm currently evaluating more sophisticated methods suggested by more
126    mathematically inclined people :).
127
128FUNCTIONS
129    *
130            my $b = Algorithm::Bucketizer->new(
131                bucketsize => $size,
132                algorithm  => $algorithm
133               );
134
135        Creates a new "Algorithm::Bucketizer" object and returns a reference
136        to it.
137
138        The "bucketsize" name-value pair is somewhat mandatory, because you
139        want to set the size of your buckets, otherwise they will default to
140        100, which isn't what you want in most cases.
141
142        "algorithm" can be left out, it defaults to "simple". If you want
143        retry behaviour, specify "retry" (see "Algorithms").
144
145        Another optional parameter, "add_buckets" specifies if the
146        bucketizer is allowed to add new buckets to the end of the brigade
147        as it sees fit. It defaults to 1. If set to 0, the bucketizer will
148        operate with a limited number of buckets, usually defined by
149        "add_bucket" calls.
150
151    *
152            $b->add_item($item_name, $item_size);
153
154        Adds an item with the specified name and size to the next available
155        bucket, according to the chosen algorithm. If you want to place an
156        item into a specific bucket (e.g. in order to prefill buckets), use
157        "prefill_bucket()" instead, which is described below.
158
159        Returns the Algorithm::Bucketizer::Bucket object of the lucky bucket
160        on sucess and "undef" if something goes badly wrong (e.g. the bucket
161        size is smaller than the item, i.e. there's no way it's ever going
162        to fit in *any* bucket).
163
164    *
165            my @buckets = $b->buckets();
166
167        Return a list of buckets. The list contains elements of type
168        "Algorithm::Bucketizer::Bucket", which understand the following
169        methods:
170
171        *
172                my @items = $bucket->items();
173
174            Returns a list of names of items in the bucket. Returns an empty
175            list if the bucket is empty.
176
177        *
178                my $level = $bucket->level();
179
180            Return how full the bucket is. That's the size of all items in
181            the bucket combined.
182
183        *
184                my $bucket_index = $bucket->idx();
185
186            Return the bucket's index. The first bucket has index 0.
187
188        *
189                my $serial_number = $bucket->serial();
190
191            Return the bucket serial number. That's the bucket index plus 1.
192
193    *
194            $b->add_bucket(
195                maxsize => $maxsize
196            );
197
198        Adds a new bucket to the end of the bucket brigade. This method is
199        useful for building brigades with buckets of various sizes.
200
201    *
202            $b->current_bucket_idx( $idx );
203
204        Set/retrieve the index of the bucket that the "simple" algorithm
205        will use first in order to try to insert the next item.
206
207    *
208            $b->optimize(
209                algorithm  => $algorithm,
210                maxtime    => $seconds,
211                maxrounds  => $number_of_rounds
212               );
213
214        Optimize bucket distribution. Currently "random" and "brute_force"
215        are implemented. Both can be ("random" *must* be) terminated by
216        either the maximum number of seconds ("maxtime") or iterations
217        ("maxrounds").
218
219EXAMPLE
220    We've got buckets which hold a weight of 100 each, and we've got 10
221    items weighing 30, 31, 32, ... 39. Distribute them into buckets.
222
223        use Algorithm::Bucketizer;
224
225        my $b = Algorithm::Bucketizer->new( bucketsize => 100 );
226        for my $i (1..10) {
227            $b->add_item($i, 30+$i);
228        }
229
230        for my $bucket ($b->buckets()) {
231            for my $item ($bucket->items()) {
232                print "Bucket ", $bucket->serial(), ": Item $item\n";
233            }
234            print "\n";
235        }
236
237    Output:
238
239        Bucket 1: Item 1
240        Bucket 1: Item 2
241        Bucket 1: Item 3
242
243        Bucket 2: Item 4
244        Bucket 2: Item 5
245
246        Bucket 3: Item 6
247        Bucket 3: Item 7
248
249        Bucket 4: Item 8
250        Bucket 4: Item 9
251
252        Bucket 5: Item 10
253
254REQUIRES
255    Algorithm::Permute 0.04 if you want to use the "brute_force" method.
256
257SCRIPTS
258    This distribution comes with a script *bucketize* which puts files into
259    directory buckets with limited size. Run "perldoc bucketize" for
260    details.
261
262AUTHOR
263    Mike Schilli, <m@perlmeister.com>
264
265COPYRIGHT AND LICENSE
266    Copyright 2002-2007 by Mike Schilli
267
268    This library is free software; you can redistribute it and/or modify it
269    under the same terms as Perl itself.
270
271