1######################################################################
2 Algorithm::Bucketizer 0.13
3######################################################################
4
5NAME
6 Algorithm::Bucketizer - Distribute sized items to buckets with limited
7 size
8
9SYNOPSIS
10 use Algorithm::Bucketizer;
11
12 # Create a bucketizer
13 my $bucketizer = Algorithm::Bucketizer->new(bucketsize => $size);
14
15 # Add items to it
16 $bucketizer->add_item($item, $size);
17
18 # Optimize distribution
19 $bucketizer->optimize(maxrounds => 100);
20
21 # When done adding, get the buckets
22 # (they're of type Algorithm::Bucketizer::Bucket)
23 my @buckets = $bucketizer->buckets();
24
25 # Access bucket content by using
26 # Algorithm::Bucketizer::Bucket methods
27 my @items = $bucket->items();
28 my $serial = $bucket->serial();
29
30DESCRIPTION
31 So, you own a number of mp3-Songs on your hard disc and want to copy
32 them to a number of CDs, maxing out the space available on each of them?
33 You want to distribute your picture collection into several folders, so
34 each of them doesn't exceed a certain size? "Algorithm::Bucketizer"
35 comes to the rescue.
36
37 "Algorithm::Bucketizer" distributes items of a defined size into a
38 number of dynamically created buckets, each of them capable of holding
39 items of a defined total size.
40
41 By calling the "$bucketizer->add_item()" method with the item (can be a
42 scalar or an object reference) and its size as parameters, you're adding
43 items to the system. The bucketizer will determine if the item fits into
44 one of the existing buckets and put it in there if possible. If none of
45 the existing buckets has enough space left to hold the new item (or if
46 no buckets exist yet for that matter), the bucketizer will create a new
47 bucket and put the item in there.
48
49 After adding all items to the system, the bucketizer lets you iterate
50 over all buckets with the "$bucketizer->items()" method and determine
51 what's in each of them.
52
53 Algorithms
54 Currently, "Algorithm::Bucketizer" comes with two algorithms, "simple"
55 and "retry".
56
57 In "simple" mode, the algorithm will just try to fit in your items in
58 the order in which they're arriving. If an item fits into the current
59 bucket, it's being dropped in, if not, the algorithm moves on to the
60 next bucket. It never goes back to previous buckets, although a new item
61 might as well fit in there. This mode might be useful if preserving the
62 original order of items is required. To query/manipulate the bucket the
63 Bucketizer will try to fit in the next item, use
64 "current_bucket_index()" explained below.
65
66 In "retry" mode, the algorithm will try each existing bucket first,
67 before opening a new one. If you have many items of various sizes,
68 "retry" allows you to fit them into less buckets than in "simple" mode.
69
70 The "new()" method chooses the algorithm:
71
72 my $dumb = Algorithm::Bucketizer->new( algorithm => "simple" );
73
74 my $smart = Algorithm::Bucketizer->new( algorithm => "retry" );
75
76 In addition to these inserting algorithms, check "Optimize" to optimize
77 the distribution, minimizing the number of required buckets.
78
79 Prefilling Buckets
80 Sometimes you will have preexisting buckets, which you need to tell the
81 algorithm about before it starts adding new items. The
82 "prefill_bucket()" method does exactly that, simply putting an item into
83 a specified bucket:
84
85 $b->prefill_bucket($bucket_idx, $item, $itemsize);
86
87 $bucket_idx is the index of the bucket, starting from 0. Non-existing
88 buckets are automatically created for you. Make sure you have a
89 consecutive number of buckets at the end of the prefill.
90
91 Optimize
92 Once you've inserted all items, you might choose to optimize the
93 distribution over the buckets, in order to *minimize* the number of
94 required buckets to hold all the elements.
95
96 Optimally distributing a number discrete-sized items into a number of
97 discrete-sized buckets, however, is a non-trivial task. It's the
98 "bin-packing problem", related to the "knapsack problem", which are both
99 *NP-complete*.
100
101 "Algorithm::Bucketize" therefore provides different optimization
102 techniques to (stupidly) approximate an ideal solution, which can't be
103 obtained otherwise (yet).
104
105 Currently, it implements "random" and "brute_force".
106
107 "random" tries to randomly vary the distribution until a time or round
108 limit is reached.
109
110 # Try randomly to improve distribution,
111 # timing out after 100 rounds
112 $b->optimize(algorithm => "random", maxrounds => 100);
113
114 # Try randomly to improve distribution,
115 # timing out after 60 secs
116 $b->optimize(algorithm => "random", maxtime => 60);
117
118 # Try to improve distribution by brute_force trying
119 # all possible combinations (watch out: can take forever)
120 $b->optimize(algorithm => "brute_force",
121 maxtime => ...,
122 maxrounds => ...,
123 );
124
125 I'm currently evaluating more sophisticated methods suggested by more
126 mathematically inclined people :).
127
128FUNCTIONS
129 *
130 my $b = Algorithm::Bucketizer->new(
131 bucketsize => $size,
132 algorithm => $algorithm
133 );
134
135 Creates a new "Algorithm::Bucketizer" object and returns a reference
136 to it.
137
138 The "bucketsize" name-value pair is somewhat mandatory, because you
139 want to set the size of your buckets, otherwise they will default to
140 100, which isn't what you want in most cases.
141
142 "algorithm" can be left out, it defaults to "simple". If you want
143 retry behaviour, specify "retry" (see "Algorithms").
144
145 Another optional parameter, "add_buckets" specifies if the
146 bucketizer is allowed to add new buckets to the end of the brigade
147 as it sees fit. It defaults to 1. If set to 0, the bucketizer will
148 operate with a limited number of buckets, usually defined by
149 "add_bucket" calls.
150
151 *
152 $b->add_item($item_name, $item_size);
153
154 Adds an item with the specified name and size to the next available
155 bucket, according to the chosen algorithm. If you want to place an
156 item into a specific bucket (e.g. in order to prefill buckets), use
157 "prefill_bucket()" instead, which is described below.
158
159 Returns the Algorithm::Bucketizer::Bucket object of the lucky bucket
160 on sucess and "undef" if something goes badly wrong (e.g. the bucket
161 size is smaller than the item, i.e. there's no way it's ever going
162 to fit in *any* bucket).
163
164 *
165 my @buckets = $b->buckets();
166
167 Return a list of buckets. The list contains elements of type
168 "Algorithm::Bucketizer::Bucket", which understand the following
169 methods:
170
171 *
172 my @items = $bucket->items();
173
174 Returns a list of names of items in the bucket. Returns an empty
175 list if the bucket is empty.
176
177 *
178 my $level = $bucket->level();
179
180 Return how full the bucket is. That's the size of all items in
181 the bucket combined.
182
183 *
184 my $bucket_index = $bucket->idx();
185
186 Return the bucket's index. The first bucket has index 0.
187
188 *
189 my $serial_number = $bucket->serial();
190
191 Return the bucket serial number. That's the bucket index plus 1.
192
193 *
194 $b->add_bucket(
195 maxsize => $maxsize
196 );
197
198 Adds a new bucket to the end of the bucket brigade. This method is
199 useful for building brigades with buckets of various sizes.
200
201 *
202 $b->current_bucket_idx( $idx );
203
204 Set/retrieve the index of the bucket that the "simple" algorithm
205 will use first in order to try to insert the next item.
206
207 *
208 $b->optimize(
209 algorithm => $algorithm,
210 maxtime => $seconds,
211 maxrounds => $number_of_rounds
212 );
213
214 Optimize bucket distribution. Currently "random" and "brute_force"
215 are implemented. Both can be ("random" *must* be) terminated by
216 either the maximum number of seconds ("maxtime") or iterations
217 ("maxrounds").
218
219EXAMPLE
220 We've got buckets which hold a weight of 100 each, and we've got 10
221 items weighing 30, 31, 32, ... 39. Distribute them into buckets.
222
223 use Algorithm::Bucketizer;
224
225 my $b = Algorithm::Bucketizer->new( bucketsize => 100 );
226 for my $i (1..10) {
227 $b->add_item($i, 30+$i);
228 }
229
230 for my $bucket ($b->buckets()) {
231 for my $item ($bucket->items()) {
232 print "Bucket ", $bucket->serial(), ": Item $item\n";
233 }
234 print "\n";
235 }
236
237 Output:
238
239 Bucket 1: Item 1
240 Bucket 1: Item 2
241 Bucket 1: Item 3
242
243 Bucket 2: Item 4
244 Bucket 2: Item 5
245
246 Bucket 3: Item 6
247 Bucket 3: Item 7
248
249 Bucket 4: Item 8
250 Bucket 4: Item 9
251
252 Bucket 5: Item 10
253
254REQUIRES
255 Algorithm::Permute 0.04 if you want to use the "brute_force" method.
256
257SCRIPTS
258 This distribution comes with a script *bucketize* which puts files into
259 directory buckets with limited size. Run "perldoc bucketize" for
260 details.
261
262AUTHOR
263 Mike Schilli, <m@perlmeister.com>
264
265COPYRIGHT AND LICENSE
266 Copyright 2002-2007 by Mike Schilli
267
268 This library is free software; you can redistribute it and/or modify it
269 under the same terms as Perl itself.
270
271