1/* tag: Tom Lord Tue Dec  4 14:57:15 2001 (bitsets-data-sheet.doc)
2 */
3/************************************************************************
4 *(h0 "Data Sheet for the Hackerlab Bitsets Functions"
5 *    :appendix)
6 *
7 insert*/
8
9Package:
10        Hackerlab Bitsets (bitset data structures)
11
12Supplier:
13	regexps.com
14
15Function:
16
17	Hackerlab Bitsets provides functions for manipulating ordinary
18	bitsets, and bitset trees (very large, sparsely populated
19	bitsets).
20
21
22Key Features:
23
24	Fast and accurate.
25
26	Clean and simple "classic C" interface.
27
28	Extensive test suite included.
29
30	Postscript and HTML documentation included.
31
32	A complete set of bitset functions.  These operations are
33	provided for both ordinary bitsets and bitset trees:
34
35	    alloc
36	    free
37	    dup
38	    is_member
39	    is_equal
40	    is_subset
41	    is_empty
42	    is_empty_range
43	    is_full
44	    is_full_range
45	    adjoin
46	    remove
47	    toggle
48	    clear
49	    clear_range
50	    fill
51	    fill_range
52	    complement
53	    assign
54	    union
55	    intersection
56	    difference
57	    revdifference
58	    xor
59	    population
60	    population_range
61	    ffs			("find first set")
62	    ffs_range
63	    ffc			("find first clear")
64	    ffc_range
65
66
67
68Licensing:
69
70	Hackerlab Bitsets is part of the Hackerlab C Library which is
71	distributed under the terms of the GNU General Public License,
72	Version 2, as published by the Free Software Foundation.
73
74	Alternative licenses (such as licenses permitting binary-only
75	re-distribution) can be purchased from regexps.com.
76
77
78Prerequisites:
79
80        standard C compiler
81        Posix libc.a (standard C library)
82        GNU Make
83
84
85Recommended and Disrecommended Applications:
86
87	Recommended for any application needing a fast and complete
88	implementation of the "standard" bitset operations.
89
90	Sparse bitset support is ideal for representing sets
91	of Unicode code points.
92
93
94Limitations:
95
96	This is the first public (non-prototype) release.
97
98
99Contribution to Executable Size:
100
101	Ordinary Bitsets
102	----------------
103
104        On a Pentium based machine, running gcc (egcs version
105        2.91.66) we compiled this simple (nonsense) program and
106	linked it against the Hackerlab C Library:
107
108
109	    int
110	    main (int argc, char * argv[])
111	    {
112	      bitset_alloc (); bitset_free (); bitset_dup ();
113	      bitset_is_member (); bitset_is_equal ();
114	      bitset_is_subset (); bitset_is_empty ();
115	      bitset_is_empty_range (); bitset_is_full ();
116	      bitset_is_full_range (); bitset_adjoin ();
117	      bitset_remove (); bitset_toggle (); bitset_clear ();
118	      bitset_clear_range (); bitset_fill ();
119	      bitset_fill_range (); bitset_complement ();
120	      bitset_assign (); bitset_union ();
121	      bitset_intersection (); bitset_difference ();
122	      bitset_revdifference (); bitset_xor ();
123	      bitset_population (); bitset_population_range ();
124	      bitset_ffs (); bitset_ffs_range ();
125	      bitset_ffc (); bitset_ffc_range ();
126	    }
127
128        Both the library and the program were compiled with the "-g"
129        option.
130
131        Total executable size:
132
133	   text	   data
134	  16681	   4864
135
136        The following list of object files from the Hackerlab
137	C library were linked into the executable:
138
139	    bitset.o alloc-limits.o mem.o must-malloc.o panic.o
140	    str.o cvt.o panic-exit.o char-class.o
141
142        The contribution of those files to the executable size is:
143
144           text    data
145	  15405    4672
146
147
148
149	Bitset Trees (Large Sparsely Populated Bitsets)
150	-----------------------------------------------
151
152	For these, we used the (nonsense) program:
153
154	    int
155	    main (int argc, char * argv[])
156	    {
157	      bits_alloc (); bits_free (); bits_dup ();
158	      bits_is_member (); bits_is_equal ();
159	      bits_is_subset (); bits_is_empty ();
160	      bits_is_empty_range (); bits_is_full ();
161	      bits_is_full_range (); bits_adjoin ();
162	      bits_remove (); bits_toggle (); bits_clear ();
163	      bits_clear_range (); bits_fill ();
164	      bits_fill_range (); bits_complement ();
165	      bits_assign (); bits_union ();
166	      bits_intersection (); bits_difference ();
167	      bits_revdifference (); bits_xor ();
168	      bits_population (); bits_population_range ();
169	      bits_ffs (); bits_ffs_range ();
170	      bits_ffc (); bits_ffc_range ();
171	    }
172
173        Total executable size:
174
175	   text	   data
176	  39097    4864
177
178        The following list of object files from the Hackerlab
179	C library were linked into the executable:
180
181	    bits.o bitset-tree.o bitset.o alloc-limits.o mem.o
182	    must-malloc.o panic.o str.o cvt.o panic-exit.o
183	    char-class.o
184
185        The contribution of those files to the executable size is:
186
187           text    data
188	  37814	   4672
189
190
191External Linkage Dependencies:
192
193
194	When compiled under FreeBSD 3.0, the simple programs used to
195        test executable sizes depends on the following symbols
196        defined in "libc":
197
198	    _exit
199	    free
200	    malloc
201	    realloc
202	    write
203
204
205Accuracy:
206
207	The Hackerlab Bitsets library passes all of the tests
208	described here:
209
210	To test the ordinary bitset functions, we wrote a Scheme
211	program to automatically generate test cases.  The tests
212	generated exercise all of the bitset functions, testing a
213	variety of bitset sizes and boundary conditions.  The Scheme
214	program uses a list-based set library to compute the expected
215	results of these tests and generates an answer key.  Over
216	32,000 tests are generated.
217
218	To test bitset trees, we wrote a program that performs a
219	randomly selected sequence of bitset operations on both an
220	ordinary bitset and on an equivalent bitset tree.  Parameters
221	are randomly selected to exercise a variety of special cases
222	such as boundary conditions.  All bitset operations are
223	tested.  The ordinary and bitset tree results of each
224	operation are compared for equality.
225
226	The totality of Hackerlab C Library tests, including the tests
227	described here and tests of other subsystems that use the
228	bitset library, achieve 98.9% test coverage (measured in lines
229	of source executed) for the ordinary bitset functions, 86.8%
230	coverage for bitset tree functions.  The code not tested is
231	primarily code concerned with cleanly returning error codes in
232	the (presumably unlikely) case of memory allocation failure.
233
234
235Execution Speed:
236
237	There are no industry-wide standards for measuring the
238	performance of bitset operations and we know of similar
239	library against which to make comparative measurements.
240
241	The implementation is conceptually straightforward, and
242	its performance characteristics can be accurately inferred
243	from a description of the implementation.
244
245	The ordinary bitset functions perform straightforward bit
246	manipulations using C operators.  Bitsets are represented as
247	an array of subsets, each subset containing (up to) the number
248	of bits in a machine word.  Most functions which operate on a
249	sequence of contiguous bits contain a loop which operates on
250	an entire word at a time, with special cases for the
251	boundaries of the loop.
252
253	Bitset trees are recursively represented as a tree of bitset
254	trees, with ordinary bitsets as leaf nodes.  Subtrees
255	which are all 0 or all 1 are (in most circumstances) replaced
256	by a NULL or -1 pointer (which speeds up operations on those
257	subtrees).
258
259
260Run-time Allocation Requirements:
261
262	Ordinary bitsets are stored as a contiguous array of bits.
263
264	Bitset trees afford programmers to make explicit space/time
265	trade-offs by determining the depth and branching structure
266	of the trees.
267
268
269Support:
270
271	To purchase an alternative license, request additional
272	features, or for any kind of support assistance, you can
273	contact us at "hackerlab@regexps.com" or via our web site
274	"www.regexps.com".  We can also be reached at (412) 401-5204
275	or at:
276
277		regexps.com
278		1717 Murray Ave. P.M.B. #10
279		Pittsburgh, PA 15217
280
281        Available support includes (but is not limited to):
282
283                - porting assistance
284                - customized extensions
285                - consultation concerning regexp-intensive applications
286                - help with undocumented interfaces
287                - bug fixing
288
289        In most cases, support is offered as a commercial
290        service.
291
292
293/*end-insert
294 */
295
296
297