1David Anderson.
2January 2014.
3
4In 2013 I realized I could simplify some code in libdwarf
5(another project) by removing some complicated and messy
6special memory allocation code.  Substituting by use of
7malloc/free and a table of used values.  The used-values needed
8to a preserve a special feature of the original allocations
9related to a special handle: any allocations would be freed
10by closing the special handle.
11
12So I decided to implement various searches (mostly tree
13searches) using the tsearch interface definitions.
14I wrote these using a dwarf_ prefix to distinguish
15these from any libc implementation and to make the
16source fit in easily with libdwarf conventions.
17
18I added a tsearch() implementation using hashing too.
19The GNU-only hsearch() function  does not let the hash grow
20(the tsearch() hash here grows automatically as needed)
21and has no hdelete() or hwalk() capability, so hsearch() and
22friends did not seem like a good interface set even though
23the GNU-designed hsearch_r() interface set is nicely designed.
24
25The attached C code is a small set of implementations of
26C search code.  Based on algorithms in Knuth Volume 3
27(2nd Ed) and Sedgewick 4th Edition.
28
29All the code here is implemented by me. I have never read the
30source to any other implementations of the tsearch algorighms
31or interfaces so this is a clean-room implementation.
32
33The hash version weakens the correspondence to tree based
34tsearch because, well, it's not a tree.  So twalk() behaves
35differently than a tree-based version and an initialization.
36
37Non-GNU libc usually has no tdestroy().   The set of functions
38here provides a tdestroy() for all the tree/hash function
39sets here.
40
41
42==================
43WHY TSEARCH?
44
45The interfaces are of a well-known design.  While
46the interfaces are based on an obsolete style of writing
47interfaces, they are a known set.  So the routines provided
48here are a direct substitution.
49
50The tdestroy() function is available in GNU libc, but is
51not part of the standard tsearch() set, nor is tdestroy()
52defined in the Single Unix Specification.
53
54The hash version of tdelete() cannot always return a non-null
55value even if there are records left.  Not being a tree at
56all, it would cost runtime to provide a non-null return in
57some cases even when there are records left from tdelete().
58The return value from tdelete() is problematic in all versions,
59since it purports to return the parent of the deleted node,
60but what if the deleted node was the root?
61
62==================
63LICENSING:
64
65I wanted this to be usable in any context, hence the use of
66the BSD open-source license.
67
68==================
69INTERFACE ODDITIES:
70
71It's not really easy to understand whether any given
72call is returning
73	success, action succeeded
74        tsearch success - added record
75        tsearch success - record already in tree
76        fail due to internal error or improper argument.
77        requested action impossible
78            (tfind() fails to find a record, or tdelete() fails
79            to find the record to delete)
80
81
82Speaking of C here, so, there is no try/catch available.
83
84It might have been nice if twalk() let the user-provided
85action code indicate to twalk() that the user wanted it to stop
86walking the tree.
87
88I won't be changing the interface though.
89I am staying with the standard interfaces.
90
91==================
92DOCUMENTATION ODDITIES:
93
94The GNU/Linux man pages on tsearch() and friends are nearly
95useless as you won't understand how to use the functions from
96reading that man page.
97
98The prototype for tfind() in the man page is slightly wrong
99while the headers in <search.h> are correct.
100
101==================
102IMPLEMENTATION ODDITIES:
103
104The code uses names like dwarf_tsearch() instead of just
105tsearch() so one can have both this tsearch() and GNU or
106a standard tsearch code available simultaneously with no
107interference at runtime.   The code here operates like GNU or
108UNIX tsearch() but is internally incompatible.  We'll usually
109refer to tsearch() not dwarf_tsearch() (etc) but we mean either
110and both unless otherwise specified here.
111
112The use of tsearch() and friends is a little tricky.
113That is just the way it is.  The best reference is the code
114in tsearch_tester.c.
115
116See the trivial set of #defines in tsearch_tester.c that
117result in a standard-based naming of the functions.
118
119The return value from tdelete() is particularly problematic
120because it purports to return a pointer to another record
121than the one deleted, yet no such record necessarily exists.
122And in the case of
123
124The hash version requires use of the new function
125dwarf_initialize_search_hash() to provide a hashing
126function. And the hash version does not set the root pointer
127to NULL on tdelete() of the last record, since that would
128result in losing the hashing function pointer.  Use tdestroy()
129to free up any remaining space.
130
131dwarf_tdump() is an invention and unique to this code.  It is
132for debugging. It prints a representation of the data in the
133tree or hash to stdout.
134
135The #include .h file set used here makes it much easier
136for me to move these files to another project as necessary.
137You will probably want to revise the #include set a little.
138
139
140==================
141OTHER DIRECTORIES:
142
143testdata contains a set of sample allocations, where
144lines starting with 'a' mean add and 'd' means delete.
145These are used for testing the library code.
146
147scripts contains some python scripts for converting
148results produced by printf added to libdwarf alloca()
149code.  These were used massage the print output
150from running dwarfdump on real object files
151into the testcase/* format.
152It is unlikely you will find them much use
153except as starting points.
154
155
156==================
157REFERENCES:
158
159Donald E Knuth, The Art of Computer Programming Volume 3, Second Edition.
160(Quite difficult to interpret some parts of Knuth's
161algorithms, but then I always found Knuth hard to
162understand. Still, Knuth is the essential source for
163algorithms.)
164
165The Open Group, The Single UNIX Specification, Version 3(2001).
166The Single Unix Specification (or whatever it is called now)
167is a reference source everyone should have.
168
169Robert Sedgewick and Kevin Wayne, Algorithms (4th Ed).
170This is the crucial reference on red-black trees.
171There is a crucial fix in the errata of the 3rd printing.
172In addition, in dwarf_tsearchred.c in two places I had to fix things,
173look for 'Mistake' in the source.
174
175
176http://www.prevanders.net/tsearch.html
177
178
179