1Smart Memory Allocation
2=======================
3
4Few things are as embarrassing as a program that leaks, yet few errors
5are so easy to commit or as difficult to track down in a large,
6complicated program as failure to release allocated memory. SMARTALLOC
7replaces the standard C library memory allocation functions with
8versions which keep track of buffer allocations and releases and report
9all orphaned buffers at the end of program execution. By including this
10package in your program during development and testing, you can identify
11code that loses buffers right when it’s added and most easily fixed,
12rather than as part of a crisis debugging push when the problem is
13identified much later in the testing cycle (or even worse, when the code
14is in the hands of a customer). When program testing is complete, simply
15recompiling with different flags removes SMARTALLOC from your program,
16permitting it to run without speed or storage penalties.
17
18In addition to detecting orphaned buffers, SMARTALLOC also helps to find
19other common problems in management of dynamic storage including storing
20before the start or beyond the end of an allocated buffer, referencing
21data through a pointer to a previously released buffer, attempting to
22release a buffer twice or releasing storage not obtained from the
23allocator, and assuming the initial contents of storage allocated by
24functions that do not guarantee a known value. SMARTALLOC’s checking
25does not usually add a large amount of overhead to a program (except for
26programs which use realloc() extensively; see below). SMARTALLOC focuses
27on proper storage management rather than internal consistency of the
28heap as checked by the malloc_debug facility available on some systems.
29SMARTALLOC does not conflict with malloc_debug and both may be used
30together, if you wish. SMARTALLOC makes no assumptions regarding the
31internal structure of the heap and thus should be compatible with any C
32language implementation of the standard memory allocation functions.
33
34Installing SMARTALLOC
35~~~~~~~~~~~~~~~~~~~~~
36
37SMARTALLOC is provided as a Zipped archive, ; see the download
38instructions below.
39
40To install SMARTALLOC in your program, simply add the statement:
41
42to every C program file which calls any of the memory allocation
43functions (malloc, calloc, free, etc.). SMARTALLOC must be used for all
44memory allocation with a program, so include file for your entire
45program, if you have such a thing. Next, define the symbol SMARTALLOC in
46the compilation before the inclusion of smartall.h. I usually do this by
47having my Makefile add the “-DSMARTALLOC” option to the C compiler for
48non-production builds. You can define the symbol manually, if you
49prefer, by adding the statement:
50
51#define SMARTALLOC
52
53At the point where your program is all done and ready to relinquish
54control to the operating system, add the call:
55
56        sm_dump(\ *datadump*\ );
57
58where *datadump* specifies whether the contents of orphaned buffers are
59to be dumped in addition printing to their size and place of allocation.
60The data are dumped only if *datadump* is nonzero, so most programs will
61normally use “sm_dump(0);”. If a mysterious orphaned buffer appears that
62can’t be identified from the information this prints about it, replace
63the statement with “sm_dump(1);”. Usually the dump of the buffer’s data
64will furnish the additional clues you need to excavate and extirpate the
65elusive error that left the buffer allocated.
66
67Finally, add the files “smartall.h” and “smartall.c” from this release
68to your source directory, make dependencies, and linker input. You
69needn’t make inclusion of smartall.c in your link optional; if compiled
70with SMARTALLOC not defined it generates no code, so you may always
71include it knowing it will waste no storage in production builds. Now
72when you run your program, if it leaves any buffers around when it’s
73done, each will be reported by sm_dump() on stderr as follows:
74
75::
76
77    Orphaned buffer:     120 bytes allocated at line 50 of gutshot.c
78
79Squelching a SMARTALLOC
80~~~~~~~~~~~~~~~~~~~~~~~
81
82Usually, when you first install SMARTALLOC in an existing program you’ll
83find it nattering about lots of orphaned buffers. Some of these turn out
84to be legitimate errors, but some are storage allocated during program
85initialisation that, while dynamically allocated, is logically static
86storage not intended to be released. Of course, you can get rid of the
87complaints about these buffers by adding code to release them, but by
88doing so you’re adding unnecessary complexity and code size to your
89program just to silence the nattering of a SMARTALLOC, so an escape
90hatch is provided to eliminate the need to release these buffers.
91
92Normally all storage allocated with the functions malloc(), calloc(),
93and realloc() is monitored by SMARTALLOC. If you make the function call:
94
95::
96
97            sm_static(1);
98
99you declare that subsequent storage allocated by malloc(), calloc(), and
100realloc() should not be considered orphaned if found to be allocated
101when sm_dump() is called. I use a call on “sm_static(1);” before I
102allocate things like program configuration tables so I don’t have to add
103code to release them at end of program time. After allocating
104unmonitored data this way, be sure to add a call to:
105
106::
107
108            sm_static(0);
109
110to resume normal monitoring of buffer allocations. Buffers allocated
111while sm_static(1) is in effect are not checked for having been orphaned
112but all the other safeguards provided by SMARTALLOC remain in effect.
113You may release such buffers, if you like; but you don’t have to.
114
115Living with Libraries
116~~~~~~~~~~~~~~~~~~~~~
117
118Some library functions for which source code is unavailable may
119gratuitously allocate and return buffers that contain their results, or
120require you to pass them buffers which they subsequently release. If you
121have source code for the library, by far the best approach is to simply
122install SMARTALLOC in it, particularly since this kind of ill-structured
123dynamic storage management is the source of so many storage leaks.
124Without source code, however, there’s no option but to provide a way to
125bypass SMARTALLOC for the buffers the library allocates and/or releases
126with the standard system functions.
127
128For each function *xxx* redefined by SMARTALLOC, a corresponding routine
129named “actually\ *xxx*” is furnished which provides direct access to the
130underlying system function, as follows:
131
132    | ll &
133    | malloc(\ *size*\ ) & actuallymalloc(\ *size*\ )
134    | calloc(\ *nelem*\ , *elsize*\ ) & actuallycalloc(\ *nelem*,
135      *elsize*\ )
136    | realloc(\ *ptr*\ , *size*\ ) & actuallyrealloc(\ *ptr*, *size*\ )
137    | free(\ *ptr*\ ) & actuallyfree(\ *ptr*\ )
138
139For example, suppose there exists a system library function named
140“getimage()” which reads a raster image file and returns the address of
141a buffer containing it. Since the library routine allocates the image
142directly with malloc(), you can’t use SMARTALLOC’s free(), as that call
143expects information placed in the buffer by SMARTALLOC’s special version
144of malloc(), and hence would report an error. To release the buffer you
145should call actuallyfree(), as in this code fragment:
146
147::
148
149            struct image *ibuf = getimage("ratpack.img");
150            display_on_screen(ibuf);
151            actuallyfree(ibuf);
152
153Conversely, suppose we are to call a library function, “putimage()”,
154which writes an image buffer into a file and then releases the buffer
155with free(). Since the system free() is being called, we can’t pass a
156buffer allocated by SMARTALLOC’s allocation routines, as it contains
157special information that the system free() doesn’t expect to be there.
158The following code uses actuallymalloc() to obtain the buffer passed to
159such a routine.
160
161::
162
163            struct image *obuf =
164               (struct image *) actuallymalloc(sizeof(struct image));
165            dump_screen_to_image(obuf);
166            putimage("scrdump.img", obuf);  /* putimage() releases obuf */
167
168It’s unlikely you’ll need any of the “actually” calls except under very
169odd circumstances (in four products and three years, I’ve only needed
170them once), but they’re there for the rare occasions that demand them.
171Don’t use them to subvert the error checking of SMARTALLOC; if you want
172to disable orphaned buffer detection, use the sm_static(1) mechanism
173described above. That way you don’t forfeit all the other advantages of
174SMARTALLOC as you do when using actuallymalloc() and actuallyfree().
175
176SMARTALLOC Details
177~~~~~~~~~~~~~~~~~~
178
179When you include “smartall.h” and define SMARTALLOC, the following
180standard system library functions are redefined with the #define
181mechanism to call corresponding functions within smartall.c instead.
182(For details of the redefinitions, please refer to smartall.h.)
183
184::
185
186            void *malloc(size_t size)
187            void *calloc(size_t nelem, size_t elsize)
188            void *realloc(void *ptr, size_t size)
189            void free(void *ptr)
190            void cfree(void *ptr)
191
192cfree() is a historical artifact identical to free().
193
194In addition to allocating storage in the same way as the standard
195library functions, the SMARTALLOC versions expand the buffers they
196allocate to include information that identifies where each buffer was
197allocated and to chain all allocated buffers together. When a buffer is
198released, it is removed from the allocated buffer chain. A call on
199sm_dump() is able, by scanning the chain of allocated buffers, to find
200all orphaned buffers. Buffers allocated while sm_static(1) is in effect
201are specially flagged so that, despite appearing on the allocated buffer
202chain, sm_dump() will not deem them orphans.
203
204When a buffer is allocated by malloc() or expanded with realloc(), all
205bytes of newly allocated storage are set to the hexadecimal value 0x55
206(alternating one and zero bits). Note that for realloc() this applies
207only to the bytes added at the end of buffer; the original contents of
208the buffer are not modified. Initializing allocated storage to a
209distinctive nonzero pattern is intended to catch code that erroneously
210assumes newly allocated buffers are cleared to zero; in fact their
211contents are random. The calloc() function, defined as returning a
212buffer cleared to zero, continues to zero its buffers under SMARTALLOC.
213
214Buffers obtained with the SMARTALLOC functions contain a special
215sentinel byte at the end of the user data area. This byte is set to a
216special key value based upon the buffer’s memory address. When the
217buffer is released, the key is tested and if it has been overwritten an
218assertion in the free function will fail. This catches incorrect program
219code that stores beyond the storage allocated for the buffer. At free()
220time the queue links are also validated and an assertion failure will
221occur if the program has destroyed them by storing before the start of
222the allocated storage.
223
224In addition, when a buffer is released with free(), its contents are
225immediately destroyed by overwriting them with the hexadecimal pattern
2260xAA (alternating bits, the one’s complement of the initial value
227pattern). This will usually trip up code that keeps a pointer to a
228buffer that’s been freed and later attempts to reference data within the
229released buffer. Incredibly, this is *legal* in the standard Unix memory
230allocation package, which permits programs to free() buffers, then raise
231them from the grave with realloc(). Such program “logic” should be
232fixed, not accommodated, and SMARTALLOC brooks no such Lazarus buffer`\`
233nonsense.
234
235Some C libraries allow a zero size argument in calls to malloc(). Since
236this is far more likely to indicate a program error than a defensible
237programming stratagem, SMARTALLOC disallows it with an assertion.
238
239When the standard library realloc() function is called to expand a
240buffer, it attempts to expand the buffer in place if possible, moving it
241only if necessary. Because SMARTALLOC must place its own private storage
242in the buffer and also to aid in error detection, its version of
243realloc() always moves and copies the buffer except in the trivial case
244where the size of the buffer is not being changed. By forcing the buffer
245to move on every call and destroying the contents of the old buffer when
246it is released, SMARTALLOC traps programs which keep pointers into a
247buffer across a call on realloc() which may move it. This strategy may
248prove very costly to programs which make extensive use of realloc(). If
249this proves to be a problem, such programs may wish to use
250actuallymalloc(), actuallyrealloc(), and actuallyfree() for such
251frequently-adjusted buffers, trading error detection for performance.
252Although not specified in the System V Interface Definition, many C
253library implementations of realloc() permit an old buffer argument of
254NULL, causing realloc() to allocate a new buffer. The SMARTALLOC version
255permits this.
256
257When SMARTALLOC is Disabled
258~~~~~~~~~~~~~~~~~~~~~~~~~~~
259
260When SMARTALLOC is disabled by compiling a program with the symbol
261SMARTALLOC not defined, calls on the functions otherwise redefined by
262SMARTALLOC go directly to the system functions. In addition,
263compile-time definitions translate calls on the ”actually…()“ functions
264into the corresponding library calls; ”actuallymalloc(100)“, for
265example, compiles into”malloc(100)\``. The two special SMARTALLOC
266functions, sm_dump() and sm_static(), are defined to generate no code
267(hence the null statement). Finally, if SMARTALLOC is not defined,
268compilation of the file smartall.c generates no code or data at all,
269effectively removing it from the program even if named in the link
270instructions.
271
272Thus, except for unusual circumstances, a program that works with
273SMARTALLOC defined for testing should require no changes when built
274without it for production release.
275
276The alloc() Function
277~~~~~~~~~~~~~~~~~~~~
278
279Many programs I’ve worked on use very few direct calls to malloc(),
280using the identically declared alloc() function instead. Alloc detects
281out-of-memory conditions and aborts, removing the need for error
282checking on every call of malloc() (and the temptation to skip checking
283for out-of-memory).
284
285As a convenience, SMARTALLOC supplies a compatible version of alloc() in
286the file alloc.c, with its definition in the file alloc.h. This version
287of alloc() is sensitive to the definition of SMARTALLOC and cooperates
288with SMARTALLOC’s orphaned buffer detection. In addition, when
289SMARTALLOC is defined and alloc() detects an out of memory condition, it
290takes advantage of the SMARTALLOC diagnostic information to identify the
291file and line number of the call on alloc() that failed.
292
293Overlays and Underhandedness
294~~~~~~~~~~~~~~~~~~~~~~~~~~~~
295
296String constants in the C language are considered to be static arrays of
297characters accessed through a pointer constant. The arrays are
298potentially writable even though their pointer is a constant. SMARTALLOC
299uses the compile-time definition ./smartall.wml to obtain the name of
300the file in which a call on buffer allocation was performed. Rather than
301reserve space in a buffer to save this information, SMARTALLOC simply
302stores the pointer to the compiled-in text of the file name. This works
303fine as long as the program does not overlay its data among modules. If
304data are overlayed, the area of memory which contained the file name at
305the time it was saved in the buffer may contain something else entirely
306when sm_dump() gets around to using the pointer to edit the file name
307which allocated the buffer.
308
309If you want to use SMARTALLOC in a program with overlayed data, you’ll
310have to modify smartall.c to either copy the file name to a fixed-length
311field added to the abufhead structure, or else allocate storage with
312malloc(), copy the file name there, and set the abfname pointer to that
313buffer, then remember to release the buffer in sm_free. Either of these
314approaches are wasteful of storage and time, and should be considered
315only if there is no alternative. Since most initial debugging is done in
316non-overlayed environments, the restrictions on SMARTALLOC with data
317overlaying may never prove a problem. Note that conventional overlaying
318of code, by far the most common form of overlaying, poses no problems
319for SMARTALLOC; you need only be concerned if you’re using exotic tools
320for data overlaying on MS-DOS or other address-space-challenged systems.
321
322Since a C language ”constant`\` string can actually be written into,
323most C compilers generate a unique copy of each string used in a module,
324even if the same constant string appears many times. In modules that
325contain many calls on allocation functions, this results in substantial
326wasted storage for the strings that identify the file name. If your
327compiler permits optimization of multiple occurrences of constant
328strings, enabling this mode will eliminate the overhead for these
329strings. Of course, it’s up to you to make sure choosing this compiler
330mode won’t wreak havoc on some other part of your program.
331
332Test and Demonstration Program
333~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
334
335A test and demonstration program, smtest.c, is supplied with SMARTALLOC.
336You can build this program with the Makefile included. Please refer to
337the comments in smtest.c and the Makefile for information on this
338program. If you’re attempting to use SMARTALLOC on a new machine or with
339a new compiler or operating system, it’s a wise first step to check it
340out with smtest first.
341
342Invitation to the Hack
343~~~~~~~~~~~~~~~~~~~~~~
344
345SMARTALLOC is not intended to be a panacea for storage management
346problems, nor is it universally applicable or effective; it’s another
347weapon in the arsenal of the defensive professional programmer
348attempting to create reliable products. It represents the current state
349of evolution of expedient debug code which has been used in several
350commercial software products which have, collectively, sold more than
351third of a million copies in the retail market, and can be expected to
352continue to develop through time as it is applied to ever more demanding
353projects.
354
355The version of SMARTALLOC here has been tested on a Sun SPARCStation,
356Silicon Graphics Indigo2, and on MS-DOS using both Borland and Microsoft
357C. Moving from compiler to compiler requires the usual small changes to
358resolve disputes about prototyping of functions, whether the type
359returned by buffer allocation is char  or void , and so forth, but
360following those changes it works in a variety of environments. I hope
361you’ll find SMARTALLOC as useful for your projects as I’ve found it in
362mine.
363