1sfsexp
2======
3
4This library is intended for developers who wish to manipulate (read,
5parse, modify, and create) symbolic expressions from C or C++
6programs. A symbolic expression, or s-expression, is essentially a
7LISP-like expression such as (a (b c)). S-expressions are able to
8represent complex, structured data without requiring additional
9meta-data describing the structure. They are recursively defined: an
10s-expression is a list of either atoms or s-expressions. In the
11example above, the expression contains an atom "a" and an
12s-expression, which in turn contains two atoms, "b" and "c". They are
13simple, useful, and well understood.
14
15This library is designed to provide a minimal set of functions and
16data structures for the four functions listed above: reading
17s-expressions (I/O), parsing strings containing them into an AST
18equivalent, modifying the AST representation, and converting the AST
19back into a well formatted string. The primary goals are efficiency
20and simplicity. This library forms the basis of the data
21representation and transmission protocol for the
22[Supermon](https://dl.acm.org/doi/10.5555/792762.793324) high-speed
23cluster monitoring system from the LANL Advanced Computing
24Laboratory. The usefulness and lack of choice in available, open
25source s-expression libraries around 2003 motivated the independent
26(from supermon) release of this library.
27
28## Building
29
30(Attention Windows Users: If you are using cygwin, this section
31applies as cygwin looks pretty much like unix for compilation.  Visual
32Studio users see the note at the end and look inside the win32
33directory.)
34
35Configure the sources via autoconf.
36
37```
38% ./configure
39```
40
41Currently, the only feature that can be enabled via autoconf is to
42enable any debugging code in the library by specifying
43"--enable-debug".  Other features such as disabling memory management
44by the library are toggled by setting appropriate options in the
45CFLAGS:
46
47```
48% CFLAGS=-D_NO_MEMORY_MANAGEMENT_ ./configure
49```
50Note that you should use the vanilla configuration unless you know that you
51want to use debug mode or other options, and understand precisely what they
52mean.
53
54After building, just invoke make:
55
56```
57% make
58```
59
60What comes out is a library called "libsexp.a".  If you wish to copy
61the headers and libraries to an installation location, you should then
62say:
63
64```
65% make install
66```
67
68At the current time, this is not recommended if the installation
69prefix is /usr or /usr/local, since the headers do not go into an
70isolated subdirectory.  In the future, we will hopefully have a single
71consolidated header that encapsulates all of the headers currently
72included in the library.  If you do not want to aim your project that
73uses the library into the library source and build directories
74directly, creating a separate installation prefix and installing there
75is recommended.  For example, from the root of the sexpr tree:
76
77```
78% mkdir installTree
79% ./configure --prefix=`pwd`/installTree
80% make
81% make install
82```
83
84If you want the docs, make sure you have doxygen installed and that
85the DOXYGEN variable in the Makefile.in in this directory points at
86the right place.  Re-run autoconf to regenerate the makefiles, and
87then type:
88
89```
90% make doc
91```
92
93## Usage notes
94
95In any code that wants to use this, just include "sexp.h".  That
96contains the one data structure, enumeration, and five functions for
97manipulating and parsing strings and s-expressions.  Compilation
98typically will look like:
99
100```
101% cc -I/path/to/sexp/include -L/path/to/sexp/library \
102     -o foo  foo.o -lsexp
103```
104
105The critical parts are to ensure that the include path aims at the
106path containing the library headers, the library path aims at the path
107containing the compiled binary libraries, and the library is linked in
108with the executable.
109
110The API is well-documented in the header files, esp.
111[sexp.h](src/sexp.h) and [sexp_ops.h](src/sexp_ops.h). The latter
112contains some convenience operators that may make your life slightly
113easier; for example, it defines `hd_sexp`, `tl_sexp`, and `next_sexp`,
114which make it a little easier to navigate sexps. (see below for a
115schematic representation of sexp structure).
116
117The library includes a basic (optional) string library as well; see
118[cstring.h](src/cstring.h).  This string library is useful to avoid
119working with fixed sized buffers and will grow strings as necessary
120automatically.
121
122If you are parsing a set of smallish sexps, as you might have in a
123lispy source file, you probably want to use `init_iowrap` and
124`read_one_sexp`. The [examples](examples) and [tests](tests)
125directories contain multiple examples showing how to do this.
126
127The drawback with `read_one_sexp` is that it uses a read buffer of
128size `BUFSIZ`, which may relatively small (1024 on MacOS Big Sur). If
129you try to read a sexp that is larger than `BUFSIZ` using this method
130you will get error `SEXP_ERR_INCOMPLETE`, meaning "parsing is
131incomplete and needs more data to complete it."
132([sexp_errors.h](src/sexp_errors.h)). This can easily happen if you
133are reading data encoded as sexps. For example, it's not uncommon for
134a big hunk o' data to be encoded as a single sexp in a file. One way
135to handle this situation is to get the file size, dynamically allocate
136a buffer big enought to hold it, and then use `parse_sexp`. Here's a
137minimal example (error checking omitted):
138
139```
140int fd;
141FILE *fp = fopen(fname, "r");
142fseek(fp, 0, SEEK_END);
143size_t fsize = (size_t) ftell(fp);
144fseek(fp, 0, SEEK_SET);  /* reset file pos to beginning of file, for reading */
145char *work_buf = (char*) malloc(fsize + 1);
146char *check_buf = (char*) malloc(fsize + 1); /* for debugging */
147size_t read_len = fread(work_buf, 1, fsize, fp);
148work_buf[read_len] = '\0';       /* make sure it's properly terminated */
149sexp_t the_sexp = parse_sexp(work_buf, read_len); /* assumption: file contains one sexp */
150/* check the parse result by serializing it to check_buf */
151size_t write_len = print_sexp(check_buf, fsize, the_sexp);
152printf("sexp: '%s'", check_buf);
153/* process the_sexp ... */
154destroy_sexp(the_sexp);
155free(work_buf);
156free(check_buf);
157close(fd);
158```
159
160The internal representation of sexps is lispy. For example, `(a (b c) d)`
161looks something like this:
162
163```
164    sexp_t
165      |
166    list
167      |
168     \_/
169    sexp_t  -- next --> sexp_t  -- next --> sexp_t
170      |                   |                   |
171     val                 list                val
172      |                   |                   |
173     \_/                  |                  \_/
174      a                   |                   d
175                          |
176                         \_/
177                        sexp_t --> next --> sexp_t
178                          |                   |
179                         val                 val
180                          |                   |
181                         \_/                 \_/
182                          b                   c
183```
184
185## Windows users
186
187Please look in the win32/ subdirectory. Note that as of 9/2013, this
188has not been looked at nor tested in a number of years.  If you try to
189use it and find it broken, fixes and updates to bring it up to speed
190with modern Windows development environments would be appreciated!
191
192## Credits
193
194The library is by Matt Sottile.  Steve James of Linux Labs has
195contributed bug fixes and features while developing for the related
196Supermon project. Sung-Eun Choi and Paul Ruth have contributed many
197bug reports as the library has grown.  Erik Hendriks contributed the
198malloc debugging tools now used when building with the
199-D_DEBUG_MALLOCS_ option.  Brad Green contributed code (in win32/) and
200testing for the Windows Visual Studio build target.  Others who have
201contributed can be found in the Github repository contributor list.
202
203### Funding acknowledgement
204
205This work was funded by the Department of Energy, Office of
206Science. The original development was performed at the Los Alamos
207National Laboratory between 2002 and 2007. It is currently maintained
208independently of any funding source as a service to the community
209(plus, it's fun to work on).
210
211