1@c PSPP - a program for statistical analysis.
2@c Copyright (C) 2019 Free Software Foundation, Inc.
3@c Permission is granted to copy, distribute and/or modify this document
4@c under the terms of the GNU Free Documentation License, Version 1.3
5@c or any later version published by the Free Software Foundation;
6@c with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
7@c A copy of the license is included in the section entitled "GNU
8@c Free Documentation License".
9@c
10
11@node Basic Concepts
12@chapter Basic Concepts
13
14This chapter introduces basic data structures and other concepts
15needed for developing in PSPP.
16
17@menu
18* Values::
19* Input and Output Formats::
20* User-Missing Values::
21* Value Labels::
22* Variables::
23* Dictionaries::
24* Coding Conventions::
25* Cases::
26* Data Sets::
27* Pools::
28@end menu
29
30@node Values
31@section Values
32
33@cindex value
34The unit of data in PSPP is a @dfn{value}.
35
36@cindex width
37@cindex string value
38@cindex numeric value
39@cindex MAX_STRING
40Values are classified by @dfn{type} and @dfn{width}.  The
41type of a value is either @dfn{numeric} or @dfn{string} (sometimes
42called alphanumeric).  The width of a string value ranges from 1 to
43@code{MAX_STRING} bytes.  The width of a numeric value is artificially
44defined to be 0; thus, the type of a value can be inferred from its
45width.
46
47Some support is provided for working with value types and widths, in
48@file{data/val-type.h}:
49
50@deftypefn Macro int MAX_STRING
51Maximum width of a string value, in bytes, currently 32,767.
52@end deftypefn
53
54@deftypefun bool val_type_is_valid (enum val_type @var{val_type})
55Returns true if @var{val_type} is a valid value type, that is,
56either @code{VAL_NUMERIC} or @code{VAL_STRING}.  Useful for
57assertions.
58@end deftypefun
59
60@deftypefun {enum val_type} val_type_from_width (int @var{width})
61Returns @code{VAL_NUMERIC} if @var{width} is 0 and thus represents the
62width of a numeric value, otherwise @code{VAL_STRING} to indicate that
63@var{width} is the width of a string value.
64@end deftypefun
65
66The following subsections describe how values of each type are
67represented.
68
69@menu
70* Numeric Values::
71* String Values::
72* Runtime Typed Values::
73@end menu
74
75@node Numeric Values
76@subsection Numeric Values
77
78A value known to be numeric at compile time is represented as a
79@code{double}.  PSPP provides three values of @code{double} for
80special purposes, defined in @file{data/val-type.h}:
81
82@deftypefn Macro double SYSMIS
83The @dfn{system-missing value}, used to represent a datum whose true
84value is unknown, such as a survey question that was not answered by
85the respondent, or undefined, such as the result of division by zero.
86PSPP propagates the system-missing value through calculations and
87compensates for missing values in statistical analyses.  @xref{Missing
88Observations,,,pspp, PSPP Users Guide}, for a PSPP user's view of
89missing values.
90
91PSPP currently defines @code{SYSMIS} as @code{-DBL_MAX}, that is, the
92greatest finite negative value of @code{double}.  It is best not to
93depend on this definition, because PSPP may transition to using an
94IEEE NaN (not a number) instead at some point in the future.
95@end deftypefn
96
97@deftypefn Macro double LOWEST
98@deftypefnx Macro double HIGHEST
99The greatest finite negative (except for @code{SYSMIS}) and positive
100values of @code{double}, respectively.  These values do not ordinarily
101appear in user data files.  Instead, they are used to implement
102endpoints of open-ended ranges that are occasionally permitted in PSPP
103syntax, e.g.@: @code{5 THRU HI} as a range of missing values
104(@pxref{MISSING VALUES,,,pspp, PSPP Users Guide}).
105@end deftypefn
106
107@node String Values
108@subsection String Values
109
110A value known at compile time to have string type is represented as an
111array of @code{char}.  String values do not necessarily represent
112readable text strings and may contain arbitrary 8-bit data, including
113null bytes, control codes, and bytes with the high bit set.  Thus,
114string values are not null-terminated strings, but rather opaque
115arrays of bytes.
116
117@code{SYSMIS}, @code{LOWEST}, and @code{HIGHEST} have no equivalents
118as string values.  Usually, PSPP fills an unknown or undefined string
119values with spaces, but PSPP does not treat such a string as a special
120case when it processes it later.
121
122@cindex MAX_STRING
123@code{MAX_STRING}, the maximum length of a string value, is defined in
124@file{data/val-type.h}.
125
126@node Runtime Typed Values
127@subsection Runtime Typed Values
128
129When a value's type is only known at runtime, it is often represented
130as a @union{value}, defined in @file{data/value.h}.  A @union{value}
131does not identify the type or width of the data it contains.  Code
132that works with @union{values}s must therefore have external knowledge
133of its content, often through the type and width of a
134@struct{variable} (@pxref{Variables}).
135
136@union{value} has one member that clients are permitted to access
137directly, a @code{double} named @samp{f} that stores the content of a
138numeric @union{value}.  It has other members that store the content of
139string @union{value}, but client code should use accessor functions
140instead of referring to these directly.
141
142PSPP provides some functions for working with @union{value}s.  The
143most useful are described below.  To use these functions, recall that
144a numeric value has a width of 0.
145
146@deftypefun void value_init (union value *@var{value}, int @var{width})
147Initializes @var{value} as a value of the given @var{width}.  After
148initialization, the data in @var{value} are indeterminate; the caller
149is responsible for storing initial data in it.
150@end deftypefun
151
152@deftypefun void value_destroy (union value *@var{value}, int @var{width})
153Frees auxiliary storage associated with @var{value}, which must have
154the given @var{width}.
155@end deftypefun
156
157@deftypefun bool value_needs_init (int @var{width})
158For some widths, @func{value_init} and @func{value_destroy} do not
159actually do anything, because no additional storage is needed beyond
160the size of @union{value}.  This function returns true if @var{width}
161is such a width, which case there is no actual need to call those
162functions.  This can be a useful optimization if a large number of
163@union{value}s of such a width are to be initialized or destroyed.
164
165This function returns false if @func{value_init} and
166@func{value_destroy} are actually required for the given @var{width}.
167@end deftypefun
168
169@deftypefun void value_copy (union value *@var{dst}, @
170                             const union value *@var{src}, @
171                             int @var{width})
172Copies the contents of @union{value} @var{src} to @var{dst}.  Both
173@var{dst} and @var{src} must have been initialized with the specified
174@var{width}.
175@end deftypefun
176
177@deftypefun void value_set_missing (union value *@var{value}, int @var{width})
178Sets @var{value} to @code{SYSMIS} if it is numeric or to all spaces if
179it is alphanumeric, according to @var{width}.  @var{value} must have
180been initialized with the specified @var{width}.
181@end deftypefun
182
183@anchor{value_is_resizable}
184@deftypefun bool value_is_resizable (const union value *@var{value}, int @var{old_width}, int @var{new_width})
185Determines whether @var{value}, which must have been initialized with
186the specified @var{old_width}, may be resized to @var{new_width}.
187Resizing is possible if the following criteria are met.  First,
188@var{old_width} and @var{new_width} must be both numeric or both
189string widths.  Second, if @var{new_width} is a short string width and
190less than @var{old_width}, resizing is allowed only if bytes
191@var{new_width} through @var{old_width} in @var{value} contain only
192spaces.
193
194These rules are part of those used by @func{mv_is_resizable} and
195@func{val_labs_can_set_width}.
196@end deftypefun
197
198@deftypefun void value_resize (union value *@var{value}, int @var{old_width}, int @var{new_width})
199Resizes @var{value} from @var{old_width} to @var{new_width}, which
200must be allowed by the rules stated above.  @var{value} must have been
201initialized with the specified @var{old_width} before calling this
202function.  After resizing, @var{value} has width @var{new_width}.
203
204If @var{new_width} is greater than @var{old_width}, @var{value} will
205be padded on the right with spaces to the new width.  If
206@var{new_width} is less than @var{old_width}, the rightmost bytes of
207@var{value} are truncated.
208@end deftypefun
209
210@deftypefun bool value_equal (const union value *@var{a}, const union value *@var{b}, int @var{width})
211Compares of @var{a} and @var{b}, which must both have width
212@var{width}.  Returns true if their contents are the same, false if
213they differ.
214@end deftypefun
215
216@deftypefun int value_compare_3way (const union value *@var{a}, const union value *@var{b}, int @var{width})
217Compares of @var{a} and @var{b}, which must both have width
218@var{width}.  Returns -1 if @var{a} is less than @var{b}, 0 if they
219are equal, or 1 if @var{a} is greater than @var{b}.
220
221Numeric values are compared numerically, with @code{SYSMIS} comparing
222less than any real number.  String values are compared
223lexicographically byte-by-byte.
224@end deftypefun
225
226@deftypefun size_t value_hash (const union value *@var{value}, int @var{width}, unsigned int @var{basis})
227Computes and returns a hash of @var{value}, which must have the
228specified @var{width}.  The value in @var{basis} is folded into the
229hash.
230@end deftypefun
231
232@node Input and Output Formats
233@section Input and Output Formats
234
235Input and output formats specify how to convert data fields to and
236from data values (@pxref{Input and Output Formats,,,pspp, PSPP Users
237Guide}).  PSPP uses @struct{fmt_spec} to represent input and output
238formats.
239
240Function prototypes and other declarations related to formats are in
241the @file{<data/format.h>} header.
242
243@deftp {Structure} {struct fmt_spec}
244An input or output format, with the following members:
245
246@table @code
247@item enum fmt_type type
248The format type (see below).
249
250@item int w
251Field width, in bytes.  The width of numeric fields is always between
2521 and 40 bytes, and the width of string fields is always between 1 and
25365534 bytes.  However, many individual types of formats place stricter
254limits on field width (see @ref{fmt_max_input_width},
255@ref{fmt_max_output_width}).
256
257@item int d
258Number of decimal places, in character positions.  For format types
259that do not allow decimal places to be specified, this value must be
2600.  Format types that do allow decimal places have type-specific and
261often width-specific restrictions on @code{d} (see
262@ref{fmt_max_input_decimals}, @ref{fmt_max_output_decimals}).
263@end table
264@end deftp
265
266@deftp {Enumeration} {enum fmt_type}
267An enumerated type representing an input or output format type.  Each
268PSPP input and output format has a corresponding enumeration constant
269prefixed by @samp{FMT}: @code{FMT_F}, @code{FMT_COMMA},
270@code{FMT_DOT}, and so on.
271@end deftp
272
273The following sections describe functions for manipulating formats and
274the data in fields represented by formats.
275
276@menu
277* Constructing and Verifying Formats::
278* Format Utility Functions::
279* Obtaining Properties of Format Types::
280* Numeric Formatting Styles::
281* Formatted Data Input and Output::
282@end menu
283
284@node Constructing and Verifying Formats
285@subsection Constructing and Verifying Formats
286
287These functions construct @struct{fmt_spec}s and verify that they are
288valid.
289
290
291
292@deftypefun {struct fmt_spec} fmt_for_input (enum fmt_type @var{type}, int @var{w}, int @var{d})
293@deftypefunx {struct fmt_spec} fmt_for_output (enum fmt_type @var{type}, int @var{w}, int @var{d})
294Constructs a @struct{fmt_spec} with the given @var{type}, @var{w}, and
295@var{d}, asserts that the result is a valid input (or output) format,
296and returns it.
297@end deftypefun
298
299@anchor{fmt_for_output_from_input}
300@deftypefun {struct fmt_spec} fmt_for_output_from_input (const struct fmt_spec *@var{input})
301Given @var{input}, which must be a valid input format, returns the
302equivalent output format.  @xref{Input and Output Formats,,,pspp, PSPP
303Users Guide}, for the rules for converting input formats into output
304formats.
305@end deftypefun
306
307@deftypefun {struct fmt_spec} fmt_default_for_width (int @var{width})
308Returns the default output format for a variable of the given
309@var{width}.  For a numeric variable, this is F8.2 format; for a
310string variable, it is the A format of the given @var{width}.
311@end deftypefun
312
313The following functions check whether a @struct{fmt_spec} is valid for
314various uses and return true if so, false otherwise.  When any of them
315returns false, it also outputs an explanatory error message using
316@func{msg}.  To suppress error output, enclose a call to one of these
317functions by a @func{msg_disable}/@func{msg_enable} pair.
318
319@deftypefun bool fmt_check (const struct fmt_spec *@var{format}, bool @var{for_input})
320@deftypefunx bool fmt_check_input (const struct fmt_spec *@var{format})
321@deftypefunx bool fmt_check_output (const struct fmt_spec *@var{format})
322Checks whether @var{format} is a valid input format (for
323@func{fmt_check_input}, or @func{fmt_check} if @var{for_input}) or
324output format (for @func{fmt_check_output}, or @func{fmt_check} if not
325@var{for_input}).
326@end deftypefun
327
328@deftypefun bool fmt_check_type_compat (const struct fmt_spec *@var{format}, enum val_type @var{type})
329Checks whether @var{format} matches the value type @var{type}, that
330is, if @var{type} is @code{VAL_NUMERIC} and @var{format} is a numeric
331format or @var{type} is @code{VAL_STRING} and @var{format} is a string
332format.
333@end deftypefun
334
335@deftypefun bool fmt_check_width_compat (const struct fmt_spec *@var{format}, int @var{width})
336Checks whether @var{format} may be used as an output format for a
337value of the given @var{width}.
338
339@func{fmt_var_width}, described in
340the following section, can be also be used to determine the value
341width needed by a format.
342@end deftypefun
343
344@node Format Utility Functions
345@subsection Format Utility Functions
346
347These functions work with @struct{fmt_spec}s.
348
349@deftypefun int fmt_var_width (const struct fmt_spec *@var{format})
350Returns the width for values associated with @var{format}.  If
351@var{format} is a numeric format, the width is 0; if @var{format} is
352an A format, then the width @code{@var{format}->w}; otherwise,
353@var{format} is an AHEX format and its width is @code{@var{format}->w
354/ 2}.
355@end deftypefun
356
357@deftypefun char *fmt_to_string (const struct fmt_spec *@var{format}, char @var{s}[FMT_STRING_LEN_MAX + 1])
358Converts @var{format} to a human-readable format specifier in @var{s}
359and returns @var{s}.  @var{format} need not be a valid input or output
360format specifier, e.g.@: it is allowed to have an excess width or
361decimal places.  In particular, if @var{format} has decimals, they are
362included in the output string, even if @var{format}'s type does not
363allow decimals, to allow accurately presenting incorrect formats to
364the user.
365@end deftypefun
366
367@deftypefun bool fmt_equal (const struct fmt_spec *@var{a}, const struct fmt_spec *@var{b})
368Compares @var{a} and @var{b} memberwise and returns true if they are
369identical, false otherwise.  @var{format} need not be a valid input or
370output format specifier.
371@end deftypefun
372
373@deftypefun void fmt_resize (struct fmt_spec *@var{fmt}, int @var{width})
374Sets the width of @var{fmt} to a valid format for a  @union{value} of size @var{width}.
375@end deftypefun
376
377@node Obtaining Properties of Format Types
378@subsection Obtaining Properties of Format Types
379
380These functions work with @enum{fmt_type}s instead of the higher-level
381@struct{fmt_spec}s.  Their primary purpose is to report properties of
382each possible format type, which in turn allows clients to abstract
383away many of the details of the very heterogeneous requirements of
384each format type.
385
386The first group of functions works with format type names.
387
388@deftypefun const char *fmt_name (enum fmt_type @var{type})
389Returns the name for the given @var{type}, e.g.@: @code{"COMMA"} for
390@code{FMT_COMMA}.
391@end deftypefun
392
393@deftypefun bool fmt_from_name (const char *@var{name}, enum fmt_type *@var{type})
394Tries to find the @enum{fmt_type} associated with @var{name}.  If
395successful, sets @code{*@var{type}} to the type and returns true;
396otherwise, returns false without modifying @code{*@var{type}}.
397@end deftypefun
398
399The functions below query basic limits on width and decimal places for
400each kind of format.
401
402@deftypefun bool fmt_takes_decimals (enum fmt_type @var{type})
403Returns true if a format of the given @var{type} is allowed to have a
404nonzero number of decimal places (the @code{d} member of
405@struct{fmt_spec}), false if not.
406@end deftypefun
407
408@anchor{fmt_min_input_width}
409@anchor{fmt_max_input_width}
410@anchor{fmt_min_output_width}
411@anchor{fmt_max_output_width}
412@deftypefun int fmt_min_input_width (enum fmt_type @var{type})
413@deftypefunx int fmt_max_input_width (enum fmt_type @var{type})
414@deftypefunx int fmt_min_output_width (enum fmt_type @var{type})
415@deftypefunx int fmt_max_output_width (enum fmt_type @var{type})
416Returns the minimum or maximum width (the @code{w} member of
417@struct{fmt_spec}) allowed for an input or output format of the
418specified @var{type}.
419@end deftypefun
420
421@anchor{fmt_max_input_decimals}
422@anchor{fmt_max_output_decimals}
423@deftypefun int fmt_max_input_decimals (enum fmt_type @var{type}, int @var{width})
424@deftypefunx int fmt_max_output_decimals (enum fmt_type @var{type}, int @var{width})
425Returns the maximum number of decimal places allowed for an input or
426output format, respectively, of the given @var{type} and @var{width}.
427Returns 0 if the specified @var{type} does not allow any decimal
428places or if @var{width} is too narrow to allow decimal places.
429@end deftypefun
430
431@deftypefun int fmt_step_width (enum fmt_type @var{type})
432Returns the ``width step'' for a @struct{fmt_spec} of the given
433@var{type}.  A @struct{fmt_spec}'s width must be a multiple of its
434type's width step.  Most format types have a width step of 1, so that
435their formats' widths may be any integer within the valid range, but
436hexadecimal numeric formats and AHEX string formats have a width step
437of 2.
438@end deftypefun
439
440These functions allow clients to broadly determine how each kind of
441input or output format behaves.
442
443@deftypefun bool fmt_is_string (enum fmt_type @var{type})
444@deftypefunx bool fmt_is_numeric (enum fmt_type @var{type})
445Returns true if @var{type} is a format for numeric or string values,
446respectively, false otherwise.
447@end deftypefun
448
449@deftypefun enum fmt_category fmt_get_category (enum fmt_type @var{type})
450Returns the category within which @var{type} falls.
451
452@deftp {Enumeration} {enum fmt_category}
453A group of format types.  Format type categories correspond to the
454input and output categories described in the PSPP user documentation
455(@pxref{Input and Output Formats,,,pspp, PSPP Users Guide}).
456
457Each format is in exactly one category.  The categories have bitwise
458disjoint values to make it easy to test whether a format type is in
459one of multiple categories, e.g.@:
460
461@example
462if (fmt_get_category (type) & (FMT_CAT_DATE | FMT_CAT_TIME))
463  @{
464    /* @dots{}@r{@code{type} is a date or time format}@dots{} */
465  @}
466@end example
467
468The format categories are:
469@table @code
470@item FMT_CAT_BASIC
471Basic numeric formats.
472
473@item FMT_CAT_CUSTOM
474Custom currency formats.
475
476@item FMT_CAT_LEGACY
477Legacy numeric formats.
478
479@item FMT_CAT_BINARY
480Binary formats.
481
482@item FMT_CAT_HEXADECIMAL
483Hexadecimal formats.
484
485@item FMT_CAT_DATE
486Date formats.
487
488@item FMT_CAT_TIME
489Time formats.
490
491@item FMT_CAT_DATE_COMPONENT
492Date component formats.
493
494@item FMT_CAT_STRING
495String formats.
496@end table
497@end deftp
498@end deftypefun
499
500The PSPP input and output routines use the following pair of functions
501to convert @enum{fmt_type}s to and from the separate set of codes used
502in system and portable files:
503
504@deftypefun int fmt_to_io (enum fmt_type @var{type})
505Returns the format code used in system and portable files that
506corresponds to @var{type}.
507@end deftypefun
508
509@deftypefun bool fmt_from_io (int @var{io}, enum fmt_type *@var{type})
510Converts @var{io}, a format code used in system and portable files,
511into a @enum{fmt_type} in @code{*@var{type}}.  Returns true if
512successful, false if @var{io} is not valid.
513@end deftypefun
514
515These functions reflect the relationship between input and output
516formats.
517
518@deftypefun enum fmt_type fmt_input_to_output (enum fmt_type @var{type})
519Returns the output format type that is used by default by DATA LIST
520and other input procedures when @var{type} is specified as an input
521format.  The conversion from input format to output format is more
522complicated than simply changing the format.
523@xref{fmt_for_output_from_input}, for a function that performs the
524entire conversion.
525@end deftypefun
526
527@deftypefun bool fmt_usable_for_input (enum fmt_type @var{type})
528Returns true if @var{type} may be used as an input format type, false
529otherwise.  The custom currency formats, in particular, may be used
530for output but not for input.
531
532All format types are valid for output.
533@end deftypefun
534
535The final group of format type property functions obtain
536human-readable templates that illustrate the formats graphically.
537
538@deftypefun const char *fmt_date_template (enum fmt_type @var{type})
539Returns a formatting template for @var{type}, which must be a date or
540time format type.  These formats are used by @func{data_in} and
541@func{data_out} to guide parsing and formatting date and time data.
542@end deftypefun
543
544@deftypefun char *fmt_dollar_template (const struct fmt_spec *@var{format})
545Returns a string of the form @code{$#,###.##} according to
546@var{format}, which must be of type @code{FMT_DOLLAR}.  The caller
547must free the string with @code{free}.
548@end deftypefun
549
550@node Numeric Formatting Styles
551@subsection Numeric Formatting Styles
552
553Each of the basic numeric formats (F, E, COMMA, DOT, DOLLAR, PCT) and
554custom currency formats (CCA, CCB, CCC, CCD, CCE) has an associated
555numeric formatting style, represented by @struct{fmt_number_style}.
556Input and output conversion of formats that have numeric styles is
557determined mainly by the style, although the formatting rules have
558special cases that are not represented within the style.
559
560@deftp {Structure} {struct fmt_number_style}
561A structure type with the following members:
562
563@table @code
564@item struct substring neg_prefix
565@itemx struct substring prefix
566@itemx struct substring suffix
567@itemx struct substring neg_suffix
568A set of strings used a prefix to negative numbers, a prefix to every
569number, a suffix to every number, and a suffix to negative numbers,
570respectively.  Each of these strings is no more than
571@code{FMT_STYLE_AFFIX_MAX} bytes (currently 16) bytes in length.
572These strings must be freed with @func{ss_dealloc} when no longer
573needed.
574
575@item decimal
576The character used as a decimal point.  It must be either @samp{.} or
577@samp{,}.
578
579@item grouping
580The character used for grouping digits to the left of the decimal
581point.  It may be @samp{.} or @samp{,}, in which case it must not be
582equal to @code{decimal}, or it may be set to 0 to disable grouping.
583@end table
584@end deftp
585
586The following functions are provided for working with numeric
587formatting styles.
588
589@deftypefun void fmt_number_style_init (struct fmt_number_style *@var{style})
590Initialises a @struct{fmt_number_style} with all of the
591prefixes and suffixes set to the empty string, @samp{.} as the decimal
592point character, and grouping disables.
593@end deftypefun
594
595
596@deftypefun void fmt_number_style_destroy (struct fmt_number_style *@var{style})
597Destroys @var{style}, freeing its storage.
598@end deftypefun
599
600@deftypefun {struct fmt_number_style}    *fmt_create (void)
601A function which creates an array of all the styles used by pspp, and
602calls fmt_number_style_init on each of them.
603@end deftypefun
604
605@deftypefun void fmt_done (struct fmt_number_style *@var{styles})
606A wrapper function which takes an array of @struct{fmt_number_style}, calls
607fmt_number_style_destroy on each of them, and then frees the array.
608@end deftypefun
609
610
611
612@deftypefun int fmt_affix_width (const struct fmt_number_style *@var{style})
613Returns the total length of @var{style}'s @code{prefix} and @code{suffix}.
614@end deftypefun
615
616@deftypefun int fmt_neg_affix_width (const struct fmt_number_style *@var{style})
617Returns the total length of @var{style}'s @code{neg_prefix} and
618@code{neg_suffix}.
619@end deftypefun
620
621PSPP maintains a global set of number styles for each of the basic
622numeric formats and custom currency formats.  The following functions
623work with these global styles:
624
625@deftypefun {const struct fmt_number_style *} fmt_get_style (enum fmt_type @var{type})
626Returns the numeric style for the given format @var{type}.
627@end deftypefun
628
629@deftypefun {const char *} fmt_name (enum fmt_type @var{type})
630Returns the name of the given format @var{type}.
631@end deftypefun
632
633
634
635@node Formatted Data Input and Output
636@subsection Formatted Data Input and Output
637
638These functions provide the ability to convert data fields into
639@union{value}s and vice versa.
640
641@deftypefun bool data_in (struct substring @var{input}, const char *@var{encoding}, enum fmt_type @var{type}, int @var{implied_decimals}, int @var{first_column}, const struct dictionary *@var{dict}, union value *@var{output}, int @var{width})
642Parses @var{input} as a field containing data in the given format
643@var{type}.  The resulting value is stored in @var{output}, which the
644caller must have initialized with the given @var{width}.  For
645consistency, @var{width} must be 0 if
646@var{type} is a numeric format type and greater than 0 if @var{type}
647is a string format type.
648@var{encoding} should be set to indicate the character
649encoding of @var{input}.
650@var{dict} must be a pointer to the dictionary with which @var{output}
651is associated.
652
653If @var{input} is the empty string (with length 0), @var{output} is
654set to the value set on SET BLANKS (@pxref{SET BLANKS,,,pspp, PSPP
655Users Guide}) for a numeric value, or to all spaces for a string
656value.  This applies regardless of the usual parsing requirements for
657@var{type}.
658
659If @var{implied_decimals} is greater than zero, then the numeric
660result is shifted right by @var{implied_decimals} decimal places if
661@var{input} does not contain a decimal point character or an exponent.
662Only certain numeric format types support implied decimal places; for
663string formats and other numeric formats, @var{implied_decimals} has
664no effect.  DATA LIST FIXED is the primary user of this feature
665(@pxref{DATA LIST FIXED,,,pspp, PSPP Users Guide}).  Other callers
666should generally specify 0 for @var{implied_decimals}, to disable this
667feature.
668
669When @var{input} contains invalid input data, @func{data_in} outputs a
670message using @func{msg}.
671@c (@pxref{msg}).
672If @var{first_column} is
673nonzero, it is included in any such error message as the 1-based
674column number of the start of the field.  The last column in the field
675is calculated as @math{@var{first_column} + @var{input} - 1}.  To
676suppress error output, enclose the call to @func{data_in} by calls to
677@func{msg_disable} and @func{msg_enable}.
678
679This function returns true on success, false if a message was output
680(even if suppressed).  Overflow and underflow provoke warnings but are
681not propagated to the caller as errors.
682
683This function is declared in @file{data/data-in.h}.
684@end deftypefun
685
686@deftypefun char * data_out (const union value *@var{input}, const struct fmt_spec *@var{format})
687@deftypefunx char * data_out_legacy (const union value *@var{input}, const char *@var{encoding}, const struct fmt_spec *@var{format})
688Converts the data pointed to by @var{input} into a string value, which
689will be encoded in UTF-8,  according to output format specifier @var{format}.
690Format
691must be a valid output format.   The width of @var{input} is
692inferred from @var{format} using an algorithm equivalent to
693@func{fmt_var_width}.
694
695When @var{input} contains data that cannot be represented in the given
696@var{format}, @func{data_out} may output a message using @func{msg},
697@c (@pxref{msg}),
698although the current implementation does not
699consistently do so.  To suppress error output, enclose the call to
700@func{data_out} by calls to @func{msg_disable} and @func{msg_enable}.
701
702This function is declared in @file{data/data-out.h}.
703@end deftypefun
704
705@node User-Missing Values
706@section User-Missing Values
707
708In addition to the system-missing value for numeric values, each
709variable has a set of user-missing values (@pxref{MISSING
710VALUES,,,pspp, PSPP Users Guide}).  A set of user-missing values is
711represented by @struct{missing_values}.
712
713It is rarely necessary to interact directly with a
714@struct{missing_values} object.  Instead, the most common operation,
715querying whether a particular value is a missing value for a given
716variable, is most conveniently executed through functions on
717@struct{variable}.  @xref{Variable Missing Values}, for details.
718
719A @struct{missing_values} is essentially a set of @union{value}s that
720have a common value width (@pxref{Values}).  For a set of
721missing values associated with a variable (the common case), the set's
722width is the same as the variable's width.
723
724Function prototypes and other declarations related to missing values
725are declared in @file{data/missing-values.h}.
726
727@deftp {Structure} {struct missing_values}
728Opaque type that represents a set of missing values.
729@end deftp
730
731The contents of a set of missing values is subject to some
732restrictions.  Regardless of width, a set of missing values is allowed
733to be empty.  A set of numeric missing values may contain up to three
734discrete numeric values, or a range of numeric values (which includes
735both ends of the range), or a range plus one discrete numeric value.
736A set of string missing values may contain up to three discrete string
737values (with the same width as the set), but ranges are not supported.
738
739In addition, values in string missing values wider than
740@code{MV_MAX_STRING} bytes may contain non-space characters only in
741their first @code{MV_MAX_STRING} bytes; all the bytes after the first
742@code{MV_MAX_STRING} must be spaces.  @xref{mv_is_acceptable}, for a
743function that tests a value against these constraints.
744
745@deftypefn Macro int MV_MAX_STRING
746Number of bytes in a string missing value that are not required to be
747spaces.  The current value is 8, a value which is fixed by the system
748file format.  In PSPP we could easily eliminate this restriction, but
749doing so would also require us to extend the system file format in an
750incompatible way, which we consider a bad tradeoff.
751@end deftypefn
752
753The most often useful functions for missing values are those for
754testing whether a given value is missing, described in the following
755section.  Several other functions for creating, inspecting, and
756modifying @struct{missing_values} objects are described afterward, but
757these functions are much more rarely useful.
758
759@menu
760* Testing for Missing Values::
761* Creating and Destroying User-Missing Values::
762* Changing User-Missing Value Set Width::
763* Inspecting User-Missing Value Sets::
764* Modifying User-Missing Value Sets::
765@end menu
766
767@node Testing for Missing Values
768@subsection Testing for Missing Values
769
770The most often useful functions for missing values are those for
771testing whether a given value is missing, described here.  However,
772using one of the corresponding missing value testing functions for
773variables can be even easier (@pxref{Variable Missing Values}).
774
775@deftypefun bool mv_is_value_missing (const struct missing_values *@var{mv}, const union value *@var{value}, enum mv_class @var{class})
776@deftypefunx bool mv_is_num_missing (const struct missing_values *@var{mv}, double @var{value}, enum mv_class @var{class})
777@deftypefunx bool mv_is_str_missing (const struct missing_values *@var{mv}, const char @var{value}[], enum mv_class @var{class})
778Tests whether @var{value} is in one of the categories of missing
779values given by @var{class}.  Returns true if so, false otherwise.
780
781@var{mv} determines the width of @var{value} and provides the set of
782user-missing values to test.
783
784The only difference among these functions in the form in which
785@var{value} is provided, so you may use whichever function is most
786convenient.
787
788The @var{class} argument determines the exact kinds of missing values
789that the functions test for:
790
791@deftp Enumeration {enum mv_class}
792@table @t
793@item MV_USER
794Returns true if @var{value} is in the set of user-missing values given
795by @var{mv}.
796
797@item MV_SYSTEM
798Returns true if @var{value} is system-missing.  (If @var{mv}
799represents a set of string values, then @var{value} is never
800system-missing.)
801
802@item MV_ANY
803@itemx MV_USER | MV_SYSTEM
804Returns true if @var{value} is user-missing or system-missing.
805
806@item MV_NONE
807Always returns false, that is, @var{value} is never considered
808missing.
809@end table
810@end deftp
811@end deftypefun
812
813@node Creating and Destroying User-Missing Values
814@subsection Creation and Destruction
815
816These functions create and destroy @struct{missing_values} objects.
817
818@deftypefun void mv_init (struct missing_values *@var{mv}, int @var{width})
819Initializes @var{mv} as a set of user-missing values.  The set is
820initially empty.  Any values added to it must have the specified
821@var{width}.
822@end deftypefun
823
824@deftypefun void mv_destroy (struct missing_values *@var{mv})
825Destroys @var{mv}, which must not be referred to again.
826@end deftypefun
827
828@deftypefun void mv_copy (struct missing_values *@var{mv}, const struct missing_values *@var{old})
829Initializes @var{mv} as a copy of the existing set of user-missing
830values @var{old}.
831@end deftypefun
832
833@deftypefun void mv_clear (struct missing_values *@var{mv})
834Empties the user-missing value set @var{mv}, retaining its existing
835width.
836@end deftypefun
837
838@node Changing User-Missing Value Set Width
839@subsection Changing User-Missing Value Set Width
840
841A few PSPP language constructs copy sets of user-missing values from
842one variable to another.  When the source and target variables have
843the same width, this is simple.  But when the target variable's width
844might be different from the source variable's, it takes a little more
845work.  The functions described here can help.
846
847In fact, it is usually unnecessary to call these functions directly.
848Most of the time @func{var_set_missing_values}, which uses
849@func{mv_resize} internally to resize the new set of missing values to
850the required width, may be used instead.
851@xref{var_set_missing_values}, for more information.
852
853@deftypefun bool mv_is_resizable (const struct missing_values *@var{mv}, int @var{new_width})
854Tests whether @var{mv}'s width may be changed to @var{new_width} using
855@func{mv_resize}.  Returns true if it is allowed, false otherwise.
856
857If @var{mv} contains any missing values, then it may be resized only
858if each missing value may be resized, as determined by
859@func{value_is_resizable} (@pxref{value_is_resizable}).
860@end deftypefun
861
862@anchor{mv_resize}
863@deftypefun void mv_resize (struct missing_values *@var{mv}, int @var{width})
864Changes @var{mv}'s width to @var{width}.  @var{mv} and @var{width}
865must satisfy the constraints explained above.
866
867When a string missing value set's width is increased, each
868user-missing value is padded on the right with spaces to the new
869width.
870@end deftypefun
871
872@node Inspecting User-Missing Value Sets
873@subsection Inspecting User-Missing Value Sets
874
875These functions inspect the properties and contents of
876@struct{missing_values} objects.
877
878The first set of functions inspects the discrete values that sets of
879user-missing values may contain:
880
881@deftypefun bool mv_is_empty (const struct missing_values *@var{mv})
882Returns true if @var{mv} contains no user-missing values, false if it
883contains at least one user-missing value (either a discrete value or a
884numeric range).
885@end deftypefun
886
887@deftypefun int mv_get_width (const struct missing_values *@var{mv})
888Returns the width of the user-missing values that @var{mv} represents.
889@end deftypefun
890
891@deftypefun int mv_n_values (const struct missing_values *@var{mv})
892Returns the number of discrete user-missing values included in
893@var{mv}.  The return value will be between 0 and 3.  For sets of
894numeric user-missing values that include a range, the return value
895will be 0 or 1.
896@end deftypefun
897
898@deftypefun bool mv_has_value (const struct missing_values *@var{mv})
899Returns true if @var{mv} has at least one discrete user-missing
900values, that is, if @func{mv_n_values} would return nonzero for
901@var{mv}.
902@end deftypefun
903
904@deftypefun {const union value *} mv_get_value (const struct missing_values *@var{mv}, int @var{index})
905Returns the discrete user-missing value in @var{mv} with the given
906@var{index}.  The caller must not modify or free the returned value or
907refer to it after modifying or freeing @var{mv}.  The index must be
908less than the number of discrete user-missing values in @var{mv}, as
909reported by @func{mv_n_values}.
910@end deftypefun
911
912The second set of functions inspects the single range of values that
913numeric sets of user-missing values may contain:
914
915@deftypefun bool mv_has_range (const struct missing_values *@var{mv})
916Returns true if @var{mv} includes a range, false otherwise.
917@end deftypefun
918
919@deftypefun void mv_get_range (const struct missing_values *@var{mv}, double *@var{low}, double *@var{high})
920Stores the low endpoint of @var{mv}'s range in @code{*@var{low}} and
921the high endpoint of the range in @code{*@var{high}}.  @var{mv} must
922include a range.
923@end deftypefun
924
925@node Modifying User-Missing Value Sets
926@subsection Modifying User-Missing Value Sets
927
928These functions modify the contents of @struct{missing_values}
929objects.
930
931The next set of functions applies to all sets of user-missing values:
932
933@deftypefun bool mv_add_value (struct missing_values *@var{mv}, const union value *@var{value})
934@deftypefunx bool mv_add_str (struct missing_values *@var{mv}, const char @var{value}[])
935@deftypefunx bool mv_add_num (struct missing_values *@var{mv}, double @var{value})
936Attempts to add the given discrete @var{value} to set of user-missing
937values @var{mv}.  @var{value} must have the same width as @var{mv}.
938Returns true if @var{value} was successfully added, false if the set
939could not accept any more discrete values or if @var{value} is not an
940acceptable user-missing value (see @func{mv_is_acceptable} below).
941
942These functions are equivalent, except for the form in which
943@var{value} is provided, so you may use whichever function is most
944convenient.
945@end deftypefun
946
947@deftypefun void mv_pop_value (struct missing_values *@var{mv}, union value *@var{value})
948Removes a discrete value from @var{mv} (which must contain at least
949one discrete value) and stores it in @var{value}.
950@end deftypefun
951
952@deftypefun bool mv_replace_value (struct missing_values *@var{mv}, const union value *@var{value}, int @var{index})
953Attempts to replace the discrete value with the given @var{index} in
954@var{mv} (which must contain at least @var{index} + 1 discrete values)
955by @var{value}.  Returns true if successful, false if @var{value} is
956not an acceptable user-missing value (see @func{mv_is_acceptable}
957below).
958@end deftypefun
959
960@deftypefun bool mv_is_acceptable (const union value *@var{value}, int @var{width})
961@anchor{mv_is_acceptable}
962Returns true if @var{value}, which must have the specified
963@var{width}, may be added to a missing value set of the same
964@var{width}, false if it cannot.  As described above, all numeric
965values and string values of width @code{MV_MAX_STRING} or less may be
966added, but string value of greater width may be added only if bytes
967beyond the first @code{MV_MAX_STRING} are all spaces.
968@end deftypefun
969
970The second set of functions applies only to numeric sets of
971user-missing values:
972
973@deftypefun bool mv_add_range (struct missing_values *@var{mv}, double @var{low}, double @var{high})
974Attempts to add a numeric range covering @var{low}@dots{}@var{high}
975(inclusive on both ends) to @var{mv}, which must be a numeric set of
976user-missing values.  Returns true if the range is successful added,
977false on failure.  Fails if @var{mv} already contains a range, or if
978@var{mv} contains more than one discrete value, or if @var{low} >
979@var{high}.
980@end deftypefun
981
982@deftypefun void mv_pop_range (struct missing_values *@var{mv}, double *@var{low}, double *@var{high})
983Given @var{mv}, which must be a numeric set of user-missing values
984that contains a range, removes that range from @var{mv} and stores its
985low endpoint in @code{*@var{low}} and its high endpoint in
986@code{*@var{high}}.
987@end deftypefun
988
989@node Value Labels
990@section Value Labels
991
992Each variable has a set of value labels (@pxref{VALUE LABELS,,,pspp,
993PSPP Users Guide}), represented as @struct{val_labs}.  A
994@struct{val_labs} is essentially a map from @union{value}s to strings.
995All of the values in a set of value labels have the same width, which
996for a set of value labels owned by a variable (the common case) is the
997same as its variable.
998
999Sets of value labels may contain any number of entries.
1000
1001It is rarely necessary to interact directly with a @struct{val_labs}
1002object.  Instead, the most common operation, looking up the label for
1003a value of a given variable, can be conveniently executed through
1004functions on @struct{variable}.  @xref{Variable Value Labels}, for
1005details.
1006
1007Function prototypes and other declarations related to missing values
1008are declared in @file{data/value-labels.h}.
1009
1010@deftp {Structure} {struct val_labs}
1011Opaque type that represents a set of value labels.
1012@end deftp
1013
1014The most often useful function for value labels is
1015@func{val_labs_find}, for looking up the label associated with a
1016value.
1017
1018@deftypefun {char *} val_labs_find (const struct val_labs *@var{val_labs}, union value @var{value})
1019Looks in @var{val_labs} for a label for the given @var{value}.
1020Returns the label, if one is found, or a null pointer otherwise.
1021@end deftypefun
1022
1023Several other functions for working with value labels are described in
1024the following section, but these are more rarely useful.
1025
1026@menu
1027* Value Labels Creation and Destruction::
1028* Value Labels Properties::
1029* Value Labels Adding and Removing Labels::
1030* Value Labels Iteration::
1031@end menu
1032
1033@node Value Labels Creation and Destruction
1034@subsection Creation and Destruction
1035
1036These functions create and destroy @struct{val_labs} objects.
1037
1038@deftypefun {struct val_labs *} val_labs_create (int @var{width})
1039Creates and returns an initially empty set of value labels with the
1040given @var{width}.
1041@end deftypefun
1042
1043@deftypefun {struct val_labs *} val_labs_clone (const struct val_labs *@var{val_labs})
1044Creates and returns a set of value labels whose width and contents are
1045the same as those of @var{var_labs}.
1046@end deftypefun
1047
1048@deftypefun void val_labs_clear (struct val_labs *@var{var_labs})
1049Deletes all value labels from @var{var_labs}.
1050@end deftypefun
1051
1052@deftypefun void val_labs_destroy (struct val_labs *@var{var_labs})
1053Destroys @var{var_labs}, which must not be referenced again.
1054@end deftypefun
1055
1056@node Value Labels Properties
1057@subsection Value Labels Properties
1058
1059These functions inspect and manipulate basic properties of
1060@struct{val_labs} objects.
1061
1062@deftypefun size_t val_labs_count (const struct val_labs *@var{val_labs})
1063Returns the number of value labels in @var{val_labs}.
1064@end deftypefun
1065
1066@deftypefun bool val_labs_can_set_width (const struct val_labs *@var{val_labs}, int @var{new_width})
1067Tests whether @var{val_labs}'s width may be changed to @var{new_width}
1068using @func{val_labs_set_width}.  Returns true if it is allowed, false
1069otherwise.
1070
1071A set of value labels may be resized to a given width only if each
1072value in it may be resized to that width, as determined by
1073@func{value_is_resizable} (@pxref{value_is_resizable}).
1074@end deftypefun
1075
1076@deftypefun void val_labs_set_width (struct val_labs *@var{val_labs}, int @var{new_width})
1077Changes the width of @var{val_labs}'s values to @var{new_width}, which
1078must be a valid new width as determined by
1079@func{val_labs_can_set_width}.
1080@end deftypefun
1081
1082@node Value Labels Adding and Removing Labels
1083@subsection Adding and Removing Labels
1084
1085These functions add and remove value labels from a @struct{val_labs}
1086object.
1087
1088@deftypefun bool val_labs_add (struct val_labs *@var{val_labs}, union value @var{value}, const char *@var{label})
1089Adds @var{label} to in @var{var_labs} as a label for @var{value},
1090which must have the same width as the set of value labels.  Returns
1091true if successful, false if @var{value} already has a label.
1092@end deftypefun
1093
1094@deftypefun void val_labs_replace (struct val_labs *@var{val_labs}, union value @var{value}, const char *@var{label})
1095Adds @var{label} to in @var{var_labs} as a label for @var{value},
1096which must have the same width as the set of value labels.  If
1097@var{value} already has a label in @var{var_labs}, it is replaced.
1098@end deftypefun
1099
1100@deftypefun bool val_labs_remove (struct val_labs *@var{val_labs}, union value @var{value})
1101Removes from @var{val_labs} any label for @var{value}, which must have
1102the same width as the set of value labels.  Returns true if a label
1103was removed, false otherwise.
1104@end deftypefun
1105
1106@node Value Labels Iteration
1107@subsection Iterating through Value Labels
1108
1109These functions allow iteration through the set of value labels
1110represented by a @struct{val_labs} object.  They may be used in the
1111context of a @code{for} loop:
1112
1113@example
1114struct val_labs val_labs;
1115const struct val_lab *vl;
1116
1117@dots{}
1118
1119for (vl = val_labs_first (val_labs); vl != NULL;
1120     vl = val_labs_next (val_labs, vl))
1121  @{
1122    @dots{}@r{do something with @code{vl}}@dots{}
1123  @}
1124@end example
1125
1126Value labels should not be added or deleted from a @struct{val_labs}
1127as it is undergoing iteration.
1128
1129@deftypefun {const struct val_lab *} val_labs_first (const struct val_labs *@var{val_labs})
1130Returns the first value label in @var{var_labs}, if it contains at
1131least one value label, or a null pointer if it does not contain any
1132value labels.
1133@end deftypefun
1134
1135@deftypefun {const struct val_lab *} val_labs_next (const struct val_labs *@var{val_labs}, const struct val_labs_iterator **@var{vl})
1136Returns the value label in @var{var_labs} following @var{vl}, if
1137@var{vl} is not the last value label in @var{val_labs}, or a null
1138pointer if there are no value labels following @var{vl}.
1139@end deftypefun
1140
1141@deftypefun {const struct val_lab **} val_labs_sorted (const struct val_labs *@var{val_labs})
1142Allocates and returns an array of pointers to value labels, which are
1143sorted in increasing order by value.  The array has
1144@code{val_labs_count (@var{val_labs})} elements.  The caller is
1145responsible for freeing the array with @func{free} (but must not free
1146any of the @struct{val_lab} elements that the array points to).
1147@end deftypefun
1148
1149The iteration functions above work with pointers to @struct{val_lab}
1150which is an opaque data structure that users of @struct{val_labs} must
1151not modify or free directly.  The following functions work with
1152objects of this type:
1153
1154@deftypefun {const union value *} val_lab_get_value (const struct val_lab *@var{vl})
1155Returns the value of value label @var{vl}.  The caller must not modify
1156or free the returned value.  (To achieve a similar result, remove the
1157value label with @func{val_labs_remove}, then add the new value with
1158@func{val_labs_add}.)
1159
1160The width of the returned value cannot be determined directly from
1161@var{vl}.  It may be obtained by calling @func{val_labs_get_width} on
1162the @struct{val_labs} that @var{vl} is in.
1163@end deftypefun
1164
1165@deftypefun {const char *} val_lab_get_label (const struct val_lab *@var{vl})
1166Returns the label in @var{vl} as a null-terminated string.  The caller
1167must not modify or free the returned string.  (Use
1168@func{val_labs_replace} to change a value label.)
1169@end deftypefun
1170
1171@node Variables
1172@section Variables
1173
1174A PSPP variable is represented by @struct{variable}, an opaque type
1175declared in @file{data/variable.h} along with related declarations.
1176@xref{Variables,,,pspp, PSPP Users Guide}, for a description of PSPP
1177variables from a user perspective.
1178
1179PSPP is unusual among computer languages in that, by itself, a PSPP
1180variable does not have a value.  Instead, a variable in PSPP takes on
1181a value only in the context of a case, which supplies one value for
1182each variable in a set of variables (@pxref{Cases}).  The set of
1183variables in a case, in turn, are ordinarily part of a dictionary
1184(@pxref{Dictionaries}).
1185
1186Every variable has several attributes, most of which correspond
1187directly to one of the variable attributes visible to PSPP users
1188(@pxref{Attributes,,,pspp, PSPP Users Guide}).
1189
1190The following sections describe variable-related functions and macros.
1191
1192@menu
1193* Variable Name::
1194* Variable Type and Width::
1195* Variable Missing Values::
1196* Variable Value Labels::
1197* Variable Print and Write Formats::
1198* Variable Labels::
1199* Variable GUI Attributes::
1200* Variable Leave Status::
1201* Dictionary Class::
1202* Variable Creation and Destruction::
1203* Variable Short Names::
1204* Variable Relationships::
1205* Variable Auxiliary Data::
1206* Variable Categorical Values::
1207@end menu
1208
1209@node Variable Name
1210@subsection Variable Name
1211
1212A variable name is a string between 1 and @code{ID_MAX_LEN} bytes
1213long that satisfies the rules for PSPP identifiers
1214(@pxref{Tokens,,,pspp, PSPP Users Guide}).  Variable names are
1215mixed-case and treated case-insensitively.
1216
1217@deftypefn Macro int ID_MAX_LEN
1218Maximum length of a variable name, in bytes, currently 64.
1219@end deftypefn
1220
1221Only one commonly useful function relates to variable names:
1222
1223@deftypefun {const char *} var_get_name (const struct variable *@var{var})
1224Returns @var{var}'s variable name as a C string.
1225@end deftypefun
1226
1227A few other functions are much more rarely used.  Some of these
1228functions are used internally by the dictionary implementation:
1229
1230@anchor{var_set_name}
1231@deftypefun {void} var_set_name (struct variable *@var{var}, const char *@var{new_name})
1232Changes the name of @var{var} to @var{new_name}, which must be a
1233``plausible'' name as defined below.
1234
1235This function cannot be applied to a variable that is part of a
1236dictionary.  Use @func{dict_rename_var} instead (@pxref{Dictionary
1237Renaming Variables}).
1238@end deftypefun
1239
1240@deftypefun {enum dict_class} var_get_dict_class (const struct variable *@var{var})
1241Returns the dictionary class of @var{var}'s name (@pxref{Dictionary
1242Class}).
1243@end deftypefun
1244
1245@node Variable Type and Width
1246@subsection Variable Type and Width
1247
1248A variable's type and width are the type and width of its values
1249(@pxref{Values}).
1250
1251@deftypefun {enum val_type} var_get_type (const struct variable *@var{var})
1252Returns the type of variable @var{var}.
1253@end deftypefun
1254
1255@deftypefun int var_get_width (const struct variable *@var{var})
1256Returns the width of variable @var{var}.
1257@end deftypefun
1258
1259@deftypefun void var_set_width (struct variable *@var{var}, int @var{width})
1260Sets the width of variable @var{var} to @var{width}.  The width of a
1261variable should not normally be changed after the variable is created,
1262so this function is rarely used.  This function cannot be applied to a
1263variable that is part of a dictionary.
1264@end deftypefun
1265
1266@deftypefun bool var_is_numeric (const struct variable *@var{var})
1267Returns true if @var{var} is a numeric variable, false otherwise.
1268@end deftypefun
1269
1270@deftypefun bool var_is_alpha (const struct variable *@var{var})
1271Returns true if @var{var} is an alphanumeric (string) variable, false
1272otherwise.
1273@end deftypefun
1274
1275@node Variable Missing Values
1276@subsection Variable Missing Values
1277
1278A numeric or short string variable may have a set of user-missing
1279values (@pxref{MISSING VALUES,,,pspp, PSPP Users Guide}), represented
1280as a @struct{missing_values} (@pxref{User-Missing Values}).
1281
1282The most frequent operation on a variable's missing values is to query
1283whether a value is user- or system-missing:
1284
1285@deftypefun bool var_is_value_missing (const struct variable *@var{var}, const union value *@var{value}, enum mv_class @var{class})
1286@deftypefunx bool var_is_num_missing (const struct variable *@var{var}, double @var{value}, enum mv_class @var{class})
1287@deftypefunx bool var_is_str_missing (const struct variable *@var{var}, const char @var{value}[], enum mv_class @var{class})
1288Tests whether @var{value} is a missing value of the given @var{class}
1289for variable @var{var} and returns true if so, false otherwise.
1290@func{var_is_num_missing} may only be applied to numeric variables;
1291@func{var_is_str_missing} may only be applied to string variables.
1292@var{value} must have been initialized with the same width as
1293@var{var}.
1294
1295@code{var_is_@var{type}_missing (@var{var}, @var{value}, @var{class})}
1296is equivalent to @code{mv_is_@var{type}_missing
1297(var_get_missing_values (@var{var}), @var{value}, @var{class})}.
1298@end deftypefun
1299
1300In addition, a few functions are provided to work more directly with a
1301variable's @struct{missing_values}:
1302
1303@deftypefun {const struct missing_values *} var_get_missing_values (const struct variable *@var{var})
1304Returns the @struct{missing_values} associated with @var{var}.  The
1305caller must not modify the returned structure.  The return value is
1306always non-null.
1307@end deftypefun
1308
1309@anchor{var_set_missing_values}
1310@deftypefun {void} var_set_missing_values (struct variable *@var{var}, const struct missing_values *@var{miss})
1311Changes @var{var}'s missing values to a copy of @var{miss}, or if
1312@var{miss} is a null pointer, clears @var{var}'s missing values.  If
1313@var{miss} is non-null, it must have the same width as @var{var} or be
1314resizable to @var{var}'s width (@pxref{mv_resize}).  The caller
1315retains ownership of @var{miss}.
1316@end deftypefun
1317
1318@deftypefun void var_clear_missing_values (struct variable *@var{var})
1319Clears @var{var}'s missing values.  Equivalent to
1320@code{var_set_missing_values (@var{var}, NULL)}.
1321@end deftypefun
1322
1323@deftypefun bool var_has_missing_values (const struct variable *@var{var})
1324Returns true if @var{var} has any missing values, false if it has
1325none.  Equivalent to @code{mv_is_empty (var_get_missing_values (@var{var}))}.
1326@end deftypefun
1327
1328@node Variable Value Labels
1329@subsection Variable Value Labels
1330
1331A numeric or short string variable may have a set of value labels
1332(@pxref{VALUE LABELS,,,pspp, PSPP Users Guide}), represented as a
1333@struct{val_labs} (@pxref{Value Labels}).  The most commonly useful
1334functions for value labels return the value label associated with a
1335value:
1336
1337@deftypefun {const char *} var_lookup_value_label (const struct variable *@var{var}, const union value *@var{value})
1338Looks for a label for @var{value} in @var{var}'s set of value labels.
1339@var{value} must have the same width as @var{var}.  Returns the label
1340if one exists, otherwise a null pointer.
1341@end deftypefun
1342
1343@deftypefun void var_append_value_name (const struct variable *@var{var}, const union value *@var{value}, struct string *@var{str})
1344Looks for a label for @var{value} in @var{var}'s set of value labels.
1345@var{value} must have the same width as @var{var}.
1346If a label exists, it will be appended to the string pointed to by @var{str}.
1347Otherwise, it formats @var{value}
1348using @var{var}'s print format (@pxref{Input and Output Formats})
1349and appends the formatted string.
1350@end deftypefun
1351
1352The underlying @struct{val_labs} structure may also be accessed
1353directly using the functions described below.
1354
1355@deftypefun bool var_has_value_labels (const struct variable *@var{var})
1356Returns true if @var{var} has at least one value label, false
1357otherwise.
1358@end deftypefun
1359
1360@deftypefun {const struct val_labs *} var_get_value_labels (const struct variable *@var{var})
1361Returns the @struct{val_labs} associated with @var{var}.  If @var{var}
1362has no value labels, then the return value may or may not be a null
1363pointer.
1364
1365The variable retains ownership of the returned @struct{val_labs},
1366which the caller must not attempt to modify.
1367@end deftypefun
1368
1369@deftypefun void var_set_value_labels (struct variable *@var{var}, const struct val_labs *@var{val_labs})
1370Replaces @var{var}'s value labels by a copy of @var{val_labs}.  The
1371caller retains ownership of @var{val_labs}.  If @var{val_labs} is a
1372null pointer, then @var{var}'s value labels, if any, are deleted.
1373@end deftypefun
1374
1375@deftypefun void var_clear_value_labels (struct variable *@var{var})
1376Deletes @var{var}'s value labels.  Equivalent to
1377@code{var_set_value_labels (@var{var}, NULL)}.
1378@end deftypefun
1379
1380A final group of functions offers shorthands for operations that would
1381otherwise require getting the value labels from a variable, copying
1382them, modifying them, and then setting the modified value labels into
1383the variable (making a second copy):
1384
1385@deftypefun bool var_add_value_label (struct variable *@var{var}, const union value *@var{value}, const char *@var{label})
1386Attempts to add a copy of @var{label} as a label for @var{value} for
1387the given @var{var}.  @var{value} must have the same width as
1388@var{var}.  If @var{value} already has a label, then the old label is
1389retained.  Returns true if a label is added, false if there was an
1390existing label for @var{value}.  Either way, the caller retains
1391ownership of @var{value} and @var{label}.
1392@end deftypefun
1393
1394@deftypefun void var_replace_value_label (struct variable *@var{var}, const union value *@var{value}, const char *@var{label})
1395Attempts to add a copy of @var{label} as a label for @var{value} for
1396the given @var{var}.  @var{value} must have the same width as
1397@var{var}.  If @var{value} already has a label, then
1398@var{label} replaces the old label.  Either way, the caller retains
1399ownership of @var{value} and @var{label}.
1400@end deftypefun
1401
1402@node Variable Print and Write Formats
1403@subsection Variable Print and Write Formats
1404
1405Each variable has an associated pair of output formats, called its
1406@dfn{print format} and @dfn{write format}.  @xref{Input and Output
1407Formats,,,pspp, PSPP Users Guide}, for an introduction to formats.
1408@xref{Input and Output Formats}, for a developer's description of
1409format representation.
1410
1411The print format is used to convert a variable's data values to
1412strings for human-readable output.  The write format is used similarly
1413for machine-readable output, primarily by the WRITE transformation
1414(@pxref{WRITE,,,pspp, PSPP Users Guide}).  Most often a variable's
1415print and write formats are the same.
1416
1417A newly created variable by default has format F8.2 if it is numeric
1418or an A format with the same width as the variable if it is string.
1419Many creators of variables override these defaults.
1420
1421Both the print format and write format are output formats.  Input
1422formats are not part of @struct{variable}.  Instead, input programs
1423and transformations keep track of variable input formats themselves.
1424
1425The following functions work with variable print and write formats.
1426
1427@deftypefun {const struct fmt_spec *} var_get_print_format (const struct variable *@var{var})
1428@deftypefunx {const struct fmt_spec *} var_get_write_format (const struct variable *@var{var})
1429Returns @var{var}'s print or write format, respectively.
1430@end deftypefun
1431
1432@deftypefun void var_set_print_format (struct variable *@var{var}, const struct fmt_spec *@var{format})
1433@deftypefunx void var_set_write_format (struct variable *@var{var}, const struct fmt_spec *@var{format})
1434@deftypefunx void var_set_both_formats (struct variable *@var{var}, const struct fmt_spec *@var{format})
1435Sets @var{var}'s print format, write format, or both formats,
1436respectively, to a copy of @var{format}.
1437@end deftypefun
1438
1439@node Variable Labels
1440@subsection Variable Labels
1441
1442A variable label is a string that describes a variable.  Variable
1443labels may contain spaces and punctuation not allowed in variable
1444names.  @xref{VARIABLE LABELS,,,pspp, PSPP Users Guide}, for a
1445user-level description of variable labels.
1446
1447The most commonly useful functions for variable labels are those to
1448retrieve a variable's label:
1449
1450@deftypefun {const char *} var_to_string (const struct variable *@var{var})
1451Returns @var{var}'s variable label, if it has one, otherwise
1452@var{var}'s name.  In either case the caller must not attempt to
1453modify or free the returned string.
1454
1455This function is useful for user output.
1456@end deftypefun
1457
1458@deftypefun {const char *} var_get_label (const struct variable *@var{var})
1459Returns @var{var}'s variable label, if it has one, or a null pointer
1460otherwise.
1461@end deftypefun
1462
1463A few other variable label functions are also provided:
1464
1465@deftypefun void var_set_label (struct variable *@var{var}, const char *@var{label})
1466Sets @var{var}'s variable label to a copy of @var{label}, or removes
1467any label from @var{var} if @var{label} is a null pointer or contains
1468only spaces.  Leading and trailing spaces are removed from the
1469variable label and its remaining content is truncated at 255 bytes.
1470@end deftypefun
1471
1472@deftypefun void var_clear_label (struct variable *@var{var})
1473Removes any variable label from @var{var}.
1474@end deftypefun
1475
1476@deftypefun bool var_has_label (const struct variable *@var{var})
1477Returns true if @var{var} has a variable label, false otherwise.
1478@end deftypefun
1479
1480@node Variable GUI Attributes
1481@subsection GUI Attributes
1482
1483These functions and types access and set attributes that are mainly
1484used by graphical user interfaces.  Their values are also stored in
1485and retrieved from system files (but not portable files).
1486
1487The first group of functions relate to the measurement level of
1488numeric data.  New variables are assigned a nominal level of
1489measurement by default.
1490
1491@deftp {Enumeration} {enum measure}
1492Measurement level.  Available values are:
1493
1494@table @code
1495@item MEASURE_NOMINAL
1496Numeric data values are arbitrary.  Arithmetic operations and
1497numerical comparisons of such data are not meaningful.
1498
1499@item MEASURE_ORDINAL
1500Numeric data values indicate progression along a rank order.
1501Arbitrary arithmetic operations such as addition are not meaningful on
1502such data, but inequality comparisons (less, greater, etc.) have
1503straightforward interpretations.
1504
1505@item MEASURE_SCALE
1506Ratios, sums, etc. of numeric data values have meaningful
1507interpretations.
1508@end table
1509
1510PSPP does not have a separate category for interval data, which would
1511naturally fall between the ordinal and scale measurement levels.
1512@end deftp
1513
1514@deftypefun bool measure_is_valid (enum measure @var{measure})
1515Returns true if @var{measure} is a valid level of measurement, that
1516is, if it is one of the @code{enum measure} constants listed above,
1517and false otherwise.
1518@end deftypefun
1519
1520@deftypefun enum measure var_get_measure (const struct variable *@var{var})
1521@deftypefunx void var_set_measure (struct variable *@var{var}, enum measure @var{measure})
1522Gets or sets @var{var}'s measurement level.
1523@end deftypefun
1524
1525The following set of functions relates to the width of on-screen
1526columns used for displaying variable data in a graphical user
1527interface environment.  The unit of measurement is the width of a
1528character.  For proportionally spaced fonts, this is based on the
1529average width of a character.
1530
1531@deftypefun int var_get_display_width (const struct variable *@var{var})
1532@deftypefunx void var_set_display_width (struct variable *@var{var}, int @var{display_width})
1533Gets or sets @var{var}'s display width.
1534@end deftypefun
1535
1536@anchor{var_default_display_width}
1537@deftypefun int var_default_display_width (int @var{width})
1538Returns the default display width for a variable with the given
1539@var{width}.  The default width of a numeric variable is 8.  The
1540default width of a string variable is @var{width} or 32, whichever is
1541less.
1542@end deftypefun
1543
1544The final group of functions work with the justification of data when
1545it is displayed in on-screen columns.  New variables are by default
1546right-justified.
1547
1548@deftp {Enumeration} {enum alignment}
1549Text justification.  Possible values are @code{ALIGN_LEFT},
1550@code{ALIGN_RIGHT}, and @code{ALIGN_CENTRE}.
1551@end deftp
1552
1553@deftypefun bool alignment_is_valid (enum alignment @var{alignment})
1554Returns true if @var{alignment} is a valid alignment, that is, if it
1555is one of the @code{enum alignment} constants listed above, and false
1556otherwise.
1557@end deftypefun
1558
1559@deftypefun enum alignment var_get_alignment (const struct variable *@var{var})
1560@deftypefunx void var_set_alignment (struct variable *@var{var}, enum alignment @var{alignment})
1561Gets or sets @var{var}'s alignment.
1562@end deftypefun
1563
1564@node Variable Leave Status
1565@subsection Variable Leave Status
1566
1567Commonly, most or all data in a case come from an input file, read
1568with a command such as DATA LIST or GET, but data can also be
1569generated with transformations such as COMPUTE.  In the latter case
1570the question of a datum's ``initial value'' can arise.  For example,
1571the value of a piece of generated data can recursively depend on its
1572own value:
1573@example
1574COMPUTE X = X + 1.
1575@end example
1576Another situation where the initial value of a variable arises is when
1577its value is not set at all for some cases, e.g.@: below, @code{Y} is
1578set only for the first 10 cases:
1579@example
1580DO IF #CASENUM <= 10.
1581+ COMPUTE Y = 1.
1582END IF.
1583@end example
1584
1585By default, the initial value of a datum in either of these situations
1586is the system-missing value for numeric values and spaces for string
1587values.  This means that, above, X would be system-missing and that Y
1588would be 1 for the first 10 cases and system-missing for the
1589remainder.
1590
1591PSPP also supports retaining the value of a variable from one case to
1592another, using the LEAVE command (@pxref{LEAVE,,,pspp, PSPP Users
1593Guide}).  The initial value of such a variable is 0 if it is numeric
1594and spaces if it is a string.  If the command @samp{LEAVE X Y} is
1595appended to the above example, then X would have value 1 in the first
1596case and increase by 1 in every succeeding case, and Y would have
1597value 1 for the first 10 cases and 0 for later cases.
1598
1599The LEAVE command has no effect on data that comes from an input file
1600or whose values do not depend on a variable's initial value.
1601
1602The value of scratch variables (@pxref{Scratch Variables,,,pspp, PSPP
1603Users Guide}) are always left from one case to another.
1604
1605The following functions work with a variable's leave status.
1606
1607@deftypefun bool var_get_leave (const struct variable *@var{var})
1608Returns true if @var{var}'s value is to be retained from case to case,
1609false if it is reinitialized to system-missing or spaces.
1610@end deftypefun
1611
1612@deftypefun void var_set_leave (struct variable *@var{var}, bool @var{leave})
1613If @var{leave} is true, marks @var{var} to be left from case to case;
1614if @var{leave} is false, marks @var{var} to be reinitialized for each
1615case.
1616
1617If @var{var} is a scratch variable, @var{leave} must be true.
1618@end deftypefun
1619
1620@deftypefun bool var_must_leave (const struct variable *@var{var})
1621Returns true if @var{var} must be left from case to case, that is, if
1622@var{var} is a scratch variable.
1623@end deftypefun
1624
1625@node Dictionary Class
1626@subsection Dictionary Class
1627
1628Occasionally it is useful to classify variables into @dfn{dictionary
1629classes} based on their names.  Dictionary classes are represented by
1630@enum{dict_class}.  This type and other declarations for dictionary
1631classes are in the @file{<data/dict-class.h>} header.
1632
1633@deftp {Enumeration} {enum dict_class}
1634The dictionary classes are:
1635
1636@table @code
1637@item DC_ORDINARY
1638An ordinary variable, one whose name does not begin with @samp{$} or
1639@samp{#}.
1640
1641@item DC_SYSTEM
1642A system variable, one whose name begins with @samp{$}.  @xref{System
1643Variables,,,pspp, PSPP Users Guide}.
1644
1645@item DC_SCRATCH
1646A scratch variable, one whose name begins with @samp{#}.
1647@xref{Scratch Variables,,,pspp, PSPP Users Guide}.
1648@end table
1649
1650The values for dictionary classes are bitwise disjoint, which allows
1651them to be used in bit-masks.  An extra enumeration constant
1652@code{DC_ALL}, whose value is the bitwise-@i{or} of all of the above
1653constants, is provided to aid in this purpose.
1654@end deftp
1655
1656One example use of dictionary classes arises in connection with PSPP
1657syntax that uses @code{@var{a} TO @var{b}} to name the variables in a
1658dictionary from @var{a} to @var{b} (@pxref{Sets of Variables,,,pspp,
1659PSPP Users Guide}).  This syntax requires @var{a} and @var{b} to be in
1660the same dictionary class.  It limits the variables that it includes
1661to those in that dictionary class.
1662
1663The following functions relate to dictionary classes.
1664
1665@deftypefun {enum dict_class} dict_class_from_id (const char *@var{name})
1666Returns the ``dictionary class'' for the given variable @var{name}, by
1667looking at its first letter.
1668@end deftypefun
1669
1670@deftypefun {const char *} dict_class_to_name (enum dict_class @var{dict_class})
1671Returns a name for the given @var{dict_class} as an adjective, e.g.@:
1672@code{"scratch"}.
1673
1674This function should probably not be used in new code as it can lead
1675to difficulties for internationalization.
1676@end deftypefun
1677
1678@node Variable Creation and Destruction
1679@subsection Variable Creation and Destruction
1680
1681Only rarely should PSPP code create or destroy variables directly.
1682Ordinarily, variables are created within a dictionary and destroying
1683by individual deletion from the dictionary or by destroying the entire
1684dictionary at once.  The functions here enable the exceptional case,
1685of creation and destruction of variables that are not associated with
1686any dictionary.  These functions are used internally in the dictionary
1687implementation.
1688
1689@anchor{var_create}
1690@deftypefun {struct variable *} var_create (const char *@var{name}, int @var{width})
1691Creates and returns a new variable with the given @var{name} and
1692@var{width}.  The new variable is not part of any dictionary.  Use
1693@func{dict_create_var}, instead, to create a variable in a dictionary
1694(@pxref{Dictionary Creating Variables}).
1695
1696@var{name} should be a valid variable name and must be a ``plausible''
1697variable name (@pxref{Variable Name}).  @var{width} must be between 0
1698and @code{MAX_STRING}, inclusive (@pxref{Values}).
1699
1700The new variable has no user-missing values, value labels, or variable
1701label.  Numeric variables initially have F8.2 print and write formats,
1702right-justified display alignment, and scale level of measurement.
1703String variables are created with A print and write formats,
1704left-justified display alignment, and nominal level of measurement.
1705The initial display width is determined by
1706@func{var_default_display_width} (@pxref{var_default_display_width}).
1707
1708The new variable initially has no short name (@pxref{Variable Short
1709Names}) and no auxiliary data (@pxref{Variable Auxiliary Data}).
1710@end deftypefun
1711
1712@anchor{var_clone}
1713@deftypefun {struct variable *} var_clone (const struct variable *@var{old_var})
1714Creates and returns a new variable with the same attributes as
1715@var{old_var}, with a few exceptions.  First, the new variable is not
1716part of any dictionary, regardless of whether @var{old_var} was in a
1717dictionary.  Use @func{dict_clone_var}, instead, to add a clone of a
1718variable to a dictionary.
1719
1720Second, the new variable is not given any short name, even if
1721@var{old_var} had a short name.  This is because the new variable is
1722likely to be immediately renamed, in which case the short name would
1723be incorrect (@pxref{Variable Short Names}).
1724
1725Finally, @var{old_var}'s auxiliary data, if any, is not copied to the
1726new variable (@pxref{Variable Auxiliary Data}).
1727@end deftypefun
1728
1729@deftypefun {void} var_destroy (struct variable *@var{var})
1730Destroys @var{var} and frees all associated storage, including its
1731auxiliary data, if any.  @var{var} must not be part of a dictionary.
1732To delete a variable from a dictionary and destroy it, use
1733@func{dict_delete_var} (@pxref{Dictionary Deleting Variables}).
1734@end deftypefun
1735
1736@node Variable Short Names
1737@subsection Variable Short Names
1738
1739PSPP variable names may be up to 64 (@code{ID_MAX_LEN}) bytes long.
1740The system and portable file formats, however, were designed when
1741variable names were limited to 8 bytes in length.  Since then, the
1742system file format has been augmented with an extension record that
1743explains how the 8-byte short names map to full-length names
1744(@pxref{Long Variable Names Record}), but the short names are still
1745present.  Thus, the continued presence of the short names is more or
1746less invisible to PSPP users, but every variable in a system file
1747still has a short name that must be unique.
1748
1749PSPP can generate unique short names for variables based on their full
1750names at the time it creates the data file.  If all variables' full
1751names are unique in their first 8 bytes, then the short names are
1752simply prefixes of the full names; otherwise, PSPP changes them so
1753that they are unique.
1754
1755By itself this algorithm interoperates well with other software that
1756can read system files, as long as that software understands the
1757extension record that maps short names to long names.  When the other
1758software does not understand the extension record, it can produce
1759surprising results.  Consider a situation where PSPP reads a system
1760file that contains two variables named RANKINGSCORE, then the user
1761adds a new variable named RANKINGSTATUS, then saves the modified data
1762as a new system file.  A program that does not understand long names
1763would then see one of these variables under the name RANKINGS---either
1764one, depending on the algorithm's details---and the other under a
1765different name.  The effect could be very confusing: by adding a new
1766and apparently unrelated variable in PSPP, the user effectively
1767renamed the existing variable.
1768
1769To counteract this potential problem, every @struct{variable} may have
1770a short name.  A variable created by the system or portable file
1771reader receives the short name from that data file.  When a variable
1772with a short name is written to a system or portable file, that
1773variable receives priority over other long names whose names begin
1774with the same 8 bytes but which were not read from a data file under
1775that short name.
1776
1777Variables not created by the system or portable file reader have no
1778short name by default.
1779
1780A variable with a full name of 8 bytes or less in length has absolute
1781priority for that name when the variable is written to a system file,
1782even over a second variable with that assigned short name.
1783
1784PSPP does not enforce uniqueness of short names, although the short
1785names read from any given data file will always be unique.  If two
1786variables with the same short name are written to a single data file,
1787neither one receives priority.
1788
1789The following macros and functions relate to short names.
1790
1791@defmac SHORT_NAME_LEN
1792Maximum length of a short name, in bytes.  Its value is 8.
1793@end defmac
1794
1795@deftypefun {const char *} var_get_short_name (const struct variable *@var{var})
1796Returns @var{var}'s short name, or a null pointer if @var{var} has not
1797been assigned a short name.
1798@end deftypefun
1799
1800@deftypefun void var_set_short_name (struct variable *@var{var}, const char *@var{short_name})
1801Sets @var{var}'s short name to @var{short_name}, or removes
1802@var{var}'s short name if @var{short_name} is a null pointer.  If it
1803is non-null, then @var{short_name} must be a plausible name for a
1804variable.  The name will be truncated
1805to 8 bytes in length and converted to all-uppercase.
1806@end deftypefun
1807
1808@deftypefun void var_clear_short_name (struct variable *@var{var})
1809Removes @var{var}'s short name.
1810@end deftypefun
1811
1812@node Variable Relationships
1813@subsection Variable Relationships
1814
1815Variables have close relationships with dictionaries
1816(@pxref{Dictionaries}) and cases (@pxref{Cases}).  A variable is
1817usually a member of some dictionary, and a case is often used to store
1818data for the set of variables in a dictionary.
1819
1820These functions report on these relationships.  They may be applied
1821only to variables that are in a dictionary.
1822
1823@deftypefun size_t var_get_dict_index (const struct variable *@var{var})
1824Returns @var{var}'s index within its dictionary.  The first variable
1825in a dictionary has index 0, the next variable index 1, and so on.
1826
1827The dictionary index can be influenced using dictionary functions such
1828as dict_reorder_var (@pxref{dict_reorder_var}).
1829@end deftypefun
1830
1831@deftypefun size_t var_get_case_index (const struct variable *@var{var})
1832Returns @var{var}'s index within a case.  The case index is an index
1833into an array of @union{value} large enough to contain all the data in
1834the dictionary.
1835
1836The returned case index can be used to access the value of @var{var}
1837within a case for its dictionary, as in e.g.@: @code{case_data_idx
1838(case, var_get_case_index (@var{var}))}, but ordinarily it is more
1839convenient to use the data access functions that do variable-to-index
1840translation internally, as in e.g.@: @code{case_data (case,
1841@var{var})}.
1842@end deftypefun
1843
1844@node Variable Auxiliary Data
1845@subsection Variable Auxiliary Data
1846
1847Each @struct{variable} can have a single pointer to auxiliary data of
1848type @code{void *}.  These functions manipulate a variable's auxiliary
1849data.
1850
1851Use of auxiliary data is discouraged because of its lack of
1852flexibility.  Only one client can make use of auxiliary data on a
1853given variable at any time, even though many clients could usefully
1854associate data with a variable.
1855
1856To prevent multiple clients from attempting to use a variable's single
1857auxiliary data field at the same time, we adopt the convention that
1858use of auxiliary data in the active dataset dictionary is restricted to
1859the currently executing command.  In particular, transformations must
1860not attach auxiliary data to a variable in the active dataset in the
1861expectation that it can be used later when the active dataset is read and
1862the transformation is executed.  To help enforce this restriction,
1863auxiliary data is deleted from all variables in the active dataset
1864dictionary after the execution of each PSPP command.
1865
1866This convention for safe use of auxiliary data applies only to the
1867active dataset dictionary.  Rules for other dictionaries may be
1868established separately.
1869
1870Auxiliary data should be replaced by a more flexible mechanism at some
1871point, but no replacement mechanism has been designed or implemented
1872so far.
1873
1874The following functions work with variable auxiliary data.
1875
1876@deftypefun {void *} var_get_aux (const struct variable *@var{var})
1877Returns @var{var}'s auxiliary data, or a null pointer if none has been
1878assigned.
1879@end deftypefun
1880
1881@deftypefun {void *} var_attach_aux (const struct variable *@var{var}, void *@var{aux}, void (*@var{aux_dtor}) (struct variable *))
1882Sets @var{var}'s auxiliary data to @var{aux}, which must not be null.
1883@var{var} must not already have auxiliary data.
1884
1885Before @var{var}'s auxiliary data is cleared by @code{var_clear_aux},
1886@var{aux_dtor}, if non-null, will be called with @var{var} as its
1887argument.  It should free any storage associated with @var{aux}, if
1888necessary.  @code{var_dtor_free} may be appropriate for use as
1889@var{aux_dtor}:
1890
1891@deffn {Function} void var_dtor_free (struct variable *@var{var})
1892Frees @var{var}'s auxiliary data by calling @code{free}.
1893@end deffn
1894@end deftypefun
1895
1896@deftypefun void var_clear_aux (struct variable *@var{var})
1897Removes auxiliary data, if any, from @var{var}, first calling the
1898destructor passed to @code{var_attach_aux}, if one was provided.
1899
1900Use @code{dict_clear_aux} to remove auxiliary data from every variable
1901in a dictionary. @c (@pxref{dict_clear_aux}).
1902@end deftypefun
1903
1904@deftypefun {void *} var_detach_aux (struct variable *@var{var})
1905Removes auxiliary data, if any, from @var{var}, and returns it.
1906Returns a null pointer if @var{var} had no auxiliary data.
1907
1908Any destructor passed to @code{var_attach_aux} is not called, so the
1909caller is responsible for freeing storage associated with the returned
1910auxiliary data.
1911@end deftypefun
1912
1913@node Variable Categorical Values
1914@subsection Variable Categorical Values
1915
1916Some statistical procedures require a list of all the values that a
1917categorical variable takes on.  Arranging such a list requires making
1918a pass through the data, so PSPP caches categorical values in
1919@struct{variable}.
1920
1921When variable auxiliary data is revamped to support multiple clients
1922as described in the previous section, categorical values are an
1923obvious candidate.  The form in which they are currently supported is
1924inelegant.
1925
1926Categorical values are not robust against changes in the data.  That
1927is, there is currently no way to detect that a transformation has
1928changed data values, meaning that categorical values lists for the
1929changed variables must be recomputed.  PSPP is in fact in need of a
1930general-purpose caching and cache-invalidation mechanism, but none
1931has yet been designed and built.
1932
1933The following functions work with cached categorical values.
1934
1935@deftypefun {struct cat_vals *} var_get_obs_vals (const struct variable *@var{var})
1936Returns @var{var}'s set of categorical values.  Yields undefined
1937behavior if @var{var} does not have any categorical values.
1938@end deftypefun
1939
1940@deftypefun void var_set_obs_vals (const struct variable *@var{var}, struct cat_vals *@var{cat_vals})
1941Destroys @var{var}'s categorical values, if any, and replaces them by
1942@var{cat_vals}, ownership of which is transferred to @var{var}.  If
1943@var{cat_vals} is a null pointer, then @var{var}'s categorical values
1944are cleared.
1945@end deftypefun
1946
1947@deftypefun bool var_has_obs_vals (const struct variable *@var{var})
1948Returns true if @var{var} has a set of categorical values, false
1949otherwise.
1950@end deftypefun
1951
1952@node Dictionaries
1953@section Dictionaries
1954
1955Each data file in memory or on disk has an associated dictionary,
1956whose primary purpose is to describe the data in the file.
1957@xref{Variables,,,pspp, PSPP Users Guide}, for a PSPP user's view of a
1958dictionary.
1959
1960A data file stored in a PSPP format, either as a system or portable
1961file, has a representation of its dictionary embedded in it.  Other
1962kinds of data files are usually not self-describing enough to
1963construct a dictionary unassisted, so the dictionaries for these files
1964must be specified explicitly with PSPP commands such as @cmd{DATA
1965LIST}.
1966
1967The most important content of a dictionary is an array of variables,
1968which must have unique names.  A dictionary also conceptually contains
1969a mapping from each of its variables to a location within a case
1970(@pxref{Cases}), although in fact these mappings are stored within
1971individual variables.
1972
1973System variables are not members of any dictionary (@pxref{System
1974Variables,,,pspp, PSPP Users Guide}).
1975
1976Dictionaries are represented by @struct{dictionary}.  Declarations
1977related to dictionaries are in the @file{<data/dictionary.h>} header.
1978
1979The following sections describe functions for use with dictionaries.
1980
1981@menu
1982* Dictionary Variable Access::
1983* Dictionary Creating Variables::
1984* Dictionary Deleting Variables::
1985* Dictionary Reordering Variables::
1986* Dictionary Renaming Variables::
1987* Dictionary Weight Variable::
1988* Dictionary Filter Variable::
1989* Dictionary Case Limit::
1990* Dictionary Split Variables::
1991* Dictionary File Label::
1992* Dictionary Documents::
1993@end menu
1994
1995@node Dictionary Variable Access
1996@subsection Accessing Variables
1997
1998The most common operations on a dictionary simply retrieve a
1999@code{struct variable *} of an individual variable based on its name
2000or position.
2001
2002@deftypefun {struct variable *} dict_lookup_var (const struct dictionary *@var{dict}, const char *@var{name})
2003@deftypefunx {struct variable *} dict_lookup_var_assert (const struct dictionary *@var{dict}, const char *@var{name})
2004Looks up and returns the variable with the given @var{name} within
2005@var{dict}.  Name lookup is not case-sensitive.
2006
2007@code{dict_lookup_var} returns a null pointer if @var{dict} does not
2008contain a variable named @var{name}.  @code{dict_lookup_var_assert}
2009asserts that such a variable exists.
2010@end deftypefun
2011
2012@deftypefun {struct variable *} dict_get_var (const struct dictionary *@var{dict}, size_t @var{position})
2013Returns the variable at the given @var{position} in @var{dict}.
2014@var{position} must be less than the number of variables in @var{dict}
2015(see below).
2016@end deftypefun
2017
2018@deftypefun size_t dict_get_var_cnt (const struct dictionary *@var{dict})
2019Returns the number of variables in @var{dict}.
2020@end deftypefun
2021
2022Another pair of functions allows retrieving a number of variables at
2023once.  These functions are more rarely useful.
2024
2025@deftypefun void dict_get_vars (const struct dictionary *@var{dict}, const struct variable ***@var{vars}, size_t *@var{cnt}, enum dict_class @var{exclude})
2026@deftypefunx void dict_get_vars_mutable (const struct dictionary *@var{dict}, struct variable ***@var{vars}, size_t *@var{cnt}, enum dict_class @var{exclude})
2027Retrieves all of the variables in @var{dict}, in their original order,
2028except that any variables in the dictionary classes specified
2029@var{exclude}, if any, are excluded (@pxref{Dictionary Class}).
2030Pointers to the variables are stored in an array allocated with
2031@code{malloc}, and a pointer to the first element of this array is
2032stored in @code{*@var{vars}}.  The caller is responsible for freeing
2033this memory when it is no longer needed.  The number of variables
2034retrieved is stored in @code{*@var{cnt}}.
2035
2036The presence or absence of @code{DC_SYSTEM} in @var{exclude} has no
2037effect, because dictionaries never include system variables.
2038@end deftypefun
2039
2040One additional function is available.  This function is most often
2041used in assertions, but it is not restricted to such use.
2042
2043@deftypefun bool dict_contains_var (const struct dictionary *@var{dict}, const struct variable *@var{var})
2044Tests whether @var{var} is one of the variables in @var{dict}.
2045Returns true if so, false otherwise.
2046@end deftypefun
2047
2048@node Dictionary Creating Variables
2049@subsection Creating Variables
2050
2051These functions create a new variable and insert it into a dictionary
2052in a single step.
2053
2054There is no provision for inserting an already created variable into a
2055dictionary.  There is no reason that such a function could not be
2056written, but so far there has been no need for one.
2057
2058The names provided to one of these functions should be valid variable
2059names and must be plausible variable names. @c (@pxref{Variable Names}).
2060
2061If a variable with the same name already exists in the dictionary, the
2062non-@code{assert} variants of these functions return a null pointer,
2063without modifying the dictionary.  The @code{assert} variants, on the
2064other hand, assert that no duplicate name exists.
2065
2066A variable may be in only one dictionary at any given time.
2067
2068@deftypefun {struct variable *} dict_create_var (struct dictionary *@var{dict}, const char *@var{name}, int @var{width})
2069@deftypefunx {struct variable *} dict_create_var_assert (struct dictionary *@var{dict}, const char *@var{name}, int @var{width})
2070Creates a new variable with the given @var{name} and @var{width}, as
2071if through a call to @code{var_create} with those arguments
2072(@pxref{var_create}), appends the new variable to @var{dict}'s array
2073of variables, and returns the new variable.
2074@end deftypefun
2075
2076@deftypefun {struct variable *} dict_clone_var (struct dictionary *@var{dict}, const struct variable *@var{old_var})
2077@deftypefunx {struct variable *} dict_clone_var_assert (struct dictionary *@var{dict}, const struct variable *@var{old_var})
2078Creates a new variable as a clone of @var{var}, inserts the new
2079variable into @var{dict}, and returns the new variable.  Other
2080properties of the new variable are copied from @var{old_var}, except
2081for those not copied by @code{var_clone} (@pxref{var_clone}).
2082
2083@var{var} does not need to be a member of any dictionary.
2084@end deftypefun
2085
2086@deftypefun {struct variable *} dict_clone_var_as (struct dictionary *@var{dict}, const struct variable *@var{old_var}, const char *@var{name})
2087@deftypefunx {struct variable *} dict_clone_var_as_assert (struct dictionary *@var{dict}, const struct variable *@var{old_var}, const char *@var{name})
2088These functions are similar to @code{dict_clone_var} and
2089@code{dict_clone_var_assert}, respectively, except that the new
2090variable is named @var{name} instead of keeping @var{old_var}'s name.
2091@end deftypefun
2092
2093@node Dictionary Deleting Variables
2094@subsection Deleting Variables
2095
2096These functions remove variables from a dictionary's array of
2097variables.  They also destroy the removed variables and free their
2098associated storage.
2099
2100Deleting a variable to which there might be external pointers is a bad
2101idea.  In particular, deleting variables from the active dataset
2102dictionary is a risky proposition, because transformations can retain
2103references to arbitrary variables.  Therefore, no variable should be
2104deleted from the active dataset dictionary when any transformations are
2105active, because those transformations might reference the variable to
2106be deleted.  The safest time to delete a variable is just after a
2107procedure has been executed, as done by @cmd{DELETE VARIABLES}.
2108
2109Deleting a variable automatically removes references to that variable
2110from elsewhere in the dictionary as a weighting variable, filter
2111variable, @cmd{SPLIT FILE} variable, or member of a vector.
2112
2113No functions are provided for removing a variable from a dictionary
2114without destroying that variable.  As with insertion of an existing
2115variable, there is no reason that this could not be implemented, but
2116so far there has been no need.
2117
2118@deftypefun void dict_delete_var (struct dictionary *@var{dict}, struct variable *@var{var})
2119Deletes @var{var} from @var{dict}, of which it must be a member.
2120@end deftypefun
2121
2122@deftypefun void dict_delete_vars (struct dictionary *@var{dict}, struct variable *const *@var{vars}, size_t @var{count})
2123Deletes the @var{count} variables in array @var{vars} from @var{dict}.
2124All of the variables in @var{vars} must be members of @var{dict}.  No
2125variable may be included in @var{vars} more than once.
2126@end deftypefun
2127
2128@deftypefun void dict_delete_consecutive_vars (struct dictionary *@var{dict}, size_t @var{idx}, size_t @var{count})
2129Deletes the variables in sequential positions
2130@var{idx}@dots{}@var{idx} + @var{count} (exclusive) from @var{dict},
2131which must contain at least @var{idx} + @var{count} variables.
2132@end deftypefun
2133
2134@deftypefun void dict_delete_scratch_vars (struct dictionary *@var{dict})
2135Deletes all scratch variables from @var{dict}.
2136@end deftypefun
2137
2138@node Dictionary Reordering Variables
2139@subsection Changing Variable Order
2140
2141The variables in a dictionary are stored in an array.  These functions
2142change the order of a dictionary's array of variables without changing
2143which variables are in the dictionary.
2144
2145@anchor{dict_reorder_var}
2146@deftypefun void dict_reorder_var (struct dictionary *@var{dict}, struct variable *@var{var}, size_t @var{new_index})
2147Moves @var{var}, which must be in @var{dict}, so that it is at
2148position @var{new_index} in @var{dict}'s array of variables.  Other
2149variables in @var{dict}, if any, retain their relative positions.
2150@var{new_index} must be less than the number of variables in
2151@var{dict}.
2152@end deftypefun
2153
2154@deftypefun void dict_reorder_vars (struct dictionary *@var{dict}, struct variable *const *@var{new_order}, size_t @var{count})
2155Moves the @var{count} variables in @var{new_order} to the beginning of
2156@var{dict}'s array of variables in the specified order.  Other
2157variables in @var{dict}, if any, retain their relative positions.
2158
2159All of the variables in @var{new_order} must be in @var{dict}.  No
2160duplicates are allowed within @var{new_order}, which means that
2161@var{count} must be no greater than the number of variables in
2162@var{dict}.
2163@end deftypefun
2164
2165@node Dictionary Renaming Variables
2166@subsection Renaming Variables
2167
2168These functions change the names of variables within a dictionary.
2169The @func{var_set_name} function (@pxref{var_set_name}) cannot be
2170applied directly to a variable that is in a dictionary, because
2171@struct{dictionary} contains an index by name that @func{var_set_name}
2172would not update.  The following functions take care to update the
2173index as well.  They also ensure that variable renaming does not cause
2174a dictionary to contain a duplicate variable name.
2175
2176@deftypefun void dict_rename_var (struct dictionary *@var{dict}, struct variable *@var{var}, const char *@var{new_name})
2177Changes the name of @var{var}, which must be in @var{dict}, to
2178@var{new_name}.  A variable named @var{new_name} must not already be
2179in @var{dict}, unless @var{new_name} is the same as @var{var}'s
2180current name.
2181@end deftypefun
2182
2183@deftypefun bool dict_rename_vars (struct dictionary *@var{dicT}, struct variable **@var{vars}, char **@var{new_names}, size_t @var{count}, char **@var{err_name})
2184Renames each of the @var{count} variables in @var{vars} to the name in
2185the corresponding position of @var{new_names}.  If the renaming would
2186result in a duplicate variable name, returns false and stores one of
2187the names that would be be duplicated into @code{*@var{err_name}}, if
2188@var{err_name} is non-null.  Otherwise, the renaming is successful,
2189and true is returned.
2190@end deftypefun
2191
2192@node Dictionary Weight Variable
2193@subsection Weight Variable
2194
2195A data set's cases may optionally be weighted by the value of a
2196numeric variable.  @xref{WEIGHT,,,pspp, PSPP Users Guide}, for a user
2197view of weight variables.
2198
2199The weight variable is written to and read from system and portable
2200files.
2201
2202The most commonly useful function related to weighting is a
2203convenience function to retrieve a weighting value from a case.
2204
2205@deftypefun double dict_get_case_weight (const struct dictionary *@var{dict}, const struct ccase *@var{case}, bool *@var{warn_on_invalid})
2206Retrieves and returns the value of the weighting variable specified by
2207@var{dict} from @var{case}.  Returns 1.0 if @var{dict} has no
2208weighting variable.
2209
2210Returns 0.0 if @var{c}'s weight value is user- or system-missing,
2211zero, or negative.  In such a case, if @var{warn_on_invalid} is
2212non-null and @code{*@var{warn_on_invalid}} is true,
2213@func{dict_get_case_weight} also issues an error message and sets
2214@code{*@var{warn_on_invalid}} to false.  To disable error reporting,
2215pass a null pointer or a pointer to false as @var{warn_on_invalid} or
2216use a @func{msg_disable}/@func{msg_enable} pair.
2217@end deftypefun
2218
2219The dictionary also has a pair of functions for getting and setting
2220the weight variable.
2221
2222@deftypefun {struct variable *} dict_get_weight (const struct dictionary *@var{dict})
2223Returns @var{dict}'s current weighting variable, or a null pointer if
2224the dictionary does not have a weighting variable.
2225@end deftypefun
2226
2227@deftypefun void dict_set_weight (struct dictionary *@var{dict}, struct variable *@var{var})
2228Sets @var{dict}'s weighting variable to @var{var}.  If @var{var} is
2229non-null, it must be a numeric variable in @var{dict}.  If @var{var}
2230is null, then @var{dict}'s weighting variable, if any, is cleared.
2231@end deftypefun
2232
2233@node Dictionary Filter Variable
2234@subsection Filter Variable
2235
2236When the active dataset is read by a procedure, cases can be excluded
2237from analysis based on the values of a @dfn{filter variable}.
2238@xref{FILTER,,,pspp, PSPP Users Guide}, for a user view of filtering.
2239
2240These functions store and retrieve the filter variable.  They are
2241rarely useful, because the data analysis framework automatically
2242excludes from analysis the cases that should be filtered.
2243
2244@deftypefun {struct variable *} dict_get_filter (const struct dictionary *@var{dict})
2245Returns @var{dict}'s current filter variable, or a null pointer if the
2246dictionary does not have a filter variable.
2247@end deftypefun
2248
2249@deftypefun void dict_set_filter (struct dictionary *@var{dict}, struct variable *@var{var})
2250Sets @var{dict}'s filter variable to @var{var}.  If @var{var} is
2251non-null, it must be a numeric variable in @var{dict}.  If @var{var}
2252is null, then @var{dict}'s filter variable, if any, is cleared.
2253@end deftypefun
2254
2255@node Dictionary Case Limit
2256@subsection Case Limit
2257
2258The limit on cases analyzed by a procedure, set by the @cmd{N OF
2259CASES} command (@pxref{N OF CASES,,,pspp, PSPP Users Guide}), is
2260stored as part of the dictionary.  The dictionary does not, on the
2261other hand, play any role in enforcing the case limit (a job done by
2262data analysis framework code).
2263
2264A case limit of 0 means that the number of cases is not limited.
2265
2266These functions are rarely useful, because the data analysis framework
2267automatically excludes from analysis any cases beyond the limit.
2268
2269@deftypefun casenumber dict_get_case_limit (const struct dictionary *@var{dict})
2270Returns the current case limit for @var{dict}.
2271@end deftypefun
2272
2273@deftypefun void dict_set_case_limit (struct dictionary *@var{dict}, casenumber @var{limit})
2274Sets @var{dict}'s case limit to @var{limit}.
2275@end deftypefun
2276
2277@node Dictionary Split Variables
2278@subsection Split Variables
2279
2280The user may use the @cmd{SPLIT FILE} command (@pxref{SPLIT
2281FILE,,,pspp, PSPP Users Guide}) to select a set of variables on which
2282to split the active dataset into groups of cases to be analyzed
2283independently in each statistical procedure.  The set of split
2284variables is stored as part of the dictionary, although the effect on
2285data analysis is implemented by each individual statistical procedure.
2286
2287Split variables may be numeric or short or long string variables.
2288
2289The most useful functions for split variables are those to retrieve
2290them.  Even these functions are rarely useful directly: for the
2291purpose of breaking cases into groups based on the values of the split
2292variables, it is usually easier to use
2293@func{casegrouper_create_splits}.
2294
2295@deftypefun {const struct variable *const *} dict_get_split_vars (const struct dictionary *@var{dict})
2296Returns a pointer to an array of pointers to split variables.  If and
2297only if there are no split variables, returns a null pointer.  The
2298caller must not modify or free the returned array.
2299@end deftypefun
2300
2301@deftypefun size_t dict_get_split_cnt (const struct dictionary *@var{dict})
2302Returns the number of split variables.
2303@end deftypefun
2304
2305The following functions are also available for working with split
2306variables.
2307
2308@deftypefun void dict_set_split_vars (struct dictionary *@var{dict}, struct variable *const *@var{vars}, size_t @var{cnt})
2309Sets @var{dict}'s split variables to the @var{cnt} variables in
2310@var{vars}.  If @var{cnt} is 0, then @var{dict} will not have any
2311split variables.  The caller retains ownership of @var{vars}.
2312@end deftypefun
2313
2314@deftypefun void dict_unset_split_var (struct dictionary *@var{dict}, struct variable *@var{var})
2315Removes @var{var}, which must be a variable in @var{dict}, from
2316@var{dict}'s split of split variables.
2317@end deftypefun
2318
2319@node Dictionary File Label
2320@subsection File Label
2321
2322A dictionary may optionally have an associated string that describes
2323its contents, called its file label.  The user may set the file label
2324with the @cmd{FILE LABEL} command (@pxref{FILE LABEL,,,pspp, PSPP
2325Users Guide}).
2326
2327These functions set and retrieve the file label.
2328
2329@deftypefun {const char *} dict_get_label (const struct dictionary *@var{dict})
2330Returns @var{dict}'s file label.  If @var{dict} does not have a label,
2331returns a null pointer.
2332@end deftypefun
2333
2334@deftypefun void dict_set_label (struct dictionary *@var{dict}, const char *@var{label})
2335Sets @var{dict}'s label to @var{label}.  If @var{label} is non-null,
2336then its content, truncated to at most 60 bytes, becomes the new file
2337label.  If @var{label} is null, then @var{dict}'s label is removed.
2338
2339The caller retains ownership of @var{label}.
2340@end deftypefun
2341
2342@node Dictionary Documents
2343@subsection Documents
2344
2345A dictionary may include an arbitrary number of lines of explanatory
2346text, called the dictionary's documents.  For compatibility, document
2347lines have a fixed width, and lines that are not exactly this width
2348are truncated or padded with spaces as necessary to bring them to the
2349correct width.
2350
2351PSPP users can use the @cmd{DOCUMENT} (@pxref{DOCUMENT,,,pspp, PSPP
2352Users Guide}), @cmd{ADD DOCUMENT} (@pxref{ADD DOCUMENT,,,pspp, PSPP
2353Users Guide}), and @cmd{DROP DOCUMENTS} (@pxref{DROP DOCUMENTS,,,pspp,
2354PSPP Users Guide}) commands to manipulate documents.
2355
2356@deftypefn Macro int DOC_LINE_LENGTH
2357The fixed length of a document line, in bytes, defined to 80.
2358@end deftypefn
2359
2360The following functions work with whole sets of documents.  They
2361accept or return sets of documents formatted as null-terminated
2362strings that are an exact multiple of @code{DOC_LINE_LENGTH}
2363bytes in length.
2364
2365@deftypefun {const char *} dict_get_documents (const struct dictionary *@var{dict})
2366Returns the documents in @var{dict}, or a null pointer if @var{dict}
2367has no documents.
2368@end deftypefun
2369
2370@deftypefun void dict_set_documents (struct dictionary *@var{dict}, const char *@var{new_documents})
2371Sets @var{dict}'s documents to @var{new_documents}.  If
2372@var{new_documents} is a null pointer or an empty string, then
2373@var{dict}'s documents are cleared.  The caller retains ownership of
2374@var{new_documents}.
2375@end deftypefun
2376
2377@deftypefun void dict_clear_documents (struct dictionary *@var{dict})
2378Clears the documents from @var{dict}.
2379@end deftypefun
2380
2381The following functions work with individual lines in a dictionary's
2382set of documents.
2383
2384@deftypefun void dict_add_document_line (struct dictionary *@var{dict}, const char *@var{content})
2385Appends @var{content} to the documents in @var{dict}.  The text in
2386@var{content} will be truncated or padded with spaces as necessary to
2387make it exactly @code{DOC_LINE_LENGTH} bytes long.  The caller retains
2388ownership of @var{content}.
2389
2390If @var{content} is over @code{DOC_LINE_LENGTH}, this function also
2391issues a warning using @func{msg}.  To suppress the warning, enclose a
2392call to one of this function in a @func{msg_disable}/@func{msg_enable}
2393pair.
2394@end deftypefun
2395
2396@deftypefun size_t dict_get_document_line_cnt (const struct dictionary *@var{dict})
2397Returns the number of line of documents in @var{dict}.  If the
2398dictionary contains no documents, returns 0.
2399@end deftypefun
2400
2401@deftypefun void dict_get_document_line (const struct dictionary *@var{dict}, size_t @var{idx}, struct string *@var{content})
2402Replaces the text in @var{content} (which must already have been
2403initialized by the caller) by the document line in @var{dict} numbered
2404@var{idx}, which must be less than the number of lines of documents in
2405@var{dict}.  Any trailing white space in the document line is trimmed,
2406so that @var{content} will have a length between 0 and
2407@code{DOC_LINE_LENGTH}.
2408@end deftypefun
2409
2410@node Coding Conventions
2411@section Coding Conventions
2412
2413Every @file{.c} file should have @samp{#include <config.h>} as its
2414first non-comment line.  No @file{.h} file should include
2415@file{config.h}.
2416
2417This section needs to be finished.
2418
2419@node Cases
2420@section Cases
2421
2422This section needs to be written.
2423
2424@node Data Sets
2425@section Data Sets
2426
2427This section needs to be written.
2428
2429@node Pools
2430@section Pools
2431
2432This section needs to be written.
2433
2434@c  LocalWords:  bool
2435