1@c PSPP - a program for statistical analysis. 2@c Copyright (C) 2019 Free Software Foundation, Inc. 3@c Permission is granted to copy, distribute and/or modify this document 4@c under the terms of the GNU Free Documentation License, Version 1.3 5@c or any later version published by the Free Software Foundation; 6@c with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. 7@c A copy of the license is included in the section entitled "GNU 8@c Free Documentation License". 9@c 10 11@node Basic Concepts 12@chapter Basic Concepts 13 14This chapter introduces basic data structures and other concepts 15needed for developing in PSPP. 16 17@menu 18* Values:: 19* Input and Output Formats:: 20* User-Missing Values:: 21* Value Labels:: 22* Variables:: 23* Dictionaries:: 24* Coding Conventions:: 25* Cases:: 26* Data Sets:: 27* Pools:: 28@end menu 29 30@node Values 31@section Values 32 33@cindex value 34The unit of data in PSPP is a @dfn{value}. 35 36@cindex width 37@cindex string value 38@cindex numeric value 39@cindex MAX_STRING 40Values are classified by @dfn{type} and @dfn{width}. The 41type of a value is either @dfn{numeric} or @dfn{string} (sometimes 42called alphanumeric). The width of a string value ranges from 1 to 43@code{MAX_STRING} bytes. The width of a numeric value is artificially 44defined to be 0; thus, the type of a value can be inferred from its 45width. 46 47Some support is provided for working with value types and widths, in 48@file{data/val-type.h}: 49 50@deftypefn Macro int MAX_STRING 51Maximum width of a string value, in bytes, currently 32,767. 52@end deftypefn 53 54@deftypefun bool val_type_is_valid (enum val_type @var{val_type}) 55Returns true if @var{val_type} is a valid value type, that is, 56either @code{VAL_NUMERIC} or @code{VAL_STRING}. Useful for 57assertions. 58@end deftypefun 59 60@deftypefun {enum val_type} val_type_from_width (int @var{width}) 61Returns @code{VAL_NUMERIC} if @var{width} is 0 and thus represents the 62width of a numeric value, otherwise @code{VAL_STRING} to indicate that 63@var{width} is the width of a string value. 64@end deftypefun 65 66The following subsections describe how values of each type are 67represented. 68 69@menu 70* Numeric Values:: 71* String Values:: 72* Runtime Typed Values:: 73@end menu 74 75@node Numeric Values 76@subsection Numeric Values 77 78A value known to be numeric at compile time is represented as a 79@code{double}. PSPP provides three values of @code{double} for 80special purposes, defined in @file{data/val-type.h}: 81 82@deftypefn Macro double SYSMIS 83The @dfn{system-missing value}, used to represent a datum whose true 84value is unknown, such as a survey question that was not answered by 85the respondent, or undefined, such as the result of division by zero. 86PSPP propagates the system-missing value through calculations and 87compensates for missing values in statistical analyses. @xref{Missing 88Observations,,,pspp, PSPP Users Guide}, for a PSPP user's view of 89missing values. 90 91PSPP currently defines @code{SYSMIS} as @code{-DBL_MAX}, that is, the 92greatest finite negative value of @code{double}. It is best not to 93depend on this definition, because PSPP may transition to using an 94IEEE NaN (not a number) instead at some point in the future. 95@end deftypefn 96 97@deftypefn Macro double LOWEST 98@deftypefnx Macro double HIGHEST 99The greatest finite negative (except for @code{SYSMIS}) and positive 100values of @code{double}, respectively. These values do not ordinarily 101appear in user data files. Instead, they are used to implement 102endpoints of open-ended ranges that are occasionally permitted in PSPP 103syntax, e.g.@: @code{5 THRU HI} as a range of missing values 104(@pxref{MISSING VALUES,,,pspp, PSPP Users Guide}). 105@end deftypefn 106 107@node String Values 108@subsection String Values 109 110A value known at compile time to have string type is represented as an 111array of @code{char}. String values do not necessarily represent 112readable text strings and may contain arbitrary 8-bit data, including 113null bytes, control codes, and bytes with the high bit set. Thus, 114string values are not null-terminated strings, but rather opaque 115arrays of bytes. 116 117@code{SYSMIS}, @code{LOWEST}, and @code{HIGHEST} have no equivalents 118as string values. Usually, PSPP fills an unknown or undefined string 119values with spaces, but PSPP does not treat such a string as a special 120case when it processes it later. 121 122@cindex MAX_STRING 123@code{MAX_STRING}, the maximum length of a string value, is defined in 124@file{data/val-type.h}. 125 126@node Runtime Typed Values 127@subsection Runtime Typed Values 128 129When a value's type is only known at runtime, it is often represented 130as a @union{value}, defined in @file{data/value.h}. A @union{value} 131does not identify the type or width of the data it contains. Code 132that works with @union{values}s must therefore have external knowledge 133of its content, often through the type and width of a 134@struct{variable} (@pxref{Variables}). 135 136@union{value} has one member that clients are permitted to access 137directly, a @code{double} named @samp{f} that stores the content of a 138numeric @union{value}. It has other members that store the content of 139string @union{value}, but client code should use accessor functions 140instead of referring to these directly. 141 142PSPP provides some functions for working with @union{value}s. The 143most useful are described below. To use these functions, recall that 144a numeric value has a width of 0. 145 146@deftypefun void value_init (union value *@var{value}, int @var{width}) 147Initializes @var{value} as a value of the given @var{width}. After 148initialization, the data in @var{value} are indeterminate; the caller 149is responsible for storing initial data in it. 150@end deftypefun 151 152@deftypefun void value_destroy (union value *@var{value}, int @var{width}) 153Frees auxiliary storage associated with @var{value}, which must have 154the given @var{width}. 155@end deftypefun 156 157@deftypefun bool value_needs_init (int @var{width}) 158For some widths, @func{value_init} and @func{value_destroy} do not 159actually do anything, because no additional storage is needed beyond 160the size of @union{value}. This function returns true if @var{width} 161is such a width, which case there is no actual need to call those 162functions. This can be a useful optimization if a large number of 163@union{value}s of such a width are to be initialized or destroyed. 164 165This function returns false if @func{value_init} and 166@func{value_destroy} are actually required for the given @var{width}. 167@end deftypefun 168 169@deftypefun void value_copy (union value *@var{dst}, @ 170 const union value *@var{src}, @ 171 int @var{width}) 172Copies the contents of @union{value} @var{src} to @var{dst}. Both 173@var{dst} and @var{src} must have been initialized with the specified 174@var{width}. 175@end deftypefun 176 177@deftypefun void value_set_missing (union value *@var{value}, int @var{width}) 178Sets @var{value} to @code{SYSMIS} if it is numeric or to all spaces if 179it is alphanumeric, according to @var{width}. @var{value} must have 180been initialized with the specified @var{width}. 181@end deftypefun 182 183@anchor{value_is_resizable} 184@deftypefun bool value_is_resizable (const union value *@var{value}, int @var{old_width}, int @var{new_width}) 185Determines whether @var{value}, which must have been initialized with 186the specified @var{old_width}, may be resized to @var{new_width}. 187Resizing is possible if the following criteria are met. First, 188@var{old_width} and @var{new_width} must be both numeric or both 189string widths. Second, if @var{new_width} is a short string width and 190less than @var{old_width}, resizing is allowed only if bytes 191@var{new_width} through @var{old_width} in @var{value} contain only 192spaces. 193 194These rules are part of those used by @func{mv_is_resizable} and 195@func{val_labs_can_set_width}. 196@end deftypefun 197 198@deftypefun void value_resize (union value *@var{value}, int @var{old_width}, int @var{new_width}) 199Resizes @var{value} from @var{old_width} to @var{new_width}, which 200must be allowed by the rules stated above. @var{value} must have been 201initialized with the specified @var{old_width} before calling this 202function. After resizing, @var{value} has width @var{new_width}. 203 204If @var{new_width} is greater than @var{old_width}, @var{value} will 205be padded on the right with spaces to the new width. If 206@var{new_width} is less than @var{old_width}, the rightmost bytes of 207@var{value} are truncated. 208@end deftypefun 209 210@deftypefun bool value_equal (const union value *@var{a}, const union value *@var{b}, int @var{width}) 211Compares of @var{a} and @var{b}, which must both have width 212@var{width}. Returns true if their contents are the same, false if 213they differ. 214@end deftypefun 215 216@deftypefun int value_compare_3way (const union value *@var{a}, const union value *@var{b}, int @var{width}) 217Compares of @var{a} and @var{b}, which must both have width 218@var{width}. Returns -1 if @var{a} is less than @var{b}, 0 if they 219are equal, or 1 if @var{a} is greater than @var{b}. 220 221Numeric values are compared numerically, with @code{SYSMIS} comparing 222less than any real number. String values are compared 223lexicographically byte-by-byte. 224@end deftypefun 225 226@deftypefun size_t value_hash (const union value *@var{value}, int @var{width}, unsigned int @var{basis}) 227Computes and returns a hash of @var{value}, which must have the 228specified @var{width}. The value in @var{basis} is folded into the 229hash. 230@end deftypefun 231 232@node Input and Output Formats 233@section Input and Output Formats 234 235Input and output formats specify how to convert data fields to and 236from data values (@pxref{Input and Output Formats,,,pspp, PSPP Users 237Guide}). PSPP uses @struct{fmt_spec} to represent input and output 238formats. 239 240Function prototypes and other declarations related to formats are in 241the @file{<data/format.h>} header. 242 243@deftp {Structure} {struct fmt_spec} 244An input or output format, with the following members: 245 246@table @code 247@item enum fmt_type type 248The format type (see below). 249 250@item int w 251Field width, in bytes. The width of numeric fields is always between 2521 and 40 bytes, and the width of string fields is always between 1 and 25365534 bytes. However, many individual types of formats place stricter 254limits on field width (see @ref{fmt_max_input_width}, 255@ref{fmt_max_output_width}). 256 257@item int d 258Number of decimal places, in character positions. For format types 259that do not allow decimal places to be specified, this value must be 2600. Format types that do allow decimal places have type-specific and 261often width-specific restrictions on @code{d} (see 262@ref{fmt_max_input_decimals}, @ref{fmt_max_output_decimals}). 263@end table 264@end deftp 265 266@deftp {Enumeration} {enum fmt_type} 267An enumerated type representing an input or output format type. Each 268PSPP input and output format has a corresponding enumeration constant 269prefixed by @samp{FMT}: @code{FMT_F}, @code{FMT_COMMA}, 270@code{FMT_DOT}, and so on. 271@end deftp 272 273The following sections describe functions for manipulating formats and 274the data in fields represented by formats. 275 276@menu 277* Constructing and Verifying Formats:: 278* Format Utility Functions:: 279* Obtaining Properties of Format Types:: 280* Numeric Formatting Styles:: 281* Formatted Data Input and Output:: 282@end menu 283 284@node Constructing and Verifying Formats 285@subsection Constructing and Verifying Formats 286 287These functions construct @struct{fmt_spec}s and verify that they are 288valid. 289 290 291 292@deftypefun {struct fmt_spec} fmt_for_input (enum fmt_type @var{type}, int @var{w}, int @var{d}) 293@deftypefunx {struct fmt_spec} fmt_for_output (enum fmt_type @var{type}, int @var{w}, int @var{d}) 294Constructs a @struct{fmt_spec} with the given @var{type}, @var{w}, and 295@var{d}, asserts that the result is a valid input (or output) format, 296and returns it. 297@end deftypefun 298 299@anchor{fmt_for_output_from_input} 300@deftypefun {struct fmt_spec} fmt_for_output_from_input (const struct fmt_spec *@var{input}) 301Given @var{input}, which must be a valid input format, returns the 302equivalent output format. @xref{Input and Output Formats,,,pspp, PSPP 303Users Guide}, for the rules for converting input formats into output 304formats. 305@end deftypefun 306 307@deftypefun {struct fmt_spec} fmt_default_for_width (int @var{width}) 308Returns the default output format for a variable of the given 309@var{width}. For a numeric variable, this is F8.2 format; for a 310string variable, it is the A format of the given @var{width}. 311@end deftypefun 312 313The following functions check whether a @struct{fmt_spec} is valid for 314various uses and return true if so, false otherwise. When any of them 315returns false, it also outputs an explanatory error message using 316@func{msg}. To suppress error output, enclose a call to one of these 317functions by a @func{msg_disable}/@func{msg_enable} pair. 318 319@deftypefun bool fmt_check (const struct fmt_spec *@var{format}, bool @var{for_input}) 320@deftypefunx bool fmt_check_input (const struct fmt_spec *@var{format}) 321@deftypefunx bool fmt_check_output (const struct fmt_spec *@var{format}) 322Checks whether @var{format} is a valid input format (for 323@func{fmt_check_input}, or @func{fmt_check} if @var{for_input}) or 324output format (for @func{fmt_check_output}, or @func{fmt_check} if not 325@var{for_input}). 326@end deftypefun 327 328@deftypefun bool fmt_check_type_compat (const struct fmt_spec *@var{format}, enum val_type @var{type}) 329Checks whether @var{format} matches the value type @var{type}, that 330is, if @var{type} is @code{VAL_NUMERIC} and @var{format} is a numeric 331format or @var{type} is @code{VAL_STRING} and @var{format} is a string 332format. 333@end deftypefun 334 335@deftypefun bool fmt_check_width_compat (const struct fmt_spec *@var{format}, int @var{width}) 336Checks whether @var{format} may be used as an output format for a 337value of the given @var{width}. 338 339@func{fmt_var_width}, described in 340the following section, can be also be used to determine the value 341width needed by a format. 342@end deftypefun 343 344@node Format Utility Functions 345@subsection Format Utility Functions 346 347These functions work with @struct{fmt_spec}s. 348 349@deftypefun int fmt_var_width (const struct fmt_spec *@var{format}) 350Returns the width for values associated with @var{format}. If 351@var{format} is a numeric format, the width is 0; if @var{format} is 352an A format, then the width @code{@var{format}->w}; otherwise, 353@var{format} is an AHEX format and its width is @code{@var{format}->w 354/ 2}. 355@end deftypefun 356 357@deftypefun char *fmt_to_string (const struct fmt_spec *@var{format}, char @var{s}[FMT_STRING_LEN_MAX + 1]) 358Converts @var{format} to a human-readable format specifier in @var{s} 359and returns @var{s}. @var{format} need not be a valid input or output 360format specifier, e.g.@: it is allowed to have an excess width or 361decimal places. In particular, if @var{format} has decimals, they are 362included in the output string, even if @var{format}'s type does not 363allow decimals, to allow accurately presenting incorrect formats to 364the user. 365@end deftypefun 366 367@deftypefun bool fmt_equal (const struct fmt_spec *@var{a}, const struct fmt_spec *@var{b}) 368Compares @var{a} and @var{b} memberwise and returns true if they are 369identical, false otherwise. @var{format} need not be a valid input or 370output format specifier. 371@end deftypefun 372 373@deftypefun void fmt_resize (struct fmt_spec *@var{fmt}, int @var{width}) 374Sets the width of @var{fmt} to a valid format for a @union{value} of size @var{width}. 375@end deftypefun 376 377@node Obtaining Properties of Format Types 378@subsection Obtaining Properties of Format Types 379 380These functions work with @enum{fmt_type}s instead of the higher-level 381@struct{fmt_spec}s. Their primary purpose is to report properties of 382each possible format type, which in turn allows clients to abstract 383away many of the details of the very heterogeneous requirements of 384each format type. 385 386The first group of functions works with format type names. 387 388@deftypefun const char *fmt_name (enum fmt_type @var{type}) 389Returns the name for the given @var{type}, e.g.@: @code{"COMMA"} for 390@code{FMT_COMMA}. 391@end deftypefun 392 393@deftypefun bool fmt_from_name (const char *@var{name}, enum fmt_type *@var{type}) 394Tries to find the @enum{fmt_type} associated with @var{name}. If 395successful, sets @code{*@var{type}} to the type and returns true; 396otherwise, returns false without modifying @code{*@var{type}}. 397@end deftypefun 398 399The functions below query basic limits on width and decimal places for 400each kind of format. 401 402@deftypefun bool fmt_takes_decimals (enum fmt_type @var{type}) 403Returns true if a format of the given @var{type} is allowed to have a 404nonzero number of decimal places (the @code{d} member of 405@struct{fmt_spec}), false if not. 406@end deftypefun 407 408@anchor{fmt_min_input_width} 409@anchor{fmt_max_input_width} 410@anchor{fmt_min_output_width} 411@anchor{fmt_max_output_width} 412@deftypefun int fmt_min_input_width (enum fmt_type @var{type}) 413@deftypefunx int fmt_max_input_width (enum fmt_type @var{type}) 414@deftypefunx int fmt_min_output_width (enum fmt_type @var{type}) 415@deftypefunx int fmt_max_output_width (enum fmt_type @var{type}) 416Returns the minimum or maximum width (the @code{w} member of 417@struct{fmt_spec}) allowed for an input or output format of the 418specified @var{type}. 419@end deftypefun 420 421@anchor{fmt_max_input_decimals} 422@anchor{fmt_max_output_decimals} 423@deftypefun int fmt_max_input_decimals (enum fmt_type @var{type}, int @var{width}) 424@deftypefunx int fmt_max_output_decimals (enum fmt_type @var{type}, int @var{width}) 425Returns the maximum number of decimal places allowed for an input or 426output format, respectively, of the given @var{type} and @var{width}. 427Returns 0 if the specified @var{type} does not allow any decimal 428places or if @var{width} is too narrow to allow decimal places. 429@end deftypefun 430 431@deftypefun int fmt_step_width (enum fmt_type @var{type}) 432Returns the ``width step'' for a @struct{fmt_spec} of the given 433@var{type}. A @struct{fmt_spec}'s width must be a multiple of its 434type's width step. Most format types have a width step of 1, so that 435their formats' widths may be any integer within the valid range, but 436hexadecimal numeric formats and AHEX string formats have a width step 437of 2. 438@end deftypefun 439 440These functions allow clients to broadly determine how each kind of 441input or output format behaves. 442 443@deftypefun bool fmt_is_string (enum fmt_type @var{type}) 444@deftypefunx bool fmt_is_numeric (enum fmt_type @var{type}) 445Returns true if @var{type} is a format for numeric or string values, 446respectively, false otherwise. 447@end deftypefun 448 449@deftypefun enum fmt_category fmt_get_category (enum fmt_type @var{type}) 450Returns the category within which @var{type} falls. 451 452@deftp {Enumeration} {enum fmt_category} 453A group of format types. Format type categories correspond to the 454input and output categories described in the PSPP user documentation 455(@pxref{Input and Output Formats,,,pspp, PSPP Users Guide}). 456 457Each format is in exactly one category. The categories have bitwise 458disjoint values to make it easy to test whether a format type is in 459one of multiple categories, e.g.@: 460 461@example 462if (fmt_get_category (type) & (FMT_CAT_DATE | FMT_CAT_TIME)) 463 @{ 464 /* @dots{}@r{@code{type} is a date or time format}@dots{} */ 465 @} 466@end example 467 468The format categories are: 469@table @code 470@item FMT_CAT_BASIC 471Basic numeric formats. 472 473@item FMT_CAT_CUSTOM 474Custom currency formats. 475 476@item FMT_CAT_LEGACY 477Legacy numeric formats. 478 479@item FMT_CAT_BINARY 480Binary formats. 481 482@item FMT_CAT_HEXADECIMAL 483Hexadecimal formats. 484 485@item FMT_CAT_DATE 486Date formats. 487 488@item FMT_CAT_TIME 489Time formats. 490 491@item FMT_CAT_DATE_COMPONENT 492Date component formats. 493 494@item FMT_CAT_STRING 495String formats. 496@end table 497@end deftp 498@end deftypefun 499 500The PSPP input and output routines use the following pair of functions 501to convert @enum{fmt_type}s to and from the separate set of codes used 502in system and portable files: 503 504@deftypefun int fmt_to_io (enum fmt_type @var{type}) 505Returns the format code used in system and portable files that 506corresponds to @var{type}. 507@end deftypefun 508 509@deftypefun bool fmt_from_io (int @var{io}, enum fmt_type *@var{type}) 510Converts @var{io}, a format code used in system and portable files, 511into a @enum{fmt_type} in @code{*@var{type}}. Returns true if 512successful, false if @var{io} is not valid. 513@end deftypefun 514 515These functions reflect the relationship between input and output 516formats. 517 518@deftypefun enum fmt_type fmt_input_to_output (enum fmt_type @var{type}) 519Returns the output format type that is used by default by DATA LIST 520and other input procedures when @var{type} is specified as an input 521format. The conversion from input format to output format is more 522complicated than simply changing the format. 523@xref{fmt_for_output_from_input}, for a function that performs the 524entire conversion. 525@end deftypefun 526 527@deftypefun bool fmt_usable_for_input (enum fmt_type @var{type}) 528Returns true if @var{type} may be used as an input format type, false 529otherwise. The custom currency formats, in particular, may be used 530for output but not for input. 531 532All format types are valid for output. 533@end deftypefun 534 535The final group of format type property functions obtain 536human-readable templates that illustrate the formats graphically. 537 538@deftypefun const char *fmt_date_template (enum fmt_type @var{type}) 539Returns a formatting template for @var{type}, which must be a date or 540time format type. These formats are used by @func{data_in} and 541@func{data_out} to guide parsing and formatting date and time data. 542@end deftypefun 543 544@deftypefun char *fmt_dollar_template (const struct fmt_spec *@var{format}) 545Returns a string of the form @code{$#,###.##} according to 546@var{format}, which must be of type @code{FMT_DOLLAR}. The caller 547must free the string with @code{free}. 548@end deftypefun 549 550@node Numeric Formatting Styles 551@subsection Numeric Formatting Styles 552 553Each of the basic numeric formats (F, E, COMMA, DOT, DOLLAR, PCT) and 554custom currency formats (CCA, CCB, CCC, CCD, CCE) has an associated 555numeric formatting style, represented by @struct{fmt_number_style}. 556Input and output conversion of formats that have numeric styles is 557determined mainly by the style, although the formatting rules have 558special cases that are not represented within the style. 559 560@deftp {Structure} {struct fmt_number_style} 561A structure type with the following members: 562 563@table @code 564@item struct substring neg_prefix 565@itemx struct substring prefix 566@itemx struct substring suffix 567@itemx struct substring neg_suffix 568A set of strings used a prefix to negative numbers, a prefix to every 569number, a suffix to every number, and a suffix to negative numbers, 570respectively. Each of these strings is no more than 571@code{FMT_STYLE_AFFIX_MAX} bytes (currently 16) bytes in length. 572These strings must be freed with @func{ss_dealloc} when no longer 573needed. 574 575@item decimal 576The character used as a decimal point. It must be either @samp{.} or 577@samp{,}. 578 579@item grouping 580The character used for grouping digits to the left of the decimal 581point. It may be @samp{.} or @samp{,}, in which case it must not be 582equal to @code{decimal}, or it may be set to 0 to disable grouping. 583@end table 584@end deftp 585 586The following functions are provided for working with numeric 587formatting styles. 588 589@deftypefun void fmt_number_style_init (struct fmt_number_style *@var{style}) 590Initialises a @struct{fmt_number_style} with all of the 591prefixes and suffixes set to the empty string, @samp{.} as the decimal 592point character, and grouping disables. 593@end deftypefun 594 595 596@deftypefun void fmt_number_style_destroy (struct fmt_number_style *@var{style}) 597Destroys @var{style}, freeing its storage. 598@end deftypefun 599 600@deftypefun {struct fmt_number_style} *fmt_create (void) 601A function which creates an array of all the styles used by pspp, and 602calls fmt_number_style_init on each of them. 603@end deftypefun 604 605@deftypefun void fmt_done (struct fmt_number_style *@var{styles}) 606A wrapper function which takes an array of @struct{fmt_number_style}, calls 607fmt_number_style_destroy on each of them, and then frees the array. 608@end deftypefun 609 610 611 612@deftypefun int fmt_affix_width (const struct fmt_number_style *@var{style}) 613Returns the total length of @var{style}'s @code{prefix} and @code{suffix}. 614@end deftypefun 615 616@deftypefun int fmt_neg_affix_width (const struct fmt_number_style *@var{style}) 617Returns the total length of @var{style}'s @code{neg_prefix} and 618@code{neg_suffix}. 619@end deftypefun 620 621PSPP maintains a global set of number styles for each of the basic 622numeric formats and custom currency formats. The following functions 623work with these global styles: 624 625@deftypefun {const struct fmt_number_style *} fmt_get_style (enum fmt_type @var{type}) 626Returns the numeric style for the given format @var{type}. 627@end deftypefun 628 629@deftypefun {const char *} fmt_name (enum fmt_type @var{type}) 630Returns the name of the given format @var{type}. 631@end deftypefun 632 633 634 635@node Formatted Data Input and Output 636@subsection Formatted Data Input and Output 637 638These functions provide the ability to convert data fields into 639@union{value}s and vice versa. 640 641@deftypefun bool data_in (struct substring @var{input}, const char *@var{encoding}, enum fmt_type @var{type}, int @var{implied_decimals}, int @var{first_column}, const struct dictionary *@var{dict}, union value *@var{output}, int @var{width}) 642Parses @var{input} as a field containing data in the given format 643@var{type}. The resulting value is stored in @var{output}, which the 644caller must have initialized with the given @var{width}. For 645consistency, @var{width} must be 0 if 646@var{type} is a numeric format type and greater than 0 if @var{type} 647is a string format type. 648@var{encoding} should be set to indicate the character 649encoding of @var{input}. 650@var{dict} must be a pointer to the dictionary with which @var{output} 651is associated. 652 653If @var{input} is the empty string (with length 0), @var{output} is 654set to the value set on SET BLANKS (@pxref{SET BLANKS,,,pspp, PSPP 655Users Guide}) for a numeric value, or to all spaces for a string 656value. This applies regardless of the usual parsing requirements for 657@var{type}. 658 659If @var{implied_decimals} is greater than zero, then the numeric 660result is shifted right by @var{implied_decimals} decimal places if 661@var{input} does not contain a decimal point character or an exponent. 662Only certain numeric format types support implied decimal places; for 663string formats and other numeric formats, @var{implied_decimals} has 664no effect. DATA LIST FIXED is the primary user of this feature 665(@pxref{DATA LIST FIXED,,,pspp, PSPP Users Guide}). Other callers 666should generally specify 0 for @var{implied_decimals}, to disable this 667feature. 668 669When @var{input} contains invalid input data, @func{data_in} outputs a 670message using @func{msg}. 671@c (@pxref{msg}). 672If @var{first_column} is 673nonzero, it is included in any such error message as the 1-based 674column number of the start of the field. The last column in the field 675is calculated as @math{@var{first_column} + @var{input} - 1}. To 676suppress error output, enclose the call to @func{data_in} by calls to 677@func{msg_disable} and @func{msg_enable}. 678 679This function returns true on success, false if a message was output 680(even if suppressed). Overflow and underflow provoke warnings but are 681not propagated to the caller as errors. 682 683This function is declared in @file{data/data-in.h}. 684@end deftypefun 685 686@deftypefun char * data_out (const union value *@var{input}, const struct fmt_spec *@var{format}) 687@deftypefunx char * data_out_legacy (const union value *@var{input}, const char *@var{encoding}, const struct fmt_spec *@var{format}) 688Converts the data pointed to by @var{input} into a string value, which 689will be encoded in UTF-8, according to output format specifier @var{format}. 690Format 691must be a valid output format. The width of @var{input} is 692inferred from @var{format} using an algorithm equivalent to 693@func{fmt_var_width}. 694 695When @var{input} contains data that cannot be represented in the given 696@var{format}, @func{data_out} may output a message using @func{msg}, 697@c (@pxref{msg}), 698although the current implementation does not 699consistently do so. To suppress error output, enclose the call to 700@func{data_out} by calls to @func{msg_disable} and @func{msg_enable}. 701 702This function is declared in @file{data/data-out.h}. 703@end deftypefun 704 705@node User-Missing Values 706@section User-Missing Values 707 708In addition to the system-missing value for numeric values, each 709variable has a set of user-missing values (@pxref{MISSING 710VALUES,,,pspp, PSPP Users Guide}). A set of user-missing values is 711represented by @struct{missing_values}. 712 713It is rarely necessary to interact directly with a 714@struct{missing_values} object. Instead, the most common operation, 715querying whether a particular value is a missing value for a given 716variable, is most conveniently executed through functions on 717@struct{variable}. @xref{Variable Missing Values}, for details. 718 719A @struct{missing_values} is essentially a set of @union{value}s that 720have a common value width (@pxref{Values}). For a set of 721missing values associated with a variable (the common case), the set's 722width is the same as the variable's width. 723 724Function prototypes and other declarations related to missing values 725are declared in @file{data/missing-values.h}. 726 727@deftp {Structure} {struct missing_values} 728Opaque type that represents a set of missing values. 729@end deftp 730 731The contents of a set of missing values is subject to some 732restrictions. Regardless of width, a set of missing values is allowed 733to be empty. A set of numeric missing values may contain up to three 734discrete numeric values, or a range of numeric values (which includes 735both ends of the range), or a range plus one discrete numeric value. 736A set of string missing values may contain up to three discrete string 737values (with the same width as the set), but ranges are not supported. 738 739In addition, values in string missing values wider than 740@code{MV_MAX_STRING} bytes may contain non-space characters only in 741their first @code{MV_MAX_STRING} bytes; all the bytes after the first 742@code{MV_MAX_STRING} must be spaces. @xref{mv_is_acceptable}, for a 743function that tests a value against these constraints. 744 745@deftypefn Macro int MV_MAX_STRING 746Number of bytes in a string missing value that are not required to be 747spaces. The current value is 8, a value which is fixed by the system 748file format. In PSPP we could easily eliminate this restriction, but 749doing so would also require us to extend the system file format in an 750incompatible way, which we consider a bad tradeoff. 751@end deftypefn 752 753The most often useful functions for missing values are those for 754testing whether a given value is missing, described in the following 755section. Several other functions for creating, inspecting, and 756modifying @struct{missing_values} objects are described afterward, but 757these functions are much more rarely useful. 758 759@menu 760* Testing for Missing Values:: 761* Creating and Destroying User-Missing Values:: 762* Changing User-Missing Value Set Width:: 763* Inspecting User-Missing Value Sets:: 764* Modifying User-Missing Value Sets:: 765@end menu 766 767@node Testing for Missing Values 768@subsection Testing for Missing Values 769 770The most often useful functions for missing values are those for 771testing whether a given value is missing, described here. However, 772using one of the corresponding missing value testing functions for 773variables can be even easier (@pxref{Variable Missing Values}). 774 775@deftypefun bool mv_is_value_missing (const struct missing_values *@var{mv}, const union value *@var{value}, enum mv_class @var{class}) 776@deftypefunx bool mv_is_num_missing (const struct missing_values *@var{mv}, double @var{value}, enum mv_class @var{class}) 777@deftypefunx bool mv_is_str_missing (const struct missing_values *@var{mv}, const char @var{value}[], enum mv_class @var{class}) 778Tests whether @var{value} is in one of the categories of missing 779values given by @var{class}. Returns true if so, false otherwise. 780 781@var{mv} determines the width of @var{value} and provides the set of 782user-missing values to test. 783 784The only difference among these functions in the form in which 785@var{value} is provided, so you may use whichever function is most 786convenient. 787 788The @var{class} argument determines the exact kinds of missing values 789that the functions test for: 790 791@deftp Enumeration {enum mv_class} 792@table @t 793@item MV_USER 794Returns true if @var{value} is in the set of user-missing values given 795by @var{mv}. 796 797@item MV_SYSTEM 798Returns true if @var{value} is system-missing. (If @var{mv} 799represents a set of string values, then @var{value} is never 800system-missing.) 801 802@item MV_ANY 803@itemx MV_USER | MV_SYSTEM 804Returns true if @var{value} is user-missing or system-missing. 805 806@item MV_NONE 807Always returns false, that is, @var{value} is never considered 808missing. 809@end table 810@end deftp 811@end deftypefun 812 813@node Creating and Destroying User-Missing Values 814@subsection Creation and Destruction 815 816These functions create and destroy @struct{missing_values} objects. 817 818@deftypefun void mv_init (struct missing_values *@var{mv}, int @var{width}) 819Initializes @var{mv} as a set of user-missing values. The set is 820initially empty. Any values added to it must have the specified 821@var{width}. 822@end deftypefun 823 824@deftypefun void mv_destroy (struct missing_values *@var{mv}) 825Destroys @var{mv}, which must not be referred to again. 826@end deftypefun 827 828@deftypefun void mv_copy (struct missing_values *@var{mv}, const struct missing_values *@var{old}) 829Initializes @var{mv} as a copy of the existing set of user-missing 830values @var{old}. 831@end deftypefun 832 833@deftypefun void mv_clear (struct missing_values *@var{mv}) 834Empties the user-missing value set @var{mv}, retaining its existing 835width. 836@end deftypefun 837 838@node Changing User-Missing Value Set Width 839@subsection Changing User-Missing Value Set Width 840 841A few PSPP language constructs copy sets of user-missing values from 842one variable to another. When the source and target variables have 843the same width, this is simple. But when the target variable's width 844might be different from the source variable's, it takes a little more 845work. The functions described here can help. 846 847In fact, it is usually unnecessary to call these functions directly. 848Most of the time @func{var_set_missing_values}, which uses 849@func{mv_resize} internally to resize the new set of missing values to 850the required width, may be used instead. 851@xref{var_set_missing_values}, for more information. 852 853@deftypefun bool mv_is_resizable (const struct missing_values *@var{mv}, int @var{new_width}) 854Tests whether @var{mv}'s width may be changed to @var{new_width} using 855@func{mv_resize}. Returns true if it is allowed, false otherwise. 856 857If @var{mv} contains any missing values, then it may be resized only 858if each missing value may be resized, as determined by 859@func{value_is_resizable} (@pxref{value_is_resizable}). 860@end deftypefun 861 862@anchor{mv_resize} 863@deftypefun void mv_resize (struct missing_values *@var{mv}, int @var{width}) 864Changes @var{mv}'s width to @var{width}. @var{mv} and @var{width} 865must satisfy the constraints explained above. 866 867When a string missing value set's width is increased, each 868user-missing value is padded on the right with spaces to the new 869width. 870@end deftypefun 871 872@node Inspecting User-Missing Value Sets 873@subsection Inspecting User-Missing Value Sets 874 875These functions inspect the properties and contents of 876@struct{missing_values} objects. 877 878The first set of functions inspects the discrete values that sets of 879user-missing values may contain: 880 881@deftypefun bool mv_is_empty (const struct missing_values *@var{mv}) 882Returns true if @var{mv} contains no user-missing values, false if it 883contains at least one user-missing value (either a discrete value or a 884numeric range). 885@end deftypefun 886 887@deftypefun int mv_get_width (const struct missing_values *@var{mv}) 888Returns the width of the user-missing values that @var{mv} represents. 889@end deftypefun 890 891@deftypefun int mv_n_values (const struct missing_values *@var{mv}) 892Returns the number of discrete user-missing values included in 893@var{mv}. The return value will be between 0 and 3. For sets of 894numeric user-missing values that include a range, the return value 895will be 0 or 1. 896@end deftypefun 897 898@deftypefun bool mv_has_value (const struct missing_values *@var{mv}) 899Returns true if @var{mv} has at least one discrete user-missing 900values, that is, if @func{mv_n_values} would return nonzero for 901@var{mv}. 902@end deftypefun 903 904@deftypefun {const union value *} mv_get_value (const struct missing_values *@var{mv}, int @var{index}) 905Returns the discrete user-missing value in @var{mv} with the given 906@var{index}. The caller must not modify or free the returned value or 907refer to it after modifying or freeing @var{mv}. The index must be 908less than the number of discrete user-missing values in @var{mv}, as 909reported by @func{mv_n_values}. 910@end deftypefun 911 912The second set of functions inspects the single range of values that 913numeric sets of user-missing values may contain: 914 915@deftypefun bool mv_has_range (const struct missing_values *@var{mv}) 916Returns true if @var{mv} includes a range, false otherwise. 917@end deftypefun 918 919@deftypefun void mv_get_range (const struct missing_values *@var{mv}, double *@var{low}, double *@var{high}) 920Stores the low endpoint of @var{mv}'s range in @code{*@var{low}} and 921the high endpoint of the range in @code{*@var{high}}. @var{mv} must 922include a range. 923@end deftypefun 924 925@node Modifying User-Missing Value Sets 926@subsection Modifying User-Missing Value Sets 927 928These functions modify the contents of @struct{missing_values} 929objects. 930 931The next set of functions applies to all sets of user-missing values: 932 933@deftypefun bool mv_add_value (struct missing_values *@var{mv}, const union value *@var{value}) 934@deftypefunx bool mv_add_str (struct missing_values *@var{mv}, const char @var{value}[]) 935@deftypefunx bool mv_add_num (struct missing_values *@var{mv}, double @var{value}) 936Attempts to add the given discrete @var{value} to set of user-missing 937values @var{mv}. @var{value} must have the same width as @var{mv}. 938Returns true if @var{value} was successfully added, false if the set 939could not accept any more discrete values or if @var{value} is not an 940acceptable user-missing value (see @func{mv_is_acceptable} below). 941 942These functions are equivalent, except for the form in which 943@var{value} is provided, so you may use whichever function is most 944convenient. 945@end deftypefun 946 947@deftypefun void mv_pop_value (struct missing_values *@var{mv}, union value *@var{value}) 948Removes a discrete value from @var{mv} (which must contain at least 949one discrete value) and stores it in @var{value}. 950@end deftypefun 951 952@deftypefun bool mv_replace_value (struct missing_values *@var{mv}, const union value *@var{value}, int @var{index}) 953Attempts to replace the discrete value with the given @var{index} in 954@var{mv} (which must contain at least @var{index} + 1 discrete values) 955by @var{value}. Returns true if successful, false if @var{value} is 956not an acceptable user-missing value (see @func{mv_is_acceptable} 957below). 958@end deftypefun 959 960@deftypefun bool mv_is_acceptable (const union value *@var{value}, int @var{width}) 961@anchor{mv_is_acceptable} 962Returns true if @var{value}, which must have the specified 963@var{width}, may be added to a missing value set of the same 964@var{width}, false if it cannot. As described above, all numeric 965values and string values of width @code{MV_MAX_STRING} or less may be 966added, but string value of greater width may be added only if bytes 967beyond the first @code{MV_MAX_STRING} are all spaces. 968@end deftypefun 969 970The second set of functions applies only to numeric sets of 971user-missing values: 972 973@deftypefun bool mv_add_range (struct missing_values *@var{mv}, double @var{low}, double @var{high}) 974Attempts to add a numeric range covering @var{low}@dots{}@var{high} 975(inclusive on both ends) to @var{mv}, which must be a numeric set of 976user-missing values. Returns true if the range is successful added, 977false on failure. Fails if @var{mv} already contains a range, or if 978@var{mv} contains more than one discrete value, or if @var{low} > 979@var{high}. 980@end deftypefun 981 982@deftypefun void mv_pop_range (struct missing_values *@var{mv}, double *@var{low}, double *@var{high}) 983Given @var{mv}, which must be a numeric set of user-missing values 984that contains a range, removes that range from @var{mv} and stores its 985low endpoint in @code{*@var{low}} and its high endpoint in 986@code{*@var{high}}. 987@end deftypefun 988 989@node Value Labels 990@section Value Labels 991 992Each variable has a set of value labels (@pxref{VALUE LABELS,,,pspp, 993PSPP Users Guide}), represented as @struct{val_labs}. A 994@struct{val_labs} is essentially a map from @union{value}s to strings. 995All of the values in a set of value labels have the same width, which 996for a set of value labels owned by a variable (the common case) is the 997same as its variable. 998 999Sets of value labels may contain any number of entries. 1000 1001It is rarely necessary to interact directly with a @struct{val_labs} 1002object. Instead, the most common operation, looking up the label for 1003a value of a given variable, can be conveniently executed through 1004functions on @struct{variable}. @xref{Variable Value Labels}, for 1005details. 1006 1007Function prototypes and other declarations related to missing values 1008are declared in @file{data/value-labels.h}. 1009 1010@deftp {Structure} {struct val_labs} 1011Opaque type that represents a set of value labels. 1012@end deftp 1013 1014The most often useful function for value labels is 1015@func{val_labs_find}, for looking up the label associated with a 1016value. 1017 1018@deftypefun {char *} val_labs_find (const struct val_labs *@var{val_labs}, union value @var{value}) 1019Looks in @var{val_labs} for a label for the given @var{value}. 1020Returns the label, if one is found, or a null pointer otherwise. 1021@end deftypefun 1022 1023Several other functions for working with value labels are described in 1024the following section, but these are more rarely useful. 1025 1026@menu 1027* Value Labels Creation and Destruction:: 1028* Value Labels Properties:: 1029* Value Labels Adding and Removing Labels:: 1030* Value Labels Iteration:: 1031@end menu 1032 1033@node Value Labels Creation and Destruction 1034@subsection Creation and Destruction 1035 1036These functions create and destroy @struct{val_labs} objects. 1037 1038@deftypefun {struct val_labs *} val_labs_create (int @var{width}) 1039Creates and returns an initially empty set of value labels with the 1040given @var{width}. 1041@end deftypefun 1042 1043@deftypefun {struct val_labs *} val_labs_clone (const struct val_labs *@var{val_labs}) 1044Creates and returns a set of value labels whose width and contents are 1045the same as those of @var{var_labs}. 1046@end deftypefun 1047 1048@deftypefun void val_labs_clear (struct val_labs *@var{var_labs}) 1049Deletes all value labels from @var{var_labs}. 1050@end deftypefun 1051 1052@deftypefun void val_labs_destroy (struct val_labs *@var{var_labs}) 1053Destroys @var{var_labs}, which must not be referenced again. 1054@end deftypefun 1055 1056@node Value Labels Properties 1057@subsection Value Labels Properties 1058 1059These functions inspect and manipulate basic properties of 1060@struct{val_labs} objects. 1061 1062@deftypefun size_t val_labs_count (const struct val_labs *@var{val_labs}) 1063Returns the number of value labels in @var{val_labs}. 1064@end deftypefun 1065 1066@deftypefun bool val_labs_can_set_width (const struct val_labs *@var{val_labs}, int @var{new_width}) 1067Tests whether @var{val_labs}'s width may be changed to @var{new_width} 1068using @func{val_labs_set_width}. Returns true if it is allowed, false 1069otherwise. 1070 1071A set of value labels may be resized to a given width only if each 1072value in it may be resized to that width, as determined by 1073@func{value_is_resizable} (@pxref{value_is_resizable}). 1074@end deftypefun 1075 1076@deftypefun void val_labs_set_width (struct val_labs *@var{val_labs}, int @var{new_width}) 1077Changes the width of @var{val_labs}'s values to @var{new_width}, which 1078must be a valid new width as determined by 1079@func{val_labs_can_set_width}. 1080@end deftypefun 1081 1082@node Value Labels Adding and Removing Labels 1083@subsection Adding and Removing Labels 1084 1085These functions add and remove value labels from a @struct{val_labs} 1086object. 1087 1088@deftypefun bool val_labs_add (struct val_labs *@var{val_labs}, union value @var{value}, const char *@var{label}) 1089Adds @var{label} to in @var{var_labs} as a label for @var{value}, 1090which must have the same width as the set of value labels. Returns 1091true if successful, false if @var{value} already has a label. 1092@end deftypefun 1093 1094@deftypefun void val_labs_replace (struct val_labs *@var{val_labs}, union value @var{value}, const char *@var{label}) 1095Adds @var{label} to in @var{var_labs} as a label for @var{value}, 1096which must have the same width as the set of value labels. If 1097@var{value} already has a label in @var{var_labs}, it is replaced. 1098@end deftypefun 1099 1100@deftypefun bool val_labs_remove (struct val_labs *@var{val_labs}, union value @var{value}) 1101Removes from @var{val_labs} any label for @var{value}, which must have 1102the same width as the set of value labels. Returns true if a label 1103was removed, false otherwise. 1104@end deftypefun 1105 1106@node Value Labels Iteration 1107@subsection Iterating through Value Labels 1108 1109These functions allow iteration through the set of value labels 1110represented by a @struct{val_labs} object. They may be used in the 1111context of a @code{for} loop: 1112 1113@example 1114struct val_labs val_labs; 1115const struct val_lab *vl; 1116 1117@dots{} 1118 1119for (vl = val_labs_first (val_labs); vl != NULL; 1120 vl = val_labs_next (val_labs, vl)) 1121 @{ 1122 @dots{}@r{do something with @code{vl}}@dots{} 1123 @} 1124@end example 1125 1126Value labels should not be added or deleted from a @struct{val_labs} 1127as it is undergoing iteration. 1128 1129@deftypefun {const struct val_lab *} val_labs_first (const struct val_labs *@var{val_labs}) 1130Returns the first value label in @var{var_labs}, if it contains at 1131least one value label, or a null pointer if it does not contain any 1132value labels. 1133@end deftypefun 1134 1135@deftypefun {const struct val_lab *} val_labs_next (const struct val_labs *@var{val_labs}, const struct val_labs_iterator **@var{vl}) 1136Returns the value label in @var{var_labs} following @var{vl}, if 1137@var{vl} is not the last value label in @var{val_labs}, or a null 1138pointer if there are no value labels following @var{vl}. 1139@end deftypefun 1140 1141@deftypefun {const struct val_lab **} val_labs_sorted (const struct val_labs *@var{val_labs}) 1142Allocates and returns an array of pointers to value labels, which are 1143sorted in increasing order by value. The array has 1144@code{val_labs_count (@var{val_labs})} elements. The caller is 1145responsible for freeing the array with @func{free} (but must not free 1146any of the @struct{val_lab} elements that the array points to). 1147@end deftypefun 1148 1149The iteration functions above work with pointers to @struct{val_lab} 1150which is an opaque data structure that users of @struct{val_labs} must 1151not modify or free directly. The following functions work with 1152objects of this type: 1153 1154@deftypefun {const union value *} val_lab_get_value (const struct val_lab *@var{vl}) 1155Returns the value of value label @var{vl}. The caller must not modify 1156or free the returned value. (To achieve a similar result, remove the 1157value label with @func{val_labs_remove}, then add the new value with 1158@func{val_labs_add}.) 1159 1160The width of the returned value cannot be determined directly from 1161@var{vl}. It may be obtained by calling @func{val_labs_get_width} on 1162the @struct{val_labs} that @var{vl} is in. 1163@end deftypefun 1164 1165@deftypefun {const char *} val_lab_get_label (const struct val_lab *@var{vl}) 1166Returns the label in @var{vl} as a null-terminated string. The caller 1167must not modify or free the returned string. (Use 1168@func{val_labs_replace} to change a value label.) 1169@end deftypefun 1170 1171@node Variables 1172@section Variables 1173 1174A PSPP variable is represented by @struct{variable}, an opaque type 1175declared in @file{data/variable.h} along with related declarations. 1176@xref{Variables,,,pspp, PSPP Users Guide}, for a description of PSPP 1177variables from a user perspective. 1178 1179PSPP is unusual among computer languages in that, by itself, a PSPP 1180variable does not have a value. Instead, a variable in PSPP takes on 1181a value only in the context of a case, which supplies one value for 1182each variable in a set of variables (@pxref{Cases}). The set of 1183variables in a case, in turn, are ordinarily part of a dictionary 1184(@pxref{Dictionaries}). 1185 1186Every variable has several attributes, most of which correspond 1187directly to one of the variable attributes visible to PSPP users 1188(@pxref{Attributes,,,pspp, PSPP Users Guide}). 1189 1190The following sections describe variable-related functions and macros. 1191 1192@menu 1193* Variable Name:: 1194* Variable Type and Width:: 1195* Variable Missing Values:: 1196* Variable Value Labels:: 1197* Variable Print and Write Formats:: 1198* Variable Labels:: 1199* Variable GUI Attributes:: 1200* Variable Leave Status:: 1201* Dictionary Class:: 1202* Variable Creation and Destruction:: 1203* Variable Short Names:: 1204* Variable Relationships:: 1205* Variable Auxiliary Data:: 1206* Variable Categorical Values:: 1207@end menu 1208 1209@node Variable Name 1210@subsection Variable Name 1211 1212A variable name is a string between 1 and @code{ID_MAX_LEN} bytes 1213long that satisfies the rules for PSPP identifiers 1214(@pxref{Tokens,,,pspp, PSPP Users Guide}). Variable names are 1215mixed-case and treated case-insensitively. 1216 1217@deftypefn Macro int ID_MAX_LEN 1218Maximum length of a variable name, in bytes, currently 64. 1219@end deftypefn 1220 1221Only one commonly useful function relates to variable names: 1222 1223@deftypefun {const char *} var_get_name (const struct variable *@var{var}) 1224Returns @var{var}'s variable name as a C string. 1225@end deftypefun 1226 1227A few other functions are much more rarely used. Some of these 1228functions are used internally by the dictionary implementation: 1229 1230@anchor{var_set_name} 1231@deftypefun {void} var_set_name (struct variable *@var{var}, const char *@var{new_name}) 1232Changes the name of @var{var} to @var{new_name}, which must be a 1233``plausible'' name as defined below. 1234 1235This function cannot be applied to a variable that is part of a 1236dictionary. Use @func{dict_rename_var} instead (@pxref{Dictionary 1237Renaming Variables}). 1238@end deftypefun 1239 1240@deftypefun {enum dict_class} var_get_dict_class (const struct variable *@var{var}) 1241Returns the dictionary class of @var{var}'s name (@pxref{Dictionary 1242Class}). 1243@end deftypefun 1244 1245@node Variable Type and Width 1246@subsection Variable Type and Width 1247 1248A variable's type and width are the type and width of its values 1249(@pxref{Values}). 1250 1251@deftypefun {enum val_type} var_get_type (const struct variable *@var{var}) 1252Returns the type of variable @var{var}. 1253@end deftypefun 1254 1255@deftypefun int var_get_width (const struct variable *@var{var}) 1256Returns the width of variable @var{var}. 1257@end deftypefun 1258 1259@deftypefun void var_set_width (struct variable *@var{var}, int @var{width}) 1260Sets the width of variable @var{var} to @var{width}. The width of a 1261variable should not normally be changed after the variable is created, 1262so this function is rarely used. This function cannot be applied to a 1263variable that is part of a dictionary. 1264@end deftypefun 1265 1266@deftypefun bool var_is_numeric (const struct variable *@var{var}) 1267Returns true if @var{var} is a numeric variable, false otherwise. 1268@end deftypefun 1269 1270@deftypefun bool var_is_alpha (const struct variable *@var{var}) 1271Returns true if @var{var} is an alphanumeric (string) variable, false 1272otherwise. 1273@end deftypefun 1274 1275@node Variable Missing Values 1276@subsection Variable Missing Values 1277 1278A numeric or short string variable may have a set of user-missing 1279values (@pxref{MISSING VALUES,,,pspp, PSPP Users Guide}), represented 1280as a @struct{missing_values} (@pxref{User-Missing Values}). 1281 1282The most frequent operation on a variable's missing values is to query 1283whether a value is user- or system-missing: 1284 1285@deftypefun bool var_is_value_missing (const struct variable *@var{var}, const union value *@var{value}, enum mv_class @var{class}) 1286@deftypefunx bool var_is_num_missing (const struct variable *@var{var}, double @var{value}, enum mv_class @var{class}) 1287@deftypefunx bool var_is_str_missing (const struct variable *@var{var}, const char @var{value}[], enum mv_class @var{class}) 1288Tests whether @var{value} is a missing value of the given @var{class} 1289for variable @var{var} and returns true if so, false otherwise. 1290@func{var_is_num_missing} may only be applied to numeric variables; 1291@func{var_is_str_missing} may only be applied to string variables. 1292@var{value} must have been initialized with the same width as 1293@var{var}. 1294 1295@code{var_is_@var{type}_missing (@var{var}, @var{value}, @var{class})} 1296is equivalent to @code{mv_is_@var{type}_missing 1297(var_get_missing_values (@var{var}), @var{value}, @var{class})}. 1298@end deftypefun 1299 1300In addition, a few functions are provided to work more directly with a 1301variable's @struct{missing_values}: 1302 1303@deftypefun {const struct missing_values *} var_get_missing_values (const struct variable *@var{var}) 1304Returns the @struct{missing_values} associated with @var{var}. The 1305caller must not modify the returned structure. The return value is 1306always non-null. 1307@end deftypefun 1308 1309@anchor{var_set_missing_values} 1310@deftypefun {void} var_set_missing_values (struct variable *@var{var}, const struct missing_values *@var{miss}) 1311Changes @var{var}'s missing values to a copy of @var{miss}, or if 1312@var{miss} is a null pointer, clears @var{var}'s missing values. If 1313@var{miss} is non-null, it must have the same width as @var{var} or be 1314resizable to @var{var}'s width (@pxref{mv_resize}). The caller 1315retains ownership of @var{miss}. 1316@end deftypefun 1317 1318@deftypefun void var_clear_missing_values (struct variable *@var{var}) 1319Clears @var{var}'s missing values. Equivalent to 1320@code{var_set_missing_values (@var{var}, NULL)}. 1321@end deftypefun 1322 1323@deftypefun bool var_has_missing_values (const struct variable *@var{var}) 1324Returns true if @var{var} has any missing values, false if it has 1325none. Equivalent to @code{mv_is_empty (var_get_missing_values (@var{var}))}. 1326@end deftypefun 1327 1328@node Variable Value Labels 1329@subsection Variable Value Labels 1330 1331A numeric or short string variable may have a set of value labels 1332(@pxref{VALUE LABELS,,,pspp, PSPP Users Guide}), represented as a 1333@struct{val_labs} (@pxref{Value Labels}). The most commonly useful 1334functions for value labels return the value label associated with a 1335value: 1336 1337@deftypefun {const char *} var_lookup_value_label (const struct variable *@var{var}, const union value *@var{value}) 1338Looks for a label for @var{value} in @var{var}'s set of value labels. 1339@var{value} must have the same width as @var{var}. Returns the label 1340if one exists, otherwise a null pointer. 1341@end deftypefun 1342 1343@deftypefun void var_append_value_name (const struct variable *@var{var}, const union value *@var{value}, struct string *@var{str}) 1344Looks for a label for @var{value} in @var{var}'s set of value labels. 1345@var{value} must have the same width as @var{var}. 1346If a label exists, it will be appended to the string pointed to by @var{str}. 1347Otherwise, it formats @var{value} 1348using @var{var}'s print format (@pxref{Input and Output Formats}) 1349and appends the formatted string. 1350@end deftypefun 1351 1352The underlying @struct{val_labs} structure may also be accessed 1353directly using the functions described below. 1354 1355@deftypefun bool var_has_value_labels (const struct variable *@var{var}) 1356Returns true if @var{var} has at least one value label, false 1357otherwise. 1358@end deftypefun 1359 1360@deftypefun {const struct val_labs *} var_get_value_labels (const struct variable *@var{var}) 1361Returns the @struct{val_labs} associated with @var{var}. If @var{var} 1362has no value labels, then the return value may or may not be a null 1363pointer. 1364 1365The variable retains ownership of the returned @struct{val_labs}, 1366which the caller must not attempt to modify. 1367@end deftypefun 1368 1369@deftypefun void var_set_value_labels (struct variable *@var{var}, const struct val_labs *@var{val_labs}) 1370Replaces @var{var}'s value labels by a copy of @var{val_labs}. The 1371caller retains ownership of @var{val_labs}. If @var{val_labs} is a 1372null pointer, then @var{var}'s value labels, if any, are deleted. 1373@end deftypefun 1374 1375@deftypefun void var_clear_value_labels (struct variable *@var{var}) 1376Deletes @var{var}'s value labels. Equivalent to 1377@code{var_set_value_labels (@var{var}, NULL)}. 1378@end deftypefun 1379 1380A final group of functions offers shorthands for operations that would 1381otherwise require getting the value labels from a variable, copying 1382them, modifying them, and then setting the modified value labels into 1383the variable (making a second copy): 1384 1385@deftypefun bool var_add_value_label (struct variable *@var{var}, const union value *@var{value}, const char *@var{label}) 1386Attempts to add a copy of @var{label} as a label for @var{value} for 1387the given @var{var}. @var{value} must have the same width as 1388@var{var}. If @var{value} already has a label, then the old label is 1389retained. Returns true if a label is added, false if there was an 1390existing label for @var{value}. Either way, the caller retains 1391ownership of @var{value} and @var{label}. 1392@end deftypefun 1393 1394@deftypefun void var_replace_value_label (struct variable *@var{var}, const union value *@var{value}, const char *@var{label}) 1395Attempts to add a copy of @var{label} as a label for @var{value} for 1396the given @var{var}. @var{value} must have the same width as 1397@var{var}. If @var{value} already has a label, then 1398@var{label} replaces the old label. Either way, the caller retains 1399ownership of @var{value} and @var{label}. 1400@end deftypefun 1401 1402@node Variable Print and Write Formats 1403@subsection Variable Print and Write Formats 1404 1405Each variable has an associated pair of output formats, called its 1406@dfn{print format} and @dfn{write format}. @xref{Input and Output 1407Formats,,,pspp, PSPP Users Guide}, for an introduction to formats. 1408@xref{Input and Output Formats}, for a developer's description of 1409format representation. 1410 1411The print format is used to convert a variable's data values to 1412strings for human-readable output. The write format is used similarly 1413for machine-readable output, primarily by the WRITE transformation 1414(@pxref{WRITE,,,pspp, PSPP Users Guide}). Most often a variable's 1415print and write formats are the same. 1416 1417A newly created variable by default has format F8.2 if it is numeric 1418or an A format with the same width as the variable if it is string. 1419Many creators of variables override these defaults. 1420 1421Both the print format and write format are output formats. Input 1422formats are not part of @struct{variable}. Instead, input programs 1423and transformations keep track of variable input formats themselves. 1424 1425The following functions work with variable print and write formats. 1426 1427@deftypefun {const struct fmt_spec *} var_get_print_format (const struct variable *@var{var}) 1428@deftypefunx {const struct fmt_spec *} var_get_write_format (const struct variable *@var{var}) 1429Returns @var{var}'s print or write format, respectively. 1430@end deftypefun 1431 1432@deftypefun void var_set_print_format (struct variable *@var{var}, const struct fmt_spec *@var{format}) 1433@deftypefunx void var_set_write_format (struct variable *@var{var}, const struct fmt_spec *@var{format}) 1434@deftypefunx void var_set_both_formats (struct variable *@var{var}, const struct fmt_spec *@var{format}) 1435Sets @var{var}'s print format, write format, or both formats, 1436respectively, to a copy of @var{format}. 1437@end deftypefun 1438 1439@node Variable Labels 1440@subsection Variable Labels 1441 1442A variable label is a string that describes a variable. Variable 1443labels may contain spaces and punctuation not allowed in variable 1444names. @xref{VARIABLE LABELS,,,pspp, PSPP Users Guide}, for a 1445user-level description of variable labels. 1446 1447The most commonly useful functions for variable labels are those to 1448retrieve a variable's label: 1449 1450@deftypefun {const char *} var_to_string (const struct variable *@var{var}) 1451Returns @var{var}'s variable label, if it has one, otherwise 1452@var{var}'s name. In either case the caller must not attempt to 1453modify or free the returned string. 1454 1455This function is useful for user output. 1456@end deftypefun 1457 1458@deftypefun {const char *} var_get_label (const struct variable *@var{var}) 1459Returns @var{var}'s variable label, if it has one, or a null pointer 1460otherwise. 1461@end deftypefun 1462 1463A few other variable label functions are also provided: 1464 1465@deftypefun void var_set_label (struct variable *@var{var}, const char *@var{label}) 1466Sets @var{var}'s variable label to a copy of @var{label}, or removes 1467any label from @var{var} if @var{label} is a null pointer or contains 1468only spaces. Leading and trailing spaces are removed from the 1469variable label and its remaining content is truncated at 255 bytes. 1470@end deftypefun 1471 1472@deftypefun void var_clear_label (struct variable *@var{var}) 1473Removes any variable label from @var{var}. 1474@end deftypefun 1475 1476@deftypefun bool var_has_label (const struct variable *@var{var}) 1477Returns true if @var{var} has a variable label, false otherwise. 1478@end deftypefun 1479 1480@node Variable GUI Attributes 1481@subsection GUI Attributes 1482 1483These functions and types access and set attributes that are mainly 1484used by graphical user interfaces. Their values are also stored in 1485and retrieved from system files (but not portable files). 1486 1487The first group of functions relate to the measurement level of 1488numeric data. New variables are assigned a nominal level of 1489measurement by default. 1490 1491@deftp {Enumeration} {enum measure} 1492Measurement level. Available values are: 1493 1494@table @code 1495@item MEASURE_NOMINAL 1496Numeric data values are arbitrary. Arithmetic operations and 1497numerical comparisons of such data are not meaningful. 1498 1499@item MEASURE_ORDINAL 1500Numeric data values indicate progression along a rank order. 1501Arbitrary arithmetic operations such as addition are not meaningful on 1502such data, but inequality comparisons (less, greater, etc.) have 1503straightforward interpretations. 1504 1505@item MEASURE_SCALE 1506Ratios, sums, etc. of numeric data values have meaningful 1507interpretations. 1508@end table 1509 1510PSPP does not have a separate category for interval data, which would 1511naturally fall between the ordinal and scale measurement levels. 1512@end deftp 1513 1514@deftypefun bool measure_is_valid (enum measure @var{measure}) 1515Returns true if @var{measure} is a valid level of measurement, that 1516is, if it is one of the @code{enum measure} constants listed above, 1517and false otherwise. 1518@end deftypefun 1519 1520@deftypefun enum measure var_get_measure (const struct variable *@var{var}) 1521@deftypefunx void var_set_measure (struct variable *@var{var}, enum measure @var{measure}) 1522Gets or sets @var{var}'s measurement level. 1523@end deftypefun 1524 1525The following set of functions relates to the width of on-screen 1526columns used for displaying variable data in a graphical user 1527interface environment. The unit of measurement is the width of a 1528character. For proportionally spaced fonts, this is based on the 1529average width of a character. 1530 1531@deftypefun int var_get_display_width (const struct variable *@var{var}) 1532@deftypefunx void var_set_display_width (struct variable *@var{var}, int @var{display_width}) 1533Gets or sets @var{var}'s display width. 1534@end deftypefun 1535 1536@anchor{var_default_display_width} 1537@deftypefun int var_default_display_width (int @var{width}) 1538Returns the default display width for a variable with the given 1539@var{width}. The default width of a numeric variable is 8. The 1540default width of a string variable is @var{width} or 32, whichever is 1541less. 1542@end deftypefun 1543 1544The final group of functions work with the justification of data when 1545it is displayed in on-screen columns. New variables are by default 1546right-justified. 1547 1548@deftp {Enumeration} {enum alignment} 1549Text justification. Possible values are @code{ALIGN_LEFT}, 1550@code{ALIGN_RIGHT}, and @code{ALIGN_CENTRE}. 1551@end deftp 1552 1553@deftypefun bool alignment_is_valid (enum alignment @var{alignment}) 1554Returns true if @var{alignment} is a valid alignment, that is, if it 1555is one of the @code{enum alignment} constants listed above, and false 1556otherwise. 1557@end deftypefun 1558 1559@deftypefun enum alignment var_get_alignment (const struct variable *@var{var}) 1560@deftypefunx void var_set_alignment (struct variable *@var{var}, enum alignment @var{alignment}) 1561Gets or sets @var{var}'s alignment. 1562@end deftypefun 1563 1564@node Variable Leave Status 1565@subsection Variable Leave Status 1566 1567Commonly, most or all data in a case come from an input file, read 1568with a command such as DATA LIST or GET, but data can also be 1569generated with transformations such as COMPUTE. In the latter case 1570the question of a datum's ``initial value'' can arise. For example, 1571the value of a piece of generated data can recursively depend on its 1572own value: 1573@example 1574COMPUTE X = X + 1. 1575@end example 1576Another situation where the initial value of a variable arises is when 1577its value is not set at all for some cases, e.g.@: below, @code{Y} is 1578set only for the first 10 cases: 1579@example 1580DO IF #CASENUM <= 10. 1581+ COMPUTE Y = 1. 1582END IF. 1583@end example 1584 1585By default, the initial value of a datum in either of these situations 1586is the system-missing value for numeric values and spaces for string 1587values. This means that, above, X would be system-missing and that Y 1588would be 1 for the first 10 cases and system-missing for the 1589remainder. 1590 1591PSPP also supports retaining the value of a variable from one case to 1592another, using the LEAVE command (@pxref{LEAVE,,,pspp, PSPP Users 1593Guide}). The initial value of such a variable is 0 if it is numeric 1594and spaces if it is a string. If the command @samp{LEAVE X Y} is 1595appended to the above example, then X would have value 1 in the first 1596case and increase by 1 in every succeeding case, and Y would have 1597value 1 for the first 10 cases and 0 for later cases. 1598 1599The LEAVE command has no effect on data that comes from an input file 1600or whose values do not depend on a variable's initial value. 1601 1602The value of scratch variables (@pxref{Scratch Variables,,,pspp, PSPP 1603Users Guide}) are always left from one case to another. 1604 1605The following functions work with a variable's leave status. 1606 1607@deftypefun bool var_get_leave (const struct variable *@var{var}) 1608Returns true if @var{var}'s value is to be retained from case to case, 1609false if it is reinitialized to system-missing or spaces. 1610@end deftypefun 1611 1612@deftypefun void var_set_leave (struct variable *@var{var}, bool @var{leave}) 1613If @var{leave} is true, marks @var{var} to be left from case to case; 1614if @var{leave} is false, marks @var{var} to be reinitialized for each 1615case. 1616 1617If @var{var} is a scratch variable, @var{leave} must be true. 1618@end deftypefun 1619 1620@deftypefun bool var_must_leave (const struct variable *@var{var}) 1621Returns true if @var{var} must be left from case to case, that is, if 1622@var{var} is a scratch variable. 1623@end deftypefun 1624 1625@node Dictionary Class 1626@subsection Dictionary Class 1627 1628Occasionally it is useful to classify variables into @dfn{dictionary 1629classes} based on their names. Dictionary classes are represented by 1630@enum{dict_class}. This type and other declarations for dictionary 1631classes are in the @file{<data/dict-class.h>} header. 1632 1633@deftp {Enumeration} {enum dict_class} 1634The dictionary classes are: 1635 1636@table @code 1637@item DC_ORDINARY 1638An ordinary variable, one whose name does not begin with @samp{$} or 1639@samp{#}. 1640 1641@item DC_SYSTEM 1642A system variable, one whose name begins with @samp{$}. @xref{System 1643Variables,,,pspp, PSPP Users Guide}. 1644 1645@item DC_SCRATCH 1646A scratch variable, one whose name begins with @samp{#}. 1647@xref{Scratch Variables,,,pspp, PSPP Users Guide}. 1648@end table 1649 1650The values for dictionary classes are bitwise disjoint, which allows 1651them to be used in bit-masks. An extra enumeration constant 1652@code{DC_ALL}, whose value is the bitwise-@i{or} of all of the above 1653constants, is provided to aid in this purpose. 1654@end deftp 1655 1656One example use of dictionary classes arises in connection with PSPP 1657syntax that uses @code{@var{a} TO @var{b}} to name the variables in a 1658dictionary from @var{a} to @var{b} (@pxref{Sets of Variables,,,pspp, 1659PSPP Users Guide}). This syntax requires @var{a} and @var{b} to be in 1660the same dictionary class. It limits the variables that it includes 1661to those in that dictionary class. 1662 1663The following functions relate to dictionary classes. 1664 1665@deftypefun {enum dict_class} dict_class_from_id (const char *@var{name}) 1666Returns the ``dictionary class'' for the given variable @var{name}, by 1667looking at its first letter. 1668@end deftypefun 1669 1670@deftypefun {const char *} dict_class_to_name (enum dict_class @var{dict_class}) 1671Returns a name for the given @var{dict_class} as an adjective, e.g.@: 1672@code{"scratch"}. 1673 1674This function should probably not be used in new code as it can lead 1675to difficulties for internationalization. 1676@end deftypefun 1677 1678@node Variable Creation and Destruction 1679@subsection Variable Creation and Destruction 1680 1681Only rarely should PSPP code create or destroy variables directly. 1682Ordinarily, variables are created within a dictionary and destroying 1683by individual deletion from the dictionary or by destroying the entire 1684dictionary at once. The functions here enable the exceptional case, 1685of creation and destruction of variables that are not associated with 1686any dictionary. These functions are used internally in the dictionary 1687implementation. 1688 1689@anchor{var_create} 1690@deftypefun {struct variable *} var_create (const char *@var{name}, int @var{width}) 1691Creates and returns a new variable with the given @var{name} and 1692@var{width}. The new variable is not part of any dictionary. Use 1693@func{dict_create_var}, instead, to create a variable in a dictionary 1694(@pxref{Dictionary Creating Variables}). 1695 1696@var{name} should be a valid variable name and must be a ``plausible'' 1697variable name (@pxref{Variable Name}). @var{width} must be between 0 1698and @code{MAX_STRING}, inclusive (@pxref{Values}). 1699 1700The new variable has no user-missing values, value labels, or variable 1701label. Numeric variables initially have F8.2 print and write formats, 1702right-justified display alignment, and scale level of measurement. 1703String variables are created with A print and write formats, 1704left-justified display alignment, and nominal level of measurement. 1705The initial display width is determined by 1706@func{var_default_display_width} (@pxref{var_default_display_width}). 1707 1708The new variable initially has no short name (@pxref{Variable Short 1709Names}) and no auxiliary data (@pxref{Variable Auxiliary Data}). 1710@end deftypefun 1711 1712@anchor{var_clone} 1713@deftypefun {struct variable *} var_clone (const struct variable *@var{old_var}) 1714Creates and returns a new variable with the same attributes as 1715@var{old_var}, with a few exceptions. First, the new variable is not 1716part of any dictionary, regardless of whether @var{old_var} was in a 1717dictionary. Use @func{dict_clone_var}, instead, to add a clone of a 1718variable to a dictionary. 1719 1720Second, the new variable is not given any short name, even if 1721@var{old_var} had a short name. This is because the new variable is 1722likely to be immediately renamed, in which case the short name would 1723be incorrect (@pxref{Variable Short Names}). 1724 1725Finally, @var{old_var}'s auxiliary data, if any, is not copied to the 1726new variable (@pxref{Variable Auxiliary Data}). 1727@end deftypefun 1728 1729@deftypefun {void} var_destroy (struct variable *@var{var}) 1730Destroys @var{var} and frees all associated storage, including its 1731auxiliary data, if any. @var{var} must not be part of a dictionary. 1732To delete a variable from a dictionary and destroy it, use 1733@func{dict_delete_var} (@pxref{Dictionary Deleting Variables}). 1734@end deftypefun 1735 1736@node Variable Short Names 1737@subsection Variable Short Names 1738 1739PSPP variable names may be up to 64 (@code{ID_MAX_LEN}) bytes long. 1740The system and portable file formats, however, were designed when 1741variable names were limited to 8 bytes in length. Since then, the 1742system file format has been augmented with an extension record that 1743explains how the 8-byte short names map to full-length names 1744(@pxref{Long Variable Names Record}), but the short names are still 1745present. Thus, the continued presence of the short names is more or 1746less invisible to PSPP users, but every variable in a system file 1747still has a short name that must be unique. 1748 1749PSPP can generate unique short names for variables based on their full 1750names at the time it creates the data file. If all variables' full 1751names are unique in their first 8 bytes, then the short names are 1752simply prefixes of the full names; otherwise, PSPP changes them so 1753that they are unique. 1754 1755By itself this algorithm interoperates well with other software that 1756can read system files, as long as that software understands the 1757extension record that maps short names to long names. When the other 1758software does not understand the extension record, it can produce 1759surprising results. Consider a situation where PSPP reads a system 1760file that contains two variables named RANKINGSCORE, then the user 1761adds a new variable named RANKINGSTATUS, then saves the modified data 1762as a new system file. A program that does not understand long names 1763would then see one of these variables under the name RANKINGS---either 1764one, depending on the algorithm's details---and the other under a 1765different name. The effect could be very confusing: by adding a new 1766and apparently unrelated variable in PSPP, the user effectively 1767renamed the existing variable. 1768 1769To counteract this potential problem, every @struct{variable} may have 1770a short name. A variable created by the system or portable file 1771reader receives the short name from that data file. When a variable 1772with a short name is written to a system or portable file, that 1773variable receives priority over other long names whose names begin 1774with the same 8 bytes but which were not read from a data file under 1775that short name. 1776 1777Variables not created by the system or portable file reader have no 1778short name by default. 1779 1780A variable with a full name of 8 bytes or less in length has absolute 1781priority for that name when the variable is written to a system file, 1782even over a second variable with that assigned short name. 1783 1784PSPP does not enforce uniqueness of short names, although the short 1785names read from any given data file will always be unique. If two 1786variables with the same short name are written to a single data file, 1787neither one receives priority. 1788 1789The following macros and functions relate to short names. 1790 1791@defmac SHORT_NAME_LEN 1792Maximum length of a short name, in bytes. Its value is 8. 1793@end defmac 1794 1795@deftypefun {const char *} var_get_short_name (const struct variable *@var{var}) 1796Returns @var{var}'s short name, or a null pointer if @var{var} has not 1797been assigned a short name. 1798@end deftypefun 1799 1800@deftypefun void var_set_short_name (struct variable *@var{var}, const char *@var{short_name}) 1801Sets @var{var}'s short name to @var{short_name}, or removes 1802@var{var}'s short name if @var{short_name} is a null pointer. If it 1803is non-null, then @var{short_name} must be a plausible name for a 1804variable. The name will be truncated 1805to 8 bytes in length and converted to all-uppercase. 1806@end deftypefun 1807 1808@deftypefun void var_clear_short_name (struct variable *@var{var}) 1809Removes @var{var}'s short name. 1810@end deftypefun 1811 1812@node Variable Relationships 1813@subsection Variable Relationships 1814 1815Variables have close relationships with dictionaries 1816(@pxref{Dictionaries}) and cases (@pxref{Cases}). A variable is 1817usually a member of some dictionary, and a case is often used to store 1818data for the set of variables in a dictionary. 1819 1820These functions report on these relationships. They may be applied 1821only to variables that are in a dictionary. 1822 1823@deftypefun size_t var_get_dict_index (const struct variable *@var{var}) 1824Returns @var{var}'s index within its dictionary. The first variable 1825in a dictionary has index 0, the next variable index 1, and so on. 1826 1827The dictionary index can be influenced using dictionary functions such 1828as dict_reorder_var (@pxref{dict_reorder_var}). 1829@end deftypefun 1830 1831@deftypefun size_t var_get_case_index (const struct variable *@var{var}) 1832Returns @var{var}'s index within a case. The case index is an index 1833into an array of @union{value} large enough to contain all the data in 1834the dictionary. 1835 1836The returned case index can be used to access the value of @var{var} 1837within a case for its dictionary, as in e.g.@: @code{case_data_idx 1838(case, var_get_case_index (@var{var}))}, but ordinarily it is more 1839convenient to use the data access functions that do variable-to-index 1840translation internally, as in e.g.@: @code{case_data (case, 1841@var{var})}. 1842@end deftypefun 1843 1844@node Variable Auxiliary Data 1845@subsection Variable Auxiliary Data 1846 1847Each @struct{variable} can have a single pointer to auxiliary data of 1848type @code{void *}. These functions manipulate a variable's auxiliary 1849data. 1850 1851Use of auxiliary data is discouraged because of its lack of 1852flexibility. Only one client can make use of auxiliary data on a 1853given variable at any time, even though many clients could usefully 1854associate data with a variable. 1855 1856To prevent multiple clients from attempting to use a variable's single 1857auxiliary data field at the same time, we adopt the convention that 1858use of auxiliary data in the active dataset dictionary is restricted to 1859the currently executing command. In particular, transformations must 1860not attach auxiliary data to a variable in the active dataset in the 1861expectation that it can be used later when the active dataset is read and 1862the transformation is executed. To help enforce this restriction, 1863auxiliary data is deleted from all variables in the active dataset 1864dictionary after the execution of each PSPP command. 1865 1866This convention for safe use of auxiliary data applies only to the 1867active dataset dictionary. Rules for other dictionaries may be 1868established separately. 1869 1870Auxiliary data should be replaced by a more flexible mechanism at some 1871point, but no replacement mechanism has been designed or implemented 1872so far. 1873 1874The following functions work with variable auxiliary data. 1875 1876@deftypefun {void *} var_get_aux (const struct variable *@var{var}) 1877Returns @var{var}'s auxiliary data, or a null pointer if none has been 1878assigned. 1879@end deftypefun 1880 1881@deftypefun {void *} var_attach_aux (const struct variable *@var{var}, void *@var{aux}, void (*@var{aux_dtor}) (struct variable *)) 1882Sets @var{var}'s auxiliary data to @var{aux}, which must not be null. 1883@var{var} must not already have auxiliary data. 1884 1885Before @var{var}'s auxiliary data is cleared by @code{var_clear_aux}, 1886@var{aux_dtor}, if non-null, will be called with @var{var} as its 1887argument. It should free any storage associated with @var{aux}, if 1888necessary. @code{var_dtor_free} may be appropriate for use as 1889@var{aux_dtor}: 1890 1891@deffn {Function} void var_dtor_free (struct variable *@var{var}) 1892Frees @var{var}'s auxiliary data by calling @code{free}. 1893@end deffn 1894@end deftypefun 1895 1896@deftypefun void var_clear_aux (struct variable *@var{var}) 1897Removes auxiliary data, if any, from @var{var}, first calling the 1898destructor passed to @code{var_attach_aux}, if one was provided. 1899 1900Use @code{dict_clear_aux} to remove auxiliary data from every variable 1901in a dictionary. @c (@pxref{dict_clear_aux}). 1902@end deftypefun 1903 1904@deftypefun {void *} var_detach_aux (struct variable *@var{var}) 1905Removes auxiliary data, if any, from @var{var}, and returns it. 1906Returns a null pointer if @var{var} had no auxiliary data. 1907 1908Any destructor passed to @code{var_attach_aux} is not called, so the 1909caller is responsible for freeing storage associated with the returned 1910auxiliary data. 1911@end deftypefun 1912 1913@node Variable Categorical Values 1914@subsection Variable Categorical Values 1915 1916Some statistical procedures require a list of all the values that a 1917categorical variable takes on. Arranging such a list requires making 1918a pass through the data, so PSPP caches categorical values in 1919@struct{variable}. 1920 1921When variable auxiliary data is revamped to support multiple clients 1922as described in the previous section, categorical values are an 1923obvious candidate. The form in which they are currently supported is 1924inelegant. 1925 1926Categorical values are not robust against changes in the data. That 1927is, there is currently no way to detect that a transformation has 1928changed data values, meaning that categorical values lists for the 1929changed variables must be recomputed. PSPP is in fact in need of a 1930general-purpose caching and cache-invalidation mechanism, but none 1931has yet been designed and built. 1932 1933The following functions work with cached categorical values. 1934 1935@deftypefun {struct cat_vals *} var_get_obs_vals (const struct variable *@var{var}) 1936Returns @var{var}'s set of categorical values. Yields undefined 1937behavior if @var{var} does not have any categorical values. 1938@end deftypefun 1939 1940@deftypefun void var_set_obs_vals (const struct variable *@var{var}, struct cat_vals *@var{cat_vals}) 1941Destroys @var{var}'s categorical values, if any, and replaces them by 1942@var{cat_vals}, ownership of which is transferred to @var{var}. If 1943@var{cat_vals} is a null pointer, then @var{var}'s categorical values 1944are cleared. 1945@end deftypefun 1946 1947@deftypefun bool var_has_obs_vals (const struct variable *@var{var}) 1948Returns true if @var{var} has a set of categorical values, false 1949otherwise. 1950@end deftypefun 1951 1952@node Dictionaries 1953@section Dictionaries 1954 1955Each data file in memory or on disk has an associated dictionary, 1956whose primary purpose is to describe the data in the file. 1957@xref{Variables,,,pspp, PSPP Users Guide}, for a PSPP user's view of a 1958dictionary. 1959 1960A data file stored in a PSPP format, either as a system or portable 1961file, has a representation of its dictionary embedded in it. Other 1962kinds of data files are usually not self-describing enough to 1963construct a dictionary unassisted, so the dictionaries for these files 1964must be specified explicitly with PSPP commands such as @cmd{DATA 1965LIST}. 1966 1967The most important content of a dictionary is an array of variables, 1968which must have unique names. A dictionary also conceptually contains 1969a mapping from each of its variables to a location within a case 1970(@pxref{Cases}), although in fact these mappings are stored within 1971individual variables. 1972 1973System variables are not members of any dictionary (@pxref{System 1974Variables,,,pspp, PSPP Users Guide}). 1975 1976Dictionaries are represented by @struct{dictionary}. Declarations 1977related to dictionaries are in the @file{<data/dictionary.h>} header. 1978 1979The following sections describe functions for use with dictionaries. 1980 1981@menu 1982* Dictionary Variable Access:: 1983* Dictionary Creating Variables:: 1984* Dictionary Deleting Variables:: 1985* Dictionary Reordering Variables:: 1986* Dictionary Renaming Variables:: 1987* Dictionary Weight Variable:: 1988* Dictionary Filter Variable:: 1989* Dictionary Case Limit:: 1990* Dictionary Split Variables:: 1991* Dictionary File Label:: 1992* Dictionary Documents:: 1993@end menu 1994 1995@node Dictionary Variable Access 1996@subsection Accessing Variables 1997 1998The most common operations on a dictionary simply retrieve a 1999@code{struct variable *} of an individual variable based on its name 2000or position. 2001 2002@deftypefun {struct variable *} dict_lookup_var (const struct dictionary *@var{dict}, const char *@var{name}) 2003@deftypefunx {struct variable *} dict_lookup_var_assert (const struct dictionary *@var{dict}, const char *@var{name}) 2004Looks up and returns the variable with the given @var{name} within 2005@var{dict}. Name lookup is not case-sensitive. 2006 2007@code{dict_lookup_var} returns a null pointer if @var{dict} does not 2008contain a variable named @var{name}. @code{dict_lookup_var_assert} 2009asserts that such a variable exists. 2010@end deftypefun 2011 2012@deftypefun {struct variable *} dict_get_var (const struct dictionary *@var{dict}, size_t @var{position}) 2013Returns the variable at the given @var{position} in @var{dict}. 2014@var{position} must be less than the number of variables in @var{dict} 2015(see below). 2016@end deftypefun 2017 2018@deftypefun size_t dict_get_var_cnt (const struct dictionary *@var{dict}) 2019Returns the number of variables in @var{dict}. 2020@end deftypefun 2021 2022Another pair of functions allows retrieving a number of variables at 2023once. These functions are more rarely useful. 2024 2025@deftypefun void dict_get_vars (const struct dictionary *@var{dict}, const struct variable ***@var{vars}, size_t *@var{cnt}, enum dict_class @var{exclude}) 2026@deftypefunx void dict_get_vars_mutable (const struct dictionary *@var{dict}, struct variable ***@var{vars}, size_t *@var{cnt}, enum dict_class @var{exclude}) 2027Retrieves all of the variables in @var{dict}, in their original order, 2028except that any variables in the dictionary classes specified 2029@var{exclude}, if any, are excluded (@pxref{Dictionary Class}). 2030Pointers to the variables are stored in an array allocated with 2031@code{malloc}, and a pointer to the first element of this array is 2032stored in @code{*@var{vars}}. The caller is responsible for freeing 2033this memory when it is no longer needed. The number of variables 2034retrieved is stored in @code{*@var{cnt}}. 2035 2036The presence or absence of @code{DC_SYSTEM} in @var{exclude} has no 2037effect, because dictionaries never include system variables. 2038@end deftypefun 2039 2040One additional function is available. This function is most often 2041used in assertions, but it is not restricted to such use. 2042 2043@deftypefun bool dict_contains_var (const struct dictionary *@var{dict}, const struct variable *@var{var}) 2044Tests whether @var{var} is one of the variables in @var{dict}. 2045Returns true if so, false otherwise. 2046@end deftypefun 2047 2048@node Dictionary Creating Variables 2049@subsection Creating Variables 2050 2051These functions create a new variable and insert it into a dictionary 2052in a single step. 2053 2054There is no provision for inserting an already created variable into a 2055dictionary. There is no reason that such a function could not be 2056written, but so far there has been no need for one. 2057 2058The names provided to one of these functions should be valid variable 2059names and must be plausible variable names. @c (@pxref{Variable Names}). 2060 2061If a variable with the same name already exists in the dictionary, the 2062non-@code{assert} variants of these functions return a null pointer, 2063without modifying the dictionary. The @code{assert} variants, on the 2064other hand, assert that no duplicate name exists. 2065 2066A variable may be in only one dictionary at any given time. 2067 2068@deftypefun {struct variable *} dict_create_var (struct dictionary *@var{dict}, const char *@var{name}, int @var{width}) 2069@deftypefunx {struct variable *} dict_create_var_assert (struct dictionary *@var{dict}, const char *@var{name}, int @var{width}) 2070Creates a new variable with the given @var{name} and @var{width}, as 2071if through a call to @code{var_create} with those arguments 2072(@pxref{var_create}), appends the new variable to @var{dict}'s array 2073of variables, and returns the new variable. 2074@end deftypefun 2075 2076@deftypefun {struct variable *} dict_clone_var (struct dictionary *@var{dict}, const struct variable *@var{old_var}) 2077@deftypefunx {struct variable *} dict_clone_var_assert (struct dictionary *@var{dict}, const struct variable *@var{old_var}) 2078Creates a new variable as a clone of @var{var}, inserts the new 2079variable into @var{dict}, and returns the new variable. Other 2080properties of the new variable are copied from @var{old_var}, except 2081for those not copied by @code{var_clone} (@pxref{var_clone}). 2082 2083@var{var} does not need to be a member of any dictionary. 2084@end deftypefun 2085 2086@deftypefun {struct variable *} dict_clone_var_as (struct dictionary *@var{dict}, const struct variable *@var{old_var}, const char *@var{name}) 2087@deftypefunx {struct variable *} dict_clone_var_as_assert (struct dictionary *@var{dict}, const struct variable *@var{old_var}, const char *@var{name}) 2088These functions are similar to @code{dict_clone_var} and 2089@code{dict_clone_var_assert}, respectively, except that the new 2090variable is named @var{name} instead of keeping @var{old_var}'s name. 2091@end deftypefun 2092 2093@node Dictionary Deleting Variables 2094@subsection Deleting Variables 2095 2096These functions remove variables from a dictionary's array of 2097variables. They also destroy the removed variables and free their 2098associated storage. 2099 2100Deleting a variable to which there might be external pointers is a bad 2101idea. In particular, deleting variables from the active dataset 2102dictionary is a risky proposition, because transformations can retain 2103references to arbitrary variables. Therefore, no variable should be 2104deleted from the active dataset dictionary when any transformations are 2105active, because those transformations might reference the variable to 2106be deleted. The safest time to delete a variable is just after a 2107procedure has been executed, as done by @cmd{DELETE VARIABLES}. 2108 2109Deleting a variable automatically removes references to that variable 2110from elsewhere in the dictionary as a weighting variable, filter 2111variable, @cmd{SPLIT FILE} variable, or member of a vector. 2112 2113No functions are provided for removing a variable from a dictionary 2114without destroying that variable. As with insertion of an existing 2115variable, there is no reason that this could not be implemented, but 2116so far there has been no need. 2117 2118@deftypefun void dict_delete_var (struct dictionary *@var{dict}, struct variable *@var{var}) 2119Deletes @var{var} from @var{dict}, of which it must be a member. 2120@end deftypefun 2121 2122@deftypefun void dict_delete_vars (struct dictionary *@var{dict}, struct variable *const *@var{vars}, size_t @var{count}) 2123Deletes the @var{count} variables in array @var{vars} from @var{dict}. 2124All of the variables in @var{vars} must be members of @var{dict}. No 2125variable may be included in @var{vars} more than once. 2126@end deftypefun 2127 2128@deftypefun void dict_delete_consecutive_vars (struct dictionary *@var{dict}, size_t @var{idx}, size_t @var{count}) 2129Deletes the variables in sequential positions 2130@var{idx}@dots{}@var{idx} + @var{count} (exclusive) from @var{dict}, 2131which must contain at least @var{idx} + @var{count} variables. 2132@end deftypefun 2133 2134@deftypefun void dict_delete_scratch_vars (struct dictionary *@var{dict}) 2135Deletes all scratch variables from @var{dict}. 2136@end deftypefun 2137 2138@node Dictionary Reordering Variables 2139@subsection Changing Variable Order 2140 2141The variables in a dictionary are stored in an array. These functions 2142change the order of a dictionary's array of variables without changing 2143which variables are in the dictionary. 2144 2145@anchor{dict_reorder_var} 2146@deftypefun void dict_reorder_var (struct dictionary *@var{dict}, struct variable *@var{var}, size_t @var{new_index}) 2147Moves @var{var}, which must be in @var{dict}, so that it is at 2148position @var{new_index} in @var{dict}'s array of variables. Other 2149variables in @var{dict}, if any, retain their relative positions. 2150@var{new_index} must be less than the number of variables in 2151@var{dict}. 2152@end deftypefun 2153 2154@deftypefun void dict_reorder_vars (struct dictionary *@var{dict}, struct variable *const *@var{new_order}, size_t @var{count}) 2155Moves the @var{count} variables in @var{new_order} to the beginning of 2156@var{dict}'s array of variables in the specified order. Other 2157variables in @var{dict}, if any, retain their relative positions. 2158 2159All of the variables in @var{new_order} must be in @var{dict}. No 2160duplicates are allowed within @var{new_order}, which means that 2161@var{count} must be no greater than the number of variables in 2162@var{dict}. 2163@end deftypefun 2164 2165@node Dictionary Renaming Variables 2166@subsection Renaming Variables 2167 2168These functions change the names of variables within a dictionary. 2169The @func{var_set_name} function (@pxref{var_set_name}) cannot be 2170applied directly to a variable that is in a dictionary, because 2171@struct{dictionary} contains an index by name that @func{var_set_name} 2172would not update. The following functions take care to update the 2173index as well. They also ensure that variable renaming does not cause 2174a dictionary to contain a duplicate variable name. 2175 2176@deftypefun void dict_rename_var (struct dictionary *@var{dict}, struct variable *@var{var}, const char *@var{new_name}) 2177Changes the name of @var{var}, which must be in @var{dict}, to 2178@var{new_name}. A variable named @var{new_name} must not already be 2179in @var{dict}, unless @var{new_name} is the same as @var{var}'s 2180current name. 2181@end deftypefun 2182 2183@deftypefun bool dict_rename_vars (struct dictionary *@var{dicT}, struct variable **@var{vars}, char **@var{new_names}, size_t @var{count}, char **@var{err_name}) 2184Renames each of the @var{count} variables in @var{vars} to the name in 2185the corresponding position of @var{new_names}. If the renaming would 2186result in a duplicate variable name, returns false and stores one of 2187the names that would be be duplicated into @code{*@var{err_name}}, if 2188@var{err_name} is non-null. Otherwise, the renaming is successful, 2189and true is returned. 2190@end deftypefun 2191 2192@node Dictionary Weight Variable 2193@subsection Weight Variable 2194 2195A data set's cases may optionally be weighted by the value of a 2196numeric variable. @xref{WEIGHT,,,pspp, PSPP Users Guide}, for a user 2197view of weight variables. 2198 2199The weight variable is written to and read from system and portable 2200files. 2201 2202The most commonly useful function related to weighting is a 2203convenience function to retrieve a weighting value from a case. 2204 2205@deftypefun double dict_get_case_weight (const struct dictionary *@var{dict}, const struct ccase *@var{case}, bool *@var{warn_on_invalid}) 2206Retrieves and returns the value of the weighting variable specified by 2207@var{dict} from @var{case}. Returns 1.0 if @var{dict} has no 2208weighting variable. 2209 2210Returns 0.0 if @var{c}'s weight value is user- or system-missing, 2211zero, or negative. In such a case, if @var{warn_on_invalid} is 2212non-null and @code{*@var{warn_on_invalid}} is true, 2213@func{dict_get_case_weight} also issues an error message and sets 2214@code{*@var{warn_on_invalid}} to false. To disable error reporting, 2215pass a null pointer or a pointer to false as @var{warn_on_invalid} or 2216use a @func{msg_disable}/@func{msg_enable} pair. 2217@end deftypefun 2218 2219The dictionary also has a pair of functions for getting and setting 2220the weight variable. 2221 2222@deftypefun {struct variable *} dict_get_weight (const struct dictionary *@var{dict}) 2223Returns @var{dict}'s current weighting variable, or a null pointer if 2224the dictionary does not have a weighting variable. 2225@end deftypefun 2226 2227@deftypefun void dict_set_weight (struct dictionary *@var{dict}, struct variable *@var{var}) 2228Sets @var{dict}'s weighting variable to @var{var}. If @var{var} is 2229non-null, it must be a numeric variable in @var{dict}. If @var{var} 2230is null, then @var{dict}'s weighting variable, if any, is cleared. 2231@end deftypefun 2232 2233@node Dictionary Filter Variable 2234@subsection Filter Variable 2235 2236When the active dataset is read by a procedure, cases can be excluded 2237from analysis based on the values of a @dfn{filter variable}. 2238@xref{FILTER,,,pspp, PSPP Users Guide}, for a user view of filtering. 2239 2240These functions store and retrieve the filter variable. They are 2241rarely useful, because the data analysis framework automatically 2242excludes from analysis the cases that should be filtered. 2243 2244@deftypefun {struct variable *} dict_get_filter (const struct dictionary *@var{dict}) 2245Returns @var{dict}'s current filter variable, or a null pointer if the 2246dictionary does not have a filter variable. 2247@end deftypefun 2248 2249@deftypefun void dict_set_filter (struct dictionary *@var{dict}, struct variable *@var{var}) 2250Sets @var{dict}'s filter variable to @var{var}. If @var{var} is 2251non-null, it must be a numeric variable in @var{dict}. If @var{var} 2252is null, then @var{dict}'s filter variable, if any, is cleared. 2253@end deftypefun 2254 2255@node Dictionary Case Limit 2256@subsection Case Limit 2257 2258The limit on cases analyzed by a procedure, set by the @cmd{N OF 2259CASES} command (@pxref{N OF CASES,,,pspp, PSPP Users Guide}), is 2260stored as part of the dictionary. The dictionary does not, on the 2261other hand, play any role in enforcing the case limit (a job done by 2262data analysis framework code). 2263 2264A case limit of 0 means that the number of cases is not limited. 2265 2266These functions are rarely useful, because the data analysis framework 2267automatically excludes from analysis any cases beyond the limit. 2268 2269@deftypefun casenumber dict_get_case_limit (const struct dictionary *@var{dict}) 2270Returns the current case limit for @var{dict}. 2271@end deftypefun 2272 2273@deftypefun void dict_set_case_limit (struct dictionary *@var{dict}, casenumber @var{limit}) 2274Sets @var{dict}'s case limit to @var{limit}. 2275@end deftypefun 2276 2277@node Dictionary Split Variables 2278@subsection Split Variables 2279 2280The user may use the @cmd{SPLIT FILE} command (@pxref{SPLIT 2281FILE,,,pspp, PSPP Users Guide}) to select a set of variables on which 2282to split the active dataset into groups of cases to be analyzed 2283independently in each statistical procedure. The set of split 2284variables is stored as part of the dictionary, although the effect on 2285data analysis is implemented by each individual statistical procedure. 2286 2287Split variables may be numeric or short or long string variables. 2288 2289The most useful functions for split variables are those to retrieve 2290them. Even these functions are rarely useful directly: for the 2291purpose of breaking cases into groups based on the values of the split 2292variables, it is usually easier to use 2293@func{casegrouper_create_splits}. 2294 2295@deftypefun {const struct variable *const *} dict_get_split_vars (const struct dictionary *@var{dict}) 2296Returns a pointer to an array of pointers to split variables. If and 2297only if there are no split variables, returns a null pointer. The 2298caller must not modify or free the returned array. 2299@end deftypefun 2300 2301@deftypefun size_t dict_get_split_cnt (const struct dictionary *@var{dict}) 2302Returns the number of split variables. 2303@end deftypefun 2304 2305The following functions are also available for working with split 2306variables. 2307 2308@deftypefun void dict_set_split_vars (struct dictionary *@var{dict}, struct variable *const *@var{vars}, size_t @var{cnt}) 2309Sets @var{dict}'s split variables to the @var{cnt} variables in 2310@var{vars}. If @var{cnt} is 0, then @var{dict} will not have any 2311split variables. The caller retains ownership of @var{vars}. 2312@end deftypefun 2313 2314@deftypefun void dict_unset_split_var (struct dictionary *@var{dict}, struct variable *@var{var}) 2315Removes @var{var}, which must be a variable in @var{dict}, from 2316@var{dict}'s split of split variables. 2317@end deftypefun 2318 2319@node Dictionary File Label 2320@subsection File Label 2321 2322A dictionary may optionally have an associated string that describes 2323its contents, called its file label. The user may set the file label 2324with the @cmd{FILE LABEL} command (@pxref{FILE LABEL,,,pspp, PSPP 2325Users Guide}). 2326 2327These functions set and retrieve the file label. 2328 2329@deftypefun {const char *} dict_get_label (const struct dictionary *@var{dict}) 2330Returns @var{dict}'s file label. If @var{dict} does not have a label, 2331returns a null pointer. 2332@end deftypefun 2333 2334@deftypefun void dict_set_label (struct dictionary *@var{dict}, const char *@var{label}) 2335Sets @var{dict}'s label to @var{label}. If @var{label} is non-null, 2336then its content, truncated to at most 60 bytes, becomes the new file 2337label. If @var{label} is null, then @var{dict}'s label is removed. 2338 2339The caller retains ownership of @var{label}. 2340@end deftypefun 2341 2342@node Dictionary Documents 2343@subsection Documents 2344 2345A dictionary may include an arbitrary number of lines of explanatory 2346text, called the dictionary's documents. For compatibility, document 2347lines have a fixed width, and lines that are not exactly this width 2348are truncated or padded with spaces as necessary to bring them to the 2349correct width. 2350 2351PSPP users can use the @cmd{DOCUMENT} (@pxref{DOCUMENT,,,pspp, PSPP 2352Users Guide}), @cmd{ADD DOCUMENT} (@pxref{ADD DOCUMENT,,,pspp, PSPP 2353Users Guide}), and @cmd{DROP DOCUMENTS} (@pxref{DROP DOCUMENTS,,,pspp, 2354PSPP Users Guide}) commands to manipulate documents. 2355 2356@deftypefn Macro int DOC_LINE_LENGTH 2357The fixed length of a document line, in bytes, defined to 80. 2358@end deftypefn 2359 2360The following functions work with whole sets of documents. They 2361accept or return sets of documents formatted as null-terminated 2362strings that are an exact multiple of @code{DOC_LINE_LENGTH} 2363bytes in length. 2364 2365@deftypefun {const char *} dict_get_documents (const struct dictionary *@var{dict}) 2366Returns the documents in @var{dict}, or a null pointer if @var{dict} 2367has no documents. 2368@end deftypefun 2369 2370@deftypefun void dict_set_documents (struct dictionary *@var{dict}, const char *@var{new_documents}) 2371Sets @var{dict}'s documents to @var{new_documents}. If 2372@var{new_documents} is a null pointer or an empty string, then 2373@var{dict}'s documents are cleared. The caller retains ownership of 2374@var{new_documents}. 2375@end deftypefun 2376 2377@deftypefun void dict_clear_documents (struct dictionary *@var{dict}) 2378Clears the documents from @var{dict}. 2379@end deftypefun 2380 2381The following functions work with individual lines in a dictionary's 2382set of documents. 2383 2384@deftypefun void dict_add_document_line (struct dictionary *@var{dict}, const char *@var{content}) 2385Appends @var{content} to the documents in @var{dict}. The text in 2386@var{content} will be truncated or padded with spaces as necessary to 2387make it exactly @code{DOC_LINE_LENGTH} bytes long. The caller retains 2388ownership of @var{content}. 2389 2390If @var{content} is over @code{DOC_LINE_LENGTH}, this function also 2391issues a warning using @func{msg}. To suppress the warning, enclose a 2392call to one of this function in a @func{msg_disable}/@func{msg_enable} 2393pair. 2394@end deftypefun 2395 2396@deftypefun size_t dict_get_document_line_cnt (const struct dictionary *@var{dict}) 2397Returns the number of line of documents in @var{dict}. If the 2398dictionary contains no documents, returns 0. 2399@end deftypefun 2400 2401@deftypefun void dict_get_document_line (const struct dictionary *@var{dict}, size_t @var{idx}, struct string *@var{content}) 2402Replaces the text in @var{content} (which must already have been 2403initialized by the caller) by the document line in @var{dict} numbered 2404@var{idx}, which must be less than the number of lines of documents in 2405@var{dict}. Any trailing white space in the document line is trimmed, 2406so that @var{content} will have a length between 0 and 2407@code{DOC_LINE_LENGTH}. 2408@end deftypefun 2409 2410@node Coding Conventions 2411@section Coding Conventions 2412 2413Every @file{.c} file should have @samp{#include <config.h>} as its 2414first non-comment line. No @file{.h} file should include 2415@file{config.h}. 2416 2417This section needs to be finished. 2418 2419@node Cases 2420@section Cases 2421 2422This section needs to be written. 2423 2424@node Data Sets 2425@section Data Sets 2426 2427This section needs to be written. 2428 2429@node Pools 2430@section Pools 2431 2432This section needs to be written. 2433 2434@c LocalWords: bool 2435