1
2The \eslmod{fileparser} module parses simple input text data files
3that consist of whitespace-delimited tokens.
4
5Data files can contain blank lines and comments. Comments are defined
6by a single character; for instance, a \verb+#+ character commonly
7means that everything following the \verb+#+ on the line is a comment.
8
9Two different styles of token input are supported. The simplest style
10reads tokens one at a time, regardless of what line they occur on,
11until the file ends. You can also read in a line-oriented way, in
12which you get one data line at a time, then read all the tokens on
13that line; this style lets you count how many tokens occur on a data
14line, which allows better checking of your input.
15
16The module implements one object, an \ccode{ESL\_FILEPARSER}, that
17holds the open input stream and the state of the parser.  The
18functions in the API are summarized in Table~\ref{tbl:fileparser_api}.
19
20\begin{table}[hbp]
21\begin{center}
22{\scriptsize
23\begin{tabular}{|lp{3.5in}|}\hline
24\hyperlink{func:esl_fileparser_Open()}{\ccode{esl\_fileparser\_Open()}}
25& Open a file for parsing.\\
26\hyperlink{func:esl_fileparser_Create()}{\ccode{esl\_fileparser\_Create()}}
27& Associate already open stream with a new parser.\\
28\hyperlink{func:esl_fileparser_SetCommentChar()}{\ccode{esl\_fileparser\_SetCommentChar()}}
29& Set character that defines start of a comment.\\
30\hyperlink{func:esl_fileparser_NextLine()}{\ccode{esl\_fileparser\_NextLine()}}
31& Advance the parser to next line containing a token.\\
32\hyperlink{func:esl_fileparser_GetToken()}{\ccode{esl\_fileparser\_GetToken()}}
33& Get the next token in the file.\\
34\hyperlink{func:esl_fileparser_GetTokenOnLine()}{\ccode{esl\_fileparser\_GetTokenOnLine()}}
35& Get the next token on the current line.\\
36\hyperlink{func:esl_fileparser_Destroy()}{\ccode{esl\_fileparser\_Destroy()}}
37& Deallocate a parser that was \ccode{Create()}'d.\\
38\hyperlink{func:esl_fileparser_Close()}{\ccode{esl\_fileparser\_Close()}}
39& Close a parser that was \ccode{Open()}'d.\\
40\hline
41\end{tabular}
42}
43\end{center}
44\caption{The \eslmod{fileparser} API.}
45\label{tbl:fileparser_api}
46\end{table}
47
48\subsection{Example of using the fileparser API}
49
50An example that opens a file, reads all its tokens one at a time, and
51prints out token number, token length, and the token itself:
52
53\input{cexcerpts/fileparser_example}
54
55A single character can be defined to serve as a comment character
56(often \ccode{\#}), using the \ccode{esl\_fileparser\_SetCommentChar()}
57call. The parser will ignore the comment character, and the remainder
58of any line following a comment character.
59
60Each call to \ccode{esl\_fileparser\_GetToken()} retrieves one
61whitespace-delimited token from the input stream; the call returns
62\ccode{eslOK} if a token is parsed, and \ccode{eslEOF} when there are
63no more tokens in the file. Whitespace is defined as space, tab,
64newline, or carriage return (\verb+" \t\n\r"+).
65
66When the caller is done, the fileparser is closed with
67\ccode{esl\_fileparser\_Close()}.
68
69\subsection{A second example: line-oriented parsing}
70
71The \ccode{esl\_fileparser\_GetToken()} call provides a simple style
72of parsing a file: read one token at a time until the file ends,
73regardless of what line the tokens are on. However, you may want to
74know how many tokens are on a given data line, either because you know
75how many there should be (and you want to verify) or because you don't
76(and you need to allocate some variable-size data structure
77appropriately). The following is an example that reads a file line by
78line:
79
80\input{cexcerpts/fileparser_example2}
81
82The output from this example is, for each data line, the actual line
83number (starting from 1), the data line number (a count that excludes
84comments and blank lines), and the number of tokens on the line.
85
86Note the use of \ccode{efp->linenumber} to obtain the current line in
87the file. You can use this to produce informative error messages.  If
88a token is not what you expected, you probably want to provide some
89diagnostic output to the user, and \ccode{efp->linenumber} lets you
90direct the user to the line that the failure occurred at.
91
92
93
94
95
96
97
98