1
2libutils.a now has a new feature, with parameter lists. Here follows the
3doc for parameter lists.
4
5The idea is that every cado tool (so far polyselect/*, sieve/makefb, and
6sieve/sieve, for an example) can have its working configuration either
7completely within a file, or completely on the command line.
8
9The old way still works unchanged. There are now just many more ways to
10call programs, like:
11
12polyselect/polyselect 71641520761751435455133616475667090434063332228247871795429
13
14or with a config file:
15polyselect/kleinjung << EOF
16M=5e20
17degree=5
18l=8
19n=3835722565249558322197842586047190634357074639921908543369929770877061053472916307274359341359614012333625966961056935597194003176666977
20pb=256
21EOF
22
23The syntax of config files that can be read is the union of the
24ggnfs/cado format, and the params file for the now obsolete cadofactor.pl
25script:
26
27file=(<comment-line> | <non-comment>)*
28comment-line = <empty-line>|<white>#.*
29non-comment = <white><key><white><separator><white><value><white>(#.*)?
30white=<whitespace>*
31key=<alphabetic>[<alphanumeric>_]*
32separator=(:= | : | =)
33value=<non-space>+
34
35I see several benefits.
36- for very quick tests, putting everything on the cmdline is handy.
37- for regression tests, being able to put everything in a file is handy
38- this deletes some code (but adds more). The cost per extra parameter is
39  lower.
40
41
42The processing goes through two steps:
43- build a dictionary of (key, value) pair (as character strings).
44- parse some values to set variables in the program.
45
46###################################################################
47# Quick documentation for the impatient.
48
49Common accepted syntaxes for options are --foo blah, -foo blah, or
50foo=blah. On a first pass, keys are recognized as strings, and parsing
51comes later on once lexing has completed.
52
53Some terminology first.
54
55``switches'' is the term I use for option which trigger something by their
56mere presence, not requiring any other ``value'' input. The archetypal
57example is for example --verbose.
58
59``aliases'' is a means of indicating that a given option may be
60referenced in more than one manner.
61
62Effort has been put towards making it possible and easy to source a
63config file hosting a (possibly large) number of options.
64
65
66
67To use params.[ch], your main() function must follow the following
68pattern.
69
70    param_list pl;
71
72    param_list_init(pl);
73
74    argv++,argc--;
75    /* switches, if any. See below */
76    /* aliases, if any. See below */
77
78    for( ; argc ; ) {
79        if (param_list_update_cmdline(pl, &argc, &argv)) { continue; }
80        /* Do perhaps some other things on the argument that haven't
81         * been eaten at all. Like check whether it is a valid file to
82         * source in order to get more options. See
83         * param_list_read_stream and param_list_read_file for that. */
84        fprintf (stderr, "Unknown option: %s\n", argv[0]);
85        usage();
86    }
87
88    /* Now parse the values corresponding to options, and map their
89     * meaning somewhere */
90    const char * tmp;
91
92    /* param_list_lookup_string gives a const char * pointer which will
93     * live only for as long as the param_list structure isn't freed. The
94     * return value is NULL if no such key exists */
95
96    if ((tmp = param_list_lookup_string(pl, "subdir")) != NULL) {
97        fprintf(stderr, "--subdir is no longer supported."
98                " prepend --out with a path instead.\n");
99        exit(1);
100    }
101
102    /* ... */
103
104    /* It is possible to strdup() it of course */
105    working_filename = strdup(param_list_lookup_string(pl, "out"));
106    if (working_filename == NULL) {
107        fprintf(stderr, "Required argument --out is missing\n");
108        exit(1);
109    }
110    /* ... */
111
112
113    /* all parsing functions return 1 when the key was found, 0 if not.
114     * If no key was found, no assignment is performed and the default
115     * value set beforehand remains */
116    param_list_parse_int(pl, "hslices", &split[0]);
117    param_list_parse_int(pl, "vslices", &split[1]);
118
119    int split[2] = {1,1};
120    param_list_parse_intxint(pl, "nbuckets", split);
121
122    /* Good practice mandates that unused parameters trigger a warning,
123     * if not an error. Only unused parameters from command line are
124     * considered for issuance of a warning. Unused parameters found in
125     * config files are considered normal and trigger nothing.
126     */
127    if (param_list_warn_unused(pl)) {
128        usage();
129    }
130    param_list_clear(pl);
131
132
133It is also possible to configure switches. Configure switches goes toghether
134with specifying a pointer to the switch value, which must have type int.
135
136    param_list_configure_switch(pl, "--legacy", &legacy);
137    param_list_configure_switch(pl, "--remove-input", &remove_input);
138    param_list_configure_switch(pl, "--pad", &pad_to_square);
139
140Aliases:
141    param_list_configure_alias(pl, "--pad", "--square");
142    // --> means that the switch --pad may be aliased as --square
143
144    param_list_configure_alias(pl, "out", "output-name");
145    // means that --out, -out, out=, and so on, may be aliased as
146    // --output-name, etc.
147
148    There's one pitfall. If you use the former syntax for a non-switch
149    argument, (e.g. --out -> --output-name), then the other forms will
150    not be aliased.
151
152
153###################################################################
154# Slightly longer doc.
155
156Roadmap for use:
157
158** BUILDING THE DICTIONARY **
159
160start with:
161
162    param_list pl;
163    param_list_init(pl);
164
165    argv++, argc--;
166    for( ; argc ; ) {
167
168then we're parsing arguments in argv,argc as we always have been. The
169only difference (for some programs) is that now argv is shifted so that
170the ``current'' argument is at argv[0]
171
172the param_list routines are good at:
173- parsing arguments of the form --degree 42 , or -d 42 , or degree=42.
174- parsing config files.
175
176So far, they are _not_ made for:
177- booleans (-skip-something)
178- switches (-verbose)
179- things not foreseen.
180
181Therefore, before giving your argv,argc to the param_list routines, you
182have the occasion to parse your command line as you always have been.
183Just do so only for the strict necessary, since for those arguments, you
184won't get the benefit of allowing information to come from config files:
185
186    if (strcmp(argv[0], "-v") == 0) { verbose++; argv++,argc--; continue; }
187
188You may call the param_list routines to recognize certain specific
189arguments on the command line, like:
190
191    if (param_list_update_cmdline(pl, "degree", &argc, &argv)) { continue; }
192
193This will recognize --degree 42, as well as degree=42, on the command
194line. Upon success, 1 is returnes, and argv, argc advance by the right
195number of positions.
196
197Or you could accept just everything that ``resembles'' a parameter.
198Formally, a parameter matches
199    (-<key> <value> | --<key> <value> | <key>=<value>)
200    where <key>=<alphabetic>[<alphanumeric>_]*
201To match in such a wildcard way (which is the easy way to go, really):
202
203    if (param_list_update_cmdline(pl, NULL, &argc, &argv)) { continue; }
204
205That's mostly it.  If we haven't hit a 'continue' statement within the
206loop already, then we might have some unparsed garbage on the command
207line (1). Or just something special (2). Or we might be very polite and
208allow the user to specify the name of an input parameter file, freeform,
209to be parsed right now (3). Or we've got some very peculiar way of
210receiving config arguments, so we'll catch them now (4).
211
212To illustrate these cases, here are some snippets. For case (2):
213
214      if (strspn(argv[0], "0123456789") == strlen(argv[0])) {
215          param_list_add_key(pl, "n", argv[0], PARAMETER_FROM_CMDLINE);
216          argv++,argc--;
217          continue;
218      }
219
220The call to param_list_add_key above is almost an innard, and having it
221in the API is just an ugly convenience measure. In particular, notice
222that it's your job at this point to position argv,argc correctly for the
223next round.
224
225After such a test, case (3) can be handled so:
226
227      if ((f = fopen(argv[0], "r")) != NULL) {
228          param_list_read_stream(pl, f, 0);
229          fclose(f);
230          argv++,argc--;
231          continue;
232      }
233
234[begin digression: another way of accepting input files can be when parsing
235arguments (before calling param_list_update_cmdline):
236
237      if (argc >= 2 && strcmp (argv[0], "-poly") == 0) {
238          param_list_read_file(pl, argv[1]);
239          argv++,argc--;
240          argv++,argc--;
241          continue;
242      }
243
244Notice that param_list_read_file fails when the file does not exist,
245while the former option above allows for the file not to exist.
246end digression]
247
248End of story, case (1) would go as such:
249
250      fprintf(stderr, "Unhandled parameter %s\n", argv[0]);
251      usage();
252
253The ``peculiar cases'' (case (4)) are to be handled in a way which
254depends exactly on how parameters are expected. Look into handling of
255amin amax bmin bmax within sieve/sieve.c for an example.
256
257Extra: parameters may have aliases. For the moment, aliases are better
258recognized before calling param_list_update_cmdline, by special calls to
259param_list_update_cmdline_alias (see polyselect/polyselect.c). But this
260may change.
261
262** PARSING VALUES **
263
264All the param_list_parse_* routines do parsing, and set variables. They
265return 1 on success, 0 on failure.
266
267It is possible to update the dictionary again depending on the result of
268a parsing operation. For example, polyselect reads stdin only if it needs
269it for knowing n:
270
271  int have_n = param_list_parse_mpz(pl, "n", poly->n);
272
273  if (!have_n) {
274      if (verbose) {
275          fprintf(stderr, "Reading n from stdin\n");
276      }
277      param_list_read_stream(pl, stdin, 0);
278      have_n = param_list_parse_mpz(pl, "n", poly->n);
279  }
280
281The param_list_parse_* functions should be easily understood.
282
283Eventually, it is possible to make sure that all command-line parameters
284were used, and warn otherwise:
285
286  if (param_list_warn_unused(pl)) {
287      usage();
288  }
289
290And clearing the parameter list can be done as soon as it's no longer
291needed:
292
293  param_list_clear(pl);
294
295Before that, notice that cado_poly_read has been superseded by
296cado_poly_set_plist, to be used so:
297
298    cado_poly_init(cpoly);
299    if (!cado_poly_set_plist (cpoly, pl)) {
300        exit (EXIT_FAILURE);
301    }
302
303
304###################################################################
305
306There is (yet) some more documentation text in params.h
307
308