1namespace Dakota {
2
3/** \page SpecChange Instructions for Modifying Dakota's Input Specification
4
5\htmlonly
6<b>Specification Change Table of Contents</b>
7<ul>
8
9<li> <a href="SpecChange.html#ModXMLSpec"> XML input specification</a>
10  <ul>
11  <li> <a href="SpecChange.html#ModXMLSpecReq"> XML Build Requirements</a>
12  <li> <a href="SpecChange.html#ModXMLSpecTools"> XML Editing Tools</a>
13  <li> <a href="SpecChange.html#ModXMLSpecFeatures"> XML Features (with map to NIDR)</a>
14  </ul>
15
16<li> <a href="SpecChange.html#RebuildGenFiles"> Rebuild generated files</a>
17
18<li> <a href="SpecChange.html#UpdateNIDRPDDB"> Update parser source NIDRProblemDescDB.cpp</a>
19
20<li> <a href="SpecChange.html#UpdateData"> Update Corresponding Data Classes</a>
21  <ul>
22  <li> <a href="SpecChange.html#UpdateDatap1"> Update the Data class header file</a>
23  <li> <a href="SpecChange.html#UpdateDatap2"> Update the .cpp file</a>
24  </ul>
25<li> <a href="SpecChange.html#UpdatePDDB"> Update database source ProblemDescDB.cpp</a>
26  <ul>
27  <li> <a href="SpecChange.html#UpdatePDDBp1"> Augment/update get_<data_type>() functions</a>
28  </ul>
29
30<li> <a href="SpecChange.html#UseFns"> Use get_<data_type>() Functions</a>
31
32<li> <a href="SpecChange.html#UpdateDocs"> Update the Documentation</a>
33
34</ul>
35\endhtmlonly
36
37To modify %Dakota's input specification (for maintenance or addition of
38new input syntax), specification maintenance mode must be enabled at
39%Dakota configure time with the \c -DENABLE_SPEC_MAINT option, e.g.,
40\code
41  ./cmake -DENABLE_SPEC_MAINT:BOOL=ON ..
42\endcode
43This will enable regeneration of NIDR and %Dakota components which must
44be updated following a spec change.
45
46
47\section ModXMLSpec XML Input Specification
48
49The authoritative source for valid %Dakota input grammar is \c
50dakota/src/dakota.xml.  The schema defining valid content for this XML
51file is in \c dakota/src/dakota.xsd.  NIDR remains %Dakota's user
52input file parser, so \c dakota.xml is translated to \c
53dakota/src/dakota.input.nspec during the %Dakota build process.  To
54update the XML input definition: <ul>
55
56<li> Make sure ENABLE_SPEC_MAINT is enabled in your build and necessary
57     Java development tools are installed (see below).</li>
58
59<li> Edit the XML spec in \c dakota.xml.</li>
60
61<li> Perform a make in \c dakota.build/src which will regenerate \c
62     dakota.source/src/dakota.input.nspec and related file.</li>
63
64<li> Review that any changes induced in the \c dakota.input.nspec file
65     are as expected.</li>
66
67<li> Proceed with verifying code changes and making downstream parse
68     handler changes as normal (described below).</li>
69
70<li> Commit the modified \c dakota.xml, \c dakota.input.nspec, and
71     other files generated to \c dakota.source/src along with your
72     other code changes.</li>
73
74</ul>
75
76
77\subsection ModXMLSpecReq XML Build Requirements
78
79Editing the XML and then compiling %Dakota requires
80
81 - Java Development Kit (JDK) providing the Java compiler javac.  Java
82   6 (version 1.6) or newer should work, with Java 8 recommended.  Can
83   satisfy on RHEL6 with RPM packages \c java-1.8.0-openjdk-devel and
84   \c java-1.8.0-openjdk.  This is needed to build the Java-based XML
85   to NIDR translator.  If this becomes too burdensome, we can check
86   in the generated \c xml2nidr.jar file.
87
88\subsection ModXMLSpecTools XML Editing Tools
89
90The following tools will make editing dakota.input.xml easier.
91
92<ul>
93
94<li> <b>Recommended: Eclipse Web Tools Platform.</b> Includes both
95     graphical and text editors.
96
97  <ol>
98  <li> Download Eclipse Standard (Classic)</li>
99  <li> Configure proxy if needed, setting to manual:
100       Window > Preferences > General > Network Connection > Proxy
101  <li> Install Web Tools Platform
102       - Help > Install New Software
103       - Work With: Kepler - http://download.eclipse.org/releases/kepler
104       - Search "Eclipse X" and install two packages under Web, XML, Java
105         - Eclipse XML Editors and Tools
106         - Eclipse XSL Developer Tools
107       - Optionally install C/C++ Development Tools
108  </li>
109
110  <li> Optional: add Subclipse for subversion (Subversive is the other
111       major competing tool and I don't think requires JavaHL)
112       Help > Install New Software
113       * Work With: http://subclipse.tigris.org/update_1.6.x
114       * Install Subclipse
115       * On Linux: yum install subversion-javahl.x86_64
116  </li>
117
118  <li> Alternately install Eclipse for Java or Eclipse Java EE
119       development which includes webtools, then optionally add
120       subclipse and C/C++ dev</li>
121
122  </ol>
123</li>
124
125<li> <b>Alternate: Emacs or your usual editor.</b> For example, Emacs
126     supports an Nxml mode.  You can tell it where to find the schema,
127     edit XML, and have it perform validation against the schema.  See
128     help at
129     http://www.gnu.org/software/emacs/manual/html_mono/nxml-mode.html
130     </li>
131
132<li> <b>Other Suggested Alternates:</b> XMLSpy, DreamWeaver, XML Copy
133     Editor</li>
134
135</ul>
136
137
138\subsection ModXMLSpecFeatures XML Features (with map to NIDR)
139
140Out of necessity, %Dakota XML \c dakota.xml closely mirrors
141\c dakota.input.nspec.  Valid %Dakota input grammar is constrained by
142\c dakota.xml, an XML document which must validate against \c
143dakota.xsd.  The top-level element of interest is \c <input>, which is
144comprised of a sequence of content elements (keywords, alternates,
145etc.), which may themselves contain additional child content elements.
146The key content types are:
147
148<ul>
149
150<li> <b>Keyword (\c <keyword>):</b>, specified with the \c <keyword>
151     element whose definition is given by keywordType in \c
152     dakota.xsd.  The required attributes are:
153     <ul>
154
155     <li><b>name:</b> The keyword name (lower case with underscores)
156         as it will be given in user input; must follow same
157         uniqueness rules are historical NIDR.  User input is allowed
158         in mixed case, but the XML must use lower case names. </li>
159
160	 Since the NIDR parser allows keyword abbreviation, you \e
161         must not add a keyword that could be misinterpreted as
162         an abbreviation for a different keyword within the same
163         top-level keyword, such as "environment" and "method".  For
164         example, adding the keyword "expansion" within the method
165         specification would be a mistake if the keyword
166         "expansion_factor" already was being used in this block.
167
168	 The NIDR input is somewhat order-dependent, allowing the same
169	 keyword to be reused multiple times in the specification.
170	 This often happens with aliases, such as \c lower_bounds, \c
171	 upper_bounds and \c initial_point.  Ambiguities are resolved
172	 by attaching a keyword to the most recently seen context in
173	 which it could appear, if such exists, or to the first
174	 relevant context that subsequently comes along in the input
175	 file.
176
177     <li><b>code:</b> The verbatim NIDR handler to be invoked when
178         this keyword parsed.  In NIDR this was specified with
179         {N_macro(...)}.</li>
180
181     </ul>
182
183     Optional/useful parser-related elements/attributes in order of
184     importance are: <ul>
185
186     <li><b>param sub-element:</b>Parameters and data types: A keyword
187     	 may have an associated parameter element with a specified
188     	 data type: \c <param \c type="PARAMTYPE" />.  NIDR data types
189     	 remain the same (INTEGER, REAL, STRING and LISTs thereof, but
190     	 new data types INPUT_FILE and OUTPUT_FILE add convenience for
191     	 the GUI, mapping to STRING for NIDR purposes.  Parameters can
192     	 also include attributes \c constraint, \c in_taglist, or \c
193     	 taglist, which are used to help validate the user-specified
194     	 parameter value.  For example <code>constraint >= 0 LEN
195     	 normal_uncertain</code></li>
196
197     <li><b>alias sub-element:</b> Historical aliases for this keyword
198         (can appear multiple times).  Alias has a single attribute
199         <b>name</b> which must be lower case with underscores.</li>
200
201     <li><b>id:</b> Unique ID for the keyword, usually name with an
202         integer appended, but not currently used/enforced.</li>
203
204     <li><b>minOccurs:</b> Minimum occurrences of the keyword in
205         current context (set to 1 for required, 0 for optional)</li>
206
207     <li><b>maxOccurs:</b> Maximum occurrences of the keyword in
208         current context (for example environment may appear at most
209         once)</li>
210
211     </ul>
212
213     And optional/useful GUI-related attributes are:
214     <ul>
215
216     <li><b>help:</b> (Don't add this attribute the new keywords!) A
217         pointer to the corresponding reference manual section
218         (deprecated as not needed with new reference manual format
219         which mirrors keyword hierarchy).</li>
220
221     <li><b>label:</b> a short, friendly label string for the keyword
222         in the GUI.  Format these like titles, e.g., "Initial Point
223         for Search".</li>
224
225     <li><b>group:</b> Category or group for this keyword, e.g.,
226         optimization vs. parameter study if they are to be groups for
227         GUI purposes</li>
228
229     </ul>
230
231<li> <b>Alternation (\c <oneOf>):</b> Alternation of groups of content
232     is done with the element \c <oneOf> which indicates that its
233     immediate children are alternates.  In NIDR this was done with
234     the pipe symbol: OptionA | OptionB.  oneOf allows the label
235     attribute and its use is recommended. </li>
236
237<li> <b>Required Group (\c <required>):</b> A required group can be specified by
238     enclosing the contents in the \c <required> element.  In NIDR
239     this was done by enclosing the content in parentheses: ( required
240     group... ) </li>
241
242<li> <b>Optional Group (\c <optional>):</b> An optional group can be
243     specified by enclosing the contents in the \c <optional> element.
244     In NIDR this was done by enclosing the content in brackets: [
245     optional group... ] </li>
246
247</ul>
248
249
250\section RebuildGenFiles Rebuild Generated Files
251
252When configured with \c -DENABLE_SPEC_MAINT, performing a make in \c
253dakota.build/src will regenerate all files which derive from \c
254dakota.xml, include dakota.input.nspec, NIDR_keywds.hpp, and
255dakota.input.summary.  If you commit changes to a source repository,
256be sure to commit any automatically generated files in addition to any
257modified in the following steps.  It is not strictly necessary to run
258make at this point in the sequence, and in fact may generate errors if
259necessary handlers aren't yet available.
260
261\warning Please do not manually modify generated files!
262
263
264\section UpdateNIDRPDDB Update Parser Source NIDRProblemDescDB.cpp
265
266Many keywords have data associated with them: an integer, a
267floating-point number, a string, or arrays of such entities.  Data
268requirements are specified in dakota.input.nspec by the tokens
269INTEGER, REAL, STRING, INTEGERLIST, REALLIST, STRINGLIST.  (Some
270keywords have no associated data and hence no such token.)  After each
271keyword and data token, the dakota.input.nspec file specifies
272functions that the NIDR parser should call to record the appearance of
273the keyword and deal with any associated data.  The general form of
274this specification is
275
276{ startfcn, startdata, stopfcn, stopdata }
277
278i.e., a brace-enclosed list of one to four functions and data
279pointers, with trailing entities taken to be zero if not present; zero
280for a function means no function will be called.  The startfcn must
281deal with any associated data.  Otherwise, the distinction between
282startfcn and stopfcn is relevant only to keywords that begin a group
283of keywords (enclosed in parentheses or square brackets).  The
284startfcn is called before other entities in the group are processed,
285and the stop function is called after they are processed.  Top-level
286keywords often have both startfcn and stopfcn; stopfcn is uncommon but
287possible for lower-level keywords.  The startdata and (if needed)
288stopdata values are usually pointers to little structures that provide
289keyword-specific details to generic functions for startfcn and
290stopfcn.  Some keywords that begin groups (such as "approx_problem"
291within the top-level "environment" keyword) have no need of either a
292startfcn or a stopfcn; this is indicated by "{0}".
293
294Most of the things within braces in dakota.input.nspec are invocations
295of macros defined in \c dakota.source/src/NIDRProblemDescDB.cpp.  The
296macros simplify writing dakota.input.nspec and make it more readable.
297Most macro invocations refer to little structures defined in
298NIDRProblemDescDB.cpp, usually with the help of other macros, some of
299which have different definitions in different parts of
300NIDRProblemDescDB.cpp.  When adding a keyword to dakota.input.nspec,
301you may need to add a structure definition or even introduce a new
302data type.  NIDRProblemDescDB.cpp has sections corresponding to each
303top-level keyword.  The top-level keywords are in alphabetical order,
304and most entities in the section for a top-level keyword are also in
305alphabetical order.  While not required, it is probably good practice
306to maintain this structure, as it makes things easier to find.
307
308
309Any integer, real, or string data associated with a keyword are
310provided to the keyword's startfcn, whose second argument is a pointer
311to a \c Values structure, defined in header file \c nidr.h.
312
313<b>Example 1:</b> if you added the specification:
314\verbatim
315    [method_setting REAL {method_setting_start, &method_setting_details} ]
316\endverbatim
317you would provide a function
318\code
319        void NIDRProblemDescDB::
320    method_setting_start(const char *keyname, Values *val, void **g, void *v)
321    { ... }
322\endcode
323in NIDRProblemDescDB.cpp.  In this example, argument \c &method_setting_details
324would be passed as \c v, \c val->n (the number of values) would be 1 and \c *val->r
325would be the REAL value given for the \c method_setting keyword.  The
326\c method_setting_start function would suitably store this value with the
327help of \c method_setting_details.
328
329For some top-level keywords, \c g
330(the third argument to the startfcn and stopfcn) provides access to a relevant context.
331For example, \c method_start (the startfcn for the top-level \c method keyword)
332executes
333\code
334    DataMethod *dm = new DataMethod;
335    *g = (void*)dm;
336\endcode
337(and supplies a couple of default values to \c dm).  The start functions for
338lower-level keywords within the \c method keyword get access to \c dm
339through their \c g arguments.  Here is an example:
340\code
341 void NIDRProblemDescDB::
342method_str(const char *keyname, Values *val, void **g, void *v)
343{
344	(*(DataMethod**)g)->**(String DataMethod::**)v = *val->s;
345	}
346\endcode
347In this example, \c v points to a pointer-to-member, and an assignment is made
348to one of the components of the DataMethod object pointed to by \c *g.
349The corresponding stopfcn for the top-level \c method keyword is
350\code
351 void NIDRProblemDescDB::
352method_stop(const char *keyname, Values *val, void **g, void *v)
353{
354	DataMethod *p = *(DataMethod**)g;
355	pDDBInstance->dataMethodList.insert(*p);
356	delete p;
357	}
358\endcode
359which copies the now populated DataMethod object to the right place
360and cleans up.
361
362
363<b>Example 2:</b> if you added the specification
364\verbatim
365    [method_setting REALLIST {{N_mdm(RealL,methodCoeffs)}
366\endverbatim
367then method_RealL (defined in NIDRProblemDescDB.cpp) would be called
368as the startfcn, and methodCoeffs would be the name of a
369(currently nonexistent) component of DataMethod.  The N_mdm macro
370is defined in NIDRProblemDescDB.cpp; among other things, it turns
371\c RealL into \c NIDRProblemDescDB::method_RealL.  This function is
372used to process lists of REAL values for several keywords.  By looking
373at the source, you can see that the list values are val->r[i]
374for 0 <= \c i < val->n.
375
376
377\section UpdateData Update Corresponding Data Classes
378
379The Data classes (\ref DataEnvironment "DataEnvironment", \ref
380DataMethod "DataMethod", \ref DataModel "DataModel", \ref
381DataVariables "DataVariables", \ref DataInterface "DataInterface", and
382\ref DataResponses "DataResponses") store the parsed user input data.
383In this step, we extend the Data class definitions to
384include any new attributes referred to in \c dakota.xml or \c NIDRProblemDescDB
385
386\subsection UpdateDatap1 Update the Data Class Header File
387
388Add a new attribute to the public data for each of the new
389specifications.  Follow the style guide for class attribute naming
390conventions (or mimic the existing code).
391
392\subsection UpdateDatap2 Update the .cpp File
393
394Define defaults for the new attributes in the constructor
395initialization list (if not a container with a sensible default
396constructor) in same order as they appear in the header.  Add the new
397attributes to the write(MPIPackBuffer&), read(MPIUnpackBuffer&), and
398write(ostream&) functions, paying careful attention to the use of a
399consistent ordering.
400
401
402\section UpdatePDDB Update Database Source ProblemDescDB.cpp
403
404\subsection UpdatePDDBp1 Augment/update get_<data_type>() Functions
405
406The next update step involves extending the database retrieval
407functions in \c dakota.source/src/ProblemDescDB.cpp.  These retrieval
408functions accept an identifier string and return a database attribute
409of a particular type, e.g., a RealVector:
410
411\code
412    const RealVector& get_rv(const String& entry_name);
413\endcode
414
415The implementation of each of these functions contains tables of
416possible entry_name values and associated pointer-to-member values.
417There is one table for each relevant top-level keyword, with the
418top-level keyword omitted from the names in the table.  Since binary
419search is used to look for names in these tables, each table must be
420kept in alphabetical order of its entry names.  For example,
421
422\code
423  ...
424  else if ((L = Begins(entry_name, "model."))) {
425    if (dbRep->methodDBLocked)
426	Locked_db();
427
428    #define P &DataModelRep::
429    static KW<RealVector, DataModelRep> RVdmo[] = {	// must be sorted
430	{"nested.primary_response_mapping", P primaryRespCoeffs},
431	{"nested.secondary_response_mapping", P secondaryRespCoeffs},
432	{"surrogate.kriging_conmin_seed", P krigingConminSeed},
433	{"surrogate.kriging_correlations", P krigingCorrelations},
434	{"surrogate.kriging_max_correlations", P krigingMaxCorrelations},
435	{"surrogate.kriging_min_correlations", P krigingMinCorrelations}};
436    #undef P
437
438    KW<RealVector, DataModelRep> *kw;
439    if ((kw = (KW<RealVector, DataModelRep>*)Binsearch(RVdmo, L)))
440	return dbRep->dataModelIter->dataModelRep->*kw->p;
441  }
442\endcode
443
444is the "model" portion of \ref ProblemDescDB::get_rv
445"ProblemDescDB::get_rv()".  Based on entry_name, it returns the
446relevant attribute from a \ref DataModel "DataModel" object.  Since
447there may be multiple model specifications, the \c dataModelIter list
448iterator identifies which node in the list of \ref DataModel
449"DataModel" objects is used.  In particular, \c dataModelList contains
450a list of all of the \c data_model objects, one for each time a
451top-level \c model keyword was seen by the parser.  The particular
452model object used for the data retrieval is managed by \c
453dataModelIter, which is set in a \c set_db_list_nodes() operation that
454will not be described here.
455
456There may be multiple \ref DataMethod "DataMethod",
457\ref DataModel "DataModel", \ref DataVariables "DataVariables",
458\ref DataInterface "DataInterface", and/or
459\ref DataResponses "DataResponses" objects.  However, only one
460 specification is currently allowed so a list of
461\ref DataEnvironment "DataEnvironment" objects is not needed.  Rather,
462\ref ProblemDescDB::environmentSpec "ProblemDescDB::environmentSpec"
463is the lone \ref DataEnvironment "DataEnvironment" object.
464
465To augment the get_<data_type>() functions, add table entries with new
466identifier strings and pointer-to-member values that address the
467appropriate data attributes from the Data class object.  The style for
468the identifier strings is a top-down hierarchical description, with
469specification levels separated by periods and words separated with
470underscores, e.g., \c
471"keyword.group_specification.individual_specification".  Use the \c
472dbRep->listIter->attribute syntax for variables, interface, responses,
473and method specifications.  For example, the \c method_setting example
474attribute would be added to \c get_drv() as:
475
476\code
477  {"method_name.method_setting", P methodSetting},
478\endcode
479
480inserted at the beginning of the \c RVdmo array shown above (since the
481name in the existing first entry, i.e.,
482"nested.primary_response_mapping", comes alphabetically after
483"method_name.method_setting").
484
485
486\section UseFns Use get_<data_type>() Functions
487
488At this point, the new specifications have been mapped through all of
489the database classes.  The only remaining step is to retrieve the new
490data within the constructors of the classes that need it.  This is
491done by invoking the get_<data_type>() function on the
492\ref ProblemDescDB "ProblemDescDB"
493object using the identifier string you selected in \ref UpdatePDDBp1.
494For example:
495\code
496  const String& interface_type = problem_db.get_string("interface.type");
497\endcode
498passes the \c "interface.type" identifier string to the
499\ref ProblemDescDB::get_string "ProblemDescDB::get_string()"
500retrieval function, which returns the desired attribute from the active
501\ref DataInterface "DataInterface" object.
502
503\warning Use of the get_<data_type>() functions is restricted to class
504constructors, since only in class constructors are the data list
505iterators (i.e., \c dataMethodIter, \c dataModelIter, \c
506dataVariablesIter, \c dataInterfaceIter, and \c dataResponsesIter)
507guaranteed to be set correctly.  Outside of the constructors, the
508database list nodes will correspond to the last set operation, and may
509not return data from the desired list node.
510
511
512\section UpdateDocs Update the Documentation
513
514Doxygen comments should be added to the Data class headers for the new
515attributes, and the reference manual sections describing the portions
516of \c dakota.xml that have been modified should be updated by updating
517files in \c dakota.source/docs/KeywordMetaData/.  d\c dakota.xml,
518together with these metadata files generates the reference manual and
519GUI context-aware help documentation.
520
521*/
522
523} // namespace Dakota
524