1\label{sec:generic}
2
3The high-level flow of control within NWChem was broadly outlined in
4the discussion of NWChem architecture (see Section \ref{sec:arch}).  This
5chapter covers the details of internal communication between modules
6and the control of program execution.  This information is needed if
7NWChem is to be embedded in another application or when new modules
8are developed for the code.
9
10\section{Flow of Control in NWChem}
11
12The Generic Task Interface controls the execution of NWChem.  The flow
13of control proceeds in the following steps:
14
15\begin{enumerate}
16\item Begin initialization of the parallel environment.
17\item Identify and open the input file.
18\item Complete the initialization of the parallel environment.
19\item Scan the input file for a memory directive.
20\item Process start-up directives.
21\item Summarize start-up information and write it to the output file.
22\item Open the runtime database with the appropriate mode.
23\item Process the input sequentially (ignoring start-up directives),
24including the first \verb+task+ directive.
25\item Execute the task.
26\item Repeat steps 8 and 9 until reaching the end of the input file, or
27encountering a fatal error condition.
28\end{enumerate}
29
30In Step 1, the parallel environment is initialized by calling the
31TCGMSG wrapper routine \verb+pbeginf()+.  This creates the parallel processes
32and provides basic message-passing.  Before the global arrays can be
33initialized, however, user-specified memory parameters must be obtained
34from the input file.  This requires execution of Step 2 to open the input file.
35
36The input file is opened {\em only} by process zero.
37The name of the input file is determined by the routine
38\verb+get_input_file_name()+.  (The convention for the input file name is
39documented in the user manual.  The default name is \verb+nwchem.nw+.)
40The input file is scanned by process zero for a memory directive using the
41routine \verb+input_mem_size()+.  Defaults are provided for all memory parameters
42not provided by the user, and results are broadcast to all nodes.  At this point the
43local memory allocator (MA) and then the global array library (GA) are initialized.
44Completion of these steps fully initializes the parallel environment.
45
46The next step is to process the startup directives that are contained in the
47input file.  This is done to determine the type of calculation being undertaken
48(i.e., startup, restart, or continue), the name of the database, and the
49location of the permanent and scratch directories.  Note that only process zero scans
50the input file, using  routine \verb+input_file_info()+. The information obtained by
51process zero when reading the input file are broadcast to all nodes, however, and
52this information is summarized to the output file.  Process zero then
53opens the database with the appropriate mode ({\em empty} for startup, {\em old}
54for restart or continue).
55
56At this point NWChem is fully functional and ready to process user
57input beyond the startup information.
58If the startup mode is 'continue', however, no more input is processed and
59the code attempts to continue the
60previously executing task from the information in the database.  No new input information
61will be processed until that task is completed.  Once the continued task is finished,
62however, or if the startup mode is for a new or restarted input file,
63the input file is read sequentially from the beginning (ignoring
64startup directives since they have already been processed).
65
66As long as input is
67available from the input file, the input module (routine \verb+input_parse()+) is
68invoked to read it,
69up to and including a task directive.  Each input line is processed, and data is
70inserted into the data base for
71later retrieval.  Note that within the input module, only process zero is executing
72code, reading input or putting data into the database.  To enable this, the database
73is switched into sequential mode at the beginning of the input module, and back to
74parallel at the end.
75
76Once a task directive is processed and entered into the database,
77control is returned to the main program so that the task can be carried out.
78The main program initiates the execution of the task by
79calling the routine \verb+task()+.  If the task fails, a fatal error is generated
80either by
81\verb+task()+ itself or a lower level routine.  The task information remains in the database
82so that the task may be continued in another job.  If the task finishes successfully, \verb+task()+
83removes information about the completed task from the database, and the main program
84invokes the input module once again.
85
86The input module continues to sequentially process the input.  If it encounters another
87task directive, control is returned to the main program and the execution of the task
88is initiated, as described above.  Upon successful completion, the main program again
89returns control to the input module and input processing continues.  If the input
90module does not encounter a task directive before running out of input
91(physical or logical end of file) it returns false and the loop in the main
92program terminates.
93
94Once all input has been processed and there are no more tasks to execute,
95the code attempts to clean up by closing the database,
96tidying up GA, and finally gracefully killing the parallel processes.  Statistics
97concerning the database, MA and GA are printed to the output file, and execution
98terminates.
99
100When a new module is introduced into NWChem, it must conform to this
101orderly control process.  The new module must be appropriately
102invoked by the task routines.  In addition, if it requires new input, the new module's
103input routine must be
104appropriately invoked by \verb+input_parse()+ (see Section \ref{sec:parser} for details
105on the input parser).  The new module's input routine must also be structured so that
106it allows only
107process zero to execute the code that reads user input.
108
109Any new module developed for NWChem must also conform to the design
110goal that restart/continuation jobs with no repeated input behave exactly the same as
111if all input and tasks were specified in a single file.  These attributes imply that
112{\em all} input data be processed and entered into the database or another persistent
113file.  This means that in-core data structures should not be initialized
114 within the input module.  (Doing so will result in
115only process zero having the information and
116restarts will not work correctly.)  In addition,
117input routines must not require having basis set or geometry information
118available, since these are not known until a task is actually invoked.
119
120\section{Task Execution in NWChem}
121As described above, NWChem excutes all tasks by invoking the routine
122\verb+task()+.  The main program does not actually know
123what a particular task is --- the necessary
124information is passed from the input module to the task library via the
125database.  This makes the top level structure of NWChem very simple.  The same simplicity
126is desirable in many applications.  For instance, molecular geometry optimizers
127(or QM/MM programs) should work for all levels of theory and should not have to be
128modified if a new theory is introduced into the code.  Similarly, routines that
129compute gradients and Hessians by finite difference need to be able
130to save and restore the state associated with each type of wavefunction.
131
132NWChem contains a layer of routines that can perform
133the most common tasks/computations for all available wavefunctions.
134The following subsection lists the routines in this layer, with their arguments and
135calling conventions.
136
137\subsection{Task Routines for NWChem Operations}
138
139The highest
140level of the task routines is subroutine (\verb+task()+), which
141is only invoked by the main NWChem program.
142The other task routines, however, can be invoked from almost any module.
143(Nested calls to the same subroutine should be avoided, however,
144 since most NWChem routines are
145not reentrant.)  The database argument passing conventions of modules in NWChem
146were developed in their
147present form mainly
148to support this layer.
149
150\subsubsection{{\tt task}}
151
152\begin{verbatim}
153      subroutine task(rtdb)
154      integer rtdb              ! [input] data base handle
155\end{verbatim}
156
157Called by ALL processes.  After \verb+task_input+ has read the
158task directive and put stuff into the database this routine gets the
159data out and invokes the desired action.
160
161If the operation is in the list of those supported by generic
162routines then
163the generic routine is called.  Otherwise, a match is attemped
164for a specialized routine.  If no operation is specified
165and no specialized routine located, then it is assumed that
166a generic energy calculation is required.
167
168This needs extending to accomodate QM/MM and other mixed methods
169by having both MM and QM pieces specified (e.g., task md dft).
170
171
172\subsubsection{{\tt task\_energy}}
173
174
175\begin{verbatim}
176      logical function task_energy(rtdb)
177      integer rtdb
178c
179c     RTDB input parameters
180c     ---------------------
181c     task:theory (string) - name of (QM) level of theory to use
182c
183c     RTDB output parameters
184c     ----------------------
185c     task:status (logical)- T/F for success/failure
186c     if (status) then
187c     .  task:energy (real)   - total energy
188c     .  task:dipole(real(3)) - total dipole moment if available
189c     .  task:cputime (real)  - cpu time to execute the task
190c     .  task:walltime (real) - wall time to execute the task
191c
192c     Also returns status through the function value
193c
194\end{verbatim}
195
196Generic NWChem interface to compute the energy.  Currently
197only the QM components are supported.
198
199\subsubsection{{\tt task\_gradient}}
200
201\begin{verbatim}
202      logical function task_gradient(rtdb)
203
204c     RTDB input parameters
205c     ---------------------
206c     task:theory (string) - name of (QM) level of theory to use
207c     task:numerical (logical) - optional - if true use numerical
208c         differentiation. If absent or false use default selection.
209c
210c     RTDB output parameters
211c     ----------------------
212c     task:status (logical)- T/F for success/failure
213c     if (status) then
214c     .  task:energy (real)   - total energy
215c     .  task:gradient (real array) - derivative w.r.t. geometry cart. coords.
216c     .  task:dipole (real(3)) - total dipole if available
217c     .  task:cputime (real)  - cpu time to execute the task
218c     .  task:walltime (real) - wall time to execute the task
219c
220c     Also returns status through the function value
221c
222\end{verbatim}
223
224Generic NWChem interface to compute the energy and gradient.
225Currently only the QM components are supported.
226
227Since this routine is directly invoked by application modules
228no input is processed in this routine.
229If the method does not have analytic derivatives
230the numerical derivative routine is automatically called.
231
232
233\subsubsection{{\tt task\_freq}}
234
235\begin{verbatim}
236      logical function task_freq(rtdb)
237
238c     RTDB input parameters
239c     ---------------------
240c     task:theory
241c
242c     RTDB output parameters
243c     ----------------------
244c     task:hessian file name (string) - name of file containing hessian
245c     task:status (logical)      - T/F on success/failure
246c     task:cputime
247c     task:walltime
248\end{verbatim}
249
250Central difference calculation of the hessian using
251the generic energy/gradient interface.  Uses a routine inside
252stepper to do the finite difference ... this needs to be
253cleaned up to be independent of stepper.
254
255Also will be hooked up to analytic methods as they are available.
256
257Since this routine is directly invoked by application modules
258no input is processed in this routine.
259
260\subsubsection{{\tt task\_hessian}}
261
262\begin{verbatim}
263      logical function task_hessian(rtdb)
264
265c     RTDB input parameters
266c     ---------------------
267c     task:theory (string) - name of (QM) level of theory to use
268c     task:numerical (logical) - optional - if true use numerical
269c         differentiation. if
270c     task:analytic  (logical) - force analytic hessian
271c
272c     RTDB output parameters no for analytic hessian at the moment.
273c     ----------------------
274c     task:hessian file name - that has a lower triangular
275C                              (double precision) array
276c                              derivative w.r.t. geometry cart. coords.
277c     task:status (logical)  - T/F for success/failure
278c     task:cputime (real)    - cpu time to execute the task
279c     task:walltime (real)   - wall time to execute the task
280c
281c     Also returns status through the function value
282\end{verbatim}
283
284Generic NWChem interface to compute the analytic hessian.
285
286If the method does not have analytic derivatives automatically calls
287the numerical derivative routine.
288
289
290
291\subsubsection{{\tt task\_optimize}}
292
293\begin{verbatim}
294      logical function task_optimize(rtdb)
295
296c     RTDB input parameters
297c     ---------------------
298c     task:theory (string) - must be set for task_gradient to work
299c
300c     RTDB output parameters
301c     ----------------------
302c     task:energy (real)   - final energy from optimization
303c     task:gradient (real) - final gradient from optimization
304c     task:status (real)   - T/F on success/failure
305c     task:cputime
306c     task:walltime
307c     geometry             - final geometry from optimization
308c
309\end{verbatim}
310
311Optimize a geometry using stepper and the generic
312task energy/gradient interface.  Eventually will need another
313layer below here to handle the selection of other optimizers.
314
315Since this routine can be directly invoked by application modules
316no input is processed in this routine.
317c
318\subsubsection{{\tt task\_num\_grad}}
319
320\begin{verbatim}
321      logical function task_num_grad(rtdb)
322      integer rtdb              ! [input]
323\end{verbatim}
324
325Returns energy and gradient at current geometry.
326
327Computes derivatives of \verb+task_energy()+ with respect to nuclear displacements
328using numerical finite difference.
329
330Uses symmetry and projects out rotations/translations.
331
332
333\subsubsection{{\tt task\_save\_state} and {\tt task\_restore\_state}}
334
335\begin{verbatim}
336      logical function task_save_state(rtdb,suffix)
337
338      integer rtdb              ! [input]
339      character*(*) suffix      ! [input]
340c
341c     Input argument ... the suffix
342c
343c     RTDB arguments ... the theory name
344c
345c     Output ... function value T/F on success/failure
346c
347\end{verbatim}
348
349Each module saves any files/database entries neccessary
350to restart the calculation at its current point by appending the
351given suffix to any names.
352
353The exact (and perhaps only) application of this routine is in
354computation of derivatives by finite difference.  The energy/gradient
355is computed at a reference geometry (or zero field) and then
356the wavefunction is saved by calling this routine.  Subsequent
357calculations at displaced geometries (or non-zero fields) call
358\verb+task_restore_state()+ in order to use the wavefunction at the
359reference geometry as a starting guess for the calculation
360at the displaced geometry.  Thus, there is no need to save basis
361or geometry (or field) information.  E.g., in the SCF only the
362MO vector file is saved.
363
364
365