1\label{sec:generic} 2 3The high-level flow of control within NWChem was broadly outlined in 4the discussion of NWChem architecture (see Section \ref{sec:arch}). This 5chapter covers the details of internal communication between modules 6and the control of program execution. This information is needed if 7NWChem is to be embedded in another application or when new modules 8are developed for the code. 9 10\section{Flow of Control in NWChem} 11 12The Generic Task Interface controls the execution of NWChem. The flow 13of control proceeds in the following steps: 14 15\begin{enumerate} 16\item Begin initialization of the parallel environment. 17\item Identify and open the input file. 18\item Complete the initialization of the parallel environment. 19\item Scan the input file for a memory directive. 20\item Process start-up directives. 21\item Summarize start-up information and write it to the output file. 22\item Open the runtime database with the appropriate mode. 23\item Process the input sequentially (ignoring start-up directives), 24including the first \verb+task+ directive. 25\item Execute the task. 26\item Repeat steps 8 and 9 until reaching the end of the input file, or 27encountering a fatal error condition. 28\end{enumerate} 29 30In Step 1, the parallel environment is initialized by calling the 31TCGMSG wrapper routine \verb+pbeginf()+. This creates the parallel processes 32and provides basic message-passing. Before the global arrays can be 33initialized, however, user-specified memory parameters must be obtained 34from the input file. This requires execution of Step 2 to open the input file. 35 36The input file is opened {\em only} by process zero. 37The name of the input file is determined by the routine 38\verb+get_input_file_name()+. (The convention for the input file name is 39documented in the user manual. The default name is \verb+nwchem.nw+.) 40The input file is scanned by process zero for a memory directive using the 41routine \verb+input_mem_size()+. Defaults are provided for all memory parameters 42not provided by the user, and results are broadcast to all nodes. At this point the 43local memory allocator (MA) and then the global array library (GA) are initialized. 44Completion of these steps fully initializes the parallel environment. 45 46The next step is to process the startup directives that are contained in the 47input file. This is done to determine the type of calculation being undertaken 48(i.e., startup, restart, or continue), the name of the database, and the 49location of the permanent and scratch directories. Note that only process zero scans 50the input file, using routine \verb+input_file_info()+. The information obtained by 51process zero when reading the input file are broadcast to all nodes, however, and 52this information is summarized to the output file. Process zero then 53opens the database with the appropriate mode ({\em empty} for startup, {\em old} 54for restart or continue). 55 56At this point NWChem is fully functional and ready to process user 57input beyond the startup information. 58If the startup mode is 'continue', however, no more input is processed and 59the code attempts to continue the 60previously executing task from the information in the database. No new input information 61will be processed until that task is completed. Once the continued task is finished, 62however, or if the startup mode is for a new or restarted input file, 63the input file is read sequentially from the beginning (ignoring 64startup directives since they have already been processed). 65 66As long as input is 67available from the input file, the input module (routine \verb+input_parse()+) is 68invoked to read it, 69up to and including a task directive. Each input line is processed, and data is 70inserted into the data base for 71later retrieval. Note that within the input module, only process zero is executing 72code, reading input or putting data into the database. To enable this, the database 73is switched into sequential mode at the beginning of the input module, and back to 74parallel at the end. 75 76Once a task directive is processed and entered into the database, 77control is returned to the main program so that the task can be carried out. 78The main program initiates the execution of the task by 79calling the routine \verb+task()+. If the task fails, a fatal error is generated 80either by 81\verb+task()+ itself or a lower level routine. The task information remains in the database 82so that the task may be continued in another job. If the task finishes successfully, \verb+task()+ 83removes information about the completed task from the database, and the main program 84invokes the input module once again. 85 86The input module continues to sequentially process the input. If it encounters another 87task directive, control is returned to the main program and the execution of the task 88is initiated, as described above. Upon successful completion, the main program again 89returns control to the input module and input processing continues. If the input 90module does not encounter a task directive before running out of input 91(physical or logical end of file) it returns false and the loop in the main 92program terminates. 93 94Once all input has been processed and there are no more tasks to execute, 95the code attempts to clean up by closing the database, 96tidying up GA, and finally gracefully killing the parallel processes. Statistics 97concerning the database, MA and GA are printed to the output file, and execution 98terminates. 99 100When a new module is introduced into NWChem, it must conform to this 101orderly control process. The new module must be appropriately 102invoked by the task routines. In addition, if it requires new input, the new module's 103input routine must be 104appropriately invoked by \verb+input_parse()+ (see Section \ref{sec:parser} for details 105on the input parser). The new module's input routine must also be structured so that 106it allows only 107process zero to execute the code that reads user input. 108 109Any new module developed for NWChem must also conform to the design 110goal that restart/continuation jobs with no repeated input behave exactly the same as 111if all input and tasks were specified in a single file. These attributes imply that 112{\em all} input data be processed and entered into the database or another persistent 113file. This means that in-core data structures should not be initialized 114 within the input module. (Doing so will result in 115only process zero having the information and 116restarts will not work correctly.) In addition, 117input routines must not require having basis set or geometry information 118available, since these are not known until a task is actually invoked. 119 120\section{Task Execution in NWChem} 121As described above, NWChem excutes all tasks by invoking the routine 122\verb+task()+. The main program does not actually know 123what a particular task is --- the necessary 124information is passed from the input module to the task library via the 125database. This makes the top level structure of NWChem very simple. The same simplicity 126is desirable in many applications. For instance, molecular geometry optimizers 127(or QM/MM programs) should work for all levels of theory and should not have to be 128modified if a new theory is introduced into the code. Similarly, routines that 129compute gradients and Hessians by finite difference need to be able 130to save and restore the state associated with each type of wavefunction. 131 132NWChem contains a layer of routines that can perform 133the most common tasks/computations for all available wavefunctions. 134The following subsection lists the routines in this layer, with their arguments and 135calling conventions. 136 137\subsection{Task Routines for NWChem Operations} 138 139The highest 140level of the task routines is subroutine (\verb+task()+), which 141is only invoked by the main NWChem program. 142The other task routines, however, can be invoked from almost any module. 143(Nested calls to the same subroutine should be avoided, however, 144 since most NWChem routines are 145not reentrant.) The database argument passing conventions of modules in NWChem 146were developed in their 147present form mainly 148to support this layer. 149 150\subsubsection{{\tt task}} 151 152\begin{verbatim} 153 subroutine task(rtdb) 154 integer rtdb ! [input] data base handle 155\end{verbatim} 156 157Called by ALL processes. After \verb+task_input+ has read the 158task directive and put stuff into the database this routine gets the 159data out and invokes the desired action. 160 161If the operation is in the list of those supported by generic 162routines then 163the generic routine is called. Otherwise, a match is attemped 164for a specialized routine. If no operation is specified 165and no specialized routine located, then it is assumed that 166a generic energy calculation is required. 167 168This needs extending to accomodate QM/MM and other mixed methods 169by having both MM and QM pieces specified (e.g., task md dft). 170 171 172\subsubsection{{\tt task\_energy}} 173 174 175\begin{verbatim} 176 logical function task_energy(rtdb) 177 integer rtdb 178c 179c RTDB input parameters 180c --------------------- 181c task:theory (string) - name of (QM) level of theory to use 182c 183c RTDB output parameters 184c ---------------------- 185c task:status (logical)- T/F for success/failure 186c if (status) then 187c . task:energy (real) - total energy 188c . task:dipole(real(3)) - total dipole moment if available 189c . task:cputime (real) - cpu time to execute the task 190c . task:walltime (real) - wall time to execute the task 191c 192c Also returns status through the function value 193c 194\end{verbatim} 195 196Generic NWChem interface to compute the energy. Currently 197only the QM components are supported. 198 199\subsubsection{{\tt task\_gradient}} 200 201\begin{verbatim} 202 logical function task_gradient(rtdb) 203 204c RTDB input parameters 205c --------------------- 206c task:theory (string) - name of (QM) level of theory to use 207c task:numerical (logical) - optional - if true use numerical 208c differentiation. If absent or false use default selection. 209c 210c RTDB output parameters 211c ---------------------- 212c task:status (logical)- T/F for success/failure 213c if (status) then 214c . task:energy (real) - total energy 215c . task:gradient (real array) - derivative w.r.t. geometry cart. coords. 216c . task:dipole (real(3)) - total dipole if available 217c . task:cputime (real) - cpu time to execute the task 218c . task:walltime (real) - wall time to execute the task 219c 220c Also returns status through the function value 221c 222\end{verbatim} 223 224Generic NWChem interface to compute the energy and gradient. 225Currently only the QM components are supported. 226 227Since this routine is directly invoked by application modules 228no input is processed in this routine. 229If the method does not have analytic derivatives 230the numerical derivative routine is automatically called. 231 232 233\subsubsection{{\tt task\_freq}} 234 235\begin{verbatim} 236 logical function task_freq(rtdb) 237 238c RTDB input parameters 239c --------------------- 240c task:theory 241c 242c RTDB output parameters 243c ---------------------- 244c task:hessian file name (string) - name of file containing hessian 245c task:status (logical) - T/F on success/failure 246c task:cputime 247c task:walltime 248\end{verbatim} 249 250Central difference calculation of the hessian using 251the generic energy/gradient interface. Uses a routine inside 252stepper to do the finite difference ... this needs to be 253cleaned up to be independent of stepper. 254 255Also will be hooked up to analytic methods as they are available. 256 257Since this routine is directly invoked by application modules 258no input is processed in this routine. 259 260\subsubsection{{\tt task\_hessian}} 261 262\begin{verbatim} 263 logical function task_hessian(rtdb) 264 265c RTDB input parameters 266c --------------------- 267c task:theory (string) - name of (QM) level of theory to use 268c task:numerical (logical) - optional - if true use numerical 269c differentiation. if 270c task:analytic (logical) - force analytic hessian 271c 272c RTDB output parameters no for analytic hessian at the moment. 273c ---------------------- 274c task:hessian file name - that has a lower triangular 275C (double precision) array 276c derivative w.r.t. geometry cart. coords. 277c task:status (logical) - T/F for success/failure 278c task:cputime (real) - cpu time to execute the task 279c task:walltime (real) - wall time to execute the task 280c 281c Also returns status through the function value 282\end{verbatim} 283 284Generic NWChem interface to compute the analytic hessian. 285 286If the method does not have analytic derivatives automatically calls 287the numerical derivative routine. 288 289 290 291\subsubsection{{\tt task\_optimize}} 292 293\begin{verbatim} 294 logical function task_optimize(rtdb) 295 296c RTDB input parameters 297c --------------------- 298c task:theory (string) - must be set for task_gradient to work 299c 300c RTDB output parameters 301c ---------------------- 302c task:energy (real) - final energy from optimization 303c task:gradient (real) - final gradient from optimization 304c task:status (real) - T/F on success/failure 305c task:cputime 306c task:walltime 307c geometry - final geometry from optimization 308c 309\end{verbatim} 310 311Optimize a geometry using stepper and the generic 312task energy/gradient interface. Eventually will need another 313layer below here to handle the selection of other optimizers. 314 315Since this routine can be directly invoked by application modules 316no input is processed in this routine. 317c 318\subsubsection{{\tt task\_num\_grad}} 319 320\begin{verbatim} 321 logical function task_num_grad(rtdb) 322 integer rtdb ! [input] 323\end{verbatim} 324 325Returns energy and gradient at current geometry. 326 327Computes derivatives of \verb+task_energy()+ with respect to nuclear displacements 328using numerical finite difference. 329 330Uses symmetry and projects out rotations/translations. 331 332 333\subsubsection{{\tt task\_save\_state} and {\tt task\_restore\_state}} 334 335\begin{verbatim} 336 logical function task_save_state(rtdb,suffix) 337 338 integer rtdb ! [input] 339 character*(*) suffix ! [input] 340c 341c Input argument ... the suffix 342c 343c RTDB arguments ... the theory name 344c 345c Output ... function value T/F on success/failure 346c 347\end{verbatim} 348 349Each module saves any files/database entries neccessary 350to restart the calculation at its current point by appending the 351given suffix to any names. 352 353The exact (and perhaps only) application of this routine is in 354computation of derivatives by finite difference. The energy/gradient 355is computed at a reference geometry (or zero field) and then 356the wavefunction is saved by calling this routine. Subsequent 357calculations at displaced geometries (or non-zero fields) call 358\verb+task_restore_state()+ in order to use the wavefunction at the 359reference geometry as a starting guess for the calculation 360at the displaced geometry. Thus, there is no need to save basis 361or geometry (or field) information. E.g., in the SCF only the 362MO vector file is saved. 363 364 365