1%\chapter{Dynamical Nucleation Theory Monte Carlo}
2
3% $Id$
4%
5% Updated 6/24/08 for new version (lcrosby)
6
7\label{sec:dntmc} 1.)  Schenter, G. K.; Kathmann, S. M.; Garrett, B.
8C. {\it J. Chem. Phys.} ({\bf 1999}), {\it 110}, 7951. 2.)Crosby, L.
9D.; Kathmann, S. M.; Windus, T. L. {\it J. Comput. Chem.} ({\bf
102008}), submitted.
11
12 The Dynamical Nucleation Theory Monte Carlo
13(DNTMC) module utilizes Dynamical Nucleation Theory (DNT) to compute
14monomer evaporation rate constants at a given temperature.  The
15reactant is a molecular cluster of $i$ rigid monomers while the
16product is a molecular cluster with $i-1$ monomers plus a free
17monomer.  A Metropolis Monte Carlo (MC) methodology is utilized to
18sample the configurational space of these $i$ rigid monomers.  Both
19homogenous and heterogenous clusters are supported.
20
21\section{SubGroups}
22
23The DNTMC module supports the use of subgroups in the MC
24simulations.  The number of subgroups is defined in the input
25through a set directive:
26\begin{verbatim}
27     set subgroup_number <integer number>
28\end{verbatim}
29,where the number of subgroups requested is the argument.  The
30number of processors that each subgroup has access to is determined
31by Total/subgroup\_number.  A separate MC simulation is performed
32within each subgroup.  To use this functionality, NWChem must be
33compiled with the USE\_SUBGROUPS environmental variable set.
34
35Each MC simulation starts at a different starting configuration,
36which is equally spaced along the reaction coordinate.  The
37statistical distributions which these MC simulations produce are
38averaged to form the final statistical distribution.  Output from
39these subgroups consists of various files whose names are of the
40form (*.\#num). These files include restart files and other data
41files.  The NWChem runtime database (RTDB) is used as input for
42these subgroups and must be globally accessible (set through the
43Permanent\_Dir directive) to all processes.
44
45\newpage
46\section{Input Syntax}
47
48The input block has the following form:
49
50\begin{verbatim}
51DNTMC
52     [nspecies <integer number>]
53     [species  <list of strings name[nspecies]]
54     [nmol     <list of integers number[nspecies]>]
55
56     [temp     <real temperature>]
57     [rmin     <real rmin>]
58     [rmax     <real rmax>]
59     [nob      <integer nob>]
60     [mcsteps  <integer number>]
61     [tdisp    <real disp>]
62     [rdisp    <real rot>]
63     [rsim || rconfig]
64     [mprnt    <integer number>]
65     [convergence <real limit>]
66     [norestart]
67     [dntmc_dir <string directory>]
68
69     [print &&|| noprint]
70
71     [procrestart <integer number>]
72END
73\end{verbatim}
74
75\section{Definition of Monomers}
76
77Geometry information is required for each unique monomer (species).
78See the geometry input section \ref{sec:geom} for more information.
79A unique label must be given for each monomer geometry.
80Additionally, the noautosym and nocenter options are suggested for
81use with the DNTMC module to prevent NWChem from changing the input
82geometries.  Symmetry should also not be used since cluster
83configurations will seldom exhibit any symmetry; although monomers
84themselves may exhibit symmetry.
85
86\begin{verbatim}
87  GEOMETRY [<string name species_1>] noautosym nocenter ...
88      ...
89  symmetry c1
90  END
91
92  GEOMETRY [<string name species_2>] noautosym nocenter ...
93      ...
94  symmetry c1
95  END
96
97  ...
98\end{verbatim}
99
100The molecular cluster is defined by the number of unique monomers
101(nspecies).  The geometry labels for each unique monomer is given in
102a space delimited list (species).  Also required are the number of
103each unique monomer in the molecular cluster given as a space
104delimited list (nmol).  These keywords are required and thus have no
105default values.
106
107\begin{verbatim}
108     [nspecies <integer number>]
109     [species  <list of strings name[nspecies]]
110     [nmol     <list of integers number[nspecies]>]
111\end{verbatim}
112
113An example is shown in section \ref{sec:dntmc_example} for a 10
114monomer cluster consisting of a 50/50 mixture of water and ammonia.
115
116\section{DNTMC runtime options}
117
118Several options control the behavior of the DNTMC module.  Some
119required options such as simulation temperature (temp), cluster
120radius (rmin and rmax), and maximum number of MC steps (mcsteps) are
121used to control the MC simulation.
122
123\begin{verbatim}
124     [temp     <real temperature>]
125\end{verbatim}
126This required option gives the simulation temperature in which the
127MC simulation is run.  Temperature is given in kelvin.
128
129\begin{verbatim}
130     [rmin     <real rmin>]
131     [rmax     <real rmax>]
132     [nob    <integer nob>]
133\end{verbatim}
134These required options define the minimum and maximum extent of the
135projected reaction coordinate (The radius of a sphere centered at
136the center of mass).  Rmin should be large enough to contain the
137entire molecular cluster of monomers and Rmax should be large enough
138to include any relevant configurational space (such as the position
139of the reaction bottleneck).  These values are given in Angstroms.
140
141The probability distributions obtain along this projected reaction
142coordinate has a minimum value of Rmin and a maximum value of Rmax.
143The distributions are created by chopping this range into a number
144of smaller sized bins.  The number of bins (nob) is controlled by
145the option of the same name.
146
147\begin{verbatim}
148     [mcsteps  <integer number>]
149     [tdisp    <real disp default 0.04>]
150     [rdisp    <real rot default 0.06>]
151     [convergence <real limit default 0.00>]
152\end{verbatim}
153These options define some characteristics of the MC simulations. The
154maximum number of MC steps (mcsteps) to take in the course of the
155calculation run is a required option.  Once the MC simulation has
156performed this number of steps the calculation will end.  This is a
157per Markov chain quantity.  The maximum translational step size
158(tdisp) and rotational step size (rdisp) are optional inputs with
159defaults set at 0.04 Angstroms and 0.06 radians, respectively.  The
160convergence keyword allows the convergence threshold to be set.  The
161default is 0.00 which effectively turns off this checking.  Once the
162measure of convergence goes below this threshold the calculation
163will end.
164
165\begin{verbatim}
166     [rsim || rconfig]
167\end{verbatim}
168These optional keywords allow the selection of two different MC
169sampling methods.  rsim selects a Metropolis MC methodology which
170samples configurations according to a Canonical ensemble.  The
171rconfig keyword selects a MC methodology which samples
172configurations according to a derivative of the Canonical ensemble
173with respect to the projected reaction coordinate.  These keywords
174are optional with the default method being rconfig.
175
176\begin{verbatim}
177     [mprnt    <integer number default 10>]
178     [dntmc_dir <string directory default ./>]
179     [norestart]
180\end{verbatim}
181These three options define some of the output and data analysis
182behavior. mprnt is an option which controls how often data analysis
183occurs during the simulation.  Currently, every mprnt*nob MC steps
184data analysis is performed and results are output to files and/or to
185the log file.  Restart files are also written every mprnt number of
186MC steps during the simulation.  The default value is 10. The
187keyword dntmc\_dir allows the definition of an alternate directory
188to place DNTMC specific ouputfiles.  These files can be very large
189so be sure enough space is available.  This directory should be
190accessible by every process (although not necessarily globally
191accessible).  The default is to place these files in the directory
192which NWChem is run (./).  The keyword norestart turns off the
193production of restart files. By default restart files are produced
194every mprnt number of MC steps.
195
196\section{Print Control}
197The DNTMC module supports the use of PRINT and NOPRINT Keywords. The
198specific labels which DNTMC recognizes are included below.
199
200\begin{table}[htbp!]
201\begin{center}
202\begin{tabular}{lcc}
203  {\bf Name}          & {\bf Print Level} & {\bf Description} \\
204``debug'' &      debug & \begin{minipage}{0.6\textwidth}Some debug
205information written in Output
206file.\end{minipage} \\
207\\
208``information'' & none & \begin{minipage}{0.6\textwidth}Some
209information such as energies and
210geometries.\end{minipage}\\
211\\
212``mcdata'' & low & \begin{minipage}{0.6\textwidth} Production of a
213set of files (Prefix.MCdata.\#num). These files are a concatenated
214list of structures, Energies, and Dipole Moments for each accepted
215configuration sampled in the MC run.\end{minipage}\\
216\\
217``alldata'' & low & \begin{minipage}{0.6\textwidth}Production of a
218set of files (Prefix.Alldata.\#num).  These files include the same
219information as MCdata files.  However, they include ALL
220configurations (accepted or
221rejected).\end{minipage}\\
222\\
223``mcout'' & debug -- low  &
224\begin{minipage}{0.6\textwidth}Production of a set of files
225(Prefix.MCout.\#num).  These files contain a set of informative and
226debug information.  Also included is the set of information which
227mirrors the Alldata files.\end{minipage}\\
228\\
229``fdist'' & low & \begin{minipage}{0.6\textwidth}Production of a
230file (Prefix.fdist) which contains
231a concatenated list of distributions every mprnt*100 MC steps.\end{minipage}\\
232\\
233``timers'' & debug &
234\begin{minipage}{0.6\textwidth}Enables some timers in the code.
235These timers return performance statistics in the output file every
236time data analysis is performed.  Two timers are used.  One for the
237mcloop itself and one for the communication step.\end{minipage}
238\end{tabular}
239\end{center}
240\end{table}
241
242\section{Selected File Formats}
243Several output files are available in the DNTMC module.  This
244section defines the format for some of these files.
245
246\begin{enumerate}
247\item *.fdist
248
249This file is a concatenated list of radial distribution functions
250printed out every mprnt MC steps.  Each distribution is normalized
251(sum equal to one) with respect to the entire (all species)
252distribution.  The error is the RMS deviation of the average at each
253point.  Each entry is as follows: \subitem [1] \# Total
254Configurations \subitem [2] Species number \# \subitem [3](R
255coordinate in Angstroms) (Probability) (Error) \subitem [Repeats nob
256times] \subitem [2 and 3 Repeats for each species] \subitem [4]
257*** separator.
258
259\item *.MCdata.\#
260
261This file is a concatenated list of accepted configurations. Each
262file corresponds to a single Markov chain.  The dipole is set to
263zero for methods which do not produce a dipole moment with energy
264calculations.  Rsim is either the radial extent of the cluster
265(r-config) or the simulation radius (r-simulation).
266 Each entry is as follows: \subitem [1] (Atomic label) (X Coord.) (Y
267Coord.) (Z Coord.) \subitem [1 Repeats for each atom in the cluster
268configuration, units are in angstroms] \subitem [2] Ucalc = \#
269hartree \subitem [3] Dipole = (X) (Y) (Z) au \subitem [4] Rsim = \#
270Angstrom \subitem [1 through 4 repeats for each accepted
271configuration]
272
273\item *.MCout.\#
274
275This file has the same format and information content as the MCdata
276file except that additional output is included.  This additional
277output includes summary statistics such as acceptance ratios,
278average potential energy, and average radius.  The information
279included for accepted configurations does not include dipole moment
280or radius.
281
282\item *.MCall.\#
283
284This file has the same format as the MCdata file expect that it
285includes information for all configurations for which an energy is
286determined.  All accepted and rejected configurations are included
287in this file.
288
289\item *.restart.\#
290
291This file contains the restart information for each subgroup.  Its
292format is not very human readable but the basic fields are described
293in short here. \subitem Random number seed \subitem Potential energy
294in hartrees \subitem Sum of potential energy \subitem Average
295potential energy \subitem Sum of the squared potential energy
296\subitem Squared potential energy \subitem Dipole moment in au (x)
297(Y) (Z) \subitem Rmin and Rmax \subitem Rsim (Radius corresponds to
298r-config or r-sim methods) \subitem Array of nspecies length, value
299indicates the number of each type of monomer which lies at radius
300Rsim from the center of mass [r-simulation sets these to zero]
301\subitem Sum of Rsim \subitem Average of Rsim \subitem Number of
302accepted translantional moves \subitem Number of accepted rotational
303moves \subitem Number of accepted volume moves \subitem Number of
304attempted moves (volume) (translational) (rotational) \subitem
305Number of accepted moves (Zero) \subitem Number of accepted moves
306(Zero) \subitem Number of MC steps completed \subitem [1] (Atom
307label) (X Coord.) (Y Coord.) (Z Coord.) \subitem [1 repeats for each
308atom in cluster configurations, units are in angstroms] \subitem [2]
309Array of nspecies length, number of configurations in bin \subitem
310[3] Array of nspecies length, normalized number of configurations in
311each bin \subitem [4] (Value of bin in Angstroms) (Array of nspecies
312length, normalized probability of bin) \subitem [2 through 4 repeats
313nob times]
314
315\end{enumerate}
316
317\section{DNTMC Restart}
318
319\begin{verbatim}
320     [procrestart <integer number>]
321\end{verbatim}
322Flag to indicate restart postprocessing.  It is suggested that this
323postprocessing run is done utilizing only one processor.
324
325In order to restart a DNTMC run, postprocessing is required to put
326required information into the runtime database (RTDB). During a run
327restart information is written to files (Prefix.restart.\#num) every
328mprnt MC steps. This information must be read and deposited into the
329RTDB before a restart run can be done.  The number taken as an
330argument is the number of files to read and must also equal the
331number of subgroups the calculation utilizes.  The start directive
332must also be set to restart for this to work properly.  All input is
333read as usual. However, values from the restart files take
334precedence over input values. Some keywords such as mcsteps are not
335defined in the restart files.  Task directives are ignored. You must
336have a RTDB present in your permanent directory.
337
338Once postprocessing is done a standard restart can be done from the
339RTDB by removing the procrestart keyword and including the restart
340directive.
341
342\section{Task Directives}
343The DNTMC module can be used with any level of theory which can
344produce energies.  Gradients and Hessians are not required within
345this methodology.  If dipole moments are available, they are also
346utilized.  The task directive for the DNTMC module is shown below:
347\begin{verbatim}
348     task <string theory> dntmc
349\end{verbatim}
350
351\section{Example}
352\label{sec:dntmc_example} This example is for a molecular cluster of
35310 monomers.  A 50/50 mixture of water and ammonia.  The energies
354are done at the SCF/6-31++G** level of theory.
355
356\begin{verbatim}
357    start
358    # start or restart directive if a restart run
359    MEMORY 1000 mb
360
361    PERMANENT_DIR /home/bill
362    # Globally accessible directory which the
363    # rtdb (*.db) file will/does reside.
364
365    basis "ao basis" spherical noprint
366        * library 6-31++G**
367    end
368    # basis set directive for scf energies
369
370    scf
371        singlet
372        rhf
373        tol2e 1.0e-12
374        vectors input atomic
375        thresh 1.0e-06
376        maxiter 200
377        print none
378    end
379    # scf directive for scf energies
380
381    geometry geom1 units angstroms noautosym nocenter noprint
382    O  0.393676503613369      -1.743794626956820      -0.762291912129271
383    H -0.427227157125777      -1.279138812526320      -0.924898279781319
384    H  1.075463952717060      -1.095883929075060      -0.940073459864222
385    symmetry c1
386    end
387    # geometry of a monomer with title "geom1"
388
389    geometry geom2 units angstroms noautosym nocenter noprint
390    N     6.36299e-08     0.00000     -0.670378
391    H     0.916275     0.00000     -0.159874
392    H     -0.458137     0.793517     -0.159874
393    H     -0.458137     -0.793517     -0.159874
394    symmetry c1
395    end
396    # geometry of another monomer with title "geom2"
397    # other monomers may be included with different titles
398
399    set subgroups_number 8
400    # set directive which gives the number of subgroups
401    # each group runs a separate MC simulation
402
403    dntmc
404    # DNTMC input block
405        nspecies 2
406        # The number of unique species (number of titled geometries
407        # above)
408        species geom1 geom2
409        # An array of geometry titles (one for each
410        # nspecies/geometry)
411        nmol    5  5
412        # An array stating the number of each
413        # monomer/nspecies/geometry in simulation.
414        temp  243.0
415        mcsteps 1000000
416        rmin 3.25
417        rmax 12.25
418        mprnt 10
419        tdisp 0.04
420        rdisp 0.06
421        print none fdist mcdata
422        # this print line first sets the print-level to none
423        # then it states that the *.fdist and *.mcdata.(#num)
424        # files are to be written
425        rconfig
426        dntmc_dir /home/bill/largefile
427        # An accessible directory which to place the *.fdist,
428        # *.mcdata.(#num), and *.restart.(#num) files.
429        convergence 1.0D+00
430    end
431
432    task scf dntmc
433    # task directive stating that energies are to be done at the scf
434    #level of theory.
435\end{verbatim}
436
437