1%\chapter{Dynamical Nucleation Theory Monte Carlo} 2 3% $Id$ 4% 5% Updated 6/24/08 for new version (lcrosby) 6 7\label{sec:dntmc} 1.) Schenter, G. K.; Kathmann, S. M.; Garrett, B. 8C. {\it J. Chem. Phys.} ({\bf 1999}), {\it 110}, 7951. 2.)Crosby, L. 9D.; Kathmann, S. M.; Windus, T. L. {\it J. Comput. Chem.} ({\bf 102008}), submitted. 11 12 The Dynamical Nucleation Theory Monte Carlo 13(DNTMC) module utilizes Dynamical Nucleation Theory (DNT) to compute 14monomer evaporation rate constants at a given temperature. The 15reactant is a molecular cluster of $i$ rigid monomers while the 16product is a molecular cluster with $i-1$ monomers plus a free 17monomer. A Metropolis Monte Carlo (MC) methodology is utilized to 18sample the configurational space of these $i$ rigid monomers. Both 19homogenous and heterogenous clusters are supported. 20 21\section{SubGroups} 22 23The DNTMC module supports the use of subgroups in the MC 24simulations. The number of subgroups is defined in the input 25through a set directive: 26\begin{verbatim} 27 set subgroup_number <integer number> 28\end{verbatim} 29,where the number of subgroups requested is the argument. The 30number of processors that each subgroup has access to is determined 31by Total/subgroup\_number. A separate MC simulation is performed 32within each subgroup. To use this functionality, NWChem must be 33compiled with the USE\_SUBGROUPS environmental variable set. 34 35Each MC simulation starts at a different starting configuration, 36which is equally spaced along the reaction coordinate. The 37statistical distributions which these MC simulations produce are 38averaged to form the final statistical distribution. Output from 39these subgroups consists of various files whose names are of the 40form (*.\#num). These files include restart files and other data 41files. The NWChem runtime database (RTDB) is used as input for 42these subgroups and must be globally accessible (set through the 43Permanent\_Dir directive) to all processes. 44 45\newpage 46\section{Input Syntax} 47 48The input block has the following form: 49 50\begin{verbatim} 51DNTMC 52 [nspecies <integer number>] 53 [species <list of strings name[nspecies]] 54 [nmol <list of integers number[nspecies]>] 55 56 [temp <real temperature>] 57 [rmin <real rmin>] 58 [rmax <real rmax>] 59 [nob <integer nob>] 60 [mcsteps <integer number>] 61 [tdisp <real disp>] 62 [rdisp <real rot>] 63 [rsim || rconfig] 64 [mprnt <integer number>] 65 [convergence <real limit>] 66 [norestart] 67 [dntmc_dir <string directory>] 68 69 [print &&|| noprint] 70 71 [procrestart <integer number>] 72END 73\end{verbatim} 74 75\section{Definition of Monomers} 76 77Geometry information is required for each unique monomer (species). 78See the geometry input section \ref{sec:geom} for more information. 79A unique label must be given for each monomer geometry. 80Additionally, the noautosym and nocenter options are suggested for 81use with the DNTMC module to prevent NWChem from changing the input 82geometries. Symmetry should also not be used since cluster 83configurations will seldom exhibit any symmetry; although monomers 84themselves may exhibit symmetry. 85 86\begin{verbatim} 87 GEOMETRY [<string name species_1>] noautosym nocenter ... 88 ... 89 symmetry c1 90 END 91 92 GEOMETRY [<string name species_2>] noautosym nocenter ... 93 ... 94 symmetry c1 95 END 96 97 ... 98\end{verbatim} 99 100The molecular cluster is defined by the number of unique monomers 101(nspecies). The geometry labels for each unique monomer is given in 102a space delimited list (species). Also required are the number of 103each unique monomer in the molecular cluster given as a space 104delimited list (nmol). These keywords are required and thus have no 105default values. 106 107\begin{verbatim} 108 [nspecies <integer number>] 109 [species <list of strings name[nspecies]] 110 [nmol <list of integers number[nspecies]>] 111\end{verbatim} 112 113An example is shown in section \ref{sec:dntmc_example} for a 10 114monomer cluster consisting of a 50/50 mixture of water and ammonia. 115 116\section{DNTMC runtime options} 117 118Several options control the behavior of the DNTMC module. Some 119required options such as simulation temperature (temp), cluster 120radius (rmin and rmax), and maximum number of MC steps (mcsteps) are 121used to control the MC simulation. 122 123\begin{verbatim} 124 [temp <real temperature>] 125\end{verbatim} 126This required option gives the simulation temperature in which the 127MC simulation is run. Temperature is given in kelvin. 128 129\begin{verbatim} 130 [rmin <real rmin>] 131 [rmax <real rmax>] 132 [nob <integer nob>] 133\end{verbatim} 134These required options define the minimum and maximum extent of the 135projected reaction coordinate (The radius of a sphere centered at 136the center of mass). Rmin should be large enough to contain the 137entire molecular cluster of monomers and Rmax should be large enough 138to include any relevant configurational space (such as the position 139of the reaction bottleneck). These values are given in Angstroms. 140 141The probability distributions obtain along this projected reaction 142coordinate has a minimum value of Rmin and a maximum value of Rmax. 143The distributions are created by chopping this range into a number 144of smaller sized bins. The number of bins (nob) is controlled by 145the option of the same name. 146 147\begin{verbatim} 148 [mcsteps <integer number>] 149 [tdisp <real disp default 0.04>] 150 [rdisp <real rot default 0.06>] 151 [convergence <real limit default 0.00>] 152\end{verbatim} 153These options define some characteristics of the MC simulations. The 154maximum number of MC steps (mcsteps) to take in the course of the 155calculation run is a required option. Once the MC simulation has 156performed this number of steps the calculation will end. This is a 157per Markov chain quantity. The maximum translational step size 158(tdisp) and rotational step size (rdisp) are optional inputs with 159defaults set at 0.04 Angstroms and 0.06 radians, respectively. The 160convergence keyword allows the convergence threshold to be set. The 161default is 0.00 which effectively turns off this checking. Once the 162measure of convergence goes below this threshold the calculation 163will end. 164 165\begin{verbatim} 166 [rsim || rconfig] 167\end{verbatim} 168These optional keywords allow the selection of two different MC 169sampling methods. rsim selects a Metropolis MC methodology which 170samples configurations according to a Canonical ensemble. The 171rconfig keyword selects a MC methodology which samples 172configurations according to a derivative of the Canonical ensemble 173with respect to the projected reaction coordinate. These keywords 174are optional with the default method being rconfig. 175 176\begin{verbatim} 177 [mprnt <integer number default 10>] 178 [dntmc_dir <string directory default ./>] 179 [norestart] 180\end{verbatim} 181These three options define some of the output and data analysis 182behavior. mprnt is an option which controls how often data analysis 183occurs during the simulation. Currently, every mprnt*nob MC steps 184data analysis is performed and results are output to files and/or to 185the log file. Restart files are also written every mprnt number of 186MC steps during the simulation. The default value is 10. The 187keyword dntmc\_dir allows the definition of an alternate directory 188to place DNTMC specific ouputfiles. These files can be very large 189so be sure enough space is available. This directory should be 190accessible by every process (although not necessarily globally 191accessible). The default is to place these files in the directory 192which NWChem is run (./). The keyword norestart turns off the 193production of restart files. By default restart files are produced 194every mprnt number of MC steps. 195 196\section{Print Control} 197The DNTMC module supports the use of PRINT and NOPRINT Keywords. The 198specific labels which DNTMC recognizes are included below. 199 200\begin{table}[htbp!] 201\begin{center} 202\begin{tabular}{lcc} 203 {\bf Name} & {\bf Print Level} & {\bf Description} \\ 204``debug'' & debug & \begin{minipage}{0.6\textwidth}Some debug 205information written in Output 206file.\end{minipage} \\ 207\\ 208``information'' & none & \begin{minipage}{0.6\textwidth}Some 209information such as energies and 210geometries.\end{minipage}\\ 211\\ 212``mcdata'' & low & \begin{minipage}{0.6\textwidth} Production of a 213set of files (Prefix.MCdata.\#num). These files are a concatenated 214list of structures, Energies, and Dipole Moments for each accepted 215configuration sampled in the MC run.\end{minipage}\\ 216\\ 217``alldata'' & low & \begin{minipage}{0.6\textwidth}Production of a 218set of files (Prefix.Alldata.\#num). These files include the same 219information as MCdata files. However, they include ALL 220configurations (accepted or 221rejected).\end{minipage}\\ 222\\ 223``mcout'' & debug -- low & 224\begin{minipage}{0.6\textwidth}Production of a set of files 225(Prefix.MCout.\#num). These files contain a set of informative and 226debug information. Also included is the set of information which 227mirrors the Alldata files.\end{minipage}\\ 228\\ 229``fdist'' & low & \begin{minipage}{0.6\textwidth}Production of a 230file (Prefix.fdist) which contains 231a concatenated list of distributions every mprnt*100 MC steps.\end{minipage}\\ 232\\ 233``timers'' & debug & 234\begin{minipage}{0.6\textwidth}Enables some timers in the code. 235These timers return performance statistics in the output file every 236time data analysis is performed. Two timers are used. One for the 237mcloop itself and one for the communication step.\end{minipage} 238\end{tabular} 239\end{center} 240\end{table} 241 242\section{Selected File Formats} 243Several output files are available in the DNTMC module. This 244section defines the format for some of these files. 245 246\begin{enumerate} 247\item *.fdist 248 249This file is a concatenated list of radial distribution functions 250printed out every mprnt MC steps. Each distribution is normalized 251(sum equal to one) with respect to the entire (all species) 252distribution. The error is the RMS deviation of the average at each 253point. Each entry is as follows: \subitem [1] \# Total 254Configurations \subitem [2] Species number \# \subitem [3](R 255coordinate in Angstroms) (Probability) (Error) \subitem [Repeats nob 256times] \subitem [2 and 3 Repeats for each species] \subitem [4] 257*** separator. 258 259\item *.MCdata.\# 260 261This file is a concatenated list of accepted configurations. Each 262file corresponds to a single Markov chain. The dipole is set to 263zero for methods which do not produce a dipole moment with energy 264calculations. Rsim is either the radial extent of the cluster 265(r-config) or the simulation radius (r-simulation). 266 Each entry is as follows: \subitem [1] (Atomic label) (X Coord.) (Y 267Coord.) (Z Coord.) \subitem [1 Repeats for each atom in the cluster 268configuration, units are in angstroms] \subitem [2] Ucalc = \# 269hartree \subitem [3] Dipole = (X) (Y) (Z) au \subitem [4] Rsim = \# 270Angstrom \subitem [1 through 4 repeats for each accepted 271configuration] 272 273\item *.MCout.\# 274 275This file has the same format and information content as the MCdata 276file except that additional output is included. This additional 277output includes summary statistics such as acceptance ratios, 278average potential energy, and average radius. The information 279included for accepted configurations does not include dipole moment 280or radius. 281 282\item *.MCall.\# 283 284This file has the same format as the MCdata file expect that it 285includes information for all configurations for which an energy is 286determined. All accepted and rejected configurations are included 287in this file. 288 289\item *.restart.\# 290 291This file contains the restart information for each subgroup. Its 292format is not very human readable but the basic fields are described 293in short here. \subitem Random number seed \subitem Potential energy 294in hartrees \subitem Sum of potential energy \subitem Average 295potential energy \subitem Sum of the squared potential energy 296\subitem Squared potential energy \subitem Dipole moment in au (x) 297(Y) (Z) \subitem Rmin and Rmax \subitem Rsim (Radius corresponds to 298r-config or r-sim methods) \subitem Array of nspecies length, value 299indicates the number of each type of monomer which lies at radius 300Rsim from the center of mass [r-simulation sets these to zero] 301\subitem Sum of Rsim \subitem Average of Rsim \subitem Number of 302accepted translantional moves \subitem Number of accepted rotational 303moves \subitem Number of accepted volume moves \subitem Number of 304attempted moves (volume) (translational) (rotational) \subitem 305Number of accepted moves (Zero) \subitem Number of accepted moves 306(Zero) \subitem Number of MC steps completed \subitem [1] (Atom 307label) (X Coord.) (Y Coord.) (Z Coord.) \subitem [1 repeats for each 308atom in cluster configurations, units are in angstroms] \subitem [2] 309Array of nspecies length, number of configurations in bin \subitem 310[3] Array of nspecies length, normalized number of configurations in 311each bin \subitem [4] (Value of bin in Angstroms) (Array of nspecies 312length, normalized probability of bin) \subitem [2 through 4 repeats 313nob times] 314 315\end{enumerate} 316 317\section{DNTMC Restart} 318 319\begin{verbatim} 320 [procrestart <integer number>] 321\end{verbatim} 322Flag to indicate restart postprocessing. It is suggested that this 323postprocessing run is done utilizing only one processor. 324 325In order to restart a DNTMC run, postprocessing is required to put 326required information into the runtime database (RTDB). During a run 327restart information is written to files (Prefix.restart.\#num) every 328mprnt MC steps. This information must be read and deposited into the 329RTDB before a restart run can be done. The number taken as an 330argument is the number of files to read and must also equal the 331number of subgroups the calculation utilizes. The start directive 332must also be set to restart for this to work properly. All input is 333read as usual. However, values from the restart files take 334precedence over input values. Some keywords such as mcsteps are not 335defined in the restart files. Task directives are ignored. You must 336have a RTDB present in your permanent directory. 337 338Once postprocessing is done a standard restart can be done from the 339RTDB by removing the procrestart keyword and including the restart 340directive. 341 342\section{Task Directives} 343The DNTMC module can be used with any level of theory which can 344produce energies. Gradients and Hessians are not required within 345this methodology. If dipole moments are available, they are also 346utilized. The task directive for the DNTMC module is shown below: 347\begin{verbatim} 348 task <string theory> dntmc 349\end{verbatim} 350 351\section{Example} 352\label{sec:dntmc_example} This example is for a molecular cluster of 35310 monomers. A 50/50 mixture of water and ammonia. The energies 354are done at the SCF/6-31++G** level of theory. 355 356\begin{verbatim} 357 start 358 # start or restart directive if a restart run 359 MEMORY 1000 mb 360 361 PERMANENT_DIR /home/bill 362 # Globally accessible directory which the 363 # rtdb (*.db) file will/does reside. 364 365 basis "ao basis" spherical noprint 366 * library 6-31++G** 367 end 368 # basis set directive for scf energies 369 370 scf 371 singlet 372 rhf 373 tol2e 1.0e-12 374 vectors input atomic 375 thresh 1.0e-06 376 maxiter 200 377 print none 378 end 379 # scf directive for scf energies 380 381 geometry geom1 units angstroms noautosym nocenter noprint 382 O 0.393676503613369 -1.743794626956820 -0.762291912129271 383 H -0.427227157125777 -1.279138812526320 -0.924898279781319 384 H 1.075463952717060 -1.095883929075060 -0.940073459864222 385 symmetry c1 386 end 387 # geometry of a monomer with title "geom1" 388 389 geometry geom2 units angstroms noautosym nocenter noprint 390 N 6.36299e-08 0.00000 -0.670378 391 H 0.916275 0.00000 -0.159874 392 H -0.458137 0.793517 -0.159874 393 H -0.458137 -0.793517 -0.159874 394 symmetry c1 395 end 396 # geometry of another monomer with title "geom2" 397 # other monomers may be included with different titles 398 399 set subgroups_number 8 400 # set directive which gives the number of subgroups 401 # each group runs a separate MC simulation 402 403 dntmc 404 # DNTMC input block 405 nspecies 2 406 # The number of unique species (number of titled geometries 407 # above) 408 species geom1 geom2 409 # An array of geometry titles (one for each 410 # nspecies/geometry) 411 nmol 5 5 412 # An array stating the number of each 413 # monomer/nspecies/geometry in simulation. 414 temp 243.0 415 mcsteps 1000000 416 rmin 3.25 417 rmax 12.25 418 mprnt 10 419 tdisp 0.04 420 rdisp 0.06 421 print none fdist mcdata 422 # this print line first sets the print-level to none 423 # then it states that the *.fdist and *.mcdata.(#num) 424 # files are to be written 425 rconfig 426 dntmc_dir /home/bill/largefile 427 # An accessible directory which to place the *.fdist, 428 # *.mcdata.(#num), and *.restart.(#num) files. 429 convergence 1.0D+00 430 end 431 432 task scf dntmc 433 # task directive stating that energies are to be done at the scf 434 #level of theory. 435\end{verbatim} 436 437