1%
2% $Id$
3%
4\label{sec:analysis}
5\def\bmu{\mbox{\boldmath $\mu$}}
6\def\bE{\mbox{\bf E}}
7\def\br{\mbox{\bf r}}
8\def\tT{\tilde{T}}
9\def\t{\tilde{1}}
10\def\ip{i\prime}
11\def\jp{j\prime}
12\def\ipp{i\prime\prime}
13\def\jpp{j\prime\prime}
14\def\etal{{\sl et al.}}
15\def\nwchem{{\bf NWChem}}
16\def\nwargos{{\bf nwargos}}
17\def\nwtop{{\bf nwtop}}
18\def\nwrst{{\bf nwrst}}
19\def\nwsgm{{\bf nwsgm}}
20\def\esp{{\bf esp}}
21\def\md{{\bf md}}
22\def\prepare{{\bf prepare}}
23\def\analysis{{\bf analysis}}
24\def\argos{{\bf ARGOS}}
25\def\amber{{\bf AMBER}}
26\def\charmm{{\bf CHARMM}}
27\def\discover{{\bf DISCOVER}}
28\def\povray{{\bf povray}}
29\def\gopenmol{{\bf gOpenMol}}
30\def\ecce{{\bf ecce}}
31
32The \analysis\ module is used to analyze molecular trajectories generated
33by the \nwchem\ molecular dynamics module, or partial charges generated
34by the \nwchem\ electrostatic potential fit module. This module should
35not de run in parallel mode.
36
37Directives for the \analysis\ module are read from an input deck,
38
39\begin{verbatim}
40analysis
41 ...
42end
43\end{verbatim}
44
45The analysis is performed  as post-analysis of trajectory files through
46using the {\rm task} directive
47
48\begin{verbatim}
49task analysis
50\end{verbatim}
51or
52\begin{verbatim}
53task analyze
54\end{verbatim}
55
56\section{System specification}
57
58\begin{verbatim}
59system <string systemid>_<string calcid>
60\end{verbatim}
61
62where the strings \verb+systemid+ and \verb+calcid+ are user defined names
63for the chemical system and the type of calculation to ber performed,
64respectively. These names are used to derive the filenames used for the
65calculation. The topoly file used will be \verb+systemid.top+, while all
66other files are named \verb+systemid_calcid.ext+.
67
68\section{Reference coordinates}
69
70Most analyses require a set of reference coordinates. These
71coordinates are read from a \nwchem\ restart file by the directive,
72
73\begin{verbatim}
74reference <string filename>
75\end{verbatim}
76
77where {\rm filename} is the name of an existing restart file.
78This input directive is required.
79
80\section{File specification}
81
82The trajectory file(s) to be analyzed are specified with
83
84\begin{verbatim}
85file <string filename> [<integer firstfile> <integer lastfile>]
86\end{verbatim}
87
88where {\rm filename} is an existing {\rm trj} trajectory file.
89If {\rm firstfile} and {\rm lastfile} are specified, the specified
90{\rm filename} needs to have a {\rm ?} wild card character that will
91be substituted by the 3-character integer number from {\rm firstfile}
92to {\rm lastfile}, and the analysis will be performed on the series
93of files.
94For example,
95
96\begin{verbatim}
97file tr_md?.trj 3 6
98\end{verbatim}
99
100will instruct the analysis to be performed on files {\it tr\_md003.trj},
101{\it tr\_md004.trj}, {\it tr\_md005.trj} and {\it tr\_md006.trj}.
102
103\par
104From the specified files the subset of frames to be analyzed is
105specified by
106
107\begin{verbatim}
108frames [<integer firstframe default 1>] <integer lastframe> \
109       [<integer frequency default 1>]
110\end{verbatim}
111
112For example, to analyze the first 100 frames from the specified
113trajectory files, use
114
115\begin{verbatim}
116frames 100
117\end{verbatim}
118
119To analyze every 10-th frame between frames 200 and 400 recorded on
120the specified trajectory files, use
121
122\begin{verbatim}
123frames 200 400 10
124\end{verbatim}
125
126A time offset can be specified with
127
128\begin{verbatim}
129time <real timoff>
130\end{verbatim}
131
132Solute coordinates of the reference set and ech subsequent frame
133read from a trajectory file are translated to have the center of
134geometry of the specified solute molecule at the center of the
135simulation box. After this translation all molecules are folded
136back into the box according to the periodic boundary conditions.
137The directive for this operation is
138
139\begin{verbatim}
140center <integer imol> [<integer jmol default imol>]
141\end{verbatim}
142
143Coordinates of each frame read from a trajectory file can be
144rotated using
145
146\begin{verbatim}
147rotate ( off | x | y | z ) <real angle units degrees>
148\end{verbatim}
149
150If \verb+center+ was defined, rotation takes place after
151the system has been centered. The \verb+rotate+ directives
152only apply to frames read from the trajectory files, and not
153to the reference coordinates. Upto 100 \verb+rotate+ directives
154can be specified, which will be carried out in the order in which
155they appear in the input deck. \verb+rotate off+ cancels all
156previously defined \verb+rotate+ directives.
157
158To perform a hydrogen bond analysis:
159
160\begin{verbatim}
161hbond [distance [[<real rhbmin default 0.0>] <real rhbmin>]] \
162      [angle [<real hbdmin> [ <real hbdmax default pi>]]] \
163      [solvent [<integer numwhb>]]
164\end{verbatim}
165
166\section{Selection}
167
168Analyses can be applied to a selection of solute atoms and solvent molecules.
169The selection is determined by
170
171\begin{verbatim}
172select ( [ super ] [ { <string atomlist> } ] |
173	solvent <real range> | save <string filename> | read <string filename> )
174\end{verbatim}
175
176where {\rm \{atomlist\}} is the set of atom names selected from the specified residues.
177By default all solute atoms are selected. When keyword \verb+super+ is specified the selecion
178applies to the superimposition option.
179
180\par
181The selected atoms are specified by the string \verb+atomlist+ which
182takes the form
183
184\begin{verbatim}
185[{isgm [ - jsgm ] [,]} [:] [{aname[,]}]
186\end{verbatim}
187where \verb+isgm+ and \verb+jsgm+ are the first and last residue numbers,
188and \verb+aname+ is an atom name. In the atomname a question mark may be
189used as a wildcard character.
190
191For example, all protein backbone atoms are selected by
192
193\begin{verbatim}
194select _N,_CA,_C
195\end{verbatim}
196
197To select the backbone atoms in residues 20 to 80 and 90 to 100 only, use
198
199\begin{verbatim}
200select 20-80,90-100:_N,_CA,_C
201\end{verbatim}
202
203This selection is reset to apply to all atoms after each file
204directive.
205
206Solvent molecules within \verb+range+ nm from any selected solute atom
207are selected by
208
209\begin{verbatim}
210select solvent <real range>
211\end{verbatim}
212
213After solvent selection, the solute atom selection is reset to being all
214selected.
215
216The current selection can be saved to, or read from a file using the
217\verb+save+ and \verb+read+ keywords, respectively.
218
219\par
220
221Some analysis are performed on groups of atoms. These groups of atoms
222are defined by
223
224\begin{verbatim}
225define <integer igroup> [<real rsel>] [solvent] { <string atomlist> }
226\end{verbatim}
227
228The string atom in this definitions again takes the form
229
230\begin{verbatim}
231[{isgm [ - jsgm ] [,]} [:] [{aname[,]}]
232\end{verbatim}
233where \verb+isgm+ and \verb+jsgm+ are the first and last residue numbers,
234and \verb+aname+ is an atom name. In the atomname a question mark may be
235used as a wildcard character.
236
237Multiple define directive can be used to define a single set of atoms.
238
239\section{Coordinate analysis}
240
241To analyze the root mean square deviation from the specified reference
242coordinates:
243
244\begin{verbatim}
245rmsd
246\end{verbatim}
247
248To analyze protein $\phi$-$\psi$ and backbone hydrogen bonding:
249
250\begin{verbatim}
251ramachandran
252\end{verbatim}
253
254To define a distance:
255
256\begin{verbatim}
257distance <integer ibond> <string atomi> <string atomj>
258\end{verbatim}
259
260To define an angle:
261
262\begin{verbatim}
263angle <integer iangle> <string atomi> <string atomj> <string atomk>
264\end{verbatim}
265
266To define a torsion:
267
268\begin{verbatim}
269torsion <integer itorsion> <string atomi> <string atomj> \
270                       <string atomk> <string atoml>
271\end{verbatim}
272
273To define a vector:
274
275\begin{verbatim}
276vector <integer ivector> <string atomi> <string atomj>
277\end{verbatim}
278
279The atom string in these definitions takes the form
280
281\begin{verbatim}
282<integer segment>:<string atomname> | w<integer molecule>:<string atomname>
283\end{verbatim}
284
285for solute and solvent atom specification, respectively.
286
287To define charge distribution in z-direction:
288
289\begin{verbatim}
290charge_distribution <integer bins>
291\end{verbatim}
292
293Analyses on atoms in a predefined group are specified by
294
295\begin{verbatim}
296group [<integer igroup> [periodic <integer ipbc>] \
297      ( local [<real rsel default 0.0>] [<real rval default rsel>]
298        <string function> )
299\end{verbatim}
300where \verb+igroup+ specifies the group of atoms defined with a
301\verb+define+ directive. Keyword \verb+periodic+ can be used to
302specify the periodicity, \verb+ipbc=1+ for periodicity in \verb+z+,
303\verb+ipbc=2+ for periodicity in \verb+x+ and \verb+y+, and
304\verb+ipbc=3+ for periodicity in \verb+x+, \verb+y+ and \verb+z+.
305Currently the only option is \verb+local+ which prints all selected
306solute atom with a distance between \verb+rsel+ and \verb+rval+ from
307the atoms defined in \verb+igroup+. The actual analysis is done by the
308\verb+scan+ deirective. A formatted report is printed from
309\verb+group+ analyses using
310
311\begin{verbatim}
312report <string filename> local
313\end{verbatim}
314
315Analyses on pairs of atoms in predefined groups are specified by
316
317\begin{verbatim}
318groups [<integer igroup> [<integer jgroup>]] [periodic [<integer ipbc default 3>]] \
319       <string function> [<real value1> [<real value2>]] [<string filename>]
320\end{verbatim}
321
322where $igroup$ and $jgroup$ are groups of atoms defined with a
323\verb+define+ directive. Keyword \verb+periodic+ specifies that
324periodic boundary conditions need to be applied in $ipbc$ dimensions.
325The type of analysis is define by $function$, $value1$ and $value2$.
326If $filename$ is specified, the analysis is applied to the reference
327coordinates and written to the specified file. If no filename is
328given, the analysis is applied to the specified trajectory and
329performed as part of the \verb+scan+ directive.
330Implemented analyses defined by
331\verb+<string function> [<real value1> [<real value2>]]+ include\\
332\\
333\verb+distance+ to calculate the distance between the centers of geometry of the
334two specified groups of atoms, and\\
335\verb+distances+ to calculate all atomic distances between atoms
336in the specified groups that lie between $value1$ and $value2$.
337
338
339Coordinate histograms are specified by
340
341\begin{verbatim}
342histogram <integer idef> [<integer length>] zcoordinate <string filename>
343\end{verbatim}
344
345where $idef$ is the atom group definition number, $length$ is the size
346of the histogram, \verb+zcoordinate+ is the currently only histogram option,
347and $filename$ is the filname to which the histogram is written.
348
349Order parameters are evalated using
350
351\begin{verbatim}
352order <integer isel> <integer jsel> <string atomi> <string atomj>
353\end{verbatim}
354This is an experimental feature.
355
356To write the average coordinates of a trajectory
357
358\begin{verbatim}
359average [super] <string filename>
360\end{verbatim}
361
362
363To perform the coordinate analysis:
364
365\begin{verbatim}
366scan [ super ] <string filename>
367\end{verbatim}
368
369which will create, depending on the specified analysis options
370files filename.rms and filename.ana. After the scan directive
371previously defined coordinate analysis options are all reset.
372Optional keyword \verb+super+ specifies that frames read from
373the trajectory file(s) are superimposed to the reference structure
374before the analysis is performed.
375
376\section{Essential dynamics analysis}
377
378Essential dynamics analysis is performed by
379
380\begin{verbatim}
381essential
382\end{verbatim}
383
384This can be followed by one or more
385
386\begin{verbatim}
387project <integer vector> <string filename>
388\end{verbatim}
389
390to project the trajectory onto the specified vector. This will
391create files filename with extensions frm or trj, val, vec, \_min.pdb
392and \_max.pdb, with the projected trajectory, the projection
393value, the eigenvector, and the minimum and maximum projection
394structure.
395
396For example, an essential dynamics analysis with projection onto
397the first vector generating files firstvec.\{trj, val, vec, \_min.pdb, \_max.pdb\}
398is generated by
399
400\begin{verbatim}
401essential
402project 1 firstvec
403\end{verbatim}
404
405\section{Trajectory format conversion}
406
407To write a single frame in PDB or XYZ format, use
408
409\begin{verbatim}
410write [<integer number default 1>] [super] [solute] <string filename>
411\end{verbatim}
412
413To copy the selected frames from the specified trejctory file(s),
414onto a new file, use
415
416\begin{verbatim}
417copy  [solute] [rotate <real tangle>] <string filename>
418\end{verbatim}
419
420To superimpose the selected atoms for each specified frame to the
421reference coordinates before copying onto a new file, use
422
423\begin{verbatim}
424super [solute] [rotate <real tangle>] <string filename>
425\end{verbatim}
426
427The \verb+rotate+ directive specifies that the structure will make
428a full ratation every tangle ps. This directive only has effect when
429writing povray files.
430
431The format of the new file is determined from the extension, which
432can be one of
433
434\begin{tabular}{rl}
435amb & \amber\ formatted trajectory file (obsolete)\\
436arc & \discover\ archive file\\
437bam & \amber\ unformatted trajectory file\\
438crd & \amber\ formatted trajectory file\\
439dcd & \charmm\ formatted trajectory file\\
440esp & \gopenmol\ formatted electrostatic potential files\\
441frm & \ecce\ frames file (obsolete)\\
442pov & \povray\ input files\\
443trj & \nwchem\ trajectory file\\
444\end{tabular}
445
446If no extension is specified, a {\rm trj} formatted file will be written.
447
448A special tag can be added to {\rm frm} and {\rm pov} formatted files  using
449
450\begin{verbatim}
451label <integer itag> <string tag>  [ <real rval default 1.0> ] \\
452    [ <integer iatag> [ <integer jatag default iatag> ] [ <real rtag default 0.0> ] ]
453    [ <string anam> ]
454\end{verbatim}
455
456where tag number $itag$ is set to the string $tag$ for all atoms
457anam within a distance $rtag$ from segments $iatag$ through $jatag$.
458A question mark can be used in anam as a wild card character.
459\par
460
461Atom rendering is specified using
462
463\begin{verbatim}
464render ( cpk | stick )  [ <real rval default 1.0> ] \\
465    [ <integer iatag> [ <integer jatag default iatag> ] [ <real rtag default 0.0> ] ]
466    [ <string anam> ]
467\end{verbatim}
468
469for all atoms anam within a distance $rtag$ from segments $iatag$ through $jatag$,
470and a scaling factor of $rval$. A question mark can be used in anam as a wild card
471character.
472\par
473
474Atom color is specified using
475
476\begin{verbatim}
477color ( <string color> | atom ) \\
478    [ <integer iatag> [ <integer jatag default iatag> ] [ <real rtag default 0.0> ] ]
479    [ <string anam> ]
480\end{verbatim}
481
482for all atoms anam within a distance $rtag$ from segments $iatag$ through $jatag$.
483A question mark can be used in anam as a wild card character.
484\par
485For example, to display all carbon atoms in segments 34 through 45
486in green and rendered cpk in povray files can be specified with
487
488\begin{verbatim}
489render cpk 34 45 _C??
490color green 34 45 _C??
491\end{verbatim}
492
493Coordinates written to a pov file can be scaled using
494
495\begin{verbatim}
496scale <real factor>
497\end{verbatim}
498
499A zero or negative scaling factor will scale the coordinates to
500lie within [-1,1] in all dimensions.
501\par
502The cpk rendering in povray files can be scaled by
503
504\begin{verbatim}
505cpk <real factor default 1.0>
506\end{verbatim}
507\par
508
509The stick rendering in povray files can be scaled by
510
511\begin{verbatim}
512stick <real factor default 1.0>
513\end{verbatim}
514
515The initial sequence number of esp related files is defined by
516
517\begin{verbatim}
518index <integer index default 1>
519\end{verbatim}
520
521A sequence of trajectory files with unequal lengths can be converted to files
522with all $nclean$ frames using
523
524\begin{verbatim}
525clean <integer nclean>
526\end{verbatim}
527
528\section{Electrostatic potentials}
529
530A file in plt format of the electrostatic potential resulting
531from partial charges generated by the ESP module is generated
532by the command
533
534\begin{verbatim}
535esp  [ <integer spacing default 10> ] \
536     [ <real rcut default 1.0> ] [periodic [<integer iper default 3>]] \
537     [ <string xfile> [ <string pltfile> ] ]
538\end{verbatim}
539
540The input coordinates are taken from the {\rm xyzq} file that can
541be generated from a {\rm rst} by the prepare module. Parameter
542spacing specifies the number of gridpoints per nm, rcut specifies
543extent of the charge grid beyond the molecule.
544Periodic boundaries will be used if \verb+periodic+
545is specified. If \verb+iper+ is set to 2, periodic boundary
546conditions are applied in x and y dimensions only. If \verb+periodic+
547is specified, a negative value of \verb+rcut+ will extend the grid
548in the periodic dimensions by abs(\verb+rcut+), otherwise this value
549will be ignored in the periodic dimensions.
550The resulting {\rm plt} formatted file pltfile can be
551viewed with the gOpenMol program. The resulting electrostatic
552potential grid is in units of kJ\ mol$^{-1}$e$^{-1}$.
553If no files are specified, only the parameters are set. This
554analysis applies to solute(s) only.
555
556The electrostatic potential at specific point are evaluated using
557
558\begin{verbatim}
559esp_points [<string filpin> [<string filhol> [<string filpou> [<string filavg>]]]]
560\end{verbatim}