1% 2% $Id$ 3% 4\label{sec:analysis} 5\def\bmu{\mbox{\boldmath $\mu$}} 6\def\bE{\mbox{\bf E}} 7\def\br{\mbox{\bf r}} 8\def\tT{\tilde{T}} 9\def\t{\tilde{1}} 10\def\ip{i\prime} 11\def\jp{j\prime} 12\def\ipp{i\prime\prime} 13\def\jpp{j\prime\prime} 14\def\etal{{\sl et al.}} 15\def\nwchem{{\bf NWChem}} 16\def\nwargos{{\bf nwargos}} 17\def\nwtop{{\bf nwtop}} 18\def\nwrst{{\bf nwrst}} 19\def\nwsgm{{\bf nwsgm}} 20\def\esp{{\bf esp}} 21\def\md{{\bf md}} 22\def\prepare{{\bf prepare}} 23\def\analysis{{\bf analysis}} 24\def\argos{{\bf ARGOS}} 25\def\amber{{\bf AMBER}} 26\def\charmm{{\bf CHARMM}} 27\def\discover{{\bf DISCOVER}} 28\def\povray{{\bf povray}} 29\def\gopenmol{{\bf gOpenMol}} 30\def\ecce{{\bf ecce}} 31 32The \analysis\ module is used to analyze molecular trajectories generated 33by the \nwchem\ molecular dynamics module, or partial charges generated 34by the \nwchem\ electrostatic potential fit module. This module should 35not de run in parallel mode. 36 37Directives for the \analysis\ module are read from an input deck, 38 39\begin{verbatim} 40analysis 41 ... 42end 43\end{verbatim} 44 45The analysis is performed as post-analysis of trajectory files through 46using the {\rm task} directive 47 48\begin{verbatim} 49task analysis 50\end{verbatim} 51or 52\begin{verbatim} 53task analyze 54\end{verbatim} 55 56\section{System specification} 57 58\begin{verbatim} 59system <string systemid>_<string calcid> 60\end{verbatim} 61 62where the strings \verb+systemid+ and \verb+calcid+ are user defined names 63for the chemical system and the type of calculation to ber performed, 64respectively. These names are used to derive the filenames used for the 65calculation. The topoly file used will be \verb+systemid.top+, while all 66other files are named \verb+systemid_calcid.ext+. 67 68\section{Reference coordinates} 69 70Most analyses require a set of reference coordinates. These 71coordinates are read from a \nwchem\ restart file by the directive, 72 73\begin{verbatim} 74reference <string filename> 75\end{verbatim} 76 77where {\rm filename} is the name of an existing restart file. 78This input directive is required. 79 80\section{File specification} 81 82The trajectory file(s) to be analyzed are specified with 83 84\begin{verbatim} 85file <string filename> [<integer firstfile> <integer lastfile>] 86\end{verbatim} 87 88where {\rm filename} is an existing {\rm trj} trajectory file. 89If {\rm firstfile} and {\rm lastfile} are specified, the specified 90{\rm filename} needs to have a {\rm ?} wild card character that will 91be substituted by the 3-character integer number from {\rm firstfile} 92to {\rm lastfile}, and the analysis will be performed on the series 93of files. 94For example, 95 96\begin{verbatim} 97file tr_md?.trj 3 6 98\end{verbatim} 99 100will instruct the analysis to be performed on files {\it tr\_md003.trj}, 101{\it tr\_md004.trj}, {\it tr\_md005.trj} and {\it tr\_md006.trj}. 102 103\par 104From the specified files the subset of frames to be analyzed is 105specified by 106 107\begin{verbatim} 108frames [<integer firstframe default 1>] <integer lastframe> \ 109 [<integer frequency default 1>] 110\end{verbatim} 111 112For example, to analyze the first 100 frames from the specified 113trajectory files, use 114 115\begin{verbatim} 116frames 100 117\end{verbatim} 118 119To analyze every 10-th frame between frames 200 and 400 recorded on 120the specified trajectory files, use 121 122\begin{verbatim} 123frames 200 400 10 124\end{verbatim} 125 126A time offset can be specified with 127 128\begin{verbatim} 129time <real timoff> 130\end{verbatim} 131 132Solute coordinates of the reference set and ech subsequent frame 133read from a trajectory file are translated to have the center of 134geometry of the specified solute molecule at the center of the 135simulation box. After this translation all molecules are folded 136back into the box according to the periodic boundary conditions. 137The directive for this operation is 138 139\begin{verbatim} 140center <integer imol> [<integer jmol default imol>] 141\end{verbatim} 142 143Coordinates of each frame read from a trajectory file can be 144rotated using 145 146\begin{verbatim} 147rotate ( off | x | y | z ) <real angle units degrees> 148\end{verbatim} 149 150If \verb+center+ was defined, rotation takes place after 151the system has been centered. The \verb+rotate+ directives 152only apply to frames read from the trajectory files, and not 153to the reference coordinates. Upto 100 \verb+rotate+ directives 154can be specified, which will be carried out in the order in which 155they appear in the input deck. \verb+rotate off+ cancels all 156previously defined \verb+rotate+ directives. 157 158To perform a hydrogen bond analysis: 159 160\begin{verbatim} 161hbond [distance [[<real rhbmin default 0.0>] <real rhbmin>]] \ 162 [angle [<real hbdmin> [ <real hbdmax default pi>]]] \ 163 [solvent [<integer numwhb>]] 164\end{verbatim} 165 166\section{Selection} 167 168Analyses can be applied to a selection of solute atoms and solvent molecules. 169The selection is determined by 170 171\begin{verbatim} 172select ( [ super ] [ { <string atomlist> } ] | 173 solvent <real range> | save <string filename> | read <string filename> ) 174\end{verbatim} 175 176where {\rm \{atomlist\}} is the set of atom names selected from the specified residues. 177By default all solute atoms are selected. When keyword \verb+super+ is specified the selecion 178applies to the superimposition option. 179 180\par 181The selected atoms are specified by the string \verb+atomlist+ which 182takes the form 183 184\begin{verbatim} 185[{isgm [ - jsgm ] [,]} [:] [{aname[,]}] 186\end{verbatim} 187where \verb+isgm+ and \verb+jsgm+ are the first and last residue numbers, 188and \verb+aname+ is an atom name. In the atomname a question mark may be 189used as a wildcard character. 190 191For example, all protein backbone atoms are selected by 192 193\begin{verbatim} 194select _N,_CA,_C 195\end{verbatim} 196 197To select the backbone atoms in residues 20 to 80 and 90 to 100 only, use 198 199\begin{verbatim} 200select 20-80,90-100:_N,_CA,_C 201\end{verbatim} 202 203This selection is reset to apply to all atoms after each file 204directive. 205 206Solvent molecules within \verb+range+ nm from any selected solute atom 207are selected by 208 209\begin{verbatim} 210select solvent <real range> 211\end{verbatim} 212 213After solvent selection, the solute atom selection is reset to being all 214selected. 215 216The current selection can be saved to, or read from a file using the 217\verb+save+ and \verb+read+ keywords, respectively. 218 219\par 220 221Some analysis are performed on groups of atoms. These groups of atoms 222are defined by 223 224\begin{verbatim} 225define <integer igroup> [<real rsel>] [solvent] { <string atomlist> } 226\end{verbatim} 227 228The string atom in this definitions again takes the form 229 230\begin{verbatim} 231[{isgm [ - jsgm ] [,]} [:] [{aname[,]}] 232\end{verbatim} 233where \verb+isgm+ and \verb+jsgm+ are the first and last residue numbers, 234and \verb+aname+ is an atom name. In the atomname a question mark may be 235used as a wildcard character. 236 237Multiple define directive can be used to define a single set of atoms. 238 239\section{Coordinate analysis} 240 241To analyze the root mean square deviation from the specified reference 242coordinates: 243 244\begin{verbatim} 245rmsd 246\end{verbatim} 247 248To analyze protein $\phi$-$\psi$ and backbone hydrogen bonding: 249 250\begin{verbatim} 251ramachandran 252\end{verbatim} 253 254To define a distance: 255 256\begin{verbatim} 257distance <integer ibond> <string atomi> <string atomj> 258\end{verbatim} 259 260To define an angle: 261 262\begin{verbatim} 263angle <integer iangle> <string atomi> <string atomj> <string atomk> 264\end{verbatim} 265 266To define a torsion: 267 268\begin{verbatim} 269torsion <integer itorsion> <string atomi> <string atomj> \ 270 <string atomk> <string atoml> 271\end{verbatim} 272 273To define a vector: 274 275\begin{verbatim} 276vector <integer ivector> <string atomi> <string atomj> 277\end{verbatim} 278 279The atom string in these definitions takes the form 280 281\begin{verbatim} 282<integer segment>:<string atomname> | w<integer molecule>:<string atomname> 283\end{verbatim} 284 285for solute and solvent atom specification, respectively. 286 287To define charge distribution in z-direction: 288 289\begin{verbatim} 290charge_distribution <integer bins> 291\end{verbatim} 292 293Analyses on atoms in a predefined group are specified by 294 295\begin{verbatim} 296group [<integer igroup> [periodic <integer ipbc>] \ 297 ( local [<real rsel default 0.0>] [<real rval default rsel>] 298 <string function> ) 299\end{verbatim} 300where \verb+igroup+ specifies the group of atoms defined with a 301\verb+define+ directive. Keyword \verb+periodic+ can be used to 302specify the periodicity, \verb+ipbc=1+ for periodicity in \verb+z+, 303\verb+ipbc=2+ for periodicity in \verb+x+ and \verb+y+, and 304\verb+ipbc=3+ for periodicity in \verb+x+, \verb+y+ and \verb+z+. 305Currently the only option is \verb+local+ which prints all selected 306solute atom with a distance between \verb+rsel+ and \verb+rval+ from 307the atoms defined in \verb+igroup+. The actual analysis is done by the 308\verb+scan+ deirective. A formatted report is printed from 309\verb+group+ analyses using 310 311\begin{verbatim} 312report <string filename> local 313\end{verbatim} 314 315Analyses on pairs of atoms in predefined groups are specified by 316 317\begin{verbatim} 318groups [<integer igroup> [<integer jgroup>]] [periodic [<integer ipbc default 3>]] \ 319 <string function> [<real value1> [<real value2>]] [<string filename>] 320\end{verbatim} 321 322where $igroup$ and $jgroup$ are groups of atoms defined with a 323\verb+define+ directive. Keyword \verb+periodic+ specifies that 324periodic boundary conditions need to be applied in $ipbc$ dimensions. 325The type of analysis is define by $function$, $value1$ and $value2$. 326If $filename$ is specified, the analysis is applied to the reference 327coordinates and written to the specified file. If no filename is 328given, the analysis is applied to the specified trajectory and 329performed as part of the \verb+scan+ directive. 330Implemented analyses defined by 331\verb+<string function> [<real value1> [<real value2>]]+ include\\ 332\\ 333\verb+distance+ to calculate the distance between the centers of geometry of the 334two specified groups of atoms, and\\ 335\verb+distances+ to calculate all atomic distances between atoms 336in the specified groups that lie between $value1$ and $value2$. 337 338 339Coordinate histograms are specified by 340 341\begin{verbatim} 342histogram <integer idef> [<integer length>] zcoordinate <string filename> 343\end{verbatim} 344 345where $idef$ is the atom group definition number, $length$ is the size 346of the histogram, \verb+zcoordinate+ is the currently only histogram option, 347and $filename$ is the filname to which the histogram is written. 348 349Order parameters are evalated using 350 351\begin{verbatim} 352order <integer isel> <integer jsel> <string atomi> <string atomj> 353\end{verbatim} 354This is an experimental feature. 355 356To write the average coordinates of a trajectory 357 358\begin{verbatim} 359average [super] <string filename> 360\end{verbatim} 361 362 363To perform the coordinate analysis: 364 365\begin{verbatim} 366scan [ super ] <string filename> 367\end{verbatim} 368 369which will create, depending on the specified analysis options 370files filename.rms and filename.ana. After the scan directive 371previously defined coordinate analysis options are all reset. 372Optional keyword \verb+super+ specifies that frames read from 373the trajectory file(s) are superimposed to the reference structure 374before the analysis is performed. 375 376\section{Essential dynamics analysis} 377 378Essential dynamics analysis is performed by 379 380\begin{verbatim} 381essential 382\end{verbatim} 383 384This can be followed by one or more 385 386\begin{verbatim} 387project <integer vector> <string filename> 388\end{verbatim} 389 390to project the trajectory onto the specified vector. This will 391create files filename with extensions frm or trj, val, vec, \_min.pdb 392and \_max.pdb, with the projected trajectory, the projection 393value, the eigenvector, and the minimum and maximum projection 394structure. 395 396For example, an essential dynamics analysis with projection onto 397the first vector generating files firstvec.\{trj, val, vec, \_min.pdb, \_max.pdb\} 398is generated by 399 400\begin{verbatim} 401essential 402project 1 firstvec 403\end{verbatim} 404 405\section{Trajectory format conversion} 406 407To write a single frame in PDB or XYZ format, use 408 409\begin{verbatim} 410write [<integer number default 1>] [super] [solute] <string filename> 411\end{verbatim} 412 413To copy the selected frames from the specified trejctory file(s), 414onto a new file, use 415 416\begin{verbatim} 417copy [solute] [rotate <real tangle>] <string filename> 418\end{verbatim} 419 420To superimpose the selected atoms for each specified frame to the 421reference coordinates before copying onto a new file, use 422 423\begin{verbatim} 424super [solute] [rotate <real tangle>] <string filename> 425\end{verbatim} 426 427The \verb+rotate+ directive specifies that the structure will make 428a full ratation every tangle ps. This directive only has effect when 429writing povray files. 430 431The format of the new file is determined from the extension, which 432can be one of 433 434\begin{tabular}{rl} 435amb & \amber\ formatted trajectory file (obsolete)\\ 436arc & \discover\ archive file\\ 437bam & \amber\ unformatted trajectory file\\ 438crd & \amber\ formatted trajectory file\\ 439dcd & \charmm\ formatted trajectory file\\ 440esp & \gopenmol\ formatted electrostatic potential files\\ 441frm & \ecce\ frames file (obsolete)\\ 442pov & \povray\ input files\\ 443trj & \nwchem\ trajectory file\\ 444\end{tabular} 445 446If no extension is specified, a {\rm trj} formatted file will be written. 447 448A special tag can be added to {\rm frm} and {\rm pov} formatted files using 449 450\begin{verbatim} 451label <integer itag> <string tag> [ <real rval default 1.0> ] \\ 452 [ <integer iatag> [ <integer jatag default iatag> ] [ <real rtag default 0.0> ] ] 453 [ <string anam> ] 454\end{verbatim} 455 456where tag number $itag$ is set to the string $tag$ for all atoms 457anam within a distance $rtag$ from segments $iatag$ through $jatag$. 458A question mark can be used in anam as a wild card character. 459\par 460 461Atom rendering is specified using 462 463\begin{verbatim} 464render ( cpk | stick ) [ <real rval default 1.0> ] \\ 465 [ <integer iatag> [ <integer jatag default iatag> ] [ <real rtag default 0.0> ] ] 466 [ <string anam> ] 467\end{verbatim} 468 469for all atoms anam within a distance $rtag$ from segments $iatag$ through $jatag$, 470and a scaling factor of $rval$. A question mark can be used in anam as a wild card 471character. 472\par 473 474Atom color is specified using 475 476\begin{verbatim} 477color ( <string color> | atom ) \\ 478 [ <integer iatag> [ <integer jatag default iatag> ] [ <real rtag default 0.0> ] ] 479 [ <string anam> ] 480\end{verbatim} 481 482for all atoms anam within a distance $rtag$ from segments $iatag$ through $jatag$. 483A question mark can be used in anam as a wild card character. 484\par 485For example, to display all carbon atoms in segments 34 through 45 486in green and rendered cpk in povray files can be specified with 487 488\begin{verbatim} 489render cpk 34 45 _C?? 490color green 34 45 _C?? 491\end{verbatim} 492 493Coordinates written to a pov file can be scaled using 494 495\begin{verbatim} 496scale <real factor> 497\end{verbatim} 498 499A zero or negative scaling factor will scale the coordinates to 500lie within [-1,1] in all dimensions. 501\par 502The cpk rendering in povray files can be scaled by 503 504\begin{verbatim} 505cpk <real factor default 1.0> 506\end{verbatim} 507\par 508 509The stick rendering in povray files can be scaled by 510 511\begin{verbatim} 512stick <real factor default 1.0> 513\end{verbatim} 514 515The initial sequence number of esp related files is defined by 516 517\begin{verbatim} 518index <integer index default 1> 519\end{verbatim} 520 521A sequence of trajectory files with unequal lengths can be converted to files 522with all $nclean$ frames using 523 524\begin{verbatim} 525clean <integer nclean> 526\end{verbatim} 527 528\section{Electrostatic potentials} 529 530A file in plt format of the electrostatic potential resulting 531from partial charges generated by the ESP module is generated 532by the command 533 534\begin{verbatim} 535esp [ <integer spacing default 10> ] \ 536 [ <real rcut default 1.0> ] [periodic [<integer iper default 3>]] \ 537 [ <string xfile> [ <string pltfile> ] ] 538\end{verbatim} 539 540The input coordinates are taken from the {\rm xyzq} file that can 541be generated from a {\rm rst} by the prepare module. Parameter 542spacing specifies the number of gridpoints per nm, rcut specifies 543extent of the charge grid beyond the molecule. 544Periodic boundaries will be used if \verb+periodic+ 545is specified. If \verb+iper+ is set to 2, periodic boundary 546conditions are applied in x and y dimensions only. If \verb+periodic+ 547is specified, a negative value of \verb+rcut+ will extend the grid 548in the periodic dimensions by abs(\verb+rcut+), otherwise this value 549will be ignored in the periodic dimensions. 550The resulting {\rm plt} formatted file pltfile can be 551viewed with the gOpenMol program. The resulting electrostatic 552potential grid is in units of kJ\ mol$^{-1}$e$^{-1}$. 553If no files are specified, only the parameters are set. This 554analysis applies to solute(s) only. 555 556The electrostatic potential at specific point are evaluated using 557 558\begin{verbatim} 559esp_points [<string filpin> [<string filhol> [<string filpou> [<string filavg>]]]] 560\end{verbatim}