• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-Nov-2020-

Makefile.protoH A D03-Nov-20207.6 KiB331259

READMEH A D03-Nov-20205.4 KiB10686

blkdat120lin.fH A D03-Nov-20203.1 KiB6049

blkdat15.fH A D03-Nov-2020768 2110

blkdat240lin.fH A D03-Nov-20205.7 KiB10291

blkdat30.fH A D03-Nov-20201.1 KiB2716

blkdat60.fH A D03-Nov-20201.9 KiB3928

blkdat60lin.fH A D03-Nov-20201.8 KiB3928

cscf.hH A D03-Nov-20201.7 KiB4340

cscf120lin.hH A D03-Nov-20201.7 KiB4340

cscf15.hH A D03-Nov-20201.7 KiB4340

cscf240lin.hH A D03-Nov-20201.7 KiB4340

cscf30.hH A D03-Nov-20201.7 KiB4340

cscf60.hH A D03-Nov-20201.7 KiB4340

cscf60lin.hH A D03-Nov-20201.7 KiB4340

daxpy.fH A D03-Nov-20201.2 KiB4931

daxpy1.sH A D03-Nov-20203.9 KiB145106

ddot.fH A D03-Nov-20201.2 KiB5032

demo.protoH A D03-Nov-20202.4 KiB13187

diagon.fH A D03-Nov-20206.4 KiB266220

dscal.fH A D03-Nov-2020939 4326

fexit.f.protoH A D03-Nov-2020297 1413

getmem.cH A D03-Nov-2020873 3620

grid.cH A D03-Nov-202022.4 KiB830597

ieeetrap.cH A D03-Nov-2020506 2519

integ.fH A D03-Nov-20202.9 KiB11490

jacobi.fH A D03-Nov-20206.4 KiB225139

mc.fH A D03-Nov-20205.3 KiB191112

mc.inputH A D03-Nov-202010 32

md.fH A D03-Nov-202015.4 KiB518374

mxv_daxpy1.fH A D03-Nov-2020801 2615

mxv_dgemv.fH A D03-Nov-2020753 246

mxv_fortran.fH A D03-Nov-2020723 2416

output.fH A D03-Nov-20202.7 KiB6639

prtri.fH A D03-Nov-2020757 3226

random.cH A D03-Nov-2020258 129

runitH A D03-Nov-2020403 1812

runit.gridH A D03-Nov-2020209 96

scf.fH A D03-Nov-202019.1 KiB606422

scfblas.fH A D03-Nov-202024.5 KiB877364

timer.fH A D03-Nov-2020562 237

trace.outH A D03-Nov-2020309 87

xpix.sharH A D03-Nov-202016.4 KiB610604

README

1TCGMSG Examples
2===============
3
4These example programs are realisitic (?) models of actual applications or
5algorithms from chemical-physics.  They should make cleanly once the Makefile
6has been appropriately modified (which is done automatically for all supported
7machines). Serial and shared-memory parallel (and possibly CM and Linda)
8versions are also available but not included here.
9
10The programs may be run using the csh script demo in this directory The script
11takes a single argument which is the name of the desired demo (scf, md, mc,
12jacobi, grid).  The script uses a template PROCGRP file (template.p) to
13generate the actual PROCGRP file used ... its makes a default file if one does
14not exist ... look in that for details.
15
16Self Consistent Field (scf)
17---------------------------
18
19This SCF code is a cleaned up and much enhanced version of the one in Szabo
20and Ostlund.  It uses distributed primitive 1s gaussian functions as a basis
21(thus emulating use of s,p,... functions) and computes integrals to
22essentially full accuracy.  It is a direct SCF (integrals are computed each
23iteration using the Schwarz inequality for screeing).  An atomic denisty is
24used for a starting guess.  Damping and level shifting are used to aid
25convergence.
26
27Rather than complicate the program with code for parsing input the include
28file 'cscf.h' and block data file 'blkdata.f' contain all the data and thus
29there are three versions, one for each of the available problem sizes.  The
30three sizes correpsond to 15 basis functions (Be), 30 basis functions (Be2)
31and 60 basis functions (tetrahedral Be4).
32
33[In addition to these three cases there are files for 60, 120 and 240
34functions, which are not built by default (type 'make extra' for
35these).  These are 4, 8 and 16 Be atoms, respectively, arranged in a line.]
36
37The O(N**4) step has been parallelized with the assumption that each process
38can hold all of the density and fock matrices which is reasonable for up to
39O(1000) basis functions on most workstations networks and many MIMD machines
40(e.g. iPSC-i860).  The work is dynamically load-balanced, with tasks
41comprising 10 sets of integrals (ij|**) (see TWOEL() and NXTASK() in scf.f).
42
43The work of O(N**3) has not been parallelized, but has been optimized to use
44BLAS and a tweaked Jacobi diagonalizer with dynamic threshold selection.
45
46Molecular Dynamics (md)
47-----------------------
48
49This program bounces a few thousand argon atoms around in a box with periodic
50boundary conditions.  Pairwise interactions (Leonard-Jones) are used with a
51simple integration of the Newtonian equations of motion.  This program is
52derived from the serial code of Deiter Heerman, but many modifications have
53been made.  Prof. Frank Harris has a related FORTRAN 9X Connection Machine
54version.
55
56The O(N) work constructing the forces has been parallelized, as has the
57computation of the pair distribution function.  The neighbour list is computed
58in parallel every 20 steps with a simple static decomposition.  This then
59drives the parallelization of the forces computation.  To make the simulation
60bigger increase the value of mm in the parameter statement at the top of md.f
61(mm=8 gives 2048 particles, mm=13 gives 8878).  Each particle interacts with
62about 80 others, and the neighbor list is computed for about 130 neighbors to
63allow for movement before it is updated.
64
65Monte Carlo (mc)
66----------------
67
68This code evaluates the energy of the simplest explicitly correlated
69electronic wavefunction for the He atom ground state using a variational
70monte-carlo method without importance sampling.  It is completely boringly
71parallel and for realistic problem sizes gives completely linear speed-ups for
72several hunderd processes.  You have to give it the no. of moves to
73equilibrate the system for (neq) and the no. of moves to compute averages over
74(nstep).  Apropriate values for a very short run are 200 and 500 respectively.
75
76Jacobi iterative linear equation solver (jacobi)
77------------------------------------------------
78
79Uses a naive jacobi iterative algorithm to solve a linear equation.  This
80algorithm is not applicable to real linear equations (sic) and neither is it
81the most parallel algorithm available.  The code as implemented here gets 780+
82MFLOP on a 128 node iPSC-i860 ... a paltry 30% efficiency, but it is not hard
83to improve upon either.
84
85All the time is spent in a large matrix vector product which is statically
86distributed across the processes.  You need to give it the matrix dimension
87(pick as big as will fit in memory).
88
89Solution of Laplace's equation on a 2-D grid (grid)
90---------------------------------------------------
91
92Solve Laplace's eqn. on a 2-D square grid subject to b.c.s on the boundary.
93Use 5 point discretization of the operator and a heirarchy of grids with
94red/black gauss seidel w-relaxation.  This is not the most efficient means of
95solving this equation (probably should use a fast-poisson solver) but it
96provides a 'real-world' example of spatial decomposition determining the
97parallel decomposition.  It is also the only example of a full application in
98C that is included here.
99
100If the code is compiled with -DPLOT and run with the option '-plot XXX', where
101XXX is one of 'value', 'residual' or 'error', then grids are dumped at
102intervals to the file 'plot' (in the directory of process zero).  This file
103may be displayed with the X11-R4/5 program xpix.  Xpix is not built
104automatically and must be extracted and built from the shar file in this
105directory.
106