• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..11-Jan-2022-

data/H03-May-2022-1,8621,788

serial/H11-Jan-2022-2,5511,815

LICENSEH A D11-Jan-20221.4 KiB2721

NOTICEH A D11-Jan-20221.2 KiB2218

README.mdH A D11-Jan-202216.4 KiB391315

ascent_options.jsonH A D11-Jan-202261 54

ascent_options.yamlH A D11-Jan-202245 53

laghos.cppH A D11-Jan-202225 KiB718524

laghos_assembly.cppH A D11-Jan-202243.5 KiB1,191885

laghos_assembly.hppH A D11-Jan-20228.3 KiB252142

laghos_solver.cppH A D11-Jan-202222 KiB632505

laghos_solver.hppH A D11-Jan-20225.8 KiB17192

run_sedov_2d.shH A D11-Jan-202279 41

run_sedov_3d.shH A D11-Jan-202276 41

run_taylor_green_2d.shH A D11-Jan-202279 41

run_taylor_green_3d.shH A D11-Jan-202276 41

run_tripple_point_2d.shH A D11-Jan-202282 41

run_tripple_point_3d.shH A D11-Jan-202275 41

README.md

1               __                __
2              / /   ____  ____  / /_  ____  _____
3             / /   / __ `/ __ `/ __ \/ __ \/ ___/
4            / /___/ /_/ / /_/ / / / / /_/ (__  )
5           /_____/\__,_/\__, /_/ /_/\____/____/
6                       /____/
7
8        High-order Lagrangian Hydrodynamics Miniapp
9
10
11## Purpose
12
13**Laghos** (LAGrangian High-Order Solver) is a miniapp that solves the
14time-dependent Euler equations of compressible gas dynamics in a moving
15Lagrangian frame using unstructured high-order finite element spatial
16discretization and explicit high-order time-stepping.
17
18Laghos is based on the discretization method described in the following article:
19
20> V. Dobrev, Tz. Kolev and R. Rieben <br>
21> [High-order curvilinear finite element methods for Lagrangian hydrodynamics](https://doi.org/10.1137/120864672) <br>
22> *SIAM Journal on Scientific Computing*, (34) 2012, pp. B606–B641.
23
24Laghos captures the basic structure of many compressible shock hydrocodes,
25including the [BLAST code](http://llnl.gov/casc/blast) at [Lawrence Livermore
26National Laboratory](http://llnl.gov). The miniapp is built on top of a general
27discretization library, [MFEM](http://mfem.org), thus separating the pointwise
28physics from finite element and meshing concerns.
29
30The Laghos miniapp is part of the [CEED software suite](http://ceed.exascaleproject.org/software),
31a collection of software benchmarks, miniapps, libraries and APIs for
32efficient exascale discretizations based on high-order finite element
33and spectral element methods. See http://github.com/ceed for more
34information and source code availability.
35
36The CEED research is supported by the [Exascale Computing Project](https://exascaleproject.org/exascale-computing-project)
37(17-SC-20-SC), a collaborative effort of two U.S. Department of Energy
38organizations (Office of Science and the National Nuclear Security
39Administration) responsible for the planning and preparation of a
40[capable exascale ecosystem](https://exascaleproject.org/what-is-exascale),
41including software, applications, hardware, advanced system engineering and early
42testbed platforms, in support of the nation’s exascale computing imperative.
43
44## Characteristics
45
46The problem that Laghos is solving is formulated as a big (block) system of
47ordinary differential equations (ODEs) for the unknown (high-order) velocity,
48internal energy and mesh nodes (position). The left-hand side of this system of
49ODEs is controlled by *mass matrices* (one for velocity and one for energy),
50while the right-hand side is constructed from a *force matrix*.
51
52Laghos supports two options for deriving and solving the ODE system, namely the
53*full assembly* and the *partial assembly* methods. Partial assembly is the main
54algorithm of interest for high orders. For low orders (e.g. 2nd order in 3D),
55both algorithms are of interest.
56
57The full assembly option relies on constructing and utilizing global mass and
58force matrices stored in compressed sparse row (CSR) format.  In contrast, the
59[partial assembly](http://ceed.exascaleproject.org/ceed-code) option defines
60only the local action of those matrices, which is then used to perform all
61necessary operations. As the local action is defined by utilizing the tensor
62structure of the finite element spaces, the amount of data storage, memory
63transfers, and FLOPs are lower (especially for higher orders).
64
65Other computational motives in Laghos include the following:
66
67- Support for unstructured meshes, in 2D and 3D, with quadrilateral and
68  hexahedral elements (triangular and tetrahedral elements can also be used, but
69  with the less efficient full assembly option). Serial and parallel mesh
70  refinement options can be set via a command-line flag.
71- Explicit time-stepping loop with a variety of time integrator options. Laghos
72  supports Runge-Kutta ODE solvers of orders 1, 2, 3, 4 and 6.
73- Continuous and discontinuous high-order finite element discretization spaces
74  of runtime-specified order.
75- Moving (high-order) meshes.
76- Separation between the assembly and the quadrature point-based computations.
77- Point-wise definition of mesh size, time-step estimate and artificial
78  viscosity coefficient.
79- Constant-in-time velocity mass operator that is inverted iteratively on
80  each time step. This is an example of an operator that is prepared once (fully
81  or partially assembled), but is applied many times. The application cost is
82  dominant for this operator.
83- Time-dependent force matrix that is prepared every time step (fully or
84  partially assembled) and is applied just twice per "assembly". Both the
85  preparation and the application costs are important for this operator.
86- Domain-decomposed MPI parallelism.
87- Optional in-situ visualization with [GLVis](http:/glvis.org) and data output
88  for visualization and data analysis with [VisIt](http://visit.llnl.gov).
89
90## Code Structure
91
92- The file `laghos.cpp` contains the main driver with the time integration loop
93  starting around line 431.
94- In each time step, the ODE system of interest is constructed and solved by
95  the class `LagrangianHydroOperator`, defined around line 375 of `laghos.cpp`
96  and implemented in files `laghos_solver.hpp` and `laghos_solver.cpp`.
97- All quadrature-based computations are performed in the function
98  `LagrangianHydroOperator::UpdateQuadratureData` in `laghos_solver.cpp`.
99- Depending on the chosen option (`-pa` for partial assembly or `-fa` for full
100  assembly), the function `LagrangianHydroOperator::Mult` uses the corresponding
101  method to construct and solve the final ODE system.
102- The full assembly computations for all mass matrices are performed by the MFEM
103  library, e.g., classes `MassIntegrator` and `VectorMassIntegrator`.  Full
104  assembly of the ODE's right hand side is performed by utilizing the class
105  `ForceIntegrator` defined in `laghos_assembly.hpp`.
106- The partial assembly computations are performed by the classes
107  `ForcePAOperator` and `MassPAOperator` defined in `laghos_assembly.hpp`.
108- When partial assembly is used, the main computational kernels are the
109  `Mult*` functions of the classes `MassPAOperator` and `ForcePAOperator`
110  implemented in file `laghos_assembly.cpp`. These functions have specific
111  versions for quadrilateral and hexahedral elements.
112- The orders of the velocity and position (continuous kinematic space)
113  and the internal energy (discontinuous thermodynamic space) are given
114  by the `-ok` and `-ot` input parameters, respectively.
115
116## Building on CPU
117
118Laghos has the following external dependencies:
119
120- *hypre*, used for parallel linear algebra, we recommend version 2.10.0b<br>
121   https://computation.llnl.gov/casc/hypre/software.html
122
123-  METIS, used for parallel domain decomposition (optional), we recommend [version 4.0.3](http://glaros.dtc.umn.edu/gkhome/fetch/sw/metis/OLD/metis-4.0.3.tar.gz) <br>
124   http://glaros.dtc.umn.edu/gkhome/metis/metis/download
125
126- MFEM, used for (high-order) finite element discretization, its GitHub master branch <br>
127  https://github.com/mfem/mfem
128
129To build the miniapp, first download *hypre* and METIS from the links above
130and put everything on the same level as the `Laghos` directory:
131```sh
132~> ls
133Laghos/  hypre-2.10.0b.tar.gz  metis-4.0.tar.gz
134```
135
136Build *hypre*:
137```sh
138~> tar -zxvf hypre-2.10.0b.tar.gz
139~> cd hypre-2.10.0b/src/
140~/hypre-2.10.0b/src> ./configure --disable-fortran
141~/hypre-2.10.0b/src> make -j
142~/hypre-2.10.0b/src> cd ../..
143```
144For large runs (problem size above 2 billion unknowns), add the
145`--enable-bigint` option to the above `configure` line.
146
147Build METIS:
148```sh
149~> tar -zxvf metis-4.0.3.tar.gz
150~> cd metis-4.0.3
151~/metis-4.0.3> make
152~/metis-4.0.3> cd ..
153~> ln -s metis-4.0.3 metis-4.0
154```
155This build is optional, as MFEM can be build without METIS by specifying
156`MFEM_USE_METIS = NO` below.
157
158Clone and build the parallel version of MFEM:
159```sh
160~> git clone https://github.com/mfem/mfem.git ./mfem
161~> cd mfem/
162~/mfem> git checkout laghos-v1.0
163~/mfem> make parallel -j
164~/mfem> cd ..
165```
166The above uses the `laghos-v1.0` tag of MFEM, which is guaranteed to work with
167Laghos v1.0. Alternatively, one can use the latest versions of the MFEM and
168Laghos `master` branches (provided there are no conflicts. See the [MFEM
169building page](http://mfem.org/building/) for additional details.
170
171(Optional) Clone and build GLVis:
172```sh
173~> git clone https://github.com/GLVis/glvis.git ./glvis
174~> cd glvis/
175~/glvis> make
176~/glvis> cd ..
177```
178The easiest way to visualize Laghos results is to have GLVis running in a
179separate terminal. Then the `-vis` option in Laghos will stream results directly
180to the GLVis socket.
181
182Build Laghos
183```sh
184~> cd Laghos/
185~/Laghos> make
186```
187This can be followed by `make test` and `make install` to check and install the
188build respectively. See `make help` for additional options.
189
190
191## Building on GPU with cuda, or RAJA
192
193### Environment setup
194```sh
195export MPI_PATH=~/usr/local/openmpi/3.0.0
196```
197
198### Hypre
199- <https://computation.llnl.gov/projects/hypre-scalable-linear-solvers-multigrid-methods/download/hypre-2.11.2.tar.gz>
200- `tar xzvf hypre-2.11.2.tar.gz`
201- ` cd hypre-2.11.2/src`
202- `./configure --disable-fortran --with-MPI --with-MPI-include=$MPI_PATH/include --with-MPI-lib-dirs=$MPI_PATH/lib`
203- `make -j`
204- `cd ../..`
205
206### Metis
207-   <http://glaros.dtc.umn.edu/gkhome/fetch/sw/metis/metis-5.1.0.tar.gz>
208-   `tar xzvf metis-5.1.0.tar.gz`
209-   `cd metis-5.1.0`
210-   ``make config shared=1 prefix=`pwd` ``
211-   `make && make install`
212-   `cd ..`
213
214### MFEM
215-   `git clone git@github.com:mfem/mfem.git`
216-   `cd mfem`
217-   ``make config MFEM_USE_MPI=YES HYPRE_DIR=`pwd`/../hypre-2.11.2/src/hypre MFEM_USE_METIS_5=YES METIS_DIR=`pwd`/../metis-5.1.0``
218-   `make status` to verify that all the include paths are correct
219-   `make -j`
220-   `cd ..`
221
222### Laghos
223-   `git clone git@github.com:CEED/Laghos.git`
224-   `cd Laghos`
225-   `git checkout raja-dev`
226-   edit the `makefile`, set NV\_ARCH to the desired architecture and the absolute paths to CUDA\_DIR, MFEM\_DIR, MPI\_HOME
227-   `make` to build for the CPU version
228-   `./laghos -cfl 0.1` should give `step 78, t = 0.5000, dt = 0.001835, |e| = 7.0537801760`
229-   `cp ./laghos ./laghos.cpu`
230-   `make clean && make cuda`
231-   `./laghos -cfl 0.1` should give you again again `step 78, t = 0.5000, dt = 0.001835, |e| = 7.0537801760`
232-   `cp ./laghos ./laghos.gpu`
233-   if you set up the RAJA_DIR path in the `makefile`, you can `make clean && make raja`, `cp ./laghos ./laghos.raja`
234
235### Options
236-   -m <string>: Mesh file to use
237-   -ok <int>: Order (degree) of the kinematic finite element space
238-   -rs <int>: Number of times to refine the mesh uniformly in serial
239-   -p <int>: Problem setup to use, Sedov problem is '1'
240-   -cfl <double>: CFL-condition number
241-   -ms <int>: Maximum number of steps (negative means no restriction)
242-   -mult: Enable or disable MULT test kernels
243-   -cuda: Enable or disable CUDA kernels if you are using RAJA
244-   -uvm: Enable or disable Unified Memory
245-   -aware: Enable or disable MPI CUDA Aware
246-   -hcpo: Enable or disable Host Conforming Prolongation Operations,
247    which transfers ALL the data to the host before communications
248
249## Running
250
251#### Sedov blast
252
253The main problem of interest for Laghos is the Sedov blast wave (`-p 1`) with
254partial assembly option (`-pa`).
255
256Some sample runs in 2D and 3D respectively are:
257```sh
258mpirun -np 8 laghos -p 1 -m data/square01_quad.mesh -rs 3 -tf 0.8 -no-vis -pa
259mpirun -np 8 laghos -p 1 -m data/cube01_hex.mesh -rs 2 -tf 0.6 -no-vis -pa
260```
261
262The latter produces the following density plot (when run with the `-vis` instead
263of the `-no-vis` option)
264
265![Sedov blast image](data/sedov.png)
266
267#### Taylor-Green vortex
268
269Laghos includes also a smooth test problem, that exposes all the principal
270computational kernels of the problem except for the artificial viscosity
271evaluation.
272
273Some sample runs in 2D and 3D respectively are:
274```sh
275mpirun -np 8 laghos -p 0 -m data/square01_quad.mesh -rs 3 -tf 0.5 -no-vis -pa
276mpirun -np 8 laghos -p 0 -m data/cube01_hex.mesh -rs 1 -cfl 0.1 -tf 0.25 -no-vis -pa
277```
278
279The latter produces the following velocity magnitude plot (when run with the
280`-vis` instead of the `-no-vis` option)
281
282![Taylor-Green image](data/tg.png)
283
284#### Triple-point problem
285
286Well known three-material problem combines shock waves and vorticity,
287thus examining the complex computational abilities of Laghos.
288
289Some sample runs in 2D and 3D respectively are:
290```sh
291mpirun -np 8 laghos -p 3 -m data/rectangle01_quad.mesh -rs 2 -tf 2.5 -cfl 0.025 -no-vis -pa
292mpirun -np 8 laghos -p 3 -m data/box01_hex.mesh -rs 1 -tf 2.5 -cfl 0.05 -no-vis -pa
293```
294
295The latter produces the following specific internal energy plot (when run with
296the `-vis` instead of the `-no-vis` option)
297
298![Triple-point image](data/tp.png)
299
300## Verification of Results
301
302To make sure the results are correct, we tabulate reference final iterations
303(`step`), time steps (`dt`) and energies (`|e|`) for the nine runs listed above:
304
3051. `mpirun -np 8 laghos -p 0 -m data/square01_quad.mesh -rs 3 -tf 0.75 -no-vis -pa`
3062. `mpirun -np 8 laghos -p 0 -m data/cube01_hex.mesh -rs 1 -tf 0.75 -no-vis -pa`
3073. `mpirun -np 8 laghos -p 1 -m data/square01_quad.mesh -rs 3 -tf 0.8 -no-vis -pa`
3084. `mpirun -np 8 laghos -p 1 -m data/cube01_hex.mesh -rs 2 -tf 0.6 -no-vis -pa`
3095. `mpirun -np 8 laghos -p 2 -m data/segment01.mesh -rs 5 -tf 0.2 -no-vis -fa`
3106. `mpirun -np 8 laghos -p 3 -m data/rectangle01_quad.mesh -rs 2 -tf 3.0 -no-vis -pa`
3117. `mpirun -np 8 laghos -p 3 -m data/box01_hex.mesh -rs 1 -tf 3.0 -no-vis -pa`
312
313| `run` | `step` | `dt` | `e` |
314| ----- | ------ | ---- | --- |
315|  1. |  339 | 0.000702 | 49.6955373491   |
316|  2. | 1041 | 0.000121 | 3390.9635545458 |
317|  3. | 1150 | 0.002271 | 46.3055694501   |
318|  4. |  561 | 0.000360 | 134.0937837919  |
319|  5. |  414 | 0.000339 | 32.0120759615   |
320|  6. | 5310 | 0.000264 | 141.8348694390  |
321|  7. |  937 | 0.002285 | 144.0012514765  |
322
323
324An implementation is considered valid if the final energy values are all within
325round-off distance from the above reference values.
326
327## Performance Timing and FOM
328
329Each time step in Laghos contains 3 major distinct computations:
330
3311. The inversion of the global kinematic mass matrix (CG H1).
3322. The force operator evaluation from degrees of freedom to quadrature points (Forces).
3333. The physics kernel in quadrature points (UpdateQuadData).
334
335By default Laghos is instrumented to report the total execution times and rates,
336in terms of millions of degrees of freedom per second (megadofs), for each of
337these computational phases. (The time for inversion of the local thermodynamic
338mass matrices (CG L2) is also reported, but that takes a small part of the
339overall computation.)
340
341Laghos also reports the total rate for these major kernels, which is a proposed
342**Figure of Merit (FOM)** for benchmarking purposes.  Given a computational
343allocation, the FOM should be reported for different problem sizes and finite
344element orders, as illustrated in the sample scripts in the [timing](./timing)
345directory.
346
347A sample run on the [Vulcan](https://computation.llnl.gov/computers/vulcan) BG/Q
348machine at LLNL is:
349
350```
351srun -n 393216 laghos -pa -p 1 -tf 0.6 -no-vis
352                      -pt 322 -m data/cube_12_hex.mesh
353                      --cg-tol 0 --cg-max-iter 50 --max-steps 2
354                      -ok 3 -ot 2 -rs 5 -rp 3
355```
356This is Q3-Q2 3D computation on 393,216 MPI ranks (24,576 nodes) that produces
357rates of approximately 168497, 74221, and 16696 megadofs, and a total FOM of
358about 2073 megadofs.
359
360To make the above run 8 times bigger, one can either weak scale by using 8 times
361as many MPI tasks and increasing the number of serial refinements: `srun -n
3623145728 ... -rs 6 -rp 3`, or use the same number of MPI tasks but increase the
363local problem on each of them by doing more parallel refinements: `srun -n
364393216 ... -rs 5 -rp 4`.
365
366## Versions
367
368In addition to the main MPI-based CPU implementation in https://github.com/CEED/Laghos,
369the following versions of Laghos have been developed
370
371- A serial version in the [serial](./serial) directory.
372- [GPU version](https://github.com/CEED/Laghos/tree/occa-dev) based on
373  [OCCA](http://libocca.org/).
374- A [RAJA](https://software.llnl.gov/RAJA/)-based version in the
375  [raja-dev](https://github.com/CEED/Laghos/tree/raja-dev) branch.
376
377## Contact
378
379You can reach the Laghos team by emailing laghos@llnl.gov or by leaving a
380comment in the [issue tracker](https://github.com/CEED/Laghos/issues).
381
382## Copyright
383
384The following copyright applies to each file in the CEED software suite,
385unless otherwise stated in the file:
386
387> Copyright (c) 2017, Lawrence Livermore National Security, LLC. Produced at the
388> Lawrence Livermore National Laboratory. LLNL-CODE-734707. All Rights reserved.
389
390See files LICENSE and NOTICE for details.
391