libxsmm-1.16.3/samples/cp2k

# CP2K Artificial Benchmark

The first code sample given for LIBXSMM was a performance reproducer exercising the same set of kernels usually generated for CP2K's SMM library. The code sample attempted to model the way "matrix stacks" are processed in CP2K, however there are two different code paths in CP2K: (1) the "main" code path used when processing stacks on the host-side, and (2) a code path targeting offload devices. Beside of the host-sided parallelization via MPI (and perhaps OpenMP), the secondly mentioned code path relies on an additional level of parallelization (which is obviously necessary to drive a potentially highly parallel offload device). Also, the additional level of parallelism is not exactly "nested" in the sense that it participates on sharing the same resources as the host-side. In fact, this "artificial benchmark" (cp2k code sample) is modeling a code path as utilized in the secondly mentioned case (offload device).
Name		Date	Size	#Lines	LOC
..		13-Oct-2021	-
Makefile	H A D	13-Oct-2021	4.2 KiB	116	93
README.md	H A D	13-Oct-2021	956	5	2
cp2k-collocate.cc	H A D	13-Oct-2021	18.6 KiB	509	411
cp2k-collocate.vcxproj	H A D	13-Oct-2021	23.1 KiB	406	406
cp2k-dbcsr.cpp	H A D	13-Oct-2021	14.8 KiB	367	322
cp2k-dbcsr.sh	H A D	03-May-2022	2.5 KiB	69	50
cp2k-dbcsr.vcxproj	H A D	13-Oct-2021	23.6 KiB	418	418
cp2k-perf-jit.sh	H A D	03-May-2022	3.6 KiB	70	51
cp2k-perf.plt	H A D	13-Oct-2021	12.1 KiB	235	214
cp2k-plot.sh	H A D	03-May-2022	3.3 KiB	111	83
mdarray.hpp	H A D	13-Oct-2021	32.7 KiB	1,127	903
rt_graph.cc	H A D	13-Oct-2021	18 KiB	546	420
rt_graph.hpp	H A D	13-Oct-2021	7.8 KiB	213	128