• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..01-Jun-2018-

MakefileH A D01-Jun-20181.6 KiB5333

README.mdH A D01-Jun-20187.6 KiB224166

bgw_mpi.f90H A D01-Jun-201828.3 KiB1,079710

bgw_mpi.f90pH A D01-Jun-20182 KiB8970

compare_wfns.pyH A D03-May-20225.1 KiB139101

diag_driver.f90H A D01-Jun-20185.8 KiB218131

diag_elpa.f90H A D01-Jun-20189.5 KiB292167

diag_primme.f90H A D01-Jun-201814.9 KiB471284

diag_scalapack.f90H A D01-Jun-201819.6 KiB695427

distribution.f90H A D01-Jun-20184.8 KiB193120

hamiltonian.f90H A D01-Jun-201810 KiB304176

inread.f90H A D01-Jun-20185.9 KiB185124

iteration_data.f90H A D01-Jun-20183.3 KiB11076

kpoint_pool.f90H A D01-Jun-20184.1 KiB13790

make_deps.mkH A D01-Jun-20182.1 KiB1916

make_deps.pyH A D03-May-20221.7 KiB5335

mako_preprocess.pyH A D03-May-2022907 3326

merge_spin.pyH A D03-May-20222.4 KiB7158

parabands.f90H A D01-Jun-201813.8 KiB404275

parabands.inpH A D01-Jun-20183.8 KiB12094

primme.f90H A D01-Jun-20185.3 KiB182120

primme_aux.cH A D01-Jun-20181.3 KiB3928

primme_eigs_f77.hH A D01-Jun-201810.7 KiB285277

primme_f77.hH A D01-Jun-20182.1 KiB3836

primme_f90.hH A D01-Jun-20186.8 KiB182175

pseudopot.f90H A D01-Jun-20184.6 KiB177111

pseudopot_vkb.f90H A D01-Jun-201810.4 KiB331220

run_parabands_test.shH A D03-May-2022159 73

split_spin.f90H A D01-Jun-20188.6 KiB216164

wfn_io.f90H A D01-Jun-201810 KiB305214

README.md

1ABOUT PARABANDS
2===============
3
4ParaBands is a code that builds and diagonalizes the DFT Hamiltonian. Instead
5of using iterative algorithms like most DFT programs, ParaBands builds the
6dense DFT Hamiltonian in plane wave basis set and calls a direct solver to
7obtain the requested number of empty bands. This approach is suitable when the
8desired number of bands is a large fraction (more than 5%) of the size of the
9Hamiltonian.
10
11ParaBands has the following features:
12
13  - Support for 3 different classes of solvers using:
14    - LAPACK/ScaLAPACK (Bisection, Divide and Conquer, and MR3)
15    - ELPA
16    - PRIMME
17
18  - Parallelization over k-point pools
19
20  - Fully parallel HDF5 output for the wave function file
21
22  - Hybrid MPI and OpenMP parallelism
23
24Note that you MUST compile BerkeleyGW with HDF5 support, otherwise ParaBands
25will not be able to write any wave function file. Also, ParaBands currently
26only works on top of calculations performed with Quantum ESPRESSO.
27
28ParaBands was written by Felipe H. da Jornada, and it is inspired and motivated
29by the SAPO code written by Georgy Samsonidze.
30
31
32BASIC USAGE
33===========
34
35In order to run ParaBands, you will first need to perform a mean-field
36calculation with Quantum ESPRESSO with the desired k-points, and with a small
37number of bands (containing at least one fully empty band) and export the
38following quantities:
39- WFN_in: the wave function file (we need this to get the el. occupations);
40- VKB: the self-consistent potential in G-space; and
41- VSC: the Kleinman-Bylander non-local projectors in G-space.
42Note that only QE can export the VSC and VKB files right now.
43
44If you plan on doing self-energy calculations, you should also export the
45exchange-correlation potential in G-space (VXC), because degenerate wave
46functions generated by ParaBands will likely differ from those generated by the
47mean-field program, so vxc.dat = <nk|VXC|nk> will not be consistent.
48
49The recommended input for pw2bgw.inp has the following structure:
50
51```
52&input_pw2bgw
53  prefix = 'SYSTEM_NAME'
54  wfng_flag = .true.
55  wfng_file = 'WFN_in'
56  real_or_complex = 2
57  wfng_kgrid = .true.
58  wfng_nk1 = 6
59  wfng_nk2 = 6
60  wfng_nk3 = 6
61  wfng_dk1 = 0.0
62  wfng_dk2 = 0.0
63  wfng_dk3 = 0.0
64  rhog_flag = .false.
65  vxc_flag = .false.
66  vxcg_flag = .true.
67  vxcg_file = 'VXC'
68  vscg_flag = .true.
69  vscg_file = 'VSC'
70  vkbg_flag = .true.
71  vkbg_file = 'VKB'
72/
73```
74
75The documentation for the ParaBands input file is in parabands.inp. Note that
76only the input and output file names are actually required to run ParaBands.
77
78Note that the output wavefunction file is written in HDF5 format. However, only
79Fortran binary I/O of wavefunctions is supported in the rest of BerkeleyGW. You
80will need to use the utility `hdf2wfn.x` to convert to Fortran binary format.
81
82
83FAQs
84====
85
86Q) How many atoms can ParaBands handle?
87A) ParaBands does not care (much) about the number of atoms. The bottleneck is
88the size of the Hamiltonian in G-space.
89
90Q) Ok, what is the maximum size of the Hamiltonian ParaBands can handle?
91A) This is equivalent to asking "how long does it take to diagonalize a matrix
92of size NxN". Here are some benchmarks on the Cori2 supercomputer at NERSC,
93using the ELPA solver, for a single k-point, and to obtain all bands:
94
95```
96  |      N | Nodes |  Time  |
97  |--------|-------|--------|
98  |  17000 |     4 |   10 s |
99  |  60000 |    16 | 10 min |
100  | 100000 |    64 | 15 min |
101  | 150000 |   256 | 17 min |
102  | 200000 |   512 | 25 min |
103  | 250000 |   512 | 38 min |
104  | 320000 |   512 | 68 min |
105  | 416000 |  2560 | 80 min |
106```
107
108Note that I/O time is not included.
109
110Q) How do I determine the size of the Hamiltonian before I run ParaBands?
111A) The easiest way is to run an scf calculation with QE, and look for the part
112on "Parallelization Info" at the very top. Example:
113
114```
115     Parallelization info
116     --------------------
117     sticks:   dense  smooth     PW     G-vecs:    dense   smooth      PW
118     Min         282     282     70                19098    19098    2384
119     Max         284     284     74                19112    19112    2396
120     Sum       18167   18167   4543              1222827  1222827  152845
121```
122
123If your calculation has a single k-point, the row "Sum", column "PW" has the
124number of G-vectors to represent the wavefunction (152845), which is the size
125of the Hamiltonian constructed by ParaBands. If you have more k-points in your
126calculation, that number contains all possible G-vectors needed to represent
127any wavefunction for all k-points, which is typically ~10% larger than the
128number of G-vectors at any
129given k-point.
130
131Q) Are there any other caveats?
132A) ParaBands sometimes doesn't work with the real version. If you see a warning
133that the difference between input and output energies is large, try to run
134ParaBands again with the complex version of the code and complex input
135wavefunctions.
136
137PERFORMANCE TIPS
138================
139
140  - The best solver algorithm for ParaBands is ELPA. Make sure you compile
141    BerkeleyGW with support for ELPA (see section below).
142
143  - If you are using a parallel file system, make sure to stripe the output
144    directory before running ParaBands.
145
146  - Out of the ScaLAPACK solvers, Bisection is typically the best. But you
147    might experience some performance gain with the Divide and Conquer
148    algorithm if you ask for all bands from your DFT Hamiltonian.
149
150  - When using a direct solver (LAPACK or ScaLAPCK, which is the default), you
151    won't save much run time on the diagonalization if you ask for a smaller
152    number of bands. But you may save some run time overall because of the I/O.
153
154  - The performance of each solver depends a lot on the LAPACK/ScaLAPACK
155    implementation, MPI stack, and interconnect. You are encouraged to test
156    all solvers on a small system when you run ParaBands on a new architecture.
157
158  - Use the default block_size of 32 when running ParaBands without threads.
159    If you increase the number of threads, you'll need to increase the block
160    size to 64 or 128 to get any performance gain.
161
162  - Note that the memory required to run ScaLAPACK increases with the block_size!
163
164
165MR3 SOLVER
166==========
167
168Not all ScaLAPACK implementations support Relatively Robust Representations (MR3).
169You'll need to include the flag -DUSEMR3 in the MATHFLAGS variable in your
170arch.mk file to enable this solver.
171
172
173ELPA SOLVER
174===========
175
176Using the ELPA solver is **HIGHLY RECOMMENDED**. To enable it, you should:
177
178  1) Download and compile ELPA **VERSION 2017.05.001 OR ABOVE**.
179
180  2) Add the flag -DUSEELPA in the MATHFLAGS variable in your arch.mk file.
181
182  3) Add a variable ELPALIB to your arch.mk which points to libelpa.a
183
184  4) Add a variable ELPAINCLUDE to your arch.mk which points to the directory
185     that contains the modules generated by ELPA. This has typically the
186     structure: ELPA_PREFIX/include/elpa-VERSION/modules
187
188You can control the kernel that ELPA uses by setting the environment variables
189COMPLEX_ELPA_KERNEL and REAL_ELPA_KERNEL. Take a look at
190ELPA/src/elpa2_utilities.F90 for the available kernels for your architecture. A
191somewhat safe and good option is to include the following lines in your
192submission script:
193
194```
195export REAL_ELPA_KERNEL=REAL_ELPA_KERNEL_SSE export
196COMPLEX_ELPA_KERNEL=COMPLEX_ELPA_KERNEL_SSE
197```
198
199
200PRIMME SOLVER
201=============
202
203The PRIMME solver is only useful if you want to obtain a very small number of
204bands, though it is a very robust solver.  If you wish to run ParaBands with
205PRIMME, you will need to:
206
207  1) Download PRIMME version 2.1.
208
209  2) Compile PRIMME.
210
211  3) Add the flag -DUSEPRIMME in the MATHFLAGS variable in your arch.mk file.
212
213  4) Add a variable PRIMMELIB to your arch.mk which points to libprimme.a.
214
215
216TODO
217====
218
219FHJ's wish list:
220
221  - Memory report
222  - Restart feature
223  - Build VKB from PP file (almost there!)
224