1RELAPACK Configuration
2======================
3
4ReLAPACK has two configuration files: `make.inc`, which is included by the
5Makefile, and `config.h` which is included in the source files.
6
7
8Build and Testing Environment
9-----------------------------
10The build environment (compiler and flags) and the test configuration (linker
11flags for BLAS and LAPACK) are specified in `make.inc`.  The test matrix size
12and error bounds are defined in `test/config.h`.
13
14The library `librelapack.a` is compiled by invoking `make`.  The tests are
15performed by either `make test` or calling `make` in the test folder.
16
17
18BLAS/LAPACK complex function interfaces
19---------------------------------------
20For BLAS and LAPACK functions that return a complex number, there exist two
21conflicting (FORTRAN compiler dependent) calling conventions: either the result
22is returned as a `struct` of two floating point numbers or an additional first
23argument with a pointer to such a `struct` is used.  By default ReLAPACK uses
24the former (which is what gfortran uses), but it can switch to the latter by
25setting `COMPLEX_FUNCTIONS_AS_ROUTINES` (or explicitly the BLAS and LAPACK
26specific counterparts) to `1` in `config.h`.
27
28**For MKL, `COMPLEX_FUNCTIONS_AS_ROUTINES` must be set to `1`.**
29
30(Using the wrong convention will break `ctrsyl` and `ztrsyl` and the test cases
31will segfault or return errors on the order of 1 or larger.)
32
33
34BLAS extension `xgemmt`
35-----------------------
36The LDL decompositions require a general matrix-matrix product that updates only
37a triangular matrix called `xgemmt`.  If the BLAS implementation linked against
38provides such a routine, set the flag `HAVE_XGEMMT` to `1` in `config.h`;
39otherwise, ReLAPACK uses its own recursive implementation of these kernels.
40
41`xgemmt` is provided by MKL.
42
43
44Routine Selection
45-----------------
46ReLAPACK's routines are named `RELAPACK_X` (e.g., `RELAPACK_dgetrf`).  If the
47corresponding `INCLUDE_X` flag in `config.h` (e.g., `INCLUDE_DGETRF`) is set to
48`1`, ReLAPACK additionally provides a wrapper under the LAPACK name (e.g.,
49`dgetrf_`).  By default, wrappers for all routines are enabled.
50
51
52Crossover Size
53--------------
54The crossover size determines below which matrix sizes ReLAPACK's recursive
55algorithms switch to LAPACK's unblocked routines to avoid tiny BLAS Level 3
56routines.  The crossover size is set in `config.h` and can be chosen either
57globally for the entire library, by operation, or individually by routine.
58
59
60Allowing Temporary Buffers
61--------------------------
62Two of ReLAPACK's routines make use of temporary buffers, which are allocated
63and freed within ReLAPACK.  Setting `ALLOW_MALLOC` (or one of the routine
64specific counterparts) to 0 in `config.h` will disable these buffers.  The
65affected routines are:
66
67 * `xsytrf`: The LDL decomposition requires a buffer of size n^2 / 2.  As in
68   LAPACK, this size can be queried by setting `lWork = -1` and the passed
69   buffer will be used if it is large enough; only if it is not, a local buffer
70   will be allocated.
71
72   The advantage of this mechanism is that ReLAPACK will seamlessly work even
73   with codes that statically provide too little memory instead of breaking
74   them.
75
76 * `xsygst`: The reduction of a real symmetric-definite generalized eigenproblem
77   to standard form can use an auxiliary buffer of size n^2 / 2 to avoid
78   redundant computations.  It thereby performs about 30% less FLOPs than
79   LAPACK.
80
81
82FORTRAN symbol names
83--------------------
84ReLAPACK is commonly linked to BLAS and LAPACK with standard FORTRAN interfaces.
85Since these libraries usually have an underscore to their symbol names, ReLAPACK
86has configuration switches in `config.h` to adjust the corresponding routine
87names.
88