1RELAPACK Configuration 2====================== 3 4ReLAPACK has two configuration files: `make.inc`, which is included by the 5Makefile, and `config.h` which is included in the source files. 6 7 8Build and Testing Environment 9----------------------------- 10The build environment (compiler and flags) and the test configuration (linker 11flags for BLAS and LAPACK) are specified in `make.inc`. The test matrix size 12and error bounds are defined in `test/config.h`. 13 14The library `librelapack.a` is compiled by invoking `make`. The tests are 15performed by either `make test` or calling `make` in the test folder. 16 17 18BLAS/LAPACK complex function interfaces 19--------------------------------------- 20For BLAS and LAPACK functions that return a complex number, there exist two 21conflicting (FORTRAN compiler dependent) calling conventions: either the result 22is returned as a `struct` of two floating point numbers or an additional first 23argument with a pointer to such a `struct` is used. By default ReLAPACK uses 24the former (which is what gfortran uses), but it can switch to the latter by 25setting `COMPLEX_FUNCTIONS_AS_ROUTINES` (or explicitly the BLAS and LAPACK 26specific counterparts) to `1` in `config.h`. 27 28**For MKL, `COMPLEX_FUNCTIONS_AS_ROUTINES` must be set to `1`.** 29 30(Using the wrong convention will break `ctrsyl` and `ztrsyl` and the test cases 31will segfault or return errors on the order of 1 or larger.) 32 33 34BLAS extension `xgemmt` 35----------------------- 36The LDL decompositions require a general matrix-matrix product that updates only 37a triangular matrix called `xgemmt`. If the BLAS implementation linked against 38provides such a routine, set the flag `HAVE_XGEMMT` to `1` in `config.h`; 39otherwise, ReLAPACK uses its own recursive implementation of these kernels. 40 41`xgemmt` is provided by MKL. 42 43 44Routine Selection 45----------------- 46ReLAPACK's routines are named `RELAPACK_X` (e.g., `RELAPACK_dgetrf`). If the 47corresponding `INCLUDE_X` flag in `config.h` (e.g., `INCLUDE_DGETRF`) is set to 48`1`, ReLAPACK additionally provides a wrapper under the LAPACK name (e.g., 49`dgetrf_`). By default, wrappers for all routines are enabled. 50 51 52Crossover Size 53-------------- 54The crossover size determines below which matrix sizes ReLAPACK's recursive 55algorithms switch to LAPACK's unblocked routines to avoid tiny BLAS Level 3 56routines. The crossover size is set in `config.h` and can be chosen either 57globally for the entire library, by operation, or individually by routine. 58 59 60Allowing Temporary Buffers 61-------------------------- 62Two of ReLAPACK's routines make use of temporary buffers, which are allocated 63and freed within ReLAPACK. Setting `ALLOW_MALLOC` (or one of the routine 64specific counterparts) to 0 in `config.h` will disable these buffers. The 65affected routines are: 66 67 * `xsytrf`: The LDL decomposition requires a buffer of size n^2 / 2. As in 68 LAPACK, this size can be queried by setting `lWork = -1` and the passed 69 buffer will be used if it is large enough; only if it is not, a local buffer 70 will be allocated. 71 72 The advantage of this mechanism is that ReLAPACK will seamlessly work even 73 with codes that statically provide too little memory instead of breaking 74 them. 75 76 * `xsygst`: The reduction of a real symmetric-definite generalized eigenproblem 77 to standard form can use an auxiliary buffer of size n^2 / 2 to avoid 78 redundant computations. It thereby performs about 30% less FLOPs than 79 LAPACK. 80 81 82FORTRAN symbol names 83-------------------- 84ReLAPACK is commonly linked to BLAS and LAPACK with standard FORTRAN interfaces. 85Since these libraries usually have an underscore to their symbol names, ReLAPACK 86has configuration switches in `config.h` to adjust the corresponding routine 87names. 88