1# esl_vectorops
2
3The `vectorops` module contains routines for simple operations on
4vectors of various types, of length <n>.
5
6Different functions allow an operation to be performed in vectors
7containing elements of different scalar types (`double`, `float`,
8`int`, `int64_t`, `char`). The appropriate routine is prefixed with
9`D`, `F`, `I`, `L`, or `C`. For example, `esl_vec_DSet()` is the Set
10routine for a vector of doubles; `esl_vec_LSet()` is for 64-bit
11("Long") integers.
12
13## quick reference
14
15Replace `*` with `esl_vec_[DFIL]`, sometimes `C`:
16
17| functions                | in English                   | in math                         | notes |
18|--------------------------|------------------------------|--------------------------------:|------:|
19| `*Set(v, n, x)`          | Set all elements to x        | $v_i = x \quad \forall i$       |       |
20| `*Scale(v, n, x)`        | Multiply all elements by x   | $v_i = x v_i \quad \forall i$   |       |
21| `*Increment(v, n, x)`    | Add x to all elements        | $v_i = v_i + x \quad \forall i$ |       |
22| `*Add(v, w, n)`          | Vector addition              | $v_i = v_i + w_i \quad \forall i$ |     |
23| `*AddScaled(v, w, n, a)` | Vector add a scaled vector   | $v_i = v_i + a w_i \quad \forall i$ |   |
24| `*Sum(v, n)`             | Return sum of elements       | (return) $\sum_i v_i$           | [1]   |
25| `*Dot(v, w, n)`          | Return dot product           | (return) $\mathbf{v} \cdot \mathbf{w}$ ||
26| `*Max(v, n)`             | Return maximum element       | (return) $\max_i v_i$           |       |
27| `*Min(v, n)`             | Return minimum element       | (return) $\min_i v_i$           |       |
28| `*ArgMax(v, n)`          | Return index of maximum element | (return) $\mathrm{argmax}_i v_i$  |  |
29| `*ArgMin(v, n)`          | Return index of minimum element | (return) $\mathrm{argmin}_i v_i$  |  |
30| `*Copy(v, n, w)`         | Copy a vector                | $w_i = v_i \quad \forall i$     |       |
31| `*Swap(v, w, n)`         | Swap contents of two vectors | $\mathbf{w} = \mathbf{v}, \mathbf{v} = \mathbf{w}$ | |
32| `*Reverse(v, w, n)`      | Reverse order of v, store in w | $w_i = v_{n-i-1}$             |       |
33| `*SortIncreasing(v, n)`  | Sort from smallest to largest   | []()                              |  |
34| `*SortDecreasing(v, n)`  | Sort from largest to smallest   | []()                         |       |
35| `*Shuffle(rng, v, n)`    | Shuffle vector in place         | []()                         |       |
36| `*Compare(v, w, n ...)`  | Compare vectors for equality | (return `eslOK` or `eslFAIL`)   | [2]   |
37| `*Dump(fp, v, n, label)` | Dump contents for inspection    |                              |       |
38
39
40Floating point only, `[DF]` (generally for probability or log probability vectors):
41
42| functions                | in English                   | in math                         |
43|--------------------------|------------------------------|--------------------------------:|
44| `*Norm(v, n)`            | Normalize probability vector | $v_i = \frac{v_i}{\sum_j v_j}$  |
45| `*LogNorm(v, n)`         | Normalize log probability vector | $v_i = \frac{e^{v_i}}{\sum_j e^{v_j}}$ |
46| `*Log(v, n)`             | Take log of each element     | $v_i = \log v_i$                |
47| `*Exp(v, n)`             | Exponentiate each element    | $v_i = e^{v_i}$                 |
48| `*LogSum(v, n)`          | Return log sum of log probabilities | (return) $\log \sum_i e^{v_i}$
49| `*Entropy(p, n)`         | Return Shannon entropy       | (return) $H = - \sum_i p_i \log_2 p_i$ |
50| `*RelEntropy(p, q, n)`   | Return Kullback-Leibler divergence | (return) $D_{\mathrm{KL}}(p \parallel q) = \sum_i p_i \log_2 \frac{p_i}{q_i}$ |
51| `*CDF(p, n, C)`          | Cumulative distribution of a prob vector | $C_i = \sum_{j=0}^{j=i} p_j$ |
52| `*Validate(p, n, atol, errbuf)`| Validate probability vector | (return `eslOK` or `eslFAIL`) |
53| `*LogValidate(p, n, atol, errbuf)`| Validate log prob vector | (return `eslOK` or `eslFAIL`) |
54
55
56[1] Floating point `*Sum()` functions use the Kahan compensated
57    summation algorithm to reduce numerical error
58	[[Kahan, 1965]](https://doi.org/10.1145/363707.363723).
59
60[2] Floating point `*Compare()` functions take an additional argument
61    `tol`, and use `esl_*Compare(v[i], w[i], tol)` to compare each
62    element with **relative** tolerance `tol`. We are phasing
63    `esl_*Compare()` functions out. They are to be replaced by
64    `esl_*CompareNew()` which use `atol` and `rtol` absolute and
65    relative tolerances.
66
67## notes to future
68
69* We could provide SIMD accelerated versions and runtime dispatchers.
70
71* The vector length <n> is always an `int`, so vectors can be up to a
72  couple billion ($2^{31}-1$) long. If we ever need longer vectors
73  (!), we'll write a `esl_vec64` module. That is, the contents (type)
74  of the vector is a different issue from its length.  Don't be
75  confused to see the `L` versions of the functions using `int64_t
76  *vec` and `int n`, and using `int i` to index it; this is
77  deliberate.
78
79
80
81