1######
2Vector
3######
4
5**Python** :doc:`Java <vector-java>`
6
7Goal
8====
9
10Create a vector data structure.
11
12Challenge
13=========
14
15Maintain the performance characteristics of vectors, including efficient random lookup, append, updates, scanning, and truncation.
16
17Explanation
18===========
19
20Using the vector index to form a key allows efficient retrieval of individual vector elements. The ordering of keys further allows scanning of an entire vector.
21
22Ordering
23========
24
25We can exploit the ordering of FoundationDB’s keys to place adjacent vector elements into adjacent keys. This approach supports efficient retrieval of an entire vector using a single range read.
26
27Pattern
28=======
29
30We store each element of the vector as a single key-value pair. The key is defined by using the tuple layer built into FoundationDB.
31
32For each index in the vector, we store::
33
34 fdb.tuple.pack((index,)) = vector[index]
35
36The tuple packing ensures that adjacent vector elements are stored as adjacent key values. This means that scanning an vector can be completed as a single range-read operation, which is very efficient. Likewise, looking up an individual vector element translates to a single random database read. Truncating the vector becomes a range-clear operation, which is O(log n) in FoundationDB.
37
38Extensions
39==========
40
41Multi-dimensional
42-----------------
43
44This approach can easily be extended to multidimensional vectors by adding additional vector indexes to the tuples. Like in an in-memory vector, the ordering of the dimensions determine what kind of range operations are most efficient. For example:
45
46For each index1, index2 in a 2D vector, we can store::
47
48 fdb.tuple.pack((index1, index2)) = vector[index1][index2]
49
50This order means that reads with a fixed index1 value and over a range of index2 values are most efficient.
51
52Sparse
53------
54
55Since we are using the presence of a key-value pair to denote the presence of each individual vector element, we can efficiently represent sparse vectors by not storing key-value pairs for absent elements and ranges.
56
57Multiple values
58---------------
59
60As described, each vector element stores only one value. An obvious extension would be to pack multiple logical values together in a single physical value for storage.
61
62External values/composition
63---------------------------
64
65Another useful extension is to store a “pointer” in the value of the array to reference another source for the logical value. This allows you to compose an array with other design patterns and data structures. The advantage of transactions is that even data structures with indirection can maintain consistency with concurrent client modifications.
66
67Code
68====
69
70Here’s the basic pattern::
71
72    vector_subspace = fdb.Subspace(('V',))
73
74    @fdb.transactional
75    def vector_get(tr, index, value):
76        return tr[vector_subspace[index]]
77
78    @fdb.transactional
79    def vector_set(tr, index, value):
80        tr[vector_subspace[index]] = value
81