1# Overview of LMDB databases in /var/cfengine/state
2
3Documentation to help understand the data in our LMDB databases.
4This is not meant as user-level documentation.
5It has technical details and shows C source code.
6
7## What is LMDB?
8
9Lightning Memory-mapped Database (LMDB) is a high performance local key-value store database.
10We use LMDB to store data which is shared between different CFEngine binaries and agent runs.
11Examples include incoming/outgoing network connections (cf_lastseen), promise locks (cf_lock), performance information and state data about the host.
12
13In our LMDB databases, we use NUL-terminated C strings as keys, and values may either be plain text (C strings) or binary data (C structs).
14LMDB database files `*.lmdb` are platform specific, the representation and sizes of various types in C are different on different systems and architectures.
15
16## Getting a list of all databases
17
18### Community policy server
19
20On a fresh community install, these LMDB databases are created:
21
22```
23$ ls -al /var/cfengine/state/*.lmdb
24-rw-r--r-- 1 root root  12288 Aug 23 08:37  /var/cfengine/state/cf_lastseen.lmdb
25-rw------- 1 root root  32768 Aug 23 08:49  /var/cfengine/state/cf_lock.lmdb
26-rw------- 1 root root  49152 Aug 23 08:49  /var/cfengine/state/cf_observations.lmdb
27-rw-r--r-- 1 root root   8192 Aug 23 08:37  /var/cfengine/state/cf_state.lmdb
28-rw------- 1 root root  16384 Aug 23 08:37  /var/cfengine/state/history.lmdb
29-rw------- 1 root root   8192 Aug 23 08:37  /var/cfengine/state/nova_measures.lmdb
30-rw------- 1 root root   8192 Aug 23 08:37  /var/cfengine/state/nova_static.lmdb
31-rw------- 1 root root 581632 Aug 23 08:47  /var/cfengine/state/packages_installed_apt_get.lmdb
32-rw-r--r-- 1 root root   8192 Aug 23 08:37 '/var/cfengine/state/packages_installed_$(package_module_knowledge.platform_default).lmdb'
33-rw------- 1 root root  28672 Aug 23 08:47  /var/cfengine/state/packages_updates_apt_get.lmdb
34-rw-r--r-- 1 root root  32768 Aug 23 08:47  /var/cfengine/state/performance.lmdb
35```
36
37### Enterprise hub package
38
39Similarly, installing an enterprise hub package:
40
41```
42$ ls -al /var/cfengine/state/*.lmdb
43-rw------- 1 root root  32768 Aug 23 09:00  /var/cfengine/state/cf_changes.lmdb
44-rw------- 1 root root  28672 Aug 23 09:00  /var/cfengine/state/cf_lastseen.lmdb
45-rw------- 1 root root  61440 Aug 23 09:01  /var/cfengine/state/cf_lock.lmdb
46-rw------- 1 root root  28672 Aug 23 09:00  /var/cfengine/state/cf_observations.lmdb
47-rw------- 1 root root  20480 Aug 23 09:00  /var/cfengine/state/cf_state.lmdb
48-rw------- 1 root root  16384 Aug 23 09:00  /var/cfengine/state/history.lmdb
49-rw------- 1 root root  32768 Aug 23 09:01  /var/cfengine/state/nova_agent_execution.lmdb
50-rw------- 1 root root   8192 Aug 23 09:00  /var/cfengine/state/nova_measures.lmdb
51-rw------- 1 root root  20480 Aug 23 09:00  /var/cfengine/state/nova_static.lmdb
52-rw------- 1 root root  12288 Aug 23 09:00  /var/cfengine/state/nova_track.lmdb
53-rw------- 1 root root 425984 Aug 23 09:01  /var/cfengine/state/packages_installed_apt_get.lmdb
54-rw------- 1 root root   8192 Aug 23 09:00 '/var/cfengine/state/packages_installed_$(package_module_knowledge.platform_default).lmdb'
55-rw------- 1 root root  32768 Aug 23 09:01  /var/cfengine/state/packages_updates_apt_get.lmdb
56-rw------- 1 root root  32768 Aug 23 09:01  /var/cfengine/state/performance.lmdb
57```
58
59### Examining the source code
60
61From `dbm_api.c`, these look like all the database files which are possible:
62
63```C
64static const char *const DB_PATHS_STATEDIR[] = {
65    [dbid_classes] = "cf_classes",
66    [dbid_variables] = "cf_variables",
67    [dbid_performance] = "performance",
68    [dbid_checksums] = "checksum_digests",
69    [dbid_filestats] = "stats",
70    [dbid_changes] = "cf_changes",
71    [dbid_observations] = "cf_observations",
72    [dbid_state] = "cf_state",
73    [dbid_lastseen] = "cf_lastseen",
74    [dbid_audit] = "cf_audit",
75    [dbid_locks] = "cf_lock",
76    [dbid_history] = "history",
77    [dbid_measure] = "nova_measures",
78    [dbid_static] = "nova_static",
79    [dbid_scalars] = "nova_pscalar",
80    [dbid_windows_registry] = "mswin",
81    [dbid_cache] = "nova_cache",
82    [dbid_license] = "nova_track",
83    [dbid_value] = "nova_value",
84    [dbid_agent_execution] = "nova_agent_execution",
85    [dbid_bundles] = "bundles",
86    [dbid_packages_installed] = "packages_installed",
87    [dbid_packages_updates] = "packages_updates"
88};
89```
90
91Some of them are deprecated:
92
93```C
94typedef enum
95{
96    dbid_classes,   // Deprecated
97    dbid_variables, // Deprecated
98    dbid_performance,
99    dbid_checksums, // Deprecated
100    dbid_filestats, // Deprecated
101    dbid_changes,
102    dbid_observations,
103    dbid_state,
104    dbid_lastseen,
105    dbid_audit,
106    dbid_locks,
107    dbid_history,
108    dbid_measure,
109    dbid_static,
110    dbid_scalars,
111    dbid_windows_registry,
112    dbid_cache,
113    dbid_license,
114    dbid_value,
115    dbid_agent_execution,
116    dbid_bundles,   // Deprecated
117    dbid_packages_installed, //new package promise installed packages list
118    dbid_packages_updates,   //new package promise list of available updates
119
120    dbid_max
121} dbid;
122```
123
124## Individual database files
125
126### cf_lastseen.lmdb
127
128See `lastseen.c` for more details.
129
130#### Example cf-check dump
131
132```
133$ cf-check dump -a /var/cfengine/state/cf_lastseen.lmdb
134key: 0x7fba331acf34[16] a192.168.100.10, data: 0x7fba331acf44[37] MD5=6fcc943142f461f4c0aa59abe871bc57
135key: 0x7fba331acf72[38] kMD5=6fcc943142f461f4c0aa59abe871bc57, data: 0x7fba331acf98[15] 192.168.100.10
136key: 0x7fba331acfb0[39] qoMD5=6fcc943142f461f4c0aa59abe871bc57, data: 0x7fba331acfd7[40] Fc]
137```
138
139#### Example mdb_dump
140
141```
142$ mdb_dump -n -p /var/cfengine/state/cf_lastseen.lmdb
143VERSION=3
144format=print
145type=btree
146mapsize=104857600
147maxreaders=126
148db_pagesize=4096
149HEADER=END
150 a192.168.100.10\00
151 MD5=6fcc943142f461f4c0aa59abe871bc57\00
152 kMD5=6fcc943142f461f4c0aa59abe871bc57\00
153 192.168.100.10\00
154 qoMD5=6fcc943142f461f4c0aa59abe871bc57\00
155 F\bfc]\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00
156DATA=END
157```
158
159#### Version entry
160
161Denotes the database schema used by this database file.
162
163```
164key: "version"
165value: "1"
166```
167
168#### Address entries (reverse lookup)
169
170```
171key: a<address> (IPv6 or IPv6)
172value: <hostkey>
173```
174
175#### Quality entries
176
177```
178key: q<direction><hostkey> (direction: 'i' for incoming, 'o' for outgoing)
179value: struct KeyHostSeen
180```
181
182##### KeyHostSeen struct
183
184From `lastseen.h`:
185
186```C
187typedef struct
188{
189    time_t lastseen;
190    QPoint Q;
191} KeyHostSeen;
192```
193
194The QPoint is a rolling weighted average of the time between connections:
195
196```
197    newq.Q = QAverage(q.Q, newq.lastseen - q.lastseen, 0.4);
198```
199
200#### Hostkey entries
201```
202key: k<hostkey> ("MD5-ffffefefeefef..." or "SHA-abacabaca...")
203value: <address> (IPv4 or IPv6)
204```
205
206### cf_lock.lmdb
207
208See `locks.c` for more details.
209
210#### Example cf-check dump
211
212```
213root@dev ~ $ cf-check dump -a /var/cfengine/state/cf_lock.lmdb
214key: 0x7f6b0f4e4b36[33] 05315dcd049a9e89c6d85520d505f600, data: 0x7f6b0f4e4b57[24] =
215key: 0x7f6b0f4e4c3e[33] 0c8f50c64db3673c7f2d8eca5d8a475b, data: 0x7f6b0f4e4c5f[24] =
216key: 0x7f6b0f4e48a2[33] 0f121ab519b8b23bae54714d3f1a83e1, data: 0x7f6b0f4e48c3[24] =
217key: 0x7f6b0f4e4d74[33] 37e5b5e86d2e7474b4106be6f395401c, data: 0x7f6b0f4e4d95[24] =
218key: 0x7f6b0f4e49aa[33] 398e0e0964c2608b419b50cf5d038fd6, data: 0x7f6b0f4e49cb[24] =
219[...]
220key: 0x7f6b0f4e4d46[13] lock_horizon, data: 0x7f6b0f4e4d53[24]
221root@dev ~ $
222```
223
224#### Example mdb_dump
225
226```
227root@dev ~ $ mdb_dump -n -p /var/cfengine/state/cf_lock.lmdb
228VERSION=3
229format=print
230type=btree
231mapsize=104857600
232maxreaders=210
233db_pagesize=4096
234HEADER=END
235 05315dcd049a9e89c6d85520d505f600\00
236 \f1=\00\00\00\00\00\00,\d4c]\00\00\00\00r\16\00\00\00\00\00\00
237 0c8f50c64db3673c7f2d8eca5d8a475b\00
238 \f1=\00\00\00\00\00\00,\d4c]\00\00\00\00r\16\00\00\00\00\00\00
239 0f121ab519b8b23bae54714d3f1a83e1\00
240 \f1=\00\00\00\00\00\00,\d4c]\00\00\00\00r\16\00\00\00\00\00\00
241 37e5b5e86d2e7474b4106be6f395401c\00
242 \f1=\00\00\00\00\00\00,\d4c]\00\00\00\00r\16\00\00\00\00\00\00
243 398e0e0964c2608b419b50cf5d038fd6\00
244 \f1=\00\00\00\00\00\00,\d4c]\00\00\00\00r\16\00\00\00\00\00\00
245[...]
246 lock_horizon\00
247 \00\00\00\00\00\00\00\00F\bfc]\00\00\00\00\00\00\00\00\00\00\00\00
248DATA=END
249root@dev ~ $
250```
251
252#### Lock keys
253
254For each promise, 2 strings are generated like this:
255
256```C
257char cflock[CF_BUFSIZE] = "";
258snprintf(cflock, CF_BUFSIZE, "lock.%.100s.%s.%.100s_%d_%s",
259         bundle_name, cc_operator, cc_operand, sum, str_digest);
260
261char cflast[CF_BUFSIZE] = "";
262snprintf(cflast, CF_BUFSIZE, "last.%.100s.%s.%.100s_%d_%s",
263         bundle_name, cc_operator, cc_operand, sum, str_digest);
264
265Log(LOG_LEVEL_DEBUG, "Locking bundle '%s' with lock '%s'",
266    bundle_name, cflock);
267```
268
269The result can be seen in debug log:
270
271```
272root@dev ~ $ cf-agent --log-level debug | grep -F Locking
273   debug: Locking bundle 'inventory_control' with lock 'lock.inventory_control.reports.-dev.inventory_control__LSB_module_enabled_5317_MD5=17cc5c06fc415f344762d927f53723b7'
274   debug: Locking bundle 'inventory_control' with lock 'lock.inventory_control.reports.-dev.inventory_control__dmidecode_module_enabled_4618_MD5=ecdad15e8a457659d321be7aaf3465e0'
275   debug: Locking bundle 'inventory_control' with lock 'lock.inventory_control.reports.-dev.inventory_control__mtab_module_enabled_5516_MD5=926e972f2d6bee317a0c1b09c0a05029'
276   debug: Locking bundle 'inventory_control' with lock 'lock.inventory_control.reports.-dev.inventory_control__fstab_module_enabled_4364_MD5=f743f4810ceac5dc7f29c246b27ef4f4'
277   debug: Locking bundle 'inventory_control' with lock 'lock.inventory_control.reports.-dev.inventory_control__proc_module_enabled_5328_MD5=b55e6f18cba2d498fc49cf9643edc9fe'
278   debug: Locking bundle 'inventory_control' with lock 'lock.inventory_control.reports.-dev.inventory_control__package_refresh_module_enabled_5998_MD5=8ed765eab0a379b899bb20ffff361149'
279   debug: Locking bundle 'inventory_autorun' with lock 'lock.inventory_autorun.methods.usebundle.handle.-dev.method_packages_refresh__7849_MD5=bf16d30d8791bfd0e9b9bdde5a2e3f1d'
280[...]
281root@dev ~ $
282```
283
284These strings are then hashed using MD5, and the MD5 digest is used as a key in LMDB.
285
286#### Lock values
287
288This is the C struct stored as a value in LMDB:
289
290```C
291typedef struct
292{
293    pid_t pid;
294    time_t time;
295    time_t process_start_time;
296} LockData;
297```
298
299Seems to be created like this:
300
301```C
302static bool WriteLockDataCurrent(CF_DB *dbp, const char *lock_id)
303{
304    LockData lock_data = { 0 };
305    lock_data.pid = getpid();
306    lock_data.time = time(NULL);
307    lock_data.process_start_time = GetProcessStartTime(getpid());
308
309    return WriteLockData(dbp, lock_id, &lock_data);
310}
311```
312
313Quite simple:
314
315* A PID and a start time of the process which locked this promise.
316* The time the promise was locked.
317