1.. _stats:
2
3**********
4Statistics
5**********
6
7Statistics Overview
8===================
9
10Both Kea DHCP servers support statistics gathering. A working DHCP
11server encounters various events that can cause certain statistics to be
12collected. For example, a DHCPv4 server may receive a packet
13(the pkt4-received statistic increases by one) that after parsing is
14identified as a DHCPDISCOVER (pkt4-discover-received). The server
15processes it and decides to send a DHCPOFFER representing its answer
16(the pkt4-offer-sent and pkt4-sent statistics increase by one). Such events
17happen frequently, so it is not uncommon for the statistics to have
18values in the high thousands. They can serve as an easy and powerful
19tool for observing a server's and a network's health. For example, if
20the pkt4-received statistic stops growing, it means that the clients'
21packets are not reaching the server.
22
23There are four types of statistics:
24
25-  *integer* - this is the most common type. It is implemented as a
26   64-bit integer (int64_t in C++), so it can hold any value between
27   -2^63 to 2^63-1.
28
29-  *floating point* - this type is intended to store floating-point
30   precision. It is implemented as a C++ double type.
31
32-  *duration* - this type is intended for recording time periods. It
33   uses the \`boost::posix_time::time_duration type, which stores hours,
34   minutes, seconds, and microseconds.
35
36-  *string* - this type is intended for recording statistics in textual
37   form. It uses the C++ std::string type.
38
39During normal operation, the DHCPv4 and DHCPv6 servers gather
40statistics. For a list of DHCPv4 and DHCPv6 statistics, see
41:ref:`dhcp4-stats` and :ref:`dhcp6-stats`, respectively.
42
43To extract data from the statistics module, the control channel can be
44used. See :ref:`ctrl-channel` for details. It is possible to
45retrieve a single statistic or all statistics, reset statistics (i.e.
46set to a neutral value, typically zero), or even completely remove a
47single statistic or all statistics. See the section :ref:`command-stats`
48for a list of statistics-oriented commands.
49
50Statistics can be used by external tools to monitor Kea. One example of such a tool is Stork.
51See :ref:`stork` for details on how to use it to retrieve statistics periodically (and use
52other data sources) to get better insight into Kea health and operational status.
53
54.. _stats-lifecycle:
55
56Statistics Lifecycle
57====================
58
59In Kea 1.6.0 version and earlier, when the Kea server is started some
60of the statistics are initially not initialized. For example, the ``pkt4-received``
61statistic is not available until the first DHCP packet is received.
62In the later Kea versions, this behavior has been changed and all of the
63statistics supported by the servers are initialized upon the servers' startup
64and should be returned in response to the commands such as
65``statistic-get-all``. The runtime statistics concerning DHCP packets
66processed is initially set to 0 and is reset upon the server
67restart.
68
69Per-subnet statistics are recalculated when reconfiguration takes place.
70
71In general, once a statistic is initialized it is held in the manager until
72explicitly removed, by ``statistic-remove`` or ``statistic-remove-all``
73being called, or when the server is shut down.
74
75Removing a statistic that is updated frequently makes little sense, as
76it will be re-added when the server code next records that statistic.
77The ``statistic-remove`` and ``statistic-remove-all`` commands are
78intended to remove statistics that are not expected to be observed in
79the near future. For example, a misconfigured device in a network may
80cause clients to report duplicate addresses, so the server will report
81increasing values of pkt4-decline-received. Once the problem is found
82and the device is removed, the system administrator may want to remove
83the pkt4-decline-received statistic, so it will not be reported anymore. If
84a duplicate address is ever detected again, the server will add this
85statistic back.
86
87.. _command-stats:
88
89Commands for Manipulating Statistics
90====================================
91
92There are several commands defined that can be used for accessing
93(-get), resetting to zero or a neutral value (-reset), or removing a
94statistic completely (-remove). We can change the statistics time based
95limit (-sample-age-set) and size based limit (-sample-count-set) which
96control how long or how many samples of a given statistic are retained.
97
98The difference between reset and remove is somewhat subtle.
99The reset command sets the value of the statistic to zero or a neutral value,
100so after this operation, the statistic will have a value of 0 (integer),
1010.0 (float), 0h0m0s0us (duration), or "" (string).
102When requested, a statistic with the values mentioned will be returned.
103``Remove`` removes a statistic completely, so the statistic will no longer
104be reported. Please note that the server code may add it back if there is a reason
105to record it.
106
107.. note::
108
109   The following sections describe commands that can be sent to the
110   server; the examples are not fragments of a configuration file. For
111   more information on sending commands to Kea, see
112   :ref:`ctrl-channel`.
113
114.. _command-statistic-get:
115
116The statistic-get Command
117-------------------------
118
119The ``statistic-get`` command retrieves a single statistic. It takes a
120single-string parameter called ``name``, which specifies the statistic
121name. An example command may look like this:
122
123::
124
125   {
126       "command": "statistic-get",
127       "arguments": {
128           "name": "pkt4-received"
129       }
130   }
131
132The server returns details of the requested statistic, with a result of
1330 indicating success and the specified statistic as the value of the
134"arguments" parameter. If the requested statistic is not found, the
135response will contain an empty map, i.e. only { } as an argument, but
136the status code will still indicate success (0).
137An example response:
138
139::
140
141   {
142       "command": "statistic-get",
143       "arguments": {
144           "pkt4-received": [ [ 125, "2019-07-30 10:11:19.498739" ], [ 100, "2019-07-30 10:11:19.498662" ] ]
145       },
146       "result": 0
147   }
148
149.. _command-statistic-reset:
150
151The statistic-reset Command
152---------------------------
153
154The ``statistic-reset`` command sets the specified statistic to its
155neutral value: 0 for integer, 0.0 for float, 0h0m0s0us for time
156duration, and "" for string type. It takes a single-string parameter
157called ``name``, which specifies the statistic name. An example command
158may look like this:
159
160::
161
162   {
163       "command": "statistic-reset",
164       "arguments": {
165           "name": "pkt4-received"
166       }
167   }
168
169If the specific statistic is found and the reset is successful, the
170server responds with a status of 0, indicating success, and an empty
171parameters field. If an error is encountered (e.g. the requested
172statistic was not found), the server returns a status code of 1 (error)
173and the text field contains the error description.
174
175.. _command-statistic-remove:
176
177The statistic-remove Command
178----------------------------
179
180The ``statistic-remove`` command attempts to delete a single statistic. It
181takes a single-string parameter called ``name``, which specifies the
182statistic name. An example command may look like this:
183
184::
185
186   {
187       "command": "statistic-remove",
188       "arguments": {
189           "name": "pkt4-received"
190       }
191   }
192
193If the specific statistic is found and its removal is successful, the
194server responds with a status of 0, indicating success, and an empty
195parameters field. If an error is encountered (e.g. the requested
196statistic was not found), the server returns a status code of 1 (error)
197and the text field contains the error description.
198
199.. _command-statistic-get-all:
200
201The statistic-get-all Command
202-----------------------------
203
204The ``statistic-get-all`` command retrieves all statistics recorded. An
205example command may look like this:
206
207::
208
209   {
210       "command": "statistic-get-all",
211       "arguments": { }
212   }
213
214The server responds with details of all recorded statistics, with a
215result set to 0 to indicate that it iterated over all statistics (even
216when the total number of statistics is zero).
217An example response returning all collected statistics:
218
219::
220
221   {
222       "command": "statistic-get-all",
223       "arguments": {
224           "cumulative-assigned-addresses": [ [ 0, "2019-07-30 10:04:28.386740" ] ],
225           "declined-addresses": [ [ 0, "2019-07-30 10:04:28.386733" ] ],
226           "reclaimed-declined-addresses": [ [ 0, "2019-07-30 10:04:28.386735" ] ],
227           "reclaimed-leases": [ [ 0, "2019-07-30 10:04:28.386736" ] ],
228           "subnet[1].assigned-addresses": [ [ 0, "2019-07-30 10:04:28.386740" ] ],
229           "subnet[1].cumulative-assigned-addresses": [ [ 0, "2019-07-30 10:04:28.386740" ] ],
230           "subnet[1].declined-addresses": [ [ 0, "2019-07-30 10:04:28.386743" ] ],
231           "subnet[1].reclaimed-declined-addresses": [ [ 0, "2019-07-30 10:04:28.386745" ] ],
232           "subnet[1].reclaimed-leases": [ [ 0, "2019-07-30 10:04:28.386747" ] ],
233           "subnet[1].total-addresses": [ [ 200, "2019-07-30 10:04:28.386719" ] ]
234       },
235       "result": 0
236   }
237
238.. _command-statistic-reset-all:
239
240The statistic-reset-all Command
241-------------------------------
242
243The ``statistic-reset`` command sets all statistics to their neutral
244values: 0 for integer, 0.0 for float, 0h0m0s0us for time duration, and
245"" for string type. An example command may look like this:
246
247::
248
249   {
250       "command": "statistic-reset-all",
251       "arguments": { }
252   }
253
254If the operation is successful, the server responds with a status of 0,
255indicating success, and an empty parameters field. If an error is
256encountered, the server returns a status code of 1 (error) and the text
257field contains the error description.
258
259.. _command-statistic-remove-all:
260
261The statistic-remove-all Command
262--------------------------------
263
264The ``statistic-remove-all`` command attempts to delete all statistics. An
265example command may look like this:
266
267::
268
269   {
270       "command": "statistic-remove-all",
271       "arguments": { }
272   }
273
274If the removal of all statistics is successful, the server responds with
275a status of 0, indicating success, and an empty parameters field. If an
276error is encountered, the server returns a status code of 1 (error) and
277the text field contains the error description.
278
279.. _command-statistic-sample-age-set:
280
281The statistic-sample-age-set Command
282----------------------------------------
283
284The ``statistic-sample-age-set`` command sets time based limit
285for collecting samples for a given statistic. It takes two parameters a string
286called ``name``, which specifies the statistic name and an integer value called
287``duration``, which specifies the time limit for the given statistic in seconds.
288An example command may look like this:
289
290::
291
292   {
293       "command": "statistic-sample-age-set",
294       "arguments": {
295           "name": "pkt4-received",
296           "duration": 1245
297       }
298
299   }
300
301The server will respond with message about successfully set limit
302for the given statistic, with a result set to 0 indicating success
303and an empty parameters field. If an error is encountered (e.g. the
304requested statistic was not found), the server returns a status code
305of 1 (error) and the text field contains the error description.
306
307.. _command-statistic-sample-age-set-all:
308
309The statistic-sample-age-set-all Command
310--------------------------------------------
311
312The ``statistic-sample-age-set-all`` command sets time based limits
313for collecting samples for all statistics. It takes a single-integer parameter
314called ``duration``, which specifies the time limit for statistic
315in seconds. An example command may look like this:
316
317::
318
319   {
320       "command": "statistic-sample-age-set-all",
321       "arguments": {
322           "duration": 1245
323       }
324
325   }
326
327The server will respond with message about successfully set limit
328for all statistics, with a result set to 0 indicating success
329and an empty parameters field. If an error is encountered, the server returns
330a status code of 1 (error) and the text field contains the error description.
331
332.. _command-statistic-sample-count-set:
333
334The statistic-sample-count-set Command
335------------------------------------------
336
337The ``statistic-sample-count-set`` command sets size based limit
338for collecting samples for a given statistic. An example command may look
339like this:
340
341::
342
343   {
344       "command": "statistic-sample-count-set",
345       "arguments": {
346           "name": "pkt4-received",
347           "max-samples": 100
348       }
349
350   }
351
352The server will respond with message about successfully set limit
353for the given statistic, with a result set to 0 indicating success
354and an empty parameters field. If an error is encountered (e.g. the
355requested statistic was not found), the server returns a status code
356of 1 (error) and the text field contains the error description.
357
358.. _command-statistic-sample-count-set-all:
359
360The statistic-sample-count-set-all Command
361----------------------------------------------
362
363The ``statistic-sample-count-set-all`` command sets size based limits
364for collecting samples for all statistics. An example command may look
365like this:
366
367::
368
369   {
370       "command": "statistic-sample-count-set-all",
371       "arguments": {
372           "max-samples": 100
373       }
374
375   }
376
377The server will respond with message about successfully set limit
378for all statistics, with a result set to 0 indicating success
379and an empty parameters field. If an error is encountered, the server returns
380a status code of 1 (error) and the text field contains the error description.
381
382.. _time-series:
383
384Time Series
385===========
386
387Previously, by default, each statistic held only a single data point. When Kea
388attempted to record a new value, the existing previous value was overwritten.
389That approach has the benefit of taking up little memory and it covers most
390cases reasonably well. However, there may be cases where you need to have many
391data points for some process. For example, some processes, such as received
392packet size, packet processing time or number of database queries needed to
393process a packet, are not cumulative and it would be useful to keep many data
394points, perhaps to do some form of statistical analysis afterwards.
395
396
397Since Kea 1.6, by default, each statistic holds 20 data points. Setting such
398a limit prevents unlimited memory growth.
399There are two ways to define the limits: time based (e.g. keep samples from
400the last 5 minutes) and size based. It's possible to change the size based
401limit by using one of two commands: ``statistic-sample-count-set``,
402to set size limit for single statistic and ``statistic-sample-count-set-all``
403for setting size based limits for all statistics. To set time based
404limits for single statistic use ``statistic-sample-age-set``, and
405``statistic-sample-age-set-all`` to set time based limits for all statistics.
406For a given statistic only one type of limit can be active. It means that storage
407is limited only by time based limit or size based, never by both of them.
408