1.. _tags:
2
3Graphite Tag Support
4====================
5From the release of the 1.1.x series, Graphite supports storing data using tags to identify each series.  This allows for much more flexibility than the traditional hierarchical layout.  When using tag support, each series is uniquely identified by its name and set of tag/value pairs.
6
7Carbon
8------
9
10To enter tagged series into Graphite, they should be passed to Carbon by appending the tags to the series name:
11
12.. code-block:: none
13
14    my.series;tag1=value1;tag2=value2
15
16Carbon will automatically decode the tags, normalize the tag order, and register the series in the tag database.
17
18.. _querying-tagged-series:
19
20Tag names must have a length >= 1 and may contain any ascii characters except ``;!^=``. Tag values must also have a length >= 1, they may contain any ascii characters except ``;`` and the first character must not be ``~``. UTF-8 characters may work for names and values, but they are not well tested and it is not recommended to use non-ascii characters in metric names or tags. Metric names get indexed under the special tag `name`, if a metric name starts with one or multiple `~` they simply get removed from the derived tag value because the `~` character is not allowed to be in the first position of the tag value. If a metric name consists of no other characters than `~`, then it is considered invalid and may get dropped.
21
22Querying
23--------
24
25When querying tagged series, we start with the `seriesByTag <functions.html#graphite.render.functions.seriesByTag>`_ function:
26
27.. code-block:: none
28
29    # find all series that have tag1 set to value1
30    seriesByTag('tag1=value1')
31
32This function returns a `seriesList` that can then be used by any other Graphite functions:
33
34.. code-block:: none
35
36    # find all series that have tag1 set to value1, sorted by total
37    seriesByTag('tag1=value1') | sortByTotal()
38
39The `seriesByTag <functions.html#graphite.render.functions.seriesByTag>`_ function supports specifying any number of tag expressions to refine the list of matches.  When multiple tag expressions are specified, only series that match all the expressions will be returned.
40
41Tags expressions are strings, and may have the following formats:
42
43.. code-block:: none
44
45    tag=spec    tag value exactly matches spec
46    tag!=spec   tag value does not exactly match spec
47    tag=~value  tag value matches the regular expression spec
48    tag!=~spec  tag value does not match the regular expression spec
49
50Any tag spec that matches an empty value is considered to match series that don't have that tag, and at least one tag spec must require a non-empty value.
51
52Regular expression conditions are treated as being anchored at the start of the value.
53
54A more complex example:
55
56.. code-block:: none
57
58    # find all series where name matches the regular expression cpu\..*, AND tag1 is not value1
59    seriesByTag('name=~cpu\..*', 'tag1!=value1')
60
61Once you have selected a seriesList, it is possible to group series together using the `groupByTags <functions.html#graphite.render.functions.groupByTags>`_ function, which operates on tags in the same way that `groupByNodes <functions.html#graphite.render.functions.groupByNodes>`_ works on nodes within a traditional naming hierarchy.
62
63.. code-block:: none
64
65    # get a list of disk space used per datacenter for all webheads
66    seriesByTag('name=disk.used', 'server=~web.*') | groupByTags('sumSeries', 'datacenter')
67
68    # given series like:
69    # disk.used;datacenter=dc1;rack=a1;server=web01
70    # disk.used;datacenter=dc1;rack=b2;server=web02
71    # disk.used;datacenter=dc2;rack=c3;server=web01
72    # disk.used;datacenter=dc2;rack=d4;server=web02
73
74    # will return the following new series, each containing the sum of the values for that datacenter:
75    # disk.used;datacenter=dc1
76    # disk.used;datacenter=dc2
77
78Finally, the `aliasByTags <functions.html#graphite.render.functions.aliasByTags>`_ function is used to help format series names for display.  It is the tag-based equivalent of the `aliasByNode <functions.html#graphite.render.functions.aliasByNode>`_ function.
79
80.. code-block:: none
81
82    # given series like:
83    # disk.used;datacenter=dc1;rack=a1;server=web01
84    # disk.used;datacenter=dc1;rack=b2;server=web02
85
86    # format series name using datacenter tag:
87    seriesByTag('name=disk.used','datacenter=dc1') | aliasByTags('server', 'name')
88
89    # will return
90    # web01.disk.used
91    # web02.disk.used
92
93If a tag name or value contains quotes (``'"``), then they will need to be escaped properly. For example a series with a tag ``tagName='quotedValue'`` could be queried with ``seriesByTag('tagName=\'quotedValue\'')`` or alternatively ``seriesByTag("tagName='quotedValue'")``.
94
95Database Storage
96----------------
97As Whisper and other storage backends are designed to hold simple time-series data (metric key, value, and timestamp), Graphite stores tag information in a separate tag database (TagDB).  The TagDB is a pluggable store, by default it uses the Graphite SQLite, MySQL or PostgreSQL database, but it can also be configured to use an external Redis server or a custom plugin.
98
99.. note::
100
101  Tag support requires Graphite webapp & carbon version 1.1.1 or newer.
102
103Local Database TagDB
104^^^^^^^^^^^^^^^^^^^^
105
106The Local TagDB stores tag information in tables inside the graphite-web database.  It supports SQLite, MySQL and Postgres, and is enabled by default.
107
108Redis TagDB
109^^^^^^^^^^^
110
111The Redis TagDB will store the tag information on a Redis server, and is selected by setting ``TAGDB='graphite.tags.redis.RedisTagDB'`` in `local_settings.py`.  There are 4 additional config settings for the Redis TagDB::
112
113    TAGDB_REDIS_HOST = 'localhost'
114    TAGDB_REDIS_PORT = 6379
115    TAGDB_REDIS_DB = 0
116    TAGDB_REDIS_PASSWORD = ''
117
118The default settings (above) will connect to a local Redis server on the default port, and use the default database without password.
119
120HTTP(S) TagDB
121^^^^^^^^^^^^^
122
123The HTTP(S) TagDB is used to delegate all tag operations to an external server that implements the Graphite tagging HTTP API.  It can be used in clustered graphite scenarios, or with custom data stores.  It is selected by setting ``TAGDB='graphite.tags.http.HttpTagDB'`` in `local_settings.py`.  There are 4 additional config settings for the HTTP(S) TagDB::
124
125    TAGDB_HTTP_URL = 'https://another.server'
126    TAGDB_HTTP_USER = ''
127    TAGDB_HTTP_PASSWORD = ''
128    TAGDB_HTTP_AUTOCOMPLETE = False
129
130The ``TAGDB_HTTP_URL`` is required. ``TAGDB_HTTP_USER`` and ``TAGDB_HTTP_PASSWORD`` are optional and if specified will be used to send a Basic Authorization header in all requests.
131
132``TAGDB_HTTP_AUTOCOMPLETE`` is also optional, if set to ``True`` auto-complete requests will be forwarded to the remote TagDB, otherwise calls to `/tags/findSeries`, `/tags` & `/tags/<tag>` will be used to provide auto-complete functionality.
133
134If ``REMOTE_STORE_FORWARD_HEADERS`` is defined, those headers will also be forwarded to the remote TagDB.
135
136Adding Series to the TagDB
137--------------------------
138Normally `carbon` will take care of this, it submits all new series to the TagDB, and periodically re-submits all series to ensure that the TagDB is kept up to date.  There are 2 `carbon` configuration settings related to tagging; the `GRAPHITE_URL` setting specifies the url of your graphite-web installation (default `http://127.0.0.1:8000`), and the `TAG_UPDATE_INTERVAL` setting specifies how often each series should be re-submitted to the TagDB (default is every 100th update).
139
140Series can be submitted via HTTP POST using command-line tools such as ``curl`` or with a variety of HTTP programming libraries.
141
142.. code-block:: none
143
144    $ curl -X POST "http://graphite/tags/tagSeries" \
145      --data-urlencode 'path=disk.used;rack=a1;datacenter=dc1;server=web01'
146
147    "disk.used;datacenter=dc1;rack=a1;server=web01"
148
149This endpoint returns the canonicalized version of the path, with the tags sorted in alphabetical order.
150
151To add multiple series with a single HTTP request, use the ``/tags/tagMultiSeries`` endpoint, which support multiple ``path`` parameters:
152
153.. code-block:: none
154
155    $ curl -X POST "http://graphite/tags/tagMultiSeries" \
156      --data-urlencode 'path=disk.used;rack=a1;datacenter=dc1;server=web01' \
157      --data-urlencode 'path=disk.used;rack=a1;datacenter=dc1;server=web02' \
158      --data-urlencode 'pretty=1'
159
160    [
161      "disk.used;datacenter=dc1;rack=a1;server=web01",
162      "disk.used;datacenter=dc1;rack=a1;server=web02"
163    ]
164
165This endpoint returns a list of the canonicalized paths, in the same order they are specified.
166
167Exploring Tags
168--------------
169You can use the HTTP api to get lists of defined tags, values for each tag, and to find series using the same logic as the `seriesByTag <functions.html#graphite.render.functions.seriesByTag>`_ function.
170
171To get a list of defined tags:
172
173.. code-block:: none
174
175    $ curl -s "http://graphite/tags?pretty=1"
176
177    [
178      {
179        "tag": "datacenter"
180      },
181      {
182        "tag": "name"
183      },
184      {
185        "tag": "rack"
186      },
187      {
188        "tag": "server"
189      }
190    ]
191
192You can filter the returned list by providing a regular expression in the `filter` parameter:
193
194.. code-block:: none
195
196    $ curl -s "http://graphite/tags?pretty=1&filter=data"
197
198    [
199      {
200        "tag": "datacenter"
201      }
202    ]
203
204To get a list of values for a specific tag:
205
206.. code-block:: none
207
208    $ curl -s "http://graphite/tags/datacenter?pretty=1"
209
210    {
211      "tag": "datacenter",
212      "values": [
213        {
214          "count": 2,
215          "value": "dc1"
216        },
217        {
218          "count": 2,
219          "value": "dc2"
220        }
221      ]
222    }
223
224You can filter the returned list of values using the `filter` parameter:
225
226.. code-block:: none
227
228    $ curl -s "http://graphite/tags/datacenter?pretty=1&filter=dc1"
229
230    {
231      "tag": "datacenter",
232      "values": [
233        {
234          "count": 2,
235          "value": "dc1"
236        }
237      ]
238    }
239
240Finally, to search for series matching a set of tag expressions:
241
242.. code-block:: none
243
244    $ curl -s "http://graphite/tags/findSeries?pretty=1&expr=datacenter=dc1&expr=server=web01"
245
246    [
247      "disk.used;datacenter=dc1;rack=a1;server=web01"
248    ]
249
250Auto-complete Support
251---------------------
252The HTTP api provides 2 endpoints to support auto-completion of tags and values based on the series which match a provided set of tag expressions.
253
254Each of these endpoints accepts an optional list of tag expressions using the same syntax as the `/tags/findSeries` endpoint.
255
256The provided expressions are used to filter the results, so that the suggested list of tags will only include tags that occur in series matching the expressions.
257
258Results are limited to 100 by default, this can be overridden by passing `limit=X` in the request parameters.  The returned JSON is a compact representation by default, if `pretty=1` is passed in the request parameters the returned JSON will be formatted with newlines and indentation.
259
260To get an auto-complete list of tags:
261
262.. code-block:: none
263
264    $ curl -s "http://graphite/tags/autoComplete/tags?pretty=1&limit=100"
265
266    [
267      "datacenter",
268      "name",
269      "rack",
270      "server"
271    ]
272
273To filter by prefix:
274
275.. code-block:: none
276
277    $ curl -s "http://graphite/tags/autoComplete/tags?pretty=1&tagPrefix=d"
278
279    [
280      "datacenter"
281    ]
282
283If you provide a list of tag expressions, the specified tags are excluded and the result is filtered to only tags that occur in series matching those expressions:
284
285.. code-block:: none
286
287    $ curl -s "http://graphite/tags/autoComplete/tags?pretty=1&expr=datacenter=dc1&expr=server=web01"
288
289    [
290      "name",
291      "rack"
292    ]
293
294To get an auto-complete list of values for a specified tag:
295
296.. code-block:: none
297
298    $ curl -s "http://graphite/tags/autoComplete/values?pretty=1&tag=rack"
299
300    [
301      "a1",
302      "a2",
303      "b1",
304      "b2"
305    ]
306
307To filter by prefix:
308
309.. code-block:: none
310
311    $ curl -s "http://graphite/tags/autoComplete/values?pretty=1&tag=rack&valuePrefix=a"
312
313    [
314      "a1",
315      "a2"
316    ]
317
318If you provide a list of tag expressions, the result is filtered to only values that occur for the specified tag in series matching those expressions:
319
320.. code-block:: none
321
322    $ curl -s "http://graphite/tags/autoComplete/values?pretty=1&tag=rack&expr=datacenter=dc1&expr=server=web01"
323
324    [
325      "a1"
326    ]
327
328Removing Series from the TagDB
329------------------------------
330When a series is deleted from the data store (for example, by deleting `.wsp` files from the whisper storage folders), it should also be removed from the tag database.  Having series in the tag database that don't exist in the data store won't cause any problems with graphing, but will cause the system to do work that isn't needed during the graph rendering, so it is recommended that the tag database be cleaned up when series are removed from the data store.
331
332Series can be deleted via HTTP POST to the `/tags/delSeries` endpoint:
333
334.. code-block:: none
335
336    $ curl -X POST "http://graphite/tags/delSeries" \
337      --data-urlencode 'path=disk.used;datacenter=dc1;rack=a1;server=web01'
338
339    true
340
341To delete multiple series at once pass multiple ``path`` parameters:
342
343.. code-block:: none
344
345    $ curl -X POST "http://graphite/tags/delSeries" \
346      --data-urlencode 'path=disk.used;datacenter=dc1;rack=a1;server=web01' \
347      --data-urlencode 'path=disk.used;datacenter=dc1;rack=a1;server=web02'
348
349    true
350