|
Name |
|
Date |
Size |
#Lines |
LOC |
| .. | | 03-May-2022 | - |
| .circleci/ | H | 11-Dec-2019 | - | 132 | 124 |
| .github/ | H | 11-Dec-2019 | - | 48 | 22 |
| api/ | H | 11-Dec-2019 | - | 16,165 | 10,422 |
| asset/ | H | 11-Dec-2019 | - | 493 | 379 |
| cli/ | H | 11-Dec-2019 | - | 2,449 | 1,712 |
| client/ | H | 11-Dec-2019 | - | 818 | 659 |
| cluster/ | H | 11-Dec-2019 | - | 2,338 | 1,920 |
| cmd/ | H | 11-Dec-2019 | - | 794 | 653 |
| config/ | H | 11-Dec-2019 | - | 3,381 | 2,709 |
| dispatch/ | H | 11-Dec-2019 | - | 1,489 | 1,144 |
| doc/ | H | 03-May-2022 | - | 251 | 197 |
| examples/ | H | 11-Dec-2019 | - | 125 | 106 |
| inhibit/ | H | 11-Dec-2019 | - | 606 | 476 |
| nflog/ | H | 11-Dec-2019 | - | 2,345 | 2,004 |
| notify/ | H | 11-Dec-2019 | - | 5,234 | 4,015 |
| pkg/ | H | 11-Dec-2019 | - | 487 | 370 |
| provider/ | H | 11-Dec-2019 | - | 614 | 437 |
| scripts/ | H | 03-May-2022 | - | 73 | 37 |
| silence/ | H | 11-Dec-2019 | - | 3,714 | 3,256 |
| store/ | H | 11-Dec-2019 | - | 251 | 177 |
| template/ | H | 03-May-2022 | - | 960 | 809 |
| test/ | H | 11-Dec-2019 | - | 5,355 | 3,773 |
| types/ | H | 11-Dec-2019 | - | 1,295 | 937 |
| ui/ | H | 11-Dec-2019 | - | 9,422 | 7,922 |
| vendor/ | H | 03-May-2022 | - | 576,490 | 463,759 |
| .dockerignore | H A D | 11-Dec-2019 | 83 | 7 | 5 |
| .gitignore | H A D | 11-Dec-2019 | 278 | 20 | 18 |
| .golangci.yml | H A D | 11-Dec-2019 | 199 | 14 | 11 |
| .promu.yml | H A D | 11-Dec-2019 | 1.3 KiB | 49 | 48 |
| CHANGELOG.md | H A D | 11-Dec-2019 | 28 KiB | 546 | 463 |
| Dockerfile | H A D | 11-Dec-2019 | 714 | 22 | 18 |
| LICENSE | H A D | 11-Dec-2019 | 11.1 KiB | 202 | 169 |
| MAINTAINERS.md | H A D | 11-Dec-2019 | 118 | 4 | 3 |
| Makefile | H A D | 11-Dec-2019 | 2 KiB | 57 | 30 |
| Makefile.common | H A D | 11-Dec-2019 | 9 KiB | 278 | 203 |
| NOTICE | H A D | 11-Dec-2019 | 457 | 19 | 13 |
| Procfile | H A D | 11-Dec-2019 | 621 | 6 | 4 |
| README.md | H A D | 11-Dec-2019 | 15.7 KiB | 400 | 301 |
| VERSION | H A D | 11-Dec-2019 | 7 | 2 | 1 |
| go.mod | H A D | 11-Dec-2019 | 1.4 KiB | 38 | 35 |
| go.sum | H A D | 11-Dec-2019 | 25.4 KiB | 268 | 267 |
README.md
1# Alertmanager [![CircleCI](https://circleci.com/gh/prometheus/alertmanager/tree/master.svg?style=shield)][circleci]
2
3[![Docker Repository on Quay](https://quay.io/repository/prometheus/alertmanager/status "Docker Repository on Quay")][quay]
4[![Docker Pulls](https://img.shields.io/docker/pulls/prom/alertmanager.svg?maxAge=604800)][hub]
5
6The Alertmanager handles alerts sent by client applications such as the Prometheus server. It takes care of deduplicating, grouping, and routing them to the correct receiver integrations such as email, PagerDuty, or OpsGenie. It also takes care of silencing and inhibition of alerts.
7
8* [Documentation](http://prometheus.io/docs/alerting/alertmanager/)
9
10## Install
11
12There are various ways of installing Alertmanager.
13
14### Precompiled binaries
15
16Precompiled binaries for released versions are available in the
17[*download* section](https://prometheus.io/download/)
18on [prometheus.io](https://prometheus.io). Using the latest production release binary
19is the recommended way of installing Alertmanager.
20
21### Docker images
22
23Docker images are available on [Quay.io](https://quay.io/repository/prometheus/alertmanager).
24
25### Compiling the binary
26
27You can either `go get` it:
28
29```
30$ GO15VENDOREXPERIMENT=1 go get github.com/prometheus/alertmanager/cmd/...
31# cd $GOPATH/src/github.com/prometheus/alertmanager
32$ alertmanager --config.file=<your_file>
33```
34
35Or clone the repository and build manually:
36
37```
38$ mkdir -p $GOPATH/src/github.com/prometheus
39$ cd $GOPATH/src/github.com/prometheus
40$ git clone https://github.com/prometheus/alertmanager.git
41$ cd alertmanager
42$ make build
43$ ./alertmanager --config.file=<your_file>
44```
45
46You can also build just one of the binaries in this repo by passing a name to the build function:
47```
48$ make build BINARIES=amtool
49```
50
51## Example
52
53This is an example configuration that should cover most relevant aspects of the new YAML configuration format. The full documentation of the configuration can be found [here](https://prometheus.io/docs/alerting/configuration/).
54
55```yaml
56global:
57 # The smarthost and SMTP sender used for mail notifications.
58 smtp_smarthost: 'localhost:25'
59 smtp_from: 'alertmanager@example.org'
60
61# The root route on which each incoming alert enters.
62route:
63 # The root route must not have any matchers as it is the entry point for
64 # all alerts. It needs to have a receiver configured so alerts that do not
65 # match any of the sub-routes are sent to someone.
66 receiver: 'team-X-mails'
67
68 # The labels by which incoming alerts are grouped together. For example,
69 # multiple alerts coming in for cluster=A and alertname=LatencyHigh would
70 # be batched into a single group.
71 #
72 # To aggregate by all possible labels use '...' as the sole label name.
73 # This effectively disables aggregation entirely, passing through all
74 # alerts as-is. This is unlikely to be what you want, unless you have
75 # a very low alert volume or your upstream notification system performs
76 # its own grouping. Example: group_by: [...]
77 group_by: ['alertname', 'cluster']
78
79 # When a new group of alerts is created by an incoming alert, wait at
80 # least 'group_wait' to send the initial notification.
81 # This way ensures that you get multiple alerts for the same group that start
82 # firing shortly after another are batched together on the first
83 # notification.
84 group_wait: 30s
85
86 # When the first notification was sent, wait 'group_interval' to send a batch
87 # of new alerts that started firing for that group.
88 group_interval: 5m
89
90 # If an alert has successfully been sent, wait 'repeat_interval' to
91 # resend them.
92 repeat_interval: 3h
93
94 # All the above attributes are inherited by all child routes and can
95 # overwritten on each.
96
97 # The child route trees.
98 routes:
99 # This routes performs a regular expression match on alert labels to
100 # catch alerts that are related to a list of services.
101 - match_re:
102 service: ^(foo1|foo2|baz)$
103 receiver: team-X-mails
104
105 # The service has a sub-route for critical alerts, any alerts
106 # that do not match, i.e. severity != critical, fall-back to the
107 # parent node and are sent to 'team-X-mails'
108 routes:
109 - match:
110 severity: critical
111 receiver: team-X-pager
112
113 - match:
114 service: files
115 receiver: team-Y-mails
116
117 routes:
118 - match:
119 severity: critical
120 receiver: team-Y-pager
121
122 # This route handles all alerts coming from a database service. If there's
123 # no team to handle it, it defaults to the DB team.
124 - match:
125 service: database
126
127 receiver: team-DB-pager
128 # Also group alerts by affected database.
129 group_by: [alertname, cluster, database]
130
131 routes:
132 - match:
133 owner: team-X
134 receiver: team-X-pager
135
136 - match:
137 owner: team-Y
138 receiver: team-Y-pager
139
140
141# Inhibition rules allow to mute a set of alerts given that another alert is
142# firing.
143# We use this to mute any warning-level notifications if the same alert is
144# already critical.
145inhibit_rules:
146- source_match:
147 severity: 'critical'
148 target_match:
149 severity: 'warning'
150 # Apply inhibition if the alertname is the same.
151 equal: ['alertname']
152
153
154receivers:
155- name: 'team-X-mails'
156 email_configs:
157 - to: 'team-X+alerts@example.org, team-Y+alerts@example.org'
158
159- name: 'team-X-pager'
160 email_configs:
161 - to: 'team-X+alerts-critical@example.org'
162 pagerduty_configs:
163 - routing_key: <team-X-key>
164
165- name: 'team-Y-mails'
166 email_configs:
167 - to: 'team-Y+alerts@example.org'
168
169- name: 'team-Y-pager'
170 pagerduty_configs:
171 - routing_key: <team-Y-key>
172
173- name: 'team-DB-pager'
174 pagerduty_configs:
175 - routing_key: <team-DB-key>
176```
177
178## API
179
180The current Alertmanager API is version 2. This API is fully generated via the
181[OpenAPI project](https://github.com/OAI/OpenAPI-Specification/blob/master/versions/2.0.md)
182and [Go Swagger](https://github.com/go-swagger/go-swagger/) with the exception
183of the HTTP handlers themselves. The API specification can be found in
184[api/v2/openapi.yaml](api/v2/openapi.yaml). A HTML rendered version can be
185accessed [here](http://petstore.swagger.io/?url=https://raw.githubusercontent.com/prometheus/alertmanager/master/api/v2/openapi.yaml).
186Clients can be easily generated via any OpenAPI generator for all major languages.
187
188With the default config, endpoints are accessed under a `/api/v1` or `/api/v2` prefix.
189The v2 `/status` endpoint would be `/api/v2/status`. If `--web.route-prefix` is set then API routes are
190prefixed with that as well, so `--web.route-prefix=/alertmanager/` would
191relate to `/alertmanager/api/v2/status`.
192
193_API v2 is still under heavy development and thereby subject to change._
194
195## amtool
196
197`amtool` is a cli tool for interacting with the Alertmanager API. It is bundled with all releases of Alertmanager.
198
199### Install
200
201Alternatively you can install with:
202```
203go get github.com/prometheus/alertmanager/cmd/amtool
204```
205
206### Examples
207
208View all currently firing alerts:
209```
210$ amtool alert
211Alertname Starts At Summary
212Test_Alert 2017-08-02 18:30:18 UTC This is a testing alert!
213Test_Alert 2017-08-02 18:30:18 UTC This is a testing alert!
214Check_Foo_Fails 2017-08-02 18:30:18 UTC This is a testing alert!
215Check_Foo_Fails 2017-08-02 18:30:18 UTC This is a testing alert!
216```
217
218View all currently firing alerts with extended output:
219```
220$ amtool -o extended alert
221Labels Annotations Starts At Ends At Generator URL
222alertname="Test_Alert" instance="node0" link="https://example.com" summary="This is a testing alert!" 2017-08-02 18:31:24 UTC 0001-01-01 00:00:00 UTC http://my.testing.script.local
223alertname="Test_Alert" instance="node1" link="https://example.com" summary="This is a testing alert!" 2017-08-02 18:31:24 UTC 0001-01-01 00:00:00 UTC http://my.testing.script.local
224alertname="Check_Foo_Fails" instance="node0" link="https://example.com" summary="This is a testing alert!" 2017-08-02 18:31:24 UTC 0001-01-01 00:00:00 UTC http://my.testing.script.local
225alertname="Check_Foo_Fails" instance="node1" link="https://example.com" summary="This is a testing alert!" 2017-08-02 18:31:24 UTC 0001-01-01 00:00:00 UTC http://my.testing.script.local
226```
227
228In addition to viewing alerts, you can use the rich query syntax provided by Alertmanager:
229```
230$ amtool -o extended alert query alertname="Test_Alert"
231Labels Annotations Starts At Ends At Generator URL
232alertname="Test_Alert" instance="node0" link="https://example.com" summary="This is a testing alert!" 2017-08-02 18:31:24 UTC 0001-01-01 00:00:00 UTC http://my.testing.script.local
233alertname="Test_Alert" instance="node1" link="https://example.com" summary="This is a testing alert!" 2017-08-02 18:31:24 UTC 0001-01-01 00:00:00 UTC http://my.testing.script.local
234
235$ amtool -o extended alert query instance=~".+1"
236Labels Annotations Starts At Ends At Generator URL
237alertname="Test_Alert" instance="node1" link="https://example.com" summary="This is a testing alert!" 2017-08-02 18:31:24 UTC 0001-01-01 00:00:00 UTC http://my.testing.script.local
238alertname="Check_Foo_Fails" instance="node1" link="https://example.com" summary="This is a testing alert!" 2017-08-02 18:31:24 UTC 0001-01-01 00:00:00 UTC http://my.testing.script.local
239
240$ amtool -o extended alert query alertname=~"Test.*" instance=~".+1"
241Labels Annotations Starts At Ends At Generator URL
242alertname="Test_Alert" instance="node1" link="https://example.com" summary="This is a testing alert!" 2017-08-02 18:31:24 UTC 0001-01-01 00:00:00 UTC http://my.testing.script.local
243```
244
245Silence an alert:
246```
247$ amtool silence add alertname=Test_Alert
248b3ede22e-ca14-4aa0-932c-ca2f3445f926
249
250$ amtool silence add alertname="Test_Alert" instance=~".+0"
251e48cb58a-0b17-49ba-b734-3585139b1d25
252```
253
254View silences:
255```
256$ amtool silence query
257ID Matchers Ends At Created By Comment
258b3ede22e-ca14-4aa0-932c-ca2f3445f926 alertname=Test_Alert 2017-08-02 19:54:50 UTC kellel
259
260$ amtool silence query instance=~".+0"
261ID Matchers Ends At Created By Comment
262e48cb58a-0b17-49ba-b734-3585139b1d25 alertname=Test_Alert instance=~.+0 2017-08-02 22:41:39 UTC kellel
263```
264
265Expire a silence:
266```
267$ amtool silence expire b3ede22e-ca14-4aa0-932c-ca2f3445f926
268```
269
270Expire all silences matching a query:
271```
272$ amtool silence query instance=~".+0"
273ID Matchers Ends At Created By Comment
274e48cb58a-0b17-49ba-b734-3585139b1d25 alertname=Test_Alert instance=~.+0 2017-08-02 22:41:39 UTC kellel
275
276$ amtool silence expire $(amtool silence -q query instance=~".+0")
277
278$ amtool silence query instance=~".+0"
279
280```
281
282Expire all silences:
283```
284$ amtool silence expire $(amtool silence query -q)
285```
286
287### Configuration
288
289`amtool` allows a configuration file to specify some options for convenience. The default configuration file paths are `$HOME/.config/amtool/config.yml` or `/etc/amtool/config.yml`
290
291An example configuration file might look like the following:
292
293```
294# Define the path that `amtool` can find your `alertmanager` instance at alertmanager.url: "http://localhost:9093"
295
296# Override the default author. (unset defaults to your username)
297author: me@example.com
298
299# Force amtool to give you an error if you don't include a comment on a silence
300comment_required: true
301
302# Set a default output format. (unset defaults to simple)
303output: extended
304
305# Set a default receiver
306receiver: team-X-pager
307```
308
309### Routes
310
311`amtool` allows you to visualize the routes of your configuration in form of text tree view.
312Also you can use it to test the routing by passing it label set of an alert
313and it prints out all receivers the alert would match ordered and separated by `,`.
314(If you use `--verify.receivers` amtool returns error code 1 on mismatch)
315
316Example of usage:
317```
318# View routing tree of remote Alertmanager
319$ amtool config routes --alertmanager.url=http://localhost:9090
320
321# Test if alert matches expected receiver
322$ amtool config routes test --config.file=doc/examples/simple.yml --tree --verify.receivers=team-X-pager service=database owner=team-X
323```
324
325## High Availability
326
327Alertmanager's high availability is in production use at many companies and is enabled by default.
328
329> Important: Both UDP and TCP are needed in alertmanager 0.15 and higher for the cluster to work.
330
331To create a highly available cluster of the Alertmanager the instances need to
332be configured to communicate with each other. This is configured using the
333`--cluster.*` flags.
334
335- `--cluster.listen-address` string: cluster listen address (default "0.0.0.0:9094"; empty string disables HA mode)
336- `--cluster.advertise-address` string: cluster advertise address
337- `--cluster.peer` value: initial peers (repeat flag for each additional peer)
338- `--cluster.peer-timeout` value: peer timeout period (default "15s")
339- `--cluster.gossip-interval` value: cluster message propagation speed
340 (default "200ms")
341- `--cluster.pushpull-interval` value: lower values will increase
342 convergence speeds at expense of bandwidth (default "1m0s")
343- `--cluster.settle-timeout` value: maximum time to wait for cluster
344 connections to settle before evaluating notifications.
345- `--cluster.tcp-timeout` value: timeout value for tcp connections, reads and writes (default "10s")
346- `--cluster.probe-timeout` value: time to wait for ack before marking node unhealthy
347 (default "500ms")
348- `--cluster.probe-interval` value: interval between random node probes (default "1s")
349- `--cluster.reconnect-interval` value: interval between attempting to reconnect to lost peers (default "10s")
350- `--cluster.reconnect-timeout` value: length of time to attempt to reconnect to a lost peer (default: "6h0m0s")
351
352The chosen port in the `cluster.listen-address` flag is the port that needs to be
353specified in the `cluster.peer` flag of the other peers.
354
355The `cluster.advertise-address` flag is required if the instance doesn't have
356an IP address that is part of [RFC 6980](https://tools.ietf.org/html/rfc6890)
357with a default route.
358
359To start a cluster of three peers on your local machine use [`goreman`](https://github.com/mattn/goreman) and the
360Procfile within this repository.
361
362 goreman start
363
364To point your Prometheus 1.4, or later, instance to multiple Alertmanagers, configure them
365in your `prometheus.yml` configuration file, for example:
366
367```yaml
368alerting:
369 alertmanagers:
370 - static_configs:
371 - targets:
372 - alertmanager1:9093
373 - alertmanager2:9093
374 - alertmanager3:9093
375```
376
377> Important: Do not load balance traffic between Prometheus and its Alertmanagers, but instead point Prometheus to a list of all Alertmanagers. The Alertmanager implementation expects all alerts to be sent to all Alertmanagers to ensure high availability.
378
379### Turn off high availability
380
381If running Alertmanager in high availability mode is not desired, setting `--cluster.listen-address=` prevents Alertmanager from listening to incoming peer requests.
382
383## Contributing
384
385Check the [Prometheus contributing page](https://github.com/prometheus/prometheus/blob/master/CONTRIBUTING.md).
386
387To contribute to the user interface, refer to [ui/app/CONTRIBUTING.md](ui/app/CONTRIBUTING.md).
388
389## Architecture
390
391![](doc/arch.svg)
392
393## License
394
395Apache License 2.0, see [LICENSE](https://github.com/prometheus/alertmanager/blob/master/LICENSE).
396
397[hub]: https://hub.docker.com/r/prom/alertmanager/
398[circleci]: https://circleci.com/gh/prometheus/alertmanager
399[quay]: https://quay.io/repository/prometheus/alertmanager
400