• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

.circleci/H25-Aug-2021-

.github/H25-Aug-2021-

api/H25-Aug-2021-

asset/H25-Aug-2021-

cli/H25-Aug-2021-

client/H25-Aug-2021-

cluster/H25-Aug-2021-

cmd/H25-Aug-2021-

config/H25-Aug-2021-

dispatch/H25-Aug-2021-

doc/H03-May-2022-

docs/H25-Aug-2021-

examples/H25-Aug-2021-

inhibit/H25-Aug-2021-

nflog/H25-Aug-2021-

notify/H25-Aug-2021-

pkg/H25-Aug-2021-

provider/H25-Aug-2021-

scripts/H03-May-2022-

silence/H25-Aug-2021-

store/H25-Aug-2021-

template/H03-May-2022-

test/H25-Aug-2021-

timeinterval/H25-Aug-2021-

types/H25-Aug-2021-

ui/H25-Aug-2021-

.dockerignoreH A D25-Aug-2021127

.gitignoreH A D25-Aug-2021286

.golangci.ymlH A D25-Aug-2021171

.promu.ymlH A D25-Aug-20211,002

CHANGELOG.mdH A D25-Aug-202132 KiB

CODE_OF_CONDUCT.mdH A D25-Aug-2021155

DockerfileH A D25-Aug-2021713

LICENSEH A D25-Aug-202111.1 KiB

MAINTAINERS.mdH A D25-Aug-2021107

MakefileH A D25-Aug-20212 KiB

Makefile.commonH A D25-Aug-202110 KiB

NOTICEH A D25-Aug-2021457

ProcfileH A D25-Aug-2021621

README.mdH A D25-Aug-202116.6 KiB

SECURITY.mdH A D25-Aug-2021170

VERSIONH A D25-Aug-20217

go.modH A D25-Aug-20211.5 KiB

go.sumH A D25-Aug-202180.3 KiB

README.md

1# Alertmanager [![CircleCI](https://circleci.com/gh/prometheus/alertmanager/tree/main.svg?style=shield)][circleci]
2
3[![Docker Repository on Quay](https://quay.io/repository/prometheus/alertmanager/status "Docker Repository on Quay")][quay]
4[![Docker Pulls](https://img.shields.io/docker/pulls/prom/alertmanager.svg?maxAge=604800)][hub]
5
6The Alertmanager handles alerts sent by client applications such as the Prometheus server. It takes care of deduplicating, grouping, and routing them to the correct receiver integrations such as email, PagerDuty, or OpsGenie. It also takes care of silencing and inhibition of alerts.
7
8* [Documentation](http://prometheus.io/docs/alerting/alertmanager/)
9
10## Install
11
12There are various ways of installing Alertmanager.
13
14### Precompiled binaries
15
16Precompiled binaries for released versions are available in the
17[*download* section](https://prometheus.io/download/)
18on [prometheus.io](https://prometheus.io). Using the latest production release binary
19is the recommended way of installing Alertmanager.
20
21### Docker images
22
23Docker images are available on [Quay.io](https://quay.io/repository/prometheus/alertmanager) or [Docker Hub](https://hub.docker.com/r/prom/alertmanager/).
24
25You can launch an Alertmanager container for trying it out with
26
27    $ docker run --name alertmanager -d -p 127.0.0.1:9093:9093 quay.io/prometheus/alertmanager
28
29Alertmanager will now be reachable at http://localhost:9093/.
30
31### Compiling the binary
32
33You can either `go get` it:
34
35```
36$ GO15VENDOREXPERIMENT=1 go get github.com/prometheus/alertmanager/cmd/...
37# cd $GOPATH/src/github.com/prometheus/alertmanager
38$ alertmanager --config.file=<your_file>
39```
40
41Or clone the repository and build manually:
42
43```
44$ mkdir -p $GOPATH/src/github.com/prometheus
45$ cd $GOPATH/src/github.com/prometheus
46$ git clone https://github.com/prometheus/alertmanager.git
47$ cd alertmanager
48$ make build
49$ ./alertmanager --config.file=<your_file>
50```
51
52You can also build just one of the binaries in this repo by passing a name to the build function:
53```
54$ make build BINARIES=amtool
55```
56
57## Example
58
59This is an example configuration that should cover most relevant aspects of the new YAML configuration format. The full documentation of the configuration can be found [here](https://prometheus.io/docs/alerting/configuration/).
60
61```yaml
62global:
63  # The smarthost and SMTP sender used for mail notifications.
64  smtp_smarthost: 'localhost:25'
65  smtp_from: 'alertmanager@example.org'
66
67# The root route on which each incoming alert enters.
68route:
69  # The root route must not have any matchers as it is the entry point for
70  # all alerts. It needs to have a receiver configured so alerts that do not
71  # match any of the sub-routes are sent to someone.
72  receiver: 'team-X-mails'
73
74  # The labels by which incoming alerts are grouped together. For example,
75  # multiple alerts coming in for cluster=A and alertname=LatencyHigh would
76  # be batched into a single group.
77  #
78  # To aggregate by all possible labels use '...' as the sole label name.
79  # This effectively disables aggregation entirely, passing through all
80  # alerts as-is. This is unlikely to be what you want, unless you have
81  # a very low alert volume or your upstream notification system performs
82  # its own grouping. Example: group_by: [...]
83  group_by: ['alertname', 'cluster']
84
85  # When a new group of alerts is created by an incoming alert, wait at
86  # least 'group_wait' to send the initial notification.
87  # This way ensures that you get multiple alerts for the same group that start
88  # firing shortly after another are batched together on the first
89  # notification.
90  group_wait: 30s
91
92  # When the first notification was sent, wait 'group_interval' to send a batch
93  # of new alerts that started firing for that group.
94  group_interval: 5m
95
96  # If an alert has successfully been sent, wait 'repeat_interval' to
97  # resend them.
98  repeat_interval: 3h
99
100  # All the above attributes are inherited by all child routes and can
101  # overwritten on each.
102
103  # The child route trees.
104  routes:
105  # This routes performs a regular expression match on alert labels to
106  # catch alerts that are related to a list of services.
107  - match_re:
108      service: ^(foo1|foo2|baz)$
109    receiver: team-X-mails
110
111    # The service has a sub-route for critical alerts, any alerts
112    # that do not match, i.e. severity != critical, fall-back to the
113    # parent node and are sent to 'team-X-mails'
114    routes:
115    - match:
116        severity: critical
117      receiver: team-X-pager
118
119  - match:
120      service: files
121    receiver: team-Y-mails
122
123    routes:
124    - match:
125        severity: critical
126      receiver: team-Y-pager
127
128  # This route handles all alerts coming from a database service. If there's
129  # no team to handle it, it defaults to the DB team.
130  - match:
131      service: database
132
133    receiver: team-DB-pager
134    # Also group alerts by affected database.
135    group_by: [alertname, cluster, database]
136
137    routes:
138    - match:
139        owner: team-X
140      receiver: team-X-pager
141
142    - match:
143        owner: team-Y
144      receiver: team-Y-pager
145
146
147# Inhibition rules allow to mute a set of alerts given that another alert is
148# firing.
149# We use this to mute any warning-level notifications if the same alert is
150# already critical.
151inhibit_rules:
152- source_matchers:
153    - severity="critical"
154  target_matchers:
155    - severity="warning"
156  # Apply inhibition if the alertname is the same.
157  # CAUTION:
158  #   If all label names listed in `equal` are missing
159  #   from both the source and target alerts,
160  #   the inhibition rule will apply!
161  equal: ['alertname']
162
163
164receivers:
165- name: 'team-X-mails'
166  email_configs:
167  - to: 'team-X+alerts@example.org, team-Y+alerts@example.org'
168
169- name: 'team-X-pager'
170  email_configs:
171  - to: 'team-X+alerts-critical@example.org'
172  pagerduty_configs:
173  - routing_key: <team-X-key>
174
175- name: 'team-Y-mails'
176  email_configs:
177  - to: 'team-Y+alerts@example.org'
178
179- name: 'team-Y-pager'
180  pagerduty_configs:
181  - routing_key: <team-Y-key>
182
183- name: 'team-DB-pager'
184  pagerduty_configs:
185  - routing_key: <team-DB-key>
186```
187
188## API
189
190The current Alertmanager API is version 2. This API is fully generated via the
191[OpenAPI project](https://github.com/OAI/OpenAPI-Specification/blob/master/versions/2.0.md)
192and [Go Swagger](https://github.com/go-swagger/go-swagger/) with the exception
193of the HTTP handlers themselves. The API specification can be found in
194[api/v2/openapi.yaml](api/v2/openapi.yaml). A HTML rendered version can be
195accessed [here](http://petstore.swagger.io/?url=https://raw.githubusercontent.com/prometheus/alertmanager/main/api/v2/openapi.yaml).
196Clients can be easily generated via any OpenAPI generator for all major languages.
197
198With the default config, endpoints are accessed under a `/api/v1` or `/api/v2` prefix.
199The v2 `/status` endpoint would be `/api/v2/status`. If `--web.route-prefix` is set then API routes are
200prefixed with that as well, so `--web.route-prefix=/alertmanager/` would
201relate to `/alertmanager/api/v2/status`.
202
203_API v2 is still under heavy development and thereby subject to change._
204
205## amtool
206
207`amtool` is a cli tool for interacting with the Alertmanager API. It is bundled with all releases of Alertmanager.
208
209### Install
210
211Alternatively you can install with:
212```
213go get github.com/prometheus/alertmanager/cmd/amtool
214```
215
216### Examples
217
218View all currently firing alerts:
219```
220$ amtool alert
221Alertname        Starts At                Summary
222Test_Alert       2017-08-02 18:30:18 UTC  This is a testing alert!
223Test_Alert       2017-08-02 18:30:18 UTC  This is a testing alert!
224Check_Foo_Fails  2017-08-02 18:30:18 UTC  This is a testing alert!
225Check_Foo_Fails  2017-08-02 18:30:18 UTC  This is a testing alert!
226```
227
228View all currently firing alerts with extended output:
229```
230$ amtool -o extended alert
231Labels                                        Annotations                                                    Starts At                Ends At                  Generator URL
232alertname="Test_Alert" instance="node0"       link="https://example.com" summary="This is a testing alert!"  2017-08-02 18:31:24 UTC  0001-01-01 00:00:00 UTC  http://my.testing.script.local
233alertname="Test_Alert" instance="node1"       link="https://example.com" summary="This is a testing alert!"  2017-08-02 18:31:24 UTC  0001-01-01 00:00:00 UTC  http://my.testing.script.local
234alertname="Check_Foo_Fails" instance="node0"  link="https://example.com" summary="This is a testing alert!"  2017-08-02 18:31:24 UTC  0001-01-01 00:00:00 UTC  http://my.testing.script.local
235alertname="Check_Foo_Fails" instance="node1"  link="https://example.com" summary="This is a testing alert!"  2017-08-02 18:31:24 UTC  0001-01-01 00:00:00 UTC  http://my.testing.script.local
236```
237
238In addition to viewing alerts, you can use the rich query syntax provided by Alertmanager:
239```
240$ amtool -o extended alert query alertname="Test_Alert"
241Labels                                   Annotations                                                    Starts At                Ends At                  Generator URL
242alertname="Test_Alert" instance="node0"  link="https://example.com" summary="This is a testing alert!"  2017-08-02 18:31:24 UTC  0001-01-01 00:00:00 UTC  http://my.testing.script.local
243alertname="Test_Alert" instance="node1"  link="https://example.com" summary="This is a testing alert!"  2017-08-02 18:31:24 UTC  0001-01-01 00:00:00 UTC  http://my.testing.script.local
244
245$ amtool -o extended alert query instance=~".+1"
246Labels                                        Annotations                                                    Starts At                Ends At                  Generator URL
247alertname="Test_Alert" instance="node1"       link="https://example.com" summary="This is a testing alert!"  2017-08-02 18:31:24 UTC  0001-01-01 00:00:00 UTC  http://my.testing.script.local
248alertname="Check_Foo_Fails" instance="node1"  link="https://example.com" summary="This is a testing alert!"  2017-08-02 18:31:24 UTC  0001-01-01 00:00:00 UTC  http://my.testing.script.local
249
250$ amtool -o extended alert query alertname=~"Test.*" instance=~".+1"
251Labels                                   Annotations                                                    Starts At                Ends At                  Generator URL
252alertname="Test_Alert" instance="node1"  link="https://example.com" summary="This is a testing alert!"  2017-08-02 18:31:24 UTC  0001-01-01 00:00:00 UTC  http://my.testing.script.local
253```
254
255Silence an alert:
256```
257$ amtool silence add alertname=Test_Alert
258b3ede22e-ca14-4aa0-932c-ca2f3445f926
259
260$ amtool silence add alertname="Test_Alert" instance=~".+0"
261e48cb58a-0b17-49ba-b734-3585139b1d25
262```
263
264View silences:
265```
266$ amtool silence query
267ID                                    Matchers              Ends At                  Created By  Comment
268b3ede22e-ca14-4aa0-932c-ca2f3445f926  alertname=Test_Alert  2017-08-02 19:54:50 UTC  kellel
269
270$ amtool silence query instance=~".+0"
271ID                                    Matchers                            Ends At                  Created By  Comment
272e48cb58a-0b17-49ba-b734-3585139b1d25  alertname=Test_Alert instance=~.+0  2017-08-02 22:41:39 UTC  kellel
273```
274
275Expire a silence:
276```
277$ amtool silence expire b3ede22e-ca14-4aa0-932c-ca2f3445f926
278```
279
280Expire all silences matching a query:
281```
282$ amtool silence query instance=~".+0"
283ID                                    Matchers                            Ends At                  Created By  Comment
284e48cb58a-0b17-49ba-b734-3585139b1d25  alertname=Test_Alert instance=~.+0  2017-08-02 22:41:39 UTC  kellel
285
286$ amtool silence expire $(amtool silence query -q instance=~".+0")
287
288$ amtool silence query instance=~".+0"
289
290```
291
292Expire all silences:
293```
294$ amtool silence expire $(amtool silence query -q)
295```
296
297Try out how a template works. Let's say you have this in your configuration file:
298```
299templates:
300  - '/foo/bar/*.tmpl'
301```
302
303Then you can test out how a template would look like with example by using this command:
304```
305amtool template render --template.glob='/foo/bar/*.tmpl' --template.text='{{ template "slack.default.markdown.v1" . }}'
306```
307
308### Configuration
309
310`amtool` allows a configuration file to specify some options for convenience. The default configuration file paths are `$HOME/.config/amtool/config.yml` or `/etc/amtool/config.yml`
311
312An example configuration file might look like the following:
313
314```
315# Define the path that `amtool` can find your `alertmanager` instance
316alertmanager.url: "http://localhost:9093"
317
318# Override the default author. (unset defaults to your username)
319author: me@example.com
320
321# Force amtool to give you an error if you don't include a comment on a silence
322comment_required: true
323
324# Set a default output format. (unset defaults to simple)
325output: extended
326
327# Set a default receiver
328receiver: team-X-pager
329```
330
331### Routes
332
333`amtool` allows you to visualize the routes of your configuration in form of text tree view.
334Also you can use it to test the routing by passing it label set of an alert
335and it prints out all receivers the alert would match ordered and separated by `,`.
336(If you use `--verify.receivers` amtool returns error code 1 on mismatch)
337
338Example of usage:
339```
340# View routing tree of remote Alertmanager
341$ amtool config routes --alertmanager.url=http://localhost:9090
342
343# Test if alert matches expected receiver
344$ amtool config routes test --config.file=doc/examples/simple.yml --tree --verify.receivers=team-X-pager service=database owner=team-X
345```
346
347## High Availability
348
349Alertmanager's high availability is in production use at many companies and is enabled by default.
350
351> Important: Both UDP and TCP are needed in alertmanager 0.15 and higher for the cluster to work.
352>  - If you are using a firewall, make sure to whitelist the clustering port for both protocols.
353>  - If you are running in a container, make sure to expose the clustering port for both protocols.
354
355To create a highly available cluster of the Alertmanager the instances need to
356be configured to communicate with each other. This is configured using the
357`--cluster.*` flags.
358
359- `--cluster.listen-address` string: cluster listen address (default "0.0.0.0:9094"; empty string disables HA mode)
360- `--cluster.advertise-address` string: cluster advertise address
361- `--cluster.peer` value: initial peers (repeat flag for each additional peer)
362- `--cluster.peer-timeout` value: peer timeout period (default "15s")
363- `--cluster.gossip-interval` value: cluster message propagation speed
364  (default "200ms")
365- `--cluster.pushpull-interval` value: lower values will increase
366  convergence speeds at expense of bandwidth (default "1m0s")
367- `--cluster.settle-timeout` value: maximum time to wait for cluster
368  connections to settle before evaluating notifications.
369- `--cluster.tcp-timeout` value: timeout value for tcp connections, reads and writes (default "10s")
370- `--cluster.probe-timeout` value: time to wait for ack before marking node unhealthy
371  (default "500ms")
372- `--cluster.probe-interval` value: interval between random node probes (default "1s")
373- `--cluster.reconnect-interval` value: interval between attempting to reconnect to lost peers (default "10s")
374- `--cluster.reconnect-timeout` value: length of time to attempt to reconnect to a lost peer (default: "6h0m0s")
375
376The chosen port in the `cluster.listen-address` flag is the port that needs to be
377specified in the `cluster.peer` flag of the other peers.
378
379The `cluster.advertise-address` flag is required if the instance doesn't have
380an IP address that is part of [RFC 6890](https://tools.ietf.org/html/rfc6890)
381with a default route.
382
383To start a cluster of three peers on your local machine use [`goreman`](https://github.com/mattn/goreman) and the
384Procfile within this repository.
385
386	goreman start
387
388To point your Prometheus 1.4, or later, instance to multiple Alertmanagers, configure them
389in your `prometheus.yml` configuration file, for example:
390
391```yaml
392alerting:
393  alertmanagers:
394  - static_configs:
395    - targets:
396      - alertmanager1:9093
397      - alertmanager2:9093
398      - alertmanager3:9093
399```
400
401> Important: Do not load balance traffic between Prometheus and its Alertmanagers, but instead point Prometheus to a list of all Alertmanagers. The Alertmanager implementation expects all alerts to be sent to all Alertmanagers to ensure high availability.
402
403### Turn off high availability
404
405If running Alertmanager in high availability mode is not desired, setting `--cluster.listen-address=` prevents Alertmanager from listening to incoming peer requests.
406
407## Contributing
408
409Check the [Prometheus contributing page](https://github.com/prometheus/prometheus/blob/main/CONTRIBUTING.md).
410
411To contribute to the user interface, refer to [ui/app/CONTRIBUTING.md](ui/app/CONTRIBUTING.md).
412
413## Architecture
414
415![](doc/arch.svg)
416
417## License
418
419Apache License 2.0, see [LICENSE](https://github.com/prometheus/alertmanager/blob/main/LICENSE).
420
421[hub]: https://hub.docker.com/r/prom/alertmanager/
422[circleci]: https://circleci.com/gh/prometheus/alertmanager
423[quay]: https://quay.io/repository/prometheus/alertmanager
424