• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

.circleci/H11-Dec-2019-132124

.github/H11-Dec-2019-4822

api/H11-Dec-2019-16,16510,422

asset/H11-Dec-2019-493379

cli/H11-Dec-2019-2,4491,712

client/H11-Dec-2019-818659

cluster/H11-Dec-2019-2,3381,920

cmd/H11-Dec-2019-794653

config/H11-Dec-2019-3,3812,709

dispatch/H11-Dec-2019-1,4891,144

doc/H03-May-2022-251197

examples/H11-Dec-2019-125106

inhibit/H11-Dec-2019-606476

nflog/H11-Dec-2019-2,3452,004

notify/H11-Dec-2019-5,2344,015

pkg/H11-Dec-2019-487370

provider/H11-Dec-2019-614437

scripts/H03-May-2022-7337

silence/H11-Dec-2019-3,7143,256

store/H11-Dec-2019-251177

template/H03-May-2022-960809

test/H11-Dec-2019-5,3553,773

types/H11-Dec-2019-1,295937

ui/H11-Dec-2019-9,4227,922

vendor/H03-May-2022-576,490463,759

.dockerignoreH A D11-Dec-201983 75

.gitignoreH A D11-Dec-2019278 2018

.golangci.ymlH A D11-Dec-2019199 1411

.promu.ymlH A D11-Dec-20191.3 KiB4948

CHANGELOG.mdH A D11-Dec-201928 KiB546463

DockerfileH A D11-Dec-2019714 2218

LICENSEH A D11-Dec-201911.1 KiB202169

MAINTAINERS.mdH A D11-Dec-2019118 43

MakefileH A D11-Dec-20192 KiB5730

Makefile.commonH A D11-Dec-20199 KiB278203

NOTICEH A D11-Dec-2019457 1913

ProcfileH A D11-Dec-2019621 64

README.mdH A D11-Dec-201915.7 KiB400301

VERSIONH A D11-Dec-20197 21

go.modH A D11-Dec-20191.4 KiB3835

go.sumH A D11-Dec-201925.4 KiB268267

README.md

1# Alertmanager [![CircleCI](https://circleci.com/gh/prometheus/alertmanager/tree/master.svg?style=shield)][circleci]
2
3[![Docker Repository on Quay](https://quay.io/repository/prometheus/alertmanager/status "Docker Repository on Quay")][quay]
4[![Docker Pulls](https://img.shields.io/docker/pulls/prom/alertmanager.svg?maxAge=604800)][hub]
5
6The Alertmanager handles alerts sent by client applications such as the Prometheus server. It takes care of deduplicating, grouping, and routing them to the correct receiver integrations such as email, PagerDuty, or OpsGenie. It also takes care of silencing and inhibition of alerts.
7
8* [Documentation](http://prometheus.io/docs/alerting/alertmanager/)
9
10## Install
11
12There are various ways of installing Alertmanager.
13
14### Precompiled binaries
15
16Precompiled binaries for released versions are available in the
17[*download* section](https://prometheus.io/download/)
18on [prometheus.io](https://prometheus.io). Using the latest production release binary
19is the recommended way of installing Alertmanager.
20
21### Docker images
22
23Docker images are available on [Quay.io](https://quay.io/repository/prometheus/alertmanager).
24
25### Compiling the binary
26
27You can either `go get` it:
28
29```
30$ GO15VENDOREXPERIMENT=1 go get github.com/prometheus/alertmanager/cmd/...
31# cd $GOPATH/src/github.com/prometheus/alertmanager
32$ alertmanager --config.file=<your_file>
33```
34
35Or clone the repository and build manually:
36
37```
38$ mkdir -p $GOPATH/src/github.com/prometheus
39$ cd $GOPATH/src/github.com/prometheus
40$ git clone https://github.com/prometheus/alertmanager.git
41$ cd alertmanager
42$ make build
43$ ./alertmanager --config.file=<your_file>
44```
45
46You can also build just one of the binaries in this repo by passing a name to the build function:
47```
48$ make build BINARIES=amtool
49```
50
51## Example
52
53This is an example configuration that should cover most relevant aspects of the new YAML configuration format. The full documentation of the configuration can be found [here](https://prometheus.io/docs/alerting/configuration/).
54
55```yaml
56global:
57  # The smarthost and SMTP sender used for mail notifications.
58  smtp_smarthost: 'localhost:25'
59  smtp_from: 'alertmanager@example.org'
60
61# The root route on which each incoming alert enters.
62route:
63  # The root route must not have any matchers as it is the entry point for
64  # all alerts. It needs to have a receiver configured so alerts that do not
65  # match any of the sub-routes are sent to someone.
66  receiver: 'team-X-mails'
67
68  # The labels by which incoming alerts are grouped together. For example,
69  # multiple alerts coming in for cluster=A and alertname=LatencyHigh would
70  # be batched into a single group.
71  #
72  # To aggregate by all possible labels use '...' as the sole label name.
73  # This effectively disables aggregation entirely, passing through all
74  # alerts as-is. This is unlikely to be what you want, unless you have
75  # a very low alert volume or your upstream notification system performs
76  # its own grouping. Example: group_by: [...]
77  group_by: ['alertname', 'cluster']
78
79  # When a new group of alerts is created by an incoming alert, wait at
80  # least 'group_wait' to send the initial notification.
81  # This way ensures that you get multiple alerts for the same group that start
82  # firing shortly after another are batched together on the first
83  # notification.
84  group_wait: 30s
85
86  # When the first notification was sent, wait 'group_interval' to send a batch
87  # of new alerts that started firing for that group.
88  group_interval: 5m
89
90  # If an alert has successfully been sent, wait 'repeat_interval' to
91  # resend them.
92  repeat_interval: 3h
93
94  # All the above attributes are inherited by all child routes and can
95  # overwritten on each.
96
97  # The child route trees.
98  routes:
99  # This routes performs a regular expression match on alert labels to
100  # catch alerts that are related to a list of services.
101  - match_re:
102      service: ^(foo1|foo2|baz)$
103    receiver: team-X-mails
104
105    # The service has a sub-route for critical alerts, any alerts
106    # that do not match, i.e. severity != critical, fall-back to the
107    # parent node and are sent to 'team-X-mails'
108    routes:
109    - match:
110        severity: critical
111      receiver: team-X-pager
112
113  - match:
114      service: files
115    receiver: team-Y-mails
116
117    routes:
118    - match:
119        severity: critical
120      receiver: team-Y-pager
121
122  # This route handles all alerts coming from a database service. If there's
123  # no team to handle it, it defaults to the DB team.
124  - match:
125      service: database
126
127    receiver: team-DB-pager
128    # Also group alerts by affected database.
129    group_by: [alertname, cluster, database]
130
131    routes:
132    - match:
133        owner: team-X
134      receiver: team-X-pager
135
136    - match:
137        owner: team-Y
138      receiver: team-Y-pager
139
140
141# Inhibition rules allow to mute a set of alerts given that another alert is
142# firing.
143# We use this to mute any warning-level notifications if the same alert is
144# already critical.
145inhibit_rules:
146- source_match:
147    severity: 'critical'
148  target_match:
149    severity: 'warning'
150  # Apply inhibition if the alertname is the same.
151  equal: ['alertname']
152
153
154receivers:
155- name: 'team-X-mails'
156  email_configs:
157  - to: 'team-X+alerts@example.org, team-Y+alerts@example.org'
158
159- name: 'team-X-pager'
160  email_configs:
161  - to: 'team-X+alerts-critical@example.org'
162  pagerduty_configs:
163  - routing_key: <team-X-key>
164
165- name: 'team-Y-mails'
166  email_configs:
167  - to: 'team-Y+alerts@example.org'
168
169- name: 'team-Y-pager'
170  pagerduty_configs:
171  - routing_key: <team-Y-key>
172
173- name: 'team-DB-pager'
174  pagerduty_configs:
175  - routing_key: <team-DB-key>
176```
177
178## API
179
180The current Alertmanager API is version 2. This API is fully generated via the
181[OpenAPI project](https://github.com/OAI/OpenAPI-Specification/blob/master/versions/2.0.md)
182and [Go Swagger](https://github.com/go-swagger/go-swagger/) with the exception
183of the HTTP handlers themselves. The API specification can be found in
184[api/v2/openapi.yaml](api/v2/openapi.yaml). A HTML rendered version can be
185accessed [here](http://petstore.swagger.io/?url=https://raw.githubusercontent.com/prometheus/alertmanager/master/api/v2/openapi.yaml).
186Clients can be easily generated via any OpenAPI generator for all major languages.
187
188With the default config, endpoints are accessed under a `/api/v1` or `/api/v2` prefix.
189The v2 `/status` endpoint would be `/api/v2/status`. If `--web.route-prefix` is set then API routes are
190prefixed with that as well, so `--web.route-prefix=/alertmanager/` would
191relate to `/alertmanager/api/v2/status`.
192
193_API v2 is still under heavy development and thereby subject to change._
194
195## amtool
196
197`amtool` is a cli tool for interacting with the Alertmanager API. It is bundled with all releases of Alertmanager.
198
199### Install
200
201Alternatively you can install with:
202```
203go get github.com/prometheus/alertmanager/cmd/amtool
204```
205
206### Examples
207
208View all currently firing alerts:
209```
210$ amtool alert
211Alertname        Starts At                Summary
212Test_Alert       2017-08-02 18:30:18 UTC  This is a testing alert!
213Test_Alert       2017-08-02 18:30:18 UTC  This is a testing alert!
214Check_Foo_Fails  2017-08-02 18:30:18 UTC  This is a testing alert!
215Check_Foo_Fails  2017-08-02 18:30:18 UTC  This is a testing alert!
216```
217
218View all currently firing alerts with extended output:
219```
220$ amtool -o extended alert
221Labels                                        Annotations                                                    Starts At                Ends At                  Generator URL
222alertname="Test_Alert" instance="node0"       link="https://example.com" summary="This is a testing alert!"  2017-08-02 18:31:24 UTC  0001-01-01 00:00:00 UTC  http://my.testing.script.local
223alertname="Test_Alert" instance="node1"       link="https://example.com" summary="This is a testing alert!"  2017-08-02 18:31:24 UTC  0001-01-01 00:00:00 UTC  http://my.testing.script.local
224alertname="Check_Foo_Fails" instance="node0"  link="https://example.com" summary="This is a testing alert!"  2017-08-02 18:31:24 UTC  0001-01-01 00:00:00 UTC  http://my.testing.script.local
225alertname="Check_Foo_Fails" instance="node1"  link="https://example.com" summary="This is a testing alert!"  2017-08-02 18:31:24 UTC  0001-01-01 00:00:00 UTC  http://my.testing.script.local
226```
227
228In addition to viewing alerts, you can use the rich query syntax provided by Alertmanager:
229```
230$ amtool -o extended alert query alertname="Test_Alert"
231Labels                                   Annotations                                                    Starts At                Ends At                  Generator URL
232alertname="Test_Alert" instance="node0"  link="https://example.com" summary="This is a testing alert!"  2017-08-02 18:31:24 UTC  0001-01-01 00:00:00 UTC  http://my.testing.script.local
233alertname="Test_Alert" instance="node1"  link="https://example.com" summary="This is a testing alert!"  2017-08-02 18:31:24 UTC  0001-01-01 00:00:00 UTC  http://my.testing.script.local
234
235$ amtool -o extended alert query instance=~".+1"
236Labels                                        Annotations                                                    Starts At                Ends At                  Generator URL
237alertname="Test_Alert" instance="node1"       link="https://example.com" summary="This is a testing alert!"  2017-08-02 18:31:24 UTC  0001-01-01 00:00:00 UTC  http://my.testing.script.local
238alertname="Check_Foo_Fails" instance="node1"  link="https://example.com" summary="This is a testing alert!"  2017-08-02 18:31:24 UTC  0001-01-01 00:00:00 UTC  http://my.testing.script.local
239
240$ amtool -o extended alert query alertname=~"Test.*" instance=~".+1"
241Labels                                   Annotations                                                    Starts At                Ends At                  Generator URL
242alertname="Test_Alert" instance="node1"  link="https://example.com" summary="This is a testing alert!"  2017-08-02 18:31:24 UTC  0001-01-01 00:00:00 UTC  http://my.testing.script.local
243```
244
245Silence an alert:
246```
247$ amtool silence add alertname=Test_Alert
248b3ede22e-ca14-4aa0-932c-ca2f3445f926
249
250$ amtool silence add alertname="Test_Alert" instance=~".+0"
251e48cb58a-0b17-49ba-b734-3585139b1d25
252```
253
254View silences:
255```
256$ amtool silence query
257ID                                    Matchers              Ends At                  Created By  Comment
258b3ede22e-ca14-4aa0-932c-ca2f3445f926  alertname=Test_Alert  2017-08-02 19:54:50 UTC  kellel
259
260$ amtool silence query instance=~".+0"
261ID                                    Matchers                            Ends At                  Created By  Comment
262e48cb58a-0b17-49ba-b734-3585139b1d25  alertname=Test_Alert instance=~.+0  2017-08-02 22:41:39 UTC  kellel
263```
264
265Expire a silence:
266```
267$ amtool silence expire b3ede22e-ca14-4aa0-932c-ca2f3445f926
268```
269
270Expire all silences matching a query:
271```
272$ amtool silence query instance=~".+0"
273ID                                    Matchers                            Ends At                  Created By  Comment
274e48cb58a-0b17-49ba-b734-3585139b1d25  alertname=Test_Alert instance=~.+0  2017-08-02 22:41:39 UTC  kellel
275
276$ amtool silence expire $(amtool silence -q query instance=~".+0")
277
278$ amtool silence query instance=~".+0"
279
280```
281
282Expire all silences:
283```
284$ amtool silence expire $(amtool silence query -q)
285```
286
287### Configuration
288
289`amtool` allows a configuration file to specify some options for convenience. The default configuration file paths are `$HOME/.config/amtool/config.yml` or `/etc/amtool/config.yml`
290
291An example configuration file might look like the following:
292
293```
294# Define the path that `amtool` can find your `alertmanager` instance at alertmanager.url: "http://localhost:9093"
295
296# Override the default author. (unset defaults to your username)
297author: me@example.com
298
299# Force amtool to give you an error if you don't include a comment on a silence
300comment_required: true
301
302# Set a default output format. (unset defaults to simple)
303output: extended
304
305# Set a default receiver
306receiver: team-X-pager
307```
308
309### Routes
310
311`amtool` allows you to visualize the routes of your configuration in form of text tree view.
312Also you can use it to test the routing by passing it label set of an alert
313and it prints out all receivers the alert would match ordered and separated by `,`.
314(If you use `--verify.receivers` amtool returns error code 1 on mismatch)
315
316Example of usage:
317```
318# View routing tree of remote Alertmanager
319$ amtool config routes --alertmanager.url=http://localhost:9090
320
321# Test if alert matches expected receiver
322$ amtool config routes test --config.file=doc/examples/simple.yml --tree --verify.receivers=team-X-pager service=database owner=team-X
323```
324
325## High Availability
326
327Alertmanager's high availability is in production use at many companies and is enabled by default.
328
329> Important: Both UDP and TCP are needed in alertmanager 0.15 and higher for the cluster to work.
330
331To create a highly available cluster of the Alertmanager the instances need to
332be configured to communicate with each other. This is configured using the
333`--cluster.*` flags.
334
335- `--cluster.listen-address` string: cluster listen address (default "0.0.0.0:9094"; empty string disables HA mode)
336- `--cluster.advertise-address` string: cluster advertise address
337- `--cluster.peer` value: initial peers (repeat flag for each additional peer)
338- `--cluster.peer-timeout` value: peer timeout period (default "15s")
339- `--cluster.gossip-interval` value: cluster message propagation speed
340  (default "200ms")
341- `--cluster.pushpull-interval` value: lower values will increase
342  convergence speeds at expense of bandwidth (default "1m0s")
343- `--cluster.settle-timeout` value: maximum time to wait for cluster
344  connections to settle before evaluating notifications.
345- `--cluster.tcp-timeout` value: timeout value for tcp connections, reads and writes (default "10s")
346- `--cluster.probe-timeout` value: time to wait for ack before marking node unhealthy
347  (default "500ms")
348- `--cluster.probe-interval` value: interval between random node probes (default "1s")
349- `--cluster.reconnect-interval` value: interval between attempting to reconnect to lost peers (default "10s")
350- `--cluster.reconnect-timeout` value: length of time to attempt to reconnect to a lost peer (default: "6h0m0s")
351
352The chosen port in the `cluster.listen-address` flag is the port that needs to be
353specified in the `cluster.peer` flag of the other peers.
354
355The `cluster.advertise-address` flag is required if the instance doesn't have
356an IP address that is part of [RFC 6980](https://tools.ietf.org/html/rfc6890)
357with a default route.
358
359To start a cluster of three peers on your local machine use [`goreman`](https://github.com/mattn/goreman) and the
360Procfile within this repository.
361
362	goreman start
363
364To point your Prometheus 1.4, or later, instance to multiple Alertmanagers, configure them
365in your `prometheus.yml` configuration file, for example:
366
367```yaml
368alerting:
369  alertmanagers:
370  - static_configs:
371    - targets:
372      - alertmanager1:9093
373      - alertmanager2:9093
374      - alertmanager3:9093
375```
376
377> Important: Do not load balance traffic between Prometheus and its Alertmanagers, but instead point Prometheus to a list of all Alertmanagers. The Alertmanager implementation expects all alerts to be sent to all Alertmanagers to ensure high availability.
378
379### Turn off high availability
380
381If running Alertmanager in high availability mode is not desired, setting `--cluster.listen-address=` prevents Alertmanager from listening to incoming peer requests.
382
383## Contributing
384
385Check the [Prometheus contributing page](https://github.com/prometheus/prometheus/blob/master/CONTRIBUTING.md).
386
387To contribute to the user interface, refer to [ui/app/CONTRIBUTING.md](ui/app/CONTRIBUTING.md).
388
389## Architecture
390
391![](doc/arch.svg)
392
393## License
394
395Apache License 2.0, see [LICENSE](https://github.com/prometheus/alertmanager/blob/master/LICENSE).
396
397[hub]: https://hub.docker.com/r/prom/alertmanager/
398[circleci]: https://circleci.com/gh/prometheus/alertmanager
399[quay]: https://quay.io/repository/prometheus/alertmanager
400