1# Monitoring Basics <a id="monitoring-basics"></a> 2 3This part of the Icinga 2 documentation provides an overview of all the basic 4monitoring concepts you need to know to run Icinga 2. 5Keep in mind these examples are made with a Linux server. If you are 6using Windows, you will need to change the services accordingly. See the [ITL reference](10-icinga-template-library.md#windows-plugins) 7 for further information. 8 9## Attribute Value Types <a id="attribute-value-types"></a> 10 11The Icinga 2 configuration uses different value types for attributes. 12 13 Type | Example 14 -------------------------------------------------------|--------------------------------------------------------- 15 [Number](17-language-reference.md#numeric-literals) | `5` 16 [Duration](17-language-reference.md#duration-literals) | `1m` 17 [String](17-language-reference.md#string-literals) | `"These are notes"` 18 [Boolean](17-language-reference.md#boolean-literals) | `true` 19 [Array](17-language-reference.md#array) | `[ "value1", "value2" ]` 20 [Dictionary](17-language-reference.md#dictionary) | `{ "key1" = "value1", "key2" = false }` 21 22It is important to use the correct value type for object attributes 23as otherwise the [configuration validation](11-cli-commands.md#config-validation) will fail. 24 25## Hosts and Services <a id="hosts-services"></a> 26 27Icinga 2 can be used to monitor the availability of hosts and services. Hosts 28and services can be virtually anything which can be checked in some way: 29 30* Network services (HTTP, SMTP, SNMP, SSH, etc.) 31* Printers 32* Switches or routers 33* Temperature sensors 34* Other local or network-accessible services 35 36Host objects provide a mechanism to group services that are running 37on the same physical device. 38 39Here is an example of a host object which defines two child services: 40 41``` 42object Host "my-server1" { 43 address = "10.0.0.1" 44 check_command = "hostalive" 45} 46 47object Service "ping4" { 48 host_name = "my-server1" 49 check_command = "ping4" 50} 51 52object Service "http" { 53 host_name = "my-server1" 54 check_command = "http" 55} 56``` 57 58The example creates two services `ping4` and `http` which belong to the 59host `my-server1`. 60 61It also specifies that the host should perform its own check using the `hostalive` 62check command. 63 64The `address` attribute is used by check commands to determine which network 65address is associated with the host object. 66 67Details on troubleshooting check problems can be found [here](15-troubleshooting.md#troubleshooting). 68 69### Host States <a id="host-states"></a> 70 71Hosts can be in any one of the following states: 72 73 Name | Description 74 ------------|-------------- 75 UP | The host is available. 76 DOWN | The host is unavailable. 77 78### Service States <a id="service-states"></a> 79 80Services can be in any one of the following states: 81 82 Name | Description 83 ------------|-------------- 84 OK | The service is working properly. 85 WARNING | The service is experiencing some problems but is still considered to be in working condition. 86 CRITICAL | The check successfully determined that the service is in a critical state. 87 UNKNOWN | The check could not determine the service's state. 88 89### Check Result State Mapping <a id="check-result-state-mapping"></a> 90 91[Check plugins](05-service-monitoring.md#service-monitoring-plugins) return 92with an exit code which is converted into a state number. 93Services map the states directly while hosts will treat `0` or `1` as `UP` 94for example. 95 96 Value | Host State | Service State 97 ------|------------|-------------- 98 0 | Up | OK 99 1 | Up | Warning 100 2 | Down | Critical 101 3 | Down | Unknown 102 103### Hard and Soft States <a id="hard-soft-states"></a> 104 105When detecting a problem with a host/service, Icinga re-checks the object a number of 106times (based on the `max_check_attempts` and `retry_interval` settings) before sending 107notifications. This ensures that no unnecessary notifications are sent for 108transient failures. During this time the object is in a `SOFT` state. 109 110After all re-checks have been executed and the object is still in a non-OK 111state, the host/service switches to a `HARD` state and notifications are sent. 112 113 Name | Description 114 ------------|-------------- 115 HARD | The host/service's state hasn't recently changed. `check_interval` applies here. 116 SOFT | The host/service has recently changed state and is being re-checked with `retry_interval`. 117 118### Host and Service Checks <a id="host-service-checks"></a> 119 120Hosts and services determine their state by running checks in a regular interval. 121 122``` 123object Host "router" { 124 check_command = "hostalive" 125 address = "10.0.0.1" 126} 127``` 128 129The `hostalive` command is one of several built-in check commands. It sends ICMP 130echo requests to the IP address specified in the `address` attribute to determine 131whether a host is online. 132 133> **Tip** 134> 135> `hostalive` is the same as `ping` but with different default thresholds. 136> Both use the `ping` CLI command to execute sequential checks. 137> 138> If you need faster ICMP checks, look into the [icmp](10-icinga-template-library.md#plugin-check-command-icmp) CheckCommand. 139 140A number of other [built-in check commands](10-icinga-template-library.md#icinga-template-library) are also 141available. In addition to these commands the next few chapters will explain in 142detail how to set up your own check commands. 143 144#### Host Check Alternatives <a id="host-check-alternatives"></a> 145 146If the host is not reachable with ICMP, HTTP, etc. you can 147also use the [dummy](10-icinga-template-library.md#itl-dummy) CheckCommand to set a default state. 148 149``` 150object Host "dummy-host" { 151 check_command = "dummy" 152 vars.dummy_state = 0 //Up 153 vars.dummy_text = "Everything OK." 154} 155``` 156 157This method is also used when you send in [external check results](08-advanced-topics.md#external-check-results). 158 159A more advanced technique is to calculate an overall state 160based on all services. This is described [here](08-advanced-topics.md#access-object-attributes-at-runtime-cluster-check). 161 162 163## Templates <a id="object-inheritance-using-templates"></a> 164 165Templates may be used to apply a set of identical attributes to more than one 166object: 167 168``` 169template Service "generic-service" { 170 max_check_attempts = 3 171 check_interval = 5m 172 retry_interval = 1m 173 enable_perfdata = true 174} 175 176apply Service "ping4" { 177 import "generic-service" 178 179 check_command = "ping4" 180 181 assign where host.address 182} 183 184apply Service "ping6" { 185 import "generic-service" 186 187 check_command = "ping6" 188 189 assign where host.address6 190} 191``` 192 193 194In this example the `ping4` and `ping6` services inherit properties from the 195template `generic-service`. 196 197Objects as well as templates themselves can import an arbitrary number of 198other templates. Attributes inherited from a template can be overridden in the 199object if necessary. 200 201You can also import existing non-template objects. 202 203> **Note** 204> 205> Templates and objects share the same namespace, i.e. you can't define a template 206> that has the same name like an object. 207 208 209### Multiple Templates <a id="object-inheritance-using-multiple-templates"></a> 210 211The following example uses [custom variables](03-monitoring-basics.md#custom-variables) which 212are provided in each template. The `web-server` template is used as the 213base template for any host providing web services. In addition to that it 214specifies the custom variable `webserver_type`, e.g. `apache`. Since this 215template is also the base template, we import the `generic-host` template here. 216This provides the `check_command` attribute by default and we don't need 217to set it anywhere later on. 218 219``` 220template Host "web-server" { 221 import "generic-host" 222 vars = { 223 webserver_type = "apache" 224 } 225} 226``` 227 228The `wp-server` host template specifies a Wordpress instance and sets 229the `application_type` custom variable. Please note the `+=` [operator](17-language-reference.md#dictionary-operators) 230which adds [dictionary](17-language-reference.md#dictionary) items, 231but does not override any previous `vars` attribute. 232 233``` 234template Host "wp-server" { 235 vars += { 236 application_type = "wordpress" 237 } 238} 239``` 240 241The final host object imports both templates. The order is important here: 242First the base template `web-server` is added to the object, then additional 243attributes are imported from the `wp-server` object. 244 245``` 246object Host "wp.example.com" { 247 import "web-server" 248 import "wp-server" 249 250 address = "192.168.56.200" 251} 252``` 253 254If you want to override specific attributes inherited from templates, you can 255specify them on the host object. 256 257``` 258object Host "wp1.example.com" { 259 import "web-server" 260 import "wp-server" 261 262 vars.webserver_type = "nginx" //overrides attribute from base template 263 264 address = "192.168.56.201" 265} 266``` 267 268<!-- Keep this for compatibility --> 269<a id="custom-attributes"></a> 270 271## Custom Variables <a id="custom-variables"></a> 272 273In addition to built-in object attributes you can define your own custom 274attributes inside the `vars` attribute. 275 276> **Tip** 277> 278> This is called `custom variables` throughout the documentation, backends and web interfaces. 279> 280> Older documentation versions referred to this as `custom attribute`. 281 282The following example specifies the key `ssh_port` as custom 283variable and assigns an integer value. 284 285``` 286object Host "localhost" { 287 check_command = "ssh" 288 vars.ssh_port = 2222 289} 290``` 291 292`vars` is a [dictionary](17-language-reference.md#dictionary) where you 293can set specific keys to values. The example above uses the shorter 294[indexer](17-language-reference.md#indexer) syntax. 295 296An alternative representation can be written like this: 297 298``` 299 vars = { 300 ssh_port = 2222 301 } 302``` 303 304or 305 306``` 307 vars["ssh_port"] = 2222 308``` 309 310### Custom Variable Values <a id="custom-variables-values"></a> 311 312Valid values for custom variables include: 313 314* [Strings](17-language-reference.md#string-literals), [numbers](17-language-reference.md#numeric-literals) and [booleans](17-language-reference.md#boolean-literals) 315* [Arrays](17-language-reference.md#array) and [dictionaries](17-language-reference.md#dictionary) 316* [Functions](03-monitoring-basics.md#custom-variables-functions) 317 318You can also define nested values such as dictionaries in dictionaries. 319 320This example defines the custom variable `disks` as dictionary. 321The first key is set to `disk /` is itself set to a dictionary 322with one key-value pair. 323 324``` 325 vars.disks["disk /"] = { 326 disk_partitions = "/" 327 } 328``` 329 330This can be written as resolved structure like this: 331 332``` 333 vars = { 334 disks = { 335 "disk /" = { 336 disk_partitions = "/" 337 } 338 } 339 } 340``` 341 342Keep this in mind when trying to access specific sub-keys 343in apply rules or functions. 344 345Another example which is shown in the example configuration: 346 347``` 348 vars.notification["mail"] = { 349 groups = [ "icingaadmins" ] 350 } 351``` 352 353This defines the `notification` custom variable as dictionary 354with the key `mail`. Its value is a dictionary with the key `groups` 355which itself has an array as value. Note: This array is the exact 356same as the `user_groups` attribute for [notification apply rules](#03-monitoring-basics.md#using-apply-notifications) 357expects. 358 359``` 360 vars.notification = { 361 mail = { 362 groups = [ 363 "icingaadmins" 364 ] 365 } 366 } 367``` 368 369<!-- Keep this for compatibility --> 370<a id="custom-attributes-functions"></a> 371 372### Functions as Custom Variables <a id="custom-variables-functions"></a> 373 374Icinga 2 lets you specify [functions](17-language-reference.md#functions) for custom variables. 375The special case here is that whenever Icinga 2 needs the value for such a custom variable it runs 376the function and uses whatever value the function returns: 377 378``` 379object CheckCommand "random-value" { 380 command = [ PluginDir + "/check_dummy", "0", "$text$" ] 381 382 vars.text = {{ Math.random() * 100 }} 383} 384``` 385 386This example uses the [abbreviated lambda syntax](17-language-reference.md#nullary-lambdas). 387 388These functions have access to a number of variables: 389 390 Variable | Description 391 -------------|--------------- 392 user | The User object (for notifications). 393 service | The Service object (for service checks/notifications/event handlers). 394 host | The Host object. 395 command | The command object (e.g. a CheckCommand object for checks). 396 397Here's an example: 398 399``` 400vars.text = {{ host.check_interval }} 401``` 402 403In addition to these variables the [macro](18-library-reference.md#scoped-functions-macro) function can be used to retrieve the 404value of arbitrary macro expressions: 405 406``` 407vars.text = {{ 408 if (macro("$address$") == "127.0.0.1") { 409 log("Running a check for localhost!") 410 } 411 412 return "Some text" 413}} 414``` 415 416The `resolve_arguments` function can be used to resolve a command and its arguments much in 417the same fashion Icinga does this for the `command` and `arguments` attributes for 418commands. The `by_ssh` command uses this functionality to let users specify a 419command and arguments that should be executed via SSH: 420 421``` 422arguments = { 423 "-C" = {{ 424 var command = macro("$by_ssh_command$") 425 var arguments = macro("$by_ssh_arguments$") 426 427 if (typeof(command) == String && !arguments) { 428 return command 429 } 430 431 var escaped_args = [] 432 for (arg in resolve_arguments(command, arguments)) { 433 escaped_args.add(escape_shell_arg(arg)) 434 } 435 return escaped_args.join(" ") 436 }} 437 ... 438} 439``` 440 441Accessing object attributes at runtime inside these functions is described in the 442[advanced topics](08-advanced-topics.md#access-object-attributes-at-runtime) chapter. 443 444 445## Runtime Macros <a id="runtime-macros"></a> 446 447Macros can be used to access other objects' attributes and [custom variables](03-monitoring-basics.md#custom-variables) 448at runtime. For example they are used in command definitions to figure out 449which IP address a check should be run against: 450 451``` 452object CheckCommand "my-ping" { 453 command = [ PluginDir + "/check_ping" ] 454 455 arguments = { 456 "-H" = "$ping_address$" 457 "-w" = "$ping_wrta$,$ping_wpl$%" 458 "-c" = "$ping_crta$,$ping_cpl$%" 459 "-p" = "$ping_packets$" 460 } 461 462 // Resolve from a host attribute, or custom variable. 463 vars.ping_address = "$address$" 464 465 // Default values 466 vars.ping_wrta = 100 467 vars.ping_wpl = 5 468 469 vars.ping_crta = 250 470 vars.ping_cpl = 10 471 472 vars.ping_packets = 5 473} 474 475object Host "router" { 476 check_command = "my-ping" 477 address = "10.0.0.1" 478} 479``` 480 481In this example we are using the `$address$` macro to refer to the host's `address` 482attribute. 483 484We can also directly refer to custom variables, e.g. by using `$ping_wrta$`. Icinga 485automatically tries to find the closest match for the attribute you specified. The 486exact rules for this are explained in the next section. 487 488> **Note** 489> 490> When using the `$` sign as single character you must escape it with an 491> additional dollar character (`$$`). 492 493 494### Evaluation Order <a id="macro-evaluation-order"></a> 495 496When executing commands Icinga 2 checks the following objects in this order to look 497up macros and their respective values: 498 4991. User object (only for notifications) 5002. Service object 5013. Host object 5024. Command object 5035. Global custom variables in the `Vars` constant 504 505This execution order allows you to define default values for custom variables 506in your command objects. 507 508Here's how you can override the custom variable `ping_packets` from the previous 509example: 510 511``` 512object Service "ping" { 513 host_name = "localhost" 514 check_command = "my-ping" 515 516 vars.ping_packets = 10 // Overrides the default value of 5 given in the command 517} 518``` 519 520If a custom variable isn't defined anywhere, an empty value is used and a warning is 521written to the Icinga 2 log. 522 523You can also directly refer to a specific attribute -- thereby ignoring these evaluation 524rules -- by specifying the full attribute name: 525 526``` 527$service.vars.ping_wrta$ 528``` 529 530This retrieves the value of the `ping_wrta` custom variable for the service. This 531returns an empty value if the service does not have such a custom variable no matter 532whether another object such as the host has this attribute. 533 534 535### Host Runtime Macros <a id="host-runtime-macros"></a> 536 537The following host custom variables are available in all commands that are executed for 538hosts or services: 539 540 Name | Description 541 -----------------------------|-------------- 542 host.name | The name of the host object. 543 host.display\_name | The value of the `display_name` attribute. 544 host.state | The host's current state. Can be one of `UNREACHABLE`, `UP` and `DOWN`. 545 host.state\_id | The host's current state. Can be one of `0` (up), `1` (down) and `2` (unreachable). 546 host.state\_type | The host's current state type. Can be one of `SOFT` and `HARD`. 547 host.check\_attempt | The current check attempt number. 548 host.max\_check\_attempts | The maximum number of checks which are executed before changing to a hard state. 549 host.last\_state | The host's previous state. Can be one of `UNREACHABLE`, `UP` and `DOWN`. 550 host.last\_state\_id | The host's previous state. Can be one of `0` (up), `1` (down) and `2` (unreachable). 551 host.last\_state\_type | The host's previous state type. Can be one of `SOFT` and `HARD`. 552 host.last\_state\_change | The last state change's timestamp. 553 host.downtime\_depth | The number of active downtimes. 554 host.duration\_sec | The time since the last state change. 555 host.latency | The host's check latency. 556 host.execution\_time | The host's check execution time. 557 host.output | The last check's output. 558 host.perfdata | The last check's performance data. 559 host.last\_check | The timestamp when the last check was executed. 560 host.check\_source | The monitoring instance that performed the last check. 561 host.num\_services | Number of services associated with the host. 562 host.num\_services\_ok | Number of services associated with the host which are in an `OK` state. 563 host.num\_services\_warning | Number of services associated with the host which are in a `WARNING` state. 564 host.num\_services\_unknown | Number of services associated with the host which are in an `UNKNOWN` state. 565 host.num\_services\_critical | Number of services associated with the host which are in a `CRITICAL` state. 566 567In addition to these specific runtime macros [host object](09-object-types.md#objecttype-host) 568attributes can be accessed too. 569 570### Service Runtime Macros <a id="service-runtime-macros"></a> 571 572The following service macros are available in all commands that are executed for 573services: 574 575 Name | Description 576 -----------------------------|-------------- 577 service.name | The short name of the service object. 578 service.display\_name | The value of the `display_name` attribute. 579 service.check\_command | The short name of the command along with any arguments to be used for the check. 580 service.state | The service's current state. Can be one of `OK`, `WARNING`, `CRITICAL` and `UNKNOWN`. 581 service.state\_id | The service's current state. Can be one of `0` (ok), `1` (warning), `2` (critical) and `3` (unknown). 582 service.state\_type | The service's current state type. Can be one of `SOFT` and `HARD`. 583 service.check\_attempt | The current check attempt number. 584 service.max\_check\_attempts | The maximum number of checks which are executed before changing to a hard state. 585 service.last\_state | The service's previous state. Can be one of `OK`, `WARNING`, `CRITICAL` and `UNKNOWN`. 586 service.last\_state\_id | The service's previous state. Can be one of `0` (ok), `1` (warning), `2` (critical) and `3` (unknown). 587 service.last\_state\_type | The service's previous state type. Can be one of `SOFT` and `HARD`. 588 service.last\_state\_change | The last state change's timestamp. 589 service.downtime\_depth | The number of active downtimes. 590 service.duration\_sec | The time since the last state change. 591 service.latency | The service's check latency. 592 service.execution\_time | The service's check execution time. 593 service.output | The last check's output. 594 service.perfdata | The last check's performance data. 595 service.last\_check | The timestamp when the last check was executed. 596 service.check\_source | The monitoring instance that performed the last check. 597 598In addition to these specific runtime macros [service object](09-object-types.md#objecttype-service) 599attributes can be accessed too. 600 601### Command Runtime Macros <a id="command-runtime-macros"></a> 602 603The following custom variables are available in all commands: 604 605 Name | Description 606 -----------------------|-------------- 607 command.name | The name of the command object. 608 609### User Runtime Macros <a id="user-runtime-macros"></a> 610 611The following custom variables are available in all commands that are executed for 612users: 613 614 Name | Description 615 -----------------------|-------------- 616 user.name | The name of the user object. 617 user.display\_name | The value of the `display_name` attribute. 618 619In addition to these specific runtime macros [user object](09-object-types.md#objecttype-user) 620attributes can be accessed too. 621 622### Notification Runtime Macros <a id="notification-runtime-macros"></a> 623 624 Name | Description 625 -----------------------|-------------- 626 notification.type | The type of the notification. 627 notification.author | The author of the notification comment if existing. 628 notification.comment | The comment of the notification if existing. 629 630In addition to these specific runtime macros [notification object](09-object-types.md#objecttype-notification) 631attributes can be accessed too. 632 633### Global Runtime Macros <a id="global-runtime-macros"></a> 634 635The following macros are available in all executed commands: 636 637 Name | Description 638 -------------------------|-------------- 639 icinga.timet | Current UNIX timestamp. 640 icinga.long\_date\_time | Current date and time including timezone information. Example: `2014-01-03 11:23:08 +0000` 641 icinga.short\_date\_time | Current date and time. Example: `2014-01-03 11:23:08` 642 icinga.date | Current date. Example: `2014-01-03` 643 icinga.time | Current time including timezone information. Example: `11:23:08 +0000` 644 icinga.uptime | Current uptime of the Icinga 2 process. 645 646The following macros provide global statistics: 647 648 Name | Description 649 ------------------------------------|------------------------------------ 650 icinga.num\_services\_ok | Current number of services in state 'OK'. 651 icinga.num\_services\_warning | Current number of services in state 'Warning'. 652 icinga.num\_services\_critical | Current number of services in state 'Critical'. 653 icinga.num\_services\_unknown | Current number of services in state 'Unknown'. 654 icinga.num\_services\_pending | Current number of pending services. 655 icinga.num\_services\_unreachable | Current number of unreachable services. 656 icinga.num\_services\_flapping | Current number of flapping services. 657 icinga.num\_services\_in\_downtime | Current number of services in downtime. 658 icinga.num\_services\_acknowledged | Current number of acknowledged service problems. 659 icinga.num\_hosts\_up | Current number of hosts in state 'Up'. 660 icinga.num\_hosts\_down | Current number of hosts in state 'Down'. 661 icinga.num\_hosts\_unreachable | Current number of unreachable hosts. 662 icinga.num\_hosts\_pending | Current number of pending hosts. 663 icinga.num\_hosts\_flapping | Current number of flapping hosts. 664 icinga.num\_hosts\_in\_downtime | Current number of hosts in downtime. 665 icinga.num\_hosts\_acknowledged | Current number of acknowledged host problems. 666 667 668## Apply Rules <a id="using-apply"></a> 669 670Several object types require an object relation, e.g. [Service](09-object-types.md#objecttype-service), 671[Notification](09-object-types.md#objecttype-notification), [Dependency](09-object-types.md#objecttype-dependency), 672[ScheduledDowntime](09-object-types.md#objecttype-scheduleddowntime) objects. The 673object relations are documented in the linked chapters. 674 675If you for example create a service object you have to specify the [host_name](09-object-types.md#objecttype-service) 676attribute and reference an existing host attribute. 677 678``` 679object Service "ping4" { 680 check_command = "ping4" 681 host_name = "icinga2-agent1.localdomain" 682} 683``` 684 685This isn't comfortable when managing a huge set of configuration objects which could 686[match](03-monitoring-basics.md#using-apply-expressions) on a common pattern. 687 688Instead you want to use **[apply](17-language-reference.md#apply) rules**. 689 690If you want basic monitoring for all your hosts, add a `ping4` service apply rule 691for all hosts which have the `address` attribute specified. Just one rule for 1000 hosts 692instead of 1000 service objects. Apply rules will automatically generate them for you. 693 694``` 695apply Service "ping4" { 696 check_command = "ping4" 697 assign where host.address 698} 699``` 700 701More explanations on assign where expressions can be found [here](03-monitoring-basics.md#using-apply-expressions). 702 703### Apply Rules: Prerequisites <a id="using-apply-prerquisites"></a> 704 705Before you start with apply rules keep the following in mind: 706 707* Define the best match. 708 * A set of unique [custom variables](03-monitoring-basics.md#custom-variables) for these hosts/services? 709 * Or [group](03-monitoring-basics.md#groups) memberships, e.g. a host being a member of a hostgroup which should have a service set? 710 * A generic pattern [match](18-library-reference.md#global-functions-match) on the host/service name? 711 * [Multiple expressions combined](03-monitoring-basics.md#using-apply-expressions) with `&&` or `||` [operators](17-language-reference.md#expression-operators) 712* All expressions must return a boolean value (an empty string is equal to `false` e.g.) 713 714More specific object type requirements are described in these chapters: 715 716* [Apply services to hosts](03-monitoring-basics.md#using-apply-services) 717* [Apply notifications to hosts and services](03-monitoring-basics.md#using-apply-notifications) 718* [Apply dependencies to hosts and services](03-monitoring-basics.md#using-apply-dependencies) 719* [Apply scheduled downtimes to hosts and services](03-monitoring-basics.md#using-apply-scheduledowntimes) 720 721### Apply Rules: Usage Examples <a id="using-apply-usage-examples"></a> 722 723You can set/override object attributes in apply rules using the respectively available 724objects in that scope (host and/or service objects). 725 726``` 727vars.application_type = host.vars.application_type 728``` 729 730[Custom variables](03-monitoring-basics.md#custom-variables) can also store 731nested dictionaries and arrays. That way you can use them for not only matching 732for their existence or values in apply expressions, but also assign 733("inherit") their values into the generated objected from apply rules. 734 735Remember the examples shown for [custom variable values](03-monitoring-basics.md#custom-variables-values): 736 737``` 738 vars.notification["mail"] = { 739 groups = [ "icingaadmins" ] 740 } 741``` 742 743You can do two things here: 744 745* Check for the existence of the `notification` custom variable and its nested dictionary key `mail`. 746If this is boolean true, the notification object will be generated. 747* Assign the value of the `groups` key to the `user_groups` attribute. 748 749``` 750apply Notification "mail-icingaadmin" to Host { 751 [...] 752 753 user_groups = host.vars.notification.mail.groups 754 755 assign where host.vars.notification.mail 756} 757 758``` 759 760A more advanced example is to use [apply rules with for loops on arrays or 761dictionaries](03-monitoring-basics.md#using-apply-for) provided by 762[custom atttributes](03-monitoring-basics.md#custom-variables) or groups. 763 764Remember the examples shown for [custom variable values](03-monitoring-basics.md#custom-variables-values): 765 766``` 767 vars.disks["disk /"] = { 768 disk_partitions = "/" 769 } 770``` 771 772You can iterate over all dictionary keys defined in `disks`. 773You can optionally use the value to specify additional object attributes. 774 775``` 776apply Service for (disk => config in host.vars.disks) { 777 [...] 778 779 vars.disk_partitions = config.disk_partitions 780} 781``` 782 783Please read the [apply for chapter](03-monitoring-basics.md#using-apply-for) 784for more specific insights. 785 786 787> **Tip** 788> 789> Building configuration in that dynamic way requires detailed information 790> of the generated objects. Use the `object list` [CLI command](11-cli-commands.md#cli-command-object) 791> after successful [configuration validation](11-cli-commands.md#config-validation). 792 793 794### Apply Rules Expressions <a id="using-apply-expressions"></a> 795 796You can use simple or advanced combinations of apply rule expressions. Each 797expression must evaluate into the boolean `true` value. An empty string 798will be for instance interpreted as `false`. In a similar fashion undefined 799attributes will return `false`. 800 801Returns `false`: 802 803``` 804assign where host.vars.attribute_does_not_exist 805``` 806 807Multiple `assign where` condition rows are evaluated as `OR` condition. 808 809You can combine multiple expressions for matching only a subset of objects. In some cases, 810you want to be able to add more than one assign/ignore where expression which matches 811a specific condition. To achieve this you can use the logical `and` and `or` operators. 812 813#### Apply Rules Expressions Examples <a id="using-apply-expressions-examples"></a> 814 815Assign a service to a specific host in a host group [array](18-library-reference.md#array-type) using the [in operator](17-language-reference.md#expression-operators): 816 817``` 818assign where "hostgroup-dev" in host.groups 819``` 820 821Assign an object when a custom variable is [equal](17-language-reference.md#expression-operators) to a value: 822 823``` 824assign where host.vars.application_type == "database" 825 826assign where service.vars.sms_notify == true 827``` 828 829Assign an object if a dictionary [contains](18-library-reference.md#dictionary-contains) a given key: 830 831``` 832assign where host.vars.app_dict.contains("app") 833``` 834 835Match the host name by either using a [case insensitive match](18-library-reference.md#global-functions-match): 836 837``` 838assign where match("webserver*", host.name) 839``` 840 841Match the host name by using a [regular expression](18-library-reference.md#global-functions-regex). Please note the [escaped](17-language-reference.md#string-literals-escape-sequences) backslash character: 842 843``` 844assign where regex("^webserver-[\\d+]", host.name) 845``` 846 847[Match](18-library-reference.md#global-functions-match) all `*mysql*` patterns in the host name and (`&&`) custom variable `prod_mysql_db` 848matches the `db-*` pattern. All hosts with the custom variable `test_server` set to `true` 849should be ignored, or any host name ending with `*internal` pattern. 850 851``` 852object HostGroup "mysql-server" { 853 display_name = "MySQL Server" 854 855 assign where match("*mysql*", host.name) && match("db-*", host.vars.prod_mysql_db) 856 ignore where host.vars.test_server == true 857 ignore where match("*internal", host.name) 858} 859``` 860 861Similar example for advanced notification apply rule filters: If the service 862attribute `notes` [matches](18-library-reference.md#global-functions-match) the `has gold support 24x7` string `AND` one of the 863two condition passes, either the `customer` host custom variable is set to `customer-xy` 864`OR` the host custom variable `always_notify` is set to `true`. 865 866The notification is ignored for services whose host name ends with `*internal` 867`OR` the `priority` custom variable is [less than](17-language-reference.md#expression-operators) `2`. 868 869``` 870template Notification "cust-xy-notification" { 871 users = [ "noc-xy", "mgmt-xy" ] 872 command = "mail-service-notification" 873} 874 875apply Notification "notify-cust-xy-mysql" to Service { 876 import "cust-xy-notification" 877 878 assign where match("*has gold support 24x7*", service.notes) && (host.vars.customer == "customer-xy" || host.vars.always_notify == true) 879 ignore where match("*internal", host.name) || (service.vars.priority < 2 && host.vars.is_clustered == true) 880} 881``` 882 883More advanced examples are covered [here](08-advanced-topics.md#use-functions-assign-where). 884 885### Apply Services to Hosts <a id="using-apply-services"></a> 886 887The sample configuration already includes a detailed example in [hosts.conf](04-configuration.md#hosts-conf) 888and [services.conf](04-configuration.md#services-conf) for this use case. 889 890The example for `ssh` applies a service object to all hosts with the `address` 891attribute being defined and the custom variable `os` set to the string `Linux` in `vars`. 892 893``` 894apply Service "ssh" { 895 import "generic-service" 896 897 check_command = "ssh" 898 899 assign where host.address && host.vars.os == "Linux" 900} 901``` 902 903Other detailed examples are used in their respective chapters, for example 904[apply services with custom command arguments](03-monitoring-basics.md#command-passing-parameters). 905 906### Apply Notifications to Hosts and Services <a id="using-apply-notifications"></a> 907 908Notifications are applied to specific targets (`Host` or `Service`) and work in a similar 909manner: 910 911``` 912apply Notification "mail-noc" to Service { 913 import "mail-service-notification" 914 915 user_groups = [ "noc" ] 916 917 assign where host.vars.notification.mail 918} 919``` 920 921In this example the `mail-noc` notification will be created as object for all services having the 922`notification.mail` custom variable defined. The notification command is set to `mail-service-notification` 923and all members of the user group `noc` will get notified. 924 925It is also possible to generally apply a notification template and dynamically overwrite values from 926the template by checking for custom variables. This can be achieved by using [conditional statements](17-language-reference.md#conditional-statements): 927 928``` 929apply Notification "host-mail-noc" to Host { 930 import "mail-host-notification" 931 932 // replace interval inherited from `mail-host-notification` template with new notfication interval set by a host custom variable 933 if (host.vars.notification_interval) { 934 interval = host.vars.notification_interval 935 } 936 937 // same with notification period 938 if (host.vars.notification_period) { 939 period = host.vars.notification_period 940 } 941 942 // Send SMS instead of email if the host's custom variable `notification_type` is set to `sms` 943 if (host.vars.notification_type == "sms") { 944 command = "sms-host-notification" 945 } else { 946 command = "mail-host-notification" 947 } 948 949 user_groups = [ "noc" ] 950 951 assign where host.address 952} 953``` 954 955In the example above the notification template `mail-host-notification` 956contains all relevant notification settings. 957The apply rule is applied on all host objects where the `host.address` is defined. 958 959If the host object has a specific custom variable set, its value is inherited 960into the local notification object scope, e.g. `host.vars.notification_interval`, 961`host.vars.notification_period` and `host.vars.notification_type`. 962This overwrites attributes already specified in the imported `mail-host-notification` 963template. 964 965The corresponding host object could look like this: 966 967``` 968object Host "host1" { 969 import "host-linux-prod" 970 display_name = "host1" 971 address = "192.168.1.50" 972 vars.notification_interval = 1h 973 vars.notification_period = "24x7" 974 vars.notification_type = "sms" 975} 976``` 977 978### Apply Dependencies to Hosts and Services <a id="using-apply-dependencies"></a> 979 980Detailed examples can be found in the [dependencies](03-monitoring-basics.md#dependencies) chapter. 981 982### Apply Recurring Downtimes to Hosts and Services <a id="using-apply-scheduledowntimes"></a> 983 984The sample configuration includes an example in [downtimes.conf](04-configuration.md#downtimes-conf). 985 986Detailed examples can be found in the [recurring downtimes](08-advanced-topics.md#recurring-downtimes) chapter. 987 988 989### Using Apply For Rules <a id="using-apply-for"></a> 990 991Next to the standard way of using [apply rules](03-monitoring-basics.md#using-apply) 992there is the requirement of applying objects based on a set (array or 993dictionary) using [apply for](17-language-reference.md#apply-for) expressions. 994 995The sample configuration already includes a detailed example in [hosts.conf](04-configuration.md#hosts-conf) 996and [services.conf](04-configuration.md#services-conf) for this use case. 997 998Take the following example: A host provides the snmp oids for different service check 999types. This could look like the following example: 1000 1001``` 1002object Host "router-v6" { 1003 check_command = "hostalive" 1004 address6 = "2001:db8:1234::42" 1005 1006 vars.oids["if01"] = "1.1.1.1.1" 1007 vars.oids["temp"] = "1.1.1.1.2" 1008 vars.oids["bgp"] = "1.1.1.1.5" 1009} 1010``` 1011 1012The idea is to create service objects for `if01` and `temp` but not `bgp`. 1013The oid value should also be used as service custom variable `snmp_oid`. 1014This is the command argument required by the [snmp](10-icinga-template-library.md#plugin-check-command-snmp) 1015check command. 1016The service's `display_name` should be set to the identifier inside the dictionary, 1017e.g. `if01`. 1018 1019``` 1020apply Service for (identifier => oid in host.vars.oids) { 1021 check_command = "snmp" 1022 display_name = identifier 1023 vars.snmp_oid = oid 1024 1025 ignore where identifier == "bgp" //don't generate service for bgp checks 1026} 1027``` 1028 1029Icinga 2 evaluates the `apply for` rule for all objects with the custom variable 1030`oids` set. 1031It iterates over all dictionary items inside the `for` loop and evaluates the 1032`assign/ignore where` expressions. You can access the loop variable 1033in these expressions, e.g. to ignore specific values. 1034 1035In this example the `bgp` identifier is ignored. This avoids to generate 1036unwanted services. A different approach would be to match the `oid` value with a 1037[regex](18-library-reference.md#global-functions-regex)/[wildcard match](18-library-reference.md#global-functions-match) pattern for example. 1038 1039``` 1040 ignore where regex("^\d.\d.\d.\d.5$", oid) 1041``` 1042 1043> **Note** 1044> 1045> You don't need an `assign where` expression which checks for the existence of the 1046> `oids` custom variable. 1047 1048This method saves you from creating multiple apply rules. It also moves 1049the attribute specification logic from the service to the host. 1050 1051<!-- Keep this for compatibility --> 1052<a id="using-apply-for-custom-attribute-override"></a> 1053 1054#### Apply For and Custom Variable Override <a id="using-apply-for-custom-variable-override"></a> 1055 1056Imagine a different more advanced example: You are monitoring your network device (host) 1057with many interfaces (services). The following requirements/problems apply: 1058 1059* Each interface service should be named with a prefix and a name defined in your host object (which could be generated from your CMDB, etc.) 1060* Each interface has its own VLAN tag 1061* Some interfaces have QoS enabled 1062* Additional attributes such as `display_name` or `notes`, `notes_url` and `action_url` must be 1063dynamically generated. 1064 1065 1066> **Tip** 1067> 1068> Define the SNMP community as global constant in your [constants.conf](04-configuration.md#constants-conf) file. 1069 1070``` 1071const IftrafficSnmpCommunity = "public" 1072``` 1073 1074Define the `interfaces` [custom variable](03-monitoring-basics.md#custom-variables) 1075on the `cisco-catalyst-6509-34` host object and add three example interfaces as dictionary keys. 1076 1077Specify additional attributes inside the nested dictionary 1078as learned with [custom variable values](03-monitoring-basics.md#custom-variables-values): 1079 1080``` 1081object Host "cisco-catalyst-6509-34" { 1082 import "generic-host" 1083 display_name = "Catalyst 6509 #34 VIE21" 1084 address = "127.0.1.4" 1085 1086 /* "GigabitEthernet0/2" is the interface name, 1087 * and key name in service apply for later on 1088 */ 1089 vars.interfaces["GigabitEthernet0/2"] = { 1090 /* define all custom variables with the 1091 * same name required for command parameters/arguments 1092 * in service apply (look into your CheckCommand definition) 1093 */ 1094 iftraffic_units = "g" 1095 iftraffic_community = IftrafficSnmpCommunity 1096 iftraffic_bandwidth = 1 1097 vlan = "internal" 1098 qos = "disabled" 1099 } 1100 vars.interfaces["GigabitEthernet0/4"] = { 1101 iftraffic_units = "g" 1102 //iftraffic_community = IftrafficSnmpCommunity 1103 iftraffic_bandwidth = 1 1104 vlan = "remote" 1105 qos = "enabled" 1106 } 1107 vars.interfaces["MgmtInterface1"] = { 1108 iftraffic_community = IftrafficSnmpCommunity 1109 vlan = "mgmt" 1110 interface_address = "127.99.0.100" #special management ip 1111 } 1112} 1113``` 1114 1115Start with the apply for definition and iterate over `host.vars.interfaces`. 1116This is a dictionary and should use the variables `interface_name` as key 1117and `interface_config` as value for each generated object scope. 1118 1119`"if-"` specifies the object name prefix for each service which results 1120in `if-<interface_name>` for each iteration. 1121 1122``` 1123/* loop over the host.vars.interfaces dictionary 1124 * for (key => value in dict) means `interface_name` as key 1125 * and `interface_config` as value. Access config attributes 1126 * with the indexer (`.`) character. 1127 */ 1128apply Service "if-" for (interface_name => interface_config in host.vars.interfaces) { 1129``` 1130 1131Import the `generic-service` template, assign the [iftraffic](10-icinga-template-library.md#plugin-contrib-command-iftraffic) 1132`check_command`. Use the dictionary key `interface_name` to set a proper `display_name` 1133string for external interfaces. 1134 1135``` 1136 import "generic-service" 1137 check_command = "iftraffic" 1138 display_name = "IF-" + interface_name 1139``` 1140 1141The `interface_name` key's value is the same string used as command parameter for 1142`iftraffic`: 1143 1144``` 1145 /* use the key as command argument (no duplication of values in host.vars.interfaces) */ 1146 vars.iftraffic_interface = interface_name 1147``` 1148 1149Remember that `interface_config` is a nested dictionary. In the first iteration it looks 1150like this: 1151 1152``` 1153interface_config = { 1154 iftraffic_units = "g" 1155 iftraffic_community = IftrafficSnmpCommunity 1156 iftraffic_bandwidth = 1 1157 vlan = "internal" 1158 qos = "disabled" 1159} 1160``` 1161 1162Access the dictionary keys with the [indexer](17-language-reference.md#indexer) syntax 1163and assign them to custom variables used as command parameters for the `iftraffic` 1164check command. 1165 1166``` 1167 /* map the custom variables as command arguments */ 1168 vars.iftraffic_units = interface_config.iftraffic_units 1169 vars.iftraffic_community = interface_config.iftraffic_community 1170``` 1171 1172If you just want to inherit all attributes specified inside the `interface_config` 1173dictionary, add it to the generated service custom variables like this: 1174 1175``` 1176 /* the above can be achieved in a shorter fashion if the names inside host.vars.interfaces 1177 * are the _exact_ same as required as command parameter by the check command 1178 * definition. 1179 */ 1180 vars += interface_config 1181``` 1182 1183If the user did not specify default values for required service custom variables, 1184add them here. This also helps to avoid unwanted configuration validation errors or 1185runtime failures. Please read more about conditional statements [here](17-language-reference.md#conditional-statements). 1186 1187``` 1188 /* set a default value for units and bandwidth */ 1189 if (interface_config.iftraffic_units == "") { 1190 vars.iftraffic_units = "m" 1191 } 1192 if (interface_config.iftraffic_bandwidth == "") { 1193 vars.iftraffic_bandwidth = 1 1194 } 1195 if (interface_config.vlan == "") { 1196 vars.vlan = "not set" 1197 } 1198 if (interface_config.qos == "") { 1199 vars.qos = "not set" 1200 } 1201``` 1202 1203If the host object did not specify a custom SNMP community, 1204set a default value specified by the [global constant](17-language-reference.md#constants) `IftrafficSnmpCommunity`. 1205 1206``` 1207 /* set the global constant if not explicitely 1208 * not provided by the `interfaces` dictionary on the host 1209 */ 1210 if (len(interface_config.iftraffic_community) == 0 || len(vars.iftraffic_community) == 0) { 1211 vars.iftraffic_community = IftrafficSnmpCommunity 1212 } 1213``` 1214 1215Use the provided values to [calculate](17-language-reference.md#expression-operators) 1216more object attributes which can be e.g. seen in external interfaces. 1217 1218``` 1219 /* Calculate some additional object attributes after populating the `vars` dictionary */ 1220 notes = "Interface check for " + interface_name + " (units: '" + interface_config.iftraffic_units + "') in VLAN '" + vars.vlan + "' with ' QoS '" + vars.qos + "'" 1221 notes_url = "https://foreman.company.com/hosts/" + host.name 1222 action_url = "https://snmp.checker.company.com/" + host.name + "/if-" + interface_name 1223} 1224``` 1225 1226> **Tip** 1227> 1228> Building configuration in that dynamic way requires detailed information 1229> of the generated objects. Use the `object list` [CLI command](11-cli-commands.md#cli-command-object) 1230> after successful [configuration validation](11-cli-commands.md#config-validation). 1231 1232Verify that the apply-for-rule successfully created the service objects with the 1233inherited custom variables: 1234 1235``` 1236# icinga2 daemon -C 1237# icinga2 object list --type Service --name *catalyst* 1238 1239Object 'cisco-catalyst-6509-34!if-GigabitEthernet0/2' of type 'Service': 1240...... 1241 * vars 1242 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 59:3-59:26 1243 * iftraffic_bandwidth = 1 1244 * iftraffic_community = "public" 1245 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 53:3-53:65 1246 * iftraffic_interface = "GigabitEthernet0/2" 1247 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 49:3-49:43 1248 * iftraffic_units = "g" 1249 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 52:3-52:57 1250 * qos = "disabled" 1251 * vlan = "internal" 1252 1253 1254Object 'cisco-catalyst-6509-34!if-GigabitEthernet0/4' of type 'Service': 1255... 1256 * vars 1257 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 59:3-59:26 1258 * iftraffic_bandwidth = 1 1259 * iftraffic_community = "public" 1260 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 53:3-53:65 1261 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 79:5-79:53 1262 * iftraffic_interface = "GigabitEthernet0/4" 1263 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 49:3-49:43 1264 * iftraffic_units = "g" 1265 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 52:3-52:57 1266 * qos = "enabled" 1267 * vlan = "remote" 1268 1269Object 'cisco-catalyst-6509-34!if-MgmtInterface1' of type 'Service': 1270... 1271 * vars 1272 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 59:3-59:26 1273 * iftraffic_bandwidth = 1 1274 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 66:5-66:32 1275 * iftraffic_community = "public" 1276 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 53:3-53:65 1277 * iftraffic_interface = "MgmtInterface1" 1278 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 49:3-49:43 1279 * iftraffic_units = "m" 1280 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 52:3-52:57 1281 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 63:5-63:30 1282 * interface_address = "127.99.0.100" 1283 * qos = "not set" 1284 % = modified in '/etc/icinga2/conf.d/iftraffic.conf', lines 72:5-72:24 1285 * vlan = "mgmt" 1286``` 1287 1288### Use Object Attributes in Apply Rules <a id="using-apply-object-attributes"></a> 1289 1290Since apply rules are evaluated after the generic objects, you 1291can reference existing host and/or service object attributes as 1292values for any object attribute specified in that apply rule. 1293 1294``` 1295object Host "opennebula-host" { 1296 import "generic-host" 1297 address = "10.1.1.2" 1298 1299 vars.hosting["cust1"] = { 1300 http_uri = "/shop" 1301 customer_name = "Customer 1" 1302 customer_id = "7568" 1303 support_contract = "gold" 1304 } 1305 vars.hosting["cust2"] = { 1306 http_uri = "/" 1307 customer_name = "Customer 2" 1308 customer_id = "7569" 1309 support_contract = "silver" 1310 } 1311} 1312``` 1313 1314`hosting` is a custom variable with the Dictionary value type. 1315This is mandatory to iterate with the `key => value` notation 1316in the below apply for rule. 1317 1318``` 1319apply Service for (customer => config in host.vars.hosting) { 1320 import "generic-service" 1321 check_command = "ping4" 1322 1323 vars.qos = "disabled" 1324 1325 vars += config 1326 1327 vars.http_uri = "/" + customer + "/" + config.http_uri 1328 1329 display_name = "Shop Check for " + vars.customer_name + "-" + vars.customer_id 1330 1331 notes = "Support contract: " + vars.support_contract + " for Customer " + vars.customer_name + " (" + vars.customer_id + ")." 1332 1333 notes_url = "https://foreman.company.com/hosts/" + host.name 1334 action_url = "https://snmp.checker.company.com/" + host.name + "/" + vars.customer_id 1335} 1336``` 1337 1338Each loop iteration has different values for `customer` and config` 1339in the local scope. 1340 13411. 1342 1343``` 1344customer = "cust 1" 1345config = { 1346 http_uri = "/shop" 1347 customer_name = "Customer 1" 1348 customer_id = "7568" 1349 support_contract = "gold" 1350} 1351``` 1352 13532. 1354 1355``` 1356customer = "cust2" 1357config = { 1358 http_uri = "/" 1359 customer_name = "Customer 2" 1360 customer_id = "7569" 1361 support_contract = "silver" 1362} 1363``` 1364 1365You can now add the `config` dictionary into `vars`. 1366 1367``` 1368vars += config 1369``` 1370 1371Now it looks like the following in the first iteration: 1372 1373``` 1374customer = "cust 1" 1375vars = { 1376 http_uri = "/shop" 1377 customer_name = "Customer 1" 1378 customer_id = "7568" 1379 support_contract = "gold" 1380} 1381``` 1382 1383Remember, you know this structure already. Custom 1384attributes can also be accessed by using the [indexer](17-language-reference.md#indexer) 1385syntax. 1386 1387``` 1388 vars.http_uri = ... + config.http_uri 1389``` 1390 1391can also be written as 1392 1393``` 1394 vars += config 1395 vars.http_uri = ... + vars.http_uri 1396``` 1397 1398 1399## Groups <a id="groups"></a> 1400 1401A group is a collection of similar objects. Groups are primarily used as a 1402visualization aid in web interfaces. 1403 1404Group membership is defined at the respective object itself. If 1405you have a hostgroup name `windows` for example, and want to assign 1406specific hosts to this group for later viewing the group on your 1407alert dashboard, first create a HostGroup object: 1408 1409``` 1410object HostGroup "windows" { 1411 display_name = "Windows Servers" 1412} 1413``` 1414 1415Then add your hosts to this group: 1416 1417``` 1418template Host "windows-server" { 1419 groups += [ "windows" ] 1420} 1421 1422object Host "mssql-srv1" { 1423 import "windows-server" 1424 1425 vars.mssql_port = 1433 1426} 1427 1428object Host "mssql-srv2" { 1429 import "windows-server" 1430 1431 vars.mssql_port = 1433 1432} 1433``` 1434 1435This can be done for service and user groups the same way: 1436 1437``` 1438object UserGroup "windows-mssql-admins" { 1439 display_name = "Windows MSSQL Admins" 1440} 1441 1442template User "generic-windows-mssql-users" { 1443 groups += [ "windows-mssql-admins" ] 1444} 1445 1446object User "win-mssql-noc" { 1447 import "generic-windows-mssql-users" 1448 1449 email = "noc@example.com" 1450} 1451 1452object User "win-mssql-ops" { 1453 import "generic-windows-mssql-users" 1454 1455 email = "ops@example.com" 1456} 1457``` 1458 1459### Group Membership Assign <a id="group-assign-intro"></a> 1460 1461Instead of manually assigning each object to a group you can also assign objects 1462to a group based on their attributes: 1463 1464``` 1465object HostGroup "prod-mssql" { 1466 display_name = "Production MSSQL Servers" 1467 1468 assign where host.vars.mssql_port && host.vars.prod_mysql_db 1469 ignore where host.vars.test_server == true 1470 ignore where match("*internal", host.name) 1471} 1472``` 1473 1474In this example all hosts with the `vars` attribute `mssql_port` 1475will be added as members to the host group `mssql`. However, all 1476hosts [matching](18-library-reference.md#global-functions-match) the string `\*internal` 1477or with the `test_server` attribute set to `true` are **not** added to this group. 1478 1479Details on the `assign where` syntax can be found in the 1480[Language Reference](17-language-reference.md#apply). 1481 1482## Notifications <a id="alert-notifications"></a> 1483 1484Notifications for service and host problems are an integral part of your 1485monitoring setup. 1486 1487When a host or service is in a downtime, a problem has been acknowledged or 1488the dependency logic determined that the host/service is unreachable, no 1489notifications are sent. You can configure additional type and state filters 1490refining the notifications being actually sent. 1491 1492There are many ways of sending notifications, e.g. by email, XMPP, 1493IRC, Twitter, etc. On its own Icinga 2 does not know how to send notifications. 1494Instead it relies on external mechanisms such as shell scripts to notify users. 1495More notification methods are listed in the [addons and plugins](13-addons.md#notification-scripts-interfaces) 1496chapter. 1497 1498A notification specification requires one or more users (and/or user groups) 1499who will be notified in case of problems. These users must have all custom 1500attributes defined which will be used in the `NotificationCommand` on execution. 1501 1502The user `icingaadmin` in the example below will get notified only on `Warning` and 1503`Critical` problems. In addition to that `Recovery` notifications are sent (they require 1504the `OK` state). 1505 1506``` 1507object User "icingaadmin" { 1508 display_name = "Icinga 2 Admin" 1509 enable_notifications = true 1510 states = [ OK, Warning, Critical ] 1511 types = [ Problem, Recovery ] 1512 email = "icinga@localhost" 1513} 1514``` 1515 1516If you don't set the `states` and `types` configuration attributes for the `User` 1517object, notifications for all states and types will be sent. 1518 1519Details on troubleshooting notification problems can be found [here](15-troubleshooting.md#troubleshooting). 1520 1521> **Note** 1522> 1523> Make sure that the [notification](11-cli-commands.md#enable-features) feature is enabled 1524> in order to execute notification commands. 1525 1526You should choose which information you (and your notified users) are interested in 1527case of emergency, and also which information does not provide any value to you and 1528your environment. 1529 1530An example notification command is explained [here](03-monitoring-basics.md#notification-commands). 1531 1532You can add all shared attributes to a `Notification` template which is inherited 1533to the defined notifications. That way you'll save duplicated attributes in each 1534`Notification` object. Attributes can be overridden locally. 1535 1536``` 1537template Notification "generic-notification" { 1538 interval = 15m 1539 1540 command = "mail-service-notification" 1541 1542 states = [ Warning, Critical, Unknown ] 1543 types = [ Problem, Acknowledgement, Recovery, Custom, FlappingStart, 1544 FlappingEnd, DowntimeStart, DowntimeEnd, DowntimeRemoved ] 1545 1546 period = "24x7" 1547} 1548``` 1549 1550The time period `24x7` is included as example configuration with Icinga 2. 1551 1552Use the `apply` keyword to create `Notification` objects for your services: 1553 1554``` 1555apply Notification "notify-cust-xy-mysql" to Service { 1556 import "generic-notification" 1557 1558 users = [ "noc-xy", "mgmt-xy" ] 1559 1560 assign where match("*has gold support 24x7*", service.notes) && (host.vars.customer == "customer-xy" || host.vars.always_notify == true 1561 ignore where match("*internal", host.name) || (service.vars.priority < 2 && host.vars.is_clustered == true) 1562} 1563``` 1564 1565 1566Instead of assigning users to notifications, you can also add the `user_groups` 1567attribute with a list of user groups to the `Notification` object. Icinga 2 will 1568send notifications to all group members. 1569 1570> **Note** 1571> 1572> Only users who have been notified of a problem before (`Warning`, `Critical`, `Unknown` 1573states for services, `Down` for hosts) will receive `Recovery` notifications. 1574 1575Icinga 2 v2.10 allows you to configure `Acknowledgement` and/or `Recovery` 1576without a `Problem` notification. These notifications will be sent without 1577any problem notifications beforehand, and can be used for e.g. ticket systems. 1578 1579``` 1580 types = [ Acknowledgement, Recovery ] 1581``` 1582 1583### Notifications: Users from Host/Service <a id="alert-notifications-users-host-service"></a> 1584 1585A common pattern is to store the users and user groups 1586on the host or service objects instead of the notification 1587object itself. 1588 1589The sample configuration provided in [hosts.conf](04-configuration.md#hosts-conf) and [notifications.conf](notifications-conf) 1590already provides an example for this question. 1591 1592> **Tip** 1593> 1594> Please make sure to read the [apply](03-monitoring-basics.md#using-apply) and 1595> [custom variable values](03-monitoring-basics.md#custom-variables-values) chapter to 1596> fully understand these examples. 1597 1598 1599Specify the user and groups as nested custom variable on the host object: 1600 1601``` 1602object Host "icinga2-agent1.localdomain" { 1603 [...] 1604 1605 vars.notification["mail"] = { 1606 groups = [ "icingaadmins" ] 1607 users = [ "icingaadmin" ] 1608 } 1609 vars.notification["sms"] = { 1610 users = [ "icingaadmin" ] 1611 } 1612} 1613``` 1614 1615As you can see, there is the option to use two different notification 1616apply rules here: One for `mail` and one for `sms`. 1617 1618This example assigns the `users` and `groups` nested keys from the `notification` 1619custom variable to the actual notification object attributes. 1620 1621Since errors are hard to debug if host objects don't specify the required 1622configuration attributes, you can add a safety condition which logs which 1623host object is affected. 1624 1625``` 1626critical/config: Host 'icinga2-client3.localdomain' does not specify required user/user_groups configuration attributes for notification 'mail-icingaadmin'. 1627``` 1628 1629You can also use the [script debugger](20-script-debugger.md#script-debugger) for more advanced insights. 1630 1631``` 1632apply Notification "mail-host-notification" to Host { 1633 [...] 1634 1635 /* Log which host does not specify required user/user_groups attributes. This will fail immediately during config validation and help a lot. */ 1636 if (len(host.vars.notification.mail.users) == 0 && len(host.vars.notification.mail.user_groups) == 0) { 1637 log(LogCritical, "config", "Host '" + host.name + "' does not specify required user/user_groups configuration attributes for notification '" + name + "'.") 1638 } 1639 1640 users = host.vars.notification.mail.users 1641 user_groups = host.vars.notification.mail.groups 1642 1643 assign where host.vars.notification.mail && typeof(host.vars.notification.mail) == Dictionary 1644} 1645 1646apply Notification "sms-host-notification" to Host { 1647 [...] 1648 1649 /* Log which host does not specify required user/user_groups attributes. This will fail immediately during config validation and help a lot. */ 1650 if (len(host.vars.notification.sms.users) == 0 && len(host.vars.notification.sms.user_groups) == 0) { 1651 log(LogCritical, "config", "Host '" + host.name + "' does not specify required user/user_groups configuration attributes for notification '" + name + "'.") 1652 } 1653 1654 users = host.vars.notification.sms.users 1655 user_groups = host.vars.notification.sms.groups 1656 1657 assign where host.vars.notification.sms && typeof(host.vars.notification.sms) == Dictionary 1658} 1659``` 1660 1661The example above uses [typeof](18-library-reference.md#global-functions-typeof) as safety function to ensure that 1662the `mail` key really provides a dictionary as value. Otherwise 1663the configuration validation could fail if an admin adds something 1664like this on another host: 1665 1666``` 1667 vars.notification.mail = "yes" 1668``` 1669 1670 1671You can also do a more fine granular assignment on the service object: 1672 1673``` 1674apply Service "http" { 1675 [...] 1676 1677 vars.notification["mail"] = { 1678 groups = [ "icingaadmins" ] 1679 users = [ "icingaadmin" ] 1680 } 1681 1682 [...] 1683} 1684``` 1685 1686This notification apply rule is different to the one above. The service 1687notification users and groups are inherited from the service and if not set, 1688from the host object. A default user is set too. 1689 1690``` 1691apply Notification "mail-service-notification" to Service { 1692 [...] 1693 1694 if (service.vars.notification.mail.users) { 1695 users = service.vars.notification.mail.users 1696 } else if (host.vars.notification.mail.users) { 1697 users = host.vars.notification.mail.users 1698 } else { 1699 /* Default user who receives everything. */ 1700 users = [ "icingaadmin" ] 1701 } 1702 1703 if (service.vars.notification.mail.groups) { 1704 user_groups = service.vars.notification.mail.groups 1705 } else if (host.vars.notification.mail.groups) { 1706 user_groups = host.vars.notification.mail.groups 1707 } 1708 1709 assign where ( host.vars.notification.mail && typeof(host.vars.notification.mail) == Dictionary ) || ( service.vars.notification.mail && typeof(service.vars.notification.mail) == Dictionary ) 1710} 1711``` 1712 1713### Notification Escalations <a id="notification-escalations"></a> 1714 1715When a problem notification is sent and a problem still exists at the time of re-notification 1716you may want to escalate the problem to the next support level. A different approach 1717is to configure the default notification by email, and escalate the problem via SMS 1718if not already solved. 1719 1720You can define notification start and end times as additional configuration 1721attributes making the `Notification` object a so-called `notification escalation`. 1722Using templates you can share the basic notification attributes such as users or the 1723`interval` (and override them for the escalation then). 1724 1725Using the example from above, you can define additional users being escalated for SMS 1726notifications between start and end time. 1727 1728``` 1729object User "icinga-oncall-2nd-level" { 1730 display_name = "Icinga 2nd Level" 1731 1732 vars.mobile = "+1 555 424642" 1733} 1734 1735object User "icinga-oncall-1st-level" { 1736 display_name = "Icinga 1st Level" 1737 1738 vars.mobile = "+1 555 424642" 1739} 1740``` 1741 1742Define an additional [NotificationCommand](03-monitoring-basics.md#notification-commands) for SMS notifications. 1743 1744> **Note** 1745> 1746> The example is not complete as there are many different SMS providers. 1747> Please note that sending SMS notifications will require an SMS provider 1748> or local hardware with an active SIM card. 1749 1750``` 1751object NotificationCommand "sms-notification" { 1752 command = [ 1753 PluginDir + "/send_sms_notification", 1754 "$mobile$", 1755 "..." 1756} 1757``` 1758 1759The two new notification escalations are added onto the local host 1760and its service `ping4` using the `generic-notification` template. 1761The user `icinga-oncall-2nd-level` will get notified by SMS (`sms-notification` 1762command) after `30m` until `1h`. 1763 1764> **Note** 1765> 1766> The `interval` was set to 15m in the `generic-notification` 1767> template example. Lower that value in your escalations by using a secondary 1768> template or by overriding the attribute directly in the `notifications` array 1769> position for `escalation-sms-2nd-level`. 1770 1771If the problem does not get resolved nor acknowledged preventing further notifications, 1772the `escalation-sms-1st-level` user will be escalated `1h` after the initial problem was 1773notified, but only for one hour (`2h` as `end` key for the `times` dictionary). 1774 1775``` 1776apply Notification "mail" to Service { 1777 import "generic-notification" 1778 1779 command = "mail-notification" 1780 users = [ "icingaadmin" ] 1781 1782 assign where service.name == "ping4" 1783} 1784 1785apply Notification "escalation-sms-2nd-level" to Service { 1786 import "generic-notification" 1787 1788 command = "sms-notification" 1789 users = [ "icinga-oncall-2nd-level" ] 1790 1791 times = { 1792 begin = 30m 1793 end = 1h 1794 } 1795 1796 assign where service.name == "ping4" 1797} 1798 1799apply Notification "escalation-sms-1st-level" to Service { 1800 import "generic-notification" 1801 1802 command = "sms-notification" 1803 users = [ "icinga-oncall-1st-level" ] 1804 1805 times = { 1806 begin = 1h 1807 end = 2h 1808 } 1809 1810 assign where service.name == "ping4" 1811} 1812``` 1813 1814### Notification Delay <a id="notification-delay"></a> 1815 1816Sometimes the problem in question should not be announced when the notification is due 1817(the object reaching the `HARD` state), but after a certain period. In Icinga 2 1818you can use the `times` dictionary and set `begin = 15m` as key and value if you want to 1819postpone the notification window for 15 minutes. Leave out the `end` key -- if not set, 1820Icinga 2 will not check against any end time for this notification. Make sure to 1821specify a relatively low notification `interval` to get notified soon enough again. 1822 1823``` 1824apply Notification "mail" to Service { 1825 import "generic-notification" 1826 1827 command = "mail-notification" 1828 users = [ "icingaadmin" ] 1829 1830 interval = 5m 1831 1832 times.begin = 15m // delay notification window 1833 1834 assign where service.name == "ping4" 1835} 1836``` 1837 1838### Disable Re-notifications <a id="disable-renotification"></a> 1839 1840If you prefer to be notified only once, you can disable re-notifications by setting the 1841`interval` attribute to `0`. 1842 1843``` 1844apply Notification "notify-once" to Service { 1845 import "generic-notification" 1846 1847 command = "mail-notification" 1848 users = [ "icingaadmin" ] 1849 1850 interval = 0 // disable re-notification 1851 1852 assign where service.name == "ping4" 1853} 1854``` 1855 1856### Notification Filters by State and Type <a id="notification-filters-state-type"></a> 1857 1858If there are no notification state and type filter attributes defined at the `Notification` 1859or `User` object, Icinga 2 assumes that all states and types are being notified. 1860 1861Available state and type filters for notifications are: 1862 1863``` 1864template Notification "generic-notification" { 1865 1866 states = [ OK, Warning, Critical, Unknown ] 1867 types = [ Problem, Acknowledgement, Recovery, Custom, FlappingStart, 1868 FlappingEnd, DowntimeStart, DowntimeEnd, DowntimeRemoved ] 1869} 1870``` 1871 1872 1873## Commands <a id="commands"></a> 1874 1875Icinga 2 uses three different command object types to specify how 1876checks should be performed, notifications should be sent, and 1877events should be handled. 1878 1879### Check Commands <a id="check-commands"></a> 1880 1881[CheckCommand](09-object-types.md#objecttype-checkcommand) objects define the command line how 1882a check is called. 1883 1884[CheckCommand](09-object-types.md#objecttype-checkcommand) objects are referenced by 1885[Host](09-object-types.md#objecttype-host) and [Service](09-object-types.md#objecttype-service) objects 1886using the `check_command` attribute. 1887 1888> **Note** 1889> 1890> Make sure that the [checker](11-cli-commands.md#enable-features) feature is enabled in order to 1891> execute checks. 1892 1893#### Integrate the Plugin with a CheckCommand Definition <a id="command-plugin-integration"></a> 1894 1895Unless you have done so already, download your check plugin and put it 1896into the [PluginDir](04-configuration.md#constants-conf) directory. The following example uses the 1897`check_mysql` plugin contained in the Monitoring Plugins package. 1898 1899The plugin path and all command arguments are made a list of 1900double-quoted string arguments for proper shell escaping. 1901 1902Call the `check_disk` plugin with the `--help` parameter to see 1903all available options. Our example defines warning (`-w`) and 1904critical (`-c`) thresholds for the disk usage. Without any 1905partition defined (`-p`) it will check all local partitions. 1906 1907``` 1908icinga@icinga2 $ /usr/lib64/nagios/plugins/check_mysql --help 1909... 1910This program tests connections to a MySQL server 1911 1912Usage: 1913check_mysql [-d database] [-H host] [-P port] [-s socket] 1914[-u user] [-p password] [-S] [-l] [-a cert] [-k key] 1915[-C ca-cert] [-D ca-dir] [-L ciphers] [-f optfile] [-g group] 1916``` 1917 1918Next step is to understand how [command parameters](03-monitoring-basics.md#command-passing-parameters) 1919are being passed from a host or service object, and add a [CheckCommand](09-object-types.md#objecttype-checkcommand) 1920definition based on these required parameters and/or default values. 1921 1922Please continue reading in the [plugins section](05-service-monitoring.md#service-monitoring-plugins) for additional integration examples. 1923 1924#### Passing Check Command Parameters from Host or Service <a id="command-passing-parameters"></a> 1925 1926Check command parameters are defined as custom variables which can be accessed as runtime macros 1927by the executed check command. 1928 1929The check command parameters for ITL provided plugin check command definitions are documented 1930[here](10-icinga-template-library.md#icinga-template-library), for example 1931[disk](10-icinga-template-library.md#plugin-check-command-disk). 1932 1933In order to practice passing command parameters you should [integrate your own plugin](03-monitoring-basics.md#command-plugin-integration). 1934 1935The following example will use `check_mysql` provided by the [Monitoring Plugins installation](02-installation.md#setting-up-check-plugins). 1936 1937Define the default check command custom variables, for example `mysql_user` and `mysql_password` 1938(freely definable naming schema) and optional their default threshold values. You can 1939then use these custom variables as runtime macros for [command arguments](03-monitoring-basics.md#command-arguments) 1940on the command line. 1941 1942> **Tip** 1943> 1944> Use a common command type as prefix for your command arguments to increase 1945> readability. `mysql_user` helps understanding the context better than just 1946> `user` as argument. 1947 1948The default custom variables can be overridden by the custom variables 1949defined in the host or service using the check command `my-mysql`. The custom variables 1950can also be inherited from a parent template using additive inheritance (`+=`). 1951 1952``` 1953# vim /etc/icinga2/conf.d/commands.conf 1954 1955object CheckCommand "my-mysql" { 1956 command = [ PluginDir + "/check_mysql" ] //constants.conf -> const PluginDir 1957 1958 arguments = { 1959 "-H" = "$mysql_host$" 1960 "-u" = { 1961 required = true 1962 value = "$mysql_user$" 1963 } 1964 "-p" = "$mysql_password$" 1965 "-P" = "$mysql_port$" 1966 "-s" = "$mysql_socket$" 1967 "-a" = "$mysql_cert$" 1968 "-d" = "$mysql_database$" 1969 "-k" = "$mysql_key$" 1970 "-C" = "$mysql_ca_cert$" 1971 "-D" = "$mysql_ca_dir$" 1972 "-L" = "$mysql_ciphers$" 1973 "-f" = "$mysql_optfile$" 1974 "-g" = "$mysql_group$" 1975 "-S" = { 1976 set_if = "$mysql_check_slave$" 1977 description = "Check if the slave thread is running properly." 1978 } 1979 "-l" = { 1980 set_if = "$mysql_ssl$" 1981 description = "Use ssl encryption" 1982 } 1983 } 1984 1985 vars.mysql_check_slave = false 1986 vars.mysql_ssl = false 1987 vars.mysql_host = "$address$" 1988} 1989``` 1990 1991The check command definition also sets `mysql_host` to the `$address$` default value. You can override 1992this command parameter if for example your MySQL host is not running on the same server's ip address. 1993 1994Make sure pass all required command parameters, such as `mysql_user`, `mysql_password` and `mysql_database`. 1995`MysqlUsername` and `MysqlPassword` are specified as [global constants](04-configuration.md#constants-conf) 1996in this example. 1997 1998``` 1999# vim /etc/icinga2/conf.d/services.conf 2000 2001apply Service "mysql-icinga-db-health" { 2002 import "generic-service" 2003 2004 check_command = "my-mysql" 2005 2006 vars.mysql_user = MysqlUsername 2007 vars.mysql_password = MysqlPassword 2008 2009 vars.mysql_database = "icinga" 2010 vars.mysql_host = "192.168.33.11" 2011 2012 assign where match("icinga2*", host.name) 2013 ignore where host.vars.no_health_check == true 2014} 2015``` 2016 2017 2018Take a different example: The example host configuration in [hosts.conf](04-configuration.md#hosts-conf) 2019also applies an `ssh` service check. Your host's ssh port is not the default `22`, but set to `2022`. 2020You can pass the command parameter as custom variable `ssh_port` directly inside the service apply rule 2021inside [services.conf](04-configuration.md#services-conf): 2022 2023``` 2024apply Service "ssh" { 2025 import "generic-service" 2026 2027 check_command = "ssh" 2028 vars.ssh_port = 2022 //custom command parameter 2029 2030 assign where (host.address || host.address6) && host.vars.os == "Linux" 2031} 2032``` 2033 2034If you prefer this being configured at the host instead of the service, modify the host configuration 2035object instead. The runtime macro resolving order is described [here](03-monitoring-basics.md#macro-evaluation-order). 2036 2037``` 2038object Host "icinga2-agent1.localdomain { 2039... 2040 vars.ssh_port = 2022 2041} 2042``` 2043 2044#### Passing Check Command Parameters Using Apply For <a id="command-passing-parameters-apply-for"></a> 2045 2046The host `localhost` with the generated services from the `basic-partitions` dictionary (see 2047[apply for](03-monitoring-basics.md#using-apply-for) for details) checks a basic set of disk partitions 2048with modified custom variables (warning thresholds at `10%`, critical thresholds at `5%` 2049free disk space). 2050 2051The custom variable `disk_partition` can either hold a single string or an array of 2052string values for passing multiple partitions to the `check_disk` check plugin. 2053 2054``` 2055object Host "my-server" { 2056 import "generic-host" 2057 address = "127.0.0.1" 2058 address6 = "::1" 2059 2060 vars.local_disks["basic-partitions"] = { 2061 disk_partitions = [ "/", "/tmp", "/var", "/home" ] 2062 } 2063} 2064 2065apply Service for (disk => config in host.vars.local_disks) { 2066 import "generic-service" 2067 check_command = "my-disk" 2068 2069 vars += config 2070 2071 vars.disk_wfree = "10%" 2072 vars.disk_cfree = "5%" 2073} 2074``` 2075 2076 2077More details on using arrays in custom variables can be found in 2078[this chapter](03-monitoring-basics.md#custom-variables). 2079 2080 2081#### Command Arguments <a id="command-arguments"></a> 2082 2083Next to the short `command` array specified in the command object, 2084it is advised to define plugin/script parameters in the `arguments` 2085dictionary attribute. 2086 2087The value of the `--parameter` key itself is a dictionary with additional 2088keys. They allow to create generic command objects and are also for documentation 2089purposes, e.g. with the `description` field copying the plugin's help text in there. 2090The Icinga Director uses this field to show the argument's purpose when selecting it. 2091 2092``` 2093 arguments = { 2094 "--parameter" = { 2095 description = "..." 2096 value = "..." 2097 } 2098 } 2099``` 2100 2101Each argument is optional by default and is omitted if 2102the value is not set. 2103 2104Learn more about integrating plugins with CheckCommand 2105objects in [this chapter](05-service-monitoring.md#service-monitoring-plugin-checkcommand). 2106 2107There are additional possibilities for creating a command only once, 2108with different parameters and arguments, shown below. 2109 2110##### Command Arguments: Value <a id="command-arguments-value"></a> 2111 2112In order to find out about the command argument, call the plugin's help 2113or consult the README. 2114 2115``` 2116./check_systemd.py --help 2117 2118... 2119 2120 -u UNIT, --unit UNIT Name of the systemd unit that is beeing tested. 2121``` 2122 2123Whenever the long parameter name is available, prefer this over the short one. 2124 2125``` 2126 arguments = { 2127 "--unit" = { 2128 2129 } 2130 } 2131``` 2132 2133Define a unique `prefix` for the command's specific arguments. Best practice is to follow this schema: 2134 2135``` 2136<command name>_<parameter name> 2137``` 2138 2139Therefore use `systemd_` as prefix, and use the long plugin parameter name `unit` inside the [runtime macro](03-monitoring-basics.md#runtime-macros) 2140syntax. 2141 2142``` 2143 arguments = { 2144 "--unit" = { 2145 value = "$systemd_unit$" 2146 } 2147 } 2148``` 2149 2150In order to specify a default value, specify 2151a [custom variable](03-monitoring-basics.md#custom-variables) inside 2152the CheckCommand object. 2153 2154``` 2155 vars.systemd_unit = "icinga2" 2156``` 2157 2158This value can be overridden from the host/service 2159object as command parameters. 2160 2161 2162##### Command Arguments: Description <a id="command-arguments-description"></a> 2163 2164Best practice, also inside the [ITL](10-icinga-template-library.md#icinga-template-library), is to always 2165copy the command parameter help output into the `description` 2166field of your check command. 2167 2168Learn more about integrating plugins with CheckCommand 2169objects in [this chapter](05-service-monitoring.md#service-monitoring-plugin-checkcommand). 2170 2171With the [example above](03-monitoring-basics.md#command-arguments-value), 2172inspect the parameter's help text. 2173 2174``` 2175./check_systemd.py --help 2176 2177... 2178 2179 -u UNIT, --unit UNIT Name of the systemd unit that is beeing tested. 2180``` 2181 2182Copy this into the command arguments `description` entry. 2183 2184``` 2185 arguments = { 2186 "--unit" = { 2187 value = "$systemd_unit$" 2188 description = "Name of the systemd unit that is beeing tested." 2189 } 2190 } 2191``` 2192 2193##### Command Arguments: Required <a id="command-arguments-required"></a> 2194 2195Specifies whether this command argument is required, or not. By 2196default all arguments are optional. 2197 2198> **Tip** 2199> 2200> Good plugins provide optional parameters in square brackets, e.g. `[-w SECONDS]`. 2201 2202The `required` field can be toggled with a [boolean](17-language-reference.md#boolean-literals) value. 2203 2204``` 2205 arguments = { 2206 "--host" = { 2207 value = "..." 2208 description = "..." 2209 required = true 2210 } 2211 } 2212``` 2213 2214Whenever the check is executed and the argument is missing, Icinga 2215logs an error. This allows to better debug configuration errors 2216instead of sometimes unreadable plugin errors when parameters are 2217missing. 2218 2219##### Command Arguments: Skip Key <a id="command-arguments-skip-key"></a> 2220 2221The `arguments` attribute requires a key, empty values are not allowed. 2222To overcome this for parameters which don't need the name in front of 2223the value, use the `skip_key` [boolean](17-language-reference.md#boolean-literals) toggle. 2224 2225``` 2226 command = [ PrefixDir + "/bin/icingacli", "businessprocess", "process", "check" ] 2227 2228 arguments = { 2229 "--process" = { 2230 value = "$icingacli_businessprocess_process$" 2231 description = "Business process to monitor" 2232 skip_key = true 2233 required = true 2234 order = -1 2235 } 2236 } 2237``` 2238 2239The service specifies the [custom variable](03-monitoring-basics.md#custom-variables) `icingacli_businessprocess_process`. 2240 2241``` 2242 vars.icingacli_businessprocess_process = "bp-shop-web" 2243``` 2244 2245This results in this command line without the `--process` parameter: 2246 2247```bash 2248'/bin/icingacli' 'businessprocess' 'process' 'check' 'bp-shop-web' 2249``` 2250 2251You can use this method to put everything into the `arguments` attribute 2252in a defined order and without keys. This avoids entries in the `command` 2253attributes too. 2254 2255 2256##### Command Arguments: Set If <a id="command-arguments-set-if"></a> 2257 2258This can be used for the following scenarios: 2259 2260**Parameters without value, e.g. `--sni`.** 2261 2262``` 2263 command = [ PluginDir + "/check_http"] 2264 2265 arguments = { 2266 "--sni" = { 2267 set_if = "$http_sni$" 2268 } 2269 } 2270``` 2271 2272Whenever a host/service object sets the `http_sni` [custom variable](03-monitoring-basics.md#custom-variables) 2273to `true`, the parameter is added to the command line. 2274 2275```bash 2276'/usr/lib64/nagios/plugins/check_http' '--sni' 2277``` 2278 2279[Numeric](17-language-reference.md#numeric-literals) values are allowed too. 2280 2281**Parameters with value, but additionally controlled with an extra custom variable boolean flag.** 2282 2283The following example is taken from the [postgres]() CheckCommand. The host 2284parameter should use a `value` but only whenever the `postgres_unixsocket` 2285[custom variable](03-monitoring-basics.md#custom-variables) is set to false. 2286 2287Note: `set_if` is using a runtime lambda function because the value 2288is evaluated at runtime. This is explained in [this chapter](08-advanced-topics.md#use-functions-object-config). 2289 2290``` 2291 command = [ PluginContribDir + "/check_postgres.pl" ] 2292 2293 arguments = { 2294 "-H" = { 2295 value = "$postgres_host$" 2296 set_if = {{ macro("$postgres_unixsocket$") == false }} 2297 description = "hostname(s) to connect to; defaults to none (Unix socket)" 2298 } 2299``` 2300 2301An executed check for this host and services ... 2302 2303``` 2304object Host "postgresql-cluster" { 2305 // ... 2306 2307 vars.postgres_host = "192.168.56.200" 2308 vars.postgres_unixsocket = false 2309} 2310``` 2311 2312... use the following command line: 2313 2314```bash 2315'/usr/lib64/nagios/plugins/check_postgres.pl' '-H' '192.168.56.200' 2316``` 2317 2318Host/service objects which set `postgres_unixsocket` to `false` don't add the `-H` parameter 2319and its value to the command line. 2320 2321References: [abbreviated lambda syntax](17-language-reference.md#nullary-lambdas), [macro](18-library-reference.md#scoped-functions-macro). 2322 2323##### Command Arguments: Order <a id="command-arguments-order"></a> 2324 2325Plugin may require parameters in a special order. One after the other, 2326or e.g. one parameter always in the first position. 2327 2328``` 2329 arguments = { 2330 "--first" = { 2331 value = "..." 2332 description = "..." 2333 order = -5 2334 } 2335 "--second" = { 2336 value = "..." 2337 description = "..." 2338 order = -4 2339 } 2340 "--last" = { 2341 value = "..." 2342 description = "..." 2343 order = 99 2344 } 2345 } 2346``` 2347 2348Keep in mind that positional arguments need to be tested thoroughly. 2349 2350##### Command Arguments: Repeat Key <a id="command-arguments-repeat-key"></a> 2351 2352Parameters can use [Array](17-language-reference.md#array) as value type. Whenever Icinga encounters 2353an array, it repeats the parameter key and each value element by default. 2354 2355``` 2356 command = [ NscpPath + "\\nscp.exe", "client" ] 2357 2358 arguments = { 2359 "-a" = { 2360 value = "$nscp_arguments$" 2361 description = "..." 2362 repeat_key = true 2363 } 2364 } 2365``` 2366 2367On a host/service object, specify the `nscp_arguments` [custom variable](03-monitoring-basics.md#custom-variables) 2368as an array. 2369 2370``` 2371 vars.nscp_arguments = [ "exclude=sppsvc", "exclude=ShellHWDetection" ] 2372``` 2373 2374This translates into the following command line: 2375 2376``` 2377nscp.exe 'client' '-a' 'exclude=sppsvc' '-a' 'exclude=ShellHWDetection' 2378``` 2379 2380If the plugin requires you to pass the list without repeating the key, 2381set `repeat_key = false` in the argument definition. 2382 2383``` 2384 command = [ NscpPath + "\\nscp.exe", "client" ] 2385 2386 arguments = { 2387 "-a" = { 2388 value = "$nscp_arguments$" 2389 description = "..." 2390 repeat_key = false 2391 } 2392 } 2393``` 2394 2395This translates into the following command line: 2396 2397``` 2398nscp.exe 'client' '-a' 'exclude=sppsvc' 'exclude=ShellHWDetection' 2399``` 2400 2401 2402##### Command Arguments: Key <a id="command-arguments-key"></a> 2403 2404The `arguments` attribute requires unique keys. Sometimes, you'll 2405need to override this in the resulting command line with same key 2406names. Therefore you can specifically override the arguments key. 2407 2408``` 2409arguments = { 2410 "--key1" = { 2411 value = "..." 2412 key = "-specialkey" 2413 } 2414 "--key2" = { 2415 value = "..." 2416 key = "-specialkey" 2417 } 2418} 2419``` 2420 2421This results in the following command line: 2422 2423``` 2424 '-specialkey' '...' '-specialkey' '...' 2425``` 2426 2427#### Environment Variables <a id="command-environment-variables"></a> 2428 2429The `env` command object attribute specifies a list of environment variables with values calculated 2430from custom variables which should be exported as environment variables prior to executing the command. 2431 2432This is useful for example for hiding sensitive information on the command line output 2433when passing credentials to database checks: 2434 2435``` 2436object CheckCommand "mysql" { 2437 command = [ PluginDir + "/check_mysql" ] 2438 2439 arguments = { 2440 "-H" = "$mysql_address$" 2441 "-d" = "$mysql_database$" 2442 } 2443 2444 vars.mysql_address = "$address$" 2445 vars.mysql_database = "icinga" 2446 vars.mysql_user = "icinga_check" 2447 vars.mysql_pass = "password" 2448 2449 env.MYSQLUSER = "$mysql_user$" 2450 env.MYSQLPASS = "$mysql_pass$" 2451} 2452``` 2453 2454The executed command line visible with `ps` or `top` looks like this and hides 2455the database credentials in the user's environment. 2456 2457```bash 2458/usr/lib/nagios/plugins/check_mysql -H 192.168.56.101 -d icinga 2459``` 2460 2461> **Note** 2462> 2463> If the CheckCommand also supports setting the parameter in the command line, 2464> ensure to use a different name for the custom variable. Otherwise Icinga 2 2465> adds the command line parameter. 2466 2467If a specific CheckCommand object provided with the [Icinga Template Library](10-icinga-template-library.md#icinga-template-library) 2468needs additional environment variables, you can import it into a new custom 2469CheckCommand object and add additional `env` keys. Example for the [mysql_health](10-icinga-template-library.md#plugin-contrib-command-mysql_health) 2470CheckCommand: 2471 2472``` 2473object CheckCommand "mysql_health_env" { 2474 import "mysql_health" 2475 2476 // https://labs.consol.de/nagios/check_mysql_health/ 2477 env.NAGIOS__SERVICEMYSQL_USER = "$mysql_health_env_username$" 2478 env.NAGIOS__SERVICEMYSQL_PASS = "$mysql_health_env_password$" 2479} 2480``` 2481 2482Specify the custom variables `mysql_health_env_username` and `mysql_health_env_password` 2483in the service object then. 2484 2485> **Note** 2486> 2487> Keep in mind that the values are still visible with the [debug console](11-cli-commands.md#cli-command-console) 2488> and the inspect mode in the [Icinga Director](https://icinga.com/docs/director/latest/). 2489 2490You can also set global environment variables in the application's 2491sysconfig configuration file, e.g. `HOME` or specific library paths 2492for Oracle. Beware that these environment variables can be used 2493by any CheckCommand object and executed plugin and can leak sensitive 2494information. 2495 2496### Notification Commands <a id="notification-commands"></a> 2497 2498[NotificationCommand](09-object-types.md#objecttype-notificationcommand) 2499objects define how notifications are delivered to external interfaces 2500(email, XMPP, IRC, Twitter, etc.). 2501[NotificationCommand](09-object-types.md#objecttype-notificationcommand) 2502objects are referenced by [Notification](09-object-types.md#objecttype-notification) 2503objects using the `command` attribute. 2504 2505> **Note** 2506> 2507> Make sure that the [notification](11-cli-commands.md#enable-features) feature is enabled 2508> in order to execute notification commands. 2509 2510While it's possible to specify an entire notification command right 2511in the NotificationCommand object it is generally advisable to create a 2512shell script in the `/etc/icinga2/scripts` directory and have the 2513NotificationCommand object refer to that. 2514 2515A fresh Icinga 2 install comes with with two example scripts for host 2516and service notifications by email. Based on the Icinga 2 runtime macros 2517(such as `$service.output$` for the current check output) it's possible 2518to send email to the user(s) associated with the notification itself 2519(`$user.email$`). Feel free to take these scripts as a starting point 2520for your own individual notification solution - and keep in mind that 2521nearly everything is technically possible. 2522 2523Information needed to generate notifications is passed to the scripts as 2524arguments. The NotificationCommand objects `mail-host-notification` and 2525`mail-service-notification` correspond to the shell scripts 2526`mail-host-notification.sh` and `mail-service-notification.sh` in 2527`/etc/icinga2/scripts` and define default values for arguments. These 2528defaults can always be overwritten locally. 2529 2530> **Note** 2531> 2532> This example requires the `mail` binary installed on the Icinga 2 2533> master. 2534> 2535> Depending on the distribution, you need a local mail transfer 2536> agent (MTA) such as Postfix, Exim or Sendmail in order 2537> to send emails. 2538> 2539> These tools virtually provide the `mail` binary executed 2540> by the notification scripts below. 2541 2542#### mail-host-notification <a id="mail-host-notification"></a> 2543 2544The `mail-host-notification` NotificationCommand object uses the 2545example notification script located in `/etc/icinga2/scripts/mail-host-notification.sh`. 2546 2547Here is a quick overview of the arguments that can be used. See also [host runtime 2548macros](03-monitoring-basics.md#-host-runtime-macros) for further 2549information. 2550 2551 Name | Description 2552 -------------------------------|--------------------------------------- 2553 `notification_date` | **Required.** Date and time. Defaults to `$icinga.long_date_time$`. 2554 `notification_hostname` | **Required.** The host's `FQDN`. Defaults to `$host.name$`. 2555 `notification_hostdisplayname` | **Required.** The host's display name. Defaults to `$host.display_name$`. 2556 `notification_hostoutput` | **Required.** Output from host check. Defaults to `$host.output$`. 2557 `notification_useremail` | **Required.** The notification's recipient(s). Defaults to `$user.email$`. 2558 `notification_hoststate` | **Required.** Current state of host. Defaults to `$host.state$`. 2559 `notification_type` | **Required.** Type of notification. Defaults to `$notification.type$`. 2560 `notification_address` | **Optional.** The host's IPv4 address. Defaults to `$address$`. 2561 `notification_address6` | **Optional.** The host's IPv6 address. Defaults to `$address6$`. 2562 `notification_author` | **Optional.** Comment author. Defaults to `$notification.author$`. 2563 `notification_comment` | **Optional.** Comment text. Defaults to `$notification.comment$`. 2564 `notification_from` | **Optional.** Define a valid From: string (e.g. `"Icinga 2 Host Monitoring <icinga@example.com>"`). Requires `GNU mailutils` (Debian/Ubuntu) or `mailx` (RHEL/SUSE). 2565 `notification_icingaweb2url` | **Optional.** Define URL to your Icinga Web 2 (e.g. `"https://www.example.com/icingaweb2"`) 2566 `notification_logtosyslog` | **Optional.** Set `true` to log notification events to syslog; useful for debugging. Defaults to `false`. 2567 2568#### mail-service-notification <a id="mail-service-notification"></a> 2569 2570The `mail-service-notification` NotificationCommand object uses the 2571example notification script located in `/etc/icinga2/scripts/mail-service-notification.sh`. 2572 2573Here is a quick overview of the arguments that can be used. See also [service runtime 2574macros](03-monitoring-basics.md#-service-runtime-macros) for further 2575information. 2576 2577 Name | Description 2578 ----------------------------------|--------------------------------------- 2579 `notification_date` | **Required.** Date and time. Defaults to `$icinga.long_date_time$`. 2580 `notification_hostname` | **Required.** The host's `FQDN`. Defaults to `$host.name$`. 2581 `notification_servicename` | **Required.** The service name. Defaults to `$service.name$`. 2582 `notification_hostdisplayname` | **Required.** Host display name. Defaults to `$host.display_name$`. 2583 `notification_servicedisplayname` | **Required.** Service display name. Defaults to `$service.display_name$`. 2584 `notification_serviceoutput` | **Required.** Output from service check. Defaults to `$service.output$`. 2585 `notification_useremail` | **Required.** The notification's recipient(s). Defaults to `$user.email$`. 2586 `notification_servicestate` | **Required.** Current state of host. Defaults to `$service.state$`. 2587 `notification_type` | **Required.** Type of notification. Defaults to `$notification.type$`. 2588 `notification_address` | **Optional.** The host's IPv4 address. Defaults to `$address$`. 2589 `notification_address6` | **Optional.** The host's IPv6 address. Defaults to `$address6$`. 2590 `notification_author` | **Optional.** Comment author. Defaults to `$notification.author$`. 2591 `notification_comment` | **Optional.** Comment text. Defaults to `$notification.comment$`. 2592 `notification_from` | **Optional.** Define a valid From: string (e.g. `"Icinga 2 Host Monitoring <icinga@example.com>"`). Requires `GNU mailutils` (Debian/Ubuntu) or `mailx` (RHEL/SUSE). 2593 `notification_icingaweb2url` | **Optional.** Define URL to your Icinga Web 2 (e.g. `"https://www.example.com/icingaweb2"`) 2594 `notification_logtosyslog` | **Optional.** Set `true` to log notification events to syslog; useful for debugging. Defaults to `false`. 2595 2596 2597## Dependencies <a id="dependencies"></a> 2598 2599Icinga 2 uses host and service [Dependency](09-object-types.md#objecttype-dependency) objects 2600for determining their network reachability. 2601 2602A service can depend on a host, and vice versa. A service has an implicit 2603dependency (parent) to its host. A host to host dependency acts implicitly 2604as host parent relation. 2605When dependencies are calculated, not only the immediate parent is taken into 2606account but all parents are inherited. 2607 2608The `parent_host_name` and `parent_service_name` attributes are mandatory for 2609service dependencies, `parent_host_name` is required for host dependencies. 2610[Apply rules](03-monitoring-basics.md#using-apply) will allow you to 2611[determine these attributes](03-monitoring-basics.md#dependencies-apply-custom-variables) in a more 2612dynamic fashion if required. 2613 2614``` 2615parent_host_name = "core-router" 2616parent_service_name = "uplink-port" 2617``` 2618 2619Notifications are suppressed by default if a host or service becomes unreachable. 2620You can control that option by defining the `disable_notifications` attribute. 2621 2622``` 2623disable_notifications = false 2624``` 2625 2626If the dependency should be triggered in the parent object's soft state, you 2627need to set `ignore_soft_states` to `false`. 2628 2629The dependency state filter must be defined based on the parent object being 2630either a host (`Up`, `Down`) or a service (`OK`, `Warning`, `Critical`, `Unknown`). 2631 2632The following example will make the dependency fail and trigger it if the parent 2633object is **not** in one of these states: 2634 2635``` 2636states = [ OK, Critical, Unknown ] 2637``` 2638 2639> **In other words** 2640> 2641> If the parent service object changes into the `Warning` state, this 2642> dependency will fail and render all child objects (hosts or services) unreachable. 2643 2644You can determine the child's reachability by querying the `last_reachable` attribute 2645via the [REST API](12-icinga2-api.md#icinga2-api). 2646 2647> **Note** 2648> 2649> Reachability calculation depends on fresh and processed check results. If dependencies 2650> disable checks for child objects, this won't work reliably. 2651 2652### Implicit Dependencies for Services on Host <a id="dependencies-implicit-host-service"></a> 2653 2654Icinga 2 automatically adds an implicit dependency for services on their host. That way 2655service notifications are suppressed when a host is `DOWN` or `UNREACHABLE`. This dependency 2656does not overwrite other dependencies and implicitly sets `disable_notifications = true` and 2657`states = [ Up ]` for all service objects. 2658 2659Service checks are still executed. If you want to prevent them from happening, you can 2660apply the following dependency to all services setting their host as `parent_host_name` 2661and disabling the checks. `assign where true` matches on all `Service` objects. 2662 2663``` 2664apply Dependency "disable-host-service-checks" to Service { 2665 disable_checks = true 2666 assign where true 2667} 2668``` 2669 2670### Dependencies for Network Reachability <a id="dependencies-network-reachability"></a> 2671 2672A common scenario is the Icinga 2 server behind a router. Checking internet 2673access by pinging the Google DNS server `google-dns` is a common method, but 2674will fail in case the `dsl-router` host is down. Therefore the example below 2675defines a host dependency which acts implicitly as parent relation too. 2676 2677Furthermore the host may be reachable but ping probes are dropped by the 2678router's firewall. In case the `dsl-router`'s `ping4` service check fails, all 2679further checks for the `ping4` service on host `google-dns` service should 2680be suppressed. This is achieved by setting the `disable_checks` attribute to `true`. 2681 2682``` 2683object Host "dsl-router" { 2684 import "generic-host" 2685 address = "192.168.1.1" 2686} 2687 2688object Host "google-dns" { 2689 import "generic-host" 2690 address = "8.8.8.8" 2691} 2692 2693apply Service "ping4" { 2694 import "generic-service" 2695 2696 check_command = "ping4" 2697 2698 assign where host.address 2699} 2700 2701apply Dependency "internet" to Host { 2702 parent_host_name = "dsl-router" 2703 disable_checks = true 2704 disable_notifications = true 2705 2706 assign where host.name != "dsl-router" 2707} 2708 2709apply Dependency "internet" to Service { 2710 parent_host_name = "dsl-router" 2711 parent_service_name = "ping4" 2712 disable_checks = true 2713 2714 assign where host.name != "dsl-router" 2715} 2716``` 2717 2718<!-- Keep this for compatibility --> 2719<a id="dependencies-apply-custom-attríbutes"></a> 2720 2721### Apply Dependencies based on Custom Variables <a id="dependencies-apply-custom-variables"></a> 2722 2723You can use [apply rules](03-monitoring-basics.md#using-apply) to set parent or 2724child attributes, e.g. `parent_host_name` to other objects' 2725attributes. 2726 2727A common example are virtual machines hosted on a master. The object 2728name of that master is auto-generated from your CMDB or VMWare inventory 2729into the host's custom variables (or a generic template for your 2730cloud). 2731 2732Define your master host object: 2733 2734``` 2735/* your master */ 2736object Host "master.example.com" { 2737 import "generic-host" 2738} 2739``` 2740 2741Add a generic template defining all common host attributes: 2742 2743``` 2744/* generic template for your virtual machines */ 2745template Host "generic-vm" { 2746 import "generic-host" 2747} 2748``` 2749 2750Add a template for all hosts on your example.com cloud setting 2751custom variable `vm_parent` to `master.example.com`: 2752 2753``` 2754template Host "generic-vm-example.com" { 2755 import "generic-vm" 2756 vars.vm_parent = "master.example.com" 2757} 2758``` 2759 2760Define your guest hosts: 2761 2762``` 2763object Host "www.example1.com" { 2764 import "generic-vm-master.example.com" 2765} 2766 2767object Host "www.example2.com" { 2768 import "generic-vm-master.example.com" 2769} 2770``` 2771 2772Apply the host dependency to all child hosts importing the 2773`generic-vm` template and set the `parent_host_name` 2774to the previously defined custom variable `host.vars.vm_parent`. 2775 2776``` 2777apply Dependency "vm-host-to-parent-master" to Host { 2778 parent_host_name = host.vars.vm_parent 2779 assign where "generic-vm" in host.templates 2780} 2781``` 2782 2783You can extend this example, and make your services depend on the 2784`master.example.com` host too. Their local scope allows you to use 2785`host.vars.vm_parent` similar to the example above. 2786 2787``` 2788apply Dependency "vm-service-to-parent-master" to Service { 2789 parent_host_name = host.vars.vm_parent 2790 assign where "generic-vm" in host.templates 2791} 2792``` 2793 2794That way you don't need to wait for your guest hosts becoming 2795unreachable when the master host goes down. Instead the services 2796will detect their reachability immediately when executing checks. 2797 2798> **Note** 2799> 2800> This method with setting locally scoped variables only works in 2801> apply rules, but not in object definitions. 2802 2803 2804### Dependencies for Agent Checks <a id="dependencies-agent-checks"></a> 2805 2806Another good example are agent based checks. You would define a health check 2807for the agent daemon responding to your requests, and make all other services 2808querying that daemon depend on that health check. 2809 2810``` 2811apply Service "agent-health" { 2812 check_command = "cluster-zone" 2813 2814 display_name = "cluster-health-" + host.name 2815 2816 /* This follows the convention that the agent zone name is the FQDN which is the same as the host object name. */ 2817 vars.cluster_zone = host.name 2818 2819 assign where host.vars.agent_endpoint 2820} 2821``` 2822 2823Now, make all other agent based checks dependent on the OK state of the `agent-health` 2824service. 2825 2826``` 2827apply Dependency "agent-health-check" to Service { 2828 parent_service_name = "agent-health" 2829 2830 states = [ OK ] // Fail if the parent service state switches to NOT-OK 2831 disable_notifications = true 2832 2833 assign where host.vars.agent_endpoint // Automatically assigns all agent endpoint checks as child services on the matched host 2834 ignore where service.name == "agent-health" // Avoid a self reference from child to parent 2835} 2836 2837``` 2838 2839This is described in detail in [this chapter](06-distributed-monitoring.md#distributed-monitoring-health-checks). 2840 2841### Event Commands <a id="event-commands"></a> 2842 2843Unlike notifications, event commands for hosts/services are called on every 2844check execution if one of these conditions matches: 2845 2846* The host/service is in a [soft state](03-monitoring-basics.md#hard-soft-states) 2847* The host/service state changes into a [hard state](03-monitoring-basics.md#hard-soft-states) 2848* The host/service state recovers from a [soft or hard state](03-monitoring-basics.md#hard-soft-states) to [OK](03-monitoring-basics.md#service-states)/[Up](03-monitoring-basics.md#host-states) 2849 2850[EventCommand](09-object-types.md#objecttype-eventcommand) objects are referenced by 2851[Host](09-object-types.md#objecttype-host) and [Service](09-object-types.md#objecttype-service) objects 2852with the `event_command` attribute. 2853 2854Therefore the `EventCommand` object should define a command line 2855evaluating the current service state and other service runtime attributes 2856available through runtime variables. Runtime macros such as `$service.state_type$` 2857and `$service.state$` will be processed by Icinga 2 and help with fine-granular 2858triggered events 2859 2860If the host/service is located on a client as [command endpoint](06-distributed-monitoring.md#distributed-monitoring-top-down-command-endpoint) 2861the event command will be executed on the client itself (similar to the check 2862command). 2863 2864Common use case scenarios are a failing HTTP check which requires an immediate 2865restart via event command. Another example would be an application that is not 2866responding and therefore requires a restart. You can also use event handlers 2867to forward more details on state changes and events than the typical notification 2868alerts provide. 2869 2870#### Use Event Commands to Send Information from the Master <a id="event-command-send-information-from-master"></a> 2871 2872This example sends a web request from the master node to an external tool 2873for every event triggered on a `businessprocess` service. 2874 2875Define an [EventCommand](09-object-types.md#objecttype-eventcommand) 2876object `send_to_businesstool` which sends state changes to the external tool. 2877 2878``` 2879object EventCommand "send_to_businesstool" { 2880 command = [ 2881 "/usr/bin/curl", 2882 "-s", 2883 "-X PUT" 2884 ] 2885 2886 arguments = { 2887 "-H" = { 2888 value ="$businesstool_url$" 2889 skip_key = true 2890 } 2891 "-d" = "$businesstool_message$" 2892 } 2893 2894 vars.businesstool_url = "http://localhost:8080/businesstool" 2895 vars.businesstool_message = "$host.name$ $service.name$ $service.state$ $service.state_type$ $service.check_attempt$" 2896} 2897``` 2898 2899Set the `event_command` attribute to `send_to_businesstool` on the Service. 2900 2901``` 2902object Service "businessprocess" { 2903 host_name = "businessprocess" 2904 2905 check_command = "icingacli-businessprocess" 2906 vars.icingacli_businessprocess_process = "icinga" 2907 vars.icingacli_businessprocess_config = "training" 2908 2909 event_command = "send_to_businesstool" 2910} 2911``` 2912 2913In order to test this scenario you can run: 2914 2915```bash 2916nc -l 8080 2917``` 2918 2919This allows to catch the web request. You can also enable the [debug log](15-troubleshooting.md#troubleshooting-enable-debug-output) 2920and search for the event command execution log message. 2921 2922```bash 2923tail -f /var/log/icinga2/debug.log | grep EventCommand 2924``` 2925 2926Feed in a check result via REST API action [process-check-result](12-icinga2-api.md#icinga2-api-actions-process-check-result) 2927or via Icinga Web 2. 2928 2929Expected Result: 2930 2931``` 2932# nc -l 8080 2933PUT /businesstool HTTP/1.1 2934User-Agent: curl/7.29.0 2935Host: localhost:8080 2936Accept: */* 2937Content-Length: 47 2938Content-Type: application/x-www-form-urlencoded 2939 2940businessprocess businessprocess CRITICAL SOFT 1 2941``` 2942 2943#### Use Event Commands to Restart Service Daemon via Command Endpoint on Linux <a id="event-command-restart-service-daemon-command-endpoint-linux"></a> 2944 2945This example triggers a restart of the `httpd` service on the local system 2946when the `procs` service check executed via Command Endpoint fails. It only 2947triggers if the service state is `Critical` and attempts to restart the 2948service before a notification is sent. 2949 2950Requirements: 2951 2952* Icinga 2 as client on the remote node 2953* icinga user with sudo permissions to the httpd daemon 2954 2955Example on CentOS 7: 2956 2957``` 2958# visudo 2959icinga ALL=(ALL) NOPASSWD: /usr/bin/systemctl restart httpd 2960``` 2961 2962Note: Distributions might use a different name. On Debian/Ubuntu the service is called `apache2`. 2963 2964Define an [EventCommand](09-object-types.md#objecttype-eventcommand) object `restart_service` 2965which allows to trigger local service restarts. Put it into a [global zone](06-distributed-monitoring.md#distributed-monitoring-global-zone-config-sync) 2966to sync its configuration to all clients. 2967 2968``` 2969[root@icinga2-master1.localdomain /]# vim /etc/icinga2/zones.d/global-templates/eventcommands.conf 2970 2971object EventCommand "restart_service" { 2972 command = [ PluginDir + "/restart_service" ] 2973 2974 arguments = { 2975 "-s" = "$service.state$" 2976 "-t" = "$service.state_type$" 2977 "-a" = "$service.check_attempt$" 2978 "-S" = "$restart_service$" 2979 } 2980 2981 vars.restart_service = "$procs_command$" 2982} 2983``` 2984 2985This event command triggers the following script which restarts the service. 2986The script only is executed if the service state is `CRITICAL`. Warning and Unknown states 2987are ignored as they indicate not an immediate failure. 2988 2989``` 2990[root@icinga2-agent1.localdomain /]# vim /usr/lib64/nagios/plugins/restart_service 2991 2992#!/bin/bash 2993 2994while getopts "s:t:a:S:" opt; do 2995 case $opt in 2996 s) 2997 servicestate=$OPTARG 2998 ;; 2999 t) 3000 servicestatetype=$OPTARG 3001 ;; 3002 a) 3003 serviceattempt=$OPTARG 3004 ;; 3005 S) 3006 service=$OPTARG 3007 ;; 3008 esac 3009done 3010 3011if ( [ -z $servicestate ] || [ -z $servicestatetype ] || [ -z $serviceattempt ] || [ -z $service ] ); then 3012 echo "USAGE: $0 -s servicestate -z servicestatetype -a serviceattempt -S service" 3013 exit 3; 3014else 3015 # Only restart on the third attempt of a critical event 3016 if ( [ $servicestate == "CRITICAL" ] && [ $servicestatetype == "SOFT" ] && [ $serviceattempt -eq 3 ] ); then 3017 sudo /usr/bin/systemctl restart $service 3018 fi 3019fi 3020 3021[root@icinga2-agent1.localdomain /]# chmod +x /usr/lib64/nagios/plugins/restart_service 3022``` 3023 3024Add a service on the master node which is executed via command endpoint on the client. 3025Set the `event_command` attribute to `restart_service`, the name of the previously defined 3026EventCommand object. 3027 3028``` 3029[root@icinga2-master1.localdomain /]# vim /etc/icinga2/zones.d/master/icinga2-agent1.localdomain.conf 3030 3031object Service "Process httpd" { 3032 check_command = "procs" 3033 event_command = "restart_service" 3034 max_check_attempts = 4 3035 3036 host_name = "icinga2-agent1.localdomain" 3037 command_endpoint = "icinga2-agent1.localdomain" 3038 3039 vars.procs_command = "httpd" 3040 vars.procs_warning = "1:10" 3041 vars.procs_critical = "1:" 3042} 3043``` 3044 3045In order to test this configuration just stop the `httpd` on the remote host `icinga2-agent1.localdomain`. 3046 3047``` 3048[root@icinga2-agent1.localdomain /]# systemctl stop httpd 3049``` 3050 3051You can enable the [debug log](15-troubleshooting.md#troubleshooting-enable-debug-output) and search for the 3052executed command line. 3053 3054``` 3055[root@icinga2-agent1.localdomain /]# tail -f /var/log/icinga2/debug.log | grep restart_service 3056``` 3057 3058#### Use Event Commands to Restart Service Daemon via Command Endpoint on Windows <a id="event-command-restart-service-daemon-command-endpoint-windows"></a> 3059 3060This example triggers a restart of the `httpd` service on the remote system 3061when the `service-windows` service check executed via Command Endpoint fails. 3062It only triggers if the service state is `Critical` and attempts to restart the 3063service before a notification is sent. 3064 3065Requirements: 3066 3067* Icinga 2 as client on the remote node 3068* Icinga 2 service with permissions to execute Powershell scripts (which is the default) 3069 3070Define an [EventCommand](09-object-types.md#objecttype-eventcommand) object `restart_service-windows` 3071which allows to trigger local service restarts. Put it into a [global zone](06-distributed-monitoring.md#distributed-monitoring-global-zone-config-sync) 3072to sync its configuration to all clients. 3073 3074``` 3075[root@icinga2-master1.localdomain /]# vim /etc/icinga2/zones.d/global-templates/eventcommands.conf 3076 3077object EventCommand "restart_service-windows" { 3078 command = [ 3079 "C:\\Windows\\SysWOW64\\WindowsPowerShell\\v1.0\\powershell.exe", 3080 PluginDir + "/restart_service.ps1" 3081 ] 3082 3083 arguments = { 3084 "-ServiceState" = "$service.state$" 3085 "-ServiceStateType" = "$service.state_type$" 3086 "-ServiceAttempt" = "$service.check_attempt$" 3087 "-Service" = "$restart_service$" 3088 "; exit" = { 3089 order = 99 3090 value = "$$LASTEXITCODE" 3091 } 3092 } 3093 3094 vars.restart_service = "$service_win_service$" 3095} 3096``` 3097 3098This event command triggers the following script which restarts the service. 3099The script only is executed if the service state is `CRITICAL`. Warning and Unknown states 3100are ignored as they indicate not an immediate failure. 3101 3102Add the `restart_service.ps1` Powershell script into `C:\Program Files\Icinga2\sbin`: 3103 3104``` 3105param( 3106 [string]$Service = '', 3107 [string]$ServiceState = '', 3108 [string]$ServiceStateType = '', 3109 [int]$ServiceAttempt = '' 3110 ) 3111 3112if (!$Service -Or !$ServiceState -Or !$ServiceStateType -Or !$ServiceAttempt) { 3113 $scriptName = GCI $MyInvocation.PSCommandPath | Select -Expand Name; 3114 Write-Host "USAGE: $scriptName -ServiceState servicestate -ServiceStateType servicestatetype -ServiceAttempt serviceattempt -Service service" -ForegroundColor red; 3115 exit 3; 3116} 3117 3118# Only restart on the third attempt of a critical event 3119if ($ServiceState -eq "CRITICAL" -And $ServiceStateType -eq "SOFT" -And $ServiceAttempt -eq 3) { 3120 Restart-Service $Service; 3121} 3122 3123exit 0; 3124``` 3125 3126Add a service on the master node which is executed via command endpoint on the client. 3127Set the `event_command` attribute to `restart_service-windows`, the name of the previously defined 3128EventCommand object. 3129 3130``` 3131[root@icinga2-master1.localdomain /]# vim /etc/icinga2/zones.d/master/icinga2-agent2.localdomain.conf 3132 3133object Service "Service httpd" { 3134 check_command = "service-windows" 3135 event_command = "restart_service-windows" 3136 max_check_attempts = 4 3137 3138 host_name = "icinga2-agent2.localdomain" 3139 command_endpoint = "icinga2-agent2.localdomain" 3140 3141 vars.service_win_service = "httpd" 3142} 3143``` 3144 3145In order to test this configuration just stop the `httpd` on the remote host `icinga2-agent1.localdomain`. 3146 3147``` 3148C:> net stop httpd 3149``` 3150 3151You can enable the [debug log](15-troubleshooting.md#troubleshooting-enable-debug-output) and search for the 3152executed command line in `C:\ProgramData\icinga2\var\log\icinga2\debug.log`. 3153 3154 3155#### Use Event Commands to Restart Service Daemon via SSH <a id="event-command-restart-service-daemon-ssh"></a> 3156 3157This example triggers a restart of the `httpd` daemon 3158via SSH when the `http` service check fails. 3159 3160Requirements: 3161 3162* SSH connection allowed (firewall, packet filters) 3163* icinga user with public key authentication 3164* icinga user with sudo permissions to restart the httpd daemon. 3165 3166Example on Debian: 3167 3168``` 3169# ls /home/icinga/.ssh/ 3170authorized_keys 3171 3172# visudo 3173icinga ALL=(ALL) NOPASSWD: /etc/init.d/apache2 restart 3174``` 3175 3176Define a generic [EventCommand](09-object-types.md#objecttype-eventcommand) object `event_by_ssh` 3177which can be used for all event commands triggered using SSH: 3178 3179``` 3180[root@icinga2-master1.localdomain /]# vim /etc/icinga2/zones.d/master/local_eventcommands.conf 3181 3182/* pass event commands through ssh */ 3183object EventCommand "event_by_ssh" { 3184 command = [ PluginDir + "/check_by_ssh" ] 3185 3186 arguments = { 3187 "-H" = "$event_by_ssh_address$" 3188 "-p" = "$event_by_ssh_port$" 3189 "-C" = "$event_by_ssh_command$" 3190 "-l" = "$event_by_ssh_logname$" 3191 "-i" = "$event_by_ssh_identity$" 3192 "-q" = { 3193 set_if = "$event_by_ssh_quiet$" 3194 } 3195 "-w" = "$event_by_ssh_warn$" 3196 "-c" = "$event_by_ssh_crit$" 3197 "-t" = "$event_by_ssh_timeout$" 3198 } 3199 3200 vars.event_by_ssh_address = "$address$" 3201 vars.event_by_ssh_quiet = false 3202} 3203``` 3204 3205The actual event command only passes the `event_by_ssh_command` attribute. 3206The `event_by_ssh_service` custom variable takes care of passing the correct 3207daemon name, while `test $service.state_id$ -gt 0` makes sure that the daemon 3208is only restarted when the service is not in an `OK` state. 3209 3210``` 3211object EventCommand "event_by_ssh_restart_service" { 3212 import "event_by_ssh" 3213 3214 //only restart the daemon if state > 0 (not-ok) 3215 //requires sudo permissions for the icinga user 3216 vars.event_by_ssh_command = "test $service.state_id$ -gt 0 && sudo systemctl restart $event_by_ssh_service$" 3217} 3218``` 3219 3220 3221Now set the `event_command` attribute to `event_by_ssh_restart_service` and tell it 3222which service should be restarted using the `event_by_ssh_service` attribute. 3223 3224``` 3225apply Service "http" { 3226 import "generic-service" 3227 check_command = "http" 3228 3229 event_command = "event_by_ssh_restart_service" 3230 vars.event_by_ssh_service = "$host.vars.httpd_name$" 3231 3232 //vars.event_by_ssh_logname = "icinga" 3233 //vars.event_by_ssh_identity = "/home/icinga/.ssh/id_rsa.pub" 3234 3235 assign where host.vars.httpd_name 3236} 3237``` 3238 3239Specify the `httpd_name` custom variable on the host to assign the 3240service and set the event handler service. 3241 3242``` 3243object Host "remote-http-host" { 3244 import "generic-host" 3245 address = "192.168.1.100" 3246 3247 vars.httpd_name = "apache2" 3248} 3249``` 3250 3251In order to test this configuration just stop the `httpd` on the remote host `icinga2-agent1.localdomain`. 3252 3253``` 3254[root@icinga2-agent1.localdomain /]# systemctl stop httpd 3255``` 3256 3257You can enable the [debug log](15-troubleshooting.md#troubleshooting-enable-debug-output) and search for the 3258executed command line. 3259 3260``` 3261[root@icinga2-agent1.localdomain /]# tail -f /var/log/icinga2/debug.log | grep by_ssh 3262``` 3263