1= The OCF Resource Agent Developer's Guide 2 3== Introduction 4 5This document is to serve as a guide and reference for all developers, 6maintainers, and contributors working on OCF (Open Cluster Framework) 7compliant cluster resource agents. It explains the anatomy and general 8functionality of a resource agent, illustrates the resource agent API, 9and provides valuable hints and tips to resource agent authors. 10 11=== What is a resource agent? 12 13A resource agent is an executable that manages a cluster resource. No 14formal definition of a cluster resource exists, other than "anything a 15cluster manages is a resource." Cluster resources can be as diverse as 16IP addresses, file systems, database services, and entire virtual 17machines -- to name just a few examples. 18 19=== Who or what uses a resource agent? 20 21Any Open Cluster Framework (OCF) compliant cluster management 22application is capable of managing resources using the resource agents 23described in this document. At the time of writing, two OCF compliant 24cluster management applications exist for the Linux platform: 25 26* _Pacemaker_, a cluster manager supporting both the Corosync and 27 Heartbeat cluster messaging frameworks. Pacemaker evolved out of the 28 Linux-HA project. 29* _RGmanager_, the cluster manager bundled in Red Hat Cluster 30 Suite. It supports the Corosync cluster messaging framework 31 exclusively. 32 33=== Which language is a resource agent written in? 34 35An OCF compliant resource agent can be implemented in _any_ 36programming language. The API is not language specific. However, most 37resource agents are implemented as shell scripts, which is why this 38guide primarily uses example code written in shell language. 39 40=== Is there a naming convention? 41 42Yes! We have agreed to the following convention for resource agent 43names: Please name resource agents using lower case letters, with 44words separated by dashes (+example-agent-name+). 45 46Existing agents may or may not follow this convention, but it is the 47intention to make sure future agents follow this rule. 48 49== API definitions 50 51=== Environment variables 52 53A resource agent receives all configuration information about the 54resource it manages via environment variables. The names of these 55environment variables are always the name of the resource parameter, 56prefixed with +OCF_RESKEY_+. For example, if the resource has an +ip+ 57parameter set to +192.168.1.1+, then the resource agent will have 58access to an environment variable +OCF_RESKEY_ip+ holding that value. 59 60For any resource parameter that is not required to be set by the user 61-- that is, its parameter definition in the resource agent metadata 62does not specify +required="true"+ -- then the resource agent must 63 64* Provide a reasonable default. This should be advertised in the 65 metadata. By convention, the resource agent uses a variable named 66 +OCF_RESKEY_<parametername>_default+ that holds this default. 67* Alternatively, cater correctly for the value being empty. 68 69In addition, the cluster manager may also support _meta_ resource 70parameters. These do not apply directly to the resource configuration, 71but rather specify _how_ the cluster resource manager is expected to manage 72the resource. For example, the Pacemaker cluster manager uses the 73+target-role+ meta parameter to specify whether the resource should be 74started or stopped. 75 76Meta parameters are passed into the resource agent in the 77+OCF_RESKEY_CRM_meta_+ namespace, with any hypens converted to 78underscores. Thus, the +target-role+ attribute maps to an environment 79variable named +OCF_RESKEY_CRM_meta_target_role+. 80 81The <<_script_variables>> section contains other system environment 82variables. 83 84=== Actions 85 86Any resource agent must support one command-line argument which 87specifies the action the resource agent is about to execute. The 88following actions must be supported by any resource agent: 89 90* +start+ -- starts the resource. 91* +stop+ -- shuts down the resource. 92* +monitor+ -- queries the resource for its state. 93* +meta-data+ -- dumps the resource agent metadata. 94 95In addition, resource agents may optionally support the following 96actions: 97 98* +promote+ -- turns a resource into the +Master+ role (Master/Slave 99 resources only). 100* +demote+ -- turns a resource into the +Slave+ role (Master/Slave 101 resources only). 102* +migrate_to+ and +migrate_from+ -- implement live migration of 103 resources. 104* +validate-all+ -- validates a resource's configuration. 105* +usage+ or +help+ -- displays a usage message when the resource 106 agent is invoked from the command line, rather than by the cluster 107 manager. 108* +notify+ -- inform resource about changes in state of other clones. 109* +status+ -- historical (deprecated) synonym for +monitor+. 110 111=== Timeouts 112 113Action timeouts are enforced outside the resource agent proper. It is 114the cluster manager's responsibility to monitor how long a resource 115agent action has been running, and terminate it if it does not meet 116its completion deadline. Thus, resource agents need not themselves 117check for any timeout expiry. 118 119Resource agents can, however, _advise_ the user of sensible timeout 120values (which, when correctly set, will be duly enforced by the 121cluster manager). See <<_metadata,the following section>> for details 122on how a resource agent advertises its suggested timeouts. 123 124=== Metadata 125 126Every resource agent must describe its own purpose and supported 127parameters in a set of XML metadata. This metadata is used by cluster 128management applications for on-line help, and resource agent man pages 129are generated from it as well. The following is a fictitious set of 130metadata from an imaginary resource agent: 131 132[source,xml] 133-------------------------------------------------------------------------- 134<?xml version="1.0"?> 135<!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd"> 136<resource-agent name="foobar"> 137 <version>0.1</version> 138 <longdesc lang="en"> 139This is a fictitious example resource agent written for the 140OCF Resource Agent Developers Guide. 141 </longdesc> 142 <shortdesc lang="en">Example resource agent 143 for budding OCF RA developers</shortdesc> 144 <parameters> 145 <parameter name="eggs" unique="0" required="1"> 146 <longdesc lang="en"> 147 Number of eggs, an example numeric parameter 148 </longdesc> 149 <shortdesc lang="en">Number of eggs</shortdesc> 150 <content type="integer"/> 151 </parameter> 152 <parameter name="superfrobnicate" unique="0" required="0"> 153 <longdesc lang="en"> 154 Enable superfrobnication, an example boolean parameter 155 </longdesc> 156 <shortdesc lang="en">Enable superfrobnication</shortdesc> 157 <content type="boolean" default="false"/> 158 </parameter> 159 <parameter name="datadir" unique="0" required="1"> 160 <longdesc lang="en"> 161 Data directory, an example string parameter 162 </longdesc> 163 <shortdesc lang="en">Data directory</shortdesc> 164 <content type="string"/> 165 </parameter> 166 </parameters> 167 <actions> 168 <action name="start" timeout="20" /> 169 <action name="stop" timeout="20" /> 170 <action name="monitor" timeout="20" 171 interval="10" depth="0" /> 172 <action name="notify" timeout="20" /> 173 <action name="reload" timeout="20" /> 174 <action name="migrate_to" timeout="20" /> 175 <action name="migrate_from" timeout="20" /> 176 <action name="meta-data" timeout="5" /> 177 <action name="validate-all" timeout="20" /> 178 </actions> 179</resource-agent> 180-------------------------------------------------------------------------- 181 182The +resource-agent+ element, of which there must only be one per 183resource agent, defines the resource agent +name+ and +version+. 184 185The +longdesc+ and +shortdesc+ elements in +resource-agent+ provide a 186long and short description of the resource agent's 187functionality. While +shortdesc+ is a one-line description of what 188the resource agent does and is usually used in terse listings, 189+longdesc+ should give a full-blown description of the resource agent 190in as much detail as possible. 191 192The +parameters+ element describes the resource agent parameters, and 193should hold any number of +parameter+ children -- one for each 194parameter that the resource agent supports. 195 196Every +parameter+ should, like the +resource-agent+ as a whole, come 197with a +shortdesc+ and a +longdesc+, and also a +content+ child that 198describes the parameter's expected content. 199 200On the +content+ element, there may be four different attributes: 201 202* +type+ describes the parameter type (+string+, +integer+, or 203 +boolean+). If unset, +type+ defaults to +string+. 204 205* +required+ indicates whether setting the parameter is mandatory 206 (+required="true"+) or optional (+required="false"+). 207 208* For optional parameters, it is customary to provide a sensible 209 default via the +default+ attribute. 210 211* Finally, the +unique+ attribute (allowed values: +true+ or +false+) 212 indicates that a specific value must be unique across the cluster, 213 for this parameter of this particular resource type. For example, a 214 highly available floating IP address is declared +unique+ -- as that 215 one IP address should run only once throughout the cluster, avoiding 216 duplicates. 217 218The +actions+ list defines the actions that the resource agent 219advertises as supported. 220 221Every +action+ should list its own +timeout+ value. This is a 222hint to the user what _minimal_ timeout should be configured for the 223action. This is meant to cater for the fact that some resources are 224quick to start and stop (IP addresses or filesystems, for example), 225some may take several minutes to do so (such as databases). 226 227In addition, recurring actions (such as +monitor+) should also specify 228a recommended minimum +interval+, which is the time between two 229consecutive invocations of the same action. Like +timeout+, this value 230does not constitute a default -- it is merely a hint for the user 231which action interval to configure, at minimum. 232 233== Return codes 234 235For any invocation, resource agents must exit with a defined return 236code that informs the caller of the outcome of the invoked 237action. The return codes are explained in detail in the following 238subsections. 239 240=== +OCF_SUCCESS+ (0) 241 242The action completed successfully. This is the expected return code 243for any successful +start+, +stop+, +promote+, +demote+, 244+migrate_from+, +migrate_to+, +meta_data+, +help+, and +usage+ action. 245 246For +monitor+ (and its deprecated alias, +status+), however, a 247modified convention applies: 248 249* For primitive (stateless) resources, +OCF_SUCCESS+ from +monitor+ 250 means that the resource is running. Non-running and gracefully 251 shut-down resources must instead return +OCF_NOT_RUNNING+. 252 253* For master/slave (stateful) resources, +OCF_SUCCESS+ from +monitor+ 254 means that the resource is running _in Slave mode_. Resources 255 running in Master mode must instead return +OCF_RUNNING_MASTER+, and 256 gracefully shut-down resources must instead return 257 +OCF_NOT_RUNNING+. 258 259=== +OCF_ERR_GENERIC+ (1) 260 261The action returned a generic error. A resource agent should use this 262exit code only when none of the more specific error codes, defined 263below, accurately describes the problem. 264 265The cluster resource manager interprets this exit code as a _soft_ 266error. This means that unless specifically configured otherwise, the 267resource manager will attempt to recover a resource which failed with 268+OCF_ERR_GENERIC+ in-place -- usually by restarting the resource on 269the same node. 270 271=== +OCF_ERR_ARGS+ (2) 272 273The resource’s configuration is not valid on this machine. E.g. it 274refers to a location not found on the node. 275 276NOTE: The resource agent should not return this error when instructed 277to perform an action that it does not support. Instead, under those 278circumstances, it should return +OCF_ERR_UNIMPLEMENTED+. 279 280=== +OCF_ERR_UNIMPLEMENTED+ (3) 281 282The resource agent was instructed to execute an action that the agent 283does not implement. 284 285Not all resource agent actions are mandatory. +promote+, +demote+, 286+migrate_to+, +migrate_from+, and +notify+, are all optional actions 287which the resource agent may or may not implement. When a non-stateful 288resource agent is misconfigured as a master/slave resource, for 289example, then the resource agent should alert the user about this 290misconfiguration by returning +OCF_ERR_UNIMPLEMENTED+ on the +promote+ 291and +demote+ actions. 292 293=== +OCF_ERR_PERM+ (4) 294 295The action failed due to insufficient permissions. This may be due to 296the agent not being able to open a certain file, to listen on a 297specific socket, to write to a directory, or similar. 298 299The cluster resource manager interprets this exit code as a _hard_ 300error. This means that unless specifically configured otherwise, the 301resource manager will attempt to recover a resource which failed with 302this error by restarting the resource on a different node (where the 303permission problem may not exist). 304 305=== +OCF_ERR_INSTALLED+ (5) 306 307The action failed because a required component is missing on the node 308where the action was executed. This may be due to a required binary 309not being executable, or a vital configuration file being unreadable. 310 311The cluster resource manager interprets this exit code as a _hard_ 312error. This means that unless specifically configured otherwise, the 313resource manager will attempt to recover a resource which failed with 314this error by restarting the resource on a different node (where the 315required files or binaries may be present). 316 317=== +OCF_ERR_CONFIGURED+ (6) 318 319The action failed because the user misconfigured the resource. For 320example, the user may have configured an alphanumeric string for a 321parameter that really should be an integer. 322 323The cluster resource manager interprets this exit code as a _fatal_ 324error. Since this is a configuration error that is present 325cluster-wide, it would make no sense to recover such a resource on a 326different node, let alone in-place. When a resource fails with this 327error, the cluster manager will attempt to shut down the resource, and 328wait for administrator intervention. 329 330=== +OCF_NOT_RUNNING+ (7) 331 332The resource was found not to be running. This is an exit code that 333may be returned by the +monitor+ action exclusively. Note that this 334implies that the resource has either _gracefully_ shut down, or has 335never been started. 336 337If the resource is not running due to an error condition, the 338+monitor+ action should instead return one of the +OCF_ERR_+ exit 339codes or +OCF_FAILED_MASTER+. 340 341=== +OCF_RUNNING_MASTER+ (8) 342 343The resource was found to be running in the +Master+ role. This 344applies only to stateful (Master/Slave) resources, and only to 345their +monitor+ action. 346 347Note that there is no specific exit code for "running in slave 348mode". This is because their is no functional distinction between a 349primitive resource running normally, and a stateful resource running 350as a slave. The +monitor+ action of a stateful resource running 351normally in the +Slave+ role should simply return +OCF_SUCCESS+. 352 353=== +OCF_FAILED_MASTER+ (9) 354 355The resource was found to have failed in the +Master+ role. This 356applies only to stateful (Master/Slave) resources, and only to their 357+monitor+ action. 358 359The cluster resource manager interprets this exit code as a _soft_ 360error. This means that unless specifically configured otherwise, the 361resource manager will attempt to recover a resource which failed with 362+$OCF_FAILED_MASTER+ in-place -- usually by demoting, stopping, 363starting and then promoting the resource on the same node. 364 365 366== Resource agent structure 367 368A typical (shell-based) resource agent contains standard structural 369items, in the order as listed in this section. It describes the 370expected behavior of a resource agent with respect to the various 371actions it supports, using a fictitous resource agent named +foobar+ 372as an example. 373 374=== Resource agent interpreter 375 376Any resource agent implemented as a script must specify its 377interpreter using standard "shebang" (+#!+) header syntax. 378 379[source,bash] 380-------------------------------------------------------------------------- 381#!/bin/sh 382-------------------------------------------------------------------------- 383 384If a resource agent is written in shell, specifying the generic shell 385interpreter (+#!/bin/sh+) is generally preferred, though not 386required. Resource agents declared as +/bin/sh+ compatible must not 387use constructs native to a specific shell (such as, for example, 388+${!variable}+ syntax native to +bash+). It is advisable to 389occasionally run such resource agents through a sanitization utility 390such as +checkbashisms+. 391 392It is considered a regression to introduce a patch that will make a 393previously +sh+ compatible resource agent suitable only for +bash+, 394+ksh+, or any other non-generic shell. It is, however, perfectly 395acceptable for a new resource agent to explicitly define a specific 396shell, such as +/bin/bash+, as its interpreter. 397 398=== Author and license information 399 400The resource agent should contain a comment listing the resource agent 401author(s) and/or copyright holder(s), and stating the license that 402applies to the resource agent: 403 404[source,bash] 405-------------------------------------------------------------------------- 406# 407# Resource Agent for managing foobar resources. 408# 409# License: GNU General Public License (GPL) 410# (c) 2008-2010 John Doe, Jane Roe, 411# and Linux-HA contributors 412-------------------------------------------------------------------------- 413 414When a resource agent refers to a license for which multiple versions 415exist, it is assumed that the current version applies. 416 417=== Initialization 418 419Any shell resource agent should source the +ocf-shellfuncs+ function 420library. With the syntax below, this is done in terms of 421+$OCF_FUNCTIONS_DIR+, which -- for testing purposes, and also for 422generating documentation -- may be overridden from the command line. 423 424[source,bash] 425-------------------------------------------------------------------------- 426# Initialization: 427: ${OCF_FUNCTIONS_DIR=${OCF_ROOT}/lib/heartbeat} 428. ${OCF_FUNCTIONS_DIR}/ocf-shellfuncs 429-------------------------------------------------------------------------- 430 431=== Functions implementing resource agent actions 432 433What follows next are the functions implementing the resource agent's 434advertised actions. The individual actions are described in detail in 435<<_resource_agent_actions>>. 436 437=== Execution block 438 439This is the part of the resource agent that actually executes when the 440resource agent is invoked. It typically follows a fairly standard 441structure: 442 443[source,bash] 444-------------------------------------------------------------------------- 445# Make sure meta-data and usage always succeed 446case $__OCF_ACTION in 447meta-data) foobar_meta_data 448 exit $OCF_SUCCESS 449 ;; 450usage|help) foobar_usage 451 exit $OCF_SUCCESS 452 ;; 453esac 454 455# Anything other than meta-data and usage must pass validation 456foobar_validate_all || exit $? 457 458# Translate each action into the appropriate function call 459case $__OCF_ACTION in 460start) foobar_start;; 461stop) foobar_stop;; 462status|monitor) foobar_monitor;; 463promote) foobar_promote;; 464demote) foobar_demote;; 465notify) foobar_notify;; 466reload) ocf_log info "Reloading..." 467 foobar_start 468 ;; 469validate-all) ;; 470*) foobar_usage 471 exit $OCF_ERR_UNIMPLEMENTED 472 ;; 473esac 474rc=$? 475 476# The resource agent may optionally log a debug message 477ocf_log debug "${OCF_RESOURCE_INSTANCE} $__OCF_ACTION returned $rc" 478exit $rc 479-------------------------------------------------------------------------- 480 481 482== Resource agent actions 483 484Each action is typically implemented in a separate function or method 485in the resource agent. By convention, these are usually named 486+<agent>_<action>+, so the function implementing the +start+ action in 487+foobar+ would be named +foobar_start()+. 488 489As a general rule, whenever the resource agent encounters an error 490that it is not able to recover, it is permitted to immediately exit, 491throw an exception, or otherwise cease execution. Examples for this 492include configuration issues, missing binaries, permission problems, 493etc. It is not necessary to pass these errors up the call stack. 494 495It is the cluster manager's responsibility to initiate the appropriate 496recovery action based on the user's configuration. The resource agent 497should not guess at said configuration. 498 499=== +start+ action 500 501When invoked with the +start+ action, the resource agent must start 502the resource if it is not yet running. This means that the agent must 503verify the resource's configuration, query its state, and then start 504it only if it is not running. A common way of doing this would be to 505invoke the +validate_all+ and +monitor+ function first, as in the 506following example: 507 508[source,bash] 509-------------------------------------------------------------------------- 510foobar_start() { 511 # exit immediately if configuration is not valid 512 foobar_validate_all || exit $? 513 514 # if resource is already running, bail out early 515 if foobar_monitor; then 516 ocf_log info "Resource is already running" 517 return $OCF_SUCCESS 518 fi 519 520 # actually start up the resource here (make sure to immediately 521 # exit with an $OCF_ERR_ error code if anything goes seriously 522 # wrong) 523 ... 524 525 # After the resource has been started, check whether it started up 526 # correctly. If the resource starts asynchronously, the agent may 527 # spin on the monitor function here -- if the resource does not 528 # start up within the defined timeout, the cluster manager will 529 # consider the start action failed 530 while ! foobar_monitor; do 531 ocf_log debug "Resource has not started yet, waiting" 532 sleep 1 533 done 534 535 # only return $OCF_SUCCESS if _everything_ succeeded as expected 536 return $OCF_SUCCESS 537} 538-------------------------------------------------------------------------- 539 540 541=== +stop+ action 542 543When invoked with the +stop+ action, the resource agent must stop the 544resource, if it is running. This means that the agent must verify the 545resource configuration, query its state, and then stop it only if it 546is currently running. A common way of doing this would be to invoke 547the +validate_all+ and +monitor+ function first. It is important to 548understand that +stop+ is a force operation -- the resource agent must 549do everything in its power to shut down, the resource, short of 550rebooting the node or shutting it off. Consider the following example: 551 552[source,bash] 553-------------------------------------------------------------------------- 554foobar_stop() { 555 local rc 556 557 # exit immediately if configuration is not valid 558 foobar_validate_all || exit $? 559 560 foobar_monitor 561 rc=$? 562 case "$rc" in 563 "$OCF_SUCCESS") 564 # Currently running. Normal, expected behavior. 565 ocf_log debug "Resource is currently running" 566 ;; 567 "$OCF_RUNNING_MASTER") 568 # Running as a Master. Need to demote before stopping. 569 ocf_log info "Resource is currently running as Master" 570 foobar_demote || \ 571 ocf_log warn "Demote failed, trying to stop anyway" 572 ;; 573 "$OCF_NOT_RUNNING") 574 # Currently not running. Nothing to do. 575 ocf_log info "Resource is already stopped" 576 return $OCF_SUCCESS 577 ;; 578 esac 579 580 # actually shut down the resource here (make sure to immediately 581 # exit with an $OCF_ERR_ error code if anything goes seriously 582 # wrong) 583 ... 584 585 # After the resource has been stopped, check whether it shut down 586 # correctly. If the resource stops asynchronously, the agent may 587 # spin on the monitor function here -- if the resource does not 588 # shut down within the defined timeout, the cluster manager will 589 # consider the stop action failed 590 while foobar_monitor; do 591 ocf_log debug "Resource has not stopped yet, waiting" 592 sleep 1 593 done 594 595 # only return $OCF_SUCCESS if _everything_ succeeded as expected 596 return $OCF_SUCCESS 597 598} 599-------------------------------------------------------------------------- 600 601NOTE: The expected exit code for a successful stop operation is 602+$OCF_SUCCESS+, _not_ +$OCF_NOT_RUNNING+. 603 604IMPORTANT: A failed stop operation is a potentially dangerous 605situation which the cluster manager will almost invariably try to 606resolve by means of node fencing. In other words, the cluster manager 607will forcibly evict from the cluster a node on which a stop operation 608has failed. While this measure serves ultimately to protect data, it 609does cause disruption to applications and their users. Thus, a 610resource agent should make sure that it exits with an error only if 611all avenues for proper resource shutdown have been exhausted. 612 613=== +monitor+ action 614 615The +monitor+ action queries the current status of a resource. It must 616discern between three different states: 617 618* resource is currently running (return +$OCF_SUCCESS+); 619* resource has stopped gracefully (return +$OCF_NOT_RUNNING+); 620* resource has run into a problem and must be considered failed 621 (return the appropriate +$OCF_ERR_+ code to indicate the nature of the 622 problem). 623 624 625[source,bash] 626-------------------------------------------------------------------------- 627foobar_monitor() { 628 local rc 629 630 # exit immediately if configuration is not valid 631 foobar_validate_all || exit $? 632 633 ocf_run frobnicate --test 634 635 # This example assumes the following exit code convention 636 # for frobnicate: 637 # 0: running, and fully caught up with master 638 # 1: gracefully stopped 639 # any other: error 640 case "$?" in 641 0) 642 rc=$OCF_SUCCESS 643 ocf_log debug "Resource is running" 644 ;; 645 1) 646 rc=$OCF_NOT_RUNNING 647 ocf_log debug "Resource is not running" 648 ;; 649 *) 650 ocf_log err "Resource has failed" 651 exit $OCF_ERR_GENERIC 652 esac 653 654 return $rc 655} 656-------------------------------------------------------------------------- 657 658Stateful (master/slave) resource agents may use a more elaborate 659monitoring scheme where they can provide "hints" to the cluster 660manager identifying which instance is best suited to assume the 661+Master+ role. <<_specifying_a_master_preference>> explains the 662details. 663 664NOTE: The cluster manager may invoke the +monitor+ action for a 665_probe_, which is a test whether the resource is currently 666running. Normally, the monitor operation would behave exactly the same 667during a probe and a "real" monitor action. If a specific resource 668does require special treatment for probes, however, the +ocf_is_probe+ 669convenience function is available in the OCF shell functions library 670for that purpose. 671 672=== +validate-all+ action 673 674The +validate-all+ action tests for correct resource agent 675configuration and a working environment. +validate-all+ should exit 676with one of the following return codes: 677 678* +$OCF_SUCCESS+ -- all is well, the configuration is valid and 679 usable. 680* +$OCF_ERR_CONFIGURED+ -- the user has misconfigured the resource. 681* +$OCF_ERR_INSTALLED+ -- the resource has possibly been configured 682 correctly, but a vital component is missing on the node where 683 +validate-all+ is being executed. 684* +$OCF_ERR_PERM+ -- the resource is configured correctly and is not 685 missing any required components, but is suffering from a permission 686 issue (such as not being able to create a necessary file). 687 688+validate-all+ is usually wrapped in a function that is not only 689called when explicitly invoking the corresponding action, but also -- 690as a sanity check -- from just about any other function. Therefore, 691the resource agent author must keep in mind that the function may be 692invoked during the +start+, +stop+, and +monitor+ operations, and also 693during probes. 694 695Probes pose a separate challenge for validation. During a probe (when 696the cluster manager may expect the resource _not_ to be running on the 697node where the probe is executed), some required components may be 698_expected_ to not be available on the affected node. For example, this 699includes any shared data on storage devices not available for reading 700during the probe. The +validate-all+ function may thus need to treat 701probes specially, using the +ocf_is_probe+ convenience function: 702 703[source,bash] 704-------------------------------------------------------------------------- 705foobar_validate_all() { 706 # Test for configuration errors first 707 if ! ocf_is_decimal $OCF_RESKEY_eggs; then 708 ocf_log err "eggs is not numeric!" 709 exit $OCF_ERR_CONFIGURED 710 fi 711 712 # Test for required binaries 713 check_binary frobnicate 714 715 # Check for data directory (this may be on shared storage, so 716 # disable this test during probes) 717 if ! ocf_is_probe; then 718 if ! [ -d $OCF_RESKEY_datadir ]; then 719 ocf_log err "$OCF_RESKEY_datadir does not exist or is not a directory!" 720 exit $OCF_ERR_INSTALLED 721 fi 722 fi 723 724 return $OCF_SUCCESS 725} 726-------------------------------------------------------------------------- 727 728=== +meta-data+ action 729 730The +meta-data+ action dumps the resource agent metadata to standard 731output. The output must follow the metadata format as specified in 732<<_metadata>>. 733 734[source,bash] 735-------------------------------------------------------------------------- 736foobar_meta_data { 737 cat <<EOF 738<?xml version="1.0"?> 739<!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd"> 740<resource-agent name="foobar"> 741 <version>0.1</version> 742 <longdesc lang="en"> 743... 744EOF 745} 746-------------------------------------------------------------------------- 747 748=== +promote+ action 749 750The +promote+ action is optional. It must only be supported by 751_stateful_ resource agents, which means agents that discern between 752two distinct _roles_: +Master+ and +Slave+. +Slave+ is functionally 753identical to the +Started+ state in a stateless resource agent. Thus, 754while a regular (stateless) resource agent only needs to implement 755+start+ and +stop+, a stateful resource agent must also support the 756+promote+ action to be able to make a transition between the +Started+ 757(+Slave+) and +Master+ roles. 758 759[source,bash] 760-------------------------------------------------------------------------- 761foobar_promote() { 762 local rc 763 764 # exit immediately if configuration is not valid 765 foobar_validate_all || exit $? 766 767 # test the resource's current state 768 foobar_monitor 769 rc=$? 770 case "$rc" in 771 "$OCF_SUCCESS") 772 # Running as slave. Normal, expected behavior. 773 ocf_log debug "Resource is currently running as Slave" 774 ;; 775 "$OCF_RUNNING_MASTER") 776 # Already a master. Unexpected, but not a problem. 777 ocf_log info "Resource is already running as Master" 778 return $OCF_SUCCESS 779 ;; 780 "$OCF_NOT_RUNNING") 781 # Currently not running. Need to start before promoting. 782 ocf_log info "Resource is currently not running" 783 foobar_start 784 ;; 785 *) 786 # Failed resource. Let the cluster manager recover. 787 ocf_log err "Unexpected error, cannot promote" 788 exit $rc 789 ;; 790 esac 791 792 # actually promote the resource here (make sure to immediately 793 # exit with an $OCF_ERR_ error code if anything goes seriously 794 # wrong) 795 ocf_run frobnicate --master-mode || exit $OCF_ERR_GENERIC 796 797 # After the resource has been promoted, check whether the 798 # promotion worked. If the resource promotion is asynchronous, the 799 # agent may spin on the monitor function here -- if the resource 800 # does not assume the Master role within the defined timeout, the 801 # cluster manager will consider the promote action failed. 802 while true; do 803 foobar_monitor 804 if [ $? -eq $OCF_RUNNING_MASTER ]; then 805 ocf_log debug "Resource promoted" 806 break 807 else 808 ocf_log debug "Resource still awaiting promotion" 809 sleep 1 810 fi 811 done 812 813 # only return $OCF_SUCCESS if _everything_ succeeded as expected 814 return $OCF_SUCCESS 815} 816-------------------------------------------------------------------------- 817 818=== +demote+ action 819 820The +demote+ action is optional. It must only be supported by 821_stateful_ resource agents, which means agents that discern between 822two distict _roles_: +Master+ and +Slave+. +Slave+ is functionally 823identical to the +Started+ state in a stateless resource agent. Thus, 824while a regular (stateless) resource agent only needs to implement 825+start+ and +stop+, a stateful resource agent must also support the 826+demote+ action to be able to make a transition between the +Master+ 827and +Started+ (+Slave+) roles. 828 829[source,bash] 830-------------------------------------------------------------------------- 831foobar_demote() { 832 local rc 833 834 # exit immediately if configuration is not valid 835 foobar_validate_all || exit $? 836 837 # test the resource's current state 838 foobar_monitor 839 rc=$? 840 case "$rc" in 841 "$OCF_RUNNING_MASTER") 842 # Running as master. Normal, expected behavior. 843 ocf_log debug "Resource is currently running as Master" 844 ;; 845 "$OCF_SUCCESS") 846 # Alread running as slave. Nothing to do. 847 ocf_log debug "Resource is currently running as Slave" 848 return $OCF_SUCCESS 849 ;; 850 "$OCF_NOT_RUNNING") 851 # Currently not running. Getting a demote action 852 # in this state is unexpected. Exit with an error 853 # and let the cluster manager recover. 854 ocf_log err "Resource is currently not running" 855 exit $OCF_ERR_GENERIC 856 ;; 857 *) 858 # Failed resource. Let the cluster manager recover. 859 ocf_log err "Unexpected error, cannot demote" 860 exit $rc 861 ;; 862 esac 863 864 # actually demote the resource here (make sure to immediately 865 # exit with an $OCF_ERR_ error code if anything goes seriously 866 # wrong) 867 ocf_run frobnicate --unset-master-mode || exit $OCF_ERR_GENERIC 868 869 # After the resource has been demoted, check whether the 870 # demotion worked. If the resource demotion is asynchronous, the 871 # agent may spin on the monitor function here -- if the resource 872 # does not assume the Slave role within the defined timeout, the 873 # cluster manager will consider the demote action failed. 874 while true; do 875 foobar_monitor 876 if [ $? -eq $OCF_RUNNING_MASTER ]; then 877 ocf_log debug "Resource still demoting" 878 sleep 1 879 else 880 ocf_log debug "Resource demoted" 881 break 882 fi 883 done 884 885 # only return $OCF_SUCCESS if _everything_ succeeded as expected 886 return $OCF_SUCCESS 887} 888-------------------------------------------------------------------------- 889 890=== +migrate_to+ action 891 892The +migrate_to+ action can serve one of two purposes: 893 894* Initiate a native _push_ type migration for the resource. In other 895 words, instruct the resource to move _to_ a specific node from the 896 node it is currently running on. The resource agent knows about its 897 destination node via the +$OCF_RESKEY_CRM_meta_migrate_target+ environment 898 variable. 899 900* Freeze the resource in a _freeze/thaw_ (also known as 901 _suspend/resume_) type migration. In this mode, the resource does 902 not need any information about its destination node at this point. 903 904The example below illustrates a push type migration: 905 906[source,bash] 907-------------------------------------------------------------------------- 908foobar_migrate_to() { 909 # exit immediately if configuration is not valid 910 foobar_validate_all || exit $? 911 912 # if resource is not running, bail out early 913 if ! foobar_monitor; then 914 ocf_log err "Resource is not running" 915 exit $OCF_ERR_GENERIC 916 fi 917 918 # actually start up the resource here (make sure to immediately 919 # exit with an $OCF_ERR_ error code if anything goes seriously 920 # wrong) 921 ocf_run frobnicate --migrate \ 922 --dest=$OCF_RESKEY_CRM_meta_migrate_target \ 923 || exit OCF_ERR_GENERIC 924 ... 925 926 # only return $OCF_SUCCESS if _everything_ succeeded as expected 927 return $OCF_SUCCESS 928} 929-------------------------------------------------------------------------- 930 931In contrast, a freeze/thaw type migration may implement its freeze 932operation like this: 933 934[source,bash] 935-------------------------------------------------------------------------- 936foobar_migrate_to() { 937 # exit immediately if configuration is not valid 938 foobar_validate_all || exit $? 939 940 # if resource is not running, bail out early 941 if ! foobar_monitor; then 942 ocf_log err "Resource is not running" 943 exit $OCF_ERR_GENERIC 944 fi 945 946 # actually start up the resource here (make sure to immediately 947 # exit with an $OCF_ERR_ error code if anything goes seriously 948 # wrong) 949 ocf_run frobnicate --freeze || exit OCF_ERR_GENERIC 950 ... 951 952 # only return $OCF_SUCCESS if _everything_ succeeded as expected 953 return $OCF_SUCCESS 954} 955-------------------------------------------------------------------------- 956 957 958=== +migrate_from+ action 959 960The +migrate_from+ action can serve one of two purposes: 961 962* Complete a native _push_ type migration for the resource. In other 963 words, check whether the migration has succeeded properly, and the 964 resource is running on the local node. The resource agent knows 965 about its the migration source via the 966 +$OCF_RESKEY_CRM_meta_migrate_source+ environment variable. 967 968* Thaw the resource in a _freeze/thaw_ (also known as 969 _suspend/resume_) type migration. In this mode, the resource usually 970 not need any information about its source node at this point. 971 972The example below illustrates a push type migration: 973 974[source,bash] 975-------------------------------------------------------------------------- 976foobar_migrate_from() { 977 # exit immediately if configuration is not valid 978 foobar_validate_all || exit $? 979 980 # After the resource has been migrated, check whether it resumed 981 # correctly. If the resource starts asynchronously, the agent may 982 # spin on the monitor function here -- if the resource does not 983 # run within the defined timeout, the cluster manager will 984 # consider the migrate_from action failed 985 while ! foobar_monitor; do 986 ocf_log debug "Resource has not yet migrated, waiting" 987 sleep 1 988 done 989 990 # only return $OCF_SUCCESS if _everything_ succeeded as expected 991 return $OCF_SUCCESS 992} 993-------------------------------------------------------------------------- 994 995In contrast, a freeze/thaw type migration may implement its thaw 996operation like this: 997 998[source,bash] 999-------------------------------------------------------------------------- 1000foobar_migrate_from() { 1001 # exit immediately if configuration is not valid 1002 foobar_validate_all || exit $? 1003 1004 # actually start up the resource here (make sure to immediately 1005 # exit with an $OCF_ERR_ error code if anything goes seriously 1006 # wrong) 1007 ocf_run frobnicate --thaw || exit OCF_ERR_GENERIC 1008 1009 # After the resource has been migrated, check whether it resumed 1010 # correctly. If the resource starts asynchronously, the agent may 1011 # spin on the monitor function here -- if the resource does not 1012 # run within the defined timeout, the cluster manager will 1013 # consider the migrate_from action failed 1014 while ! foobar_monitor; do 1015 ocf_log debug "Resource has not yet migrated, waiting" 1016 sleep 1 1017 done 1018 1019 # only return $OCF_SUCCESS if _everything_ succeeded as expected 1020 return $OCF_SUCCESS 1021} 1022-------------------------------------------------------------------------- 1023 1024 1025=== +notify+ action 1026 1027With notifications, instances of clones (and of master/slave 1028resources, which are an extended kind of clones) can inform each other 1029about their state. When notifications are enabled, certain actions on 1030any instance of a clone carries a +pre+ and +post+ notification. 1031 1032List of actions that trigger notifications: 1033 1034* start 1035* stop 1036* promote 1037* demote 1038 1039The cluster manager invokes the +notify+ operation on _all_ clone 1040instances. For +notify+ operations, additional environment variables 1041are passed into the resource agent during execution: 1042 1043* +$OCF_RESKEY_CRM_meta_notify_type+ -- the notification type (+pre+ 1044 or +post+) 1045 1046* +$OCF_RESKEY_CRM_meta_notify_operation+ -- the operation (action) 1047 that the notification is about (+start+, +stop+, +promote+, +demote+ 1048 etc.) 1049 1050* +$OCF_RESKEY_CRM_meta_notify_start_uname+ -- node name of the node 1051 where the resource is being started (+start+ notifications only) 1052 1053* +$OCF_RESKEY_CRM_meta_notify_stop_uname+ -- node name of the node 1054 where the resource is being stopped (+stop+ notifications only) 1055 1056* +$OCF_RESKEY_CRM_meta_notify_master_uname+ -- node name of the node 1057 where the resource currently _is in_ the Master role 1058 1059* +$OCF_RESKEY_CRM_meta_notify_promote_uname+ -- node name of the node 1060 where the resource currently _is being promoted to_ the Master role 1061 (+promote+ notifications only) 1062 1063* +$OCF_RESKEY_CRM_meta_notify_demote_uname+ -- node name of the node 1064 where the resource currently _is being demoted to_ the Slave role 1065 (+demote+ notifications only) 1066 1067Notifications come in particularly handy for master/slave resources 1068using a "pull" scheme, where the master is a publisher and the slave a 1069subscriber. Since the master is obviously only available as such when 1070a promotion has occurred, the slaves can use a "pre-promote" 1071notification to configure themselves to subscribe to the right 1072publisher. 1073 1074Likewise, the subscribers may want to unsubscribe from the publisher 1075after it has relinquished its master status, and a "post-demote" 1076notification can be used for that purpose. 1077 1078Consider the example below to illustrate the concept. 1079 1080[source,bash] 1081-------------------------------------------------------------------------- 1082foobar_notify() { 1083 local type_op 1084 type_op="${OCF_RESKEY_CRM_meta_notify_type}-${OCF_RESKEY_CRM_meta_notify_operation}" 1085 1086 ocf_log debug "Received $type_op notification." 1087 case "$type_op" in 1088 'pre-promote') 1089 ocf_run frobnicate --slave-mode \ 1090 --master=$OCF_RESKEY_CRM_meta_notify_promote_uname \ 1091 || exit $OCF_ERR_GENERIC 1092 ;; 1093 'post-demote') 1094 ocf_run frobnicate --unset-slave-mode || exit $OCF_ERR_GENERIC 1095 ;; 1096 esac 1097 1098 return $OCF_SUCCESS 1099} 1100-------------------------------------------------------------------------- 1101 1102NOTE: A master/slave resource agent may support a _multi-master_ 1103configuration, where there is possibly more than one master at any 1104given time. If that is the case, then the 1105+$OCF_RESKEY_CRM_meta_notify_*_uname+ variables may each contain a 1106space-separated lists of hostnames, rather than a single host name as 1107shown in the example. Under those circumstances the resource agent 1108would have to properly iterate over this list. 1109 1110== Script variables 1111 1112This section outlines variables typically available to resource agents, 1113primarily for convenience purposes. For additional variables 1114available while the agent is being executed, refer to 1115<<_environment_variables>> and <<_return_codes>>. 1116 1117=== +$OCF_RA_VERSION_MAJOR+ 1118 1119The major version number of the resource agent API that the cluster 1120manager is currently using. 1121 1122=== +$OCF_RA_VERSION_MINOR+ 1123 1124The minor version number of the resource agent API that the cluster 1125manager is currently using. 1126 1127=== +$OCF_ROOT+ 1128 1129The root of the OCF resource agent hierarchy. This should never be 1130changed by a resource agent. This is usually +/usr/lib/ocf+. 1131 1132=== +$OCF_FUNCTIONS_DIR+ 1133 1134The directory where the resource agents shell function library, 1135+ocf-shellfuncs+, resides. This is usually defined in terms of 1136+$OCF_ROOT+ and should never be changed by a resource agent. This 1137variable may, however, be overridden from the command line while 1138testing a new or modified resource agent. 1139 1140=== +$OCF_EXIT_REASON_PREFIX+ 1141 1142Used as a prefix when printing error messages from the resource agent. 1143Script functions use this automaticly so no explicit use is required 1144for shell based scripts. 1145 1146=== +$OCF_RESOURCE_INSTANCE+ 1147 1148The resource instance name. For primitive (non-clone, non-stateful) 1149resources, this is simply the resource name. For clones and stateful 1150resources, this is the primitive name, followed by a colon an the 1151clone instance number (such as +p_foobar:0+). 1152 1153=== +$OCF_RESOURCE_TYPE+ 1154 1155The resource type of the current resource, e.g. IPaddr2. 1156 1157=== +$OCF_RESOURCE_PROVIDER+ 1158 1159The resource provider, e.g. heartbeat. This may not be in all cluster 1160managers of Resource Agent API version 1.0. 1161 1162=== +$__OCF_ACTION+ 1163 1164The currently invoked action. This is exactly the first command-line 1165argument that the cluster manager specifies when it invokes the 1166resource agent. 1167 1168=== +$__SCRIPT_NAME+ 1169 1170The name of the resource agent. This is exactly the base name of the 1171resource agent script, with leading directory names removed. 1172 1173=== +$HA_RSCTMP+ 1174 1175A temporary directory for use by resource agents. The system startup 1176sequence (on any LSB compliant Linux distribution) guarantees that 1177this directory is emptied on system startup, so this directory will 1178not contain any stale data after a node reboot. 1179 1180== Convenience functions 1181 1182=== Logging: +ocf_log+ 1183 1184Resource agents should use the +ocf_log+ function for logging 1185purposes. This convenient logging wrapper is invoked as follows: 1186 1187[source,bash] 1188-------------------------------------------------------------------------- 1189ocf_log <severity> "Log message" 1190-------------------------------------------------------------------------- 1191 1192It supports following the following severity levels: 1193 1194* +debug+ -- for debugging messages. Most logging configurations 1195 suppress this level by default. 1196* +info+ -- for informational messages about the agent's behavior or 1197 status. 1198* +warn+ -- for warnings. This is for any messages which reflect 1199 unexpected behavior that does _not_ constitute an unrecoverable 1200 error. 1201* +err+ -- for errors. As a general rule, this logging level should 1202 only be used immediately prior to an +exit+ with the appropriate 1203 error code. 1204* +crit+ -- for critical errors. As with +err+, this logging level 1205 should not be used unless the resource agent also exits with an 1206 error code. Very rarely used. 1207 1208=== Testing for binaries: +have_binary+ and +check_binary+ 1209 1210A resource agent may need to test for the availability of a specific 1211executable. The +have_binary+ convenience function comes in handy 1212here: 1213 1214[source,bash] 1215-------------------------------------------------------------------------- 1216if ! have_binary frobnicate; then 1217 ocf_log warn "Missing frobnicate binary, frobnication disabled!" 1218fi 1219-------------------------------------------------------------------------- 1220 1221If a missing binary is a fatal problem for the resource, then the 1222+check_binary+ function should be used: 1223 1224[source,bash] 1225-------------------------------------------------------------------------- 1226check_binary frobnicate 1227-------------------------------------------------------------------------- 1228 1229Using +check_binary+ is a shorthand method for testing for the 1230existence (and executability) of the specified binary, and exiting 1231with +$OCF_ERR_INSTALLED+ if it cannot be found or executed. 1232 1233NOTE: Both +have_binary+ and +check_binary+ honor +$PATH+ when the 1234binary to test for is not specified as a full path. It is usually wise 1235to _not_ test for a full path, as binary installations path may vary 1236by distribution or user policy. 1237 1238=== Executing commands and capturing their output: +ocf_run+ 1239 1240Whenever a resource agent needs to execute a command and capture its 1241output, it should use the +ocf_run+ convenience function, invoked as 1242in this example: 1243 1244[source,bash] 1245-------------------------------------------------------------------------- 1246ocf_run frobnicate --spam=eggs || exit $OCF_ERR_GENERIC 1247-------------------------------------------------------------------------- 1248 1249With the command specified above, the resource agent will invoke 1250+frobnicate --spam=eggs+ and capture its output and 1251exit code. If the exit code is nonzero (indicating an error), 1252+ocf_run+ logs the command output with the +err+ logging severity, and 1253the resource agent subsequently exits. If the exit code is zero 1254(indicating success), any command output will be logged with the +info+ 1255logging severity. 1256 1257If the resource agent wishes to ignore the output of a successful 1258command execution, it can use the +-q+ flag with +ocf_run+. In the 1259example below, +ocf_run+ will only log output if the command exit code 1260is nonzero. 1261 1262[source,bash] 1263-------------------------------------------------------------------------- 1264ocf_run -q frobnicate --spam=eggs || exit $OCF_ERR_GENERIC 1265-------------------------------------------------------------------------- 1266 1267Finally, if the resource agent wants to log the output of a command 1268with a nonzero exit code with a severity _other_ than error, it may do 1269so by adding the +-info+ or +-warn+ option to +ocf_run+: 1270 1271[source,bash] 1272-------------------------------------------------------------------------- 1273ocf_run -warn frobnicate --spam=eggs 1274-------------------------------------------------------------------------- 1275 1276=== Locks: +ocf_take_lock+ and +ocf_release_lock_on_exit+ 1277 1278Occasionally, there may be different resources of the same type in a 1279cluster configuration that should not execute actions in 1280parallel. When a resource agent needs to guard against parallel 1281execution on the same machine, it can use the +ocf_take_lock+ and 1282+ocf_release_lock_on_exit+ convenience functions: 1283 1284[source,bash] 1285-------------------------------------------------------------------------- 1286LOCKFILE=${HA_RSCTMP}/foobar 1287ocf_release_lock_on_exit $LOCKFILE 1288 1289foobar_start() { 1290 ... 1291 ocf_take_lock $LOCKFILE 1292 ... 1293} 1294-------------------------------------------------------------------------- 1295 1296+ocf_take_lock+ attempts to acquire the designated +$LOCKFILE+. When 1297it is unavailable, it sleeps a random amount of time between 0 and 1 1298seconds, and retries. +ocf_release_lock_on_exit+ releases the lock 1299file when the agent exits (for any reason). 1300 1301=== Testing for numerical values: +ocf_is_decimal+ 1302 1303Specifically for parameter validation, it can be helpful to test 1304whether a given value is numeric. The +ocf_is_decimal+ function exists 1305for that purpose: 1306-------------------------------------------------------------------------- 1307foobar_validate_all() { 1308 if ! ocf_is_decimal $OCF_RESKEY_eggs; then 1309 ocf_log err "eggs is not numeric!" 1310 exit $OCF_ERR_CONFIGURED 1311 fi 1312 ... 1313} 1314-------------------------------------------------------------------------- 1315 1316=== Testing for boolean values: +ocf_is_true+ 1317 1318When a resource agent defines a boolean parameter, the value 1319for this parameter may be specified by the user as +0+/+1+, 1320+true+/+false+, or +on+/+off+. Since it is tedious to test for all 1321these values from within the resource agent, the agent should instead 1322use the +ocf_is_true+ convenience function: 1323 1324[source,bash] 1325-------------------------------------------------------------------------- 1326if ocf_is_true $OCF_RESKEY_superfrobnicate; then 1327 ocf_run frobnicate --super 1328fi 1329-------------------------------------------------------------------------- 1330 1331NOTE: If +ocf_is_true+ is used against an empty or non-existant 1332variable, it always returns an exit code of +1+, which is equivalent 1333to +false+. 1334 1335=== Version comparison: +ocf_version_cmp+ 1336 1337A resource agent may want to check the version of software 1338installed. +ocf_version_cmp+ takes care of all the necessary 1339details. 1340 1341The return codes are 1342 1343* +0+ -- the first version is smaller (earlier) than the second 1344* +1+ -- the two versions are equal 1345* +2+ -- the first version is greater (later) than the second 1346* +3+ -- one of arguments is not recognized as a version string 1347 1348The versions are allowed to contain digits, dots, and dashes. 1349 1350[source,bash] 1351-------------------------------------------------------------------------- 1352local v=`gooey --version` 1353ocf_version_cmp "$v" 12.0.8-1 1354case $? in 1355 0) ocf_log err "we do not support version $v, it is too old" 1356 exit $OCF_ERR_INSTALLED 1357 ;; 1358 [12]) ;; # we can work with versions >= 12.0.8-1 1359 3) ocf_log err "gooey produced version <$v>, too funky for me" 1360 exit $OCF_ERR_INSTALLED 1361 ;; 1362esac 1363-------------------------------------------------------------------------- 1364 1365=== Pseudo resources: +ha_pseudo_resource+ 1366 1367"Pseudo resources" are those where the resource agent in fact does not 1368actually start or stop something akin to a runnable process, but 1369merely executes a single action and then needs some form of tracing 1370whether that action has been executed or not. The +portblock+ resource 1371agent is an example of this. 1372 1373Resource agents for pseudo resources can use a convenience function, 1374+ha_pseudo_resource+, which makes use of _tracking files_ to keep tabs 1375on the status of a resource. If +foobar+ was designed to manage a 1376pseudo resource, then its +start+ action could look like this: 1377 1378[source,bash] 1379-------------------------------------------------------------------------- 1380foobar_start() { 1381 # exit immediately if configuration is not valid 1382 foobar_validate_all || exit $? 1383 1384 # if resource is already running, bail out early 1385 if foobar_monitor; then 1386 ocf_log info "Resource is already running" 1387 return $OCF_SUCCESS 1388 fi 1389 1390 # start the pseudo resource 1391 ha_pseudo_resource ${OCF_RESOURCE_INSTANCE} start 1392 1393 # After the resource has been started, check whether it started up 1394 # correctly. If the resource starts asynchronously, the agent may 1395 # spin on the monitor function here -- if the resource does not 1396 # start up within the defined timeout, the cluster manager will 1397 # consider the start action failed 1398 while ! foobar_monitor; do 1399 ocf_log debug "Resource has not started yet, waiting" 1400 sleep 1 1401 done 1402 1403 # only return $OCF_SUCCESS if _everything_ succeeded as expected 1404 return $OCF_SUCCESS 1405} 1406-------------------------------------------------------------------------- 1407 1408 1409== Conventions 1410 1411This section contains a collection of conventions that have emerged in 1412the resource agent repositories over the years. Following these 1413conventions is by no means mandatory for resource agent authors, but 1414it is a good idea based on the 1415http://en.wikipedia.org/wiki/Principle_of_least_surprise[Principle of 1416Least Surprise] -- resource agents following these conventions will be 1417easier to understand, review, and use than those that do not. 1418 1419=== Well-known parameter names 1420 1421Several parameter names are supported by a number of resource 1422agents. For new resource agents, following these examples is generally 1423a good idea: 1424 1425* +binary+ -- the name of a binary that principally manages the 1426 resource, such as a server daemon 1427* +config+ -- the full path to a configuration file 1428* +pid+ -- the full path to a file holding a process ID (PID) 1429* +log+ -- the full path to a log file 1430* +socket+ -- the full path to a UNIX socket that the resource manages 1431* +ip+ -- an IP address that a daemon binds to 1432* +port+ -- a TCP or UDP port that a daemon binds to 1433 1434Needless to say, resource agents should only implement any of these 1435parameters if they are sensible to use in the agent's context. 1436 1437=== Parameter defaults 1438 1439Defaults for resource agent parameters should be set by initializing 1440variables with the suffix +_default+: 1441 1442[source,bash] 1443-------------------------------------------------------------------------- 1444# Defaults 1445OCF_RESKEY_superfrobnicate_default=0 1446 1447: ${OCF_RESKEY_superfrobnicate=${OCF_RESKEY_superfrobnicate_default}} 1448-------------------------------------------------------------------------- 1449 1450NOTE: The resource agent should make sure that it sets a default for 1451any parameter not marked as +required+ in the metadata. 1452 1453 1454=== Honoring +PATH+ for binaries 1455 1456When a resource agent supports a parameter designed to hold the name 1457of a binary (such as a daemon, or a client utility for querying 1458status), then that parameter should honor the +PATH+ environment 1459variable. Do not supply full paths. Thus, the following approach: 1460 1461[source,bash] 1462-------------------------------------------------------------------------- 1463# Good example -- do it this way 1464OCF_RESKEY_frobnicate_default="frobnicate" 1465: ${OCF_RESKEY_frobnicate="${OCF_RESKEY_frobnicate_default}"} 1466-------------------------------------------------------------------------- 1467 1468is much preferred over specifying a full path, as shown here: 1469 1470[source,bash] 1471-------------------------------------------------------------------------- 1472# Bad example -- avoid if you can 1473OCF_RESKEY_frobnicate_default="/usr/local/sbin/frobnicate" 1474: ${OCF_RESKEY_frobnicate="${OCF_RESKEY_frobnicate_default}"} 1475-------------------------------------------------------------------------- 1476 1477This rule holds for defaults, as well. 1478 1479 1480 1481== Special considerations 1482 1483=== Licensing 1484 1485Whenever possible, resource agent contributors are _encouraged_ to use 1486the GNU General Public License (GPL), version 2 and later, for any new 1487resource agents. The shell functions library does not strictly mandate 1488this, however, as it is licensed under the GNU Lesser General Public 1489License (LGPL), version 2.1 and later (so it can be used by non-GPL 1490agents). 1491 1492The resource agent _must_ explicitly state its own license in the 1493agent source code. 1494 1495 1496=== Locale settings 1497 1498When sourcing +ocf-shellfuncs+ as explained in <<_initialization>>, 1499any resource agent automatically sets +LANG+ and +LC_ALL+ to the +C+ 1500locale. Resource agents can thus expect to always operate in the +C+ 1501locale, and need not reset +LANG+ or any of the +LC_+ environment 1502variables themselves. 1503 1504 1505=== Testing for running processes 1506 1507For testing whether a particular process (with a known process ID) is 1508currently running, a frequently found method is to send it a +0+ 1509signal and catch errors, similar to this example: 1510 1511[source,bash] 1512-------------------------------------------------------------------------- 1513if kill -s 0 `cat $daemon_pid_file`; then 1514 ocf_log debug "Process is currently running" 1515else 1516 ocf_log warn "Process is dead, removing pid file" 1517 rm -f $daemon_pid_file 1518if 1519-------------------------------------------------------------------------- 1520 1521IMPORTANT: An approach far superior to this example is to instead test 1522the _functionality_ of the daemon by connecting to it with a client 1523process, as shown in the example in 1524<<_literal_monitor_literal_action>>. 1525 1526 1527=== Specifying a master preference 1528 1529Stateful (master/slave) resources must set their own _master 1530preference_ -- they can thus provide hints to the cluster manager 1531which is the the best instance to promote to the +Master+ role. 1532 1533IMPORTANT: It is acceptable for multiple instances to have identical 1534positive master preferences. In that case, the cluster resource 1535manager will automatically select a resource agent to 1536promote. However, if _all_ instances have the (default) master score 1537of zero, the cluster manager will not promote any instance at 1538all. Thus, it is crucial that at least one instance has a positive 1539master score. 1540 1541For this purpose, +crm_master+ comes in handy. This convenience 1542wrapper around the +crm_attribute+ sets a node attribute named 1543+master-<<_literal_ocf_resource_instance_literal,$OCF_RESOURCE_INSTANCE>>+ 1544for the node it is being executed on, and fills this attribute with 1545the specified value. The cluster manager is then expected to translate 1546this into a promotion score for the corresponding instance, and base 1547its promotion preference on that score. 1548 1549Stateful resource agents typically execute +crm_master+ during the 1550<<_literal_monitor_literal_action,+monitor+>> and/or 1551<<_literal_notify_literal_action,+notify+>> action. 1552 1553The following example assumes that the +foobar+ resource agent can 1554test the application's status by executing a binary that returns 1555certain exit codes based on whether 1556 1557* the resource is either in the master role, or is a slave that is 1558 fully caught up with the master (at any rate, it has current data), 1559 or 1560* the resource is in the slave role, but through some form of 1561 asynchronous replication has "fallen behind" the master, or 1562* the resource has gracefully stopped, or 1563* the resource has unexpectedly failed. 1564 1565[source,bash] 1566-------------------------------------------------------------------------- 1567foobar_monitor() { 1568 local rc 1569 1570 # exit immediately if configuration is not valid 1571 foobar_validate_all || exit $? 1572 1573 ocf_run frobnicate --test 1574 1575 # This example assumes the following exit code convention 1576 # for frobnicate: 1577 # 0: running, and fully caught up with master 1578 # 1: gracefully stopped 1579 # 2: running, but lagging behind master 1580 # any other: error 1581 case "$?" in 1582 0) 1583 rc=$OCF_SUCCESS 1584 ocf_log debug "Resource is running" 1585 # Set a high master preference. The current master 1586 # will always get this, plus 1. Any current slaves 1587 # will get a high preference so that if the master 1588 # fails, they are next in line to take over. 1589 crm_master -l reboot -v 100 1590 ;; 1591 1) 1592 rc=$OCF_NOT_RUNNING 1593 ocf_log debug "Resource is not running" 1594 # Remove the master preference for this node 1595 crm_master -l reboot -D 1596 ;; 1597 2) 1598 rc=$OCF_SUCCESS 1599 ocf_log debug "Resource is lagging behind master" 1600 # Set a low master preference: if the master fails 1601 # right now, and there is another slave that does 1602 # not lag behind the master, its higher master 1603 # preference will win and that slave will become 1604 # the new master 1605 crm_master -l reboot -v 5 1606 ;; 1607 *) 1608 ocf_log err "Resource has failed" 1609 exit $OCF_ERR_GENERIC 1610 esac 1611 1612 return $rc 1613} 1614-------------------------------------------------------------------------- 1615 1616 1617== Testing resource agents 1618 1619This section discusses automated testing for resource agents. Testing 1620is a vital aspect of development; it is crucial both for creating new 1621resource agents, and for modifying existing ones. 1622 1623 1624=== Testing with +ocf-tester+ 1625 1626The resource agents repository (and hence, any installed resource 1627agents package) contains a utility named +ocf-tester+. This shell 1628script allows you to conveniently and easily test the functionality of 1629your resource agent. 1630 1631+ocf-tester+ is commonly invoked, as +root+, like this: 1632 1633-------------------------------------------------------------------------- 1634ocf-tester -n <name> [-o <param>=<value> ... ] <resource agent> 1635-------------------------------------------------------------------------- 1636 1637* +<name>+ is an arbitrary resource name. 1638 1639* You may set any number of +<param>=<value>+ with the +-o+ option, 1640 corresponding to any resource parameters you wish to set for 1641 testing. 1642 1643* +<resource agent>+ is the full path to your resource agent. 1644 1645When invoked, +ocf-tester+ executes all mandatory actions and enforces 1646action behavior as explained in <<_resource_agent_actions>>. 1647 1648It also tests for optional actions. Optional actions must behave as 1649expected when advertised, but do not cause +ocf-tester+ to flag an 1650error if not implemented. 1651 1652IMPORTANT: +ocf-tester+ does not initiate "dry runs" of actions, nor 1653does it create resource dummies of any kind. Instead, it exercises the 1654actual resource agent as-is, whether that may include opening and 1655closing databases, mounting file systems, starting or stopping virtual 1656machines, etc. Use with care. 1657 1658For example, you could run +ocf-tester+ on the +foobar+ resource agent 1659as follows: 1660 1661-------------------------------------------------------------------------- 1662# ocf-tester -n foobartest \ 1663 -o superfrobnicate=true \ 1664 -o datadir=/tmp \ 1665 /home/johndoe/ra-dev/foobar 1666Beginning tests for /home/johndoe/ra-dev/foobar... 1667* Your agent does not support the notify action (optional) 1668* Your agent does not support the reload action (optional) 1669/home/johndoe/ra-dev/foobar passed all tests 1670-------------------------------------------------------------------------- 1671 1672If the resource agent exhibits some difficult to grasp behaviour, 1673which is typically the case with just developed software, there 1674are +-v+ and +-d+ options to dump more output. If that does not 1675help, instruct +ocf-tester+ to trace the resource agent with 1676+-X+ (make sure to redirect output to a file, unless you are a 1677really fast reader). 1678 1679=== Testing with +ocft+ 1680 1681+ocft+ is a testing tool for resource agents. The main difference 1682to +ocf-tester+ is that +ocft+ can automate creating complex 1683testing environments. That includes package installation and 1684arbitrary shell scripting. 1685 1686==== +ocft+ components 1687 1688+ocft+ consists of the following components: 1689 1690* A test case generator (+/usr/sbin/ocft+) -- generates shell 1691 scripts from test case configuration files 1692 1693* Configuration files (+/usr/share/resource-agents/ocft/configs/+) -- 1694 a configuration file contains environment setup and test cases 1695 for one resource agent 1696 1697* The testing scripts are stored in +/var/lib/resource-agents/ocft/cases/+, 1698 but normally there is no need to inspect them 1699 1700==== Customizing the testing environment 1701 1702+ocft+ modifies the runtime environment of the resource agent 1703either by changing environment variables (through the interface 1704defined by OCF) or by running ad-hoc shell scripts which can for 1705instance change permissions of a file or unmount a file system. 1706 1707==== How to test 1708 1709You need to know the software (resource) you want to test. Draw a 1710sketch of all interesting scenarios, with all expected and 1711unexpected conditions and how the resource agent should react to 1712them. Then you need to encode these conditions and the expected 1713outcomes as +ocft+ test cases. Running ocft is then simple: 1714 1715--------------------------------------- 1716# ocft make <RA> 1717# ocft test <RA> 1718--------------------------------------- 1719 1720The first subcommand generates the scripts for your test cases 1721whereas the second runs them and checks the outcome. 1722 1723==== +ocft+ configuration file syntax 1724 1725There are four top level options each of which can contain 1726one or more sub-options. 1727 1728===== +CONFIG+ (top level option) 1729 1730This option is global and influences every test case. 1731 1732 ** +AgentRoot+ (sub-option) 1733--------------------------------------- 1734AgentRoot /usr/lib/ocf/resource.d/xxx 1735--------------------------------------- 1736 1737Normally, we assume that the resource agent lives under the 1738+heartbeat+ provider. Use `AgentRoot` to test agent which is 1739distributed by another vendor. 1740 1741 ** +InstallPackage+ (sub-option) 1742--------------------------------------- 1743InstallPackage package [package2 [...]] 1744--------------------------------------- 1745 1746Install packages necessary for testing. The installation is 1747skipped if the packages have already been installed. 1748 1749 ** 'HangTimeout' (sub-option) 1750--------------------------------------- 1751HangTimeout secs 1752--------------------------------------- 1753 1754The maximum time allowed for a single RA action. If this timer 1755expires, the action is considered as failed. 1756 1757===== +SETUP-AGENT+ (top level option) 1758--------------------------------------- 1759SETUP-AGENT 1760 bash commands 1761--------------------------------------- 1762 1763If the RA needs to be initialized before testing, you can put 1764bash code here for that purpose. The initialization is done only 1765once. If you need to reinitialize then delete the 1766+/tmp/.[AGENT_NAME]_set+ stamp file. 1767 1768===== +CASE+ (top level option) 1769--------------------------------------- 1770CASE "description" 1771--------------------------------------- 1772 1773This is the main building block of the test suite. Each test 1774case is to be described in one +CASE+ top level option. 1775 1776One case consists of several suboptions typically followed by the 1777+RunAgent+ suboption. 1778 1779 ** +Var+ (sub-option) 1780--------------------------------------- 1781Var VARIABLE=value 1782--------------------------------------- 1783 1784It is to set up an environment variable of the resource agent. They 1785usually appear to be OCF_RESKEY_xxx. One point is to be noted is there 1786is no blank by both sides of "=". 1787 1788 ** +Unvar+ (sub-option) 1789--------------------------------------- 1790Unvar VARIABLE [VARIABLE2 [...]] 1791--------------------------------------- 1792 1793Remove the environment variable. 1794 1795 ** +Include+ (sub-option) 1796--------------------------------------- 1797Include macro_name 1798--------------------------------------- 1799 1800Include statements in 'macro_name'. See below for description of 1801+CASE-BLOCK+. 1802 1803** +Bash+ (sub-option) 1804--------------------------------------- 1805Bash bash_codes 1806--------------------------------------- 1807 1808This option is to set up the environment of OS, where you can insert 1809BASH code to customize the system randomly. Note, do not cause 1810unrecoverable consequences to the system. 1811 1812** +BashAtExit+ (sub-option) 1813--------------------------------------- 1814BashAtExit bash_codes 1815--------------------------------------- 1816 1817This option is to recover the OS environment in order to run another 1818test case correctly. Of cause you can use 'Bash' option to recover 1819it. However, if mistakes occur in the process, the script will quit 1820directly instead of running your recovery codes. If it happens, you 1821ought to use BashAtExit which can restore the system environment 1822before you quit. 1823 1824** +RunAgent+ (sub-option) 1825--------------------------------------- 1826RunAgent cmd [ret_value] 1827--------------------------------------- 1828 1829This option is to run resource agent. "cmd" is the parameter of the 1830resource agent, such as "start, status, stop ...". The second 1831parameter is optional. It will compare the actual returned value with 1832the expected value when the script has run recourse agent. If 1833differs, bugs will be found. 1834 1835It is also possible to execute a suboption on a remote host 1836instead of locally. The protocol used is ssh and the command is 1837run in the background. Just add the +@<ipaddr>+ suffix to the 1838suboption name. For instance: 1839 1840--------------------------------------- 1841Bash@192.168.1.100 date 1842--------------------------------------- 1843 1844would run the date program. Remote commands are run in 1845background. 1846 1847NB: Not clear how can ssh be automated as we don't know in 1848advance the environment. Perhaps use "well-known" host names such 1849as "node2"? Also, if the command runs in the background, it's not 1850clear how is the exit code checked. Finally, does Var@node make 1851sense? Or is the current environment somehow copied over? We 1852probably need an example here. 1853 1854Need examples in general. 1855 1856===== +CASE-BLOCK+ (top level option) 1857--------------------------------------- 1858CASE-BLOCK macro_name 1859--------------------------------------- 1860 1861The +CASE-BLOCK+ option defines a macro which can be +Include+d 1862in any +CASE+. All +CASE+ suboptions are valid in +CASE-BLOCK+. 1863 1864 1865== Installing and packaging resource agents 1866 1867This section discusses what to do with your resource agent once it is 1868done and tested -- where to install it, and how to include it in either 1869your own application package or in the Linux-HA resource agents 1870repository. 1871 1872=== Installing resource agents 1873 1874If you choose to include your resource agent in your own project, make 1875sure it installs into the correct location. Resource agents should 1876install into the +/usr/lib/ocf/resource.d/<provider>+ directory, where 1877+<provider>+ is the name of your project or any other name you wish to 1878identify the resource agent with. 1879 1880For example, if your +foobar+ resource agent is being packaged as part 1881of a project named +fortytwo+, then the correct full path to your 1882resource agent would be 1883+/usr/lib/ocf/resource.d/fortytwo/foobar+. Make sure your resource 1884agent installs with +0755+ (+-rwxr-xr-x+) permission bits. 1885 1886When installed this way, OCF-compliant cluster resource managers will 1887be able to properly identify, parse, and execute your resource 1888agent. The Pacemaker cluster manager, for example, would map the 1889above-mentioned installation path to the +ocf:fortytwo:foobar+ 1890resource type identifier. 1891 1892=== Packaging resource agents 1893 1894When you package resource agents as part of your own project, you 1895should apply the considerations outlined in this section. 1896 1897NOTE: If you instead prefer to submit your resource agent to the 1898Linux-HA resource agents repository, see 1899<<_submitting_resource_agents>> for information on doing so. 1900 1901==== RPM packaging 1902 1903It is recommended to put your OCF resource agent(s) in an RPM 1904sub-package, with the name +<toppackage>-resource-agents+. Ensure that 1905the package owns its provider directory, and depends on the upstream 1906+resource-agents+ package which lays out the directory hierarchy and 1907provides convenience shell functions. An example RPM spec snippet is 1908given below: 1909 1910-------------------------------------------------------------------------- 1911%package resource-agents 1912Summary: OCF resource agent for Foobar 1913Group: System Environment/Base 1914Requires: %{name} = %{version}-%{release}, resource-agents 1915 1916%description resource-agents 1917This package contains the OCF-compliant resource agents for Foobar. 1918 1919%files resource-agents 1920%defattr(755,root,root,-) 1921%dir %{_prefix}/lib/ocf/resource.d/fortytwo 1922%{_prefix}/lib/ocf/resource.d/fortytwo/foobar 1923-------------------------------------------------------------------------- 1924 1925NOTE: If an RPM spec file contains a +%package+ declaration, then RPM 1926considers this a sub-package which inherits top-level fields such as 1927+Name+, +Version+, +License+, etc. Sub-packages have the top-level 1928package name automatically prepended to their own name. Thus the snippet 1929above would create a sub-package named +foobar-resource-agents+ 1930(presuming the package +Name+ is +foobar+). 1931 1932==== Debian packaging 1933 1934For Debian packages, like for <<_rpm_packaging,RPMs>>, it is 1935recommended to create a separate package holding your resource agents, 1936which then should depend on the +cluster-agents+ package. 1937 1938NOTE: This section assumes that you are packaging with +debhelper+. 1939 1940An example +debian/control+ snippet is given below: 1941 1942-------------------------------------------------------------------------- 1943Package: foobar-cluster-agents 1944Priority: extra 1945Architecture: all 1946Depends: cluster-agents 1947Description: OCF-compliant resource agents for Foobar 1948-------------------------------------------------------------------------- 1949 1950You will also create a separate +.install+ file. Sticking with the 1951example of installing the +foobar+ resource agent as a sub-package of 1952+fortytwo+, the +debian/fortytwo-cluster-agents.install+ file could 1953consist of the following content: 1954 1955-------------------------------------------------------------------------- 1956usr/lib/ocf/resource.d/fortytwo/foobar 1957-------------------------------------------------------------------------- 1958 1959=== Submitting resource agents 1960 1961If you choose not to bundle your resource agent with your own package, 1962but instead wish to submit it to the upstream resource agent 1963repository hosted on 1964https://github.com/ClusterLabs/resource-agents[the ClusterLabs 1965repository on GitHub], please follow the steps outlined in this section. 1966 1967Create a fork of the 1968https://github.com/ClusterLabs/resource-agents[upstream repository] and 1969clone it with the following commands: 1970 1971-------------------------------------------------------------------------- 1972git clone git://github.com/<your-username>/resource-agents 1973git remote add upstream git@github.com:ClusterLabs/resource-agents.git 1974git checkout -b <new-branch> 1975-------------------------------------------------------------------------- 1976 1977Then, copy your resource agent into the +heartbeat+ subdirectory: 1978-------------------------------------------------------------------------- 1979cd resource-agents/heartbeat 1980cp /path/to/your/local/copy/of/foobar . 1981chmod 0755 foobar 1982cd .. 1983-------------------------------------------------------------------------- 1984 1985Next, modify the +Makefile.am+ file in +resource-agents/heartbeat+ and 1986add your new resource agent to the +ocf_SCRIPTS+ list. This will make 1987sure the agent is properly installed. 1988 1989Lastly, open Makefile.am in +resource-agents/doc/man+ and add 1990+ocf_heartbeat_<name>.7+ to the +man_MANS+ variable. This will 1991automatically generate a resource agent manual page from its metadata, 1992and then install that man page into the correct location. 1993 1994Now, add your new resource agents, and the two modifications to the 1995Makefiles, to your changeset: 1996 1997-------------------------------------------------------------------------- 1998git add heartbeat/foobar 1999git add heartbeat/Makefile.am 2000git add doc/man/Makefile.am 2001git commit 2002-------------------------------------------------------------------------- 2003 2004In your commit message, be sure to include a meaningful description, 2005for example: 2006-------------------------------------------------------------------------- 2007High: foobar: new resource agent 2008 2009This new resource agent adds functionality to manage a foobar service. 2010It supports being configured as a primitive or as a master/slave set, 2011and also optionally supports superfrobnication. 2012-------------------------------------------------------------------------- 2013 2014Now push the patch set to GitHub: 2015-------------------------------------------------------------------------- 2016git push 2017-------------------------------------------------------------------------- 2018 2019Create a Pull Request (PR) on Github that will be reviewed by the 2020upstream developers. 2021 2022Once your new resource agent has been accepted for merging, one of the 2023upstream developers will Merge the Pull Request into the upstream 2024repository. At that point, you can update your master branch from 2025upstream, and remove your own branch. 2026 2027-------------------------------------------------------------------------- 2028git checkout master 2029git fetch upstream 2030git merge upstream/master 2031git branch -D <branch> 2032-------------------------------------------------------------------------- 2033 2034=== Maintaining resource agents 2035 2036If you maintain a specific resource agent, or you are making repeated 2037contributions to the codebase, it's usually a good idea to maintain 2038your own _fork_ of the +ClusterLabs/resource-agents+ repository on 2039GitHub. 2040 2041To do so, 2042 2043* https://github.com/signup[Create a GitHub account] if you do not 2044 have one already. 2045* http://help.github.com/fork-a-repo/[Fork] the 2046 https://github.com/ClusterLabs/resource-agents[+resource-agents+ 2047 repository]. 2048* Clone your personal fork into a local working copy. 2049 2050As you work on resource agents, *please* commit early, and commit 2051often. You can always fold commits later with +git rebase -i+. 2052 2053Once you have made a number of changes that you would like others to 2054review, push them to your GitHub fork and send a post to the 2055+linux-ha-dev+ mailing list pointing people to it. 2056 2057After the review is done, fix up your tree with any requested changes, 2058and then issue a pull request. There are two ways of doing so: 2059 2060* You can use the +git request-pull+ utility to get a pre-populated 2061 email skeleton summarizing your changesets. Add any information you 2062 see fit, and send it to the list. It is a good idea to prefix your 2063 email subject with +[GIT PULL]+ so upstream maintainers can pick the 2064 message out easily. 2065 2066* You can also issue a pull request directly on GitHub. GitHub 2067 automatically notifies upstream maintainers about new pull requests 2068 by email. Please refer to 2069 http://help.github.com/send-pull-requests/[github:help] for details 2070 on initiating pull requests. 2071