1.. _testing_strategies:
2
3Testing Strategies
4==================
5
6.. _testing_intro:
7
8Integrating Testing With Ansible Playbooks
9``````````````````````````````````````````
10
11Many times, people ask, "how can I best integrate testing with Ansible playbooks?"  There are many options.  Ansible is actually designed
12to be a "fail-fast" and ordered system, therefore it makes it easy to embed testing directly in Ansible playbooks.  In this chapter,
13we'll go into some patterns for integrating tests of infrastructure and discuss the right level of testing that may be appropriate.
14
15.. note:: This is a chapter about testing the application you are deploying, not the chapter on how to test Ansible modules during development.  For that content, please hop over to the Development section.
16
17By incorporating a degree of testing into your deployment workflow, there will be fewer surprises when code hits production and, in many cases,
18tests can be leveraged in production to prevent failed updates from migrating across an entire installation.  Since it's push-based, it's
19also very easy to run the steps on the localhost or testing servers. Ansible lets you insert as many checks and balances into your upgrade workflow as you would like to have.
20
21The Right Level of Testing
22``````````````````````````
23
24Ansible resources are models of desired-state.  As such, it should not be necessary to test that services are started, packages are
25installed, or other such things.  Ansible is the system that will ensure these things are declaratively true.   Instead, assert these
26things in your playbooks.
27
28.. code-block:: yaml
29
30   tasks:
31     - service:
32         name: foo
33         state: started
34         enabled: yes
35
36If you think the service may not be started, the best thing to do is request it to be started.  If the service fails to start, Ansible
37will yell appropriately. (This should not be confused with whether the service is doing something functional, which we'll show more about how to
38do later).
39
40.. _check_mode_drift:
41
42Check Mode As A Drift Test
43``````````````````````````
44
45In the above setup, `--check` mode in Ansible can be used as a layer of testing as well.  If running a deployment playbook against an
46existing system, using the `--check` flag to the `ansible` command will report if Ansible thinks it would have had to have made any changes to
47bring the system into a desired state.
48
49This can let you know up front if there is any need to deploy onto the given system.  Ordinarily scripts and commands don't run in check mode, so if you
50want certain steps to always execute in check mode, such as calls to the script module, disable check mode for those tasks::
51
52
53   roles:
54     - webserver
55
56   tasks:
57     - script: verify.sh
58       check_mode: no
59
60Modules That Are Useful for Testing
61```````````````````````````````````
62
63Certain playbook modules are particularly good for testing.  Below is an example that ensures a port is open::
64
65   tasks:
66
67     - wait_for:
68         host: "{{ inventory_hostname }}"
69         port: 22
70       delegate_to: localhost
71
72Here's an example of using the URI module to make sure a web service returns::
73
74   tasks:
75
76     - action: uri url=http://www.example.com return_content=yes
77       register: webpage
78
79     - fail:
80         msg: 'service is not happy'
81       when: "'AWESOME' not in webpage.content"
82
83It's easy to push an arbitrary script (in any language) on a remote host and the script will automatically fail if it has a non-zero return code::
84
85   tasks:
86
87     - script: test_script1
88     - script: test_script2 --parameter value --parameter2 value
89
90If using roles (you should be, roles are great!), scripts pushed by the script module can live in the 'files/' directory of a role.
91
92And the assert module makes it very easy to validate various kinds of truth::
93
94   tasks:
95
96      - shell: /usr/bin/some-command --parameter value
97        register: cmd_result
98
99      - assert:
100          that:
101            - "'not ready' not in cmd_result.stderr"
102            - "'gizmo enabled' in cmd_result.stdout"
103
104Should you feel the need to test for existence of files that are not declaratively set by your Ansible configuration, the 'stat' module is a great choice::
105
106   tasks:
107
108      - stat:
109          path: /path/to/something
110        register: p
111
112      - assert:
113          that:
114            - p.stat.exists and p.stat.isdir
115
116
117As mentioned above, there's no need to check things like the return codes of commands.  Ansible is checking them automatically.
118Rather than checking for a user to exist, consider using the user module to make it exist.
119
120Ansible is a fail-fast system, so when there is an error creating that user, it will stop the playbook run.  You do not have
121to check up behind it.
122
123Testing Lifecycle
124`````````````````
125
126If writing some degree of basic validation of your application into your playbooks, they will run every time you deploy.
127
128As such, deploying into a local development VM and a staging environment will both validate that things are according to plan
129ahead of your production deploy.
130
131Your workflow may be something like this::
132
133    - Use the same playbook all the time with embedded tests in development
134    - Use the playbook to deploy to a staging environment (with the same playbooks) that simulates production
135    - Run an integration test battery written by your QA team against staging
136    - Deploy to production, with the same integrated tests.
137
138Something like an integration test battery should be written by your QA team if you are a production webservice.  This would include
139things like Selenium tests or automated API tests and would usually not be something embedded into your Ansible playbooks.
140
141However, it does make sense to include some basic health checks into your playbooks, and in some cases it may be possible to run
142a subset of the QA battery against remote nodes.   This is what the next section covers.
143
144Integrating Testing With Rolling Updates
145````````````````````````````````````````
146
147If you have read into :ref:`playbooks_delegation` it may quickly become apparent that the rolling update pattern can be extended, and you
148can use the success or failure of the playbook run to decide whether to add a machine into a load balancer or not.
149
150This is the great culmination of embedded tests::
151
152    ---
153
154    - hosts: webservers
155      serial: 5
156
157      pre_tasks:
158
159        - name: take out of load balancer pool
160          command: /usr/bin/take_out_of_pool {{ inventory_hostname }}
161          delegate_to: 127.0.0.1
162
163      roles:
164
165         - common
166         - webserver
167         - apply_testing_checks
168
169      post_tasks:
170
171        - name: add back to load balancer pool
172          command: /usr/bin/add_back_to_pool {{ inventory_hostname }}
173          delegate_to: 127.0.0.1
174
175Of course in the above, the "take out of the pool" and "add back" steps would be replaced with a call to a Ansible load balancer
176module or appropriate shell command.  You might also have steps that use a monitoring module to start and end an outage window
177for the machine.
178
179However, what you can see from the above is that tests are used as a gate -- if the "apply_testing_checks" step is not performed,
180the machine will not go back into the pool.
181
182Read the delegation chapter about "max_fail_percentage" and you can also control how many failing tests will stop a rolling update
183from proceeding.
184
185This above approach can also be modified to run a step from a testing machine remotely against a machine::
186
187    ---
188
189    - hosts: webservers
190      serial: 5
191
192      pre_tasks:
193
194        - name: take out of load balancer pool
195          command: /usr/bin/take_out_of_pool {{ inventory_hostname }}
196          delegate_to: 127.0.0.1
197
198      roles:
199
200         - common
201         - webserver
202
203      tasks:
204         - script: /srv/qa_team/app_testing_script.sh --server {{ inventory_hostname }}
205           delegate_to: testing_server
206
207      post_tasks:
208
209        - name: add back to load balancer pool
210          command: /usr/bin/add_back_to_pool {{ inventory_hostname }}
211          delegate_to: 127.0.0.1
212
213In the above example, a script is run from the testing server against a remote node prior to bringing it back into
214the pool.
215
216In the event of a problem, fix the few servers that fail using Ansible's automatically generated
217retry file to repeat the deploy on just those servers.
218
219Achieving Continuous Deployment
220```````````````````````````````
221
222If desired, the above techniques may be extended to enable continuous deployment practices.
223
224The workflow may look like this::
225
226    - Write and use automation to deploy local development VMs
227    - Have a CI system like Jenkins deploy to a staging environment on every code change
228    - The deploy job calls testing scripts to pass/fail a build on every deploy
229    - If the deploy job succeeds, it runs the same deploy playbook against production inventory
230
231Some Ansible users use the above approach to deploy a half-dozen or dozen times an hour without taking all of their infrastructure
232offline.  A culture of automated QA is vital if you wish to get to this level.
233
234If you are still doing a large amount of manual QA, you should still make the decision on whether to deploy manually as well, but
235it can still help to work in the rolling update patterns of the previous section and incorporate some basic health checks using
236modules like 'script', 'stat', 'uri', and 'assert'.
237
238Conclusion
239``````````
240
241Ansible believes you should not need another framework to validate basic things of your infrastructure is true.  This is the case
242because Ansible is an order-based system that will fail immediately on unhandled errors for a host, and prevent further configuration
243of that host.  This forces errors to the top and shows them in a summary at the end of the Ansible run.
244
245However, as Ansible is designed as a multi-tier orchestration system, it makes it very easy to incorporate tests into the end of
246a playbook run, either using loose tasks or roles.  When used with rolling updates, testing steps can decide whether to put a machine
247back into a load balanced pool or not.
248
249Finally, because Ansible errors propagate all the way up to the return code of the Ansible program itself, and Ansible by default
250runs in an easy push-based mode, Ansible is a great step to put into a build environment if you wish to use it to roll out systems
251as part of a Continuous Integration/Continuous Delivery pipeline, as is covered in sections above.
252
253The focus should not be on infrastructure testing, but on application testing, so we strongly encourage getting together with your
254QA team and ask what sort of tests would make sense to run every time you deploy development VMs, and which sort of tests they would like
255to run against the staging environment on every deploy.  Obviously at the development stage, unit tests are great too.  But don't unit
256test your playbook.  Ansible describes states of resources declaratively, so you don't have to.  If there are cases where you want
257to be sure of something though, that's great, and things like stat/assert are great go-to modules for that purpose.
258
259In all, testing is a very organizational and site-specific thing.  Everybody should be doing it, but what makes the most sense for your
260environment will vary with what you are deploying and who is using it -- but everyone benefits from a more robust and reliable deployment
261system.
262
263.. seealso::
264
265   :ref:`all_modules`
266       All the documentation for Ansible modules
267   :ref:`working_with_playbooks`
268       An introduction to playbooks
269   :ref:`playbooks_delegation`
270       Delegation, useful for working with load balancers, clouds, and locally executed steps.
271   `User Mailing List <https://groups.google.com/group/ansible-project>`_
272       Have a question?  Stop by the google group!
273   `irc.libera.chat <https://libera.chat/>`_
274       #ansible IRC chat channel
275
276