1Configuring Betamax
2===================
3
4By now you've seen examples where we pass a great deal of keyword arguments to
5:meth:`~betamax.Betamax.use_cassette`. You have also seen that we used
6:meth:`betamax.Betamax.configure`. In this section, we'll go into a deep
7description of the different approaches and why you might pick one over the
8other.
9
10Global Configuration
11--------------------
12
13Admittedly, I am not too proud of my decision to borrow this design from
14`VCR`_, but I did and I use it and it isn't entirely terrible. (Note: I do
15hope to come up with an elegant way to redesign it for v1.0.0 but that's a
16long way off.)
17
18The best way to configure Betamax globally is by using
19:meth:`betamax.Betamax.configure`. This returns a
20:class:`betamax.configure.Configuration` instance. This instance can be used
21as a context manager in order to make the usage look more like `VCR`_'s way of
22configuring the library. For example, in `VCR`_, you might do
23
24.. code-block:: ruby
25
26    VCR.configure do |config|
27      config.cassette_library_dir = 'examples/cassettes'
28      config.default_cassette_options[:record] = :none
29      # ...
30    end
31
32Where as with Betamax you might do
33
34.. code-block:: python
35
36    from betamax import Betamax
37
38    with Betamax.configure() as config:
39        config.cassette_library_dir = 'examples/cassettes'
40        config.default_cassette_options['record_mode'] = 'none'
41
42Alternatively, since the object returned is really just an object and does not
43do anything special as a context manager, you could just as easily do
44
45.. code-block:: python
46
47    from betamax import Betamax
48
49    config = Betamax.configure()
50    config.cassette_library_dir = 'examples/cassettes'
51    config.default_cassette_options['record_mode'] = 'none'
52
53We'll now move on to specific use-cases when configuring Betamax. We'll
54exclude the portion of each example where we create a
55:class:`~betamax.configure.Configuration` instance.
56
57Setting the Directory in which Betamax Should Store Cassette Files
58``````````````````````````````````````````````````````````````````
59
60Each and every time we use Betamax we need to tell it where to store (and
61discover) cassette files. By default we do this by setting the
62``cassette_library_dir`` attribute on our ``config`` object, e.g.,
63
64.. code-block:: python
65
66    config.cassette_library_dir = 'tests/integration/cassettes'
67
68Note that these paths are relative to what Python thinks is the current
69working directory. Wherever you run your tests from, write the path to be
70relative to that directory.
71
72Setting Default Cassette Options
73````````````````````````````````
74
75Cassettes have default options used by Betamax if none are set. For example,
76
77- The default record mode is ``once``.
78
79- The default matchers used are ``method`` and ``uri``.
80
81- Cassettes do **not** preserve the exact body bytes by default.
82
83These can all be configured as you please. For example, if you want to change
84the default matchers and preserve exact body bytes, you would do
85
86.. code-block:: python
87
88    config.default_cassette_options['match_requests_on'] = [
89        'method',
90        'uri',
91        'headers',
92    ]
93    config.preserve_exact_body_bytes = True
94
95Filtering Sensitive Data
96````````````````````````
97
98It's unlikely that you'll want to record an interaction that will not require
99authentication. For this we can define placeholders in our cassettes. Let's
100use a very real example.
101
102Let's say that you want to get your user data from GitHub using Requests. You
103might have code that looks like this:
104
105.. code-block:: python
106
107    def me(username, password, session):
108        r = session.get('https://api.github.com/user', auth=(username, password))
109        r.raise_for_status()
110        return r.json()
111
112You would test this something like:
113
114.. code-block:: python
115
116    import os
117
118    import betamax
119    import requests
120
121    from my_module import me
122
123    session = requests.Session()
124    recorder = betamax.Betamax(session)
125    username = os.environ.get('USERNAME', 'testuser')
126    password = os.environ.get('PASSWORD', 'testpassword')
127
128    with recorder.use_cassette('test-me'):
129        json = me(username, password, session)
130        # assertions about the JSON returned
131
132The problem is that now your username and password will be recorded in the
133cassette which you don't then want to push to your version control. How can we
134prevent that from happening?
135
136.. code-block:: python
137
138    import base64
139
140    username = os.environ.get('USERNAME', 'testuser')
141    password = os.environ.get('PASSWORD', 'testpassword')
142    config.define_cassette_placeholder(
143        '<GITHUB-AUTH>',
144        base64.b64encode(
145            '{0}:{1}'.format(username, password).encode('utf-8')
146        )
147    )
148
149.. note::
150
151    Obviously you can refactor this a bit so you can pull those environment
152    variables out in only one place, but I'd rather be clear than not here.
153
154The first time you run the test script you would invoke your tests like so:
155
156.. code-block:: sh
157
158    $ USERNAME='my-real-username' PASSWORD='supersecretep@55w0rd' \
159      python test_script.py
160
161Future runs of the script could simply be run without those environment
162variables, e.g.,
163
164.. code-block:: sh
165
166    $ python test_script.py
167
168This means that you can run these tests on a service like Travis-CI without
169providing credentials.
170
171In the event that you can not anticipate what you will need to filter out,
172version 0.7.0 of Betamax adds ``before_record`` and ``before_playback`` hooks.
173These two hooks both will pass the
174:class:`~betamax.cassette.interaction.Interaction` and
175:class:`~betamax.cassette.cassette.Cassette` to the function provided. An
176example callback would look like:
177
178.. code-block:: python
179
180    def hook(interaction, cassette):
181        pass
182
183You would then register this callback:
184
185.. code-block:: python
186
187    # Either
188    config.before_record(callback=hook)
189    # Or
190    config.before_playback(callback=hook)
191
192You can register callables for both hooks. If you wish to ignore an
193interaction and prevent it from being recorded or replayed, you can call the
194:meth:`~betamax.cassette.interaction.Interaction.ignore`. You also have full
195access to all of the methods and attributes on an instance of an Interaction.
196This will allow you to inspect the response produced by the interaction and
197then modify it. Let's say, for example, that you are talking to an API that
198grants authorization tokens on a specific request. In this example, you might
199authenticate initially using a username and password and then use a token
200after authenticating. You want, however, for the token to be kept secret. In
201that case you might configure Betamax to replace the username and password,
202e.g.,
203
204.. code-block:: python
205
206    config.define_cassette_placeholder('<USERNAME>', username)
207    config.define_cassette_placeholder('<PASSWORD>', password)
208
209And you would also write a function that, prior to recording, finds the token,
210saves it, and obscures it from the recorded version of the cassette:
211
212.. code-block:: python
213
214    from betamax.cassette import cassette
215
216
217    def sanitize_token(interaction, current_cassette):
218        # Exit early if the request did not return 200 OK because that's the
219        # only time we want to look for Authorization-Token headers
220        if interaction.data['response']['status']['code'] != 200:
221            return
222
223        headers = interaction.data['response']['headers']
224        token = headers.get('Authorization-Token')
225        # If there was no token header in the response, exit
226        if token is None:
227            return
228
229        # Otherwise, create a new placeholder so that when cassette is saved,
230        # Betamax will replace the token with our placeholder.
231        current_cassette.placeholders.append(
232            cassette.Placeholder(placeholder='<AUTH_TOKEN>', replace=token)
233        )
234
235This will dynamically create a placeholder for that cassette only. Once we
236have our hook, we need merely register it like so:
237
238.. code-block:: python
239
240    config.before_record(callback=sanitize_token)
241
242And we no longer need to worry about leaking sensitive data.
243
244
245Setting default serializer
246``````````````````````````
247
248If you want to use a specific serializer for every cassette, you can set
249``serialize_with`` as a default cassette option. For example, if you wanted to
250use the ``prettyjson`` serializer for every cassette you would do:
251
252.. code-block:: python
253
254    config.default_cassette_options['serialize_with'] = 'prettyjson'
255
256Per-Use Configuration
257---------------------
258
259Each time you create a :class:`~betamax.Betamax` instance or use
260:meth:`~betamax.Betamax.use_cassette`, you can pass some of the options from
261above.
262
263Setting the Directory in which Betamax Should Store Cassette Files
264``````````````````````````````````````````````````````````````````
265
266When using per-use configuration of Betamax, you can specify the cassette
267directory when you instantiate a :class:`~betamax.Betamax` object:
268
269.. code-block:: python
270
271    session = requests.Session()
272    recorder = betamax.Betamax(session,
273                               cassette_library_dir='tests/cassettes/')
274
275Setting Default Cassette Options
276````````````````````````````````
277
278You can also set default cassette options when instantiating a
279:class:`~betamax.Betamax` object:
280
281.. code-block:: python
282
283    session = requests.Session()
284    recorder = betamax.Betamax(session, default_cassette_options={
285        'record_mode': 'once',
286        'match_requests_on': ['method', 'uri', 'headers'],
287        'preserve_exact_body_bytes': True
288    })
289
290You can also set the above when calling :meth:`~betamax.Betamax.use_cassette`:
291
292.. code-block:: python
293
294    session = requests.Session()
295    recorder = betamax.Betamax(session)
296    with recorder.use_cassette('cassette-name',
297                               preserve_exact_body_bytes=True,
298                               match_requests_on=['method', 'uri', 'headers'],
299                               record='once'):
300        session.get('https://httpbin.org/get')
301
302Filtering Sensitive Data
303````````````````````````
304
305Filtering sensitive data on a per-usage basis is the only difficult (or
306perhaps, less convenient) case. Cassette placeholders are part of the default
307cassette options, so we'll set this value similarly to how we set the other
308default cassette options, the catch is that placeholders have a specific
309structure. Placeholders are stored as a list of dictionaries. Let's use our
310example above and convert it.
311
312.. code-block:: python
313
314    import base64
315
316    username = os.environ.get('USERNAME', 'testuser')
317    password = os.environ.get('PASSWORD', 'testpassword')
318    session = requests.Session()
319
320    recorder = betamax.Betamax(session, default_cassette_options={
321        'placeholders': [{
322            'placeholder': '<GITHUB-AUTH>',
323            'replace': base64.b64encode(
324                '{0}:{1}'.format(username, password).encode('utf-8')
325            ),
326        }]
327    })
328
329Note that what we passed as our first argument is assigned to the
330``'placeholder'`` key while the value we're replacing is assigned to the
331``'replace'`` key.
332
333This isn't the typical way that people filter sensitive data because they tend
334to want to do it globally.
335
336Mixing and Matching
337-------------------
338
339It's not uncommon to mix and match configuration methodologies. I do this in
340`github3.py`_. I use global configuration to filter sensitive data and set
341defaults based on the environment the tests are running in. On Travis-CI, the
342record mode is set to ``'none'``. I also set how we match requests and when we
343preserve exact body bytes on a per-use basis.
344
345.. links
346
347.. _VCR: https://relishapp.com/vcr/vcr
348.. _github3.py: https://github.com/sigmavirus24/github3.py
349