1.. currentmodule:: astropy.io.fits
2.. doctest-skip-all
3
4.. _header-transition-guide:
5
6*********************************
7Header Interface Transition Guide
8*********************************
9
10.. note::
11
12    This guide was originally included with the release of PyFITS 3.1, and
13    still references PyFITS in many places, though the examples have been
14    updated for ``astropy.io.fits``. It is still useful here for informational
15    purposes, though Astropy has always used the PyFITS 3.1 Header interface.
16
17PyFITS v3.1 included an almost complete rewrite of the :class:`Header`
18interface. Although the new interface is largely compatible with the old
19interface (whether due to similarities in the design, or backwards-compatibility
20support), there are enough differences that a full explanation of the new
21interface is merited.
22
23Background
24==========
25
26Prior to 3.1, PyFITS users interacted with FITS headers by way of three
27different classes: :class:`Card`, ``CardList``, and :class:`Header`.
28
29The Card class represents a single header card with a keyword, value, and
30comment. It also contains all of the machinery for parsing FITS header cards,
31given the 80-character string, or "card image" read from the header.
32
33The CardList class is actually a subclass of Python's `list` built-in. It was
34meant to represent the actual list of cards that make up a header. That is, it
35represents an ordered list of cards in the physical order that they appear in
36the header. It supports the usual list methods for inserting and appending new
37cards into the list. It also supports `dict`-like keyword access, where
38``cardlist['KEYWORD']`` would return the first card in the list with the given
39keyword.
40
41A lot of the functionality for manipulating headers was actually buried in the
42CardList class. The Header class was more of a wrapper around CardList that
43added a little bit of abstraction. It also implemented a partial dict-like
44interface, though for Headers a keyword lookup returned the header value
45associated with that keyword, not the Card object, and almost every
46method on the Header class was just performing some operations on the
47underlying CardList.
48
49The problem was that there were certain things a user could *only* do by
50directly accessing the CardList, such as look up the comments on a card or
51access cards that have duplicate keywords, such as HISTORY. Another long-
52standing misfeature was that slicing a Header object actually returned a
53CardList object, rather than a new Header. For all but the simplest use cases,
54working with CardList objects was largely unavoidable.
55
56But it was realized that CardList is really an implementation detail
57not representing any element of a FITS file distinct from the header itself.
58Users familiar with the FITS format know what a header is, but it is not clear
59how a "card list" is distinct from that, or why operations go through the
60Header object, while some have to be performed through the CardList.
61
62So the primary goal of this redesign was to eliminate the ``CardList`` class
63altogether, and make it possible for users to perform all header manipulations
64directly through :class:`Header` objects. It also tried to present headers as
65similarly as possible to a more familiar data structure — an ordered mapping
66(or :class:`~collections.OrderedDict` in Python) for ease of use by new users
67less familiar with the FITS format, though there are still many added
68complexities for dealing with the idiosyncrasies of the FITS format.
69
70
71Deprecation Warnings
72====================
73
74A few older methods on the :class:`Header` class have been marked as deprecated,
75either because they have been renamed to a more `PEP 8`_-compliant name, or
76because have become redundant due to new features. To check if your code is
77using any deprecated methods or features, run your code with ``python -Wd``.
78This will output any deprecation warnings to the console.
79
80Two of the most common deprecation warnings related to Headers are:
81
82- ``Header.has_key``: this has been deprecated since PyFITS 3.0,
83  just as Python's `dict.has_key` is deprecated. To check a key's presence
84  in a mapping object like `dict` or :class:`Header`, use the ``key in d``
85  syntax. This has long been the preference in Python.
86
87- ``Header.ascardlist`` and ``Header.ascard``: these were used to
88  access the ``CardList`` object underlying a header. They should still
89  work, and return a skeleton CardList implementation that should support most
90  of the old CardList functionality. But try removing as much of this as
91  possible. If direct access to the :class:`Card` objects making up a header
92  is necessary, use :attr:`Header.cards`, which returns an iterator over the
93  cards. More on that below.
94
95.. _PEP 8: https://www.python.org/dev/peps/pep-0008/
96
97New Header Design
98=================
99
100The new :class:`Header` class is designed to work as a drop-in replacement for
101a `dict` via `duck typing`_. That is, although it is not a subclass of `dict`,
102it implements all of the same methods and interfaces. In particular, it is
103similar to an :class:`~collections.OrderedDict` in that the order of insertions
104is preserved. However, Header also supports many additional features and
105behaviors specific to the FITS format. It should also be noted that while the
106old Header implementation also had a dict-like interface, it did not implement
107the *entire* dict interface as the new Header does.
108
109Although the new Header is used like a dict/mapping in most cases, it also
110supports a `list` interface. The list-like interface is a bit idiosyncratic in
111that in some contexts the Header acts like a list of values, in others like a
112list of keywords, and in a few contexts like a list of :class:`Card` objects.
113This may be the most difficult aspect of the new design, but there is a logic
114to it.
115
116As with the old Header implementation, integer index access is supported:
117``header[0]`` returns the value of the first keyword. However, the
118:meth:`Header.index` method treats the header as though it is a list of
119keywords and returns the index of a given keyword. For example::
120
121    >>> header.index('BITPIX')
122    2
123
124:meth:`Header.count` is similar to `list.count` and also takes a keyword as
125its argument::
126
127    >>> header.count('HISTORY')
128    20
129
130A good rule of thumb is that any item access using square brackets ``[]``
131returns *value* in the header, whether using keyword or index lookup. Methods
132like :meth:`~Header.index` and :meth:`~Header.count` that deal with the order
133and quantity of items in the Header generally work on keywords. Finally,
134methods like :meth:`~Header.insert` and :meth:`~Header.append` that add new
135items to the header work on cards.
136
137Aside from the list-like methods, the new Header class works very similarly to
138the old implementation for most basic use cases and should not present too many
139surprises. There are differences, however:
140
141- As before, the Header() initializer can take a list of :class:`Card` objects
142  with which to fill the header. However, now any iterable may be used. It is
143  also important to note that *any* Header method that accepts :class:`Card`
144  objects can also accept 2-tuples or 3-tuples in place of Cards. That is,
145  either a ``(keyword, value, comment)`` tuple or a ``(keyword, value)`` tuple
146  (comment is assumed blank) may be used anywhere in place of a Card object.
147  This is even preferred, as it involves less typing. For example::
148
149      >>> from astropy.io import fits
150      >>> header = fits.Header([('A', 1), ('B', 2), ('C', 3, 'A comment')])
151      >>> header
152      A       =                    1
153      B       =                    2
154      C       =                    3 / A comment
155
156- As demonstrated in the previous example, the ``repr()`` for a Header (that is,
157  the text that is displayed when entering a Header object in the Python
158  console as an expression), shows the header as it would appear in a FITS file.
159  This inserts newlines after each card so that it is readable regardless of
160  terminal width. It is *not* necessary to use ``print header`` to view this.
161  Entering ``header`` displays the header contents as it would appear in the
162  file (sans the END card).
163
164- ``len(header)`` is now supported (previously it was necessary to do
165  ``len(header.ascard)``). This returns the total number of cards in the
166  header, including blank cards, but excluding the END card.
167
168- FITS supports having duplicate keywords, although they are generally in error
169  except for commentary keywords like COMMENT and HISTORY. PyFITS now supports
170  reading, updating, and deleting duplicate keywords; instead of using the
171  keyword by itself, use a ``(keyword, index)`` tuple. For example,
172  ``('HISTORY', 0)`` represents the first HISTORY card, ``('HISTORY', 1)``
173  represents the second HISTORY card, and so on. In fact, when a keyword is
174  used by itself, it is shorthand for ``(keyword, 0)``. It is now possible to
175  delete an accidental duplicate like so::
176
177      >>> del header[('NAXIS', 1)]
178
179  This will remove an accidental duplicate NAXIS card from the header.
180
181- Even if there are duplicate keywords, keyword lookups like
182  ``header['NAXIS']`` will always return the value associated with the first
183  copy of that keyword, with one exception: commentary keywords like COMMENT
184  and HISTORY are expected to have duplicates. So ``header['HISTORY']``, for
185  example, returns the whole sequence of HISTORY values in the correct order.
186  This list of values can be sliced arbitrarily. For example, to view the last
187  three history entries in a header::
188
189      >>> hdulist[0].header['HISTORY'][-3:]
190        reference table oref$laf13367o_pct.fits
191        reference table oref$laf13369o_apt.fits
192      Heliocentric correction = 16.225 km/s
193
194- Subscript assignment can now be used to add new keywords to the header. Just
195  as with a normal `dict`, ``header['NAXIS'] = 1`` will either update the NAXIS
196  keyword if it already exists, or add a new NAXIS keyword with a value of
197  ``1`` if it does not exist. In the old interface this would return a
198  `KeyError` if NAXIS did not exist, and the only way to add a new
199  keyword was through the update() method.
200
201  By default, new keywords added in this manner are added to the end of the
202  header, with a few FITS-specific exceptions:
203
204  * If the header contains extra blank cards at the end, new keywords are added
205    before the blanks.
206
207  * If the header ends with a list of commentary cards — for example, a sequence
208    of HISTORY cards — those are kept at the end, and new keywords are inserted
209    before the commentary cards.
210
211  * If the keyword is a commentary keyword like COMMENT or HISTORY (or an empty
212    string for blank keywords), a *new* commentary keyword is always added and
213    appended to the last commentary keyword of the same type. For example,
214    HISTORY keywords are always placed after the last history keyword::
215
216        >>> header = fits.Header()
217        >>> header['COMMENT'] = 'Comment 1'
218        >>> header['HISTORY'] = 'History 1'
219        >>> header['COMMENT'] = 'Comment 2'
220        >>> header['HISTORY'] = 'History 2'
221        >>> header
222        COMMENT Comment 1
223        COMMENT Comment 2
224        HISTORY History 1
225        HISTORY History 2
226
227  These behaviors represent a sensible default behavior for keyword assignment,
228  and the same behavior as :meth:`~Header.update` in the old Header
229  implementation. The default behaviors may still be bypassed through the use
230  of other assignment methods like the :meth:`Header.set` and
231  :meth:`Header.append` methods described later.
232
233- It is now also possible to assign a value and a comment to a keyword
234  simultaneously using a tuple::
235
236      >>> header['NAXIS'] = (2, 'Number of axis')
237
238  This will update the value and comment of an existing keyword, or add a new
239  keyword with the given value and comment.
240
241- There is a new :attr:`Header.comments` attribute which lists all of the
242  comments associated with keywords in the header (not to be confused with
243  COMMENT cards). This allows viewing and updating the comments on specific
244  cards::
245
246      >>> header.comments['NAXIS']
247      Number of axis
248      >>> header.comments['NAXIS'] = 'Number of axes'
249      >>> header.comments['NAXIS']
250      Number of axes
251
252- When deleting a keyword from a header, do not assume that the keyword already
253  exists. In the old Header implementation, this action would silently do
254  nothing. For backwards-compatibility, it is still okay to delete a
255  nonexistent keyword, but a warning will be raised. In the future this
256  *will* be changed so that trying to delete a nonexistent keyword raises a
257  `KeyError`. This is for consistency with the behavior of Python dicts. So
258  unless you know for certain that a keyword exists before deleting it, it is
259  best to do something like::
260
261      >>> try:
262      ...     del header['BITPIX']
263      ... except KeyError:
264      ...     pass
265
266  Or if you prefer to look before you leap::
267
268      >>> if 'BITPIX' in header:
269      ...     del header['BITPIX']
270
271- ``del header`` now supports slices. For example, to delete the last three
272  keywords from a header::
273
274      >>> del header[-3:]
275
276- Two headers can now be compared for equality — previously no two Header
277  objects were the same. Now they compare as equal if they contain the exact
278  same content. That is, this requires strict equality.
279
280- Two headers can now be added with the '+' operator, which returns a copy of
281  the left header extended by the right header with :meth:`~Header.extend`.
282  Assignment addition is also possible.
283
284- The Header.update() method used commonly with the old Header API has been
285  renamed to :meth:`Header.set`. The primary reason for this change is very
286  simple: Header implements the `dict` interface, which already has a method
287  called update(), but that behaves differently from the old Header.update().
288
289  The details of the new update() can be read in the API docs, but it is very
290  similar to `dict.update`. It also supports backwards compatibility with the
291  old update() by analysis of the arguments passed to it, so existing code will
292  not break immediately. However, this *will* cause a deprecation warning to
293  be output if they are enabled. It is best, for starters, to replace all
294  update() calls with set(). Recall, also, that direct assignment is now
295  possible for adding new keywords to a header. So by and large the only
296  reason to prefer using :meth:`Header.set` is its capability of inserting or
297  moving a keyword to a specific location using the ``before`` or ``after``
298  arguments.
299
300- Slicing a Header with a slice index returns a new Header containing only
301  those cards contained in the slice. As mentioned earlier, it used to be that
302  slicing a Header returned a card list — something of a misfeature. In
303  general, objects that support slicing ought to return an object of the same
304  type when you slice them.
305
306  Likewise, wildcard keywords used to return a CardList object — now they
307  return a new Header similarly to a slice. For example::
308
309      >>> header['NAXIS*']
310
311  returns a new header containing only the NAXIS and NAXISn cards from the
312  original header.
313
314.. _duck typing: https://en.wikipedia.org/wiki/Duck_typing
315
316
317Transition Tips
318===============
319
320The above may seem like a lot, but the majority of existing code using PyFITS
321to manipulate headers should not need to be updated, at least not immediately.
322The most common operations still work the same.
323
324As mentioned above, it would be helpful to run your code with ``python -Wd`` to
325enable deprecation warnings — that should be a good idea of where to look to
326update your code.
327
328If your code needs to be able to support older versions of PyFITS
329simultaneously with PyFITS 3.1, things are slightly trickier, but not by
330much — the deprecated interfaces will not be removed for several more versions
331because of this.
332
333- The first change worth making, which is supported by any PyFITS version in
334  the last several years, is to remove any use of ``Header.has_key`` and
335  replace it with ``keyword in header`` syntax. It is worth making this change
336  for any dict as well, since `dict.has_key` is deprecated. Running the
337  following regular expression over your code may help with most (but not all)
338  cases::
339
340      s/([^ ]+)\.has_key\(([^)]+)\)/\2 in \1/
341
342- If possible, replace any calls to Header.update() with Header.set() (though
343  do not bother with this if you need to support older PyFITS versions). Also,
344  if you have any calls to Header.update() that can be replaced with simple
345  subscript assignments (e.g., ``header['NAXIS'] = (2, 'Number of axes')``) do
346  that too, if possible.
347
348- Find any code that uses ``header.ascard`` or ``header.ascardlist()``. First
349  ascertain whether that code really needs to work directly on Card objects.
350  If that is definitely the case, go ahead and replace those with
351  ``header.cards`` — that should work without too much fuss. If you do need to
352  support older versions, you may keep using ``header.ascard`` for now.
353
354- In the off chance that you have any code that slices a header, it is best to
355  take the result of that and create a new Header object from it. For
356  example::
357
358      >>> new_header = fits.Header(old_header[2:])
359
360  This avoids the problem that in PyFITS <= 3.0 slicing a Header returns a
361  CardList by using the result to initialize a new Header object. This will
362  work in both cases (in PyFITS 3.1, initializing a Header with an existing
363  Header just copies it, à la `list`).
364
365- As mentioned earlier, locate any code that deletes keywords with ``del`` and
366  make sure they either look before they leap (``if keyword in header:``) or
367  ask forgiveness (``try/except KeyError:``).
368
369Other Gotchas
370-------------
371
372- As mentioned above, it is not necessary to enter ``print header`` to display
373  a header in an interactive Python prompt. Entering ``>>> header``
374  by itself is sufficient. Using ``print`` usually will *not* display the
375  header readably, because it does not include line breaks between the header
376  cards. The reason is that Python has two types of string representations.
377  One is returned when a user calls ``str(header)``, which happens automatically
378  when you ``print`` a variable. In the case of the Header class this actually
379  returns the string value of the header as it is written literally in the
380  FITS file, which includes no line breaks.
381
382  The other type of string representation happens when one calls
383  ``repr(header)``. The `repr` of an object is meant to be a useful
384  string "representation" of the object; in this case the contents of the
385  header but with line breaks between the cards and with the END card and
386  trailing padding stripped off. This happens automatically when
387  a user enters a variable at the Python prompt by itself without a ``print``
388  call.
389
390- The current version of the FITS Standard (3.0) states in section 4.2.1
391  that trailing spaces in string values in headers are not significant and
392  should be ignored. PyFITS < 3.1 *did* treat trailing spaces as significant.
393  For example, if a header contained:
394
395      KEYWORD1= 'Value    '
396
397  then ``header['KEYWORD1']`` would return the string ``'Value    '`` exactly,
398  with the trailing spaces intact. The new Header interface fixes this by
399  automatically stripping trailing spaces, so that ``header['KEYWORD1']`` would
400  return just ``'Value'``.
401
402  There is, however, one convention used by the IRAF CCD mosaic task for
403  representing its TNX World Coordinate System and ZPX World Coordinate System
404  nonstandard WCS that uses a series of keywords in the form ``WATj_nnn``,
405  which store a text description of coefficients for a nonlinear distortion
406  projection. It uses its own microformat for listing the coefficients as a
407  string, but the string is long, and thus broken up into several of these
408  ``WATj_nnn`` keywords. Correct recombination of these keywords requires
409  treating all whitespace literally. This convention either overlooked or
410  predated the prescribed treatment of whitespace in the FITS standard.
411
412  To get around this issue, a global variable ``fits.STRIP_HEADER_WHITESPACE``
413  was introduced. Temporarily setting
414  ``fits.STRIP_HEADER_WHITESPACE.set(False)`` before reading keywords affected
415  by this issue will return their values with all trailing whitespace intact.
416
417  A future version of PyFITS may be able to detect use of conventions like this
418  contextually and behave according to the convention, but in most cases the
419  default behavior of PyFITS is to behave according to the FITS Standard.
420