1.. _astropy-table:
2
3*****************************
4Data Tables (`astropy.table`)
5*****************************
6
7Introduction
8============
9
10`astropy.table` provides functionality for storing and manipulating
11heterogeneous tables of data in a way that is familiar to ``numpy`` users. A few
12notable capabilities of this package are:
13
14* Initialize a table from a wide variety of input data structures and types.
15* Modify a table by adding or removing columns, changing column names,
16  or adding new rows of data.
17* Handle tables containing missing values.
18* Include table and column metadata as flexible data structures.
19* Specify a description, units, and output formatting for columns.
20* Interactively scroll through long tables similar to using ``more``.
21* Create a new table by selecting rows or columns from a table.
22* Perform :ref:`table_operations` like database joins, concatenation, and binning.
23* Maintain a table index for fast retrieval of table items or ranges.
24* Manipulate multidimensional columns.
25* Handle non-native (mixin) column types within table.
26* Methods for :ref:`read_write_tables` to files.
27* Hooks for :ref:`subclassing_table` and its component classes.
28
29Getting Started
30===============
31
32The basic workflow for creating a table, accessing table elements,
33and modifying the table is shown below. These examples demonstrate a concise
34case, while the full `astropy.table` documentation is available from the
35:ref:`using_astropy_table` section.
36
37First create a simple table with columns of data named ``a``, ``b``, ``c``, and
38``d``. These columns have integer, float, string, and |Quantity| values
39respectively::
40
41  >>> from astropy.table import QTable
42  >>> import astropy.units as u
43  >>> import numpy as np
44
45  >>> a = np.array([1, 4, 5], dtype=np.int32)
46  >>> b = [2.0, 5.0, 8.5]
47  >>> c = ['x', 'y', 'z']
48  >>> d = [10, 20, 30] * u.m / u.s
49
50  >>> t = QTable([a, b, c, d],
51  ...            names=('a', 'b', 'c', 'd'),
52  ...            meta={'name': 'first table'})
53
54Comments:
55
56- Column ``a`` is a |ndarray| with a specified ``dtype`` of ``int32``. If the
57  data type is not provided, the default type for integers is ``int64`` on Mac
58  and Linux and ``int32`` on Windows.
59- Column ``b`` is a list of ``float`` values, represented as ``float64``.
60- Column ``c`` is a list of ``str`` values, represented as unicode.
61  See :ref:`bytestring-columns-python-3` for more information.
62- Column ``d`` is a |Quantity| array. Since we used |QTable|, this stores a
63  native |Quantity| within the table and brings the full power of
64  :ref:`astropy-units` to this column in the table.
65
66.. Note::
67
68   If the table data have no units or you prefer to not use |Quantity|, then you
69   can use the |Table| class to create tables. The **only** difference between
70   |QTable| and |Table| is the behavior when adding a column that has units.
71   See :ref:`quantity_and_qtable` and :ref:`columns_with_units` for details on
72   the differences and use cases.
73
74There are many other ways of :ref:`construct_table`, including from a list of
75rows (either tuples or dicts), from a ``numpy`` structured or 2D array, by
76adding columns or rows incrementally, or even converting from a |SkyCoord| or a
77:class:`pandas.DataFrame`.
78
79There are a few ways of :ref:`access_table`. You can get detailed information
80about the table values and column definitions as follows::
81
82  >>> t
83  <QTable length=3>
84    a      b     c      d
85                      m / s
86  int32 float64 str1 float64
87  ----- ------- ---- -------
88      1     2.0    x    10.0
89      4     5.0    y    20.0
90      5     8.5    z    30.0
91
92You can get summary information about the table as follows::
93
94  >>> t.info
95  <QTable length=3>
96  name  dtype   unit  class
97  ---- ------- ----- --------
98     a   int32         Column
99     b float64         Column
100     c    str1         Column
101     d float64 m / s Quantity
102
103From within a `Jupyter notebook <https://jupyter.org/>`_, the table is
104displayed as a formatted HTML table (details of how it appears can be changed
105by altering the `astropy.table.conf.default_notebook_table_class
106<astropy.table.Conf.default_notebook_table_class>` item in the
107:ref:`astropy_config`:
108
109.. image:: table_repr_html.png
110   :width: 450px
111
112Or you can get a fancier notebook interface with in-browser search, and sort
113using :meth:`~astropy.table.Table.show_in_notebook`:
114
115.. image:: table_show_in_nb.png
116   :width: 450px
117
118If you print the table (either from the notebook or in a text console session)
119then a formatted version appears::
120
121  >>> print(t)
122   a   b   c    d
123              m / s
124  --- --- --- -----
125    1 2.0   x  10.0
126    4 5.0   y  20.0
127    5 8.5   z  30.0
128
129
130If you do not like the format of a particular column, you can change it through
131:ref:`the 'info' property <mixin_attributes>`::
132
133  >>> t['b'].info.format = '7.3f'
134  >>> print(t)
135   a     b     c    d
136                  m / s
137  --- ------- --- -----
138    1   2.000   x  10.0
139    4   5.000   y  20.0
140    5   8.500   z  30.0
141
142For a long table you can scroll up and down through the table one page at
143time::
144
145  >>> t.more()  # doctest: +SKIP
146
147You can also display it as an HTML-formatted table in the browser::
148
149  >>> t.show_in_browser()  # doctest: +SKIP
150
151Or as an interactive (searchable and sortable) javascript table::
152
153  >>> t.show_in_browser(jsviewer=True)  # doctest: +SKIP
154
155Now examine some high-level information about the table::
156
157  >>> t.colnames
158  ['a', 'b', 'c', 'd']
159  >>> len(t)
160  3
161  >>> t.meta
162  {'name': 'first table'}
163
164Access the data by column or row using familiar ``numpy`` structured array
165syntax::
166
167  >>> t['a']       # Column 'a'
168  <Column name='a' dtype='int32' length=3>
169  1
170  4
171  5
172
173  >>> t['a'][1]    # Row 1 of column 'a'
174  4
175
176  >>> t[1]         # Row 1 of the table
177  <Row index=1>
178    a      b     c      d
179                      m / s
180  int32 float64 str1 float64
181  ----- ------- ---- -------
182      4   5.000    y    20.0
183
184
185  >>> t[1]['a']    # Column 'a' of row 1
186  4
187
188You can retrieve a subset of a table by rows (using a :class:`slice`) or by
189columns (using column names), where the subset is returned as a new table::
190
191  >>> print(t[0:2])      # Table object with rows 0 and 1
192   a     b     c    d
193                  m / s
194  --- ------- --- -----
195    1   2.000   x  10.0
196    4   5.000   y  20.0
197
198
199  >>> print(t['a', 'c'])  # Table with cols 'a' and 'c'
200   a   c
201  --- ---
202    1   x
203    4   y
204    5   z
205
206:ref:`modify_table` in place is flexible and works as you would expect::
207
208  >>> t['a'][:] = [-1, -2, -3]    # Set all column values in place
209  >>> t['a'][2] = 30              # Set row 2 of column 'a'
210  >>> t[1] = (8, 9.0, "W", 4 * u.m / u.s) # Set all values of row 1
211  >>> t[1]['b'] = -9              # Set column 'b' of row 1
212  >>> t[0:2]['b'] = 100.0         # Set column 'b' of rows 0 and 1
213  >>> print(t)
214   a     b     c    d
215                  m / s
216  --- ------- --- -----
217   -1 100.000   x  10.0
218    8 100.000   W   4.0
219   30   8.500   z  30.0
220
221Replace, add, remove, and rename columns with the following::
222
223  >>> t['b'] = ['a', 'new', 'dtype']   # Replace column 'b' (different from in-place)
224  >>> t['e'] = [1, 2, 3]               # Add column 'e'
225  >>> del t['c']                       # Delete column 'c'
226  >>> t.rename_column('a', 'A')        # Rename column 'a' to 'A'
227  >>> t.colnames
228  ['A', 'b', 'd', 'e']
229
230Adding a new row of data to the table is as follows. Note that the unit
231value is given in ``cm / s`` but will be added to the table as ``0.1 m / s`` in
232accord with the existing unit.
233
234  >>> t.add_row([-8, 'string', 10 * u.cm / u.s, 10])
235  >>> t['d']
236  <Quantity [10. ,  4. , 30. ,  0.1] m / s>
237
238Tables can be used for data with missing values::
239
240  >>> from astropy.table import MaskedColumn
241  >>> a_masked = MaskedColumn(a, mask=[True, True, False])
242  >>> t = QTable([a_masked, b, c], names=('a', 'b', 'c'),
243  ...            dtype=('i4', 'f8', 'U1'))
244  >>> t
245  <QTable length=3>
246    a      b     c
247  int32 float64 str1
248  ----- ------- ----
249     --     2.0    x
250     --     5.0    y
251      5     8.5    z
252
253In addition to |Quantity|, you can include certain object types like
254`~astropy.time.Time`, `~astropy.coordinates.SkyCoord`, and
255`~astropy.table.NdarrayMixin` in your table. These "mixin" columns behave like
256a hybrid of a regular `~astropy.table.Column` and the native object type (see
257:ref:`mixin_columns`). For example::
258
259  >>> from astropy.time import Time
260  >>> from astropy.coordinates import SkyCoord
261  >>> tm = Time(['2000:002', '2002:345'])
262  >>> sc = SkyCoord([10, 20], [-45, +40], unit='deg')
263  >>> t = QTable([tm, sc], names=['time', 'skycoord'])
264  >>> t
265  <QTable length=2>
266           time          skycoord
267                         deg,deg
268           Time          SkyCoord
269  --------------------- ----------
270  2000:002:00:00:00.000 10.0,-45.0
271  2002:345:00:00:00.000  20.0,40.0
272
273Now let us compute the interval since the launch of the `Chandra X-ray Observatory
274<https://en.wikipedia.org/wiki/Chandra_X-ray_Observatory>`_ aboard `STS-93
275<https://en.wikipedia.org/wiki/STS-93>`_ and store this in our table as a
276|Quantity| in days::
277
278  >>> dt = t['time'] - Time('1999-07-23 04:30:59.984')
279  >>> t['dt_cxo'] = dt.to(u.d)
280  >>> t['dt_cxo'].info.format = '.3f'
281  >>> print(t)
282           time          skycoord   dt_cxo
283                         deg,deg      d
284  --------------------- ---------- --------
285  2000:002:00:00:00.000 10.0,-45.0  162.812
286  2002:345:00:00:00.000  20.0,40.0 1236.812
287
288.. _using_astropy_table:
289
290Using ``table``
291===============
292
293The details of using `astropy.table` are provided in the following sections:
294
295Construct Table
296---------------
297
298.. toctree::
299   :maxdepth: 2
300
301   construct_table.rst
302
303Access Table
304------------
305
306.. toctree::
307   :maxdepth: 2
308
309   access_table.rst
310
311Modify Table
312------------
313
314.. toctree::
315   :maxdepth: 2
316
317   modify_table.rst
318
319Table Operations
320----------------
321
322.. toctree::
323   :maxdepth: 2
324
325   operations.rst
326
327Indexing
328--------
329
330.. toctree::
331   :maxdepth: 2
332
333   indexing.rst
334
335Masking
336-------
337
338.. toctree::
339   :maxdepth: 2
340
341   masking.rst
342
343I/O with Tables
344---------------
345
346.. toctree::
347   :maxdepth: 2
348
349   io.rst
350   pandas.rst
351
352Mixin Columns
353-------------
354
355.. toctree::
356   :maxdepth: 2
357
358   mixin_columns.rst
359
360Implementation
361--------------
362
363.. toctree::
364   :maxdepth: 2
365
366   implementation_details.rst
367
368.. note that if this section gets too long, it should be moved to a separate
369   doc page - see the top of performance.inc.rst for the instructions on how to do
370   that
371.. include:: performance.inc.rst
372
373Reference/API
374=============
375
376.. automodapi:: astropy.table
377