1ASDF - Advanced Scientific Data Format
2======================================
3
4.. image:: https://github.com/asdf-format/asdf/workflows/CI/badge.svg
5 :target: https://github.com/asdf-format/asdf/actions
6 :alt: CI Status
7
8.. image:: https://github.com/asdf-format/asdf/workflows/s390x/badge.svg
9 :target: https://github.com/asdf-format/asdf/actions
10 :alt: s390x Status
11
12.. image:: https://github.com/asdf-format/asdf/workflows/Downstream/badge.svg
13 :target: https://github.com/asdf-format/asdf/actions
14 :alt: Downstream CI Status
15
16.. image:: https://readthedocs.org/projects/asdf/badge/?version=latest
17 :target: https://asdf.readthedocs.io/en/latest/
18
19.. image:: https://codecov.io/gh/asdf-format/asdf/branch/master/graphs/badge.svg
20 :target: https://codecov.io/gh/asdf-format/asdf
21
22.. image:: https://img.shields.io/pypi/l/asdf.svg
23 :target: https://img.shields.io/pypi/l/asdf.svg
24
25|
26
27.. _begin-summary-text:
28
29The **A**\ dvanced **S**\ cientific **D**\ ata **F**\ ormat (ASDF) is a
30next-generation interchange format for scientific data. This package
31contains the Python implementation of the ASDF Standard. More
32information on the ASDF Standard itself can be found
33`here <https://asdf-standard.readthedocs.io>`__.
34
35The ASDF format has the following features:
36
37* A hierarchical, human-readable metadata format (implemented using `YAML
38 <http://yaml.org>`__)
39* Numerical arrays are stored as binary data blocks which can be memory
40 mapped. Data blocks can optionally be compressed.
41* The structure of the data can be automatically validated using schemas
42 (implemented using `JSON Schema <http://json-schema.org>`__)
43* Native Python data types (numerical types, strings, dicts, lists) are
44 serialized automatically
45* ASDF can be extended to serialize custom data types
46
47.. _end-summary-text:
48
49ASDF is under active development `on github
50<https://github.com/asdf-format/asdf>`__. More information on contributing
51can be found `below <#contributing>`__.
52
53Overview
54--------
55
56This section outlines basic use cases of the ASDF package for creating
57and reading ASDF files.
58
59Creating a file
60~~~~~~~~~~~~~~~
61
62.. _begin-create-file-text:
63
64We're going to store several `numpy` arrays and other data to an ASDF file. We
65do this by creating a "tree", which is simply a `dict`, and we provide it as
66input to the constructor of `AsdfFile`:
67
68.. code:: python
69
70 import asdf
71 import numpy as np
72
73 # Create some data
74 sequence = np.arange(100)
75 squares = sequence**2
76 random = np.random.random(100)
77
78 # Store the data in an arbitrarily nested dictionary
79 tree = {
80 'foo': 42,
81 'name': 'Monty',
82 'sequence': sequence,
83 'powers': { 'squares' : squares },
84 'random': random
85 }
86
87 # Create the ASDF file object from our data tree
88 af = asdf.AsdfFile(tree)
89
90 # Write the data to a new file
91 af.write_to('example.asdf')
92
93If we open the newly created file, we can see some of the key features
94of ASDF on display:
95
96::
97
98 #ASDF 1.0.0
99 #ASDF_STANDARD 1.2.0
100 %YAML 1.1
101 %TAG ! tag:stsci.edu:asdf/
102 --- !core/asdf-1.1.0
103 asdf_library: !core/software-1.0.0 {author: The ASDF Developers, homepage: 'http://github.com/asdf-format/asdf',
104 name: asdf, version: 2.0.0}
105 history:
106 extensions:
107 - !core/extension_metadata-1.0.0
108 extension_class: asdf.extension.BuiltinExtension
109 software: {name: asdf, version: 2.0.0}
110 foo: 42
111 name: Monty
112 powers:
113 squares: !core/ndarray-1.0.0
114 source: 1
115 datatype: int64
116 byteorder: little
117 shape: [100]
118 random: !core/ndarray-1.0.0
119 source: 2
120 datatype: float64
121 byteorder: little
122 shape: [100]
123 sequence: !core/ndarray-1.0.0
124 source: 0
125 datatype: int64
126 byteorder: little
127 shape: [100]
128 ...
129
130The metadata in the file mirrors the structure of the tree that was stored. It
131is hierarchical and human-readable. Notice that metadata has been added to the
132tree that was not explicitly given by the user. Notice also that the numerical
133array data is not stored in the metadata tree itself. Instead, it is stored as
134binary data blocks below the metadata section (not shown here).
135
136It is possible to compress the array data when writing the file:
137
138.. code:: python
139
140 af.write_to('compressed.asdf', all_array_compression='zlib')
141
142Available compression algorithms are ``'zlib'``, ``'bzp2'``, and
143``'lz4'``.
144
145.. _end-create-file-text:
146
147Reading a file
148~~~~~~~~~~~~~~
149
150.. _begin-read-file-text:
151
152To read an existing ASDF file, we simply use the top-level `open` function of
153the `asdf` package:
154
155.. code:: python
156
157 import asdf
158
159 af = asdf.open('example.asdf')
160
161The `open` function also works as a context handler:
162
163.. code:: python
164
165 with asdf.open('example.asdf') as af:
166 ...
167
168To access the data stored in the file, use the top-level `AsdfFile.tree`
169attribute:
170
171.. code:: python
172
173 >>> import asdf
174 >>> af = asdf.open('example.asdf')
175 >>> af.tree
176 {'asdf_library': {'author': 'The ASDF Developers',
177 'homepage': 'http://github.com/asdf-format/asdf',
178 'name': 'asdf',
179 'version': '1.3.1'},
180 'foo': 42,
181 'name': 'Monty',
182 'powers': {'squares': <array (unloaded) shape: [100] dtype: int64>},
183 'random': <array (unloaded) shape: [100] dtype: float64>,
184 'sequence': <array (unloaded) shape: [100] dtype: int64>}
185
186The tree is simply a Python `dict`, and nodes are accessed like any other
187dictionary entry:
188
189.. code:: python
190
191 >>> af.tree['name']
192 'Monty'
193 >>> af.tree['powers']
194 {'squares': <array (unloaded) shape: [100] dtype: int64>}
195
196Array data remains unloaded until it is explicitly accessed:
197
198.. code:: python
199
200 >>> af.tree['powers']['squares']
201 array([ 0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100,
202 121, 144, 169, 196, 225, 256, 289, 324, 361, 400, 441,
203 484, 529, 576, 625, 676, 729, 784, 841, 900, 961, 1024,
204 1089, 1156, 1225, 1296, 1369, 1444, 1521, 1600, 1681, 1764, 1849,
205 1936, 2025, 2116, 2209, 2304, 2401, 2500, 2601, 2704, 2809, 2916,
206 3025, 3136, 3249, 3364, 3481, 3600, 3721, 3844, 3969, 4096, 4225,
207 4356, 4489, 4624, 4761, 4900, 5041, 5184, 5329, 5476, 5625, 5776,
208 5929, 6084, 6241, 6400, 6561, 6724, 6889, 7056, 7225, 7396, 7569,
209 7744, 7921, 8100, 8281, 8464, 8649, 8836, 9025, 9216, 9409, 9604,
210 9801])
211
212 >>> import numpy as np
213 >>> expected = [x**2 for x in range(100)]
214 >>> np.equal(af.tree['powers']['squares'], expected).all()
215 True
216
217By default, uncompressed data blocks are memory mapped for efficient
218access. Memory mapping can be disabled by using the ``copy_arrays``
219option of `open` when reading:
220
221.. code:: python
222
223 af = asdf.open('example.asdf', copy_arrays=True)
224
225.. _end-read-file-text:
226
227For more information and for advanced usage examples, see the
228`documentation <#documentation>`__.
229
230Extending ASDF
231~~~~~~~~~~~~~~
232
233Out of the box, the ``asdf`` package automatically serializes and
234deserializes native Python types. It is possible to extend ``asdf`` by
235implementing custom tag types that correspond to custom user types. More
236information on extending ASDF can be found in the `official
237documentation <http://asdf.readthedocs.io/en/latest/asdf/extensions.html>`__.
238
239Installation
240------------
241
242.. _begin-pip-install-text:
243
244Stable releases of the ASDF Python package are registered `at
245PyPi <https://pypi.python.org/pypi/asdf>`__. The latest stable version
246can be installed using ``pip``:
247
248::
249
250 $ pip install asdf
251
252.. _begin-source-install-text:
253
254The latest development version of ASDF is available from the ``master`` branch
255`on github <https://github.com/asdf-format/asdf>`__. To clone the project:
256
257::
258
259 $ git clone https://github.com/asdf-format/asdf
260
261To install:
262
263::
264
265 $ cd asdf
266 $ git submodule update --init
267 $ pip install .
268
269To install in `development
270mode <https://packaging.python.org/tutorials/distributing-packages/#working-in-development-mode>`__::
271
272 $ pip install -e .
273
274.. note::
275
276 The source repository makes use of a git submodule for referencing the
277 schemas provided by the ASDF standard. While this submodule is
278 automatically initialized when installing the package (including in
279 development mode), it may be necessary for developers to manually update
280 the submodule if changes are made upstream. See the `documentation on git
281 submodules <https://git-scm.com/docs/git-submodule>`__ for more
282 information.
283
284.. _end-source-install-text:
285
286Testing
287-------
288
289.. _begin-testing-text:
290
291To install the test dependencies from a source checkout of the repository:
292
293::
294
295 $ pip install -e .[tests]
296
297To run the unit tests from a source checkout of the repository:
298
299::
300
301 $ pytest
302
303It is also possible to run the test suite from an installed version of
304the package.
305
306::
307
308 pip install asdf[tests]
309 pytest --pyargs asdf
310
311It is also possible to run the tests using `tox
312<https://tox.readthedocs.io/en/latest/>`__.
313
314::
315
316 $ pip install tox
317
318To list all available environments:
319
320::
321
322 $ tox -va
323
324To run a specific environment:
325
326::
327
328 $ tox -e <envname>
329
330
331.. _end-testing-text:
332
333Documentation
334-------------
335
336More detailed documentation on this software package can be found
337`here <https://asdf.readthedocs.io>`__.
338
339More information on the ASDF Standard itself can be found
340`here <https://asdf-standard.readthedocs.io>`__.
341
342There are two mailing lists for ASDF:
343
344* `asdf-users <https://groups.google.com/forum/#!forum/asdf-users>`_
345* `asdf-developers <https://groups.google.com/forum/#!forum/asdf-developers>`_
346
347 If you are looking for the **A**\ daptable **S**\ eismic **D**\ ata
348 **F**\ ormat, information can be found
349 `here <https://seismic-data.org/>`__.
350
351Contributing
352------------
353
354We welcome feedback and contributions to the project. Contributions of
355code, documentation, or general feedback are all appreciated. Please
356follow the `contributing guidelines <CONTRIBUTING.md>`__ to submit an
357issue or a pull request.
358
359We strive to provide a welcoming community to all of our users by
360abiding to the `Code of Conduct <CODE_OF_CONDUCT.md>`__.
361