1.. _data-package-design: 2 3Design of data packages for the nibabel and the nipy suite 4========================================================== 5 6See :ref:`data-package-discuss` for a more general discussion of design 7issues. 8 9When developing or using nipy, many data files can be useful. We divide the 10data files nipy uses into at least 3 categories 11 12#. *test data* - data files required for routine code testing 13#. *template data* - data files required for algorithms to function, 14 such as templates or atlases 15#. *example data* - data files for running examples, or optional tests 16 17Files used for routine testing are typically very small data files. They are 18shipped with the software, and live in the code repository. For example, in 19the case of ``nipy`` itself, there are some test files that live in the module 20path ``nipy.testing.data``. Nibabel ships data files in 21``nibabel.tests.data``. See :doc:`add_test_data` for discussion. 22 23*template data* and *example data* are example of *data packages*. What 24follows is a discussion of the design and use of data packages. 25 26.. testsetup:: 27 28 # Make fake data and template directories 29 import os 30 from os.path import join as pjoin 31 import tempfile 32 tmpdir = tempfile.mkdtemp() 33 os.environ['NIPY_USER_DIR'] = tmpdir 34 for subdir in ('data', 'templates'): 35 files_dir = pjoin(tmpdir, 'nipy', subdir) 36 os.makedirs(files_dir) 37 with open(pjoin(files_dir, 'config.ini'), 'wt') as fobj: 38 fobj.write( 39 """[DEFAULT] 40 version = 0.2 41 """) 42 43Use cases for data packages 44+++++++++++++++++++++++++++ 45 46Using the data package 47`````````````````````` 48 49The programmer can use the data like this: 50 51.. testcode:: 52 53 from nibabel.data import make_datasource 54 55 templates = make_datasource(dict(relpath='nipy/templates')) 56 fname = templates.get_filename('ICBM152', '2mm', 'T1.nii.gz') 57 58where ``fname`` will be the absolute path to the template image 59``ICBM152/2mm/T1.nii.gz``. 60 61The programmer can insist on a particular version of a ``datasource``: 62 63>>> if templates.version < '0.4': 64... raise ValueError('Need datasource version at least 0.4') 65Traceback (most recent call last): 66... 67ValueError: Need datasource version at least 0.4 68 69If the repository cannot find the data, then: 70 71>>> make_datasource(dict(relpath='nipy/implausible')) 72Traceback (most recent call last): 73 ... 74nibabel.data.DataError: ... 75 76where ``DataError`` gives a helpful warning about why the data was not 77found, and how it should be installed. 78 79Warnings during installation 80```````````````````````````` 81 82The example data and template data may be important, and so we want to warn 83the user if NIPY cannot find either of the two sets of data when installing 84the package. Thus:: 85 86 python setup.py install 87 88will import nipy after installation to check whether these raise an error: 89 90>>> from nibabel.data import make_datasource 91>>> templates = make_datasource(dict(relpath='nipy/templates')) 92>>> example_data = make_datasource(dict(relpath='nipy/data')) 93 94and warn the user accordingly, with some basic instructions for how to 95install the data. 96 97.. _find-data: 98 99Finding the data 100```````````````` 101 102The routine ``make_datasource`` will look for data packages that have been 103installed. For the following call: 104 105>>> templates = make_datasource(dict(relpath='nipy/templates')) 106 107the code will: 108 109#. Get a list of paths where data is known to be stored with 110 ``nibabel.data.get_data_path()`` 111#. For each of these paths, search for directory ``nipy/templates``. If 112 found, and of the correct format (see below), return a datasource, 113 otherwise raise an Exception 114 115The paths collected by ``nibabel.data.get_data_paths()`` are constructed from 116':' (Unix) or ';' separated strings. The source of the strings (in the order 117in which they will be used in the search above) are: 118 119#. The value of the ``NIPY_DATA_PATH`` environment variable, if set 120#. A section = ``DATA``, parameter = ``path`` entry in a 121 ``config.ini`` file in ``nipy_dir`` where ``nipy_dir`` is 122 ``$HOME/.nipy`` or equivalent. 123#. Section = ``DATA``, parameter = ``path`` entries in configuration 124 ``.ini`` files, where the ``.ini`` files are found by 125 ``glob.glob(os.path.join(etc_dir, '*.ini')`` and ``etc_dir`` is 126 ``/etc/nipy`` on Unix, and some suitable equivalent on Windows. 127#. The result of ``os.path.join(sys.prefix, 'share', 'nipy')`` 128#. If ``sys.prefix`` is ``/usr``, we add ``/usr/local/share/nipy``. We 129 need this because Python >= 2.6 in Debian / Ubuntu does default installs to 130 ``/usr/local``. 131#. The result of ``get_nipy_user_dir()`` 132 133Requirements for a data package 134``````````````````````````````` 135 136To be a valid NIPY project data package, you need to satisfy: 137 138#. The installer installs the data in some place that can be found using 139 the method defined in :ref:`find-data`. 140 141We recommend that: 142 143#. By default, you install data in a standard location such as 144 ``<prefix>/share/nipy`` where ``<prefix>`` is the standard Python 145 prefix obtained by ``>>> import sys; print sys.prefix`` 146 147Remember that there is a distinction between the NIPY project - the 148umbrella of neuroimaging in python - and the NIPY package - the main 149code package in the NIPY project. Thus, if you want to install data 150under the NIPY *package* umbrella, your data might go to 151``/usr/share/nipy/nipy/packagename`` (on Unix). Note ``nipy`` twice - 152once for the project, once for the package. If you want to install data 153under - say - the ``pbrain`` package umbrella, that would go in 154``/usr/share/nipy/pbrain/packagename``. 155 156Data package format 157``````````````````` 158 159The following tree is an example of the kind of pattern we would expect 160in a data directory, where the ``nipy-data`` and ``nipy-templates`` 161packages have been installed:: 162 163 <ROOT> 164 `-- nipy 165 |-- data 166 | |-- config.ini 167 | `-- placeholder.txt 168 `-- templates 169 |-- ICBM152 170 | `-- 2mm 171 | `-- T1.nii.gz 172 |-- colin27 173 | `-- 2mm 174 | `-- T1.nii.gz 175 `-- config.ini 176 177The ``<ROOT>`` directory is the directory that will appear somewhere in 178the list from ``nibabel.data.get_data_path()``. The ``nipy`` subdirectory 179signifies data for the ``nipy`` package (as opposed to other 180NIPY-related packages such as ``pbrain``). The ``data`` subdirectory of 181``nipy`` contains files from the ``nipy-data`` package. In the 182``nipy/data`` or ``nipy/templates`` directories, there is a 183``config.ini`` file, that has at least an entry like this:: 184 185 [DEFAULT] 186 version = 0.2 187 188giving the version of the data package. 189 190.. _data-package-design-install: 191 192Installing the data 193``````````````````` 194 195We use python distutils to install data packages, and the ``data_files`` 196mechanism to install the data. On Unix, with the following command:: 197 198 python setup.py install --prefix=/my/prefix 199 200data will go to:: 201 202 /my/prefix/share/nipy 203 204For the example above this will result in these subdirectories:: 205 206 /my/prefix/share/nipy/nipy/data 207 /my/prefix/share/nipy/nipy/templates 208 209because ``nipy`` is both the project, and the package to which the data 210relates. 211 212If you install to a particular location, you will need to add that location to 213the output of ``nibabel.data.get_data_path()`` using one of the mechanisms 214above, for example, in your system configuration:: 215 216 export NIPY_DATA_PATH=/my/prefix/share/nipy 217 218Packaging for distributions 219``````````````````````````` 220 221For a particular data package - say ``nipy-templates`` - distributions 222will want to: 223 224#. Install the data in set location. The default from ``python setup.py 225 install`` for the data packages will be ``/usr/share/nipy`` on Unix. 226#. Point a system installation of NIPY to these data. 227 228For the latter, the most obvious route is to copy an ``.ini`` file named for 229the data package into the NIPY ``etc_dir``. In this case, on Unix, we will 230want a file called ``/etc/nipy/nipy_templates.ini`` with contents:: 231 232 [DATA] 233 path = /usr/share/nipy 234 235Current implementation 236`````````````````````` 237 238This section describes how we (the nipy community) implement data packages at 239the moment. 240 241The data in the data packages will not usually be under source control. This 242is because images don't compress very well, and any change in the data will 243result in a large extra storage cost in the repository. If you're pretty 244clear that the data files aren't going to change, then a repository could work 245OK. 246 247The data packages will be available at a central release location. For now 248this will be: http://nipy.org/data-packages/ . 249 250A package, such as ``nipy-templates-0.2.tar.gz`` will have the following sort 251of structure:: 252 253 254 <ROOT> 255 |-- setup.py 256 |-- README.txt 257 |-- MANIFEST.in 258 `-- templates 259 |-- ICBM152 260 | |-- 1mm 261 | | `-- T1_brain.nii.gz 262 | `-- 2mm 263 | `-- T1.nii.gz 264 |-- colin27 265 | `-- 2mm 266 | `-- T1.nii.gz 267 `-- config.ini 268 269 270There should be only one ``nipy/packagename`` directory delivered by a 271particular package. For example, this package installs ``nipy/templates``, 272but does not contain ``nipy/data``. 273 274Making a new package tarball is simply: 275 276#. Downloading and unpacking e.g. ``nipy-templates-0.1.tar.gz`` to form the 277 directory structure above; 278#. Making any changes to the directory; 279#. Running ``setup.py sdist`` to recreate the package. 280 281The process of making a release should be: 282 283#. Increment the major or minor version number in the ``config.ini`` file; 284#. Make a package tarball as above; 285#. Upload to distribution site. 286 287There is an example nipy data package ``nipy-examplepkg`` in the 288``examples`` directory of the NIPY repository. 289 290The machinery for creating and maintaining data packages is available at 291https://github.com/nipy/data-packaging. 292 293See the ``README.txt`` file there for more information. 294 295.. testcleanup:: 296 297 import shutil 298 shutil.rmtree(tmpdir) 299