1# emacs: -*- mode: python-mode; py-indent-offset: 4; indent-tabs-mode: nil -*- 2# vi: set ft=python sts=4 ts=4 sw=4 et: 3### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ## 4# 5# See COPYING file distributed along with the NiBabel package for the 6# copyright and license terms. 7# 8### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ### ## 9""" 10NetCDF reader/writer module. 11 12This module is used to read and create NetCDF files. NetCDF files are 13accessed through the `netcdf_file` object. Data written to and from NetCDF 14files are contained in `netcdf_variable` objects. Attributes are given 15as member variables of the `netcdf_file` and `netcdf_variable` objects. 16 17Notes 18----- 19NetCDF files are a self-describing binary data format. The file contains 20metadata that describes the dimensions and variables in the file. More 21details about NetCDF files can be found `here 22<http://www.unidata.ucar.edu/software/netcdf/docs/netcdf.html>`_. There 23are three main sections to a NetCDF data structure: 24 251. Dimensions 262. Variables 273. Attributes 28 29The dimensions section records the name and length of each dimension used 30by the variables. The variables would then indicate which dimensions it 31uses and any attributes such as data units, along with containing the data 32values for the variable. It is good practice to include a 33variable that is the same name as a dimension to provide the values for 34that axes. Lastly, the attributes section would contain additional 35information such as the name of the file creator or the instrument used to 36collect the data. 37 38When writing data to a NetCDF file, there is often the need to indicate the 39'record dimension'. A record dimension is the unbounded dimension for a 40variable. For example, a temperature variable may have dimensions of 41latitude, longitude and time. If one wants to add more temperature data to 42the NetCDF file as time progresses, then the temperature variable should 43have the time dimension flagged as the record dimension. 44 45This module implements the Scientific.IO.NetCDF API to read and create 46NetCDF files. The same API is also used in the PyNIO and pynetcdf 47modules, allowing these modules to be used interchangeably when working 48with NetCDF files. The major advantage of this module over other 49modules is that it doesn't require the code to be linked to the NetCDF 50C libraries. 51 52The code is based on the `NetCDF file format specification 53<http://www.unidata.ucar.edu/software/netcdf/docs/netcdf.html>`_. A 54NetCDF file is a self-describing binary format, with a header followed 55by data. The header contains metadata describing dimensions, variables 56and the position of the data in the file, so access can be done in an 57efficient manner without loading unnecessary data into memory. We use 58the ``mmap`` module to create Numpy arrays mapped to the data on disk, 59for the same purpose. 60 61The structure of a NetCDF file is as follows: 62 63 C D F <VERSION BYTE> <NUMBER OF RECORDS> 64 <DIMENSIONS> <GLOBAL ATTRIBUTES> <VARIABLES METADATA> 65 <NON-RECORD DATA> <RECORD DATA> 66 67Record data refers to data where the first axis can be expanded at 68will. All record variables share a same dimension at the first axis, 69and they are stored at the end of the file per record, ie 70 71 A[0], B[0], ..., A[1], B[1], ..., etc, 72 73so that new data can be appended to the file without changing its original 74structure. Non-record data are padded to a 4n bytes boundary. Record data 75are also padded, unless there is exactly one record variable in the file, 76in which case the padding is dropped. All data is stored in big endian 77byte order. 78 79The Scientific.IO.NetCDF API allows attributes to be added directly to 80instances of ``netcdf_file`` and ``netcdf_variable``. To differentiate 81between user-set attributes and instance attributes, user-set attributes 82are automatically stored in the ``_attributes`` attribute by overloading 83``__setattr__``. This is the reason why the code sometimes uses 84``obj.__dict__['key'] = value``, instead of simply ``obj.key = value``; 85otherwise the key would be inserted into userspace attributes. 86 87In addition, the NetCDF file header contains the position of the data in 88the file, so access can be done in an efficient manner without loading 89unnecessary data into memory. It uses the ``mmap`` module to create 90Numpy arrays mapped to the data on disk, for the same purpose. 91 92Examples 93-------- 94To create a NetCDF file: 95 96Make a temporary file for testing: 97 98 >>> import os 99 >>> from tempfile import mkdtemp 100 >>> tmp_pth = mkdtemp() 101 >>> fname = os.path.join(tmp_pth, 'test.nc') 102 103Then: 104 105 >>> f = netcdf_file(fname, 'w') 106 >>> f.history = 'Created for a test' 107 >>> f.createDimension('time', 10) 108 >>> time = f.createVariable('time', 'i', ('time',)) 109 >>> time[:] = range(10) 110 >>> time.units = 'days since 2008-01-01' 111 >>> f.close() 112 113Note the assignment of ``range(10)`` to ``time[:]``. Exposing the slice 114of the time variable allows for the data to be set in the object, rather 115than letting ``range(10)`` overwrite the ``time`` variable. 116 117To read the NetCDF file we just created: 118 119 >>> f = netcdf_file(fname, 'r') 120 >>> f.history #23dt next : bytes 121 'Created for a test' 122 >>> time = f.variables['time'] 123 >>> time.units #23dt next : bytes 124 'days since 2008-01-01' 125 >>> time.shape == (10,) 126 True 127 >>> print time[-1] 128 9 129 >>> f.close() 130 131 Delete our temporary directory and file: 132 133 >>> del f, time # needed for windows unlink 134 >>> os.unlink(fname) 135 >>> os.rmdir(tmp_pth) 136""" 137 138 139 140#TODO: 141# * properly implement ``_FillValue``. 142# * implement Jeff Whitaker's patch for masked variables. 143# * fix character variables. 144# * implement PAGESIZE for Python 2.6? 145 146#The Scientific.IO.NetCDF API allows attributes to be added directly to 147#instances of ``netcdf_file`` and ``netcdf_variable``. To differentiate 148#between user-set attributes and instance attributes, user-set attributes 149#are automatically stored in the ``_attributes`` attribute by overloading 150#``__setattr__``. This is the reason why the code sometimes uses 151#``obj.__dict__['key'] = value``, instead of simply ``obj.key = value``; 152#otherwise the key would be inserted into userspace attributes. 153 154 155__all__ = ['netcdf_file'] 156 157 158from operator import mul 159from mmap import mmap, ACCESS_READ 160 161import numpy as np 162from ..py3k import asbytes, asstr 163from numpy import fromstring, ndarray, dtype, empty, array, asarray 164from numpy import little_endian as LITTLE_ENDIAN 165 166 167ABSENT = asbytes('\x00\x00\x00\x00\x00\x00\x00\x00') 168ZERO = asbytes('\x00\x00\x00\x00') 169NC_BYTE = asbytes('\x00\x00\x00\x01') 170NC_CHAR = asbytes('\x00\x00\x00\x02') 171NC_SHORT = asbytes('\x00\x00\x00\x03') 172NC_INT = asbytes('\x00\x00\x00\x04') 173NC_FLOAT = asbytes('\x00\x00\x00\x05') 174NC_DOUBLE = asbytes('\x00\x00\x00\x06') 175NC_DIMENSION = asbytes('\x00\x00\x00\n') 176NC_VARIABLE = asbytes('\x00\x00\x00\x0b') 177NC_ATTRIBUTE = asbytes('\x00\x00\x00\x0c') 178 179 180TYPEMAP = { NC_BYTE: ('b', 1), 181 NC_CHAR: ('c', 1), 182 NC_SHORT: ('h', 2), 183 NC_INT: ('i', 4), 184 NC_FLOAT: ('f', 4), 185 NC_DOUBLE: ('d', 8) } 186 187REVERSE = { ('b', 1): NC_BYTE, 188 ('B', 1): NC_CHAR, 189 ('c', 1): NC_CHAR, 190 ('h', 2): NC_SHORT, 191 ('i', 4): NC_INT, 192 ('f', 4): NC_FLOAT, 193 ('d', 8): NC_DOUBLE, 194 195 # these come from asarray(1).dtype.char and asarray('foo').dtype.char, 196 # used when getting the types from generic attributes. 197 ('l', 4): NC_INT, 198 ('S', 1): NC_CHAR } 199 200 201class netcdf_file(object): 202 """ 203 A file object for NetCDF data. 204 205 A `netcdf_file` object has two standard attributes: `dimensions` and 206 `variables`. The values of both are dictionaries, mapping dimension 207 names to their associated lengths and variable names to variables, 208 respectively. Application programs should never modify these 209 dictionaries. 210 211 All other attributes correspond to global attributes defined in the 212 NetCDF file. Global file attributes are created by assigning to an 213 attribute of the `netcdf_file` object. 214 215 Parameters 216 ---------- 217 filename : string or file-like 218 string -> filename 219 mode : {'r', 'w'}, optional 220 read-write mode, default is 'r' 221 mmap : None or bool, optional 222 Whether to mmap `filename` when reading. Default is True 223 when `filename` is a file name, False when `filename` is a 224 file-like object 225 version : {1, 2}, optional 226 version of netcdf to read / write, where 1 means *Classic 227 format* and 2 means *64-bit offset format*. Default is 1. See 228 `here <http://www.unidata.ucar.edu/software/netcdf/docs/netcdf/Which-Format.html>`_ 229 for more info. 230 231 """ 232 def __init__(self, filename, mode='r', mmap=None, version=1): 233 """Initialize netcdf_file from fileobj (str or file-like). 234 235 Parameters 236 ---------- 237 filename : string or file-like 238 string -> filename 239 mode : {'r', 'w'}, optional 240 read-write mode, default is 'r' 241 mmap : None or bool, optional 242 Whether to mmap `filename` when reading. Default is True 243 when `filename` is a file name, False when `filename` is a 244 file-like object 245 version : {1, 2}, optional 246 version of netcdf to read / write, where 1 means *Classic 247 format* and 2 means *64-bit offset format*. Default is 1. See 248 http://www.unidata.ucar.edu/software/netcdf/docs/netcdf/Which-Format.html#Which-Format 249 """ 250 if hasattr(filename, 'seek'): # file-like 251 self.fp = filename 252 self.filename = 'None' 253 if mmap is None: 254 mmap = False 255 elif mmap and not hasattr(filename, 'fileno'): 256 raise ValueError('Cannot use file object for mmap') 257 else: # maybe it's a string 258 self.filename = filename 259 self.fp = open(self.filename, '%sb' % mode) 260 if mmap is None: 261 mmap = True 262 self.use_mmap = mmap 263 self.version_byte = version 264 265 if not mode in 'rw': 266 raise ValueError("Mode must be either 'r' or 'w'.") 267 self.mode = mode 268 269 self.dimensions = {} 270 self.variables = {} 271 272 self._dims = [] 273 self._recs = 0 274 self._recsize = 0 275 276 self._attributes = {} 277 278 if mode == 'r': 279 self._read() 280 281 def __setattr__(self, attr, value): 282 # Store user defined attributes in a separate dict, 283 # so we can save them to file later. 284 try: 285 self._attributes[attr] = value 286 except AttributeError: 287 pass 288 self.__dict__[attr] = value 289 290 def close(self): 291 """Closes the NetCDF file.""" 292 try: 293 if self.fp.closed: 294 return 295 except AttributeError: # gzip files don't have closed attr 296 pass 297 try: 298 self.flush() 299 finally: 300 self.fp.close() 301 __del__ = close 302 303 def createDimension(self, name, length): 304 """ 305 Adds a dimension to the Dimension section of the NetCDF data structure. 306 307 Note that this function merely adds a new dimension that the variables can 308 reference. The values for the dimension, if desired, should be added as 309 a variable using `createVariable`, referring to this dimension. 310 311 Parameters 312 ---------- 313 name : str 314 Name of the dimension (Eg, 'lat' or 'time'). 315 length : int 316 Length of the dimension. 317 318 See Also 319 -------- 320 createVariable 321 322 """ 323 self.dimensions[name] = length 324 self._dims.append(name) 325 326 def createVariable(self, name, type, dimensions): 327 """ 328 Create an empty variable for the `netcdf_file` object, specifying its data 329 type and the dimensions it uses. 330 331 Parameters 332 ---------- 333 name : str 334 Name of the new variable. 335 type : dtype or str 336 Data type of the variable. 337 dimensions : sequence of str 338 List of the dimension names used by the variable, in the desired order. 339 340 Returns 341 ------- 342 variable : netcdf_variable 343 The newly created ``netcdf_variable`` object. 344 This object has also been added to the `netcdf_file` object as well. 345 346 See Also 347 -------- 348 createDimension 349 350 Notes 351 ----- 352 Any dimensions to be used by the variable should already exist in the 353 NetCDF data structure or should be created by `createDimension` prior to 354 creating the NetCDF variable. 355 356 """ 357 shape = tuple([self.dimensions[dim] for dim in dimensions]) 358 shape_ = tuple([dim or 0 for dim in shape]) # replace None with 0 for numpy 359 360 if isinstance(type, basestring): type = dtype(type) 361 typecode, size = type.char, type.itemsize 362 if (typecode, size) not in REVERSE: 363 raise ValueError("NetCDF 3 does not support type %s" % type) 364 dtype_ = '>%s' % typecode 365 if size > 1: dtype_ += str(size) 366 367 data = empty(shape_, dtype=dtype_) 368 self.variables[name] = netcdf_variable(data, typecode, size, shape, dimensions) 369 return self.variables[name] 370 371 def flush(self): 372 """ 373 Perform a sync-to-disk flush if the `netcdf_file` object is in write mode. 374 375 See Also 376 -------- 377 sync : Identical function 378 379 """ 380 if hasattr(self, 'mode') and self.mode is 'w': 381 self._write() 382 sync = flush 383 384 def _write(self): 385 self.fp.write(asbytes('CDF')) 386 self.fp.write(array(self.version_byte, '>b').tostring()) 387 388 # Write headers and data. 389 self._write_numrecs() 390 self._write_dim_array() 391 self._write_gatt_array() 392 self._write_var_array() 393 394 def _write_numrecs(self): 395 # Get highest record count from all record variables. 396 for var in self.variables.values(): 397 if var.isrec and len(var.data) > self._recs: 398 self.__dict__['_recs'] = len(var.data) 399 self._pack_int(self._recs) 400 401 def _write_dim_array(self): 402 if self.dimensions: 403 self.fp.write(NC_DIMENSION) 404 self._pack_int(len(self.dimensions)) 405 for name in self._dims: 406 self._pack_string(name) 407 length = self.dimensions[name] 408 self._pack_int(length or 0) # replace None with 0 for record dimension 409 else: 410 self.fp.write(ABSENT) 411 412 def _write_gatt_array(self): 413 self._write_att_array(self._attributes) 414 415 def _write_att_array(self, attributes): 416 if attributes: 417 self.fp.write(NC_ATTRIBUTE) 418 self._pack_int(len(attributes)) 419 for name, values in attributes.items(): 420 self._pack_string(name) 421 self._write_values(values) 422 else: 423 self.fp.write(ABSENT) 424 425 def _write_var_array(self): 426 if self.variables: 427 self.fp.write(NC_VARIABLE) 428 self._pack_int(len(self.variables)) 429 430 # Sort variables non-recs first, then recs. We use a DSU 431 # since some people use pupynere with Python 2.3.x. 432 deco = [ (v._shape and not v.isrec, k) for (k, v) in self.variables.items() ] 433 deco.sort() 434 variables = [ k for (unused, k) in deco ][::-1] 435 436 # Set the metadata for all variables. 437 for name in variables: 438 self._write_var_metadata(name) 439 # Now that we have the metadata, we know the vsize of 440 # each record variable, so we can calculate recsize. 441 self.__dict__['_recsize'] = sum([ 442 var._vsize for var in self.variables.values() 443 if var.isrec]) 444 # Set the data for all variables. 445 for name in variables: 446 self._write_var_data(name) 447 else: 448 self.fp.write(ABSENT) 449 450 def _write_var_metadata(self, name): 451 var = self.variables[name] 452 453 self._pack_string(name) 454 self._pack_int(len(var.dimensions)) 455 for dimname in var.dimensions: 456 dimid = self._dims.index(dimname) 457 self._pack_int(dimid) 458 459 self._write_att_array(var._attributes) 460 461 nc_type = REVERSE[var.typecode(), var.itemsize()] 462 self.fp.write(asbytes(nc_type)) 463 464 if not var.isrec: 465 vsize = var.data.size * var.data.itemsize 466 vsize += -vsize % 4 467 else: # record variable 468 try: 469 vsize = var.data[0].size * var.data.itemsize 470 except IndexError: 471 vsize = 0 472 rec_vars = len([var for var in self.variables.values() 473 if var.isrec]) 474 if rec_vars > 1: 475 vsize += -vsize % 4 476 self.variables[name].__dict__['_vsize'] = vsize 477 self._pack_int(vsize) 478 479 # Pack a bogus begin, and set the real value later. 480 self.variables[name].__dict__['_begin'] = self.fp.tell() 481 self._pack_begin(0) 482 483 def _write_var_data(self, name): 484 var = self.variables[name] 485 486 # Set begin in file header. 487 the_beguine = self.fp.tell() 488 self.fp.seek(var._begin) 489 self._pack_begin(the_beguine) 490 self.fp.seek(the_beguine) 491 492 # Write data. 493 if not var.isrec: 494 self.fp.write(var.data.tostring()) 495 count = var.data.size * var.data.itemsize 496 self.fp.write(asbytes('0') * (var._vsize - count)) 497 else: # record variable 498 # Handle rec vars with shape[0] < nrecs. 499 if self._recs > len(var.data): 500 shape = (self._recs,) + var.data.shape[1:] 501 var.data.resize(shape) 502 503 pos0 = pos = self.fp.tell() 504 for rec in var.data: 505 # Apparently scalars cannot be converted to big endian. If we 506 # try to convert a ``=i4`` scalar to, say, '>i4' the dtype 507 # will remain as ``=i4``. 508 if not rec.shape and (rec.dtype.byteorder == '<' or 509 (rec.dtype.byteorder == '=' and LITTLE_ENDIAN)): 510 rec = rec.byteswap() 511 self.fp.write(rec.tostring()) 512 # Padding 513 count = rec.size * rec.itemsize 514 self.fp.write(asbytes('0') * (var._vsize - count)) 515 pos += self._recsize 516 self.fp.seek(pos) 517 self.fp.seek(pos0 + var._vsize) 518 519 def _write_values(self, values): 520 if hasattr(values, 'dtype'): 521 nc_type = REVERSE[values.dtype.char, values.dtype.itemsize] 522 else: 523 types = [ 524 (int, NC_INT), 525 (long, NC_INT), 526 (float, NC_FLOAT), 527 (basestring, NC_CHAR), 528 ] 529 try: 530 sample = values[0] 531 except TypeError: 532 sample = values 533 for class_, nc_type in types: 534 if isinstance(sample, class_): break 535 536 typecode, size = TYPEMAP[nc_type] 537 dtype_ = '>%s' % typecode 538 539 values = asarray(values, dtype=dtype_) 540 541 self.fp.write(asbytes(nc_type)) 542 543 if values.dtype.char == 'S': 544 nelems = values.itemsize 545 else: 546 nelems = values.size 547 self._pack_int(nelems) 548 549 if not values.shape and (values.dtype.byteorder == '<' or 550 (values.dtype.byteorder == '=' and LITTLE_ENDIAN)): 551 values = values.byteswap() 552 self.fp.write(values.tostring()) 553 count = values.size * values.itemsize 554 self.fp.write(asbytes('0') * (-count % 4)) # pad 555 556 def _read(self): 557 # Check magic bytes and version 558 magic = self.fp.read(3) 559 if not magic == asbytes('CDF'): 560 raise TypeError("Error: %s is not a valid NetCDF 3 file" % 561 self.filename) 562 self.__dict__['version_byte'] = fromstring(self.fp.read(1), '>b')[0] 563 564 # Read file headers and set data. 565 self._read_numrecs() 566 self._read_dim_array() 567 self._read_gatt_array() 568 self._read_var_array() 569 570 def _read_numrecs(self): 571 self.__dict__['_recs'] = self._unpack_int() 572 573 def _read_dim_array(self): 574 header = self.fp.read(4) 575 if not header in [ZERO, NC_DIMENSION]: 576 raise ValueError("Unexpected header.") 577 count = self._unpack_int() 578 579 for dim in range(count): 580 name = asstr(self._unpack_string()) 581 length = self._unpack_int() or None # None for record dimension 582 self.dimensions[name] = length 583 self._dims.append(name) # preserve order 584 585 def _read_gatt_array(self): 586 for k, v in self._read_att_array().items(): 587 self.__setattr__(k, v) 588 589 def _read_att_array(self): 590 header = self.fp.read(4) 591 if not header in [ZERO, NC_ATTRIBUTE]: 592 raise ValueError("Unexpected header.") 593 count = self._unpack_int() 594 595 attributes = {} 596 for attr in range(count): 597 name = asstr(self._unpack_string()) 598 attributes[name] = self._read_values() 599 return attributes 600 601 def _read_var_array(self): 602 header = self.fp.read(4) 603 if not header in [ZERO, NC_VARIABLE]: 604 raise ValueError("Unexpected header.") 605 606 begin = 0 607 dtypes = {'names': [], 'formats': []} 608 rec_vars = [] 609 count = self._unpack_int() 610 for var in range(count): 611 (name, dimensions, shape, attributes, 612 typecode, size, dtype_, begin_, vsize) = self._read_var() 613 # http://www.unidata.ucar.edu/software/netcdf/docs/netcdf.html 614 # Note that vsize is the product of the dimension lengths 615 # (omitting the record dimension) and the number of bytes 616 # per value (determined from the type), increased to the 617 # next multiple of 4, for each variable. If a record 618 # variable, this is the amount of space per record. The 619 # netCDF "record size" is calculated as the sum of the 620 # vsize's of all the record variables. 621 # 622 # The vsize field is actually redundant, because its value 623 # may be computed from other information in the header. The 624 # 32-bit vsize field is not large enough to contain the size 625 # of variables that require more than 2^32 - 4 bytes, so 626 # 2^32 - 1 is used in the vsize field for such variables. 627 if shape and shape[0] is None: # record variable 628 rec_vars.append(name) 629 # The netCDF "record size" is calculated as the sum of 630 # the vsize's of all the record variables. 631 self.__dict__['_recsize'] += vsize 632 if begin == 0: begin = begin_ 633 dtypes['names'].append(name) 634 dtypes['formats'].append(str(shape[1:]) + dtype_) 635 636 # Handle padding with a virtual variable. 637 if typecode in 'bch': 638 actual_size = reduce(mul, (1,) + shape[1:]) * size 639 padding = -actual_size % 4 640 if padding: 641 dtypes['names'].append('_padding_%d' % var) 642 dtypes['formats'].append('(%d,)>b' % padding) 643 644 # Data will be set later. 645 data = None 646 else: # not a record variable 647 # Calculate size to avoid problems with vsize (above) 648 a_size = reduce(mul, shape, 1) * size 649 if self.use_mmap: 650 mm = mmap(self.fp.fileno(), begin_+a_size, access=ACCESS_READ) 651 data = ndarray.__new__(ndarray, shape, dtype=dtype_, 652 buffer=mm, offset=begin_, order=0) 653 else: 654 pos = self.fp.tell() 655 self.fp.seek(begin_) 656 data = fromstring(self.fp.read(a_size), dtype=dtype_) 657 data.shape = shape 658 self.fp.seek(pos) 659 660 # Add variable. 661 self.variables[name] = netcdf_variable( 662 data, typecode, size, shape, dimensions, attributes) 663 664 if rec_vars: 665 # Remove padding when only one record variable. 666 if len(rec_vars) == 1: 667 dtypes['names'] = dtypes['names'][:1] 668 dtypes['formats'] = dtypes['formats'][:1] 669 670 # Build rec array. 671 if self.use_mmap: 672 mm = mmap(self.fp.fileno(), begin+self._recs*self._recsize, access=ACCESS_READ) 673 rec_array = ndarray.__new__(ndarray, (self._recs,), dtype=dtypes, 674 buffer=mm, offset=begin, order=0) 675 else: 676 pos = self.fp.tell() 677 self.fp.seek(begin) 678 rec_array = fromstring(self.fp.read(self._recs*self._recsize), dtype=dtypes) 679 rec_array.shape = (self._recs,) 680 self.fp.seek(pos) 681 682 for var in rec_vars: 683 self.variables[var].__dict__['data'] = rec_array[var] 684 685 def _read_var(self): 686 name = asstr(self._unpack_string()) 687 dimensions = [] 688 shape = [] 689 dims = self._unpack_int() 690 691 for i in range(dims): 692 dimid = self._unpack_int() 693 dimname = self._dims[dimid] 694 dimensions.append(dimname) 695 dim = self.dimensions[dimname] 696 shape.append(dim) 697 dimensions = tuple(dimensions) 698 shape = tuple(shape) 699 700 attributes = self._read_att_array() 701 nc_type = self.fp.read(4) 702 vsize = self._unpack_int() 703 begin = [self._unpack_int, self._unpack_int64][self.version_byte-1]() 704 705 typecode, size = TYPEMAP[nc_type] 706 dtype_ = '>%s' % typecode 707 708 return name, dimensions, shape, attributes, typecode, size, dtype_, begin, vsize 709 710 def _read_values(self): 711 nc_type = self.fp.read(4) 712 n = self._unpack_int() 713 714 typecode, size = TYPEMAP[nc_type] 715 716 count = n*size 717 values = self.fp.read(int(count)) 718 self.fp.read(-count % 4) # read padding 719 720 if typecode is not 'c': 721 values = fromstring(values, dtype='>%s' % typecode) 722 if values.shape == (1,): values = values[0] 723 else: 724 values = values.rstrip(asbytes('\x00')) 725 return values 726 727 def _pack_begin(self, begin): 728 if self.version_byte == 1: 729 self._pack_int(begin) 730 elif self.version_byte == 2: 731 self._pack_int64(begin) 732 733 def _pack_int(self, value): 734 self.fp.write(array(value, '>i').tostring()) 735 _pack_int32 = _pack_int 736 737 def _unpack_int(self): 738 return int(fromstring(self.fp.read(4), '>i')[0]) 739 _unpack_int32 = _unpack_int 740 741 def _pack_int64(self, value): 742 self.fp.write(array(value, '>q').tostring()) 743 744 def _unpack_int64(self): 745 return fromstring(self.fp.read(8), '>q')[0] 746 747 def _pack_string(self, s): 748 count = len(s) 749 self._pack_int(count) 750 self.fp.write(asbytes(s)) 751 self.fp.write(asbytes('0') * (-count % 4)) # pad 752 753 def _unpack_string(self): 754 count = self._unpack_int() 755 s = self.fp.read(count).rstrip(asbytes('\x00')) 756 self.fp.read(-count % 4) # read padding 757 return s 758 759 760class netcdf_variable(object): 761 """ 762 A data object for the `netcdf` module. 763 764 `netcdf_variable` objects are constructed by calling the method 765 `netcdf_file.createVariable` on the `netcdf_file` object. `netcdf_variable` 766 objects behave much like array objects defined in numpy, except that their 767 data resides in a file. Data is read by indexing and written by assigning 768 to an indexed subset; the entire array can be accessed by the index ``[:]`` 769 or (for scalars) by using the methods `getValue` and `assignValue`. 770 `netcdf_variable` objects also have attribute `shape` with the same meaning 771 as for arrays, but the shape cannot be modified. There is another read-only 772 attribute `dimensions`, whose value is the tuple of dimension names. 773 774 All other attributes correspond to variable attributes defined in 775 the NetCDF file. Variable attributes are created by assigning to an 776 attribute of the `netcdf_variable` object. 777 778 Parameters 779 ---------- 780 data : array_like 781 The data array that holds the values for the variable. 782 Typically, this is initialized as empty, but with the proper shape. 783 typecode : dtype character code 784 Desired data-type for the data array. 785 size : int 786 Desired element size for the data array. 787 shape : sequence of ints 788 The shape of the array. This should match the lengths of the 789 variable's dimensions. 790 dimensions : sequence of strings 791 The names of the dimensions used by the variable. Must be in the 792 same order of the dimension lengths given by `shape`. 793 attributes : dict, optional 794 Attribute values (any type) keyed by string names. These attributes 795 become attributes for the netcdf_variable object. 796 797 798 Attributes 799 ---------- 800 dimensions : list of str 801 List of names of dimensions used by the variable object. 802 isrec, shape 803 Properties 804 805 See also 806 -------- 807 isrec, shape 808 809 """ 810 def __init__(self, data, typecode, size, shape, dimensions, attributes=None): 811 self.data = data 812 self._typecode = typecode 813 self._size = size 814 self._shape = shape 815 self.dimensions = dimensions 816 817 self._attributes = attributes or {} 818 for k, v in self._attributes.items(): 819 self.__dict__[k] = v 820 821 def __setattr__(self, attr, value): 822 # Store user defined attributes in a separate dict, 823 # so we can save them to file later. 824 try: 825 self._attributes[attr] = value 826 except AttributeError: 827 pass 828 self.__dict__[attr] = value 829 830 def isrec(self): 831 """Returns whether the variable has a record dimension or not. 832 833 A record dimension is a dimension along which additional data could be 834 easily appended in the netcdf data structure without much rewriting of 835 the data file. This attribute is a read-only property of the 836 `netcdf_variable`. 837 838 """ 839 return self.data.shape and not self._shape[0] 840 isrec = property(isrec) 841 842 def shape(self): 843 """Returns the shape tuple of the data variable. 844 845 This is a read-only attribute and can not be modified in the 846 same manner of other numpy arrays. 847 """ 848 return self.data.shape 849 shape = property(shape) 850 851 def getValue(self): 852 """ 853 Retrieve a scalar value from a `netcdf_variable` of length one. 854 855 Raises 856 ------ 857 ValueError 858 If the netcdf variable is an array of length greater than one, 859 this exception will be raised. 860 861 """ 862 return self.data.item() 863 864 def assignValue(self, value): 865 """ 866 Assign a scalar value to a `netcdf_variable` of length one. 867 868 Parameters 869 ---------- 870 value : scalar 871 Scalar value (of compatible type) to assign to a length-one netcdf 872 variable. This value will be written to file. 873 874 Raises 875 ------ 876 ValueError 877 If the input is not a scalar, or if the destination is not a length-one 878 netcdf variable. 879 880 """ 881 self.data.itemset(value) 882 883 def typecode(self): 884 """ 885 Return the typecode of the variable. 886 887 Returns 888 ------- 889 typecode : char 890 The character typecode of the variable (eg, 'i' for int). 891 892 """ 893 return self._typecode 894 895 def itemsize(self): 896 """ 897 Return the itemsize of the variable. 898 899 Returns 900 ------- 901 itemsize : int 902 The element size of the variable (eg, 8 for float64). 903 904 """ 905 return self._size 906 907 def __getitem__(self, index): 908 return self.data[index] 909 910 def __setitem__(self, index, data): 911 # Expand data for record vars? 912 if self.isrec: 913 if isinstance(index, tuple): 914 rec_index = index[0] 915 else: 916 rec_index = index 917 if isinstance(rec_index, slice): 918 recs = (rec_index.start or 0) + len(data) 919 else: 920 recs = rec_index + 1 921 if recs > len(self.data): 922 shape = (recs,) + self._shape[1:] 923 self.data.resize(shape) 924 self.data[index] = data 925 926 927NetCDFFile = netcdf_file 928NetCDFVariable = netcdf_variable 929