1/** 2@if INTERNAL 3 4@page inmemintern Internal Architecture for NC_INMEMORY Support 5 6\tableofcontents 7 8<!-- Note that this file has the .dox extension, but is mostly markdown --> 9<!-- Begin MarkDown --> 10 11# Introduction {#inmemintern_intro} 12 13This document describes the internal workings 14of the inmemory features of the netcdf-c library. 15The companion document to this -- inmemory.md -- 16describes the "external" operation of the inmemory features. 17 18This document describes how the in-memory operation 19is implemented both for netcdf-3 files and for netcdf-4 files. 20 21# Generic Capabilities {#inmemintern_general} 22 23Both the netcdf-3 and netcdf-4 implementations assume that 24they are initially given a (pointer,size) pair representing 25a chunk of allocated memory of specified size. 26 27If a file is being created instead of opened, then only the size 28is needed and the netcdf-c library will internally allocate the 29corresponding memory chunk. 30 31If NC_DISKLESS is being used, then a chunk of memory is allocated 32whose size is the same as the length of the file, and the contents 33of the file is then read into that chunk of memory. 34 35This information is in general represented by the following struct 36(see include/netcdf_mem.h). 37```` 38typedef struct NC_memio { 39 size_t size; 40 void* memory; 41 int flags; 42} NC_memio; 43 44```` 45The flags field describes properties and constraints to be applied 46to the given memory. At the moment, only this one flag is defined. 47```` 48#define NC_MEMIO_LOCKED 1 49```` 50If this flag is set, then the netcdf library will ensure that 51the original allocated memory is ```locked```, which means 52that it will never be realloc'd nor free'd. 53Note that this flag is ignored when creating a memory file: it is only 54relevant when opening a pre-allocated chunk of memory via the 55_nc_open_mem_ function. 56 57Note that this flag does not prevent the memory from being modified. 58If there is room, then the memory may be modified in place. If the size 59of the memory needs to be increased and the this flag is set, then 60the operation will fail. 61 62When the _nc_close_memio_ function is called instead of 63_nc_close_, then the currently allocated memory (and its size) 64is returned. If the _NC_MEMIO_LOCKED_ flag is set, then it 65should be the case that the chunk of memory returned is the same 66as originally provided. However, the size may be different 67because it represents the amount of memory that contains 68meaningful data; this value may be less than the original provided size. 69The actual allocated size for the memory chunk is the same as originally 70provided, so it that value is needed, then the caller must save it somewhere. 71 72Note also that ownership of the memory chunk is given to the 73caller, and it is the caller's responsibility to _free_ the memory. 74 75# NetCDF-4 Implementation {#inmemintern_nc4} 76 77The implementation of in-memory support for netcdf-4 files 78is quite complicated. 79 80The netCDF-4 implemention relies on the HDF5 library. In order 81to implement in-memory storage of data, the HDF5 core driver is 82used to manage underlying storage of the netcdf-c file. 83 84An HDF5 driver is an abstract interface that allows different 85underlying storage implementations. So there is a standard file 86driver as well as a core driver, which uses memory as the 87underlying storage. 88 89Generically, the memory is referred to as a file image [1]. 90 91## libhdf5/nc4mem 92 93The primary API for in-memory operations is in the file 94libhdf5/nc4mem.c and the defined functions are described in the next sections 95 96### nc4mem.NC4_open_image_file 97 98The signature is: 99```` 100int NC4_open_image_file(NC_FILE_INFO_T* h5) 101```` 102Basically, this function sets up the necessary state information 103to use the HDF5 core driver. 104It obtains the memory chunk and size from the _h5->mem.memio_ field. 105 106Specifically, this function converts the 107_NC_MEMIO_LOCKED_ flag into using the HDF5 image specific flags: 108_H5LT_FILE_IMAGE_DONT_COPY_ and _H5LT_FILE_IMAGE_DONT_RELEASE_. 109It then invokes the function _libhdf5/nc4memcb/NC4_image_init_ 110function to do the necessary HDF5 specific setup. 111 112### nc4mem.NC4_create_image_file 113 114The signature is: 115```` 116int NC4_create_image_file(NC_FILE_INFO_T* h5, size_t initialsize) 117```` 118 119This function sets up the necessary state information 120to use the HDF5 core driver, but for a newly created file. 121It initializes the memory chunk and size in the _h5->mem.memio_ field 122from the _initialsize_ argument and it leaves the memory chunk pointer NULL. 123It ignores the _NC_MEMIO_LOCKED_ flag. 124It then invokes the function _libhdf5/nc4memcb/NC4_image_init_ 125function to do the necessary HDF5 specific setup. 126 127### libhdf5/hdf5file.c/nc4_close-netcdf4_file 128 129When a file is closed, this function is invoked. As part of its operation, 130and if the file is an in-memory file, it does one of two things. 131 1321. If the user provided an _NC_memio_ instance, then return the final image 133in that instance; the user is then responsible for freeing it. 1342. If no _NC_memio_ instance was provided, then just discard the final image. 135 136## libhdf5/nc4memcb 137 138The HDF5 core driver uses an abstract interface for managing the 139allocation and free'ing of memory. This interface is defined 140as a set of callback functions [2] that implement the functions 141of this struct. 142 143```` 144typedef struct { 145 void *(*_malloc)(size_t size, H5_file_image_op_t op, void *udata); 146 void *(*_memcpy)(void *dest, const void *src, size_t size, 147 H5_file_image_op_t op, void *udata); 148 void *(*_realloc)(void *ptr, size_t size, 149 H5_file_image_op_t op, void *udata); 150 herr_t (*_free)(void *ptr, H5_file_image_op_t op, void *udata); 151 void *(*udata_copy)(void *udata); 152 herr_t (*udata_free)(void *udata); 153 void *udata; 154} H5_file_image_callbacks_t; 155```` 156The _udata_ field at the end defines any extra state needed by the functions. 157Each function is passed the udata as its last argument. The structure of the 158udata is arbitrary, and is passed as _void*_ to the functions. 159 160The _udata_ structure and callback functions used by the netcdf-c library 161are defined in the file _libhdf5/nc4memcb.c_. Setup is defined by the 162function _NC4_image_init_ in that same file. 163 164The _udata_ structure used by netcdf is as follows. 165```` 166typedef struct { 167 void *app_image_ptr; /* Pointer to application buffer */ 168 size_t app_image_size; /* Size of application buffer */ 169 void *fapl_image_ptr; /* Pointer to FAPL buffer */ 170 size_t fapl_image_size; /* Size of FAPL buffer */ 171 int fapl_ref_count; /* Reference counter for FAPL buffer */ 172 void *vfd_image_ptr; /* Pointer to VFD buffer (Note: VFD => used by core driver) */ 173 size_t vfd_image_size; /* Size of VFD buffer */ 174 int vfd_ref_count; /* Reference counter for VFD buffer */ 175 unsigned flags; /* Flags indicate how the file image will be opened */ 176 int ref_count; /* Reference counter on udata struct */ 177 NC_FILE_INFO_T* h5; /* Pointer to the netcdf parent structure */ 178} H5LT_file_image_ud_t; 179```` 180 181It is necessary to understand one more point about the callback functions. 182The first four take an argument of type _H5_file_image_op_t_ -- the operator. 183This is an enumeration that indicates additional context about the purpose for 184which the callback is being invoked. For the purposes of the netcdf-4 185implementation, only the following operators are used. 186 187- H5FD_FILE_IMAGE_OP_PROPERTY_LIST_SET 188- H5FD_FILE_IMAGE_OP_PROPERTY_LIST_COPY 189- H5FD_FILE_IMAGE_OP_PROPERTY_LIST_GET 190- H5FD_FILE_IMAGE_OP_PROPERTY_LIST_CLOSE 191- H5FD_FILE_IMAGE_OP_FILE_OPEN 192- H5FD_FILE_IMAGE_OP_FILE_RESIZE 193- H5FD_FILE_IMAGE_OP_FILE_CLOSE 194 195As can be seen, basically the operators indicate if the operation is with respect to 196an HDF5 property list, or with respect to a file (i.e. a core image in this case). 197For each callback described below, the per-operator actions will be described. 198Not all operators are used with all callbacks. 199 200Internally, the HDF5 core driver thinks it is doing the following: 201 2021. Allocate memory and copy the incoming memory chunk into that newly allocated memory 203 (call image_malloc followed by image_memcpy). 2042. Periodically reallocate the memory to increase its size 205 (call image_realloc). 2063. Free up the memory as no longer needed 207 (call image_free). 208 209It turns out that for propertly lists, realloc is never called. 210However the HDF5 core driver follows all of the above steps. 211 212The following sections describe the callback function operation. 213 214### libhdf5/nc4memcb/local_image_malloc 215 216This function is called to allocated an internal chunk of memory so 217the original provided memory is no longer needed. In order to implement 218the netcdf-c semantics, we modify this behavior. 219 220#### Operator H5FD_FILE_IMAGE_OP_PROPERTY_LIST_SET 221We assume that the property list image info will never need to be modified, 222so we just copy the incoming buffer info (the app_image fields) into the fapl_image fields. 223 224#### Operator H5FD_FILE_IMAGE_OP_PROPERTY_LIST_COPY 225Basically just return the fapl_image_ptr field, so no actual copying. 226 227#### Operator H5FD_FILE_IMAGE_OP_PROPERTY_LIST_COPY and H5FD_FILE_IMAGE_OP_PROPERTY_LIST_GET 228Basically just return the fapl_image_ptr field, so no actual copying or malloc needed. 229 230#### Operator H5FD_FILE_IMAGE_OP_FILE_OPEN 231Since we always start by using the original incoming image buffer, we just 232need to store that pointer and size into the vfd_image fields (remember, vfd is that 233used by the core driver). 234 235### libhdf5/nc4memcb/local_image_memcpy 236This function is supposed to be used to copy the incoming buffer into an internally 237malloc'd buffer. Since we use the original buffer, no memcpy is actually needed. 238As a safety check, we do actually do a memcpy if, for some reason, the _src_ and _dest_ 239arguments are different. In practice, this never happens. 240 241### libhdf5/nc4memcb/local_image_realloc 242Since the property list image is never realloc'd this is only called with 243_H5FD_FILE_IMAGE_OP_FILE_RESIZE_. 244 245If the memory is not locked (i.e. the _NC_MEMIO_LOCKED_ flag was not used), 246then we are free to realloc the vfd_ptr. But if the memory is locked, 247then we cannot realloc and we must fake it as follows: 248 2491. If the chunk is big enough, then pretend to do a realloc by 250 changing the vfd_image_size. 2512. If the chunk is not big enough to accomodate the requested new size, 252 then fail. 253 254There is one important complication. It turns out that the image_realloc 255callback is sometimes called with a ptr argument value of NULL. This assumes 256that if realloc is called with a NULL buffer pointer, then it acts like _malloc_. 257Since we have found that some systems to do not implement this, we implement it 258in our local_image_realloc code and do a _malloc_ instead of _realloc_. 259 260### libhdf5/nc4memcb/local_image_free 261 262This function is, of course, invoked to deallocate memory. 263It is only invoked with the 264H5FD_FILE_IMAGE_OP_PROPERTY_LIST_CLOSE 265and H5FD_FILE_IMAGE_OP_FILE_CLOSE 266operators. 267 268#### Operator H5FD_FILE_IMAGE_OP_PROPERTY_LIST_CLOSE 269For the way the netcdf library uses it, it should still be the case that 270the fapl pointer is same as original incoming app_ptr, so we do not need 271to do anything for this operator. 272 273#### Operator H5FD_FILE_IMAGE_OP_FILE_CLOSE 274Since in our implementation, we maintain control of the memory, this case 275will never free any memory, but may save a pointer to the current vfd memory 276so it can be returned to the original caller, if they want it. 277Specifically the vfd_image_ptr and vfd_image_size are always 278copied to the _udata->h5->mem.memio_ field so they can 279be referenced by higher level code. 280 281### libhdf5/nc4memcb/local_udata_copy 282Our version of this function only manipulates the reference count. 283 284### libhdf5/nc4memcb/local_udata_free 285Our version of this function only manipulates the reference count. 286 287# NetCDF-3 Implementation {#inmemintern_nc3} 288 289The netcdf-3 code -- in libsrc -- has its own, internal storage 290management API as defined in the file _libsrc/ncio.h_. It implements 291the API in the form of a set of function pointers as defined in the 292structure _struct_ _ncio_. These function have the following signatures 293and semantics. 294 295- int ncio_relfunc(ncio*, off_t offset, int rflags) -- 296 Indicate that you are done with the region which begins at offset. 297- int ncio_getfunc(ncio*, off_t offset, size_t extent, int rflags, void **const vpp) -- 298 Request that the region (offset, extent) be made available through *vpp. 299- int ncio_movefunc(ncio*, off_t to, off_t from, size_t nbytes, int rflags) -- 300 Like memmove(), safely move possibly overlapping data. 301- int ncio_syncfunc(ncio*) -- 302 Write out any dirty buffers to disk and ensure that next read will get data from disk. 303- int ncio_pad_lengthfunc(ncio*, off_t length) -- 304 Sync any changes to disk, then truncate or extend file so its size is length. 305- int ncio_closefunc(ncio*, int doUnlink) 306 -- Write out any dirty buffers and ensure that next read will not get cached data. 307 Then sync any changes, and then close the open file. 308 309The _NC_INMEMORY_ semantics are implemented by creating an implementation of the above functions 310specific for handling in-memory support. This is implemented in the file _libsrc/memio.c_. 311 312## Open/Create/Close 313 314Open and close related functions exist in _memio.c_ that are not specifically part of the API. 315These functions are defined in the following sections. 316 317### memio_create 318Signature: 319```` 320int memio_create(const char* path, int ioflags, size_t initialsz, off_t igeto, size_t igetsz, size_t* sizehintp, void* parameters /*ignored*/, ncio* *nciopp, void** const mempp) 321```` 322Create a new file. Invoke _memio_new_ to create the _ncio_ 323instance. If it is intended that the resulting file be 324persisted to the file system, then verify that writing such a 325file is possible. Also create an initial in-memory buffer to 326hold the file data. Otherwise act like e.g. _posixio_create_. 327 328### memio_open 329Signature: 330```` 331int memio_open(const char* path, int ioflags, off_t igeto, size_t igetsz, size_t* sizehintp, void* parameters, ncio* *nciopp, void** const mempp) 332```` 333Open an existing file. Invoke _memio_new_ to create the _ncio_ 334instance. If it is intended that the resulting file be 335persisted to the file system, then verify that writing such a 336file is possible. Also create an initial in-memory buffer to 337hold the file data. Read the contents of the existing file into 338the allocated memory. 339Otherwise act like e.g. _posixio_open_. 340 341### memio_extract 342Signature: 343```` 344int memio_extract(ncio* const nciop, size_t* sizep, void** memoryp) 345```` 346This function is called as part of the NC3_close function in the event that 347the user wants the final in-memory chunk returned to them via _nc_close_mem_. 348It captures the existing in-memory chunk and returns it. At this point, 349memio will no longer have access to that memory. 350 351## API Semantics 352 353The semantic interaction of the above API and NC_INMEMORY are described in the following sections. 354 355### ncio_relfunc 356Just unlock the in-memory chunk. 357 358### ncio_getfunc 359First guarantee that the requested region exists, and if necessary, 360realloc to make it exist. If realloc is needed, and the 361file is locked, then fail. 362 363### ncio_movefunc 364First guarantee that the requested destination region exists, and if necessary, 365realloc to make it exist. If realloc is needed, and the 366file is locked, then fail. 367 368### ncio_syncfunc 369This is a no-op as far as memio is concerned. 370 371### ncio_pad_lengthfunc 372This may realloc the allocated in-memory buffer to achieve padding 373rounded up to the pagesize. 374 375### ncio_filesizefunc 376This just returns the used size of the in-memory chunk. 377Note that the allocated size might be larger. 378 379### ncio_closefunc 380If the usere wants the contents persisted, then write out the used portion 381of the in-memory chunk to the target file. 382Then, if the in-memory chunk is not locked, or for some reason has 383been modified, go ahead and free that memory. 384 385# References {#inmemintern_bib} 386 3871. https://support.hdfgroup.org/HDF5/doc1.8/Advanced/FileImageOperations/HDF5FileImageOperations.pdf 3882. https://support.hdfgroup.org/HDF5/doc/RM/RM_H5P.html#Property-SetFileImageCallbacks 389 390# Point of Contact {#inmemintern_poc} 391 392__Author__: Dennis Heimbigner<br> 393__Email__: dmh at ucar dot edu<br> 394__Initial Version__: 8/28/2018<br> 395__Last Revised__: 8/28/2018 396 397<!-- End MarkDown --> 398 399@endif 400 401*/ 402