1.. _rfc-26: 2 3================================================================================ 4RFC 26: GDAL Block Cache Improvements 5================================================================================ 6 7Authors: Tamas Szekeres, Even Rouault 8 9Contact: szekerest@gmail.com, even.rouault at spatialys.com 10 11Status: Adopted, implemented 12 13Implementation version: GDAL 2.1 14 15Summary and rationale 16--------------------- 17 18GDAL maintains an in-memory cache for the raster blocks fetched from the 19drivers and ensures that the second attempt to access the same block 20will be served from the cache instead of the driver. This cache is 21maintained in a per-band fashion and an array is allocated for the 22pointers for each blocks (or sub-blocks). This approach is not 23sufficient with large raster dimensions (or large virtual rasters ie. 24with the WMS/TMS driver), which may cause out of memory errors in 25GDALRasterBand::InitBlockInfo, as raised in #3224 26 27For example, a band of a dataset at level 21 with a GoogleMaps tiling 28requires 2097152x2097152 tiles of 256x256 pixels. This means that GDAL 29will try to allocate an array of 32768x32768 = 1 billion elements (32768 30= 2097152 / 64). The size of this array is 4 GB on a 32-bit build, so it 31cannot be allocated at all. And it is 8 GB on a 64-bit build (even if 32this is generally only virtual memory reservation but not actually 33allocation of physical pages of memory, due to over-commit mechanism of 34the operating system). At dataset closing, this means that those 1 35billion cells will have to be explored to discover remaining cached 36blocks. In reality, all above figures must be multiplied by 3 for a RGB 37(or 4 for a RGBA) dataset. 38 39In the hash set implementation, memory allocation depends directly on 40the number of cached blocks. Typically with the default GDAL_CACHEMAX 41size of 40 MB, only 640 blocks of 256x256 pixels can be simultaneously 42cached (for all datasets). 43 44Main concepts 45------------- 46 47Awareness of thread-safety issues is crucial in the design of block 48caching. In gdalrasterblock.cpp, a static linked list is maintained so 49as to track the access order of the blocks and keep the size of the 50cache within a desired limit by dropping the oldest blocks out of the 51list. This linked list is shared among all the datasets and bands in 52GDAL (protected by a hRBMutex) and a thread on each band, when reading a 53new block, may also trigger a GDALRasterBand::UnreferenceBlock call on 54another band within the scope of this mutex. GDALRasterBand::FlushBlock 55will also access the data structure of the band level cache by removing 56the corresponding tile from the array or the hashtable. 57 58In GDAL 2.0, some issues related to thread-safety (#3225, #3226) have 59been fixed and this RFC still preserves those scenarios as safe. 60 61The changes of this RFC consist in moving away from the GDALRasterBand 62class the logic to access to a cached block, to add or remove it. This 63is done with the new GDALAbstractBandBlockCache class. The current array 64based logic is moved into the new GDALArrayBandBlockCache class, and the 65new hashset based logic in GDALHashsetBandBlockCache. 66 67For the array based implementation, due to the "static" nature of the 68hosting structure (an array), no special care is needed when reading or 69writing a cell from concurrent threads. The only special care that must 70be taken is to prevent a given cell (block) to be accessed concurrently. 71For example we want to avoid TryGetLockedBlockRef() to return a block 72that is being freed by another thread from 73GDALRasterBlock::FlushCacheBlock() or Internalize(). For that, the 74nRefCount member of GDALRasterBlock is now accessed and modified only 75through atomic functions to increase, decrease or compare-and-swap its 76value. 77 78For the hash set based implementation, the base implementation of hash 79set data structure done in in cpl_hash_set.h / cpl_hash_set.cpp is not 80thread safe by default. So GDALHashsetBandBlockCache has a dedicated 81mutex to protect all reads, additions and removals from the hash set. No 82dead-lock with the hRBMutex can occurs since no operations done under 83the hashset mutex involves calling any method from GDALRasterBlock. 84 85We could potentially have reused the hRBMutex to protect the hash set, 86but this would have increased the contention of the hRBMutex 87unnecessarily. 88 89By default, the selection between the array based and the hashtable 90based approaches is based on the following rule: if the dataset has more 91than 1 million blocks, the hashset based implementation is used, 92otherwise the array based implementation is used. The new 93GDAL_OF_ARRAY_BLOCK_ACCESS and GDAL_OF_HASHSET_BLOCK_ACCESS open flags 94can also be passed to GDALOpenEx() to override this choice. The 95GDAL_BAND_BLOCK_CACHE configuration option can also be set to ARRAY or 96HASHSET. 97 98The hashset based implementation could potentially be the default 99implementation in all cases (performance comparisons done with the 100autotest/cpp/testblockcache utility with 4 or 8 cores show non 101measurable differences), but in theory the array based implementation 102offers less contention of the hRBMutex, so should be more scalable when 103using lots of cores. And as work has been done during GDAL 2.0 to 104improve the scalability, it might be prudent for now to remain on the 105array based implementation on rasters of modest size. 106 107Not completely linked with this RFC, a few changes have been done to 108limit the number of allocation/deallocation of objects (GDALRasterBlock 109instances, as well as an internal element of CPLHashSet), which has an 110effect on scalability since memory allocation routines involve 111synchronization between threads. 112 113Implementation 114-------------- 115 116To implement the addition the following changes is made in the GDAL 117codebase: 118 119- port/cpl_hash_set.cpp / port/cpl_hash_set.h: CPLHashSetClear() 120 function added to remove all the elements in one operation. 121 122- port/cpl_hash_set.cpp / port/cpl_hash_set.h: 123 CPLHashSetRemoveDeferRehash() function added to remove one element 124 quickly. That is to say the potential resizing of the array used 125 internally is deferred to a later operation 126 127- port/cpl_hash_set.cpp / port/cpl_hash_set.h: improvements to 128 "recycle" links from the linked lists and avoid useless 129 malloc()/free(). 130 131- port/cpl_atomic_ops.cpp: addition of CPLAtomicCompareAndExchange() 132 133- gcore/gdal.h: additions of GDAL_OF_DEFAULT_BLOCK_ACCESS, 134 GDAL_OF_ARRAY_BLOCK_ACCESS and GDAL_OF_HASHSET_BLOCK_ACCESS values. 135 136- gcore/gdal_priv.h: definition of GDALAbstractBandBlockCache class, 137 and GDALArrayBandBlockCacheCreate() and 138 GDALHashSetBandBlockCacheCreate() functions. Modifications of 139 GDALRasterBand, GDALDataset and GDALRasterBlock definitions. 140 141- gcore/gdalrasterband.cpp: InitBlockInfo() instantiates the 142 appropriate band block cache implementation. 143 144- gcore/gdalrasterband.cpp: the AdoptBlock(), UnreferenceBlock(), 145 FlushBlock() and TryGetLockedBlockRef() methods delegate to the 146 actual band block cache implementation. 147 148- gcore/gdalrasterband.cpp: AddBlockToFreeList() is added and delegate 149 to GDALAbstractBandBlockCache 150 151- gcore/gdalrasterblock.cpp: SafeLockBlock() is replaced by TakeLock() 152 153- gcore/gdalrasterblock.cpp: RecycleFor() method added to recycle an 154 existing block object to save a few new/delete calls (used by 155 GDALAbstractBandBlockCache::CreateBlock()) 156 157- gcore/gdalrasterblock.cpp: Internalize() or FlushCacheBlock() no 158 longer directly free a block (they still free or recycle its pData 159 member), but provide it to GDALRasterBand::AddBlockToFreeList() for 160 layer reuse. 161 162- gcore/gdalrasterblock.cpp: DropLockForRemovalFromStorage() is added 163 to avoid racing destruction of blocks between 164 GDALRasterBand::FlushCache() or FlushBlock() with 165 GDALRasterBlock::Internalize() or FlushCacheBlock(). 166 167- gcore/gdalabstractbandblockcache.cpp: added. Contains logic to keep 168 instantiated GDALRasterBlock that were discarded by the global block 169 manager for their later reuse. Saves a few new/delete calls. 170 171- gcore/gdalarraybandblockcache.cpp: the GDALArrayBandBlockCache class 172 implementation with mostly the existing code 173 174- gcore/gdalhashsetbandblockcache.cpp: the new 175 GDALHashsetBandBlockCache class implementation 176 177Backward Compatibility 178---------------------- 179 180This implementation retains the backward compatibility with the existing 181API. The C++ ABI of GDALRasterBand, GDALDataset and GDALRasterBlock is 182modified. 183 184Performance impacts 185------------------- 186 187The array based implementation after this RFC should still show the same 188performance than the current implementation (potentially very slightly 189improved with the recycling of blocks). Confirmed by tests with 190autotest/cpp/testblockcache. 191 192Documentation 193------------- 194 195This change doesn't affect the existing user documentation. 196 197Testing 198------- 199 200The autotest/cpp/testblockcache utility is now run by the "quick_test" 201target of autotest/cpp/Makefile with GDAL_BAND_BLOCK_CACHE=HASHSET in 202additions to the array based implementation. 203 204A new autotest/cpp/testblockcachelimits utility has been developed to 205test a few racing situations. As races are hard to trigger, the code of 206GDALRasterBlock has been instrumented to allow sleeping in particular 207places, enabling races to be reliably simulated. 208 209Implementation 210-------------- 211 212Tamas Szekeres had provided an initial version of this RFC. It has been 213restructured and ported on GDAL 2.0 by Even Rouault (sponsored by `LINZ 214(Land Information New Zealand) <http://www.linz.govt.nz/>`__) 215 216References 217---------- 218 219The proposed implementation lies in the "rfc26_bandblockcache" branch of 220the 221`https://github.com/rouault/gdal2/tree/rfc26_bandblockcache <https://github.com/rouault/gdal2/tree/rfc26_bandblockcache>`__ 222repository. 223 224The list of changes: 225`https://github.com/rouault/gdal2/compare/rfc26_bandblockcache <https://github.com/rouault/gdal2/compare/rfc26_bandblockcache>`__ 226 227Related bugs: #3264, #3224. 228 229Voting History 230-------------- 231 232+1 from EvenR, DanielM, TamasS. +0 from JukkaR 233