1.. _rfc-40:
2
3=======================================================================================
4RFC 40: Improving performance of Raster Attribute Table implementation for large tables
5=======================================================================================
6
7Summary:
8--------
9
10Raster Attrbute Tables from some applications (notably segmentation) can
11be very large and are slow to access with the current API due to the way
12only one element can get read or written at a time. Also, when an
13attribute table is requested by the application the whole table must be
14read - there is no way of delaying this so just the required subset is
15read off disk. These changes will bring the attribute table support more
16in line with the way raster data is accessed.
17
18Implementation:
19---------------
20
21It is proposed that GDALRasterAttributeTable be re-written as a virtual
22base class. This will allow drivers to have their own implementation
23that only reads and writes data when requested. A new derived class,
24GDALDefaultRasterAttributeTable will be provided that provides the
25functionality of the GDAL 1.x GDALRasterAttributeTable (ie holds all
26data in memory).
27
28Additional methods will be provided in the GDALRasterAttributeTable
29class that allow 'chunks' of data from a column to be read/written in
30one call. As with the GetValueAs functions columns of different types
31would be able to read as a value of a different type (i.e., read a int
32column as a double) with the appropriate conversion taking place. The
33following overloaded methods will be available:
34
35::
36
37   CPLErr ValuesIO(GDALRWFlag eRWFlag, int iField, int iStartRow, int iLength, double *pdfData);
38   CPLErr ValuesIO(GDALRWFlag eRWFlag, int iField, int iStartRow, int iLength, int *pnData);
39   CPLErr ValuesIO(GDALRWFlag eRWFlag, int iField, int iStartRow, int iLength, char **papszStrList);
40
41It is expected that the application will allocate the required space for
42reading in the same way as with the RasterIO() call.
43
44The char*\* type will be used for reading and writing strings. When
45reading strings, it is expected that the array is created of the correct
46size and ValuesIO will just create the individual strings for each row.
47The application should call CPLFree on each of the strings before
48de-allocating the array.
49
50These methods will be available from C as GDALRATValuesIOAsDouble,
51GDALRATValuesIOAsInteger and GDALRATValuesIOAsString.
52
53This is also an opportunity to remove unused functions on the attribute
54table such as GetRowMin(), GetRowMax() and GetColorOfValue().
55
56Language Bindings:
57------------------
58
59The Python bindings will be altered so ValuesIO will be supported using
60numpy arrays for the data with casting of types as appropriate. Strings
61will be supported using the numpy support for string arrays.
62
63Backward Compatibility:
64-----------------------
65
66The proposed additions will extend the C API. However, the C++ binary
67interface will be broken and so GDAL 2.0 is suggested as an appropriate
68time to introduce the changes.
69
70Care will be taken to still support the use of Clone() and Serialize()
71in derived implementations of the GDALRasterAttributeTable class as
72these are called by existing code. For implementations where the table
73is not held in memory these may fail if the table is larger than some
74suitable limit (for example, GetRowCount() \* GetColCount() < 1 000
75000). Clone() should return a instance of
76GDALDefaultRasterAttributeTable to prevent problems with sharing memory
77between objects.
78
79Existing code may need to be altered to use create instances of
80GDALDefaultRasterAttributeTable rather than GDALRasterAttributeTable if
81an in memory implementation is still required.
82
83Impact on Drivers
84-----------------
85
86The HFA driver will be updated to support all aspects of the new
87interface, such as the new functions and reading/writing upon request.
88Other drivers will be modified to continue to use the in memory
89implementation (GDALDefaultRasterAttributeTable).
90
91Testing
92-------
93
94The Python autotest suite will be extended to test the new API, both for
95the default implementation and specialised implementation in the HFA
96driver.
97
98Timeline
99--------
100
101We (Sam Gillingham and Pete Bunting) are prepared undertake the work
102required and have it ready for inclusion in GDAL 1.11 There needs to be
103a discussion on the names of the methods and on the internal logic of
104the methods.
105
106Ticket
107------
108
109Ticket #5129 has been opened to track the progress of this RFC.
110