1.. _rfc-15:
2
3================================================================================
4RFC 15: Band Masks
5================================================================================
6
7Author: Frank Warmerdam
8
9Contact: warmerdam@pobox.com
10
11Status: Adopted
12
13Summary
14-------
15
16Some file formats support a concept of a bitmask to identify pixels that
17are not valid data. This can be particularly valuable with byte image
18formats where a nodata pixel value can not be used because all pixel
19values have a valid meaning. This RFC tries to formalize a way of
20recognising and accessing such null masks through the GDAL API, while
21moving to a uniform means of representing other kinds of masking (nodata
22values, and alpha bands).
23
24The basic approach is to treat such masks as raster bands, but not
25regular raster bands on the datasource. Instead they are freestanding
26raster bands in a manner similar to the overview raster band objects.
27The masks are represented as GDT_Byte bands with a value of zero
28indicating nodata and non-zero values indicating valid data. Normally
29the value 255 will be used for valid data pixels.
30
31API
32---
33
34GDALRasterBand is extended with the following methods:
35
36::
37
38       virtual GDALRasterBand *GetMaskBand();
39       virtual int             GetMaskFlags();
40       virtual CPLErr          CreateMaskBand( int nFlags );
41
42GDALDataset is extended with the following method:
43
44::
45
46       virtual CPLErr          CreateMaskBand( nFlags );
47
48Note that the GetMaskBand() should always return a GDALRasterBand mask,
49even if it is only an all 255 mask with the flags indicating
50GMF_ALL_VALID.
51
52The GetMaskFlags() method returns an bitwise OR-ed set of status flags
53with the following available definitions that may be extended in the
54future:
55
56-  GMF_ALL_VALID(0x01): There are no invalid pixels, all mask values
57   will be 255. When used this will normally be the only flag set.
58-  GMF_PER_DATASET(0x02): The mask band is shared between all bands on
59   the dataset.
60-  GMF_ALPHA(0x04): The mask band is actually an alpha band and may have
61   values other than 0 and 255.
62-  GMF_NODATA(0x08): Indicates the mask is actually being generated from
63   nodata values. (mutually exclusive of GMF_ALPHA)
64
65The CreateMaskBand() method will attempt to create a mask band
66associated with the band on which it is invoked, issuing an error if it
67is not supported. Currently the only flag that is meaningful to pass in
68when creating a mask band is GMF_PER_DATASET. The rest are used to
69represent special system provided mask bands. GMF_PER_DATASET is assumed
70when CreateMaskBand() is called on a dataset.
71
72Default GetMaskBand() / GetMaskFlags() Implementation
73-----------------------------------------------------
74
75The GDALRasterBand class will include a default implementation of
76GetMaskBand() that returns one of three default implementations.
77
78-  If a corresponding .msk file exists it will be used for the mask
79   band.
80-  If the band has a nodata value set, an instance of the new
81   GDALNodataMaskRasterBand class will be returned. GetMaskFlags() will
82   return GMF_NODATA.
83-  If there is no nodata value, but the dataset has an alpha band that
84   seems to apply to this band (specific rules yet to be determined) and
85   that is of type GDT_Byte then that alpha band will be returned, and
86   the flags GMF_PER_DATASET and GMF_ALPHA will be returned in the
87   flags.
88-  If neither of the above apply, an instance of the new
89   GDALAllValidRasterBand class will be returned that has 255 values for
90   all pixels. The null flags will return GMF_ALL_VALID.
91
92The GDALRasterBand will include a protected poMask instance variable and
93a bOwnMask flag. The first call to the default GetMaskBand() will result
94in creation of the GDALNodataMaskRasterBand, GDALAllValidMaskRasterBand
95and their assignment to poMask with bOwnMask set TRUE. If an alpha band
96is identified for use, it will be assigned to poMask and bOwnMask set to
97FALSE. The GDALRasterBand class will take care of deleting the poMask if
98set and bOwnMask is true in the destructor. Derived band classes may
99safely use the poMask and bOwnMask flag similarly as long as the
100semantics are maintained.
101
102For an external .msk file to be recognized by GDAL, it must be a valid
103GDAL dataset, with the same name as the main dataset and suffixed with
104.msk, with either one band (in the GMF_PER_DATASET case), or as many
105bands as the main dataset. It must have INTERNAL_MASK_FLAGS_xx metadata
106items set at the dataset level, where xx matches the band number of a
107band of the main dataset. The value of those items is a combination of
108the flags GMF_ALL_VALID, GMF_PER_DATASET, GMF_ALPHA and GMF_NODATA. If a
109metadata item is missing for a band, then the other rules explained
110above will be used to generate a on-the-fly mask band.
111
112Default CreateMaskBand()
113------------------------
114
115The default implementation of the CreateMaskBand() method will be
116implemented based on similar rules to the .ovr handling implemented
117using the GDALDefaultOverviews object. A TIFF file with the extension
118.msk will be created with the same basename as the original file, and it
119will have as many bands as the original image (or just one for
120GMF_PER_DATASET). The mask images will be deflate compressed tiled
121images with the same block size as the original image if possible.
122
123The default implementation of GetFileList() will also be modified to
124know about the .msk files.
125
126CreateCopy()
127------------
128
129The GDALDriver::DefaultCreateCopy(), and GDALPamDataset::CloneInfo()
130methods will be updated to copy mask information if it seems necessary
131and is possible. Note that NODATA, ALL_VALID and ALPHA type masks are
132not copied since they are just derived information.
133
134Alpha Bands
135-----------
136
137When a dataset has a normal GDT_Byte alpha (transparency) band that
138applies, it should be returned as the null mask, but the GetMaskFlags()
139method should include GMF_ALPHA. For processing purposes any value other
140than 0 should be treated as valid data, though some algorithms will
141treat values between 1 and 254 as partially transparent.
142
143Drivers Updated
144---------------
145
146These drivers will be updated:
147
148-  JPEG Driver: support the "zlib compressed mask appended to the file"
149   approach used by a few data providers.
150-  GRASS Driver: updated to support handling null values as masks.
151
152Possibly updated:
153
154-  HDF4 Driver: This driver might possibly be updated to return real
155   mask if we can figure out a way.
156-  SDE Driver: This driver might be updated if Howard has sufficient
157   time and enthusiasm.
158
159Utilities
160---------
161
162The gdalwarp utility and the gdal warper algorithm will be updated to
163use null masks on input. The warper algorithm already uses essentially
164this model internally. For now gdalwarp output (nodata or alpha band)
165will remain unchanged, though at some point in the future support may be
166added for explicitly generating null masks, but for most purposes
167producing an alpha band is producing a null mask.
168
169Implementation Plan
170-------------------
171
172This change will be implemented by Frank Warmerdam in trunk in time for
173the 1.5.0 release.
174
175SWIG Implications
176-----------------
177
178The GetMaskBand(), GetMaskFlags() and CreateMaskBand() methods (and
179corresponding defines) will need to be added. The mask should work like
180a normal raster band for swig purposes so minimal special work should be
181required.
182
183Testing
184-------
185
186The gdalautotest will be extended with the following:
187
188-  gcore/mask.py: test default mask implementation for nodata, alpha and
189   all valid cases.
190-  gdriver/jpeg.py: extend with a test for "appended bitmask" case -
191   creation and reading.
192
193Interactive testing will be done for gdalwarp.
194