1.. _rfc-53: 2 3======================================================================================= 4RFC 53: OGR not-null constraints and default values 5======================================================================================= 6 7Authors: Even Rouault 8 9Contact: even dot rouault at spatialys.com 10 11Status: Adopted, implemented in GDAL 2.0 12 13Summary 14------- 15 16This RFC addresses handling of NOT NULL constraints and DEFAULT values 17for OGR fields. NOT NULL constraints are useful to maintain basic data 18integrity and are handled by most (all?) drivers that have SQL 19capabilities. Default fields values may be used complementary or 20independently of NOT NULL constraints to specify the value a field must 21be assigned to if it is not provided when inserting a feature into the 22layer. 23 24NOT NULL constraint 25------------------- 26 27Up to now, OGR fields did not have NOT NULL constraints, i.e. fields in 28layers/tables were created with the possibility for a field of a 29feature/record to be unset (i.e. having a NULL value). This will still 30be the default, i.e. fields are assumed to be nullable. The OGRFieldDefn 31class is extended with a boolean attribute bNullable that defaults to 32TRUE and can be set to FALSE to express a NOT NULL constraint (bNullable 33has been preferred over bNotNullable to avoid confusion with double 34negation). Drivers that can translate NOT NULL constraints in their 35storage will use that attribute to determine if the field definition 36must include a NOT NULL constraint. When opening a datasource, their 37metadata will be inspected to set the nullable attribute properly, so 38that round-tripping works. 39 40The following methods are added to the OGRFieldDefn class 41 42:: 43 44 int IsNullable() const { return bNullable; } 45 46 /** 47 * \brief Return whether this field can receive null values. 48 * 49 * By default, fields are nullable. 50 * 51 * Even if this method returns FALSE (i.e not-nullable field), it doesn't mean 52 * that OGRFeature::IsFieldSet() will necessary return TRUE, as fields can be 53 * temporary unset and null/not-null validation is usually done when 54 * OGRLayer::CreateFeature()/SetFeature() is called. 55 * 56 * This method is the same as the C function OGR_Fld_IsNullable(). 57 * 58 * @return TRUE if the field is authorized to be null. 59 * @since GDAL 2.0 60 */ 61 62 void SetNullable( int bNullableIn ) { bNullable = bNullableIn; } 63 64 /** 65 * \brief Set whether this field can receive null values. 66 * 67 * By default, fields are nullable, so this method is generally called with FALSE 68 * to set a not-null constraint. 69 * 70 * Drivers that support writing not-null constraint will advertise the 71 * GDAL_DCAP_NOTNULL_FIELDS driver metadata item. 72 * 73 * This method is the same as the C function OGR_Fld_SetNullable(). 74 * 75 * @param bNullableIn FALSE if the field must have a not-null constraint. 76 * @since GDAL 2.0 77 */ 78 79As this holds true for geometry fields, those 2 methods are also add to 80the OGRGeometryFieldDefn class. 81 82Note that adding a field with a NOT NULL constraint on a non-empty layer 83is generally impossible, unless a DEFAULT value is associated with it. 84 85The following method is added to the OGRFeature class : 86 87:: 88 89 int Validate( int nValidateFlags, int bEmitError ); 90 91 /** 92 * \brief Validate that a feature meets constraints of its schema. 93 * 94 * The scope of test is specified with the nValidateFlags parameter. 95 * 96 * Regarding OGR_F_VAL_WIDTH, the test is done assuming the string width must 97 * be interpreted as the number of UTF-8 characters. Some drivers might interpret 98 * the width as the number of bytes instead. So this test is rather conservative 99 * (if it fails, then it will fail for all interpretations). 100 * 101 * This method is the same as the C function OGR_F_Validate(). 102 * 103 * @param nValidateFlags OGR_F_VAL_ALL or combination of OGR_F_VAL_NULL, 104 * OGR_F_VAL_GEOM_TYPE, OGR_F_VAL_WIDTH and OGR_F_VAL_ALLOW_NULL_WHEN_DEFAULT 105 * with '|' operator 106 * @param bEmitError TRUE if a CPLError() must be emitted when a check fails 107 * @return TRUE if all enabled validation tests pass. 108 * @since GDAL 2.0 109 */ 110 111where nValidateFlags is a combination of : 112 113:: 114 115 /** Validate that fields respect not-null constraints. 116 * Used by OGR_F_Validate(). 117 * @since GDAL 2.0 118 */ 119 #define OGR_F_VAL_NULL 0x00000001 120 121 /** Validate that geometries respect geometry column type. 122 * Used by OGR_F_Validate(). 123 * @since GDAL 2.0 124 */ 125 #define OGR_F_VAL_GEOM_TYPE 0x00000002 126 127 /** Validate that (string) fields respect field width. 128 * Used by OGR_F_Validate(). 129 * @since GDAL 2.0 130 */ 131 #define OGR_F_VAL_WIDTH 0x00000004 132 133 /** Allow fields that are null when there's an associated default value. 134 * This can be used for drivers where the low-level layers will automatically set the 135 * field value to the associated default value. 136 * This flag only makes sense if OGR_F_VAL_NULL is set too. 137 * Used by OGR_F_Validate(). 138 * @since GDAL 2.0 139 */ 140 #define OGR_F_VAL_ALLOW_NULL_WHEN_DEFAULT 0x00000008 141 142 /** Enable all validation tests. 143 * Used by OGR_F_Validate(). 144 * @since GDAL 2.0 145 */ 146 #define OGR_F_VAL_ALL 0xFFFFFFFF 147 148Validation of NOT NULL constraints is generally let to the driver 149low-level layer, so OGRFeature::Validate() is only useful on a few cases 150(one of such case is the GML driver) 151 152A new flag ALTER_NULLABLE_FLAG = 0x8 is added to be passed to 153OGRLayer::AlterFieldDefn() so as to set or drop NULL / NOT-NULL 154constraints (for drivers that implement it). 155 156Drivers that handle NOT NULL constraint for regular attribute fields 157should advertise the new GDAL_DCAP_NOTNULL_FIELDS and/or 158GDAL_DCAP_NOTNULL_GEOMFIELDS driver metadata items. 159 160Drivers that do not implement the OGRLayer::CreateGeomField() interface 161(i.e. the ones that support single geometry field), but can create a 162layer with a NOT NULL constraint on the geometry field can expose a 163GEOMETRY_NULLABLE=YES/NO layer creation option. 164 165Note: due to the way they are commonly written, the CreateField() 166implementations of drivers that do not support NOT NULL constraint will 167generally copy the value of the nullable flag, which may be a bit 168misleading if querying the field definition just after having adding it 169(the same holds true for width/precision as well). 170 171All above methods are mapped into the C API : 172 173:: 174 175 int CPL_DLL OGR_Fld_IsNullable( OGRFieldDefnH hDefn ); 176 void CPL_DLL OGR_Fld_SetNullable( OGRFieldDefnH hDefn, int ); 177 178 int CPL_DLL OGR_GFld_IsNullable( OGRGeomFieldDefnH hDefn ); 179 void CPL_DLL OGR_GFld_SetNullable( OGRGeomFieldDefnH hDefn, int ); 180 181 int CPL_DLL OGR_F_Validate( OGRFeatureH, int nValidateFlags, int bEmitError ); 182 183Default field values 184-------------------- 185 186Fields with NOT NULL constraints are sometimes accompanied with a 187DEFAULT clause so as to be able to create a new feature without filling 188all fields, while maintaining integrity. DEFAULT values can also be set 189on nullable fields but for reasons exposed later it is recommended to 190avoid that. 191 192Drivers that can translate DEFAULT values in their storage will use that 193attribute to determine if the field definition must include a DEFAULT 194value. When opening a datasource, their metadata will be inspected to 195set the default value attribute properly, so that round-tripping works. 196 197There was an embryonic support for default values in GDAL 1.X but that 198never got implemented beyond the getter/setter methods on OGRFieldDefn. 199It relied on a "OGRField uDefault" member. The choice of OGRField 200restricts the default values to be expressed with the type of the field, 201but in some situations we want to be able to assign expressions or 202special keywords for non-string fields. For example the SQL standard 203defines CURRENT_TIMESTAMP for DateTime fields. So as to be general, we 204have remove this uDefault member and replaced it with a "char\* 205pszDefault" string. 206 207The values that can be set as default values are : 208 209- literal string values enclosed in single-quote characters and 210 properly escaped like: ``'Nice weather. Isn''t it ?'`` 211- numeric values (unquoted) 212- reserved keywords (unquoted): CURRENT_TIMESTAMP, CURRENT_DATE, 213 CURRENT_TIME, NULL 214- datetime literal values enclosed in single-quote characters with the 215 following defined format: 'YYYY/MM/DD HH:MM:SS[.sss]' 216- any other driver specific expression. e.g. for SQLite: 217 (strftime('%Y-%m-%dT%H:%M:%fZ','now')) 218 219The following methods are added/modified to the OGRFieldDefn class 220 221:: 222 223 void SetDefault( const char* ); 224 225 /** 226 * \brief Set default field value. 227 * 228 * The default field value is taken into account by drivers (generally those with 229 * a SQL interface) that support it at field creation time. OGR will generally not 230 * automatically set the default field value to null fields by itself when calling 231 * OGRFeature::CreateFeature() / OGRFeature::SetFeature(), but will let the 232 * low-level layers to do the job. So retrieving the feature from the layer is 233 * recommended. 234 * 235 * The accepted values are NULL, a numeric value, a literal value enclosed 236 * between single quote characters (and inner single quote characters escaped by 237 * repetition of the single quote character), 238 * CURRENT_TIMESTAMP, CURRENT_TIME, CURRENT_DATE or 239 * a driver specific expression (that might be ignored by other drivers). 240 * For a datetime literal value, format should be 'YYYY/MM/DD HH:MM:SS[.sss]' 241 * (considered as UTC time). 242 * 243 * Drivers that support writing DEFAULT clauses will advertise the 244 * GDAL_DCAP_DEFAULT_FIELDS driver metadata item. 245 * 246 * This function is the same as the C function OGR_Fld_SetDefault(). 247 * 248 * @param pszDefault new default field value or NULL pointer. 249 * 250 * @since GDAL 2.0 251 */ 252 253 254 const char *GetDefault() const; 255 256 /** 257 * \brief Get default field value. 258 * 259 * This function is the same as the C function OGR_Fld_GetDefault(). 260 * 261 * @return default field value or NULL. 262 * @since GDAL 2.0 263 */ 264 265 266 int IsDefaultDriverSpecific() const; 267 268 /** 269 * \brief Returns whether the default value is driver specific. 270 * 271 * Driver specific default values are those that are *not* NULL, a numeric value, 272 * a literal value enclosed between single quote characters, CURRENT_TIMESTAMP, 273 * CURRENT_TIME, CURRENT_DATE or datetime literal value. 274 * 275 * This method is the same as the C function OGR_Fld_IsDefaultDriverSpecific(). 276 * 277 * @return TRUE if the default value is driver specific. 278 * @since GDAL 2.0 279 */ 280 281SetDefault() validates that a string literal beginning with ' is 282properly escaped. 283 284IsDefaultDriverSpecific() returns TRUE if the value set does not belong 285to one of the 4 bullets in the above enumeration. This is used by 286drivers to determine if they can handle or not a default value. 287 288Drivers should do some effort to interpret and reformat default values 289in the above 4 standard formats so as to be able to propagate default 290values from one driver to another one. 291 292The following method is added to the OGRFeature class : 293 294:: 295 296 void FillUnsetWithDefault(int bNotNullableOnly, 297 char** papszOptions ); 298 /** 299 * \brief Fill unset fields with default values that might be defined. 300 * 301 * This method is the same as the C function OGR_F_FillUnsetWithDefault(). 302 * 303 * @param bNotNullableOnly if we should fill only unset fields with a not-null 304 * constraint. 305 * @param papszOptions unused currently. Must be set to NULL. 306 * @since GDAL 2.0 307 */ 308 309It will replace unset fields of a feature with their default values, but 310should rarely be used as most drivers will do that substitution 311automatically in their low-level layer. CreateFeature() cannot be 312trusted to automatically modify the passed OGRFeature object to set 313unset fields to their default values. For that, an explicit GetFeature() 314call should be issued to retrieve the record as stored in the database. 315 316A new flag ALTER_DEFAULT_FLAG = 0x8 is added to be passed to 317OGRLayer::AlterFieldDefn() so as to set, drop or modify default values 318(for drivers that implement it) 319 320Drivers that handle default values should advertise the new 321GDAL_DCAP_DEFAULT_FIELDS driver metadata items. 322 323Note: due to the way they are commonly written, the CreateField() 324implementations of drivers that do not support default values will 325generally copy the value of the default value string, which may be a bit 326misleading if querying the field definition just after having adding it. 327 328All above methods are mapped into the C API : 329 330:: 331 332 const char CPL_DLL *OGR_Fld_GetDefault( OGRFieldDefnH hDefn ); 333 void CPL_DLL OGR_Fld_SetDefault( OGRFieldDefnH hDefn, const char* ); 334 int CPL_DLL OGR_Fld_IsDefaultDriverSpecific( OGRFieldDefnH hDefn ); 335 336 void CPL_DLL OGR_F_FillUnsetWithDefault( OGRFeatureH hFeat, 337 int bNotNullableOnly, 338 char** papszOptions ); 339 340SWIG bindings (Python / Java / C# / Perl) changes 341------------------------------------------------- 342 343The following additions have been done : 344 345- SetNullable(), IsNullable() added on FieldDefn class 346- SetNullable(), IsNullable() added on GeomFieldDefn class 347- Validate() added on Feature class 348- SetDefault(), GetDefault(), IsDefaultDriverSpecific() available on 349 FieldDefn class 350- FillUnsetWithDefault() added on Feature class 351 352Utilities 353--------- 354 355ogrinfo has been updated to expose NOT NULL constraints and DEFAULT 356values. e.g. 357 358:: 359 360 Geometry Column 1 NOT NULL = WKT 361 Geometry Column 2 NOT NULL = geom2 362 id: Integer (0.0) NOT NULL DEFAULT 1234567 363 dbl: Real (0.0) NOT NULL DEFAULT 1.456 364 str: String (0.0) NOT NULL DEFAULT 'a' 365 d: Date (0.0) NOT NULL DEFAULT CURRENT_DATE 366 t: Time (0.0) NOT NULL DEFAULT CURRENT_TIME 367 dt: DateTime (0.0) NOT NULL DEFAULT CURRENT_TIMESTAMP 368 dt2: DateTime (0.0) NOT NULL DEFAULT '2013/12/11 01:23:45' 369 3702 news options have been added to ogr2ogr : 371 372- -forceNullable to remove NOT NULL constraint (NOT NULL constraints 373 are propagated by default from source to target layer) 374- -unsetDefault to remove DEFAULT values (DEFAULT values are propagated 375 by default from source to target layer) 376 377Unless it is explicitly specified, ogr2ogr will also automatically set 378the GEOMETRY_NULLABLE=NO creation option to target layers that support 379it, if the source layer has its first geometry field with a NOT NULL 380constraint. 381 382Documentation 383------------- 384 385New/modified API are documented. 386 387File Formats 388------------ 389 390The following OGR drivers have been updated to support the new 391interfaces. 392 393- PG: supports NOT NULL (for attribute and multiple geometry fields) 394 and DEFAULT on creation/read. AlterFieldDefn() implementation 395 modified to support ALTER_NULLABLE_FLAG and ALTER_DEFAULT_FLAG. 396- PGDump: supports NOT NULL (for attribute and multiple geometry 397 fields) and DEFAULT on creation. 398- CartoDB: supports NOT NULL (for attribute and single geometry fields) 399 and DEFAULT on creation. Supported also on read with authenticated 400 login only (relies on queries on PostgreSQL system tables) 401- GPKG: supports NOT NULL (for attribute and its single geometry field) 402 and DEFAULT on creation/read. GEOMETRY_NULLABLE layer creation added. 403- SQLite: supports NOT NULL (for attribute and multiple geometry 404 fields. Support for multiple geometry fields has been added recently 405 per #5494) and DEFAULT on creation/read. AlterFieldDefn() 406 implementation modified to support ALTER_NULLABLE_FLAG and 407 ALTER_DEFAULT_FLAG. 408- MySQL: supports NOT NULL (for attribute fields only) and DEFAULT on 409 creation/read. 410- OCI: supports NOT NULL (for attribute and its single geometry field) 411 and DEFAULT on creation/read. GEOMETRY_NULLABLE layer creation added. 412- VRT: supports NOT NULL (for attribute and multiple geometry fields) 413 and DEFAULT on read, through new attributes "nullable" and "default" 414 (driver documentation and data/ogrvrt.xsd updated) 415- GML: supports NOT NULL (for attribute and multiple geometry field) on 416 creation/read. DEFAULT not truly supported (no way to express it in 417 .xsd AFAIK), but on creation, unset fields with a NOT NULL constraint 418 and DEFAULT values will be filled by using FillUnsetWithDefault() so 419 as to generate valid XML. 420- WFS: supports NOT NULL (for attribute fields only) on read 421- FileGDB: supports NOT NULL (for attribute and its single geometry 422 field) on read/write. GEOMETRY_NULLABLE layer creation added. DEFAULT 423 supported for String,Integer and Real fieds on creation/read (with 424 some bugs/weird behavior seen in FileGDB SDK and E$RI tools, 425 workarounded by using the OpenFileGDB driver in problematic 426 cases...). DEFAULT supported for DateTime on read, but unsupported on 427 creation to bug in FileGDB SDK. 428- OpenFileGDB: supports NOT NULL (for attribute and its single geometry 429 field) and DEFAULT on read 430 431MSSQLSpatial could probably support NOT NULL / DEFAULT, but has not been 432updated as part of this work. 433 434Test Suite 435---------- 436 437The test suite is extended to test: 438 439- all new methods of OGRFieldDefn, OGRGeomFieldDefn and OGRFeature in 440 ogr_feature.py 441- updated drivers: PG, PGDump, CartoDB, GPKG, SQLite, MySQL, OCI, VRT, 442 GML, FileGDB, OpenFileGDB 443- new options of ogr2ogr, and default behavior with NOT NULL / DEFAULT 444 propagation 445 446Compatibility Issues 447-------------------- 448 449This RFC should cause few compatibility issues. 450 451Regarding API, the existing OGRFieldDefn::SetDefault() has been changed 452and GetDefaultRef() has been removed. Impact should be low as this 453wasn't used in any drivers, was documented as being prone to be removed 454in the future, and so was unlikely to be used in applications either 455(there was no C binding) 456 457When not using the new API, behavior should remain unchanged w.r.t GDAL 4581.X when operating on layers created by GDAL. If reading layers created 459by other tools, then NOT NULL and/or DEFAULT can be read, and 460propagated. We cannot exclude that propagation of NOT NULL / DEFAULT can 461cause problems in some situations. In which case the new options of 462ogr2ogr will revert to a behavior that was the one of the GDAL 1.X era. 463 464Related topics out of scope of this RFC 465--------------------------------------- 466 467There might be an ambiguity between a field that has not been set and a 468field that is set to NULL. Both concepts are not distinguished in OGR 469currently, but most RDBMS are able to make such a distinction. 470 471Consider the 2 following statements : 472 473:: 474 475 INSERT INTO mytable (COL1) VALUES (5) 476 INSERT INTO mytable (COL1, COL2) VALUES (5, NULL) 477 478They are not equivalent when COL2 has a default value. 479 480The behavior of the modified drivers by this RFC is to *NOT* emit NULL 481at CreateFeature() time when a field is unset, so that the low-level 482layer of the driver can replace it with its default value if it exists. 483This is generally the wished behavior. 484 485If explicit NULL insertion is wanted, then using SetFeature() afterwards 486might be needed, if supported by the drivers (some drivers will likely 487not force unset OGR fields to be NULL when composing an UPDATE 488statement), otherwise with a direct SQL UPDATE statement. 489 490In fact, this confusion between unset or NULL hurts only in the case of 491fields that are nullable and have a DEFAULT value. If making sure to 492always associate DEFAULT with NOT NULL, then it becomes a non-issue as 493the database would refuse explicit NULL values. 494 495Solving the confusion would require to add a new state to an 496instantiated field within a feature to distinguish explicit NULL from 497unset, but this would have deep impact in drivers and application code. 498 499Implementation 500-------------- 501 502Implementation will be done by Even Rouault 503(`Spatialys <http://spatialys.com>`__), and sponsored by `LINZ (Land 504Information New Zealand) <http://www.linz.govt.nz/>`__. 505 506The proposed implementation lies in the "rfc53_ogr_notnull_default" 507branch of the 508`https://github.com/rouault/gdal2/tree/rfc53_ogr_notnull_default <https://github.com/rouault/gdal2/tree/rfc53_ogr_notnull_default>`__ 509repository. 510 511The list of changes : 512`https://github.com/rouault/gdal2/compare/rfc53_ogr_notnull_default <https://github.com/rouault/gdal2/compare/rfc53_ogr_notnull_default>`__ 513 514Voting history 515-------------- 516 517+1 from JukkaR, DanielM and EvenR 518