1.. _rfc-53:
2
3=======================================================================================
4RFC 53: OGR not-null constraints and default values
5=======================================================================================
6
7Authors: Even Rouault
8
9Contact: even dot rouault at spatialys.com
10
11Status: Adopted, implemented in GDAL 2.0
12
13Summary
14-------
15
16This RFC addresses handling of NOT NULL constraints and DEFAULT values
17for OGR fields. NOT NULL constraints are useful to maintain basic data
18integrity and are handled by most (all?) drivers that have SQL
19capabilities. Default fields values may be used complementary or
20independently of NOT NULL constraints to specify the value a field must
21be assigned to if it is not provided when inserting a feature into the
22layer.
23
24NOT NULL constraint
25-------------------
26
27Up to now, OGR fields did not have NOT NULL constraints, i.e. fields in
28layers/tables were created with the possibility for a field of a
29feature/record to be unset (i.e. having a NULL value). This will still
30be the default, i.e. fields are assumed to be nullable. The OGRFieldDefn
31class is extended with a boolean attribute bNullable that defaults to
32TRUE and can be set to FALSE to express a NOT NULL constraint (bNullable
33has been preferred over bNotNullable to avoid confusion with double
34negation). Drivers that can translate NOT NULL constraints in their
35storage will use that attribute to determine if the field definition
36must include a NOT NULL constraint. When opening a datasource, their
37metadata will be inspected to set the nullable attribute properly, so
38that round-tripping works.
39
40The following methods are added to the OGRFieldDefn class
41
42::
43
44       int                 IsNullable() const { return bNullable; }
45
46   /**
47    * \brief Return whether this field can receive null values.
48    *
49    * By default, fields are nullable.
50    *
51    * Even if this method returns FALSE (i.e not-nullable field), it doesn't mean
52    * that OGRFeature::IsFieldSet() will necessary return TRUE, as fields can be
53    * temporary unset and null/not-null validation is usually done when
54    * OGRLayer::CreateFeature()/SetFeature() is called.
55    *
56    * This method is the same as the C function OGR_Fld_IsNullable().
57    *
58    * @return TRUE if the field is authorized to be null.
59    * @since GDAL 2.0
60    */
61
62       void                SetNullable( int bNullableIn ) { bNullable = bNullableIn; }
63
64   /**
65    * \brief Set whether this field can receive null values.
66    *
67    * By default, fields are nullable, so this method is generally called with FALSE
68    * to set a not-null constraint.
69    *
70    * Drivers that support writing not-null constraint will advertise the
71    * GDAL_DCAP_NOTNULL_FIELDS driver metadata item.
72    *
73    * This method is the same as the C function OGR_Fld_SetNullable().
74    *
75    * @param bNullableIn FALSE if the field must have a not-null constraint.
76    * @since GDAL 2.0
77    */
78
79As this holds true for geometry fields, those 2 methods are also add to
80the OGRGeometryFieldDefn class.
81
82Note that adding a field with a NOT NULL constraint on a non-empty layer
83is generally impossible, unless a DEFAULT value is associated with it.
84
85The following method is added to the OGRFeature class :
86
87::
88
89       int                 Validate( int nValidateFlags, int bEmitError );
90
91   /**
92    * \brief Validate that a feature meets constraints of its schema.
93    *
94    * The scope of test is specified with the nValidateFlags parameter.
95    *
96    * Regarding OGR_F_VAL_WIDTH, the test is done assuming the string width must
97    * be interpreted as the number of UTF-8 characters. Some drivers might interpret
98    * the width as the number of bytes instead. So this test is rather conservative
99    * (if it fails, then it will fail for all interpretations).
100    *
101    * This method is the same as the C function OGR_F_Validate().
102    *
103    * @param nValidateFlags OGR_F_VAL_ALL or combination of OGR_F_VAL_NULL,
104    *                       OGR_F_VAL_GEOM_TYPE, OGR_F_VAL_WIDTH and OGR_F_VAL_ALLOW_NULL_WHEN_DEFAULT
105    *                       with '|' operator
106    * @param bEmitError TRUE if a CPLError() must be emitted when a check fails
107    * @return TRUE if all enabled validation tests pass.
108    * @since GDAL 2.0
109    */
110
111where nValidateFlags is a combination of :
112
113::
114
115   /** Validate that fields respect not-null constraints.
116    * Used by OGR_F_Validate().
117    * @since GDAL 2.0
118    */
119   #define OGR_F_VAL_NULL           0x00000001
120
121   /** Validate that geometries respect geometry column type.
122    * Used by OGR_F_Validate().
123    * @since GDAL 2.0
124    */
125   #define OGR_F_VAL_GEOM_TYPE      0x00000002
126
127   /** Validate that (string) fields respect field width.
128    * Used by OGR_F_Validate().
129    * @since GDAL 2.0
130    */
131   #define OGR_F_VAL_WIDTH          0x00000004
132
133   /** Allow fields that are null when there's an associated default value.
134    * This can be used for drivers where the low-level layers will automatically set the
135    * field value to the associated default value.
136    * This flag only makes sense if OGR_F_VAL_NULL is set too.
137    * Used by OGR_F_Validate().
138    * @since GDAL 2.0
139    */
140   #define OGR_F_VAL_ALLOW_NULL_WHEN_DEFAULT       0x00000008
141
142   /** Enable all validation tests.
143    * Used by OGR_F_Validate().
144    * @since GDAL 2.0
145    */
146   #define OGR_F_VAL_ALL            0xFFFFFFFF
147
148Validation of NOT NULL constraints is generally let to the driver
149low-level layer, so OGRFeature::Validate() is only useful on a few cases
150(one of such case is the GML driver)
151
152A new flag ALTER_NULLABLE_FLAG = 0x8 is added to be passed to
153OGRLayer::AlterFieldDefn() so as to set or drop NULL / NOT-NULL
154constraints (for drivers that implement it).
155
156Drivers that handle NOT NULL constraint for regular attribute fields
157should advertise the new GDAL_DCAP_NOTNULL_FIELDS and/or
158GDAL_DCAP_NOTNULL_GEOMFIELDS driver metadata items.
159
160Drivers that do not implement the OGRLayer::CreateGeomField() interface
161(i.e. the ones that support single geometry field), but can create a
162layer with a NOT NULL constraint on the geometry field can expose a
163GEOMETRY_NULLABLE=YES/NO layer creation option.
164
165Note: due to the way they are commonly written, the CreateField()
166implementations of drivers that do not support NOT NULL constraint will
167generally copy the value of the nullable flag, which may be a bit
168misleading if querying the field definition just after having adding it
169(the same holds true for width/precision as well).
170
171All above methods are mapped into the C API :
172
173::
174
175     int    CPL_DLL OGR_Fld_IsNullable( OGRFieldDefnH hDefn );
176     void   CPL_DLL OGR_Fld_SetNullable( OGRFieldDefnH hDefn, int );
177
178   int                  CPL_DLL OGR_GFld_IsNullable( OGRGeomFieldDefnH hDefn );
179   void                 CPL_DLL OGR_GFld_SetNullable( OGRGeomFieldDefnH hDefn, int );
180
181   int    CPL_DLL OGR_F_Validate( OGRFeatureH, int nValidateFlags, int bEmitError );
182
183Default field values
184--------------------
185
186Fields with NOT NULL constraints are sometimes accompanied with a
187DEFAULT clause so as to be able to create a new feature without filling
188all fields, while maintaining integrity. DEFAULT values can also be set
189on nullable fields but for reasons exposed later it is recommended to
190avoid that.
191
192Drivers that can translate DEFAULT values in their storage will use that
193attribute to determine if the field definition must include a DEFAULT
194value. When opening a datasource, their metadata will be inspected to
195set the default value attribute properly, so that round-tripping works.
196
197There was an embryonic support for default values in GDAL 1.X but that
198never got implemented beyond the getter/setter methods on OGRFieldDefn.
199It relied on a "OGRField uDefault" member. The choice of OGRField
200restricts the default values to be expressed with the type of the field,
201but in some situations we want to be able to assign expressions or
202special keywords for non-string fields. For example the SQL standard
203defines CURRENT_TIMESTAMP for DateTime fields. So as to be general, we
204have remove this uDefault member and replaced it with a "char\*
205pszDefault" string.
206
207The values that can be set as default values are :
208
209-  literal string values enclosed in single-quote characters and
210   properly escaped like: ``'Nice weather. Isn''t it ?'``
211-  numeric values (unquoted)
212-  reserved keywords (unquoted): CURRENT_TIMESTAMP, CURRENT_DATE,
213   CURRENT_TIME, NULL
214-  datetime literal values enclosed in single-quote characters with the
215   following defined format: 'YYYY/MM/DD HH:MM:SS[.sss]'
216-  any other driver specific expression. e.g. for SQLite:
217   (strftime('%Y-%m-%dT%H:%M:%fZ','now'))
218
219The following methods are added/modified to the OGRFieldDefn class
220
221::
222
223       void                SetDefault( const char* );
224
225   /**
226    * \brief Set default field value.
227    *
228    * The default field value is taken into account by drivers (generally those with
229    * a SQL interface) that support it at field creation time. OGR will generally not
230    * automatically set the default field value to null fields by itself when calling
231    * OGRFeature::CreateFeature() / OGRFeature::SetFeature(), but will let the
232    * low-level layers to do the job. So retrieving the feature from the layer is
233    * recommended.
234    *
235    * The accepted values are NULL, a numeric value, a literal value enclosed
236    * between single quote characters (and inner single quote characters escaped by
237    * repetition of the single quote character),
238    * CURRENT_TIMESTAMP, CURRENT_TIME, CURRENT_DATE or
239    * a driver specific expression (that might be ignored by other drivers).
240    * For a datetime literal value, format should be 'YYYY/MM/DD HH:MM:SS[.sss]'
241    * (considered as UTC time).
242    *
243    * Drivers that support writing DEFAULT clauses will advertise the
244    * GDAL_DCAP_DEFAULT_FIELDS driver metadata item.
245    *
246    * This function is the same as the C function OGR_Fld_SetDefault().
247    *
248    * @param pszDefault new default field value or NULL pointer.
249    *
250    * @since GDAL 2.0
251    */
252
253
254       const char         *GetDefault() const;
255
256   /**
257    * \brief Get default field value.
258    *
259    * This function is the same as the C function OGR_Fld_GetDefault().
260    *
261    * @return default field value or NULL.
262    * @since GDAL 2.0
263    */
264
265
266       int                 IsDefaultDriverSpecific() const;
267
268   /**
269    * \brief Returns whether the default value is driver specific.
270    *
271    * Driver specific default values are those that are *not* NULL, a numeric value,
272    * a literal value enclosed between single quote characters, CURRENT_TIMESTAMP,
273    * CURRENT_TIME, CURRENT_DATE or datetime literal value.
274    *
275    * This method is the same as the C function OGR_Fld_IsDefaultDriverSpecific().
276    *
277    * @return TRUE if the default value is driver specific.
278    * @since GDAL 2.0
279    */
280
281SetDefault() validates that a string literal beginning with ' is
282properly escaped.
283
284IsDefaultDriverSpecific() returns TRUE if the value set does not belong
285to one of the 4 bullets in the above enumeration. This is used by
286drivers to determine if they can handle or not a default value.
287
288Drivers should do some effort to interpret and reformat default values
289in the above 4 standard formats so as to be able to propagate default
290values from one driver to another one.
291
292The following method is added to the OGRFeature class :
293
294::
295
296       void                FillUnsetWithDefault(int bNotNullableOnly,
297                                                char** papszOptions );
298   /**
299    * \brief Fill unset fields with default values that might be defined.
300    *
301    * This method is the same as the C function OGR_F_FillUnsetWithDefault().
302    *
303    * @param bNotNullableOnly if we should fill only unset fields with a not-null
304    *                     constraint.
305    * @param papszOptions unused currently. Must be set to NULL.
306    * @since GDAL 2.0
307    */
308
309It will replace unset fields of a feature with their default values, but
310should rarely be used as most drivers will do that substitution
311automatically in their low-level layer. CreateFeature() cannot be
312trusted to automatically modify the passed OGRFeature object to set
313unset fields to their default values. For that, an explicit GetFeature()
314call should be issued to retrieve the record as stored in the database.
315
316A new flag ALTER_DEFAULT_FLAG = 0x8 is added to be passed to
317OGRLayer::AlterFieldDefn() so as to set, drop or modify default values
318(for drivers that implement it)
319
320Drivers that handle default values should advertise the new
321GDAL_DCAP_DEFAULT_FIELDS driver metadata items.
322
323Note: due to the way they are commonly written, the CreateField()
324implementations of drivers that do not support default values will
325generally copy the value of the default value string, which may be a bit
326misleading if querying the field definition just after having adding it.
327
328All above methods are mapped into the C API :
329
330::
331
332   const char CPL_DLL *OGR_Fld_GetDefault( OGRFieldDefnH hDefn );
333   void   CPL_DLL OGR_Fld_SetDefault( OGRFieldDefnH hDefn, const char* );
334   int    CPL_DLL OGR_Fld_IsDefaultDriverSpecific( OGRFieldDefnH hDefn );
335
336   void   CPL_DLL OGR_F_FillUnsetWithDefault( OGRFeatureH hFeat,
337                                              int bNotNullableOnly,
338                                              char** papszOptions );
339
340SWIG bindings (Python / Java / C# / Perl) changes
341-------------------------------------------------
342
343The following additions have been done :
344
345-  SetNullable(), IsNullable() added on FieldDefn class
346-  SetNullable(), IsNullable() added on GeomFieldDefn class
347-  Validate() added on Feature class
348-  SetDefault(), GetDefault(), IsDefaultDriverSpecific() available on
349   FieldDefn class
350-  FillUnsetWithDefault() added on Feature class
351
352Utilities
353---------
354
355ogrinfo has been updated to expose NOT NULL constraints and DEFAULT
356values. e.g.
357
358::
359
360   Geometry Column 1 NOT NULL = WKT
361   Geometry Column 2 NOT NULL = geom2
362   id: Integer (0.0) NOT NULL DEFAULT 1234567
363   dbl: Real (0.0) NOT NULL DEFAULT 1.456
364   str: String (0.0) NOT NULL DEFAULT 'a'
365   d: Date (0.0) NOT NULL DEFAULT CURRENT_DATE
366   t: Time (0.0) NOT NULL DEFAULT CURRENT_TIME
367   dt: DateTime (0.0) NOT NULL DEFAULT CURRENT_TIMESTAMP
368   dt2: DateTime (0.0) NOT NULL DEFAULT '2013/12/11 01:23:45'
369
3702 news options have been added to ogr2ogr :
371
372-  -forceNullable to remove NOT NULL constraint (NOT NULL constraints
373   are propagated by default from source to target layer)
374-  -unsetDefault to remove DEFAULT values (DEFAULT values are propagated
375   by default from source to target layer)
376
377Unless it is explicitly specified, ogr2ogr will also automatically set
378the GEOMETRY_NULLABLE=NO creation option to target layers that support
379it, if the source layer has its first geometry field with a NOT NULL
380constraint.
381
382Documentation
383-------------
384
385New/modified API are documented.
386
387File Formats
388------------
389
390The following OGR drivers have been updated to support the new
391interfaces.
392
393-  PG: supports NOT NULL (for attribute and multiple geometry fields)
394   and DEFAULT on creation/read. AlterFieldDefn() implementation
395   modified to support ALTER_NULLABLE_FLAG and ALTER_DEFAULT_FLAG.
396-  PGDump: supports NOT NULL (for attribute and multiple geometry
397   fields) and DEFAULT on creation.
398-  CartoDB: supports NOT NULL (for attribute and single geometry fields)
399   and DEFAULT on creation. Supported also on read with authenticated
400   login only (relies on queries on PostgreSQL system tables)
401-  GPKG: supports NOT NULL (for attribute and its single geometry field)
402   and DEFAULT on creation/read. GEOMETRY_NULLABLE layer creation added.
403-  SQLite: supports NOT NULL (for attribute and multiple geometry
404   fields. Support for multiple geometry fields has been added recently
405   per #5494) and DEFAULT on creation/read. AlterFieldDefn()
406   implementation modified to support ALTER_NULLABLE_FLAG and
407   ALTER_DEFAULT_FLAG.
408-  MySQL: supports NOT NULL (for attribute fields only) and DEFAULT on
409   creation/read.
410-  OCI: supports NOT NULL (for attribute and its single geometry field)
411   and DEFAULT on creation/read. GEOMETRY_NULLABLE layer creation added.
412-  VRT: supports NOT NULL (for attribute and multiple geometry fields)
413   and DEFAULT on read, through new attributes "nullable" and "default"
414   (driver documentation and data/ogrvrt.xsd updated)
415-  GML: supports NOT NULL (for attribute and multiple geometry field) on
416   creation/read. DEFAULT not truly supported (no way to express it in
417   .xsd AFAIK), but on creation, unset fields with a NOT NULL constraint
418   and DEFAULT values will be filled by using FillUnsetWithDefault() so
419   as to generate valid XML.
420-  WFS: supports NOT NULL (for attribute fields only) on read
421-  FileGDB: supports NOT NULL (for attribute and its single geometry
422   field) on read/write. GEOMETRY_NULLABLE layer creation added. DEFAULT
423   supported for String,Integer and Real fieds on creation/read (with
424   some bugs/weird behavior seen in FileGDB SDK and E$RI tools,
425   workarounded by using the OpenFileGDB driver in problematic
426   cases...). DEFAULT supported for DateTime on read, but unsupported on
427   creation to bug in FileGDB SDK.
428-  OpenFileGDB: supports NOT NULL (for attribute and its single geometry
429   field) and DEFAULT on read
430
431MSSQLSpatial could probably support NOT NULL / DEFAULT, but has not been
432updated as part of this work.
433
434Test Suite
435----------
436
437The test suite is extended to test:
438
439-  all new methods of OGRFieldDefn, OGRGeomFieldDefn and OGRFeature in
440   ogr_feature.py
441-  updated drivers: PG, PGDump, CartoDB, GPKG, SQLite, MySQL, OCI, VRT,
442   GML, FileGDB, OpenFileGDB
443-  new options of ogr2ogr, and default behavior with NOT NULL / DEFAULT
444   propagation
445
446Compatibility Issues
447--------------------
448
449This RFC should cause few compatibility issues.
450
451Regarding API, the existing OGRFieldDefn::SetDefault() has been changed
452and GetDefaultRef() has been removed. Impact should be low as this
453wasn't used in any drivers, was documented as being prone to be removed
454in the future, and so was unlikely to be used in applications either
455(there was no C binding)
456
457When not using the new API, behavior should remain unchanged w.r.t GDAL
4581.X when operating on layers created by GDAL. If reading layers created
459by other tools, then NOT NULL and/or DEFAULT can be read, and
460propagated. We cannot exclude that propagation of NOT NULL / DEFAULT can
461cause problems in some situations. In which case the new options of
462ogr2ogr will revert to a behavior that was the one of the GDAL 1.X era.
463
464Related topics out of scope of this RFC
465---------------------------------------
466
467There might be an ambiguity between a field that has not been set and a
468field that is set to NULL. Both concepts are not distinguished in OGR
469currently, but most RDBMS are able to make such a distinction.
470
471Consider the 2 following statements :
472
473::
474
475   INSERT INTO mytable (COL1) VALUES (5)
476   INSERT INTO mytable (COL1, COL2) VALUES (5, NULL)
477
478They are not equivalent when COL2 has a default value.
479
480The behavior of the modified drivers by this RFC is to *NOT* emit NULL
481at CreateFeature() time when a field is unset, so that the low-level
482layer of the driver can replace it with its default value if it exists.
483This is generally the wished behavior.
484
485If explicit NULL insertion is wanted, then using SetFeature() afterwards
486might be needed, if supported by the drivers (some drivers will likely
487not force unset OGR fields to be NULL when composing an UPDATE
488statement), otherwise with a direct SQL UPDATE statement.
489
490In fact, this confusion between unset or NULL hurts only in the case of
491fields that are nullable and have a DEFAULT value. If making sure to
492always associate DEFAULT with NOT NULL, then it becomes a non-issue as
493the database would refuse explicit NULL values.
494
495Solving the confusion would require to add a new state to an
496instantiated field within a feature to distinguish explicit NULL from
497unset, but this would have deep impact in drivers and application code.
498
499Implementation
500--------------
501
502Implementation will be done by Even Rouault
503(`Spatialys <http://spatialys.com>`__), and sponsored by `LINZ (Land
504Information New Zealand) <http://www.linz.govt.nz/>`__.
505
506The proposed implementation lies in the "rfc53_ogr_notnull_default"
507branch of the
508`https://github.com/rouault/gdal2/tree/rfc53_ogr_notnull_default <https://github.com/rouault/gdal2/tree/rfc53_ogr_notnull_default>`__
509repository.
510
511The list of changes :
512`https://github.com/rouault/gdal2/compare/rfc53_ogr_notnull_default <https://github.com/rouault/gdal2/compare/rfc53_ogr_notnull_default>`__
513
514Voting history
515--------------
516
517+1 from JukkaR, DanielM and EvenR
518