1.. _vector.pg_advanced:
2
3PostgreSQL / PostGIS - Advanced Driver Information
4==================================================
5
6The information collected in that page deal with advanced topics, not
7found in the :ref:`OGR PostgreSQL driver Information <vector.pg>` page.
8
9Connection options related to schemas and tables
10------------------------------------------------
11
12The database opening should be significantly
13faster than in previous versions, so using tables= or schemas= options
14will not bring further noticeable speed-ups.
15
16The set of tables to be scanned can be
17overridden by specifying
18*tables=[schema.]table[(geom_column_name)][,[schema2.]table2[(geom_column_name2)],...]*
19within the connection string. If the parameter is found, the driver
20skips enumeration of the tables as described in the next paragraph.
21
22It is possible to restrict the schemas that
23will be scanned while establishing the list of tables. This can be done
24by specifying *schemas=schema_name[,schema_name2]* within the connection
25string. This can also be a way of speeding up the connection to a
26PostgreSQL database if there are a lot of schemas. Note that if only one
27schema is listed, it will also be made automatically the active schema
28(and the schema name will not prefix the layer name). Otherwise, the
29active schema is still 'public', unless otherwise specified by the
30*active_schema=* option.
31
32The active schema ('public' being the default)
33can be overridden by specifying *active_schema=schema_name* within the
34connection string. The active schema is the schema where tables are
35created or looked for when their name is not explicitly prefixed by a
36schema name. Note that this does not restrict the tables that will be
37listed (see *schemas=* option above). When getting the list of tables,
38the name of the tables within that active schema will not be prefixed by
39the schema name. For example, if you have a table 'foo' within the
40public schema, and a table 'foo' within the 'bar_schema' schema, and
41that you specify active_schema=bar_schema, 2 layers will be listed :
42'foo' (implicitly within 'bar_schema') and 'public.foo'.
43
44Multiple geometry columns
45-------------------------
46
47The PostgreSQL driver supports accessing
48tables with multiple PostGIS geometry columns.
49
50OGR supports reading, updating, creating tables with multiple
51PostGIS geometry columns (following :ref:`rfc-41`)
52For such a table, a single OGR layer will be reported with as many
53geometry fields as there are geometry columns in the table.
54
55For backward compatibility, it is also possible to query a layer with
56GetLayerByName() with a name formatted like 'foo(bar)' where 'foo' is a
57table and 'bar' a geometry column.
58
59Layers
60------
61
62Even when PostGIS is enabled, if the user
63defines the environment variable
64
65::
66
67   PG_LIST_ALL_TABLES=YES
68
69(and does not specify tables=), all regular user tables and named views
70will be treated as layers. However, tables with multiple geometry column
71will only be reported once in that mode. So this variable is mainly
72useful when PostGIS is enabled to find out tables with no spatial data,
73or views without an entry in *geometry_columns* table.
74
75In any case, all user tables can be queried explicitly with
76GetLayerByName()
77
78Regular (non-spatial) tables can be accessed, and will return features
79with attributes, but not geometry. If the table has a "wkb_geometry"
80field, it will be treated as a spatial table. The type of the field is
81inspected to determine how to read it. It can be a PostGIS **geometry**
82field, which is assumed to come back in OGC WKT, or type BYTEA or OID in
83which case it is used as a source of OGC WKB geometry.
84
85Tables inherited from spatial tables are
86supported.
87
88If there is an "ogc_fid" field, it will be used to set the feature id of
89the features, and not treated as a regular field.
90
91The layer name may be of the form "schema.table". The schema must exist,
92and the user needs to have write permissions for the target and the
93public schema.
94
95If the user defines the environment variable
96
97::
98
99   PG_SKIP_VIEWS=YES
100
101(and does not specify tables=), only the regular user tables will be
102treated as layers. The default action is to include the views. This
103variable is particularly useful when you want to copy the data into
104another format while avoiding the redundant data from the views.
105
106Named views
107-----------
108
109When PostGIS is enabled for the accessed database, named views are
110supported, provided that there is an entry in the *geometry_columns*
111tables. But, note that the AddGeometryColumn() SQL function doesn't
112accept adding an entry for a view (only for regular tables). So, that
113must usually be done by hand with a SQL statement like :
114
115::
116
117   "INSERT INTO geometry_columns VALUES ( '', 'public', 'name_of_my_view', 'name_of_geometry_column', 2, 4326, 'POINT');"
118
119It is also possible to use named views without
120inserting a row in the geometry_columns table. For that, you need to
121explicitly specify the name of the view in the "tables=" option of the
122connection string. See above. The drawback is that OGR will not be able
123to report a valid SRS and figure out the right geometry type.
124
125Retrieving FID of newly inserted feature
126----------------------------------------
127
128The FID of
129a feature (i.e. usually the value of the OGC_FID column for the feature)
130inserted into a table with CreateFeature(), in non-copy mode, will be
131retrieved from the database and can be obtained with GetFID(). One
132side-effect of this new behavior is that you must be careful if you
133re-use the same feature object in a loop that makes insertions. After
134the first iteration, the FID will be set to a non-null value, so at the
135second iteration, CreateFeature() will try to insert the new feature
136with the FID of the previous feature, which will fail as you cannot
137insert 2 features with same FID. So in that case you must explicitly
138reset the FID before calling CreateFeature(), or use a fresh feature
139object.
140
141Snippet example in Python :
142
143::
144
145       feat = ogr.Feature(lyr.GetLayerDefn())
146       for i in range(100):
147           feat.SetFID(-1)  # Reset FID to null value
148           lyr.CreateFeature(feat)
149           print('The feature has been assigned FID %d' % feat.GetFID())
150
151or :
152
153::
154
155       for i in range(100):
156           feat = ogr.Feature(lyr.GetLayerDefn())
157           lyr.CreateFeature(feat)
158           print('The feature has been assigned FID %d' % feat.GetFID())
159
160Old GDAL behavior can be obtained by setting the configuration
161option :decl_configoption:`OGR_PG_RETRIEVE_FID` to FALSE.
162
163Issues with transactions
164------------------------
165
166Efficient sequential reading in PostgreSQL requires to be done within a
167transaction (technically this is a CURSOR WITHOUT HOLD). So the PG
168driver will implicitly open such a transaction if none is currently
169opened as soon as a feature is retrieved. This transaction will be
170released if ResetReading() is called (provided that no other layer is
171still being read).
172
173If within such an implicit transaction, an explicit dataset level
174StartTransaction() is issued, the PG driver will use a SAVEPOINT to
175emulate properly the transaction behavior while making the active
176cursor on the read layer still opened.
177
178If an explicit transaction is opened with dataset level
179StartTransaction() before reading a layer, this transaction will be used
180for the cursor that iterates over the layer. When explicitly committing
181or rolling back the transaction, the cursor will become invalid, and
182ResetReading() should be issued again to restart reading from the
183beginning.
184
185As calling SetAttributeFilter() or SetSpatialFilter() implies an
186implicit ResetReading(), they have the same effect as ResetReading().
187That is to say, while an implicit transaction is in progress, the
188transaction will be committed (if no other layer is being read), and a
189new one will be started again at the next GetNextFeature() call. On the
190contrary, if they are called within an explicit transaction, the
191transaction is maintained.
192
193With the above rules, the below examples show the SQL instructions that
194are run when using the OGR API in different scenarios.
195
196::
197
198
199   lyr1->GetNextFeature()             BEGIN (implicit)
200                                      DECLARE cur1 CURSOR FOR SELECT * FROM lyr1
201                                      FETCH 1 IN cur1
202
203   lyr1->SetAttributeFilter('xxx')
204        --> lyr1->ResetReading()      CLOSE cur1
205                                      COMMIT (implicit)
206
207   lyr1->GetNextFeature()             BEGIN (implicit)
208                                      DECLARE cur1 CURSOR  FOR SELECT * FROM lyr1 WHERE xxx
209                                      FETCH 1 IN cur1
210
211   lyr2->GetNextFeature()             DECLARE cur2 CURSOR  FOR SELECT * FROM lyr2
212                                      FETCH 1 IN cur2
213
214   lyr1->GetNextFeature()             FETCH 1 IN cur1
215
216   lyr2->GetNextFeature()             FETCH 1 IN cur2
217
218   lyr1->CreateFeature(f)             INSERT INTO cur1 ...
219
220   lyr1->SetAttributeFilter('xxx')
221        --> lyr1->ResetReading()      CLOSE cur1
222                                      COMMIT (implicit)
223
224   lyr1->GetNextFeature()             DECLARE cur1 CURSOR  FOR SELECT * FROM lyr1 WHERE xxx
225                                      FETCH 1 IN cur1
226
227   lyr1->ResetReading()               CLOSE cur1
228
229   lyr2->ResetReading()               CLOSE cur2
230                                      COMMIT (implicit)
231
232   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
233
234   ds->StartTransaction()             BEGIN
235
236   lyr1->GetNextFeature()             DECLARE cur1 CURSOR FOR SELECT * FROM lyr1
237                                      FETCH 1 IN cur1
238
239   lyr2->GetNextFeature()             DECLARE cur2 CURSOR FOR SELECT * FROM lyr2
240                                      FETCH 1 IN cur2
241
242   lyr1->CreateFeature(f)             INSERT INTO cur1 ...
243
244   lyr1->SetAttributeFilter('xxx')
245        --> lyr1->ResetReading()      CLOSE cur1
246                                      COMMIT (implicit)
247
248   lyr1->GetNextFeature()             DECLARE cur1 CURSOR  FOR SELECT * FROM lyr1 WHERE xxx
249                                      FETCH 1 IN cur1
250
251   lyr1->ResetReading()               CLOSE cur1
252
253   lyr2->ResetReading()               CLOSE cur2
254
255   ds->CommitTransaction()            COMMIT
256
257   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
258
259   ds->StartTransaction()             BEGIN
260
261   lyr1->GetNextFeature()             DECLARE cur1 CURSOR FOR SELECT * FROM lyr1
262                                      FETCH 1 IN cur1
263
264   lyr1->CreateFeature(f)             INSERT INTO cur1 ...
265
266   ds->CommitTransaction()            CLOSE cur1 (implicit)
267                                      COMMIT
268
269   lyr1->GetNextFeature()             FETCH 1 IN cur1      ==> Error since the cursor was closed with the commit. Explicit ResetReading() required before
270
271   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
272
273   lyr1->GetNextFeature()             BEGIN (implicit)
274                                      DECLARE cur1 CURSOR FOR SELECT * FROM lyr1
275                                      FETCH 1 IN cur1
276
277   ds->StartTransaction()             SAVEPOINT savepoint
278
279   lyr1->CreateFeature(f)             INSERT INTO cur1 ...
280
281   ds->CommitTransaction()            RELEASE SAVEPOINT savepoint
282
283   lyr1->ResetReading()               CLOSE cur1
284                                      COMMIT (implicit)
285
286Note: in reality, the PG drivers fetches 500 features at once. The FETCH
2871 is for clarity of the explanation.
288
289Advanced Examples
290-----------------
291
292-  This example shows using ogrinfo to list only the layers specified by
293   the *tables=* options.
294
295   ::
296
297      ogrinfo -ro PG:'dbname=warmerda tables=table1,table2'
298
299-  This example shows using ogrinfo to query a table 'foo' with multiple
300   geometry columns ('geom1' and 'geom2').
301
302   ::
303
304      ogrinfo -ro -al PG:dbname=warmerda 'foo(geom2)'
305
306-  This example show how to list only the layers inside the schema
307   apt200810 and apt200812. The layer names will be prefixed by the name
308   of the schema they belong to.
309
310   ::
311
312      ogrinfo -ro PG:'dbname=warmerda schemas=apt200810,apt200812'
313
314-  This example shows using ogrinfo to list only the layers inside the
315   schema named apt200810. Note that the layer names will not be
316   prefixed by apt200810 as only one schema is listed.
317
318   ::
319
320      ogrinfo -ro PG:'dbname=warmerda schemas=apt200810'
321
322-  This example shows how to convert a set of shapefiles inside the
323   apt200810 directory into an existing Postgres schema apt200810. In
324   that example, we could have use the schemas= option instead.
325
326   ::
327
328      ogr2ogr -f PostgreSQL "PG:dbname=warmerda active_schema=apt200810" apt200810
329
330-  This example shows how to convert all the tables inside the schema
331   apt200810 as a set of shapefiles inside the apt200810 directory. Note
332   that the layer names will not be prefixed by apt200810 as only one
333   schema is listed
334
335   ::
336
337      ogr2ogr apt200810 PG:'dbname=warmerda schemas=apt200810'
338
339-  This example shows how to overwrite an existing table in an existing
340   schema. Note the use of -nln to specify the qualified layer name.
341
342   ::
343
344      ogr2ogr -overwrite -f PostgreSQL "PG:dbname=warmerda" mytable.shp mytable -nln myschema.mytable
345
346   Note that using -lco SCHEMA=mytable instead of -nln would not have
347   worked in that case (see
348   `#2821 <http://trac.osgeo.org/gdal/ticket/2821>`__ for more details).
349
350   If you need to overwrite many tables located in a schema at once, the
351   -nln option is not the more appropriate, so it might be more
352   convenient to use the active_schema connection string.
353   The following example will overwrite, if necessary, all
354   the PostgreSQL tables corresponding to a set of shapefiles inside the
355   apt200810 directory :
356
357   ::
358
359      ogr2ogr -overwrite -f PostgreSQL "PG:dbname=warmerda active_schema=apt200810" apt200810
360
361See Also
362--------
363
364-  :ref:`OGR PostgreSQL driver Information <vector.pg>`
365