1.. _vector.gmlas: 2 3GMLAS - Geography Markup Language (GML) driven by application schemas 4===================================================================== 5 6.. versionadded:: 2.2 7 8.. shortname:: GMLAS 9 10.. build_dependencies:: Xerces 11 12This driver can read and write XML files of arbitrary structure, 13included those containing so called Complex Features, provided that they 14are accompanied by one or several XML schemas that describe the 15structure of their content. While this driver is generic to any XML 16schema, the main target is to be able to read and write documents 17referencing directly or indirectly to the GML namespace. 18 19The driver requires Xerces-C >= 3.1. 20 21The driver can deal with files of arbitrary size with a very modest RAM 22usage, due to its working in streaming mode. 23 24Driver capabilities 25------------------- 26 27.. supports_georeferencing:: 28 29.. supports_virtualio:: 30 31Opening syntax 32-------------- 33 34The connection string is GMLAS:/path/to/the.gml. Note the GMLAS: prefix. 35If this prefix it is omitted, then the GML driver is likely to be used. 36 37It is also possible to only used "GMLAS:" as the connection string, but 38in that case the schemas must be explicitly provided with the XSD open 39option. 40 41Mapping of XML structure to OGR layers and fields 42------------------------------------------------- 43 44The driver scans the XML schemas referenced by the XML/GML to build the 45OGR layers and fields. It is strictly required that the schemas, 46directly or indirectly used, are fully valid. The content of the XML/GML 47file itself is marginally used, mostly to determine the SRS of geometry 48columns. 49 50XML elements declared at the top level of a schema will generally be 51exposed as OGR layers. Their attributes and sub-elements of simple XML 52types (string, integer, real, ...) will be exposed as OGR fields. For 53sub-elements of complex type, different cases can happen. If the 54cardinality of the sub-element is at most one and it is not referenced 55by other elements, then it is "flattened" into its enclosing element. 56Otherwise it will be exposed as a OGR layer, with either a link to its 57"parent" layer if the sub-element is specific to its parent element, or 58through a junction table if the sub-element is shared by several 59parents. 60 61By default the driver is robust to documents non strictly conforming to 62the schemas. Unexpected content in the document will be silently 63ignored, as well as content required by the schema and absent from the 64document. 65 66Consult the :ref:`GMLAS mapping examples <gmlas_mapping_examples>` 67page for more details. 68 69By default in the configuration, swe:DataRecord and swe:DataArray 70elements from the Sensor Web Enablement (SWE) Common Data Model 71namespace will receive a special processing, so they are mapped more 72naturally to OGR concepts. The swe:field elements will be mapped as OGR 73fields, and the swe:values element of a swe:DataArray will be parsed 74into OGR features in a dedicated layer for each swe:DataArray. Note that 75those conveniency exposure is for read-only purpose. When using the 76write side of the driver, only the content of the general mapping 77mechanisms will be used. 78 79Metadata layers 80--------------- 81 82Three special layers "_ogr_fields_metadata", "_ogr_layers_metadata", 83"_ogr_layer_relationships" and "_ogr_other_metadata" add extra 84information to the basic ones you can get from the OGR data model on OGR 85layers and fields. 86 87Those layers are exposed if the EXPOSE_METADATA_LAYERS open option is 88set to YES (or if enabled in the configuration). They can also be 89individually retrieved by specifying their name in calls to 90GetLayerByName(), or on as layer names with the ogrinfo and ogr2ogr 91utility. 92 93Consult the :ref:`GMLAS metadata layers <gmlas_metadata_layers>` 94page for more details. 95 96Configuration file 97------------------ 98 99A default configuration file 100`gmlasconf.xml <http://github.com/OSGeo/gdal/blob/master/gdal/data/gmlasconf.xml>`__ 101file is provided in the data directory of the GDAL installation. Its 102structure and content is documented in 103`gmlasconf.xsd <http://github.com/OSGeo/gdal/blob/master/gdal/data/gmlasconf.xsd>`__ 104schema. 105 106This configuration file enables the user to modify the following 107settings: 108 109- whether remote schemas should be downloaded. Enabled by default. 110- whether the local cache of schemas is enabled. Enabled by default. 111- the path of the local cache. By default, $HOME/.gdal/gmlas_xsd_cache 112- whether validation of the document against the schemas should be 113 enabled. Disabled by default. 114- whether validation error should cause dataset opening to fail. 115 Disabled by default. 116- whether the metadata layers should be exposed by default. Disabled by 117 default. 118- whether a 'ogr_pkid' field should always be generated. Disabled by 119 default. Turning that on can be useful on layers that have a ID 120 attribute whose uniqueness is not guaranteed among various documents. 121 Which could cause issues when appending several documents into a 122 target database table. 123- whether layers and fields that are not used in the XML document 124 should be removed. Disable by default. 125- whether OGR array data types can be used. Enabled by default. 126- whether the XML definition of the GML geometry should be reported as 127 a OGR string field. Disabled by default. 128- whether only XML elements that derive from gml:_Feature or 129 gml:AbstractFeature should be considered in the initial pass of the 130 schema building, when at least one element in the schemas derive from 131 them. Enabled by default. 132- several rules to configure if and how xlink:href should be resolved. 133- a definition of XPaths of elements and attributes that must be 134 ignored, so as to lighten the number of OGR layers and fields. 135 136This file can be adapted and modified versions can be provided to the 137driver with the CONFIG_FILE open option. None of the elements of the 138configuration file are required. When they are absent, the default value 139indicated in the schema documentation is used. 140 141Configuration can also be provided through other open options. Note that 142some open options have identical names to settings present in the 143configuration file. When such open option is provided, then its value 144will override the one of the configuration file (either the default one, 145or the one provided through the CONFIG_FILE open option). 146 147Geometry support 148---------------- 149 150XML schemas only indicate the geometry type but do not constraint the 151spatial reference systems (SRS), so it is theoretically possible to have 152object instances of the same class having different SRS for the same 153geometry field. This is not practical to deal with, so when geometry 154fields are detected, an initial scan of the document is done to find the 155first geometry of each geometry field that has an explicit srsName set. 156This one will be used for the whole geometry field. In case other 157geometries of the same field would have different SRS, they will be 158reprojected. 159 160By default, only the OGR geometry built from the GML geometry is exposed 161in the OGR feature. It is possible to change the IncludeGeometryXML 162setting of the configuration file to true so as to expose a OGR string 163field with the XML definition of the GML geometry. 164 165Performance issues with large multi-layer GML files. 166---------------------------------------------------- 167 168Traditionnaly to read a OGR datasource, one iterate over layers with 169GDALDataset::GetLayer(), and for each layer one iterate over features 170with OGRLayer::GetNextFeature(). While this approach still works for the 171GMLAS driver, it may result in very poor performance on big documents or 172documents using complex schemas that are translated in many OGR layers. 173 174It is thus recommended to use GDALDataset::GetNextFeature() to iterate 175over features as soon as they appear in the .gml/.xml file. This may 176return features from non-sequential layers, when the features include 177nested elements. 178 179Open options 180------------ 181 182- **XSD**\ =filename(s): to specify an explicit XSD application schema 183 to use (or a list of filenames, provided they are comma separated). 184 "http://" or "https://" URLs can be used. This option is not required 185 when the XML/GML document has a schemaLocation attribute with valid 186 links in its root element. 187- **CONFIG_FILE**\ =filename or inline XML definition: filename of a 188 XML configuration file conforming to the 189 `gmlasconf.xsd <https://github.com/OSGeo/gdal/blob/master/gdal/data/gmlasconf.xsd>`__ 190 schema. It is also possible to provide the XML content directly 191 inlined provided that the very first characters are <Configuration. 192- **EXPOSE_METADATA_LAYERS**\ =YES/NO: whether the metadata layers 193 "_ogr_fields_metadata", "_ogr_layers_metadata", 194 "_ogr_layer_relationships" and "ogr_other_metadata" should be 195 reported by default. Default is NO. 196- **VALIDATE**\ =YES/NO: whether the document should be validated 197 against the schemas. Validation is done at dataset opening. Default 198 is NO. 199- **FAIL_IF_VALIDATION_ERROR**\ =YES/NO: Whether a validation error 200 should cause dataset opening to fail. (only used if VALIDATE=YES) 201 Default is NO. 202- **REFRESH_CACHE**\ =YES/NO: Whether remote schemas and documents 203 pointed by xlink:href links should be downloaded from the server even 204 if already present in the local cache. If the cache is enabled, it 205 will be refreshed with the newly downloaded resources. Default is NO. 206- **SWAP_COORDINATES**\ =AUTO/YES/NO: Whether the order of the x/y or 207 long/lat coordinates should be swapped. In AUTO mode, the driver will 208 determine if swapping must be done from the srsName. If the srsName 209 is urn:ogc:def:crs:EPSG::XXXX and that the order of coordinates in 210 the EPSG database for this SRS is lat,long or northing,easting, then 211 the driver will swap them to the GIS friendly order (long,lat or 212 easting,northing). For other forms of SRS (such as EPSG:XXXX), GIS 213 friendly order is assumed and thus no swapping is done. When 214 SWAP_COORDINATES is set to YES, coordinates will be always swapped 215 regarding the order they appear in the GML, and when it set to NO, 216 they will be kept in the same order. The default is AUTO. 217- **REMOVE_UNUSED_LAYERS**\ =YES/NO: Whether unused layers should be 218 removed from the reported layers. Defaults to NO 219- **REMOVE_UNUSED_FIELDS**\ =YES/NO: Whether unused fields should be 220 removed from the reported layers. Defaults to NO 221- **HANDLE_MULTIPLE_IMPORTS**\ =YES/NO: Whether multiple imports with 222 the same namespace but different schema are allowed. Defaults to NO 223- **SCHEMA_FULL_CHECKING**\ =YES/NO: Whether to be pedantic with XSD 224 checking or to be forgiving e.g. if the invalid part of the schema is 225 not referenced in the main document. Defaults to NO 226 227Creation support 228---------------- 229 230The GMLAS driver can write XML documents in a schema-driven way by 231converting a source dataset (contrary to most other drivers that have 232read support that implement the CreateLayer() and CreateFeature() 233interfaces). The typical workflow is to use the read side of the GMLAS 234driver to produce a SQLite/Spatialite/ PostGIS database, potentially 235modify the features imported and re-export this database as a new XML 236document. 237 238The driver will identify in the source dataset "top-level" layers, and 239in those layers will find which features are not referenced by other 240top-level layers. As the creation of the output XML is schema-driver, 241the schemas need to be available. There are two possible ways: 242 243- either the result of the processing of the schemas was stored as the 244 4 \_ogr_\* metadata tables in the source dataset by using the 245 EXPOSE_METADATA_LAYERS=YES open option when converting the source 246 .xml), 247- or the schemas can be specified at creation time with the INPUT_XSD 248 creation option. 249 250By default, the driver will "wrap" the features inside a WFS 2.0 251wfs:FeatureCollection / wfs:member element. It is also possible to ask 252the driver to create instead a custom wrapping .xsd file that declares 253the ogr_gmlas:FeatureCollection / ogr_gmlas:featureMember XML elements. 254 255Note that while the file resulting from the export should be XML valid, 256there is no strong guarantee that it will validate against the 257additional constraints expressed in XML schema(s). This will depend on 258the content of the features (for example if converting from a GML file 259that is not conformant to the schemas, the output of the driver will 260generally be not validating) 261 262If the input layers have geometries stored as GML content in a \_xml 263suffixed field, then the driver will compare the OGR geometry built from 264that XML content with the OGR geometry stored in the dedicated geometry 265field of the feature. If both match, then the GML content stored in the 266\_xml suffixed field will be used, such as to preserve particularities 267of the initial GML content. Otherwise GML will be exported from the OGR 268geometry. 269 270To increase export performance on very large databases, creating 271attribute indexes on the fields pointed by the 'layer_pkid_name' 272attribute in '_ogr_layers_metadata' might help. 273 274ogr2ogr behavior 275~~~~~~~~~~~~~~~~~ 276 277When using ogr2ogr / GDALVectorTranslate() to convert to XML/GML from a 278source database, there are restrictions to the options that can be used. 279Only the following options of ogr2ogr are supported: 280 281- dataset creation options (see below) 282- layer names 283- spatial filter through -spat option. 284- attribute filter through -where option 285 286The effect of spatial and attribute filtering will only apply on 287top-levels layers. Sub-features selected through joins will not be 288affected by those filters. 289 290Dataset creation options 291~~~~~~~~~~~~~~~~~~~~~~~~ 292 293The supported dataset creation options are: 294 295- **INPUT_XSD**\ =filename(s): to specify an explicit XSD application 296 schema to use (or a list of filenames, provided they are comma 297 separated). "http://" or "https://" URLs can be used. This option is 298 not required when the source dataset has a \_ogr_other_metadata with 299 schemas and locations filled. 300- **CONFIG_FILE**\ =filename or inline XML definition: filename of a 301 XML configuration file conforming to the 302 `gmlasconf.xsd <https://github.com/OSGeo/gdal/blob/master/gdal/data/gmlasconf.xsd>`__ 303 schema. It is also possible to provide the XML content directly 304 inlined provided that the very first characters are <Configuration>. 305- **LAYERS**\ =layers. Comma separated list of layers to export as 306 top-level features. The special value "{SPATIAL_LAYERS}" can also be 307 used to specify all layers that have geometries. When LAYERS is not 308 specified, the driver will identify in the source dataset "top-level" 309 layers, and in those layers will find which features are not 310 referenced by other top-level layers. 311- **SRSNAME_FORMAT**\ =SHORT/OGC_URN/OGC_URL (Only valid for GML 3 312 output) Defaults to OGC_URL. If SHORT, then srsName will be in the 313 form AUTHORITY_NAME:AUTHORITY_CODE If OGC_URN, then srsName will be 314 in the form urn:ogc:def:crs:AUTHORITY_NAME::AUTHORITY_CODE If 315 OGC_URL, then srsName will be in the form 316 http://www.opengis.net/def/crs/AUTHORITY_NAME/0/AUTHORITY_CODE For 317 OGC_URN and OGC_URL, in the case the SRS is a SRS without explicit 318 AXIS order, but that the same SRS authority code imported with 319 ImportFromEPSGA() should be treated as lat/long or northing/easting, 320 then the function will take care of coordinate order swapping. 321- **INDENT_SIZE**\ =[0-8]. Number of spaces for each indentation level. 322 Default is 2. 323- **COMMENT**\ =string. Comment to add at top of generated XML file as 324 a XML comment. 325- **LINEFORMAT**\ =CRLF/LF. End-of-line sequence to use. Defaults to 326 CRLF on Windows and LF on other platforms. 327- **WRAPPING**\ =WFS2_FEATURECOLLECTION/GMLAS_FEATURECOLLECTION. 328 Whether to wrap features in a wfs:FeatureCollection or in a 329 ogr_gmlas:FeatureCollection. Defaults to WFS2_FEATURECOLLECTION. 330- **TIMESTAMP**\ =XML date time. User-specified XML dateTime value for 331 timestamp to use in wfs:FeatureCollection attribute. If not 332 specified, current date time is used. Only valid for 333 WRAPPING=WFS2_FEATURECOLLECTION. 334- **WFS20_SCHEMALOCATION**\ =Path or URL to wfs.xsd. Only valid for 335 WRAPPING=WFS2_FEATURECOLLECTION. Default is 336 "http://schemas.opengis.net/wfs/2.0/wfs.xsd" 337- **GENERATE_XSD**\ =YES/NO. Whether to generate a .xsd file that has 338 the structure of the wrapping ogr_gmlas:FeatureCollection / 339 ogr_gmlas:featureMember elements. Only valid for 340 WRAPPING=GMLAS_FEATURECOLLECTION. Default to YES. 341- **OUTPUT_XSD_FILENAME**\ =string. Wrapping .xsd filename. If not 342 specified, same basename as output file with .xsd extension. Note 343 that it is possible to use this option even if GENERATE_XSD=NO, so 344 that the wrapping .xsd appear in the schemaLocation attribute of the 345 .xml file. Only valid for WRAPPING=GMLAS_FEATURECOLLECTION 346 347Examples 348-------- 349 350Listing content of a data file: 351 352:: 353 354 ogrinfo -ro GMLAS:my.gml 355 356Converting to PostGIS: 357 358:: 359 360 ogr2ogr -f PostgreSQL PG:'host=myserver dbname=warmerda' GMLAS:my.gml -nlt CONVERT_TO_LINEAR 361 362Converting to Spatialite and back to GML 363 364:: 365 366 ogr2ogr -f SQLite tmp.sqlite GMLAS:in.gml -dsco SPATILIATE=YES -nlt CONVERT_TO_LINEAR -oo EXPOSE_METADATA_LAYERS=YES 367 ogr2ogr -f GMLAS out.gml tmp.sqlite 368 369See Also 370-------- 371 372- :ref:`GML <vector.gml>`: general purpose driver not requiring the 373 presence of schemas, but with limited support for complex features 374- :ref:`NAS/ALKIS <vector.nas>`: specialized GML driver for cadastral 375 data in Germany 376 377Credits 378------- 379 380Initial implementation has been funded by the European Union's Earth 381observation programme Copernicus, as part of the tasks delegated to the 382European Environment Agency. 383 384Development of special processing of some Sensor Web Enablement (SWE) 385Common Data Model swe:DataRecord and swe:DataArray constructs has been 386funded by Bureau des Recherches Géologiques et Minières (BRGM). 387 388.. toctree:: 389 :maxdepth: 1 390 :hidden: 391 392 gmlas_mapping_examples 393 gmlas_metadata_layers 394 395