1.. _schema-components:
2
3*****************
4Schema components
5*****************
6
7After the building a schema object contains a set of components that represent
8the definitions/declarations defined in loaded schema files. These components,
9sometimes referred as *Post Schema Validation Infoset* or **PSVI**, constitute
10an augmentation of the original information contained into schema files.
11
12.. testsetup:: collection
13
14    import xmlschema
15    import os
16    import warnings
17
18    if os.getcwd().endswith('/doc'):
19        os.chdir('..')
20    warnings.simplefilter("ignore", xmlschema.XMLSchemaIncludeWarning)
21    schema = xmlschema.XMLSchema('tests/test_cases/examples/collection/collection.xsd')
22
23
24Accessing schema components
25===========================
26
27Taking the *collection.xsd* as sample schema to illustrate the access to components, we
28can iterate the entire set of components, globals an locals, using the *iter_components()*
29generator function:
30
31.. doctest:: collection
32
33    >>> import xmlschema
34    >>> schema = xmlschema.XMLSchema('tests/test_cases/examples/collection/collection.xsd')
35    >>> for xsd_component in schema.iter_components():
36    ...     xsd_component
37    ...
38    XMLSchema10(name='collection.xsd', namespace='http://example.com/ns/collection')
39    XsdComplexType(name='personType')
40    XsdAttributeGroup(['id'])
41    XsdAttribute(name='id')
42    XsdGroup(model='sequence', occurs=[1, 1])
43    XsdElement(name='name', occurs=[1, 1])
44    ...
45    ...
46    XsdElement(name='object', occurs=[1, None])
47    XsdElement(name='person', occurs=[1, 1])
48
49For taking only global components use *iter_globals()* instead:
50
51.. doctest:: collection
52
53    >>> for xsd_component in schema.iter_globals():
54    ...     xsd_component
55    ...
56    XsdComplexType(name='personType')
57    XsdComplexType(name='objType')
58    XsdElement(name='collection', occurs=[1, 1])
59    XsdElement(name='person', occurs=[1, 1])
60
61
62Access with XPath API
63---------------------
64
65Another method for retrieving XSD elements and attributes of a schema is
66to use XPath expressions with *find* or *findall* methods:
67
68.. doctest:: collection
69
70    >>> from pprint import pprint
71    >>> namespaces = {'': 'http://example.com/ns/collection'}
72    >>> schema.find('collection/object', namespaces)
73    XsdElement(name='object', occurs=[1, None])
74    >>> pprint(schema.findall('collection/object/*', namespaces))
75    [XsdElement(name='position', occurs=[1, 1]),
76     XsdElement(name='title', occurs=[1, 1]),
77     XsdElement(name='year', occurs=[1, 1]),
78     XsdElement(name='author', occurs=[1, 1]),
79     XsdElement(name='estimation', occurs=[0, 1]),
80     XsdElement(name='characters', occurs=[0, 1])]
81
82
83Access to global components
84---------------------------
85
86Accessing a specific type of global component a dictionary access may be preferred:
87
88.. doctest:: collection
89
90    >>> schema.elements['person']
91    XsdElement(name='person', occurs=[1, 1])
92    >>> schema.types['personType']
93    XsdComplexType(name='personType')
94
95The schema object has a dictionary attribute for each type of XSD declarations
96(*elements*, *attributes* and *notations*) and for each type of XSD definitions
97(*types*, *model groups*, *attribute groups*, *identity constraints* and *substitution
98groups*).
99
100These dictionaries are only views of common dictionaries, shared by all the
101loaded schemas in a structure called *maps*:
102
103.. doctest:: collection
104
105    >>> schema.maps
106    XsdGlobals(validator=XMLSchema10(name='collection.xsd', ...)
107
108.. doctest:: collection
109
110    >>> person = schema.elements['person']
111    >>> person
112    XsdElement(name='person', occurs=[1, 1])
113    >>> schema.maps.elements[person.qualified_name]
114    XsdElement(name='person', occurs=[1, 1])
115
116
117Component structure
118===================
119
120Only the main component classes are available at package level:
121
122XsdComponent
123    The base class of every XSD component.
124
125XsdType
126    The base class of every XSD type, both complex and simple types.
127
128XsdElement
129    The XSD 1.0 element class, base also of XSD 1.1 element class.
130
131XsdAttribute
132    The XSD 1.0 attribute class, base also of XSD 1.1 attribute class.
133
134
135The full schema components are provided only by accessing the `xmlschema.validators`
136subpackage, for example:
137
138.. doctest::
139
140    >>> import xmlschema
141    >>> xmlschema.validators.Xsd11Element
142    <class 'xmlschema.validators.elements.Xsd11Element'>
143
144
145Connection with the schema
146--------------------------
147
148Every component is linked to its container schema and a reference node of its
149XSD schema document:
150
151.. doctest:: collection
152
153    >>> person = schema.elements['person']
154    >>> person.schema
155    XMLSchema10(name='collection.xsd', namespace='http://example.com/ns/collection')
156    >>> person.elem
157    <Element '{http://www.w3.org/2001/XMLSchema}element' at ...>
158    >>> person.tostring()
159    '<xs:element xmlns:xs="http://www.w3.org/2001/XMLSchema" name="person" type="personType" />'
160
161
162Naming options
163--------------
164
165A component that has a name (eg. elements or global types) can be referenced with
166a different name format, so there are some properties for getting these formats:
167
168.. doctest:: collection
169
170    >>> vh_schema = xmlschema.XMLSchema('tests/test_cases/examples/vehicles/vehicles.xsd')
171    >>> car = vh_schema.find('vh:vehicles/vh:cars/vh:car')
172    >>> car.name
173    '{http://example.com/vehicles}car'
174    >>> car.local_name
175    'car'
176    >>> car.prefixed_name
177    'vh:car'
178    >>> car.qualified_name
179    '{http://example.com/vehicles}car'
180    >>> car.attributes['model'].name
181    'model'
182    >>> car.attributes['model'].qualified_name
183    '{http://example.com/vehicles}model'
184
185
186Decoding and encoding
187---------------------
188
189Every schema component includes methods for data conversion:
190
191.. doctest::
192
193    >>> schema = xmlschema.XMLSchema('tests/test_cases/examples/vehicles/vehicles.xsd')
194    >>> schema.types['vehicleType'].decode
195    <bound method XsdComplexType.decode of XsdComplexType(name='vehicleType')>
196    >>> schema.elements['cars'].encode
197    <bound method ValidationMixin.encode of XsdElement(name='vh:cars', occurs=[1, 1])>
198
199
200Those methods can be used to decode the correspondents parts of the XML document:
201
202.. doctest::
203
204    >>> import xmlschema
205    >>> from pprint import pprint
206    >>> from xml.etree import ElementTree
207    >>> xs = xmlschema.XMLSchema('tests/test_cases/examples/vehicles/vehicles.xsd')
208    >>> xt = ElementTree.parse('tests/test_cases/examples/vehicles/vehicles.xml')
209    >>> root = xt.getroot()
210    >>> pprint(xs.elements['cars'].decode(root[0]))
211    {'{http://example.com/vehicles}car': [{'@make': 'Porsche', '@model': '911'},
212                                          {'@make': 'Porsche', '@model': '911'}]}
213    >>> pprint(xs.elements['cars'].decode(xt.getroot()[1], validation='skip'))
214    None
215    >>> pprint(xs.elements['bikes'].decode(root[1], namespaces={'vh': 'http://example.com/vehicles'}))
216    {'@xmlns:vh': 'http://example.com/vehicles',
217     'vh:bike': [{'@make': 'Harley-Davidson', '@model': 'WL'},
218                 {'@make': 'Yamaha', '@model': 'XS650'}]}
219
220
221XSD types
222=========
223
224Every element or attribute declaration has a *type* attribute for accessing its XSD type:
225
226.. doctest:: collection
227
228    >>> person = schema.elements['person']
229    >>> person.type
230    XsdComplexType(name='personType')
231
232
233Simple types
234------------
235
236Simple types are used on attributes and elements that contains a text value:
237
238.. doctest::
239
240    >>> schema = xmlschema.XMLSchema('tests/test_cases/examples/vehicles/vehicles.xsd')
241    >>> schema.attributes['step']
242    XsdAttribute(name='vh:step')
243    >>> schema.attributes['step'].type
244    XsdAtomicBuiltin(name='xs:positiveInteger')
245
246A simple type doesn't have attributes but can have facets-related validators or properties:
247
248.. doctest::
249
250    >>> schema.attributes['step'].type.attributes
251    Traceback (most recent call last):
252      File "<stdin>", line 1, in <module>
253    AttributeError: 'XsdAtomicBuiltin' object has no attribute 'attributes'
254    >>> schema.attributes['step'].type.validators
255    [<function positive_int_validator at ...>]
256    >>> schema.attributes['step'].type.white_space
257    'collapse'
258
259To check if a type is a simpleType use *is_simple()*:
260
261.. doctest::
262
263    >>> schema.attributes['step'].type.is_simple()
264    True
265
266
267Complex types
268-------------
269
270Complex types are used only for elements with attributes or with child elements.
271
272For accessing the attributes there is always defined and attribute group, also
273when the complex type has no attributes:
274
275.. doctest:: collection
276
277    >>> schema.types['objType']
278    XsdComplexType(name='objType')
279    >>> schema.types['objType'].attributes
280    XsdAttributeGroup(['id', 'available'])
281    >>> schema.types['objType'].attributes['available']
282    XsdAttribute(name='available')
283
284For accessing the content model there use the attribute *content*. In most
285cases the element's type is a complexType with a complex content and in these
286cases *content* is a not-empty `XsdGroup`:
287
288.. doctest:: collection
289
290    >>> person = schema.elements['person']
291    >>> person.type.has_complex_content()
292    True
293    >>> person.type.content
294    XsdGroup(model='sequence', occurs=[1, 1])
295    >>> for item in person.type.content:
296    ...     item
297    ...
298    XsdElement(name='name', occurs=[1, 1])
299    XsdElement(name='born', occurs=[1, 1])
300    XsdElement(name='dead', occurs=[0, 1])
301    XsdElement(name='qualification', occurs=[0, 1])
302
303.. note::
304
305    The attribute *content_type* has been renamed to *content* in v1.2.1
306    in order to avoid confusions between the complex type and its content.
307    A property with the old name will be maintained until v2.0.
308
309
310Model groups can be nested with very complex structures, so there is an generator
311function *iter_elements()* to traverse a model group:
312
313.. doctest:: collection
314
315    >>> for e in person.type.content.iter_elements():
316    ...     e
317    ...
318    XsdElement(name='name', occurs=[1, 1])
319    XsdElement(name='born', occurs=[1, 1])
320    XsdElement(name='dead', occurs=[0, 1])
321    XsdElement(name='qualification', occurs=[0, 1])
322
323Sometimes a complex type can have a simple content, in these cases *content* is a simple type.
324
325
326Content types
327-------------
328
329An element can have four different content types:
330
331- **empty**: deny child elements, deny text content
332- **simple**: deny child elements, allow text content
333- **element-only**: allow child elements, deny intermingled text content
334- **mixed**: allow child elements and intermingled text content
335
336For attributes only *empty* or *simple* content types are possible, because
337they can have only a simpleType value.
338
339The reference methods for checking the content type are respectively *is_empty()*,
340*has_simple_content()*, *is_element_only()* and *has_mixed_content()*.
341
342
343Access to content validator
344---------------------------
345
346The content type checking can be complicated if you want to know which is the
347content validator without use a type checking. To making this simpler there are
348two properties defined for XSD types:
349
350simple_type
351    a simple type in case of *simple* content or when an *empty* content is
352    based on an empty simple type, `None` otherwise.
353
354model_group
355    a model group in case of *mixed* or *element-only* content or when an
356    *empty* content is based on an empty model group, `None` otherwise.
357