As mentioned earlier, when working with XML documents, a data schema describing their structure is necessary.
In practice, the prolog usually contains an XML schema describing which elements the document using this schema may contain, which attributes correspond to which elements, and so on.
If we draw an analogy with a database, an XML schema resembles the description of attributes and data types for tables in a database.
A special language exists for describing schemas — XSD (XML Schema Definition Language).
The process of comparing the contents of an XML document against a certain XML schema is called validation.
The schema itself may be placed directly inside the document, but more often it is stored in a separate file with the .xsd extension, while the XML document itself contains a reference to this file.
NOTE: A schema may be completely absent from the document — neither inside it nor as a reference. In this case, validation is carried out either manually or programmatically.
Namespaces
Several schemas can be used in a single XML document. In this case, the problem of name conflicts arises: different schemas may define the same names, and if a document refers to two such schemas, each defining the same element name differently, the question arises: which definition applies?
To solve this problem, the concept of a namespace is introduced. By specifying a name, one can always determine the corresponding namespace. A namespace must also have a unique name (prefix). To specify a prefix, a URL (Uniform Resource Locator) can be used.
A namespace is defined inside the opening tag of an element:
<namespacePrefix:elementName xmlns:namespacePrefix = "URL">
The URL used does not necessarily have to point to a real file, since its main purpose is to ensure uniqueness.
A document may use several namespaces, one of which may remain unnamed. In this case, it is called the default namespace.
Using a Default Namespace
<?xml version="1.0" encoding="Windows-1251" ?> <!-- Using a default namespace --> <employee xmlns = "http://www.myorg.ru/staff"> <name> Петров </name> <salary currency="р."> 100000 </salary> </employee>
Requirements for an XML Schema
Note that a schema is also an XML document and must satisfy the following requirements:
- all schemas must have a top-level element named schema;
- all schemas must use the same base namespace, whose URL is:
http://www.w3.org/2001/XMLSchema.
In addition to the base namespace, additional namespaces may also be used in the schema.
For example, an XML schema with the base namespace bn can be defined as shown:
Example of an XML Schema
<?xml version="1.0" encoding="Windows-1251" ?> <bn:schema xmlns:bn="http://www.w3.org/2001/XMLSchema"> <bn:element name="employee"> <bn:complexType> <bn:sequence> <bn:element name="name" type="bn:string"/> <bn:element name="salary" type="bn:integer"/> </bn:sequence> </bn:complexType> </bn:element> </bn:schema>
Schema Embedded in an XML Document
The XML schema given in Listing 10.6 can be directly inserted into an XML document (Listing 10.7).
Example of Using a Schema Inside an XML Document
<?xml version="1.0" encoding="Windows-1251" ?> <employees> <!-- Beginning of schema --> <bn:schema xmlns:bn="http://www.w3.org/2001/XMLSchema"> <bn:element name="employee"> <bn:complexType> <bn:sequence> <bn:element name="name" type="bn:string"/> <bn:element name="salary" type="bn:integer"/> </bn:sequence> </bn:complexType> </bn:element> </bn:schema> <!-- End of schema --> <employee> <name> Петров </name> <salary>10000</salary> </employee> <employee> <name> Сидоров </name> <salary>15000</salary> </employee> </employees>
External XML Schema
In the previous section, we considered using a schema inside an XML document.
However, the most optimal approach is to use an external schema stored in a separate file.
Enter the code from Listing 10.6 in a text editor (e.g., Notepad) and save it under the name 5-Schema.xsd.
To specify in the document that it should be validated using the schema stored in 5-Schema.xsd, it is necessary to reference this file in a special attribute (from the namespace http://www.w3.org/2001/XMLSchema-instance).
- If the document refers to any additional namespaces (besides the one above), the schemaLocation attribute is used.
- Otherwise, the noNamespaceSchemaLocation attribute is applied.
XML Document Referring to Schema 5-Schema.xsd
<?xml version="1.0" encoding="Windows-1251" ?> <!-- Using an external XML schema --> <employee xmlns:bni="http://www.w3.org/2001/XMLSchema-instance" bni:schemaLocation="employee 5-Schema.xsd"> <name> Петров </name> <salary>10000</salary> </employee>
Another Example of an XML Schema
Let us now consider another XML schema, to be saved as 6-Schema.xsd, which allows the use of a list of multiple records.
The corresponding XML document for this schema is shown:
Example of an XML Schema for Verifying a List of Multiple Records
<?xml version="1.0" encoding="Windows-1251" ?> <bn:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <bn:element name="employees"> <bn:complexType> <bn:sequence> <bn:element ref="employee" maxOccurs="unbounded"/> </bn:sequence> </bn:complexType> </bn:element> <bn:element name="employee"> <bn:complexType> <bn:sequence> <bn:element name="firstname" type="bn:string"/> <bn:element name="lastname" type="bn:string"/> <bn:element name="salary" type="bn:integer"/> </bn:sequence> </bn:complexType> </bn:element> </bn:schema>
XML Document with a List of Multiple Employees
<?xml version="1.0" encoding="Windows-1251" ?> <!-- Example of using an XML schema --> <employees xmlns:bni="http://www.w3.org/2001/XMLSchema-instance"> bni:schemaLocation="employee 6-Schema.xsd" <employee> <firstname> Иван </firstname> <lastname> Петров </lastname> <salary> 10000 </salary> </employee> <employee> <firstname> Дмитрий </firstname> <lastname> Федоров </lastname> <salary> 9000 </salary> </employee> <employee> <firstname> Анна </firstname> <lastname> Котова </lastname> <salary> 15000 </salary> </employee> </employees>