Introduction to XML Schema and Simple Types
What is XSD Introduction and Simple Types?
XML Schema is a powerful alternative to XML DTD. The language for XML Schema is referred to as XML Schema Definition (XSD).
Why we use XML Schema:
- It supports data types.
- It is written in XML syntax.
- It ensures secure data communication.
- It is flexible.
The <schema>
element
The root element is the <schema>
element.
<xml version="1.0"?>
<xs:schema>
<!-- Code Block -->
</xs:schema>
The <schema>
element's attributes:
<xml version="1.0"?>
<xs:schema
xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="https://mehmetcan.sahin.dev"
xmlns="https://mehmetcan.sahin.dev">
<!-- Code Block -->
</xs:schema>
The xmlns:xs attribute specifies that we will make definitions according to the http://www.w3.org/2001/XMLSchema standards.
The targetNamespace attribute indicates that the attributes defined in the XML belong to this namespace.
The xmlns attribute specifies the XML's associated namespace.
To reference an XSD, we use the following:
<xml version="1.0"?>
<kitaplar xmlns="http://www.w3schools.com"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="https://mehmetcan.sahin.dev/kitaplar.xsd">
<!-- XML -->
The xsi:schemaLocation attribute specifies which XSD should be applied to the element.
Simple Elements
Element Declaration
To define elements in XSD:
<xs:element name="element-name" type="element-type"/>
Element Types:
- xs:string
- xs:decimal
- xs:integer
- xs:boolean
- xs:date
- xs:time
Let's apply the library example using element types. Sample XML:
<xml version="1.0"?>
<adi>Yunus Emre Divanı</adi>
<yazar>Selim Yağmur</yazar>
<dil>Türkçe</dil>
<baski>8</baski>
<tarih>2014/04/01</tarih>
Defining with XSD:
<xs:element name="adi" type="xs:string"/>
<xs:element name="yazar" type="xs:string"/>
<xs:element name="dil" type="xs:string"/>
<xs:element name="baski" type="xs:integer"/>
<xs:element name="tarih" type="xs:date"/>
When defining an element in XSD, if it is a default element, we use the "default" attribute; if it is fixed, we use the "fixed" attribute.
<xs:element name="dil" type="xs:string" fixed="Türkçe"/>
<xs:element name="baski" type="xs:integer" default="1"/>
Attribute Declaration
To define attributes in XSD:
<xs:attribute name="attribute-name" type="attribute-type"/>
Attribute Types:
- xs:string
- xs:decimal
- xs:integer
- xs:boolean
- xs:date
- xs:time
Let's use the library example to define attributes in XML:
<xml version="1.0"?>
<kitap isbn="9759954949">Yunus Emre Divanı</kitap>
<yazar gorev="Derleyici">Selim Yağmur</yazar>
Defining attributes with XSD:
<xs:attribute name="isbn" type="xs:integer"/>
<xs:attribute name="gorev" type="xs:string"/>
Restrictions
To restrict the values of elements, the general structure is as follows:
<xs:element name="element-name">
<xs:simpletype>
<xs:restriction base="element-type">
<!-- Restrictions -->
</xs:restriction>
</xs:simpletype>
</xs:element>
We use minInclusive and maxInclusive to determine the minimum and maximum values a certain element can take.
<xs:element name="baski">
<xs:simpletype>
<xs:restriction base="xs:integer">
<xs:minInclusive value="1"/>
<xs:maxInclusive value="500"/>
</xs:restriction>
</xs:simpletype>
</xs:element>
For the "baski" (edition) element, we have set the minimum value as 1 and the maximum value as 500.
<xs:element name="yazar">
<xs:simpletype>
<xs:restriction base="xs:string">
<xs:enumeration value="yazar"/>
<xs:enumeration value="cevirmen"/>
<xs:enumeration value="derleyici"/>
</xs:restriction>
</xs:simpletype>
</xs:element>
For the "yazar" (author) element, we have defined 3 options for the "gorev" (role) attribute. To reuse the restriction in multiple places, we need to convert the restrictions into types.
<!-- When defining an element, you must use the name of the restriction you will create as the type -->
<xs:element name="element-name" type="restriction-name"/>
<xs:simpletype name="restriction-name">
<xs:restriction base="xs:string">
<!-- Restrictions -->
</xs:restriction>
</xs:simpletype>
<!-- Example -->
<xs:element name="yazar" type="gorevRestriction"/>
<xs:simpleType name="gorevRestriction">
<xs:restriction base="xs:string">
<xs:enumeration value="yazar"/>
<xs:enumeration value="cevirmen"/>
<xs:enumeration value="derleyici"/>
</xs:restriction>
</xs:simpleType>
We can also define patterns:
<xs:element name="element-name">
<xs:simpletype>
<xs:restriction base="xs:string">
<xs:pattern value="pattern"/>
</xs:restriction>
</xs:simpletype>
</xs:element>
Let's define a pattern that starts with a lowercase letter followed by uppercase letters.
<xs:element name="ad">
<xs:simpletype>
<xs:restriction base="xs:string">
<xs:pattern value="[a-z]([A-Z])*"/>
</xs:restriction>
</xs:simpletype>
</xs:element>
Patterns and their descriptions:
<xs:element name="ad">
<xs:simpletype>
<xs:restriction base="xs:string">
<xs:pattern value="[a-z]
"/> <!-- Single character, lowercase letters -->
<xs:pattern value="[A-Z]"/> <!-- Single character, uppercase letters -->
<xs:pattern value="[a-zA-Z]"/> <!-- Single character, lowercase or uppercase letters -->
<xs:pattern value="[a-zA-Z][a-zA-Z]"/> <!-- Two characters, lowercase or uppercase letters -->
<xs:pattern value="xyz"/> <!-- Single character, one of the letters 'x', 'y', 'z' -->
<xs:pattern value="[0-9]"/> <!-- A single digit from 0 to 9 -->
<xs:pattern value="([a-z])*"/> <!-- Any number of lowercase letters -->
<xs:pattern value="([a-z])+"/> <!-- At least one or more letters -->
<xs:pattern value="ahmet|mehmet"/> <!-- 'ahmet' or 'mehmet' -->
<xs:pattern value="([a-z])*"/> <!-- Any number of lowercase letters -->
</xs:restriction>
</xs:simpletype>
</xs:element>
Try out the above patterns one by one. For handling spaces:
<xs:element name="ad">
<xs:simpletype>
<xs:restriction base="xs:string">
<xs:whiteSpace value="preserve"/> <!-- Does not make any changes to #x9, #xA, #xD.-->
<xs:whiteSpace value="replace"/> <!-- Replaces #x9(tab), #xA, #xD characters with #x20(space). -->
<xs:whiteSpace value="collapse"/> <!-- Removes spaces. -->
</xs:restriction>
</xs:simpletype>
</xs:element>
For character length:
<xs:element name="ad">
<xs:simpletype>
<xs:restriction base="xs:string">
<xs:length value="5"/> <!-- Specifies the exact number of characters -->
<xs:minLength value="1"/> <!-- Specifies the minimum number of characters -->
<xs:maxLength value="10"/> <!-- Specifies the maximum number of characters -->
</xs:restriction>
</xs:simpletype>
</xs:element>
Types used in restrictions:
- enumeration: accepted values
- fractionDigits: For example, 2.4 specifies that the value can have two digits before the decimal point and four digits after the decimal point.
- length: specifies the exact number of characters
- maxExclusive: specifies the maximum numeric value (value must be less than the specified value)
- maxInclusive: specifies the maximum numeric value (value must be less than or equal to the specified value)
- maxLength: specifies the maximum number of characters
- minExclusive: specifies the minimum numeric value (value must be greater than the specified value)
- minInclusive: specifies the minimum numeric value (value must be greater than or equal to the specified value)
- minLength: specifies the minimum number of characters
- pattern: specifies a pattern for the value to match
- totalDigits: specifies the total number of digits
- whiteSpace: specifies how to handle white space (spaces, tabs, line breaks)