XML Syntax and Rules
XML has a strict set of syntax rules. Every XML document must follow these rules exactly. If even one rule is broken, the document is considered invalid and most programs will refuse to process it. This strictness is what makes XML reliable and predictable across different systems.
A document that follows all XML syntax rules is called a well-formed XML document.
Rule 1: Every XML Document Must Have a Root Element
An XML document must have exactly one root element that contains all other elements. Think of the root element as the trunk of a tree — everything else branches out from it.
Correct Example
<?xml version="1.0" encoding="UTF-8"?>
<library>
<book>
<title>XML Basics</title>
</book>
</library>
Here, <library> is the single root element. All other elements live inside it.
Incorrect Example
<book>
<title>XML Basics</title>
</book>
<magazine>
<title>Tech Today</title>
</magazine>
This is invalid because there are two root-level elements: <book> and <magazine>. There can only be one.
Rule 2: All Tags Must Be Closed
In HTML, it is acceptable to leave some tags unclosed. In XML, this is not allowed. Every opening tag must have a corresponding closing tag.
Correct Example
<city>New York</city>
Incorrect Example
<city>New York
The closing tag </city> is missing, which makes this invalid XML.
Self-Closing Tags
If an element has no content, it can be written as a self-closing tag using a forward slash before the closing angle bracket.
<image src="photo.jpg" />
This is a valid shorthand for <image src="photo.jpg"></image>.
Rule 3: Tags Are Case-Sensitive
XML treats uppercase and lowercase letters as completely different. <Name>, <name>, and <NAME> are three distinct tags.
Correct Example
<Country>France</Country>
Incorrect Example
<Country>France</country>
The opening tag uses Country (capital C) but the closing tag uses country (lowercase c). These do not match, making the document invalid.
Rule 4: Elements Must Be Properly Nested
When one element is placed inside another, it must be fully contained within the outer element. Tags cannot overlap each other.
Correct Example
<person>
<name>Alice</name>
</person>
Incorrect Example
<person>
<name>Alice</person>
</name>
Here, <person> closes before <name> closes. This is overlapping and is not allowed in XML.
Rule 5: Attribute Values Must Be Quoted
All attribute values in XML must be enclosed in either single quotes or double quotes. Unquoted attribute values are not allowed.
Correct Example
<product id="101" category="electronics">Laptop</product>
Incorrect Example
<product id=101>Laptop</product>
The value 101 is not quoted, which makes this invalid.
Rule 6: The XML Declaration (Optional but Recommended)
An XML document typically starts with a declaration that specifies the XML version and the character encoding used. While this is technically optional, it is strongly recommended as best practice.
<?xml version="1.0" encoding="UTF-8"?>
version="1.0"— Specifies the XML version being used.encoding="UTF-8"— Specifies the character set. UTF-8 supports a wide range of characters from different languages.
If the declaration is present, it must be on the very first line of the document.
Rule 7: Special Characters Must Be Escaped
Some characters have special meaning in XML and cannot be used directly inside content. These characters must be replaced with their escape sequences, also called entity references.
| Character | Use Instead | Meaning |
|---|---|---|
| < | < | Less than |
| > | > | Greater than |
| & | & | Ampersand |
| ' | ' | Apostrophe |
| " | " | Double quote |
Example Using Escaped Characters
<description>Price is less than <100 & more than 10</description>
If the < and & symbols were used directly, the XML parser would get confused and produce an error.
Rule 8: XML Comments
Comments can be added to an XML document using the same syntax as HTML comments. Comments are ignored by XML processors.
<!-- This is a comment and will be ignored by the parser -->
Comments cannot appear before the XML declaration, and they cannot be nested inside each other.
A Complete Well-Formed XML Example
<?xml version="1.0" encoding="UTF-8"?>
<!-- Employee record for HR system -->
<employee>
<id>E001</id>
<name>Maria Garcia</name>
<department>Engineering</department>
<salary currency="USD">75000</salary>
<active />
</employee>
This document follows all XML rules: it has a declaration, one root element, properly nested and closed tags, quoted attribute values, and no illegal characters.
Key Points
- A well-formed XML document follows all syntax rules.
- There must be exactly one root element.
- All tags must be opened and closed correctly.
- XML is case-sensitive — opening and closing tags must match exactly.
- Tags must be properly nested and not overlapping.
- All attribute values must be wrapped in quotes.
- Special characters like
<and&must be escaped with entity references.
