XPath/Basic Syntax

= Basic XPath Syntax =

Expressions that start with a forward slash "/" are called absolute expressions. They start at the root of the document. All other expressions are relative to the current position within an XML document.

Expressions are created by creating a list of step expressions of the form

step[predicate]/step[predicate]/step[predicate]

You can think of the predicate as a filter or conditional expression that service like a WHERE clause in SQL.

Sample XML file
Many of the examples use a "books" example such as the following:

http://raw.github.com/dmccreary/learn-xquery/master/data/books.xml

In general the books file has the following structure:

Basic XPath Expressions
The root document node /

Note that the forward slash returns the document root, not the full books element.

The root node that contains all the books: /books

All book elements: /books/book //book The first version is with an absolute path. The second uses a relative path - book elements at any level of the file.

Note that the first expression is faster in unindexed XML but within indexed native XML databases the second is faster.

A count of the number of books: count(//book)

All the book titles: //book/title

The second book in the collection: //book[2]

The title of the second book: //book[2]/title

The third author of the second book //book[2]/author[3]

All books with the format "wikibook": //book[format='wikibook']

Get a list of all the publishers //publisher

Get a distinct list of the publishers (duplicates removed) distinct-values(//publisher)

Books that have at least one price over 30 //book[list-price > 30]

XPath abbreviations
. represents the current node

.. represents the nearest parent node

@ represents the attribute delimiter

$ represents the variable delimiter

[n] represents the n-th child of the current node

ancestor::div represents the set of parent div nodes

normalize-space(firstname)="Paul" matches Paul regardless of whitespace delimiters

boolean(string($myvar) ) checks for empty strings

/ represents the absolute path of the root node

@* represents all attributes of the current node

-Return all values using a union of attributes, node names, and text values:

@*|node|text

-Return all of a node's siblings using a union of the preceding-sibling and following-sibling axes:

preceding-sibling::node | following-sibling::node

-Return the adjacent sibling of a specific type

//div/following-sibling::h3

-Check string value of current node

[. = "Matthew Bob"]

-Node identity can be checked using the count function to see if the intersection of two node-sets of the same length equals the length of either of the node sets(or in the case of a single node set whether it is equal to 1). For example, the following query returns TRUE in this case because both nodes are the same:

count(/bk:books | /bk:books/bk:book[1]/parent::*) = 1