XQuery/All Leaf Paths

Motivation
You want to generate a list of all leaf paths in a document or document collection.

This process is very useful to get to know a new data set. Specifically you will find that the leaf elements in an XML file carry much of the data in a data-style markup. These leaf elements frequently are used to carry the most semantics or meaning within the document. They for the basis for a semantic inventory of the document. That is each leaf element should be able to be associated with a data definition.

Leaf elements are also good targets for indexing within your index configuration file.

Method
We will use the functx leaf-elements function

functx:leaf-elements($nodes*) xs:string*

This function takes as input, one or more nodes and returns an array of strings.

Example Output
For the demo play Hamlet that is included in the eXist demo set the file /db/shakespeare/plays/hamlet.xml will generate the following output:

Source Code to leaf-elements
This query uses the descendant-or-self::* function with the predicate [not(*)] to qualify only elements that do not have child nodes.

Adding Attributes
You can also run a query that will get all the distinct attributes. Attributes are all considered leaf data types since they can never have child elements.

This query says in effect to "get all the all the distinct attribute names in the input nodes".

For the MODS demo file: doc('/db/mods/01c73f2b05650de2e6124d9d113f40be.xml')

You will get the following attributes:  type encoding authority 