XQuery/XQuery and Python

Over at Cameron Laird has an example of Python code to extract and list the a tags on an XHTML page:

import elementtree.ElementTree for element in elementtree.ElementTree.parse("draft2.xml").findall("//a"): if element.tag == "a": attributes = element.attrib if "href" in attributes: print "'%s' is at URL '%s'." % (element.text,                                               attributes['href']) if "name" in attributes: print "'%s' anchors '%s'." % (element.text,                                               attributes['name'])

The equivalent to this Python code in XQuery is: for $a in doc("http://en.wikipedia.org/wiki/XQuery")//*:a return if ($a/@href) then concat("'", $a,"' is at  URL '",$a/@href,"'&amp;#10;") else if ($a/@name) then concat("'", $a,"' anchors '",$a/@name,"'&amp;#10;") else

tags in XQuery on Wikipedia (view source)

Here the namespace prefix is a wild-card since we don't know what the html namespace might be.

More succinctly but less readably (and with quotes omitted from the output for clarity), this could be expressed as:

string-join(     doc("http://en.wikipedia.org/wiki/XQuery")//*:a/        (if (@href) then concat(.," is at  URL ",@href) else if (@name) then concat(.," anchors ", @name) else )        ,'&amp;#10;'     )

tags in XQuery on Wikipedia (view source)

More usefully, we might supply the url of any XHTML page as a parameter and generate an HTML page of external links:

declare option exist:serialize "method=xhtml media-type=text/html";

let $url :=request:get-parameter("url",) return External links in {$url} {        for $a in doc($url)//*:a[text][starts-with(@href,'http://')] return {string($a)} is at {string($a/@href)}  }

XQuery on Wikipedia