XQuery/Pachube feed

Motivation
You want to create a feed for the Pachube application. A Pachube application allows you to store, share & discover realtime sensor, energy and environment data from objects, devices & buildings around the world. This provides a platform for sensor data integration. History gathered by Pachube can be presented in various formats and used by other applications to mashup feeds.

Modules and concepts

 * eXist httpclient to GET, POST and PUT
 * eXist scheduler for job scheduling
 * eXist update extension
 * server-side XSLT

Tower Bridge
The idea of a feed of the open/closed status of Tower Bridge in London was borrowed from @ni.

A Twitter stream provides the base data for a simple status feed. The RSS feed from this stream is read by an XQuery script, the status deduced from the text and an XML file representing the current status updated.

This XML file has an attached XSLT stylesheet so that when the file is pulled on schedule from the eXist database, it is first transformed on the server-side into the EEML format required for Pachube feeds. As configured on the UWE server, this uses Saxon XSLT-2.

XQuery script
let $rss := httpclient:get(xs:anyURI( "http://twitter.com/statuses/user_timeline/14012942.rss" ),false,)/httpclient:body let $lastChange:= $rss//item[1] let $bridgeStatus := doc("/db/Wiki/Pachube/bridge.xml")/data/status return if (exists($lastChange) and exists($bridgeStatus)) then let $open := if(contains($lastChange/description,"opening")) then "1" else "0" return update replace $bridgeStatus with element status { attribute bridge {$open}, attribute lastChange {$lastChange/pubDate}, attribute lastUpdate {current-dateTime} }    else

1. httpclient is used here because doc throws an error about duplicate namespace declarations - under investigation

XSLT
 

Job scheduling
The XQuery update script is invoked by the eXist job scheduler every 1 minute:

let $login := xmldb:login( "/db", "user", "password" ) let $del := scheduler:delete-scheduled-job("BRIDGE") let $job := scheduler:schedule-xquery-cron-job("/db/Wiki/Pachube/pollbridgerss.xq", "0 0/1 * * * ?","BRIDGE") return $job

Feed view
There is a public view of the Feed as processed by Pachube : http://www.pachube.com/feeds/3922

Discussion
The Pachube interface refreshes the automatic feeds every 15 minutes (for the free service). Since typical bridge lifts last 10 minutes, there is a likelihood that a lift will be missed. The alternative is to push changes to Pachube when detected.

Weather Push Feed
Many amateur weather stations use Weather Display software. This software writes current observations to a space-delimited text file to support interfaces to viewing software, such as the Flash Weather Display Live. The text files are generally web accessible, so that any client has access to this raw data, although it is polite to ask for access.

One such station is run by Martyn Hicks at http://www.martynhicks.co.uk/weather/data.php located in Horfield, Bristol. The raw data file is http://www.martynhicks.co.uk/weather/clientraw.txt

In this Push implementation, a manual feed is defined in Pachube via the API by POSTing a full EEML document. An XML descriptor file defines the mapping between values in the data file and data streams in the feed. A scheduled XQuery script reads the data file and transforms it via the mapping file to EEML format prior to PUTing to the Pachube API.

XQuery feed creation
The feed is defined in an EEML document which is POSTed to the Pachube API. A return code of 201 indicates that the feed has been created.

let $url := xs:anyURI("http://www.pachube.com/api/feeds/") let $headers :=  let $feed :=   Horfield Weather The weather observed by a weather station run by Martyn Hicks. Public interface is http://www.martynhicks.co.uk/weather/data.php kit.wallace@gmail.com  Horfield 51.4900                    -2.5805 return httpclient:post($url,$feed,false,$headers)

Mapping File
An XML document defines the origin of the raw data, the Pachube appid and the mapping from data values (1-based) to data streams (numbered from 1 in document order). 4013        Average Wind Speed Wind Direction Temperature Barometer

Update script
This script is scheduled to run every minute (as above).

The mapping file namespace needs to be declared:

declare namespace wdl = "http://www.cems.uwe.ac.uk/xmlwiki/wdl";

First a function to read the raw data file and tokenize to a sequence of values: declare function local:client-data ($rawuri) { let $headers := element headers{ element header { attribute name {"Cache-Control"}, attribute value {"no-cache"} }   } let $raw := httpclient:get(xs:anyURI($rawuri),false,$headers )/httpclient:body return tokenize($raw,"\+") };

Then a function to transform from the sequence of values to the Pachube data chanels: declare function local:data-to-eeml ($data,$format) { for $field at $id in $format/wdl:field let $name := string($field) let $index := xs:integer($field/@n) return element data { attribute id {$id}, element tag { string($field)}, element value {$data[$index] }, element unit {string($field/@unit)} } };

The main line fetches the feed definition file (here hard-coded but it could be passed in as a parameter). The data values are obtained, the EEML generated and PUT to the Pachube API. let $feed := doc("/db/Wiki/Pachube/horfieldweather.xml")/wdl:weatherfeed let $data := local:client-data($feed/wdl:data) let $appid := $feed/wdl:appid let $APIKey := "eeda7c27ff8b7c49e8529e4eb4b3f57724c5b609db0d22904df11edd4742e92c" let $url := xs:anyURI(concat( "http://www.pachube.com/api/",$appid)) let $headers :=  let $eeml:=   {local:data-to-eeml($data,$feed/wdl:format)} return httpclient:put($url,$eeml,false,$headers)

Feed view
There is a public view of the Pachube feed at http://www.pachube.com/feeds/4013

Weather Pull Feed
The alternative approach is simpler and relies on Pachube to pull data on their schedule.

In this example, weather station data consolidated by WeatherUnderground and republished as XML is transformed to EEML.

Weatherundgerground Feed
A typical XML feed for a station in weatherunderground is http://api.wunderground.com/weatherstation/WXCurrentObXML.asp?ID=IBAYOFPL1

XSLT transform
This XML can be transformed to EEML using XSLT:

 

Connecting XML to XSLT
A simple XQuery script accepts one parameter, the station id, fetches the XML feed and transforms using XSLT to EEML:

let $id := request:get-parameter("id",) let $ss := doc("/db/Wiki/Pachube/weatherunderground.xsl") let $data := doc(concat("http://api.wunderground.com/weatherstation/WXCurrentObXML.asp?ID=",$id)) return transform:transform($data,$ss,)

The script can be invoked: http://www.cems.uwe.ac.uk/xmlwiki/Pachube/weatherunderground.xq?id=IBAYOFPL1.

Since this script is parameterised, it could be used with any weatherUnderground station.

Pachube Feed
An automatic feed can be created - http://www.pachube.com/feeds/4037 which uses this feed.

NOAA Feed
We can adopt a similar approach with the feeds for US ICAO stations. NOAA provide XML feeds such as http://www.weather.gov/xml/current_obs/KEWR.xml. The format is nearly the same as the weatherunderground feed and is documented: http://www.weather.gov/view/current_observation.xsd. Update rate is hourly but there is no way currently to configure Pachube to update at that frequency.

XSLT
 

XQuery Script
let $id := request:get-parameter("id",) let $ss := doc("/db/Wiki/Pachube/NOAA.xsl") let $data := doc(concat("http://www.weather.gov/xml/current_obs/",$id,".xml")) return transform:transform($data,$ss,)

Feed
The transformed XML http://www.cems.uwe.ac.uk/xmlwiki/Pachube/NOAA.xq?id=KEWR is the basis for the manual feed http://www.pachube.com/feeds/4047

XSLT only
If Pachube supported XSLT on the server side, the whole task could be handled by a single XSLT script. For the sake of generalisation, its helpful to provide an interface which allows parameters to be passed to the script but it is not necessary:

</xsl:template> </xsl:stylesheet>

The server can just run this standalone to generate the EEML feed. This small XQuery script uses the SAXON processor on the eXist platform:

transform:transform(,doc("/db/Wiki/Pachube/NOAA3.xsl"),)

XSLT Execute  (currently fails - under investigation)

Automatic Feed
The XSLT conversion could be provided as a service by an eXist db. It would need:


 * an XML database of connection definitions e.g.

http://www.cems.uwe.ac.uk/xmlwiki/Pachube/NOAA7.xsl </PachubeFeed> <PachubeFeed id="2002"> http://www.cems.uwe.ac.uk/xmlwiki/Pachube/NOAA7.xsl </PachubeFeed> </PachubeFeeds>


 * a script to locate and execute a feed:

let $id := request:get-parameter("id",) let $feed := doc("/db/Wiki/Pachube/feeds.xml")//PachubeFeed[@id=$id] return transform:transform(    doc($feed/data),     doc($feed/xslt),     $feed/params     )

http://www.cems.uwe.ac.uk/xmlwiki/Pachube/getFeed.xq?id=2001
 * Pachube automatic feeds can now be created with a URL like

e.g. http://www.pachube.com/feeds/4661


 * User interface and database to allow users to register, create and edit feeds

There are issues here with loading and with unsafe code in the stored XSLT.

Output
Similarly output processing of either the current EEML or a specific datastream's csv history could be provided with a bit of code and XSLT. Since this may require authentication, API keys would have to be stored on this database too. Jobs could be generated and scheduled to implement triggers but this will need a timed pull of the required data.

Code is needed to convert the history feeds provided by Pachube to XML since these are only available in CSV. Once in XML, XSLT can transform to the format required. Of course it would be preferable if Pachube provided XML feeds in addition to the CSV feeds.

Archive
The full archive is provided as a csv file. We can convert that to XML with the following script:

import module namespace csv = "http://www.cems.uwe.ac.uk/xmlwiki/csv" at "../lib/csv.xqm";

let $feed := request:get-parameter("feed","") let $stream := request:get-parameter("stream","") let $archiveurl := concat("http://www.pachube.com/feeds/",$feed,"/datastreams/",$stream,"/archive.csv") let $data:= csv:get-data($archiveurl) let $rows := tokenize($data,$csv:newline) let $now := current-dateTime return <history feed="{$feed}" stream="{$stream}" dateTime="{$now}" count="{count($rows)}"> {for $row in $rows let $point := tokenize($row,",") return <value dateTime="{$point[1]}">{$point[2]} }

http://www.cems.uwe.ac.uk/xmlwiki/Pachube/getArchive.xq?feed=4037&stream=2

24 Hour History
In the csv stream, these are untimed. The time has to be estimated and calculated using xs:dateTimeDuration:

import module namespace csv = "http://www.cems.uwe.ac.uk/xmlwiki/csv" at "../lib/csv.xqm";

let $feed := request:get-parameter("feed","") let $stream := request:get-parameter("stream","") let $historyurl := concat("http://www.pachube.com/feeds/",$feed,"/datastreams/",$stream,"/history.csv") let $data:= csv:get-data($historyurl) let $values :=tokenize($data,",") let $now := current-dateTime let $then := $now - xs:dayTimeDuration("P1D") return <history feed="{$feed}" stream="{$stream}" dateTime="{$now}" count="{count($values)}"> {for $value at $i in $values let $dt := $then + xs:dayTimeDuration(concat("PT",15*$i,"M")) return <value dateTime="{$dt}">{$value} }

http://www.cems.uwe.ac.uk/xmlwiki/Pachube/getHistory.xq?feed=4037&stream=2