XQuery/Caching and indexes

Motivation
The views of the data about individual teams or groups needs to be supplemented with indexes to the resources for which those views are appropriate. Generating the indexes on demand is one approach but loads the SPARQL server. Given the batch nature of the DBpedia extract, it makes more sense to cache the index data and use the cache to generate an index page. (triggering the cache refresh is another problem!)

Non-caching approach
The following script generates an index page with links to the HTML view and the timeline views of a artist album.

declare option exist:serialize "method=xhtml media-type=text/html"; declare variable $query := " PREFIX skos:  PREFIX p:  SELECT * WHERE {      ?group  skos:subject .   } ";

declare function local:clean($text) { let $text:= util:unescape-uri($text,"UTF-8") let $text := replace($text,"\(.*\)","") let $text := replace($text,"_"," ") return $text };

let $category := request:get-parameter("category","") let $categoryx := replace($category,"_"," ") let $queryx := replace($query,"Rock_and_Roll_Hall_of_Fame_inductees",$category) let $sparql := concat("http://dbpedia.org/sparql?default-graph-uri=",escape-uri("http://dbpedia.org",true),                                     "&amp;query=",escape-uri($queryx,true)                         ) let $result := doc($sparql) return {$categoryx}

Index examples

 * Rock and Roll Groups

Caching Approach
Two scripts are needed - one to generate the data to cache, the other to generate the index page. The approach is illustrated with an index to Rock and Roll groups based on the Wikipedia category Rock and Roll Hall of Fame inductees.

Generate the index data
This script generates an XML file. A further development would store the XML directly to the database but it could also be saved manually to the appropriate location. It is parameterised by a category.

declare variable $query := " PREFIX skos:  PREFIX p:  SELECT * WHERE {      ?group  skos:subject .   } ";

declare function local:clean($text) { let $text:= util:unescape-uri($text,"UTF-8") let $text := replace($text,"\(.*\)","") let $text := replace($text,"_"," ") return $text };

declare function local:table-to-seq($table ) { let $head := $table/tr[1] for $row in $table/tr[position>1] return { for $cell at $i in $row/td return element {$head/th[position=$i]} {string($cell)} } }; let $category := request:get-parameter("category","Rock_and_Roll_Hall_of_Fame_inductees") let $queryx := replace($query,"Rock_and_Roll_Hall_of_Fame_inductees",$category) let $sparql := concat("http://dbpedia.org/sparql?default-graph-uri=",escape-uri("http://dbpedia.org",true),                                     "&amp;query=",escape-uri($query,true)                         ) let $result := doc($sparql)/table let $groups := local:table-to-seq($result) return  {for $group in $groups let $resource := substring-after($group/group,"resource/") let $name := local:clean($resource) order by $name return  } 

Note: I guess a better approach would be to use triples here, saved to a local triple store.

HTML index page
This script, groupList, uses the cached index data declare option exist:serialize "method=xhtml media-type=text/html"; let $list := //ResourceList[@category="Rock_and_Roll_Hall_of_Fame_inductees"] return Rock Groups

Execute
Roll and Roll groups