ETD Guide/Technical Issues/Seamless access: Open Archives Initiative, federated search

Access to ETDs that are produced by students around the globe requires some mechanism for connecting with the many computers that house those ETDs. There are two basic approaches.

Federated Research

In federated search, a user’s information need, expressed in the form of a query, is sent by the federated search system to all the sites that support the searching over local ETD collections. Then, when the sites have completed their searching and generated results, either the user can view each site that might have some relevant content (see Powell & Fox, 1998), or some type of fusion of results leads to a single merged list (as with Dienst, see Lagoze & Davis, 1995). While federated search yields up-to-the-moment results, such currency is usually not of high priority in the ETD world (where daily updates should suffice). At the same time, federated search may involve complex timeout and backup site management, if some remote sites are down or slow to respond. At best, federated search is often slow (due to network delays) and suffers from having to manage a wide diversity of representations of data at remote sites, leading in some cases to low data quality. Nevertheless, see such a service for ETDs from www.theses.org.

Harvesting

Harvesting is the second basic approach. As is explained in section Harvest usage in Germany, France, the Harvest system first clearly demonstrated this solution, and is still in use in Germany and other locations. However, this is being superseded by the Open Archives Initiative (www.openarchives.org, see Lagoze & Van de Sompel 2001). NDLTD has developed a harvesting-based OAI access scheme for handling the global collection of ETDs; see Suleman et al., 2001 (parts 1 and 2). The basic outline of the approach is as follows:
 * Each ETD is described (with metadata) using MARC21 or ETD-MS (http://www.ndltd.org/standards/metadata/ETD-ms-v1.00.html).
 * Each ETD site runs an open archive, which responds to OAI requests for metadata by providing Dublin Core records, as well as either (or both) MARC21 and ETD-MS. For example, the software for ETD management developed by Virginia Tech has such a capability (http://www.dlib.vt.edu/projects/OAI/software/ndltd/ndltd.html).
 * State, provincial, national, regional or other organizations may harvest from these sites, and run their own open archives and related services.
 * Virginia Tech harvests from all sites (or group sites that have harvested for university sites) to develop a union collection (http://oai.dlib.vt.edu/~etdunion).
 * VTLS Inc., as a service to NDLTD, provides search access to the union collection (http://www.vtls.com/ndltd).

References


 * Lagoze, C. and J. R. Davis. 1995. "Dienst - An Architecture for Distributed Document Libraries", in Communications of the ACM, Vol. 38, No. 4, p. 47, ACM, 1995.
 * Lagoze, Carl and Herbert Van de Sompel. 2001. The Open Archives Initiative Protocol for Metadata Harvesting, Open Archives Initiative, January 2001. Available http://www.openarchives.org/OAI/openarchivesprotocol.htm
 * J. Powell and E. Fox. Multilingual Federated Searching Across Heterogeneous Collections, D-Lib Magazine, Sep. 1998 http://www.dlib.org/dlib/september98/powell/09powell.html
 * Hussein Suleman, Anthony Atkins, Marcos A. Gonçalves, Robert K. France, and Edward A. Fox, Virginia Tech; Vinod Chachra and Murray Crowder, VTLS, Inc.; and Jeff Young, OCLC. Networked Digital Library of Theses and Dissertations: Bridging the Gaps for Global Access - Part 1: Mission and Progress. D-Lib Magazine, 7(9), Sept. 2001, http://www.dlib.org/dlib/september01/suleman/09sulemanpt1.html
 * Hussein Suleman, Anthony Atkins, Marcos A. Gonçalves, Robert K. France, and Edward A. Fox, Virginia Tech; Vinod Chachra and Murray Crowder, VTLS, Inc.; and Jeff Young, OCLC. Networked Digital Library of Theses and Dissertations: Bridging the Gaps for Global Access - Part 2: Services and Research. D-Lib Magazine, 7(9), Sept. 2001, http://www.dlib.org/dlib/september01/suleman/09suleman-pt2.html

Next Section: Production of ETDs