A case study in metadata harvesting: the NSDL

Pages228-237
Published date01 June 2003
Date01 June 2003
DOIhttps://doi.org/10.1108/07378830310479866
AuthorWilliam Y. Arms,Naomi Dushay,Dave Fulker,Carl Lagoze
Subject MatterInformation & knowledge management,Library & information science
A case study in
metadata harvesting:
the NSDL
William Y. Arms
Naomi Dushay
Dave Fulker and
Carl Lagoze
Harvesting as a core NSDL strategy
A point on a spectrum of
interoperability
The National Science Digital Library (NSDL)
is a wide-ranging program of the National
Science Foundation (NSF) to build library
collections and services for all aspects of science
education (Zia, 2001). The Core Integration
team, of which we are members, has the specific
task of integrating the individual projects,
together with all other relevant collections, into
a large-scale digital library. Eventually, the goal
is to integrate tens of thousands of collections,
ranging from simple Web sites to large and
sophisticated digital libraries, into a coherent
whole that is structured to support education
and facilitate incorporation of innovative,
value-adding services.
As described in an earlier paper (Arms et al.,
2002), the architecture is based on the
recognition that, with a library of this
complexity, it is impossible to impose detailed
requirements for standards that every collection
must follow. Although the Core Integration
team and the NSDL community can coax and
cajole collections towards preferred standards,
the architecture needs to accommodate a
spectrum of interoperability, which makes use
of widely varying protocols, formats and
metadata standards. A second paper (Lagoze
et al., 2002) describes the architecture that has
been developed.
The Metadata Repository
The Metadata Repository is a key component of
the NSDL architecture. Its function is to
support providers of services, such as the
NSDL Search Service. It holds collection-level
metadata about every collection known to the
NSDL and an item-level metadata record for
each known individual item.
The authors
William Y. Arms is Professor of Computer Science and
Director of the Information Science program at Cornell
University, Ithaca, New York, USA.
Naomi Dushay is a programmer/analyst for the National
Science Digital Library project at Cornell University, Ithaca,
New York, USA.
Dave Fulker is Executive Director of the NSDL, University
Corporation for Atmospheric Research, Boulder, Colorado,
USA.
Carl Lagoze is a Senior Research Associate in Computing
and Information Science at Cornell University, Ithaca, New
York, USA.
Keywords
Data collection, Archiving, Digital libraries
Abstract
This paper describes the use of the Open Archives Initiative
Protocol for Metadata Harvesting in the NSF's National
Science Digital Library (NSDL). The protocol is used both as a
method to ingest metadata into a central Metadata
Repository and also as the means by which the repository
exports metadata to service providers. The NSDL Search
Service is used to illustrate this architecture. An early version
of the Metadata Repository was an alpha test site for
version 1 of the protocol and the production repository was
a beta test site for version 2. This paper describes the
implementation experience and early practical tests. Despite
some teething troubles and the long-term difficulties of
semantic compatibility, the overall conclusion is optimism
that the Open Archive Initiative will be a successful part of
the NSDL.
Electronic access
The Emerald Research Register for this journal is available at
http://www.emeraldinsight.com/researchregister
The current issue and full text archive of this journal is
available at
http://www.emeraldinsight.com/0737-8831.htm
This work was partially funded by the National
Science Foundation under grants 0127298 and
0127308. The ideas in this paper are those of the
authors and not of the National Science Foundation.
The development team that is implementing the
NSDL Metadata Repository and the OAI-PMH has
included Tim Cornwell, Naomi Dushay, Stoney
Gan, Chris Ingram, Rich Marisa, and Jon Phipps
228
Library Hi Tech
Volume 21 .Number 2 .2003 .pp. 228-237
#MCB UP Limited .ISSN 0737-8831
DOI 10.1108/07378830310479866

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT