Enriching and enhancing moving images with Linked Data. An exploration in the alignment of metadata models

Pages354-371
Date12 March 2018
Published date12 March 2018
DOIhttps://doi.org/10.1108/JD-07-2017-0106
AuthorKaren F. Gracy
Subject MatterLibrary & information science,Records management & preservation,Document management,Classification & cataloguing,Information behaviour & retrieval,Collection building & management,Scholarly communications/publishing,Information & knowledge management,Information management & governance,Information management,Information & communications technology,Internet
Enriching and enhancing moving
images with Linked Data
An exploration in the alignment of
metadata models
Karen F. Gracy
School of Information, Kent State University, Kent, Ohio, USA
Abstract
Purpose The purpose of this paper is to examine the current state of Linked Data (LD) in archival moving
image description, and propose ways in which current metadata records can be enriched and enhanced by
interlinking such metadata with relevant information found in other data sets.
Design/methodology/approach Several possible metadata models for moving image production and
archiving are conside red, including models f rom records management , digital curation, and t he recent
BIBFRAME AV Modeling Stu dy. This research also e xplores how mappings bet ween archival moving
image records and rel evant external data so urces might be drawn, an d what gaps exist between current
vocabularies and what is needed to record and make accessible the full lifecycle of archiving through
production, use, and reus e.
Findings The authornotes several major impediments to implementation of LD for archivalmoving images.
The various pieces of information about creators, places, and events found in moving image records are not
easily connected to relevant information in other sources because they are often not semantically defined
within the record and can behidden in unstructured fields. Libraries, archives, and museums must work on
aligning the various vocabularies and schemas of potential value for archival moving image description to
enable interlinking between vocabularies currently in use and those which are used by external data sets.
Alignment of vocabularies is often complicated by mismatchesin granularity between vocabularies.
Research limitations/implications The focus is on how these models inform functional requirements
for access and other archival activities, and how the field might benefit from having a common metadata
model for critical archival descriptive activities.
Practical implications By having a shared model, archivists may more easily align current vocabularies
and develop new vocabularies and schemas to address the needs of moving image data creators and scholars.
Originality/value Moving image archives, like other cultural institutions with significant heritage
holdings, can benefit tremendously from investing in the semantic definition of information found in their
information databases. While commercial entities such as search engines and data providers have already
embraced the opportunities that semantic search provides for resource discovery, most non-commercial
entities are just beginning to do so. Thus, this research addresses the benefits and challenges of enriching and
enhancing archival moving image records with semantically defined information via LD.
Keywords Semantics, Linked Data, Information modelling, Alignment of metadata models,
Moving image archives, Time-based media
Paper type Research paper
1. Introduction
Time-based media, such as moving images, sound recordings, animation, and contemporary
artworks that incorporate video, film, slide, audio, or computer technologies, offer a
significant descriptive challenge for heritage stewards given the wide variety of genres,
formats, and environments in which they are created, managed, and used. As a malleable
form of communication, time-based media can be information sources, archival records,
historical documents, artistic works, or commercial assets for entertainment companies and
news organizations. Information systems designed for time-based media must not only
provide access to descriptions of these objects to support resource discovery but also
sustain a wide variety of activities and functional requirements. Additionally, these systems
need to interact seamlessly with other systems to draw information from multiple sources
and present it to users in a unified manner.
Journal of Documentation
Vol. 74 No. 2, 2018
pp. 354-371
© Emerald PublishingLimited
0022-0418
DOI 10.1108/JD-07-2017-0106
Received 16 July 2017
Revised 9 October 2017
Accepted 29 October 2017
The current issue and full text archive of this journal is available on Emerald Insight at:
www.emeraldinsight.com/0022-0418.htm
354
JD
74,2
For decades, indeed centuries, galleries, libraries, archives, and museums (GLAMs)
have created and maintained information systems for describing objects and collections of
various kinds. These systems are known by a variety of names such as catalogs, databases,
finding aids, and indexes. While there has been much success at standardizing how object
and collection records are structured and shared among institutions, this approach to
description has significant limitations. It concentrates a significant amount of information at
the record level, with only a fraction of that information indexed or otherwise semantically
defined. Numerous valuable bits of information about creators, places, events, topics, object
characteristics, and institutional actions remain embedded within each object record.
Often this information is only found through serendipitous discovery or keyword searching.
If these info-bits are not defined and coded for meaning, they are difficult to access and
cannot be easily connected to relevant information about these same entities in other data
sets and information sources. Moving image records, particularly those for archival moving
images, tend to be rich in such hidden information and thus were ripe for study to examine
the possibilities of LD to expand access to information buried in descriptive records. If such
information were semantically defined and made available as open data, these information
systems could be the entry point and gathering place for a universe of knowledge about
moving image production, exhibition, preservation, and use.
The potential for such enhancement and enrichment can already be seen via next
generation search models such as Googles Knowledge Graph, which understands
real-world entities and their relationships to one another: things, not strings(Singhal, 2012).
A simple search for a recognizable entity such as a person, organization, place, event,
publication, orwork of art invokes the Knowledge Graphintelligent search model. As part of
search results, the engine will return a summary of information related to that entity; for
example, a search for the title The Shiningwill return information related to the Stanley
Kubrick film, including year of release, director, screenwriter, cast members, a plot synopsis,
images from the film, a link to the trailer found on YouTube, average viewer ratings from
several internetmovie sites like Rotten Tomatoes, programlistings for when to catch the film
on television or via streaming sites, where to purchase the film online, and related resources,
such as the Stephen King novel upon which the film is based. Although some of this
information comes from Googles knowledge base (much of it gathered via open source data
sources such as DBpedia), other information is drawn from such sources as data sets of
vendors like Amazon and data service providers such as Rovi.
To gather all this information together on one page, each relevant bit of information
must be semantically defined using the Resource Description Framework (RDF) syntax
and connected to the larger universe of data by using a unique identifier for the entity or
concept an HTTP Uniform Resource Identifier (URI). RDF facilitates the definition of
relationships among related entities, thus connecting users in a meaningful, purposive
way to related information about entities and topics.
Moving image archives, like other cultural institutions with significant heritage holdings,
can benefit tremendously from investing in the semantic definition of information found in
their information databases. While commercial entities such as search engines and data
providershave already embraced the opportunities that semanticsearch provides for resource
discovery, most non-commercial entitiesare just beginning to do so. Thus, this paper focuses
on the challenges of enriching and enhancing archival moving image records with
semantically defined information via Linked Data (LD) and Linked Open Data (LOD),
the latter of which refers to LD that is published under an open license and thus freely
available to all without restrictions on reuse (Heath, n.d.).
First, to explore the possibilities of LD for next generation moving image information
systems, several possible metadata models for moving image production and archiving
are considered, including models from records management, digital curation, and the recent
355
Alignment of
metadata
models

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT