Extraction, analysis and publication of bibliographical references within an institutional repository

DOIhttps://doi.org/10.1108/LHT-01-2016-0003
Date20 June 2016
Pages259-267
Published date20 June 2016
AuthorGötz Hatop
Subject MatterLibrary & information science,Librarianship/library management,Library technology
Extraction, analysis and
publication of bibliographical
references within an
institutional repository
Götz Hatop
University Library, Philipps University Marburg, Marburg, Germany
Abstract
Purpose The academic tradition of adding a reference section with references to cited and otherwise
related academic material to an article provides a natural starting point for finding links to other
publications. These links can then be published as linked data. Natural language processing
technologies are available today that can perform the task of bibliographical reference extraction from
text. Publishing references by the means of semantic web technologies is a prerequisite for a broader
study and analysis of citations and thus can help to improve academic communication in a general
sense. The paper aims to discuss these issues.
Design/methodology/approach This paper examines the overall workflow required to extract,
analyze and semantically publish bibliographical references within an Institutional Repository with
the help of open source software components.
Findings A publication infrastructure where references are available for software agents would enable
additional benefits like citation analysis, e.g. the collection of citations of a known paper and the investigation
of citation sentiment.The publication of reference information as demonstrated in this article is possible with
existing semantic web technologies based on established ontologies and open source software components.
Research limitations/implications Only a limited number of metadata extraction programs have
been considered for performance evaluation and reference extraction was tested for journal articles
only, whereas Institutional Repositories usually do contain a large number of other material like
monographs. Also, citation analysis is in an experimental state and citation sentiment is currently not
published at all. For future work, the problem of distributing reference information between
repositories is an important problem that needs to be tackled.
Originality/value Publishing reference information as linked data are new within the academic
publishing domain.
Keywords Digital libraries, Library services, Linked data, Citation analysis, Reference extraction,
Semantic publishing
Paper type Research paper
Introduction
The most common purpose of a citation is to show that a referenced work has
influenced the authors work. The reference section of an academic paper has its value
not only in giving credit to authors whose work was used, but is useful to support
assertions and arguments of the author and also helps readers to find more information
on the subject and beyound. Besides these reasons, the further development of tools
and techniques for citation analysis and other applications like author-topic evolution
or co-authorship graph analysis depends on published references.
As a consequence, the publication of reference information by the means of semantic
web technologies is of paramount importance.
Unfortunately, because of the amount of work required to manually create machine
readable reference information, the publication of this information is almost constantly
Library Hi Tech
Vol. 34 No. 2, 2016
pp. 259-267
©Emerald Group Publis hing Limited
0737-8831
DOI 10.1108/LHT-01-2016-0003
Received 7 January 2016
Revised22February2016
15 March 2016
Accepted 6 April 2016
The current issue and full text archive of this journal is available on Emerald Insight at:
www.emeraldinsight.com/0737-8831.htm
259
Bibliographical
references

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT