Managing and mining historical research data

Pages172-179
Date21 March 2016
Published date21 March 2016
DOIhttps://doi.org/10.1108/LHT-09-2015-0086
AuthorMichael S. Seadle
Subject MatterLibrary & information science,Librarianship/library management,Library technology
Managing and mining
historical research data
Michael S. Seadle
Humboldt Universität zu Berlin, Berlin, Germany
Abstract
Purpose The purpose of this paper is to review how historical research data are managed and
mined today.
Design/methodology/approach The methodology builds on observations over the last decade.
Findings Reading speed is a factor in managing the quantity of text in historical research. Twenty
years ago historical research involved visits to physical libraries and archives, but today much of the
information is online. The granularity of reading has changed over recent decades and recognizing this
change is an important factor in improving acce.
Practical implications Computer-based humanities text mining could be simpler if publishers and
libraries would manage the data in ways that facilitate the process. Some aspects still need
development, including better context awareness, either by writing context awareness into programs
or by encoding it in the text.
Social implications Future researchers who want to make use of text mining and distant reading
techniques will need more thorough technical training than they get today.
Originality/value There is relatively little discussion of text mining and distant reading in the
LIS literature.
Keywords Case studies, Reading, Data collection, Techniques, Data mining, Knowledge mining
Paper type Viewpoint
1. Introduction
Twenty years ago managing and mining historical research data almost exclusively
involved visits to physical libraries and archives, which often meant travel to remote
locations to research primary sources and rare works. The digitization of books, journals,
manuscripts, and other archival materials has changed the options for historical research,
and libraries and archives are still learning how to adjust. The chief issue today is not
whether to make digital content available, but how to make it available in ways that
facilitate research. The old paper-based paradigm remains in the consciousness of
librarians, archivists, and historical researchers, but digital content lends itself to greater
malleability. This paper discusses the kinds of changes that would facilitate the discovery
of historical research data.
2. Reading as text mining
Reading is the traditional basis of most historical research. Libraries and archives
make texts available and scholars work through them. This sounds easy because
people are accustomed to the process, but the traditional approach to reading has
significant problems that scholars tend to forget.
The sheer quantity of text to read has always been a problem in historical research
and the problem has grown in a world in which the number of published books and
journals has proliferated, and access to them has improved via better interlibrary loan
and digital copies. Commentary on published works has expanded intothe online world
in the form of web pages, blogs, tweets, e-mails, and a host of social mediaopportunities
for communication. The old excuse that certain material is unavailable applies less and
Library Hi Tech
Vol. 34 No. 1, 2016
pp. 172-179
©Emerald Group Publishing Limited
0737-8831
DOI 10.1108/LHT-09-2015-0086
Received 3 September 2015
Revised 3 September 2015
Accepted 7 September 2015
The current issue and full text archive of this journal is available on Emerald Insight at:
www.emeraldinsight.com/0737-8831.htm
172
LHT
34,1

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT