Can I have more of these please?. Assisting researchers in finding similar research papers from a seed basket of papers

Published date04 June 2018
Date04 June 2018
Pages568-587
DOIhttps://doi.org/10.1108/EL-04-2017-0077
AuthorAravind Sesagiri Raamkumar,Schubert Foo,Natalie Pang
Subject MatterInformation & knowledge management,Information & communications technology,Internet
Can I have more of these please?
Assisting researchers in finding similar
research papers from a seed basket of papers
Aravind Sesagiri Raamkumar, Schubert Foo and Natalie Pang
Wee Kim Wee School of Communication and Information,
Nanyang Technological University, Singapore
Abstract
Purpose During the literature review phase, the task of nding similar research papers can be a difcult
proposition for researchers due to the procedural complexity of the task. Current systems and approaches
help in nding similar papers for a given paper, even though researchers tend to additionally search using a
set of papers. This paper aims to focus on conceptualizing and developing recommendation techniques for
key literature review and manuscript preparatory tasks that are interconnected. In this paper, the user
evaluation results of the task where seed basket-based discovery of papers is performed are presented.
Design/methodology/approach A user evaluation study was conducted on a corpus of papers
extracted from the ACM Digital Library. Participants in the study included 121 researchers who had
experience in authoring research papers. Participants, split into students and staff groups, had to select one of
the provided 43 topics and run the tasks offered by the developed assistive system. A questionnaire was
provided at the end of each task for evaluating the task performance.
Findings The results show that the student group evaluated the task more favourably than the staff
group, even though the difference was statistically signicant for only 5 of the 16 measures. The measures
topical relevance, interdisciplinarity, familiarity and usefulness were found to be signicant predictors for
user satisfaction in this task. A majority of the participants, who explicitly stated the need for assistance in
nding similar papers, were satised with the recommended papers in the study.
Originality/value The current research helps in bridging the gap between novices and experts in terms
of literature review skills. The hybrid recommendation technique evaluated in this study highlights the
effectiveness of combining the results of different approaches in nding similar papers.
Keywords Digital libraries, Information retrieval, Scholarly article recommender systems,
Scholarly articles, Seed baskets
Paper type Research paper
Introduction
Literature review (LR) is an important phase of a research project, as it has direct impact on
the subsequent phases. During LR, there are transitions in focus state, activity type and
search style. The user moves through three stages starting from pre-focus to a problem
formulation stage and then on to the nal post-focus stage (Vakkari, 2000). These stages
apply to both general-purpose and scientic/academic information-seeking domains. In a
typical pre-focus stage, researchers use exploratory search tactics to get an initial set of
papers for the given research area. These papers are either manually collated or acquired
from experts (Ellis et al., 1993). The initial set of papers can be referred to as the reading list,
and this list ideally comprises a mix of seminal, recent, literature survey papers covering the
This research was supported by the National Research Foundation, Prime Minister’s Oce,
Singapore under its International Research Centres in Singapore Funding Initiative and administered
by the Interactive Digital Media Programme Oce.
EL
36,3
568
Received 8 April 2017
Revised 1 September 2017
Accepted 30 October 2017
The Electronic Library
Vol. 36 No. 3, 2018
pp. 568-587
© Emerald Publishing Limited
0264-0473
DOI 10.1108/EL-04-2017-0077
The current issue and full text archive of this journal is available on Emerald Insight at:
www.emeraldinsight.com/0264-0473.htm
sub-topics of the research area. After obtaining a holistic understanding of the research area,
researchers select a few papers from the list for nding more similar papers. The transition
to directed search happens during this task. A seed set of papers (called seed basket in the
context of the current study) from the reading list are used as inputs to this task. During this
stage, the researcher executes multiple activities, such as chaining, metadata hyperlinking
and extended topical searching, to name a few. Current academic search systems, databases
and citation indices are tools used by researchers for performing the aforementioned
activities, even though these systems are mainly designed for ad hoc searching. As this task
involves variegated activities, information sources and relevance criteria, researchers need
both skills and additional time. Moreover, novice researchers need assistance in performing
this type of information-seeking task (Du and Evans, 2011). Two types of interventions
provide the mitigatory measure for this scenario. They are process-based and technology-
oriented interventions. In process-based interventions, the role of librarians and experts in
helping other researchers has been underlined (Du and Evans, 2011; Spezi, 2016).
Under technology-oriented interventions, prior studies in the area of scientic paper
information retrieval (IR) and recommender systems (RS) have looked at proposing
techniques for nding similar papers. Most of the approaches have looked at either one or
two of the aforementioned sub-tasks for nding similar papers. In these studies, evaluations
have been conducted in an ofine environment using simulations, without involving actual
users. Most importantly, most of the studies have proposed techniques for nding similar
papers for a single input paper. In a real-world scenario, researchers also tend to nd similar
papers for a set of seed papers (Raamkumar et al., 2016).
With a view to address the abovementioned issues, the researchers developed a system
for assisting researchers in three LR and manuscript preparatory tasks. The three tasks are:
(1) building a reading list of research papers;
(2) nding similar papers based on a set of papers; and
(3) shortlisting papers from the nal reading list for inclusion in a manuscript based
on article type.
These tasks are interconnected using two paper collection features – seed basket (SB) and
reading list. The recommendation techniques for these tasks have been conceptualized
using a set of pre-computed features (criteria) that capture the important characteristics
of a research paper and its relations with bibliographic references and citations.
Reproducibility in other environments has been taken as the key characteristic while
designing the techniques for tasks. A data set of research papers extracted from the ACM
Digital Library (ACM DL) comprising 103,739 articles is used as the corpus for the
system.
In this paper, the focus is on the second task addressed in the assistive system: that is, the
task of nding similar papers based on a seed set (seed basket) of papers. The conceptual
design of the task is rst described, followed by the ndings of a user evaluation study
conducted with 121 researchers. This research contributes to the existing literature in
several ways:
the importance of considering multiple seed papers while designing
recommendation tasks for nding similar papers is highlighted;
the proposed technique of nding similar papers using a SB of papers; and
identication of measures that have predictive ability over user satisfaction in this
task.
Finding similar
research
papers
569

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT