Designing web-scale discovery systems using the VuFind open source software

Published date08 May 2018
Pages16-22
DOIhttps://doi.org/10.1108/LHTN-12-2017-0088
Date08 May 2018
AuthorBijan Kumar Roy,Subal Chandra Biswas,Parthasarathi Mukhopadhyay
Subject MatterLibrary & information science,Librarianship/library management,Library technology,Library & information services
Designing web-scale discovery systems using the
VuFind open source software
Bijan Kumar Roy, Subal Chandra Biswas and Parthasarathi Mukhopadhyay
Introduction
Open source LIS software in recent
years has helped create demand for
web-scale discovery systems (WSDSs).
We have witnessed a paradigm shift
from library automation to resource
discovery by exploring the applications
of several commercial resource
discovery tools (e.g. Primo Central,
EBSCO Discovery Service, Summon,
etc.) and open source resource
discovery tools (e.g. Blacklight,
VuFind, LibraryFind, etc.). These
discovery systems can harvest content
from different locations (not limited to
the libraries) and provide access to the
content in an open manner. Basically, it
has become a de facto choice for
librarians who wish to offer a unified
Google-like simple and OPAC-like
elegant search box. The purpose of this
paper is to provide an opportunity to
academic libraries to make better use of
library resources and to integrate
heterogeneous bibliographic data
sources in VuFind.
At present, library environments have
been working through two parallel
systems – library automation system and
digital repository system. But these
systems are using different standards,
following different software architecture
and are providing different retrieval
techniques. All the systems have their
own controlled vocabulary systems,
databases and interfaces that create a
serious retrieval problem for end
users. It has become problematic for
the users to find what the system has
and how to access it. Existing systems
are capable of searching collections
separately but are not able to search
different bibliographic sources/
databases at a time through a single
access point. In simple words libraries
are providing retrieval silos with
interconnection amongst retrieval
interfaces. So, a user has to visit all the
systems, interfaces or databases
separately. They have to face a wide
range of platforms, as well as many
search entries (Roy et al., 2016a,
2016b).
A prototype Web-scale discovery
system
This WSDS has been developed by
using different open source software
(OSS) and open standards. This case
study is not based on real-life examples or
demonstrations of discovery tools, but the
tool has been tested on several different
configurations. It is basically a prototype
resource discovery framework which
may be integrated with any web-enabled
on-line information retrieval system. This
paper shares test bed experiences of
integrating Koha [here open source
integrated library system (ILS)] and
DSpace (open source repository system)
with VuFind (open source discovery tool)
along with other different external/
commercial/licensed/license-free/open
access databases. For the sample test,
data (in MARC format) were exported
from different bibliographic databases
like IndCat (http://indcat.inflibnet.ac.
in/) and the Library of Congress (LOC)
(www.loc.gov/) into Koha (version
17.11) (http://koha-community.org)
and then imported into VuFind
(version 4.0). In the same fashion,
resources were harvested from selected
institutional repository systems
developed through DSpace software
(version 5.5) (www.dspace.org/)and
then imported into VuFind. Only three
open source discovery tools have been
compared against predefined
parameters (Table I). Although, all the
features (Table I) are not clearly s tated
in the documentation or are not
supported in present versions of
selected discovery tools. Some are
even under development and may be
available in the latest/or next version.
In that case, the term “in the pipeline”
is used and represented by an asterisk
“*”.
This study presents the thematic
overview of the methodology followed
for the development of the prototype.
This model uses a number of OSS,
protocols and open standards based
technologies in different layers and levels
of its implementations. There are
basically three layers: Layer – I (core
layer) based on LAMP (Linux-Apache-
MySQL-PERL/PHP) architecture
including installation of VuFind. It uses
Linux (Ubuntu 16.04 LTS) as an
operating system, MySQL is used as a
relational DBMS (version 5.7.13),
Apache as a web server (version 2.4.18)
and PHP (version 7.1.0) as a
programming/scripting language. Layer –
II (full-text layer) uses Apache Tika as
a full-text extractor (version 1.15) and
Apache Solr (version 5.5.4) for
indexing harvested metadata from
different subscribed or externals
sources. Layer – III (Front Layer) is
basically the user interface or the
default interface of VuFind (version
4.0) which has been customized to
provide value-added services to users.
All software used in the different
stages of its implementations (the
model) were integrated and deployed
in VuFind. The designing part of the
proposed model can be grouped into
three broad headings: development of
LAMP architecture,selection and
installation of VuFind and installation
and configuration of Apache Tika with
VuFind as indexing enhancement
facilities. The process of extracting
bibliographic data from different
external sources is illustrated in
Figure 1.
Before selecting VuFind, the other
two open source discovery tools,
namely, Blacklight (developed by the
University of Virginia) and eXtensible
Catalog (developed by the River
16 LIBRARY HITECH NEWS Number 3 2018, pp. 16-22, V
CEmerald Publishing Limited, 0741-9058, DOI 10.1108/LHTN-12-2017-0088

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT