Data mining topics in the discipline of library and information science: analysis of influential terms and Dirichlet multinomial regression topic model

Date19 December 2022
Pages65-85
DOIhttps://doi.org/10.1108/AJIM-05-2022-0260
Published date19 December 2022
Subject MatterLibrary & information science,Information behaviour & retrieval,Information & knowledge management,Information management & governance,Information management
AuthorSukjin You,Soohyung Joo,Marie Katsurai
Data mining topics in the discipline
of library and information science:
analysis of influential terms
and Dirichlet multinomial
regression topic model
Sukjin You
University of Wisconsin-Milwaukee, Milwaukee, Wisconsin, USA
Soohyung Joo
University of Kentucky, Lexington, Kentucky, USA, and
Marie Katsurai
Department of Intelligent Information Engineering and Sciences,
Doshisha University, Kyotanabe, Japan
Abstract
Purpose The purpose of this study is to explore to which extent data mining research would beassociated
withthe library and information science (LIS) discipline. This study aims to identify data mining related subject
terms and topics in representative LIS scholarly publications.
Design/methodology/approach A large set of bibliographic records over 38,000 was collected from a
scholarly database representing the fields of LIS and the data mining, respectively. A multitude of text mining
techniques were applied to investigate prevailing subject terms and research topics, such as influential term
analysis and Dirichlet multinomial regression topic modeling.
Findings The findings of this study revealed the relationship between the LIS and data mining research
domains. Various data mining method terms were observed in recent LIS publications, such as machine
learning, artificial intelligence and neural networks. The topic modeling result identified prevailing data mining
related research topics in LIS, such as machine learning, deep learning, big data and among others. In addition,
this study investigated the trends of popular topics in LIS over time in the recent decade.
Originality/value This investigation is one of a few studies that empirically investigated the relationships
between the LIS and data mining research domains. Multiple text mining techniques were employed to
delineate to which extent the two research domains would be associated with each other based on both at the
term-level and topic-level analysis. Methodologically, the study identified influential terms in each domain
usingmultiple feature selection indices. In addition, Dirichlet multinomial regression was applied to explore LIS
topics in relation to data mining.
Keywords Data mining, Research topics, Library and information science, Trend analysis, Textual analysis,
Bibliographic records
Paper type Research paper
Introduction
The discipline of library and information science (LIS) has been interdisciplinary in nature
(Chang, 2018), and it has exhibited the shift that involves more technology and data-focused
research subjects (Timakum et al., 2020). In LIS, data mining methods have been introduced
and used for different research problems for recent decades. With the popularity of
computational research methods, LIS researchers became well aware of the importance of
integration of data science methods in LIS research. The LIS discipline has expanded its
research scope to a wide variety of areas as many of traditional LIS schools transitioned to
iSchools (Wang, 2018). Such transitions have emphasized the adoption of data science
Data mining
topics in LIS
65
The current issue and full text archive of this journal is available on Emerald Insight at:
https://www.emerald.com/insight/2050-3806.htm
Received 17 May 2022
Revised 13 August 2022
30 October 2022
Accepted 13 November 2022
Aslib Journal of Information
Management
Vol. 76 No. 1, 2024
pp. 65-85
© Emerald Publishing Limited
2050-3806
DOI 10.1108/AJIM-05-2022-0260
methods, and accordingly, LIS researchers began to utilize more computational methods in
their research. In particular, data mining techniques have been applied in LIS to analyze large
size data collected from various sources. Data mining is one of the key areas in data sciences
(Kelleher and Tierney, 2018;Provost and Fawcett, 2013). Data mining techniques, such as
classification, clustering, data visualization and deep learning, have been beneficial to extract
meaningful patterns or implications from large numerical data in LIS research (Sahoo and
Mishra, 2015;Urs and Minhaj, 2022).
Given the increased use of data science methods in LIS, researchers have begun to explore
the relationships between the LIS and the data science disciplines (Virkus and Garoufallou,
2019,2020). Recently, there were attempts to investigate the adoption of data mining methods
in LIS research (e.g. Ma and Lund, 2020,2021a) as data mining became an important part of
research in LIS. To better understand the potential benefits of data mining methods in LIS, it
is imperative to recognize the nature of the interdisciplinary relationships between the LIS
research and the data mining research domains. The primary purpose of this study is to
explore to which data mining related topics have been studied in the LIS field in recent years
based on a text mining approach. The study intends to identify influential data mining
subject terms observed in representative LIS publications. In addition, it explores data
mining related topics and/or concepts in recent LIS research using text mining and topic
modeling.
This research is one of a few studies that attempted to investigate the LIS research trends
focusing on data mining by comparing datasets collected for LIS and data mining
respectively. This study intended to delve into the intersection area between LIS and data
mining to better understand the recent trends of adopting data mining methods in LIS
research. Methodologically, the unique contributions of this study lie in that it employed
multiple text mining techniques to delineate to which extent the two research domains are
related with each other. Both the conventional feature selection methods (e.g. mutual
information, F statistic and chi-square test) and the Dirichlet multinomial regression (DMR)
were employed to analyze a large size textual data both at the term and topic levels. Moreover,
the findings of this study inform the integration of data mining related topics in LIS
education.
Literature review
Traditionally, LIS research has used traditional social science methods most widely, such as
surveys, interviews, experiments and content analysis (Chu, 2015;Chu and Ke, 2017;Togia
and Malliari, 2017;Ullah and Ameen, 2018). Despite the popular use of traditional social
science methods in LIS, data science methods began to be adopted increasingly by LIS
researchers in recent years (Katsurai and Joo, 2021;Timakum et al., 2020;Ma and Lund,
2021b). Marchionini (2016) addressed the nature of interdisciplinary in LIS research and
discussed the roles of information science, which subsumes LIS in the perspective of the
emerging data science field. Computer science researchers have made significant
contributions to the information science field, and therefore, information science schools
recently hired technical faculty with a certain computer science background. Virkus and
Garoufallou (2019) investigated data science related publications in LIS based on the analysis
of bibliographic data from the Web of Science. Their study affirmed the interdisciplinary
nature in data sciences and that LIS was also involved in part of such interdisciplinarity in
data science research. Their findings revealed that the largest contributions to data science
methods were made by the computer science community while only a small portion was made
by LIS. Virkus and Garoufallou (2020) further explored the emerging field of data science
from the LIS perspective on the basis of content analysis of eighty relevant publications.
They identified that data science studies in LIS were categorized into data science education,
AJIM
76,1
66

Get this document and AI-powered insights with a free trial of vLex and Vincent AI

Get Started for Free

Start Your Free Trial of vLex and Vincent AI, Your Precision-Engineered Legal Assistant

  • Access comprehensive legal content with no limitations across vLex's unparalleled global legal database

  • Build stronger arguments with verified citations and CERT citator that tracks case history and precedential strength

  • Transform your legal research from hours to minutes with Vincent AI's intelligent search and analysis capabilities

  • Elevate your practice by focusing your expertise where it matters most while Vincent handles the heavy lifting

vLex

Start Your Free Trial of vLex and Vincent AI, Your Precision-Engineered Legal Assistant

  • Access comprehensive legal content with no limitations across vLex's unparalleled global legal database

  • Build stronger arguments with verified citations and CERT citator that tracks case history and precedential strength

  • Transform your legal research from hours to minutes with Vincent AI's intelligent search and analysis capabilities

  • Elevate your practice by focusing your expertise where it matters most while Vincent handles the heavy lifting

vLex

Start Your Free Trial of vLex and Vincent AI, Your Precision-Engineered Legal Assistant

  • Access comprehensive legal content with no limitations across vLex's unparalleled global legal database

  • Build stronger arguments with verified citations and CERT citator that tracks case history and precedential strength

  • Transform your legal research from hours to minutes with Vincent AI's intelligent search and analysis capabilities

  • Elevate your practice by focusing your expertise where it matters most while Vincent handles the heavy lifting

vLex

Start Your Free Trial of vLex and Vincent AI, Your Precision-Engineered Legal Assistant

  • Access comprehensive legal content with no limitations across vLex's unparalleled global legal database

  • Build stronger arguments with verified citations and CERT citator that tracks case history and precedential strength

  • Transform your legal research from hours to minutes with Vincent AI's intelligent search and analysis capabilities

  • Elevate your practice by focusing your expertise where it matters most while Vincent handles the heavy lifting

vLex

Start Your Free Trial of vLex and Vincent AI, Your Precision-Engineered Legal Assistant

  • Access comprehensive legal content with no limitations across vLex's unparalleled global legal database

  • Build stronger arguments with verified citations and CERT citator that tracks case history and precedential strength

  • Transform your legal research from hours to minutes with Vincent AI's intelligent search and analysis capabilities

  • Elevate your practice by focusing your expertise where it matters most while Vincent handles the heavy lifting

vLex

Start Your Free Trial of vLex and Vincent AI, Your Precision-Engineered Legal Assistant

  • Access comprehensive legal content with no limitations across vLex's unparalleled global legal database

  • Build stronger arguments with verified citations and CERT citator that tracks case history and precedential strength

  • Transform your legal research from hours to minutes with Vincent AI's intelligent search and analysis capabilities

  • Elevate your practice by focusing your expertise where it matters most while Vincent handles the heavy lifting

vLex