Extracting entity relations for “problem-solving” knowledge graph of scientific domains using word analogy
| Date | 08 June 2022 |
| Pages | 481-499 |
| DOI | https://doi.org/10.1108/AJIM-03-2022-0129 |
| Published date | 08 June 2022 |
| Subject Matter | Library & information science,Information behaviour & retrieval,Information & knowledge management,Information management & governance,Information management |
| Author | Guo Chen,Jiabin Peng,Tianxiang Xu,Lu Xiao |
Extracting entity relations for
“problem-solving”knowledge
graph of scientific domains using
word analogy
Guo Chen, Jiabin Peng and Tianxiang Xu
Nanjing University of Science and Technology, Nanjing, China, and
Lu Xiao
School of Journalism, Nanjing University of Finance and Economics, Nanjing, China
Abstract
Purpose –Problem-solving”is the most crucial key insight of scientific research. This study focuses on
constructing the “problem-solving”knowledge graph of scientific domains by extracting four entity relation
types: problem-solving, problem hierarchy, solution hierarchy and association.
Design/methodology/approach–This paper presents a low-cost method for identifying these relationships
in scientific papers based on word analogy. The problem-solving and hierarchical relations are represented as
offset vectors of the head and tail entities and then classified by referencing a small set of predefined entity
relations.
Findings –This paper presents an experiment with artificial intelligence papers from the Web of Science and
achieved good performance. The F1 scores of entity relation types problem hierarchy, problem-solving and
solution hierarchy, which were 0.823, 0.815 and 0.748, respectively. This paper used computer vision as an
example to demonstrate the application of the extracted relations in constructing domain knowledge graphs
and revealing historical research trends.
Originality/value–This paper uses an approach that is highly efficient and has a good generalization ability.
Instead of relying on a large-scalemanually annotated corpus, it only requires a small set of entity relations that
can be easily extracted from external knowledge resources.
Keywords Entity relation extraction, Problem-solving, Word analogy, Knowledge graph,
Artificial intelligence, Research trend analysis
Paper type Research paper
1. Introduction
What problem does this paper solve, and what solution is used? This is what we try to
understand when reading a scientific article because “problem-solving”is the core aspect of
scientific research. Nasar et al. (2018) indicated that the problems and their solutions consist
of the basis of phrase-level “key-sights”in a scientific article. With the rapid growth in the
number of scientific papers, it is necessary to extract their key insights (research problems,
solutions and their relations) to construct scholarly knowledge graphs and conduct research
trend analyses of the corresponding scientific domains.
Several studies have focused on extracting domain-specific entities from scientific papers,
but less attention has been given to extracting entity relations. The current technologies for
extracting entity relations from scientific papers encounter some challenges despite the
growing demand for their application. The immediate problem is that most of these
technologies are supervised learning methods trained on a pre-annotated corpus, which is
Extracting
entity relations
481
This study is supported by the MOE (Ministry of Education in China) Project of Humanities and Social
Sciences (No. 21YJC870003); and the Jiangsu Provincial Social Science Foundation of China (No.
21TQC002).
The current issue and full text archive of this journal is available on Emerald Insight at:
https://www.emerald.com/insight/2050-3806.htm
Received 15 March 2022
Revised 1 May 2022
16 May 2022
Accepted 17 May 2022
Aslib Journal of Information
Management
Vol. 75 No. 3, 2023
pp. 481-499
© Emerald Publishing Limited
2050-3806
DOI 10.1108/AJIM-03-2022-0129
scarce in most scientific domains. Moreover, manual annotation of a domain-specific corpus
is a high-cost task. As a result, the performance of supervised learning methods for extracting
entity relations from scientific papers is insufficient for application; the best F1 value of
existing models is approximately 50%, according to Nasar et al. (2018), and their domain
generalization ability is poor. Thus, related applications have primarily been conducted in
domains for which a manual annotation corpus is available, such as biomedicine and
computer science. It is unrealistic to build a manual annotation corpus for each domain for
extracting entity relations from scientific papers.
Now, we face a dilemma: the demand for application of entity relations has become urgent.
Simultaneously, the cost of utilizing current supervision learning technologies is extremely
high. Therefore, we must find a breakthrough according to the characteristics of entity
relations in scientific papers. We found two crucial clues based on a comprehensive survey of
related research. First, the instance distribution of different entity types in scientific papers is
highly unbalanced. For example, in the SemEval18 corpus, USAGE and COMPOSE relations
constituted approximately 50% of all instances (G
abor et al., 2016). Second, the existing
methods exhibit a varying performance in extracting different relation types. For example,
Jiang et al. (2020) listed their model’s performance on seven relation types, in which the F1 of
USED-FOR achieved 62.1%, whereas the F1 of CONJUNCTION was only 7.3%. Nonetheless,
the essential relation types in scientific papers (USAGE/USED-FOR, which can be viewed as
problem-solving relations) cover most instances and can be solved efficiently.
Under such circumstances, a feasible manner of solving the above dilemma in the
application is to focus on extracting relations between the problem and solution without
manual annotation, that is, without extracting all relation types relying on a large-scale
manually-annotated corpus. Four relation types can be identified between problems and
solutions: problem-solving, problem hierarchy, solution hierarchy and association relations.
These relations can be utilized to construct a comprehensive problem-solving-oriented
knowledge graph for a given domain.
As a solution to this problem, we propose a low-cost method of identifying the
abovementioned four relation types in scientific papers using a word analogy learning
scheme. Using this method, problem-solving and hierarchical relations can be classified
based on small-scale reference entity pairs.
The remainder of this paper is organized as follows: Section 2 discusses the related works.
Section 3 presents the objectives of this study. Section 4 introduces the proposed framework
for extracting entity relations of problems and solutions from scientific papers. Section 5
describes the experiment in the artificial intelligence (AI) domain. Section 6 presents an
application case using experimental results. A brief discussion and the conclusions of this
study are presented in Section 7.
2. Related research
2.1 Problems and solutions in scientific papers
Scientific research is usually described as a problem-solving activity. Therefore, problems
and solutions are essential insights in scientific research (Heffernan and Teufel, 2018). After
analysing many scientific papers, Winter (1968) indicated that a four-part “problem-solving”
pattern can mainly describe scientific papers: situation, problem, solution and evaluation.
Hoey (2001) adjusted this model by introducing the concept of response instead of a solution
because a solution must be evaluated before it can be accepted. Heffernan and Teufel (2018)
discussed two connotations of “Problem”: one is the challenges to be solved, and the other is
called “problematic problems,”that is, problems that would cause difficulties or negative
effects. Nasar et al. (2018) indicates that valuable general phrase-level key insights in
scientific papers include problems, domains, processes and results.
AJIM
75,3
482
Get this document and AI-powered insights with a free trial of vLex and Vincent AI
Get Started for FreeStart Your Free Trial of vLex and Vincent AI, Your Precision-Engineered Legal Assistant
-
Access comprehensive legal content with no limitations across vLex's unparalleled global legal database
-
Build stronger arguments with verified citations and CERT citator that tracks case history and precedential strength
-
Transform your legal research from hours to minutes with Vincent AI's intelligent search and analysis capabilities
-
Elevate your practice by focusing your expertise where it matters most while Vincent handles the heavy lifting
Start Your Free Trial of vLex and Vincent AI, Your Precision-Engineered Legal Assistant
-
Access comprehensive legal content with no limitations across vLex's unparalleled global legal database
-
Build stronger arguments with verified citations and CERT citator that tracks case history and precedential strength
-
Transform your legal research from hours to minutes with Vincent AI's intelligent search and analysis capabilities
-
Elevate your practice by focusing your expertise where it matters most while Vincent handles the heavy lifting
Start Your Free Trial of vLex and Vincent AI, Your Precision-Engineered Legal Assistant
-
Access comprehensive legal content with no limitations across vLex's unparalleled global legal database
-
Build stronger arguments with verified citations and CERT citator that tracks case history and precedential strength
-
Transform your legal research from hours to minutes with Vincent AI's intelligent search and analysis capabilities
-
Elevate your practice by focusing your expertise where it matters most while Vincent handles the heavy lifting
Start Your Free Trial of vLex and Vincent AI, Your Precision-Engineered Legal Assistant
-
Access comprehensive legal content with no limitations across vLex's unparalleled global legal database
-
Build stronger arguments with verified citations and CERT citator that tracks case history and precedential strength
-
Transform your legal research from hours to minutes with Vincent AI's intelligent search and analysis capabilities
-
Elevate your practice by focusing your expertise where it matters most while Vincent handles the heavy lifting
Start Your Free Trial of vLex and Vincent AI, Your Precision-Engineered Legal Assistant
-
Access comprehensive legal content with no limitations across vLex's unparalleled global legal database
-
Build stronger arguments with verified citations and CERT citator that tracks case history and precedential strength
-
Transform your legal research from hours to minutes with Vincent AI's intelligent search and analysis capabilities
-
Elevate your practice by focusing your expertise where it matters most while Vincent handles the heavy lifting