Publication Date19 Nov 2018
AuthorMoritz Schubotz,Philipp Scharpf,Kaushal Dudhat,Yash Nagar,Felix Hamborg,Bela Gipp
SubjectLibrary & information science,Library & information services,Lending,Document delivery,Collection building & management,Stock revision,Consortia
Moritz Schubotz
Chair of Digital Media, University of Wuppertal, School of Electrical Information and Media Engineering, Wuppertal, Germany
Philipp Scharpf, Kaushal Dudhat and Yash Nagar
Information Science Group, University of Konstanz, Department of Computer and Information Science, Konstanz, Germany
Felix Hamborg
Universitat Konstanz Fachbereich Wirtschaftswissenschaften, Konstanz, Germany, and
Bela Gipp
Chair of Digital Media, University of Wuppertal, School of Electrical Information and Media Engineering, Wuppertal, Germany
Purpose This paper aims to present an open source math-aware Question Answering System based on Ask Platypus.
Design/methodology/approach The system returns as a single mathematical formula for a natural language question in English or Hindi. These
formulae originate from the knowledge-based Wikidata. The authors translate these formulae to computable data by integ rating the calculation
engine sympy into the system. This way, users can enter numeric values for the variables occurring in the formula. Moreover, the system loads
numeric values for constants occurring in the formula from Wikidata.
Findings In a user study, this system outperformed a commercial computational mathematical knowledge engine by 13 per cent. Howe ver, the
performance of this system heavily depends on the size and quality of the formula data available in Wikidata. As only a few items in Wikidata
contained formulae when the project started, the authors facilitated the import process by suggesting formu la edits to Wikidata editors. With the
simple heuristic that the rst formula is signicant for the paper, 80 per cent of the suggestions were correct.
Originality/value This research was presented at the JCDL17 KDD workshop.
Keywords Digital libraries, Information retrieval, Database management, Mathematical information retrieval, Question answering systems,
Paper type Research paper
ACM Reference Format:
2018. Introducing MathQA - A Math-Aware Question
Answering System. In Proceedings of The 18th ACM/IEEEJoint
Conference on Digital Libraries (JCDL'18). ACM, NY, NY,
USA, 11 pages.
1. Introduction
Question answering (QA) systems are information retrieval
(IR) systems, allowing the user to pose questions in natural
language to provide quick and succinctanswers in contrast to
search engines which deliver ranked lists of documents. In this
project, we developed an open source QA system, which is
available at Our system
can answer mathematical questions in the form of natural
language, yielding a formula, whichis retrieved from Wikidata.
Wikidata is a free and open knowledge-base that can be read
and edited by humans and machines.It stores common sources
of other Wikimedia projects, especially for Wikipedia
infoboxes. In addition, our system enables the user to perform
arithmetic operations using the retrieved formula. (Figure 1)
We developed three modules: The Question ParsingModule (1)
transforms questions into a triple representation and produces
a simplied dependency tree.The Formula Retrieval Module (2)
then queries the Wikidata knowledge- base for the requested
formula and presents the result to the user. The user can
subsequently choose values for the occurring variables and
order a calculation that is done by a Calculation Module (3). If
Accepted 12 September 2018

