THC-DAT helps in reading a multi-topic document. Results from a user-centered evaluation of a within-document analysis tool

Published date21 November 2016
DOIhttps://doi.org/10.1108/LHT-07-2016-0081
Date21 November 2016
Pages685-704
AuthorJing Chen,Dan Wang,Quan Lu,Zeyuan Xu
Subject MatterLibrary & information science,Librarianship/library management,Library technology,Information behaviour & retrieval,Information user studies,Metadata,Information & knowledge management,Information & communications technology,Internet
THC-DAT helps in reading a
multi-topic document
Results from a user-centered evaluation of a
within-document analysis tool
Jing Chen and Dan Wang
School of Information Management,
Central China Normal University, Wuhan, China
Quan Lu
School of Information Management, Wuhan University, Wuhan, China, and
Zeyuan Xu
Henry Samueli School of Engineering and Applied Science,
University of California, Los Angeles, California, USA
Abstract
Purpose With a mass of electronic multi-topic documents available, there is an increasing need for
evaluating emerging analysis tools to help users and digital libraries analyze these documents better.
The purpose of this paper is to evaluate the effectiveness, efficiency and user satisfaction of THC-DAT,
a within-document analysis tool, in reading a multi-topic document.
Design/methodology/approach The authors reviewed related literature first, then performed a
user-centered, comparative evaluation of two within-document analysis tools, THC-DAT and
BOOKMARK. THC-DAT extracts a topic hierarchy tree using hierarchical latent Dirichlet allocation
(hLDA) method and takes the context information into account. BOOKMARK provides similar
functionality to the Table of Contents bookmarks in Adobe Reader. Three novel kinds of tasks were
devised for participants to finish on two tools, with objective results to assess reading effectiveness and
efficiency. And post-system questionnaires were employed to obtain participantssubjective
judgments about the tools.
Findings The results confirm that THC-DAT is significantly more effective than BOOKMARK,
while not inferior in efficiency. There is some evidence that suggests THC-DAT can slow down the
process of approaching cognitive overload and improve userswillingness to undertake difficult task.
Based on qualitative data from questionnaires, the results indicate that users were more satisfied when
using THC-DAT than BOOKMARK.
Practical implications Adopting THC-DAT in digital libraries or electrical document reading
systems contributes to promoting usersreading performance, willingness to undertake difficult task
and general satisfaction. Moreover, THC-DAT is of great value to addressing cognitive overload
problem in the information retrieval field.
Originality/value This paper evaluates a novel within-document analysis tool in analyzing a multi-
topic document, and proved that this tool is superior to the benchmark in effectiveness and user
satisfaction, and not inferior in efficiency.
Keywords Digital libraries, E-readers, Multi-topic document, THC-DAT, User-centered evaluation,
Within-document analysis
Paper type Technical paper
Library Hi Tech
Vol. 34 No. 4, 2016
pp. 685-704
©Emerald Group Publis hing Limited
0737-8831
DOI 10.1108/LHT-07-2016-0081
Received 25 July 2016
Accepted 11 September 2016
The current issue and full text archive of this journal is available on Emerald Insight at:
www.emeraldinsight.com/0737-8831.htm
The authors gratefully acknowledge the financial support for this work provided by National
Natural Science Foundation of China (No: 71303089, 71273195 and 71420107026), the National
Basic Research Program of China (973 Program, No: 904171200) and the Major projects of the
National Social Science Foundation (No: 13&ZD183).
685
Reading a
multi-topic
document
1. Introduction
With the rapid growth of electronic documents, there are more and more multi-topic
documents, which requires users with high reading ability. Compared with single
topic documents, multi-topic documents, including scientific articles, news stories
and patents, may come naturally with complex hierarchical structure, involving
more than one topic, meanwhile topics within separated paragraphs may have
relationships (Chen et al., 2016; Tagarelli and Karypis, 2013). So analyzing the
document from within-document perspective then showing the content in an
organized way is helpful for users in understanding the multi-topic document.
In respect to how to visualize and analyze these multi-topic documents, various tools
have been proposed, such as TOPIC ISLANDS (Miller et al., 1998), HINATA
(Nishihara et al., 2011) and TopicNets (Gretarsson et al., 2012). Unfortunately, most
tools ignore the latent hierarchical structures and the context information within a
document. Hence, we have proposed a new multi-topic document analysis tool
THC-DAT (as acronym of topic hierarchy and context-based document analysis tool)
which uses hierarchical latent Dirichlet allocation (hLDA) method to extract a topic
hierarchy tree, and takes the context information into account to enable users to
browse and analyze a document in a multi-grained, topic-oriented and context-based
way (Chen et al., 2016).
Text signal is introduced as the writing device that emphasizes aspects of a text
content or structure without adding to the content of the text (Lorch, 1989). It attempts
to pre-announce or emphasize content and reveal content relationship (Le marié et al.,
2006; Lorch et al., 2001; Spyridakis, 1989), and can direct attention of readers during
reading and help improve readersultimate comprehension about text information.
Actually, the hierarchical topic tree extracted by THC-DAT is one kind of text signals.
BOOKMARK such as the Table of Contents bookmarks in Adobe Reader, which
provides a directory tree that integrates all levels of titles of a document, is the most
common text signal and within-document analysis tool. In our previous work, we
provided a comprehensive overview of approaches, interface and functions of
THC-DAT. Subsequently, we conducted a case study to evaluate the tool, and
qualitative analysis results that indicate the effectiveness of the tool were also
discussed. In this paper, we conducted a comparative evaluation of THC-DAT with
BOOKMARK to figure out whether THC-DAT enables users to browse, analyze and
understand a multi-topic document more efficiently and effectively. With this intent,
we investigated the two tools within a simulated work task situation, in which
participants were asked to finish three kinds of tasks about a document, each tool
was used to finish three tasks. On the basis of quantitative performance data and
qualitative data derived from questionnaires, we assessed the comparative efficiency,
effectiveness and user satisfaction of the tools.
The structure of the paper is as follows. Section 2 reviews related work. Section 3
introduces the interfaces of the tools used in the experiment. Research questions and
hypotheses are provided in Section 4, the whole experiment design scheme is presented
in Section 5. Section 6 shows the results of the experiment study and the discussion.
Finally, some concluding remarks are offered in Section 7.
2. Related work
With the growing availability of electronic document in recent years, topic-based tools
which reveal the topic structure in a long document with multiple topics are becoming a
research hotspot. For example, TOPIC ISLANDS (Miller et al., 1998) applied Wavelet
686
LHT
34,4

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT