An empirical analysis of user behaviour on multilingual information retrieval

Published date05 June 2017
Pages410-426
DOIhttps://doi.org/10.1108/EL-01-2016-0004
Date05 June 2017
AuthorLi Si,Qiuyu Pan,Xiaozhe Zhuang
Subject MatterInformation & knowledge management,Information & communications technology,Internet
An empirical analysis of user
behaviour on multilingual
information retrieval
Li Si
Center for the Studies of Information Resources, Wuhan University,
Wuhan, China, and
Qiuyu Pan and Xiaozhe Zhuang
Department of Library Science, School of Information Management,
Wuhan University, Wuhan, China
Abstract
Purpose This paper aims to understand user information behaviours when they perform multilingual
information retrieval. It also offers reference for the development of multilingual information retrieval systems
and relevant service platforms.
Design/methodology/approach The authors designed an experiment on multilingual information
retrieval with WorldWideScience, utilized Camtasia studio7 (a screen capturing and recording tool) to record
overall operational processes of subjects and collected participants’ thought processes with think-aloud
protocols. Meanwhile, a questionnaire survey and interviews were used to examine the subjects’ background
information, their feelings for the experiment and their ideas about the experimental platform, respectively.
Thirty-two valid data points were obtained by 41 subjects.
Findings The users preferred their own language for retrieval. Most users from social science chose
general search or advanced search freely according to the tasks. The majority of the participants selected key
words directly from the tasks as search terms. Doctoral candidates were more likely to construct a search
query with logic symbols. Translation tools were utilized for assisting retrieval and solving doubts of
translation. When facing obstacles, users stayed on the original web page to explore continually, followed by
back to homepage.
Originality/value This paper provides a study of user behaviour through investigating how users
behave on the whole process of retrieving multilingual information. The ndings offer advice for optimizing
the function of multilingual information retrieval systems and service platforms.
Keywords Information retrieval, User studies, Information seeking behaviours,
Multilingual information access, Think-aloud protocols
Paper type Research paper
Introduction
According to the internet data statistics agency, Internet World Stats, by November 2015,
internet users from Asia, Europe, Latin America, Africa, North America, Middle East and
Oceania accounted for 48.2, 18.0, 10.2, 9.8, 9.3, 3.7, and 0.8 per cent, respectively, of the global
internet users. The top ten languages used by internet users were English, Chinese, Spanish,
Arabic, Portuguese, Japanese, Russian, Malay, French and German, accounting for 78.2
per cent (www.internetworldstats.com/stats.html). The United Nations Educational,
This paper is one of the research outcomes of a Humanities and Social Sciences Project supported by the
Ministry of Education of P.R. China (Project Name: Research on content-based multilingual information
organization and retrieval, Project No. 14JJD870001).
The current issue and full text archive of this journal is available on Emerald Insight at:
www.emeraldinsight.com/0264-0473.htm
EL
35,3
410
Received 9 January 2016
Revised 6 June 2016
12 August 2016
20 September 2016
Accepted 2 October 2016
TheElectronic Library
Vol.35 No. 3, 2017
pp.410-426
©Emerald Publishing Limited
0264-0473
DOI 10.1108/EL-01-2016-0004
Scientic and Cultural Organization (UNESCO) reported that web pages in English made up
75 per cent of global web pages in 1998, but dropped sharply to 45 per cent in 2007 (Pimienta
et al., 2009). Online information tends to be multilingual, and it brings difculties for users
who may only master one or few languages. Web search in multilingual collections has long
been discussed primarily within the context of cross-language information retrieval (CLIR).
CLIR deals with retrieval situations in which users post queries in one language but expect
to receive search results in another language (Rieh and Rieh, 2005), while multilingual
information retrieval enables retrieving documents in multiple languages in response to a
user’s query (Rahimi et al., 2015). In this paper, the focus is on behaviours on multilingual
information retrieval of web users who have only mastered one or two languages and had an
interest in nding documents written in foreign languages (more than two languages). Using
think-aloud protocols, experiments, questionnaire and interview, features of users’
behaviours were analysed with the expectation of offering inspiration to the development of
multilingual information retrieval systems and service platforms.
Literature review
Understanding users’ information behaviours is crucial to developing information retrieval
tools with high efciency. Some research studies have been done to explore users’ behaviours
on cross-language, as well as multilingual information retrieval.
Users’ behaviours on multilingual information retrieval
In the context of multilingual information retrieval, multilingual image retrieval was the
most covered issue. Vassilakaki et al. (2009) discussed image-seeking behaviours, when
searching for known, non-annotated images in FlickLing, to reveal users’ perceptions of the
tasks involved when searching across languages. Processing and analysing more than
one-million log lines of multilingual image search, Peinado et al. (2010) found that users with
no competence in the annotation language of the image tend to ask for more hints; the fewer
language skills a user had, the more often he or she used cross-language facilities; and that
usage of relevance feedback was remarkably low, but successful users tended to use it more
frequently. Ruiz and Chin (2010) studied the challenges that users faced when searching for
images indexed in languages other than English, and their results showed that the users
found the task hard due to the difculty of selecting terms that match those assigned by
creators of the images. In addition, Wu et al. (2012) investigated academic users’ multilingual
needs and expectations for digital libraries with a questionnaire which was designed into
Chinese and English versions. They then divided the participants into a Chinese group, an
English group and a third group (non-Chinese and non-English language) to determine the
disparities in needs and expectations among different language groups, drawing the
conclusion that Chinese participants demonstrated the strongest multilingual needs while
the English group the weakest.
Analysing search logs has become an important way to discover the features of users’
multilingual information retrieval behaviour. Cristea et al. (2009) tried to nd correlations
between different search parameters from a subset of the FlickLing search log, and nally
found there was not a clear connection between the results of over-achieving users and their
particular actions. Navarro-Colorado et al. (2010) investigated user behaviour by mining
search logs from a multilingual search interface for Flickr to determine the necessity of word
sense disambiguation in information retrieval tasks, and they found there was a clear
inuence of the lexical ambiguity of the queries on the precision of users. By analysing
search logs of the European Library, Ghorab et al. (2010) made efforts to probe into the
retrieval behaviour of users from different linguistic or cultural backgrounds, conrming
that users from different backgrounds behaved differently, and that user queries could
411
Empirical
analysis

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT