Mixed language queries in online searches. A study of intra-sentential code-switching from a qualitative perspective

Date21 January 2019
Pages72-89
DOIhttps://doi.org/10.1108/AJIM-04-2018-0091
Published date21 January 2019
AuthorHengyi Fu
Subject MatterLibrary & information science,Information behaviour & retrieval,Information & knowledge management,Information management & governance,Information management
Mixed language queries in
online searches
A study of intra-sentential code-switching from
a qualitative perspective
Hengyi Fu
Florida State University, Tallahassee, Florida, USA
Abstract
Purpose With the increasing number of online multilingual resources, cross-language information retrieval
(CLIR) has drawn much attention from the information retrieval (IR) research community. However, few
studies have examined how and why multilingual searchers seek information in two or more languages,
specifically how they switch and mix language in queries to get satisfying results. The purpose of this paper
is to focus on ChineseEnglish bilingualsintra-sentential code-switching behaviors in online searches. The
scenarios and reasons of code-switching, factors that may affect code-switching, the patterns of mixed
language query formulation and reformulation and how current IR systems and other search tools can
facilitate such information needs were examined.
Design/methodology/approach In-depth semi-structured interviews were used as the research method.
In total, 30 participants were recruited based on their English proficiency, location and profession, using a
purposive sampling method.
Findings Four scenarios and four reasons for using ChineseEnglish mixed language queries to cover
information needs were identified, and results suggest that linguistic and cultural/social factors are of
equivalent importance in code-switching behaviors. English terms and Chinese terms in queries play different
roles in searches, and mixed language queries are irreplaceable by either single language queries or other
search facilitating features. Findings also suggest current search engines and tools need greater emphasis in
the user interface and more user education is required.
Originality/value This study presents a qualitative analysis of bilingualscode-switching behaviors in
online searches. Findings are expected to advance the theoretical understanding of bilingual userssearch
strategies and interactions with IR systems, and provide insights for designing more effective IR systems and
tools to discover multilingual online resources, including cross-language controlled vocabularies, personalized
CLIR tools and mixed language query assistants.
Keywords Search behaviour, Code-switching, Cross-language information retrieval, Mixed language queries,
Online search tools, Query formulation and reformulation
Paper type Research paper
1. Introduction
In cross-language information retrieval (CLIR), search queries are translated from the source
language to the target language, and the original and translated queries are used to retrieve
documents in both the source and targeted languages. The assumption behind this
mechanism is that queries are consistently used in one language. This mechanism works
well when the user is monolingual and is looking for information in a certain language.
However, the user of a CLIR system may be bilingual to some extent. For example,
Hong Kong people typically speak Cantonese with English words (Gibbons, 1987). When
conducting online searches, they often combine Chinese and English terms to encompass
their information needs. This phenomenon, called code-switching, has been studied in
psycholinguistics and social linguistics for decades (Auer, 1988; Blom and Gumperz, 1972;
Myers-Scotton, 1979, 1993), but very few researchers have studied it in an online context,
especially for online searches. Originating from social linguistics, code-switching is defined
as the mixing of words, phrases or sentences from two different grammatical structures in a
single statement (Bokamba, 1989). This concept can be applied to information retrieval (IR),
when people switch between languages in their searches.
Aslib Journal of Information
Management
Vol. 71 No. 1, 2019
pp. 72-89
© Emerald PublishingLimited
2050-3806
DOI 10.1108/AJIM-04-2018-0091
Received 26 April 2018
Revised 4 August 2018
20 September 2018
Accepted 1 October 2018
The current issue and full text archive of this journal is available on Emerald Insight at:
www.emeraldinsight.com/2050-3806.htm
72
AJIM
71,1
In social linguistics, mixed language is defined as an important type of intra-sentential
code-switching, which embeds various phrases and clauses from two distinct grammatical
systems within the same sentence (Bokamba, 1989). In online searches, a mixed language
query is a search query including words mixed from two or more languages. For example,
the query Caudalie grape water (effect)is a ChineseEnglish mixed language query
looking for reviews about a certain skin care product. This unique type of query formulation
and search strategy have been frequently used by multilingual users when they search in
multiple languages to satisfy their information needs (Chau et al., 2007). However, very few
studies have addressed the code-switching behaviors in online searches, especially why and
how multilingual searchers use mixed language queries, formulation and reformulation of
mixed language queries, how current IR systems can better serve those information needs
for which mixed language queries are preferred. User interaction issues regarding
code-switching in searches, including search process, user experience, user preference and
user satisfaction, have also not drawn much attention.
To fill this gap, this study examines bilingualsChineseEnglish mixed language
querying behaviors from a qualitative perspective. Previous studies based on
ChineseEnglish mixed query log analysis, discovered unique usage purposes and
patterns of query formulation and reformulation (Fu, 2016; Fu and Wu, 2014, 2015). This
follow-up study aims to further explore and deepen our understanding of multilingual
searchersintra-sentential code-switching behaviors from the usersperspective, and
subsequently, to provide insights for improvements in the design of IR systems and tools to
support them. The research questions explored in this paper are:
RQ1. What are ChineseEnglish mixed language queries used to search for?
RQ2. What are the reasons for employing ChineseEnglish mixed language queries in
search? Why users do not use single language queries (Chinese-only queries or
English-only queries)?
RQ3. What are some query reformulation strategies regarding ChineseEnglish mixed
language queries? Why do they use certain reformulation strategies?
RQ4. How do usersperceptions about the performance of current search engines
regarding mixed language queries? What are some other online search facilitating
features and tools they used find cross-language online resources?
2. Related work
Multilingual usersonline search behavior
A considerable number of empirical studies have attempted to characterize and report
findings on multilingual usersonline search behaviors. Many factors can influence
multilingual usersonline search behaviors, including domain knowledge (Berendt and
Kralisch, 2009; Gaspari, 2004; Ghorab et al., 2009; Steichen et al., 2014), search task/usage
purpose (Petrelli et al., 2004; Rieh and Rieh, 2005; Steichen et al., 2014), language proficiency
and culture (Artiles et al., 2006; Peinado et al., 2008; Steichen et al., 2014; Zazo et al., 2006),
language skills (Clough and Eleta, 2010; Marlow et al., 2008) and search facilitating features
and tools (Peinado et al., 2008; Zazo et al., 2006). In general, users with limited foreign
language skills tend to search in their own native languages and only use queries in other
languages when content in their native languages is not available. Artiles et al.s (2006)
Flickr image search studies showed that users often avoid translating their query into
languages in which they are less fluent, even in the most favorable search setting. Users also
assumed that they could find everything in English even though English was not their first
language. Zazo et al. (2006) reported similar findings compared to individuals with good
foreign language skills, users with poor skills were more likely to enter queries in their
73
Mixed
language
queries in
online searches

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT