Gender bias in machine learning for sentiment analysis

Pages343-354
Published date11 June 2018
DOIhttps://doi.org/10.1108/OIR-05-2017-0153
Date11 June 2018
AuthorMike Thelwall
Subject MatterLibrary & information science,Information behaviour & retrieval,Collection building & management,Bibliometrics,Databases,Information & knowledge management,Information & communications technology,Internet,Records management & preservation,Document management
Gender bias in machine learning
for sentiment analysis
Mike Thelwall
School of Mathematics and Computer Science, University of Wolverhampton,
Wolverhampton, UK
Abstract
Purpose The purpose of this paper is to investigate whether machine learning induces gender biases in the
sense of results that are more accurate for male authors or for female authors. It also investigates whether training
separate male and female variants could improve the accuracy of machine learning for sentiment analysis.
Design/methodology/approach This paper uses ratings-balanced sets of reviews of restaurants and
hotels (3 sets) to train algorithms with and without gender selection.
Findings Accuracy is higher on female-authored reviews than on male-authored reviews for all data sets,
so applications of sentiment analysis using mixed gender data sets will over represent the opinions of women.
Training on same gender data improves performance less than having additional data from both genders.
Practical implications End users of sentiment analysis shouldbe aware that its small gender biasescan
affect the conclusions drawn from it and apply correction factors when necessary. Users of systems that
incorporatesentiment analysisshould be aware that performancewill vary by author gender.Developers do not
need to create gender-specific algorithmsunless they have more trainingdata than their system can cope with.
Originality/value This is the first demonstration of gender bias in machine learning sentiment analysis.
Keywords Machine learning, Sentiment analysis, Gender, Opinion mining, Gender bias
Paper type Research paper
Introduction
Automatic sentiment analysis algorithms (Liu, 2012; Pang and Lee, 2008) are part of the
standard toolkit for market research and customer relations management (Lamont, 2014;
Liyakasa, 2012). They can detect the opinions of customers unobtrusively on a large scale in
near real-time from their social web posts. Sentiment analysis is also used in government
and politics to detect public opinion about issues or candidates (Wood, 2016; Wright, 2015).
Analysts may implicitly assume that sentiment analysis results are unbiased because they
are automatic but this is not necessarily true. Given the existence of clear gender differences
in communication styles on the social web (Burger et al., 2011; Mihalcea and Garimella, 2016;
Volkova and Yoram, 2015), including for expressing sentiment (Montero et al., 2014;
Thelwall, Wilkinson and Uppal, 2010), interpreting sentiment (Guerini et al., 2013) and
discussing products (Yang et al., 2015), gender biases in sentiment analysis seem likely.
In other words, sentiment analysis algorithms may be more able to detect sentiment from
one gender than from another so that, in a gender-mixed collection of texts, sentiment
analysis results could over represent the opinions of one gender.
Thus, for example, a company exploiting sentiment analysis for customer relations
management (Waller and Fawcett, 2013) might find that 50 per cent of female-authored
reviews of their product were positive in comparison to 40 per cent of male-authored
reviews. From this they might conclude that their product is more appealing to females,
whereas the 10 per cent difference might be due to the algorithms being better able to detect
sentiment expressed by females. Similarly, a company with two products, A and B, where B
is an upgraded variant of A might find that A has a 50 per cent approval rating whilst B has
a 60 per cent approval rating. If females preferred B but males preferred A and the
sentiment analysis algorithm detected female sentiment better, then it is possible that A and
B have the same overall approval rating but that gender bias in the algorithm (i.e. greater
ability to detect sentiment from females) caused the difference in the overall ratings. Gender
Online Information Review
Vol. 42 No. 3, 2018
pp. 343-354
© Emerald PublishingLimited
1468-4527
DOI 10.1108/OIR-05-2017-0153
Received 23 May 2017
Revised 21 September 2017
Accepted 21 September 2017
The current issue and full text archive of this journal is available on Emerald Insight at:
www.emeraldinsight.com/1468-4527.htm
343
Gender bias in
machine
learning

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT