Protecting privacy on the web. A study of HTTPS and Google Analytics implementation in academic library websites

Published date08 October 2018
Date08 October 2018
DOIhttps://doi.org/10.1108/OIR-02-2018-0056
Pages734-751
AuthorPatrick O’Brien,Scott W.H. Young,Kenning Arlitsch,Karl Benedict
Subject MatterLibrary & information science,Information behaviour & retrieval,Collection building & management,Bibliometrics,Databases,Information & knowledge management,Information & communications technology,Internet,Records management & preservation,Document management
Protecting privacy on the web
A study of HTTPS and Google Analytics
implementation in academic library websites
Patrick OBrien, Scott W.H. Young and Kenning Arlitsch
Montana State University, Bozeman, Montana, USA, and
Karl Benedict
University of New Mexico, Albuquerque, New Mexico, USA
Abstract
Purpose The purpose of this paper is to examine the extent to which HTTPS encryption and Google
Analytics services have been implemented on academic library websites, and discuss the privacy implications
of free services that introduce web tracking of users.
Design/methodology/approach The home pages of 279 academic libraries were analyzed for the
presence of HTTPS, Google Analytics services and privacy-protection features.
Findings Results indicate that HTTPS implementation on library websites is not widespread, and many
libraries continue to offer non-secured connections without an automatically enforced redirect to a secure
connection. Furthermore, a large majority of library websites included in the study have implemented Google
Analytics and/or Google Tag Manager, yet only very few connect securely to Google via HTTPS or have
implemented Google Analytics IP anonymization.
Practical implications Librarians are encouraged to increase awareness of this issue and take concerted
and coherent action across five interrelated areas: implementing secure web protocols (HTTPS), user
education, privacy policies, informed consent and risk/benefit analyses.
Originality/value Third-party tracking of users is prevalent across the web, and yet few studies
demonstrate its extent and consequences for academic library websites.
Keywords Web analytics, HTTPS, Third-party tracking, Web privacy
Paper type Research paper
Introduction
Third-party tracking can occur when web analytics services, such as Google Analytics, are
utilized to measure visitation to websites. These services provide information about website
use and user behavior, which can help libraries improve their online services. However, the
analytics services operate sophisticated mechanisms through extensive networks to track
users and their behavior across sites, acquiring user demographics and behavioral patterns.
The detailed tracking enabled by Google Analytics is often performed without the fully
informed consent of individual users of the website. The extent to which Google Analytics
services have been implemented within the domain of library websites has been unknown
prior to this study. Unknown, also, has been the extentto which available privacy-protecting
features have been implemented on those websites.
The library profession has long supported the principles of privacy, but tracking used by
analytics service providers has rendered those principles nearly untenable. For example,
without proactive efforts to mitigate their impact, browser cookies set by Google Analytics
act as beacons for collecting and sharing user data through a vast network of commercial
trackers. By understanding the extent and significance of web tracking and the available
Online Information Review
Vol. 42 No. 6, 2018
pp. 734-751
Emerald Publishing Limited
1468-4527
DOI 10.1108/OIR-02-2018-0056
Received 19 February 2018
Revised 16 June 2018
Accepted 13 July 2018
The current issue and full text archive of this journal is available on Emerald Insight at:
www.emeraldinsight.com/1468-4527.htm
© Patrick OBrien, Scott W.H. Young, Kenning Arlitsch and Karl Benedict. Published by Emerald
Publishing Limited. This article is published under the Creative Commons Attribution (CC BY 4.0) licence.
Anyone may reproduce, distribute, translate and create derivative works of this article (for both
commercial and non-commercial purposes), subject to full attribution to the original publication and
authors. The full terms of this licence may be seen at http://creativecommons.org/licences/by/4.0/legalcode
734
OIR
42,6
privacy-protection mechanisms, libraries can begin to minimize their participation in
third-party tracking on the web.
The results presented in this paper demonstrate conclusively that 279 academic libraries
from around the world must do much more to ensure user privacy if they hope to maintain
trust with their users. The principle of this trust is outlined in the privacy statements of the
American Library Association (ALA), Coalition for Networked Information (CNI), National
Information Standards Organization (NISO) and the International Federation of Library
Associations and Institutions (IFLA).
In presenting our research, we first explain web tracking, web analytics and web
privacy. We then detail our methods and results, followed by a discussion of the privacy
implications of third-party web tracking. We conclude by offering recommendations for
professional action and avenues for future research.
Literature review
Web tracking
The practice of third-party tracking on websites is widespread (Narayanan and Reisman,
2017), and has only increased in prevalence, variety and complexity over time (Lerner
et al., 2016; Englehardt and Narayanan, 2016). One of the most common trackers found on
the Web is produced by the Google Analytics web service, which is used to measure the
visitation to a website (Lerner et al., 2016; Schelter and Kunegis, 2016). In exchange for this
easy-to-implement and free-to-use analytics service, websites execute Google Analytics
JavaScript code and pass user visit data to Google through browser cookies set by Google
Analytics (Krishnamurthy and Wills, 2009). Such data are considered to be leakedif the
user is unaware of its collection and does not consent to the data being shared with
additional third parties (Sar and Al-Saggaf, 2013). An analysis of 1m websites found that
nearly nine in ten websites leak user data to third parties without the users knowledge
(Libert, 2015).
The Google Analytics tracker is not designed to leak user data across sites on its own,
but its tracking capabilities are enhanced when combined with Google AdSense, Googles
popular cross-site advertising service that utilizes its Doubleclick tracker. When Google
AdSense and Google Analytics have both been implemented in a website, the unique
identifiers from each service can be linked by Googles Doubleclick tracker such that
Google can create browsing profiles that track users across sites (Roesner et al., 2012).
Data leakage from Google Analytics can also occur when websites activate the additional
Google tracking service known as Tag Manager, which allows for cross-site tracking and
targeted advertising (Bashir et al., 2016). Under these expanded tracking conditions,
third-party trackers can match user behavior data with user profiles, thereby allowing users
to be tracked and targeted across the web (Olejnik et al., 2012; Falahrastegar et al., 2016;
Kalavri et al., 2016). While data about Google Tag Manager and Google AdSense were
collected during course of this study, full analysis is beyond the scope of this paper.
Data leakage and user profiling via web tracking represents a privacy issue for users
because of a lack of transparency and the lack of opportunity for users to consent to the
sharing of their tracked behavior. The following example illustrates this case:
A user logs into Gmail and then visits a library website that has implemented Google Analytics or
Google Tag Manager. This user then searches for tax relief resources through the library website.
Because Google 1) identifies and authenticates users via their Google IDs and passwords and 2)
identifies and authenticates the library website through Google Analytics or Tag Manager, Google
can link userslibrary website activity to individual usersGoogle profiles. Depending on the
librarys Google implementation, this user activity may also be shared with Googles advertising
network, which targets users with personalized ads, such as credit cards or personal loan services,
even after the user has left the library web site.
735
Protecting
privacy
on the web

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT