The lost academic home: institutional affiliation links in Google Scholar Citations

DOIhttps://doi.org/10.1108/OIR-10-2016-0302
Pages762-781
Published date09 October 2017
Date09 October 2017
AuthorEnrique Orduña-Malea,Juan M. Ayllón,Alberto Martín-Martín,Emilio Delgado López-Cózar
Subject MatterLibrary & information science,Information behaviour & retrieval,Collection building & management,Bibliometrics,Databases,Information & knowledge management,Information & communications technology,Internet,Records management & preservation,Document management
The lost academic home:
institutional affiliation links
in Google Scholar Citations
Enrique Orduña-Malea
Polytechnic University of Valencia, Valencia, Spain, and
Juan M. Ayllón, Alberto Martín-Martín and
Emilio Delgado López-Cózar
Universidad de Granada, Granada, Spain
Abstract
Purpose Google Scholar Citations (GSC) provides an institutional affiliation link which groups together
authors who belong to the same institution. The purpose of this paper is to ascertain whether this feature is
able to identify and normalize all the institutions entered by the authors, and whether it is able to assign all
researchers to their own institution correctly.
Design/methodology/approach Systematic queries to GSCs internal search box were performed under
two different forms (institution name and institutional e-mail web domain) in September 2015. The whole
Spanish academic system (82 institutions) was used as a test. Additionally, specific searches to companies
(Google) and world-class universities were performed to identify and classify potential errors in the
functioning of the feature.
Findings Although the affiliation tool works well for most institutions, it is unable to detect all existing
institutions in the database, and it is not always able to create a unique standardized entry for each
institution. Additionally, it also fails to group all the authors who belong to the same institution. A wide
variety of errors have been identified and classified.
Research limitations/implications Even though the analyzed sample is good enough to empirically
answer the research questions initially proposed, a more comprehensive study should be performed to
calibrate the real volume of the errors.
Practical implications The discovered affili ation link errors preve nt institutions from b eing able
to access the profiles o f all their respective authors usi ng the institutions lists offe red by GSC. Additionally,
it introduces a shortco ming in the navigation fe atures of Google Schola r which may impair web
user experience.
Social implications Some institutions (mainly universities) are under-represented in the affiliation feature
provided by GSC. This fact might jeopardize the visibility of institutions as well as the use of this feature in
bibliometric or webometric analyses.
Originality/value This work proves inconsi stencies in the affiliat ion feature provided by G SC.
A whole national university system is systematically analyzed and several queries have been used to
reveal errors in its function ing. The completeness of the err ors identified and the empiric al data
examined are the most exha ustive to date regarding this topic. Final ly, some recommendations about how
to correctly fill in the affiliation data (both for authors and institutions) and how to improve this feature are
provided as well.
Keywords Universities, Google Scholar, Authority control, Academic search engines
Paper type Research paper
1. Introduction
Google Scholar Citations (GSC) is the academic profile service created by Google
(now Alphabet) in 2011, based on the bibliographic data available in Google Scholar
( Jacsó, 2012b). Among its main features is the complete freedom users have to edit their
personal information (name, institutional affiliation, and areas of interest). Faithful to the
philosophy of the company, a loss of precision in affiliations and subject categorizations is
expected because of the absence of any terminology constraints, which, on the upside,
allows a more flexible and open experience. This is in contrast to other, more constrained
profile services, like the one Microsoft Academic Search used to offer ( Jacsó, 2012b).
Online Information Review
Vol. 41 No. 6, 2017
pp. 762-781
© Emerald PublishingLimited
1468-4527
DOI 10.1108/OIR-10-2016-0302
Received 12 October 2016
Revised 18 April 2017
Accepted 23 April 2017
The current issue and full text archive of this journal is available on Emerald Insight at:
www.emeraldinsight.com/1468-4527.htm
762
OIR
41,6
However, in August 2015 GSC introduced a new feature aimed to facilitate affiliation
searches and browsing the profiles of authors who work in the same institution. Thus, if an
author has entered an affiliation in a more or less conventional form, a link will
automatically appear pointing to the list of all researchers working in the same institution,
while also displaying the usual information available in a GSC search: total number of
citations received by each researcher, areas of interest entered by the author (up to five of
them). This feature is therefore an evident improvement in the product, since it offers a
simple and quick method to search authors by their institution. Although these searches
could already be carried out before (by searching the domain of the institutional e-mail), the
process was too slow and tedious to be used for large-scale quantitative studies, and
excluded authors using unofficial e-mail addresses.
This new functionality not only facilitates the identification of all authors working in a
university or in other organizations (providing their profiles are public), but is also useful if
you want to learn about the scientific interests and subject profile of an institution. This fact
may, indirectly, make it easier to perform institution evaluation processes (especially
university rankings), as well as to evaluate authors from the same institution, as it has
already been done in one way or another in other profile platforms like the already
mentioned Microsoft Academic Search, and more recently, ResearchGate (Thelwall and
Kousha, 2015). The top universities by GSC (http://webometrics.info/en/node/169) and the
ranking of researchers by country (http://webometrics.info/en/node/116), both developed by
the Cybermetrics Lab at the Spanish National Research Council (CSIC), serve as examples of
this kind of initiatives.
It is for this reason that the high or low accuracy with which institutions are normalized
can have a major impact in the visibility of institutions in this particular product, and
therefore, in the potential institution-level studies with evaluative purposes or otherwise
that may be carried out using these data, as it happens with other bibliographic products
(García-Zorita et al., 2006; Venets, 2014).
The identification and unification of institutions is a difficult and intricate task, full of
complex and unexpected cases, mainly due to the wide variety of forms researchers use to
enter their institutional affiliations (uncommon variations, spelling errors, or even complete
absence of information). It is even worse in non-English-speaking countries, where
institutions can be entered in their original form, or translated to English. These issues
affect international visibility of universities negatively, making bibliometric studies
based on affiliations and university rankings troublesome (Bador and Lafouge, 2005;
Van Raan, 2005; Praal et al., 2013; Taşkın and Al, 2014).
Huang et al. (2014) differentiate between two kinds of problems in institutional
affiliations: multiple to simple (MTS) when various institutions share the same
name; and simple to multiple (STM), when one institution can be referred to by several
names. Some of the reasons why the latter phenomenon may occur are translation,
spelling, institutions that change their name, errors, and divisions. These problems are
exacerbated when there is no authority control in the data entry process, and the system
gives users the freedom to identify themselves as they like. The main methods to fix these
errors and reach a complete unification of the variants are clustering through word
similarity, and using variant lists developed by competent authorities. To date, the
literature has proposed quite a diverse automatic institutional name disambiguation and
normalization processes (Galvez and Moya-Anegón, 2006, 2007; Jiang et al., 2011;
Cuxac et al., 2013; Hu et al., 2013; Morillo, Aparicio, González-Albo, and Moreno, 2013;
Morillo, Santabárbara, and Aparicio, 2013; Huang et al., 2014), thought any of them has
completely solved the problem.
These normalization problems can entaila numbness for authors (being not appropriately
linked with theirinstitutions), institutions (diminishing their web visibility by not showing all
763
Institutional
affiliation links
in GSC

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT