Which research institution performs better than average in a subject category or better than selected other institutions?

Pages222-237
Date09 April 2018
Published date09 April 2018
DOIhttps://doi.org/10.1108/OIR-08-2015-0276
AuthorLutz Bornmann
Subject MatterLibrary & information science,Information behaviour & retrieval,Collection building & management,Bibliometrics,Databases,Information & knowledge management,Information & communications technology,Internet,Records management & preservation,Document management
Which research institution
performs better than average in a
subject category or better than
selected other institutions?
Lutz Bornmann
Max Planck Society, Munich, Germany
Abstract
Purpose Institutional bibliometric analyses compare as a rule the performance of different institutions.
The purpose of this paper is to use a statistical approach which not only allows a comparison of the citation
impact of papers from selected institutions, but also a comparison of the citation impact of the papers of these
institutions with all other papers published in a particular time frame.
Design/methodology/approach The study is based on a randomly selected cluster sample (n¼4,327,013
articles and reviews from 2000 to 2004), which is drawn from a bibliometric in-house database including Web
of Science data. Regression models are used to analyze citation impact scores. Subsequent to the models,
average predictions at specific interesting values are calculated to analyze which factors could have an effect
on the impact scores-the journal impact factor (JIF ), of the journals which published the papers and the
number of affiliations given in a paper.
Findings Three anonymous German institutions are compared with one another and with the set of all
other papers in the time frame. As an indicator of institutional performance, fractionally counted PP
top 50%
on
the level of individual papers are used. This indicator is a normalized impact score whereas each paper is
fractionally assigned to the 50 percent most frequently cited papers within its subject category and
publication year. The results show that the JIF and the number of affiliations have a statistically significant
effect on the institutional performance.
Originality/value Fractional regression models are introduced to analyze the fractionally counted
PP
top 50%
on the level of individual papers.
Keywords Bibliometrics, Fractional counting, Highly cited papers, Normalized citation impact
Paper type Research paper
1. Introduction
The chief aim of evaluative bibliometrics, as applied to the performance of research
institutions, is to answer two questions: Is the performance of the institution in question
better or worse than that of particular other institutions? How do the publications of the
institution in question perform compared with all other publications? Normalized citation
impact indicators, which are widely used in bibliometrics today, are well suited to answering
the first question. The indicators allow the comparison of institutions with different profiles
as to subject and time, as the citation impact of every single publication is normalized as to
subject category and publication year. For example, in the size-independent Leiden Ranking
2017, the Rockefeller University has a PP
top 10%
¼31.2% (i.e. a proportion of ~31 percent of
papers belonging to the 10 percent most-cited papers in their subject category and
publication year) and thus has a better performance than the Rice University with a
PP
top 10%
¼20.4%. However, the approaches currently widespread in bibliometrics are
generally incapable of answering question 2.
Online Information Review
Vol. 42 No. 2, 2018
pp. 222-237
© Emerald PublishingLimited
1468-4527
DOI 10.1108/OIR-08-2015-0276
Received 19 August 2015
Revised 8 August 2017
Accepted 29 August 2017
The current issue and full text archive of this journal is available on Emerald Insight at:
www.emeraldinsight.com/1468-4527.htm
The bibliometric data used in this paper are from an in-house database developed and maintained by
the Max Planck Digital Library (MPDL, Munich) and derived from the Science Citation Index
Expanded (SCI-E), Social Sciences Citation Index (SSCI), Arts and Humanities Citation Index (AHCI)
provided by Clarivate Analytics (Philadelphia, Pennsylvania, USA). The data in the in-house database
refer especially to the publication years since 1980.
222
OIR
42,2
Bornmann (2016) therefore presented a statistical approach using regression models
which not only allows a comparison of the citation impact of papers from selected
institutions, but also a comparison of the citation impact of the papers of these institutions
with all other papers published in a particular time frame. The chief advantage of this
statistical approach is that it shows the change in performance of a paper relative to all other
papers in a publication year when a particular institution is specified on a paper. The
present study applies the approach of Bornmann (2016) and adapts it better to requirements
of an evaluation study: first, whereas the regression model of Bornmann (2016) was based
on citation counts as dependent variable, here normalized impact scores are used in the
analysis. Thereby, a new possibility is suggested of analyzing fractionally counted PP
top x%
(Waltman and Schreiber, 2013) on the level of individual papers. The evaluation as
described in Section 2 includes the share of individual papers belonging to the
above-average cited papers in a subject category and publication year. Second, a design is
proposed for the extraction of a cluster sample from the population of all publications in
Web of Science (WoS, Clarivate Analytics). Since it is not possible to include all publications
from the population in the regression models, a sample should be drawn.
2. The evaluation of fractionally counted PP
top 50%
on the level of individual
papers
Until recently, the meannormalized citation score (MNCS) (Waltman et al., 2011) was classed
as the most important bibliometric indicator for normalizing the citation impact of
publications.With this indicator, the citationimpact of a paper is normalizedwith the citation
impact of those papers which appeared in the same publication year and in the same subject
category. However, percentile-based indicators are increasingly used since their results are
hardly or not at all influenced by a small number of extremely highly cited papers
(Bornmann, Leydesdorff and Mutz, 2013; Hicks et al., 2015; Wilsdon et al., 2015). Scopus
(Elsevier) has gone over to providing percentiles for individual papers. The percentile-based
indicator PP
top 10%
is used both in the SCImago Institutions Ranking( Bornmann et al., 2012)
and in the LeidenRanking (Waltman et al., 2012).The current Leiden Rankingin fact provides
the MNCS, in addition to the PP
top 10%
the PP
top 50%
and PP
top 1%
of the individual
institutions. Here, the proportion of the x percent most-cited papers in the particular subject
category and publication year is provided for each institution.
However, the PP
top x%
indicators are affected by a problem related to their accuracy.
Because of the discrete nature of citation distributions, there are many cases where it is
impossible to determine accurately the x percent most-cited publications in a subject category
(Schreiber, 2012; Waltman and Schreiber, 2013). For example, if we have 100 publications in a
subject category (5 with 30 citations each, 10 with 21 citations and 85 with 10 citations), then
one couldinclude 5 percent of thepapers with the 10 percentmost-cited papers(the five papers
with 30 citations) or 15 percent of the papers (the five papers and also the ten papers with
21 citations). However, 5 percent of the papers fall below the share of the papers which belong
to the 10 percent most cited, and 15 percent exceed the share. Because of the papers in the
subject category whose impact places them on the threshold of the x percent most-cited
publications and which exhibit the same number of citations (here 21 citations), the papers
cannot be unambiguously assigned to the highly cited area. Waltman and Schreiber (2013)
therefore suggested an approach in which one calculates PP
top x%
for a subject category (in one
publication year) with exactly x percent most-cited publications.
With this approach, the publications in one subject category are fractionally assigned to the
x percent most-cited publications. This approach is used for the current Leiden Ranking
(see www.leidenranking.com). Let us take the example above to explain the approach: 5 percent
of the publications in the subject category belong without a doubt to the 10 percent most-cited
publications, since they were most highly cited (with 30 citations). In order to reach the target
223
Which
research
institution
performs
better?

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT