ON THE GROWTH OF BIBLIOGRAPHIES WITH TIME: AN EXERCISE IN BIBLIOMETRIC PREDICTION

DOIhttps://doi.org/10.1108/eb026847
Published date01 April 1989
Pages302-317
Date01 April 1989
AuthorQUENTIN L. BURRELL
Subject MatterInformation & knowledge management,Library & information science
ON THE GROWTH OF BIBLIOGRAPHIES WITH TIME:
AN EXERCISE IN BIBLIOMETRIC PREDICTION
QUENTIN L. BURRELL
Statistical
Laboratory,
Department
of Mathematics,
University
of
Manchester,
Manchester
M13 9PL
Recent work has shown that potentially useful predictions of
the
circulation of
library materials can
be made
which do not require
very
restrictive assumptions
about underlying probability distributions. In the same spirit,
we here
consider
one of
the classic
problems of bibliometrics,
viz.
predicting the number of 'new'
journals carrying 'relevant' articles in the future, using both established
parametric approaches and the newer, empirical methods.
1.
INTRODUCTION
IN HIS FOUNDING PAPER ON BIBLIOMETRICS, S.C. Bradford [1]
sought ways of providing an 'efficient service for abstracting and indexing
scientific and technical literature'. As is surely well known, Bradford's
investigation concentrated on the productivity of journals; more particularly
he sought to identify those journals which were 'relevant' in the sense of
publishing articles in a particular subject area during the period of study. His
empirical studies gave rise to the so-called Bradford Law of bibliometrics
which has been considered by many authors and from various standpoints.
The Bradford Laws are not our concern here, although one aspect of
Bradford's original
article,
which seems to have been largely overlooked in the
following half-century, provides the motivation for this study.
In Bradford's context, we have a population of academic journals and in
compiling a bibliography we wish to identify those journals which produce
articles relevant to our
field
of interest and the numbers of articles
so
provided
by each. While a number of authors have concentrated on the importance of
having as nearly as possible a complete search of the possible sources, an
important practical point raised by Bradford is that 'even when the actual
producers during a period of years had been ascertained, new sources would
certainly appear during a further period' [1]. In other words, even if our
bibliography is complete so far as the period of study is concerned, if the
search is extended in time then new producers will inevitably transpire.
Similarly, in
his
reassessment of Bradford's
work,
M.G. Kendall
[2]
concluded
by remarking that 'the problem of
the
outliers is not completely described in
terms of existing bibliographies. There
is
also a non-observed
class
of journals
which have not carried a relevant article
in
the period examined but
may do
so
at any moment in the future'.
Journal
of
Documentation,
Vol.
45,
No. 4, December
1989,
pp. 302-317.
302
December 1989 BIBLIOMETRIC PREDICTION
We
consider this problem of trying to predict the number of new producers
in the next section and illustrate the results with Bradford's and Kendall's
original data. Several of the methods used come within the realm of what is
usually termed an empirical Bayes approach and require a minimal set of
assumptions. The mathematical derivations of the prediction formulae are
given in an Appendix.
2.
ADDRESSING THE PROBLEM
Suppose that
we are
interested in the journal coverage of some particular
well-
defined subject area in that
we
wish to identify all of the papers relevant to the
subject and the journals
in
which they appear. In our study
we
therefore carry
out a complete search of journals over a certain period of
time.
The actual
length of this time period is not of direct interest but is chosen in some
convenient fashion. For instance, in the data we shall consider, Bradford's
data on papers in applied geophysics cover the years 1928 to 1931, while
Kendall's data relate to a bibliography on operational research issued by the
Operational Research Society of America in 1958 and covering all papers in
the area up to that time. Whatever may be the actual length of the period, it
does no
harm to take this
as
defining the unit of
time.
(We
shall
see
later that
in
many circumstances we should drop the idea of 'real time', although it is
convenient to retain it for the present discussion.) The problem may then be
simply stated
as:
'If we now continue our search for a further period of length
t, how many new producers (i.e. journals carrying a relevant paper in this
further period of study but which did not carry a relevant paper
in the
original
period) will come to light?'
Let us be clear at the outset that the 'classical' laws of bibliometrics -
Bradford, Lotka, Mandelbrot,
Zipf,
Price -
have
nothing to say regarding this
problem, their common drawback here being that they do not include a time
parameter in their standard forms. In this sense they are static and at best
descriptive laws while what is needed is an approach which is both dynamic
and predictive.
In the problem of direct interest here, and in many others, the crucial
observation is not just that there is variability but that there are two quite
distinct manifestations of the variability - there are variations between
journals so far as the number of relevant papers published is concerned and,
just as importantly, there is variation with time so far as the numbers
published by a particular journal are concerned. As has been argued
previously (e.g. Burrell
[3-5],
Sichel [6]), the natural way to model such
phenomena mathematically is by means of stochastic point processes and in
particular this is readily done through mixtures of counting processes.
Thus we suppose that there is a population of N potentially productive
journals each of which publishes papers of interest according to some
stochastic point process, the rates of these varying over the population. Of
course the population here is ill-defined and, indeed, even its size is unknown
in general. So far as the mathematics is concerned, however, it turns out that
303

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT