LOTKAIAN INFORMETRICS OF SYSTEMS IN WHICH ITEMS CAN HAVE MULTIPLE SOURCES
DOI | https://doi.org/10.1108/S1876-0562(2005)0000005008 |
Pages | 247-294 |
Published date | 20 January 2005 |
Date | 20 January 2005 |
Author | Leo Egghe |
VI
LOTKAIAN INFORMETRICS OF SYSTEMS
IN WHICH ITEMS CAN HAVE MULTIPLE
SOURCES
VI.
1
INTRODUCTION
This is a unique chapter in many senses. This chapter deals with informetric systems in which
items can have multiple sources. The most obvious application is, of course, the case where
items are articles which are written by (possibly) more than one author (i.e. multiple sources).
We underline that the goal of this chapter is not to study the informetric functions describing
the number of authors per paper (for this, see Subsection
1.4.4)
but the number of papers per
author, although the former will be used in the latter. Note that the terminology of this
application coincides with the terminology used in the historical paper Lotka (1926) in which
Lotka's size-frequency function was introduced for the first time. As remarked in Chapter I,
however, Lotka circumvented the problem of dealing with multiple authorship by only giving
a credit (of 1) to the senior author (the other authors were not given any credit), hence, in fact,
Lotka treated the articles as single-author articles. In the next Section VI.2 we will study
other, more realistic ways, of crediting sources in systems where items can have multiple
sources. Let us here just mention two possibilities: in an item (e.g. article) with n e N sources
(e.g. authors), each author could be given a credit of
1
(each author of the article is hence fully
recognized but in this way an article gives, in total, n author credits, so a different weight
compared to articles with a different number of authors), called the "total counting" system
or, alternatively, each author is given a credit of
—
(keeping hereby the total author "weight"
n
of each paper equal to 1 but making an author credit dependent on the total number of authors
of an article), called the "fractional counting" system. We will see that the total counting
248 Power laws in the information production process: Lotkaian informetrics
system is very different from the fractional counting system, which causes a serious problem,
e.g. for evaluation studies. This fact and solutions for this problem will be given in Section
VI.2.
Although interesting and important in
itself,
the topic of Section VI.2 will not be the main
issue of this chapter. The fact that we encounter multiple sources for an item in informetrics
makes this field rather unique: apart from artificial examples that could be produced in any
field, we note that this problem is not encountered in other -metrics sciences (or at least we
have not seen a paper introducing this topic): classical econometrics deals with wealth and
poverty expressed e.g. by the income of people and it is clear that salaries are single sourced;
in biometrics each animal belongs to one species; in linguistics the "token" (= item) is
uniquely linked with one "type" (= source) since here a source is a word (as such) and an item
is the use of this word (e.g. in a text). Even in informetrics, the possibility that items have
multiple sources, is a rare phenomenon. In Bradfordian terminology, a source is a journal and
an item is an article in such a journal: it is clear that the article is published in only one
journal. In citation analysis a reference is uniquely linked with the citing article and the same
for citations. In library sciences the borrowing of a book (= item) is uniquely linked with this
book (= source), obviously. The reader can check the many other examples given in Chapter I
to find out that they are all examples of single source items (except, as said, the case of
authors and their publications (and some derived examples, see Section VI.2) and some
artificial examples, discussed in Subsection
1.4.4).
This special place of the author/publication system makes us think of a possible special
informetrics theory that is behind it. In Subsections II.2.2 and II.4.2 we showed that Lotka's
law is equivalent with functions such as the ones of Mandelbrot,
Zipf,
Leimkuhler and
Bradford, but how can this be when Lotka's function represents a multiple-source system for
items while the other functions represent single-source systems for items? Of course we can
answer this question: in the theories developed so far (hence also in Subsections II.2.2 and
II.4.2) we have always used the same informetric system (i.e. IPP) for all these laws, in fact
comparable with what Lotka did by treating the articles as single-authored (as mentioned
above).
Of course, this leaves us with the problem of deriving informetrics theories for the
multiple-source framework for items. Here we immediately encounter the problem of
choosing the scoring system for sources in such a multiple-source framework for items, e.g.
(but these are not the only possibilities) the total scoring system or the fractional scoring
Lotkaian informetrics
of
systems
in
which items
can
have multiple sources
249
system (briefly discussed above
and
more exhaustively
in
Section
VI.2 to
come).
Let us
discuss
the
problem
and
possibilities
of
solution
for
both scoring systems separately since
they
are
completely different.
Problem VI.1.1 (Total scoring system)
To
fix the
ideas,
let us
consider
the
discrete case.
In
this case
the
only values
for "the
number
of items
per
source"
are the
natural numbers
n = 1, 2, 3, .... So the
size-frequency function
f(n),
describing
the
number
of
sources with
n
items,
has the
same arguments
as the one we
have discussed already throughout this book.
The
same elementary observations, developed
in
Subsection
1.3.1 can be
made here:
f > 0 and f
decreasing
in n,
hence also Proposition
1.3.1.1
is valid: there exists
a
constant
D > 0
such that limf
(n) = D . We can
refer
to
Kretschmer
and Rousseau (2001) where
f(n), the
number
of
authors with
n
papers,
in the
total counting
system,
is not
decreasing
but
only
in
those cases where there
are
more than
50
authors
per
paper:
in all
other cases, where
the
number
of
authors
per
paper
is
below
40
(still very high!),
f(n)
is
empirically seen
to be
decreasing
and
even
of
Lotka type.
We
can
remark here that Egghe (1995) extends
the
success-breeds-success principle,
introduced
in
Subsection
1.3.6, to the
case that items
can
have multiple sources
and
where,
possibly, non-decreasing size-frequency functions
can
occur.
The
model
is,
however,
theoretic
and
does
not
yield analytic forms
of
these functions
(in a way
understandable
because
of the
intricate nature
of
such, exceptional, size-frequency functions
- see e.g. Fig. 1,
p.
612 in
Kretschmer
and
Rousseau (2001)).
Also
the
following argument based
on the
results
of
Subsection
1.3.2 can be
convincing
in
accepting that
the
size-frequency function
of the
total counting system
is
Lotkaian.
Argument VI.1.1.1 (Egghe):
Lotkaian informetrics where items
can
only have
one
source implies Lotkaian informetrics
where items
can
have multiple sources
and
where
the
total scoring system
for
sources
is
applied.
To continue reading
Request your trial