Date20 January 2005
Published date20 January 2005
AuthorLeo Egghe
The purpose of this book was to present a unified theory of Lotkaian informetrics so that the
informetrics community (and even beyond informetrics) has a tool that can be used in further
applications. Indeed, although the law of Lotka and its equivalent expressions (Chapter II) are
interesting in themselves, we should always try to apply them in other informetrics issues. If
this can be done we are far ahead of research that always starts from zero and tries to explain
certain regularities without referring to earlier results. In this sense the Chapters III, IV, V and
VI could be considered as applications in Lotkaian informetrics but, because of their special
importance and far-reaching consequences in informetrics, we devoted a special chapter to
This book could end here were it not that there exist further interesting applications of the
Lotka function in diverse topics. They are gathered in this closing chapter. We start, in
Section VII.2 with a warning that some regularities found in informetrics (exact or
approximate) are not explained using informetric laws but simply by plain mathematics. We
give two examples: one on the well-known arcs at the end of a Leimkuhler curve which are
exact exponential functions on a log-scale (i.e. linear functions of the original variable) and
one on a relation (called type-token identity) - see Chen and Leimkuhler (1989), which is not
correct but which is - mathematically - explainable in an approximative way.
Section VII.3 deals with a regularity that is, in essence, also not informetric in nature but a
probabilistic argument leads to the empirically verified power law. It concerns the graph of
Wallace (1986) on the relation between the number of articles per journal and the journal
296 Power laws in the information production process: Lotkaian informetrics
median citation age. We will prove that the graph is below a decreasing power function with
exponent 2, but this is not a consequence of Lotkaian theory but of the Central Limit Theorem
in probability theory. Only the fact that the cloud of points is becoming thinner for high
number of articles per journal is explained by Lotka's law.
Section VII.4 deals with another topic on multi-authored articles (cf. Chapter VI), namely on
the distribution of the rank of an author in such an m-authored paper. We show that, if the
number of authors per paper follows Lotka's law (cf. Subsection 1.4.4 and Chapter VI), the
author rank distribution follows the same Lotka law. We further determine author ranks, using
author seeds, i.e. a universal number indicating the general place of an author name as e.g.
expressed by the alphabetic order.
A very important application of Lotkaian informetrics is given in Section VII.5. There we
determine the so-called "first-citation distribution", i.e. (e.g. in a bibliography) the overall
cumulative distribution of the time period between the publication of an article and the time it
receives its first citation. We can explain it using an exponentially decreasing age function
time distribution of all citations) and a Lotka law for the number of citations for an article
in this bibliography. Note that both functions are applicable in one model, due to the
arguments given in Subsection
The application is remarkable since it is capable of
explaining concave as well as S-shaped cumulative distribution functions and - what is even
more interesting - the difference between the two types is characterized in terms of the
Lotkaian exponent a. Hence Lotka's size-frequency function is decisive in this matter. In this
section we also mention an application of aging functions (as the exponential one) to the
explanation of the relation of the Price index in function of the mean and median reference
The closing Section VII.6 deals with the intricate problem of determining the rank-frequency
and size-frequency functions for N-grams and N-word phrases. We derive these functions
using Zipf
function as rank-frequency function for the constituting parts of the N-gram
(letters) or of the N-word phrase (words) and using a technique of N-product space, being the
Cartesian product of the IPPs of the constituting parts. The theory presented here is exact in
the sense that no approximations are used and in this sense improves earlier results of the
author. Using the size-frequency function we will also determine (as in Chapter III) formulae
Further applications in Lotkaian informetrics 297
for the (Type/Token) average nN and the Type/Token-Taken average \in of N-grams and N-
word phrases and the values of uN and \x'N are compared. This final section also gives rise to
(hard) problems concerning N-grams and N-word phrases.
Real regularities need explanations. Sometimes these explanations are elementary and are not
of an informetric nature (i.e. we do not need Lotka's law or subsequent results to explain
The researcher should be able to make a distinction between the various types of
explanations. In this section we will give two examples of regularities that can be explained
via plain (simple) mathematics.
VII.2.1 The arcs at the end of a Leimkuhler curve
One of the simplest regularities ever found in informetrics, but which is not an informetric
regularity at all, is the fact that, at the end of a Leimkuhler curve, one detects "arcs". One
obtains a Leimkuhler curve when graphing the cumulative number G(r) of items in the first
(largest) r sources, versus log r (any log can be used, e.g. In). The graph looks as in Fig.VII. 1
and can be found, for example, in Warren and Newill (1967), Brookes (1973), Praunlich and
Kroll (1978), Wilkinson (1973) and Summers (1983).
Fig. VII.
A Leimkuhler curve, with arcs for large r.
Although the graph (without the arcs) has an equation of the form (see formula (11.43))

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT