# BASIC THEORY OF LOTKAIAN INFORMETRICS

 DOI https://doi.org/10.1108/S1876-0562(2005)0000005004 Pages 101-156 Publication Date 20 Jan 2005 Author Leo Egghe II
BASIC THEORY OF LOTKAIAN
INFORMETRICS
II.l GENERAL INFORMETRICS THEORY
Before developing the Lotkaian informetrics theory we will describe, formally, the functions
that are needed in general informetrics - e.g. a formal description of the functions introduced
in Section I.I and we will prove their properties. The basis of this general theory is duality
between sources and items, i.e. their interchangeability or the possibility of replacing
"produces" by "is produced by". Exact definitions will be given, followed by a heuristic
interpretation. We hope, this way, to give also an intuitively clear description of the necessary
formalism which, however, limits itself to aspects of real analysis (mainly derivatives and
integrals). The reader who wants an update on these techniques is referred to the vast
literature on real analysis, e.g. Apostol (1957) or Protter and Morrey (1977). That we use
derivatives and integrals is evident from the arguments given in the first chapter: the
simplicity of the functional formalism and its calculations (as compared to the evaluation of
sums) and also the simplicity of the relations between the different functions and their
parameters and further, its applicability to the practical cases of generalized bibliographies
since, the larger the datasets (as they are more and more common because of literature growth
and the automation of its collection), the better they are represented by continuous models
(rational, non-entire numbers) as will become clear in Chapter VI.
II.l.l Generalized bibliographies: Information Production Processes (IPPs)
An information production process (IPP) is a triple of the form (S,I,V) where S = [0,T] is the
set of sources, I = [0,A] is the set of items and where 102 Power laws in the information production process: Lotkaian informetrics
V:S^I
(II.l)
is a strictly increasing differentiable function such that V(0) = 0 and V(T) = A.
Intuition; S and I stand for the discrete sets of sources {1,2,...,T} and of items {1,2,...,A}
respectively, hence the interval lengths (T respectively A) stand for the total number of
sources, respectively items. The function V represents the cumulative number of items in the
least productive sources. Since we will identify S with the source rankings (in decreasing
order of number of items), V(r) (for re[0,T]) denotes the cumulative number of items in the
sources belonging to the interval [T
r,T] (i.e. the sources in the interval of length r,
containing the least productive sources).
We next introduce the notion of duality. Let (S,I,V) = ([0,T],
[0,A],
V) be an IPP as defined
above. The dual IPP of the IPP (S,I,V) is defined to be the IPP
(I,S,U) = ([O,A],[O,T],U), (II.2)
where U : I
>
S is the function
U(i) = T-V-'(A-i) (II.3)
for all ie[0,A]. Note that V"1 (the inverse of V) exists since V is strictly increasing.
Note also that it follows from (II.3) that i = V(r) « T - r = U(A - i).
Intuition: When replacing the words "source" and "item" in the IPP (S,I,V) we obtain the IPP
(I,S,U). Now, for each ie[0,A], U(i) denotes the cumulative number of sources that produce
the items belonging to the interval
[0,i],
for each ie[0,A].
Note that, since V is differentiable and strictly increasing, the same is true for U. Denote, for
every iel and reS Basic theory ofLotkaian informetrics 103
a(i) = U'(i) (II.4)
p(r) = V'(r) (II.5)
We will (with an acceptable abuse of notation) denote
p(i) = V'(r)
if and only if i = V(r). In other words
p(i) = V'(v-1(i)) (II.6)
for all iel.
Intuition: For each reS, p (r) is the density function of the items in the source T - r and, for
each iel, a (i) is the density function of the sources in the item i.
We have the following result:
Proposition H.l.1.1:
C(I)=K^)
(IL7)
for every iel.
Proof: By (II.3)