Semiotics and indexing: an analysis of the subject indexing process

Published date01 October 2001
Date01 October 2001
DOIhttps://doi.org/10.1108/EUM0000000007095
Pages591-622
AuthorJens‐Erik Mai
Subject MatterInformation & knowledge management,Library & information science
Journal of Documentation, Vol. 57, No. 5, September 2001
© Aslib, The Association for Information Management.
All rights reserved. Except as otherwise permitted under the Copyright, Designs and Patents Act
1988, no part of this publication may be reproduced, stored in a retrieval system, or transmitted in
any form or by any means, electronic, mechanical, photocopying or otherwise without the prior
written permission of the publisher.
Aslib, The Association for Information Management
Staple Hall, Stone House Court, London EC3A 7PB
Tel: +44 (0) 20 7903 0000, Fax: +44 (0) 20 7903 0011
Email: pubs@aslib.com, WWW: http://www.aslib.com
SEMIOTICS AND INDEXING: AN ANALYSIS OF THE SUBJECT
INDEXING PROCESS
JENS-ERIK MAI
jemai@u.washington.edu
The Information School, University of Washington, Seattle
Washington 98195-2840
This paper explains at least some of the major problems related to
the subject indexing process and proposes a new approach to
understanding the process, which is ordinarily described as a process
that takes a number of steps. The subject is first determined, then it
is described in a few sentences and, lastly, the description of the
subject is converted into the indexing language. It is argued that this
typical approach characteristically lacks an understanding of what
the central nature of the process is. Indexing is not a neutral and
objective representation of a document’s subject matter but the
representation of an interpretation of a document for future use.
Semiotics is offered here as a framework for understanding the
‘interpretative’ nature of the subject indexing process. By placing
this process within Peirce’s semiotic framework of ideas and
terminology, a more detailed description of the process is offered
which shows that the uncertainty generally associated with this
process is created by the fact that the indexer goes through a
number of steps and creates the subject matter of the document
during this process. The creation of the subject matter is based on
the indexer’s social and cultural context. The paper offers an
explanation of what occurs in the indexing process and suggests that
there is only little certainty to its result.
1. INTRODUCTION
In the literature, the indexing process is often described as a process of multiple
steps. However, discussions have not been concerned with the nature of the
indexing process, but mostly with the last step, that of producing an appropriate
subject entry. The aim of this paper is to present a theoretical framework for
understanding the nature of the indexing process that explains why a predictable
result cannot be expected. The attempt is to explain at least some of the major
problems related to representing the subject matter of documents; more speci-
cally, to explain the nature of the subject indexing process in a new way. This
study is based on the assumption that it is not possible to make a general pre-
scription of how to index and explores the indexing process from the perspective
that the process is one of interpretation.
The paper provides an understanding of the subject indexing process that
views the process as a number of interpretations that to some degree depend on
591591
Journal of Documentation, vol. 57, no. 5, September 2001, pp. 591–622
the specic cultural and social context of the indexer. The aim is not to provide a
new and improved method for indexing. The investigation is held at a level inde-
pendent of specic indexing languages and indexing practices.
The main problems of representing the subject matter of documents for
retrieval are concerned with meaning and language, more specically how a
statement can be represented using a few words or symbols. Philosophy of lan-
guage is concerned with how meaning is determined and established and how lan-
guage can represent reality. There seems to be an overlap of interest between
understanding the subject indexing process and philosophy of language; the sub-
ject indexing process is, therefore, explored here from a philosophy of language
perspective. Others have begun with similar assumptions. Fairthorne (1969), for
instance, noted that ‘special topics can be treated as isolated topics only at the risk
of sterility; therefore some acquaintance with the general problems of language
and meaning is essential’. Blair (1990, pp. vii–viii) notes that: ‘The central task of
information retrieval research is to understand how documents should be repre-
sented for effective retrieval. This is primarily a problem of language and mean-
ing. Any theory of document representation ... must be based on a clear theory of
language and meaning’. In this respect, this study argues that the subject indexing
process consists of a number of steps that should be viewed as interpretations.
Benediktsson (1989, p. 218) has noted the interpretative nature of the indexing
process and the need for guidelines that recognise the signicance of interpreta-
tion: ‘Any sort of bibliographical description ... can be considered descriptive.
When it comes to interpretation, the question is: ought not the description to fol-
low a method or standard as any canon, which makes interpretation possible?’
The present study will explore the approach to studies of indexing and library
and information science (LIS) suggested by Fairthorne, Blair, Benediktsson and
others.
1.1 Steps in the indexing process
In the literature, the indexing process is often portrayed as involving two, three,
or sometimes even four steps. The two-step approach (cf. e.g. Benediktsson,
1989; Frohmann, 1990) consists of one step in which the subject matter is deter-
mined and a second step in which the subject is translated into and expressed in
an indexing language, i.e.:
1. determine the subject matter of the document;
2. translate the subject matter into the indexing language.
The three-step approach (cf. e.g. Miksa, 1983; ISO, 1985; Farrow, 1991;
Taylor, 1994; Petersen, 1994) adds one more step to the process. The subject is
still determined rst. However, a second step is then included in which the subject
matter found in step one is reformulated in more formal language. Thereafter, in
a third step, the more formally-stated subject is further translated into the explic-
it terminology of an indexing language, i.e.:
1. determine the subject matter of the document;
2. reformulate the subject matter in a natural language statement;
3. translate the subject matter into the indexing language.
JOURNAL OF DOCUMENTATION vol. 57, no. 5
592
Journal of Documentation, Vol. 57, No. 5, September 2001
© Aslib, The Association for Information Management.
All rights reserved. Except as otherwise permitted under the Copyright, Designs and Patents Act
1988, no part of this publication may be reproduced, stored in a retrieval system, or transmitted in
any form or by any means, electronic, mechanical, photocopying or otherwise without the prior
written permission of the publisher.
Aslib, The Association for Information Management
Staple Hall, Stone House Court, London EC3A 7PB
Tel: +44 (0) 20 7903 0000, Fax: +44 (0) 20 7903 0011
Email: pubs@aslib.com, WWW: http://www.aslib.com
The four-step approach (cf. e.g. Langridge, 1989; Chu & O’Brien, 1993) is
similar to the three-step approach in the rst two points. The rst step determines
the document’s subject matter more or less informally. In the second step, the
indexer then summarises the subject matter of the document more formally, usu-
ally in his or her own vocabulary and in the form of a more compressed statement.
From this point forward, this approach differs from the three-step approach.
Here the translation of the subject matter into an indexing language consists of
two steps rather than a single step. In a third step the indexer translates the sen-
tences into the vocabulary used in the indexing language. And in a fourth step the
indexer constructs one or more subject entries in the indexing language – in the
form of index terms, class marks or subject headings – with respect to their syn-
tax and relationships, i.e.:
1. determine the subject matter of the document;
2. reformulate the subject matter in a natural language statement;
3. reformulate the statement into the vocabulary of the indexing language;
4. translate the subject matter into the indexing language.
It should be noted that the idea of ‘steps’ as recounted here has to do chiey
with the logic of the indexing process, not necessarily with the actual sequence of
mental and physical operations. It may well be that some indexers, particularly
those who are beginners in such work, may accomplish their indexing ‘by the
numbers’, ticking off the steps as they go. However, this is less likely as experi-
ence is gained. In reality, experienced indexers and cataloguers may not be con-
scious of the various steps at all, and all steps, regardless of how many one
supposes are most accurate, may well take place almost simultaneously. In short,
an experienced indexer will perform the indexing process in just one complex
action1. It is useful, however, to operate with the idea of steps when analysing the
process, because breaking down the process into its individual parts will allow
one to examine it in greater detail.
The three-step approach is chosen here for several reasons. The two-step
model is too simplied in its conception of the subject indexing process. In fact,
the two-step approach appears to be used chiey as a device to separate two dis-
tinct activities in the subject indexing process: determining the subject of a docu-
ment and converting that subject to the terminology of an indexing language. It is
seldom used to discuss the details of the process itself. In contrast, the four-step
approach appears to add an unnecessary complication to the nal part of the
process which consists of the activity of translating the subject of a document into
the terminology of an indexing language. The four-step approach breaks that nal
part of the process into two parts which is not useful as there is no essential dif-
ference between these two steps but only a difference of general versus specic
activity. In the rst of these two nal steps, the subject of a document is said to be
translated into the language of a given subject access vocabulary, whereas the
next step only translates the results into indexing terms or strings of terms (i.e. the
syntax) in the system.
September 2001 SEMIOTICS AND INDEXING
593
Journal of Documentation, Vol. 57, No. 5, September 2001
© Aslib, The Association for Information Management.
All rights reserved. Except as otherwise permitted under the Copyright, Designs and Patents Act
1988, no part of this publication may be reproduced, stored in a retrieval system, or transmitted in
any form or by any means, electronic, mechanical, photocopying or otherwise without the prior
written permission of the publisher.
Aslib, The Association for Information Management
Staple Hall, Stone House Court, London EC3A 7PB
Tel: +44 (0) 20 7903 0000, Fax: +44 (0) 20 7903 0011
Email: pubs@aslib.com, WWW: http://www.aslib.com
1Mai (1999) has explored this development of indexers from being novice indexers to
becoming experts.

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT