Converting and evolving a subject heading list into a thesaurus

Date10 July 2024
Pages1528-1545
DOIhttps://doi.org/10.1108/JD-01-2024-0024
Published date10 July 2024
Subject MatterLibrary & information science,Records management & preservation,Document management,Classification & cataloguing,Information behaviour & retrieval,Collection building & management,Scholarly communications/publishing,Information & knowledge management,Information management & governance,Information management,Information & communications technology,Internet
AuthorMaria Teresa Guaglianone,Giovanna Aracri,Maria Taverniti
Converting and evolving a subject
heading list into a thesaurus
Maria Teresa Guaglianone, Giovanna Aracri and Maria Taverniti
National Research Council, Institute of Informatics and Telematics, Rende, Italy
Abstract
Purpose The objective of this paper is to describe the evolution of the available subject heading list, i.e. the
CC Soggettario (Carabinieri Corps Soggettario), towards a thesaurus, that is CCThes (Carabinieri Corps
Thesaurus), to support subject indexing and retrieval of the documentary heritage held by the Historical Office
of the General Command of the Carabinieri Corps. This work follows the need to implement a controlled
vocabulary compliant with the state-of-the-art standards.
Design/methodology/approach The methodology implements the practice of reengineering available
vocabularies, following standardised guidelines for thesaurus development. The conversion process includes
the balance maintenance of what has been achieved in the CC Soggettario and the enrichment of the semantic
structure in the thesaurus by using both deductive and inductive methods.
Findings The main result ofthis study is a thesaurus compliant with ISO 29964-1:2011 recommendations,
whichimproves informationretrievalperformancesand interoperabilitywith other vocabulariesand applications.
Itgenerally has a mono-hierarchicalstructure with thepossibility of admitting,as an exception,the poly-hierarchy
for a few concepts. An introductive user guidehas been created as a complementary toolto the CCThes.
Originality/value This is an applied study which deals with Knowledge Organisation System (KOS)
reengineering and outlines this process using a pragmatic approach. The paper strength lies in providing the
description of performed activities and conveying a set of resources to approach KOS reengineering practice.
The study is also relevant for the preservation and diffusion of a part of the social memory and identity of Italy.
Keywords Knowledge organisation systems, Controlled languages, Indexing, Information retrieval,
Thesaurus construction, Cultural heritage
Paper type Research paper
1. Introduction
To properly convey to users what a document is about, as well as assigning a classification
and improve document and information retrieval, it is appropriate to represent its content by
means of some keywords taken from controlled vocabularies, that is authoritative lists of
terms to be used in indexing, which are predefined and supported by international standards
(De Keyser, 2012).
Indexing is mainly based on the usage of a subject heading list or a thesaurus. Both are
Knowledge Organisation Systems (KOSs), with differences in the development and in the
representation of semantic relationships (Chatterjee, 2017). The former is based on less strict
guidelines especially deriving from explicit theory and practice, the latter is designed and
constructed by following international standards specifications, able to guarantee
consistency and quality and to favour interoperability (Dextre Clarke, 2019).
These vocabularies control the form of index terms, provide guidelines for their use and
specify the relationships between terms. The achievement of such goals can occur indifferent
JD
80,6
1528
The paper is the result of the collaboration among the Historical Office of the General Command of
Carabinieri Corps, the Informatics and Telematics Institute secondary branch of Rende (Cs) of the
National Research Council (IIT-CNR) of Italy and the Department of Culture, Education and Society of
the University of Calabria (DiCES-UniCal).
Author contributions: Conceptualization, M.T.G.,G.A. and M.T.; methodology, M.T.G.,G.A. and M.T.;
investigation, M.T.G., G.A. and M.T.; writing original draft, M.T.G. (paragraphs 3-4-5) and G.A.
(paragraphs1-2); writing review andediting, M.T.G., G.A. and M.T.All authors have read and agreedto
the published version of the manuscript.
The current issue and full text archive of this journal is available on Emerald Insight at:
https://www.emerald.com/insight/0022-0418.htm
Received 31 January 2024
Revised 11 June 2024
Accepted 14 June 2024
Journal of Documentation
Vol. 80 No. 6, 2024
pp. 1528-1545
© Emerald Publishing Limited
0022-0418
DOI 10.1108/JD-01-2024-0024
ways andwith targeted strategies, for examplesubject headinglists mainly usean alphabetical
presentation, whereas thesauri also provide a systematic organisation, given by the semantic
network. In fact, they contain different types of semantic relationships and more specific and
rich terminology. Moreover, they allowfor an additional explicitvisualisation of the relational
structurein the form of categorised lists orgraphic displays,which support indexersand users
(Chatterjee, 2017), helping to have a general comprehension of a domain of knowledge and
outline inte r-relationships betw een concepts (Aitchison et al., 2000).
This paper focuses on the evolution of a subject heading list, i.e. the CC Soggettario
(Carabinieri Corps Soggettario), towards a thesaurus, serving as an indexing and retrieval
tool for the documentary heritage held by the Historical Office of the General Command of the
Carabinieri Corps which is a fundamental piece of the Italian history. The Carabinieri Corps is
one of the Italian police forces, with general functions of control, civil police and public
security service. At the same time, it is the fourth Italian Armed Force, and it contributes to
the homeland defence: by taking part in international peacekeeping missions; by performing
military policy activities; by ensuring security in Italian diplomatic and consular offices
around the world. [1] The ultimate purpose of this conversion is to update and enhance the
content of the CC Soggettario and to implement the CCThes as more efficient indexing tool,
which enables users: to provide both on-site and remote access to resources; and to promote
culture by rendering available and accessible the knowledge embedded in the vast historical
heritage preserved by the Carabinieri Corps. Finally, the evolution of the CC Soggettario
towards the CCThes arises from the need to be compliant with the state-of-the-art standards
in controlled vocabulary development and use. Conformity to a standard has as a direct
consequence the possibility to enrich the range of subjects that can be used for indexing and
classification by including explicitly and conventionally formalised semantic relationships,
which ensure terminological control and interoperability.
The paper has an applied slant, and it is organised to provide the reader with a detailed
and pragmatic description of the performed activities. The reflections can be used as a sort of
toolkit by the community approaching the KOS reengineering practice. As known, there are
several studies about how subject heading lists and thesauri should be developed, and how
these latter tools can evolve into ontologies. Therefore, the theory on how to convert subject
heading lists into thesauri can be fathomed from the literature. Nevertheless, this paper
intends to explicitly clarify the necessary steps and the methodological choices of the
conversion process. This perspective can be useful to the current state of knowledge. The
paper is organised as follows: firstly, some initiatives of terminology standardisation and
harmonisation in the military domain are presented; then, the CC Soggettario, as the starting
tool of the study, is described; subsequently, the methodological path that led to the
conversion process is described; lastly, some of the results achieved so far are shown.
2. Background
Reengineering KOSs is a common approach to reuse existing vocabularies instead of
redesigning them from scratch (Soergel, 2009). Several initiatives have focused on converting
glossaries (Hilera et al., 2010), thesauri (Wielinga et al., 2001) or subject headings (Rout, 2018)
into ontologies to improve subject retrieval and to encourage the use of ontology-based
systems. The improvement of a simple structure by means of the identification and
formalisation of conceptual relationships between entities (such as terms, concepts, resources
and data) is quite a common practice in several domains. The aim is to improve information
retrieval and to achieve higher levels of interoperability. This trend is also consolidated in the
military domain where several ontology-based applications have been developed over the
years (Bowman et al., 2001). It is a strategic domain that demands effective and unambiguous
communication to reduce misunderstanding and inefficiency.
Journal of
Documentation
1529

Get this document and AI-powered insights with a free trial of vLex and Vincent AI

Get Started for Free

Start Your Free Trial of vLex and Vincent AI, Your Precision-Engineered Legal Assistant

  • Access comprehensive legal content with no limitations across vLex's unparalleled global legal database

  • Build stronger arguments with verified citations and CERT citator that tracks case history and precedential strength

  • Transform your legal research from hours to minutes with Vincent AI's intelligent search and analysis capabilities

  • Elevate your practice by focusing your expertise where it matters most while Vincent handles the heavy lifting

vLex

Start Your Free Trial of vLex and Vincent AI, Your Precision-Engineered Legal Assistant

  • Access comprehensive legal content with no limitations across vLex's unparalleled global legal database

  • Build stronger arguments with verified citations and CERT citator that tracks case history and precedential strength

  • Transform your legal research from hours to minutes with Vincent AI's intelligent search and analysis capabilities

  • Elevate your practice by focusing your expertise where it matters most while Vincent handles the heavy lifting

vLex

Start Your Free Trial of vLex and Vincent AI, Your Precision-Engineered Legal Assistant

  • Access comprehensive legal content with no limitations across vLex's unparalleled global legal database

  • Build stronger arguments with verified citations and CERT citator that tracks case history and precedential strength

  • Transform your legal research from hours to minutes with Vincent AI's intelligent search and analysis capabilities

  • Elevate your practice by focusing your expertise where it matters most while Vincent handles the heavy lifting

vLex

Start Your Free Trial of vLex and Vincent AI, Your Precision-Engineered Legal Assistant

  • Access comprehensive legal content with no limitations across vLex's unparalleled global legal database

  • Build stronger arguments with verified citations and CERT citator that tracks case history and precedential strength

  • Transform your legal research from hours to minutes with Vincent AI's intelligent search and analysis capabilities

  • Elevate your practice by focusing your expertise where it matters most while Vincent handles the heavy lifting

vLex

Start Your Free Trial of vLex and Vincent AI, Your Precision-Engineered Legal Assistant

  • Access comprehensive legal content with no limitations across vLex's unparalleled global legal database

  • Build stronger arguments with verified citations and CERT citator that tracks case history and precedential strength

  • Transform your legal research from hours to minutes with Vincent AI's intelligent search and analysis capabilities

  • Elevate your practice by focusing your expertise where it matters most while Vincent handles the heavy lifting

vLex

Start Your Free Trial of vLex and Vincent AI, Your Precision-Engineered Legal Assistant

  • Access comprehensive legal content with no limitations across vLex's unparalleled global legal database

  • Build stronger arguments with verified citations and CERT citator that tracks case history and precedential strength

  • Transform your legal research from hours to minutes with Vincent AI's intelligent search and analysis capabilities

  • Elevate your practice by focusing your expertise where it matters most while Vincent handles the heavy lifting

vLex