Patent informatics. The issue of relevance in full‐text patent document searches

Date20 February 2009
AuthorMarica Starešinič,Bojana Boh
Subject MatterInformation & knowledge management,Library & information science
Patent informatics
The issue of relevance in full-text patent
document searches
Marica Stares
ˇand Bojana Boh
Faculty of Natural Sciences and Engineering, University of Ljubljana,
Ljubljana, Slovenia
Purpose – The purpose of this paper is to discuss the issue of relevance in full-text patent document
searches from the viewpoint of end-users in science and technology. It aims to present three cases of
patent document analysis for relevance, with an additional case of improved search profile with
increased relevance, and to summarise the findings in the form of instructions for users.
Design/methodology/approach – Two methodological approaches were used for the analysis of
patent documents: value-added processing of the bibliographic part of patent documents for the
identification of trends; and structuring of data into systems for the determination of patent relevance.
Overall, four sets of full-text patent documents were analysed, covering the topics of:
microencapsulated phase change materials; digita l photography and image sensors; patent
document processing; and patent analysis.
Findings – Value-added analysis of the bibliographic parts of patent documents is a quick and useful
option for the recognition of research trends. However, where non-relevant patent documents are
present in a data set, automatic bibliographic analysis may lead to conclusions that are
mathematically and statistically correct, but that are not reliable or may even be incorrect for the user’s
research. Non-adequate terminology is one of the main obstacles to relevant patent searches, especially
if well-defined keywords are non-existent, as with cases of newly emerging and fast developing
scientific and technological fields.
Originality/value – Based on the bibliographic and content analyses of patent documents, the paper
provides instructions for users in the form of ten general rules for increasing the relevance of full-text
patent document searches.
Keywords Patents, Databases,Text retrieval, Computer applications
Paper type Research paper
Informatics is defined as the science of collecting, designing, distributing, exchanging,
managing and transforming information. The study area of informatics is the struc ture
and interactions between systems for storage, processing and communication of
information. The expression was first used by Philippe Dreyfus in 1962 (Dreyfus,
1962), followed by the introduction of similar expressions in other languages, such as
“informatik” in German and “informa
´tica” in Spanish.
The increase in patent documents available in specialised databases together with
the development of computer software and hardware triggered the development of
patent informatics, for which some authors have been using a novel expression
“patinformatics” (Triappe, 2003).
The first publications to include the term “informatics” date from before the year 1980.
A preliminary search on the STN International database host showed the largest number
of hits containing the keyword “informatics” in the PROMPT (Predicates Overview of
Markets and Technology) database, with a significant increase in documents after the
year 2000, as illustrated in Figure 1. This observation correlates with the increasing
number of new patent documents in patent databases, as presented in Figure 2.
The greatest number of records on new patent documents was available in the
INPADOC (International Patent Documentat ion Centre) database, followed by
WPINDEX (Derwent World Patents Index) and DGENE (The Derwent GENESEQ
gene sequence). Large fluctuations in the gene patent database DGENE can be
Figure 2.
New patent documents in
some major databases in
science and technology
Figure 1.
Cumulative number of
documents on informatics
in the PROMPT database

