A study of metadata element co‐occurrence

Document

Cited in

Published date	01 July 2006
DOI	https://doi.org/10.1108/14684520610686319
Pages	428-453
Date	01 July 2006
Author	Jin Zhang,Iris Jastram
Subject Matter	Information & knowledge management,Library & information science

A study of metadata element

co-occurrence

Jin Zhang and Iris Jastram

School of Information Studies, University of Wisconsin, Milwaukee,

Wisconsin, USA

Abstract

Purpose – This paper aims to investigate the internet web page metadata usage behavior in terms of

their metadata element co-occurrences. Metadata are designed to facilitate both web

publishers/authors to organize their web pages and search engines to index the web pages accurately.

Design/methodology/approach – This study examines the types of metadata elements employed

by different professional groups of web authors, the number of elements they prefer to use, and the

types of element combinations they typically embed in their pages’ HTML code.

Findings – The ﬁndings reveal that the “keyword” and “description” elements were the most popular

single elements. The most popular combination of two elements was that of “keyword and

description”. Very few authors included combinations of ﬁve elements. This study also shows that

preferences for element combinations varied by domains.

Originality/value – This approach will enhance the current understanding of metadata usage

behavior and may help search engine designers as they continue their quest for improved indexing

and retrieval of web pages.

Keywords Behaviour, Internet, Information organizations

Paper type Research paper

1. Introduction

The proliferation of information on the internet has made information retrieval from

that resource a challenging discussion topic for researchers, search engine and subject

index developers, and Internet users alike. Metadata could help this situation if it were

used consistently and well. There is no centralized control over the form or conten t of

embedded metadata however, which causes many to fear that it is too easily misused

or abused. Given the vast potential metadata possesses to enhance internet information

organization and retrieval, and given the equally vast potential for internet resource

creators to misrepresent their pages through metadata (either accidentally or

maliciously), researchers have focused their efforts either on the theoretical side or the

practical side of metadata implementation: how metadata can or should be used and

how metadata is being used.

The most commonly used metadata scheme on the Internet is the HTML “meta” tag.

The researchers found that of the 2,400 pages visited, 62.83 percent included this type

of metadata embedded in the HTML code. This is a much greater percentage than the

7.42 percent of pages containing Dublin Core and the 44.12 percent containing any

other scheme of metadata. This scheme has no standardized element set, leaving the

choice of the type and quantity of elements entirely up to the resource author. This type

of metadata can be found in the source code of a web page in the format ,META

name ¼“[tag name, such as keywords]” content ¼“[metadata content, such as a list of

keywords]”/.. Because the author has complete control over the type and quantity of

The current issue and full text archive of this journal is available at

www.emeraldinsight.com/1468-4527.htm

OIR

30,4

428

Refereed article received

28 February 2006

Revision approved for

publication 15 April 2006

Online Information Review

Vol. 30 No. 4, 2006

pp. 428-453

qEmerald Group Publishing Limited

1468-4527

DOI 10.1108/14684520610686319

metadata elements, this scheme allows metadata to be as simple or as complex as the

author wishes.

It is this ﬂexibility that causes much of the discussion among researchers and

search engine developers. How much granularity is beneﬁcial and how much dilutes

the effectiveness of the scheme or renders the scheme too complex for the average web

author? As Campbell points out, metadata is pulled in two directions: that of traditional

information organization and bibliographic description on one side, and that of “the

emerging standards that will form the web of the future” on the other (Campbell, 2002).

On the one hand, metadata stems from a long history of describing resources using

standardized formats and vocabularies. On the other hand, metadata of the future is as

yet undetermined; this type of metadata is still evolving quietly on the web.

Those researchers who believe metadata should be governed by stricter standards

argue that the lack of controlled vocabulary fundamentally dilutes metadata’s

effectiveness. Chepesuik, for example, argues that metadata is really “cataloging by

another name” (Chepesuik, 1999). As such, he maintains, controlled vocabulary is

necessary to fend off “bibliographic chaos” (Chepesuik, 1999). He quotes Michael

Gorman as saying:

There is no third way between cataloging, controlled vocabularies, etc. (expensive and

effective) and the chaos of keyword searching on the web (inexpensive and utterly ineffective)

(Chepesuik, 1999).

Other researchers also note that without some standardization and centralized control,

metadata will have little value and therefore will not be used by search engines (see

Sokvitne, 2000; Henshaw and Valauskas, 2001; Tennant, 2003, 2004). The stakes are

therefore quite high. Lack of bibliographic control could lead to such inconsistent

metadata that search engines completely disregard it, which would make the use of

metadata by authors publishing pages to the open internet an exercise in futility.

Proponents of a looser metadata scheme, however, argue that if metadata is too

difﬁcult for the average web author to create, those authors will not use the scheme or

will misuse the scheme, each of which could result in the ultimate demise of metadata

as a tool of internet resource discovery. Carl Lagoze (2001), for example, argues that

even though there is a place for greater granularity in metadata, there is also a strong

argument for “pidgin” metadata on the internet. This type of metadata would be simple

enough that multiple and diverse search algorithms could access its contents, and in

this way the pidgin scheme would allow for basic cross-domain resource discovery

(Lagoze, 2001). Diane Hillman (2003) agrees, saying that there are such differences in

vocabulary preferences between spheres of knowledge that pidgin metadata schemes

provide better cross-domain retrieval possibilities.

Those who give advice and do research on how to increase web page visibility seem

to agree with Lagoze and Hillman. Most advocate using only “keyword”, “description”,

or a combination of those two elements (see Richardson, 2003; Search Engine

Optimization, n.d.; Search Engine Optimization 1-2-3, n.d.; Sullivan, 2003; Yahoo.com,

n.d.). Other research indicates that the “keyword”, “description”, and “title” el ements

inﬂuence retrieval and ranking more than other elements do (Zhang and Dimitroff,

2005). This type of research tends to support the proposal to keep metadata simple.

Creating metadata is expensive, requiring time and thought. Every element added

costs money, so in an age of tightening proﬁt margins metadata’s strength is often seen

Metadata

element

co-occurrence

429

To continue reading

Request your trial

Subscribers can access the reported version of this case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the cited cases and legislation of a document.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the documents that have cited the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the revised versions of legislation with amendments.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see any amendments made to the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a visualisation of a case and its relationships to other cases. An alternative to lists of cases, the Precedent Map makes it easier to establish which ones may be of most relevance to your research and prioritise further reading. You also get a useful overview of how the case was received.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the list of results connected to your document through the topics and citations Vincent found.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

A study of metadata element co‐occurrence

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users