Setting our bibliographic references free: towards open citation data

Document

Cited in

Date	09 March 2015
Published date	09 March 2015
DOI	https://doi.org/10.1108/JD-12-2013-0166
Pages	253-277
Author	Silvio Peroni,Alexander Dutton,Tanya Gray,David Shotton
Subject Matter	Library & information science,Records management & preservation,Document management

Setting our bibliographic

references free: towards

open citation data

Silvio Peroni

Department of Computer Science and Engineering,

University of Bologna, Bologna, Italy

Alexander Dutton

IT Services, University of Oxford, Oxford, UK

Tanya Gray

Bodleian Libraries, University of Oxford, Oxford, UK, and

David Shotton

Oxford e-Research Centre, University of Oxford, Oxford, UK

Abstract

Purpose –Citation data needsto be recognised as a part of the Commons –those works that are freely

and legally available for sharing –and placed in an open repository.The paper aims to discuss this issue.

Design/methodology/approach –The Open Citation Corpus is a new open repository of scholarly

citation data, made available under a Creative Commons CC0 1.0 public domain dedication and

encoded as Open Linked Data using the SPAR Ontologies.

Findings –The Open Citation Corpus presently provides open access (OA) to reference lists from

204,637 articles from the OA Subset of PubMed Central, containing 6,325,178 individual references to

3,373,961 unique papers.

Originality/value –Scholars, publishers and institutions may freely build upon, enhance and reuse

the open citation data for any purpose, without restriction under copyright or database law.

Keywords Semantic publishing, Open access, Citations, Open citation corpus, References,

SPAR ontologies

Paper type Viewpoint

1. Introduction

We are living in the early part of the decade of open information. Following a spate of

recent reports and government policy statements (Boulton, 2012; Finch, 2012; American

Journal of Documentation

Vol. 71 No. 2, 2015

pp. 253-277

©Emerald Group Publis hing Limited

0022-0418

DOI 10.1108/JD-12-2013-0166

Received 18 December 2013

Revised26February2014

Accepted 3 March 2014

The current issue and full text archive of this journal is available on Emerald Insight at:

www.emeraldinsight.com/0022-0418.htm

This paper has been developed from the same textual source material from which was distilled a short

Comment piece entitled “Open Citations”recently published by David Shotton in Nature (Shotton,

2013). It thus has substantial textual elements in common with that publication.The authors gratefully

acknowledge the financial support of Jisc, which provided two small grants to David Shotton that, in

addition to enabling the creation of the OCC of which he is the Director, in part also made possible his

development of the SPAR ontologies in collaboration with Silvio Peroni, and of the CiTO Reference

Annotation Tools in collaboration with Tanya Gray. The software development of the first public

prototype of OCC was primarily undertaken by Alexander Dutton during the first Jisc grant. Work

currently in progress on revising the data model, infrastructure and ingest pipeline of the OCC was

initiated during the second Jisc grant, in collaboration with Richard Jones, Mark Macgillivray and

Martyn Whitwell of Cottage Labs, acting as development consultants, who are sincerely thanked for

their excellent work.Silvio Peroni would like to thank Angelo Di Iorio and Andrea Giovanni Nuzzolese,

who co-authored CiTalO, and Paolo Ciancarini and Fabio Vitali for their help and for many fruitful and

proactive discussions about citations, citation functions and citation metrics.

253

Setting our

bibliographic

references free

Meteorological Society, 2013; Burwell et al., 2013; New South Wales Government, 2013;

Research CouncilsUK, 2013; Wellcome Trust, 2013), it can be fairly statedthat the policy

debate on open access (OA) has been won. Interest is now focused on implementation of

the open agenda.

Over the past decade, several studies have demonstrated the importance and

benefits of releasing articles and data as OA material: (Lawrence, 2001; Harnad

and Brody, 2004; Davis et al., 2008; Swan, 2009) gave empirical evidence of the

advantages of OA in terms of better visibility, findability and accessibility for research

articles. Following an initial study showing similar results (Piwowar et al., 2007), a new

larger study by Piwowar and Vision shows that making research data publicly

available can increase the citation rates of articles between 9 and 30 per cent, depending

on the publication dates of the data sets (Piwowar and Vision, 2013).

But what of OA to the citation data, in other words to the reference lists within

scholarly papers that cite other bibliographic resources, from which citation rates can

be calculated? Heather Piwowar, a resident of Vancouver, Canada, never anticipated

the difficulties in collecting such citation data for that study (Piwowar and Vision,

2013). She needed to analyse citation counts for thousands of articles (she had 10K

PubMed IDs to look up), but three of the major sources of citation data, Thomson

Reuters’Web of Science[1], Google Scholar[2] and Microsoft Academic Search[3], did

not support PubMed ID queries. Scopus[4], Elsevier’s database of scholarly citations,

did, but because Piwowar lacked institutional access to that resource, and with direct

appeals to Scopus staff falling on deaf ears, she had a problem. She eventually obtained

access through a Research Worker agreement with Canada’s National Science Library,

but, because she had recently worked in the USA, this required her first to obtain a

police clearance certificate and to have her fingerprints sent to the FBI.

A similar story can be told concerning Steven Greenberg’s striking analysis of

citation distortion (Greenberg, 2009), revealing how hypotheses can be converted into

“facts”simply by repeating citation. His work involved the manual construction and

analysis of a citation network contained 242 papers, 675 citations, and 220,553 distinct

citation paths relevant to a particular hypothesis relating to Alzheimers Disease.

Had those citation data been readily accessible online, they would have been saved

considerable effort. These two examples demonstrate how actual research practice

suffers because access to citation data is currently so difficult.

In this OA decade, we think it is a scandal that reference lists from academic articles,

core elements of scholarly communication that permit the attribution of credit and

integrate our independent research endeavours, are not already freely available for use

by scholars. To rectify this, citation data now needs to be recognised as a part of the

Commons –those works that are freely and legally available for sharing –and placed

in an open repository, where they should be stored in appropriate machine-readable

formats so as to be easily reused by machines to assist people in producing novel

services. So there is work to be done.

In this paper, we first introduce the issues affecting the currently available sources

of citation data, and then describe our own contributions to this field which attempt to

improve the current situation: the Open Citations Corpus (OCC)[5], the Citations Typing

Ontology (CiTO)[6] (Peroni and Shotton, 2012), the CiTO Reference Annotation Tools[7]

and CiTalO[8]. OCC is an open repository for citations data, available under a Creative

Commons CC0 1.0 public domain dedication and encoded as Open Linked Data. CiTO is

an OWL2 DL ontology (Motik et al., 2012) that enables the assertion of citations in RDF,

and their machine-readable characterisation in terms of the reasons for such citations.

254

JDOC

71,2

To continue reading

Request your trial

Subscribers can access the reported version of this case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the cited cases and legislation of a document.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the documents that have cited the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the revised versions of legislation with amendments.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see any amendments made to the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a visualisation of a case and its relationships to other cases. An alternative to lists of cases, the Precedent Map makes it easier to establish which ones may be of most relevance to your research and prioritise further reading. You also get a useful overview of how the case was received.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the list of results connected to your document through the topics and citations Vincent found.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Setting our bibliographic references free: towards open citation data

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users