Publishing legacy data as linked data: a state of the art survey

Document

Cited in

Pages	520-535
DOI	https://doi.org/10.1108/LHT-09-2012-0075
Published date	02 September 2013
Date	02 September 2013
Author	Ujjal Marjit,Kumar Sharma,Arup Sarkar,Madaiah Krishnamurthy
Subject Matter	Library & information science,Librarianship/library management,Library technology

Publishing legacy data as linked

data: a state of the art survey

Ujjal Marjit and Kumar Sharma

CIRM, University of Kalyani, Kalyani, India

Arup Sarkar

Department of Computer Science & Engineering, University of Kalyani,

Kalyani, India, and

Madaiah Krishnamurthy

DRTC, Indian Statistical Institute, Bangalore, India

Abstract

Purpose – This article aims to discuss how the emergence of advanced semantic web technology has

transformed the conventional web into machine processable and understandable form.

Design/methodology/approach – In this paper the authors survey the current research works,

tools and applications on publishing legacy data as linked data with the aspiration of conferring

healthier understanding of the working domain of the linked data world.

Findings – Today, a vast amount of data are stored in various ﬁle formats other than RDF, which are

called legacy data. In order to publish them as linked data they need to be extracted and converted into

RDF or linked data without altering the original data schema or loss of information.

Originality/value – Most of the key issues have to be addressed. A more sophisticated approach to

this technology is the linked data, which constructs the transformation of web of documents into the

web of connected data possible.

Keywords Linked data, Webof document, Legacy data, Semantic web, Webof data, Semantics,

Data management

Paper type Research paper

1. Introduction

World Wide Web (WWW) plays a pivotal role as a global knowledge base and an

international comfortable communication, information and business network. In the

present day, almost all kinds of academics, business personnel or students are more or

less familiar with the concept of the World Wide Web. The web is the interconnection

of documents available from all around the world. Each of these documents is essential

for us in different context because it is a place where any kind of data or information is

stored. Whenever we search for information the search engine furnishes us the record

of web links based on the terms or keywords we have entered. As a consequence we

have to check these web links to analyze whether they lead us to useful information or

not. Today, every kind of data is available in the web; such as academic data, business

data, and personal or social data. Without web it would be difﬁcult to reach the high

quality of information or events that are happening around the world. The web mainly

suffers from the data sharing between different data sources and the reuse of these

data. Basically, the web is the interconnection of HTML documents where the HTML

guides how to structure or decorate the textual data in the document within a browser.

The current issue and full text archive of this journal is available at

www.emeraldinsight.com/0737-8831.htm

LHT

31,3

520

Received 4 September 2012

Revised 23 April 2013

13 May 2013

Accepted 26 May 2013

Library Hi Tech

Vol. 31 No. 3, 2013

pp. 520-535

qEmerald Group Publishing Limited

0737-8831

DOI 10.1108/LHT-09-2012-0075

HTML also contains the links called outgoing links or hyperlinks, which are deﬁned by

href attribute. On the traditional web the documents are connected by means of

hyperlinks. A very common phrase that is used to describe this web is the “Web of

Documents”. The Web of Documents is enough for us to search and analyze the data

by ourselves, but for a computer it is almost impossible to identify what data is hidden

in each document. Computer is unable to identify the actual document information and

their intended meaning. The reason behind this is that the data/information is intended

for the humans not for the machines. Machine only knows how to render and display

the textual data on the web of documents, which is guided by HTML. Thus, it becomes

impossible for the machines to take any measure to identify the relationship between

data. Therefore the contemporary generation of web moved to a newer breed of web

technology called Semantic Web. In Semantic Web, semantics are added with the

normal data to make them machine processable and comprehensible. It is a web (also

known as web 3.0), which allows data to be self-described in a more structured way so

that the machine can easily process and analyze the data. But the semantic web is not

enough to fulﬁll the current need. Semantic web makes data machine processable and

understandable without creating any links among them. This only creates some

separate data islands without any meaningful connection depicting their relation with

each other. So the job to identify the relatedness among the data of two different

semantic web applications working in same domain becomes cumbersome. In 2006,

Tim Berners Lee ﬁrst coined a new concept, a further extension to the existing

semantic web called Linked Data to resolve the current state of the problem.

Use of Linked Data applications within the organization is escalating day by day.

Besides this, those who decided to publish all their organizational data online as

Linked Data, a plethora of legacy data is waiting for their attention. In the beginning of

the adoption of Linked Data technology it was a pretty challenge to publish such huge

amount of legacy data as Linked Data. Today, the scenario differs a bit. There are

many applications and tools available at the academic as well as commercial level to

take care of the publication of the legacy data online as Linked Data. This survey

report is planned to expose some of the selected tools, applications, frameworks and

projects to give the readers a glimpse about how the Linked Data applications and

tools are developing one after other with different kind of approaches and possibilities.

A comparative study of these tools and applications are given to develop the vision

about what is going on and its possibilities in future.

2. Linked Data for all

The Linked Data initiative has taken further steps to overcome the shortcomings of the

current web whose ultimate goal is to unite the machine readable datasets as a Web of

Data. The Linked Data refers to a set of best practices for publishing and interlinking

the structured data on the Web of Data. Using the web-architecture and the HTTP

protocol as universal access mechanism, the Linked Data enables user to publish the

structured data and interlink them on the Web of Data thereby enabling the task of

data sharing and data reusing. A set of principles for publishing the structured data

into the Web of Data has been proposed ﬁrst time by Berners-Lee (2006/2009) as

follows:

.use URIs (Uniform Resource Identiﬁer) as names for things;

.use HTTP URIs so that people can look up those names;

Publishing

legacy data

521

To continue reading

Request your trial

Subscribers can access the reported version of this case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the cited cases and legislation of a document.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the documents that have cited the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the revised versions of legislation with amendments.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see any amendments made to the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a visualisation of a case and its relationships to other cases. An alternative to lists of cases, the Precedent Map makes it easier to establish which ones may be of most relevance to your research and prioritise further reading. You also get a useful overview of how the case was received.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the list of results connected to your document through the topics and citations Vincent found.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Publishing legacy data as linked data: a state of the art survey

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users