Cloud storage for digital preservation: optimal uses of Amazon S3 and Glacier

Date15 June 2015
Published date15 June 2015
DOIhttps://doi.org/10.1108/LHT-12-2014-0118
Pages261-271
AuthorYan Han
Subject MatterLibrary & information science,Librarianship/library management,Library technology
Cloud storage for digital
preservation: optimal uses of
Amazon S3 and Glacier
Yan Han
University Libraries, The University of Arizona, Tucson, Arizona, USA
Abstract
Purpose The purpose of this paper is to use cloud storage in digital preservation by analyzing the
pricing and data retrieval models. The author recommends strategies to minimize the costs and
believes cloud storage is worthy of serious consideration.
Design/methodology/approach Few articles have been published to show the uses of cloud
storage in libraries. The cost is the main concern. An overview of cloud storage pricing shows a price
drop once every one or one-and-a-half years. The author emphasize the data transfer-out costs and
demonstrate a case study. Comparisons and analysis of S3 and Glacier have been conducted to show
the differences in retrieval and costs.
Findings Cloud storage solutions like Glacier can be very attractive for long-term digital
preservation if data can be operated within the providers same data zone and data transfer-out can be
minimized.
Practical implications Institutions can benefit from cloud storage by understanding the cost
models and data retrieval models. Multiple strategies are suggested to minimize the costs.
Originality/value The paper is intended to bridge the gap of uses of cloud storage. Cloud
storage pricing especially data transfer-out pricing charts are presented to show the price drops
over the past eight years. Costs and analysis of storing and retrieving data in Amazon S3 and
Glacier are discussed in details. Comparisons of S3 and Glacier show that Glacier has uniqueness
and advantages over other cloud storage solutions. Finally strategies are suggested to minimize
the costs of using cloud storage. The analysis shows that cloud storage can be very useful in digital
preservation.
Keywords Cloud computing, Cost analysis, Amazon S3, Cloud storage, Glacier
Paper type Research paper
1. Introduction
Over the past 20 years, millions of manuscripts, serials, audio and videos resources
have been digitized globally for access and preservation. Almost every academic
library, archive, and museum have some digital initiatives to make their unique
materials available on the internet for access. Best practices and digitization stand ards
have been published and adopted by libraries, museums, and archives to produce
high-quality uncompressed digital surrogates. In addition, the new trend of preserving
born-digital big data from research requires a way to store and save these critical
data for the future. In the past, the data (i.e. master copies) have been typically stored
and backed up in traditional local storage (e.g. hard disks and tapes), whil ede rivatives
were loaded to repositories or web sites for daily access. Access to the master copies is
very infrequent and usually consists of two types: first, data integrity and verification:
the primary access to them is to verify the checksum to ensure data integrity in a
predefined schedule per preservation policy (e.g. once every six month); and
second, data review and update: typically only a small portion of data is required
to be accessed. Sometimes a master copy is untouched for a long time of period
(e.g. 5+years).
Library Hi Tech
Vol. 33 No. 2, 2015
pp. 261-271
©Emerald Group Publis hing Limited
0737-8831
DOI 10.1108/LHT-12-2014-0118
Received 19 December 2014
Revised 19 December 2014
Accepted 11 February 2015
The current issue and full text archive of this journal is available on Emerald Insight at:
www.emeraldinsight.com/0737-8831.htm
261
Cloud storage
for digital
preservation

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT