Beyond TIFF and JPEG2000: PDF/A as an OAIS submission information package container

Published date21 September 2015
Pages409-423
Date21 September 2015
DOIhttps://doi.org/10.1108/LHT-06-2015-0068
AuthorYan Han
Subject MatterLibrary & information science,Librarianship/library management,Library technology
Beyond TIFF and JPEG2000:
PDF/A as an OAIS submission
information package container
Yan Han
The University of Arizona Libraries,
The University of Arizona, Tucson, Arizona, USA
Abstract
Purpose The purpose of this paper is to introduce PDF/A to replace TIFF as the preferred file
format for digitization of textual documents. In addition, PDF/A can be used as an open archival
information system (OAIS) submission information package (SIP) container to reduce digitization and
digital preservation costs.
Design/methodology/approach The author first reviewed the current digitization guidelines, the
OAIS model and provides on an overview of the development PDF and PDF/A as international
standards. Then literature review of the uses of PDF/A is presented. The author analyzed pitfalls of
TIFFs as the preferred format for digitization, and showed how to use PDF/A to code digitization SIP.
Findings TIFF file format has been the preferred master file format by Federal Agency Digitization
GuidelinesInitiative digitization guidelinesfor the past 20 years. However, there are drawbacks of TIFF
format. Literature reviews show that PDF/A has been the preferred standard for coding born-digital
documents in court, government and business sectors. PDF/A-2 and PDF/A-3 are relatively new standards
released after 2010. However, few understood the standards and have utilized the f ull potentials in
digitization. The author shows that PDF/A can be used as an OAIS SIP container.
Practical implications In order to delivery OAIS SIPs, current practices require a combination
of files, directories and various types of metadata. The author shows that PDF/A (PDF/A-2 and/or
PDF/A-3) can be a better file format for textual document digitization with coding various types of
metadatain extensible metadataplatform and arbitrary file/datacan be coded in PDF/A-3.These features
in PDF/A provide much better waysto deliver SIPs in a cost-efficient manner.
Originality/value PDF/A has been recognized as the preferredstandard for born-digital documents,
but it has notbeen used as the preferred file formatfor digitized materials.The author recommends that:
PDF/A with lossless JPX compressions as the preferred file format; and PDF/A with lossless JPX
compressionsalong with metadata/dataas the preferred OAIS SIP container.As a result, the uses reduce
costs in digitizationand digital preservation and also increaseproductivity. The author recommendsto
update the national and international digitization practices using PDF/A.
Keywords Digital documents, Digitization, Standards, Digital preservation, PDF/A
Paper type Research paper
1. Background
1.1 Overview of current digitization guidelines
Libraries, museums and archives have been digitizing materials for preservation and
access since the 1990s. Over the past 20 years, Federal agencies such as the National
Archives and the Digital Library Federation (DLF) have published several critical
digitization guidelines and best practices, which have been the de facto standards for
digitization projects in libraries, archives and museums. These guidelines were written
by experts and specify in great details in every aspects of digitization including file
format and various metadata considerations. These guidelines greatly influence almost Library Hi Tech
Vol. 33 No. 3, 2015
pp. 409-423
©Emerald Group Publis hing Limited
0737-8831
DOI 10.1108/LHT-06-2015-0068
Received 26 June 2015
Revised 13 July 2015
Accepted 15 July 2015
The current issue and full text archive of this journal is available on Emerald Insight at:
www.emeraldinsight.com/0737-8831.htm
The author would like to thank Leonard Rosenthol, Project leader for ISO PDF/A and Adobe PDF
Architect for his comments on TIFF, PDF and PDF/A file formats.
409
Beyond
TIFF and
JPEG2000

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT