Analysis and optimization of PDF-to-EPUB in the digital publishing process
Published date | 03 April 2018 |
Date | 03 April 2018 |
DOI | https://doi.org/10.1108/EL-11-2016-0247 |
Pages | 350-368 |
Author | Qian Pu,Xiaomin Zhu,Donghua Chen,Runtong Zhang |
Subject Matter | Information & knowledge management,Information & communications technology,Internet |
Analysis and optimization of
PDF-to-EPUB in the digital
publishing process
Qian Pu,Xiaomin Zhu and Donghua Chen
School of Mechanical, Electronic and Control Engineering,
Beijing Jiaotong University, Beijing, China, and
Runtong Zhang
School of Economics and Management, Beijing Jiaotong University,
Beijing, China
Abstract
Purpose –This paper aims to provide an optimization method of workflow for publishing houses and
electronicbook (e-book) studies in the field of digital publishing.
Design/methodology/approach –Based on the studies of publishing houses in Beijing, the present
conversion workflowis illustrated using a functional modeling methodology.Then, the workflow is analyzed
using 5W1H (why, who, what, where, when, how) methodology and optimized using ECRSI (eliminate,
combine, rearrange,simplify and increase) principles. To validatethe optimization effect, the workflow before
and afteroptimization are generated and implementedby the ExtendSim® simulation software.
Findings –The simulation resultsshow that under similar circumstances, both quantity andquality of the
productsare improved after optimization, which indicate that the optimizationmethod is effective.
Practical implications –ElectronicPUBlication (EPUB) has significant requirementsto satisfy the needs
of the mobile reading market and to earn increased profits, whereas some e-books are still preserved in a
portable documentformat (PDF). This study results in the enhanced EPUB quality and production efficiency
of the PDF-to-EPUB format conversionworkflow in publishing houses. Publishing houses aroundthe world
can refer tothis study to make a similar optimization whenhandling PDF-to-EPUB.
Originality/value –This research introduces the traditionalindustrial engineering analytical techniques
to the workflow optimization of e-book conversion. Compared with the most of other methods used to
optimizeworkflow, this method is simpler, more efficientand more suitable for e-book format conversion.
Keywords PDF, E-book format, EPUB, ExtendSim simulation, Workflow optimization
Paper type Research paper
1. Introduction
Electronic books (e-books) offer a remedial solution for traditional printed books: accessing
content online, avoiding tedious presentations and providing better interactivity (Jou et al.,
2016). Many people read e-books in their daily lives. A poll conducted by USA Today and
Bookish in 2013 showed that 40 per cent of adults (including 46 per cent aged between 18 and
39 years) own an e-reader or a tablet, and 60 per cent of college graduates say they have one
(Minzesheimer, 2013). E-books have been studied by many researchers who mainly focused on
the following: printed books (Hou et al., 2017;Knowlton, 2015;Kurata et al., 2017), e-book usage
This work was partially supported by a key project of National Natural Science Foundation of China
under grant number 71532002, the Fundamental Research Funds for the Central Universities under
grant number B16JB00130 and Beijing Logistics Informatics Research Base.
EL
36,2
350
Received28 December 2016
Revised22 June 2017
Accepted1 August 2017
TheElectronic Library
Vol.36 No. 2, 2018
pp. 350-368
© Emerald Publishing Limited
0264-0473
DOI 10.1108/EL-11-2016-0247
The current issue and full text archive of this journal is available on Emerald Insight at:
www.emeraldinsight.com/0264-0473.htm
in academia (Bennett and Landoni, 2005;Jindal and Pant, 2013;Raynard, 2017)ande-book
reading habits analysis (Gilbert and Fister, 2015;Hsiusen and Chen, 2014;Pinto et al., 2014).
Different e-book formats can lead to various reading experiences. Portable document
format (PDF) is widely used to preserveand present different document types (Khusro et al.,
2014), but it requires frequent size adjustments when applied to a small screen. Among all
e-book formats that can show pictures, electronic PUBlication (EPUB) is outstanding,
because it can be read by most reading applications and allows users to modify the font size
basis of the display area (Marinai et al., 2011). EPUB is an open and free standard e-book
format published by the International Digital Publishing Forum and contains different
media formats, such as PNG, JPEG and SVG. In particular, EPUB3 is a portable packaged
file format that can include supplementary material, enable repetition of experiments,
improve readability andunderstanding, interlink with research data and supportopen Web
standards, including HTML5, CCS3 and JavaScript (Meester et al., 2014;Sigarchian et al.,
2014). Thus, e-book conversion from PDF to EPUB deserves to be studied. EPUB is a
reflowable format based on the XHTML file standard,which implies software and hardware
dependence (Okuda et al., 2012). The basic conversion procedure has three main steps:
identification and analysis of the table of contents, identification of notes and illustrations
and their export in the EPUB format (Marinai et al., 2011). These steps involve standards
and technologies, such as document type definition, extensible markup language (XML),
extensible stylesheet language transformations and optical character recognition (Hsiao
et al., 2014;Zhang, 2008). In many Chinese publishing houses, traditional conversion
systems cannot meet production efficiency and quality requirements; thus, this study
focuses on improving the efficiencyand accuracy of the conversion workflow.
Workflow is mainly studied using mathematical modeling, simulation and analysis.
Mathematical modeling is widely used to solve many process efficiency problems. Zhong
et al. (2016) utilized the Markov chain model to formulatethe mammography testing process
of an exam room. Wang et al. (2012) also adopted the Markov chain model to estimate
patient length of stay and investigate different configurations of radiology specialists in a
computed tomographyimaging department to reduce flow time and cost.
Simulation is also used by manyresearchers as an analyzing tool. Mrowinski et al. (2016)
analyzed the theoretical probability distributions of review time and applied them to
simulate different editorialstrategies to study the effects of different editorial policies on the
entire process efficiency. Lee et al. (2015)used RealOpt© software to simulate and optimize
the emergency department workflow and verify the optimization effect through its
application in hospitals. Simulation is also implemented to validate the optimization effect
(Yılmaz and Baykal, 2016), support the decision-making process (Abo-Hamad and Arisha,
2013) and predict the schemeseffect (Reijers et al., 2016).
Mathematical modeling and simulationare also powerful, but can be complex and time-
consuming. Zhang and Perry (2014) used the abstract syntax tree to describe a business
process. Several studies directly described and analyzed the clinic workflow with different
aspects for its optimization(Gocsik and Barton, 2014;Grischeau and Zenner, 2012). The lean
manufacturing techniqueis also used to optimize industry and medical workflows (Dickson
et al., 2009;Ng et al., 2010;Ramnathet al., 2014).
The current study deals with the workflow optimization problem of the PDF-to-EPUB
format conversion in publishing houses. First, the present workflow is described by a
functional modeling methodology,and the functional requirements of publishing houses are
defined. Second, the 5W1H (why, who, what, where, when, how) methodology is used to
analyze the workflow, whereas the ECRSI (eliminate, combine, rearrange, simplify and
increase) principles are applied to optimize the workflow. Note that the 5W1H and ECRSI
Digital
publishing
process
351
To continue reading
Request your trial