A study on industry and synthetic standard benchmarks in relational and object databases

Pages516-532
Published date01 October 2003
Date01 October 2003
DOIhttps://doi.org/10.1108/02635570310489214
AuthorJia‐Lang Seng
Subject MatterEconomics,Information & knowledge management,Management science & operations
A study on industry and synthetic standard
benchmarks in relational and object databases
Jia-Lang Seng
Department of MIS, National Chengchi University, Taipei, Taiwan ROC
1. Introduction
This paper is organized into eight sections.
Section 1 introduces this research. Section 2
describes the concepts of benchmark and
workload. Section 3 reviews the RDBMS
benchmarks. Section 4 surveys the
OODBMS/ORDBMS benchmarks. Section 5
presents the schema analysis, the operation
analysis, the control analysis, and the system
analysis from a workload-centric analysis
framework. Section 6 discusses the RDBMS
benchmark analysis results and the
OODBMS/ORDBMS benchmark analysis
results. Section 7 details the managerial
implications of this survey. Section 8
concludes this paper with a brief summary
and future research directions.
Benchmarks are the vital tools in the
performance measurement and evaluation of
database management systems (DBMS).
DBMS represents the crux of the enterprise
information systems (EIS). EIS performance
depends on DBMS operation and function. A
wide variety of standard database
benchmarks have been developed for
relational database management systems
(RDBMS) and object-oriented/object-
relational database management systems
(OODBMS/ORDBMS). Examples are the
Wisconsin, AS
3
AP, Debit-Credit, TPC-A, TPC-
B, TPC-C, TPC-D, and TPC-W, SAP R/3 SD,
PeopleSoft eBill, and Oracle AS-BM
benchmarks in the RDBMS area and the OO1,
HyperModel, ACOB, OO7, and BUCKY
benchmarks in the OODBMS/ORDBMS area.
These well-known RDBMS and
OODBMS/ORDBMS database benchmarks
can be roughly divided into two main classes.
One is the synthetic benchmark that uses the
artificial data and operation to simulate one
representative application in a problem
domain. The other is the empirical
benchmark that uses the real data and
operation runin an actual application system.
As we know, empirical data are ideal for
performancemeasures. However, it is difficult
to control their variables in a systematic and
scalable manner. And, it is more expensive to
re-implement the entire actual application
system on two or more different DBMSs. The
costs often outweigh the benefits of empirical
experiments. Hence, synthetic benchmarks
are the most chosen approach to measure and
evaluate DBMS performance.
Synthetic benchmarks can be further
divided into three categories. One is the
system standard benchmark created from
academia such as Wisconsin, AS
3
AP, and
OO7, that develops the relational query and
object operation to measure DBMS
fundamentals. Another is the industry
standard benchmark created by TPC and
other standards organizations such as SPEC
(www.spec.com), that emulates one
representative application system in an
industry, for instance, online transaction
processing system (OLTP), on line analytical
processing system (OLAP)
(www.olapcouncil.org), and electronic
commerce system (EC). Yet another is the
application standard benchmark crafted by
DBMS and ERP major vendors such as SAP
R/3 SD, PeopleSoft eBill, and Oracle AS-BM,
that models a set of DBMS application
modules to test an integrated process of EIS
(Microsoft, 2002; (www.olapcouncil.org),
Oracle, 2002). As we can see, the above are
mainly RDBMS benchmarks. OODBMS/
ORDBMS benchmarks are mostly system
standard benchmarks and industry standard
benchmarks (Cattell, 1993; Carey et al., 1993).
OO1 models the simple object database,
HyperModel simulates the hypertext
database, ACOB represents the client/
server object database, OO7 measures the
complex object database, and BUCKY
The Emerald Research Register for this journal is available at
http://www.emeraldinsight.com/researchregister
The current issue and full text archive of this journal is available at
http://www.emeraldinsight.com/0263-5577.htm
[ 516 ]
Industrial Management &
Data Systems
103/7 [2003] 516-532
#MCB UP Limited
[ISSN 0263-5577]
[DOI 10.1108/02635570310489214]
Keywords
Databases, Benchmarking,
Performance monitoring,
Online operation,
Relational databases,
Object-oriented databases
Abstract
Benchmarks are the vital tools in
the performance measurement
and evaluation of database
management systems (DBMS),
including the relational database
management systems (RDBMS)
and the object-oriented/object-
relational database management
systems (OODBMS/ORDBMS).
Standard synthetic benchmarks
have been used to assess the
performance of RDBMS software.
Other benchmarks have been
utilized to appraise the
performance of OODBMS/
ORDBMS products. In this paper,
an analytical framework of
workload characterization to
extensively and expansively
examine the rationale and design
of the industry standard and
synthetic standard benchmarks is
presented. This analytical
framework of workload analysis is
made up of four main components:
the schema analysis, the
operation analysis, the control
analysis, and the system analysis.
These analysis results are
compiled and new concepts and
perspectives of benchmark design
are collated. Each analysis aspect
is described and each managerial
implication is discussed in detail.
emulates the object-relational database.
Sometimes, when an application standard
benchmark and an industry standard
benchmark model one similar application
system, for instance, ordering-procurement-
payment-shipment, both categories can be
examined at the same time. Hence, in our
study, we focus on the first two categories of
synthetic benchmarks, that is, the system
standard benchmarks and the industry
standard benchmarks in RDBMS and
OODBMS/ORDBMS. No matter to which
category a synthetic benchmark belongs,
each has a typical problem domain to model
and measure.
However, due to the nature of the
synthetic data, benchmark results are
domain-depende nt and domain-spec ific.
They are highly dependent upon the
standard workloads. Benchmark results
may vary as the user domain changes and
the application workload evolves. Ferrari
(1978), Turbyfill (1987), Highleyman (1989),
Yu et al. (1992) and Gray (1993) indicate that
the degree of variation depends on the
approximation of the user workload to those
defined in the standard and synthetic
benchmarks. It is important and crucial for
users and management to understand and
administer the different characteristics of
the standard and synthetic benchmarks in
order to select and use the right
performance measures.
It is the objective of this study to present a
simple survey of industry standard and
synthetic standard benchmarks in RDBMS
and OODBMS/ORDBMS from a workload
perspective. We examine the design and
rationale of each workload. The
examination is undertaken from three
dimensions. One is from the benchmark
development life cycle dimension. Another
is the from the workload requirements
development dimen sion. Yet another is from
the benchmark analysis dimension.
Specifically, they are the schema analysis,
the operation analysis, the control analysis,
and the system analysis. We describe and
discuss each kind of analysis results in this
paper. Furthermore, from our results, we
find a new perspective to be taken to suggest
a more generic, more d omain-independen t,
and more application-representative
benchmark approach. We recommend a new
concept of common carrier to be created to
carry the user's workload requirements. A
benchmark approach is treated as a process
of workload requirements acquisition,
analysis, and anno tation. It is develo ped to
meet the user's requirements and to produce
more flexible and m ore realistic
performance models.
2. Database benchmark and
workload
A benchmark is a standard by which
something can be measured or judged. A
computer benchmark is a set of executable
instructions to be enforced in controlled
experiments to compare two or more
computer systems. Benchmarking is the
process of evaluating different hardware
systems or reviewing different software
systems on the same or different hardware
platforms. A database benchmark is
therefore a standard set of executable
instructions that are used to measure and
compare the relative and quantitative
performance of two or more database
systems through the execution of controlled
experiments. Benchmark data such as
through-put, jobs per time unit, response
time, time per job unit, and other measures
serve to predict performance and help us to
procure systems, plan capacity, and uncover
development bottle-necks for various user
groups (Turbyfill, 1987; Gray, 1993; Carey
et al., 1993).
A workload is the core of a benchmark. A
workload is the amount of work assigned to
or performed by a worker or unit of workers
in a given time period. A workload is best
described by analyzing the amount of work,
the rate at which the work is created, and the
characteristics, distribution, and content of
the work, as shown in Figure 1. To analyze a
workload is to characterize a workload from
four main aspects. They are the schema
aspect, the operation aspect, the control
aspect, and the system aspect (Ferrari, 1978;
Turbyfill, 1987; Highleyman, 1989; Yu et al.,
1992; Gray, 1993; Carey et al., 1993; TPC, 2001).
These aspects are based on two main kinds of
variables that are required to design a
benchmark. They are the experimental
factors and performance metrics.
Experimental factors are called the
explanatory variables, predictors, or
independent variables that have effects on
the system performance. Examples are the
size of database, the complexity of query, and
the number of users. Performance metrics
are called the response variables, indicators,
or dependent variables that will be affected
by the experimental factors. Examples are
the through-put, the response time, and the
price/performance ratio.
A workload-centric and analytic
framework is used based on the above
benchmark analysis aspects. Specifically,
schema analysis is to analyze the database
size, the data type, the data distribution, the
data relationship, and the indexing type in
RDBMS and OODBMS/ORDBMS
[ 517 ]
Jia-Lang Seng
A study on industry and
synthetic standard
benchmarks in relational and
object databases
Industrial Management &
Data Systems
103/7 [2003] 516-532

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT