A review of data mining techniques

Published date01 February 2001
DOIhttps://doi.org/10.1108/02635570110365989
Date01 February 2001
Pages41-46
AuthorSang Jun Lee,Keng Siau
Subject MatterEconomics,Information & knowledge management,Management science & operations
A review of data mining techniques
Sang Jun Lee
University of Nebraska-Lincoln, Lincoln, Nebraska, USA
Keng Siau
University of Nebraska-Lincoln, Lincoln, Nebraska, USA
Introduction
The technologies for generating and
collecting data have been advancing rapidly.
At the current stage, lack of data is no longer
a problem; the inability to generate useful
information from data is! The explosive
growth in data and database results in the
need to develop new technologies and tools to
process data into useful information and
knowledge intelligently and automatically.
Data mining (DM), therefore, has become a
research area with increasing importance
(Weiss and Indurkhya, 1998; Technology
Forecast, 1997; Fayyad et al., 1996; Piatetsky-
Shapiro and Frawley, 1991).
DM is the search for valuable information
in large volumes of data (Weiss and
Indurkhya, 1998). It is the process of
nontrivial extraction of implicit, previously
unknown and potentially useful information
such as knowledge rules, constraints, and
regularities from data stored in repositories
using pattern recognition technologies as
well as statistical and mathematical
techniques (Technology Forecast, 1997;
Piatetsky-Shapiro and Frawley, 1991). Many
companies have recognized DM as an
important technique that will have an impact
on the performance of the companies.
DM is an active research area and research
is ongoing to bring statistical analysis and
artificial intelligence (AI) techniques
together to address the issues.
Current trends on data mining
Just five years ago, only 50 researchers took
part in the knowledge discovery and data
mining conference workshop. Today,
however, knowledge discovery nuggets, the
well-known monthly electronic newsletter by
Gregory Piatetsky-Shapiro, has more than
4,000 readers. Moreover, data mining
continues to attract more and more attention
in the business and scientific communities.
In a 1997 report, Stamford, Conneticut-based
Gartner Group mentioned: ``Data mining and
artificial intelligence are at the top of the five
key technology areas that will clearly have a
major impact across a wide range of
industries within the next three to five
years.'' Many companies currently use
computers to capture details of business
transactions such as banking and credit card
records, retail sales, manufacturing
warranty, telecommunications, and myriad
other transactions. Data mining tools are
then used to uncover useful patterns and
relationships from the data captured.
Currently, data mining techniques, tools,
and researches are being expanded to the
various fields. For example, the DM tool,
intelligent text-mining system, extracts text
fragments relevant to user queries,
automatically creates and processes a series
of new queries, and assembles a new text.
The output enables the user to see the new
aspects of a given theme. This tool is a rule-
based system using complex heuristics.
Data warehousing is one of the most
important research areas related to DM. A
data warehouse is a read-only database
developed for analyzing business situations
and supporting decision makers. The data
warehouse includes large volumes of subject-
oriented data, where all levels of an
organization can find the information in a
timely manner. DM goes together with the
data warehousing which is necessary to
organize historical information gathered
from large-scale client/server-based
applications. In other words, DM can add
values to the information assets of
organizations in different sectors, through
effective induction of large corporate data
warehouses into a client-server. Therefore,
developing an advanced client-server
induction system capable of supporting
efficient and effective data mining of large
The current issue and full text archive of this journal is available
at
http://www.emerald-library.com/ft
[41]
Industrial Management &
Data Systems
101/1 [2001] 41±46
#MCB University Press
[ISSN 0263-5577]
Keywords
Data mining,
Artificial intelligence, Algorithms,
Decision trees
Abstract
Terabytes of data are generated
everyday in many organizations.
To extract hidden predictive
information from large volumes of
data, data mining (DM)
techniques are needed.
Organizations are starting to
realize the importance of data
mining in their strategic planning
and successful application of DM
techniques can be an enormous
payoff for the organizations. This
paper discusses the requirements
and challenges of DM, and
describes major DM techniques
such as statistics, artificial
intelligence, decision tree
approach, genetic algorithm, and
visualization.

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT