Behaviour analysis of internet survey completion using decision trees. An exploratory study

DOIhttps://doi.org/10.1108/14684520910944427
Pages117-134
Published date20 February 2009
Date20 February 2009
AuthorChe‐Chern Lin,Hung‐Jen Yang,Lung‐Hsing Kuo
Subject MatterInformation & knowledge management,Library & information science
Behaviour analysis of internet
survey completion using decision
trees
An exploratory study
Che-Chern Lin, Hung-Jen Yang and Lung-Hsing Kuo
National Kaohsiung Normal University, Kaohsiung City, Taiwan
Abstract
Purpose – The purpose of this paper is to explore teachers’ behaviours in completing an internet
survey using decision trees. Furthermore, to reduce the complexity of the decision trees, a statistical
technique was used to decrease the number of input variables in the decision trees.
Design/methodology/approach – A dataset of 47,647samples was used to build the decision trees.
These sampleswere collected from an internet surveyof teachers in Taiwan. The output of the decision
trees was the answering time (the time taken to complete the internet questionnaire). Eight variables
were selected as the inputs for thedecision trees. Two techniques were employed to buildthe decision
trees – the exhaustive chi-squared automatic interaction detector (ECHAID) and classification and
regression tree (CRT) analysis. To reduce the complexity of the decision models, factor analysis
technique was used to decrease the data dimensions (number of input variables) and to obtain a
simplifieddecision model. One-way ANOVA wasused to validate the effects of the dimensionreduction.
Findings – From the results of the factor analysis, a simplified decision tree is recommended using
four input variables – teaching years, school level, sex and area. The classification accuracy of the
simplified model is statistically equivalent to that of the original one, which used eight input variables.
Originality/value – The complexity of decision trees theoretically depends on the number of input
variables. This study used a statistical technique to decrease the number of input variables and
thereby reduce the complexity of the decision trees. A statistical technique was employed to validate
that the classification accuracy is not statistically different between the original decision model and the
simplified one. The decision models proposed in this paper can be applied in estimating the answering
time for completing a questionnaire during an internet survey.
Keywords Surveys, Internet,Teachers, Individual behaviour,Decision trees, Taiwan
Paper type Case study
Introduction
Due to the rapid development of network bandwidths and web-based applications,
internet surveys are becoming more and more popular. Internet surveying provides
unlimited environments for users to answer a questionnaire, anytime and anywhere.
There are two main methodologies for implementing internet surveying systems. The
first is to develop the surveying system using a customised web application
implemented by internet programming languages such as ASP.NET, PHP and JSP. The
second is to perform internet surveys using existing software packages. PHPSurveyor
(now LimeSurvey – www.phpsurveyor.org) is one of the most popular tools used for
online surveying. It is powerful, user-friendly and, most importantly, free of charge. It
also provides an authoring and editing platform for questionnaire designers.
The current issue and full text archive of this journal is available at
www.emeraldinsight.com/1468-4527.htm
Analysis of
internet survey
completion
117
Refereed article received
23 January 2008
Approved for publication
1 July 2008
Online Information Review
Vol. 33 No. 1, 2009
pp. 117-134
qEmerald Group Publishing Limited
1468-4527
DOI 10.1108/14684520910944427
In this study, we used PHPSurveyor to conduct our internet survey and employed
an existing web site, one aimed at providing continuing education for teachers in
Taiwan (referred to as the In-Service Website in this paper), to perform user
identification and session control.
Built in 2003, the In-Service Website is an information portal providing in-service
educational course information for teachers in Taiwan. It is equipped with a search
engine for users to find appropriate courses to meet their demands. In addition, the web
site provides an area for users to share learning materials and issues e-journals to
subscribers quarterly. The web site is located at National Kaohsiung Normal University
(NKNU) and is financially supported by the Ministry of Education (MOE) of Taiwan.
In 2006 the MOE delegated to NKNU a research project focused on surveying the
in-service education demands of teachers in Taiwan based on two different viewpoints:
(1) personal demands from teachers; and
(2) administrative demands from schools.
Two types of questionnaires were designed to collect data from the two viewpoints:
(1) a personal questionnaire for teachers; and
(2) a school administrative questionnaire to be answered by schools’ principals or
academic supervisors.
In the study presented here, we used the 47,647 teacher personal questionnaires as our
samples. The internet survey was conducted from 24 February 2006 to 19 March 2006.
The goal of this study was to build decision models using decision trees to classify
the answering times of completing an internet survey for teachers. Since the
complexity of a decision tree depends on the number of input variables, we used a
statistical method called the factor analysis technique to reduce the number of
variables and thereby simplify the structures of the decision trees.
We used answering time (the time interval taken to answer the internet
questionnaire) as a dependent variable. We divided the values of the dependent
variable into three sub-ranges according to the length of the answering time: short,
medium and long. A total of eight background attributes were used as independent
variables:
(1) age;
(2) sex;
(3) number of teaching years;
(4) school level;
(5) school type;
(6) teacher training type;
(7) area; and
(8) level of education.
Two decision tree techniques were applied to perform the classifications:
(1) Exhaustive chi-squared automatic interaction detector (ECHAID); and
(2) Classification and regression tree (CRT) analysis.
OIR
33,1
118

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT