Weblogs for market research: finding more relevant opinion documents using system fusion

Document

Cited in

Published date	25 September 2009
Date	25 September 2009
Pages	873-888
DOI	https://doi.org/10.1108/14684520911001882
Author	Deanna Osman,John Yearwood,Peter Vamplew
Subject Matter	Information & knowledge management,Library & information science

Weblogs for market research:

ﬁnding more relevant opinion

documents using system fusion

Deanna Osman, John Yearwood and Peter Vamplew

School of Information Technology and Mathematical Sciences,

University of Ballarat, Ballarat, Australia

Abstract

Purpose – The purpose of this paper is to examine the usefulness of fusion as a means of improving

the precision of automated opinion detection.

Design/methodology/approach – Five system fusion methods are proposed and tested using runs

submitted by the Text REtrieval Conference (TREC) Blog06 participants as input. The methods

include a voting method, an inverse rank method (IRM), a linear-normalised score method and two

weighted methods that use a weighted IRM score to rank the document.

Findings – Mean average precision (MAP) is used as an indicator of the performance of the runs in

this study. The best system fusion method achieves a 55.5 percent higher MAP result compared with

the highest MAP result of any individual run submitted by the Blog06 participants. This equates to an

increase in detection of 2,398 relevant opinion documents (21 percent).

Practical implications – System fusion can be used to improve upon the results achieved by

existing individual opinion detection systems. On the other hand, multiple opinion detection

approaches can be combined into one system and fusion used to combine the results to build in

diversity. Diversity within fusion inputs can increase the improvements achieved by fusion methods.

The improved output from a diverse opinion detection system will then contain a higher number of

relevant documents and reduce the incidence of high-ranking non-relevant documents and

low-ranking relevant documents.

Originality/value – The fusion methods proposed in this study demonstrate that simple fusion of

opinion detection systems can improve performance.

Keywords Market research,Internet, Communication technologies

Paper type Technical paper

Introduction

The number of people regularly accessing the internet is reported to have grown by

244.7 percent worldwide between 2000 and 2007 (Internet World Stats, 2007). One area

recording a high level of growth on the internet is weblogs (blogs). In December 2007 a

blog tracking company, Technorati (2007), reported that it was monitoring 112.8

million blogs worldwide, up from 4.2 million in October 2004 (Rosenbloom, 2004).

Along with the growth in the number of blogs on the internet, there is a growth in

interest in the content of blogs, particularly opinions within blogs. The majority of blog

authors surveyed by Lenhart and Fox (2006) indicated that their reason for blogging is

to share their knowledge, skills and life experiences. Often bloggers will express their

opinions about products, events and people affecting their lives.

These unsolicited opinions could prove invaluable for market research by

organisations who wish to gauge reactions to products and services. For example,

The current issue and full text archive of this journal is available at

www.emeraldinsight.com/1468-4527.htm

Weblogs for

market research

873

Refereed article received

23 August 2008

Approved for publication

20 December 2008

Online Information Review

Vol. 33 No. 5, 2009

pp. 873-888

qEmerald Group Publishing Limited

1468-4527

DOI 10.1108/14684520911001882

negative opinions about a competitor’s product may provide a competitive edge for a

new design or governments could search blogs for data in qualitative research

regarding new policies or upcoming elections. Small businesses, which do not have a

large “market research” budget, could gain access to millions of people who potentially

have an opinion relating to them.

Most users will not read all documents returned by a search engine. Jansen et al. (2000)

found that 58 percent of users do not read more than the ﬁrst page of a list of relevant

documents. Therefore, the aim of this research is not only to create a list of documents

with a higher proportion of relevant opinion documents, but also to ﬂoat the documents

with relevant opinions to the top of the list and the remaining documents to the bottom of

the list. The resulting list will include a higher number of documents expressing a

relevant opinion on the topic, which can then be used as a list for a search engine or as

input into an automated opinion analysis system. An automated analysis system would

requirea set of documents with a highproportion of relevantopinion documentsto enable

the system to quantify positive and negative opinions toward the topic.

In 2005 and 2006, the Text REtrieval Conference (TREC) created a blog document

collection (Blog06), comprising 3.2 million blog posts and comments. The tasks in 2006 for

the Blog06 collection included an “Opinion Retrieval Task” where participants retrieved

blogs expressing an opinion on each of 50 given topics. Participants could submit up to

ﬁve runs (a “run” is a ranked list of relevant documents submitted to TREC by the

participants), which included retrieved documents expressing an opinion on a given topic

(Ounis et al., 2006). A total of 56 runs consisting of the top 1000 opinion-bearing

documents for each topic (TREC provided the participants with 50 topics for this task)

were submitted by the 14 Blog06 participants. Of these, the top 100 documents from 27

runs were combined with the top ten documents from the remainder of the runs to create a

list of blog documents to be assessed by TREC assessors (Ounis et al., 2006). The runs

submitted in 2006 were used as input runs for the study discussed in this report.

Once the assessments were available, mean average precision (MAP) was calculated

for each run (NIST, 2005), measuring the precision of retrieval of documents relevant to

the given topic (irrespective of whether an opinion existed on the topic within the

document) andthe retrieval of documents expressingan opinion on the given topic. MAP

has been previously used to measure relevance in TREC corpora, however in Blog06

there were two measures: relevant documents and relevant opinion documents. The

MAP results were published in Ounis et al. (2006). MAP is a standard reporting method

of TREC corpora. Precision (P) is calculated using formula (1a) and formula (1b)

calculates average precision (AP), where Nis the total number of documents in the run

for the topic – in the Blog06 corpus, there was a maximum of 1,000 documents in each

run for each topic. MAP is the mean of the AP values for 50 topics:

Ri¼

0 if documentiis not relevant

1 if documentiis relevant

Pi¼

j¼1

ið1aÞ

OIR

33,5

874

To continue reading

Request your trial

Subscribers can access the reported version of this case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the cited cases and legislation of a document.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the documents that have cited the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the revised versions of legislation with amendments.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see any amendments made to the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a visualisation of a case and its relationships to other cases. An alternative to lists of cases, the Precedent Map makes it easier to establish which ones may be of most relevance to your research and prioritise further reading. You also get a useful overview of how the case was received.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the list of results connected to your document through the topics and citations Vincent found.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Weblogs for market research: finding more relevant opinion documents using system fusion

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users