RANKuser. A folksonomy and user profile based algorithm to identify experts in Community Question Answering sites

Document

Cited in

Date	02 July 2018
Pages	329-350
Published date	02 July 2018
DOI	https://doi.org/10.1108/DTA-10-2017-0080
Author	Abhishek Kumar Singh,Naresh Kumar Nagwani,Sudhakar Pandey
Subject Matter	Library & information science,Librarianship/library management,Library technology,Information behaviour & retrieval,Metadata,Information & knowledge management,Information & communications technology,Internet

RANKuser

A folksonomy and user profile based algorithm

to identify experts in Community Question

Answering sites

Abhishek Kumar Singh, Naresh Kumar Nagwani and

Sudhakar Pandey

Department of Computer Science and Engineering,

National Institute of Technology, Raipur, India

Abstract

Purpose –Recently, with a high volume of users and user’s content in Community Question Answering

(CQA) sites, the quality of answers provided by users has raised a big concern. Finding the expert userscan

be a method to address this problem, which aims to find the suitable users (answerers) who can provide

high-quality relevant answers. The purpose of this paper is to find the expert users for the newly posted

questions of the CQA sites.

Design/methodology/approach –In this paper, a new algorithm, RANKuser, is proposed for identifying

the expert users of CQA sites. The proposed RANKuser algorithm consists of three major stages. In the first

stage, folksonomy relation between users, tags, and queries is established. User profile attributes, namely,

reputation, tags, and badges, are also considered in folksonomy. In the second stage, expertise scores of the

user are calculated based on reputation, badges, and tags. Finally, in the third stage, the expert users are

identified by extracting top Nusers based on expertise score.

Findings –In this work, with the help of proposed ranking algorithm, expert users are identified for newly

posted questions. In this paper, comparison of proposed user ranking algorithm (RANKuser) is also

performed with other existing ranking algorithms, namely, ML-KNN, rankSVM, LDA, STM CQARank, and

EV-based model using performance parameters such as hamming loss, accuracy, average precision, one error,

F-measure, and normalized discounted cumulative gain. The proposed ranking method is also compared to

the original ranking of CQA sites using the paired t-test. The experimental results demonstrate the

effectiveness of the proposed RANKuser algorithm in comparison with the existing ranking algorithms.

Originality/value –This paper proposes and implements a new algorithm for expert user identification in

CQA sites. By utilizing the folksonomy in CQA sites and information of user profile, this algorithm identifies

the experts.

Keywords Folksonomy, Online social networking, Community Question Answering, Expert identification,

Ranking algorithms, Social text mining

Paper type Research paper

1. Introduction

The benefits of Community Question Answering (CQA) system have become popular in

recent times. Some popular CQA systems are Yahoo Answer (Answers.yahoo.com, 2017),

StackOverflow (StackOverflow, 2017), and Quora (Quora, 2017). The main objective of such

communities is to contribute to high-quality answers (Qu et al., 2009) and offer a wide

variety of solutions or explanations. Expert finding is a crucial problem in CQA sites

(Zhou et al., 2015). The main problem of CQA sites is the low involvement rate of the users.

There are two reasons for the low involvement rate of users. First, most users do not have

the willingness to answer the question, and second, users are not able to find the questions

related to their expertise or interest (Riahi et al., 2012). The essential problem of expert

finding is to choose the appropriate users for answering the questions, which has attracted

high attention by the researchers (Zhou et al., 2015; Riahi et al., 2012; Xu et al., 2012; Yang

et al., 2013; Pal et al., 2012; Yang and Manandhar, 2014; Lin et al., 2017).

In CQA sites, the asker whoposts the query needs to wait for other user’s replywhich is a

time takingprocess. Sometimes the repliesto the questions may also beincorrect or irrelevant.

Data Technologies and

Applications

Vol. 52 No. 3, 2018

pp. 329-350

2514-9288

DOI 10.1108/DTA-10-2017-0080

Received 31 October 2017

Revised 15 January 2018

Accepted 9 February 2018

The current issue and full text archive of this journal is available on Emerald Insight at:

www.emeraldinsight.com/2514-9288.htm

329

RANKuser

The main reason behind this is that the questions of particular subject or area are not

properly linked to the relevant expert users. Therefore, it is necessary to identify

the expert users so that proper linking between questions and experts can be made.

An essential part of proposed work is to identify the expert users based on the folksonomy

and users profile.

Feature-based approaches for expert identification have been applied by several

researchers. A number of existing research studies focused on the problem of user

ranking, but none of the studies has considered the concept of folksonomy and users

profile for expert identification problem for the newly posted question. The main

difference between existing research and proposed work is to utilize the concept of

tagging data and users profile such as reputation and badges. Folksonomy involves tags

which do not require a lot of pre-processing. Folksonomy is developed in this work on

CQA sites attributes (Godoy et al., 2014; Min et al., 2012) and adopts the user profile

information for expert identification. Given a large number of users and questions on CQA

sites, folksonomy can be efficiently utilized for expert identification task (Godoy et al.,

2014; Min et al., 2012). A folksonomy is presented as a triplet F¼oU,P,T,YW,whereU

is the set of users, Pis set of posts, Tis set of tags, and Yis a relation between users, tags,

and questions. The question asked by users in CQA sites consists of title, body, and tags.

Thebodycontainsthequerybytheusersfollowed by the number of answers. Tags are

popular on these sites. The tag provides a metadata applied to the body of the questions.

User provides some tags in the question so that other users can easily find, identify, and

bookmark various questions (Rawashdeh et al., 2013; Schuster et al., 2013). In CQA sites,

users can earn reputation and badges depending upon their contributions. Users can also

earn reputation on their various activities such as like, dislike, accepted answer and

answers. In CQA sites, reputation score recognizes the contributor’s expertise (Bosu et al.,

2013). Work carried in this research use one of the famous CQA website databases,

StackOverflow (StackOverflow, 2017). In StackOverflow, users receive badges as a reward

for their actions. Users can earn gold, silver, or bronze badges. Gold badges are hard to

receive, while bronze badges are very easy to receive in StackOverflow.

The proposed algorithm (RANKuser) is based on folksonomy and user profile

attributes to calculate the users score in order to identify expert users. User’sscoresare

calculated in three parts, namely reputation score, tag-count score, and badge-count score.

Reputation score of a user can be calculated usingreputationearnedbyusers.Tag-count

score of the users is calculated using tags used by the users. Badge-count score of the

users is calculated using badges earned by the users. The performance of the proposed

algorithm (RANKuser) is compared with ML-KNN, rankSVM (Zhang and Zhou, 2007),

Latent Dirichlet Allocation (LDA), STM (Riahi et al., 2012), CQARank (Yang et al., 2013),

andEV-basedmodel(Palet al., 2012). Results are compared using six performance

metrics, namely hamming loss, accuracy, average precision, one error, F-measure, and

normalized discounted cumulative gain (nDCG). The proposed method is also compared

with the original ranking of the CQA sites using a paired t-test. Hamming loss is the

fraction of the labels that labeled incorrectly to the total number of labels. Accuracy

calculates subset accuracy, i.e. the set predicated labels for a sample match and the

corresponding set of actual labels. Average precision computes the average precision

from prediction score. This score shows the area under the precision-recall curve.

One error evaluates how many times the top-ranked label does not matches the set of

proper labels (Sadiq and Helen, 2016). F-measure is a method to test the accuracy of the

model (Pal et al., 2012). nDCG shows the usefulness of the user in the ranked users list

(Scikit-learn.org, 2017). The paired t-test is a method for comparing the mean of two

population where one sample can be paired with observations of other sample (Miyamoto

et al., 2017).

330

DTA

52,3

To continue reading

Request your trial

Subscribers can access the reported version of this case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the cited cases and legislation of a document.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the documents that have cited the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the revised versions of legislation with amendments.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see any amendments made to the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a visualisation of a case and its relationships to other cases. An alternative to lists of cases, the Precedent Map makes it easier to establish which ones may be of most relevance to your research and prioritise further reading. You also get a useful overview of how the case was received.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the list of results connected to your document through the topics and citations Vincent found.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

RANKuser. A folksonomy and user profile based algorithm to identify experts in Community Question Answering sites

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users