Disambiguating USPTO inventor names with semantic fingerprinting and DBSCAN clustering

Document

Cited authorities 1

Cited in

Published date	01 April 2019
Date	01 April 2019
DOI	https://doi.org/10.1108/EL-12-2018-0232
Pages	225-239
Author	Hongqi Han,Yongsheng Yu,Lijun Wang,Xiaorui Zhai,Yaxin Ran,Jingpeng Han
Subject Matter	Information & knowledge management

Disambiguating USPTO inventor

names with semantic ﬁngerprinting

and DBSCAN clustering

Hongqi Han

Data Mining Group, Institute of Scientiﬁc and Technical Information of China,

Haidian-qu, China and Key Laboratory of Rich-media Knowledge Organization

and Service of Digital Publishing Content, SAPPRFT, Beijing, China

Yongsheng Yu

Institute of Scientiﬁc and Technical Information of China, Haidian-qu, China

Lijun Wang,Xiaorui Zhai and Yaxin Ran

Data Mining Group, Institute of Scientiﬁc and Technical Information of China,

Haidian-qu, China, and

Jingpeng Han

Beijing University of Technology, Beijing, China

Abstract

Purpose –The aim of this study is to present a novel approach based on semantic ﬁngerprinting and a

clustering algorithm called density-based spatial clustering of applications with noise (DBSCAN),

which can be used to convert investor records into 128-bit semantic ﬁngerprints. Inventor

disambiguation is a method used to discover a unique set of underlying inventors and map a set of

patents to their corresponding inventors. Resolving the ambiguities between inventors is necessary to

improve the quality of the patent database and to ensure accurate entity-level analysis. Most existing

methods are based on machine learning and, while they often show good performance, this comes at the

cost of time, computational power and storage space.

Design/methodology/approach –Using DBSCAN, the meta and textual data in inventor records are

converted into 128-bit semantic ﬁngerprints. However, rather than using a string comparison or cosine

similarity to calculate the distance between pair-wise ﬁngerprint records, a binary number comparison

function was used in DBSCAN. DBSCAN then clusters the inventor records based on this distance to

disambiguateinventor names.

Findings –Experiments conductedon the PatentsView campaign database of the United States Patent and

Trademark Ofﬁce show that thismethod disambiguates inventor names with recall greater than 99 percent

in less timeand with substantially smaller storage requirement.

Research limitations/implications –A better semantic ﬁngerprint algorithm and a better distance

function may improve precision. Setting of different clustering parameters for each block or other

clustering algorithms will be considered to improve the accuracy of the disambiguation results even

further.

Originality/value –Compared with the existing methods, the proposed method does notrely on feature

selection and complex feature comparison computation. Most importantly, running time and storage

requirementsare drastically reduced.

Keywords Cluster analysis, Patent analysis, Inventor name disambiguation,

Semantic ﬁngerprinting

Paper type Research paper

Semantic

ﬁngerprinting

and DBSCAN

clustering

225

Received1 December 2018

Revised9 March 2019

23March 2019

Accepted24 March 2019

TheElectronic Library

Vol.37 No. 2, 2019

pp. 225-239

0264-0473

DOI 10.1108/EL-12-2018-0232

The current issue and full text archive of this journal is available on Emerald Insight at:

www.emeraldinsight.com/0264-0473.htm

To continue reading

Request your trial

Subscribers can access the reported version of this case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the cited cases and legislation of a document.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a list of all the documents that have cited the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the revised versions of legislation with amendments.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see any amendments made to the case.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see a visualisation of a case and its relationships to other cases. An alternative to lists of cases, the Precedent Map makes it easier to establish which ones may be of most relevance to your research and prioritise further reading. You also get a useful overview of how the case was received.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Subscribers are able to see the list of results connected to your document through the topics and citations Vincent found.

You can sign up for a trial and make the most of our service including these benefits.

Request your trial

Why Sign-up to vLex?

Over 100 Countries

Search over 120 million documents from over 100 countries including primary and secondary collections of legislation, case law, regulations, practical law, news, forms and contracts, books, journals, and more.
Thousands of Data Sources

Updated daily, vLex brings together legal information from over 750 publishing partners, providing access to over 2,500 legal and news sources from the world’s leading publishers.
Find What You Need, Quickly

Advanced A.I. technology developed exclusively by vLex editorially enriches legal information to make it accessible, with instant translation into 14 languages for enhanced discoverability and comparative research.
Over 2 million registered users

Founded over 20 years ago, vLex provides a first-class and comprehensive service for lawyers, law firms, government departments, and law schools around the world.

Disambiguating USPTO inventor names with semantic fingerprinting and DBSCAN clustering

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users

You can sign up for a trial and make the most of our service including these benefits.

Why Sign-up to vLex?

Over 100 Countries

Thousands of Data Sources

Find What You Need, Quickly

Over 2 million registered users