NG-DBSCAN: Scalable density-based clustering for arbitrary data

Lulli, Alessandro; Dell?Amico, Matteo; Michiardi, Pietro; Ricci, Laura

VLDB 2016, 42nd International Conference on Very Large Data Bases, September 5-9, 2016, New-Delhi, India / Proceedings of the VLDB Endowment, 2016, Vol.10, N°3


We present NG-DBSCAN, an approximate density-based clustering algorithm that operates on arbitrary data and any symmetric distance measure. The distributed design of our algorithm makes it scalable to very large datasets; its approximate nature makes it fast, yet capable of producing high quality clustering results. We provide a detailed overview of the steps of NG-DBSCAN, together with their analysis. Our results, obtained through an extensive experimental campaign with real and synthetic data, substantiate our claims about NG-DBSCAN's performance and scalability.

Detail

Document

DOI

BIBTEX

Type:

Conference

City:

New-Delhi

Date:

2016-09-05

Department:

Data Science

Eurecom Ref:

5076

© ACM, 2016. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in VLDB 2016, 42nd International Conference on Very Large Data Bases, September 5-9, 2016, New-Delhi, India / Proceedings of the VLDB Endowment, 2016, Vol.10, N°3
 https://doi.org/10.14778/3021924.3021932