NG-DBSCAN: Scalable density-based clustering for arbitrary data

Lulli, Alessandro; Dell?Amico, Matteo; Michiardi, Pietro; Ricci, Laura
VLDB 2016, 42nd International Conference on Very Large Data Bases, September 5-9, 2016, New-Delhi, India / Proceedings of the VLDB Endowment, 2016, Vol.10, N°3


We present NG-DBSCAN, an approximate density-based clustering algorithm that operates on arbitrary data and any symmetric distance measure. The distributed design of our algorithm makes it scalable to very large datasets; its approximate nature makes it fast, yet capable of producing high quality clustering results. We provide a detailed overview of the steps of NG-DBSCAN, together with their analysis. Our results, obtained through an extensive experimental campaign with real and synthetic data, substantiate our claims about NG-DBSCAN's performance and scalability.


DOI
Type:
Conference
City:
New-Delhi
Date:
2016-09-05
Department:
Data Science
Eurecom Ref:
5076
Copyright:
© ACM, 2016. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in VLDB 2016, 42nd International Conference on Very Large Data Bases, September 5-9, 2016, New-Delhi, India / Proceedings of the VLDB Endowment, 2016, Vol.10, N°3
 https://doi.org/10.14778/3021924.3021932
See also:

PERMALINK : https://www.eurecom.fr/publication/5076