Fast online k-nn graph building

Debatty, Thibault; Michiardi, Pietro; Mees, Wim
Submitted on ArXiv on February 22nd, 2016

In this paper we propose an online approximate k-nn graph building algorithm, which is able to quickly update a k-nn graph using a flow of data points. One very important step of the algorithm consists in using the current distributed graph to search for the neighbors of a new node. Hence we also propose a distributed partitioning method based on balanced k-medoids clustering, that we use to optimize the distributed search process. Finally, we present the improved sequential search procedure that is used inside each partition. 

We also perform an experimental evaluation of the different algorithms, where we study the influence of the parameters and compare the result of our algorithms to existing state of the art. This experimental evaluation confirms that the fast online k-nn graph building algorithm produces a graph that is highly similar to the graph produced by an offline exhaustive algorithm, while it requires less similarity computations.

San Francisco
Data Science
Eurecom Ref:
© ACM, 2016. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Submitted on ArXiv on February 22nd, 2016