Skip to content

A header-only C++ library for k nearest neighbor search with Eigen3.

License

Notifications You must be signed in to change notification settings

Rookfighter/knn-cpp

Repository files navigation

knn-cpp

Cpp11 License CMake

knn-cppis a header-only C++ library for k nearest neighbor search using theEigen3library.

It implements various interfaces for KNN search:

  • pureEigen3parallelized brute force search
  • pureEigen3kdtree for efficient search with Manhatten, Euclidean and Minkowski distances

Install

Simply copy the header files into your project or install them using the CMake build system by typing

cdpath/to/repo
mkdir build
cdbuild
cmake..
make install

The library requiresEigen3to be installed on your system.

In Debian based systems you can simply install these dependencies usingapt-get.

apt-get install libeigen3-dev

Make sureEigen3can be found by your build system.

You can use the CMake Find module incmake/to find the installed headers.

Usage

All search algorithms share a similar interface. Have a look at the files in the examples/directory.

Here is a basic example on how to build a kdtree and query it.

#include<iostream>
#include<knncpp.h>

typedefEigen::MatrixXd Matrix;
typedefknncpp::Matrixi Matrixi;

intmain()
{
//Define some data points, which should be searched.
//Each column defines one datapoint.
MatrixdataPoints(3,9);
dataPoints <<1,2,3,1,2,3,1,2,3,
2,1,0,3,2,1,0,3,4,
3,1,3,1,3,4,4,2,1;

//Create a KDTreeMinkowski object and set the data points.
//Data is not copied by default. You can also pass an additional bool flag
//to create a data copy. The tree is not built yet.
//You can also use the setData() method to set the data at a later point.
//The distance type is defined by the second template parameter.
//Currently ManhattenDistance, EuclideanDistance, ChebyshevDistance and
//MinkowskiDistance are available.
knncpp::KDTreeMinkowskiX<double,knncpp::EuclideanDistance<double>>kdtree(dataPoints);

//Set the bucket size for each leaf node in the tree. The higher the value
//the less leafs have to be visited to find the nearest neighbors. The
//lower the value the less distance evaluations have to be computed.
//Default is 16.
kdtree.setBucketSize(16);
//Set if the resulting neighbors should be sorted in ascending order after
//a successfull search.
//This consumes some time during the query.
//Default is true.
kdtree.setSorted(true);
//Set if the root should be taken of the distances after a successful search.
//This consumes some time during the query.
//Default is false.
kdtree.setTakeRoot(true);
//Set the maximum inclusive distance for the query. Set to 0 or negative
//to disable maximum distances.
//Default is 0.
kdtree.setMaxDistance(2.5*2.5);
//Set how many threads should be used during the query. Set to 0 or
//negative to autodetect the optimal number of threads (OpenMP only).
//Default is 1.
kdtree.setThreads(2);

//Build the tree. This consumes some time.
kdtree.build();

//Create a querypoint. We will search for this points nearest neighbors.
MatrixqueryPoints(3,1);
queryPoints <<0,1,0;

Matrixi indices;
Matrix distances;
//Search for 3 nearest neighbors.
//The matrices indices and distances hold the index and distance of the
//respective nearest neighbors.
//Their value is set to -1 if no further neighbor was found.
kdtree.query(queryPoints,3,indices, distances);

//Do something with the results.
std::cout
<<"Data points:"<< std::endl
<< dataPoints << std::endl
<<"Query points:"<< std::endl
<< queryPoints << std::endl
<<"Neighbor indices:"<< std::endl
<< indices << std::endl
<<"Neighbor distances:"<< std::endl
<< distances << std::endl;

return0;
}

References

  1. Songrit Maneewongvatana and David M. Mount,Analysis of Approximate Nearest Neighbor Searching with Clustered Point Sets,DIMACS Series in Discrete Mathematics and Theoretical Computer Science, 2002

  2. Mohammad Norouzi, Ali Punjani and David J. Fleet,Fast Search in Hamming Space with Multi-Index Hashing,In Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition