quitefastmst: Euclidean and Mutual Reachability Minimum Spanning Trees¶
Keywords: Euclidean minimum spanning tree, MST, EMST, mutual reachability distance, nearest neighbours, k-nn, k-d tree, Boruvka, Prim, Jarnik, Kruskal, Genie, HDBSCAN*, DBSCAN, clustering, outlier detection.
Package features:
Euclidean Minimum Spanning Trees using single-, sesqui-, and dual-tree Borůvka algorithms – quite fast in spaces of low intrinsic dimensionality,
Minimum spanning trees with respect to mutual reachability distances based on the Euclidean metric (used in the definition of the HDBSCAN* algorithm; see [1]),
Euclidean nearest neighbours with nicely-optimised K-d trees,
relatively fast fallback algorithms for spaces of higher dimensionality,
supports multiprocessing via OpenMP (on selected platforms).
Possible applications in topological data analysis: clustering (HDBSCAN*, Genie, Lumbermark, Single linkage, etc.), density estimation, dimensionality reduction, outlier and noise point detection, and many more.
Contributing¶
quitefastmst is distributed under the open source GNU AGPL v3 license. Its source code can be downloaded from GitHub.
The Python version is available from PyPI and the R one from CRAN.
However, the core functionality is implemented in the form of a C++ library, so it may be adapted to new projects relatively easily: any valuable contributions are welcome (Julia or Matlab bindings, etc.).
Author and Maintainer: Marek Gagolewski