Definitions

A Euclidean minimum spanning tree (MST) connects n data points via n-1 shortest segments. It is a connected acyclic graph where:

  • vertices represent the points,

  • edges are weighted by the distances between point pairs,

  • edges have minimal total weight.

The mst_euclid function determines the classic Euclidean minimum spanning trees as well as ones corresponding to mutual reachability distances [3].

MSTs can be used in topological data analysis (clustering, density estimation, dimensionality reduction, outlier detection, etc.); see: hdbscan, deadwood, genieclust, and lumbermark.

Note

This section is a work in progress.

In the meantime, take a look at the Python or R package reference manual.

Relevant citations — the Jarník-Prim algorithm: [6], [11], [12], the Borůvka algorithm: [2], [8], K-d trees: [1], [7], [13], mutual reachability distance: [3], [4].