| Title: | Almost Linear-Time k-Medoids Clustering |
|---|---|
| Description: | Interface to a high-performance implementation of k-medoids clustering described in Tiwari, Zhang, Mayclin, Thrun, Piech and Shomorony (2020) "BanditPAM: Almost Linear Time k-medoids Clustering via Multi-Armed Bandits" <https://proceedings.neurips.cc/paper/2020/file/73b817090081cef1bca77232f4532c5d-Paper.pdf>. |
| Authors: | Balasubramanian Narasimhan [aut, cre], Mo Tiwari [aut] (https://motiwari.com) |
| Maintainer: | Balasubramanian Narasimhan <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 1.0-2 |
| Built: | 2026-05-28 08:48:11 UTC |
| Source: | https://github.com/cran/banditpam |
banditpam is a high-performance package for almost linear-time k-medoids clustering. The methods are described in Tiwari, et al. 2020 (Advances in Neural Information Processing Systems 33).
Balasubramanian Narasimhan and Mo Tiwari
Useful links:
Report bugs at https://github.com/motiwari/BanditPAM/issues
Return the number of threads banditpam is using
bpam_num_threads()bpam_num_threads()
the number of threads banditpam is using
This class wraps around the C++ KMedoids class and exposes methods and fields of the C++ object.
k(integer(1))
The number of medoids/clusters to create
max_iter(integer(1))
max_iter the maximum number of SWAP steps the algorithm runs
build_conf(integer(1))
Parameter that affects the width of BUILD confidence intervals, default 1000
swap_conf(integer(1))
Parameter that affects the width of SWAP confidence intervals, default 10000
loss_fn(character(1))
The loss function, "lp" (for p integer > 0) or one of "manhattan", "cosine", "inf" or "euclidean"
new()
Create a new KMedoids object
KMedoids$new(
k = 5L,
algorithm = c("BanditPAM", "PAM", "FastPAM1"),
max_iter = 1000L,
build_conf = 1000,
swap_conf = 10000L
)knumber of medoids/clusters to create, default 5
algorithmthe algorithm to use, one of "BanditPAM", "PAM", "FastPAM1"
max_iterthe maximum number of SWAP steps the algorithm runs, default 1000
build_confparameter that affects the width of BUILD confidence intervals, default 1000
swap_confparameter that affects the width of SWAP confidence intervals, default 10000
a KMedoids object which can be used to fit the banditpam algorithm to data
get_algorithm()
Return the algorithm used
KMedoids$get_algorithm()
a string indicating the algorithm
fit()
Fit the KMedoids algorthm given the data and loss. It is advisable to set the seed before calling this method for reproducible results.
KMedoids$fit(data, loss, dist_mat = NULL)
datathe data matrix
lossthe loss function, either "lp" (p, integer indicating L_p loss) or one of "manhattan", "cosine", "inf" or "euclidean"
dist_matan optional distance matrix
get_medoids_final()
Return the final medoid indices after clustering
KMedoids$get_medoids_final()
a vector indices of the final mediods
get_labels()
Return the cluster labels after clustering
KMedoids$get_labels()
a vector of the cluster labels for the observations
get_statistic()
Get the specified statistic after clustering
KMedoids$get_statistic(what)
whata string which should one of "dist_computations", "dist_computations_and_misc",
"misc_dist", "build_dist", "swap_dist", "cache_writes", "cache_hits",
or "cache_misses"
returnthe statistic
print()
Printer.
KMedoids$print(...)
...(ignored).
clone()
The objects of this class are cloneable with this method.
KMedoids$clone(deep = FALSE)
deepWhether to make a deep clone.
# Generate data from a Gaussian Mixture Model with the given means: set.seed(10) n_per_cluster <- 40 means <- list(c(0, 0), c(-5, 5), c(5, 5)) X <- do.call(rbind, lapply(means, MASS::mvrnorm, n = n_per_cluster, Sigma = diag(2))) obj <- KMedoids$new(k = 3) obj$fit(data = X, loss = "l2") meds <- obj$get_medoids_final() plot(X[, 1], X[, 2]) points(X[meds, 1], X[meds, 2], col = "red", pch = 19)# Generate data from a Gaussian Mixture Model with the given means: set.seed(10) n_per_cluster <- 40 means <- list(c(0, 0), c(-5, 5), c(5, 5)) X <- do.call(rbind, lapply(means, MASS::mvrnorm, n = n_per_cluster, Sigma = diag(2))) obj <- KMedoids$new(k = 3) obj$fit(data = X, loss = "l2") meds <- obj$get_medoids_final() plot(X[, 1], X[, 2]) points(X[meds, 1], X[meds, 2], col = "red", pch = 19)