| Title: | Buttler-Fickel Distance and R2 for Mixed-Scale Cluster Analysis |
|---|---|
| Description: | Implements the distance measure for mixed-scale variables proposed by Buttler and Fickel (1995), based on normalized mean pairwise distances (Gini mean difference), and an R2 statistic to assess clustering quality. |
| Authors: | Moritz Schäfer [aut, cre] |
| Maintainer: | Moritz Schäfer <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 1.0.0 |
| Built: | 2026-05-23 07:53:31 UTC |
| Source: | https://github.com/cran/bfcluster |
Computes the proportion of explained distance variation (R²) for a given clustering solution using a distance matrix derived from the Buttler-Fickel distance. The statistic reflects how well the clustering partitions the total pairwise distance structure.
bf_R2(D, cluster)bf_R2(D, cluster)
D |
A distance object of class |
cluster |
An integer or factor vector of cluster assignments,
typically obtained from |
The R² is defined as:
where is the sum of all pairwise distances and
is the sum of distances within clusters.
A numeric value between 0 and 1 indicating the proportion of explained distance variation. Higher values represent better cluster fit.
df <- data.frame( sex = factor(c("m","f","m","f")), height = c(180, 165, 170, 159), age = c(25, 32, 29, 28) ) types <- c("nominal", "metric", "metric") D <- buttler_fickel_dist(df, types) hc <- hclust(D) cl <- cutree(hc, k = 2) bf_R2(D, cl)df <- data.frame( sex = factor(c("m","f","m","f")), height = c(180, 165, 170, 159), age = c(25, 32, 29, 28) ) types <- c("nominal", "metric", "metric") D <- buttler_fickel_dist(df, types) hc <- hclust(D) cl <- cutree(hc, k = 2) bf_R2(D, cl)
Computes a distance matrix following Buttler & Fickel (1995) for mixed-scale variables. Each variable-specific distance matrix is normalized by its mean pairwise distance (Gini mean difference), ensuring equal contribution of all variables to the overall distance.
buttler_fickel_dist(df, types)buttler_fickel_dist(df, types)
df |
A data.frame where rows are cases and columns are variables. |
types |
A character vector of the same length as |
An object of class dist.