Package: clusterability 0.1.1.0

Zachariah Neville

clusterability: Performs Tests for Cluster Tendency of a Data Set

Test for cluster tendency (clusterability) of a data set. The methods implemented - reducing the data set to a single dimension using principal component analysis or computing pairwise distances, and performing a multimodality test like the Dip Test or Silverman's Critical Bandwidth Test - are described in Adolfsson, Ackerman, and Brownstein (2019) <doi:10.1016/j.patcog.2018.10.026>. Such methods can inform whether clustering algorithms are appropriate for a data set.

Authors:Zachariah Neville [aut, cre], Naomi Brownstein [aut], Maya Ackerman [aut], Andreas Adolfsson [aut]

clusterability_0.1.1.0.tar.gz
clusterability_0.1.1.0.tar.gz(r-4.5-noble)clusterability_0.1.1.0.tar.gz(r-4.4-noble)
clusterability_0.1.1.0.tgz(r-4.4-emscripten)clusterability_0.1.1.0.tgz(r-4.3-emscripten)
clusterability.pdf |clusterability.html
clusterability/json (API)
NEWS

# Install 'clusterability' in R:
install.packages('clusterability', repos = c('https://cran.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Datasets:
  • normals1 - Data generated from a single multivariate Normal distribution, 2 dimensions.
  • normals2 - Data generated from a mixture of two multivariate Normal distributions, 2 dimensions. A dataset containing 150 observations generated from a mixture of two multivariate Normal distributions. 75 observations come from a distribution with mean vector (-3, -2) with each variable having unit variance and uncorrelated with each other. 75 observations come from a distribution with mean vector (1, 1) with each variable having unit variance and uncorrelated with each other. The dataset is clusterable.
  • normals3 - Data generated from a mixture of three multivariate Normal distributions, 2 dimensions. A dataset containing 150 observations generated from a mixture of three multivariate Normal distributions. 50 observations are from a distribution with mean vector (3, 0), 50 observations from a distribution with mean vector (0, 3), and 50 observations from a distribution with mean vector (3, 6). For each of these three distributions, the x and y variables have unit variance and are uncorrelated. The dataset is clusterable.
  • normals4 - Data generated from a mixture of two multivariate Normal distributions, 3 dimensions. A dataset containing 150 observations generated from a mixture of two multivariate Normal distributions. 75 observations come from a distribution with mean vector (1, 3, 2) and 75 observations come from a distribution with mean vector (4, 6, 0). For each distribution, the variables each have unit variance and are uncorrelated. The dataset is clusterable.
  • normals5 - Data generated from a mixture of three multivariate Normal distributions, 3 dimensions. A dataset containing 150 observations generated from a mixture of three multivariate Normal distributions. 50 observations come from a distribution with mean vector (1, 3, 3), 50 observations come from a distribution with mean vector (4, 6, 0), and 50 observations come from a distribution with mean vector (2, 8, -3). For each distribution, the variables each have unit variance and are uncorrelated. The dataset is clusterable.

This package does not link to any Github/Gitlab/R-forge repository. No issue tracker or development information is available.

1 exports 0.00 score 1 dependencies 20 scripts 198 downloads

Last updated 5 years agofrom:57a774c8c6. Checks:OK: 2. Indexed: yes.

TargetResultDate
Doc / VignettesOKAug 21 2024
R-4.5-linuxOKAug 21 2024

Exports:clusterabilitytest

Dependencies:diptest

Readme and manuals

Help Manual

Help pageTopics
clusterability: a package to perform tests of clusterabilityclusterability-package clusterability
Perform a test of clusterabilityclusterabilitytest
Data generated from a single multivariate Normal distribution, 2 dimensions.normals1
Data generated from a mixture of two multivariate Normal distributions, 2 dimensions. A dataset containing 150 observations generated from a mixture of two multivariate Normal distributions. 75 observations come from a distribution with mean vector (-3, -2) with each variable having unit variance and uncorrelated with each other. 75 observations come from a distribution with mean vector (1, 1) with each variable having unit variance and uncorrelated with each other. The dataset is clusterable.normals2
Data generated from a mixture of three multivariate Normal distributions, 2 dimensions. A dataset containing 150 observations generated from a mixture of three multivariate Normal distributions. 50 observations are from a distribution with mean vector (3, 0), 50 observations from a distribution with mean vector (0, 3), and 50 observations from a distribution with mean vector (3, 6). For each of these three distributions, the x and y variables have unit variance and are uncorrelated. The dataset is clusterable.normals3
Data generated from a mixture of two multivariate Normal distributions, 3 dimensions. A dataset containing 150 observations generated from a mixture of two multivariate Normal distributions. 75 observations come from a distribution with mean vector (1, 3, 2) and 75 observations come from a distribution with mean vector (4, 6, 0). For each distribution, the variables each have unit variance and are uncorrelated. The dataset is clusterable.normals4
Data generated from a mixture of three multivariate Normal distributions, 3 dimensions. A dataset containing 150 observations generated from a mixture of three multivariate Normal distributions. 50 observations come from a distribution with mean vector (1, 3, 3), 50 observations come from a distribution with mean vector (4, 6, 0), and 50 observations come from a distribution with mean vector (2, 8, -3). For each distribution, the variables each have unit variance and are uncorrelated. The dataset is clusterable.normals5
Print a clusterability objectprint.clusterability