Title: | Distance Density Clustering Algorithm |
---|---|
Description: | A distance density clustering (DDC) algorithm in R. DDC uses dynamic time warping (DTW) to compute a similarity matrix, based on which cluster centers and cluster assignments are found. DDC inherits dynamic time warping (DTW) arguments and constraints. The cluster centers are centroid points that are calculated using the DTW Barycenter Averaging (DBA) algorithm. The clustering process is divisive. At each iteration, cluster centers are updated and data is reassigned to cluster centers. Early stopping is possible. The output includes cluster centers and clustering assignment, as described in the paper (Ma et al (2017) <doi:10.1109/ICDMW.2017.11>). |
Authors: | Ruizhe Ma [cre, aut], Bing Jiang [aut] |
Maintainer: | Ruizhe Ma <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.0.1 |
Built: | 2024-10-29 06:41:05 UTC |
Source: | CRAN |
Use the DTW to generate the matrix
createDistMatrix(standard_matrix, output_dir = NULL, mc.cores = 1, ...)
createDistMatrix(standard_matrix, output_dir = NULL, mc.cores = 1, ...)
standard_matrix |
the matrix genereated by function 'createStandardMatrix' |
output_dir |
the file to save the dissimilarity matrix data |
mc.cores |
the number of cores would be used in parallel |
... |
the same parameters which would be used in 'dtw' for calculating the distances of events |
the matrix, which describes pairwise distinction between M objects. It is a square symmetrical 'MxM' matrix with the (ij)th element equal to the value of a chosen measure of distinction between the (i)th and the (j)th object.
original_data <- data.frame("1"=c(1, 2, 1), "2"=c(5,6,7), "3"=c(4, 5, 8), "4"=c(3, 1, 9)) standard_matrix <- createStandardMatrix(data = original_data) dist_matrix <- createDistMatrix(standard_matrix = standard_matrix)
original_data <- data.frame("1"=c(1, 2, 1), "2"=c(5,6,7), "3"=c(4, 5, 8), "4"=c(3, 1, 9)) standard_matrix <- createStandardMatrix(data = original_data) dist_matrix <- createDistMatrix(standard_matrix = standard_matrix)
Create the dataframe with event names and the related labels
createLabelMatrix(data, output_dir = NULL)
createLabelMatrix(data, output_dir = NULL)
data |
data structure as the files in "UCR Time Series Classification Archive" |
output_dir |
the file to save the label matrix data |
the dataframe, including event names and labels
original_data <- data.frame("1"=c(1, 2, 1), "2"=c(5,6,7), "3"=c(4, 5, 8), "4"=c(3, 1, 9)) label_matrix <- createLabelMatrix(data = original_data)
original_data <- data.frame("1"=c(1, 2, 1), "2"=c(5,6,7), "3"=c(4, 5, 8), "4"=c(3, 1, 9)) label_matrix <- createLabelMatrix(data = original_data)
Create the dataframe, only including the event data
createStandardMatrix(data, output_dir = NULL)
createStandardMatrix(data, output_dir = NULL)
data |
data structure as the files in "UCR Time Series Classification Archive" |
output_dir |
the file to save the standard matrix data |
the dataframe of event data
original_data <- data.frame("1"=c(1, 2, 1), "2"=c(5,6,7), "3"=c(4, 5, 8), "4"=c(3, 1, 9)) standard_matrix <- createStandardMatrix(data = original_data)
original_data <- data.frame("1"=c(1, 2, 1), "2"=c(5,6,7), "3"=c(4, 5, 8), "4"=c(3, 1, 9)) standard_matrix <- createStandardMatrix(data = original_data)
Execute DDC to cluster the dataset
ddc(dist_matrix, standard_matrix, label_matrix, end_cluster_num = NULL, ...)
ddc(dist_matrix, standard_matrix, label_matrix, end_cluster_num = NULL, ...)
dist_matrix |
the created dist matrix |
standard_matrix |
the original data matrix |
label_matrix |
the matrix including events and labels |
end_cluster_num |
the max number of cluster when the procedue ends |
... |
including: mc.cores(cores used in parallel), the dtw parameters like step.pattern, keep, mc.cores |
the cluster array as a result, including 'Centroid', 'Elements' and 'DBAValue' for each cluster
original_data <- data.frame("1"=c(1, 2, 1), "2"=c(5,6,7), "3"=c(4, 5, 8), "4"=c(3, 1, 9)) standard_matrix <- createStandardMatrix(data = original_data) label_matrix <- createLabelMatrix(data = original_data) dist_matrix <- createDistMatrix(standard_matrix = standard_matrix) result <- ddc(dist_matrix=dist_matrix, standard_matrix=standard_matrix, label_matrix=label_matrix, end_cluster_num=2)
original_data <- data.frame("1"=c(1, 2, 1), "2"=c(5,6,7), "3"=c(4, 5, 8), "4"=c(3, 1, 9)) standard_matrix <- createStandardMatrix(data = original_data) label_matrix <- createLabelMatrix(data = original_data) dist_matrix <- createDistMatrix(standard_matrix = standard_matrix) result <- ddc(dist_matrix=dist_matrix, standard_matrix=standard_matrix, label_matrix=label_matrix, end_cluster_num=2)