Title: | Combined Cluster and Discriminant Analysis |
---|---|
Description: | Implements the combined cluster and discriminant analysis method for finding homogeneous groups of data with known origin as described in Kovacs et. al (2014): Classification into homogeneous groups using combined cluster and discriminant analysis (CCDA). Environmental Modelling & Software. <doi:10.1016/j.envsoft.2014.01.010>. |
Authors: | Solt Kovacs, Jozsef Kovacs, Peter Tanos |
Maintainer: | Solt Kovacs <[email protected]> |
License: | GPL-2 |
Version: | 1.1.1 |
Built: | 2024-12-02 06:34:25 UTC |
Source: | CRAN |
Classification into homogeneous groups using combined cluster and discriminant analysis (CCDA).
ccda.main(dataset, names_vector, nr, nameslist, prior = "proportions",return.RCDP=FALSE)
ccda.main(dataset, names_vector, nr, nameslist, prior = "proportions",return.RCDP=FALSE)
dataset |
Contains only the dataset as a matrix (without labels). |
names_vector |
Contains labels (names of sample origins) for each individual observation. |
nr |
Number of randomly coded datasets (RCD) investigated. |
nameslist |
Contains the names of sample origins as a list. |
prior |
A specified method that can be either "proportions" (in the case of different group sizes) or "equal" (in the case of equal group sizes). If unspecified, "proportions" is used as the default. |
return.RCDP |
A logical value indicating whether the method should return the percentages for the randomly coded datasets as a matrix. Not returned, unless set to "TRUE". |
ccda.main determines the basic grouping (Step I). For this it uses hierarchical clustering with Ward's method for the averages of the measured variables. Step II, the core cycle then runs for every one of the obtained groupings. For a suggestion on the number of randomly coded datasets investigated (nr), see Appendix in Kovacs et al., 2014. It should be noted that nr has a linear influence on the amount of time needed for computing.
Step III, the evaluation of the results is left to the user based on the output of ccda.main. Based on these outputs, the function plot.ccda.result helps the decision regarding further division.
The subgroups component of the output contains the grouping with the highest corresponding difference value. The iterative further investigation of these subgroups is required in order to obtain homogeneous groups as a final result. One should stop when the highest difference value is reached when every sampling location belongs to the same group.
nameslist |
Returns the input nameslist. |
q95 |
The 95 % quantiles of the ratios of correctly classified cases by LDA for the randomly coded datasets. |
ratio |
Ratios of correctly classified cases by LDA for each coded dataset. |
difference |
Ratio-q95. |
sub_groups |
Suggestion for subdivision according to the maximal difference value. |
RCDP |
Percentages for the randomly coded datasets as a matrix. |
Jozsef Kovacs, Solt Kovacs, Norbert Magyar, Peter Tanos, Istvan Gabor Hatvani, Angela Anda (2014): Classification into homogeneous groups using combined cluster and discriminant analysis (CCDA). Environmental Modelling & Software. DOI: http://dx.doi.org/10.1016/j.envsoft.2014.01.010
percentage
, plotccda.results
, plotccda.q95
,
plotccda.cluster
ccda.main(iris[,1:4] , iris[,5], 500, c("setosa","versicolor","virginica"), "proportions",return.RCDP=FALSE)
ccda.main(iris[,1:4] , iris[,5], 500, c("setosa","versicolor","virginica"), "proportions",return.RCDP=FALSE)
Extracts the ratio of correctly classified cases from the output of lda.
percentage(dataset, starting_vector, prior)
percentage(dataset, starting_vector, prior)
dataset |
Contains only the dataset as a matrix (without labels). |
starting_vector |
A vector specifying the class for each observation. |
prior |
A specified method that can be either "proportions" (in the case of different group sizes) or "equal" (in the case of equal group sizes). |
perctg |
The ratio of correctly classified cases by lda for the input grouping. |
The function plotccda.cluster draws the dendrogram for the basic grouping using hierarchical clustering for the averages with Ward's method (as used in ccda.main).
plotccda.cluster(x)
plotccda.cluster(x)
x |
The output list of ccda.main. |
ccda.main
, plotccda.results
,
plotccda.q95
result<-ccda.main(iris[,1:4] , iris[,5], 500, c("setosa","versicolor", "virginica"),"proportions",return.RCDP=FALSE) plotccda.cluster(result)
result<-ccda.main(iris[,1:4] , iris[,5], 500, c("setosa","versicolor", "virginica"),"proportions",return.RCDP=FALSE) plotccda.cluster(result)
The function plotccda.q95 draws the simulated density for the randomly coded datasets.
plotccda.q95(x,pl="max")
plotccda.q95(x,pl="max")
x |
The output list of ccda.main which has to include the RCDP output! (Set return.RCDP=TRUE while running ccda.main). |
pl |
"max" if the grouping with the highest difference value is considered or the number of the grouping for which the plot is made. |
ccda.main
, plotccda.results
, plotccda.cluster
result<-ccda.main(iris[,1:4] , iris[,5], 500, c("setosa","versicolor", "virginica"), "proportions", return.RCDP=TRUE) plotccda.q95(result) plotccda.q95(result, pl=2)
result<-ccda.main(iris[,1:4] , iris[,5], 500, c("setosa","versicolor", "virginica"), "proportions", return.RCDP=TRUE) plotccda.q95(result) plotccda.q95(result, pl=2)
Plots the summarized results of CCDA for all possible groupings based on the output of ccda.main.
plotccda.results(x)
plotccda.results(x)
x |
The output list of ccda.main. |
ccda.main
, plotccda.cluster
,
plotccda.q95
result<-ccda.main(iris[,1:4] , iris[,5], 500, c("setosa","versicolor", "virginica"),"proportions", return.RCDP=FALSE) plotccda.results(result)
result<-ccda.main(iris[,1:4] , iris[,5], 500, c("setosa","versicolor", "virginica"),"proportions", return.RCDP=FALSE) plotccda.results(result)