Package 'ClustBlock'

Title: Clustering of Datasets
Description: Hierarchical and partitioning algorithms to cluster blocks of variables. The partitioning algorithm includes an option called noise cluster to set aside atypical blocks of variables. The CLUSTATIS method (for quantitative blocks) (Llobell, Cariou, Vigneau, Labenne & Qannari (2020) <doi:10.1016/j.foodqual.2018.05.013>, Llobell, Vigneau & Qannari (2019) <doi:10.1016/j.foodqual.2019.02.017>) and the CLUSCATA method (for Check-All-That-Apply data) (Llobell, Cariou, Vigneau, Labenne & Qannari (2019) <doi:10.1016/j.foodqual.2018.09.006>, Llobell, Giacalone, Labenne & Qannari (2019) <doi:10.1016/j.foodqual.2019.05.017>) are the core of this package. The CATATIS methods allows to compute some indices and tests to control the quality of CATA data. Multivariate analysis and clustering of subjects for quantitative multiblock data, CATA, RATA, Free Sorting and JAR experiments are available. Clustering of rows in multi-block context (notably with ClusMB strategy) is also included.
Authors: Fabien Llobell [aut, cre] (Oniris/XLSTAT), Evelyne Vigneau [ctb] (Oniris), Veronique Cariou [ctb] (Oniris), El Mostafa Qannari [ctb] (Oniris)
Maintainer: Fabien Llobell <[email protected]>
License: GPL-3
Version: 4.0.0
Built: 2024-12-19 06:32:51 UTC
Source: CRAN

Help Index


Clustering of Datasets

Description

Hierarchical and partitioning algorithms of blocks of variables.The CLUSTATIS method and the CLUSCATA method are the core of this package. The CATATIS methods allows to compute some indices and tests to control the quality of CATA data. Multivariate analysis and clustering of subjects for quantitative multiblock data, CATA, RATA, Free Sorting and JAR experiments are available. Clustering of rows in multi-block context (notably with ClusMB strategy) is also included.

Details

Package: ClustBlock
Type: Package
Version: 4.0.0
First version Date: 2019-03-06
Last version Date: 2024-05-21

Author(s)

Fabien Llobell, Evelyne Vigneau, Veronique Cariou, El Mostafa Qannari

Maintainer: [email protected]

References

Llobell, F., Cariou, V., Vigneau, E., Labenne, A., & Qannari, E. M. (2020). Analysis and clustering of multiblock datasets
by means of the STATIS and CLUSTATIS methods. Application to sensometrics. Food
Quality and Preference, 79, 103520.
Llobell, F., Vigneau, E., & Qannari, E. M. (2019). Clustering datasets by means of CLUSTATIS with identification
of atypical datasets. Application to sensometrics. Food Quality and Preference, 75, 97-104.
Llobell, F., Cariou, V., Vigneau, E., Labenne, A., & Qannari, E. M. (2019). A new approach for the analysis of data
and the clustering of subjects in a CATA experiment. Food quality and preference, 72, 31-39.
Llobell, F., Giacalone, D., Labenne, A., & Qannari, E. M. (2019). Assessment of the agreement and cluster analysis of
the respondents in a CATA experiment. Food Quality and Preference, 77, 184-190.
Llobell, F., & Qannari, E. M. (2020). CLUSTATIS: Cluster analysis of blocks of variables. Electronic
Journal of Applied Statistical Analysis, 13(2), 436-453.
Llobell, F. (2020). Classification de tableaux de données, applications en analyse sensorielle (Doctoral
dissertation, Nantes, Ecole nationale vétérinaire).

Perform the CATATIS method on different blocks from a CATA experiment

Description

CATATIS method. Additional outputs are also computed. Non-binary data are accepted and weights can be tested.

Usage

catatis(Data,nblo,NameBlocks=NULL, NameVar=NULL, Graph=TRUE, Graph_weights=TRUE,
 Test_weights=FALSE, nperm=100)

Arguments

Data

data frame or matrix where the blocks of binary variables are merged horizontally. If you have a different format, see change_cata_format

nblo

integer. Number of blocks (subjects).

NameBlocks

string vector. Name of each block (subject). Length must be equal to the number of blocks. If NULL, the names are S1,...Sm. Default: NULL

NameVar

string vector. Name of each variable (attribute, the same names for each subject). Length must be equal to the number of attributes. If NULL, the colnames of the first block are taken. Default: NULL

Graph

logical. Show the graphical representation? Default: TRUE

Graph_weights

logical. Should the barplot of the weights be plotted? Default: TRUE

Test_weights

logical. Should the the weights be tested? Default: FALSE

nperm

integer. Number of permutation for the weight tests. Default: 100

Value

a list with:

  • S: the S matrix: a matrix with the similarity coefficient among the subjects

  • compromise: a matrix which is the compromise of the subjects (akin to a weighted average)

  • weights: the weights associated with the subjects to build the compromise

  • weights_tests: the weights tests results

  • lambda: the first eigenvalue of the S matrix

  • overall error: the error for the CATATIS criterion

  • error_by_sub: the error by subject (CATATIS criterion)

  • error_by_prod: the error by product (CATATIS criterion)

  • s_with_compromise: the similarity coefficient of each subject with the compromise

  • homogeneity: homogeneity of the subjects (in percentage)

  • CA: the results of correspondence analysis performed on the compromise dataset

  • eigenvalues: the eigenvalues associated to the correspondence analysis

  • inertia: the percentage of total variance explained by each axis of the CA

  • scalefactors: the scaling factors of each subject

  • nb_1: the number of 1 in each block, i.e. the number of checked attributes by subject.

  • param: parameters called

References

Llobell, F., Cariou, V., Vigneau, E., Labenne, A., & Qannari, E. M. (2019). A new approach for the analysis of data and the clustering of subjects in a CATA experiment. Food Quality and Preference, 72, 31-39.
Bonnet, L., Ferney, T., Riedel, T., Qannari, E.M., Llobell, F. (September 14, 2022) .Using CATA for sensory profiling: assessment of the panel performance. Eurosense, Turku, Finland.

See Also

plot.catatis, summary.catatis, cluscata, change_cata_format, change_cata_format2

Examples

data(straw)
res.cat=catatis(straw, nblo=114)
summary(res.cat)
plot(res.cat)

#Vertical format with sessions
data("fish")
chang=change_cata_format2(fish, nprod= 6, nattr= 27, nsub = 12, nsess= 3)
res.cat2=catatis(Data= chang$Datafinal, nblo = 12, NameBlocks =  chang$NameSub, Test_weights=TRUE)

#Vertical format without sessions
Data=fish[1:66,2:30]
chang2=change_cata_format2(Data, nprod= 6, nattr= 27, nsub = 11, nsess= 1)
res.cat3=catatis(Data= chang2$Datafinal, nblo = 11, NameBlocks =  chang2$NameSub)

Perform the CATATIS method on Just About Right data.

Description

CATATIS method adapted to JAR data.

Usage

catatis_jar(Data, nprod, nsub, levelsJAR=3, beta=0.1, Graph=TRUE, Graph_weights=TRUE,
Test_weights=FALSE, nperm=100)

Arguments

Data

data frame where the first column is the Assessors, the second is the products and all other columns the JAR attributes with numbers (1 to 3 or 1 to 5, see levelsJAR)

nprod

integer. Number of products.

nsub

integer. Number of subjects.

levelsJAR

integer. 3 or 5 levels. If 5, the data will be transformed in 3 levels.

beta

numerical. Parameter for agreement between JAR and other answers. Between 0 and 0.5.

Graph

logical. Show the graphical representation? Default: TRUE

Graph_weights

logical. Should the barplot of the weights be plotted? Default: TRUE

Test_weights

logical. Should the the weights be tested? Default: FALSE

nperm

integer. Number of permutation for the weight tests. Default: 100

Value

a list with:

  • S: the S matrix: a matrix with the similarity coefficient among the subjects

  • compromise: a matrix which is the compromise of the subjects (akin to a weighted average)

  • weights: the weights associated with the subjects to build the compromise

  • weights_tests: the weights tests results

  • lambda: the first eigenvalue of the S matrix

  • overall error: the error for the CATATIS criterion

  • error_by_sub: the error by subject (CATATIS criterion)

  • error_by_prod: the error by product (CATATIS criterion)

  • s_with_compromise: the similarity coefficient of each subject with the compromise

  • homogeneity: homogeneity of the subjects (in percentage)

  • CA: the results of correspondance analysis performed on the compromise dataset

  • eigenvalues: the eigenvalues associated to the correspondance analysis

  • inertia: the percentage of total variance explained by each axis of the CA

  • scalefactors: the scaling factors of each subject

  • nb_1: Can be ignored

  • param: parameters called

References

Llobell, F., Vigneau, E. & Qannari, E. M. ((September 14, 2022). Multivariate data analysis and clustering of subjects in a Just about right task. Eurosense, Turku, Finland.

See Also

catatis, plot.catatis, summary.catatis, cluscata_jar, preprocess_JAR, cluscata_kmeans_jar

Examples

data(cheese)
res.cat=catatis_jar(Data=cheese, nprod=8, nsub=72, levelsJAR=5)
summary(res.cat)
#plot(res.cat)

Perform the CATATIS method on different blocks from a RATA experiment

Description

CATATIS method for RATA data. Additional outputs are also computed. Non-binary data are accepted and weights can be tested.

Usage

catatis_rata(Data,nblo,NameBlocks=NULL, NameVar=NULL, Graph=TRUE, Graph_weights=TRUE,
 Test_weights=FALSE, nperm=100)

Arguments

Data

data frame or matrix where the blocks of variables are merged horizontally. If you have a different format, see change_cata_format

nblo

integer. Number of blocks (subjects).

NameBlocks

string vector. Name of each block (subject). Length must be equal to the number of blocks. If NULL, the names are S1,...Sm. Default: NULL

NameVar

string vector. Name of each variable (attribute, the same names for each subject). Length must be equal to the number of attributes. If NULL, the colnames of the first block are taken. Default: NULL

Graph

logical. Show the graphical representation? Default: TRUE

Graph_weights

logical. Should the barplot of the weights be plotted? Default: TRUE

Test_weights

logical. Should the the weights be tested? Default: FALSE

nperm

integer. Number of permutation for the weight tests. Default: 100

Value

a list with:

  • S: the S matrix: a matrix with the similarity coefficient among the subjects

  • compromise: a matrix which is the compromise of the subjects (akin to a weighted average)

  • weights: the weights associated with the subjects to build the compromise

  • weights_tests: the weights tests results

  • lambda: the first eigenvalue of the S matrix

  • overall error: the error for the CATATIS criterion

  • error_by_sub: the error by subject (CATATIS criterion)

  • error_by_prod: the error by product (CATATIS criterion)

  • s_with_compromise: the similarity coefficient of each subject with the compromise

  • homogeneity: homogeneity of the subjects (in percentage)

  • CA: the results of correspondence analysis performed on the compromise dataset

  • eigenvalues: the eigenvalues associated to the correspondence analysis

  • inertia: the percentage of total variance explained by each axis of the CA

  • scalefactors: the scaling factors of each subject

  • param: parameters called

References

Llobell, F., Cariou, V., Vigneau, E., Labenne, A., & Qannari, E. M. (2019). A new approach for the analysis of data and the clustering of subjects in a CATA experiment. Food Quality and Preference, 72, 31-39.
Bonnet, L., Ferney, T., Riedel, T., Qannari, E.M., Llobell, F. (September 14, 2022) .Using CATA for sensory profiling: assessment of the panel performance. Eurosense, Turku, Finland.
Bonnet, L., Llobell, F., Qannari, E.M. (Pangborn 2023). Assessment of the panel performance in a RATA experiment.

See Also

catatis, plot.catatis, summary.catatis, change_cata_format, change_cata_format2

Examples

#RATA data with session
data(RATAchoc)
chang2=change_cata_format2(RATAchoc, nprod= 12, nattr= 13, nsub = 9, nsess= 3)
res.cat4=catatis_rata(Data= chang2$Datafinal, nblo = 9, NameBlocks =  chang2$NameSub)
summary(res.cat4)

#RATA data without session
Data=RATAchoc[1:108,2:16]
chang2=change_cata_format2(Data, nprod= 12, nattr= 13, nsub = 9, nsess = 1)
res.cat5=catatis_rata(Data= chang2$Datafinal, nblo = 9, NameBlocks =  chang2$NameSub)
summary(res.cat5)
graphics.off()

Change format of CATA datasets to perform CATATIS or CLUSCATA function

Description

CATATIS and CLUSCATA operate on data where the blocksvariables are merged horizontally. If you have a different format, you can use this function to change the format. Format=1 is for data merged vertically with the dataset of the first subject, then the second,... with products in same order Format=2 is for data merged vertically with the dataset for the first product, then the second... with subjects in same order

Unlike change_cata_format2, you don't need to specify products and subjects, just make sure they are in the right order.

Usage

change_cata_format(Data, nprod, nattr, nsub, format=1, NameProds=NULL, NameAttr=NULL)

Arguments

Data

data frame or matrix. Correspond to your data

nprod

integer. Number of products

nattr

integer. Number of attributes

nsub

integer. Number of subjects.

format

integer (1 or 2). See the description

NameProds

string vector with the names of the products (length must be nprod)

NameAttr

string vector with the names of attributes (length must be nattr)

Value

The arranged data for CATATIS and CLUSCATA function

See Also

catatis, cluscata, change_cata_format2


Change format of CATA datasets to perform the package functions

Description

CATATIS and CLUSCATA operate on data where the blocks of variables are merged horizontally. If you have a vertical format, you can use this function to change the format. The first column must contain the sessions, the second the subjects, the third the products and the others the attributes. If you don't have sessions, then the first column must contain the subjects and the second the products. Unlike change_cata_format function, you can enter data with sessions and/or mixed data in terms of products/subjects. However, you have to set columns to indicate this beforehand.

Usage

change_cata_format2(Data, nprod, nattr, nsub, nsess)

Arguments

Data

data frame or matrix. Correspond to your data

nprod

integer. Number of products

nattr

integer. Number of attributes

nsub

integer. Number of subjects.

nsess

integer. Number of sessions

Value

The arranged data for CATATIS and CLUSCATA function and the subjects names in the correct order.

See Also

catatis, cluscata, change_cata_format

Examples

#Vertical format with sessions
data("fish")
chang=change_cata_format2(fish, nprod= 6, nattr= 27, nsub = 12, nsess= 3)
res.cat2=catatis(Data= chang$Datafinal, nblo = 12, NameBlocks =  chang$NameSub)

#Vertical format without sessions
Data=fish[1:66,2:30]
chang2=change_cata_format2(Data, nprod= 6, nattr= 27, nsub = 11, nsess= 1)
res.cat3=catatis(Data= chang2$Datafinal, nblo = 11, NameBlocks =  chang2$NameSub)
res.clu3=cluscata(Data= chang2$Datafinal, nblo = 11, NameBlocks =  chang2$NameSub)

cheese Just About Right data

Description

cheese Just About Right data

Usage

data(cheese)

Format

JAR data. A data frame with Assessors, Products and JAR attributes. 8 products, 9 attributes and 72 subjects.

References

Luc, A., Lê, S., Philippe, M., Qannari, E. M., & Vigneau, E. (2022). Free JAR experiment: Data analysis and comparison with JAR task. Food Quality and Preference, 98, 104453.

Examples

data(cheese)

chocolates data

Description

chocolates data

Usage

data(choc)

Format

Free sorting data. A data frame with 14 rows (the chocolates) and 25 columns (the subjects). The numbers indicate the groups to which the products (rows) are assigned.

References

Courcoux, P., Qannari, E. M., Taylor, Y., Buck, D., & Greenhoff, K. (2012). Taxonomic free sorting. Food Quality and Preference, 23(1), 30-35.

Examples

data(choc)

Perform a cluster analysis of subjects from a CATA experiment

Description

Clustering of subjects (blocks) from a CATA experiment. Each cluster of blocks is associated with a compromise computed by the CATATIS method. The hierarchical clustering is followed by a partitioning algorithm (consolidation). Non-binary data are accepted.

Usage

cluscata(Data, nblo, NameBlocks=NULL, NameVar=NULL, Noise_cluster=FALSE,
        Itermax=30, Graph_dend=TRUE, Graph_bar=TRUE, printlevel=FALSE,
        gpmax=min(6, nblo-2), rhoparam=NULL, Testonlyoneclust=FALSE, alpha=0.05,
        nperm=50, Warnings=FALSE)

Arguments

Data

data frame or matrix where the blocks of binary variables are merged horizontally. If you have a different format, see change_cata_format

nblo

numerical. Number of blocks (subjects).

NameBlocks

string vector. Name of each block (subject). Length must be equal to the number of blocks. If NULL, the names are S1,...Sm. Default: NULL

NameVar

string vector. Name of each variable (attribute, the same names for each subject). Length must be equal to the number of attributes. If NULL, the colnames of the first block are taken. Default: NULL

Noise_cluster

logical. Should a noise cluster be computed? Default: FALSE

Itermax

numerical. Maximum of iteration for the partitioning algorithm. Default:30

Graph_dend

logical. Should the dendrogram be plotted? Default: TRUE

Graph_bar

logical. Should the barplot of the difference of the criterion and the barplot of the overall homogeneity at each merging step of the hierarchical algorithm be plotted? Default: TRUE

printlevel

logical. Print the number of remaining levels during the hierarchical clustering algorithm? Default: FALSE

gpmax

logical. What is maximum number of clusters to consider? Default: min(6, nblo-2)

rhoparam

numerical. What is the threshold for the noise cluster? Between 0 and 1, high value can imply lot of blocks set aside. If NULL, automatic threshold is computed.

Testonlyoneclust

logical. Test if there is more than one cluster? Default: FALSE

alpha

numerical between 0 and 1. What is the threshold to test if there is more than one cluster? Default: 0.05

nperm

numerical. How many permutations are required to test if there is more than one cluster? Default: 50

Warnings

logical. Display warnings about the fact that none of the subjects in some clusters checked an attribute or product? Default: FALSE

Value

Each partitionK contains a list for each number of clusters of the partition, K=1 to gpmax with:

  • group: the clustering partition after consolidation. If Noise_cluster=TRUE, some subjects could be in the noise cluster ("K+1")

  • rho: the threshold for the noise cluster

  • homogeneity: homogeneity index (

  • s_with_compromise: similarity coefficient of each subject with its cluster compromise

  • weights: weight associated with each subject in its cluster

  • compromise: the compromise of each cluster

  • CA: list. the correspondance analysis results on each cluster compromise (coordinates, contributions...)

  • inertia: percentage of total variance explained by each axis of the CA for each cluster

  • s_all_cluster: the similarity coefficient between each subject and each cluster compromise

  • criterion: the CLUSCATA criterion error

  • param: parameters called

  • type: parameter passed to other functions

There is also at the end of the list:

  • dend: The CLUSCATA dendrogram

  • cutree_k: the partition obtained by cutting the dendrogram in K clusters (before consolidation).

  • overall_homogeneity_ng: percentage of overall homogeneity by number of clusters before consolidation (and after if there is no noise cluster)

  • diff_crit_ng: variation of criterion when a merging is done before consolidation (and after if there is no noise cluster)

  • test_one_cluster: decision and pvalue to know if there is more than one cluster

  • param: parameters called

  • type: parameter passed to other functions

References

Llobell, F., Cariou, V., Vigneau, E., Labenne, A., & Qannari, E. M. (2019). A new approach for the analysis of data and the clustering of subjects in a CATA experiment. Food Quality and Preference, 72, 31-39.
Llobell, F., Giacalone, D., Labenne, A., Qannari, E.M. (2019). Assessment of the agreement and cluster analysis of the respondents in a CATA experiment. Food Quality and Preference, 77, 184-190.

See Also

plot.cluscata, summary.cluscata , catatis, cluscata_kmeans, change_cata_format, change_cata_format2

Examples

data(straw)
#with 40 subjects
res=cluscata(Data=straw[,1:(16*40)], nblo=40)
#plot(res, ngroups=3, Graph_dend=FALSE)
summary(res, ngroups=3)
#With noise cluster
res2=cluscata(Data=straw[,1:(16*40)], nblo=40, Noise_cluster=TRUE,
Graph_dend=FALSE, Graph_bar=FALSE)
#With noise cluster and defined rho threshold
#(high threshold for this example, you can put low threshold
#(ex: 0.2 or 0.3) to avoid set aside lot of respondents)
res3=cluscata(Data=straw[,1:(16*40)], nblo=40, Noise_cluster=TRUE,
Graph_dend=FALSE, Graph_bar=FALSE, rhoparam=0.6)
#with all subjects
res=cluscata(Data=straw, nblo=114, printlevel=TRUE)


#Vertical format
data("fish")
Data=fish[1:66,2:30]
chang2=change_cata_format2(Data, nprod= 6, nattr= 27, nsub = 11, nsess= 1)
res3=cluscata(Data= chang2$Datafinal, nblo = 11, NameBlocks =  chang2$NameSub)

Perform a cluster analysis of subjects in a JAR experiment.

Description

Hierarchical clustering of subjects from a JAR experiment. Each cluster of subjects is associated with a compromise computed by the CATATIS method. The hierarchical clustering is followed by a partitioning algorithm (consolidation).

Usage

cluscata_jar(Data, nprod, nsub, levelsJAR=3, beta=0.1,  Noise_cluster=FALSE,
        Itermax=30, Graph_dend=TRUE, Graph_bar=TRUE, printlevel=FALSE,
        gpmax=min(6, nsub-2), rhoparam=NULL,
        Testonlyoneclust=FALSE, alpha=0.05, nperm=50, Warnings=FALSE)

Arguments

Data

data frame where the first column is the Assessors, the second is the products and all other columns the JAR attributes with numbers (1 to 3 or 1 to 5, see levelsJAR)

nprod

integer. Number of products.

nsub

integer. Number of subjects.

levelsJAR

integer. 3 or 5 levels. If 5, the data will be transformed in 3 levels.

beta

numerical. Parameter for agreement between JAR and other answers. Between 0 and 0.5.

Noise_cluster

logical. Should a noise cluster be computed? Default: FALSE

Itermax

numerical. Maximum of iteration for the partitioning algorithm. Default:30

Graph_dend

logical. Should the dendrogram be plotted? Default: TRUE

Graph_bar

logical. Should the barplot of the difference of the criterion and the barplot of the overall homogeneity at each merging step of the hierarchical algorithm be plotted? Default: TRUE

printlevel

logical. Print the number of remaining levels during the hierarchical clustering algorithm? Default: FALSE

gpmax

logical. What is maximum number of clusters to consider? Default: min(6, nblo-2)

rhoparam

numerical. What is the threshold for the noise cluster? Between 0 and 1, high value can imply lot of blocks set aside. If NULL, automatic threshold is computed.

Testonlyoneclust

logical. Test if there is more than one cluster? Default: FALSE

alpha

numerical between 0 and 1. What is the threshold to test if there is more than one cluster? Default: 0.05

nperm

numerical. How many permutations are required to test if there is more than one cluster? Default: 50

Warnings

logical. Display warnings about the fact that none of the subjects in some clusters checked an attribute or product? Default: FALSE

Value

Each partitionK contains a list for each number of clusters of the partition, K=1 to gpmax with:

  • group: the clustering partition after consolidation. If Noise_cluster=TRUE, some subjects could be in the noise cluster ("K+1")

  • rho: the threshold for the noise cluster

  • homogeneity: homogeneity index (

  • s_with_compromise: similarity coefficient of each subject with its cluster compromise

  • weights: weight associated with each subject in its cluster

  • compromise: the compromise of each cluster

  • CA: list. the correspondance analysis results on each cluster compromise (coordinates, contributions...)

  • inertia: percentage of total variance explained by each axis of the CA for each cluster

  • s_all_cluster: the similarity coefficient between each subject and each cluster compromise

  • criterion: the CLUSCATA criterion error

  • param: parameters called

  • type: parameter passed to other functions

There is also at the end of the list:

  • dend: The CLUSCATA dendrogram

  • cutree_k: the partition obtained by cutting the dendrogram in K clusters (before consolidation).

  • overall_homogeneity_ng: percentage of overall homogeneity by number of clusters before consolidation (and after if there is no noise cluster)

  • diff_crit_ng: variation of criterion when a merging is done before consolidation (and after if there is no noise cluster)

  • test_one_cluster: decision and pvalue to know if there is more than one cluster

  • param: parameters called

  • type: parameter passed to other functions

References

Llobell, F., Vigneau, E. & Qannari, E. M. ((September 14, 2022). Multivariate data analysis and clustering of subjects in a Just about right task. Eurosense, Turku, Finland.

See Also

plot.cluscata, summary.cluscata , catatis_jar, preprocess_JAR, cluscata_kmeans_jar

Examples

data(cheese)
res=cluscata_jar(Data=cheese, nprod=8, nsub=72, levelsJAR=5)
#plot(res, ngroups=4, Graph_dend=FALSE)
summary(res, ngroups=4)

Compute the CLUSCATA partitioning algorithm on different blocks from a CATA experiment. Can be performed using a multi-start strategy or initial partition provided by the user.

Description

Partitioning of binary Blocks from a CATA experiment. Each cluster is associated with a compromise computed by the CATATIS method. Moreover, a noise cluster can be set up.

Usage

cluscata_kmeans(Data,nblo, clust, nstart=100, rho=0, NameBlocks=NULL, NameVar=NULL,
               Itermax=30, Graph_groups=TRUE, print_attempt=FALSE, Warnings=FALSE)

Arguments

Data

data frame or matrix where the blocks of binary variables are merged horizontally. If you have a different format, see change_cata_format

nblo

numerical. Number of blocks (subjects).

clust

numerical vector or integer. Initial partition or number of starting partitions if integer. If numerical vector, the numbers must be 1,2,3,...,number of clusters

nstart

numerical. Number of starting partitions. Default: 100

rho

numerical between 0 and 1. Threshold for the noise cluster. If 0, there is no noise cluster. Default: 0

NameBlocks

string vector. Name of each block. Length must be equal to the number of blocks. If NULL, the names are S1,...Sm. Default: NULL

NameVar

string vector. Name of each variable (attribute, the same names for each subject). Length must be equal to the number of attributes. If NULL, the colnames of the first block are taken. Default: NULL

Itermax

numerical. Maximum of iterations by partitionning algorithm. Default: 30

Graph_groups

logical. Should each cluster compromise graphical representation be plotted? Default: TRUE

print_attempt

logical. Print the number of remaining attempts in multi-start case? Default: FALSE

Warnings

logical. Display warnings about the fact that none of the subjects in some clusters checked an attribute or product? Default: FALSE

Value

a list with:

  • group: the clustering partition. If rho>0, some subjects could be in the noise cluster ("K+1")

  • rho: the threshold for the noise cluster

  • homogeneity: percentage of homogeneity of the subjects in each cluster and the overall homogeneity

  • s_with_compromise: Similarity coefficient of each subject with its cluster compromise

  • weights: weight associated with each subject in its cluster

  • compromise: The compromise of each cluster

  • CA: The correspondance analysis results on each cluster compromise (coordinates, contributions...)

  • inertia: percentage of total variance explained by each axis of the CA for each cluster

  • s_all_cluster: the similarity coefficient between each subject and each cluster compromise

  • param: parameters called

  • criterion: the CLUSCATA criterion error

  • type: parameter passed to other functions

References

Llobell, F., Cariou, V., Vigneau, E., Labenne, A., & Qannari, E. M. (2019). A new approach for the analysis of data and the clustering of subjects in a CATA experiment. Food Quality and Preference, 72, 31-39.
Llobell, F., Giacalone, D., Labenne, A., Qannari, E.M. (2019). Assessment of the agreement and cluster analysis of the respondents in a CATA experiment. Food Quality and Preference, 77, 184-190.

See Also

plot.cluscata , summary.cluscata, catatis, cluscata, change_cata_format

Examples

data(straw)
cl_km=cluscata_kmeans(Data=straw[,1:(16*40)], nblo=40, clust=3)
#plot(cl_km, Graph_groups=FALSE, Graph_weights = TRUE)
summary(cl_km)

Perform a cluster analysis of subjects in a JAR experiment.

Description

Partitionning of subject from a JAR experiment. Each cluster is associated with a compromise computed by the CATATIS method. Moreover, a noise cluster can be set up.

Usage

cluscata_kmeans_jar(Data, nprod, nsub, levelsJAR=3, beta=0.1, clust, nstart=100, rho=0,
Itermax=30, Graph_groups=TRUE, print_attempt=FALSE, Warnings=FALSE)

Arguments

Data

data frame where the first column is the Assessors, the second is the products and all other columns the JAR attributes with numbers (1 to 3 or 1 to 5, see levelsJAR)

nprod

integer. Number of products.

nsub

integer. Number of subjects.

levelsJAR

integer. 3 or 5 levels. If 5, the data will be transformed in 3 levels.

beta

numerical. Parameter for agreement between JAR and other answers. Between 0 and 0.5.

clust

numerical vector or integer. Initial partition or number of starting partitions if integer. If numerical vector, the numbers must be 1,2,3,...,number of clusters

nstart

numerical. Number of starting partitions. Default: 100

rho

numerical between 0 and 1. Threshold for the noise cluster. If 0, there is no noise cluster. Default: 0

Itermax

numerical. Maximum of iterations by partitionning algorithm. Default: 30

Graph_groups

logical. Should each cluster compromise graphical representation be plotted? Default: TRUE

print_attempt

logical. Print the number of remaining attempts in multi-start case? Default: FALSE

Warnings

logical. Display warnings about the fact that none of the subjects in some clusters checked an attribute or product? Default: FALSE

Value

a list with:

  • group: the clustering partition. If rho>0, some subjects could be in the noise cluster ("K+1")

  • rho: the threshold for the noise cluster

  • homogeneity: percentage of homogeneity of the subjects in each cluster and the overall homogeneity

  • s_with_compromise: Similarity coefficient of each subject with its cluster compromise

  • weights: weight associated with each subject in its cluster

  • compromise: The compromise of each cluster

  • CA: The correspondance analysis results on each cluster compromise (coordinates, contributions...)

  • inertia: percentage of total variance explained by each axis of the CA for each cluster

  • s_all_cluster: the similarity coefficient between each subject and each cluster compromise

  • param: parameters called

  • criterion: the CLUSCATA criterion error

  • type: parameter passed to other functions

References

Llobell, F., Vigneau, E. & Qannari, E. M. ((September 14, 2022). Multivariate data analysis and clustering of subjects in a Just about right task. Eurosense, Turku, Finland.

See Also

plot.cluscata, summary.cluscata , catatis_jar, preprocess_JAR, cluscata_jar

Examples

data(cheese)
res=cluscata_kmeans_jar(Data=cheese, nprod=8, nsub=72, levelsJAR=5, clust=4)
#plot(res)
summary(res)

Perform a cluster analysis of subjects from a RATA experiment

Description

Hierarchical clustering of subjects (blocks) from a RATA experiment. Each cluster of blocks is associated with a compromise computed by the CATATIS method. The hierarchical clustering is followed by a partitioning algorithm (consolidation).

Usage

cluscata_rata(Data, nblo, NameBlocks=NULL, NameVar=NULL, Noise_cluster=FALSE,
        Itermax=30, Graph_dend=TRUE, Graph_bar=TRUE, printlevel=FALSE,
        gpmax=min(6, nblo-2), rhoparam=NULL, Testonlyoneclust=FALSE, alpha=0.05,
        nperm=50, Warnings=FALSE)

Arguments

Data

data frame or matrix where the blocks of binary variables are merged horizontally. If you have a different format, see change_cata_format

nblo

numerical. Number of blocks (subjects).

NameBlocks

string vector. Name of each block (subject). Length must be equal to the number of blocks. If NULL, the names are S1,...Sm. Default: NULL

NameVar

string vector. Name of each variable (attribute, the same names for each subject). Length must be equal to the number of attributes. If NULL, the colnames of the first block are taken. Default: NULL

Noise_cluster

logical. Should a noise cluster be computed? Default: FALSE

Itermax

numerical. Maximum of iteration for the partitioning algorithm. Default:30

Graph_dend

logical. Should the dendrogram be plotted? Default: TRUE

Graph_bar

logical. Should the barplot of the difference of the criterion and the barplot of the overall homogeneity at each merging step of the hierarchical algorithm be plotted? Default: TRUE

printlevel

logical. Print the number of remaining levels during the hierarchical clustering algorithm? Default: FALSE

gpmax

logical. What is maximum number of clusters to consider? Default: min(6, nblo-2)

rhoparam

numerical. What is the threshold for the noise cluster? Between 0 and 1, high value can imply lot of blocks set aside. If NULL, automatic threshold is computed.

Testonlyoneclust

logical. Test if there is more than one cluster? Default: FALSE

alpha

numerical between 0 and 1. What is the threshold to test if there is more than one cluster? Default: 0.05

nperm

numerical. How many permutations are required to test if there is more than one cluster? Default: 50

Warnings

logical. Display warnings about the fact that none of the subjects in some clusters checked an attribute or product? Default: FALSE

Value

Each partitionK contains a list for each number of clusters of the partition, K=1 to gpmax with:

  • group: the clustering partition after consolidation. If Noise_cluster=TRUE, some subjects could be in the noise cluster ("K+1")

  • rho: the threshold for the noise cluster

  • homogeneity: homogeneity index (

  • s_with_compromise: similarity coefficient of each subject with its cluster compromise

  • weights: weight associated with each subject in its cluster

  • compromise: the compromise of each cluster

  • CA: list. the correspondance analysis results on each cluster compromise (coordinates, contributions...)

  • inertia: percentage of total variance explained by each axis of the CA for each cluster

  • s_all_cluster: the similarity coefficient between each subject and each cluster compromise

  • criterion: the CLUSCATA criterion error

  • param: parameters called

  • type: parameter passed to other functions

There is also at the end of the list:

  • dend: The CLUSCATA dendrogram

  • cutree_k: the partition obtained by cutting the dendrogram in K clusters (before consolidation).

  • overall_homogeneity_ng: percentage of overall homogeneity by number of clusters before consolidation (and after if there is no noise cluster)

  • diff_crit_ng: variation of criterion when a merging is done before consolidation (and after if there is no noise cluster)

  • test_one_cluster: decision and pvalue to know if there is more than one cluster

  • param: parameters called

  • type: parameter passed to other functions

References

Llobell, F., Cariou, V., Vigneau, E., Labenne, A., & Qannari, E. M. (2019). A new approach for the analysis of data and the clustering of subjects in a CATA experiment. Food Quality and Preference, 72, 31-39.
Llobell, F., Giacalone, D., Labenne, A., Qannari, E.M. (2019). Assessment of the agreement and cluster analysis of the respondents in a CATA experiment. Food Quality and Preference, 77, 184-190. Conference to come (Eurosense 2024)

See Also

plot.cluscata, summary.cluscata , catatis_rata, change_cata_format, change_cata_format2

Examples

#RATA data without session
data(RATAchoc)
Data=RATAchoc[1:108,2:16]
chang2=change_cata_format2(Data, nprod= 12, nattr= 13, nsub = 9, nsess = 1)
res.clus=cluscata_rata(Data= chang2$Datafinal, nblo = 9, NameBlocks =  chang2$NameSub)
summary(res.clus)
plot(res.clus)

Perform a cluster analysis of rows in a Multi-block context with the ClusMB method

Description

Clustering of rows (products in sensory analysis) in a Multi-block context. The hierarchical clustering is followed by a partitioning algorithm (consolidation).

Usage

ClusMB(Data, Blocks, NameBlocks=NULL, scale=FALSE, center=TRUE,
nclust=NULL, gpmax=6)

Arguments

Data

data frame or matrix. Correspond to all the blocks of variables merged horizontally

Blocks

numerical vector. The number of variables of each block. The sum must be equal to the number of columns of Data.

NameBlocks

string vector. Name of each block. Length must be equal to the length of Blocks vector. If NULL, the names are B1,...Bm. Default: NULL

scale

logical. Should the data variables be scaled? Default: FALSE

center

logical. Should the data variables be centered? Default: TRUE. Please set to FALSE for a CATA experiment

nclust

numerical. Number of clusters to consider. If NULL, the Hartigan index advice is taken.

gpmax

logical. What is maximum number of clusters to consider? Default: min(6, number of blocks -2)

Value

  • group: the clustering partition after consolidation.

  • nbgH: Advised number of clusters per Hartigan index

  • nbgCH: Advised number of clusters per Calinski-Harabasz index

  • cutree_k: the partition obtained by cutting the dendrogram in K clusters (before consolidation).

  • dend: The ClusMB dendrogram

  • param: parameters called

  • type: parameter passed to other functions

References

Llobell, F., Qannari, E.M. (June 10, 2022). Cluster analysis in a multi-bloc setting. SMTDA, Athens, Greece.
Llobell, F., Giacalone, D., Qannari, E. M. (Pangborn 2021). Cluster Analysis of products in CATA experiments.
Paper submitted

See Also

indicesClusters, summary.clusRows , clustRowsOnStatisAxes

Examples

#####projective mapping####
library(ClustBlock)
data(smoo)
res1=ClusMB(smoo, rep(2,24))
summary(res1)
indicesClusters(smoo, rep(2,24), res1$group)

####CATA####
data(fish)
Data=fish[1:66,2:30]
chang2=change_cata_format2(Data, nprod= 6, nattr= 27, nsub = 11, nsess= 1)
res2=ClusMB(Data= chang2$Datafinal, Blocks= rep(27, 11), center=FALSE)
indicesClusters(Data= chang2$Datafinal, Blocks= rep(27, 11),cut = res2$group, center=FALSE)

Perform a cluster analysis of blocks of quantitative variables

Description

Hierarchical clustering of quantitative Blocks followed by a partitioning algorithm (consolidation). Each cluster of blocks is associated with a compromise computed by the STATIS method. Moreover, a noise cluster can be set up.

Usage

clustatis(Data,Blocks,NameBlocks=NULL,Noise_cluster=FALSE,scale=FALSE,
  Itermax=30, Graph_dend=TRUE, Graph_bar=TRUE,
  printlevel=FALSE, gpmax=min(6, length(Blocks)-2),  rhoparam=NULL,
  Testonlyoneclust=FALSE, alpha=0.05, nperm=50)

Arguments

Data

data frame or matrix. Correspond to all the blocks of variables merged horizontally

Blocks

numerical vector. The number of variables of each block. The sum must be equal to the number of columns of Data

NameBlocks

string vector. Name of each block. Length must be equal to the length of Blocks vector. If NULL, the names are B1,...Bm. Default: NULL

Noise_cluster

logical. Should a noise cluster be computed? Default: FALSE

scale

logical. Should the data variables be scaled? Default: FALSE

Itermax

numerical. Maximum of iteration for the partitioning algorithm. Default: 30

Graph_dend

logical. Should the dendrogram be plotted? Default: TRUE

Graph_bar

logical. Should the barplot of the difference of the criterion and the barplot of the overall homogeneity at each merging step of the hierarchical algorithm be plotted? Default: TRUE

printlevel

logical. Print the number of remaining levels during the hierarchical clustering algorithm? Default: FALSE

gpmax

logical. What is maximum number of clusters to consider? Default: min(6, number of blocks -2)

rhoparam

numerical. What is the threshold for the noise cluster? Between 0 and 1, high value can imply lot of blocks set aside. If NULL, automatic threshold is computed.

Testonlyoneclust

logical. Test if there is more than one cluster? Default: FALSE

alpha

numerical between 0 and 1. What is the threshold to test if there is more than one cluster? Default: 0.05

nperm

numerical. How many permutations are required to test if there is more than one cluster? Default: 50

Value

Each partitionK contains a list for each number of clusters of the partition, K=1 to gpmax with:

  • group: the clustering partition of datasets after consolidation. If Noise_cluster=TRUE, some blocks could be in the noise cluster ("K+1")

  • rho: the threshold for the noise cluster (computed or input parameter)

  • homogeneity: homogeneity index (

  • rv_with_compromise: RV coefficient of each block with its cluster compromise

  • weights: weight associated with each block in its cluster

  • comp_RV: RV coefficient between the compromises associated with the various clusters

  • compromise: the W compromise of each cluster

  • coord: the coordinates of objects of each cluster

  • inertia: percentage of total variance explained by each axis for each cluster

  • rv_all_cluster: the RV coefficient between each block and each cluster compromise

  • criterion: the CLUSTATIS criterion error

  • param: parameters called in the consolidation

  • type: parameter passed to other functions

There is also at the end of the list:

  • dend: The CLUSTATIS dendrogram

  • cutree_k: the partition obtained by cutting the dendrogram for K clusters (before consolidation).

  • overall_homogeneity_ng: percentage of overall homogeneity by number of clusters before consolidation (and after if there is no noise cluster)

  • diff_crit_ng: variation of criterion when a merging is done before consolidation (and after if there is no noise cluster)

  • test_one_cluster: decision and pvalue to know if there is more than one cluster

  • param: parameters called

  • type: parameter passed to other functions

References

Llobell, F., Cariou, V., Vigneau, E., Labenne, A., & Qannari, E. M. (2018). Analysis and clustering of multiblock datasets by means of the STATIS and CLUSTATIS methods. Application to sensometrics. Food Quality and Preference, in Press.
Llobell, F., Vigneau, E., Qannari, E. M. (2019). Clustering datasets by means of CLUSTATIS with identification of atypical datasets. Application to sensometrics. Food Quality and Preference, 75, 97-104.

See Also

plot.clustatis, summary.clustatis , clustatis_kmeans, statis

Examples

data(smoo)
 NameBlocks=paste0("S",1:24)
 cl=clustatis(Data=smoo,Blocks=rep(2,24),NameBlocks = NameBlocks)
 #plot(cl, ngroups=3, Graph_dend=FALSE)
 summary(cl)
 #with noise cluster
 cl2=clustatis(Data=smoo,Blocks=rep(2,24),NameBlocks = NameBlocks,
 Noise_cluster=TRUE, Graph_dend=FALSE, Graph_bar=FALSE)
 #with noise cluster and defined rho threshold
 cl3=clustatis(Data=smoo,Blocks=rep(2,24),NameBlocks = NameBlocks,
 Noise_cluster=TRUE, Graph_dend=FALSE, Graph_bar=FALSE, rhoparam=0.5)

Perform a cluster analysis of free sorting data

Description

Hierarchical clustering of free sorting data followed by a partitioning algorithm (consolidation). Each cluster of blocks is associated with a compromise computed by the STATIS method. Moreover, a noise cluster can be set up.

Usage

clustatis_FreeSort(Data, NameSub=NULL, Noise_cluster=FALSE,Itermax=30,
                           Graph_dend=TRUE, Graph_bar=TRUE, printlevel=FALSE,
                           gpmax=min(6, ncol(Data)-1),rhoparam=NULL,
                           Testonlyoneclust=FALSE, alpha=0.05, nperm=50)

Arguments

Data

data frame or matrix. Corresponds to all variables that contain subjects results. Each column corresponds to a subject and gives the groups to which the products (rows) are assigned

NameSub

string vector. Name of each subject. Length must be equal to the number of clumn of the Data. If NULL, the names are S1,...Sm. Default: NULL

Noise_cluster

logical. Should a noise cluster be computed? Default: FALSE

Itermax

numerical. Maximum of iteration for the partitioning algorithm. Default: 30

Graph_dend

logical. Should the dendrogram be plotted? Default: TRUE

Graph_bar

logical. Should the barplot of the difference of the criterion and the barplot of the overall homogeneity at each merging be plotted? Default: FALSE

printlevel

logical. Print the number of remaining levels during the hierarchical clustering algorithm? Default: FALSE

gpmax

logical. What is maximum number of clusters to consider? Default: min(6, number of subjects -1)

rhoparam

numerical. What is the threshold for the noise cluster? Between 0 and 1, high value can imply lot of blocks set aside. If NULL, automatic threshold is computed.

Testonlyoneclust

logical. Test if there is more than one cluster? Default: FALSE

alpha

numerical between 0 and 1. What is the threshold to test if there is more than one cluster? Default: 0.05

nperm

numerical. How many permutations are required to test if there is more than one cluster? Default: 50

Value

Each partitionK contains a list for each number of clusters of the partition, K=1 to gpmax with:

  • group: the clustering partition of subjects after consolidation. If Noise_cluster=TRUE, some subjects could be in the noise cluster ("K+1")

  • rho: the threshold for the noise cluster

  • homogeneity: homogeneity index (

  • rv_with_compromise: RV coefficient of each block with its cluster compromise

  • weights: weight associated with each subject in its cluster

  • comp_RV: RV coefficient between the compromises associated with the various clusters

  • compromise: the W compromise of each cluster

  • coord: the coordinates of objects of each cluster

  • inertia: percentage of total variance explained by each axis for each cluster

  • rv_all_cluster: the RV coefficient between each subject and each cluster compromise

  • criterion: the CLUSTATIS criterion error

  • param: parameters called in the consolidation

  • type: parameter passed to other functions

There is also at the end of the list:

  • dend: The CLUSTATIS dendrogram

  • cutree_k: the partition obtained by cutting the dendrogram for K clusters (before consolidation).

  • overall_homogeneity_ng: percentage of overall homogeneity by number of clusters before consolidation (and after if there is no noise cluster)

  • diff_crit_ng: variation of criterion when a merging is done before consolidation (and after if there is no noise cluster)

  • test_one_cluster: decision and pvalue to know if there is more than one cluster

  • param: parameters called

  • type: parameter passed to other functions

References

Llobell, F., Cariou, V., Vigneau, E., Labenne, A., & Qannari, E. M. (2018). Analysis and clustering of multiblock datasets by means of the STATIS and CLUSTATIS methods. Application to sensometrics. Food Quality and Preference, in Press.
Llobell, F., Vigneau, E., Qannari, E. M. (2019). Clustering datasets by means of CLUSTATIS with identification of atypical datasets. Application to sensometrics. Food Quality and Preference, 75, 97-104.

See Also

clustatis, preprocess_FreeSort, summary.clustatis, , plot.clustatis

Examples

data(choc)
res.clu=clustatis_FreeSort(choc)
plot(res.clu, Graph_dend=FALSE)
summary(res.clu)

Compute the CLUSTATIS partitionning algorithm on free sorting data

Description

Partitionning algorithm for Free Sorting data. Each cluster is associated with a compromise computed by the STATIS method. Moreover, a noise cluster can be set up.

Usage

clustatis_FreeSort_kmeans(Data, NameSub=NULL, clust, nstart=100, rho=0,Itermax=30,
Graph_groups=TRUE, Graph_weights=FALSE,  print_attempt=FALSE)

Arguments

Data

data frame or matrix. Corresponds to all variables that contain subjects results. Each column corresponds to a subject and gives the groups to which the products (rows) are assigned

NameSub

string vector. Name of each subject. Length must be equal to the number of clumn of the Data. If NULL, the names are S1,...Sm. Default: NULL

clust

numerical vector or integer. Initial partition or number of starting partitions if integer. If numerical vector, the numbers must be 1,2,3,...,number of clusters

nstart

integer. Number of starting partitions. Default: 100

rho

numerical between 0 and 1. Threshold for the noise cluster. Default:0

Itermax

numerical. Maximum of iterations by partitionning algorithm. Default: 30

Graph_groups

logical. Should each cluster compromise be plotted? Default: TRUE

Graph_weights

logical. Should the barplot of the weights in each cluster be plotted? Default: FALSE

print_attempt

logical. Print the number of remaining attempts in the multi-start case? Default: FALSE

Value

a list with:

  • group: the clustering partition. If rho>0, some subjects could be in the noise cluster ("K+1")

  • rho: the threshold for the noise cluster

  • homogeneity: percentage of homogeneity of the subjects in each cluster and the overall homogeneity

  • rv_with_compromise: RV coefficient of each subject with its cluster compromise

  • weights: weight associated with each subject in its cluster

  • comp_RV: RV coefficient between the compromises associated with the various clusters

  • compromise: the W compromise of each cluster

  • coord: the coordinates of objects of each cluster

  • inertia: percentage of total variance explained by each axis for each cluster

  • rv_all_cluster: the RV coefficient between each subject and each cluster compromise

  • criterion: the CLUSTATIS criterion error

  • param: parameters called

  • type: parameter passed to other functions

References

Llobell, F., Cariou, V., Vigneau, E., Labenne, A., & Qannari, E. M. (2018). Analysis and clustering of multiblock datasets by means of the STATIS and CLUSTATIS methods. Application to sensometrics. Food Quality and Preference, in Press.
Llobell, F., Vigneau, E., Qannari, E. M. (2019). Clustering datasets by means of CLUSTATIS with identification of atypical datasets. Application to sensometrics. Food Quality and Preference, 75, 97-104.

See Also

clustatis_FreeSort, preprocess_FreeSort, summary.clustatis, , plot.clustatis

Examples

data(choc)
res.clu=clustatis_FreeSort_kmeans(choc, clust=2)
plot(res.clu, Graph_groups=FALSE, Graph_weights=TRUE)
summary(res.clu)

Compute the CLUSTATIS partitioning algorithm on different blocks of quantitative variables. Can be performed using a multi-start strategy or initial partition provided by the user.

Description

Partitioning algorithm for quantitative variables. Each cluster is associated with a compromise computed by the STATIS method. Moreover, a noise cluster can be set up.

Usage

clustatis_kmeans(Data, Blocks, clust, nstart=100, rho=0, NameBlocks=NULL,
Itermax=30,Graph_groups=TRUE, Graph_weights=FALSE,
 scale=FALSE, print_attempt=FALSE)

Arguments

Data

data frame or matrix. Correspond to all the blocks of variables merged horizontally

Blocks

numerical vector. The number of variables of each block. The sum must be equal to the number of columns of Data

clust

numerical vector or integer. Initial partition or number of starting partitions if integer. If numerical vector, the numbers must be 1,2,3,...,number of clusters

nstart

integer. Number of starting partitions. Default: 100

rho

numerical between 0 and 1. Threshold for the noise cluster. Default:0

NameBlocks

string vector. Name of each block. Length must be equal to the length of Blocks vector. If NULL, the names are B1,...Bm. Default: NULL

Itermax

numerical. Maximum of iterations by partitionning algorithm. Default: 30

Graph_groups

logical. Should each cluster compromise be plotted? Default: TRUE

Graph_weights

logical. Should the barplot of the weights in each cluster be plotted? Default: FALSE

scale

logical. Should the data variables be scaled? Default: FALSE

print_attempt

logical. Print the number of remaining attempts in the multi-start case? Default: FALSE

Value

a list with:

  • group: the clustering partition. If rho>0, some blocks could be in the noise cluster ("K+1")

  • rho: the threshold for the noise cluster

  • homogeneity: percentage of homogeneity of the blocks in each cluster and the overall homogeneity

  • rv_with_compromise: RV coefficient of each block with its cluster compromise

  • weights: weight associated with each block in its cluster

  • comp_RV: RV coefficient between the compromises associated with the various clusters

  • compromise: the W compromise of each cluster

  • coord: the coordinates of objects of each cluster

  • inertia: percentage of total variance explained by each axis for each cluster

  • rv_all_cluster: the RV coefficient between each block and each cluster compromise

  • criterion: the CLUSTATIS criterion error

  • param: parameters called

  • type: parameter passed to other functions

References

Llobell, F., Cariou, V., Vigneau, E., Labenne, A., & Qannari, E. M. (2018). Analysis and clustering of multiblock datasets by means of the STATIS and CLUSTATIS methods. Application to sensometrics. Food Quality and Preference, in Press.
Llobell, F., Vigneau, E., Qannari, E. M. (2019). Clustering datasets by means of CLUSTATIS with identification of atypical datasets. Application to sensometrics. Food Quality and Preference, 75, 97-104.

See Also

plot.clustatis, clustatis, summary.clustatis, statis

Examples

data(smoo)
 NameBlocks=paste0("S",1:24)
 #with multi-start
 cl_km=clustatis_kmeans(Data=smoo,Blocks=rep(2,24),NameBlocks = NameBlocks, clust=3)
 #with an initial partition
 cl=clustatis(Data=smoo,Blocks=rep(2,24),NameBlocks = NameBlocks,
 Graph_dend=FALSE)
 partition=cl$cutree_k$partition3
 cl_km2=clustatis_kmeans(Data=smoo,Blocks=rep(2,24),NameBlocks = NameBlocks,
 clust=partition, Graph_weights=FALSE, Graph_groups=FALSE)
 graphics.off()

Perform a cluster analysis of rows in a Multi-block context with clustering on STATIS axes

Description

Clustering of rows (products in sensory analysis) in a Multi-block context. The STATIS method is followed by a hierarchical algorithm.

Usage

clustRowsOnStatisAxes(Data, Blocks, NameBlocks=NULL, scale=FALSE,
nclust=NULL, gpmax=6, ncomp=5)

Arguments

Data

data frame or matrix. Correspond to all the blocks of variables merged horizontally

Blocks

numerical vector. The number of variables of each block. The sum must be equal to the number of columns of Data.

NameBlocks

string vector. Name of each block. Length must be equal to the length of Blocks vector. If NULL, the names are B1,...Bm. Default: NULL

scale

logical. Should the data variables be scaled? Default: FALSE

nclust

numerical. Number of clusters to consider. If NULL, the Hartigan index advice is taken.

gpmax

logical. What is maximum number of clusters to consider? min(6, number of blocks -2)

ncomp

numerical. Number of axes to consider. Default:5

Value

  • group: the clustering partition.

  • nbgH: Advised number of clusters per Hartigan index

  • nbgCH: Advised number of clusters per Calinski-Harabasz index

  • cutree_k: the partition obtained by cutting the dendrogram in K clusters

  • dend: The dendrogram

  • param: parameters called

  • type: parameter passed to other functions

References

Paper submitted

See Also

indicesClusters, summary.clusRows , ClusMB

Examples

#####projective mapping####
library(ClustBlock)
data(smoo)
res1=clustRowsOnStatisAxes(smoo, rep(2,24))
summary(res1)
indicesClusters(smoo, rep(2,24), res1$group)

####CATA####
data(fish)
Data=fish[1:66,2:30]
chang2=change_cata_format2(Data, nprod= 6, nattr= 27, nsub = 11, nsess= 1)
res2=clustRowsOnStatisAxes(Data= chang2$Datafinal, Blocks= rep(27, 11))
indicesClusters(Data= chang2$Datafinal, Blocks= rep(27, 11),cut = res2$group, center=FALSE)

Test the consistency of each attribute in a CATA experiment

Description

Permutation test on the agreement between subjects for each attribute in a CATA experiment

Usage

consistency_cata(Data,nblo, nperm=100, alpha=0.05, printAttrTest=FALSE)

Arguments

Data

data frame or matrix. Correspond to all the blocks of variables merged horizontally

nblo

numerical. Number of blocks (subjects).

nperm

numerical. How many permutations are required? Default: 100

alpha

numerical between 0 and 1. What is the threshold? Default: 0.05

printAttrTest

logical. Print the number of remaining attributes to be tested? Default: FALSE

Value

a list with:

  • consist: the consistent attributes

  • no_consist: the inconsistent attributes

  • pval: pvalue for each test

References

Llobell, F., Giacalone, D., Labenne, A., Qannari, E.M. (2019). Assessment of the agreement and cluster analysis of the respondents in a CATA experiment. Food Quality and Preference, 77, 184-190.

See Also

consistency_cata_panel, change_cata_format, change_cata_format2

Examples

data(straw)
#with only 40 subjects
consistency_cata(Data=straw[,1:(16*40)], nblo=40)
#with all subjects
consistency_cata(Data=straw, nblo=114, printAttrTest=TRUE)

Test the consistency of the panel in a CATA experiment

Description

Permutation test on the agreement between subjects in a CATA experiment

Usage

consistency_cata_panel(Data,nblo, nperm=100, alpha=0.05)

Arguments

Data

data frame or matrix. Correspond to all the blocks of variables merged horizontally

nblo

numerical. Number of blocks (subjects).

nperm

numerical. How many permutations are required? Default: 100

alpha

numerical between 0 and 1. What is the threshold? Default: 0.05

Value

a list with:

  • answer: the answer of the test

  • pval: pvalue of the test

  • dis: distance between the homogeneity and the median of the permutations

References

Llobell, F., Giacalone, D., Labenne, A., Qannari, E.M. (2019). Assessment of the agreement and cluster analysis of the respondents in a CATA experiment. Food Quality and Preference, 77, 184-190.
Bonnet, L., Ferney, T., Riedel, T., Qannari, E.M., Llobell, F. (September 14, 2022) .Using CATA for sensory profiling: assessment of the panel performance. Eurosense, Turku, Finland.

See Also

consistency_cata, change_cata_format, change_cata_format2

Examples

data(straw)
#with all subjects
consistency_cata_panel(Data=straw, nblo=114)

fish data

Description

fish data

Usage

data(fish)

Format

CATA data with sessions. A data frame with the sessions, the panelists, the products and CATA attributes.

References

Bonnet, L., Ferney, T., Riedel, T., Qannari, E.M., Llobell, F. (September 14, 2022) .Using CATA for sensory profiling: assessment of the panel performance. Eurosense, Turku, Finland.

Examples

data(fish)

Compute the indices to evaluate the quality of the cluster partition in multi-block context

Description

Compute the Il index to evaluate the agreement between each block and the global partition (in sensory: agreement between each subject and the global partition)

Compute the Jl index to evaluate if each block has a partition (in sensory: if each subject made a partition of products)

Usage

indicesClusters(Data, Blocks, cut, NameBlocks=NULL, center=TRUE, scale=FALSE)

Arguments

Data

data frame or matrix. Correspond to all the blocks of variables merged horizontally

Blocks

numerical vector. The number of variables of each block. The sum must be equal to the number of columns of Data.

cut

numerical vector. The partition of the cluster analysis.

NameBlocks

string vector. Name of each block. Length must be equal to the length of Blocks vector. If NULL, the names are B1,...Bm. Default: NULL

center

logical. Should the data variables be centered? Default: TRUE. Please set to FALSE for a CATA experiment

scale

logical. Should the data variables be scaled? Default: FALSE

Value

  • Il: the Il indices

  • jl: the jl indicess

References

Llobell, F., Qannari, E.M. (June 10, 2022). Cluster analysis in a multi-bloc setting. SMTDA, Athens, Greece.
Llobell, F., Giacalone, D., Qannari, E. M. (Pangborn 2021). Cluster Analysis of products in CATA experiments.
Paper submitted

See Also

clustRowsOnStatisAxes, , ClusMB

Examples

#####projective mapping####
library(ClustBlock)
data(smoo)
res1=ClusMB(smoo, rep(2,24))
summary(res1)
indicesClusters(smoo, rep(2,24), res1$group)

####CATA####
data(fish)
Data=fish[1:66,2:30]
chang2=change_cata_format2(Data, nprod= 6, nattr= 27, nsub = 11, nsess= 1)
res2=ClusMB(Data= chang2$Datafinal, Blocks= rep(27, 11), center=FALSE)
indicesClusters(Data= chang2$Datafinal, Blocks= rep(27, 11),cut = res2$group, center=FALSE)

Displays the CATATIS graphs

Description

This function plots the CATATIS map and CATATIS weights

Usage

## S3 method for class 'catatis'
plot(x, Graph=TRUE, Graph_weights=TRUE, Graph_eig=TRUE,
  axes=c(1,2), tit="CATATIS", cex=1, col.obj="blue", col.attr="red", ...)

Arguments

x

object of class 'catatis'

Graph

logical. Show the graphical representation? Default: TRUE

Graph_weights

logical. Should the barplot of the weights be plotted? Default: TRUE

Graph_eig

logical. Should the barplot of the eigenvalues be plotted? Only with Graph=TRUE. Default: TRUE

axes

numerical vector (length 2). Axes to be plotted

tit

string. Title for the graphical representation. Default: 'CATATIS'

cex

numerical. Numeric character expansion factor; multiplied by par("cex") yields the final character size. NULL and NA are equivalent to 1.0.

col.obj

numerical or string. Color for the objects points. Default: "blue"

col.attr

numerical or string. Color for the attributes points. Default: "red"

...

further arguments passed to or from other methods

Value

the CATATIS map

See Also

catatis

Examples

data(straw)
res.cat=catatis(straw, nblo=114)
plot(res.cat, Graph_weights=FALSE, axes=c(1,3))

Displays the CLUSCATA graphs

Description

This function plots dendrogram, variation of the merging criterion, weights and CATATIS map of each cluster

Usage

## S3 method for class 'cluscata'
plot(x, ngroups=NULL, Graph_groups=TRUE, Graph_dend=TRUE,
Graph_bar=FALSE, Graph_weights=FALSE, axes=c(1,2), cex=1,
col.obj="blue", col.attr="red", ...)

Arguments

x

object of class 'cluscata'.

ngroups

number of groups to consider. Ignored for cluscata_kmeans results. Default: recommended number of clusters

Graph_groups

logical. Should each cluster compromise graphical representation be plotted? Default: TRUE

Graph_dend

logical. Should the dendrogram be plotted? Default: TRUE

Graph_bar

logical. Should the barplot of the difference of the criterion and the barplot of the overall homogeneity at each merging step of the hierarchical algorithm be plotted? Also available after consolidation if Noise_cluster=FALSE. Default: FALSE

Graph_weights

logical. Should the barplot of the weights in each cluster be plotted? Default: FALSE

axes

numerical vector (length 2). Axes to be plotted. Default: c(1,2)

cex

numerical. Numeric character expansion factor; multiplied by par("cex") yields the final character size. NULL and NA are equivalent to 1.0.

col.obj

numerical or string. Color for the objects points. Default: "blue"

col.attr

numerical or string. Color for the attributes points. Default: "red"

...

further arguments passed to or from other methods

Value

the CLUSCATA graphs

See Also

cluscata , cluscata_kmeans

Examples

data(straw)
 res=cluscata(Data=straw[,1:(16*40)], nblo=40)
 plot(res, ngroups=3, Graph_dend=FALSE)
 plot(res, ngroups=3, Graph_dend=FALSE,Graph_bar=FALSE, Graph_weights=FALSE, axes=c(1,3))

Displays the CLUSTATIS graphs

Description

This function plots dendrogram, variation of the merging criterion, weights and STATIS map of each cluster

Usage

## S3 method for class 'clustatis'
plot(x, ngroups=NULL, Graph_groups=TRUE, Graph_dend=TRUE,
Graph_bar=FALSE, Graph_weights=FALSE, axes=c(1,2), col=NULL, cex=1, font=1, ...)

Arguments

x

object of class 'clustatis'.

ngroups

number of groups to consider. Ignored for clustatis_kmeans results. Default: recommended number of clusters

Graph_groups

logical. Should each cluster compromise graphical representation be plotted? Default: TRUE

Graph_dend

logical. Should the dendrogram be plotted? Default: TRUE

Graph_bar

logical. Should the barplot of the difference of the criterion and the barplot of the overall homogeneity at each merging step of the hierarchical algorithm be plotted? Also available after consolidation if Noise_cluster=FALSE. Default: FALSE

Graph_weights

logical. Should the barplot of the weights in each cluster be plotted? Default: FALSE

axes

numerical vector (length 2). Axes to be plotted. Default: c(1,2)

col

vector. Color for each object. Default: rainbow(nrow(Data))

cex

numerical. Numeric character expansion factor; multiplied by par("cex") yields the final character size. NULL and NA are equivalent to 1.0.

font

numerical. Integer specifying font to use for text. 1=plain, 2=bold, 3=italic, 4=bold italic, 5=symbol. Default: 1

...

further arguments passed to or from other methods

Value

the CLUSTATIS graphs

See Also

clustatis , clustatis_kmeans

Examples

data(smoo)
 NameBlocks=paste0("S",1:24)
 cl=clustatis(Data=smoo,Blocks=rep(2,24),NameBlocks = NameBlocks)
 plot(cl, ngroups=3, Graph_dend=FALSE)
 plot(cl, ngroups=3,  Graph_dend=FALSE, axes=c(1,3))
 graphics.off()

Displays the STATIS graphs

Description

This function plots the STATIS map and STATIS weights

Usage

## S3 method for class 'statis'
plot(x, axes=c(1,2), Graph_obj=TRUE,
Graph_weights=TRUE, Graph_eig=TRUE, tit="STATIS", col=NULL, cex=1, font=1,
xlim=NULL, ylim=NULL, ...)

Arguments

x

object of class 'statis'

axes

numerical vector (length 2). Axes to be plotted. Default: c(1,2)

Graph_obj

logical. Should the compromise graphical representation be plotted? Default: TRUE

Graph_weights

logical. Should the barplot of the weights be plotted? Default: TRUE

Graph_eig

logical. Should the barplot of the eigenvalues be plotted? Only with Graph_obj=TRUE. Default: TRUE

tit

string. Title for the objects graphical representation. Default: 'STATIS'

col

vector. Color for each object. If NULL, col=rainbow(nrow(Data)). Default: NULL

cex

numerical. Numeric character expansion factor; multiplied by par("cex") yields the final character size. NULL and NA are equivalent to 1.0.

font

numerical. Integer specifying font to use for text. 1=plain, 2=bold, 3=italic, 4=bold italic, 5=symbol. Default: 1

xlim

numerical vector (length 2). Minimum and maximum for x coordinates.

ylim

numerical vector (length 2). Minimum and maximum for y coordinates.

...

further arguments passed to or from other methods

Value

the STATIS graphs

See Also

statis

Examples

data(smoo)
 NameBlocks=paste0("S",1:24)
 st=statis(Data=smoo,Blocks=rep(2,24),NameBlocks = NameBlocks)
 plot(st, axes=c(1,3), Graph_weights=FALSE)

Preprocessing for Free Sorting Data

Description

For Free Sorting Data, this preprocessing is needed.

Usage

preprocess_FreeSort(Data, NameSub=NULL)

Arguments

Data

data frame or matrix. Corresponds to all variables that contain subjects results. Each column corresponds to a subject and gives the groups to which the products (rows) are assigned

NameSub

string vector. Name of each subject. Length must be equal to the number of clumn of the Data. If NULL, the names are S1,...Sm. Default: NULL

Value

A list with:

  • new_Data: the Data transformed

  • Blocks: the number of groups for each subject

  • NameBlocks: the name of each subject

References

Llobell, F., Cariou, V., Vigneau, E., Labenne, A., & Qannari, E. M. (2018). Analysis and clustering of multiblock datasets by means of the STATIS and CLUSTATIS methods. Application to sensometrics. Food Quality and Preference, in Press.

See Also

clustatis, clustatis_FreeSort

Examples

data(choc)
prepro=preprocess_FreeSort(choc)

Preprocessing for Just About Right Data

Description

For JAR data, this preprocessing is needed.

Usage

preprocess_JAR(Data,  nprod, nsub, levelsJAR=3, beta=0.1)

Arguments

Data

data frame where the first column is the Assessors, the second is the products and all other columns the JAR attributes with numbers (1 to 3 or 1 to 5, see levelsJAR)

nprod

integer. Number of products.

nsub

integer. Number of subjects.

levelsJAR

integer. 3 or 5 levels. If 5, the data will be transformed in 3 levels.

beta

numerical. Parameter for agreement between JAR and other answers. Between 0 and 0.5.

Value

A list with:

  • Datafinal: the Data transformed

  • NameSub: the name of each subject in the right order

References

Llobell, F., Vigneau, E. & Qannari, E. M. (September 14, 2022). Multivariate data analysis and clustering of subjects in a Just about right task. Eurosense, Turku, Finland.

See Also

catatis_jar, cluscata_jar, cluscata_kmeans_jar

Examples

data(cheese)
prepro=preprocess_JAR(cheese, nprod=8, nsub=72, levelsJAR=5)

Print the CATATIS results

Description

Print the CATATIS results

Usage

## S3 method for class 'catatis'
print(x, ...)

Arguments

x

object of class 'catatis'

...

further arguments passed to or from other methods

See Also

catatis


Print the CLUSCATA results

Description

Print the CLUSCATA results

Usage

## S3 method for class 'cluscata'
print(x, ...)

Arguments

x

object of class 'cluscata'

...

further arguments passed to or from other methods

See Also

cluscata , cluscata_kmeans


Print the ClusMB or clustering on STATIS axes results

Description

Print the ClusMB or clustering on STATIS axes results

Usage

## S3 method for class 'clusRows'
print(x, ...)

Arguments

x

object of class 'clusRows'

...

further arguments passed to or from other methods

See Also

ClusMB, clustRowsOnStatisAxes


Print the CLUSTATIS results

Description

Print the CLUSTATIS results

Usage

## S3 method for class 'clustatis'
print(x, ...)

Arguments

x

object of class 'clustatis'

...

further arguments passed to or from other methods

See Also

clustatis , clustatis_kmeans


Print the STATIS results

Description

Print the STATIS results

Usage

## S3 method for class 'statis'
print(x, ...)

Arguments

x

object of class 'statis'

...

further arguments passed to or from other methods

See Also

statis


RATA data on chocolates

Description

RATA data on chocolates

Usage

data(RATAchoc)

Format

RATA data with sessions. A data frame with 3 sessions, 9 panelists, 12 products and 27 RATA attributes.

References

Pangborn 2023

Examples

data(RATAchoc)

Testing the difference in perception between two predetermined groups of subjects in a CATA experiment

Description

Test adapted to CATA data to determine whether two predetermined groups of subjects have a different perception or not. For example, men and women.

Usage

simil_groups_cata(Data, groups, one=1, two=2, nperm=50, Graph=TRUE,
  alpha= 0.05, printl=FALSE)

Arguments

Data

data frame or matrix. Correspond to all the blocks of variables merged horizontally

groups

categorical vector. The groups of each subject . The length must be the number of subjects.

one

string. Name of the group 1 in groups vector.

two

string. Name of the group 2 in groups vector.

nperm

numerical. How many permutations are required? Default: 50

Graph

logical. Should the CATATIS graph of each group be plotted? Default: TRUE

alpha

numerical between 0 and 1. What is the threshold of the test? Default: 0.05

printl

logical. Print the number of remaining permutations during the algorithm? Default: FALSE

Value

a list with:

  • decision: the decision of the test

  • pval: pvalue of the test

References

Llobell, F., Giacalone, D., Jaeger, S.R. & Qannari, E. M. (2021). CATA data: Are there differences in perception? JSM conference.
Llobell, F., Giacalone, D., Jaeger, S.R. & Qannari, E. M. (2021). CATA data: Are there differences in perception? AgroStat conference.

Examples

data(straw)
 groups=sample(1:2, 114, replace=TRUE)
 simil_groups_cata(straw, groups, one=1, two=2)

smoothies data

Description

smoothies data

Usage

data(smoo)

Format

Projective mapping (or Napping) data. A data frame with 8 rows (the number of smoothies) and 48 columns (the number of consumers * 2). For each consumer, we have the coordinates of the products on the sheet of paper.

References

Francois Husson, Sebastien Le and Marine Cadoret (2017). SensoMineR: Sensory Data Analysis. R package version 1.23. https://CRAN.R-project.org/package=SensoMineR

Examples

data(smoo)

Performs the STATIS method on different blocks of quantitative variables

Description

STATIS method on quantitative blocks. SUpplementary outputs are also computed

Usage

statis(Data,Blocks,NameBlocks=NULL,Graph_obj=TRUE, Graph_weights=TRUE, scale=FALSE)

Arguments

Data

data frame or matrix. Correspond to all the blocks of variables merged horizontally

Blocks

numerical vector. The number of variables of each block. The sum must be equal to the number of columns of Data

NameBlocks

string vector. Name of each block. Length must be equal to the length of Blocks vector. If NULL, the names are B1,...Bm. Default: NULL

Graph_obj

logical. Show the graphical representation od the objects? Default: TRUE

Graph_weights

logical. Should the barplot of the weights be plotted? Default: TRUE

scale

logical. Should the data variables be scaled? Default: FALSE

Value

a list with:

  • RV: the RV matrix: a matrix with the RV coefficient between blocks of variables

  • compromise: a matrix which is the compromise of the blocks (akin to a weighted average)

  • weights: the weights associated with the blocks to build the compromise

  • lambda: the first eigenvalue of the RV matrix

  • overall error : the error for the STATIS criterion

  • error_by_conf: the error by configuration (STATIS criterion)

  • rv_with_compromise: the RV coefficient of each block with the compromise

  • homogeneity: homogeneity of the blocks (in percentage)

  • coord: the coordinates of each object

  • eigenvalues: the eigenvalues of the svd decomposition

  • inertia: the percentage of total variance explained by each axis

  • error_by_obj: the error by object (STATIS criterion)

  • scalefactors: the scaling factors of each block

  • proj_config: the projection of each object of each configuration on the axes: presentation by configuration

  • proj_objects: the projection of each object of each configuration on the axes: presentation by object

References

  • Lavit, C., Escoufier, Y., Sabatier, R., Traissac, P. (1994). The act (statis method). Computational 462 Statistics & Data Analysis, 18 (1), 97-119.\

  • Llobell, F., Cariou, V., Vigneau, E., Labenne, A., & Qannari, E. M. (2018). Analysis and clustering of multiblock datasets by means of the STATIS and CLUSTATIS methods.Application to sensometrics. Food Quality and Preference, in Press.

See Also

plot.statis, clustatis

Examples

data(smoo)
 NameBlocks=paste0("S",1:24)
 st=statis(Data=smoo, Blocks=rep(2,24),NameBlocks = NameBlocks)
 #plot(st, axes=c(1,3))
 summary(st)
 #with variables scaling
 st2=statis(Data=smoo, Blocks=rep(2,24),NameBlocks = NameBlocks, Graph_weights=FALSE, scale=TRUE)

Performs the STATIS method on Free Sorting data

Description

STATIS method on Free Sorting data. A lot of supplementary informations are also computed

Usage

statis_FreeSort(Data, NameSub=NULL, Graph_obj=TRUE, Graph_weights=TRUE)

Arguments

Data

data frame or matrix. Corresponds to all variables that contain subjects results. Each column corresponds to a subject and gives the groups to which the products (rows) are assigned

NameSub

string vector. Name of each subject. Length must be equal to the number of clumn of the Data. If NULL, the names are S1,...Sm. Default: NULL

Graph_obj

logical. Show the graphical representation od the objects? Default: TRUE

Graph_weights

logical. Should the barplot of the weights be plotted? Default: TRUE

Value

a list with:

a list with:

  • RV: the RV matrix: a matrix with the RV coefficient between subjects

  • compromise: a matrix which is the compromise of the subjects (akin to a weighted average)

  • weights: the weights associated with the subjects to build the compromise

  • lambda: the first eigenvalue of the RV matrix

  • overall error : the error for the STATIS criterion

  • error_by_conf: the error by configuration (STATIS criterion)

  • rv_with_compromise: the RV coefficient of each subject with the compromise

  • homogeneity: homogeneity of the subjects (in percentage)

  • coord: the coordinates of each object

  • eigenvalues: the eigenvalues of the svd decomposition

  • inertia: the percentage of total variance explained by each axis

  • error_by_obj: the error by object (STATIS criterion)

  • scalefactors: the scaling factors of each subject

  • proj_config: the projection of each object of each subject on the axes: presentation by subject

  • proj_objects: the projection of each object of each subject on the axes: presentation by object

References

  • Lavit, C., Escoufier, Y., Sabatier, R., Traissac, P. (1994). The act (statis method). Computational 462 Statistics & Data Analysis, 18 (1), 97-119.\

  • Llobell, F., Cariou, V., Vigneau, E., Labenne, A., & Qannari, E. M. (2018). Analysis and clustering of multiblock datasets by means of the STATIS and CLUSTATIS methods.Application to sensometrics. Food Quality and Preference, in Press.

See Also

preprocess_FreeSort, clustatis_FreeSort

Examples

data(choc)
res.sta=statis_FreeSort(choc)

strawberries data

Description

strawberries data

Usage

data(straw)

Format

CATA data. A data frame with 6 rows (the number of strawberries) and 1824 columns (the number of consumers (114) * the number of attributes (16)). For each consumer,each attribute and eachb product, there is 1 if the attribute has been checked by the consumer for the product, and 0 if not.

References

Ares, G., & Jaeger, S. R. (2013). Check-all-that-apply questions: Influence of attribute order on sensory product characterization. Food Quality and Preference, 28(1), 141-153.

Examples

data(straw)

Show the CATATIS results

Description

This function shows the CATATIS results

Usage

## S3 method for class 'catatis'
summary(object, ...)

Arguments

object

object of class 'catatis'.

...

further arguments passed to or from other methods

Value

a list with:

  • homogeneity: homogeneity of the subjects (in percentage)

  • weights: the weights associated with the subjects to build the compromise

  • eigenvalues: the eigenvalues associated to the correspondance analysis

  • inertia: the percentage of total variance explained by each axis of the CA

See Also

catatis


Show the CLUSCATA results

Description

This function shows the cluscata results

Usage

## S3 method for class 'cluscata'
summary(object, ngroups=NULL, ...)

Arguments

object

object of class 'cluscata'.

ngroups

number of groups to consider. Ignored for cluscata_kmeans results. Default: recommended number of clusters

...

further arguments passed to or from other methods

Value

the CLUSCATA principal results

a list with:

  • group: the clustering partition

  • homogeneity: homogeneity index (

  • weights: weight associated with each subject in its cluster

  • rho: the threshold for the noise cluster

  • test_one_cluster: decision and pvalue to know if there is more than one cluster

See Also

cluscata , cluscata_kmeans


Show the ClusMB or clustering on STATIS axes results

Description

This function shows the ClusMB or clustering on STATIS axes results

Usage

## S3 method for class 'clusRows'
summary(object, ...)

Arguments

object

object of class 'clusRows'.

...

further arguments passed to or from other methods

Value

a list with:

  • groups: clustering partition

  • nbClustRetained: the number of clusters retained

  • nbgH: Advised number of clusters per Hartigan index

  • nbgCH: Advised number of clusters per Calinski-Harabasz index

See Also

ClusMB, clustRowsOnStatisAxes


Show the CLUSTATIS results

Description

This function shows the clustatis results

Usage

## S3 method for class 'clustatis'
summary(object, ngroups=NULL, ...)

Arguments

object

object of class 'clustatis'.

ngroups

number of groups to consider. Ignored for clustatis_kmeans results. Default: recommended number of clusters

...

further arguments passed to or from other methods

Value

the CLUSTATIS principal results

a list with:

  • group: the clustering partition

  • homogeneity: homogeneity index (

  • weights: weight associated with each block in its cluster

  • rho: the threshold for the noise cluster

  • test_one_cluster: decision and pvalue to know if there is more than one cluster

See Also

clustatis , clustatis_kmeans


Show the STATIS results

Description

This function shows the STATIS results

Usage

## S3 method for class 'statis'
summary(object, ...)

Arguments

object

object of class 'statis'.

...

further arguments passed to or from other methods

Value

a list with:

  • homogeneity: homogeneity of the blocks (in percentage)

  • weights: the weights associated with the blocks to build the compromise

  • eigenvalues: the eigenvalues of the svd decomposition

  • inertia: the percentage of total variance explained by each axis

See Also

statis