Package 'simuclustfactor'

Title: Simultaneous Clustering and Factorial Decomposition of Three-Way Datasets
Description: Implements two iterative techniques called T3Clus and 3Fkmeans, aimed at simultaneously clustering objects and a factorial dimensionality reduction of variables and occasions on three-mode datasets developed by Vichi et al. (2007) <doi:10.1007/s00357-007-0006-x>. Also, we provide a convex combination of these two simultaneous procedures called CT3Clus and based on a hyperparameter alpha (alpha in [0,1], with 3FKMeans for alpha=0 and T3Clus for alpha= 1) also developed by Vichi et al. (2007) <doi:10.1007/s00357-007-0006-x>. Furthermore, we implemented the traditional tandem procedures of T3Clus (TWCFTA) and 3FKMeans (TWFCTA) for sequential clustering-factorial decomposition (TWCFTA), and vice-versa (TWFCTA) proposed by P. Arabie and L. Hubert (1996) <doi:10.1007/978-3-642-79999-0_1>.
Authors: Prosper Ablordeppey [aut, cre] , Adelaide Freitas [ctb] , Giorgia Zaccaria [ctb]
Maintainer: Prosper Ablordeppey <[email protected]>
License: GPL-3
Version: 0.0.3
Built: 2024-12-12 06:49:15 UTC
Source: CRAN

Help Index


Simultaneous results attributes

Description

Simultaneous results attributes

Slots

U_i_g0

matrix. Initial object membership function matrix

B_j_q0

matrix. Initial factor/component matrix for the variables

C_k_r0

matrix. Initial factor/component matrix for the occasions

U_i_g

matrix. Final/updated object membership function matrix

B_j_q

matrix. Final/updated factor/component matrix for the variables

C_k_r

matrix. Final/updated factor/component matrix for the occasions

Y_g_qr

matrix. Derived centroids in the reduced space (data matrix)

X_i_jk_scaled

matrix. Standardized dataset matrix

BestTimeElapsed

numeric. Execution time for the best iterate

BestLoop

numeric. Loop that obtained the best iterate

BestIteration

numeric. Iteration yielding the best results

Converged

numeric. Flag to check if algorithm converged for the K-means

nConverges

numeric. Number of loops that converged for the K-means

TSS_full

numeric. Total deviance in the full-space

BSS_full

numeric. Between deviance in the reduced-space

RSS_full

numeric. Residual deviance in the reduced-space

PF_full

numeric. PseudoF in the full-space

TSS_reduced

numeric. Total deviance in the reduced-space

BSS_reduced

numeric. Between deviance in the reduced-space

RSS_reduced

numeric. Residual deviance in the reduced-space

PF_reduced

numeric. PseudoF in the reduced-space

PF

numeric. Weighted PseudoF score

Labels

integer. Object cluster assignments

Fs

numeric. Objective function values for the KM best iterate

Enorm

numeric. Average l2 norm of the residual norm.


Tandem results attributes

Description

Tandem results attributes

Slots

U_i_g0

matrix. Initial object membership function matrix.

B_j_q0

matrix. Initial factor/component matrix for the variables.

C_k_r0

matrix. Initial factor/component matrix for the occasions.

U_i_g

matrix. Final/updated object membership function matrix.

B_j_q

matrix. Final/updated factor/component matrix for the variables.

C_k_r

matrix. Final/updated factor/component matrix for the occasions.

Y_g_qr

matrix. Derived centroids in the reduced space (data matrix).

X_i_jk_scaled

matrix. Standardized dataset matrix.

BestTimeElapsed

numeric. Execution time for the best iterate.

BestLoop

numeric. Loop that obtained the best iterate.

BestKmIteration

numeric. Number of iteration until best iterate for the K-means.

BestFaIteration

numeric. Number of iteration until best iterate for the FA.

FaConverged

numeric. Flag to check if algorithm converged for the K-means.

KmConverged

numeric. Flag to check if algorithm converged for the Factor Decomposition.

nKmConverges

numeric. Number of loops that converged for the K-means.

nFaConverges

numeric. Number of loops that converged for the Factor decomposition.

TSS_full

numeric. Total deviance in the full-space.

BSS_full

numeric. Between deviance in the reduced-space.

RSS_full

numeric. Residual deviance in the reduced-space.

PF_full

numeric. PseudoF in the full-space.

TSS_reduced

numeric. Total deviance in the reduced-space.

BSS_reduced

numeric. Between deviance in the reduced-space.

RSS_reduced

numeric. Residual deviance in the reduced-space.

PF_reduced

numeric. PseudoF in the reduced-space.

PF

numeric. Actual PseudoF value to obtain best loop.

Labels

integer. Object cluster assignments.

FsKM

numeric. Objective function values for the KM best iterate.

FsFA

numeric. Objective function values for the FA best iterate.

Enorm

numeric. Average l2 norm of the residual norm.


3FKMeans Model

Description

Implements simultaneous version of TWFCTA

Usage

fit.3fkmeans(model, X_i_jk, full_tensor_shape, reduced_tensor_shape)

## S4 method for signature 'simultaneous'
fit.3fkmeans(model, X_i_jk, full_tensor_shape, reduced_tensor_shape)

Arguments

model

Initialized simultaneous model.

X_i_jk

Matricized tensor along mode-1 (I objects).

full_tensor_shape

Dimensions of the tensor in full-space.

reduced_tensor_shape

Dimensions of tensor in the reduced-space.

Details

The procedure performs simultaneously the sequential TWFCTA model. The model finds B_j_q and C_k_r such that the within-clusters deviance of the component scores is minimized.

Value

Output attributes accessible via the '@' operator.

  • U_i_g0 - Initial object membership function matrix

  • B_j_q0 - Initial factor/component matrix for the variables

  • C_k_r0 - Initial factor/component matrix for the occasions

  • U_i_g - Final/updated object membership function matrix

  • B_j_q - Final/updated factor/component matrix for the variables

  • C_k_r - Final/updated factor/component matrix for the occasions

  • Y_g_qr - Derived centroids in the reduced space (data matrix)

  • X_i_jk_scaled - Standardized dataset matrix

  • BestTimeElapsed - Execution time for the best iterate

  • BestLoop - Loop that obtained the best iterate

  • BestIteration - Iteration yielding the best results

  • Converged - Flag to check if algorithm converged for the K-means

  • nConverges - Number of loops that converged for the K-means

  • TSS_full - Total deviance in the full-space

  • BSS_full - Between deviance in the reduced-space

  • RSS_full - Residual deviance in the reduced-space

  • PF_full - PseudoF in the full-space

  • TSS_reduced - Total deviance in the reduced-space

  • BSS_reduced - Between deviance in the reduced-space

  • RSS_reduced - Residual deviance in the reduced-space

  • PF_reduced - PseudoF in the reduced-space

  • PF - Weighted PseudoF score

  • Labels - Object cluster assignments

  • Fs - Objective function values for the KM best iterate

  • Enorm - Average l2 norm of the residual norm.

References

Tucker L (1966). “Some mathematical notes on three-mode factor analysis.” Psychometrika, 31(3), 279-311. doi:10.1007/BF02289464, https://ideas.repec.org/a/spr/psycho/v31y1966i3p279-311.html. Vichi M, Kiers HAL (2001). “Factorial k-means analysis for two-way data.” Computational Statistics and Data Analysis, 37(1), 49-64. https://EconPapers.repec.org/RePEc:eee:csdana:v:37:y:2001:i:1:p:49-64. Vichi M, Rocci R, Kiers H (2007). “Simultaneous Component and Clustering Models for Three-way Data: Within and Between Approaches.” Journal of Classification, 24, 71-98. doi:10.1007/s00357-007-0006-x.

Examples

X_i_jk = generate_dataset()$X_i_jk
model = simultaneous()
tfkmeans = fit.3fkmeans(model, X_i_jk, c(8,5,4), c(3,3,2))

CT3Clus Model

Description

Implements simultaneous T3Clus and 3FKMeans integrating an alpha value between 0 and 1 inclusive for a weighted result.

Usage

fit.ct3clus(
  model,
  X_i_jk,
  full_tensor_shape,
  reduced_tensor_shape,
  alpha = 0.5
)

## S4 method for signature 'simultaneous'
fit.ct3clus(
  model,
  X_i_jk,
  full_tensor_shape,
  reduced_tensor_shape,
  alpha = 0.5
)

Arguments

model

Initialized simultaneous model.

X_i_jk

Matricized tensor along mode-1 (I objects).

full_tensor_shape

Dimensions of the tensor in full space.

reduced_tensor_shape

Dimensions of tensor in the reduced space.

alpha

0<alpha>1 hyper parameter. Model is T3Clus when alpha=1 and 3FKMeans when alpha=0.

Value

Output attributes accessible via the '@' operator.

  • U_i_g0 - Initial object membership function matrix

  • B_j_q0 - Initial factor/component matrix for the variables

  • C_k_r0 - Initial factor/component matrix for the occasions

  • U_i_g - Final/updated object membership function matrix

  • B_j_q - Final/updated factor/component matrix for the variables

  • C_k_r - Final/updated factor/component matrix for the occasions

  • Y_g_qr - Derived centroids in the reduced space (data matrix)

  • X_i_jk_scaled - Standardized dataset matrix

  • BestTimeElapsed - Execution time for the best iterate

  • BestLoop - Loop that obtained the best iterate

  • BestIteration - Iteration yielding the best results

  • Converged - Flag to check if algorithm converged for the K-means

  • nConverges - Number of loops that converged for the K-means

  • TSS_full - Total deviance in the full-space

  • BSS_full - Between deviance in the reduced-space

  • RSS_full - Residual deviance in the reduced-space

  • PF_full - PseudoF in the full-space

  • TSS_reduced - Total deviance in the reduced-space

  • BSS_reduced - Between deviance in the reduced-space

  • RSS_reduced - Residual deviance in the reduced-space

  • PF_reduced - PseudoF in the reduced-space

  • PF - Weighted PseudoF score

  • Labels - Object cluster assignments

  • Fs - Objective function values for the KM best iterate

  • Enorm - Average l2 norm of the residual norm.

References

Tucker L (1966). “Some mathematical notes on three-mode factor analysis.” Psychometrika, 31(3), 279-311. doi:10.1007/BF02289464, https://ideas.repec.org/a/spr/psycho/v31y1966i3p279-311.html. Rocci R, Vichi M (2005). “Three-Mode Component Analysis with Crisp or Fuzzy Partition of Units.” Psychometrika, 70, 715-736. doi:10.1007/s11336-001-0926-z. Vichi M, Kiers HAL (2001). “Factorial k-means analysis for two-way data.” Computational Statistics and Data Analysis, 37(1), 49-64. https://EconPapers.repec.org/RePEc:eee:csdana:v:37:y:2001:i:1:p:49-64. Vichi M, Rocci R, Kiers H (2007). “Simultaneous Component and Clustering Models for Three-way Data: Within and Between Approaches.” Journal of Classification, 24, 71-98. doi:10.1007/s00357-007-0006-x.

See Also

fit.t3clus fit.3fkmeans simultaneous

Examples

X_i_jk = generate_dataset()$X_i_jk
model = simultaneous()
ct3clus = fit.ct3clus(model, X_i_jk, c(8,5,4), c(3,3,2), alpha=0.5)

T3Clus Model

Description

Implements simultaneous version of TWCFTA

Usage

fit.t3clus(model, X_i_jk, full_tensor_shape, reduced_tensor_shape)

## S4 method for signature 'simultaneous'
fit.t3clus(model, X_i_jk, full_tensor_shape, reduced_tensor_shape)

Arguments

model

Initialized simultaneous model.

X_i_jk

Matricized tensor along mode-1 (I objects).

full_tensor_shape

Dimensions of the tensor in full-space.

reduced_tensor_shape

Dimensions of tensor in the reduced-space.

Details

The procedure performs simultaneously the sequential TWCFTA model. The model finds B_j_q and C_k_r such that the between-clusters deviance of the component scores is maximized.

Value

Output attributes accessible via the '@' operator.

  • U_i_g0 - Initial object membership function matrix

  • B_j_q0 - Initial factor/component matrix for the variables

  • C_k_r0 - Initial factor/component matrix for the occasions

  • U_i_g - Final/updated object membership function matrix

  • B_j_q - Final/updated factor/component matrix for the variables

  • C_k_r - Final/updated factor/component matrix for the occasions

  • Y_g_qr - Derived centroids in the reduced space (data matrix)

  • X_i_jk_scaled - Standardized dataset matrix

  • BestTimeElapsed - Execution time for the best iterate

  • BestLoop - Loop that obtained the best iterate

  • BestIteration - Iteration yielding the best results

  • Converged - Flag to check if algorithm converged for the K-means

  • nConverges - Number of loops that converged for the K-means

  • TSS_full - Total deviance in the full-space

  • BSS_full - Between deviance in the reduced-space

  • RSS_full - Residual deviance in the reduced-space

  • PF_full - PseudoF in the full-space

  • TSS_reduced - Total deviance in the reduced-space

  • BSS_reduced - Between deviance in the reduced-space

  • RSS_reduced - Residual deviance in the reduced-space

  • PF_reduced - PseudoF in the reduced-space

  • PF - Weighted PseudoF score

  • Labels - Object cluster assignments

  • Fs - Objective function values for the KM best iterate

  • Enorm - Average l2 norm of the residual norm.

References

Tucker L (1966). “Some mathematical notes on three-mode factor analysis.” Psychometrika, 31(3), 279-311. doi:10.1007/BF02289464, https://ideas.repec.org/a/spr/psycho/v31y1966i3p279-311.html. Rocci R, Vichi M (2005). “Three-Mode Component Analysis with Crisp or Fuzzy Partition of Units.” Psychometrika, 70, 715-736. doi:10.1007/s11336-001-0926-z. Vichi M, Rocci R, Kiers H (2007). “Simultaneous Component and Clustering Models for Three-way Data: Within and Between Approaches.” Journal of Classification, 24, 71-98. doi:10.1007/s00357-007-0006-x.

Examples

X_i_jk = generate_dataset()$X_i_jk
model = simultaneous()
t3clus = fit.t3clus(model, X_i_jk, c(8,5,4), c(3,3,2))

TWCFTA model

Description

Implements K-means clustering and afterwards factorial reduction in a sequential fashion.

Usage

fit.twcfta(model, X_i_jk, full_tensor_shape, reduced_tensor_shape)

## S4 method for signature 'tandem'
fit.twcfta(model, X_i_jk, full_tensor_shape, reduced_tensor_shape)

Arguments

model

Initialized tandem model.

X_i_jk

Matricized tensor along mode-1 (I objects).

full_tensor_shape

Dimensions of the tensor in full space.

reduced_tensor_shape

Dimensions of tensor in the reduced space.

Details

The procedure requires sequential clustering and factorial decomposition.

  • The K-means clustering algorithm is initially applied to the matricized tensor X_i_jk to obtain the centroids matrix X_g_jk and the membership matrix U_i_g.

  • The Tucker2 decomposition technique is then implemented on the centroids matrix X_g_jk to yield the core centroids matrix Y_g_qr and the component weights matrices B_j_q and C_k_r.

Value

Output attributes accessible via the '@' operator.

  • U_i_g0 - Initial object membership function matrix.

  • B_j_q0 - Initial factor/component matrix for the variables.

  • C_k_r0 - Initial factor/component matrix for the occasions.

  • U_i_g - Final/updated object membership function matrix.

  • B_j_q - Final/updated factor/component matrix for the variables.

  • C_k_r - Final/updated factor/component matrix for the occasions.

  • Y_g_qr - Derived centroids in the reduced space (data matrix).

  • X_i_jk_scaled - Standardized dataset matrix.

  • BestTimeElapsed - Execution time for the best iterate.

  • BestLoop - Loop that obtained the best iterate.

  • BestKmIteration - Number of iteration until best iterate for the K-means.

  • BestFaIteration - Number of iteration until best iterate for the FA.

  • FaConverged - Flag to check if algorithm converged for the K-means.

  • KmConverged - Flag to check if algorithm converged for the Factor Decomposition.

  • nKmConverges - Number of loops that converged for the K-means.

  • nFaConverges - Number of loops that converged for the Factor decomposition.

  • TSS_full - Total deviance in the full-space.

  • BSS_full - Between deviance in the reduced-space.

  • RSS_full - Residual deviance in the reduced-space.

  • PF_full - PseudoF in the full-space.

  • TSS_reduced - Total deviance in the reduced-space.

  • BSS_reduced - Between deviance in the reduced-space.

  • RSS_reduced - Residual deviance in the reduced-space.

  • PF_reduced - PseudoF in the reduced-space.

  • PF - Actual PseudoF value to obtain best loop.

  • Labels - Object cluster assignments.

  • FsKM - Objective function values for the KM best iterate.

  • FsFA - Objective function values for the FA best iterate.

  • Enorm - Average l2 norm of the residual norm.

Note

  • This procedure is useful to further interpret the between clusters variability of the data and to understand the variables and/or occasions that most contribute to discriminate the clusters. However, the application of this technique could lead to the masking of variables that are not informative of the clustering structure.

  • since the Tucker2 model is applied after the clustering, this cannot help select the most relevant information for the clustering in the dataset.

References

Arabie P, Hubert L (1996). “Advances in Cluster Analysis Relevant to Marketing Research.” In Gaul W, Pfeifer D (eds.), From Data to Knowledge, 3–19. Tucker L (1966). “Some mathematical notes on three-mode factor analysis.” Psychometrika, 31(3), 279-311. doi:10.1007/BF02289464, https://ideas.repec.org/a/spr/psycho/v31y1966i3p279-311.html.

See Also

fit.twfcta tandem

Examples

X_i_jk = generate_dataset()$X_i_jk
model = tandem()
twcfta = fit.twcfta(model, X_i_jk, c(8,5,4), c(3,3,2))

TWFCTA model

Description

Implements factorial reduction and then K-means clustering in a sequential fashion.

Usage

fit.twfcta(model, X_i_jk, full_tensor_shape, reduced_tensor_shape)

## S4 method for signature 'tandem'
fit.twfcta(model, X_i_jk, full_tensor_shape, reduced_tensor_shape)

Arguments

model

Initialized tandem model.

X_i_jk

Matricized tensor along mode-1 (I objects).

full_tensor_shape

Dimensions of the tensor in full space.

reduced_tensor_shape

Dimensions of tensor in the reduced space.

Details

The procedure implements sequential factorial decomposition and clustering.

  • The technique performs Tucker2 decomposition on the X_i_jk matrix to obtain the matrix of component scores Y_i_qr with component weights matrices B_j_q and C_k_r.

  • The K-means clustering algorithm is then applied to the component scores matrix Y_i_qr to obtain the desired core centroids matrix Y_g_qr and its associated stochastic membership function matrix U_i_g.

Value

Output attributes accessible via the '@' operator.

  • U_i_g0 - Initial object membership function matrix.

  • B_j_q0 - Initial factor/component matrix for the variables.

  • C_k_r0 - Initial factor/component matrix for the occasions.

  • U_i_g - Final/updated object membership function matrix.

  • B_j_q - Final/updated factor/component matrix for the variables.

  • C_k_r - Final/updated factor/component matrix for the occasions.

  • Y_g_qr - Derived centroids in the reduced space (data matrix).

  • X_i_jk_scaled - Standardized dataset matrix.

  • BestTimeElapsed - Execution time for the best iterate.

  • BestLoop - Loop that obtained the best iterate.

  • BestKmIteration - Number of iteration until best iterate for the K-means.

  • BestFaIteration - Number of iteration until best iterate for the FA.

  • FaConverged - Flag to check if algorithm converged for the K-means.

  • KmConverged - Flag to check if algorithm converged for the Factor Decomposition.

  • nKmConverges - Number of loops that converged for the K-means.

  • nFaConverges - Number of loops that converged for the Factor decomposition.

  • TSS_full - Total deviance in the full-space.

  • BSS_full - Between deviance in the reduced-space.

  • RSS_full - Residual deviance in the reduced-space.

  • PF_full - PseudoF in the full-space.

  • TSS_reduced - Total deviance in the reduced-space.

  • BSS_reduced - Between deviance in the reduced-space.

  • RSS_reduced - Residual deviance in the reduced-space.

  • PF_reduced - PseudoF in the reduced-space.

  • PF - Actual PseudoF value to obtain best loop.

  • Labels - Object cluster assignments.

  • FsKM - Objective function values for the KM best iterate.

  • FsFA - Objective function values for the FA best iterate.

  • Enorm - Average l2 norm of the residual norm.

Note

  • The technique helps interpret the within clusters variability of the data. The Tucker2 tends to explain most of the total variation in the dataset. Hence, the variance of variables that do not contribute to the clustering structure in the dataset is also included.

  • The Tucker2 dimensions may still mask some essential clustering structures in the dataset.

References

Arabie P, Hubert L (1996). “Advances in Cluster Analysis Relevant to Marketing Research.” In Gaul W, Pfeifer D (eds.), From Data to Knowledge, 3–19. Tucker L (1966). “Some mathematical notes on three-mode factor analysis.” Psychometrika, 31(3), 279-311. doi:10.1007/BF02289464, https://ideas.repec.org/a/spr/psycho/v31y1966i3p279-311.html.

See Also

fit.twcfta tandem

Examples

X_i_jk = generate_dataset()$X_i_jk
model = tandem()
twfCta = fit.twfcta(model, X_i_jk, c(8,5,4), c(3,3,2))

Folding Matrix to Tensor by Mode.

Description

X_i_jk => X_i_j_k, X_j_ki => X_i_j_k, X_k_ij => X_i_j_k

Usage

fold(X, mode, shape)

Arguments

X

Data matrix to fold.

mode

Mode of operation.

shape

Dimension of original tensor.

Value

X_i_j_k Three-mode tensor.

Examples

X_i_jk = generate_dataset()$X_i_jk
X_i_j_k = fold(X_i_jk, mode=1, shape=c(I=8,J=5,K=4)) # X_i_j_k

Three-Mode Dataset Generator for Simulations

Description

Generate G clustered synthetic dataset of I objects measured on J variables for K occasions with additive noise.

Usage

generate_dataset(
  I = 8,
  J = 5,
  K = 4,
  G = 3,
  Q = 3,
  R = 2,
  centroids_spread = c(0, 1),
  noise_mean = 0,
  noise_stdev = 0.5,
  seed = NULL
)

Arguments

I

Number of objects.

J

Number of variables per occasion.

K

Number of occasions.

G

Number of clusters.

Q

Number of factors for the variables.

R

Number of factors for the occasions.

centroids_spread

interval from which to uniformly pick the centroids.

noise_mean

Mean of noise to generate.

noise_stdev

Noise effect level/spread/standard deviation.

seed

Seed for random sequence generation.

Value

Z_i_jk: Component scores in the full space.

E_i_jk: Generated noise at the given noise level.

X_i_jk: Dataset with noise level set to noise_stdev specified.

Y_g_qr: Centroids matrix in the reduced space.

U_i_g: Stochastic membership function matrix.

B_j_q: Objects component scores matrix.

C_k_r: Occasions component scores matrix.

Examples

generate_dataset(seed=0)

Random Membership Function Matrix Generator

Description

Generates random binary stochastic membership function matrix for the I objects.

Usage

generate_rmfm(I, G, seed = NULL)

Arguments

I

Number of objects.

G

Number of groups/clusters.

seed

Seed for random number generation.

Value

U_i_g, binary stochastic membership matrix.

Examples

generate_rmfm(I=8,G=3)

One-run of the K-means clustering technique

Description

Initializes centroids based on a given membership function matrix or randomly. Iterate once over the input data to update the membership function matrix assigning objects to the closest centroids.

Usage

onekmeans(Y_i_qr, G, U_i_g = NULL, seed = NULL)

Arguments

Y_i_qr

Input data to group/cluster.

G

Number of clusters to find.

U_i_g

Initial membership matrix for the I objects.

seed

Seed for random values generation.

Value

updated membership matrix U_i_g.

References

Oti EU, Olusola MO, Eze FC, Enogwe SU (2021). “Comprehensive Review of K-Means Clustering Algorithms.” International Journal of Advances in Scientific Research and Engineering (IJASRE), ISSN:2454-8006, DOI: 10.31695/IJASRE, 7(8), 64–69. doi:10.31695/IJASRE.2021.34050, https://ijasre.net/index.php/ijasre/article/view/1301.

Examples

X_i_jk = generate_dataset(seed=0)$X_i_jk
onekmeans(X_i_jk, G=5)

PseudoF Score in the Full-Space

Description

Computes the PseudoF score in the full space.

Usage

pseudof.full(bss, wss, full_tensor_shape, reduced_tensor_shape)

Arguments

bss

Between sums of squared deviations between clusters.

wss

Within sums of squared deviations within clusters.

full_tensor_shape

Dimensions of the tensor in the original space.

reduced_tensor_shape

Dimension of the tensor in the reduced space.

Value

PseudoF score

References

Caliński T, Harabasz J (1974). “A dendrite method for cluster analysis.” Communications in Statistics, 3(1), 1-27. doi:10.1080/03610927408827101, https://www.tandfonline.com/doi/pdf/10.1080/03610927408827101, https://www.tandfonline.com/doi/abs/10.1080/03610927408827101. Rocci R, Vichi M (2005). “Three-Mode Component Analysis with Crisp or Fuzzy Partition of Units.” Psychometrika, 70, 715-736. doi:10.1007/s11336-001-0926-z.

Examples

pseudof.full(12,6,c(8,5,4),c(3,3,2))

PseudoF Score in the Reduced-Space

Description

Computes the PseudoF score in the reduced space.

Usage

pseudof.reduced(bss, wss, full_tensor_shape, reduced_tensor_shape)

Arguments

bss

Between sums of squared deviations between clusters.

wss

Within sums of squared deviations within clusters.

full_tensor_shape

Dimensions of the tensor in the original space.

reduced_tensor_shape

Dimension of the tensor in the reduced space.

Value

PseudoF score

References

Caliński T, Harabasz J (1974). “A dendrite method for cluster analysis.” Communications in Statistics, 3(1), 1-27. doi:10.1080/03610927408827101, https://www.tandfonline.com/doi/pdf/10.1080/03610927408827101, https://www.tandfonline.com/doi/abs/10.1080/03610927408827101.

Examples

pseudof.reduced(12,6,c(8,5,4),c(3,3,2))

Simultaneous Model Constructor

Description

Initialize model object required by the simultaneous methods.

Usage

simultaneous(
  seed = NULL,
  verbose = TRUE,
  init = "svd",
  n_max_iter = 10,
  n_loops = 10,
  tol = 1e-05,
  U_i_g = NULL,
  B_j_q = NULL,
  C_k_r = NULL
)

Arguments

seed

Seed for random sequence generation.

verbose

Flag to display output result for each loop.

init

The initialization method for the model parameters. Values could be 'svd','random','twcfta' or 'twfcta' Defaults to svd.

n_max_iter

Maximum number of iterations to optimize objective function.

n_loops

Number of runs/loops in search of the global result.

tol

Acceptable tolerance level.

U_i_g

Membership function matrix for the objects.

B_j_q

Component matrix for the variables.

C_k_r

Component matrix for the occasions.

Details

Two simultaneous models T3Clus and 3FKMeans are the implemented methods.

  • T3Clus finds B_j_q and C_k_r such that the between-clusters deviance of the component scores is maximized.

  • 3FKMeans finds B_j_q and C_k_r such that the within-clusters deviance of the component scores is minimized.

Value

An object of class "simultaneous".

Note

The model finds the best partition described by the best orthogonal linear combinations of the variables and orthogonal linear combinations of the occasions.

References

Tucker L (1966). “Some mathematical notes on three-mode factor analysis.” Psychometrika, 31(3), 279-311. doi:10.1007/BF02289464, https://ideas.repec.org/a/spr/psycho/v31y1966i3p279-311.html. Vichi M, Rocci R, Kiers H (2007). “Simultaneous Component and Clustering Models for Three-way Data: Within and Between Approaches.” Journal of Classification, 24, 71-98. doi:10.1007/s00357-007-0006-x.

See Also

fit.t3clus fit.3fkmeans fit.ct3clus tandem

Examples

simultaneous()

Simultaneous Model

Description

Simultaneous Model

Slots

seed

numeric. Seed for random sequence generation. Defaults to None.

verbose

logical. Whether to display executions output or not. Defaults to False.

init

character. The parameter initialization method. Defaults to 'svd'.

n_max_iter

numeric. Maximum number of iterations. Defaults to 10.

n_loops

numeric. Number of initialization to guarantee global results. Defaults to 10.

tol

numeric. Tolerance level/acceptable error. Defaults to 1e-5.

U_i_g

numeric. (I,G) initial stochastic membership function matrix.

B_j_q

numeric. (J,Q) initial component weight matrix for variables.

C_k_r

numeric. (K,R) initial component weight matrix for occasions.


Initializes an instance of the tandem model required by the tandem methods.

Description

Initializes an instance of the tandem model required by the tandem methods.

Usage

tandem(
  seed = NULL,
  verbose = TRUE,
  init = "svd",
  n_max_iter = 10,
  n_loops = 10,
  tol = 1e-05,
  U_i_g = NULL,
  B_j_q = NULL,
  C_k_r = NULL
)

Arguments

seed

Seed for random sequence generation.

verbose

Flag to display iteration outputs for each loop.

init

Parameter initialization method, 'svd' or 'random'.

n_max_iter

Maximum number of iteration to optimize the objective function.

n_loops

Maximum number of loops/runs for global results.

tol

Allowable tolerance to check convergence.

U_i_g

Initial membership function matrix for the objects.

B_j_q

Initial component scores matrix for the variables.

C_k_r

Initial component sores matrix for the occasions.

Value

An object of class "tandem".

References

Arabie P, Hubert L (1996). “Advances in Cluster Analysis Relevant to Marketing Research.” In Gaul W, Pfeifer D (eds.), From Data to Knowledge, 3–19. Tucker L (1966). “Some mathematical notes on three-mode factor analysis.” Psychometrika, 31(3), 279-311. doi:10.1007/BF02289464, https://ideas.repec.org/a/spr/psycho/v31y1966i3p279-311.html.

See Also

fit.twcfta fit.twfcta simultaneous


Tandem Class

Description

Tandem Class

Slots

seed

Seed for random sequence generation. Defaults to None.

verbose

logical. Whether to display executions output or not. Defaults to False.

init

character. The parameter initialization method. Defaults to 'svd'.

n_max_iter

numeric. Maximum number of iterations. Defaults to 10.

n_loops

numeric. Number of initialization to guarantee global results. Defaults to 10.

tol

numeric. Tolerance level/acceptable error. Defaults to 1e-5.

U_i_g

matrix. (I,G) initial stochastic membership function matrix.

B_j_q

matrix. (J,Q) initial component weight matrix for variables.

C_k_r

matrix. (K,R) initial component weight matrix for occasions.


Tensor Matricization

Description

Unfold/Matricize tensor. convert matrix to tensor by mode.

Usage

unfold(tensor, mode)

Arguments

tensor

Three-mode tensor array.

mode

Mode of operation.

Value

Matrix

Examples

X_i_jk = generate_dataset()$X_i_jk
X_i_j_k = fold(X_i_jk, mode=1, shape=c(I=8,J=5,K=4))
unfold(X_i_j_k, mode=1) # X_i_jk