Package 'MatrixCorrelation'

Title:	Matrix Correlation Coefficients
Description:	Computation and visualization of matrix correlation coefficients. The main method is the Similarity of Matrices Index, while various related measures like r1, r2, r3, r4, Yanai's GCD, RV, RV2, adjusted RV, Rozeboom's linear correlation and Coxhead's coefficient are included for comparison and flexibility.
Authors:	Kristian Hovde Liland
Maintainer:	Kristian Hovde Liland <[email protected]>
License:	GPL-2
Version:	0.10.0
Built:	2025-02-04 06:49:06 UTC
Source:	CRAN

Help Index

All correlations
Candy data
Test for no correlation between paired sampes
Coxhead's coefficient
Similiarity of Matrices Coefficients
Principal Component Analysis cross-validation error
Principal Component Analysis based imputation
Result functions for the Similarity of Matrices Index (SMI)
Procrustes Similarity Index
Correlational Measures for Matrices
Rozeboom's squared vector correlation
RV coefficients
Significance estimation for Similarity of Matrices Index (SMI)
Similarity of Matrices Index (SMI)

All correlations

Description

Compare all correlation measures in the package (or a subset)

Usage

allCorrelations(
  X1,
  X2,
  ncomp1,
  ncomp2,
  methods = c("SMI", "RV", "RV2", "RVadj", "PSI", "r1", "r2", "r3", "r4", "GCD"),
  digits = 3,
  plot = TRUE,
  xlab = "",
  ylab = "",
  ...
)
allCorrelations(
  X1,
  X2,
  ncomp1,
  ncomp2,
  methods = c("SMI", "RV", "RV2", "RVadj", "PSI", "r1", "r2", "r3", "r4", "GCD"),
  digits = 3,
  plot = TRUE,
  xlab = "",
  ylab = "",
  ...
)

Arguments

`X1`	first `matrix` to be compared (`data.frames` are also accepted).
`X2`	second `matrix` to be compared (`data.frames` are also accepted).
`ncomp1`	maximum number of subspace components from the first `matrix`.
`ncomp2`	maximum number of subspace components from the second `matrix`.
`methods`	`character` vector containing a subset of the supported methods: "SMI", "RV", "RV2", "RVadj", "PSI", "r1", "r2", "r3", "r4", "GCD".
`digits`	number of digits for numerical output.
`plot`	logical indicating if plotting should be performed (default = TRUE).
`xlab`	optional x axis label.
`ylab`	optional y axis label.
`...`	additional arguments for `SMI` or `plot`.

Details

For each of the coefficients a single scalar is computed to describe the similarity between the two input matrices. Note that some methods requires setting one or two numbers of components.

Value

A single value measuring the similarity of two matrices.

Author(s)

Kristian Hovde Liland

References

SMI: Indahl, U.G.; Næs, T.; Liland, K.H.; 2018. A similarity index for comparing coupled matrices. Journal of Chemometrics; e3049.
RV: Robert, P.; Escoufier, Y. (1976). "A Unifying Tool for Linear Multivariate Statistical Methods: The RV-Coefficient". Applied Statistics 25 (3): 257-265.
RV2: Smilde, AK; Kiers, HA; Bijlsma, S; Rubingh, CM; van Erk, MJ (2009). "Matrix correlations for high-dimensional data: the modified RV-coefficient". Bioinformatics 25(3): 401-5.
Adjusted RV: Mayer, CD; Lorent, J; Horgan, GW. (2011). "Exploratory analysis of multiple omics datasets using the adjusted RV coefficient". Stat Appl Genet Mol Biol. 10(14).
PSI: Sibson, R; 1978. "Studies in the Robustness of Multidimensional Scaling: Procrustes Statistics". Journal of the Royal Statistical Society. Series B (Methodological), Vol. 40, No. 2, pp. 234-238.
Rozeboom: Rozeboom, WW; 1965. "Linear correlations between sets of variables". Psychometrika 30(1): 57-71.
Coxhead: Coxhead, P; 1974. "Measuring the releationship between two sets of variables". British Journal of Mathematical and Statistical Psychology 27: 205-212.

Examples

X1  <- scale( matrix( rnorm(100*300), 100,300), scale = FALSE)
usv <- svd(X1)
# Remove third principal component from X1 to produce X2
X2  <- usv$u[,-3] %*% diag(usv$d[-3]) %*% t(usv$v[,-3])

allCorrelations(X1,X2, ncomp1 = 5,ncomp2 = 5)

X1  <- scale( matrix( rnorm(100*300), 100,300), scale = FALSE)
usv <- svd(X1)
# Remove third principal component from X1 to produce X2
X2  <- usv$u[,-3] %*% diag(usv$d[-3]) %*% t(usv$v[,-3])

allCorrelations(X1,X2, ncomp1 = 5,ncomp2 = 5)

Measurements from sensory analysis (professional tasting) on a number of candy products obtained by sensory labs. The two labs and the associated data sets are parts of a larger study described in Tomic et al. (2010),

Usage

data(candy)
data(candy)

Format

Two matrices of dimension 18 x 6.

References

Tomic, O., Luciano, G., Nilsen, A., Hyldig, G., Lorensen, K., Næs, T. (2010). Analysing sensory panel performance in a proficiency test using the PanelCheck software. European Food Research and Technology. 230. 3, 497-511

Test for no correlation between paired sampes

Description

Permutation test for squared Pearson correlation between to vectors of samples.

Usage

cor.test_eq(x, y, B = 10000)
cor.test_eq(x, y, B = 10000)

Arguments

`x`	first `vector` to be compared (or two column `matrix/data.frame`).
`y`	second `vector` to be compared (ommit if included in `x`).
`B`	integer number of permutations, default = 10000.

Details

This is a convenience function combining SMI and significant for the special case of vector vs vector comparisons. The nullhypothesis is that the correlation between the vectors is +/-1, while significance signifies a deviance toward 0.

Value

A value indicating if the two input vectors are signficantly different.

Author(s)

Kristian Hovde Liland

References

Similarity of Matrices Index - Ulf Geir Indahl, Tormod Næs, Kristian Hovde Liland

Examples

a <- (1:5) + rnorm(5)
b <- (1:5) + rnorm(5)
cor.test_eq(a,b)

a <- (1:5) + rnorm(5)
b <- (1:5) + rnorm(5)
cor.test_eq(a,b)

Coxhead's coefficient

Description

Coxhead's coefficient

Usage

Coxhead(X1, X2, weighting = c("sqrt", "min"))
Coxhead(X1, X2, weighting = c("sqrt", "min"))

Arguments

`X1`	first `matrix` to be compared (`data.frames` are also accepted).
`X2`	second `matrix` to be compared (`data.frames` are also accepted).
`weighting`	`string` indicating if weighting should be `sqrt(p*q)` or `min(p,q)` (default = 'sqrt').

Value

A single value measuring the similarity of two matrices. For diagnostic purposes it is accompanied by an attribute "canonical.correlation".

References

Coxhead, P; 1974. "Measuring the releationship between two sets of variables". British Journal of Mathematical and Statistical Psychology 27: 205-212.

Examples

X <- matrix(rnorm(100*13),nrow=100)
X1 <- X[, 1:5]  # Random normal
X2 <- X[, 6:12] # Random normal
X2[,1] <- X2[,1] + X[,5] # Overlap in one variable
Coxhead(X1, X2)
X <- matrix(rnorm(100*13),nrow=100)
X1 <- X[, 1:5]  # Random normal
X2 <- X[, 6:12] # Random normal
X2[,1] <- X2[,1] + X[,5] # Overlap in one variable
Coxhead(X1, X2)

Similiarity of Matrices Coefficients

Description

Computation and visualization of matrix correlation coefficients. The main method is the Similarity of Matrices Index, while various related measures like r1, r2, r3, r4, Yanai's GCD, RV, RV2, adjusted RV, Rozeboom's linear correlation and Coxhead's coefficient are included for comparison and flexibility.

References

SMI: Indahl, U.G.; Næs, T.; Liland, K.H.; 2018. A similarity index for comparing coupled matrices. Journal of Chemometrics; e3049.
RV: Robert, P.; Escoufier, Y. (1976). "A Unifying Tool for Linear Multivariate Statistical Methods: The RV-Coefficient". Applied Statistics 25 (3): 257-265.
RV2: Smilde, AK; Kiers, HA; Bijlsma, S; Rubingh, CM; van Erk, MJ (2009). "Matrix correlations for high-dimensional data: the modified RV-coefficient". Bioinformatics 25(3): 401-5.
Adjusted RV: Mayer, CD; Lorent, J; Horgan, GW. (2011). "Exploratory analysis of multiple omics datasets using the adjusted RV coefficient". Stat Appl Genet Mol Biol. 10(14).
PSI: Sibson, R; 1978. "Studies in the Robustness of Multidimensional Scaling: Procrustes Statistics". Journal of the Royal Statistical Society. Series B (Methodological), Vol. 40, No. 2, pp. 234-238.
Rozeboom: Rozeboom, WW; 1965. "Linear correlations between sets of variables". Psychometrika 30(1): 57-71.
Coxhead: Coxhead, P; 1974. "Measuring the releationship between two sets of variables". British Journal of Mathematical and Statistical Psychology 27: 205-212.

Principal Component Analysis cross-validation error

Description

PRESS values for PCA as implemented by Eigenvector and described by Bro et al. (2008).

Usage

PCAcv(X, ncomp)
PCAcv(X, ncomp)

Arguments

`X`	`matrix` object to perform PCA on.
`ncomp`	`integer` number of components.

Details

For each number of components predicted residual sum of squares are calculated based on leave-one-out cross-validation. The implementation ensures no over-fitting or information bleeding.

Value

A vector of PRESS-values.

Author(s)

Kristian Hovde Liland

References

R. Bro, K. Kjeldahl, A.K. Smilde, H.A.L. Kiers, Cross-validation of component models: A critical look at current methods. Anal Bioanal Chem (2008) 390: 1241-1251.

Examples

X1  <- scale( matrix( rnorm(100*300), 100,300), scale = FALSE)
PCAcv(X1,10)

X1  <- scale( matrix( rnorm(100*300), 100,300), scale = FALSE)
PCAcv(X1,10)

Principal Component Analysis based imputation

Description

Imputation of missing data, NA, using Principal Component Analysis with iterative refitting and mean value updates. The chosen number of components and convergence parameters (iterations and tolerance) influence the precision of the imputation.

Usage

PCAimpute(X, ncomp, center = TRUE, max_iter = 20, tol = 10^-5)
PCAimpute(X, ncomp, center = TRUE, max_iter = 20, tol = 10^-5)

Arguments

`X`	`matrix` object to perform PCA on.
`ncomp`	`integer` number of components.
`center`	`logical` indicating if centering (default) should be performed.
`max_iter`	`integer` number of iterations of PCA if sum of squared change in imputed values is above `tol`.
`tol`	`numeric` tolerance for sum of squared cange in imputed values.

Value

Final singular value decomposition, imputed X matrix and convergence metrics (sequence of sum of squared change and number of iterations).

Examples

X <- matrix(rnorm(12),3,4)
X[c(2,6,10)] <- NA
PCAimpute(X, 3)
X <- matrix(rnorm(12),3,4)
X[c(2,6,10)] <- NA
PCAimpute(X, 3)

Result functions for the Similarity of Matrices Index (SMI)

Description

Plotting, printing and summary functions for SMI, plus significance testing.

Usage

## S3 method for class 'SMI'
plot(
  x,
  y = NULL,
  x1lab = attr(x, "mat.names")[[1]],
  x2lab = attr(x, "mat.names")[[2]],
  main = "SMI",
  signif = 0.05,
  xlim = c(-(pq[1] + 1)/2, (pq[2] + 1)/2),
  ylim = c(0.5, (sum(pq) + 3)/2),
  B = 10000,
  cex = 1,
  cex.sym = 1,
  frame = NULL,
  frame.col = "red",
  frame.lwd = 2,
  replicates = NULL,
  ...
)

## S3 method for class 'SMI'
print(x, ...)

## S3 method for class 'SMI'
summary(object, ...)

is.signif(x, signif = 0.05, B = 10000, ...)
## S3 method for class 'SMI'
plot(
  x,
  y = NULL,
  x1lab = attr(x, "mat.names")[[1]],
  x2lab = attr(x, "mat.names")[[2]],
  main = "SMI",
  signif = 0.05,
  xlim = c(-(pq[1] + 1)/2, (pq[2] + 1)/2),
  ylim = c(0.5, (sum(pq) + 3)/2),
  B = 10000,
  cex = 1,
  cex.sym = 1,
  frame = NULL,
  frame.col = "red",
  frame.lwd = 2,
  replicates = NULL,
  ...
)

## S3 method for class 'SMI'
print(x, ...)

## S3 method for class 'SMI'
summary(object, ...)

is.signif(x, signif = 0.05, B = 10000, ...)

Arguments

`x`	object of class `SMI`.
`y`	not used.
`x1lab`	optional label for first matrix.
`x2lab`	optional label for second matrix.
`main`	optional heading (default = SMI).
`signif`	significance level for testing (default=0.05).
`xlim`	optional plotting limits.
`ylim`	optional plotting limits.
`B`	number of permutations (for significant, default=10000).
`cex`	optional text scaling (default = 1)
`cex.sym`	optional scaling for significance symbols (default = 1)
`frame`	two element integer vector indicating framed components.
`frame.col`	color for framed components.
`frame.lwd`	line width for framed components.
`replicates`	vector of replicates for significance testing.
`...`	additional arguments for `plot`.
`object`	object of class `SMI`.

Details

For plotting a diamonad plot is used. High SMI values are light and low SMI values are dark. If orthogonal projections have been used for calculating SMIs, significance symbols are included in the plot unless signif=NULL.

Value

plot silently returns NULL. print and summary return the printed matrix.

Author(s)

Kristian Hovde Liland

References

Similarity of Matrices Index - Ulf G. Indahl, Tormod Næs, Kristian Hovde Liland

Examples

X1  <- scale( matrix( rnorm(100*300), 100,300), scale = FALSE)
usv <- svd(X1)
X2  <- usv$u[,-3] %*% diag(usv$d[-3]) %*% t(usv$v[,-3])

smi <- SMI(X1,X2,5,5)
plot(smi, B = 1000) # default B = 10000
print(smi)
summary(smi)
is.signif(smi, B = 1000) # default B = 10000

X1  <- scale( matrix( rnorm(100*300), 100,300), scale = FALSE)
usv <- svd(X1)
X2  <- usv$u[,-3] %*% diag(usv$d[-3]) %*% t(usv$v[,-3])

smi <- SMI(X1,X2,5,5)
plot(smi, B = 1000) # default B = 10000
print(smi)
summary(smi)
is.signif(smi, B = 1000) # default B = 10000

Procrustes Similarity Index

Description

An index based on the RV coefficient with Procrustes rotation.

Usage

PSI(X1, X2, center = TRUE)
PSI(X1, X2, center = TRUE)

Arguments

`X1`	first `matrix` to be compared (`data.frames` are also accepted).
`X2`	second `matrix` to be compared (`data.frames` are also accepted).
`center`	`logical` indicating if input matrices should be centered (default = TRUE).

Value

The Procrustes Similarity Index

References

Sibson, R; 1978. "Studies in the Robustness of Multidimensional Scaling: Procrustes Statistics". Journal of the Royal Statistical Society. Series B (Methodological), Vol. 40, No. 2, pp. 234-238.

Examples

X1  <- scale( matrix( rnorm(100*300), 100,300), scale = FALSE)
usv <- svd(X1)
X2  <- usv$u[,-3] %*% diag(usv$d[-3]) %*% t(usv$v[,-3])
PSI(X1,X2)

X1  <- scale( matrix( rnorm(100*300), 100,300), scale = FALSE)
usv <- svd(X1)
X2  <- usv$u[,-3] %*% diag(usv$d[-3]) %*% t(usv$v[,-3])
PSI(X1,X2)

Correlational Measures for Matrices

Description

Matrix similarity as described by Ramsey et al. (1984).

Usage

r1(X1, X2, center = TRUE, impute = FALSE)

r2(
  X1,
  X2,
  center = TRUE,
  impute = FALSE,
  impute_par = list(max_iter = 20, tol = 10^-5)
)

r3(
  X1,
  X2,
  center = TRUE,
  impute = FALSE,
  impute_par = list(max_iter = 20, tol = 10^-5)
)

r4(
  X1,
  X2,
  center = TRUE,
  impute = FALSE,
  impute_par = list(max_iter = 20, tol = 10^-5)
)

GCD(
  X1,
  X2,
  ncomp1 = min(dim(X1)),
  ncomp2 = min(dim(X2)),
  center = TRUE,
  impute = FALSE,
  impute_par = list(max_iter = 20, tol = 10^-5)
)
r1(X1, X2, center = TRUE, impute = FALSE)

r2(
  X1,
  X2,
  center = TRUE,
  impute = FALSE,
  impute_par = list(max_iter = 20, tol = 10^-5)
)

r3(
  X1,
  X2,
  center = TRUE,
  impute = FALSE,
  impute_par = list(max_iter = 20, tol = 10^-5)
)

r4(
  X1,
  X2,
  center = TRUE,
  impute = FALSE,
  impute_par = list(max_iter = 20, tol = 10^-5)
)

GCD(
  X1,
  X2,
  ncomp1 = min(dim(X1)),
  ncomp2 = min(dim(X2)),
  center = TRUE,
  impute = FALSE,
  impute_par = list(max_iter = 20, tol = 10^-5)
)

Arguments

`X1`	first `matrix` to be compared (`data.frames` are also accepted).
`X2`	second `matrix` to be compared (`data.frames` are also accepted).
`center`	`logical` indicating if input matrices should be centered (default = TRUE).
`impute`	`logical` indicating if missing values are expected in `X1` or `X2`.
`impute_par`	named `list` of imputation parameters in case of NAs in X1/X2.
`ncomp1`	(GCD) number of subspace components from the first `matrix` (default: full subspace).
`ncomp2`	(GCD) number of subspace components from the second `matrix` (default: full subspace).

Details

Details can be found in Ramsey's paper:

r1: inner product correlation
r2: orientation-independent inner product correlation
r3: spectra-independent inner product correlations (including orientation)
r4: Spectra-Independent inner product Correlations
GCD: Yanai's Generalized Coefficient of Determination (GCD) Measure. To reproduce the original GCD, use all components. When X1 and X2 are dummy variables, GCD is proportional with Pillai's criterion: tr(W^-1(B+W)).

Value

A single value measuring the similarity of two matrices.

Author(s)

Kristian Hovde Liland

References

Ramsay, JO; Berg, JT; Styan, GPH; 1984. "Matrix Correlation". Psychometrica 49(3): 403-423.

Examples

X1  <- matrix(rnorm(100*300),100,300)
usv <- svd(X1)
X2  <- usv$u[,-3] %*% diag(usv$d[-3]) %*% t(usv$v[,-3])

r1(X1,X2)
r2(X1,X2)
r3(X1,X2)
r4(X1,X2)
GCD(X1,X2)
GCD(X1,X2, 5,5)

# Missing data
X1[c(1, 50, 400, 900)] <- NA
X2[c(10, 200, 450, 1200)] <- NA
r1(X1,X2, impute = TRUE)
r2(X1,X2, impute = TRUE)
r3(X1,X2, impute = TRUE)
r4(X1,X2, impute = TRUE)
GCD(X1,X2, impute = TRUE)
GCD(X1,X2, 5,5, impute = TRUE)


X1  <- matrix(rnorm(100*300),100,300)
usv <- svd(X1)
X2  <- usv$u[,-3] %*% diag(usv$d[-3]) %*% t(usv$v[,-3])

r1(X1,X2)
r2(X1,X2)
r3(X1,X2)
r4(X1,X2)
GCD(X1,X2)
GCD(X1,X2, 5,5)

# Missing data
X1[c(1, 50, 400, 900)] <- NA
X2[c(10, 200, 450, 1200)] <- NA
r1(X1,X2, impute = TRUE)
r2(X1,X2, impute = TRUE)
r3(X1,X2, impute = TRUE)
r4(X1,X2, impute = TRUE)
GCD(X1,X2, impute = TRUE)
GCD(X1,X2, 5,5, impute = TRUE)

Rozeboom's squared vector correlation

Description

Rozeboom's squared vector correlation

Usage

Rozeboom(X1, X2)

sqveccor(X1, X2)
Rozeboom(X1, X2)

sqveccor(X1, X2)

Arguments

`X1`	first `matrix` to be compared (`data.frames` are also accepted).
`X2`	second `matrix` to be compared (`data.frames` are also accepted).

Value

A single value measuring the similarity of two matrices. For diagnostic purposes it is accompanied by an attribute "canonical.correlation".

Author(s)

Korbinian Strimmer and Kristian Hovde Liland

References

Rozeboom, WW; 1965. "Linear correlations between sets of variables". Psychometrika 30(1): 57-71.

Examples

X <- matrix(rnorm(100*13),nrow=100)
X1 <- X[, 1:5]  # Random normal
X2 <- X[, 6:12] # Random normal
X2[,1] <- X2[,1] + X[,5] # Overlap in one variable
Rozeboom(X1, X2)

X <- matrix(rnorm(100*13),nrow=100)
X1 <- X[, 1:5]  # Random normal
X2 <- X[, 6:12] # Random normal
X2[,1] <- X2[,1] + X[,5] # Overlap in one variable
Rozeboom(X1, X2)

RV coefficients

Description

Three different RV coefficients: RV, RV2 and adusted RV.

Usage

RV(X1, X2, center = TRUE, impute = FALSE)

RV2(X1, X2, center = TRUE, impute = FALSE)

RVadjMaye(X1, X2, center = TRUE)

RVadjGhaziri(X1, X2, center = TRUE)

RVadj(X1, X2, version = c("Maye", "Ghaziri"), center = TRUE)
RV(X1, X2, center = TRUE, impute = FALSE)

RV2(X1, X2, center = TRUE, impute = FALSE)

RVadjMaye(X1, X2, center = TRUE)

RVadjGhaziri(X1, X2, center = TRUE)

RVadj(X1, X2, version = c("Maye", "Ghaziri"), center = TRUE)

Arguments

`X1`	first `matrix` to be compared (`data.frames` are also accepted).
`X2`	second `matrix` to be compared (`data.frames` are also accepted).
`center`	`logical` indicating if input matrices should be centered (default = TRUE).
`impute`	`logical` indicating if missing values are expected in `X1` or `X2` (only for RV and RV2).
`version`	Which version of RV adjusted to apply: "Maye" (default) or "Ghaziri" RV adjusted is run using the `RVadj` function.

Details

For each of the four coefficients a single scalar is computed to describe the similarity between the two input matrices.

Value

A single value measuring the similarity of two matrices.

Author(s)

Kristian Hovde Liland, Benjamin Leutner (RV2)

References

RV: Robert, P.; Escoufier, Y. (1976). "A Unifying Tool for Linear Multivariate Statistical Methods: The RV-Coefficient". Applied Statistics 25 (3): 257-265.
RV2: Smilde, AK; Kiers, HA; Bijlsma, S; Rubingh, CM; van Erk, MJ (2009). "Matrix correlations for high-dimensional data: the modified RV-coefficient". Bioinformatics 25(3): 401-5.
Adjusted RV: Maye, CD; Lorent, J; Horgan, GW. (2011). "Exploratory analysis of multiple omics datasets using the adjusted RV coefficient". Stat Appl Genet Mol Biol. 10(14).
Adjusted RV: El Ghaziri, A; Qannari, E.M. (2015) "Measures of association between two datasets; Application to sensory data", Food Quality and Preference 40 (A): 116-124.

Examples

X1  <- matrix(rnorm(100*300),100,300)
usv <- svd(X1)
X2  <- usv$u[,-3] %*% diag(usv$d[-3]) %*% t(usv$v[,-3])

RV(X1,X2)
RV2(X1,X2)
RVadj(X1,X2)

# Missing data
X1[c(1, 50, 400, 900)] <- NA
X2[c(10, 200, 450, 1200)] <- NA
RV(X1,X2, impute = TRUE)
RV2(X1,X2, impute = TRUE)

X1  <- matrix(rnorm(100*300),100,300)
usv <- svd(X1)
X2  <- usv$u[,-3] %*% diag(usv$d[-3]) %*% t(usv$v[,-3])

RV(X1,X2)
RV2(X1,X2)
RVadj(X1,X2)

# Missing data
X1[c(1, 50, 400, 900)] <- NA
X2[c(10, 200, 450, 1200)] <- NA
RV(X1,X2, impute = TRUE)
RV2(X1,X2, impute = TRUE)

Significance estimation for Similarity of Matrices Index (SMI)

Description

Permutation based hypothesis testing for SMI. The nullhypothesis is that a linear function of one matrix subspace is included in the subspace of another matrix.

Usage

significant(smi, B = 10000, replicates = NULL)
significant(smi, B = 10000, replicates = NULL)

Arguments

`smi`	`smi` object returned by call to `SMI`.
`B`	integer number of permutations, default = 10000.
`replicates`	integer vector of replicates.

Details

For each combination of components significance is estimated by sampling from a null distribution of no similarity, i.e. when the rows of one matrix is permuted B times and corresponding SMI values are computed. If the vector replicates is included, replicates will be kept together through permutations.

Value

A matrix containing P-values for all combinations of components.

Author(s)

Kristian Hovde Liland

References

Similarity of Matrices Index - Ulf G. Indahl, Tormod Næs Kristian Hovde Liland

Examples

X1  <- scale( matrix( rnorm(100*300), 100,300), scale = FALSE)
usv <- svd(X1)
X2  <- usv$u[,-3] %*% diag(usv$d[-3]) %*% t(usv$v[,-3])

(smi <- SMI(X1,X2,5,5))
significant(smi, B = 1000) # default B = 10000

X1  <- scale( matrix( rnorm(100*300), 100,300), scale = FALSE)
usv <- svd(X1)
X2  <- usv$u[,-3] %*% diag(usv$d[-3]) %*% t(usv$v[,-3])

(smi <- SMI(X1,X2,5,5))
significant(smi, B = 1000) # default B = 10000

Similarity of Matrices Index (SMI)

Description

A similarity index for comparing coupled data matrices.

Usage

SMI(
  X1,
  X2,
  ncomp1 = Rank(X1) - 1,
  ncomp2 = Rank(X2) - 1,
  projection = "Orthogonal",
  Scores1 = NULL,
  Scores2 = NULL,
  impute = FALSE,
  impute_par = list(max_iter = 20, tol = 10^-5)
)
SMI(
  X1,
  X2,
  ncomp1 = Rank(X1) - 1,
  ncomp2 = Rank(X2) - 1,
  projection = "Orthogonal",
  Scores1 = NULL,
  Scores2 = NULL,
  impute = FALSE,
  impute_par = list(max_iter = 20, tol = 10^-5)
)

Arguments

`X1`	first `matrix` to be compared (`data.frames` are also accepted).
`X2`	second `matrix` to be compared (`data.frames` are also accepted).
`ncomp1`	maximum number of subspace components from the first `matrix`.
`ncomp2`	maximum number of subspace components from the second `matrix`.
`projection`	type of projection to apply, defaults to "Orthogonal", alternatively "Procrustes".
`Scores1`	user supplied score-`matrix` to replace singular value decomposition of first `matrix`.
`Scores2`	user supplied score-`matrix` to replace singular value decomposition of second `matrix`.
`impute`	`logical` for activation of PCA based imputation for X1/X2.
`impute_par`	named `list` of imputation parameters in case of NAs in X1/X2.

Details

A two-step process starts with extraction of stable subspaces using Principal Component Analysis or some other method yielding two orthonormal bases. These bases are compared using Orthogonal Projection (OP / ordinary least squares) or Procrustes Rotation (PR). The result is a similarity measure that can be adjusted to various data sets and contexts and which includes explorative plotting and permutation based testing of matrix subspace equality.

Value

A matrix containing all combinations of components. Its class is "SMI" associated with print, plot, summary methods.

Author(s)

Kristian Hovde Liland

References

Ulf Geir Indahl, Tormod Næs, Kristian Hovde Liland; 2018. A similarity index for comparing coupled matrices. Journal of Chemometrics; e3049.

Examples

# Simulation
X1  <- scale( matrix( rnorm(100*300), 100,300), scale = FALSE)
usv <- svd(X1)
X2  <- usv$u[,-3] %*% diag(usv$d[-3]) %*% t(usv$v[,-3])

(smi <- SMI(X1,X2,5,5))
plot(smi, B = 1000 ) # default B = 10000

# Sensory analysis
data(candy)
plot( SMI(candy$Panel1, candy$Panel2, 3,3, projection = "Procrustes"),
    frame = c(2,2), B = 1000, x1lab = "Panel1", x2lab = "Panel2" ) # default B = 10000

# Missing data (100 missing completely at random points each)
X1[sort(round(runif(100)*29999+1))] <- NA
X2[sort(round(runif(100)*29999+1))] <- NA
(smi <- SMI(X1,X2,5,5, impute = TRUE))

# Simulation
X1  <- scale( matrix( rnorm(100*300), 100,300), scale = FALSE)
usv <- svd(X1)
X2  <- usv$u[,-3] %*% diag(usv$d[-3]) %*% t(usv$v[,-3])

(smi <- SMI(X1,X2,5,5))
plot(smi, B = 1000 ) # default B = 10000

# Sensory analysis
data(candy)
plot( SMI(candy$Panel1, candy$Panel2, 3,3, projection = "Procrustes"),
    frame = c(2,2), B = 1000, x1lab = "Panel1", x2lab = "Panel2" ) # default B = 10000

# Missing data (100 missing completely at random points each)
X1[sort(round(runif(100)*29999+1))] <- NA
X2[sort(round(runif(100)*29999+1))] <- NA
(smi <- SMI(X1,X2,5,5, impute = TRUE))

Package 'MatrixCorrelation'

Help Index

All correlations

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Candy data

Description

Usage

Format

References

Test for no correlation between paired sampes

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Coxhead's coefficient

Description

Usage

Arguments

Value

References

See Also

Examples

Similiarity of Matrices Coefficients

Description

References

See Also

Principal Component Analysis cross-validation error

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Principal Component Analysis based imputation

Description

Usage

Arguments

Value

Examples

Result functions for the Similarity of Matrices Index (SMI)

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Procrustes Similarity Index

Description

Usage

Arguments

Value

References

Examples

Correlational Measures for Matrices

Description

Usage

Arguments

Details

Value

Author(s)

References