Title: | Tools for Exploring Multivariate Data via ICS/ICA |
---|---|
Description: | Implementation of Tyler, Critchley, Duembgen and Oja's (JRSS B, 2009, <doi:10.1111/j.1467-9868.2009.00706.x>) and Oja, Sirkia and Eriksson's (AJS, 2006, <https://www.ajs.or.at/index.php/ajs/article/view/vol35,%20no2%263%20-%207>) method of two different scatter matrices to obtain an invariant coordinate system or independent components, depending on the underlying assumptions. |
Authors: | Klaus Nordhausen [aut, cre] , Andreas Alfons [aut] , Aurore Archimbaud [aut] , Hannu Oja [aut] , Anne Ruiz-Gazen [aut] , David E. Tyler [aut] |
Maintainer: | Klaus Nordhausen <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.4-1 |
Built: | 2024-12-14 06:46:46 UTC |
Source: | CRAN |
Implementation of Tyler, Critchley, Duembgen and Oja's (JRSS B, 2009, <doi:10.1111/j.1467-9868.2009.00706.x>) and Oja, Sirkia and Eriksson's (AJS, 2006, <https://www.ajs.or.at/index.php/ajs/article/view/vol35,%20no2%263%20-%207>) method of two different scatter matrices to obtain an invariant coordinate system or independent components, depending on the underlying assumptions.
Package: | ICS |
Type: | Package |
Title: | Tools for Exploring Multivariate Data via ICS/ICA |
Version: | 1.4-1 |
Date: | 2023-09-17 |
Authors@R: | c(person("Klaus", "Nordhausen", email = "[email protected]", role = c("aut", "cre"), comment = c(ORCID = "0000-0002-3758-8501")), person("Andreas", "Alfons", email = "[email protected]", role = "aut", comment = c(ORCID = "0000-0002-2513-3788")), person("Aurore", "Archimbaud", email = "[email protected]", role = "aut", comment = c(ORCID = "0000-0002-6511-9091")), person("Hannu", "Oja", email = "", role = c("aut"), comment = c(ORCID = "0000-0002-4945-5976")), person("Anne", "Ruiz-Gazen", email = "[email protected]", role = "aut", comment = c(ORCID = "0000-0001-8970-8061")), person("David E.", "Tyler", email = "", role = c("aut"))) |
Author: | Klaus Nordhausen [aut, cre] (<https://orcid.org/0000-0002-3758-8501>), Andreas Alfons [aut] (<https://orcid.org/0000-0002-2513-3788>), Aurore Archimbaud [aut] (<https://orcid.org/0000-0002-6511-9091>), Hannu Oja [aut] (<https://orcid.org/0000-0002-4945-5976>), Anne Ruiz-Gazen [aut] (<https://orcid.org/0000-0001-8970-8061>), David E. Tyler [aut] |
Maintainer: | Klaus Nordhausen <[email protected]> |
Depends: | R (>= 2.5.0), methods, mvtnorm |
Imports: | survey, graphics |
Suggests: | pixmap, robustbase, MASS, ICSNP, testthat (>= 3.0.0), ICSOutlier |
Description: | Implementation of Tyler, Critchley, Duembgen and Oja's (JRSS B, 2009, <doi:10.1111/j.1467-9868.2009.00706.x>) and Oja, Sirkia and Eriksson's (AJS, 2006, <https://www.ajs.or.at/index.php/ajs/article/view/vol35,%20no2%263%20-%207>) method of two different scatter matrices to obtain an invariant coordinate system or independent components, depending on the underlying assumptions. |
License: | GPL (>= 2) |
LazyLoad: | yes |
Encoding: | UTF-8 |
NeedsCompilation: | no |
RoxygenNote: | 7.2.3 |
Config/testthat/edition: | 3 |
Packaged: | 2023-09-20 17:23:07 UTC; admin |
Repository: | CRAN |
Date/Publication: | 2023-09-21 07:10:02 UTC |
Config/pak/sysreqs: | make |
Some multivariate tests and estimates are not affine equivariant by nature. A possible remedy for the lack of that property is to transform the data points to an invariant coordinate system,
construct tests and estimates from the transformed data, and if needed, retransform the estimates back. The use of two different
scatter matrices to obtain invariant coordinates is implemeted in this package by the function ICS
. For an invariant coordinate selection no
assumptions are made about the data or the scatter matrices and it can be seen as a data transformation method. If the data come, however, from a so called independent component model
the ICS
function can recover the independent components and estimate the mixing matrix under general assumptions.
Besides, the function ICS
provides these package tools to work with objects of this class, and some
scatter matrices which can be used in the ICS
function. Furthermore, there are also two tests for multinormality.
Note that starting with version 1.4-0 the functions ics
and ics2
are not recommended anymore and everything can be done in a more efficient way using the function ICS
which combines
the functionality of the original two functions and also provides an improved algorithm for certain scatter combinations. Furthermore, does ICS
return
an S3 object and not anymore S4 objects as ics
and ics2
did. In the long run functions ics
and ics2
will be removed from the package.
Index of help topics:
ICS-S3 Two Scatter Matrices ICS Transformation ICS-package Tools for Exploring Multivariate Data via ICS/ICA ICS_scatter Location and Scatter Estimates for ICS Mean3Cov4 Location Vector Based on 3rd Moments and Scatter Matrix Based on 4th Moments MeanCov Mean Vector and Covariance Matrix coef.ICS-S3 To extract the Coefficient Matrix of the ICS Transformation coef.ics To extract the Unmixing Matrix components To extract the Component Scores of the ICS Transformation cov4 Scatter Matrix based on Fourth Moments cov4.wt Weighted Scatter Matrix based on Fourth Moments covAxis One step Tyler Shape Matrix covOrigin Covariance Matrix with Respect to the Origin covW One-step M-estimator fitted.ICS-S3 Fitted Values of the ICS Transformation fitted.ics Fitted Values of an ICS Object gen_kurtosis To extract the Generalized Kurtosis Values of the ICS Transformation ics Two Scatter Matrices ICS Transformation ics-class Class ICS ics.components Extracting ICS Components ics2 Two Scatter Matrices ICS Transformation Augmented by Two Location Estimates ics2-class Class ICS2 mean3 Location Estimate based on Third Moments mvnorm.kur.test Test of Multivariate Normality Based on Kurtosis mvnorm.skew.test Test of Multivariate Normality Based on Skewness plot.ICS-S3 Scatterplot Matrix of Component Scores from the ICS Transformation plot.ics Scatterplot for a ICS Object print.ICS-S3 Basic information of ICS Object print.ics Basic information of ICS Object print.ics2 Basic information of ICS2 Object scovq Supervised scatter matrix based on quantiles screeplot.ICS-S3 Screeplot for an 'ICS' Object screeplot.ics Screeplot for an ICS Object summary.ICS-S3 To summarize an 'ICS' object summary.ics To summarize an ICS object summary.ics2 To summarize an ICS2 object tM Joint M-estimation of Location and Scatter for a Multivariate t-distribution
Further information is available in the following vignettes:
ICS |
Tools for Exploring Multivariate Data: The Package ICS (source, pdf) |
Klaus Nordhausen [aut, cre] (<https://orcid.org/0000-0002-3758-8501>), Andreas Alfons [aut] (<https://orcid.org/0000-0002-2513-3788>), Aurore Archimbaud [aut] (<https://orcid.org/0000-0002-6511-9091>), Hannu Oja [aut] (<https://orcid.org/0000-0002-4945-5976>), Anne Ruiz-Gazen [aut] (<https://orcid.org/0000-0001-8970-8061>), David E. Tyler [aut]
Maintainer: Klaus Nordhausen <[email protected]>
Tyler, D.E., Critchley, F., Dümbgen, L. and Oja, H. (2009), Invariant co-ordinate selecetion, Journal of the Royal Statistical Society,Series B, 71, 549–592. <doi:10.1111/j.1467-9868.2009.00706.x>.
Oja, H., Sirkiä, S. and Eriksson, J. (2006), Scatter matrices and independent component analysis, Austrian Journal of Statistics, 35, 175–189.
Nordhausen, K., Oja, H. and Tyler, D.E. (2008), Tools for exploring multivariate data: The package ICS, Journal of Statistical Software, 28, 1–31. <doi:10.18637/jss.v028.i06>.
Archimbaud, A., Drmac, Z., Nordhausen, K., Radojicic, U. and Ruiz-Gazen, A. (2023), Numerical considerations and a new implementation for ICS, SIAM Journal on Mathematics of Data Science, 5, 97–121. <doi:10.1137/22M1498759>.
Extracts the unmixing matrix of a class ics
object.
## S4 method for signature 'ics' coef(object)
## S4 method for signature 'ics' coef(object)
object |
object of class |
The unmixing matrix of a class ics
object.
Klaus Nordhausen
Extracts the coefficient matrix of a linear transformation to an invariant coordinate system. Each row of the matrix contains the coefficients of the transformation to the corresponding component.
## S3 method for class 'ICS' coef(object, select = NULL, drop = FALSE, index = NULL, ...)
## S3 method for class 'ICS' coef(object, select = NULL, drop = FALSE, index = NULL, ...)
object |
an object inheriting from class |
select |
an integer, character, or logical vector specifying for which
components to extract the coefficients, or |
drop |
a logical indicating whether to return a vector rather than a
matrix in case coefficients are extracted for a single component (default
to |
index |
an integer vector specifying for which components to extract
the coefficients, or |
... |
additional arguments are ignored. |
A numeric matrix or vector containing the coefficients for the requested components.
Andreas Alfons and Aurore Archimbaud
ICS()
gen_kurtosis()
, components()
,
fitted()
, and plot()
methods
data("iris") X <- iris[,-5] out <- ICS(X) coef(out) coef(out, select = c(1,4)) coef(out, select = 1, drop = FALSE)
data("iris") X <- iris[,-5] out <- ICS(X) coef(out) coef(out, select = c(1,4)) coef(out, select = 1, drop = FALSE)
Extracts the components scores of an invariant coordinate system obtained via an ICS transformation.
components(x, ...) ## S3 method for class 'ICS' components(x, select = NULL, drop = FALSE, index = NULL, ...)
components(x, ...) ## S3 method for class 'ICS' components(x, select = NULL, drop = FALSE, index = NULL, ...)
x |
an object inheriting from class |
... |
additional arguments to be passed down. |
select |
an integer, character, or logical vector specifying which
components to extract, or |
drop |
a logical indicating whether to return a vector rather than a
matrix in case a single component is extracted (default to |
index |
an integer vector specifying which components to extract, or
|
A numeric matrix or vector containing the requested components.
Andreas Alfons and Aurore Archimbaud
ICS()
gen_kurtosis()
, coef()
,
fitted()
, and plot()
methods
data("iris") X <- iris[,-5] out <- ICS(X) components(out) components(out, select = c(1,4)) components(out, select = 1, drop = FALSE)
data("iris") X <- iris[,-5] out <- ICS(X) components(out) components(out, select = c(1,4)) components(out, select = 1, drop = FALSE)
Estimates the scatter matrix based on the 4th moments of the data.
cov4(X, location = "Mean", na.action = na.fail)
cov4(X, location = "Mean", na.action = na.fail)
X |
numeric data matrix or dataframe, missing values are not allowed. |
location |
can be either |
na.action |
a function which indicates what should happen when the data contain 'NA's. Default is to fail. |
If location is Mean
the scatter matrix of 4th moments is computed wrt to the sample mean.
For location = Origin
it is the scatter matrix of 4th moments wrt to the origin.
The scatter matrix is standardized in such a way to be consistent for the regular covariance matrix at the multinormal model.
It is given for matrix X by
where is the mean vector and
the regular covariance matrix.
A matrix containing the estimated fourth moments scatter.
Klaus Nordhausen
Cardoso, J.F. (1989), Source separation using higher order moments, in Proc. IEEE Conf. on Acoustics, Speech and Signal Processing (ICASSP'89), 2109–2112. <doi:10.1109/ICASSP.1989.266878>.
Oja, H., Sirki?, S. and Eriksson, J. (2006), Scatter matrices and independent component analysis, Austrian Journal of Statistics, 35, 175–189.
set.seed(654321) cov.matrix <- matrix(c(3,2,1,2,4,-0.5,1,-0.5,2), ncol=3) X <- rmvnorm(100, c(0,0,0), cov.matrix) cov4(X) cov4(X, location="Origin") rm(.Random.seed)
set.seed(654321) cov.matrix <- matrix(c(3,2,1,2,4,-0.5,1,-0.5,2), ncol=3) X <- rmvnorm(100, c(0,0,0), cov.matrix) cov4(X) cov4(X, location="Origin") rm(.Random.seed)
Estimates the weighted scatter matrix based on the 4th moments of the data.
cov4.wt(x, wt = rep(1/nrow(x), nrow(x)), location = TRUE, method = "ML", na.action = na.fail)
cov4.wt(x, wt = rep(1/nrow(x), nrow(x)), location = TRUE, method = "ML", na.action = na.fail)
x |
numeric data matrix or dataframe. |
wt |
numeric vector of non-negative weights. At least some weights must be larger than zero. |
location |
|
method |
Either |
na.action |
a function which indicates what should happen when the data contain 'NA's. Default is to fail. |
If location = TRUE
, then the scatter matrix is given for a data matrix X by
where are the weights standardized such that
,
is the weighted mean vector and
the weighted covariance matrix.
For details about the weighted mean vector and weighted covariance matrix see
cov.wt
.
A matrix containing the estimated weighted fourth moments scatter.
Klaus Nordhausen
cov.matrix.1 <- matrix(c(3,2,1,2,4,-0.5,1,-0.5,2), ncol=3) X.1 <- rmvnorm(100, c(0,0,0), cov.matrix.1) cov.matrix.2 <- diag(1,3) X.2 <- rmvnorm(50, c(1,1,1), cov.matrix.2) X <- rbind(X.1, X.2) cov4.wt(X, rep(c(0,1), c(100,50))) cov4.wt(X, rep(c(1,0), c(100,50)))
cov.matrix.1 <- matrix(c(3,2,1,2,4,-0.5,1,-0.5,2), ncol=3) X.1 <- rmvnorm(100, c(0,0,0), cov.matrix.1) cov.matrix.2 <- diag(1,3) X.2 <- rmvnorm(50, c(1,1,1), cov.matrix.2) X <- rbind(X.1, X.2) cov4.wt(X, rep(c(0,1), c(100,50))) cov4.wt(X, rep(c(1,0), c(100,50)))
This matrix can be used to get the principal axes from ics
,
which is then known as principal axis analysis.
covAxis(X, na.action = na.fail)
covAxis(X, na.action = na.fail)
X |
numeric data matrix or dataframe. |
na.action |
a function which indicates what should happen when the data contain 'NA's. Default is to fail. |
The covAxis
matrix is a given for a
data matrix X as
where is the mean vector and
the regular covariance matrix.
covAxis
can be used to perform a Prinzipal Axis Analysis (Critchley et al. 2006) using the function ics
.
In that case, for a centered data matrix X, covAxis
can be used as S2 in ics
, where S1 should be in that
case the regular covariance matrix.
A matrix containing the estimated one step Tyler shape matrix.
Klaus Nordhausen
Critchley , F., Pires, A. and Amado, C. (2006), Principal axis analysis, Technical Report, 06/14, The Open University Milton Keynes.
Tyler, D.E., Critchley, F., D?mbgen, L. and Oja, H. (2009), Invariant co-ordinate selecetion, Journal of the Royal Statistical Society,Series B, 71, 549–592. <doi:10.1111/j.1467-9868.2009.00706.x>.
data(iris) iris.centered <- sweep(iris[,1:4], 2, colMeans(iris[,1:4]), "-") iris.paa <- ics(iris.centered, cov, covAxis, stdKurt = FALSE) summary(iris.paa) plot(iris.paa, col=as.numeric(iris[,5])) mean(iris.paa@gKurt) emp.align <- iris.paa@gKurt emp.align screeplot(iris.paa) abline(h = 1)
data(iris) iris.centered <- sweep(iris[,1:4], 2, colMeans(iris[,1:4]), "-") iris.paa <- ics(iris.centered, cov, covAxis, stdKurt = FALSE) summary(iris.paa) plot(iris.paa, col=as.numeric(iris[,5])) mean(iris.paa@gKurt) emp.align <- iris.paa@gKurt emp.align screeplot(iris.paa) abline(h = 1)
Estimates the covariance matrix with respect to the origin.
covOrigin(X, location = NULL, na.action = na.fail)
covOrigin(X, location = NULL, na.action = na.fail)
X |
a numeric data matrix or dataframe. |
location |
optional location value which serves as the center instead of the origin. |
na.action |
a function which indicates what should happen when the data contain 'NA's. Default is to fail. |
The covariance matrix with respect to origin is given for a matrix X with n observations by
A matrix containing the estimated covariance matrix with respect to the origin.
Klaus Nordhausen
set.seed(654321) cov.matrix <- matrix(c(3,2,1,2,4,-0.5,1,-0.5,2), ncol=3) X <- rmvnorm(100,c(0,0,0),cov.matrix) covOrigin(X) rm(.Random.seed)
set.seed(654321) cov.matrix <- matrix(c(3,2,1,2,4,-0.5,1,-0.5,2), ncol=3) X <- rmvnorm(100,c(0,0,0),cov.matrix) covOrigin(X) rm(.Random.seed)
Estimates the scatter matrix based on one-step M-estimator using mean and covariance matrix as starting point.
covW(X, na.action = na.fail, alpha = 1, cf = 1)
covW(X, na.action = na.fail, alpha = 1, cf = 1)
X |
numeric |
na.action |
a function which indicates what should happen when the data contain 'NA's. Default is to fail. |
alpha |
parameter of the one-step M-estimator. By default equals to 1. |
cf |
consistency factor of the one-step M-estimator. By default equals to 1. |
It is given for matrix
by
where is the mean vector,
is the squared
Mahalanobis distance,
is a
non-negative and continuous weight function and
is a consistency factor.
Note that the consistency factor, which makes the estimator consistent at the multivariate normal distribution, is in most case unknown and therefore the default is to use simply
cf = 1
.
If , we get the covariance matrix
cov()
(up to the factor
instead of
).
If , we get the
covAxis()
.
If , we get the
cov4()
with .
A matrix containing the one-step M-scatter.
Aurore Archimbaud and Klaus Nordhausen
Archimbaud, A., Drmac, Z., Nordhausen, K., Radojicic, U. and Ruiz-Gazen, A. (2023). SIAM Journal on Mathematics of Data Science (SIMODS), Vol.5(1):97–121. doi:10.1137/22M1498759.
data(iris) X <- iris[,1:4] # Equivalence with covAxis covW(X, alpha = -1, cf = ncol(X)) covAxis(X) # Equivalence with cov4 covW(X, alpha = 1, cf = 1/(ncol(X)+2)) cov4(X) # covW with alpha = 0.5 covW(X, alpha = 0.5)
data(iris) X <- iris[,1:4] # Equivalence with covAxis covW(X, alpha = -1, cf = ncol(X)) covAxis(X) # Equivalence with cov4 covW(X, alpha = 1, cf = 1/(ncol(X)+2)) cov4(X) # covW with alpha = 0.5 covW(X, alpha = 0.5)
Computes the fitted values of an ics
object.
## S4 method for signature 'ics' fitted(object,index=NULL)
## S4 method for signature 'ics' fitted(object,index=NULL)
object |
object of class |
index |
A vector which defines which components should be used to compute the fitted values. The default NULL uses all components. |
Returns a dataframe with the fitted values.
Klaus Nordhausen
set.seed(123456) X1 <- rmvnorm(250, rep(0,8), diag(c(rep(1,6),0.04,0.04))) X2 <- rmvnorm(50, c(rep(0,6),2,0), diag(c(rep(1,6),0.04,0.04))) X3 <- rmvnorm(200, c(rep(0,7),2), diag(c(rep(1,6),0.04,0.04))) X.comps <- rbind(X1,X2,X3) A <- matrix(rnorm(64),nrow=8) X <- X.comps %*% t(A) ics.X.1 <- ics(X) fitted(ics.X.1) fitted(ics.X.1,index=c(1,2,3,6,7,8)) rm(.Random.seed)
set.seed(123456) X1 <- rmvnorm(250, rep(0,8), diag(c(rep(1,6),0.04,0.04))) X2 <- rmvnorm(50, c(rep(0,6),2,0), diag(c(rep(1,6),0.04,0.04))) X3 <- rmvnorm(200, c(rep(0,7),2), diag(c(rep(1,6),0.04,0.04))) X.comps <- rbind(X1,X2,X3) A <- matrix(rnorm(64),nrow=8) X <- X.comps %*% t(A) ics.X.1 <- ics(X) fitted(ics.X.1) fitted(ics.X.1,index=c(1,2,3,6,7,8)) rm(.Random.seed)
Computes the fitted values based on an invariant coordinate system obtained via an ICS transformation. When using all components, computing the fitted values constitutes a backtransformation to the observed data. When using fewer components, the fitted values can often be viewed as reconstructions of the observed data with noise removed.
## S3 method for class 'ICS' fitted(object, select = NULL, index = NULL, ...)
## S3 method for class 'ICS' fitted(object, select = NULL, index = NULL, ...)
object |
an object inheriting from class |
select |
an integer, character, or logical vector specifying which
components to use for computing the fitted values, or |
index |
an integer vector specifying which components to use for
computing the fitted values, or |
... |
additional arguments are ignored. |
A numeric matrix containing the fitted values.
Andreas Alfons and Aurore Archimbaud
ICS()
gen_kurtosis()
, coef()
,
components()
, and plot()
methods
data("iris") X <- iris[,-5] out <- ICS(X) fitted(out) fitted(out, select = 4)
data("iris") X <- iris[,-5] out <- ICS(X) fitted(out) fitted(out, select = 4)
Extracts the generalized kurtosis values of the components obtained via an ICS transformation.
gen_kurtosis(object, ...) ## S3 method for class 'ICS' gen_kurtosis(object, select = NULL, scale = FALSE, index = NULL, ...)
gen_kurtosis(object, ...) ## S3 method for class 'ICS' gen_kurtosis(object, select = NULL, scale = FALSE, index = NULL, ...)
object |
an object inheriting from class |
... |
additional arguments to be passed down. |
select |
an integer, character, or logical vector specifying for which
components to extract the generalized kurtosis values, or |
scale |
a logical indicating whether to scale the generalized kurtosis
values to have product 1 (default to |
index |
an integer vector specifying for which components to extract
the generalized kurtosis values, or |
The argument scale
is useful when ICS is performed with shape
matrices rather than true scatter matrices. Let and
denote the scatter or shape matrices used in ICS.
If both and
are true scatter matrices, their
order in principal does not matter. Changing their order will just reverse
the order of the components and invert the corresponding generalized
kurtosis values.
The same does not hold when at least one of them is a shape matrix rather than a true scatter matrix. In that case, changing their order will also reverse the order of the components, but the ratio of the generalized kurtosis values is no longer 1 but only a constant. This is due to the fact that when shape matrices are used, the generalized kurtosis values are only relative ones. It is then useful to scale the generalized kurtosis values such that their product is 1.
A numeric vector containing the generalized kurtosis values of the requested components.
Andreas Alfons and Aurore Archimbaud
ICS()
coef()
, components()
,
fitted()
, and plot()
methods
data("iris") X <- iris[,-5] out <- ICS(X) gen_kurtosis(out) gen_kurtosis(out, scale = TRUE) gen_kurtosis(out, select = c(1,4))
data("iris") X <- iris[,-5] out <- ICS(X) gen_kurtosis(out) gen_kurtosis(out, scale = TRUE) gen_kurtosis(out, select = c(1,4))
Implements the two scatter matrices transformation to obtain an invariant coordinate sytem or independent components, depending on the underlying assumptions.
ics(X, S1 = cov, S2 = cov4, S1args = list(), S2args = list(), stdB = "Z", stdKurt = TRUE, na.action = na.fail)
ics(X, S1 = cov, S2 = cov4, S1args = list(), S2args = list(), stdB = "Z", stdKurt = TRUE, na.action = na.fail)
X |
numeric data matrix or dataframe. |
S1 |
name of the first scatter matrix function or a scatter matrix. Default is the regular covariance matrix. |
S2 |
name of the second scatter matrix or a scatter matrix. Default is the covariance matrix based on forth order moments. Note that the type of S2 must be the same as S1. |
S1args |
list with optional additional arguments for S1. Only considered if S1 is a function. |
S2args |
list with optional additional arguments for S2. Only considered if S2 is a function. |
stdB |
either "B" or "Z". Defines the way to standardize the matrix B. Default is "Z". Details are given below. |
stdKurt |
Logical, either "TRUE" or "FALSE". Specifies weather the product of the kurtosis values is 1 or not. |
na.action |
a function which indicates what should happen when the data contain 'NA's. Default is to fail. |
Seeing this function as a tool for data transformation the result is an invariant coordinate selection which can be used for test and estimation. And if needed the results can be easily retransformed to the original scale. It is possible to use it also for dimension reduction, in order to find outliers or clusters in the data. The function can, also be used in a modelling framework. In this case it is assumed that the data were created by mixing independent components which have different kurtosis values. If the two scatter matrices used have the so-called independence property the function can recover the independent components by estimating the unmixing matrix.
By default S1 is the regular covariance matrix cov
and S2 the matrix of fourth moments cov4
. However those can be replaced with any other
scatter matrix the user prefers. The package ICS offers for example also cov4.wt
, covAxis
, covOrigin
, covW
or tM
and the ICSNP offers further scatters as duembgen.shape
, tyler.shape
, HR.Mest
or HP1.shape
.
But of course also scatters from any other package can be used.
Note that when function names are submitted, the function should return only a scatter matrix. If the function returns more, the scatter should be computed in advance or
a wrapper written that yields the required output. For example tM
returns a list with four elements where the scatter estimate is called V. A simple wrapper would then be
my.tm <- function(x, ...) tM(x, ...)$V
.
For a given choice of S1 and S2, the general idea of the ics
function is to find the unmixing matrix B and the invariant coordinates (independent coordinates)
Z in such a way, that:
The elements of Z are standardized with respect to S1 (S1(Z)=I).
The elements of Z are uncorrelated with respect to S2. (S2(Z)=D, where D is a diagonal matrix).
The elements of Z are ordered according to their generalized kurtosis.
Given those criteria, B is unique up to sign changes of its rows. The function provides two options to decide the exact form of B.
Method 'Z' standardizes B such, that all components are right skewed. The criterion used is the sign of each componentwise difference of mean vector and transformation retransformation median. This standardization is prefered in an invariant coordinate framework.
Method 'B' standardizes B independent of Z such that the maximum element per row is positive and each row has norm 1. Usual way in an independent component analysis framework.
In principal, if S1 and S2 are true scatter matrices the order does not matter. It will just reverse and invert the kurtosis value vector. This is however not true when one or both are shape matrices (and not both of them are scatter matrices). In this case the order of the kurtosis values is also reversed, the ratio however then is not 1 but only constant. This is due to the fact that when shape matrices are used, the kurtosis values are only relative ones. Therefore by the default the kurtosis values are standardized such that their product is 1. If no standardization is wanted, the 'stdKurt' argument should be used.
an object of class ics
.
Function ics()
reached the end of its lifecycle, please use ICS()
instead. In future versions, ics()
will be deprecated and eventually removed.
Klaus Nordhausen
Tyler, D.E., Critchley, F., D?mbgen, L. and Oja, H. (2009), Invariant co-ordinate selecetion, Journal of the Royal Statistical Society,Series B, 71, 549–592. <doi:10.1111/j.1467-9868.2009.00706.x>.
Oja, H., Sirki?, S. and Eriksson, J. (2006), Scatter matrices and independent component analysis, Austrian Journal of Statistics, 35, 175–189.
Nordhausen, K., Oja, H. and Tyler, D.E. (2008), Tools for exploring multivariate data: The package ICS, Journal of Statistical Software, 28, 1–31. <doi:10.18637/jss.v028.i06>.
# example using two functions set.seed(123456) X1 <- rmvnorm(250, rep(0,8), diag(c(rep(1,6),0.04,0.04))) X2 <- rmvnorm(50, c(rep(0,6),2,0), diag(c(rep(1,6),0.04,0.04))) X3 <- rmvnorm(200, c(rep(0,7),2), diag(c(rep(1,6),0.04,0.04))) X.comps <- rbind(X1,X2,X3) A <- matrix(rnorm(64),nrow=8) X <- X.comps %*% t(A) ics.X.1 <- ics(X) summary(ics.X.1) plot(ics.X.1) # compare to pairs(X) pairs(princomp(X,cor=TRUE)$scores) # slow: # library(ICSNP) # ics.X.2 <- ics(X, tyler.shape, duembgen.shape, S1args=list(location=0)) # summary(ics.X.2) # plot(ics.X.2) rm(.Random.seed) # example using two computed scatter matrices for outlier detection library(robustbase) ics.wood<-ics(wood,tM(wood)$V,tM(wood,2)$V) plot(ics.wood) # example using three pictures library(pixmap) fig1 <- read.pnm(system.file("pictures/cat.pgm", package = "ICS")[1]) fig2 <- read.pnm(system.file("pictures/road.pgm", package = "ICS")[1]) fig3 <- read.pnm(system.file("pictures/sheep.pgm", package = "ICS")[1]) p <- dim(fig1@grey)[2] fig1.v <- as.vector(fig1@grey) fig2.v <- as.vector(fig2@grey) fig3.v <- as.vector(fig3@grey) X <- cbind(fig1.v,fig2.v,fig3.v) set.seed(4321) A <- matrix(rnorm(9), ncol = 3) X.mixed <- X %*% t(A) ICA.fig <- ics(X.mixed) par.old <- par() par(mfrow=c(3,3), omi = c(0.1,0.1,0.1,0.1), mai = c(0.1,0.1,0.1,0.1)) plot(fig1) plot(fig2) plot(fig3) plot(pixmapGrey(X.mixed[,1],ncol=p)) plot(pixmapGrey(X.mixed[,2],ncol=p)) plot(pixmapGrey(X.mixed[,3],ncol=p)) plot(pixmapGrey(ics.components(ICA.fig)[,1],ncol=p)) plot(pixmapGrey(ics.components(ICA.fig)[,2],ncol=p)) plot(pixmapGrey(ics.components(ICA.fig)[,3],ncol=p)) par(par.old) rm(.Random.seed)
# example using two functions set.seed(123456) X1 <- rmvnorm(250, rep(0,8), diag(c(rep(1,6),0.04,0.04))) X2 <- rmvnorm(50, c(rep(0,6),2,0), diag(c(rep(1,6),0.04,0.04))) X3 <- rmvnorm(200, c(rep(0,7),2), diag(c(rep(1,6),0.04,0.04))) X.comps <- rbind(X1,X2,X3) A <- matrix(rnorm(64),nrow=8) X <- X.comps %*% t(A) ics.X.1 <- ics(X) summary(ics.X.1) plot(ics.X.1) # compare to pairs(X) pairs(princomp(X,cor=TRUE)$scores) # slow: # library(ICSNP) # ics.X.2 <- ics(X, tyler.shape, duembgen.shape, S1args=list(location=0)) # summary(ics.X.2) # plot(ics.X.2) rm(.Random.seed) # example using two computed scatter matrices for outlier detection library(robustbase) ics.wood<-ics(wood,tM(wood)$V,tM(wood,2)$V) plot(ics.wood) # example using three pictures library(pixmap) fig1 <- read.pnm(system.file("pictures/cat.pgm", package = "ICS")[1]) fig2 <- read.pnm(system.file("pictures/road.pgm", package = "ICS")[1]) fig3 <- read.pnm(system.file("pictures/sheep.pgm", package = "ICS")[1]) p <- dim(fig1@grey)[2] fig1.v <- as.vector(fig1@grey) fig2.v <- as.vector(fig2@grey) fig3.v <- as.vector(fig3@grey) X <- cbind(fig1.v,fig2.v,fig3.v) set.seed(4321) A <- matrix(rnorm(9), ncol = 3) X.mixed <- X %*% t(A) ICA.fig <- ics(X.mixed) par.old <- par() par(mfrow=c(3,3), omi = c(0.1,0.1,0.1,0.1), mai = c(0.1,0.1,0.1,0.1)) plot(fig1) plot(fig2) plot(fig3) plot(pixmapGrey(X.mixed[,1],ncol=p)) plot(pixmapGrey(X.mixed[,2],ncol=p)) plot(pixmapGrey(X.mixed[,3],ncol=p)) plot(pixmapGrey(ics.components(ICA.fig)[,1],ncol=p)) plot(pixmapGrey(ics.components(ICA.fig)[,2],ncol=p)) plot(pixmapGrey(ics.components(ICA.fig)[,3],ncol=p)) par(par.old) rm(.Random.seed)
Computes a scatter matrix and an optional location vector to be used in transforming the data to an invariant coordinate system or independent components.
ICS_cov(x, location = TRUE) ICS_cov4(x, location = c("mean", "mean3", "none")) ICS_covW(x, location = TRUE, alpha = 1, cf = 1) ICS_covAxis(x, location = TRUE) ICS_tM(x, location = TRUE, df = 1, ...) ICS_scovq(x, y, ...)
ICS_cov(x, location = TRUE) ICS_cov4(x, location = c("mean", "mean3", "none")) ICS_covW(x, location = TRUE, alpha = 1, cf = 1) ICS_covAxis(x, location = TRUE) ICS_tM(x, location = TRUE, df = 1, ...) ICS_scovq(x, y, ...)
x |
a numeric matrix or data frame. |
location |
for |
alpha |
parameter of the one-step M-estimator (default to 1). |
cf |
consistency factor of the one-step M-estimator (default to 1). |
df |
assumed degrees of freedom of the t-distribution (default to 1, which corresponds to the Cauchy distribution). |
... |
additional arguments to be passed down to |
y |
numerical vector specifying the dependent variable. |
ICS_cov()
is a wrapper for the sample covariance matrix as computed
by cov()
.
ICS_cov4()
is a wrapper for the scatter matrix based on fourth
moments as computed by cov4()
. Note that the scatter matrix
is always computed with respect to the sample mean, even though the returned
location component can be specified to be based on third moments as computed
by mean3()
. Setting a location component other than the
sample mean can be used to fix the signs of the invariant coordinates in
ICS()
based on generalized skewness values, for instance
when using the scatter pair ICS_cov()
and ICS_cov4()
.
ICS_covW()
is a wrapper for the one-step M-estimator of scatter as
computed by covW()
.
ICS_covAxis()
is a wrapper for the one-step Tyler shape matrix as
computed by covAxis()
, which is can be used to perform
Principal Axis Analysis.
ICS_tM()
is a wrapper for the M-estimator of location and scatter
for a multivariate t-distribution, as computed by tM()
.
ICS_scovq()
is a wrapper for the supervised scatter matrix based
on quantiles scatter, as computed by scovq()
.
An object of class "ICS_scatter"
with the following
components:
location |
if requested, a numeric vector giving the location estimate. |
scatter |
a numeric matrix giving the estimate of the scatter matrix. |
label |
a character string providing a label for the scatter matrix. |
Andreas Alfons and Aurore Archimbaud
Arslan, O., Constable, P.D.L. and Kent, J.T. (1995) Convergence behaviour of the EM algorithm for the multivariate t-distribution, Communications in Statistics, Theory and Methods, 24(12), 2981–3000. doi:10.1080/03610929508831664.
Critchley, F., Pires, A. and Amado, C. (2006) Principal Axis Analysis. Technical Report, 06/14. The Open University, Milton Keynes.
Kent, J.T., Tyler, D.E. and Vardi, Y. (1994) A curious likelihood identity for the multivariate t-distribution, Communications in Statistics, Simulation and Computation, 23(2), 441–453. doi:10.1080/03610919408813180.
Oja, H., Sirkia, S. and Eriksson, J. (2006) Scatter Matrices and Independent Component Analysis. Austrian Journal of Statistics, 35(2&3), 175-189.
Tyler, D.E., Critchley, F., Duembgen, L. and Oja, H. (2009) Invariant Co-ordinate Selection. Journal of the Royal Statistical Society, Series B, 71(3), 549–592. doi:10.1111/j.1467-9868.2009.00706.x.
ICS()
cov()
, cov4()
, covW()
,
covAxis()
, tM()
, scovq()
data("iris") X <- iris[,-5] ICS_cov(X) ICS_cov4(X) ICS_covW(X, alpha = 1, cf = 1/(ncol(X)+2)) ICS_covAxis(X) ICS_tM(X) # The number of explaining variables p <- 10 # The number of observations n <- 400 # The error variance sigma <- 0.5 # The explaining variables X <- matrix(rnorm(p*n),n,p) # The error term epsilon <- rnorm(n, sd = sigma) # The response y <- X[,1]^2 + X[,2]^2*epsilon ICS_scovq(X, y = y)
data("iris") X <- iris[,-5] ICS_cov(X) ICS_cov4(X) ICS_covW(X, alpha = 1, cf = 1/(ncol(X)+2)) ICS_covAxis(X) ICS_tM(X) # The number of explaining variables p <- 10 # The number of observations n <- 400 # The error variance sigma <- 0.5 # The explaining variables X <- matrix(rnorm(p*n),n,p) # The error term epsilon <- rnorm(n, sd = sigma) # The response y <- X[,1]^2 + X[,2]^2*epsilon ICS_scovq(X, y = y)
A S4 class to store results from an invariant coordinate system transformation or independent component computation based on two scatter matrices.
Objects can be created by calls of the form new("ics", ...)
. But usually objects are created by the function ics
.
gKurt
:Object of class "numeric"
. Gives the generalized kurtosis measures of the components
UnMix
:Object of class "matrix"
. The unmixing matrix.
S1
:Object of class "matrix"
. The first scatter matrix.
S2
:Object of class "matrix"
. The second scatter matrix.
S1name
:Object of class "character"
. Name of the first scatter matrix.
S2name
:Object of class "character"
. Name of the second scatter matrix.
Scores
:Object of class "data.frame"
. The underlying components in the invariant coordinate system.
DataNames
:Object of class "character"
. Names of the original variables.
StandardizeB
:Object of class "character"
. Names standardization method for UnMix.
StandardizegKurt
:Object of class "logical"
. States wether the generalized kurtosis is standardized or not.
For this class the following generic functions are available: print.ics
, summary.ics
, coef.ics
, fitted.ics
and plot.ics
In case no extractor function for the slots exists, the component can be extracted the usual way using '@'.
Klaus Nordhausen
Transforms the data via two scatter matrices to an invariant coordinate
system or independent components, depending on the underlying assumptions.
Function ICS()
is intended as a replacement for ics()
and ics2()
, and it combines their functionality into a single
function. Importantly, the results are returned as an
S3
object rather than an
S4
object. Furthermore, ICS()
implements recent improvements, such as a numerically stable algorithm based
on the QR algorithm for a common family of scatter pairs.
ICS( X, S1 = ICS_cov, S2 = ICS_cov4, S1_args = list(), S2_args = list(), algorithm = c("whiten", "standard", "QR"), center = FALSE, fix_signs = c("scores", "W"), na.action = na.fail )
ICS( X, S1 = ICS_cov, S2 = ICS_cov4, S1_args = list(), S2_args = list(), algorithm = c("whiten", "standard", "QR"), center = FALSE, fix_signs = c("scores", "W"), na.action = na.fail )
X |
a numeric matrix or data frame containing the data to be transformed. |
S1 |
a numeric matrix containing the first scatter matrix, an object
of class |
S2 |
a numeric matrix containing the second scatter matrix, an object
of class |
S1_args |
a list containing additional arguments for |
S2_args |
a list containing additional arguments for |
algorithm |
a character string specifying with which algorithm
the invariant coordinate system is computed. Possible values are
|
center |
a logical indicating whether the invariant coordinates should
be centered with respect to first locattion or not (default to |
fix_signs |
a character string specifying how to fix the signs of the
invariant coordinates. Possible values are |
na.action |
a function to handle missing values in the data (default
to |
For a given scatter pair and
, a matrix
(in which the columns contain the scores of the respective invariant
coordinates) and a matrix
(in which the rows contain the
coefficients of the linear transformation to the respective invariant
coordinates) are found such that:
The columns of are whitened with respect to
. That is,
, where
denotes the identity matrix.
The columns of are uncorrelated with respect to
. That is,
, where
is a diagonal matrix.
The columns of are ordered according to their generalized
kurtosis.
Given those criteria, is unique up to sign changes in its rows. The
argument
fix_signs
provides two ways to ensure uniqueness of :
If argument fix_signs
is set to "scores"
, the signs
in are fixed such that the generalized skewness values of all
components are positive. If
S1
and S2
provide location
components, which are denoted by and
,
the generalized skewness values are computed as
.
Otherwise, the skewness is computed by subtracting the column medians of
from the corresponding column means so that all components are
right-skewed. This way of fixing the signs is preferred in an invariant
coordinate selection framework.
If argument fix_signs
is set to "W"
, the signs in
are fixed independently of
such that the maximum element
in each row of
is positive and that each row has norm 1. This is
the usual way of fixing the signs in an independent component analysis
framework.
In principal, the order of and
does not
matter if both are true scatter matrices. Changing their order will just
reverse the order of the components and invert the corresponding
generalized kurtosis values.
The same does not hold when at least one of them is a shape matrix rather than a true scatter matrix. In that case, changing their order will also reverse the order of the components, but the ratio of the generalized kurtosis values is no longer 1 but only a constant. This is due to the fact that when shape matrices are used, the generalized kurtosis values are only relative ones.
Different algorithms are available to compute the invariant coordinate
system of a data frame with
observations:
"whiten": whitens the data with respect to the first
scatter matrix before computing the second scatter matrix. If
S2
is not a function, whitening is not applicable.
whiten the data with respect to the first
scatter matrix:
compute for the uncorrelated data:
perform the eigendecomposition of :
compute :
"standard": performs the spectral decomposition of the
symmetric matrix
compute
perform the eigendecomposition of :
compute :
"QR": numerically stable algorithm based on the QR algorithm for a
common family of scatter pairs: if S1
is ICS_cov()
or cov()
, and if S2
is one of
ICS_cov4()
, ICS_covW()
, ICS_covAxis()
, cov4()
,
covW()
, or covAxis()
.
For other scatter pairs, the QR algorithm is not
applicable. See Archimbaud et al. (2023)
for details.
The "whiten" algorithm is the most natural version and therefore the default. The option "standard" should be only used if the scatters provided are not functions but precomputed matrices. The option "QR" is mainly of interest when there are numerical issues when "whiten" is used and the scatter combination allows its usage.
Note that when the purpose of ICS is outlier detection the package ICSOutlier
provides additional functionalities as does the package ICSClust
in case the
goal of ICS is dimension reduction prior clustering.
An object of class "ICS"
with the following components:
gen_kurtosis |
a numeric vector containing the generalized kurtosis values of the invariant coordinates. |
W |
a numeric matrix in which each row contains the coefficients of the linear transformation to the corresponding invariant coordinate. |
scores |
a numeric matrix in which each column contains the scores of the corresponding invariant coordinate. |
gen_skewness |
a numeric vector containing the (generalized) skewness
values of the invariant coordinates (only returned if
|
S1_label |
a character string providing a label for the first scatter matrix to be used by various methods. |
S2_label |
a character string providing a label for the second scatter matrix to be used by various methods. |
S1_args |
a list containing additional arguments used to compute
|
S2_args |
a list containing additional arguments used to compute
|
algorithm |
a character string specifying how the invariant coordinate is computed. |
center |
a logical indicating whether or not the data were centered with respect to the first location vector before computing the invariant coordinates. |
fix_signs |
a character string specifying how the signs of the invariant coordinates were fixed. |
Andreas Alfons and Aurore Archimbaud, based on code for
ics()
and ics2()
by Klaus Nordhausen
Tyler, D.E., Critchley, F., Duembgen, L. and Oja, H. (2009) Invariant Co-ordinate Selection. Journal of the Royal Statistical Society, Series B, 71(3), 549–592. doi:10.1111/j.1467-9868.2009.00706.x.
Archimbaud, A., Drmac, Z., Nordhausen, K., Radojcic, U. and Ruiz-Gazen, A. (2023) Numerical Considerations and a New Implementation for Invariant Coordinate Selection. SIAM Journal on Mathematics of Data Science, 5(1), 97–121. doi:10.1137/22M1498759.
gen_kurtosis()
, coef()
,
components()
, fitted()
, and
plot()
methods
# import data data("iris") X <- iris[,-5] # run ICS out_ICS <- ICS(X) out_ICS summary(out_ICS) # extract generalized eigenvalues gen_kurtosis(out_ICS) # Plot screeplot(out_ICS) # extract the components components(out_ICS) components(out_ICS, select = 1:2) # Plot plot(out_ICS) # equivalence with previous functions out_ics <- ics(X, S1 = cov, S2 = cov4, stdKurt = FALSE) out_ics out_ics2 <- ics2(X, S1 = MeanCov, S2 = Mean3Cov4) out_ics2 out_ICS # example using two functions X1 <- rmvnorm(250, rep(0,8), diag(c(rep(1,6),0.04,0.04))) X2 <- rmvnorm(50, c(rep(0,6),2,0), diag(c(rep(1,6),0.04,0.04))) X3 <- rmvnorm(200, c(rep(0,7),2), diag(c(rep(1,6),0.04,0.04))) X.comps <- rbind(X1,X2,X3) A <- matrix(rnorm(64),nrow=8) X <- X.comps %*% t(A) ics.X.1 <- ICS(X) summary(ics.X.1) plot(ics.X.1) # compare to pairs(X) pairs(princomp(X,cor=TRUE)$scores) # slow: if (require("ICSNP")) { ics.X.2 <- ICS(X, S1 = tyler.shape, S2 = duembgen.shape, S1_args = list(location=0)) summary(ics.X.2) plot(ics.X.2) # example using three pictures library(pixmap) fig1 <- read.pnm(system.file("pictures/cat.pgm", package = "ICS")[1], cellres = 1) fig2 <- read.pnm(system.file("pictures/road.pgm", package = "ICS")[1], cellres = 1) fig3 <- read.pnm(system.file("pictures/sheep.pgm", package = "ICS")[1], cellres = 1) p <- dim(fig1@grey)[2] fig1.v <- as.vector(fig1@grey) fig2.v <- as.vector(fig2@grey) fig3.v <- as.vector(fig3@grey) X <- cbind(fig1.v, fig2.v, fig3.v) A <- matrix(rnorm(9), ncol = 3) X.mixed <- X %*% t(A) ICA.fig <- ICS(X.mixed) par.old <- par() par(mfrow=c(3,3), omi = c(0.1,0.1,0.1,0.1), mai = c(0.1,0.1,0.1,0.1)) plot(fig1) plot(fig2) plot(fig3) plot(pixmapGrey(X.mixed[,1], ncol = p, cellres = 1)) plot(pixmapGrey(X.mixed[,2], ncol = p, cellres = 1)) plot(pixmapGrey(X.mixed[,3], ncol = p, cellres = 1)) plot(pixmapGrey(components(ICA.fig)[,1], ncol = p, cellres = 1)) plot(pixmapGrey(components(ICA.fig)[,2], ncol = p, cellres = 1)) plot(pixmapGrey(components(ICA.fig)[,3], ncol = p, cellres = 1)) }
# import data data("iris") X <- iris[,-5] # run ICS out_ICS <- ICS(X) out_ICS summary(out_ICS) # extract generalized eigenvalues gen_kurtosis(out_ICS) # Plot screeplot(out_ICS) # extract the components components(out_ICS) components(out_ICS, select = 1:2) # Plot plot(out_ICS) # equivalence with previous functions out_ics <- ics(X, S1 = cov, S2 = cov4, stdKurt = FALSE) out_ics out_ics2 <- ics2(X, S1 = MeanCov, S2 = Mean3Cov4) out_ics2 out_ICS # example using two functions X1 <- rmvnorm(250, rep(0,8), diag(c(rep(1,6),0.04,0.04))) X2 <- rmvnorm(50, c(rep(0,6),2,0), diag(c(rep(1,6),0.04,0.04))) X3 <- rmvnorm(200, c(rep(0,7),2), diag(c(rep(1,6),0.04,0.04))) X.comps <- rbind(X1,X2,X3) A <- matrix(rnorm(64),nrow=8) X <- X.comps %*% t(A) ics.X.1 <- ICS(X) summary(ics.X.1) plot(ics.X.1) # compare to pairs(X) pairs(princomp(X,cor=TRUE)$scores) # slow: if (require("ICSNP")) { ics.X.2 <- ICS(X, S1 = tyler.shape, S2 = duembgen.shape, S1_args = list(location=0)) summary(ics.X.2) plot(ics.X.2) # example using three pictures library(pixmap) fig1 <- read.pnm(system.file("pictures/cat.pgm", package = "ICS")[1], cellres = 1) fig2 <- read.pnm(system.file("pictures/road.pgm", package = "ICS")[1], cellres = 1) fig3 <- read.pnm(system.file("pictures/sheep.pgm", package = "ICS")[1], cellres = 1) p <- dim(fig1@grey)[2] fig1.v <- as.vector(fig1@grey) fig2.v <- as.vector(fig2@grey) fig3.v <- as.vector(fig3@grey) X <- cbind(fig1.v, fig2.v, fig3.v) A <- matrix(rnorm(9), ncol = 3) X.mixed <- X %*% t(A) ICA.fig <- ICS(X.mixed) par.old <- par() par(mfrow=c(3,3), omi = c(0.1,0.1,0.1,0.1), mai = c(0.1,0.1,0.1,0.1)) plot(fig1) plot(fig2) plot(fig3) plot(pixmapGrey(X.mixed[,1], ncol = p, cellres = 1)) plot(pixmapGrey(X.mixed[,2], ncol = p, cellres = 1)) plot(pixmapGrey(X.mixed[,3], ncol = p, cellres = 1)) plot(pixmapGrey(components(ICA.fig)[,1], ncol = p, cellres = 1)) plot(pixmapGrey(components(ICA.fig)[,2], ncol = p, cellres = 1)) plot(pixmapGrey(components(ICA.fig)[,3], ncol = p, cellres = 1)) }
Function to extract the ICS components of a ics
object.
ics.components(object)
ics.components(object)
object |
object of class |
Dataframe that contains the components.
Klaus Nordhausen
This function implements the two scatter matrices transformation to obtain an invariant coordinate sytem or independent
components, depending on the underlying assumptions. Differently to ics
here, there are also two location functionals used
to fix the signs of the components and to get a measure of skewness.
ics2(X, S1 = MeanCov, S2 = Mean3Cov4, S1args = list(), S2args = list(), na.action = na.fail)
ics2(X, S1 = MeanCov, S2 = Mean3Cov4, S1args = list(), S2args = list(), na.action = na.fail)
X |
numeric data matrix or dataframe. |
S1 |
name of the function which returns the first location vector T1 and scatter matrix S1. Can be also
a list which has these values already computed. See details for more information. Default is |
S2 |
name of the function which returns the second location vector T2 and scatter matrix S2. Can be also
a list which has these values already computed. See details for more information. Default is |
S1args |
list with optional additional arguments when calling function S1. |
S2args |
list with optional additional arguments when calling function S2. |
na.action |
a function which indicates what should happen when the data contain 'NA's. Default is to fail. |
For a general discussion about ICS see the help for ics
. The difference to ics
is that S1
and S2
are either functions which return a list containing a multivariate location and scatter computed on X
or lists containing these measures
computed in advance. Of importance for the resulting lists is that in both cases the location vector is the first element of the list and the scatter matrix
the second element. This means most multivariate location - scatter functions can be used directly without the need to write a wrapper.
The invariant coordinates Z are then computed such that (i) T1(Z) = 0, the origin. (ii) S1(Z) = I_p, the identity matrix. (iii) T2(Z) = S, where S is a vector having positive elements which can be seen as a generalized skewness measure (gSkew). (iv) S2(Z) = D, a diagonal matrix with descending elements which can be seen as a generalized kurtosis measure (gKurt).
Hence in this function there are no options to standardize Z or the transformation matrix B as everything is
specified by S1
and S2
.
Note also that ics2
makes hardly any input checks.
an object of class ics2
inheriting from class ics
.
Function ics2()
reached the end of its lifecycle, please use ICS()
instead. In future versions, ics2()
will be deprecated and eventually removed.
Klaus Nordhausen
Tyler, D.E., Critchley, F., Dümbgen, L. and Oja, H. (2009), Invariant co-ordinate selecetion, Journal of the Royal Statistical Society,Series B, 71, 549–592. <doi:10.1111/j.1467-9868.2009.00706.x>.
Nordhausen, K., Oja, H. and Ollila, E. (2011), Multivariate Models and the First Four Moments, In Hunter, D.R., Richards, D.S.R. and Rosenberger, J.L. (editors) "Nonparametric Statistics and Mixture Models: A Festschrift in Honor of Thomas P. Hettmansperger", 267–287, World Scientific, Singapore. <doi:10.1142/9789814340564_0016>.
set.seed(123456) X1 <- rmvnorm(250, rep(0,8), diag(c(rep(1,6),0.04,0.04))) X2 <- rmvnorm(50, c(rep(0,6),2,0), diag(c(rep(1,6),0.04,0.04))) X3 <- rmvnorm(200, c(rep(0,7),2), diag(c(rep(1,6),0.04,0.04))) X.comps <- rbind(X1,X2,X3) A <- matrix(rnorm(64),nrow=8) X <- X.comps %*% t(A) # the default ics2.X.1 <- ics2(X2) summary(ics2.X.1) # using another function as S2 not with its default ics2.X.2 <- ics2(X2, S2 = tM, S2args = list(df = 2)) summary(ics2.X.2) # computing in advance S2 and using another S1 Scauchy <- tM(X) ics2.X.2 <- ics2(X2, S1 = tM, S2 = Scauchy, S1args = list(df = 5)) summary(ics2.X.2) plot(ics2.X.2)
set.seed(123456) X1 <- rmvnorm(250, rep(0,8), diag(c(rep(1,6),0.04,0.04))) X2 <- rmvnorm(50, c(rep(0,6),2,0), diag(c(rep(1,6),0.04,0.04))) X3 <- rmvnorm(200, c(rep(0,7),2), diag(c(rep(1,6),0.04,0.04))) X.comps <- rbind(X1,X2,X3) A <- matrix(rnorm(64),nrow=8) X <- X.comps %*% t(A) # the default ics2.X.1 <- ics2(X2) summary(ics2.X.1) # using another function as S2 not with its default ics2.X.2 <- ics2(X2, S2 = tM, S2args = list(df = 2)) summary(ics2.X.2) # computing in advance S2 and using another S1 Scauchy <- tM(X) ics2.X.2 <- ics2(X2, S1 = tM, S2 = Scauchy, S1args = list(df = 5)) summary(ics2.X.2) plot(ics2.X.2)
A S4 class to store results from an invariant coordinate system transformation or independent component computation based on two scatter matrices and two location vectors.
Objects can be created by calls of the form new("ics2", ...)
. But usually objects are created by the function ics2
.
The Class inherits from the ics
class.
gSkew
:Object of class "numeric"
. Gives the generalized skewness measures of the components
gKurt
:Object of class "numeric"
. Gives the generalized kurtosis measures of the components
UnMix
:Object of class "matrix"
. The unmixing matrix.
S1
:Object of class "matrix"
. The first scatter matrix.
S2
:Object of class "matrix"
. The second scatter matrix.
T1
:Object of class "numeric"
. The first location vector.
T2
:Object of class "numeric"
. The second location vector.
S1name
:Object of class "character"
. Name of the first scatter matrix.
S2name
:Object of class "character"
. Name of the second scatter matrix.
S1args
:Object of class "list"
. Additional arguments needed when calling function S1.
S2args
:Object of class "list"
. Additional arguments needed when calling function S2.
Scores
:Object of class "data.frame"
. The underlying components in the invariant coordinate system.
DataNames
:Object of class "character"
. Names of the original variables.
StandardizeB
:Object of class "character"
. Names standardization method for UnMix.
StandardizegKurt
:Object of class "logical"
. States wether the generalized kurtosis is standardized or not.
For this class the following generic functions are available: print.ics2
, summary.ics2
But naturally the other methods like plot, coef, fitted and so from class ics work via inheritance.
In case no extractor function for the slots exists, the component can be extracted the usual way using '@'.
Klaus Nordhausen
Estimates the location based on third moments.
mean3(X, na.action = na.fail)
mean3(X, na.action = na.fail)
X |
numeric data matrix or dataframe with at least two columns. |
na.action |
a function which indicates what should happen when the data contain 'NA's. Default is to fail. |
This location estimate is defined for a matrix X as
where is the mean vector and
the regular covariance matrix.
A vector.
Klaus Nordhausen
Oja, H., Sirki?, S. and Eriksson, J. (2006), Scatter matrices and independent component analysis, Austrian Journal of Statistics, 35, 175–189.
set.seed(654321) cov.matrix <- matrix(c(3,2,1,2,4,-0.5,1,-0.5,2), ncol=3) X <- rmvnorm(100, c(0,0,0), cov.matrix) mean3(X) rm(.Random.seed)
set.seed(654321) cov.matrix <- matrix(c(3,2,1,2,4,-0.5,1,-0.5,2), ncol=3) X <- rmvnorm(100, c(0,0,0), cov.matrix) mean3(X) rm(.Random.seed)
Returns, for some multivariate data, the location vector based on 3rd moments and the scatter matrix based on 4th moments.
Mean3Cov4(x)
Mean3Cov4(x)
x |
a numeric data matrix. |
Note that the scatter matrix of 4th moments is computed with respect to the mean vector and not with respect to the location vector based on 3rd moments.
A list containing:
locations |
The location vector based on 3rd moments as computed by |
scatter |
The scatter matrix based on 4th moments as computed by |
Klaus Nordhausen
X <- rmvnorm(200, 1:3, diag(2:4)) Mean3Cov4(X)
X <- rmvnorm(200, 1:3, diag(2:4)) Mean3Cov4(X)
Returns, for some multivariate data, the mean vector and covariance matrix.
MeanCov(x)
MeanCov(x)
x |
a numeric data matrix. |
A list containing:
locations |
The mean vector as computed by |
scatter |
The covariance matrix as computed by |
Klaus Nordhausen
X <- rmvnorm(200, 1:3, diag(2:4)) MeanCov(X)
X <- rmvnorm(200, 1:3, diag(2:4)) MeanCov(X)
Test for multivariate normality which uses as criterion the kurtosis measured by the ratio of regular covariance matrix and matrix of fourth moments.
mvnorm.kur.test(X, method = "integration", n.simu = 1000, na.action = na.fail)
mvnorm.kur.test(X, method = "integration", n.simu = 1000, na.action = na.fail)
X |
a numeric data frame or matrix. |
method |
defines the method used for the computation of the p-value. The possibilites are "integration" (default), "satterthwaite" or "simulation". Details below. |
n.simu |
if ' |
na.action |
a function which indicates what should happen when the data contain 'NA's. Default is to fail. |
This test implements the multivariate normality test based on kurtosis measured by two different scatter estimates
as described in Kankainen, Taskinen and Oja. The choice here is based on the regular covariance matrix and matrix of
fourth moments (cov4
).
The limiting distribution of the test statistic W is a linear combination of independent chi-square variables with different degrees of freedom.
Exact limiting p-values or approximated p-values are obtained by using the function pchisqsum
. However Kankainen et al.
mention that even for n = 200 the convergence can be poor, therefore also p-values simulated under the NULL can be obtained.
Note that the test statistic used is a symmetric version of the one in the paper to guarantee affine invariance.
A list with class 'htest' containing the following components:
statistic |
the value of the test statistic W. |
parameter |
the degrees of freedom for the test statistic W with their weights or the number of replications depending on the chosen method. |
p.value |
the p-value for the test. |
method |
a character string indicating what type of test was performed. |
data.name |
a character string giving the name of the data. |
Klaus Nordhausen
Kankainen, A., Taskinen, S. and Oja, H. (2007), Tests of multinormality based on location vectors and scatter matrices, Statistical Methods and Applications, 16, 357–379. <doi:10.1007/s10260-007-0045-9>.
X<-rmvnorm(100, c(2, 4, 5)) mvnorm.kur.test(X) mvnorm.kur.test(X, method = "satt") mvnorm.kur.test(X, method = "simu")
X<-rmvnorm(100, c(2, 4, 5)) mvnorm.kur.test(X) mvnorm.kur.test(X, method = "satt") mvnorm.kur.test(X, method = "simu")
Test for multivariate normality that uses as criterion the skewness measured as the difference between location estimates based on first respectively third moments
mvnorm.skew.test(X, na.action = na.fail)
mvnorm.skew.test(X, na.action = na.fail)
X |
a numeric data frame or matrix. |
na.action |
a function which indicates what should happen when the data contain 'NA's. Default is to fail. |
This test implements the multivariate normality test based on skewness measured by two different location estimates
as described in Kankainen, Taskinen and Oja. The choice here is based on the regular mean vector and the location estimate based on
third moments (mean3
). The scatter matrix used is the regular covariance matrix.
A list with class 'htest' containing the following components:
statistic |
the value of the test statistic U. |
parameter |
the degrees of freedom for the statistic U. |
p.value |
the p-value for the test. |
method |
a character string indicating what type of test was performed. |
data.name |
a character string giving the name of the data. |
Klaus Nordhausen
Kankainen, A., Taskinen, S. and Oja, H. (2007),Tests of multinormality based on location vectors and scatter matrices, Statistical Methods and Applications, 16, 357–379. <doi:10.1007/s10260-007-0045-9>.
X<-rmvnorm(100,c(2,4,5)) mvnorm.skew.test(X)
X<-rmvnorm(100,c(2,4,5)) mvnorm.skew.test(X)
Scatterplot matrix for an ics
object.
## S4 method for signature 'ics,missing' plot(x, index = NULL, ...)
## S4 method for signature 'ics,missing' plot(x, index = NULL, ...)
x |
object of class |
index |
index vector of which components should be plotted. See details for further information |
... |
other arguments for |
If no index vector is given the function plots the full scatterplots matrix only if there are less than seven components. Otherwise the three first and three last components will be plotted. This is because the components with extreme kurtosis are the most interesting ones.
Klaus Nordhausen
screeplot.ics
, ics-class
and ics
set.seed(123456) X1 <- rmvnorm(250, rep(0,8), diag(c(rep(1,6),0.04,0.04))) X2 <- rmvnorm(50, c(rep(0,6),2,0), diag(c(rep(1,6),0.04,0.04))) X3 <- rmvnorm(200, c(rep(0,7),2), diag(c(rep(1,6),0.04,0.04))) X.comps <- rbind(X1,X2,X3) A <- matrix(rnorm(64),nrow=8) X <- X.comps %*% t(A) ics.X.1 <- ics(X) plot(ics.X.1) plot(ics.X.1,index=1:8) rm(.Random.seed)
set.seed(123456) X1 <- rmvnorm(250, rep(0,8), diag(c(rep(1,6),0.04,0.04))) X2 <- rmvnorm(50, c(rep(0,6),2,0), diag(c(rep(1,6),0.04,0.04))) X3 <- rmvnorm(200, c(rep(0,7),2), diag(c(rep(1,6),0.04,0.04))) X.comps <- rbind(X1,X2,X3) A <- matrix(rnorm(64),nrow=8) X <- X.comps %*% t(A) ics.X.1 <- ics(X) plot(ics.X.1) plot(ics.X.1,index=1:8) rm(.Random.seed)
Produces a scatterplot matrix of the component scores of an invariant coordinate system obtained via an ICS transformation.
## S3 method for class 'ICS' plot(x, select = NULL, index = NULL, ...)
## S3 method for class 'ICS' plot(x, select = NULL, index = NULL, ...)
x |
an object inheriting from class |
select |
an integer, character, or logical vector specifying which
components to plot. If |
index |
an integer vector specifying which components to plot, or
|
... |
additional arguments to be passed down to
|
Andreas Alfons and Aurore Archimbaud
ICS()
gen_kurtosis()
, coef()
,
components()
, and fitted()
methods
data("iris") X <- iris[,-5] out <- ICS(X) plot(out) plot(out, select = c(1,4))
data("iris") X <- iris[,-5] out <- ICS(X) plot(out) plot(out, select = c(1,4))
Prints the minimal information of an ics
object.
## S4 method for signature 'ics' show(object)
## S4 method for signature 'ics' show(object)
object |
object of class |
Klaus Nordhausen
Prints information of an ICS
object.
## S3 method for class 'ICS' print(x, info = FALSE, digits = 4L, ...)
## S3 method for class 'ICS' print(x, info = FALSE, digits = 4L, ...)
x |
object of class |
info |
Logical, either TRUE or FALSE. If TRUE, print additional information on arguments used for computing scatter matrices (only named arguments that contain numeric, character, or logical scalars) and information on the parameters of the algorithm. Default is FALSE. |
digits |
number of digits for the numeric output. |
... |
additional arguments passed to |
Andreas Alfons and Aurore Archimbaud
ICS()
data("iris") X <- iris[,-5] out <- ICS(X) print(out) print(out, info = TRUE)
data("iris") X <- iris[,-5] out <- ICS(X) print(out) print(out, info = TRUE)
Prints the minimal information of an ics2
object.
## S4 method for signature 'ics2' show(object)
## S4 method for signature 'ics2' show(object)
object |
object of class |
Klaus Nordhausen
ics2-class
and ics2
Function for a supervised scatter matrix that is the weighted
covariance matrix of x
with weights 1/(q2-q1
) if y
is between the
lower (q1
) and upper (q2
) quantile and 0 otherwise (or vice versa).
scovq(x, y, q1 = 0, q2 = 0.5, pos = TRUE, type = 7, method = "unbiased", na.action = na.fail, check = TRUE)
scovq(x, y, q1 = 0, q2 = 0.5, pos = TRUE, type = 7, method = "unbiased", na.action = na.fail, check = TRUE)
x |
numeric data matrix with at least two columns. |
y |
numerical vector specifying the dependent variable. |
q1 |
percentage for lower quantile of |
q2 |
percentage for upper quantile of |
pos |
logical. If TRUE then the weights are 1/( |
type |
passed on to function |
method |
passed on to function |
na.action |
a function which indicates what should happen when the data contain 'NA's. Default is to fail. |
check |
logical. Checks if the input should be checked for consistency. If not needed setting it to FALSE might save some time. |
The weights for this supervised scatter matrix for pos=TRUE
are
. Then
scovq
is calculated as
where .
To see how this function can be used in the context of supervised invariant coordinate selection see the example below.
a matrix.
Klaus Nordhausen
Liski, E., Nordhausen, K. and Oja, H. (2014), Supervised invariant coordinate selection, Statistics: A Journal of Theoretical and Applied Statistics, 48, 711–731. <doi:10.1080/02331888.2013.800067>.
# Creating some data # The number of explaining variables p <- 10 # The number of observations n <- 400 # The error variance sigma <- 0.5 # The explaining variables X <- matrix(rnorm(p*n),n,p) # The error term epsilon <- rnorm(n, sd = sigma) # The response y <- X[,1]^2 + X[,2]^2*epsilon # SICS with ics X.centered <- sweep(X,2,colMeans(X),"-") SICS <- ics(X.centered, S1=cov, S2=scovq, S2args=list(y=y, q1=0.25, q2=0.75, pos=FALSE), stdKurt=FALSE, stdB="Z") # Assuming it is known that k=2, then the two directions # of interest are choosen as: k <- 2 KURTS <- SICS@gKurt KURTS.max <- ifelse(KURTS >= 1, KURTS, 1/KURTS) ordKM <- order(KURTS.max, decreasing = TRUE) indKM <- ordKM[1:k] # The two variables of interest Zk <- ics.components(SICS)[,indKM] # The correspondings transformation matrix Bk <- coef(SICS)[indKM,] # The corresponding projection matrix Pk <- t(Bk) %*% solve(Bk %*% t(Bk)) %*% Bk # Visualization pairs(cbind(y,Zk)) # checking the subspace difference # true projection B0 <- rbind(rep(c(1,0),c(1,p-1)),rep(c(0,1,0),c(1,1,p-2))) P0 <- t(B0) %*% solve(B0 %*% t(B0)) %*% B0 # crone and crosby subspace distance measure, should be small k - sum(diag(P0 %*% Pk))
# Creating some data # The number of explaining variables p <- 10 # The number of observations n <- 400 # The error variance sigma <- 0.5 # The explaining variables X <- matrix(rnorm(p*n),n,p) # The error term epsilon <- rnorm(n, sd = sigma) # The response y <- X[,1]^2 + X[,2]^2*epsilon # SICS with ics X.centered <- sweep(X,2,colMeans(X),"-") SICS <- ics(X.centered, S1=cov, S2=scovq, S2args=list(y=y, q1=0.25, q2=0.75, pos=FALSE), stdKurt=FALSE, stdB="Z") # Assuming it is known that k=2, then the two directions # of interest are choosen as: k <- 2 KURTS <- SICS@gKurt KURTS.max <- ifelse(KURTS >= 1, KURTS, 1/KURTS) ordKM <- order(KURTS.max, decreasing = TRUE) indKM <- ordKM[1:k] # The two variables of interest Zk <- ics.components(SICS)[,indKM] # The correspondings transformation matrix Bk <- coef(SICS)[indKM,] # The corresponding projection matrix Pk <- t(Bk) %*% solve(Bk %*% t(Bk)) %*% Bk # Visualization pairs(cbind(y,Zk)) # checking the subspace difference # true projection B0 <- rbind(rep(c(1,0),c(1,p-1)),rep(c(0,1,0),c(1,1,p-2))) P0 <- t(B0) %*% solve(B0 %*% t(B0)) %*% B0 # crone and crosby subspace distance measure, should be small k - sum(diag(P0 %*% Pk))
Plots the kurtosis measures of an ics
object against its index number. Two versions of this screeplot are available.
## S3 method for class 'ics' screeplot(x, index = NULL, type = "barplot", main = deparse(substitute(x)), ylab = "generalized kurtosis", xlab = "component", names.arg = index, labels = TRUE, ...)
## S3 method for class 'ics' screeplot(x, index = NULL, type = "barplot", main = deparse(substitute(x)), ylab = "generalized kurtosis", xlab = "component", names.arg = index, labels = TRUE, ...)
x |
object of class |
index |
index of the components to be plottes. If NULL all components are used. |
type |
|
main |
main title of the plot. |
ylab |
y-axis label. |
xlab |
x-axis label. |
names.arg |
|
labels |
|
... |
other arguments for the plotting functions. |
Klaus Nordhausen
set.seed(654321) A <- matrix(c(3,2,1,2,4,-0.5,1,-0.5,2),ncol=3) eigen.A <- eigen(A) sqrt.A <- eigen.A$vectors %*% (diag(eigen.A$values))^0.5 %*% t(eigen.A$vectors) normal.ic <- cbind(rnorm(800), rnorm(800), rnorm(800)) mix.ic <- cbind(rt(800,4), rnorm(800), runif(800,-2,2)) data.normal <- normal.ic %*% t(sqrt.A) data.mix <- mix.ic %*% t(sqrt.A) par(mfrow=c(1,2)) screeplot(ics(data.normal)) screeplot(ics(data.mix), type="lines") par(mfrow=c(1,1)) rm(.Random.seed) screeplot(ics(data.normal), names.arg=paste("IC", 1:ncol(A), sep=""), xlab="")
set.seed(654321) A <- matrix(c(3,2,1,2,4,-0.5,1,-0.5,2),ncol=3) eigen.A <- eigen(A) sqrt.A <- eigen.A$vectors %*% (diag(eigen.A$values))^0.5 %*% t(eigen.A$vectors) normal.ic <- cbind(rnorm(800), rnorm(800), rnorm(800)) mix.ic <- cbind(rt(800,4), rnorm(800), runif(800,-2,2)) data.normal <- normal.ic %*% t(sqrt.A) data.mix <- mix.ic %*% t(sqrt.A) par(mfrow=c(1,2)) screeplot(ics(data.normal)) screeplot(ics(data.mix), type="lines") par(mfrow=c(1,1)) rm(.Random.seed) screeplot(ics(data.normal), names.arg=paste("IC", 1:ncol(A), sep=""), xlab="")
ICS
ObjectPlots the kurtosis measures of an ICS
object against its index number.
Two versions of this screeplot are available.
## S3 method for class 'ICS' screeplot( x, index = NULL, type = "barplot", main = deparse(substitute(x)), ylab = "generalized kurtosis", xlab = "component", names.arg = index, labels = TRUE, ... )
## S3 method for class 'ICS' screeplot( x, index = NULL, type = "barplot", main = deparse(substitute(x)), ylab = "generalized kurtosis", xlab = "component", names.arg = index, labels = TRUE, ... )
x |
object of class |
index |
index of the components to be plottes. If NULL all components are used. |
type |
"barplot" if a barplot or "lines" if a line plot is preferred. |
main |
main title of the plot. |
ylab |
y-axis label. |
xlab |
x-axis label. |
names.arg |
names.arg argument passed on to "barplot". |
labels |
|
... |
other arguments for the plotting functions. |
Andreas Alfons and Aurore Archimbaud
ICS()
gen_kurtosis()
method
X <- iris[,-5] out <- ICS(X) screeplot(out) screeplot(out, type = "lines")
X <- iris[,-5] out <- ICS(X) screeplot(out) screeplot(out, type = "lines")
Summarizes and prints a ics
object in an informative way.
## S4 method for signature 'ics' summary(object, digits = 4)
## S4 method for signature 'ics' summary(object, digits = 4)
object |
object of class |
digits |
number of digits for the numeric output. |
Klaus Nordhausen
ICS
objectSummarizes and prints an ICS
object in an informative way.
## S3 method for class 'ICS' summary(object, ...)
## S3 method for class 'ICS' summary(object, ...)
object |
object of class |
... |
additional arguments passed to |
Andreas Alfons and Aurore Archimbaud
ICS()
data("iris") X <- iris[,-5] out <- ICS(X) summary(out)
data("iris") X <- iris[,-5] out <- ICS(X) summary(out)
Summarizes and prints a ics2
object in an informative way.
## S4 method for signature 'ics2' summary(object, digits = 4)
## S4 method for signature 'ics2' summary(object, digits = 4)
object |
object of class |
digits |
number of digits for the numeric output. |
Klaus Nordhausen
ics2-class
and ics2
Implements three EM algorithms to M-estimate the location vector and scatter matrix of a multivariate t-distribution.
tM(X, df = 1, alg = "alg3", mu.init = NULL, V.init = NULL, gamma.init = NULL, eps = 1e-06, maxiter = 100, na.action = na.fail)
tM(X, df = 1, alg = "alg3", mu.init = NULL, V.init = NULL, gamma.init = NULL, eps = 1e-06, maxiter = 100, na.action = na.fail)
X |
numeric data matrix or dataframe. |
df |
assumed degrees of freedom of the t-distribution. Default is |
alg |
specifies which algorithm to use. Options are |
mu.init |
initial value for the location vector if available. |
V.init |
initial value for the scatter matrix if available. |
gamma.init |
initial value for gamma if available. Only needed for |
eps |
convergence tolerance. |
maxiter |
maximum number of iterations. |
na.action |
a function which indicates what should happen when the data contain 'NA's. Default is to fail. |
This function implements the EM algorithms described in Kent et al. (1994). The norm used to define convergence is as in Arslan et al. (1995).
Algorithm 1 is valid for all degrees of freedom df
> 0. Algorithm 2 is well defined only for degrees of freedom df
> 1.
Algorithm 3 is the limiting case of Algorithm 2 with degrees of freedom df
= 1.
The performance of the algorithms are compared in Arslan et al. (1995).
Note that cov.trob
in the MASS package implements also a covariance estimate for a multivariate t-distribution.
That function provides for example also the possibility to fix the location. It requires however that the degrees of freedom exceeds 2.
A list containing:
mu |
vector with the estimated loaction. |
V |
matrix of the estimated scatter. |
gam |
estimated value of gamma. Only present when |
iter |
number of iterations. |
Klaus Nordhausen
Kent, J.T., Tyler, D.E. and Vardi, Y. (1994), A curious likelihood identity for the multivariate t-distribution, Communications in Statistics, Simulation and Computation, 23, 441–453. <doi:10.1080/03610919408813180>.
Arslan, O., Constable, P.D.L. and Kent, J.T. (1995), Convergence behaviour of the EM algorithm for the multivariate t-distribution, Communications in Statistics, Theory and Methods, 24, 2981–3000. <doi:10.1080/03610929508831664>.
set.seed(654321) cov.matrix <- matrix(c(3,2,1,2,4,-0.5,1,-0.5,2), ncol=3) X <- rmvt(100, cov.matrix, 1) tM(X) rm(.Random.seed)
set.seed(654321) cov.matrix <- matrix(c(3,2,1,2,4,-0.5,1,-0.5,2), ncol=3) X <- rmvt(100, cov.matrix, 1) tM(X) rm(.Random.seed)