Package 'mvhtests'

Title: Multivariate Hypothesis Tests
Description: Hypothesis tests for multivariate data. Tests for one and two mean vectors, multivariate analysis of variance, tests for one, two or more covariance matrices. References include: Mardia K.V., Kent J.T. and Bibby J.M. (1979). Multivariate Analysis. ISBN: 978-0124712522. London: Academic Press.
Authors: Michail Tsagris [aut, cre]
Maintainer: Michail Tsagris <[email protected]>
License: GPL (>= 2)
Version: 1.0
Built: 2024-10-14 06:27:00 UTC
Source: CRAN

Help Index


Multivariate Hypothesis Tests

Description

Multivariate Hypothesis Tests.

Details

Package: mvhtests
Type: Package
Version: 1.0
Date: 2023-10-19
License: GPL-2

Maintainers

Michail Tsagris [email protected].

Author(s)

Michail Tsagris [email protected].

References

Aitchison J. (1986). The statistical analysis of compositional data. Chapman & Hall.

Amaral G.J.A., Dryden I.L. and Wood A.T.A. (2007). Pivotal bootstrap methods for k-sample problems in directional statistics and shape analysis. Journal of the American Statistical Association, 102(478): 695–707.

Efron B. (1981) Nonparametric standard errors and confidence intervals. Canadian Journal of Statistics, 9(2): 139–158.

Emerson S. (2009). Small sample performance and calibration of the Empirical Likelihood method. PhD thesis, Stanford university.

Everitt B. (2005). An R and S-Plus Companion to Multivariate Analysis. Springer.

James G.S. (1954). Tests of Linear Hypotheses in Univariate and Multivariate Analysis when the Ratios of the Population Variances are Unknown. Biometrika, 41(1/2): 19–43.

Jing B.Y. and Wood A.T.A. (1996). Exponential empirical likelihood is not Bartlett correctable. Annals of Statistics, 24(1): 365–369.

Jing B.Y. and Robinson J. (1997). Two-sample nonparametric tilting method. Australian Journal of Statistics, 39(1): 25–34.

Johnson R.A. and Wichern D.W. (2007, 6th Edition). Applied Multivariate Statistical Analysis.

Krishnamoorthy K. and Yu J. (2004). Modified Nel and Van der Merwe test for the multivariate Behrens-Fisher problem. Statistics & Probability Letters, 66(2): 161–169.

Krishnamoorthy K. and Yanping X. (2006). On Selecting Tests for Equality of Two Normal Mean Vectors. Multivariate Behavioral Research, 41(4): 533–548.

Mardia K.V., Kent J.T. and Bibby J.M. (1979). Multivariate Analysis. London: Academic Press.

Owen A.B. (1988). Empirical likelihood ratio confidence intervals for a single functional. Biometrika, 75(2): 237–249.

Owen A. (1990). Empirical likelihood ratio confidence regions. Annals of Statistics, 18(1): 90–120.

Owen A. B. (2001). Empirical likelihood. Chapman and Hall/CRC Press.

Preston S.P. and Wood A.T.A. (2010). Two-Sample Bootstrap Hypothesis Tests for Three-Dimensional Labelled Landmark Data. Scandinavian Journal of Statistics 37(4): 568–587.

Todorov V. and Filzmoser P. (2010). Robust Statistic for the One-way MANOVA. Computational Statistics & Data Analysis 54(1): 37–48.


Box's M test for equality of two or more covariance matrices

Description

Box's M test for equality of two or more covariance matrices.

Usage

Mtest.cov(x, ina, a = 0.05)

Arguments

x

A matrix containing Euclidean data.

ina

A vector denoting the groups of the data.

a

The significance level, set to 0.05 by default.

Details

According to Mardia, Kent and Bibby (1979, pg. 140), it may be argued that if nin_i is small, then the log-likelihood ratio test (function likel.cov) gives too much weight to the contribution of S{\bf S}. This consideration led Box (1949) to propose another test statistic in place of that seen in likel.cov . Box's MM is given by

M=γi=1k(ni1)logSi1Sp,M=\gamma\sum_{i=1}^k\left(n_i-1\right)\log{\left|{\bf S}_{i}^{-1}{\bf S}_p \right|},

where γ=12p2+3p16(p+1)(k1)(i=1k1ni11nk)\gamma=1-\frac{2p^2+3p-1}{6\left(p+1\right)\left(k-1\right)}\left(\sum_{i=1}^k\frac{1}{n_i-1}-\frac{1}{n-k}\right) and Si{\bf S}_{i} and Sp{\bf S}_{p} are the ii-th unbiased covariance estimator and the pooled covariance matrix, respectively with Sp=i=1k(ni1)Sink{\bf S}_p=\frac{\sum_{i=1}^k\left(n_i-1\right){\bf S}_i}{n-k}. Box's MM also has an asymptotic χ2\chi^2 distribution with 12(p+1)(k1)\frac{1}{2}\left(p+1\right)\left(k-1\right) degrees of freedom. Box's approximation seems to be good if each nin_i exceeds 20 and if kk and pp do not exceed 5 (Bibby and Kent (1979) pg. 140).

Value

A vector with the the test statistic, the p-value, the degrees of freedom and the critical value of the test.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris [email protected].

References

Mardia K.V., Kent J.T. and Bibby J.M. (1979). Multivariate Analysis. London: Academic Press.

See Also

equal.cov, likel.cov

Examples

x <- as.matrix( iris[, 1:4] )
ina <- iris[, 5]
Mtest.cov(x, ina)

Empirical likelihood for a one sample mean vector hypothesis testing

Description

Empirical likelihood for a one sample mean vector hypothesis testing.

Usage

el.test1(x, mu, R = 1, ncores = 1, graph = FALSE)

Arguments

x

A matrix containing Euclidean data.

mu

The hypothesized mean vector.

R

If R is 1 no bootstrap calibration is performed and the classical p-value via the χ2\chi^2 distribution is returned. If R is greater than 1, the bootstrap p-value is returned.

ncores

The number of cores to use, set to 1 by default.

graph

A boolean variable which is taken into consideration only when bootstrap calibration is performed. IF TRUE the histogram of the bootstrap test statistic values is plotted.

Details

The H0H_0 is that μ=μ0\pmb{\mu} = \pmb{\mu}_0 and the constraint imposed by EL is

1ni=1n{[1+λT(xiμ0)]1(xiμ0)}=0,\frac{1}{n}\sum_{i=1}^{n}\left\lbrace\left[1+\pmb{\lambda}^T\left({\bf x}_i-\pmb{\mu}_0 \right)\right]^{-1}\left({\bf x}_i-\pmb{\mu}_0\right)\right\rbrace={\bf 0},

where the λ\pmb{\lambda} is the Lagrangian parameter introduced to maximize the above expression. Note that the maximization of is with respect to the λ\pmb{\lambda}. The probabilities have the following form

pi=1n[1+λT(xiμ0)]1.p_i=\frac{1}{n}\left[1+\pmb{\lambda}^T \left({\bf x}_i-\pmb{\mu}_0 \right)\right]^{-1}.

The log-likelihood ratio test statistic can be written as

Λ=i=1nlognpi.\Lambda=\sum_{i=1}^{n}\log{np_i}.

where dd denotes the number of variables. Under H0H_0 Λχd2\Lambda \sim \chi^2_d, asymptotically. Alternatively the bootstrap p-value may be computed.

Value

A list with the outcome of the function el.test which includes the -2 log-likelihood ratio, the observed P-value by chi-square approximation, the final value of Lagrange multiplier λ\lambda, the gradient at the maximum, the Hessian matrix, the weights on the observations (probabilities multiplied by the sample size) and the number of iteration performed. In addition the runtime of the procedure is reported. In the case of bootstrap, the bootstrap p-value is also returned.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris [email protected].

References

Jing B.Y. and Wood A.T.A. (1996). Exponential empirical likelihood is not Bartlett correctable. Annals of Statistics, 24(1): 365–369.

Owen A. (1990). Empirical likelihood ratio confidence regions. Annals of Statistics, 18(1): 90–120.

Owen A.B. (2001). Empirical likelihood. Chapman and Hall/CRC Press.

See Also

eel.test1, hotel1T2, james, hotel2T2, maov, el.test2

Examples

x <- as.matrix(iris[, 1:4])
el.test1(x, mu = numeric(4) )
eel.test1(x, mu = numeric(4) )

Empirical likelihood hypothesis testing for two mean vectors

Description

Empirical likelihood hypothesis testing for two mean vectors.

Usage

el.test2(y1, y2, R = 0, ncores = 1, graph = FALSE)

Arguments

y1

A matrix containing the Euclidean data of the first group.

y2

A matrix containing the Euclidean data of the second group.

R

If R is 0, the classical chi-square distribution is used, if R = 1, the corrected chi-square distribution (James, 1954) is used and if R = 2, the modified F distribution (Krishnamoorthy and Yanping, 2006) is used. If R is greater than 3 bootstrap calibration is performed.

ncores

How many to cores to use.

graph

A boolean variable which is taken into consideration only when bootstrap calibration is performed. IF TRUE the histogram of the bootstrap test statistic values is plotted.

Details

The H0H_0 is that μ1=μ2\pmb{\mu}_1 = \pmb{\mu}_2 and the two constraints imposed by EL are

1nji=1nj{[1+λjT(xjiμ)]1(xijμ)}=0,\frac{1}{n_j}\sum_{i=1}^{n_j}\left\lbrace\left[1+\pmb{\lambda}_j^T\left({\bf x}_{ji}-\pmb{\mu} \right)\right]^{-1}\left({\bf x}_{ij}-\pmb{\mu}\right)\right\rbrace={\bf 0},

where j=1,2j=1,2 and the λjs\pmb{\lambda}_js are Lagrangian parameters introduced to maximize the above expression. Note that the maximization of is with respect to the λjs\pmb{\lambda}_js. The probabilities of the jj-th sample have the following form

pji=1nj[1+λjT(xjiμ)]1p_{ji}=\frac{1}{n_j} \left[1+\pmb{\lambda}_j^T \left({\bf x}_{ji}-\pmb{\mu} \right)\right]^{-1}

. The log-likelihood ratio test statistic can be written as

Λ=j=12i=1njlognjpij.\Lambda=\sum_{j=1}^2\sum_{i=1}^{n_j}\log{n_jp_{ij}}.

The test is implemented by searching for the mean vector that minimizes the sum of the two one sample EL test statistics. See el.test1 for the test statistic in the one-sample case.

Value

A list including:

test

The empirical likelihood test statistic value.

modif.test

The modified test statistic, either via the chi-square or the F distribution.

dof

Thre degrees of freedom of the chi-square or the F distribution.

pvalue

The asymptotic or the bootstrap p-value.

mu

The estimated common mean vector.

runtime

The runtime of the bootstrap calibration.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris [email protected].

References

Amaral G.J.A., Dryden I.L. and Wood A.T.A. (2007). Pivotal bootstrap methods for k-sample problems in directional statistics and shape analysis. Journal of the American Statistical Association, 102(478): 695–707.

Owen A. B. (2001). Empirical likelihood. Chapman and Hall/CRC Press.

Owen A.B. (1988). Empirical likelihood ratio confidence intervals for a single functional. Biometrika, 75(2): 237–249.

Preston S.P. and Wood A.T.A. (2010). Two-Sample Bootstrap Hypothesis Tests for Three-Dimensional Labelled Landmark Data. Scandinavian Journal of Statistics, 37(4): 568–587.

See Also

eel.test2, maovjames, maov, hotel2T2, james

Examples

el.test2( y1 = as.matrix(iris[1:25, 1:4]), y2 = as.matrix(iris[26:50, 1:4]), R = 0 )
el.test2( y1 = as.matrix(iris[1:25, 1:4]), y2 = as.matrix(iris[26:50, 1:4]), R = 1 )
el.test2( y1 =as.matrix(iris[1:25, 1:4]), y2 = as.matrix(iris[26:50, 1:4]), R = 2 )

Exponential empirical likelihood for a one sample mean vector hypothesis testing

Description

Exponential empirical likelihood for a one sample mean vector hypothesis testing.

Usage

eel.test1(x, mu, tol = 1e-06, R = 1)

Arguments

x

A matrix containing Euclidean data.

mu

The hypothesized mean vector.

tol

The tolerance value used to stop the Newton-Raphson algorithm.

R

The number of bootstrap samples used to calculate the p-value. If R = 1 (default value), no bootstrap calibration is performed

Details

Exponential empirical likelihood or exponential tilting was first introduced by Efron (1981) as a way to perform a "tilted" version of the bootstrap for the one sample mean hypothesis testing. Similarly to the empirical likelihood, positive weights pip_i, which sum to one, are allocated to the observations, such that the weighted sample mean xˉ{\bf \bar{x}} is equal to some population mean μ0\pmb{\mu}_0, under the H0H_0. Under H1H_1 the weights are equal to 1n\frac{1}{n}, where nn is the sample size. Following Efron (1981), the choice of pisp_is will minimize the Kullback-Leibler distance from H0H_0 to H1H_1

D(L0,L1)=i=1npilog(npi),D\left(L_0,L_1\right)=\sum_{i=1}^np_i\log\left(np_i\right),

subject to the constraint i=1npixi=μ0\sum_{i=1}^np_i{\bf x}_i=\pmb{\mu}_0. The probabilities take the form

pi=eλTxij=1neλTxjp_i=\frac{e^{\pmb{\lambda}^T{\bf x}_i}}{\sum_{j=1}^ne^{\pmb{\lambda}^T{\bf x}_j}}

and the constraint becomes

i=1neλTxi(xiμ0)j=1neλTxj=0i=1nxieλTxij=1neλTxjμ0=0.\frac{\sum_{i=1}^ne^{\pmb{\lambda}^T{\bf x}_i}\left({\bf x}_i-\pmb{\mu}_0\right)}{\sum_{j=1}^ne^{\pmb{\lambda}^T{\bf x}_j}}=0 \Rightarrow \frac{\sum_{i=1}^n{\bf x}_ie^{\pmb{\lambda}^T{\bf x}_i}}{\sum_{j=1}^ne^{\pmb{\lambda}^T{\bf x}_j}}-\pmb{\mu}_0={\bf 0}.

A numerical search over λ\pmb{\lambda} is required. Under H0H_0 Λχd2\Lambda \sim \chi^2_d, where dd denotes the number of variables. Alternatively the bootstrap p-value may be computed.

Value

A list including:

p

The estimated probabilities.

lambda

The value of the Lagrangian parameter λ\lambda.

iter

The number of iterations required by the newton-Raphson algorithm.

info

The value of the log-likelihood ratio test statistic along with its corresponding p-value.

runtime

The runtime of the process.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris [email protected].

References

Efron B. (1981) Nonparametric standard errors and confidence intervals. Canadian Journal of Statistics, 9(2): 139–158.

Jing B.Y. and Wood A.T.A. (1996). Exponential empirical likelihood is not Bartlett correctable. Annals of Statistics, 24(1): 365–369.

Owen A. B. (2001). Empirical likelihood. Chapman and Hall/CRC Press.

See Also

el.test1, hotel1T2, james, hotel2T2, maov, el.test2

Examples

x <- as.matrix( iris[, 1:4] )
eel.test1(x, numeric(4) )
el.test1(x, numeric(4) )

Exponential empirical likelihood hypothesis testing for two mean vectors

Description

Exponential empirical likelihood hypothesis testing for two mean vectors.

Usage

eel.test2(y1, y2, tol = 1e-07, R = 0, graph = FALSE)

Arguments

y1

A matrix containing the Euclidean data of the first group.

y2

A matrix containing the Euclidean data of the second group.

tol

The tolerance level used to terminate the Newton-Raphson algorithm.

R

If R is 0, the classical chi-square distribution is used, if R = 1, the corrected chi-square distribution (James, 1954) is used and if R = 2, the modified F distribution (Krishnamoorthy and Yanping, 2006) is used. If R is greater than 3 bootstrap calibration is performed.

graph

A boolean variable which is taken into consideration only when bootstrap calibration is performed. IF TRUE the histogram of the bootstrap test statistic values is plotted.

Details

Exponential empirical likelihood or exponential tilting was first introduced by Efron (1981) as a way to perform a "tilted" version of the bootstrap for the one sample mean hypothesis testing. Similarly to the empirical likelihood, positive weights pip_i, which sum to one, are allocated to the observations, such that the weighted sample mean xˉ{\bf \bar{x}} is equal to some population mean μ\pmb{\mu}, under the H0H_0. Under H1H_1 the weights are equal to 1n\frac{1}{n}, where nn is the sample size. Following Efron (1981), the choice of pisp_is will minimize the Kullback-Leibler distance from H0H_0 to H1H_1

D(L0,L1)=i=1npilog(npi),D\left(L_0,L_1\right)=\sum_{i=1}^np_i\log\left(np_i\right),

subject to the constraint i=1npixi=μ\sum_{i=1}^np_i{\bf x}_i=\pmb{\mu}. The probabilities take the form

pi=eλTxij=1neλTxjp_i=\frac{e^{\pmb{\lambda}^T{\bf x}_i}}{\sum_{j=1}^ne^{\pmb{\lambda}^T{\bf x}_j}}

and the constraint becomes

i=1neλTxi(xiμ)j=1neλTxj=0i=1nxieλTxij=1neλTxjμ=0.\frac{\sum_{i=1}^ne^{\pmb{\lambda}^T{\bf x}_i}\left({\bf x}_i-\pmb{\mu}\right)}{\sum_{j=1}^ne^{\pmb{\lambda}^T{\bf x}_j}}=0 \Rightarrow \frac{\sum_{i=1}^n{\bf x}_ie^{\pmb{\lambda}^T{\bf x}_i}}{\sum_{j=1}^ne^{\pmb{\lambda}^T{\bf x}_j}}-\pmb{\mu}=0.

Similarly to empirical likelihood a numerical search over λ\pmb{\lambda} is required.

We can derive the asymptotic form of the test statistic in the two sample means case but in a simpler form, generalizing the approach of Jing and Robinson (1997) to the multivariate case as follows. The three constraints are

(j=1n1eλ1Txj)1(i=1n1xieλ1Txi)μ=0(j=1n2eλ2Tyj)1(i=1n2yieλ2Tyi)μ=0n1λ1+n2λ2=0.{\begin{array}{ccc} \left(\sum_{j=1}^{n_1}e^{\pmb {\lambda}_1^T{\bf x}_j}\right)^{-1}\left(\sum_{i=1}^{n_1}{\bf x}_ie^{\pmb{\lambda}_1^T {\bf x}_i}\right) -\pmb{\mu} & = & {\bf 0} \\ \left(\sum_{j=1}^{n_2}e^{\pmb {\lambda}_2^T{\bf y}_j}\right)^{-1}\left(\sum_{i=1}^{n_2}{\bf y}_ie^{\pmb{\lambda}_2^T {\bf y}_i}\right) -\pmb{\mu} & = & {\bf 0} \\ n_1\pmb{\lambda}_1+n_2\pmb{\lambda}_2 & = & {\bf 0}. \end{array}}

Similarly to EL the sum of a linear combination of the λs\pmb{\lambda}s is set to zero. We can equate the first two constraints of

(j=1n1eλ1Txj)1(i=1n1xieλ1Txi)=(j=1n2eλ2Tyj)1(i=1n2yieλ2Tyi).\left(\sum_{j=1}^{n_1}e^{\pmb {\lambda}_1^T{\bf x}_j}\right)^{-1}\left(\sum_{i=1}^{n_1}{\bf x}_ie^{\pmb{\lambda}_1^T {\bf x}_i}\right)= \left(\sum_{j=1}^{n_2}e^{\pmb {\lambda}_2^T{\bf y}_j}\right)^{-1}\left(\sum_{i=1}^{n_2}{\bf y}_ie^{\pmb{\lambda}_2^T {\bf y}_i}\right).

Also, we can write the third constraint of as λ2=n1n2λ1\pmb{\lambda}_2=-\frac{n_1}{n_2}\pmb{\lambda}_1 and thus rewrite the first two constraints as

(j=1n1eλTxj)1(i=1n1xieλTxi)=(j=1n2en1n2λTyj)1(i=1n2yien1n2λTyi).\left(\sum_{j=1}^{n_1}e^{\pmb{\lambda}^T{\bf x}_j}\right)^{-1}\left(\sum_{i=1}^{n_1}{\bf x}_ie^{\pmb{\lambda}^T {\bf x}_i}\right) = \left(\sum_{j=1}^{n_2}e^{-\frac{n_1}{n_2}\pmb{\lambda}^T{\bf y}_j}\right)^{-1}\left(\sum_{i=1}^{n_2}{\bf y}_ie^{-\frac{n_1}{n_2}\pmb{\lambda}^T {\bf y}_i}\right).

This trick allows us to avoid the estimation of the common mean. It is not possible though to do this in the empirical likelihood method. Instead of minimisation of the sum of the one-sample test statistics from the common mean, we can define the probabilities by searching for the λ\pmb{\lambda} which makes the last equation hold true. The third constraint of is a convenient constraint, but Jing and Robinson (1997) mention that even though as a constraint is simple it does not lead to second-order accurate confidence intervals unless the two sample sizes are equal. Asymptotically, the test statistic follows a χd2\chi^2_d under the null hypothesis.

Value

A list including:

test

The empirical likelihood test statistic value.

modif.test

The modified test statistic, either via the chi-square or the F distribution.

dof

The degrees of freedom of the chi-square or the F distribution.

pvalue

The asymptotic or the bootstrap p-value.

mu

The estimated common mean vector.

runtime

The runtime of the bootstrap calibration.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris [email protected].

References

Efron B. (1981) Nonparametric standard errors and confidence intervals. Canadian Journal of Statistics, 9(2): 139–158.

Jing B.Y. and Wood A.T.A. (1996). Exponential empirical likelihood is not Bartlett correctable. Annals of Statistics, 24(1): 365–369.

Jing B.Y. and Robinson J. (1997). Two-sample nonparametric tilting method. Australian Journal of Statistics, 39(1): 25–34.

Owen A.B. (2001). Empirical likelihood. Chapman and Hall/CRC Press.

Preston S.P. and Wood A.T.A. (2010). Two-Sample Bootstrap Hypothesis Tests for Three-Dimensional Labelled Landmark Data. Scandinavian Journal of Statistics 37(4): 568–587.

Tsagris M., Preston S. and Wood A.T.A. (2017). Nonparametric hypothesis testing for equality of means on the simplex. Journal of Statistical Computation and Simulation, 87(2): 406–422.

See Also

el.test2, maovjames, maov, hotel2T2, james

Examples

y1 = as.matrix(iris[1:25, 1:4])
y2 = as.matrix(iris[26:50, 1:4])
eel.test2(y1, y2)
eel.test2(y1, y2 )
eel.test2( y1, y2 )

Hotelling's multivariate version of the 1 sample t-test for Euclidean data

Description

Hotelling's test for testing one Euclidean population mean vector.

Usage

hotel1T2(x, M, a = 0.05, R = 999, graph = FALSE)

Arguments

x

A matrix containing Euclidean data.

a

The significance level, set to 0.05 by default.

M

The hypothesized mean vector.

R

If R is 1 no bootstrap calibration is performed and the classical p-value via the F distribution is returned. If R is greater than 1, the bootstrap p-value is returned.

graph

A boolean variable which is taken into consideration only when bootstrap calibration is performed. IF TRUE the histogram of the bootstrap test statistic values is plotted.

Details

The hypothesis test is that a mean vector is equal to some specified vector H0:μ=μ0H_0:\pmb{\mu}=\pmb{\mu}_0. We assume that Σ\pmb{\Sigma} is unknown. The first approach to this hypothesis test is parametrically, using the Hotelling's T2T^2 test Mardia, Bibby and Kent (1979, pg. 125-126). The test statistic is given by

T2=(np)n(n1)p(Xˉμ)TS1(Xˉμ).T^2=\frac{\left(n-p\right)n}{\left(n-1\right)p}\left(\bar{{\bf X}}-\pmb{\mu}\right)^T{\bf S}^{-1}\left(\bar{{\bf X}}-\pmb{\mu} \right).

Under the null hypothesis, the above test statistic follows the Fp,npF_{p,n-p} distribution. The bootstrap version of the one-sample multivariate generalization of the simple t-test is also included in the function. An extra argument (R) indicates whether bootstrap calibration should be used or not. If R=1, then the asymptotic theory applies, if R>1, then the bootstrap p-value will be applied and the number of re-samples is equal to R.

Value

A list including:

m

The sample mean vector.

info

The test statistic, the p-value, the critical value and the degrees of freedom of the F distribution (numerator and denominator). This is given if no bootstrap calibration is employed.

pvalue

The bootstrap p-value is bootstrap is employed.

runtime

The runtime of the bootstrap calibration.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris [email protected].

References

Mardia K.V., Kent J.T. and Bibby J.M. (1979). Multivariate analysis. London: Academic Press.

See Also

eel.test1, el.test1, james, hotel2T2, maov, el.test2

Examples

x <- matrix( rnorm( 100 * 4), ncol = 4)
hotel1T2(x, numeric(4), R = 1)
hotel1T2(x, numeric(4), R = 999, graph = TRUE)

Hotelling's multivariate version of the 2 sample t-test for Euclidean data

Description

Hotelling's test for testing the equality of two Euclidean population mean vectors.

Usage

hotel2T2(x1, x2, a = 0.05, R = 999, graph = FALSE)

Arguments

x1

A matrix containing the Euclidean data of the first group.

x2

A matrix containing the Euclidean data of the second group.

a

The significance level, set to 0.05 by default.

R

If R is 1 no bootstrap calibration is performed and the classical p-value via the F distribution is returned. If R is greater than 1, the bootstrap p-value is returned.

graph

A boolean variable which is taken into consideration only when bootstrap calibration is performed. IF TRUE the histogram of the bootstrap test statistic values is plotted.

Details

The fist case scenario is when we assume equality of the two covariance matrices. This is called the two-sample Hotelling's T2T^2 test (Mardia, Kent and Bibby, 1979, pg. 131-140) and Everitt (2005, pg. 139). The test statistic is defined as

T2=n1n2n1+n2(Xˉ1Xˉ2)TS1(Xˉ1Xˉ2),T^2=\frac{n_1n_2}{n_1+n_2}\left(\bar{{\bf X}}_1- \bar{{\bf X}}_2\right)^T{\bf S}^{-1}\left(\bar{{\bf X}}_1- \bar{{\bf X}}_2\right),

where S\bf S is the pooled covariance matrix calculated under the assumption of equal covariance matrices S=(n11)S1+(n21)S2n1+n22.{\bf S}=\frac{\left(n_1-1\right){\bf S}_1+\left(n_2-1\right){\bf S}_2}{n_1+n_2-2}. Under H0H_0 the statistic FF given by

F=(n1+n2p1)T2(n1+n22)pF=\frac{\left( n_1+n_2-p-1 \right)T^2}{\left(n_1+n_2-2 \right)p}

follows the FF distribution with pp and n1+n2p1n_1+n_2-p-1 degrees of freedom. Similar to the one-sample test, an extra argument (R) indicates whether bootstrap calibration should be used or not. If R=1, then the asymptotic theory applies, if R>1, then the bootstrap p-value will be applied and the number of re-samples is equal to R. The estimate of the common mean used in the bootstrap to transform the data under the null hypothesis the mean vector of the combined sample, of all the observations.

The built-in command manova does the same thing exactly. Try it, the asymptotic FF test is what you have to see. In addition, this command allows for more mean vector hypothesis testing for more than two groups. I noticed this command after I had written my function and nevertheless as I mention in the introduction this document has an educational character as well.

Value

A list including:

mesoi

The two mean vectors.

info

The test statistic, the p-value, the critical value and the degrees of freedom of the F distribution (numerator and denominator). This is given if no bootstrap calibration is employed.

pvalue

The bootstrap p-value is bootstrap is employed.

note

A message informing the user that bootstrap calibration has been employed.

runtime

The runtime of the bootstrap calibration.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris [email protected].

References

Everitt B. (2005). An R and S-Plus Companion to Multivariate Analysis. Springer.

Mardia K.V., Kent J.T. and Bibby J.M. (1979). Multivariate Analysis. London: Academic Press.

Tsagris M., Preston S. and Wood A.T.A. (2017). Nonparametric hypothesis testing for equality of means on the simplex. Journal of Statistical Computation and Simulation, 87(2): 406–422.

See Also

james, maov, el.test2, eel.test2

Examples

hotel2T2( as.matrix(iris[1:25, 1:4]), as.matrix(iris[26:50, 1:4]) )
hotel2T2( as.matrix(iris[1:25, 1:4]), as.matrix(iris[26:50, 1:4]), R = 1 )

Hypothesis test for two high-dimensional mean vectors

Description

Hypothesis test for two high-dimensional mean vectors.

Usage

sarabai(x1, x2)

Arguments

x1

A matrix containing the Euclidean data of the first group.

x2

A matrix containing the Euclidean data of the second group.

Details

High dimensional data are the multivariate data which have many variables (pp) and usually a small number of observations (nn). It also happens that p>np>n and this is the case here in this Section. We will see a simple test for the case of p>np>n. In this case, the covariance matrix is not invertible and in addition it can have a lot of zero eigenvalues.

The test we will see was proposed by Bai and Saranadasa (1996). Ever since, there have been some more suggestions but I chose this one for its simplicity. There are two datasets, X1{\bf X}_1 and X2{\bf X}_2 of sample sizes n1n_1 and n2n_2, respectively. Their corresponding sample mean vectors and covariance matrices are Xˉ1\bar{{\bf X}}_1, Xˉ2\bar{{\bf X}}_2 and S1{\bf S}_1, S2{\bf S}_2 respectively. The assumption here is the same as that of the Hotelling's test we saw before.

Let us define the pooled covariance matrix at first, calculated under the assumption of equal covariance matrices Sn=(n11)S1+(n21)S2n{\bf S}_n=\frac{\left(n_1-1\right){\bf S}_1+\left(n_2-1\right){\bf S}_2}{n}, where n=n1+n2n=n_1+n_2. Then define Bn=n2(n+2)(n1){tr(Sn2)1n[tr(Sn)]2}B_n=\sqrt{ \frac{n^2}{\left(n+2\right)\left(n-1\right)}\left\lbrace\text{tr}\left({\bf S}_n^2\right)- \frac{1}{n}\left[\text{tr}\left({\bf S}_n\right)\right]^2 \right\rbrace }. The test statistic is

Z=n1n2n1+n2(Xˉ1Xˉ2)T(Xˉ1Xˉ2)tr(Sn)2(n+1)nBn.Z=\frac{\frac{n_1n_2}{n_1+n_2}\left(\bar{{\bf X}}_1-\bar{{\bf X}}_2\right)^T\left(\bar{{\bf X}}_1-\bar{{\bf X}}_2\right) -\text{tr}\left({\bf S}_n\right)}{\sqrt{\frac{2\left(n+1\right)}{n}}B_n}.

Under the null hypothesis (equality of the two mean vectors) the test statistic follows the standard normal distribution. Bai and Saranadasa (1996) established the asymptotic normality of the test statistics and showed that it has attractive power property when p/nc<p/n \rightarrow c < \infty and under some restriction on the maximum eigenvalue of the common population covariance matrix. However, the requirement of pp and nn being of the same order is too restrictive to be used in the "large pp small nn" situation.

Value

A vector with the test statistic and the p-value.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris [email protected].

References

Bai Z. D. and Saranadasa H. (1996). Effect of high dimension: by an example of a two sample problem. Statistica Sinica, 6(2): 311–329.

See Also

hotel2T2, maov, el.test2, eel.test2

Examples

x1 <- matrix( rnorm(40 * 100), ncol = 100 )
x2 <- matrix( rnorm(50 * 100), ncol = 100 )
sarabai(x1, x2)

James multivariate version of the t-test

Description

James test for testing the equality of two population mean vectors without assuming equality of the covariance matrices.

Usage

james(y1, y2, a = 0.05, R = 999, graph = FALSE)

Arguments

y1

A matrix containing the Euclidean data of the first group.

y2

A matrix containing the Euclidean data of the second group.

a

The significance level, set to 0.05 by default.

R

If R is 1 no bootstrap calibration is performed and the classical p-value via the F distribution is returned. If R is greater than 1, the bootstrap p-value is returned.

graph

A boolean variable which is taken into consideration only when bootstrap calibration is performed. If TRUE the histogram of the bootstrap test statistic values is plotted.

Details

Here we show the modified version of the two-sample T2T^2 test (function hotel2T2) in the case where the two covariances matrices cannot be assumed to be equal.

James (1954) proposed a test for linear hypotheses of the population means when the variances (or the covariance matrices) are not known. Its form for two pp-dimensional samples is:

Tu2=(Xˉ1Xˉ2)TS~1(Xˉ1Xˉ2),T^2_u=\left(\bar{{\bf X}}_1-\bar{{\bf X}}_2\right)^T\tilde{{\bf S}}^{-1}\left(\bar{{\bf X}}_1-\bar{{\bf X}}_2\right),

where S~=S1~+S2~=S1n1+S2n2\tilde{{\bf S}}=\tilde{{\bf S}_1}+\tilde{{\bf S}_2}=\frac{{\bf S}_1}{n_1}+\frac{{\bf S}_2}{n_2}.

James (1954) suggested that the test statistic is compared with 2h(α)2h\left(\alpha\right), a corrected χ2\chi^2 distribution whose form is

2h(α)=χ2(A+Bχ2),2h\left(\alpha\right)=\chi^2\left(A+B\chi^2\right),

where A=1+12pi=12(trS~1Si~)2ni1A=1+\frac{1}{2p}\sum_{i=1}^2\frac{\left(tr \tilde{{\bf S}}^{-1}\tilde{{\bf S}_i}\right)^2}{n_i-1} and B=1p(p+2)[i=12tr(S~1Si~)2ni1+12i=12(trS~1Si~)2ni1]B=\frac{1}{p\left(p+2\right)}\left[\sum_{i=1}^2\frac{tr\left(\tilde{{\bf S}}^{-1}\tilde{{\bf S}_i}\right)^2}{n_i-1}+\frac{1}{2}\sum_{i=1}^2\frac{\left(\text{tr} \tilde{{\bf S}}^{-1}\tilde{{\bf S}_i}\right)^2}{n_i-1} \right].

If you want to do bootstrap to get the p-value, then you must transform the data under the null hypothesis. The estimate of the common mean is given by Aitchison (1986)

μ^c=(n1S11+n2S21)1(n1S11Xˉ1+n2S21Xˉ2)=(S~11+S~21)1(S~11Xˉ1+S~21Xˉ2).\hat{\pmb{\mu}}_c = \left(n_1{\bf S}_1^{-1}+n_2{\bf S}_2^{-1}\right)^{-1}\left(n_1{\bf S}_1^{-1}\bar{{\bf X}}_1+n_2{\bf S}_2^{-1}\bar{{\bf X}}_2\right)= \left(\tilde{{\bf S}}_1^{-1}+\tilde{{\bf S}}_2^{-1}\right)^{-1}\left(\tilde{{\bf S}}_1^{-1}\bar{{\bf X}}_1+\tilde{{\bf S}}_2^{-1}\bar{{\bf X}}_2\right).

The modified Nel and van der Merwe (1986) test is based on the same quadratic form as that of James (1954) but the distribution used to compare the value of the test statistic is different. It is shown in Krishnamoorthy and Yanping (2006) that Tu2νpνp+1Fp,νp+1T^2_u \sim \frac{\nu p}{\nu-p+1}F_{p,\nu-p+1} approximately, where ν=p+p21n1{tr[(S1S~)2]+tr[(S1S~)]2}+1n2{tr[(S2S~)2]+tr[(S2S~)]2}.\nu=\frac{p+p^2}{\frac{1}{n_1}\left\lbrace \text{tr}\left[ \left( {\bf S}_1\tilde{{\bf S}} \right)^2\right]+ \text{tr}\left[ \left( {\bf S}_1\tilde{{\bf S}} \right)\right]^2 \right\rbrace + \frac{1}{n_2}\left\lbrace \text{tr}\left[ \left( {\bf S}_2\tilde{{\bf S}}\right)^2\right]+ \text{tr}\left[ \left( {\bf S}_2\tilde{{\bf S}} \right)\right]^2 \right\rbrace }.

The algorithm is taken by Krishnamoorthy and Yu (2004).

Value

A list including:

note

A message informing the user about the test used.

mesoi

The two mean vectors.

info

The test statistic, the p-value, the correction factor and the corrected critical value of the chi-square distribution if the James test has been used or, the test statistic, the p-value, the critical value and the degrees of freedom (numerator and denominator) of the F distribution if the modified James test has been used.

pvalue

The bootstrap p-value if bootstrap is employed.

runtime

The runtime of the bootstrap calibration.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris [email protected].

References

Aitchison J. (1986). The statistical analysis of compositional data. Chapman & Hall.

James G.S. (1954). Tests of Linear Hypothese in Univariate and Multivariate Analysis when the Ratios of the Population Variances are Unknown. Biometrika, 41(1/2): 19–43.

Krishnamoorthy K. and Yu J. (2004). Modified Nel and Van der Merwe test for the multivariate Behrens-Fisher problem. Statistics & Probability Letters, 66(2): 161–169.

Krishnamoorthy K. and Yanping Xia (2006). On Selecting Tests for Equality of Two Normal Mean Vectors. Multivariate Behavioral Research, 41(4): 533–548.

Tsagris M., Preston S. and Wood A.T.A. (2017). Nonparametric hypothesis testing for equality of means on the simplex. Journal of Statistical Computation and Simulation, 87(2): 406–422.

See Also

hotel2T2, maovjames, el.test2, eel.test2

Examples

james( as.matrix(iris[1:25, 1:4]), as.matrix(iris[26:50, 1:4]), R = 1 )
james( as.matrix(iris[1:25, 1:4]), as.matrix(iris[26:50, 1:4]), R = 2 )
james( as.matrix(iris[1:25, 1:4]), as.matrix(iris[26:50, 1:4]) )

Log-likelihood ratio test for equality of one covariance matrix

Description

Log-likelihood ratio test for equality of one covariance matrix.

Usage

equal.cov(x, Sigma, a = 0.05)

Arguments

x

A matrix containing Euclidean data.

Sigma

The hypothesis covariance matrix.

a

The significance level, set to 0.05 by default.

Details

The hypothesis test is that the the sample covariance is equal to some specified covariance matrix: H0:Σ=Σ0H_0:\pmb{\Sigma}=\pmb{\Sigma}_0, with μ\pmb{\mu} unknown. The algorithm for this test is taken from Mardia, Bibby and Kent (1979, pg. 126-127). The test is based upon the log-likelihood ratio test. The form of the test is

2logλ=ntr{Σ01S}nlogΣ01Snp,-2\log{\lambda}=n \text{tr}\left\lbrace \pmb{\Sigma}_0^{-1}{\bf S}\right\rbrace-n\log{\left|\pmb{\Sigma}_0^{-1}{\bf S} \right|}-np,

where nn is the sample size, Σ0\pmb{\Sigma}_0 is the specified covariance matrix under the null hypothesis, S{\bf S} is the sample covariance matrix and pp is the dimensionality of the data (or the number of variables). Let α\alpha and gg denote the arithmetic mean and the geometric mean respectively of the eigenvalues of Σ01S\pmb{\Sigma}_0^{-1}{\bf S}, so that tr{Σ01S}=pαtr\left\lbrace \pmb{\Sigma}_0^{-1}{\bf S}\right\rbrace=p\alpha and Σ01S=gp\left|\pmb{\Sigma}_0^{-1}{\bf S} \right|=g^p, then the test statistic becomes

2logλ=np(αlog(g)1).-2\log{\lambda}=np\left(\alpha-log{(g)}-1 \right).

The degrees of freedom of the χ2\chi^2 distribution are 12p(p+1)\frac{1}{2}p\left(p+1\right).

Value

A vector with the the test statistic, the p-value, the degrees of freedom and the critical value of the test.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris [email protected].

References

Mardia K.V., Kent J.T. and Bibby J.M. (1979). Multivariate Analysis. London: Academic Press.

See Also

likel.cov, Mtest.cov

Examples

x <- as.matrix( iris[, 1:4] )
s <- cov(x) * 1.5
equal.cov(x, s)

Log-likelihood ratio test for equality of two or more covariance matrices

Description

Log-likelihood ratio test for equality of two or more covariance matrices.

Usage

likel.cov(x, ina, a = 0.05)

Arguments

x

A matrix containing Euclidean data.

ina

A vector denoting the groups of the data.

a

The significance level, set to 0.05 by default.

Details

Tthe hypothesis test is that of the equality of at least two covariance matrices: H0:Σ1==ΣkH_0:\pmb{\Sigma}_1=\ldots=\pmb{\Sigma}_k. The algorithm is taken from Mardia, Bibby and Kent (1979, pg. 140). The log-likelihood ratio test is the multivariate generalization of Bartlett's test of homogeneity of variances. The test statistic takes the following form

2logλ=nlogSi=1knilogSi=i=1knilogSi1S,-2log{\lambda}=n\log{\left|{\bf S}\right|}-\sum_{i=1}^kn_i\log{\left|{\bf S_i}\right|}=\sum_{i=1}^kn_i\log{\left|{\bf S}_i^{-1}{\bf S}\right|},

where Si{\bf S}_i is the ii-th sample biased covariance matrix and S=n1i=1kniSi{\bf S}=n^{-1}\sum_{i=1}^kn_i{\bf S}_i is the maximum likelihood estimate of the common covariance matrix (under the null hypothesis) with n=i=1knin=\sum_{i=1}^kn_i. The degrees of freedom of the asymptotic chi-square distribution are 12(p+1)(k1)\frac{1}{2}\left(p+1\right)\left(k-1\right).

Value

A vector with the the test statistic, the p-value, the degrees of freedom and the critical value of the test.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris [email protected].

References

Mardia K.V., Kent J.T. and Bibby J.M. (1979). Multivariate Analysis. London: Academic Press.

See Also

equal.cov, Mtest.cov

Examples

x <- as.matrix( iris[, 1:4] )
ina <- iris[, 5]
likel.cov(x, ina)

Multivariate analysis of variance (James test)

Description

Multivariate analysis of variance without assuming equality of the covariance matrices.

Usage

maovjames(x, ina, a = 0.05)

Arguments

x

A matrix containing Euclidean data.

ina

A numerical or factor variable indicating the groups of the data.

a

The significance level, set to 0.005 by default.

Details

James (1954) also proposed an alternative to MANOVA when the covariance matrices are not assumed equal. The test statistic for kk samples is

J=i=1k(xˉiXˉ)TWi(xˉiXˉ),J=\sum_{i=1}^k\left(\bar{{\bf x}}_i-\bar{{\bf X}}\right)^T{\bf W}_i\left(\bar{{\bf x}}_i-\bar{{\bf X}}\right),

where xˉi\bar{{\bf x}}_i and nin_i are the sample mean vector and sample size of the ii-th sample respectively and Wi=(Sini)1{\bf W}_i=\left(\frac{{\bf S}_i}{n_i}\right)^{-1}, where Si{\bf S}_i is the covariance matrix of the ii-sample mean vector and Xˉ\bar{{\bf X}} is the estimate of the common mean Xˉ=(i=1kWi)1i=1kWixˉi\bar{{\bf X}}=\left(\sum_{i=1}^k{\bf W}_i\right)^{-1}\sum_{i=1}^k{\bf W}_i\bar{{\bf x}}_i.

Normally one would compare the test statistic with a χr,1α2\chi^2_{r,1-\alpha}, where r=p(k1)r=p\left(k-1\right) are the degrees of freedom with kk denoting the number of groups and pp the dimensionality of the data. There are rr constraints (how many univariate means must be equal, so that the null hypothesis, that all the mean vectors are equal, holds true), that is where these degrees of freedom come from. James (1954) compared the test statistic with a corrected χ2\chi^2 distribution instead. Let AA and BB be A=1+12ri=1k[tr(IpW1Wi)]2ni1A= 1+\frac{1}{2r}\sum_{i=1}^k\frac{\left[\text{tr}\left({\bf I}_p-{\bf W}^{-1}{\bf W}_i\right)\right]^2}{n_i-1} and B=1r(r+2)i=1k{tr[(IpW1Wi)2]ni1+[tr(IpW1Wi)]22(ni1)}B= \frac{1}{r\left(r+2\right)}\sum_{i=1}^k\left\lbrace\frac{\text{tr}\left[\left({\bf I}_p-{\bf W}^{-1}{\bf W}_i\right)^2\right]}{n_i-1}+\frac{\left[\text{tr}\left({\bf I}_p-{\bf W}^{-1}{\bf W}_i\right)\right]^2}{2\left(n_i-1\right)}\right\rbrace.

The corrected quantile of the χ2\chi^2 distribution is given as before by 2h(α)=χ2(A+Bχ2)2h\left(\alpha\right)=\chi^2\left(A+B\chi^2\right).

Value

A vector with the next 4 elements:

test

The test statistic.

correction

The value of the correction factor.

corr.critical

The corrected critical value of the chi-square distribution.

p-value

The p-value of the corrected test statistic.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris [email protected].

References

James G.S. (1954). Tests of Linear Hypotheses in Univariate and Multivariate Analysis when the Ratios of the Population Variances are Unknown. Biometrika, 41(1/2): 19–43.

See Also

maov, hotel2T2, james

Examples

maov( as.matrix(iris[,1:4]), iris[,5] )
maovjames( as.matrix(iris[,1:4]), iris[,5] )

Multivariate analysis of variance assuming equality of the covariance matrices

Description

Multivariate analysis of variance assuming equality of the covariance matrices.

Usage

maov(x, ina)

Arguments

x

A matrix containing Euclidean data.

ina

A numerical or factor variable indicating the groups of the data.

Details

Multivariate analysis of variance assuming equality of the covariance matrices.

Value

A list including:

note

A message stating whether the FF or the chi2chi^2 approximation has been used.

result

The test statistic and the p-value.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris [email protected].

References

Johnson R.A. and Wichern D.W. (2007, 6th Edition). Applied Multivariate Statistical Analysis, pg. 302–303.

Todorov V. and Filzmoser P. (2010). Robust Statistic for the One-way MANOVA. Computational Statistics & Data Analysis, 54(1): 37–48.

See Also

maovjames, hotel2T2, james

Examples

maov( as.matrix(iris[,1:4]), iris[,5] )
maovjames( as.matrix(iris[,1:4]), iris[,5] )

Relationship between Hotelling's T2T^2 test and James' MANOVA

Description

Relationship between Hotelling's T2T^2 test and James' MANOVA.

Usage

maovjames.hotel(x, ina)

Arguments

x

A matrix containing the Euclidean data of the first group.

ina

A numerical or factor variable indicating the groups of the data.

Details

The relationship for the James two sample test (see the function james.hotel) is true for the case of the MANOVA. The estimate of the common mean, muc\pmb{mu}_c (see the function james for the expression of μc\pmb{\mu}_c), is in general, for gg groups, each of sample size nin_i, written as

μ^c=(i=1gniSi1)1i=1gniSi1Xˉi.\hat{\pmb{\mu}}_c = \left(\sum_{i=1}^gn_i{\bf S}_i^{-1}\right)^{-1}\sum_{i=1}^gn_i{\bf S}_i^{-1}\bar{{\bf X}}_i.

The function is just a proof of the mathematics you will find in Emerson (2009, pg. 76–81) and is again intended for educational purposes.

Value

A list including:

test

The value of the test statistic, the sum of the two Hotelling's test statistic using the common mean.

mc

The common mean.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris [email protected].

References

Emerson S. (2009). Small sample performance and calibration of the Empirical Likelihood method. PhD thesis, Stanford university.

James G.S. (1954). Tests of Linear Hypothese in Univariate and Multivariate Analysis when the Ratios of the Population Variances are Unknown. Biometrika, 41(1/2): 19–43.

See Also

hotel2T2, maovjames, el.test2, eel.test2

Examples

maovjames.hotel( as.matrix(iris[, 1:4]), iris[, 5] )
maovjames( as.matrix(iris[, 1:4]), iris[, 5] )

Relationship between the Hotelling's T2T^2 and James' test

Description

Relationship between the Hotelling's T2T^2 and James' test.

Usage

james.hotel(x1, x2)

Arguments

x1

A matrix containing the Euclidean data of the first group.

x2

A matrix containing the Euclidean data of the second group.

Details

Emerson (2009, pg. 76–81) mentioned a very nice result between the Hotelling's one sample T2T^2 and James test for two mean vectors

J(μ)=T12(μ)+T22(μ),J\left(\pmb{\mu}\right) = T_1^2\left(\pmb{\mu}\right) + T_2^2\left(\pmb{\mu}\right),

where J(μ)J\left(\pmb{\mu}\right) is the James test statistic (James, 1954) and T12(μ)T_1^2\left(\pmb{\mu}\right) and T12(μ)T_1^2\left(\pmb{\mu}\right) are the two one sample Hotelling's T2T^2 test statistic values (see function hotel1T2) for each sample from their common mean vector μc\pmb{\mu}_c (see the help file of james). In fact, James test statistic is found from minimizing the right hand side of the above expression with respect to μ\pmb{\mu}. The sum is mimized when μ\pmb{\mu} takes the form of the common mean vector μc\pmb{\mu}_c. The same is true for the t-test in the univariate case.

I have created this function illustrating this result, so this one is for educational purposes. It calculates the James test statistic, the sum of the two T2T^2 test statistics, the common mean vector and the one found via numerical optimization. In the univariate case, the common mean vector is a weighted linear combination of the two sample means. So, if we take a segment connecting the two means, the common mean is somewhere on that segment.

Value

A list including:

tests

A vector with two values, the James test statistic value and the sum of the two Hotelling's test statistic using the common mean.

mathematics.mean

The common mean computed the closed form expression seen in the help file of james.

optimised.mean

The common mean vector obtained from the minimisation process.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris [email protected].

References

Emerson S. (2009). Small sample performance and calibration of the Empirical Likelihood method. PhD thesis, Stanford university.

James G.S. (1954). Tests of Linear Hypothese in Univariate and Multivariate Analysis when the Ratios of the Population Variances are Unknown. Biometrika, 41(1/2): 19–43.

See Also

hotel2T2, maovjames, el.test2, eel.test2

Examples

james.hotel( as.matrix(iris[1:50, 1:4]), as.matrix(iris[51:100, 1:4]) )
james( as.matrix(iris[1:50, 1:4]), as.matrix(iris[51:100, 1:4]), R = 1 )

Repeated measures ANOVA (univariate data) using Hotelling's T2T^2 test

Description

Repeated measures ANOVA (univariate data) using Hotelling's T2T^2 test.

Usage

rm.hotel(x, a = 0.05)

Arguments

x

A numerical matrix with the repeated measurements. Each column contains the values of the repeated measurements.

a

The level of significance, default value is equal to 0.05.

Details

We now show how can one use Hotelling's T2T^2 test to analyse univariate repeated measures. Univariate analysis of variance for repeated measures is the classical way, but we can use this multivariate test as well. In the repeated measures ANOVA case, we have many repeated observations from the same nn subjects, usually at different time points and the interest is to see whether the means of the samples are equal or not μ1=μ2==μk\mu_1=\mu_2=\ldots=\mu_k assuming kk repeated measurements. We can of course change this null hypothesis and test many combinations of means. The idea in any case is to construct a matrix of contrasts. I will focus here in the first case only and in particular the null hypothesis and the matrix of contrasts C\bf C are

(μ1=μ2μ2=μ3μk1=μk)=(110010101001)μ=Cμ.\left( {\begin{array}{c} \mu_1=\mu_2 \\ \mu_2=\mu_3 \\ \vdots \\ \mu_{k-1}=\mu_k \end{array}} \right)= \left( {\begin{array}{ccccc} 1 & -1 & 0 & \ldots & 0 \\ 1 & 0 & -1 & \dots & 0 \\ \vdots & \vdots & \vdots & \vdots & \vdots \\ 1 & 0 & 0 & \ldots & -1 \\ \end{array}} \right)\pmb{\mu}={\bf C}\pmb{\mu}.

The contrast matrix C\bf C has k1k-1 independent rows and if there is no treatment effect, Cμ=0{\bf C}\pmb{\mu}={\bf 0}.

The test statistic is

Tr2=(nk+1)(n1)(k1)n(Cxˉ)T(CSCT)1(Cxˉ)Fk1,nk+1.T_r^2=\frac{\left(n-k+1\right)}{\left(n-1\right)\left(k-1\right)}n\left({\bf C}\bar{\bf x}\right)^T \left({\bf CSC}^T\right)^{-1}\left({\bf C}\bar{\bf x}\right) \sim F_{k-1,n-k+1}.

Value

A list including:

m

The mean vector.

result

A vector with the test statistic value, it's associated p-value, the numerator and denominator degrees of freedom and the critical value.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris [email protected].

See Also

maov, hotel2T2, james

Examples

x <- as.matrix(iris[, 1:4]) ## assume they are repeated measurements
rm.hotel(x)