Package 'mvhtests' reference manual

Title:	Multivariate Hypothesis Tests
Description:	Hypothesis tests for multivariate data. Tests for one and two mean vectors, multivariate analysis of variance, tests for one, two or more covariance matrices. References include: Mardia K.V., Kent J.T. and Bibby J.M. (1979). Multivariate Analysis. ISBN: 978-0124712522. London: Academic Press.
Authors:	Michail Tsagris [aut, cre]
Maintainer:	Michail Tsagris <mtsagris@uoc.gr>
License:	GPL (>= 2)
Version:	1.1
Built:	2025-03-10 06:41:09 UTC
Source:	CRAN

Multivariate Hypothesis Tests

Description

Multivariate Hypothesis Tests.

Details

Package:	mvhtests
Type:	Package
Version:	1.1
Date:	2025-01-08
License:	GPL-2

Maintainers

Michail Tsagris mtsagris@uoc.gr.

Author(s)

Michail Tsagris mtsagris@uoc.gr.

References

Aitchison J. (1986). The statistical analysis of compositional data. Chapman & Hall.

Amaral G.J.A., Dryden I.L. and Wood A.T.A. (2007). Pivotal bootstrap methods for k-sample problems in directional statistics and shape analysis. Journal of the American Statistical Association, 102(478): 695–707.

Efron B. (1981) Nonparametric standard errors and confidence intervals. Canadian Journal of Statistics, 9(2): 139–158.

Emerson S. (2009). Small sample performance and calibration of the Empirical Likelihood method. PhD thesis, Stanford university.

Everitt B. (2005). An R and S-Plus Companion to Multivariate Analysis. Springer.

James G.S. (1954). Tests of Linear Hypotheses in Univariate and Multivariate Analysis when the Ratios of the Population Variances are Unknown. Biometrika, 41(1/2): 19–43.

Jing B.Y. and Wood A.T.A. (1996). Exponential empirical likelihood is not Bartlett correctable. Annals of Statistics, 24(1): 365–369.

Jing B.Y. and Robinson J. (1997). Two-sample nonparametric tilting method. Australian Journal of Statistics, 39(1): 25–34.

Johnson R.A. and Wichern D.W. (2007, 6th Edition). Applied Multivariate Statistical Analysis.

Krishnamoorthy K. and Yu J. (2004). Modified Nel and Van der Merwe test for the multivariate Behrens-Fisher problem. Statistics & Probability Letters, 66(2): 161–169.

Krishnamoorthy K. and Yanping X. (2006). On Selecting Tests for Equality of Two Normal Mean Vectors. Multivariate Behavioral Research, 41(4): 533–548.

Mardia K.V., Kent J.T. and Bibby J.M. (1979). Multivariate Analysis. London: Academic Press.

Owen A.B. (1988). Empirical likelihood ratio confidence intervals for a single functional. Biometrika, 75(2): 237–249.

Owen A. (1990). Empirical likelihood ratio confidence regions. Annals of Statistics, 18(1): 90–120.

Owen A. B. (2001). Empirical likelihood. Chapman and Hall/CRC Press.

Preston S.P. and Wood A.T.A. (2010). Two-Sample Bootstrap Hypothesis Tests for Three-Dimensional Labelled Landmark Data. Scandinavian Journal of Statistics 37(4): 568–587.

Todorov V. and Filzmoser P. (2010). Robust Statistic for the One-way MANOVA. Computational Statistics & Data Analysis 54(1): 37–48.

Box's M test for equality of two or more covariance matrices

Description

Box's M test for equality of two or more covariance matrices.

Usage

Mtest.cov(x, ina, a = 0.05)
Mtest.cov(x, ina, a = 0.05)

Arguments

`x`	A matrix containing Euclidean data.
`ina`	A vector denoting the groups of the data.
`a`	The significance level, set to 0.05 by default.

Details

According to Mardia, Kent and Bibby (1979, pg. 140), it may be argued that if $n_i$ is small, then the log-likelihood ratio test (function likel.cov) gives too much weight to the contribution of ${\bf S}$ . This consideration led Box (1949) to propose another test statistic in place of that seen in likel.cov . Box's $M$ is given by

$M=\gamma\sum_{i=1}^k\left(n_i-1\right)\log{\left|{\bf S}_{i}^{-1}{\bf S}_p \right|},$

where $\gamma=1-\frac{2p^2+3p-1}{6\left(p+1\right)\left(k-1\right)}\left(\sum_{i=1}^k\frac{1}{n_i-1}-\frac{1}{n-k}\right)$ and ${\bf S}_{i}$ and ${\bf S}_{p}$ are the $i$ -th unbiased covariance estimator and the pooled covariance matrix, respectively with ${\bf S}_p=\frac{\sum_{i=1}^k\left(n_i-1\right){\bf S}_i}{n-k}$ . Box's $M$ also has an asymptotic $\chi^2$ distribution with $\frac{1}{2}\left(p+1\right)\left(k-1\right)$ degrees of freedom. Box's approximation seems to be good if each $n_i$ exceeds 20 and if $k$ and $p$ do not exceed 5 (Bibby and Kent (1979) pg. 140).

Value

A vector with the the test statistic, the p-value, the degrees of freedom and the critical value of the test.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris mtsagris@uoc.gr.

References

Mardia K.V., Kent J.T. and Bibby J.M. (1979). Multivariate Analysis. London: Academic Press.

Examples

x <- as.matrix( iris[, 1:4] )
ina <- iris[, 5]
Mtest.cov(x, ina)
x <- as.matrix( iris[, 1:4] )
ina <- iris[, 5]
Mtest.cov(x, ina)

Empirical likelihood for a one sample mean vector hypothesis testing

Description

Empirical likelihood for a one sample mean vector hypothesis testing.

Usage

el.test1(x, mu, R = 1, ncores = 1, graph = FALSE)
el.test1(x, mu, R = 1, ncores = 1, graph = FALSE)

Arguments

`x`	A matrix containing Euclidean data.
`mu`	The hypothesized mean vector.
`R`	If R is 1 no bootstrap calibration is performed and the classical p-value via the $\chi^2$ distribution is returned. If R is greater than 1, the bootstrap p-value is returned.
`ncores`	The number of cores to use, set to 1 by default.
`graph`	A boolean variable which is taken into consideration only when bootstrap calibration is performed. IF TRUE the histogram of the bootstrap test statistic values is plotted.

Details

The $H_0$ is that $\pmb{\mu} = \pmb{\mu}_0$ and the constraint imposed by EL is

$\frac{1}{n}\sum_{i=1}^{n}\left\lbrace\left[1+\pmb{\lambda}^T\left({\bf x}_i-\pmb{\mu}_0 \right)\right]^{-1}\left({\bf x}_i-\pmb{\mu}_0\right)\right\rbrace={\bf 0},$

where the $\pmb{\lambda}$ is the Lagrangian parameter introduced to maximize the above expression. Note that the maximization of is with respect to the $\pmb{\lambda}$ . The probabilities have the following form

$p_i=\frac{1}{n}\left[1+\pmb{\lambda}^T \left({\bf x}_i-\pmb{\mu}_0 \right)\right]^{-1}.$

The log-likelihood ratio test statistic can be written as

$\Lambda=\sum_{i=1}^{n}\log{np_i}.$

where $d$ denotes the number of variables. Under $H_0$ $\Lambda \sim \chi^2_d$ , asymptotically. Alternatively the bootstrap p-value may be computed.

Value

A list with the outcome of the function el.test() from the package emplik which includes the -2 log-likelihood ratio, the observed P-value by chi-square approximation, the final value of Lagrange multiplier $\lambda$ , the gradient at the maximum, the Hessian matrix, the weights on the observations (probabilities multiplied by the sample size) and the number of iteration performed. In addition the runtime of the procedure is reported. In the case of bootstrap, the bootstrap p-value is also returned.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris mtsagris@uoc.gr.

References

Jing B.Y. and Wood A.T.A. (1996). Exponential empirical likelihood is not Bartlett correctable. Annals of Statistics, 24(1): 365–369.

Owen A. (1990). Empirical likelihood ratio confidence regions. Annals of Statistics, 18(1): 90–120.

Owen A.B. (2001). Empirical likelihood. Chapman and Hall/CRC Press.

Examples

x <- as.matrix(iris[, 1:4])
el.test1(x, mu = numeric(4) )
eel.test1(x, mu = numeric(4) )
x <- as.matrix(iris[, 1:4])
el.test1(x, mu = numeric(4) )
eel.test1(x, mu = numeric(4) )

Empirical likelihood hypothesis testing for two mean vectors

Description

Empirical likelihood hypothesis testing for two mean vectors.

Usage

el.test2(y1, y2, R = 0, ncores = 1, graph = FALSE)
el.test2(y1, y2, R = 0, ncores = 1, graph = FALSE)

Arguments

`y1`	A matrix containing the Euclidean data of the first group.
`y2`	A matrix containing the Euclidean data of the second group.
`R`	If R is 0, the classical chi-square distribution is used, if R = 1, the corrected chi-square distribution (James, 1954) is used and if R = 2, the modified F distribution (Krishnamoorthy and Yanping, 2006) is used. If R is greater than 3 bootstrap calibration is performed.
`ncores`	How many to cores to use.
`graph`	A boolean variable which is taken into consideration only when bootstrap calibration is performed. IF TRUE the histogram of the bootstrap test statistic values is plotted.

Details

The $H_0$ is that $\pmb{\mu}_1 = \pmb{\mu}_2$ and the two constraints imposed by EL are

$\frac{1}{n_j}\sum_{i=1}^{n_j}\left\lbrace\left[1+\pmb{\lambda}_j^T\left({\bf x}_{ji}-\pmb{\mu} \right)\right]^{-1}\left({\bf x}_{ij}-\pmb{\mu}\right)\right\rbrace={\bf 0},$

where $j=1,2$ and the $\pmb{\lambda}_js$ are Lagrangian parameters introduced to maximize the above expression. Note that the maximization of is with respect to the $\pmb{\lambda}_js$ . The probabilities of the $j$ -th sample have the following form

$p_{ji}=\frac{1}{n_j} \left[1+\pmb{\lambda}_j^T \left({\bf x}_{ji}-\pmb{\mu} \right)\right]^{-1}$

. The log-likelihood ratio test statistic can be written as

$\Lambda=\sum_{j=1}^2\sum_{i=1}^{n_j}\log{n_jp_{ij}}.$

The test is implemented by searching for the mean vector that minimizes the sum of the two one sample EL test statistics. See el.test1 for the test statistic in the one-sample case.

Value

A list including:

`test`	The empirical likelihood test statistic value.
`modif.test`	The modified test statistic, either via the chi-square or the F distribution.
`dof`	Thre degrees of freedom of the chi-square or the F distribution.
`pvalue`	The asymptotic or the bootstrap p-value.
`mu`	The estimated common mean vector.
`runtime`	The runtime of the bootstrap calibration.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris mtsagris@uoc.gr.

References

Owen A. B. (2001). Empirical likelihood. Chapman and Hall/CRC Press.

Owen A.B. (1988). Empirical likelihood ratio confidence intervals for a single functional. Biometrika, 75(2): 237–249.

Preston S.P. and Wood A.T.A. (2010). Two-Sample Bootstrap Hypothesis Tests for Three-Dimensional Labelled Landmark Data. Scandinavian Journal of Statistics, 37(4): 568–587.

Examples

el.test2( y1 = as.matrix(iris[1:25, 1:4]), y2 = as.matrix(iris[26:50, 1:4]), R = 0 )
el.test2( y1 = as.matrix(iris[1:25, 1:4]), y2 = as.matrix(iris[26:50, 1:4]), R = 1 )
el.test2( y1 =as.matrix(iris[1:25, 1:4]), y2 = as.matrix(iris[26:50, 1:4]), R = 2 )
el.test2( y1 = as.matrix(iris[1:25, 1:4]), y2 = as.matrix(iris[26:50, 1:4]), R = 0 )
el.test2( y1 = as.matrix(iris[1:25, 1:4]), y2 = as.matrix(iris[26:50, 1:4]), R = 1 )
el.test2( y1 =as.matrix(iris[1:25, 1:4]), y2 = as.matrix(iris[26:50, 1:4]), R = 2 )

Exponential empirical likelihood for a one sample mean vector hypothesis testing

Description

Exponential empirical likelihood for a one sample mean vector hypothesis testing.

Usage

eel.test1(x, mu, tol = 1e-06, R = 1)
eel.test1(x, mu, tol = 1e-06, R = 1)

Arguments

`x`	A matrix containing Euclidean data.
`mu`	The hypothesized mean vector.
`tol`	The tolerance value used to stop the Newton-Raphson algorithm.
`R`	The number of bootstrap samples used to calculate the p-value. If R = 1 (default value), no bootstrap calibration is performed

Details

Exponential empirical likelihood or exponential tilting was first introduced by Efron (1981) as a way to perform a "tilted" version of the bootstrap for the one sample mean hypothesis testing. Similarly to the empirical likelihood, positive weights $p_i$ , which sum to one, are allocated to the observations, such that the weighted sample mean ${\bf \bar{x}}$ is equal to some population mean $\pmb{\mu}_0$ , under the $H_0$ . Under $H_1$ the weights are equal to $\frac{1}{n}$ , where $n$ is the sample size. Following Efron (1981), the choice of $p_is$ will minimize the Kullback-Leibler distance from $H_0$ to $H_1$

$D\left(L_0,L_1\right)=\sum_{i=1}^np_i\log\left(np_i\right),$

subject to the constraint $\sum_{i=1}^np_i{\bf x}_i=\pmb{\mu}_0$ . The probabilities take the form

$p_i=\frac{e^{\pmb{\lambda}^T{\bf x}_i}}{\sum_{j=1}^ne^{\pmb{\lambda}^T{\bf x}_j}}$

and the constraint becomes

$\frac{\sum_{i=1}^ne^{\pmb{\lambda}^T{\bf x}_i}\left({\bf x}_i-\pmb{\mu}_0\right)}{\sum_{j=1}^ne^{\pmb{\lambda}^T{\bf x}_j}}=0 \Rightarrow \frac{\sum_{i=1}^n{\bf x}_ie^{\pmb{\lambda}^T{\bf x}_i}}{\sum_{j=1}^ne^{\pmb{\lambda}^T{\bf x}_j}}-\pmb{\mu}_0={\bf 0}.$

A numerical search over $\pmb{\lambda}$ is required. Under $H_0$ $\Lambda \sim \chi^2_d$ , where $d$ denotes the number of variables. Alternatively the bootstrap p-value may be computed.

Value

A list including:

`p`	The estimated probabilities.
`lambda`	The value of the Lagrangian parameter $\lambda$ .
`iter`	The number of iterations required by the newton-Raphson algorithm.
`info`	The value of the log-likelihood ratio test statistic along with its corresponding p-value.
`runtime`	The runtime of the process.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris mtsagris@uoc.gr.

References

Efron B. (1981) Nonparametric standard errors and confidence intervals. Canadian Journal of Statistics, 9(2): 139–158.

Jing B.Y. and Wood A.T.A. (1996). Exponential empirical likelihood is not Bartlett correctable. Annals of Statistics, 24(1): 365–369.

Owen A. B. (2001). Empirical likelihood. Chapman and Hall/CRC Press.

Examples

x <- as.matrix( iris[, 1:4] )
eel.test1(x, numeric(4) )
el.test1(x, numeric(4) )
x <- as.matrix( iris[, 1:4] )
eel.test1(x, numeric(4) )
el.test1(x, numeric(4) )

Exponential empirical likelihood hypothesis testing for two mean vectors

Description

Exponential empirical likelihood hypothesis testing for two mean vectors.

Usage

eel.test2(y1, y2, tol = 1e-07, R = 0, graph = FALSE)
eel.test2(y1, y2, tol = 1e-07, R = 0, graph = FALSE)

Arguments

`y1`	A matrix containing the Euclidean data of the first group.
`y2`	A matrix containing the Euclidean data of the second group.
`tol`	The tolerance level used to terminate the Newton-Raphson algorithm.
`R`	If R is 0, the classical chi-square distribution is used, if R = 1, the corrected chi-square distribution (James, 1954) is used and if R = 2, the modified F distribution (Krishnamoorthy and Yanping, 2006) is used. If R is greater than 3 bootstrap calibration is performed.
`graph`	A boolean variable which is taken into consideration only when bootstrap calibration is performed. IF TRUE the histogram of the bootstrap test statistic values is plotted.

Details

Exponential empirical likelihood or exponential tilting was first introduced by Efron (1981) as a way to perform a "tilted" version of the bootstrap for the one sample mean hypothesis testing. Similarly to the empirical likelihood, positive weights $p_i$ , which sum to one, are allocated to the observations, such that the weighted sample mean ${\bf \bar{x}}$ is equal to some population mean $\pmb{\mu}$ , under the $H_0$ . Under $H_1$ the weights are equal to $\frac{1}{n}$ , where $n$ is the sample size. Following Efron (1981), the choice of $p_is$ will minimize the Kullback-Leibler distance from $H_0$ to $H_1$

$D\left(L_0,L_1\right)=\sum_{i=1}^np_i\log\left(np_i\right),$

subject to the constraint $\sum_{i=1}^np_i{\bf x}_i=\pmb{\mu}$ . The probabilities take the form

$p_i=\frac{e^{\pmb{\lambda}^T{\bf x}_i}}{\sum_{j=1}^ne^{\pmb{\lambda}^T{\bf x}_j}}$

and the constraint becomes

$\frac{\sum_{i=1}^ne^{\pmb{\lambda}^T{\bf x}_i}\left({\bf x}_i-\pmb{\mu}\right)}{\sum_{j=1}^ne^{\pmb{\lambda}^T{\bf x}_j}}=0 \Rightarrow \frac{\sum_{i=1}^n{\bf x}_ie^{\pmb{\lambda}^T{\bf x}_i}}{\sum_{j=1}^ne^{\pmb{\lambda}^T{\bf x}_j}}-\pmb{\mu}=0.$

Similarly to empirical likelihood a numerical search over $\pmb{\lambda}$ is required.

We can derive the asymptotic form of the test statistic in the two sample means case but in a simpler form, generalizing the approach of Jing and Robinson (1997) to the multivariate case as follows. The three constraints are

${\begin{array}{ccc} \left(\sum_{j=1}^{n_1}e^{\pmb {\lambda}_1^T{\bf x}_j}\right)^{-1}\left(\sum_{i=1}^{n_1}{\bf x}_ie^{\pmb{\lambda}_1^T {\bf x}_i}\right) -\pmb{\mu} & = & {\bf 0} \\ \left(\sum_{j=1}^{n_2}e^{\pmb {\lambda}_2^T{\bf y}_j}\right)^{-1}\left(\sum_{i=1}^{n_2}{\bf y}_ie^{\pmb{\lambda}_2^T {\bf y}_i}\right) -\pmb{\mu} & = & {\bf 0} \\ n_1\pmb{\lambda}_1+n_2\pmb{\lambda}_2 & = & {\bf 0}. \end{array}}$

Similarly to EL the sum of a linear combination of the $\pmb{\lambda}s$ is set to zero. We can equate the first two constraints of

$\left(\sum_{j=1}^{n_1}e^{\pmb {\lambda}_1^T{\bf x}_j}\right)^{-1}\left(\sum_{i=1}^{n_1}{\bf x}_ie^{\pmb{\lambda}_1^T {\bf x}_i}\right)= \left(\sum_{j=1}^{n_2}e^{\pmb {\lambda}_2^T{\bf y}_j}\right)^{-1}\left(\sum_{i=1}^{n_2}{\bf y}_ie^{\pmb{\lambda}_2^T {\bf y}_i}\right).$

Also, we can write the third constraint of as $\pmb{\lambda}_2=-\frac{n_1}{n_2}\pmb{\lambda}_1$ and thus rewrite the first two constraints as

$\left(\sum_{j=1}^{n_1}e^{\pmb{\lambda}^T{\bf x}_j}\right)^{-1}\left(\sum_{i=1}^{n_1}{\bf x}_ie^{\pmb{\lambda}^T {\bf x}_i}\right) = \left(\sum_{j=1}^{n_2}e^{-\frac{n_1}{n_2}\pmb{\lambda}^T{\bf y}_j}\right)^{-1}\left(\sum_{i=1}^{n_2}{\bf y}_ie^{-\frac{n_1}{n_2}\pmb{\lambda}^T {\bf y}_i}\right).$

This trick allows us to avoid the estimation of the common mean. It is not possible though to do this in the empirical likelihood method. Instead of minimisation of the sum of the one-sample test statistics from the common mean, we can define the probabilities by searching for the $\pmb{\lambda}$ which makes the last equation hold true. The third constraint of is a convenient constraint, but Jing and Robinson (1997) mention that even though as a constraint is simple it does not lead to second-order accurate confidence intervals unless the two sample sizes are equal. Asymptotically, the test statistic follows a $\chi^2_d$ under the null hypothesis.

Value

A list including:

`test`	The empirical likelihood test statistic value.
`modif.test`	The modified test statistic, either via the chi-square or the F distribution.
`dof`	The degrees of freedom of the chi-square or the F distribution.
`pvalue`	The asymptotic or the bootstrap p-value.
`mu`	The estimated common mean vector.
`runtime`	The runtime of the bootstrap calibration.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris mtsagris@uoc.gr.

References

Efron B. (1981) Nonparametric standard errors and confidence intervals. Canadian Journal of Statistics, 9(2): 139–158.

Jing B.Y. and Wood A.T.A. (1996). Exponential empirical likelihood is not Bartlett correctable. Annals of Statistics, 24(1): 365–369.

Jing B.Y. and Robinson J. (1997). Two-sample nonparametric tilting method. Australian Journal of Statistics, 39(1): 25–34.

Owen A.B. (2001). Empirical likelihood. Chapman and Hall/CRC Press.

Preston S.P. and Wood A.T.A. (2010). Two-Sample Bootstrap Hypothesis Tests for Three-Dimensional Labelled Landmark Data. Scandinavian Journal of Statistics 37(4): 568–587.

Tsagris M., Preston S. and Wood A.T.A. (2017). Nonparametric hypothesis testing for equality of means on the simplex. Journal of Statistical Computation and Simulation, 87(2): 406–422.

Examples

y1 = as.matrix(iris[1:25, 1:4])
y2 = as.matrix(iris[26:50, 1:4])
eel.test2(y1, y2)
eel.test2(y1, y2 )
eel.test2( y1, y2 )
y1 = as.matrix(iris[1:25, 1:4])
y2 = as.matrix(iris[26:50, 1:4])
eel.test2(y1, y2)
eel.test2(y1, y2 )
eel.test2( y1, y2 )

Hotelling's multivariate version of the 1 sample t-test for Euclidean data

Description

Hotelling's test for testing one Euclidean population mean vector.

Usage

hotel1T2(x, M, a = 0.05, R = 999, graph = FALSE)
hotel1T2(x, M, a = 0.05, R = 999, graph = FALSE)

Arguments

`x`	A matrix containing Euclidean data.
`a`	The significance level, set to 0.05 by default.
`M`	The hypothesized mean vector.
`R`	If R is 1 no bootstrap calibration is performed and the classical p-value via the F distribution is returned. If R is greater than 1, the bootstrap p-value is returned.
`graph`	A boolean variable which is taken into consideration only when bootstrap calibration is performed. IF TRUE the histogram of the bootstrap test statistic values is plotted.

Details

The hypothesis test is that a mean vector is equal to some specified vector $H_0:\pmb{\mu}=\pmb{\mu}_0$ . We assume that $\pmb{\Sigma}$ is unknown. The first approach to this hypothesis test is parametrically, using the Hotelling's $T^2$ test Mardia, Bibby and Kent (1979, pg. 125-126). The test statistic is given by

$T^2=\frac{\left(n-p\right)n}{\left(n-1\right)p}\left(\bar{{\bf X}}-\pmb{\mu}\right)^T{\bf S}^{-1}\left(\bar{{\bf X}}-\pmb{\mu} \right).$

Under the null hypothesis, the above test statistic follows the $F_{p,n-p}$ distribution. The bootstrap version of the one-sample multivariate generalization of the simple t-test is also included in the function. An extra argument (R) indicates whether bootstrap calibration should be used or not. If R=1, then the asymptotic theory applies, if R>1, then the bootstrap p-value will be applied and the number of re-samples is equal to R.

Value

A list including:

`m`	The sample mean vector.
`info`	The test statistic, the p-value, the critical value and the degrees of freedom of the F distribution (numerator and denominator). This is given if no bootstrap calibration is employed.
`pvalue`	The bootstrap p-value is bootstrap is employed.
`runtime`	The runtime of the bootstrap calibration.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris mtsagris@uoc.gr.

References

Mardia K.V., Kent J.T. and Bibby J.M. (1979). Multivariate analysis. London: Academic Press.

Examples

x <- matrix( rnorm( 100 * 4), ncol = 4)
hotel1T2(x, numeric(4), R = 1)
hotel1T2(x, numeric(4), R = 999, graph = TRUE)
x <- matrix( rnorm( 100 * 4), ncol = 4)
hotel1T2(x, numeric(4), R = 1)
hotel1T2(x, numeric(4), R = 999, graph = TRUE)

Hotelling's multivariate version of the 2 sample t-test for Euclidean data

Description

Hotelling's test for testing the equality of two Euclidean population mean vectors.

Usage

hotel2T2(x1, x2, a = 0.05, R = 999, graph = FALSE)
hotel2T2(x1, x2, a = 0.05, R = 999, graph = FALSE)

Arguments

`x1`	A matrix containing the Euclidean data of the first group.
`x2`	A matrix containing the Euclidean data of the second group.
`a`	The significance level, set to 0.05 by default.
`R`	If R is 1 no bootstrap calibration is performed and the classical p-value via the F distribution is returned. If R is greater than 1, the bootstrap p-value is returned.
`graph`	A boolean variable which is taken into consideration only when bootstrap calibration is performed. IF TRUE the histogram of the bootstrap test statistic values is plotted.

Details

The fist case scenario is when we assume equality of the two covariance matrices. This is called the two-sample Hotelling's $T^2$ test (Mardia, Kent and Bibby, 1979, pg. 131-140) and Everitt (2005, pg. 139). The test statistic is defined as

$T^2=\frac{n_1n_2}{n_1+n_2}\left(\bar{{\bf X}}_1- \bar{{\bf X}}_2\right)^T{\bf S}^{-1}\left(\bar{{\bf X}}_1- \bar{{\bf X}}_2\right),$

where $\bf S$ is the pooled covariance matrix calculated under the assumption of equal covariance matrices ${\bf S}=\frac{\left(n_1-1\right){\bf S}_1+\left(n_2-1\right){\bf S}_2}{n_1+n_2-2}.$ Under $H_0$ the statistic $F$ given by

$F=\frac{\left( n_1+n_2-p-1 \right)T^2}{\left(n_1+n_2-2 \right)p}$

follows the $F$ distribution with $p$ and $n_1+n_2-p-1$ degrees of freedom. Similar to the one-sample test, an extra argument (R) indicates whether bootstrap calibration should be used or not. If R=1, then the asymptotic theory applies, if R>1, then the bootstrap p-value will be applied and the number of re-samples is equal to R. The estimate of the common mean used in the bootstrap to transform the data under the null hypothesis the mean vector of the combined sample, of all the observations.

The built-in command manova does the same thing exactly. Try it, the asymptotic $F$ test is what you have to see. In addition, this command allows for more mean vector hypothesis testing for more than two groups. I noticed this command after I had written my function and nevertheless as I mention in the introduction this document has an educational character as well.

Value

A list including:

`mesoi`	The two mean vectors.
`info`	The test statistic, the p-value, the critical value and the degrees of freedom of the F distribution (numerator and denominator). This is given if no bootstrap calibration is employed.
`pvalue`	The bootstrap p-value is bootstrap is employed.
`note`	A message informing the user that bootstrap calibration has been employed.
`runtime`	The runtime of the bootstrap calibration.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris mtsagris@uoc.gr.

References

Everitt B. (2005). An R and S-Plus Companion to Multivariate Analysis. Springer.

Mardia K.V., Kent J.T. and Bibby J.M. (1979). Multivariate Analysis. London: Academic Press.

Tsagris M., Preston S. and Wood A.T.A. (2017). Nonparametric hypothesis testing for equality of means on the simplex. Journal of Statistical Computation and Simulation, 87(2): 406–422.

Examples

hotel2T2( as.matrix(iris[1:25, 1:4]), as.matrix(iris[26:50, 1:4]) )
hotel2T2( as.matrix(iris[1:25, 1:4]), as.matrix(iris[26:50, 1:4]), R = 1 )
hotel2T2( as.matrix(iris[1:25, 1:4]), as.matrix(iris[26:50, 1:4]) )
hotel2T2( as.matrix(iris[1:25, 1:4]), as.matrix(iris[26:50, 1:4]), R = 1 )

Hypothesis test for two high-dimensional mean vectors

Description

Hypothesis test for two high-dimensional mean vectors.

Usage

sarabai(x1, x2)
sarabai(x1, x2)

Arguments

`x1`	A matrix containing the Euclidean data of the first group.
`x2`	A matrix containing the Euclidean data of the second group.

Details

High dimensional data are the multivariate data which have many variables ( $p$ ) and usually a small number of observations ( $n$ ). It also happens that $p>n$ and this is the case here in this Section. We will see a simple test for the case of $p>n$ . In this case, the covariance matrix is not invertible and in addition it can have a lot of zero eigenvalues.

The test we will see was proposed by Bai and Saranadasa (1996). Ever since, there have been some more suggestions but I chose this one for its simplicity. There are two datasets, ${\bf X}_1$ and ${\bf X}_2$ of sample sizes $n_1$ and $n_2$ , respectively. Their corresponding sample mean vectors and covariance matrices are $\bar{{\bf X}}_1$ , $\bar{{\bf X}}_2$ and ${\bf S}_1$ , ${\bf S}_2$ respectively. The assumption here is the same as that of the Hotelling's test we saw before.

Let us define the pooled covariance matrix at first, calculated under the assumption of equal covariance matrices ${\bf S}_n=\frac{\left(n_1-1\right){\bf S}_1+\left(n_2-1\right){\bf S}_2}{n}$ , where $n=n_1+n_2$ . Then define $B_n=\sqrt{ \frac{n^2}{\left(n+2\right)\left(n-1\right)}\left\lbrace\text{tr}\left({\bf S}_n^2\right)- \frac{1}{n}\left[\text{tr}\left({\bf S}_n\right)\right]^2 \right\rbrace }$ . The test statistic is

$Z=\frac{\frac{n_1n_2}{n_1+n_2}\left(\bar{{\bf X}}_1-\bar{{\bf X}}_2\right)^T\left(\bar{{\bf X}}_1-\bar{{\bf X}}_2\right) -\text{tr}\left({\bf S}_n\right)}{\sqrt{\frac{2\left(n+1\right)}{n}}B_n}.$

Under the null hypothesis (equality of the two mean vectors) the test statistic follows the standard normal distribution. Bai and Saranadasa (1996) established the asymptotic normality of the test statistics and showed that it has attractive power property when $p/n \rightarrow c < \infty$ and under some restriction on the maximum eigenvalue of the common population covariance matrix. However, the requirement of $p$ and $n$ being of the same order is too restrictive to be used in the "large $p$ small $n$ " situation.

Value

A vector with the test statistic and the p-value.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris mtsagris@uoc.gr.

References

Bai Z. D. and Saranadasa H. (1996). Effect of high dimension: by an example of a two sample problem. Statistica Sinica, 6(2): 311–329.

Examples

x1 <- matrix( rnorm(40 * 100), ncol = 100 )
x2 <- matrix( rnorm(50 * 100), ncol = 100 )
sarabai(x1, x2)
x1 <- matrix( rnorm(40 * 100), ncol = 100 )
x2 <- matrix( rnorm(50 * 100), ncol = 100 )
sarabai(x1, x2)

James multivariate version of the t-test

Description

James test for testing the equality of two population mean vectors without assuming equality of the covariance matrices.

Usage

james(y1, y2, a = 0.05, R = 999, graph = FALSE)
james(y1, y2, a = 0.05, R = 999, graph = FALSE)

Arguments

`y1`	A matrix containing the Euclidean data of the first group.
`y2`	A matrix containing the Euclidean data of the second group.
`a`	The significance level, set to 0.05 by default.
`R`	If R is 1 no bootstrap calibration is performed and the classical p-value via the F distribution is returned. If R is greater than 1, the bootstrap p-value is returned.
`graph`	A boolean variable which is taken into consideration only when bootstrap calibration is performed. If TRUE the histogram of the bootstrap test statistic values is plotted.

Details

Here we show the modified version of the two-sample $T^2$ test (function hotel2T2) in the case where the two covariances matrices cannot be assumed to be equal.

James (1954) proposed a test for linear hypotheses of the population means when the variances (or the covariance matrices) are not known. Its form for two $p$ -dimensional samples is:

$T^2_u=\left(\bar{{\bf X}}_1-\bar{{\bf X}}_2\right)^T\tilde{{\bf S}}^{-1}\left(\bar{{\bf X}}_1-\bar{{\bf X}}_2\right),$

where $\tilde{{\bf S}}=\tilde{{\bf S}_1}+\tilde{{\bf S}_2}=\frac{{\bf S}_1}{n_1}+\frac{{\bf S}_2}{n_2}$ .

James (1954) suggested that the test statistic is compared with $2h\left(\alpha\right)$ , a corrected $\chi^2$ distribution whose form is

$2h\left(\alpha\right)=\chi^2\left(A+B\chi^2\right),$

where $A=1+\frac{1}{2p}\sum_{i=1}^2\frac{\left(tr \tilde{{\bf S}}^{-1}\tilde{{\bf S}_i}\right)^2}{n_i-1}$ and $B=\frac{1}{p\left(p+2\right)}\left[\sum_{i=1}^2\frac{tr\left(\tilde{{\bf S}}^{-1}\tilde{{\bf S}_i}\right)^2}{n_i-1}+\frac{1}{2}\sum_{i=1}^2\frac{\left(\text{tr} \tilde{{\bf S}}^{-1}\tilde{{\bf S}_i}\right)^2}{n_i-1} \right]$ .

If you want to do bootstrap to get the p-value, then you must transform the data under the null hypothesis. The estimate of the common mean is given by Aitchison (1986)

$\hat{\pmb{\mu}}_c = \left(n_1{\bf S}_1^{-1}+n_2{\bf S}_2^{-1}\right)^{-1}\left(n_1{\bf S}_1^{-1}\bar{{\bf X}}_1+n_2{\bf S}_2^{-1}\bar{{\bf X}}_2\right)= \left(\tilde{{\bf S}}_1^{-1}+\tilde{{\bf S}}_2^{-1}\right)^{-1}\left(\tilde{{\bf S}}_1^{-1}\bar{{\bf X}}_1+\tilde{{\bf S}}_2^{-1}\bar{{\bf X}}_2\right).$

The modified Nel and van der Merwe (1986) test is based on the same quadratic form as that of James (1954) but the distribution used to compare the value of the test statistic is different. It is shown in Krishnamoorthy and Yanping (2006) that $T^2_u \sim \frac{\nu p}{\nu-p+1}F_{p,\nu-p+1}$ approximately, where $\nu=\frac{p+p^2}{\frac{1}{n_1}\left\lbrace \text{tr}\left[ \left( {\bf S}_1\tilde{{\bf S}} \right)^2\right]+ \text{tr}\left[ \left( {\bf S}_1\tilde{{\bf S}} \right)\right]^2 \right\rbrace + \frac{1}{n_2}\left\lbrace \text{tr}\left[ \left( {\bf S}_2\tilde{{\bf S}}\right)^2\right]+ \text{tr}\left[ \left( {\bf S}_2\tilde{{\bf S}} \right)\right]^2 \right\rbrace }.$

The algorithm is taken by Krishnamoorthy and Yu (2004).

Value

A list including:

`note`	A message informing the user about the test used.
`mesoi`	The two mean vectors.
`info`	The test statistic, the p-value, the correction factor and the corrected critical value of the chi-square distribution if the James test has been used or, the test statistic, the p-value, the critical value and the degrees of freedom (numerator and denominator) of the F distribution if the modified James test has been used.
`pvalue`	The bootstrap p-value if bootstrap is employed.
`runtime`	The runtime of the bootstrap calibration.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris mtsagris@uoc.gr.

References

Aitchison J. (1986). The statistical analysis of compositional data. Chapman & Hall.

James G.S. (1954). Tests of Linear Hypothese in Univariate and Multivariate Analysis when the Ratios of the Population Variances are Unknown. Biometrika, 41(1/2): 19–43.

Krishnamoorthy K. and Yu J. (2004). Modified Nel and Van der Merwe test for the multivariate Behrens-Fisher problem. Statistics & Probability Letters, 66(2): 161–169.

Krishnamoorthy K. and Yanping Xia (2006). On Selecting Tests for Equality of Two Normal Mean Vectors. Multivariate Behavioral Research, 41(4): 533–548.

Tsagris M., Preston S. and Wood A.T.A. (2017). Nonparametric hypothesis testing for equality of means on the simplex. Journal of Statistical Computation and Simulation, 87(2): 406–422.

Examples

james( as.matrix(iris[1:25, 1:4]), as.matrix(iris[26:50, 1:4]), R = 1 )
james( as.matrix(iris[1:25, 1:4]), as.matrix(iris[26:50, 1:4]), R = 2 )
james( as.matrix(iris[1:25, 1:4]), as.matrix(iris[26:50, 1:4]) )
james( as.matrix(iris[1:25, 1:4]), as.matrix(iris[26:50, 1:4]), R = 1 )
james( as.matrix(iris[1:25, 1:4]), as.matrix(iris[26:50, 1:4]), R = 2 )
james( as.matrix(iris[1:25, 1:4]), as.matrix(iris[26:50, 1:4]) )

Log-likelihood ratio test for equality of one covariance matrix

Description

Log-likelihood ratio test for equality of one covariance matrix.

Usage

equal.cov(x, Sigma, a = 0.05)
equal.cov(x, Sigma, a = 0.05)

Arguments

`x`	A matrix containing Euclidean data.
`Sigma`	The hypothesis covariance matrix.
`a`	The significance level, set to 0.05 by default.

Details

The hypothesis test is that the the sample covariance is equal to some specified covariance matrix: $H_0:\pmb{\Sigma}=\pmb{\Sigma}_0$ , with $\pmb{\mu}$ unknown. The algorithm for this test is taken from Mardia, Bibby and Kent (1979, pg. 126-127). The test is based upon the log-likelihood ratio test. The form of the test is

$-2\log{\lambda}=n \text{tr}\left\lbrace \pmb{\Sigma}_0^{-1}{\bf S}\right\rbrace-n\log{\left|\pmb{\Sigma}_0^{-1}{\bf S} \right|}-np,$

where $n$ is the sample size, $\pmb{\Sigma}_0$ is the specified covariance matrix under the null hypothesis, ${\bf S}$ is the sample covariance matrix and $p$ is the dimensionality of the data (or the number of variables). Let $\alpha$ and $g$ denote the arithmetic mean and the geometric mean respectively of the eigenvalues of $\pmb{\Sigma}_0^{-1}{\bf S}$ , so that $tr\left\lbrace \pmb{\Sigma}_0^{-1}{\bf S}\right\rbrace=p\alpha$ and $\left|\pmb{\Sigma}_0^{-1}{\bf S} \right|=g^p$ , then the test statistic becomes

$-2\log{\lambda}=np\left(\alpha-log{(g)}-1 \right).$

The degrees of freedom of the $\chi^2$ distribution are $\frac{1}{2}p\left(p+1\right)$ .

Value

A vector with the the test statistic, the p-value, the degrees of freedom and the critical value of the test.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris mtsagris@uoc.gr.

References

Mardia K.V., Kent J.T. and Bibby J.M. (1979). Multivariate Analysis. London: Academic Press.

Examples

x <- as.matrix( iris[, 1:4] )
s <- cov(x) * 1.5
equal.cov(x, s)
x <- as.matrix( iris[, 1:4] )
s <- cov(x) * 1.5
equal.cov(x, s)

Log-likelihood ratio test for equality of two or more covariance matrices

Description

Log-likelihood ratio test for equality of two or more covariance matrices.

Usage

likel.cov(x, ina, a = 0.05)
likel.cov(x, ina, a = 0.05)

Arguments

`x`	A matrix containing Euclidean data.
`ina`	A vector denoting the groups of the data.
`a`	The significance level, set to 0.05 by default.

Details

Tthe hypothesis test is that of the equality of at least two covariance matrices: $H_0:\pmb{\Sigma}_1=\ldots=\pmb{\Sigma}_k$ . The algorithm is taken from Mardia, Bibby and Kent (1979, pg. 140). The log-likelihood ratio test is the multivariate generalization of Bartlett's test of homogeneity of variances. The test statistic takes the following form

$-2log{\lambda}=n\log{\left|{\bf S}\right|}-\sum_{i=1}^kn_i\log{\left|{\bf S_i}\right|}=\sum_{i=1}^kn_i\log{\left|{\bf S}_i^{-1}{\bf S}\right|},$

where ${\bf S}_i$ is the $i$ -th sample biased covariance matrix and ${\bf S}=n^{-1}\sum_{i=1}^kn_i{\bf S}_i$ is the maximum likelihood estimate of the common covariance matrix (under the null hypothesis) with $n=\sum_{i=1}^kn_i$ . The degrees of freedom of the asymptotic chi-square distribution are $\frac{1}{2}\left(p+1\right)\left(k-1\right)$ .

Value

A vector with the the test statistic, the p-value, the degrees of freedom and the critical value of the test.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris mtsagris@uoc.gr.

References

Mardia K.V., Kent J.T. and Bibby J.M. (1979). Multivariate Analysis. London: Academic Press.

Examples

x <- as.matrix( iris[, 1:4] )
ina <- iris[, 5]
likel.cov(x, ina)
x <- as.matrix( iris[, 1:4] )
ina <- iris[, 5]
likel.cov(x, ina)

Multivariate analysis of variance (James test)

Description

Multivariate analysis of variance without assuming equality of the covariance matrices.

Usage

maovjames(x, ina, a = 0.05)
maovjames(x, ina, a = 0.05)

Arguments

`x`	A matrix containing Euclidean data.
`ina`	A numerical or factor variable indicating the groups of the data.
`a`	The significance level, set to 0.005 by default.

Details

James (1954) also proposed an alternative to MANOVA when the covariance matrices are not assumed equal. The test statistic for $k$ samples is

$J=\sum_{i=1}^k\left(\bar{{\bf x}}_i-\bar{{\bf X}}\right)^T{\bf W}_i\left(\bar{{\bf x}}_i-\bar{{\bf X}}\right),$

where $\bar{{\bf x}}_i$ and $n_i$ are the sample mean vector and sample size of the $i$ -th sample respectively and ${\bf W}_i=\left(\frac{{\bf S}_i}{n_i}\right)^{-1}$ , where ${\bf S}_i$ is the covariance matrix of the $i$ -sample mean vector and $\bar{{\bf X}}$ is the estimate of the common mean $\bar{{\bf X}}=\left(\sum_{i=1}^k{\bf W}_i\right)^{-1}\sum_{i=1}^k{\bf W}_i\bar{{\bf x}}_i$ .

Normally one would compare the test statistic with a $\chi^2_{r,1-\alpha}$ , where $r=p\left(k-1\right)$ are the degrees of freedom with $k$ denoting the number of groups and $p$ the dimensionality of the data. There are $r$ constraints (how many univariate means must be equal, so that the null hypothesis, that all the mean vectors are equal, holds true), that is where these degrees of freedom come from. James (1954) compared the test statistic with a corrected $\chi^2$ distribution instead. Let $A$ and $B$ be $A= 1+\frac{1}{2r}\sum_{i=1}^k\frac{\left[\text{tr}\left({\bf I}_p-{\bf W}^{-1}{\bf W}_i\right)\right]^2}{n_i-1}$ and $B= \frac{1}{r\left(r+2\right)}\sum_{i=1}^k\left\lbrace\frac{\text{tr}\left[\left({\bf I}_p-{\bf W}^{-1}{\bf W}_i\right)^2\right]}{n_i-1}+\frac{\left[\text{tr}\left({\bf I}_p-{\bf W}^{-1}{\bf W}_i\right)\right]^2}{2\left(n_i-1\right)}\right\rbrace$ .

The corrected quantile of the $\chi^2$ distribution is given as before by $2h\left(\alpha\right)=\chi^2\left(A+B\chi^2\right)$ .

Value

A vector with the next 4 elements:

`test`	The test statistic.
`correction`	The value of the correction factor.
`corr.critical`	The corrected critical value of the chi-square distribution.
`p-value`	The p-value of the corrected test statistic.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris mtsagris@uoc.gr.

References

James G.S. (1954). Tests of Linear Hypotheses in Univariate and Multivariate Analysis when the Ratios of the Population Variances are Unknown. Biometrika, 41(1/2): 19–43.

Examples

maov( as.matrix(iris[,1:4]), iris[,5] )
maovjames( as.matrix(iris[,1:4]), iris[,5] )
maov( as.matrix(iris[,1:4]), iris[,5] )
maovjames( as.matrix(iris[,1:4]), iris[,5] )

Multivariate analysis of variance assuming equality of the covariance matrices

Description

Multivariate analysis of variance assuming equality of the covariance matrices.

Usage

maov(x, ina)
maov(x, ina)

Arguments

`x`	A matrix containing Euclidean data.
`ina`	A numerical or factor variable indicating the groups of the data.

Details

Multivariate analysis of variance assuming equality of the covariance matrices.

Value

A list including:

`note`	A message stating whether the $F$ or the $chi^2$ approximation has been used.
`result`	The test statistic and the p-value.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris mtsagris@uoc.gr.

References

Johnson R.A. and Wichern D.W. (2007, 6th Edition). Applied Multivariate Statistical Analysis, pg. 302–303.

Todorov V. and Filzmoser P. (2010). Robust Statistic for the One-way MANOVA. Computational Statistics & Data Analysis, 54(1): 37–48.

Examples

maov( as.matrix(iris[,1:4]), iris[,5] )
maovjames( as.matrix(iris[,1:4]), iris[,5] )
maov( as.matrix(iris[,1:4]), iris[,5] )
maovjames( as.matrix(iris[,1:4]), iris[,5] )

Relationship between Hotelling's $T^2$ test and James' MANOVA

Description

Relationship between Hotelling's $T^2$ test and James' MANOVA.

Usage

maovjames.hotel(x, ina)
maovjames.hotel(x, ina)

Arguments

`x`	A matrix containing the Euclidean data of the first group.
`ina`	A numerical or factor variable indicating the groups of the data.

Details

The relationship for the James two sample test (see the function james.hotel) is true for the case of the MANOVA. The estimate of the common mean, $\pmb{mu}_c$ (see the function james for the expression of $\pmb{\mu}_c$ ), is in general, for $g$ groups, each of sample size $n_i$ , written as

$\hat{\pmb{\mu}}_c = \left(\sum_{i=1}^gn_i{\bf S}_i^{-1}\right)^{-1}\sum_{i=1}^gn_i{\bf S}_i^{-1}\bar{{\bf X}}_i.$

The function is just a proof of the mathematics you will find in Emerson (2009, pg. 76–81) and is again intended for educational purposes.

Value

A list including:

`test`	The value of the test statistic, the sum of the two Hotelling's test statistic using the common mean.
`mc`	The common mean.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris mtsagris@uoc.gr.

References

Emerson S. (2009). Small sample performance and calibration of the Empirical Likelihood method. PhD thesis, Stanford university.

James G.S. (1954). Tests of Linear Hypothese in Univariate and Multivariate Analysis when the Ratios of the Population Variances are Unknown. Biometrika, 41(1/2): 19–43.

Examples

maovjames.hotel( as.matrix(iris[, 1:4]), iris[, 5] )
maovjames( as.matrix(iris[, 1:4]), iris[, 5] )
maovjames.hotel( as.matrix(iris[, 1:4]), iris[, 5] )
maovjames( as.matrix(iris[, 1:4]), iris[, 5] )

Relationship between the Hotelling's $T^2$ and James' test

Description

Relationship between the Hotelling's $T^2$ and James' test.

Usage

james.hotel(x1, x2)
james.hotel(x1, x2)

Arguments

`x1`	A matrix containing the Euclidean data of the first group.
`x2`	A matrix containing the Euclidean data of the second group.

Details

Emerson (2009, pg. 76–81) mentioned a very nice result between the Hotelling's one sample $T^2$ and James test for two mean vectors

$J\left(\pmb{\mu}\right) = T_1^2\left(\pmb{\mu}\right) + T_2^2\left(\pmb{\mu}\right),$

where $J\left(\pmb{\mu}\right)$ is the James test statistic (James, 1954) and $T_1^2\left(\pmb{\mu}\right)$ and $T_1^2\left(\pmb{\mu}\right)$ are the two one sample Hotelling's $T^2$ test statistic values (see function hotel1T2) for each sample from their common mean vector $\pmb{\mu}_c$ (see the help file of james). In fact, James test statistic is found from minimizing the right hand side of the above expression with respect to $\pmb{\mu}$ . The sum is mimized when $\pmb{\mu}$ takes the form of the common mean vector $\pmb{\mu}_c$ . The same is true for the t-test in the univariate case.

I have created this function illustrating this result, so this one is for educational purposes. It calculates the James test statistic, the sum of the two $T^2$ test statistics, the common mean vector and the one found via numerical optimization. In the univariate case, the common mean vector is a weighted linear combination of the two sample means. So, if we take a segment connecting the two means, the common mean is somewhere on that segment.

Value

A list including:

`tests`	A vector with two values, the James test statistic value and the sum of the two Hotelling's test statistic using the common mean.
`mathematics.mean`	The common mean computed the closed form expression seen in the help file of `james`.
`optimised.mean`	The common mean vector obtained from the minimisation process.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris mtsagris@uoc.gr.

References

Emerson S. (2009). Small sample performance and calibration of the Empirical Likelihood method. PhD thesis, Stanford university.

James G.S. (1954). Tests of Linear Hypothese in Univariate and Multivariate Analysis when the Ratios of the Population Variances are Unknown. Biometrika, 41(1/2): 19–43.

Examples

james.hotel( as.matrix(iris[1:50, 1:4]), as.matrix(iris[51:100, 1:4]) )
james( as.matrix(iris[1:50, 1:4]), as.matrix(iris[51:100, 1:4]), R = 1 )
james.hotel( as.matrix(iris[1:50, 1:4]), as.matrix(iris[51:100, 1:4]) )
james( as.matrix(iris[1:50, 1:4]), as.matrix(iris[51:100, 1:4]), R = 1 )

Repeated measures ANOVA (univariate data) using Hotelling's $T^2$ test

Description

Repeated measures ANOVA (univariate data) using Hotelling's $T^2$ test.

Usage

rm.hotel(x, a = 0.05)
rm.hotel(x, a = 0.05)

Arguments

`x`	A numerical matrix with the repeated measurements. Each column contains the values of the repeated measurements.
`a`	The level of significance, default value is equal to 0.05.

Details

We now show how can one use Hotelling's $T^2$ test to analyse univariate repeated measures. Univariate analysis of variance for repeated measures is the classical way, but we can use this multivariate test as well. In the repeated measures ANOVA case, we have many repeated observations from the same $n$ subjects, usually at different time points and the interest is to see whether the means of the samples are equal or not $\mu_1=\mu_2=\ldots=\mu_k$ assuming $k$ repeated measurements. We can of course change this null hypothesis and test many combinations of means. The idea in any case is to construct a matrix of contrasts. I will focus here in the first case only and in particular the null hypothesis and the matrix of contrasts $\bf C$ are

$\left( {\begin{array}{c} \mu_1=\mu_2 \\ \mu_2=\mu_3 \\ \vdots \\ \mu_{k-1}=\mu_k \end{array}} \right)= \left( {\begin{array}{ccccc} 1 & -1 & 0 & \ldots & 0 \\ 1 & 0 & -1 & \dots & 0 \\ \vdots & \vdots & \vdots & \vdots & \vdots \\ 1 & 0 & 0 & \ldots & -1 \\ \end{array}} \right)\pmb{\mu}={\bf C}\pmb{\mu}.$

The contrast matrix $\bf C$ has $k-1$ independent rows and if there is no treatment effect, ${\bf C}\pmb{\mu}={\bf 0}$ .

The test statistic is

$T_r^2=\frac{\left(n-k+1\right)}{\left(n-1\right)\left(k-1\right)}n\left({\bf C}\bar{\bf x}\right)^T \left({\bf CSC}^T\right)^{-1}\left({\bf C}\bar{\bf x}\right) \sim F_{k-1,n-k+1}.$

Value

A list including:

`m`	The mean vector.
`result`	A vector with the test statistic value, it's associated p-value, the numerator and denominator degrees of freedom and the critical value.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris mtsagris@uoc.gr.

Examples

x <- as.matrix(iris[, 1:4]) ## assume they are repeated measurements
rm.hotel(x)
x <- as.matrix(iris[, 1:4]) ## assume they are repeated measurements
rm.hotel(x)

Package 'mvhtests'

Help Index

Multivariate Hypothesis Tests

Description

Details

Maintainers

Author(s)

References

Box's M test for equality of two or more covariance matrices

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Empirical likelihood for a one sample mean vector hypothesis testing

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Empirical likelihood hypothesis testing for two mean vectors

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Exponential empirical likelihood for a one sample mean vector hypothesis testing

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Exponential empirical likelihood hypothesis testing for two mean vectors

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Hotelling's multivariate version of the 1 sample t-test for Euclidean data

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Hotelling's multivariate version of the 2 sample t-test for Euclidean data

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Hypothesis test for two high-dimensional mean vectors

Description

Relationship between Hotelling's $T^2$ test and James' MANOVA

Relationship between the Hotelling's $T^2$ and James' test