Title: | Inference of Linear Mixed Models Through MM Algorithm |
---|---|
Description: | The main function MMEst() performs (Restricted) Maximum Likelihood in a variance component mixed models using a Min-Max (MM) algorithm (Laporte, F., Charcosset, A. & Mary-Huard, T. (2022) <doi:10.1371/journal.pcbi.1009659>). |
Authors: | Fabien Laporte [aut, cre], Tristan Mary-Huard [aut], GQMS CoreFunctions Team [aut] |
Maintainer: | Fabien Laporte <[email protected]> |
License: | GPL (>= 2) |
Version: | 3.0.3 |
Built: | 2025-02-28 07:36:27 UTC |
Source: | CRAN |
This package provides a function to perform either ML or ReML estimation in a Variance Component Mixed Model. The optimization of the (possibly Restricted) log-likelihood is perfomed using the Min-Max algorithm described in Hunter et al. (2004). Depending on the number of variance components, different computational tricks are used to speed up inference. Additionally, the AnovaTest
function provides Type I, Type III and Type III Kenward Roger approximation test series for fixed effects. The nullity of a specific linear combination can also be tested.
Variance Component Mixed Models are mixed models of the form
where Y is the response vector, X and are respectively the incidence matrix and the coefficient vector associated with the fixed effects,
is the kth vector of random effects and corresponds to its associated incidence matrix. All random effect are assumed to be Gaussian with mean 0 and covariance
, where
is a known correlation matrix and
is an unknown variance parameter. All random effects are assumed to be independent. In many applications the last random component corresponds to the error and therefore both
and
correspond to the identity matrix.
In such models the inference of both the unknown variance components ,...,
and the fixed effect
can be achieved through Maximum Likelihood (ML) or Restricted Maximum Likelihood (ReML) estimation. Neither ML nor ReML yield close form expressions of the estimates, consequently the maximization of the (restricted) log-likelihood has to be performed numerically. This package provides the user with Min-Max algorithms for the optimization. Efficient tricks such as profiling, MME trick and MM acceleration are implemented for computational efficiency (see Johnson et al. (1995), Varadhan et al. (2008) for details). The main function
MMEst
handles parallel inference of multiple models sharing the same set of correlation matrices - but possibly different fixed effects, an usual situation in GWAS analysis for instance.
Fabien Laporte and Tristan Mary-Huard
Maintainer: Fabien Laporte <[email protected]>
Laporte, F., Charcosset, A., & Mary-Huard, T. (2022). Efficient ReML inference in variance component mixed models using a Min-Max algorithm. PLOS Computational Biology, 18(1), e1009659.
Johnson, D. L., & Thompson, R. (1995). Restricted maximum likelihood estimation of variance components for univariate animal models using sparse matrix techniques and average information. Journal of dairy science, 78(2), 449-456.
Hunter, D. R., & Lange, K. (2004). A tutorial on MM algorithms. The American Statistician, 58(1), 30-37.
Kenward, M. G., & Roger, J. H. (1997). Small sample inference for fixed effects from restricted maximum likelihood. Biometrics, 983-997.
Varadhan, R., & Roland, C. (2008). Simple and globally convergent methods for accelerating the convergence of any EM algorithm. Scandinavian Journal of Statistics, 35(2), 335-353.
Zhou, H., Hu, L., Zhou, J., & Lange, K. (2015). MM algorithms for variance components models. arXiv preprint arXiv:1509.07426.
This function computes Type I and Type III tests for each fixed effect of a model, as displayed by the MMEst
function. Alternatively, a specific linear combination of the fixed parameters may be tested (at 0).
AnovaTest(ResMMEst , TestedCombination=NULL , Type = "TypeIII" , Cofactor = NULL , X = NULL , formula = NULL , VarList = NULL , NbCores=1)
AnovaTest(ResMMEst , TestedCombination=NULL , Type = "TypeIII" , Cofactor = NULL , X = NULL , formula = NULL , VarList = NULL , NbCores=1)
ResMMEst |
A list as displayed by the |
TestedCombination |
A contrast matrix or a list of contrast matrices |
Type |
"TypeI", "TypeIII" or "KR" (default is "TypeIII"). AnovaTest will compute tests of the given type for each fixed effect in the model. The option is ignored if a |
Cofactor |
The incidence matrix corresponding to fixed effects common to all models used in |
X |
The incidence matrix or a list of incidence matrices corresponding to fixed effects specific to each model used in |
formula |
The formula object specifying the fixed effect part of all models separated by + operators used in |
VarList |
The list of correlation matrices associated with random and residual effects used in |
NbCores |
The number of cores to be used. |
If no TestedCombination
is provided, the function performs either Type I or Type III tests for each fixed effect in the model (default is Type III). If TestedCombination
is provided, each linear combination in TestedCombination
is tested at 0 using a Wald test. No check is performed regarding the estimability of the linear combination to be tested. If the dimension of the combination does not match with the dimension of ResMMEst
, a NA
is returned.
The output of the function is a list with as many items as in the original input list ResMMEst
. For each item of ResMMEst
, a table is provided that contains the Wald test statistics, p-values and degrees of freedom for all tested hypotheses.
F. Laporte and T. Mary-Huard
Kenward, M. G., & Roger, J. H. (1997). Small sample inference for fixed effects from restricted maximum likelihood. Biometrics, 983-997.
require('MM4LMM') data(QTLDetectionExample) Pheno <- QTLDetectionExample$Phenotype Geno <- QTLDetectionExample$Genotype Kinship <- QTLDetectionExample$Kinship ##Build the VarList object VL <- list(Additive = Kinship , Error = diag(1,length(Pheno))) ##Perform inference Result <- MMEst(Y=Pheno , X = Geno , VarList = VL) ##Compute tests AOV <- AnovaTest(Result,Type="TypeI") ##Test specific contrast matrix TC = matrix(c(0,1),nrow=1) AOV2 <- AnovaTest(Result, TestedCombination = TC) AOV3 <- AnovaTest(Result, TestedCombination = TC , Type="KR" , X = Geno , VarList = VL)
require('MM4LMM') data(QTLDetectionExample) Pheno <- QTLDetectionExample$Phenotype Geno <- QTLDetectionExample$Genotype Kinship <- QTLDetectionExample$Kinship ##Build the VarList object VL <- list(Additive = Kinship , Error = diag(1,length(Pheno))) ##Perform inference Result <- MMEst(Y=Pheno , X = Geno , VarList = VL) ##Compute tests AOV <- AnovaTest(Result,Type="TypeI") ##Test specific contrast matrix TC = matrix(c(0,1),nrow=1) AOV2 <- AnovaTest(Result, TestedCombination = TC) AOV3 <- AnovaTest(Result, TestedCombination = TC , Type="KR" , X = Geno , VarList = VL)
This function computes the BLUP for each random vector included in the MMEst
output. Note that this function can be called only if the argument X
of MMEst
was set to NULL
MMBlup(Y,Cofactor = NULL, X = NULL, fmla = NULL,ZList=NULL,VarList,ResMM)
MMBlup(Y,Cofactor = NULL, X = NULL, fmla = NULL,ZList=NULL,VarList,ResMM)
Y |
The vector of response values used in the function |
Cofactor |
The incidence matrix corresponding to fixed effects common to all models to be adjusted used in the function |
X |
Must be |
fmla |
The formula object specifying the fixed effect part of all models separated by + operators used in the function |
ZList |
The list of incidence matrices associated with random and residual effects used in the function |
VarList |
The list of covariance matrices associated with random and residual effects used in the function |
ResMM |
A list as displayed by the |
The function returns a list where each element corresponds to the Best Linear Unbiased Prediction of a random component of the model.
GQMS CoreFunctions Team
require('MM4LMM') data(VarianceComponentExample) DataHybrid <- VarianceComponentExample$Data KinF <- VarianceComponentExample$KinshipF KinD <- VarianceComponentExample$KinshipD ##Build incidence matrix for each random effect Zf <- t(sapply(as.character(DataHybrid$CodeFlint), function(x) as.numeric(rownames(KinF)==x))) Zd <- t(sapply(as.character(DataHybrid$CodeDent), function(x) as.numeric(rownames(KinD)==x))) ##Build the VarList and ZList objects VL = list(Flint=KinF , Dent=KinD , Error = diag(1,nrow(DataHybrid))) ZL <- list(Flint=Zf , Dent=Zd , Error = diag(1,nrow(DataHybrid))) ##Perform inference #A first way to call MMEst ResultVA <- MMEst(Y=DataHybrid$Trait , Cofactor = matrix(DataHybrid$Trial) , ZList = ZL , VarList = VL) BlupVA <- MMBlup(Y=DataHybrid$Trait , Cofactor = matrix(DataHybrid$Trial) , ZList = ZL , VarList = VL , ResMM=ResultVA)
require('MM4LMM') data(VarianceComponentExample) DataHybrid <- VarianceComponentExample$Data KinF <- VarianceComponentExample$KinshipF KinD <- VarianceComponentExample$KinshipD ##Build incidence matrix for each random effect Zf <- t(sapply(as.character(DataHybrid$CodeFlint), function(x) as.numeric(rownames(KinF)==x))) Zd <- t(sapply(as.character(DataHybrid$CodeDent), function(x) as.numeric(rownames(KinD)==x))) ##Build the VarList and ZList objects VL = list(Flint=KinF , Dent=KinD , Error = diag(1,nrow(DataHybrid))) ZL <- list(Flint=Zf , Dent=Zd , Error = diag(1,nrow(DataHybrid))) ##Perform inference #A first way to call MMEst ResultVA <- MMEst(Y=DataHybrid$Trait , Cofactor = matrix(DataHybrid$Trial) , ZList = ZL , VarList = VL) BlupVA <- MMBlup(Y=DataHybrid$Trait , Cofactor = matrix(DataHybrid$Trial) , ZList = ZL , VarList = VL , ResMM=ResultVA)
This is the main function of the MM4LMM package. It performs inference in a variance component mixed model using a Min-Max algorithm. Inference in multiple models (e.g. for GWAS analysis) can also be performed.
MMEst(Y, Cofactor = NULL, X = NULL, formula=NULL, VarList, ZList = NULL, Method = "Reml", Henderson=NULL, Init = NULL, CritVar = 0.001, CritLogLik = 0.001, MaxIter = 100, NbCores = 1, Verbose = TRUE)
MMEst(Y, Cofactor = NULL, X = NULL, formula=NULL, VarList, ZList = NULL, Method = "Reml", Henderson=NULL, Init = NULL, CritVar = 0.001, CritLogLik = 0.001, MaxIter = 100, NbCores = 1, Verbose = TRUE)
Y |
A vector of response values. |
Cofactor |
An incidence matrix corresponding to fixed effects common to all models to be adjusted. If |
X |
An incidence matrix or a list of incidence matrices corresponding to fixed effects specific to each model. If |
formula |
A formula object specifying the fixed effect part of all models separated by + operators. To specify an interaction between |
VarList |
A list of covariance matrices associated with random and residual effects. |
ZList |
A list of incidence matrices associated with random and residual effects (default is |
Method |
The method used for inference. Available methods are "Reml" (Restricted Maximum Likelihood) and "ML" (Maximum Likelihood). |
Henderson |
If |
Init |
A vector of initial values for variance parameters (default is |
CritVar |
Value of the criterion for the variance components to stop iteration. (see Details) |
CritLogLik |
Value of the criterion for the log-likelihood to stop iteration. (see Details) |
MaxIter |
Maximum number of iterations per model. |
NbCores |
Number of cores to be used. |
Verbose |
A boolean describing if messages have to be printed (TRUE) or not (FALSE). Default is TRUE. |
If X
is NULL
, the following model is fitted:
with the matrix provided in
Cofactor
, the unknown fixed effects,
the incidence matrix provided for the kth component of
ZList
and the kth vector of random effects. If
ZList
is unspecified, all incidence matrices are assumed to be the Identity matrix. Random effects are assumed to follow a Gaussian distribution with mean 0 and covariance matrix , where
is the kth correlation matrix provided in
VarList
.
If X
is not NULL
, the following model is fitted for each i:
where is the incidence matrix corresponding to the ith component (i.e. column if
is a matrix, element otherwise) of
, and
is the (unknow) fixed effect associated to
.
All models are fitted using the MM algorithm. If Henderson
=TRUE
, at each step the quantities required for updating the variance components are computed using the Mixed Model Equation (MME) trick. See Johnson et al. (1995) for details.
The result is a list where each element corresponds to a fitted model. Each element displays the following:
Beta |
Estimated values of |
Sigma2 |
Estimated values of |
VarBeta |
Variance matrix of |
LogLik (Method) |
The value of the (restricted, if |
NbIt |
The number of iterations required to reach the optimum |
Method |
The method used for the inference |
attr |
An integer vector with an entry for each element of |
Factors |
Names of each term in the formula |
F. Laporte and T. Mary-Huard
Laporte, F., Charcosset, A., & Mary-Huard, T. (2022). Efficient ReML inference in variance component mixed models using a Min-Max algorithm. PLOS Computational Biology, 18(1), e1009659.
Johnson, D. L., & Thompson, R. (1995). Restricted maximum likelihood estimation of variance components for univariate animal models using sparse matrix techniques and average information. Journal of dairy science, 78(2), 449-456.
Hunter, D. R., & Lange, K. (2004). A tutorial on MM algorithms. The American Statistician, 58(1), 30-37.
Zhou, H., Hu, L., Zhou, J., & Lange, K. (2015). MM algorithms for variance components models. arXiv preprint arXiv:1509.07426.
require('MM4LMM') #### Example 1: variance component analysis, 1 model data(VarianceComponentExample) DataHybrid <- VarianceComponentExample$Data KinF <- VarianceComponentExample$KinshipF KinD <- VarianceComponentExample$KinshipD ##Build incidence matrix for each random effect Zf <- t(sapply(as.character(DataHybrid$CodeFlint), function(x) as.numeric(rownames(KinF)==x))) Zd <- t(sapply(as.character(DataHybrid$CodeDent), function(x) as.numeric(rownames(KinD)==x))) ##Build the VarList and ZList objects VL = list(Flint=KinF , Dent=KinD , Error = diag(1,nrow(DataHybrid))) ZL <- list(Flint=Zf , Dent=Zd , Error = diag(1,nrow(DataHybrid))) ##Perform inference #A first way to call MMEst ResultVA <- MMEst(Y=DataHybrid$Trait , Cofactor = matrix(DataHybrid$Trial) , ZList = ZL , VarList = VL) length(ResultVA) print(ResultVA) #A second way to call MMEst (same result) Formula <- as.formula('~ Trial') ResultVA2 <- MMEst(Y=DataHybrid$Trait , Cofactor = DataHybrid, formula = Formula , ZList = ZL , VarList = VL) length(ResultVA2) print(ResultVA2) #### Example 2: Marker Selection with interaction between Cofactor and X matrix Formula <- as.formula('~ Trial+Xeffect+Xeffect:Trial') ResultVA3 <- MMEst(Y=DataHybrid$Trait , Cofactor = DataHybrid, X = VarianceComponentExample$Markers, formula = Formula , ZList = ZL , VarList = VL) length(ResultVA3) print(ResultVA3[[1]]) #### Example 3: QTL detection with two variance components data(QTLDetectionExample) Pheno <- QTLDetectionExample$Phenotype Geno <- QTLDetectionExample$Genotype Kinship <- QTLDetectionExample$Kinship ##Build the VarList object VLgd <- list(Additive=Kinship , Error=diag(1,length(Pheno))) ##Perform inference ResultGD <- MMEst(Y=Pheno , X=Geno , VarList=VLgd , CritVar = 10e-5) length(ResultGD) print(ResultGD[[1]])
require('MM4LMM') #### Example 1: variance component analysis, 1 model data(VarianceComponentExample) DataHybrid <- VarianceComponentExample$Data KinF <- VarianceComponentExample$KinshipF KinD <- VarianceComponentExample$KinshipD ##Build incidence matrix for each random effect Zf <- t(sapply(as.character(DataHybrid$CodeFlint), function(x) as.numeric(rownames(KinF)==x))) Zd <- t(sapply(as.character(DataHybrid$CodeDent), function(x) as.numeric(rownames(KinD)==x))) ##Build the VarList and ZList objects VL = list(Flint=KinF , Dent=KinD , Error = diag(1,nrow(DataHybrid))) ZL <- list(Flint=Zf , Dent=Zd , Error = diag(1,nrow(DataHybrid))) ##Perform inference #A first way to call MMEst ResultVA <- MMEst(Y=DataHybrid$Trait , Cofactor = matrix(DataHybrid$Trial) , ZList = ZL , VarList = VL) length(ResultVA) print(ResultVA) #A second way to call MMEst (same result) Formula <- as.formula('~ Trial') ResultVA2 <- MMEst(Y=DataHybrid$Trait , Cofactor = DataHybrid, formula = Formula , ZList = ZL , VarList = VL) length(ResultVA2) print(ResultVA2) #### Example 2: Marker Selection with interaction between Cofactor and X matrix Formula <- as.formula('~ Trial+Xeffect+Xeffect:Trial') ResultVA3 <- MMEst(Y=DataHybrid$Trait , Cofactor = DataHybrid, X = VarianceComponentExample$Markers, formula = Formula , ZList = ZL , VarList = VL) length(ResultVA3) print(ResultVA3[[1]]) #### Example 3: QTL detection with two variance components data(QTLDetectionExample) Pheno <- QTLDetectionExample$Phenotype Geno <- QTLDetectionExample$Genotype Kinship <- QTLDetectionExample$Kinship ##Build the VarList object VLgd <- list(Additive=Kinship , Error=diag(1,length(Pheno))) ##Perform inference ResultGD <- MMEst(Y=Pheno , X=Geno , VarList=VLgd , CritVar = 10e-5) length(ResultGD) print(ResultGD[[1]])
This function computes the covariance matrix of variance estimators using either the inverse of the Expected Hessian Matrix or the inverse of the Average Information Matrix.
MMVcov(ResMM , Y , Cofactor = NULL , formula = NULL, ZList = NULL , VarList , information="Expected")
MMVcov(ResMM , Y , Cofactor = NULL , formula = NULL, ZList = NULL , VarList , information="Expected")
ResMM |
A list as displayed by the |
Y |
The vector of response values used in the function |
Cofactor |
The incidence matrix corresponding to fixed effects common to all models to be adjusted used in the function |
formula |
The formula object specifying the fixed effect part of all models separated by + operators used in the function |
ZList |
The list of incidence matrices associated with random and residual effects used in the function |
VarList |
The list of covariance matrices associated with random and residual effects used in the function |
information |
A string specifying the method used to approximate the covariance matrix. It can be either "Expected" (default) to use the Expected Hessian Matrix or "AI" to use the Average Information Matrix. The AI matrix is always computed using Reml estimates whereas the expected hessian matrix could also be used for ML estimates. |
If information
is not specified then the algorithm computes the covariance matrix using the Expected matrix based on the inference method (Reml or ML) used in the first item of ResMM
. If information
is equal to "AI" then it computes the AI matrix to calculate the covariance matrix. Only the first item of ResMM
is analyzed.
The output of the function is a list:
vcov |
The covariance matrix of the variance estimators |
SE |
The standard errors of the variance estimators (the square root of the covariance matrix diagonal) |
F. Laporte and T. Mary-Huard
require('MM4LMM') data(VarianceComponentExample) DataHybrid <- VarianceComponentExample$Data KinF <- VarianceComponentExample$KinshipF KinD <- VarianceComponentExample$KinshipD ##Build incidence matrix for each random effect Zf <- t(sapply(as.character(DataHybrid$CodeFlint), function(x) as.numeric(rownames(KinF)==x))) Zd <- t(sapply(as.character(DataHybrid$CodeDent), function(x) as.numeric(rownames(KinD)==x))) ##Build the VarList and ZList objects VL = list(Flint=KinF , Dent=KinD , Error = diag(1,nrow(DataHybrid))) ZL <- list(Flint=Zf , Dent=Zd , Error = diag(1,nrow(DataHybrid))) ##Perform inference #A first way to call MMEst ResultVA <- MMEst(Y=DataHybrid$Trait , Cofactor = matrix(DataHybrid$Trial) , ZList = ZL , VarList = VL) Expected_vcov <- MMVcov(ResMM=ResultVA,Y=DataHybrid$Trait, Cofactor = matrix(DataHybrid$Trial), , ZList = ZL , VarList = VL) AI_vcov <- MMVcov(ResMM=ResultVA,Y=DataHybrid$Trait, Cofactor = matrix(DataHybrid$Trial), , ZList = ZL , VarList = VL , information = "AI")
require('MM4LMM') data(VarianceComponentExample) DataHybrid <- VarianceComponentExample$Data KinF <- VarianceComponentExample$KinshipF KinD <- VarianceComponentExample$KinshipD ##Build incidence matrix for each random effect Zf <- t(sapply(as.character(DataHybrid$CodeFlint), function(x) as.numeric(rownames(KinF)==x))) Zd <- t(sapply(as.character(DataHybrid$CodeDent), function(x) as.numeric(rownames(KinD)==x))) ##Build the VarList and ZList objects VL = list(Flint=KinF , Dent=KinD , Error = diag(1,nrow(DataHybrid))) ZL <- list(Flint=Zf , Dent=Zd , Error = diag(1,nrow(DataHybrid))) ##Perform inference #A first way to call MMEst ResultVA <- MMEst(Y=DataHybrid$Trait , Cofactor = matrix(DataHybrid$Trial) , ZList = ZL , VarList = VL) Expected_vcov <- MMVcov(ResMM=ResultVA,Y=DataHybrid$Trait, Cofactor = matrix(DataHybrid$Trial), , ZList = ZL , VarList = VL) AI_vcov <- MMVcov(ResMM=ResultVA,Y=DataHybrid$Trait, Cofactor = matrix(DataHybrid$Trial), , ZList = ZL , VarList = VL , information = "AI")
This corresponds to (a sample of) the dataset presented in Rincent et al. (2014).
data("QTLDetectionExample")
data("QTLDetectionExample")
The format is: List of 3
Phenotype
Named num [1:259]
Genotype
int [1:259,1:10]
Kinship
num [1:259,1:259]
The list contains three elements:
Phenotype: a numeric vector containing phenotypes of individuals
Genotype: a matrix containing the genotypes of indviduals over 10 biallelic markers
Kinship: a matrix of simple relatedness coefficients between individuals
https://link.springer.com/article/10.1007/s00122-014-2379-7
Rincent, R., Nicolas, S., Bouchet, S., Altmann, T., Brunel, D., Revilla, P., ... & Schipprack, W. (2014). Dent and Flint maize diversity panels reveal important genetic potential for increasing biomass production. Theoretical and applied genetics, 127(11), 2313-2331.
data(QTLDetectionExample) names(QTLDetectionExample) ## maybe str(QTLDetectionExample) ; plot(QTLDetectionExample) ...
data(QTLDetectionExample) names(QTLDetectionExample) ## maybe str(QTLDetectionExample) ; plot(QTLDetectionExample) ...
This corresponds to (a sample of) the dataset presented in Giraud et al. (2017).
data("VarianceComponentExample")
data("VarianceComponentExample")
The format is: List of 3
Data
'data.frame': 432 obs. of 5 variables
Trial
a factor with 2 levels
CodeHybrid
a factor with 177 levels
CodeDent
a factor with 116 levels
CodeFlint
a factor with 122 levels
Trait
a numeric vector
KinshipD
num [1:116,1:116]
KinshipF
num [1:122,1:122]
The list contains three elements:
Data: a data frame containing the information about hybrids (trials, hybrid names, dent parental lines, flint parental lines and phenotypes)
KinshipD: a matrix of simple relatedness coefficients between dent lines
KinshipF: a matrix of simple relatedness coefficients between flint lines
doi:10.1534/genetics.117.300305
Giraud, H., Bauland, C., Falque, M., Madur, D., Combes, V., Jamin, P., ... & Blanchard, P. (2017). Reciprocal Genetics: Identifying QTL for General and Specific Combining Abilities in Hybrids Between Multiparental Populations from Two Maize (Zea mays L.) Heterotic Groups. Genetics, 207(3), 1167-1180.
data(VarianceComponentExample) names(VarianceComponentExample) ## maybe str(VarianceComponentExample) ; plot(VarianceComponentExample) ...
data(VarianceComponentExample) names(VarianceComponentExample) ## maybe str(VarianceComponentExample) ; plot(VarianceComponentExample) ...