Package 'metaMA' reference manual

Title:	Meta-Analysis for MicroArrays
Description:	Combination of either p-values or modified effect sizes from different studies to find differentially expressed genes.
Authors:	Guillemette Marot [aut,cre]
Maintainer:	Samuel Blanck <[email protected]>
License:	GPL
Version:	3.1.3
Built:	2024-11-27 06:55:09 UTC
Source:	CRAN

Meta-analysis for MicroArrays

Description

Combines either p-values or moderated effect sizes from different studies to find differentially expressed genes.

Details

Package:	metaMA
Type:	Package
Version:	3.1.2
Date:	2015-01-28
License:	GPL
LazyLoad:	yes

pvalcombination and EScombination are the most important functions to combine unpaired data.

pvalcombination combines p-values from individual studies.

EScombination combines effect sizes from individual studies.

pvalcombination.paired and EScombination.paired are to be used for paired data.

IDDIDR can help in the interpretation of gain and loss of information due to meta-analysis.

Author(s)

Guillemette Marot <[email protected]>

References

Marot, G., Foulley, J.-L., Mayer, C.-D., Jaffrezic, F. (2009) Moderated effect size and p-value combinations for microarray meta-analyses. Bioinformatics. 25 (20): 2692-2699.

Examples

library(metaMA)
data(Singhdata)
EScombination(esets=Singhdata$esets,classes=Singhdata$classes)
pvalcombination(esets=Singhdata$esets,classes=Singhdata$classes)
#more details are provided in the vignette; only open it in interactive R sessions
if(interactive()){
  vignette("metaMA")
  }
library(metaMA)
data(Singhdata)
EScombination(esets=Singhdata$esets,classes=Singhdata$classes)
pvalcombination(esets=Singhdata$esets,classes=Singhdata$classes)
#more details are provided in the vignette; only open it in interactive R sessions
if(interactive()){
  vignette("metaMA")
  }

Empirical Bayes statistics from limma analysis with unpaired data

Description

Computes empirical Bayes statistics from limma analysis with only one group effect.

Usage

calcfit2Diffrep(C1, C2)
calcfit2Diffrep(C1, C2)

Arguments

`C1`	Gene expression data of the arrays in the first condition. Each row of `C1` corresponds to one spot, each column to one replicate.
`C2`	Gene expression data of the arrays in the second condition. Each row of `C2` corresponds to one spot, each column to one replicate.

Details

Returns fit2 described in limma vignette. To be used with unpaired data.

Value

fit2

Note

see Bioconductor limma vignette

Direct effect size combination

Description

Combines effect sizes already calculated.

Usage

directEScombi(ES, varES, BHth = 0.05, useREM = TRUE)
directEScombi(ES, varES, BHth = 0.05, useREM = TRUE)

Arguments

`ES`	Matrix of effect sizes. Each column of `ES` corresponds to one study and each row to one gene.
`varES`	Matrix of effect size variances. Each column of `varES` corresponds to one study and each row to one gene.
`BHth`	Benjamini Hochberg threshold. By default, the False Discovery Rate is controlled at 5%.
`useREM`	A logical value indicating whether or not to include the between-study variance into the model.

Details

Combines effect sizes with the method presented in (Choi et al., 2003).

Value

List

`DEindices`	Indices of differentially expressed genes at the chosen Benjamini Hochberg threshold.
`TestStatistic`	Vector with test statistics for differential expression in the meta-analysis.

References

Choi, J. K., Yu, U., Kim, S., and Yoo, O. J. (2003). Combining multiple microarray studies and modeling interstudy variation. Bioinformatics, 19 Suppl 1.

Direct p-value combination

Description

Combines one sided p-values with the inverse normal method.

Usage

directpvalcombi(pvalonesided, nrep, BHth = 0.05)
directpvalcombi(pvalonesided, nrep, BHth = 0.05)

Arguments

`pvalonesided`	List of vectors of one sided p-values to be combined.
`nrep`	Vector of numbers of replicates used in each study to calculate the previous one-sided p-values.
`BHth`	Benjamini Hochberg threshold. By default, the False Discovery Rate is controlled at 5%.

Value

List

`DEindices`	Indices of differentially expressed genes at the chosen Benjamini Hochberg threshold.
`TestStatistic`	Vector with test statistics for differential expression in the meta-analysis.

Note

One-sided p-values are required to avoid directional conflicts. Then a two-sided test is performed to find differentially expressed genes.

Author(s)

Guillemette Marot

References

Hedges, L. and Olkin, I. (1985). Statistical Methods for Meta-Analysis. Academic Press.

Calculates effect sizes from given t or moderated t statistics

Description

Function not to be used separately.

Usage

effectsize(tstat, ntilde, m)
effectsize(tstat, ntilde, m)

Arguments

`tstat`	Vector of test statistics and effect sizes.
`ntilde`	Proportion factor between a test statistic and its corresponding effect size.
`m`	Number of degrees of freedom.

Value

Matrix with one row per gene, and in column:

`d`	Commonly used effect size (which is biased)
`vard`	Variance of the commonly used effect size
`dprime`	Unbiased effect size
`vardprime`	Variance of the unbiased effect size

Author(s)

Guillemette Marot with contribution from Ankur Ravinarayana Chakravarthy

References

Marot, G., Foulley, J.-L., Mayer, C.-D., Jaffrezic, F. Moderated effect size combination for microarray meta-analyses and comparison study. Submitted.

Examples

#for SMVar: 
#stati$TestStat[order(stati$GeneId)],length(classes[[i]]),stati$DegOfFreedom[order(stati$GeneId)])
#for Limma
#effectsize(fit2i$t,length(classes[[i]]),(fit2i$df.prior+fit2i$df.residual))
#for SMVar: 
#stati$TestStat[order(stati$GeneId)],length(classes[[i]]),stati$DegOfFreedom[order(stati$GeneId)])
#for Limma
#effectsize(fit2i$t,length(classes[[i]]),(fit2i$df.prior+fit2i$df.residual))

Effect size combination for unpaired data

Description

Calculates effect sizes from unpaired data either from classical or moderated t-tests (Limma, SMVar) for each study and combines these effect sizes.

Usage

EScombination(esets, classes, moderated = c("limma", "SMVar", "t")[1], BHth = 0.05)
EScombination(esets, classes, moderated = c("limma", "SMVar", "t")[1], BHth = 0.05)

Arguments

`esets`	List of matrices (or data frames), one matrix per study. Each matrix has one row per gene and one column per replicate and gives the expression data for both conditions with the order specified in the `classes` argument. All studies must have the same genes. If the data are already stored as ExpressionSets objects (cf. Bioconductor project), then `exprs(yourdata)` will give an appropriate element of the list `esets` used for this function.
`classes`	List of class memberships, one per study. Each vector or factor of the list can only contain two levels which correspond to the two conditions studied.
`moderated`	Method to calculate the test statistic inside each study from which the effect size is computed. `moderated` has to be chosen between "limma", "SMVar" and "t".
`BHth`	Benjamini Hochberg threshold. By default, the False Discovery Rate is controlled at 5%.

Value

List

`Study1`	Vector of indices of differentially expressed genes in study 1. Similar names are given for the other individual studies.
`AllIndStudies`	Vector of indices of differentially expressed genes found by at least one of the individual studies.
`Meta`	Vector of indices of differentially expressed genes in the meta-analysis.
`TestStatistic`	Vector with test statistics for differential expression in the meta-analysis.

Note

While the invisible object resulting from this function contains the values described previously, other quantities of interest are printed: DE,IDD,Loss,IDR,IRR. All these quantities are defined in function IDDIDR and in (Marot et al., 2009)

Author(s)

Guillemette Marot

References

Marot, G., Foulley, J.-L., Mayer, C.-D., Jaffrezic, F. (2009) Moderated effect size and p-value combinations for microarray meta-analyses. Bioinformatics. 25 (20): 2692-2699.

Examples

data(Singhdata)
#Meta-analysis
res=EScombination(esets=Singhdata$esets,classes=Singhdata$classes)
#Number of differentially expressed genes in the meta-analysis
length(res$Meta)
#To plot an histogram of raw p-values
rawpval=2*(1-pnorm(abs(res$TestStatistic)))
hist(rawpval,nclass=100)
data(Singhdata)
#Meta-analysis
res=EScombination(esets=Singhdata$esets,classes=Singhdata$classes)
#Number of differentially expressed genes in the meta-analysis
length(res$Meta)
#To plot an histogram of raw p-values
rawpval=2*(1-pnorm(abs(res$TestStatistic)))
hist(rawpval,nclass=100)

Effect size combination for paired data

Description

Calculates effect sizes from paired data either from classical or moderated t-tests (Limma, SMVar) for each study and combines these effect sizes.

Usage

EScombination.paired(logratios, moderated = c("limma", "SMVar", "t")[1], BHth = 0.05)
EScombination.paired(logratios, moderated = c("limma", "SMVar", "t")[1], BHth = 0.05)

Arguments

`logratios`	List of matrices (or data frames). Each matrix has one row per gene and one column per replicate and gives the logratios of one study. All studies must have the same genes.
`moderated`	Method to calculate the test statistic inside each study from which the effect size is computed. `moderated` has to be chosen between "limma", "SMVar" and "t".
`BHth`	Benjamini Hochberg threshold. By default, the False Discovery Rate is controlled at 5%.

Value

List

`Study1`	Vector of indices of differentially expressed genes in study 1. Similar names are given for the other individual studies.
`AllIndStudies`	Vector of indices of differentially expressed genes found by at least one of the individual studies.
`Meta`	Vector of indices of differentially expressed genes in the meta-analysis.
`TestStatistic`	Vector with test statistics for differential expression in the meta-analysis.

Note

Author(s)

Guillemette Marot

References

Marot, G., Foulley, J.-L., Mayer, C.-D., Jaffrezic, F. (2009) Moderated effect size and p-value combinations for microarray meta-analyses. Bioinformatics. 25 (20): 2692-2699.

Examples

data(Singhdata)
#create artificially paired data:
artificialdata=lapply(Singhdata$esets,FUN=function(x) (x[,1:10]-x[,11:20]))
#Meta-analysis
res=EScombination.paired(artificialdata)
#Number of differentially expressed genes in the meta-analysis
length(res$Meta)
#To plot an histogram of raw p-values
rawpval=2*(1-pnorm(abs(res$TestStatistic)))
hist(rawpval,nclass=100)
data(Singhdata)
#create artificially paired data:
artificialdata=lapply(Singhdata$esets,FUN=function(x) (x[,1:10]-x[,11:20]))
#Meta-analysis
res=EScombination.paired(artificialdata)
#Number of differentially expressed genes in the meta-analysis
length(res$Meta)
#To plot an histogram of raw p-values
rawpval=2*(1-pnorm(abs(res$TestStatistic)))
hist(rawpval,nclass=100)

Integration-driven Discovery and Integration-driven Revision Rates

Description

Calculates the gain or the loss of differentially expressed genes due to meta-analysis compared to individual studies.

Usage

IDDIRR(finalde, deindst)
IDDIRR(finalde, deindst)

Arguments

`finalde`	Vector of indices of differentially expressed genes after meta-analysis
`deindst`	Vector of indices of differentially expressed genes found at least in one study

Value

`DE`	Number of Differentially Expressed (DE) genes
`IDD`	Integration Driven Discoveries: number of genes that are declared DE in the meta-analysis that were not identified in any of the individual studies alone.
`Loss`	Number of genes that are declared DE in individual studies but not in meta-analysis.
`IDR`	Integration-driven Discovery Rate: proportion of genes that are identified as DE in the meta-analysis that were not identified in any of the individual studies alone.
`IRR`	Integration-driven Revision Rate: percentage of genes that are declared DE in individual studies but not in meta-analysis.

Author(s)

Guillemette Marot

References

Marot, G., Foulley, J.-L., Mayer, C.-D., Jaffrezic, F. (2009) Moderated effect size and p-value combinations for microarray meta-analyses. Bioinformatics. 25 (20): 2692-2699.

Examples

data(Singhdata)
out=EScombination(esets=Singhdata$esets,classes=Singhdata$classes)
IDDIRR(out$Meta,out$AllIndStudies)

## The function is currently defined as
#function(finalde,deindst)
#{
#DE=length(finalde)
#gains=finalde[which(!(finalde %in% deindst))]
#IDD=length(gains)
#IDR=IDD/DE*100
#perte=which(!(deindst %in% finalde))
#Loss=length(perte)
#IRR=Loss/length(deindst)*100
#res=c(DE,IDD,Loss,IDR,IRR)
#names(res)=c("DE","IDD","Loss","IDR","IRR")
#res
#}
data(Singhdata)
out=EScombination(esets=Singhdata$esets,classes=Singhdata$classes)
IDDIRR(out$Meta,out$AllIndStudies)

## The function is currently defined as
#function(finalde,deindst)
#{
#DE=length(finalde)
#gains=finalde[which(!(finalde %in% deindst))]
#IDD=length(gains)
#IDR=IDD/DE*100
#perte=which(!(deindst %in% finalde))
#Loss=length(perte)
#IRR=Loss/length(deindst)*100
#res=c(DE,IDD,Loss,IDR,IRR)
#names(res)=c("DE","IDD","Loss","IDR","IRR")
#res
#}

P-value combination for unpaired data

Description

Calculates differential expression p-values from unpaired data either from classical or moderated t-tests (Limma, SMVar) for each study and combines these p-values by the inverse normal method.

Usage

pvalcombination(esets, classes, moderated = c("limma", "SMVar", "t")[1], BHth = 0.05)
pvalcombination(esets, classes, moderated = c("limma", "SMVar", "t")[1], BHth = 0.05)

Arguments

`esets`	List of matrices (or data frames), one matrix per study. Each matrix has one row per gene and one column per replicate and gives the expression data for both conditions with the order specified in the `classes` argument. All studies must have the same genes. If the data are already stored as ExpressionSets objects (cf. Bioconductor project), then `exprs(yourdata)` will give an appropriate element of the list `esets` used for this function.
`classes`	List of class memberships, one per study. Each vector or factor of the list can only contain two levels which correspond to the two conditions studied.
`moderated`	Method to calculate the test statistic inside each study from which the p-value is computed. `moderated` has to be chosen between "limma", "SMVar" and "t".
`BHth`	Benjamini Hochberg threshold. By default, the False Discovery Rate is controlled at 5%.

Value

List

`Study1`	Vector of indices of differentially expressed genes in study 1. Similar names are given for the other individual studies.
`AllIndStudies`	Vector of indices of differentially expressed genes found by at least one of the individual studies.
`Meta`	Vector of indices of differentially expressed genes in the meta-analysis.
`TestStatistic`	Vector with test statistics for differential expression in the meta-analysis.

Note

Author(s)

Guillemette Marot

References

Marot, G., Foulley, J.-L., Mayer, C.-D., Jaffrezic, F. (2009) Moderated effect size and p-value combinations for microarray meta-analyses. Bioinformatics. 25 (20): 2692-2699.

Examples

data(Singhdata)
#Meta-analysis
res=pvalcombination(esets=Singhdata$esets,classes=Singhdata$classes)
#Number of differentially expressed genes in the meta-analysis
length(res$Meta)
#To plot an histogram of raw p-values
rawpval=2*(1-pnorm(abs(res$TestStatistic)))
hist(rawpval,nclass=100)
data(Singhdata)
#Meta-analysis
res=pvalcombination(esets=Singhdata$esets,classes=Singhdata$classes)
#Number of differentially expressed genes in the meta-analysis
length(res$Meta)
#To plot an histogram of raw p-values
rawpval=2*(1-pnorm(abs(res$TestStatistic)))
hist(rawpval,nclass=100)

P-value combination for paired data

Description

Calculates differential expression p-values from paired data either from classical or moderated t-tests (Limma, SMVar) for each study and combines these p-values by the inverse normal method.

Usage

pvalcombination.paired(logratios, moderated = c("limma", "SMVar", "t")[1], BHth = 0.05)
pvalcombination.paired(logratios, moderated = c("limma", "SMVar", "t")[1], BHth = 0.05)

Arguments

`logratios`	List of matrices. Each matrix has one row per gene and one column per replicate and gives the logratios of one study. All studies must have the same genes.
`moderated`	Method to calculate the test statistic inside each study from which the effect size is computed. `moderated` has to be chosen between "limma", "SMVar" and "t".
`BHth`	Benjamini Hochberg threshold. By default, the False Discovery Rate is controlled at 5%.

Value

List

`Study1`	Vector of indices of differentially expressed genes in study 1. Similar names are given for the other individual studies.
`AllIndStudies`	Vector of indices of differentially expressed genes found by at least one of the individual studies.
`Meta`	Vector of indices of differentially expressed genes in the meta-analysis.
`TestStatistic`	Vector with test statistics for differential expression in the meta-analysis.

Note

Author(s)

Guillemette Marot

References

Marot, G., Foulley, J.-L., Mayer, C.-D., Jaffrezic, F. (2009) Moderated effect size and p-value combinations for microarray meta-analyses. Bioinformatics. 25 (20): 2692-2699.

Examples

data(Singhdata)
#create artificially paired data:
artificialdata=lapply(Singhdata$esets,FUN=function(x) (x[,1:10]-x[,11:20]))
#Meta-analysis
res=pvalcombination.paired(artificialdata)
#Number of differentially expressed genes in the meta-analysis
length(res$Meta)
#To plot an histogram of raw p-values
rawpval=2*(1-pnorm(abs(res$TestStatistic)))
hist(rawpval,nclass=100)
data(Singhdata)
#create artificially paired data:
artificialdata=lapply(Singhdata$esets,FUN=function(x) (x[,1:10]-x[,11:20]))
#Meta-analysis
res=pvalcombination.paired(artificialdata)
#Number of differentially expressed genes in the meta-analysis
length(res$Meta)
#To plot an histogram of raw p-values
rawpval=2*(1-pnorm(abs(res$TestStatistic)))
hist(rawpval,nclass=100)

Row t-tests

Description

Performs t-tests for unpaired data row by row.

Usage

row.ttest.stat(mat1, mat2)
row.ttest.stat(mat1, mat2)

Arguments

`mat1`	Matrix with data for the first condition
`mat2`	Matrix with data for the second condition

Details

This function is much faster than employing apply with FUN=t.test

Value

Vector with t-test statistics

Examples

## The function is currently defined as
function(mat1,mat2){ 
n1<-dim(mat1)[2]
n2<-dim(mat2)[2] 
n<-n1+n2 
m1<-rowMeans(mat1,na.rm=TRUE) 
m2<-rowMeans(mat2,na.rm=TRUE) 
v1<-rowVars(mat1,na.rm=TRUE) 
v2<-rowVars(mat2,na.rm=TRUE) 
vpool<-(n1-1)/(n-2)*v1 + (n2-1)/(n-2)*v2 
tstat<-sqrt(n1*n2/n)*(m2-m1)/sqrt(vpool) 
return(tstat)}
## The function is currently defined as
function(mat1,mat2){ 
n1<-dim(mat1)[2]
n2<-dim(mat2)[2] 
n<-n1+n2 
m1<-rowMeans(mat1,na.rm=TRUE) 
m2<-rowMeans(mat2,na.rm=TRUE) 
v1<-rowVars(mat1,na.rm=TRUE) 
v2<-rowVars(mat2,na.rm=TRUE) 
vpool<-(n1-1)/(n-2)*v1 + (n2-1)/(n-2)*v2 
tstat<-sqrt(n1*n2/n)*(m2-m1)/sqrt(vpool) 
return(tstat)}

Row paired t-tests

Description

Performs t-tests for paired data row by row.

Usage

row.ttest.statp(mat)
row.ttest.statp(mat)

Arguments

mat

Matrix with data to be tested (for example, log-ratios in microarray experiments).

Details

This function is much faster than employing apply with FUN=t.test.

Value

Vector with t-test statistics.

Examples

## The function is currently defined as
function(mat){ 
m<-rowMeans(mat,na.rm=TRUE) 
sd<-rowSds(mat,na.rm=TRUE)  
tstat<-m/(sd*sqrt(1/dim(mat)[2])) 
return(tstat)}
## The function is currently defined as
function(mat){ 
m<-rowMeans(mat,na.rm=TRUE) 
sd<-rowSds(mat,na.rm=TRUE)  
tstat<-m/(sd*sqrt(1/dim(mat)[2])) 
return(tstat)}

Row variance of an array

Description

Calculates variances of each row of an array

Usage

rowVars(x, na.rm = TRUE)
rowVars(x, na.rm = TRUE)

Arguments

`x`	Array of two or more dimensions, containing numeric, complex, integer or logical values, or a numeric data frame.
`na.rm`	Logical. Should missing values (including NaN) be omitted from the calculations?

Details

This function is the same as applying apply with FUN=var but is a lot faster.

Value

A numeric or complex array of suitable size, or a vector if the result is one-dimensional. The dimnames (or names for a vector result) are taken from the original array.

Examples

## The function is currently defined as
function (x,na.rm = TRUE) 
{
    sqr = function(x) x * x
    n = rowSums(!is.na(x))
    n[n <= 1] = NA
    return(rowSums(sqr(x - rowMeans(x,na.rm = na.rm)), na.rm = na.rm)/(n - 1))
  }
## The function is currently defined as
function (x,na.rm = TRUE) 
{
    sqr = function(x) x * x
    n = rowSums(!is.na(x))
    n[n <= 1] = NA
    return(rowSums(sqr(x - rowMeans(x,na.rm = na.rm)), na.rm = na.rm)/(n - 1))
  }

Singh dataset

Description

Publicly available microarray dataset artificially split in 5 studies

Usage

data(Singhdata)data(Singhdata)

Format

List of 3 elements:

esets: List of 5 data frames corresponding to 5 artificial studies, each with 12625 genes and 20 replicates (10 normal samples and 10 tumoral samples)
classes: List of 5 numeric vectors with class memberships, one per study
geneNames: Factor with 12625 levels corresponding to gene names

Source

These data are available on the website http://www.bioinf.ucd.ie/people/ian/. We considered 50 normal samples and 50 tumoral samples, leaving out the 2 last tumoral samples. Data are already normalized.

References

Singh, D., Febbo, P. G., Ross, K., Jackson, D. G., Manola, J., Ladd, C., Tamayo, P., Renshaw, A. A., D'Amico, A. V., Richie, J. P., Lander, E. S., Loda, M., Kantoff, P.W., Golub, T. R., and Sellers,W. R. (2002). Gene expression correlates of clinical prostate cancer behavior. Cancer Cell, 1(2). 203:209.

Examples

data(Singhdata)
data(Singhdata)

Package 'metaMA'

Help Index

Meta-analysis for MicroArrays

Description

Details

Author(s)

References

Examples

Empirical Bayes statistics from limma analysis with unpaired data

Description

Usage

Arguments

Details

Value

Note

Direct effect size combination

Description

Usage

Arguments

Details

Value

References

Direct p-value combination

Description

Usage

Arguments

Value

Note

Author(s)

References

Calculates effect sizes from given t or moderated t statistics

Description

Usage

Arguments

Value

Author(s)

References

Examples

Effect size combination for unpaired data

Description

Usage

Arguments

Value

Note

Author(s)

References

Examples

Effect size combination for paired data

Description

Usage

Arguments

Value

Note

Author(s)

References

Examples

Integration-driven Discovery and Integration-driven Revision Rates

Description

Usage

Arguments

Value

Author(s)

References

Examples

P-value combination for unpaired data

Description

Usage

Arguments

Value

Note

Author(s)

References

Examples

P-value combination for paired data

Description

Usage

Arguments

Value

Note

Author(s)