Title: | Combined Total and Allele Specific Reads Sequencing Study |
---|---|
Description: | Analysis of combined total and allele specific reads from the reciprocal cross study with RNA-seq data. |
Authors: | Vasyl Zhabotynsky [aut, cre], Wei Sun [aut], Fei Zou [aut] |
Maintainer: | Vasyl Zhabotynsky <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.99.3 |
Built: | 2024-10-26 06:30:49 UTC |
Source: | CRAN |
This data set provides with example of experimental data for a subset of autosomal genes. The full model requires a combination of total read counts (y) - all the reads belonging for a gene, and finding out which of these reads we can specifically attribute to allele A or allele B - allele specific counts (n), separately the reads attributed specifically to allele B (n0B). Also, it includes the other data pieces to fit the model: kappas - total number of counts for each mouse, on log scale, index - specifying which cross each mouse belongs to, and geneids - Ensembl ids of genes. They, as well as the datasets simulated with simRX can be fitted using proc.trecase.A or proc.trec.A.
index |
vector defining the cross of the mouse, female - AB=1, BA=2, AA=3, BB=4, and male - AB=5, BA=6, AA=7, BB=8. If mice are of only one sex, AB=1, BA=2, AA=3, BB=4. |
y |
matrix of TReC counts. Note, the expected input assumes that inbred mice will be in the last columns of the table, after the last F1 mouse. |
n |
matrix of ASE counts for corresponding F1 mouse (classes 1, 2, 5, 6) for corresponding genes. |
n0B |
matrix of ASE counts belonging for allele B, for correponding genes and mice as in n. |
kappas |
A parameter, specifying log(overall TReC) for each mouse. |
geneids |
ids of genes, if not provided, rownames of the matrix y will be used |
Vasyl Zhabotynsky [email protected]
# see total read counts (TReC) for first 2 autosomal genes of a data example: data.A$y[1:2,]
# see total read counts (TReC) for first 2 autosomal genes of a data example: data.A$y[1:2,]
This data set provides with example of experimental data for a subset of autosomal genes. The full model requires a combination of total read counts (y) - all the reads belonging for a gene, and finding out which of these reads we can specifically attribute to allele A or allele B - allele specific counts (n), separately the reads attributed specifically to allele B (n0B). Also, tausB - is the Xce effect for each F1 mouse, which specifies the proportion of allele specific reads belonging to allele B. Also, it includes the other data pieces to fit the model: kappas - total number of counts for each mouse, on log scale, index - specifying which cross each mouse belongs to, and geneids - Ensembl ids of genes. They, as well as the datasets simulated with simRX can be fitted using proc.trecase.X or proc.trec.X.
index |
vector defining the cross of the mouse, female - AB=1, BA=2, AA=3, BB=4, and male - AB=5, BA=6, AA=7, BB=8. If mice are of only one sex, AB=1, BA=2, AA=3, BB=4. |
y |
matrix of TReC counts. Note, the expected input assumes that inbred mice will be in the last columns of the table, after the last F1 mouse. |
n |
matrix of ASE counts for corresponding F1 mouse (classes 1,2,5,6) for corresponding genes. |
n0B |
matrix of ASE counts belonging for allele B, for correponding genes and mice as in n. |
kappas |
A parameter, specifying as overall TReC for the mouse, on log scale |
tausB |
Xce effect: expression of allele B relative to the overall allele specific count for each mouse. Use some allele specific counts to estimate the effect. |
geneids |
ids of genes, if not provided, rownames of the matrix y will be used |
index |
index defining which mouse belongs to which cross |
y |
modified total read counts |
n |
modified allele specific counts |
n0B |
modified allele specific coutns, belonging to allele B |
kappas |
offset, defining a library size for each mouse |
tausB |
Xce effect for each mouse, for a given cross |
geneids |
Ensembl gene ids |
Vasyl Zhabotynsky [email protected]
# see total read counts (TReC) for first 2 X chromosome genes of a data example: data.X$y[1:2,]
# see total read counts (TReC) for first 2 X chromosome genes of a data example: data.X$y[1:2,]
Xce estimation for mice with allele specific reads.
get.tausB(n, n0B, geneids, min.cnt=50, exclude.prop=.05, Xist.ID="ENSMUSG00000086503")
get.tausB(n, n0B, geneids, min.cnt=50, exclude.prop=.05, Xist.ID="ENSMUSG00000086503")
n |
vector of allele specific counts for each mouse |
n0B |
vector of allele specific counts for allele B |
geneids |
gene IDs |
min.cnt |
minimum number of allele specific counts |
exclude.prop |
minimum proportion of allele specific counts for each allele |
Xist.ID |
and ID of Xist, to exclude it from estimating Xce, since Xce would 1-tausB |
output - matrix of 4 rows:
med.tauB |
taus estimated via median |
ave.tauB |
taus estimated via percent of allele B counts |
all.genes |
number of genes that had passed minimum count |
used.genes |
number of genes that had required percent of each allele |
each column represent respective mouse.
Vasyl Zhabotynsky [email protected]
# Estimating XCE effect for each mouse for X chromosome get.tausB(n=data.X$n, n0B=data.X$n0B, geneids=data.X$geneids)
# Estimating XCE effect for each mouse for X chromosome get.tausB(n=data.X$n, n0B=data.X$n0B, geneids=data.X$geneids)
Calculates negative log(likelihood) of an X chromosome joint TReC and ASE counts model at a given set of parameters
nLogLik(res, rc, genei, hessian=FALSE)
nLogLik(res, rc, genei, hessian=FALSE)
res |
result object from process function |
rc |
Read count data object created by readCounts function |
genei |
get results for i'th gene |
hessian |
a logical option whether to calculate a Hessian matrix, the default values is set to FALSE. |
output - list(nll=-log.likelihood,hessian=hessian matrix)
Vasyl Zhabotynsky [email protected]
process
, rcA
, readCounts
.
## Not run: # get negative-log likelihood at the given point: nLogLik(res=trecase.X.out, rc=rcX, genei=1, hessian=TRUE) ## End(Not run)
## Not run: # get negative-log likelihood at the given point: nLogLik(res=trecase.X.out, rc=rcX, genei=1, hessian=TRUE) ## End(Not run)
Performs optimization of one of four combinations: joint TReC and ASE or just TReC for autosome or X chromosome and tests with lrt test several hypotheses: additive, parent of origin, dominance, consistency of TreC and ASE additive effect, ASE only additive effect, sex, sex specific additive, dominance deviation for males.
process(rc, hessian=FALSE)
process(rc, hessian=FALSE)
rc |
an object of class readCounts. |
hessian |
a flag whether Hessian matrix for these genes should be calculated, by default set to FALSE |
a list of following matrices (if there is only one sex, only the relevant tests and matrices are outputed) :
pvals |
matrix of p-values from description for each gene corresponding row |
coef.full |
matrix of full model fit coefficients, -log(likelihood at these coefficients),phi, theta (2 overdispersion parameters used) |
coef.add |
matrix of additive restricted fit coefficients, -log(likelihood at these coefficients),phi, theta |
coef.poo |
matrix of parent of origin restricted fit coefficients, -log(likelihood at these coefficients),phi, theta |
coef.dom |
matrix of dominance restricted fit coefficients, -log(likelihood at these coefficients),phi, theta |
coef.same |
matrix of TReC=ASE additive restricted fit coefficients, -log(likelihood at these coefficients),phi, theta |
coef.ase.add |
matrix of ASE additive restricted fit coefficients, -log(likelihood at these coefficients),phi, theta |
coef.sex |
matrix of sex restricted fit coefficients, -log(likelihood at these coefficients),phi, theta |
coef.sex.add |
matrix of sex specific additive restricted fit coefficients, -log(likelihood at these coefficients),phi, theta |
coef.dev.dom |
matrix of dominance deviation for male restricted fit coefficients, -log(likelihood at these coefficients),phi, theta |
errorlist |
a list of errors |
hess.lst |
a list of heassian matrices, if parameter hessian is set to TRUE |
Vasyl Zhabotynsky [email protected]
get.tausB
,nLogLik
, data.X
, data.A
, rcA
, readCounts
.
## Not run: # fitting X chromosome data example, for a full model, i.e. assuming we have allele specific reads: trecase.A.out = process(rc=rcA) names(trecase.A.out) trecase.A.out$pval #alternatively for X chromosome: trecase.X.out = process(rc=rcX) names(trecase.X.out) trecase.X.out$pval ## End(Not run)
## Not run: # fitting X chromosome data example, for a full model, i.e. assuming we have allele specific reads: trecase.A.out = process(rc=rcA) names(trecase.A.out) trecase.A.out$pval #alternatively for X chromosome: trecase.X.out = process(rc=rcX) names(trecase.X.out) trecase.X.out$pval ## End(Not run)
This is an object of type readCounts provides with example of experimental data for a subset of autosomal genes. The full model requires a combination of total read counts (y) - all the reads belonging for a gene, and finding out which of these reads we can specifically attribute to allele A or allele B - allele specific counts (n), separately the reads attributed specifically to allele B (n0B). In autosomes Xce effect is absent, so it would be set to NULL for this dataset. Also, it includes the other data pieces to fit the model: kappas - total number of counts for each mouse, on log scale, index - specifying which cross each mouse belongs to, and geneids - Ensembl ids of genes. Such values also can be simulated with simRX can be fitted using process with appropriate options chrom="X" and field model to be either "full" or "short".
index |
vector defining the cross of the mouse, female - AB=1, BA=2, AA=3, BB=4, and male - AB=5, BA=6, AA=7, BB=8. If mice are of only one sex, AB=1, BA=2, AA=3, BB=4. |
y |
matrix of TReC counts. Note, the expected input assumes that inbred mice will be in the last columns of the table, after the last F1 mouse. |
n |
matrix of ASE counts for corresponding F1 mouse (classes 1,2,5,6) for corresponding genes. |
n0B |
matrix of ASE counts belonging for allele B, for correponding genes and mice as in n. |
kappas |
A parameter, specifying as overall TReC for the mouse, on log scale |
tausB |
Xce effect: expression of allele B relative to the overall allele specific count for each mouse. Set to NULL in autosomes. |
gene.switch |
For which genes Xce effect should be switched. Null for autosomes. |
geneids |
ids of genes, if not provided, rownames of the matrix y will be used |
chrom |
this field would be set to be "X" since this is dataset for chromosome X |
model |
set to be "full", can be modified to "short" to run a TReC oply model |
geneids |
Ensembl gene ids |
tech.ctrl |
a list of overdispersion boundaries and log(2) |
Vasyl Zhabotynsky [email protected]
# see total read counts (TReC) for first 2 X chromosome genes of a data example: rcX
# see total read counts (TReC) for first 2 X chromosome genes of a data example: rcX
This is an object of type readCounts provides with example of experimental data for a subset of X chromosome genes. The full model requires a combination of total read counts (y) - all the reads belonging for a gene, and finding out which of these reads we can specifically attribute to allele A or allele B - allele specific counts (n), separately the reads attributed specifically to allele B (n0B). Also, tausB - is the Xce effect for each F1 mouse, which specifies the proportion of allele specific reads belonging to allele B. Also, it includes the other data pieces to fit the model: kappas - total number of counts for each mouse, on log scale, index - specifying which cross each mouse belongs to, and geneids - Ensembl ids of genes. They, as well as the datasets simulated with simRX can be fitted using process with appropriate options chrom="X" and field model to be either "full" or "short".
genes.switch=genes.switch, geneids=geneids,
index |
vector defining the cross of the mouse, female - AB=1, BA=2, AA=3, BB=4, and male - AB=5, BA=6, AA=7, BB=8. If mice are of only one sex, AB=1, BA=2, AA=3, BB=4. |
y |
matrix of TReC counts. Note, the expected input assumes that inbred mice will be in the last columns of the table, after the last F1 mouse. |
n |
matrix of ASE counts for corresponding F1 mouse (classes 1,2,5,6) for corresponding genes. |
n0B |
matrix of ASE counts belonging for allele B, for correponding genes and mice as in n. |
kappas |
A parameter, specifying as overall TReC for the mouse, on log scale |
tausB |
Xce effect: expression of allele B relative to the overall allele specific count for each mouse. Use some allele specific counts to estimate the effect. |
gene.switch |
For which genes Xce effect should be switched. Xist gene set to be switched in this set. |
geneids |
ids of genes, if not provided, rownames of the matrix y will be used |
chrom |
this field would be set to be "X" since this is dataset for chromosome X |
model |
set to be "full", can be modified to "short" to run a TReC oply model |
geneids |
Ensembl gene ids |
tech.ctrl |
a list of overdispersion boundaries and log(2) |
Vasyl Zhabotynsky [email protected]
# see total read counts (TReC) for first 2 X chromosome genes of a data example: rcX
# see total read counts (TReC) for first 2 X chromosome genes of a data example: rcX
It should contain at least total read counts (TReC) and classification of crosses 1 to 8. To fit the full model should also have appropriate allele specific counts n and n0B. Also is used along with results of optimization as input to nLogLik function if one needs to calculate Hessian matrix.
index |
vector defining the cross of the mouse, female - AB=1, BA=2, AA=3, BB=4, and male - AB=5, BA=6, AA=7, BB=8. If mice are of only one sex, AB=1, BA=2, AA=3, BB=4. |
y |
matrix of TReC counts. Note, the expected input assumes that inbred mice will be in the last columns of the table, after the last F1 mouse. |
n |
matrix of ASE counts for corresponding F1 mouse (classes 1,2,5,6) for corresponding genes. |
n0B |
matrix of ASE counts belonging for allele B, for correponding genes and mice as in n. |
kappas |
A parameter, specifying as overall TReC for the mouse, on log scale |
tausB |
Xce effect: expression of allele B relative to the overall allele specific count for each mouse. Set to NULL in autosomes. |
gene.switch |
For which genes Xce effect should be switched. Null for autosomes. |
geneids |
ids of genes, if not provided, rownames of the matrix y will be used |
chrom |
this field would be set to be "X" since this is dataset for chromosome X |
model |
set to be "full", can be modified to "short" to run a TReC oply model |
geneids |
Ensembl gene ids |
tech.ctrl |
a list of overdispersion boundaries and log(2) |
Vasyl Zhabotynsky [email protected]
# see total read counts (TReC) for first 2 X chromosome genes of a data example: rcA = readCounts(index=data.A$index, y=data.A$y[1:2,], n=data.A$n[1:2,], n0B=data.A$n0B[1:2,], kappas=data.A$kappas, geneids=data.A$geneids[1:2])
# see total read counts (TReC) for first 2 X chromosome genes of a data example: rcA = readCounts(index=data.A$index, y=data.A$y[1:2,], n=data.A$n[1:2,], n0B=data.A$n0B[1:2,], kappas=data.A$kappas, geneids=data.A$geneids[1:2])
This function is producing simulated counts for the joint model with Negative-Binomial distribution for TReC and Beta-Binomial for ASE counts. The simulated dataset should be reformatted to readCounts format to be used for optimization.
simRX(b0f, b0m, b1f, b1m, beta_sex, beta_dom, beta_k=1, phi=1, theta=1, n=6, mean.base.cnt=50, range.base.cnt=60, perc.ase=.35, n.simu=1E4, is.X=FALSE, tauB=NULL, seed=NULL)
simRX(b0f, b0m, b1f, b1m, beta_sex, beta_dom, beta_k=1, phi=1, theta=1, n=6, mean.base.cnt=50, range.base.cnt=60, perc.ase=.35, n.simu=1E4, is.X=FALSE, tauB=NULL, seed=NULL)
b0f |
a female additive strain effect |
b0m |
a male additive strain effect |
b1f |
a female parent of origin effect |
b1m |
a male parent of origin effect |
beta_sex |
a sex effect |
beta_dom |
a dominance effect |
beta_k |
an effect associated with the library size kappas |
phi |
a Negative-Binomial overdispersion, default value is 1 |
theta |
a Beta-Binomial overdispersion, default value is 1 |
n |
a vector defining number of mice in each cross, default value is 6 |
mean.base.cnt |
a target expected number of counts for the base group (with no effects), default value is 50 |
range.base.cnt |
a range in which the expected number of counts for the base group will vary, default value is 60 |
perc.ase |
a percent reads that are allele-specific, default value is 35% |
n.simu |
a number of simulations, default value is 1E4 |
is.X |
a flag if the value to be simulated is X for chromosome (otherwise autosome), default value is FALSE |
tauB |
a value describing allelic imbalance - Xce effect for the cross, default value is NULL, in which case 50% will be simulated |
seed |
a random seed to be set, no set by default. |
output - 3 matrices with one row - one gene, one column - one mouse:
index |
vector defining the cross of the mouse, female - AB=1, BA=2, AA=3, BB=4, and male - AB=5, BA=6, AA=7, BB=8. If mice are of only one sex, AB=1, BA=2, AA=3, BB=4. |
y |
A matrix of total read counts |
n |
A matrix of allele specific counts |
n0B |
A matrix of allele specific counts associated with allele B |
kappas |
Offset parameter, given as overall TReC for the mouse. |
tausB |
In case of the simulating X chromosome the provided Xce effect is returned: expression of allele B relative to the overall allele specific count for each mouse. |
Vasyl Zhabotynsky [email protected]
# simulating autosomal data: dat.A = simRX(b0f=.5, b0m=.6, b1f=.3, b1m=.4, beta_sex=.1, beta_dom=.1, n.simu=1E1) names(dat.A) # simulating autosomal data: dat.X = simRX(b0f=.5, b0m=.6, b1f=.3, b1m=.4, beta_sex=.1, beta_dom=.1, n.simu=1E1, is.X=TRUE, tauB=.3) names(dat.X)
# simulating autosomal data: dat.A = simRX(b0f=.5, b0m=.6, b1f=.3, b1m=.4, beta_sex=.1, beta_dom=.1, n.simu=1E1) names(dat.A) # simulating autosomal data: dat.X = simRX(b0f=.5, b0m=.6, b1f=.3, b1m=.4, beta_sex=.1, beta_dom=.1, n.simu=1E1, is.X=TRUE, tauB=.3) names(dat.X)
A list containing test results as well as parameter estimates for joint model evaluated by process function for autosomal genes.
a list of following matrices (if there is only one sex, only the relevant tests and matrices are outputed) :
pvals |
matrix of p-values from description for each gene corresponding row |
coef.full |
matrix of full model fit coefficients, -log(likelihood at these coefficients),phi, theta (2 overdispersion parameters used) |
coef.add |
matrix of additive restricted fit coefficients, -log(likelihood at these coefficients),phi, theta |
coef.poo |
matrix of parent of origin restricted fit coefficients, -log(likelihood at these coefficients),phi, theta |
coef.dom |
matrix of dominance restricted fit coefficients, -log(likelihood at these coefficients),phi, theta |
coef.same |
matrix of TReC=ASE additive restricted fit coefficients, -log(likelihood at these coefficients),phi, theta |
coef.ase.add |
matrix of ASE additive restricted fit coefficients, -log(likelihood at these coefficients),phi, theta |
coef.sex |
matrix of sex restricted fit coefficients, -log(likelihood at these coefficients),phi, theta |
coef.sex.add |
matrix of sex specific additive restricted fit coefficients, -log(likelihood at these coefficients),phi, theta |
coef.dev.dom |
matrix of dominance deviation for male restricted fit coefficients, -log(likelihood at these coefficients),phi, theta |
errorlist |
a list of errors |
hess.lst |
a list of heassian matrices, if parameter hessian is set to TRUE |
Vasyl Zhabotynsky [email protected]
names(trecase.A.out)
names(trecase.A.out)
A list containing test results as well as parameter estimates for joint model evaluated by process function for autosomal genes.
a list of following matrices (if there is only one sex, only the relevant tests and matrices are outputed) :
pvals |
matrix of p-values from description for each gene corresponding row |
coef.full |
matrix of full model fit coefficients, -log(likelihood at these coefficients),phi, theta (2 overdispersion parameters used) |
coef.add |
matrix of additive restricted fit coefficients, -log(likelihood at these coefficients),phi, theta |
coef.poo |
matrix of parent of origin restricted fit coefficients, -log(likelihood at these coefficients),phi, theta |
coef.dom |
matrix of dominance restricted fit coefficients, -log(likelihood at these coefficients),phi, theta |
coef.same |
matrix of TReC=ASE additive restricted fit coefficients, -log(likelihood at these coefficients),phi, theta |
coef.ase.add |
matrix of ASE additive restricted fit coefficients, -log(likelihood at these coefficients),phi, theta |
coef.sex |
matrix of sex restricted fit coefficients, -log(likelihood at these coefficients),phi, theta |
coef.sex.add |
matrix of sex specific additive restricted fit coefficients, -log(likelihood at these coefficients),phi, theta |
coef.dev.dom |
matrix of dominance deviation for male restricted fit coefficients, -log(likelihood at these coefficients),phi, theta |
errorlist |
a list of errors |
hess.lst |
a list of heassian matrices, if parameter hessian is set to TRUE |
Vasyl Zhabotynsky [email protected]
names(trecase.X.out)
names(trecase.X.out)