Title: | Endogenous Perturbation Analysis of Cancer |
---|---|
Description: | Estimates sparse matrices A or G using fast lasso regression from mRNA transcript levels Y and CNA profiles U. Two models are provided, EPoC A where AY + U + R = 0 and EPoC G where Y = GU + E, the matrices R and E are so far treated as noise. For details see the manual page of 'lassoshooting' and the article Rebecka Jörnsten, Tobias Abenius, Teresia Kling, Linnéa Schmidt, Erik Johansson, Torbjörn E M Nordling, Bodil Nordlander, Chris Sander, Peter Gennemark, Keiko Funa, Björn Nilsson, Linda Lindahl, Sven Nelander (2011) <doi:10.1038/msb.2011.17>. |
Authors: | Rebecka Jornsten, Tobias Abenius, Sven Nelander |
Maintainer: | Tobias Abenius <[email protected]> |
License: | LGPL-3 |
Version: | 0.2.6-1.1 |
Built: | 2024-11-13 06:46:33 UTC |
Source: | CRAN |
EPoC (Endogenous Perturbation analysis of Cancer)
epocA(Y, U=NULL, lambdas=NULL, thr=1.0e-10, trace=0, ...) epocG(Y, U, lambdas=NULL, thr=1.0e-10, trace=0, ...) epoc.lambdamax(X, Y, getall=F, predictorix=NULL) as.graph.EPOCA(model,k=1) as.graph.EPOCG(model,k=1) write.sif(model, k=1, file="", append=F) ## S3 method for class 'EPOCA' print(x,...) ## S3 method for class 'EPOCG' print(x,...) ## S3 method for class 'EPOCA' summary(object, k=NULL, ...) ## S3 method for class 'EPOCG' summary(object, k=NULL,...) ## S3 method for class 'EPOCA' coef(object, k=1, ...) ## S3 method for class 'EPOCG' coef(object, k=1, ...) ## S3 method for class 'EPOCA' predict(object, newdata,k=1,trace=0, ...) ## S3 method for class 'EPOCG' predict(object, newdata,k=1,trace=0, ...)
epocA(Y, U=NULL, lambdas=NULL, thr=1.0e-10, trace=0, ...) epocG(Y, U, lambdas=NULL, thr=1.0e-10, trace=0, ...) epoc.lambdamax(X, Y, getall=F, predictorix=NULL) as.graph.EPOCA(model,k=1) as.graph.EPOCG(model,k=1) write.sif(model, k=1, file="", append=F) ## S3 method for class 'EPOCA' print(x,...) ## S3 method for class 'EPOCG' print(x,...) ## S3 method for class 'EPOCA' summary(object, k=NULL, ...) ## S3 method for class 'EPOCG' summary(object, k=NULL,...) ## S3 method for class 'EPOCA' coef(object, k=1, ...) ## S3 method for class 'EPOCG' coef(object, k=1, ...) ## S3 method for class 'EPOCA' predict(object, newdata,k=1,trace=0, ...) ## S3 method for class 'EPOCG' predict(object, newdata,k=1,trace=0, ...)
Y |
N x p matrix of mRNA transcript levels for p genes and N samples for epocA and epocG. For |
U |
N x p matrix of DNA copy number |
lambdas |
Non-negative vector of relative regularization parameters for lasso. |
thr |
Threshold for convergence. Default value is 1e-10. Iterations stop when max absolute parameter change is less than thr |
trace |
Level of detail for printing out information as iterations proceed. Default 0 – no information |
X |
In |
predictorix |
For |
getall |
Logical. For |
file |
either a character string naming a file or a connection open for writing. |
append |
logical. Only relevant if |
model |
Model set from epocA or epocG |
k |
Select a model of sparsity level k in [1,K]. In |
newdata |
List of Y and U matrices required for prediction. |
x |
Model parameter to |
object |
Model parameter to |
... |
Parameters passed down to underlying function, e.g. |
epocA
and epocG
estimates sparse matrices or
using fast lasso regression from mRNA transcript levels
and CNA profiles
. Two models are provided, EPoC A where
and EPoC G where
The matrices and
are so far treated as noise. For details see the reference section and the manual page of
lassoshooting
.
If you have different sizes of U and Y you need to sort your Y such that the U-columns correspond to the first Y-columns. Example: Y.new <- cbind(Y[,haveCNA], Y[, -haveCNA])
CHANGES: predictorix
used to be a parameter with a vector of a subset of the variables 1:p
of U corresponding to transcripts in Y, Default was to use all which mean that Y and U must have same size.
epoc.lambdamax
returns the maximal value in a series of lasso regression models such that all coefficients are zero.
plot
if type='graph'
(default) plot graph of model using the Rgraphviz
package
arrows only tell direction, not inhibit or stimulate. If type='modelsel'
see modelselPlot
.
epocA
and epocG
returns an object of class
‘"epocA"’ and ‘"epocG"’ respectively.
The methods summary
, print
, coef
, predict
can be used as with other models. coef
and predict
take an extra optional integer parameter k
(default 1) which gives the model at the given density level.
An object of these classes is a list containing at least the following components:
coefficients |
list of t(A) or t(G) matrices for the different |
links |
the number of links for the different |
lambdas |
the |
R2 |
R², coefficient of determination |
Cp |
Mallows Cp |
s2 |
Estimate of the error variance |
RSS |
Residual Sum of Squares (SSreg) |
SS.tot |
Total sum of squares of the response |
inorms |
the infinity norm of predictors transposed times response for the different responses |
d |
Direct effects of CNA to corresponding gene |
The coef
function returns transposed versions of the matrices and
.
Tobias Abenius
Rebecka Jörnsten, Tobias Abenius, Teresia Kling, Linnéa Schmidt, Erik Johansson, Torbjörn Nordling, Bodil Nordlander, Chris Sander, Peter Gennemark, Keiko Funa, Björn Nilsson, Linda Lindahl, Sven Nelander. (2011) Network modeling of the transcriptional effects of copy number aberrations in glioblastoma. Molecular Systems Biology 7 (to appear)
print
, modelselPlot
,
epoc.validation
,
epoc.bootstrap
,
plot.EPoC.validation.pred
,
plot.EPoC.validation.W
,
coef
, predict
## Not run: modelA <- epocA(X,U) modelG <- epocG(X,U) # plot sparsest A and G models using the igraph package # arrows only tell direction, not inhibit or stimulate par(mfrow=c(1,2)) plot(modelA) plot(modelG) # OpenGL 3D plot on sphere using the igraph and rgl packages plot(modelA,threed=T) # Write the graph to a file in SIF format for import in e.g. Cytoscape write.sif(modelA,file="modelA.sif") # plot graph in Cytoscape using Cytoscape XMLRPC plugin and # R packages RCytoscape, bioconductor graph, XMLRPC require('graph') require('RCytoscape') g <- as.graph.EPOCA(modelA,k=5) cw <- CytoscapeWindow("EPoC", graph = g) displayGraph(cw) # prediction N <- dim(X)[1] ii <- sample(1:N, N/3) modelG <- epocG(X[ii,], U[ii,]) K <- length(modelA$lambda) # densest model index index newdata <- list(U=U[-ii,]) e <- X[-ii,] - predict(modelA, newdata, k=K) RSS <- sum(e^2) cat("RMSD:", sqrt(RSS/N), "\n") ## End(Not run)
## Not run: modelA <- epocA(X,U) modelG <- epocG(X,U) # plot sparsest A and G models using the igraph package # arrows only tell direction, not inhibit or stimulate par(mfrow=c(1,2)) plot(modelA) plot(modelG) # OpenGL 3D plot on sphere using the igraph and rgl packages plot(modelA,threed=T) # Write the graph to a file in SIF format for import in e.g. Cytoscape write.sif(modelA,file="modelA.sif") # plot graph in Cytoscape using Cytoscape XMLRPC plugin and # R packages RCytoscape, bioconductor graph, XMLRPC require('graph') require('RCytoscape') g <- as.graph.EPOCA(modelA,k=5) cw <- CytoscapeWindow("EPoC", graph = g) displayGraph(cw) # prediction N <- dim(X)[1] ii <- sample(1:N, N/3) modelG <- epocG(X[ii,], U[ii,]) K <- length(modelA$lambda) # densest model index index newdata <- list(U=U[-ii,]) e <- X[-ii,] - predict(modelA, newdata, k=K) RSS <- sum(e^2) cat("RMSD:", sqrt(RSS/N), "\n") ## End(Not run)
Bootstrap for the EPoC methods
epoc.bootstrap(Y, U, nboots=100, bthr=NULL, method='epocG',...) ## S3 method for class 'bootsize' plot(x, lambda.boot=NULL, B, range=c(0,1), ...) epoc.final(epocboot, bthr=0.2, k)
epoc.bootstrap(Y, U, nboots=100, bthr=NULL, method='epocG',...) ## S3 method for class 'bootsize' plot(x, lambda.boot=NULL, B, range=c(0,1), ...) epoc.final(epocboot, bthr=0.2, k)
Y |
mRNA, samples x genes. |
U |
CNA, samples x genes. |
nboots |
Number of bootstrap iterations to run. |
method |
For |
x |
A sparse network matrix or a list of the same, non-zeros are links. These come from e.g. |
lambda.boot |
The |
B |
Number of bootstrap iterations ran for |
range |
Range of bootstrap thresholds to display. |
epocboot |
For |
k |
For |
bthr |
Require presence of links in 100*bthr% of the bootstrap iterations. |
... |
Parameters passed down to an underlying function. For |
epoc.bootstrap
run epocA
or epocG
using bootstrap.
epoc.bootstrap
returns a list of arrays of values in
where 1 is presence of link in 100% of bootstrap iterations for the
different
values for
different genes.
epoc.final
returns a sparse matrix of values in
where 1 is presence of link in 100% of bootstrap iterations, but thresholded such that all values have be greater than or equal to
bthr
.
Rebecka Jörnsten, Tobias Abenius, Teresia Kling, Linnéa Schmidt, Erik Johansson, Torbjörn Nordling, Bodil Nordlander, Chris Sander, Peter Gennemark, Keiko Funa, Björn Nilsson, Linda Lindahl, Sven Nelander. (2011) Network modeling of the transcriptional effects of copy number aberrations in glioblastoma. Molecular Systems Biology 7 (to appear)
epoc
, plot.EPoC.validation
, plot.EPOCA
, plot.EPOCG
Survival analysis
epoc.svd(model, k=1, C=1, numload=NULL) epoc.survival(G.svd, Y, U, surv, C=1, type=NULL) epoc.svdplot(G.svd, C=1) ## S3 method for class 'EPoC.survival' plot(x,...) ## S3 method for class 'EPoC.survival' summary(object,...) ## S3 method for class 'summary.EPoC.survival' print(x,...)
epoc.svd(model, k=1, C=1, numload=NULL) epoc.survival(G.svd, Y, U, surv, C=1, type=NULL) epoc.svdplot(G.svd, C=1) ## S3 method for class 'EPoC.survival' plot(x,...) ## S3 method for class 'EPoC.survival' summary(object,...) ## S3 method for class 'summary.EPoC.survival' print(x,...)
model |
An object from |
k |
In case |
C |
Default 1. For |
numload |
Number of loadings in the sparse components, a vector for each component. Default 10 for all components. |
G.svd |
The list obtained from |
Y |
mRNA, samples x genes. |
U |
CNA, samples x genes. |
surv |
Survival data for the samples. |
type |
|
x |
An object from |
object |
An object from |
... |
Parameters passed down to underlying functions, |
Applies survival analysis using the first SVD component, but other components can also be used by changing the input value of C
. Survival scores are generated as described in Subsect. 2.4 in the second paper referenced. A simple non-parametric survival analysis is performed, comparing survival between patientswith positive or negative scores (tumor fitness).
The epoc.survival object contains the summary information from a log-rank test comparing survival (survdiff) and survival fit objects.
Rebecka Jörnsten, Tobias Abenius, Teresia Kling, Linnéa Schmidt, Erik Johansson, Torbjörn Nordling, Bodil Nordlander, Chris Sander, Peter Gennemark, Keiko Funa, Björn Nilsson, Linda Lindahl, Sven Nelander. (2011) Network modeling of the transcriptional effects of copy number aberrations in glioblastoma. Molecular Systems Biology 7
Tobias Abenius, Rebecka Jörnsten, Teresia Kling, Linnéa Schmidt, José Sánchez, Sven Nelander. (2012) System-scale network modeling of cancer using EPoC. Advances in experimental medicine and biology
epoc
, epoc.validation
and spca
Model validation using random split or cross-validation
epoc.validation(type=c('pred','concordance'),repl,Y,U,lambdas=NULL, method='G',thr=1e-10,trace=0,...)
epoc.validation(type=c('pred','concordance'),repl,Y,U,lambdas=NULL, method='G',thr=1e-10,trace=0,...)
type |
|
repl |
The number of replicates |
Y |
mRNA, samples x genes |
U |
CNA, samples x genes |
lambdas |
series of relative |
method |
|
thr |
Threshold for convergence to the LASSO solver |
trace |
Debug information |
... |
Extra parameters passed through to the EPoC solver |
In the case of 'pred'
assess CV prediction error using 10-fold cross-validation with repl
replicates.
In the case of 'concordance'
assess network concordance using random split and Kendall W with repl
replicates.
A list of class plot.EPoC.validation.pred
or plot.EPoC.validation.W
respectively.
Rebecka Jörnsten, Tobias Abenius, Teresia Kling, Linnéa Schmidt, Erik Johansson, Torbjörn Nordling, Bodil Nordlander, Chris Sander, Peter Gennemark, Keiko Funa, Björn Nilsson, Linda Lindahl, Sven Nelander. (2011) Network modeling of the transcriptional effects of copy number aberrations in glioblastoma. Molecular Systems Biology 7 (to appear)
Plot BIC, Mallow's Cp and
modelselPlot(x, layout=NULL, k=1, showtitle=F, bthr=0, showself=F, type=c('graph','modelsel'), ...) ## S3 method for class 'EPOCA' plot(x, layout=NULL, k=1, showtitle=F, bthr=0, showself=F, type=c('graph','modelsel'), ...) ## S3 method for class 'EPOCG' plot(x, layout=NULL, k=1, showtitle=F, bthr=0, showself=F, type=c('graph','modelsel'), ...)
modelselPlot(x, layout=NULL, k=1, showtitle=F, bthr=0, showself=F, type=c('graph','modelsel'), ...) ## S3 method for class 'EPOCA' plot(x, layout=NULL, k=1, showtitle=F, bthr=0, showself=F, type=c('graph','modelsel'), ...) ## S3 method for class 'EPOCG' plot(x, layout=NULL, k=1, showtitle=F, bthr=0, showself=F, type=c('graph','modelsel'), ...)
x |
An EPoC G or EPoC A object |
layout |
Not used only for |
k |
Not used for |
showtitle |
Not used for |
bthr |
Not used for |
showself |
Not used for |
type |
This page documents |
... |
Parameters passed down to underlying functions, |
Creates a plot that aids in model selection.
Scale Bayesian information criterion (BIC) and Mallow's between zero on one and put that on the y-axis and put relative
values on the x-axis.
Rebecka Jörnsten, Tobias Abenius, Teresia Kling, Linnéa Schmidt, Erik Johansson, Torbjörn Nordling, Bodil Nordlander, Chris Sander, Peter Gennemark, Keiko Funa, Björn Nilsson, Linda Lindahl, Sven Nelander. (2011) Network modeling of the transcriptional effects of copy number aberrations in glioblastoma. Molecular Systems Biology 7 (to appear)
Parallell list apply
plapply(X1, X2, FUN, ...)
plapply(X1, X2, FUN, ...)
X1 |
a vector (atomic or list). Other objects (including classed objects) will be coerced by |
X2 |
See |
FUN |
the function to be applied to each pair of |
... |
optional arguments to |
FUN
is found by a call to match.fun
and typically is specified as a function or a symbol (e.g. a backquoted name) or a character string specifyign a function to be searched for from the environment of the call to plapply
.
Function FUN
must be able to accept as input any of the element pairs of X1
and X2
. If any of these are atomic vectors, FUN
will always be passed a length-one vector of the same type as X1
, X2
respectively.
Users of S4 classes should pass a list to plapply
: the internal coercion is done by the system as.list
in the base namespace and not one defined by a user (e.g. by setting S4 methods on the system function).
A list.
X1 <- array(1:4,dim=c(2,2)) X2 <- array(5:8,dim=c(2,2)) X3 <- array(9:12,dim=c(2,2)) X4 <- array(13:16,dim=c(2,2)) l <- plapply(list(X1,X2),list(X3,X4), function(E1,E2) E2 - E1) stopifnot(all(sapply(l,sum)/4 == 4*2))
X1 <- array(1:4,dim=c(2,2)) X2 <- array(5:8,dim=c(2,2)) X3 <- array(9:12,dim=c(2,2)) X4 <- array(13:16,dim=c(2,2)) l <- plapply(list(X1,X2),list(X3,X4), function(E1,E2) E2 - E1) stopifnot(all(sapply(l,sum)/4 == 4*2))
Plot model validation criteria
## S3 method for class 'EPoC.validation.W' plot(x, ...) ## S3 method for class 'EPoC.validation.pred' plot(x, ...)
## S3 method for class 'EPoC.validation.W' plot(x, ...) ## S3 method for class 'EPoC.validation.pred' plot(x, ...)
x |
An object of type |
... |
Parameters passed down to underlying functions, |
Plot Kendall W or prediction error, respectively on the y-axis, network size on the upper x-axis and on the lower x-axis.
The plot fit a loess model of degree 1 to the points from the input object and finds the largest network size and corresponding
such that W is maximized or prediction error is minimized, respectively.
Rebecka Jörnsten, Tobias Abenius, Teresia Kling, Linnéa Schmidt, Erik Johansson, Torbjörn Nordling, Bodil Nordlander, Chris Sander, Peter Gennemark, Keiko Funa, Björn Nilsson, Linda Lindahl, Sven Nelander. (2011) Network modeling of the transcriptional effects of copy number aberrations in glioblastoma. Molecular Systems Biology 7 (to appear)
epoc
, epoc.validation
, plot.default
This dataset contains blinded mRNA, CNA and survival data of 186 cancer tumors modified for demonstration usage. Some genes are randomly selected from 10672 probes, others are chosen for their characteristics.
mRNA is standardized to sd=1 and mean=0. CNA is centered to mean=0. survival is in days.
data(synth)
data(synth)
The synth
data set is a list containing mRNA y
, CNA u
and surv
survival data.
## Not run: data(synth) y <- synth$y # standardize u u <- apply(synth$u, 2, function(x) x/sd(x)) G <- epocG(Y=y, U=u) summary(G) plot(G) ## End(Not run)
## Not run: data(synth) y <- synth$y # standardize u u <- apply(synth$u, 2, function(x) x/sd(x)) G <- epocG(Y=y, U=u) summary(G) plot(G) ## End(Not run)