Title: | Integrative Random Forest for Gene Regulatory Network Inference |
---|---|
Description: | Provides a flexible integrative algorithm that allows information from prior data, such as protein protein interactions and gene knock-down, to be jointly considered for gene regulatory network inference. |
Authors: | Francesca Petralia [aut, cre], Pei Wang [aut], Zhidong Tu [aut], Jialiang Yang [aut], Adele Cutler [ctb], Leo Breiman [ctb], Andy Liaw [ctb], Matthew Wiener [ctb] |
Maintainer: | Francesca Petralia <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.1-1 |
Built: | 2024-12-07 06:42:27 UTC |
Source: | CRAN |
This function fits iRafNet, a flexible unified integrative algorithm that allows information from prior data, such as protein-protein interactions and gene knock-down, to be jointly considered for gene regulatory network inference. This function takes as input only one set of sampling scores, computed considering one prior data such as protein-protein interactions or gene expression from knock-out experiments. Note that some of the functions utilized are a modified version of functions contained in the R package randomForest (A. Liaw and M. Wiener, 2002).
iRafNet(X, W, ntree, mtry,genes.name)
iRafNet(X, W, ntree, mtry,genes.name)
X |
|
W |
|
ntree |
Numeric value: number of trees. |
mtry |
Numeric value: number of potential regulators to be sampled at each tree node. |
genes.name |
Vector containing gene names. The order needs to match the columns of |
Importance score for each regulatory relationship. The first column contains gene name of regulators, the second column contains gene name of targets, and third column contains corresponding importance scores.
Petralia, F., Wang, P., Yang, J., Tu, Z. (2015) Integrative random forest for gene regulatory network inference, Bioinformatics, 31, i197-i205.
A. Liaw and M. Wiener (2002). Classification and Regression by randomForest. R News 2, 18–22.
# --- Generate data sets n<-20 # sample size p<-5 # number of genes genes.name<-paste("G",seq(1,p),sep="") # genes name data<-matrix(rnorm(p*n),n,p) # generate expression matrix W<-abs(matrix(rnorm(p*p),p,p)) # generate weights for regulatory relationships # --- Standardize variables to mean 0 and variance 1 data <- (apply(data, 2, function(x) { (x - mean(x)) / sd(x) } )) # --- Run iRafNet and obtain importance score of regulatory relationships out<-iRafNet(data,W,mtry=round(sqrt(p-1)),ntree=1000,genes.name)
# --- Generate data sets n<-20 # sample size p<-5 # number of genes genes.name<-paste("G",seq(1,p),sep="") # genes name data<-matrix(rnorm(p*n),n,p) # generate expression matrix W<-abs(matrix(rnorm(p*p),p,p)) # generate weights for regulatory relationships # --- Standardize variables to mean 0 and variance 1 data <- (apply(data, 2, function(x) { (x - mean(x)) / sd(x) } )) # --- Run iRafNet and obtain importance score of regulatory relationships out<-iRafNet(data,W,mtry=round(sqrt(p-1)),ntree=1000,genes.name)
This function computes permutation-based FDR of importance scores and returns gene-gene regulations.
iRafNet_network(out.iRafNet,out.perm,TH)
iRafNet_network(out.iRafNet,out.perm,TH)
out.iRafNet |
Output object from function |
out.perm |
Output object from function |
TH |
Threshold for FDR. |
List of estimated regulations.
Petralia, F., Song, W.M., Tu, Z. and Wang, P. (2016). New method for joint network analysis reveals common and different coexpression patterns among genes and proteins in breast cancer. Journal of proteome research, 15(3), pp.743-754.
A. Liaw and M. Wiener (2002). Classification and Regression by randomForest. R News 2, 18–22.
Xie, Y., Pan, W. and Khodursky, A.B., 2005. A note on using permutation-based false discovery rate estimates to compare different analysis methods for microarray data. Bioinformatics, 21(23), pp.4280-4288.
# --- Generate data sets n<-20 # sample size p<-5 # number of genes genes.name<-paste("G",seq(1,p),sep="") # genes name M=5; # number of permutations data<-matrix(rnorm(p*n),n,p) # generate gene expression matrix data[,1]<-data[,2] # var 1 and var 2 interact W<-abs(matrix(rnorm(p*p),p,p)) # generate weights for regulatory relationships # --- Standardize variables to mean 0 and variance 1 data <- (apply(data, 2, function(x) { (x - mean(x)) / sd(x) } )) # --- Run iRafNet and obtain importance score of regulatory relationships out.iRafNet<-iRafNet(data,W,mtry=round(sqrt(p-1)),ntree=1000,genes.name) # --- Run iRafNet for M permuted data sets out.perm<-Run_permutation(data,W,mtry=round(sqrt(p-1)),ntree=1000,genes.name,M) # --- Derive final networks final.net<-iRafNet_network(out.iRafNet,out.perm,0.001)
# --- Generate data sets n<-20 # sample size p<-5 # number of genes genes.name<-paste("G",seq(1,p),sep="") # genes name M=5; # number of permutations data<-matrix(rnorm(p*n),n,p) # generate gene expression matrix data[,1]<-data[,2] # var 1 and var 2 interact W<-abs(matrix(rnorm(p*p),p,p)) # generate weights for regulatory relationships # --- Standardize variables to mean 0 and variance 1 data <- (apply(data, 2, function(x) { (x - mean(x)) / sd(x) } )) # --- Run iRafNet and obtain importance score of regulatory relationships out.iRafNet<-iRafNet(data,W,mtry=round(sqrt(p-1)),ntree=1000,genes.name) # --- Run iRafNet for M permuted data sets out.perm<-Run_permutation(data,W,mtry=round(sqrt(p-1)),ntree=1000,genes.name,M) # --- Derive final networks final.net<-iRafNet_network(out.iRafNet,out.perm,0.001)
This function computes importance score for one permuted data set. Sample labels of target genes are randomly permuted and iRafNet is implemented. Resulting importance scores can be used to derive an estimate of FDR.
iRafNet_permutation(X, W, ntree, mtry,genes.name,perm)
iRafNet_permutation(X, W, ntree, mtry,genes.name,perm)
X |
|
W |
|
ntree |
Numeric value: number of trees. |
mtry |
Numeric value: number of predictors to be sampled at each node. |
genes.name |
Vector containing genes name. The order needs to match the rows of |
perm |
Integer: seed for permutation. |
A vector containing importance score for permuted data.
Petralia, F., Wang, P., Yang, J., Tu, Z. (2015) Integrative random forest for gene regulatory network inference, Bioinformatics, 31, i197-i205.
Petralia, F., Song, W.M., Tu, Z. and Wang, P. (2016). New method for joint network analysis reveals common and different coexpression patterns among genes and proteins in breast cancer. Journal of proteome research, 15(3), pp.743-754.
A. Liaw and M. Wiener (2002). Classification and Regression by randomForest. R News 2, 18–22.
# --- Generate data sets n<-20 # sample size p<-5 # number of genes genes.name<-paste("G",seq(1,p),sep="") # genes name data<-matrix(rnorm(p*n),n,p) # generate expression matrix W<-abs(matrix(rnorm(p*p),p,p)) # generate weights for regulatory relationships # --- Standardize variables to mean 0 and variance 1 data <- (apply(data, 2, function(x) { (x - mean(x)) / sd(x) } )) # --- Run iRafNet and obtain importance score of regulatory relationships out.iRafNet<-iRafNet(data,W,mtry=round(sqrt(p-1)),ntree=1000,genes.name) # --- Run iRafNet for one permuted data set and obtain importance scores out.perm<-iRafNet_permutation(data,W,mtry=round(sqrt(p-1)),ntree=1000,genes.name,perm=1)
# --- Generate data sets n<-20 # sample size p<-5 # number of genes genes.name<-paste("G",seq(1,p),sep="") # genes name data<-matrix(rnorm(p*n),n,p) # generate expression matrix W<-abs(matrix(rnorm(p*p),p,p)) # generate weights for regulatory relationships # --- Standardize variables to mean 0 and variance 1 data <- (apply(data, 2, function(x) { (x - mean(x)) / sd(x) } )) # --- Run iRafNet and obtain importance score of regulatory relationships out.iRafNet<-iRafNet(data,W,mtry=round(sqrt(p-1)),ntree=1000,genes.name) # --- Run iRafNet for one permuted data set and obtain importance scores out.perm<-iRafNet_permutation(data,W,mtry=round(sqrt(p-1)),ntree=1000,genes.name,perm=1)
This function uses R package ROCR to plot ROC curves for iRafNet object.
roc_curve(out, truth)
roc_curve(out, truth)
out |
Output from iRafNet. |
truth |
Matrix of true regulations. Rows correspond to different regulations and match rows of |
Plot ROC curve and return area under ROC curve.
Petralia, F., Wang, P., Yang, J., Tu, Z. (2015) Integrative random forest for gene regulatory network inference, Bioinformatics, 31, i197-i205.
Sing, Tobias, et al. (2005) ROCR: visualizing classifier performance in R, Bioinformatics, 21, 3940-3941.
# --- Generate data sets n<-20 # sample size p<-5 # number of genes genes.name<-paste("G",seq(1,p),sep="") # genes name data<-matrix(rnorm(p*n),n,p) # generate expression matrix data[,1]<-data[,2] # var 1 and 2 interact W<-abs(matrix(rnorm(p*p),p,p)) # generate score for regulatory relationships # --- Standardize variables to mean 0 and variance 1 data <- (apply(data, 2, function(x) { (x - mean(x)) / sd(x) } )) # --- Run iRafNet and obtain importance score of regulatory relationships out<-iRafNet(data,W,mtry=round(sqrt(p-1)),ntree=1000,genes.name) # --- Matrix of true regulations truth<-out[,seq(1,2)] truth<-cbind(as.character(truth[,1]),as.character(truth[,2]) ,as.data.frame(rep(0,,dim(out)[1]))); truth[(truth[,1]=="G2" & truth[,2]=="G1") | (truth[,1]=="G1" & truth[,2]=="G2"),3]<-1 # --- Plot ROC curve and compute AUC auc<-roc_curve(out,truth)
# --- Generate data sets n<-20 # sample size p<-5 # number of genes genes.name<-paste("G",seq(1,p),sep="") # genes name data<-matrix(rnorm(p*n),n,p) # generate expression matrix data[,1]<-data[,2] # var 1 and 2 interact W<-abs(matrix(rnorm(p*p),p,p)) # generate score for regulatory relationships # --- Standardize variables to mean 0 and variance 1 data <- (apply(data, 2, function(x) { (x - mean(x)) / sd(x) } )) # --- Run iRafNet and obtain importance score of regulatory relationships out<-iRafNet(data,W,mtry=round(sqrt(p-1)),ntree=1000,genes.name) # --- Matrix of true regulations truth<-out[,seq(1,2)] truth<-cbind(as.character(truth[,1]),as.character(truth[,2]) ,as.data.frame(rep(0,,dim(out)[1]))); truth[(truth[,1]=="G2" & truth[,2]=="G1") | (truth[,1]=="G1" & truth[,2]=="G2"),3]<-1 # --- Plot ROC curve and compute AUC auc<-roc_curve(out,truth)
This function computes importance score for M
permuted data sets. Sample labels of target genes are randomly permuted and iRafNet is implemented. Resulting importance scores can be used to derive an estimate of FDR.
Run_permutation(X, W, ntree, mtry,genes.name,M)
Run_permutation(X, W, ntree, mtry,genes.name,M)
X |
|
W |
|
ntree |
Numeric value: number of trees. |
mtry |
Numeric value: number of predictors to be sampled at each node. |
genes.name |
Vector containing genes name. The order needs to match the rows of |
M |
Integer: total number of permutations. |
A matrix with I
rows and M
columns with I
being the total number of regulations and M
the number of permutations. Element (i,j)
corresponds to the importance score of interaction i
for permuted data j
.
Petralia, F., Wang, P., Yang, J., Tu, Z. (2015) Integrative random forest for gene regulatory network inference, Bioinformatics, 31, i197-i205.
Petralia, F., Song, W.M., Tu, Z. and Wang, P. (2016). New method for joint network analysis reveals common and different coexpression patterns among genes and proteins in breast cancer. Journal of proteome research, 15(3), pp.743-754.
A. Liaw and M. Wiener (2002). Classification and Regression by randomForest. R News 2, 18–22.
# --- Generate data sets n<-20 # sample size p<-5 # number of genes genes.name<-paste("G",seq(1,p),sep="") # genes name M=5; # number of permutations data<-matrix(rnorm(p*n),n,p) # generate expression matrix W<-abs(matrix(rnorm(p*p),p,p)) # generate score for regulatory relationships # --- Standardize variables to mean 0 and variance 1 data <- (apply(data, 2, function(x) { (x - mean(x)) / sd(x) } )) # --- Run iRafNet and obtain importance score of regulatory relationships out.iRafNet<-iRafNet(data,W,mtry=round(sqrt(p-1)),ntree=1000,genes.name) # --- Run iRafNet for M permuted data sets out.perm<-Run_permutation(data,W,mtry=round(sqrt(p-1)),ntree=1000,genes.name,M)
# --- Generate data sets n<-20 # sample size p<-5 # number of genes genes.name<-paste("G",seq(1,p),sep="") # genes name M=5; # number of permutations data<-matrix(rnorm(p*n),n,p) # generate expression matrix W<-abs(matrix(rnorm(p*p),p,p)) # generate score for regulatory relationships # --- Standardize variables to mean 0 and variance 1 data <- (apply(data, 2, function(x) { (x - mean(x)) / sd(x) } )) # --- Run iRafNet and obtain importance score of regulatory relationships out.iRafNet<-iRafNet(data,W,mtry=round(sqrt(p-1)),ntree=1000,genes.name) # --- Run iRafNet for M permuted data sets out.perm<-Run_permutation(data,W,mtry=round(sqrt(p-1)),ntree=1000,genes.name,M)