Title: | A Novel Topology-Based Pathway Enrichment Analysis Approach |
---|---|
Description: | We described a novel Topology-based pathway enrichment analysis, which integrated the global position of the nodes and the topological property of the pathways in Kyoto Encyclopedia of Genes and Genomes Database. We also provide some functions to obtain the latest information about pathways to finish pathway enrichment analysis using this method. |
Authors: | Wei Jiang |
Maintainer: | Wei Jiang <[email protected]> |
License: | GPL-2 |
Version: | 3.1.0 |
Built: | 2024-11-12 06:53:46 UTC |
Source: | CRAN |
This package descirbed A Novel Pathway enrichment analysis approach based on topological structure and updated annotation of pathway which integrated the topological property of the pathway and the global position of nodes in pathways.Additionally,it also provided the update functions which could obtain the latest pathway information from KEGG database and users can use the latest information to do the pathway enrichment analysis.
The function AUEC is to calculate the area under the cumulative enrichment curve. The function TPEA is to measure the significance of pathways. The function UPDATE is to online download the latest KEGG pathway information. The viewpathway function is to visualize the pathway in the result based on the genes you input, such as differentially expressed genes. Several other functions are the update related functions, including ViewUpdateTime,UpdateKGML,PathNetwork,NodeGeneData,NodeGene,importUpdateData. The functions involved in relationship between nodes and genes were provided by Chunquan Li. If you want to use the latest information of KEGG database,please run "UPDATE()" functions first, and then run the pathway enrichment analysis functions AUEC and TPEA.
Wei Jiang
Human protein coding genes from NCBI Database. We use this set as background gene set.
The interested gene set may be the differentially expressed genes or any other gene set. The function calculate the AUEC based on the interested genes. AUEC is the area under the cumulative enrichment curve in a coordinate system. X-axis displays the nodes by the scores from maximum to minimum. Y-axis displays the cumulative enrichment curve.
AUEC(DEGs)
AUEC(DEGs)
DEGs |
The interested genes you input and the format must be "Entrez ID". If not,translate the interested genes into Entrez ID. |
The function only identifies Entrez ID of genes. The nodes are sorted by their AUEC in the pathway. If genes locates on the upstream or the nodes with high degree in a certain pathway, the AUEC of this pathway is high.
The AUEC of 109 pathways based on the interested gene set.
Wei Jiang
##Randomly generated interested genes DEGs<-sample(100:100000,15) DEG<-as.matrix(DEGs); ## The function is used to calculate the observed statistic area<-AUEC(DEG);
##Randomly generated interested genes DEGs<-sample(100:100000,15) DEG<-as.matrix(DEGs); ## The function is used to calculate the observed statistic area<-AUEC(DEG);
Download the latest KGML files from KEGG database if you want the latest KGML files from KEGG database.
Download the latest KGML files from KEGG database before pathway enrichment analysis.
The latest KGML files from KEGG database.
Wei Jiang
Obtain the graph of pathways.
getUGraph(graphList, simpleGraph = TRUE)
getUGraph(graphList, simpleGraph = TRUE)
graphList |
Get the list. |
simpleGraph |
Convert the network. |
The graphList relationship.
Wei Jiang
Import the latest relationship information about node,gene and score.
importLatesData()
importLatesData()
Import the latest relationship information about nodes,genes and their scores based on KGML files.
Import the latest relationship information about node,gene and score.
Wei Jiang
The relationship between nodes and genes in each pathway in KEGG Database
Restract the relationship between nodes and genes from KGML files.
NodeGene()
NodeGene()
This function must be used behind the function NodeGeneData.
Restract the relationship between nodes and genes in each network based on the information of KGML files.
Wei Jiang
Intergate list of node,gene and the score of node based on latest KGML files from KEGG database.
NodeGeneData()
NodeGeneData()
Intergate list of node,gene and the score of node based on latest KGML files from KEGG database.
List contains the relationship of node,gene and the score of node based on latest KGML files.
Wei Jiang
The dataset includes 109 list and each list contains four columns (the order of node, node, gene and the score).
Reconstruct pathways to networks based on KGML files from KEGG database.
PathNetwork()
PathNetwork()
Reconstruct pathways to networks based on KGML files from KEGG database.
The relationship of edges in network.
Wei Jiang
Comparing with the AUEC_R which the interested gene set extract from the background gene set randomly and the corresponding AUEC based on interested gene set you input. The last step is to calculate the significance.
TPEA(DEGs, scores, n, FDR_method)
TPEA(DEGs, scores, n, FDR_method)
DEGs |
Interested gene set such as differentailly expressed gene set. |
scores |
The "AUEC" based on the interested gene set of 109 pathways. |
n |
Randomly number,e.g. 1000, 5000. |
FDR_method |
The methods of calculating FDR value,e.g. "fdr","BH","BY" ,"bonferroni" and etc.. |
To calculate the significance of the result, you can set "n" as "1000" or any other number you want.
The ultimately result of this topology-based enrichment analysis method.
Wei Jiang
##Randomly generated interested gene set ViewLatestTime() ##If you want to use the latest information,please run "UPDATE()". DEGs<-sample(100:10000,10); DEG<-as.matrix(DEGs); ##Set the times of perturbation number<-50; ##Calculate the observed statistic scores<-AUEC(DEG); ##Significant computational FDR_method<-"fdr"; results<-TPEA(DEG,scores,number,FDR_method);
##Randomly generated interested gene set ViewLatestTime() ##If you want to use the latest information,please run "UPDATE()". DEGs<-sample(100:10000,10); DEG<-as.matrix(DEGs); ##Set the times of perturbation number<-50; ##Calculate the observed statistic scores<-AUEC(DEG); ##Significant computational FDR_method<-"fdr"; results<-TPEA(DEG,scores,number,FDR_method);
Updating the latest information of pathways in KEGG database and the time of this process is about 1-2 minutes.
Check up the latest date of KGML files from KEGG database.
ViewLatestTime()
ViewLatestTime()
The latest date of KGML files from KEGG database.
Wei Jiang
Input the number of the interested pathway in KEGG Database and genes you interested in, such as differentially expressed genes.
viewpathway(pathwayID, DEGs)
viewpathway(pathwayID, DEGs)
pathwayID |
The number of interested pathway ID in KEGG Database, such as "hsa05210". |
DEGs |
The genes you interested in, such as differentially expressed genes. |
The "DEGs" must be Entrez ID. If not, please translate them into Entrez ID.
The interface link to KEGG Database to visualize the pathway you input.
Wei Jiang
DEGs<-c(836,842,5594,595); DEG<-as.data.frame(DEGs); pathwayID<-"hsa05210"; viewpathway(pathwayID,DEG);
DEGs<-c(836,842,5594,595); DEG<-as.data.frame(DEGs); pathwayID<-"hsa05210"; viewpathway(pathwayID,DEG);