--- title: "mut" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{mut} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) vignette_file <- function(...) { candidates <- c( file.path(...), file.path("vignettes", ...), file.path("inst", "extdata", ...), file.path(Sys.getenv("PWD"), "inst", "extdata", ...), system.file("extdata", ..., package = "oncoPredict"), system.file("doc", ..., package = "oncoPredict") ) candidates <- candidates[nzchar(candidates) & file.exists(candidates)] if (!length(candidates)) { stop("Could not find vignette file: ", file.path(...), call. = FALSE) } candidates[[1]] } ``` ```{r setup} library(oncoPredict) #This vignette demonstrates how to prepare predicted drug response and mutation #data for mutation-based IDWAS with idwas(cnv=FALSE). #Determine the parameters of the idwas() function... #Set the drug_prediction parameter. #Make sure rownames() are samples, and colnames() are drugs. Also make sure this data is a data frame. drug_prediction<-as.data.frame(read.table(vignette_file("DrugPredictions.txt"), header=TRUE, row.names=1)) #In this example, replace '.' with '-' so the TCGA sample identifiers match the #format used in the mutation data. colnames(drug_prediction)<-gsub(".", "-", colnames(drug_prediction), fixed=T) #Make sure the sample identifiers in the 'drug prediction' data are of similar form as the sample identifiers in the 'data' parameter. cols=colnames(drug_prediction) colnames(drug_prediction)<-substring(cols, 3, nchar(cols)) drug_prediction<-as.data.frame(t(drug_prediction)) ``` This vignette provides an example of how to prepare mutation data from the GDC database for GBM (glioblastoma) and how to apply `idwas()` to test predicted drug response against somatic mutations. Because GDC and TCGAbiolinks access patterns can change over time, the download code is shown as non-executed guidance. Download mutation data for your cancer of interest from GDC database. https://bioconductor.org/packages/release/bioc/vignettes/TCGAbiolinks/inst/doc/mutation.html https://rdrr.io/bioc/TCGAbiolinks/f/vignettes/mutation.Rmd The code would look something like this: ```{r mutation-download, eval=FALSE} library(TCGAbiolinks) query_maf <- GDCquery(project = "TCGA-GBM", data.category = "Simple Nucleotide Variation", access = "open", data.type = "Simple somatic mutation", legacy = TRUE) GDCdownload(query_maf) maf <- GDCprepare(query_maf) ``` After downloading the mutation data, format the mutation table before running IDWAS. ```{r mutation-formatting, eval=FALSE} #Make sure this data is a data frame with mutation annotations in columns. #For idwas(cnv=FALSE), the data should include Variant_Classification, #Hugo_Symbol, and Tumor_Sample_Barcode. data<-as.data.frame(maf) samps<-data$Tumor_Sample_Barcode data$Tumor_Sample_Barcode<-substr(samps,1,nchar(samps)-12) #Make sure these sample ids are of the same form as the sample ids in your prediction data. #Determine the number of samples you want mutations to occur in. The default is 10. n=10 #Indicate whether or not you would like to test CNA amplification data. If TRUE, you will test CNA amplifications. If FALSE, you will test mutation data. cnv=FALSE ```