--- title: "Genomes" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Genomes} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(misha) ``` # Create a `misha` database from UCSC The easiest way to create a `misha` database is to use the `gdb.create_genome` function: ```{r, eval = FALSE} gdb.create_genome("hg19") # creates a database for the hg19 genome gdb.create_genome("hg38") # creates a database for the hg38 genome gdb.create_genome("mm10") # creates a database for the mm10 genome gdb.create_genome("mm9") # creates a database for the mm9 genome gdb.create_genome("mm39") # creates a database for the mm39 genome ``` However, if you need to create a database for a genome that is not supported by `gdb.create_genome`, or if you want to make sure that the database is created from the latest version of the genome in ucsc, you can create it manually using the commands below. ## hg19 In order to create a misha database for _hg19_ genome, run the following commands (assuming *"hg19"* is your new data base path): ```{r, eval = FALSE} ftp <- "ftp://hgdownload.soe.ucsc.edu/goldenPath/hg19" gdb.create( "hg19", paste(ftp, "chromosomes", paste0("chr", c(1:22, "X", "Y", "M"), ".fa.gz"), sep = "/"), paste(ftp, "database/knownGene.txt.gz", sep = "/"), paste(ftp, "database/kgXref.txt.gz", sep = "/"), c( "kgID", "mRNA", "spID", "spDisplayID", "geneSymbol", "refseq", "protAcc", "description", "rfamAcc", "tRnaName" ) ) gdb.init("hg19") ``` ## hg38 In order to create a misha database for _hg38_ genome, run the following commands (assuming *"hg38"* is your new data base path): ```{r, eval = FALSE} ftp <- "ftp://hgdownload.soe.ucsc.edu/goldenPath/hg38" gdb.create( "hg38", paste(ftp, "chromosomes", paste0("chr", c(1:22, "X", "Y", "M"), ".fa.gz"), sep = "/"), paste(ftp, "database/knownGene.txt.gz", sep = "/"), paste(ftp, "database/kgXref.txt.gz", sep = "/"), c( "kgID", "mRNA", "spID", "spDisplayID", "geneSymbol", "refseq", "protAcc", "description", "rfamAcc", "tRnaName" ) ) gdb.init("hg38") ``` ## mm9 In order to create a misha database for _mm9_ genome, run the following commands (assuming *"mm9"* is your new data base path): ```{r, eval = FALSE} ftp <- "ftp://hgdownload.soe.ucsc.edu/goldenPath/mm9" gdb.create( "mm9", paste(ftp, "chromosomes", paste0("chr", c(1:19, "X", "Y", "M"), ".fa.gz"), sep = "/"), paste(ftp, "database/knownGene.txt.gz", sep = "/"), paste(ftp, "database/kgXref.txt.gz", sep = "/"), c( "kgID", "mRNA", "spID", "spDisplayID", "geneSymbol", "refseq", "protAcc", "description" ) ) gdb.init("mm9") ``` ## mm10 In order to create a misha database for _mm10_ genome, run the following commands (assuming *"mm10"* is your new data base path): ```{r, eval = FALSE} ftp <- "ftp://hgdownload.soe.ucsc.edu/goldenPath/mm10" gdb.create( "mm10", paste(ftp, "chromosomes", paste0("chr", c(1:19, "X", "Y", "M"), ".fa.gz"), sep = "/"), paste(ftp, "database/knownGene.txt.gz", sep = "/"), paste(ftp, "database/kgXref.txt.gz", sep = "/"), c( "kgID", "mRNA", "spID", "spDisplayID", "geneSymbol", "refseq", "protAcc", "description", "rfamAcc", "tRnaName" ) ) gdb.init("mm10") ``` ## mm39 In order to create a misha database for _mm39_ genome, run the following commands (assuming *"mm39"* is your new data base path): ```{r, eval = FALSE} ftp <- "ftp://hgdownload.soe.ucsc.edu/goldenPath/mm39" gdb.create( "mm39", paste(ftp, "chromosomes", paste0("chr", c(1:19, "X", "Y", "M"), ".fa.gz"), sep = "/"), paste(ftp, "database/knownGene.txt.gz", sep = "/"), paste(ftp, "database/kgXref.txt.gz", sep = "/"), c( "kgID", "mRNA", "spID", "spDisplayID", "geneSymbol", "refseq", "protAcc", "description", "rfamAcc", "tRnaName" ) ) gdb.init("mm39") ```