Title: | Codon Usage Bias Analysis |
---|---|
Description: | A suite of functions for rapid and flexible analysis of codon usage bias. It provides in-depth analysis at the codon level, including relative synonymous codon usage (RSCU), tRNA weight calculations, machine learning predictions for optimal or preferred codons, and visualization of codon-anticodon pairing. Additionally, it can calculate various gene- specific codon indices such as codon adaptation index (CAI), effective number of codons (ENC), fraction of optimal codons (Fop), tRNA adaptation index (tAI), mean codon stabilization coefficients (CSCg), and GC contents (GC/GC3s/GC4d). It also supports both standard and non-standard genetic code tables found in NCBI, as well as custom genetic code tables. |
Authors: | Hong Zhang [aut, cre] , Mengyue Liu [aut], Bu Zi [aut] |
Maintainer: | Hong Zhang <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.1.0 |
Built: | 2024-12-07 10:41:41 UTC |
Source: | CRAN |
A data.frame of mapping from amino acids to codons
aa2codon
aa2codon
a data.frame with two columns: amino_acid, and codon.
amino acid corresponding to the codon
codon identity
It is actually the standard genetic code.
aa2codon
aa2codon
check_cds
performs quality control of CDS sequences by filtering some
peculiar sequences and optionally remove start or stop codons.
check_cds( seqs, codon_table = get_codon_table(), min_len = 6, check_len = TRUE, check_start = TRUE, check_stop = TRUE, check_istop = TRUE, rm_start = TRUE, rm_stop = TRUE, start_codons = c("ATG") )
check_cds( seqs, codon_table = get_codon_table(), min_len = 6, check_len = TRUE, check_start = TRUE, check_stop = TRUE, check_istop = TRUE, rm_start = TRUE, rm_stop = TRUE, start_codons = c("ATG") )
seqs |
input CDS sequences |
codon_table |
codon table matching the genetic code of |
min_len |
minimum CDS length in nt |
check_len |
check whether CDS length is divisible by 3 |
check_start |
check whether CDSs have start codons |
check_stop |
check whether CDSs have stop codons |
check_istop |
check internal stop codons |
rm_start |
whether to remove start codons |
rm_stop |
whether to remove stop codons |
start_codons |
vector of start codons |
DNAStringSet of filtered (and trimmed) CDS sequences
# CDS sequence QC for a sample of yeast genes s <- head(yeast_cds, 10) print(s) check_cds(s)
# CDS sequence QC for a sample of yeast genes s <- head(yeast_cds, 10) print(s) check_cds(s)
codon_diff
takes two set of coding sequences and
perform differential codon usage analysis.
codon_diff(seqs1, seqs2, codon_table = get_codon_table())
codon_diff(seqs1, seqs2, codon_table = get_codon_table())
seqs1 |
DNAStringSet, or an object that can be coerced to a DNAStringSet |
seqs2 |
DNAStringSet, or an object that can be coerced to a DNAStringSet |
codon_table |
a table of genetic code derived from |
a data.table of the differential codon usage analysis. Global tests examine wthether a codon
is used differently relative to all the other codons. Family tests examine whether a codon is used
differently relative to other codons that encode the same amino acid. Subfamily tests examine whether
a codon is used differently relative to other synonymous codons that share the same first two nucleotides.
Odds ratio > 1 suggests a codon is used at higher frequency in seqs1
than in seqs2
.
yeast_exp_sorted <- yeast_exp[order(yeast_exp$fpkm),] seqs1 <- yeast_cds[names(yeast_cds) %in% head(yeast_exp_sorted$gene_id, 1000)] seqs2 <- yeast_cds[names(yeast_cds) %in% tail(yeast_exp_sorted$gene_id, 1000)] cudiff <- codon_diff(seqs1, seqs2)
yeast_exp_sorted <- yeast_exp[order(yeast_exp$fpkm),] seqs1 <- yeast_cds[names(yeast_cds) %in% head(yeast_exp_sorted$gene_id, 1000)] seqs2 <- yeast_cds[names(yeast_cds) %in% tail(yeast_exp_sorted$gene_id, 1000)] cudiff <- codon_diff(seqs1, seqs2)
codon_optimize
takes a coding sequence (without stop codon) and replace
each codon to the corresponding synonymous optimal codon.
codon_optimize( seq, optimal_codons, codon_table = get_codon_table(), level = "subfam" )
codon_optimize( seq, optimal_codons, codon_table = get_codon_table(), level = "subfam" )
seq |
DNAString, or an object that can be coerced to a DNAString. |
optimal_codons |
table optimze codons as generated by |
codon_table |
a table of genetic code derived from |
level |
"subfam" (default) or "amino_acid". Optimize codon usage at which level. |
a DNAString of the optimized coding sequence.
cf_all <- count_codons(yeast_cds) optimal_codons <- est_optimal_codons(cf_all) seq <- 'ATGCTACGA' codon_optimize(seq, optimal_codons)
cf_all <- count_codons(yeast_cds) optimal_codons <- est_optimal_codons(cf_all) seq <- 'ATGCTACGA' codon_optimize(seq, optimal_codons)
count_codons
tabulates the occurrences of all the 64 codons in input CDSs
count_codons(seqs, ...)
count_codons(seqs, ...)
seqs |
CDS sequences, DNAStringSet. |
... |
additional arguments passed to |
matrix of codon (column) frequencies of each CDS (row).
# count codon occurrences cf_all <- count_codons(yeast_cds) dim(cf_all) cf_all[1:5, 1:5] count_codons(yeast_cds[1])
# count codon occurrences cf_all <- count_codons(yeast_cds) dim(cf_all) cf_all[1:5, 1:5] count_codons(yeast_cds[1])
create_codon_table
creates codon table from data frame of aa to codon mapping.
create_codon_table(aa2codon)
create_codon_table(aa2codon)
aa2codon |
a data frame with two columns: amino_acid (Ala, Arg, etc.) and codon. |
a data.table
with four columns: aa_code, amino_acid, codon, and subfam.
head(aa2codon) create_codon_table(aa2codon = aa2codon)
head(aa2codon) create_codon_table(aa2codon = aa2codon)
get_csc
calculate codon occurrence to mRNA stability correlation coefficients (Default to Pearson's).
est_csc( seqs, half_life, codon_table = get_codon_table(), cor_method = "pearson" )
est_csc( seqs, half_life, codon_table = get_codon_table(), cor_method = "pearson" )
seqs |
CDS sequences of all protein-coding genes. One for each gene. |
half_life |
data.frame of mRNA half life (gene_id & half_life are column names). |
codon_table |
a table of genetic code derived from |
cor_method |
method name passed to 'cor.test' used for calculating correlation coefficients. |
data.table of optimal codons.
Presnyak V, Alhusaini N, Chen YH, Martin S, Morris N, Kline N, Olson S, Weinberg D, Baker KE, Graveley BR, et al. 2015. Codon optimality is a major determinant of mRNA stability. Cell 160:1111-1124.
# estimate yeast mRNA CSC est_csc(yeast_cds, yeast_half_life)
# estimate yeast mRNA CSC est_csc(yeast_cds, yeast_half_life)
est_optimal_codons
determine optimal codon of each codon family with binomial regression.
Usage of optimal codons should correlate negatively with enc.
est_optimal_codons( cf, codon_table = get_codon_table(), level = "subfam", gene_score = NULL, fdr = 0.001 )
est_optimal_codons( cf, codon_table = get_codon_table(), level = "subfam", gene_score = NULL, fdr = 0.001 )
cf |
matrix of codon frequencies as calculated by |
codon_table |
a table of genetic code derived from |
level |
"subfam" (default) or "amino_acid". For which level to determine optimal codons. |
gene_score |
a numeric vector of scores for genes. The order of values should match with
gene orders in the codon frequency matrix. The length of the vector should be equal to the
number of rows in the matrix. The scores could be gene expression levels (RPKM or TPM) that are
optionally log-transformed (for example, with |
fdr |
false discovery rate used to determine optimal codons. |
data.table of optimal codons.
# perform binomial regression for optimal codon estimation cf_all <- count_codons(yeast_cds) codons_opt <- est_optimal_codons(cf_all) codons_opt <- codons_opt[optimal == TRUE] codons_opt
# perform binomial regression for optimal codon estimation cf_all <- count_codons(yeast_cds) codons_opt <- est_optimal_codons(cf_all) codons_opt <- codons_opt[optimal == TRUE] codons_opt
est_rscu
returns the RSCU value of codons
est_rscu( cf, weight = 1, pseudo_cnt = 1, codon_table = get_codon_table(), level = "subfam" )
est_rscu( cf, weight = 1, pseudo_cnt = 1, codon_table = get_codon_table(), level = "subfam" )
cf |
matrix of codon frequencies as calculated by |
weight |
a vector of the same length as |
pseudo_cnt |
pseudo count to avoid dividing by zero. This may occur when only a few sequences are available for RSCU calculation. |
codon_table |
a table of genetic code derived from |
level |
"subfam" (default) or "amino_acid". For which level to determine RSCU. |
a data.table of codon info. RSCU values are reported in the last column.
Sharp PM, Tuohy TM, Mosurski KR. 1986. Codon usage in yeast: cluster analysis clearly differentiates highly and lowly expressed genes. Nucleic Acids Res 14:5125-5143.
# compute RSCU of all yeast genes cf_all <- count_codons(yeast_cds) est_rscu(cf_all) # compute RSCU of highly expressed (top 500) yeast genes heg <- head(yeast_exp[order(-yeast_exp$fpkm), ], n = 500) cf_heg <- count_codons(yeast_cds[heg$gene_id]) est_rscu(cf_heg)
# compute RSCU of all yeast genes cf_all <- count_codons(yeast_cds) est_rscu(cf_all) # compute RSCU of highly expressed (top 500) yeast genes heg <- head(yeast_exp[order(-yeast_exp$fpkm), ], n = 500) cf_heg <- count_codons(yeast_cds[heg$gene_id]) est_rscu(cf_heg)
est_trna_weight
compute the tRNA weight per codon for TAI calculation.
This weight reflects relative tRNA availability for each codon.
est_trna_weight( trna_level, codon_table = get_codon_table(), s = list(WC = 0, IU = 0, IC = 0.4659, IA = 0.9075, GU = 0.7861, UG = 0.6295) )
est_trna_weight( trna_level, codon_table = get_codon_table(), s = list(WC = 0, IU = 0, IC = 0.4659, IA = 0.9075, GU = 0.7861, UG = 0.6295) )
trna_level |
named vector of tRNA level (or gene copy numbers), one value for each anticodon. vector names are anticodons. |
codon_table |
a table of genetic code derived from |
s |
list of non-Waston-Crick pairing panelty. |
data.table of tRNA expression information.
dos Reis M, Savva R, Wernisch L. 2004. Solving the riddle of codon usage preferences: a test for translational selection. Nucleic Acids Res 32:5036-5044.
# estimate codon tRNA weight for yeasts est_trna_weight(yeast_trna_gcn)
# estimate codon tRNA weight for yeasts est_trna_weight(yeast_trna_gcn)
get_cai
calculates Codon Adaptation Index (CAI) of each input CDS
get_cai(cf, rscu, level = "subfam")
get_cai(cf, rscu, level = "subfam")
cf |
matrix of codon frequencies as calculated by |
rscu |
rscu table containing CAI weight for each codon. This table could be
generated with |
level |
"subfam" (default) or "amino_acid". For which level to determine CAI. |
a named vector of CAI values
Sharp PM, Li WH. 1987. The codon Adaptation Index–a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res 15:1281-1295.
# estimate CAI of yeast genes based on RSCU of highly expressed genes heg <- head(yeast_exp[order(-yeast_exp$fpkm), ], n = 500) cf_all <- count_codons(yeast_cds) cf_heg <- cf_all[heg$gene_id, ] rscu_heg <- est_rscu(cf_heg) cai <- get_cai(cf_all, rscu_heg) head(cai) hist(cai)
# estimate CAI of yeast genes based on RSCU of highly expressed genes heg <- head(yeast_exp[order(-yeast_exp$fpkm), ], n = 500) cf_all <- count_codons(yeast_cds) cf_heg <- cf_all[heg$gene_id, ] rscu_heg <- est_rscu(cf_heg) cai <- get_cai(cf_all, rscu_heg) head(cai) hist(cai)
get_codon_table
creates a codon table based on the given id of genetic code in NCBI.
get_codon_table(gcid = "1")
get_codon_table(gcid = "1")
gcid |
a string of genetic code id. run |
a data.table
with four columns: aa_code, amino_acid, codon, and subfam.
# Standard genetic code get_codon_table() # Vertebrate Mitochondrial genetic code get_codon_table(gcid = '2')
# Standard genetic code get_codon_table() # Vertebrate Mitochondrial genetic code get_codon_table(gcid = '2')
get_cscg
calculates Mean Codon Stabilization Coefficients of each CDS.
get_cscg(cf, csc)
get_cscg(cf, csc)
cf |
matrix of codon frequencies as calculated by |
csc |
table of Codon Stabilization Coefficients as calculated by |
a named vector of cscg values.
Presnyak V, Alhusaini N, Chen YH, Martin S, Morris N, Kline N, Olson S, Weinberg D, Baker KE, Graveley BR, et al. 2015. Codon optimality is a major determinant of mRNA stability. Cell 160:1111-1124.
# estimate CSCg of yeast genes yeast_csc <- est_csc(yeast_cds, yeast_half_life) cf_all <- count_codons(yeast_cds) cscg <- get_cscg(cf_all, csc = yeast_csc) head(cscg) hist(cscg)
# estimate CSCg of yeast genes yeast_csc <- est_csc(yeast_cds, yeast_half_life) cf_all <- count_codons(yeast_cds) cscg <- get_cscg(cf_all, csc = yeast_csc) head(cscg) hist(cscg)
get_dp
calculates Deviation from Proportionality of each CDS.
get_dp( cf, host_weights, codon_table = get_codon_table(), level = "subfam", missing_action = "ignore" )
get_dp( cf, host_weights, codon_table = get_codon_table(), level = "subfam", missing_action = "ignore" )
cf |
matrix of codon frequencies as calculated by |
host_weights |
a named vector of tRNA weights for each codon that reflects the relative availability of tRNAs in the host organism. |
codon_table |
a table of genetic code derived from |
level |
"subfam" (default) or "amino_acid". If "subfam", the deviation is calculated at the codon subfamily level. Otherwise, the deviation is calculated at the amino acid level. |
missing_action |
Actions to take when no codon of a group were found in a CDS. Options are "ignore" (default), or "zero" (set codon proportions to 0). |
a named vector of dp values.
Chen F, Wu P, Deng S, Zhang H, Hou Y, Hu Z, Zhang J, Chen X, Yang JR. 2020. Dissimilation of synonymous codon usage bias in virus-host coevolution due to translational selection. Nat Ecol Evol 4:589-600.
# estimate DP of yeast genes cf_all <- count_codons(yeast_cds) trna_weight <- est_trna_weight(yeast_trna_gcn) trna_weight <- setNames(trna_weight$w, trna_weight$codon) dp <- get_dp(cf_all, host_weights = trna_weight) head(dp) hist(dp)
# estimate DP of yeast genes cf_all <- count_codons(yeast_cds) trna_weight <- est_trna_weight(yeast_trna_gcn) trna_weight <- setNames(trna_weight$w, trna_weight$codon) dp <- get_dp(cf_all, host_weights = trna_weight) head(dp) hist(dp)
get_enc
computes ENC of each CDS
get_enc(cf, codon_table = get_codon_table(), level = "subfam")
get_enc(cf, codon_table = get_codon_table(), level = "subfam")
cf |
matrix of codon frequencies as calculated by |
codon_table |
codon_table a table of genetic code derived from |
level |
"subfam" (default) or "amino_acid". For which level to determine ENC. |
vector of ENC values, sequence names are used as vector names
- Wright F. 1990. The 'effective number of codons' used in a gene. Gene 87:23-29. - Sun X, Yang Q, Xia X. 2013. An improved implementation of effective number of codons (NC). Mol Biol Evol 30:191-196.
# estimate ENC of yeast genes cf_all <- count_codons(yeast_cds) enc <- get_enc(cf_all) head(enc) hist(enc)
# estimate ENC of yeast genes cf_all <- count_codons(yeast_cds) enc <- get_enc(cf_all) head(enc) hist(enc)
get_fop
calculates the fraction of optimal codons (Fop) of each CDS.
get_fop(cf, op = NULL, codon_table = get_codon_table(), ...)
get_fop(cf, op = NULL, codon_table = get_codon_table(), ...)
cf |
matrix of codon frequencies as calculated by |
op |
a character vector of optimal codons. Can be determined automatically by running
|
codon_table |
a table of genetic code derived from |
... |
other arguments passed to |
a named vector of fop values.
Ikemura T. 1981. Correlation between the abundance of Escherichia coli transfer RNAs and the occurrence of the respective codons in its protein genes: a proposal for a synonymous codon choice that is optimal for the E. coli translational system. J Mol Biol 151:389-409.
# estimate Fop of yeast genes cf_all <- count_codons(yeast_cds) fop <- get_fop(cf_all) head(fop) hist(fop)
# estimate Fop of yeast genes cf_all <- count_codons(yeast_cds) fop <- get_fop(cf_all) head(fop) hist(fop)
Calculate GC content of the whole sequences.
get_gc(cf)
get_gc(cf)
cf |
matrix of codon frequencies as calculated by 'count_codons()'. |
a named vector of GC contents.
# estimate GC content of yeast genes cf_all <- count_codons(yeast_cds) gc <- get_gc(cf_all) head(gc) hist(gc)
# estimate GC content of yeast genes cf_all <- count_codons(yeast_cds) gc <- get_gc(cf_all) head(gc) hist(gc)
Calculate GC content at synonymous 3rd codon positions.
get_gc3s(cf, codon_table = get_codon_table(), level = "subfam")
get_gc3s(cf, codon_table = get_codon_table(), level = "subfam")
cf |
matrix of codon frequencies as calculated by |
codon_table |
a table of genetic code derived from |
level |
"subfam" (default) or "amino_acid". For which level to determine GC content at synonymous 3rd codon positions. |
a named vector of GC3s values.
Peden JF. 2000. Analysis of codon usage.
# estimate GC3s of yeast genes cf_all <- count_codons(yeast_cds) gc3s <- get_gc3s(cf_all) head(gc3s) hist(gc3s)
# estimate GC3s of yeast genes cf_all <- count_codons(yeast_cds) gc3s <- get_gc3s(cf_all) head(gc3s) hist(gc3s)
Calculate GC content at synonymous position of codons (using four-fold degenerate sites only).
get_gc4d(cf, codon_table = get_codon_table(), level = "subfam")
get_gc4d(cf, codon_table = get_codon_table(), level = "subfam")
cf |
matrix of codon frequencies as calculated by |
codon_table |
a table of genetic code derived from |
level |
"subfam" (default) or "amino_acid". For which level to determine GC contents at 4-fold degenerate sites. |
a named vector of GC4d values.
# estimate GC4d of yeast genes cf_all <- count_codons(yeast_cds) gc4d <- get_gc4d(cf_all) head(gc4d) hist(gc4d)
# estimate GC4d of yeast genes cf_all <- count_codons(yeast_cds) gc4d <- get_gc4d(cf_all) head(gc4d) hist(gc4d)
get_tai
calculates tRNA Adaptation Index (TAI) of each CDS
get_tai(cf, trna_w)
get_tai(cf, trna_w)
cf |
matrix of codon frequencies as calculated by |
trna_w |
tRNA weight for each codon, can be generated with |
a named vector of TAI values
dos Reis M, Savva R, Wernisch L. 2004. Solving the riddle of codon usage preferences: a test for translational selection. Nucleic Acids Res 32:5036-5044.
# calculate TAI of yeast genes based on genomic tRNA copy numbers w <- est_trna_weight(yeast_trna_gcn) cf_all <- count_codons(yeast_cds) tai <- get_tai(cf_all, w) head(tai) hist(tai)
# calculate TAI of yeast genes based on genomic tRNA copy numbers w <- est_trna_weight(yeast_trna_gcn) cf_all <- count_codons(yeast_cds) tai <- get_tai(cf_all, w) head(tai) hist(tai)
CDSs of 13 protein-coding genes in the human mitochondrial genome extracted from ENSEMBL Biomart
human_mt
human_mt
a DNAStringSet of 13 sequences
<https://www.ensembl.org/index.html>
head(human_mt)
head(human_mt)
plot_ca_pairing
show possible codon-anticodons pairings
plot_ca_pairing(codon_table = get_codon_table(), plot = TRUE)
plot_ca_pairing(codon_table = get_codon_table(), plot = TRUE)
codon_table |
a table of genetic code derived from |
plot |
whether to plot the pairing relationship |
a data.table of codon info and RSCU values
ctab <- get_codon_table(gcid = '2') pairing <- plot_ca_pairing(ctab) head(pairing)
ctab <- get_codon_table(gcid = '2') pairing <- plot_ca_pairing(ctab) head(pairing)
rev_comp
creates reverse complemented version of the input sequence
rev_comp(seqs)
rev_comp(seqs)
seqs |
input sequences, DNAStringSet or named vector of sequences |
reverse complemented input sequences as a DNAStringSet.
# reverse complement of codons rev_comp(Biostrings::DNAStringSet(c('TAA', 'TAG')))
# reverse complement of codons rev_comp(Biostrings::DNAStringSet(c('TAA', 'TAG')))
seq_to_codons
converts a coding sequence to a vector of codons
seq_to_codons(seq)
seq_to_codons(seq)
seq |
DNAString, or an object that can be coerced to a DNAString |
a character vector of codons
# convert a CDS sequence to a sequence of codons seq_to_codons('ATGTGGTAG') seq_to_codons(yeast_cds[[1]])
# convert a CDS sequence to a sequence of codons seq_to_codons('ATGTGGTAG') seq_to_codons(yeast_cds[[1]])
show_codon_tables
print a table of available genetic code from NCBI through
Biostrings::GENETIC_CODE_TABLE
.
show_codon_tables()
show_codon_tables()
No return value (NULL). Available codon tables will be printed out directly.
# print available NCBI codon table IDs and descriptions. show_codon_tables()
# print available NCBI codon table IDs and descriptions. show_codon_tables()
slide
generates a data.table with start, center, and end columns
for a sliding window analysis.
slide(from, to, step = 1, before = 0, after = 0)
slide(from, to, step = 1, before = 0, after = 0)
from |
integer, the start of the sequence |
to |
integer, the end of the sequence |
step |
integer, the step size |
before |
integer, the number of values before the center of a window |
after |
integer, the number of values after the center of a window |
data.table with start, center, and end columns
slide(1, 10, step = 2, before = 1, after = 1)
slide(1, 10, step = 2, before = 1, after = 1)
slide_apply
applies a function to a sliding window of codons.
slide_apply(seq, .f, step = 1, before = 0, after = 0, ...)
slide_apply(seq, .f, step = 1, before = 0, after = 0, ...)
seq |
DNAString, the sequence |
.f |
function, the codon index calculation function to apply, for
example, |
step |
integer, the step size in number of codons |
before |
integer, the number of codons before the center of a window |
after |
integer, the number of codons after the center of a window |
... |
additional arguments to pass to the function |
data.table with start, center, end, and codon usage index columns
slide_apply(yeast_cds[[1]], get_enc, step = 1, before = 10, after = 10)
slide_apply(yeast_cds[[1]], get_enc, step = 1, before = 10, after = 10)
slide_codon
generates a data.table with start, center, and end columns
for a sliding window analysis of codons.
slide_codon(seq, step = 1, before = 0, after = 0)
slide_codon(seq, step = 1, before = 0, after = 0)
seq |
DNAString, the sequence |
step |
integer, the step size |
before |
integer, the number of codons before the center of a window |
after |
integer, the number of codons after the center of a window |
data.table with start, center, and end columns
x <- Biostrings::DNAString('ATCTACATAGCTACGTAGCTCGATGCTAGCATGCATCGTACGATCGTCGATCGTAG') slide_codon(x, step = 3, before = 1, after = 1)
x <- Biostrings::DNAString('ATCTACATAGCTACGTAGCTCGATGCTAGCATGCATCGTACGATCGTCGATCGTAG') slide_codon(x, step = 3, before = 1, after = 1)
slide_plot
visualizes codon usage in sliding window.
slide_plot(windt, index_name = "Index")
slide_plot(windt, index_name = "Index")
windt |
data.table, the sliding window codon usage
generated by |
index_name |
character, the name of the index to display. |
ggplot2 plot.
sw <- slide_apply(yeast_cds[[1]], get_enc, step = 1, before = 10, after = 10) slide_plot(sw)
sw <- slide_apply(yeast_cds[[1]], get_enc, step = 1, before = 10, after = 10) slide_plot(sw)
CDSs of all protein-coding genes in Saccharomyces_cerevisiae
yeast_cds
yeast_cds
a DNAStringSet of 6600 sequences
<https://ftp.ensembl.org/pub/release-107/fasta/saccharomyces_cerevisiae/cds/Saccharomyces_cerevisiae.R64-1-1.cds.all.fa.gz>
head(yeast_cds)
head(yeast_cds)
Yeast mRNA FPKM determined from rRNA-depleted (RiboZero) total RNA-Seq libraries. RUN1_0_WT and RUN2_0_WT (0 min after RNA Pol II repression) were averaged and used here.
yeast_exp
yeast_exp
a data.frame with 6717 rows and three columns:
gene ID
gene name
mRNA expression level in Fragments per kilobase per million reads
<https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE57385>
Presnyak V, Alhusaini N, Chen YH, Martin S, Morris N, Kline N, Olson S, Weinberg D, Baker KE, Graveley BR, et al. 2015. Codon optimality is a major determinant of mRNA stability. Cell 160:1111-1124.
head(yeast_exp)
head(yeast_exp)
Half life of yeast mRNAs in Saccharomyces_cerevisiae calculated from rRNA-deleted total RNAs by Presnyak et al.
yeast_half_life
yeast_half_life
a data.frame with 3888 rows and three columns:
gene id
gene name
mRNA half life in minutes
<https://doi.org/10.1016/j.cell.2015.02.029>
Presnyak V, Alhusaini N, Chen YH, Martin S, Morris N, Kline N, Olson S, Weinberg D, Baker KE, Graveley BR, et al. 2015. Codon optimality is a major determinant of mRNA stability. Cell 160:1111-1124.
head(yeast_half_life)
head(yeast_half_life)
Yeast tRNA sequences obtained from gtRNAdb.
yeast_trna
yeast_trna
a RNAStringSet with a length of 275.
<http://gtrnadb.ucsc.edu/genomes/eukaryota/Scere3/sacCer3-mature-tRNAs.fa>
Chan PP, Lowe TM. 2016. GtRNAdb 2.0: an expanded database of transfer RNA genes identified in complete and draft genomes. Nucleic Acids Res 44:D184-189.
yeast_trna
yeast_trna
Yeast tRNA gene copy numbers (GCN) by anticodon obtained from gtRNAdb.
yeast_trna_gcn
yeast_trna_gcn
a named vector with a length of 41. Value names are anticodons.
<http://gtrnadb.ucsc.edu/genomes/eukaryota/Scere3/sacCer3-mature-tRNAs.fa>
Chan PP, Lowe TM. 2016. GtRNAdb 2.0: an expanded database of transfer RNA genes identified in complete and draft genomes. Nucleic Acids Res 44:D184-189.
yeast_trna_gcn
yeast_trna_gcn