Title: | Analysis and Visualization of Macroevolutionary Dynamics on Phylogenetic Trees |
---|---|
Description: | Provides functions for analyzing and visualizing complex macroevolutionary dynamics on phylogenetic trees. It is a companion package to the command line program BAMM (Bayesian Analysis of Macroevolutionary Mixtures) and is entirely oriented towards the analysis, interpretation, and visualization of evolutionary rates. Functionality includes visualization of rate shifts on phylogenies, estimating evolutionary rates through time, comparing posterior distributions of evolutionary rates across clades, comparing diversification models using Bayes factors, and more. |
Authors: | Dan Rabosky [aut], Michael Grundler [aut], Pascal Title [aut, cre] , Carlos Anderson [aut], Jeff Shi [aut], Joseph Brown [aut], Huateng Huang [aut], Jon Mitchell [aut] |
Maintainer: | Pascal Title <[email protected]> |
License: | GPL (>= 2) |
Version: | 2.1.12 |
Built: | 2024-11-12 06:52:21 UTC |
Source: | CRAN |
Add a legend to a phylorate plot, with greater manual control.
addBAMMlegend( x, direction, side, location = "topleft", nTicks = 2, stretchInterval = FALSE, shortFrac = 0.02, longFrac = 0.3, axisOffset = 0.002, cex.axis = 0.8, labelDist = 0.7, ... )
addBAMMlegend( x, direction, side, location = "topleft", nTicks = 2, stretchInterval = FALSE, shortFrac = 0.02, longFrac = 0.3, axisOffset = 0.002, cex.axis = 0.8, labelDist = 0.7, ... )
x |
A |
direction |
Direction of color ramp. If omitted, then direction is automatically inferred, otherwise can be specified as horizontal or vertical. |
side |
Side for tick marks, see |
location |
Either a location name (see Details), or coordinates for the corners of the bar legend c(xmin, xmax, ymin, ymax). |
nTicks |
Number of tick marks, besides min and max. |
stretchInterval |
If color.interval was defined, should the legend be stretched to the color.interval, or should the full range of rates be presented. |
shortFrac |
Percent of the plot width range that will be used as the short dimention of the legend. Only applies to preset location options. |
longFrac |
Percent of the plot width range that will be used as the long dimension of the legend. Only applies to preset location options. |
axisOffset |
Distance from color bar for labels, as a percent of the plot width. |
cex.axis |
Size of axis labels. |
labelDist |
Distance from axis to axis labels (passed to mgp). |
... |
Additional parameters to be passed to axis. |
A number of predefined locations exist in this function to make
it easy to add a legend to a phylorate plot. Preset locations
are: topleft
, topright
, bottomleft
,
bottomright
, left
, right
, top
and
bottom
. If more fine-tuned control is desired, then a numeric
vector of length 4 can be supplied to location
, specifying the
min x, max x, min y and max y values for the legend. See
Examples
.
Invisibly returns a list with the following components:
coords: A 2-column matrix of xy coordinates for each color bin in the legend.
width: Coordinates for the short dimension of the legend.
pal: The color ramp.
tickLocs: The tick mark locations in plotting units.
labels: The rate values associated with those tick locations.
Pascal Title
Requires an object created with plot.bammdata
.
data(whales, events.whales) ephy <- getEventData(whales, events.whales, burnin = 0.25, nsamples = 300) # plot phylorate with extra margin space x <- plot(ephy, lwd = 2, mar = c(5,4,4,4)) # presets addBAMMlegend(x, location = 'topleft') addBAMMlegend(x, location = 'bottom') addBAMMlegend(x, location = 'right') # fine-tune placement x <- plot(ephy, lwd = 2, mar = c(5,4,4,4)) axis(1); axis(2) addBAMMlegend(x, location = c(-1, -0.5, 40, 80), nTicks = 4) addBAMMlegend(x, location = c(5, 20, 60, 61), nTicks = 4, side = 3, cex.axis = 0.7) # addBAMMlegend also automatically detects the use of color.interval data(primates, events.primates) ephy <- getEventData(primates, events.primates, burnin=0.25, nsamples = 300, type = 'trait') x <- plot(ephy, breaksmethod = 'linear', color.interval = c(NA, 0.12), lwd = 2) addBAMMlegend(x, location = c(0, 30, 200, 205), nTicks = 1, side = 3)
data(whales, events.whales) ephy <- getEventData(whales, events.whales, burnin = 0.25, nsamples = 300) # plot phylorate with extra margin space x <- plot(ephy, lwd = 2, mar = c(5,4,4,4)) # presets addBAMMlegend(x, location = 'topleft') addBAMMlegend(x, location = 'bottom') addBAMMlegend(x, location = 'right') # fine-tune placement x <- plot(ephy, lwd = 2, mar = c(5,4,4,4)) axis(1); axis(2) addBAMMlegend(x, location = c(-1, -0.5, 40, 80), nTicks = 4) addBAMMlegend(x, location = c(5, 20, 60, 61), nTicks = 4, side = 3, cex.axis = 0.7) # addBAMMlegend also automatically detects the use of color.interval data(primates, events.primates) ephy <- getEventData(primates, events.primates, burnin=0.25, nsamples = 300, type = 'trait') x <- plot(ephy, breaksmethod = 'linear', color.interval = c(NA, 0.12), lwd = 2) addBAMMlegend(x, location = c(0, 30, 200, 205), nTicks = 1, side = 3)
BAMM
-inferred rate shifts to a phylogeny plotAdds symbols to a plotted tree to mark the location(s) where there is a shift in the macroevolutionary dynamics of diversification or trait evolution.
addBAMMshifts( ephy, index = 1, method = "phylogram", cex = 1, pch = 21, col = 1, bg = 2, msp = NULL, shiftnodes = NULL, par.reset = TRUE, ... )
addBAMMshifts( ephy, index = 1, method = "phylogram", cex = 1, pch = 21, col = 1, bg = 2, msp = NULL, shiftnodes = NULL, par.reset = TRUE, ... )
ephy |
An object of class |
index |
An integer indicating which posterior sample to use for adding shifts to the plotted tree. |
method |
A character string indicating the method used in plotting. Must be "polar" or "phylogram". |
cex |
A numeric indicating the character expansion ("size") of the plotted points. |
pch |
An integer indicating the choice of plotting symbol. |
col |
An integer or character string indicating the border color of the plotting symbol. |
bg |
An integer or character string indicating the background color of the plotting symbol. |
msp |
If not |
shiftnodes |
An optional vector of node numbers indicating the locations of shifts to plot. |
par.reset |
A logical indicating whether to reset the graphical parameters before exiting. |
... |
additional arguments to be passed to |
Any given sample from the posterior distribution sampled using
BAMM
contains a potentially unique configuration of rate shifts
and associated parameters. There is no single "best" rate shift, but
rather a set of shift configurations (and associated parameters) -
along with their relative probabilities - sampled with MCMC. This
function enables the user to plot the locations of shifts sampled with
BAMM
for a given sample from the posterior.
If the bammdata
object contains just a single sample, these
shifts will be plotted regardless of the value of index
.
If a shiftnodes
argument is passed care should be taken to
ensure that the nodes are in the same order as in the event data for
the sample index.
Mike Grundler
getShiftNodesFromIndex
, plot.bammdata
data(whales, events.whales) ed <- getEventData(whales, events.whales, burnin=0.25, nsamples = 500) # adding shifts to tree for specific posterior samples plot(ed, method="polar") addBAMMshifts(ed, index=5, "polar") # multi-panel plotting and adding shifts par(mfrow=c(2,3),mar=c(5,1,1,1)) samples = sample(1:length(ed$eventData), 6) for (i in 1:6) { sed <- subsetEventData(ed, samples[i]) plot(sed, par.reset=FALSE) addBAMMshifts(sed,index=1,method="phylogram",par.reset=FALSE) }
data(whales, events.whales) ed <- getEventData(whales, events.whales, burnin=0.25, nsamples = 500) # adding shifts to tree for specific posterior samples plot(ed, method="polar") addBAMMshifts(ed, index=5, "polar") # multi-panel plotting and adding shifts par(mfrow=c(2,3),mar=c(5,1,1,1)) samples = sample(1:length(ed$eventData), 6) for (i in 1:6) { sed <- subsetEventData(ed, samples[i]) plot(sed, par.reset=FALSE) addBAMMshifts(sed,index=1,method="phylogram",par.reset=FALSE) }
Maps macroevolutionary rates to a set of NCOLORS
.
assignColorBreaks( rates, NCOLORS = 64, spex = "s", logcolor = FALSE, method = c("linear", "quantile", "jenks"), JenksSubset = NULL )
assignColorBreaks( rates, NCOLORS = 64, spex = "s", logcolor = FALSE, method = c("linear", "quantile", "jenks"), JenksSubset = NULL )
rates |
A numeric vector of phenotypic rates or a list of numeric vectors of speciation and extinction rates. |
NCOLORS |
An integer number of colors to use for the mapping. Larger numbers do not necessarily result in smoother looking color ramps. The default is 64 and is probably sufficient for most purposes. |
spex |
A character string. "s" means that speciation rates are used
to make the map, "e" means that extinction rates are used. "netdiv"
means that diversification rates are used. Ignored for |
logcolor |
Logical. Should the natural logarithm of rates be used for the color map. |
method |
Determines how the color breaks are created. See Details. |
JenksSubset |
Number of regularly spaced samples to subset from
|
If method = "quantile"
macroevolutionary rates are binned
into NCOLORS+1
percentiles and rates in each bin are mapped to
a color determined by the pal
argument in plot.bammdata
.
Alternatively, if method = "linear"
macroevolutionary rates are
binned into NCOLORS+1
equal length intervals between the
minimum and maximum.
If method = "jenks"
, macroevolutionary rates are binned into
NCOLORS+1
categories, according to the Jenks natural breaks
classification method. This method is borrowed from the field of
cartography, and seeks to minimize the variance within categories,
while maximizing the variance between categories.
The Jenks natural breaks method was ported to C from code found in the classInt R package.
A numeric vector of rate percentiles/intervals.
Mike Grundler, Pascal Title
data(whales, events.whales) ed <- getEventData(whales, events.whales, burnin = 0.2, nsamples = 500) ed <- dtRates(ed, 0.01) colors <- assignColorBreaks(ed$dtrates$rates, spex="s") #speciation rates #colors <- assignColorBreaks(ed$dtrates$rates[[1]]) #this also works for speciation rates plot(ed, colorbreaks = colors, spex="s") colors <- assignColorBreaks(ed$dtrates$rates, spex="netdiv") #diversification rates #colors <- assignColorBreaks(ed$dtrates$rates[[1]] - ed$dtrates$rates[[2]]) #this also works for diversification rates plot(ed, colorbreaks = colors, spex="netdiv")
data(whales, events.whales) ed <- getEventData(whales, events.whales, burnin = 0.2, nsamples = 500) ed <- dtRates(ed, 0.01) colors <- assignColorBreaks(ed$dtrates$rates, spex="s") #speciation rates #colors <- assignColorBreaks(ed$dtrates$rates[[1]]) #this also works for speciation rates plot(ed, colorbreaks = colors, spex="s") colors <- assignColorBreaks(ed$dtrates$rates, spex="netdiv") #diversification rates #colors <- assignColorBreaks(ed$dtrates$rates[[1]] - ed$dtrates$rates[[2]]) #this also works for diversification rates plot(ed, colorbreaks = colors, spex="netdiv")
BAMM
likelihoodCalculates the likelihood of a phylogeny exactly as is done
by BAMM
, given a set of events.
BAMMlikelihood( phy, eventdata, gen = "last", segLength = 0.02, sf = 1, return.intermediates = FALSE, e_prob_condition = "if_different", ... )
BAMMlikelihood( phy, eventdata, gen = "last", segLength = 0.02, sf = 1, return.intermediates = FALSE, e_prob_condition = "if_different", ... )
phy |
Either an object of class |
eventdata |
A table of event data, as returned by |
gen |
The |
segLength |
The relative segment length, exactly as defined for
|
sf |
The sampling fraction. |
return.intermediates |
Debugging option, returns augmented
|
e_prob_condition |
Approach for how extinction probabilities are handled at nodes. |
... |
Additional arguments that will be passed to an internal
function |
This function allows the user to check the likelihoods computed
by BAMM
using an independent R-based implementation. This is
designed to provide a check on potential software bugs that might be
introduced during future BAMM
development and which might
compromise the likelihood calculation. If you observe measurable
discrepancies between the likelihood computed by this function and the
corresponding likelihood returned by BAMM
, please inform the
BAMM
development team.
If return.intermediates == TRUE
, then phylo
objects
are returned with the following components:
event_times |
A list of length (number of nodes), where
event_times[[k]] is the vector of absolute times, in order, of
events that happened on a focal branch. If no event, it is
|
event_id |
A list of length equal to number of nodes, as event_times, but holding the corresponding event id. |
events |
A dataframe giving parameters and associated nodes (and unique index values) of the event data. |
node_event |
The event governing the process realized at the node. This will be the first event encountered as one moves rootwards towards the tips from the focal node. |
Dan Rabosky, Pascal Title
# a global sampling fraction of 0.98 was used in generating the whales # dataset. data(whales, events.whales, mcmc.whales) x <- BAMMlikelihood(whales, events.whales, gen = 'last', sf = 0.98) # Does the likelihood generated by BAMM match the R implementation? identical(round(x, 3), mcmc.whales[nrow(mcmc.whales), 'logLik']) # an example with a constant-rate birth-death process: pars <- c(0.5, 0.45) names(pars) <- c("lambda", "mu") BAMMlikelihood(whales, pars, sf = 0.98)
# a global sampling fraction of 0.98 was used in generating the whales # dataset. data(whales, events.whales, mcmc.whales) x <- BAMMlikelihood(whales, events.whales, gen = 'last', sf = 0.98) # Does the likelihood generated by BAMM match the R implementation? identical(round(x, 3), mcmc.whales[nrow(mcmc.whales), 'logLik']) # an example with a constant-rate birth-death process: pars <- c(0.5, 0.45) names(pars) <- c("lambda", "mu") BAMMlikelihood(whales, pars, sf = 0.98)
An R package for the analysis and visualization of complex
macroevolutionary dynamics. Functions in BAMMtools
are oriented
entirely around analysis of results obtained using the BAMM
software (http://bamm-project.org/).
Dan Rabosky, Mike Grundler, Pascal Title, Jonathan Mitchell, Carlos Anderson, Jeff Shi, Joseph Brown, Huateng Huang
Rabosky, D., M. Grundler, C. Anderson, P. Title, J. Shi, J. Brown, H. Huang and J. Larson. 2014. BAMMtools: an R package for the analysis of evolutionary dynamics on phylogenetic trees. Methods in Ecology and Evolution 5: 701-707.
Rabosky, D. L. 2014. Automatic detection of key innovations, rate shifts, and diversity-dependence on phylogenetic trees. PLoS ONE 9: e89543.
Shi, J. J., and D. L. Rabosky. 2015. Speciation dynamics during the global radiation of extant bats. Evolution 69: 1528-1545.
Rabosky, D. L., F. Santini, J. T. Eastman, S. A. Smith, B. L. Sidlauskas, J. Chang, and M. E. Alfaro. 2013. Rates of speciation and morphological evolution are correlated across the largest vertebrate radiation. Nature Communications DOI: 10.1038/ncomms2958.
Useful links:
Example datasets and sample BAMM
output for the
package BAMMtools
.
This includes both the raw data and the BAMM
output for
three example analyses. The first is an analysis of speciation and
extinction rates during the radiation of modern whales, using a
time-calibrated tree from Steeman et al. (2009). The second is a
BAMM
analysis of phenotypic evolutionary rates (body mass)
during the radiation of extant primates, taken from Vos and Mooers
(2006) and Redding et al. (2010).The third is a BAMM
analysis
of speciation and extinction rates for a 300-species subset of
ray-finned fishes, along with body size data for these species from
Rabosky et al. (2013).
Dataset whales
is the raw time-calibrated tree that was
analyzed with BAMM
, primates
is the corresponding
time-calibrated phylogeny of 233 primate species, and fishes
is the time-calibrated phylogeny of 300 fish species. Log-transformed
body masses for primates are in dataset mass.primates
, and fish
body sizes are in dataset traits.fishes
.
The MCMC output files (mcmc.whales
and mcmc.primates
)
are dataframes containing the raw MCMC output as generated by
BAMM
. Column headers in the dataframes includes the sampling
generation, the current number of shifts in the simulation
(N_shifts
), the log-prior density of the parameters
(logPrior
), the log-likelihood of the data (logLik
), the
current parameter of the Poisson process governing the number of
regime shifts (eventRate
), and the MCMC acceptance rate
(acceptRate
). This is the file that would typically be analyzed
as a first step towards assessing MCMC convergence (e.g., analyzing
effective sample sizes of logLik
and N_shifts
).
The "core" BAMM
output is included in the event data
files (events.whales
, events.primates
and
events.fishes
). These are all the parameters sampled with MCMC
that are relevant to reconstructing the nature and location of
evolutionary rate dynamics across a phylogeny. Please refer to
BAMM
documentation for a detailed overview of this output, but
a brief description is as follows:
generation
: The index value of the state in the MCMC simulation
(the "generation").
leftchild, rightchild
: This defines a unique topological
location where a rate shift was sampled. Specifically, for given
right-left pair, the shift is sampled on the branch leading to the
node from which rightchild
and leftchild
are descended
(these two taxa are part of the spanning set of taxa for the node). If
leftchild
is "NA", this simply means that the shift was sampled
on a terminal branch.
abstime
: The absolute occurrence time of the shift, assuming
that the time of the root node is 0.0.
lambdainit, lambdashift
: For speciation extinction model, the
initial speciation rate and rate change parameter for the process.
muinit
: For speciation extinction model, the extinction rate
(time-invariant).
betainit, betashift
: For phenotypic evolutionary model, the
initial (betainit
) rate of phenotypic evolution and the rate
change parameter (betashift
).
Vos R.A., A.O. Mooers. 2006. A new dated supertree of the Primates. Chapter 5. In Inferring large phylogenies: the big tree problem (R Vos, Ph.D. thesis) Simon Fraser University.
Redding D.W., C. DeWolff, A.O. Mooers. 2010. Evolutionary distinctiveness, threat status and ecological oddity in primates. Conservation Biology 24: 1052-1058. DOI: 10.1111/j.1523-1739.2010.01532.x
Steeman, M.E., M.B. Hebsgaard, R.E. Fordyce, S.W.Y. Ho, D.L. Rabosky, R. Nielsen, C. Rahbek, H. Glenner, M.V. Sorensen, E. Willerslev. 2009. Radiation of Extant Cetaceans Driven by Restructuring of the Oceans. Systematic Biology. 58: 573-585. DOI: 10.1093/sysbio/syp060
Rabosky, D. L., F. Santini, J.T. Eastman, S. . Smith, B.L. Sidlauskas, J. Chang, and M.E. Alfaro. 2013. Rates of speciation and morphological evolution are correlated across the largest vertebrate radiation. Nature Communications DOI: 10.1038/ncomms2958.
Plots the matrix of pairwise correlations in rate regimes between all tips in a phylogeny.
cohorts( x, ephy, col, pal, lwd = 1, ofs = 0, use.plot.bammdata = FALSE, useraster = FALSE, LARGE = 500, ... )
cohorts( x, ephy, col, pal, lwd = 1, ofs = 0, use.plot.bammdata = FALSE, useraster = FALSE, LARGE = 500, ... )
x |
A matrix of pairwise correlations generated by
|
ephy |
An object of class |
col |
A vector of colors passed to the function |
pal |
The palette to use if |
lwd |
A numeric indicating the width of branches in the phylogeny. |
ofs |
A numeric controlling the offset of the phylogeny from the matrix plot. Appropriate values will probably be in the interval [0,0.1]. |
use.plot.bammdata |
Logical. should a phylorate plot be generated? |
useraster |
A logical indicating whether the function |
LARGE |
An integer. If trees have more tips than |
... |
Further arguments passed to |
The plotting function creates an image of the BAMM
correlation matrix between tip lineages of the phylogeny. Each
correlation is the posterior frequency with which a pair of lineages
occurs in the same macroevolutionary rate regime. Correlations are
mapped to a set of colors, with warmer colors corresponding to higher
correlations than cooler colors. The set of colors is specified by the
col
argument and a legend is plotted to guide interpretation of
the color-correlation map. Trees are plotted on the margins of the
matrix image. The correlation between any two tips can be inferred by
finding their intersection within the matrix image.
IMPORTANT: the legend DOES NOT apply to the phylorate plots
shown in the margin if use.plot.bammdata=TRUE
.
Mike Grundler
plot.bammdata
, getCohortMatrix
,
image
data(whales, events.whales) ed <- getEventData(whales, events.whales, burnin=0.1, nsamples=500) x <- getCohortMatrix(ed) cohorts(x, ed) cohorts(x, ed, col='temperature') cohorts(x, ed, ofs=0.05, col='temperature') cohorts(x, ed, pal="temperature", col='temperature', use.plot.bammdata=TRUE) # gray scale cohorts(x, ed, col=gray(seq(0.2,0.9,length.out=128)), use.plot.bammdata=FALSE)
data(whales, events.whales) ed <- getEventData(whales, events.whales, burnin=0.1, nsamples=500) x <- getCohortMatrix(ed) cohorts(x, ed) cohorts(x, ed, col='temperature') cohorts(x, ed, ofs=0.05, col='temperature') cohorts(x, ed, pal="temperature", col='temperature', use.plot.bammdata=TRUE) # gray scale cohorts(x, ed, col=gray(seq(0.2,0.9,length.out=128)), use.plot.bammdata=FALSE)
Computes pairwise Bayes factors for a set of
macroevolutionary models sampled using BAMM
, using MCMC
simulation output.
computeBayesFactors(postdata, expectedNumberOfShifts, burnin = 0.1, ...)
computeBayesFactors(postdata, expectedNumberOfShifts, burnin = 0.1, ...)
postdata |
Filename for the MCMC output file from a |
expectedNumberOfShifts |
Expected number of shifts under the prior. |
burnin |
What fraction of samples to discard from postdata as burnin? |
... |
Additional arguments to computeBayesFactors. |
This function returns a matrix of pairwise Bayes factors, where the Bayes factor is the ratio of marginal likelihoods between two models Mi and Mj. Numerator models are given as rows, and denominator models as columns. Row names and column names give the number of shifts in the corresponding model. Suppose you have an output matrix with row and column names 0:3 (0, 1, 2, 3). Model 0 is a model with just a single process (starting at the root), and no among-lineage rate heterogeneity.
If computeBayesFactors
gives a matrix mm
, and
mm[2,1]
is 10.0, this implies Bayes factor evidence of 10 in
favor of the 2nd row model (a model with 1 process; e.g.,
rownames(mm)[2]
) over the first column model (a model with a
single process).
This function will only compute Bayes factors between models which
were actually sampled during simulation of the posterior. Hence, if
a model has such low probability that it is never visited by
BAMM
during the simulation of the posterior, it will be
impossible to estimate its posterior probability (and thus, you will
get no Bayes factors involving this particular model). This is likely
to change in the future with more robust methods for estimating
posterior probabilities in the tails of the distribution.
A matrix of pairwise Bayes factors between models.
Dan Rabosky
data(mcmc.whales) computeBayesFactors(mcmc.whales, expectedNumberOfShifts = 1, burnin = 0.1)
data(mcmc.whales) computeBayesFactors(mcmc.whales, expectedNumberOfShifts = 1, burnin = 0.1)
BAMM
resultsComputes the 95% (or any other %) credible set of
macroevolutionary rate shift configurations from a bammdata
object. These results can be analyzed further and/or plotted.
credibleShiftSet( ephy, expectedNumberOfShifts, threshold = 5, set.limit = 0.95, ... )
credibleShiftSet( ephy, expectedNumberOfShifts, threshold = 5, set.limit = 0.95, ... )
ephy |
An object of class |
expectedNumberOfShifts |
The expected number of rate shifts under the prior. |
threshold |
The marginal posterior-to-prior odds ratio for a rate shift on a specific branch, used to distinguish core and non-core shifts. |
set.limit |
The desired limit to the credible set. A value of 0.95 will return the 95% credible set of shift configurations. |
... |
Other arguments to |
Computes the 95% credible set (or XX% credible set, depending
on set.limit
) of diversification shift configurations sampled
using BAMM
. This is analogous to a credible set of phylogenetic
tree topologies from a Bayesian phylogenetic analysis.
To understand how this calculation is performed, one must first
distinguish between "core" and "non-core" rate shifts. A "core shift"
is a rate shift with a marginal probability that is substantially
elevated above the probability expected on the basis of the prior
alone. With BAMM
, every branch in a phylogenetic tree is
associated with some non-zero prior probability of a rate shift.
Typically this is a very low per-branch shift probability (this prior
is determined by the value of the "poissonRatePrior" parameter in a
BAMM
analysis).
If we compute distinct shift configurations with every sampled shift (including those shifts with very low marginal probabilities), the number of distinct shift configurations will be overwhelmingly high. However, most of these configurations include shifts with marginal probabilities that are expected even under the prior alone. Hence, using these shifts to identify distinct shift configurations simply generates noise and isn't particularly useful.
The solution adopted in BAMMtools
is, for each branch in the
phylogeny, to compute both the posterior and prior probabilities of a
rate shift occurring. The ratio of these probabilities is a
branch-specific marginal odds ratio: it is the marginal posterior
frequency of one or more rate shifts normalized by the corresponding
prior probability. Hence, any branch with a marginal odds ratio of
1.0 is one where the observed (posterior) odds of a rate shift are no
different from the prior odds. A value of 10 implies that the
posterior probability is 10 times the prior probability.
The user of credibleShiftSet
must specify a threshold
argument. This is simply a cutoff value for identifying "important"
shifts for the purposes of identifying distinct shift configurations.
This does not imply that it is identifying "significant" shifts. See
the online documentation on this topic available at
http://bamm-project.org for more information. If you specify
threshold = 5
as an argument to credibleShiftSet
, the
function will ignore all branches with marginal odds ratios less than
5 during the enumeration of topologically distinct shift
configurations. Only shifts with marginal odds ratios greater than or
equal to threshold
will be treated as core shifts for the
purposes of identifying distinct shift configurations.
For each shift configuration in the credible set, this function will
compute the average diversification parameters. For example, the most
frequent shift configuration (the maximum a posteriori shift
configuration) might have 3 shifts, and 150 samples from your
posterior (within the bammdata
object) might show this shift
configuration. However, the parameters associated with each of these
shift configurations (the actual evolutionary rate parameters) might
be different for every sample. This function returns the mean set of
rate parameters for each shift configuration, averaging over all
samples from the posterior that can be assigned to a particular shift
configuration.
A class credibleshiftset
object with many components. Most
components are an ordered list of length L, where L is the number of
distinct shift configurations in the credible set. The first list
element in each case corresponds to the shift configuration with the
maximum a posteriori probability.
frequency |
A vector of frequencies of shift configurations,
including those that account for |
shiftnodes |
A list of the "core" rate shifts (marginal
probability > threshold) that occurred in each distinct shift
configuration in the credible set. The i'th vector from this list
gives the core shift nodes for the i'th shift configuration. They
are sorted by frequency, so |
indices |
A list of vectors containing the indices of samples in
the |
cumulative |
Like |
threshold |
The marginal posterior-to-prior odds for rate shifts on branches used during enumeration of distinct shift configurations. |
number.distinct |
Number of distinct shift configurations in the credible set. |
set.limit |
which credible set is this (0.9, 0.95, etc)? |
coreshifts |
A vector of node numbers corresponding to the core
shifts. All of these nodes have a Bayes factor of at least
|
In addition, a number of components that are defined similarly in
class phylo
or class bammdata
objects:
edge |
See documentation for class |
Nnode |
See documentation for class |
tip.label |
See documentation for class |
edge.length |
See documentation for class |
begin |
The beginning time of each branch in absolute time (the root is set to time zero) |
end |
The ending time of each branch in absolute time. |
numberEvents |
An integer vector with the number of core events
contained in the |
eventData |
A list of dataframes. Each element holds the average
rate and location parameters for all samples from the posterior
that were assigned to a particular distinct shift configuration.
Each row in a dataframe holds the data for a single event. Data
associated with an event are: |
eventVectors |
A list of integer vectors. Each element is for a
single shift configuration in the posterior. For each branch in
the |
eventBranchSegs |
A list of matrices. Each element of the list is
a single distinct shift configuration. Each matrix has four
columns: |
tipStates |
A list of integer vectors. Each element is a single distinct shift configuration. For each tip the index of the event that occurs along the branch subtending the tip. Tips are ordered increasing here and elsewhere. |
tipLambda |
A list of numeric vectors. Each element is a single distinct shift configuration. For each tip is the average rate of speciation or trait evolution at the end of the terminal branch subtending that tip (averaged over all samples that are assignable to this particular distinct shift configuration). |
tipMu |
A list of numeric vectors. Each element is a single
distinct shift configuration. For each tip the rate of extinction
at the end of the terminal branch subtending that tip. Meaningless
if working with |
type |
A character string. Either "diversification" or "trait"
depending on your |
downseq |
An integer vector holding the nodes of |
lastvisit |
An integer vector giving the index of the last node
visited by the node in the corresponding position in
|
Dan Rabosky
distinctShiftConfigurations
,
plot.bammshifts
, summary.credibleshiftset
,
plot.credibleshiftset
,
getBranchShiftPriors
data(events.whales, whales) ed <- getEventData(whales, events.whales, burnin=0.1, nsamples=500) cset <- credibleShiftSet(ed, expectedNumberOfShifts = 1, threshold = 5) # Here is the total number of samples in the posterior: length(ed$eventData) # And here is the number of distinct shift configurations: cset$number.distinct # here is the summary statistics: summary(cset) # Accessing the raw frequency vector for the credible set: cset$frequency #The cumulative frequencies: cset$cumulative # The first element is the shift configuration with the maximum # a posteriori probability. We can identify all the samples from # posterior that show this shift configuration: cset$indices[[1]] # Now we can plot the credible set: plot(cset, plotmax=4)
data(events.whales, whales) ed <- getEventData(whales, events.whales, burnin=0.1, nsamples=500) cset <- credibleShiftSet(ed, expectedNumberOfShifts = 1, threshold = 5) # Here is the total number of samples in the posterior: length(ed$eventData) # And here is the number of distinct shift configurations: cset$number.distinct # here is the summary statistics: summary(cset) # Accessing the raw frequency vector for the credible set: cset$frequency #The cumulative frequencies: cset$cumulative # The first element is the shift configuration with the maximum # a posteriori probability. We can identify all the samples from # posterior that show this shift configuration: cset$indices[[1]] # Now we can plot the credible set: plot(cset, plotmax=4)
marginalShiftProbsTree
computes a version of a
phylogenetic tree where each branch length is equal to the marginal
probability that a shift occurred on a particular branch. The
cumulativeShiftProbsTree
includes the cumulative probability
that a shift occurred on a given branch. See details.
cumulativeShiftProbsTree(ephy) marginalShiftProbsTree(ephy)
cumulativeShiftProbsTree(ephy) marginalShiftProbsTree(ephy)
ephy |
An object of class |
The marginal shift probability tree is a copy of the target phylogeny, but where each branch length is equal to the branch-specific marginal probability that a rate-shift occurred on the focal branch. For example, a branch length of 0.333 implies that 1/3 of all samples from the posterior had a rate shift on the focal branch.
Note: It is highly inaccurate to use marginal shift probabilities as a measure of whether diversification rate heterogeneity occurs within a given dataset. Consider the following example. Suppose you have a tree with topology (A, (B, C)). You find a marginal shift probability of 0.5 on the branch leading to clade C, and also a marginal shift probability of 0.5 on the branch leading to clade BC. Even though the marginal shift probabilities appear low, it may be the case that the joint probability of a shift occurring on either the branch leading to C or BC is 1.0. Hence, you could be extremely confident (posterior probabilities approaching 1.0) in rate heterogeneity, yet find that no single branch has a particularly high marginal shift probability. In fact, this is exactly what we expect in most real datasets, because there is rarely enough signal to strongly support the occurrence of a shift on any particular branch.
The cumulative shift probability tree is a copy of the target phylogeny but where branch lengths are equal to the cumulative probability that a rate shift occurred somewhere on the path between the root and the focal branch. A branch length equal to 0.0 implies that the branch in question has evolutionary rate dynamics that are shared with the evolutionary process starting at the root of the tree. A branch length of 1.0 implies that, with posterior probability 1.0, the rate dynamics on a branch are decoupled from the "root process".
An object of class phylo
, but with branch lengths equal to
the marginal or cumulative shift probabilities.
Dan Rabosky
data(whales) data(events.whales) ed <- getEventData(whales, events.whales, nsamples = 500) # computing the marginal shift probs tree: mst <- marginalShiftProbsTree(ed) # The cumulative shift probs tree: cst <- cumulativeShiftProbsTree(ed) #compare the two types of shift trees side-by-side: plot.new() par(mfrow=c(1,2)) plot.phylo(mst, no.margin=TRUE, show.tip.label=FALSE) plot.phylo(cst, no.margin=TRUE, show.tip.label=FALSE)
data(whales) data(events.whales) ed <- getEventData(whales, events.whales, nsamples = 500) # computing the marginal shift probs tree: mst <- marginalShiftProbsTree(ed) # The cumulative shift probs tree: cst <- cumulativeShiftProbsTree(ed) #compare the two types of shift trees side-by-side: plot.new() par(mfrow=c(1,2)) plot.phylo(mst, no.margin=TRUE, show.tip.label=FALSE) plot.phylo(cst, no.margin=TRUE, show.tip.label=FALSE)
Identify topologically distinct rate shift configurations
that were sampled with BAMM
, and assign each sample in the
posterior to one of the distinct shift configurations.
distinctShiftConfigurations(ephy, expectedNumberOfShifts, threshold, ...)
distinctShiftConfigurations(ephy, expectedNumberOfShifts, threshold, ...)
ephy |
An object of class |
expectedNumberOfShifts |
The expected number of rate shifts under the prior. |
threshold |
Threshold value for marginal posterior-to-prior odds ratios, used to identify branches with elevated shift probabilities relative to prior (core vs non-core shifts). |
... |
Other arguments to distinctShiftConfigurations (possibly deprecated args). |
See Rabosky et al (2014) and the BAMM
project website for
details on the nature of distinct shift configurations and especially
the distinction between "core" and "non-core" rate shifts. Note that
branches with elevated marginal posterior probabilities relative to
the prior (marginal odds ratios) cannot be claimed to have
"significant" evidence for a rate shift on the basis of this evidence
alone.
An object of class bammshifts
. This is a list with the
following components:
marg.probs: A list of the marginal probability of a shift occurring at each node of the phylogeny for each distinct rate shift configuration.
marginal_odd_ratio: Marginal posterior-to-prior odds ratios for one or more rate shifts an a given branch.
shifts: A list of the set of shift nodes for each distinct rate configuration.
samplesets: A list of sample indices that reduce to each of the unique shift sets.
frequency: A vector of frequencies of each distinct shift configuration.
coreshifts: A vector of node numbers corresponding to the
core shifts. All of these nodes have a marginal odds ratio of
at least threshold
supporting a rate shift.
threshold: A single numeric value giving the marginal posterior:prior odds ratio threshold used during enumeration of distinct shift configurations.
Results are sorted by frequency:
$frequency[1] gives the most common shift configuration sampled.
$shifts[[1]] gives the corresponding node indices for that configuration.
$samplesets[[1]] gives the indices of samples with this configuration.
Dan Rabosky
plot.bammshifts
, credibleShiftSet
data(whales, events.whales) ed <- getEventData(whales, events.whales, burnin=0.25, nsamples=500) sc <- distinctShiftConfigurations(ed, expectedNumberOfShifts = 1, threshold = 5) plot(sc, ed, rank=1)
data(whales, events.whales) ed <- getEventData(whales, events.whales, burnin=0.25, nsamples=500) sc <- distinctShiftConfigurations(ed, expectedNumberOfShifts = 1, threshold = 5) plot(sc, ed, rank=1)
BAMM
outputdtRates
calculates the mean of the marginal posterior
density of the rate of speciation/extinction or trait evolution for
small segments along each branch in a phylogeny.
dtRates(ephy, tau, ism = NULL, tmat = FALSE)
dtRates(ephy, tau, ism = NULL, tmat = FALSE)
ephy |
An object of class |
tau |
A numeric that specifies the size (as a fraction of tree height) of the segments that each branch will be discretized into. |
ism |
An integer vector indexing which posterior samples to include in the calculation. |
tmat |
A logical. If |
dtRates
bins the phylogeny into windows of time and
calculates average rates of speciation/extinction or phenotypic
evolution along each segment of a branch within a window. The width of
each window is determined by tau
. tau
is a fraction of
the root to tip distance so a value of tau = 0.01
bins the
phylogeny into 100 time windows of equal width.
A bammdata
object with a new component named "dtrates", which
is a list with two or three components:
tau: The parameter value of tau
used in the
calculation.
rates: If ephy$type = "trait"
: a numeric vector with
the phenotypic rates of each segment on each branch. If
ephy$type = "diversification"
: a list with two
components. The first component is a numeric vector of
speciation rates. The second component is a numeric vector of
extinction rates.
tmat: A matrix of the starting and ending times of the
segments on each branch. Only if tmat = TRUE
.
If there are zero length branches in the input tree NA
s will
result.
Mike Grundler
data(whales, events.whales) ed <- getEventData(whales, events.whales, burnin=0.25, nsamples=500) # use all posterior samples ed <- dtRates(ed, tau=0.01) # use specified range of posterior samples ed <- dtRates(ed, tau=0.01, ism=50:150)
data(whales, events.whales) ed <- getEventData(whales, events.whales, burnin=0.25, nsamples=500) # use all posterior samples ed <- dtRates(ed, tau=0.01) # use specified range of posterior samples ed <- dtRates(ed, tau=0.01, ism=50:150)
BAMM
Generates a template diversification or trait control file
for BAMM
, while allowing the user to specify parameter values.
generateControlFile( file = "controlfile.txt", type = "diversification", params = NULL )
generateControlFile( file = "controlfile.txt", type = "diversification", params = NULL )
file |
Destination file name with or without path. |
type |
Character, either “ |
params |
List of parameters, see |
The user can supply parameters as a list, where the name of the list item is the name of the parameter as it appears in the control file, and the value of the list item is what will be placed in the contol file.
If a parameter is specified by the user, it will automatically be uncommented if it was commented in the template.
Pascal Title
## Not run: #Produce a blank template control file generateControlFile(file = 'traitcontrol.txt', type='trait') #Produce a customized control file data(whales) #get bamm priors to supply to control file priors <- setBAMMpriors(whales, outfile = NULL) generateControlFile(file = 'divcontrol.txt', params = list( treefile = 'whales.tre', globalSamplingFraction = '1', numberOfGenerations = '100000', overwrite = '1', lambdaInitPrior = as.numeric(priors['lambdaInitPrior']), lambdaShiftPrior = as.numeric(priors['lambdaShiftPrior']), muInitPrior = as.numeric(priors['muInitPrior']), expectedNumberOfShifts = '1')) ## End(Not run)
## Not run: #Produce a blank template control file generateControlFile(file = 'traitcontrol.txt', type='trait') #Produce a customized control file data(whales) #get bamm priors to supply to control file priors <- setBAMMpriors(whales, outfile = NULL) generateControlFile(file = 'divcontrol.txt', params = list( treefile = 'whales.tre', globalSamplingFraction = '1', numberOfGenerations = '100000', overwrite = '1', lambdaInitPrior = as.numeric(priors['lambdaInitPrior']), lambdaShiftPrior = as.numeric(priors['lambdaShiftPrior']), muInitPrior = as.numeric(priors['muInitPrior']), expectedNumberOfShifts = '1')) ## End(Not run)
BAMM
analysisGet the rate shift configuration with the maximum a
posteriori probability, e.g., the shift configuration that was sampled
most frequently with BAMM
.
getBestShiftConfiguration(x, expectedNumberOfShifts, threshold = 5)
getBestShiftConfiguration(x, expectedNumberOfShifts, threshold = 5)
x |
Either a |
expectedNumberOfShifts |
The expected number of shifts under the prior. |
threshold |
The marginal posterior-to-prior odds ratio used as a cutoff for distinguishing between "core" and "non-core" rate shifts. |
This function estimates the rate shift configuration with the
highest maximum a posteriori (MAP) probability. It returns a
bammdata
object with a single sample. This can be plotted with
plot.bammdata
, and individual rate shifts can then
be added with addBAMMshifts
.
The parameters of this object are averaged over all samples in the
posterior that were assignable to the MAP shift configuration. All
non-core shifts have been excluded, such that the only shift
information contained in the object is from the "significant" rate
shifts, as determined by the relevant marginal posterior-to-prior odds
ratio threshold
.
You can extract the same information from the credible set of shift
configurations. See credibleShiftSet
for more
information.
A class bammdata
object with a single sample, corresponding
to the diversification rate shift configuration with the maximum a
posteriori probability. See getEventData
for details.
Dan Rabosky
getEventData, credibleShiftSet, plot.credibleshiftset, plot.bammdata
data(whales, events.whales) ed <- getEventData(whales, events.whales, burnin=0.1, nsamples=500) # Get prior distribution on shifts-per-branch: bp <- getBranchShiftPriors(whales, expectedNumberOfShifts = 1) # Pass the event data object in to the function: best <- getBestShiftConfiguration(ed, expectedNumberOfShifts = 1, threshold = 5) plot(best, lwd=2) addBAMMshifts(best, cex=2) # Now we can also work with the credible shift set: css <- credibleShiftSet(ed, expectedNumberOfShifts = 1, threshold = 5) summary(css) # examine model-averaged shifts from MAP configuration- # This gives us parameters, times, and associated nodes # of each evolutionary rate regime (note that one of # them corresponds to the root) css$eventData[[1]]; # Get bammdata representation of MAP configuration: best <- getBestShiftConfiguration(css, expectedNumberOfShifts = 1, threshold = 5) plot(best) addBAMMshifts(best)
data(whales, events.whales) ed <- getEventData(whales, events.whales, burnin=0.1, nsamples=500) # Get prior distribution on shifts-per-branch: bp <- getBranchShiftPriors(whales, expectedNumberOfShifts = 1) # Pass the event data object in to the function: best <- getBestShiftConfiguration(ed, expectedNumberOfShifts = 1, threshold = 5) plot(best, lwd=2) addBAMMshifts(best, cex=2) # Now we can also work with the credible shift set: css <- credibleShiftSet(ed, expectedNumberOfShifts = 1, threshold = 5) summary(css) # examine model-averaged shifts from MAP configuration- # This gives us parameters, times, and associated nodes # of each evolutionary rate regime (note that one of # them corresponds to the root) css$eventData[[1]]; # Get bammdata representation of MAP configuration: best <- getBestShiftConfiguration(css, expectedNumberOfShifts = 1, threshold = 5) plot(best) addBAMMshifts(best)
Computes the prior probability of a rate shift event for each branch. These results are important for identifying topological rate shift locations on phylogenies with marginal probabilities that exceed those predicted under the prior alone.
getBranchShiftPriors(phy, expectedNumberOfShifts)
getBranchShiftPriors(phy, expectedNumberOfShifts)
phy |
An object of class |
expectedNumberOfShifts |
The expected number of shifts under the prior. |
This function computes the prior odds on the distribution of numbers of rate shift events per branch under the prior. It returns an object which is nothing more than a copy of the original phylogenetic tree but where each branch length has been replaced by the prior probability of a rate shift on each branch.
The significance of this function is that it lets us explicitly determine which branches have shift probabilities that are elevated relative to the prior expectation.
A class phylo
with all the components of the original class
phylo
object, with the following changes:
edge.length |
Branch lengths now represent the prior probability of a rate shift on each branch. |
Dan Rabosky
distinctShiftConfigurations
,
plot.bammshifts
, summary.credibleshiftset
,
plot.credibleshiftset
, credibleShiftSet
data(whales) prior_tree1 <- getBranchShiftPriors(whales, expectedNumberOfShifts = 1) prior_tree10 <- getBranchShiftPriors(whales, expectedNumberOfShifts = 10) # plot prior expectations for branches based on these two counts: plot(prior_tree1$edge.length ~ prior_tree10$edge.length, xlim=c(0,0.05), ylim=c(0,0.05), asp=1) lines(x=c(0,1), y=c(0,1))
data(whales) prior_tree1 <- getBranchShiftPriors(whales, expectedNumberOfShifts = 1) prior_tree10 <- getBranchShiftPriors(whales, expectedNumberOfShifts = 10) # plot prior expectations for branches based on these two counts: plot(prior_tree1$edge.length ~ prior_tree10$edge.length, xlim=c(0,0.05), ylim=c(0,0.05), asp=1) lines(x=c(0,1), y=c(0,1))
Computes marginal clade-specific rates of speciation,
extinction, or (if relevant) trait evolution from BAMM
output.
getCladeRates(ephy, node = NULL, nodetype = "include", verbose = FALSE)
getCladeRates(ephy, node = NULL, nodetype = "include", verbose = FALSE)
ephy |
An object of class |
node |
If computing rates for a specific portion of tree, the node
subtending the relevant subtree. If multiple nodes are supplied, then
the equivalent number of |
nodetype |
Either "include" or "exclude". If |
verbose |
Logical. If |
Computes the time-weighted mean evolutionary rate for a given
clade. Conversely, one can compute the rate for a given phylogeny
while excluding a clade; this operation will give the "background"
rate. It is important to understand several aspects of these mean
rates. First, rates in the BAMM
framework are not constant
through time. Hence, the function computes the mean time-integrated
rates across the subtree. Operationally, this is done by integrating
the speciation rate with respect to time along each branch in the
subtree. These time-integrated rates are then summed, and the sum
is divided by the total sum of branch lengths for the subtree.
The function computes a rate for each sample in the posterior, and returns a list of rate vectors. Each rate in the corresponding vector is a mean rate for a particular sample from the posterior. Hence, you can think of the return value for this function as an estimate of the marginal distribution of rates for the focal clade. You can compute means, medians, quantiles, etc from these vectors.
A list with the following components:
lambda: A vector of speciation rates (if applicable), with the i'th rate corresponding to the mean rate from the i'th sample in the posterior.
mu: A vector of extinction rates (if applicable), with the i'th rate corresponding to the mean rate from the i'th sample in the posterior.
beta: A vector of phenotypic rates (if applicable), with the i'th rate corresponding to the mean rate from the i'th sample in the posterior.
Dan Rabosky
data(events.whales, whales) ed <- getEventData(whales, events.whales, nsamples=500) all_rates <- getCladeRates(ed) mean(all_rates$lambda) mean(all_rates$mu) # joint density of mean speciation and extinction rates: plot(all_rates$mu ~ all_rates$lambda) # clade specific rates: here for Dolphin subtree: dol_rates <- getCladeRates(ed, node=140) mean(dol_rates$lambda) mean(dol_rates$mu) # defining multiple nodes mean(getCladeRates(ed, node=c(132, 140), nodetype=c('include','exclude'))$lambda)
data(events.whales, whales) ed <- getEventData(whales, events.whales, nsamples=500) all_rates <- getCladeRates(ed) mean(all_rates$lambda) mean(all_rates$mu) # joint density of mean speciation and extinction rates: plot(all_rates$mu ~ all_rates$lambda) # clade specific rates: here for Dolphin subtree: dol_rates <- getCladeRates(ed, node=140) mean(dol_rates$lambda) mean(dol_rates$mu) # defining multiple nodes mean(getCladeRates(ed, node=c(132, 140), nodetype=c('include','exclude'))$lambda)
bammdata
objectTakes a bammdata
object and computes the pairwise
correlation in evolutionary rate regimes between all tips in the
phylogeny. This can be used to identify cohorts of taxa that share
common macroevolutionary rate parameters. It can also be used to
construct a correlation matrix for GLS analyses using
BAMM
-estimated tip rates of speciation, extinction, or
phenotypic evolution.
getCohortMatrix(ephy)
getCohortMatrix(ephy)
ephy |
An object of class |
The cohort matrix is important for interpreting and visualizing macroevolutionary dynamics. Each entry [i, j] of the cohort matrix is the probability that taxon i and taxon j share a common macroevolutionary rate regime. To compute this, we simply tabulate the percentage of samples from the posterior where taxon i and taxon j were placed in the same rate regime. If there is no rate heterogeneity in the dataset (e.g., the data are best explained by a single rate regime), then all species will tend to share the same rate regime and all values of the cohort matrix will approach 1.
A value of 0 between any two taxa means that at least one rate shift occurred on the nodal path connecting them in 100% of samples from the posterior. A value of 0.50 would imply that 50% of samples from the posterior included a rate shift on the path connecting taxa i and j. See below (Examples) for an illustration of this.
A numeric matrix of dimension k x k, where k is the number of
species in the phylogeny included in the bammdata
object.
Species names are included as row names and column names. The matrix
is symmetric, such that the values for entry [i , j] will mirror those
for [j , i].
Dan Rabosky
data(whales, events.whales) ed <- getEventData(whales, events.whales, nsamples=500) cormat <- getCohortMatrix(ed) dim(cormat) hist(cormat, breaks=50)
data(whales, events.whales) ed <- getEventData(whales, events.whales, nsamples=500) cormat <- getCohortMatrix(ed) dim(cormat) hist(cormat, breaks=50)
bammdata
object from MCMC outputgetEventData
Reads shift configuration data (the
"event data" output) from a BAMM
analysis and creates a
bammdata
object. The bammdata
object is fundamental
for extracting information about macroevolutionary rate variation
through time and among lineages.
getEventData( phy, eventdata, burnin = 0, nsamples = NULL, verbose = FALSE, type = "diversification" )
getEventData( phy, eventdata, burnin = 0, nsamples = NULL, verbose = FALSE, type = "diversification" )
phy |
An object of class |
eventdata |
A character string specifying the path to a |
burnin |
A numeric indicating the fraction of posterior samples to discard as burn-in. |
nsamples |
An integer indicating the number of posterior samples to
include in the |
verbose |
A logical. If |
type |
A character string. Either "diversification" or "trait"
depending on your |
In the BAMM
framework, an "event" defines a
macroevolutionary process of diversification or trait evolution. Every
sample from the posterior includes at least one process, defined by
such an "event". If a given sample includes just a single event, then
the dynamics of diversification or trait evolution can be described
entirely by a single time-constant or time-varying process that begins
at the root of the tree. Any sample from the posterior distribution
may include a complex mixture of distinct processes. To represent
temporal heterogeneity in macroevolutionary rates, BAMM
models
a rate , e.g. speciation, as a function that changes
exponentially with time:
.
Here is the initial rate and
is a parameter
determining how quickly that rate grows or decays with time.
The eventdata
file (or data frame) is a record of events and
associated parameters that were sampled with BAMM
during
simulation of the posterior with reversible jump MCMC. This complex,
information-rich file is processed into a bammdata
object,
which serves as the core data object for numerous downstream analyses.
From a bammdata
object, you can summarize rate variation
through time, among clades, extract locations of rate shifts,
summarize clade-specific rates of speciation and extinction, and more.
In general, the user does not need to be concerned with the details of
a bammdata
object. The object is used as input by a number of
BAMMtools
functions.
The parameter nsamples
can be used to reduce the total amount
of data included in the raw eventdata output from a BAMM
run.
The final bammdata
object will consist of all data for
nsamples
from the posterior. These nsamples
are equally
spaced after discarding some burnin
fraction as "burn-in". If
nsamples
is set to NULL
, the bammdata
object will
include all samples in the posterior after discarding the
burnin
fraction.
A list with many components:
edge: See documentation for class phylo
in package ape.
Nnode: See documentation for class phylo
in package
ape.
tip.label: See documentation for class phylo
in package
ape.
edge.length: See documentation for class phylo
in
package ape.
begin: The beginning time of each branch in absolute time (the root is set to time zero)
end: The ending time of each branch in absolute time.
numberEvents: An integer vector with the number of events
contained in phy
for each posterior sample. The length of
this vector is equal to the number of posterior samples in the
bammdata
object.
eventData: A list of dataframes. Each element is a single
posterior sample. Each row in a dataframe holds the data for a
single event. Data associated with an event are: node
- a
node number. This identifies the branch where the event
originates. time
- this is the absolute time on that branch
where the event originates (with the root at time 0). lam1
- an initial rate of speciation or trait evolution. lam2
-
a decay/growth parameter. mu1
- an initial rate of
extinction. mu2
- a decay/growth parameter. index
-
a unique integer associated with the event. See 'Details'.
eventVectors: A list of integer vectors. Each element is a
single posterior sample. For each branch in phy
the index
of the event that occurs along that branch. Branches are ordered
increasing here and elsewhere.
eventBranchSegs: A list of matrices. Each element is a single
posterior sample. Each matrix has four columns: Column 1
identifies a node in phy
. Column 2
identifies the
beginning time of the branch or segment of the branch that
subtends the node in Column 1
. Column 3
identifies
the ending time of the branch or segment of the branch that
subtends the node in Column 1
. Column 4
identifies
the index of the event that occurs along the branch or segment of
the branch that subtends the node in Column 1
.
tipStates: A list of integer vectors. Each element is a single posterior sample. For each tip the index of the event that occurs along the branch subtending the tip. Tips are ordered increasing here and elsewhere.
tipLambda: A list of numeric vectors. Each element is a single posterior sample. For each tip the rate of speciation or trait evolution at the end of the terminal branch subtending that tip.
tipMu: A list of numeric vectors. Each element is a single
posterior sample. For each tip the rate of extinction at the end
of the terminal branch subtending that tip. Meaningless if working
with BAMM
trait results.
meanTipLambda: For each tip the mean of the marginal posterior density of the rate of speciation or trait evolution at the end of the terminal branch subtending that tip.
meanTipMu: For each tip the mean of the marginal posterior
density of the rate of extinction at the end of the terminal
branch subtending that tip. Meaningless if working with
BAMM
trait results.
type: A character string. Either "diversification" or "trait"
depending on your BAMM
analysis.
downseq: An integer vector holding the nodes of phy
. The
order corresponds to the order in which nodes are visited by a
pre-order tree traversal.
lastvisit: An integer vector giving the index of the last node
visited by the node in the corresponding position in
downseq
. downseq
and lastvisit
can be used to
quickly retrieve the descendants of any node. e.g. the descendants
of node 89 can be found by
downseq[which(downseq==89):which(downseq==lastvisit[89])
.
Currently the function does not check for duplicate tip labels in
phy
, which may cause the function to choke.
Dan Rabosky, Mike Grundler
summary.bammdata
, plot.bammdata
,
dtRates
.
data(primates, events.primates) xx <- getEventData(primates, events.primates, burnin=0.25, nsamples=500, type = 'trait') # compute mean phenotypic rate for primate body size evolution: brates <- getCladeRates(xx) mean(brates$beta) # Plot rates: plot(xx)
data(primates, events.primates) xx <- getEventData(primates, events.primates, burnin=0.25, nsamples=500, type = 'trait') # compute mean phenotypic rate for primate body size evolution: brates <- getCladeRates(xx) mean(brates$beta) # Plot rates: plot(xx)
Given a vector of numeric values and the number of desired breaks, calculate the optimum breakpoints using Jenks natural breaks optimization.
getJenksBreaks(var, k, subset = NULL)
getJenksBreaks(var, k, subset = NULL)
var |
Numeric vector. |
k |
Number of breaks. |
subset |
Number of regularly spaced samples to subset from
|
getJenksBreaks
is called by
assignColorBreaks
.
The values in var
are binned into k+1
categories,
according to the Jenks natural breaks classification method. This
method is borrowed from the field of cartography, and seeks to
minimize the variance within categories, while maximizing the variance
between categories. If subset = NULL
, all values of var
are used for the optimization, however this can be a slow process with
very large datasets. If subset
is set to some number, then
subset
regularly spaced values of var
will be sampled.
This is slightly less accurate than when using the entirety of
var
but is unlikely to make much of a difference. If
subset
is defined but length(var) < subset
, then
subset
has no effect.
The Jenks natural breaks method was ported to C from code found in the classInt R package.
A numeric vector of intervals.
Pascal Title
See assignColorBreaks
and
plot.bammdata
.
# load whales dataset data(whales, events.whales) ed <- getEventData(whales, events.whales, burnin=0.25, nsamples=500) # for demonstration purposes, extract the vector of speciation rates ed <- dtRates(ed, tau=0.01) vec <- ed$dtrates$rates[[1]] # Return breaks for the binning of speciation rates into 65 groups # yielding 64 breaks getJenksBreaks(vec, 64)
# load whales dataset data(whales, events.whales) ed <- getEventData(whales, events.whales, burnin=0.25, nsamples=500) # for demonstration purposes, extract the vector of speciation rates ed <- dtRates(ed, tau=0.01) vec <- ed$dtrates$rates[[1]] # Return breaks for the binning of speciation rates into 65 groups # yielding 64 breaks getJenksBreaks(vec, 64)
bammdata
objectFor each sample in the posterior, computes the mean rate for
each branch in the focal phylogeny (speciation, extinction, trait
evolution). If the bammdata
object contains nsamples
samples and the target phylogeny has nbranches branches, the
function will compute a matrix of nbranches x nsamples.
getMarginalBranchRateMatrix(ephy, verbose = FALSE)
getMarginalBranchRateMatrix(ephy, verbose = FALSE)
ephy |
An object of class |
verbose |
Print progress during processing of |
If a type = 'diversification'
bammdata
object is
passed as an argument, the function will return matrices for both
speciation and extinction. If type = 'trait'
object, the matrix
will simply be the corresponding phenotypic rates. Branch-specific
rates are the mean rates computed by integrating the relevant
rate-through-time function along each branch, then dividing by the
length of the branch.
Returns a list with the following components:
lambda_branch_matrix: A nbranches x nsamples
matrix
of mean speciation rates for each branch.
mu_branch_matrix: A nbranches x nsamples
matrix of
mean extinction rates for each branch.
beta_branch_matrix: A nbranches x nsamples
matrix of
mean phenotypic rates for each branch.
Dan Rabosky
data(whales) data(events.whales) ed <- getEventData(whales, events.whales, nsamples = 10) mbr <- getMarginalBranchRateMatrix(ed) dim(mbr$lambda_branch_matrix)
data(whales) data(events.whales) ed <- getEventData(whales, events.whales, nsamples = 10) mbr <- getMarginalBranchRateMatrix(ed) dim(mbr$lambda_branch_matrix)
Takes a bammdata
object and computes a phylogenetic
tree where branch lengths are equal to the mean of the marginal
distributions of rates on each branch. This tree can be plotted to
visualize rate variation across a phylogeny.
getMeanBranchLengthTree(ephy, rate = "speciation")
getMeanBranchLengthTree(ephy, rate = "speciation")
ephy |
An object of class |
rate |
The type of rate-tree to be computed. Options: "speciation" (default), "extinction", "ndr" (net diversification), and "trait". |
A list with the following components:
phy: A phylogenetic tree, topologically identical to the
model tree, but with branch lengths replaced by the mean
(marginal) rates on each branch as estimated from the
posterior samples in the bammdata
object.
mean: The mean rate over all branches.
median: the median rate over all branches.
Dan Rabosky
data(whales) data(events.whales) ed <- getEventData(whales, events.whales, burnin=0.1, nsamples=500) ed2 <- subsetEventData(ed, index = 1:20) ratetree <- getMeanBranchLengthTree(ed2, rate='speciation') plot(ratetree$phy, show.tip.label=FALSE)
data(whales) data(events.whales) ed <- getEventData(whales, events.whales, burnin=0.1, nsamples=500) ed2 <- subsetEventData(ed, index = 1:20) ratetree <- getMeanBranchLengthTree(ed2, rate='speciation') plot(ratetree$phy, show.tip.label=FALSE)
Calculates the most recent common ancestor for each pair of
tips. Used internally by getEventData
.
getmrca(phy, t1, t2)
getmrca(phy, t1, t2)
phy |
An object of class |
t1 |
A vector of mode integer or character corresponding to tips in
|
t2 |
A vector of mode integer or character corresponding to tips in
|
Finds the most recent common ancestor for each pair of tips where
pairs are defined as (t1
[1], t2
[1]), (t1
[2],
t2
[2]), ... , (t1
[i], t2
[i]), ... ,(t1
[n],
t2
[n]).
A vector of node numbers of the common ancestor for each pair of tips.
Mike Grundler
bammdata
objectComputes a matrix of macroevolutionary rates at specified
timepoints from a bammdata
object. These rates can be used for
plotting speciation rates (and other rates) through time.
getRateThroughTimeMatrix( ephy, start.time = NULL, end.time = NULL, nslices = 100, node = NULL, nodetype = "include" )
getRateThroughTimeMatrix( ephy, start.time = NULL, end.time = NULL, nslices = 100, node = NULL, nodetype = "include" )
ephy |
An object of class |
start.time |
The start time (in units before present) for the time.
sequence over which rates should be computed. If |
end.time |
The end time (in units before present) for the time
sequence over which rates should be computed. If |
nslices |
The number of time points at which to compute rate
estimates (between |
node |
Allows user to extract rate-through-time information for the subtree descended from a specific node. Alternatively, a specified subtree can be excluded from the rate matrix calculations. |
nodetype |
Two options: "include" and "exclude". If "include",
computes rate matrix only for the descendants of subtree defined by
|
Computes evolutionary rates for each sample in the posterior
included as part of the bammdata
object. Rates are computed by
draping an imaginary grid over the phylogeny, where the grid begins at
start.time
and ends at end.time
, with nslices
vertical lines through the phylogeny. The mean rate at each point in
time (for a given sample from the posterior) is simply the mean rate
at that time for all branches that are intersected by the grid (see
the grid plot in the examples section).
This function is used by plotRateThroughTime, but the user can
work directly with the bamm-ratematrix
object for greater
control in plotting rate-through-time trajectories for individual
clades. See examples
for an example of how this can be used to
plot confidence intervals on a rate trajectory using shaded polygons.
The node
options are particularly useful. If you have run
BAMM
on a large phylogeny, you can easily generate the
rate-through-time data for a particular subtree by specifying the node
number along with nodetype = "include"
. Likewise, if you want
to look at just the background rate - excluding some particular
lineage - just specify nodetype = "exclude"
.
An object of class bamm-ratematrix
with the following
components:
lambda |
A |
mu |
A |
beta |
A |
times |
A vector of timepoints where rates were computed. |
times |
A vector of timepoints where rates were computed (see Examples). |
type |
Either "diversification" or "trait", depending on the input data. |
Dan Rabosky
## Not run: # Plot a rate-through-time curve with # confidence intervals for the whale dataset: data(whales, events.whales) ed <- getEventData(whales, events.whales) rmat <- getRateThroughTimeMatrix(ed) plot.new() plot.window(xlim=c(0, 36), ylim=c(0, .7)) ## Speciation quantiles: plot 90% CIs qq <- apply(rmat$lambda, 2, quantile, c(0.05, 0.5, 0.95)) xv <- c(rmat$times, rev(rmat$times)) yv <- c(qq[1,], rev(qq[3,])) ## Add the confidence polygon on rate distributions: polygon(xv, yv, col='gray80', border=FALSE) ## Add the median rate line: lines(rmat$times, qq[2,], lwd=3, col='red') ## Add axes axis(1, at=seq(-5, 35, by=5)) axis(2, at=seq(-0.2, 1, by=0.2), las=1) ####### Now we will show the actual grid used for rate calculations: plot(whales, show.tip.label=FALSE) axisPhylo() mbt <- max(branching.times(whales)) tvec <- mbt - rmat$times; tvec <- rmat$times; for (i in 1:length(tvec)){ lines(x=c(tvec[i], tvec[i]), y=c(0, 90), lwd=0.7, col='gray70') } ## This shows the grid of time slices over the phylogeny ## End(Not run)
## Not run: # Plot a rate-through-time curve with # confidence intervals for the whale dataset: data(whales, events.whales) ed <- getEventData(whales, events.whales) rmat <- getRateThroughTimeMatrix(ed) plot.new() plot.window(xlim=c(0, 36), ylim=c(0, .7)) ## Speciation quantiles: plot 90% CIs qq <- apply(rmat$lambda, 2, quantile, c(0.05, 0.5, 0.95)) xv <- c(rmat$times, rev(rmat$times)) yv <- c(qq[1,], rev(qq[3,])) ## Add the confidence polygon on rate distributions: polygon(xv, yv, col='gray80', border=FALSE) ## Add the median rate line: lines(rmat$times, qq[2,], lwd=3, col='red') ## Add axes axis(1, at=seq(-5, 35, by=5)) axis(2, at=seq(-0.2, 1, by=0.2), las=1) ####### Now we will show the actual grid used for rate calculations: plot(whales, show.tip.label=FALSE) axisPhylo() mbt <- max(branching.times(whales)) tvec <- mbt - rmat$times; tvec <- rmat$times; for (i in 1:length(tvec)){ lines(x=c(tvec[i], tvec[i]), y=c(0, 90), lwd=0.7, col='gray70') } ## This shows the grid of time slices over the phylogeny ## End(Not run)
bammdata
objectFind the node numbers associated with rate shifts for a
specified sample from the posterior distribution contained in a
bammdata
object.
getShiftNodesFromIndex(ephy, index)
getShiftNodesFromIndex(ephy, index)
ephy |
A |
index |
The index value of the posterior sample from which you want
to identify shift nodes. This is not the same as the actual
generation number of the MCMC sample. If your |
A vector of nodes (excluding the root) that define branches on which shifts occurred for the specified sample from the posterior. Will return a numeric of length 0 if no non-root shifts occur in the specified sample.
Dan Rabosky
addBAMMshifts
, plot.bammdata
,
maximumShiftCredibility
data(whales, events.whales) ed <- getEventData(whales, events.whales, burnin=0.1, nsamples=500) # Get the maximum shift credibility configuration: msc <- maximumShiftCredibility(ed) # Get the nodes at which shifts occurred in the # maximum shift credibility configuration: getShiftNodesFromIndex(ed, index=msc$sampleindex)
data(whales, events.whales) ed <- getEventData(whales, events.whales, burnin=0.1, nsamples=500) # Get the maximum shift credibility configuration: msc <- maximumShiftCredibility(ed) # Get the nodes at which shifts occurred in the # maximum shift credibility configuration: getShiftNodesFromIndex(ed, index=msc$sampleindex)
bammdata
objectReturn speciation, extinction, net diversification, or
Brownian motion trait rates for all species in the phylogeny from
BAMM
output.
getTipRates(ephy, returnNetDiv = FALSE, statistic = "mean")
getTipRates(ephy, returnNetDiv = FALSE, statistic = "mean")
ephy |
An object of class |
returnNetDiv |
Logical. If |
statistic |
Determines how the average tip rates should be
calculated. Can be either |
Returns a list with the following elements:
If ephy
type is 'diversification':
lambda: A matrix of tip speciation rates with species as rows, and posterior samples as columns.
mu: A matrix of tip extinction rates with species as rows, and posterior samples as columns.
lambda.avg: A vector of average tip speciation rates,
averaged with mean or median, depending on selected option for
statistic
. The vector is named with species names.
mu.avg: A vector of average tip extinction rates, averaged
with mean or median, depending on selected option for
statistic
. The vector is named with species names.
If ephy
type is 'diversification' and
returnNetDiv = TRUE
:
netdiv: A matrix of tip net diversification rates with species as rows, and posterior samples as columns.
netdiv.avg: A vector of average tip net diversification
rates, averaged with mean or median, depending on selected
option for statistic
. The vector is named with species
names.
If ephy
type is 'trait':
beta: A matrix of tip phenotypic rates with species as rows, and posterior samples as columns.
beta.avg: A vector of average tip phenotypic rates,
averaged with mean or median, depending on selected option for
statistic
. The vector is named with species names.
Pascal Title
Requires an object of class bammdata
as obtained with
getEventData
.
data(whales, events.whales) ephy <- getEventData(whales, events.whales, burnin=0.25, nsamples = 500) # return a vector of average species-specific speciation rates. meanlam <- getTipRates(ephy, returnNetDiv = FALSE, statistic = 'mean')$lambda.avg meanlam # return a vector of median species-specific net diversification rates. ndr <- getTipRates(ephy, returnNetDiv = TRUE, statistic = 'median')$netdiv.avg # Return mean species-specific speciation rates from all posterior # samples in the \code{bamm-data} object. lam <- getTipRates(ephy, returnNetDiv = FALSE, statistic = 'mean')$lambda rowMeans(lam)
data(whales, events.whales) ephy <- getEventData(whales, events.whales, burnin=0.25, nsamples = 500) # return a vector of average species-specific speciation rates. meanlam <- getTipRates(ephy, returnNetDiv = FALSE, statistic = 'mean')$lambda.avg meanlam # return a vector of median species-specific net diversification rates. ndr <- getTipRates(ephy, returnNetDiv = TRUE, statistic = 'median')$netdiv.avg # Return mean species-specific speciation rates from all posterior # samples in the \code{bamm-data} object. lam <- getTipRates(ephy, returnNetDiv = FALSE, statistic = 'mean')$lambda rowMeans(lam)
Compute marginal posterior-to-prior odds ratio associated with observing one or more rate shift on a given branch.
marginalOddsRatioBranches(ephy, expectedNumberOfShifts)
marginalOddsRatioBranches(ephy, expectedNumberOfShifts)
ephy |
An object of class |
expectedNumberOfShifts |
Expected number of shifts under the prior alone. |
This function returns a copy of a phylogenetic tree where each branch length is equal to the marginal odds ratio in favor of a rate shift on a particular branch. These cannot be interpreted as evidence for a rate shift in an absolute sense. As explained on the website, they are a marginal odds ratio. This function is provided primarily for the purpose of distinguishing core and non-core shifts.
A object of class phylo
but where each branch length is
equal to the marginal shift odds on each branch.
Dan Rabosky
getBranchShiftPriors
,
distinctShiftConfigurations
,
credibleShiftSet
data(whales, events.whales) ed <- getEventData(whales, events.whales, burnin=0.1, nsamples=500) marginalOddsRatioBranches(ed, expectedNumberOfShifts = 1)
data(whales, events.whales) ed <- getEventData(whales, events.whales, burnin=0.1, nsamples=500) marginalOddsRatioBranches(ed, expectedNumberOfShifts = 1)
This is one estimate of the "best" rate shift configuration,
considering only those shift configurations that were actually sampled
using BAMM
's reversible jump MCMC simulator. This is analogous
to the "maximum clade credibility tree" from a Bayesian phylogenetic
analysis. It is not necessarily the same as the shift configuration
with the maximum a posteriori probability.
maximumShiftCredibility(ephy, maximize = "product")
maximumShiftCredibility(ephy, maximize = "product")
ephy |
An object of class |
maximize |
Maximize the marginal probability of the product or sum of branch-specific shifts. |
This is one point estimate of the overall "best" rate shift
configuration. Following an MCMC simulation, the marginal shift
probabilities on each individual branch are computed using
marginalShiftProbsTree
. The shift configuration that
maximizes the product (or sum, if specified) of these marginal
branch-specific shift probabilities is the maximum shift
credibility configuration.
This option is only recommended if you have no clear "winner" in your
credible set of shift configurations (see
credibleShiftSet
). If you have a number of
largely-equiprobable shift configurations in your 95% credible set,
you may wish to try this function as an alternative for identifying a
single best shift configuration. Otherwise, it is recommended that you
present the shift configuration with the maximum a posteriori
probability (see getBestShiftConfiguration
).
A list with the following components:
bestconfigs: A vector of the index values of MCMC samples with shift configurations equal to the maximum. Usually, more than one state sampled during the MCMC simulation will have an identical (maximized) marginal probability. All samples given in this vector will have an identical shift configuration.
scores: The optimality score (product or sum of marginal
shift probabilities) for all sampled shift configurations in
the BAMMdata
object.
optimalityType: Whether the product or sum of marginal shift probabilities was used to compute the maximum shift credibility configuration.
sampleindex: A representative sample that is equal to the
maximum shift credibility configuration (e.g., this can be
plotted with addBAMMshifts
).
Dan Rabosky
marginalShiftProbsTree
,
addBAMMshifts
, cumulativeShiftProbsTree
,
credibleShiftSet
,
getBestShiftConfiguration
data(whales, events.whales) ed <- getEventData(whales, events.whales, burnin=0.25, nsamples=500) best_config <- maximumShiftCredibility(ed) plot(ed) addBAMMshifts(ed, method='phylogram', index=best_config$sampleindex)
data(whales, events.whales) ed <- getEventData(whales, events.whales, burnin=0.25, nsamples=500) best_config <- maximumShiftCredibility(ed) plot(ed) addBAMMshifts(ed, method='phylogram', index=best_config$sampleindex)
BAMM
-estimated macroevolutionary rates on a phylogenyplot.bammdata
plots a phylogenetic tree from a
bammdata
object and colors each branch by the estimated rate of
speciation, extinction, or trait evolution. Rates are not assumed to
be constant in time, and the function can plot continuously-varying
rates along individual branches.
## S3 method for class 'bammdata' plot( x, tau = 0.01, method = "phylogram", xlim = NULL, ylim = NULL, vtheta = 5, rbf = 0.001, show = TRUE, labels = FALSE, legend = FALSE, spex = "s", lwd = 1, cex = 1, pal = "RdYlBu", mask = integer(0), mask.color = gray(0.5), colorbreaks = NULL, logcolor = FALSE, breaksmethod = "linear", color.interval = NULL, JenksSubset = 20000, par.reset = FALSE, direction = "rightwards", ... )
## S3 method for class 'bammdata' plot( x, tau = 0.01, method = "phylogram", xlim = NULL, ylim = NULL, vtheta = 5, rbf = 0.001, show = TRUE, labels = FALSE, legend = FALSE, spex = "s", lwd = 1, cex = 1, pal = "RdYlBu", mask = integer(0), mask.color = gray(0.5), colorbreaks = NULL, logcolor = FALSE, breaksmethod = "linear", color.interval = NULL, JenksSubset = 20000, par.reset = FALSE, direction = "rightwards", ... )
x |
An object of class |
tau |
A numeric indicating the grain size for the calculations. See
documentation for |
method |
A character string indicating the method for plotting the
phylogenetic tree. |
xlim |
A numeric vector of coordinates for the x-axis endpoints.
Defaults to |
ylim |
A numeric vector of coordinates for the y-axis endpoints.
Defaults to |
vtheta |
A numeric indicating the angular separation (in degrees) of
the first and last terminal nodes. Ignored if
|
rbf |
A numeric indicating the length of the root branch as a
fraction of total tree height. Ignored if |
show |
A logical indicating whether or not to plot the tree. Defaults
to |
labels |
A logical indicating whether or not to plot the tip labels.
Defaults to |
legend |
A logical indicating whether or not to plot a legend for
interpreting the mapping of evolutionary rates to colors. Defaults to
|
spex |
A character string indicating what type of macroevolutionary
rates should be plotted. "s" (default) indicates speciation rates, "e"
indicates extinction rates, and "netdiv" indicates net diversification
rates. Ignored if |
lwd |
A numeric specifying the line width for branches. |
cex |
A numeric specifying the size of tip labels. |
pal |
A character string or vector of mode character that describes the color palette. See Details for explanation of options. |
mask |
An optional integer vector of node numbers specifying branches
that will be masked with |
mask.color |
The color for the mask. |
colorbreaks |
A numeric vector of percentiles delimiting the bins for
mapping rates to colors. If |
logcolor |
Logical. Should colors be plotted on a log scale. |
breaksmethod |
Method used for determining color breaks. See help
file for |
color.interval |
Min and max value for the mapping of rates. If
|
JenksSubset |
If |
par.reset |
A logical indicating whether or not to reset the
graphical parameters when the function exits. Defaults to
|
direction |
A character string. Options are "rightwards", "leftwards", "upwards", and "downwards", which determine the orientation of the tips when the phylogeny plotted. |
... |
Further arguments passed to |
To calculate rates, each branch of the phylogeny is discretized
into a number of small segments, and the mean of the marginal
posterior density of the rate of speciation/extinction or trait
evolution is calculated for each such segment. Rates are mapped to
colors such that cool colors represent slow rates and warm colors
represent fast rates. When the tree is plotted each of these small
segments is plotted so that changes in rates through time and shifts
in rates are visible as gradients of color. The spex
argument
determines the type of rate that will be calculated. spex = "s"
will plot speciation rates, spex = "e"
will plot extinction
rates, and spex = "netdiv"
will plot diversification rates
(speciation - extinction). Note that if x$type = "trait"
the
spex
argument is ignored and rates of phenotypic evolution are
plotted instead. If legend = TRUE
the function will plot a
legend that contains the mapping of colors to numerical values.
A number of color palettes come built in with BAMMtools
.
Color-blind friendly options include:
BrBG
PiYG
PRGn
PuOr
RdBu
RdYlBu
BuOr
BuOrRd
DkRdBu
BuDkOr
GnPu
Some color-blind unfriendly options include:
RdYlGn
Spectral
temperature
terrain
Some grayscale options include:
grayscale
revgray
For more information about these color palettes visit
https://colorbrewer2.org/ and
https://pjbartlein.github.io/datagraphics/color_scales.html or
use the help files of the R packages RColorBrewer
and
dichromat
.
Additionally, any vector of valid named colors may also be used. The
only restriction is that the length of this vector be greater than or
equal to three (you can provide a single color, but in this case the
entire tree will be assigned the same color). The colors should be
ordered from cool to warm as the colors will be mapped from low rates
to high rates in the order supplied (e.g. pal=c("darkgreen",
"yellow2", "red")
). The option pal = "temperature"
uses the
rich.colors
function written by Arni Magnusson for the R
package gplots
.
Internally plot.bammdata
checks whether or not rates have been
calculated by looking for a component named "dtrates" in the
bammdata
object. If rates have not been calculated
plot.bammdata
calls dtRates
with tau
. Specifying
smaller values for tau
will result in smoother-looking rate
changes on the tree. Note that smaller values of tau
require
more computation. If the colorbreaks
argument
is NULL
a map of rates to colors is also made by calling
assignColorBreaks
with NCOLORS = 64
. A user supplied
colorbreaks
argument can be passed as well. This allows one to
plot parts of a tree while preserving the map of rates to colors that
was made using rates for the entire tree.
If color.interval is defined, then those min and max values override the automatic detection of min and max. This might be useful if some small number of lineages have very high or very low rates, such that the map of colors is being skewed towards these extremes, resulting in other rate variation being drowned out. If specified, the color ramp will be built between these two color.interval values, and the rates outside of the color interval range will be set to the highest and lowest color. The total number of colors will also be increased such that 64 color bins are found within the color.interval.
If plot.bammdata
is called repeatedly with the same
bammdata
object, computation can be reduced by first calling
dtRates
in the global environment.
Returns (invisibly) a list with three components.
coords: A matrix of plot coordinates. Rows correspond to
branches. Columns 1-2 are starting (x,y) coordinates of each
branch and columns 3-4 are ending (x,y) coordinates of each
branch. If method = "polar"
a fifth column gives the
angle(in radians) of each branch.
colorbreaks: A vector of percentiles used to group macroevolutionary rates into color bins.
colordens: A matrix of the kernel density estimates (column 2) of evolutionary rates (column 1) and the color (column 3) corresponding to each rate value.
Mike Grundler, Pascal Title
https://colorbrewer2.org/, https://pjbartlein.github.io/datagraphics/color_scales.html
dtRates
, addBAMMshifts
,
assignColorBreaks
, subtreeBAMM
,
colorRampPalette
data(whales, events.whales) ed <- getEventData(whales, events.whales, burnin=0.25, nsamples=500) # The first call to plot.bammdata # No calculations or assignments of rates have been made plot(ed, lwd = 3, spex = "s") # calls dtRates & assignColorBreaks # Compare the different color breaks methods par(mfrow=c(1,3)) plot(ed, lwd = 3, spex = "s", breaksmethod = "linear") title(main="linear") plot(ed, lwd = 3, spex = "s", breaksmethod = "quantile") title(main="quantile") plot(ed, lwd = 3, spex = "s", breaksmethod = "jenks") title(main="jenks") ## Not run: # now plot.bammdata no longer calls dtRates ed <- dtRates(ed, tau = 0.01) xx <- plot(ed, lwd = 3, spex = "s") # you can plot subtrees while preserving the original # rates to colors map by passing the colorbreaks object as an argument sed <- subtreeBAMM(ed, node = 103) plot(sed, lwd = 3, colorbreaks = xx$colorbreaks) sed <- subtreeBAMM(ed, node = 140) plot(sed, lwd = 3, colorbreaks = xx$colorbreaks) # note how if we do not pass colorbreaks the map is # no longer relative to the rest of the tree and the plot is quite # distinct from the original plot(sed, lwd = 3) # if you want to change the value of tau and the rates to colors map for # the entire tree ed <- dtRates(ed, tau = 0.002) xx <- plot(ed, lwd = 3, spex = "s") # now you can re-plot the subtrees using this finer tau partition sed <- subtreeBAMM(ed, node = 103) sed <- dtRates(sed, 0.002) plot(sed, lwd = 3, colorbreaks = xx$colorbreaks) sed <- subtreeBAMM(ed, node = 140) sed <- dtRates(sed, 0.002) plot(sed, lwd = 3, colorbreaks = xx$colorbreaks) # multi-panel plotting and adding shifts of specific posterior samples par(mfrow=c(2,3)) samples <- sample(1:length(ed$eventData), 6) ed <- dtRates(ed, 0.005) # individual plots will have a color map relative to the mean xx <- plot(ed, show=FALSE) for (i in 1:6) { ed <- dtRates(ed, 0.005, samples[i]) plot(ed, colorbreaks=xx$colorbreaks) addBAMMshifts(ed,index=samples[i],method="phylogram", par.reset=FALSE) } dev.off() # color options ed <- dtRates(ed,0.01) plot(ed, pal="temperature",lwd=3) plot(ed, pal="terrain",lwd=3) plot(ed, pal=c("darkgreen","yellow2","red"),lwd=3) plot(ed,method="polar",pal="Spectral", lwd=3) plot(ed,method="polar",pal="RdYlBu", lwd=3) ## End(Not run)
data(whales, events.whales) ed <- getEventData(whales, events.whales, burnin=0.25, nsamples=500) # The first call to plot.bammdata # No calculations or assignments of rates have been made plot(ed, lwd = 3, spex = "s") # calls dtRates & assignColorBreaks # Compare the different color breaks methods par(mfrow=c(1,3)) plot(ed, lwd = 3, spex = "s", breaksmethod = "linear") title(main="linear") plot(ed, lwd = 3, spex = "s", breaksmethod = "quantile") title(main="quantile") plot(ed, lwd = 3, spex = "s", breaksmethod = "jenks") title(main="jenks") ## Not run: # now plot.bammdata no longer calls dtRates ed <- dtRates(ed, tau = 0.01) xx <- plot(ed, lwd = 3, spex = "s") # you can plot subtrees while preserving the original # rates to colors map by passing the colorbreaks object as an argument sed <- subtreeBAMM(ed, node = 103) plot(sed, lwd = 3, colorbreaks = xx$colorbreaks) sed <- subtreeBAMM(ed, node = 140) plot(sed, lwd = 3, colorbreaks = xx$colorbreaks) # note how if we do not pass colorbreaks the map is # no longer relative to the rest of the tree and the plot is quite # distinct from the original plot(sed, lwd = 3) # if you want to change the value of tau and the rates to colors map for # the entire tree ed <- dtRates(ed, tau = 0.002) xx <- plot(ed, lwd = 3, spex = "s") # now you can re-plot the subtrees using this finer tau partition sed <- subtreeBAMM(ed, node = 103) sed <- dtRates(sed, 0.002) plot(sed, lwd = 3, colorbreaks = xx$colorbreaks) sed <- subtreeBAMM(ed, node = 140) sed <- dtRates(sed, 0.002) plot(sed, lwd = 3, colorbreaks = xx$colorbreaks) # multi-panel plotting and adding shifts of specific posterior samples par(mfrow=c(2,3)) samples <- sample(1:length(ed$eventData), 6) ed <- dtRates(ed, 0.005) # individual plots will have a color map relative to the mean xx <- plot(ed, show=FALSE) for (i in 1:6) { ed <- dtRates(ed, 0.005, samples[i]) plot(ed, colorbreaks=xx$colorbreaks) addBAMMshifts(ed,index=samples[i],method="phylogram", par.reset=FALSE) } dev.off() # color options ed <- dtRates(ed,0.01) plot(ed, pal="temperature",lwd=3) plot(ed, pal="terrain",lwd=3) plot(ed, pal=c("darkgreen","yellow2","red"),lwd=3) plot(ed,method="polar",pal="Spectral", lwd=3) plot(ed,method="polar",pal="RdYlBu", lwd=3) ## End(Not run)
Plots a random distinct rate shift configuration sampled by
BAMM
on a phylogeny.
## S3 method for class 'bammshifts' plot( x, ephy, method = "phylogram", pal = "RdYlBu", rank = NULL, index = NULL, spex = "s", legend = TRUE, add.freq.text = TRUE, logcolor = FALSE, breaksmethod = "linear", color.interval = NULL, JenksSubset = 20000, ... )
## S3 method for class 'bammshifts' plot( x, ephy, method = "phylogram", pal = "RdYlBu", rank = NULL, index = NULL, spex = "s", legend = TRUE, add.freq.text = TRUE, logcolor = FALSE, breaksmethod = "linear", color.interval = NULL, JenksSubset = 20000, ... )
x |
An object of class |
ephy |
An object of class |
method |
A character string for which plotting method to use. "phylogram" uses rectangular coordinates. "polar" uses polar coordinates. |
pal |
The color palette to use in |
rank |
The rank of the core shift configuration to plot. For the
default ( |
index |
The posterior sample to plot. For the default ( |
spex |
A character string indicating what type of macroevolutionary
rates should be plotted. "s" (default) indicates speciation rates, "e"
indicates extinction rates, and 'netdiv' indicates net diversification
rates. Ignored if |
legend |
Logical indicating whether to plot a legend. |
add.freq.text |
A logical indicating whether the frequency of each sampled shift configuration should be added to each plot. |
logcolor |
Logical. Should colors be plotted on a log scale. |
breaksmethod |
Method used for determining color breaks. See help
file for |
color.interval |
Min and max value for the mapping of rates. One of
the two values can be |
JenksSubset |
If |
... |
Other arguments to |
A rate shift configuration is the set of nodes of the phylogeny
where a shift occurs in the macroevolutionary rate dynamic of
diversification or trait evolution. Each posterior sample is a
potentially distinct rate shift configuration. Different
configurations may imply different macroevolutionary scenarios. This
function helps visualize the different distinct rate shift
configurations sampled by BAMM
.
A core shift configuration is defined by a set of nodes that have
shift probabilities that are substantially elevated relative to what
you expect under the prior alone. These core configurations are
inferred in distinctShiftConfigurations
. It is almost
certain that more than one core shift configuration will be sampled by
BAMM
. Moreover, each core shift configuration may contain many
subconfigurations. A subconfiguration contains the core shift node
configuration and zero or more additional shift nodes that occur with
low marginal probability.
Points are added to the branches subtending the nodes of each rate configuration. The size of the point is proportional to the marginal probability that a shift occurs on a specific branch. If the instantaneous rate at a shift's origin represents an initial increase above the ancestral instantaneous rate the point is red. If the instantaneous rate at a shift's origin represents an initial decrease below the ancestral instantaneous rate the point is blue.
Mike Grundler, Dan Rabosky
distinctShiftConfigurations
,
plot.bammdata
data(whales, events.whales) ed <- getEventData(whales, events.whales, burnin=0.25, nsamples=500) sc <- distinctShiftConfigurations(ed, expectedNumberOfShifts = 1, threshold = 5) plot(sc, ed)
data(whales, events.whales) ed <- getEventData(whales, events.whales, burnin=0.25, nsamples=500) sc <- distinctShiftConfigurations(ed, expectedNumberOfShifts = 1, threshold = 5) plot(sc, ed)
BAMM
analysisPlots the credible set of rate shift configurations from a
BAMM
analysis on a phylogeny.
## S3 method for class 'credibleshiftset' plot( x, plotmax = 9, method = "phylogram", pal = "RdYlBu", shiftColor = "black", spex = "s", add.freq.text = TRUE, use.plot.bammdata = TRUE, border = TRUE, legend = FALSE, send2pdf = FALSE, logcolor = FALSE, breaksmethod = "linear", color.interval = NULL, JenksSubset = 20000, ... )
## S3 method for class 'credibleshiftset' plot( x, plotmax = 9, method = "phylogram", pal = "RdYlBu", shiftColor = "black", spex = "s", add.freq.text = TRUE, use.plot.bammdata = TRUE, border = TRUE, legend = FALSE, send2pdf = FALSE, logcolor = FALSE, breaksmethod = "linear", color.interval = NULL, JenksSubset = 20000, ... )
x |
An object of class |
plotmax |
An integer number of plots to display. |
method |
A coordinate method to use for plotting. Options are "phylogram" or "polar". |
pal |
A color palette to use with |
shiftColor |
Color to use for shift points. |
spex |
A character string indicating what type of macroevolutionary rates should be plotted. "s" (default) indicates speciation rates, "e" indicates extinction rates, and "netdiv" indicates net diversification rates. Ignored if ephy$type = "trait". |
add.freq.text |
A logical indicating whether to add the posterior frequency of each shift configuration to the plotting region. |
use.plot.bammdata |
A logical indicating whether to use
|
border |
A logical indicating whether to frame the plotting region. |
legend |
A logical indicating whether to plot a legend. |
send2pdf |
A logical indicating whether to print the figure to a PDF file. |
logcolor |
A logical indicating whether the rates should be log-transformed. |
breaksmethod |
Method used for determining color breaks. See help
file for |
color.interval |
Min and max value for the mapping of rates. One of
the two values can be |
JenksSubset |
If |
... |
Further arguments to pass to |
This produces phylorate plots for the plotmax
most-probable shift configurations sampled with BAMM
. Shift
configurations are plotted in a single graphics window. The posterior
probability (frequency) of each rate shift configuration in the
posterior is shown (omitted with argument add.freq.text = FALSE
).
Points are added to the branches subtending the nodes of each rate configuration. The size of the point is proportional to the marginal probability that a shift occurs on a specific branch.
Mike Grundler
credibleShiftSet
,
distinctShiftConfigurations
,
plot.bammdata
, plot.bammshifts
data(events.whales) data(whales) ed <- getEventData(whales, events.whales, nsamples=500) cset <- credibleShiftSet(ed, expectedNumberOfShifts = 1, threshold = 5) plot(cset)
data(events.whales) data(whales) ed <- getEventData(whales, events.whales, nsamples=500) cset <- credibleShiftSet(ed, expectedNumberOfShifts = 1, threshold = 5) plot(cset)
Generates a barplot of the prior and posterior distributions of the number of shifts.
plotPrior( mcmc, expectedNumberOfShifts = 1, burnin = 0.15, priorCol = "light blue", postCol = "red", legendPos = "topright", ... )
plotPrior( mcmc, expectedNumberOfShifts = 1, burnin = 0.15, priorCol = "light blue", postCol = "red", legendPos = "topright", ... )
mcmc |
A dataframe of the mcmc_out file from a |
expectedNumberOfShifts |
Expected number of shifts under the prior. |
burnin |
The fraction of samples to discard as burn-in. |
priorCol |
Color for the prior distribution. |
postCol |
Color for the posterior distribution. |
legendPos |
Placement of the legend, see |
... |
Additional parameters that are passed to
|
Invisibly returns a matrix with the probability of each shift number under the prior and the posterior.
Pascal Title
data(mcmc.whales) plotPrior(mcmc.whales, expectedNumberOfShifts = 1, burnin = 0.15)
data(mcmc.whales) plotPrior(mcmc.whales, expectedNumberOfShifts = 1, burnin = 0.15)
Generates a plot of diversification or phenotypic rate through time with confidence intervals.
plotRateThroughTime( ephy, useMedian = TRUE, intervals = seq(from = 0, to = 1, by = 0.01), ratetype = "auto", nBins = 100, smooth = FALSE, smoothParam = 0.2, opacity = 0.01, intervalCol = "blue", avgCol = "red", start.time = NULL, end.time = NULL, node = NULL, nodetype = "include", plot = TRUE, cex.axis = 1, cex.lab = 1.3, lwd = 3, xline = 3.5, yline = 3.5, mar = c(6, 6, 1, 1), xticks = NULL, yticks = NULL, xlim = "auto", ylim = "auto", add = FALSE, axis.labels = TRUE )
plotRateThroughTime( ephy, useMedian = TRUE, intervals = seq(from = 0, to = 1, by = 0.01), ratetype = "auto", nBins = 100, smooth = FALSE, smoothParam = 0.2, opacity = 0.01, intervalCol = "blue", avgCol = "red", start.time = NULL, end.time = NULL, node = NULL, nodetype = "include", plot = TRUE, cex.axis = 1, cex.lab = 1.3, lwd = 3, xline = 3.5, yline = 3.5, mar = c(6, 6, 1, 1), xticks = NULL, yticks = NULL, xlim = "auto", ylim = "auto", add = FALSE, axis.labels = TRUE )
ephy |
Object of class |
useMedian |
A logical: will plot median if |
intervals |
If |
ratetype |
If 'auto', defaults to speciation (for diversification) or beta (for traits). Can alternatively specify 'netdiv' or 'extinction'. |
nBins |
Number of time slices used to generate rates through time. |
smooth |
A logical: whether or not to apply loess smoothing. |
smoothParam |
Loess smoothing parameter, ignored if
|
opacity |
Opacity of color for interval polygons. |
intervalCol |
Color for interval polygons. |
avgCol |
Color for mean/median line (line will not be plotted if
|
start.time |
Start time (in units before present). If |
end.time |
End time (in units before present). If |
node |
If supplied, the clade descended from this node will be used
or ignored, depending on |
nodetype |
If 'include', rates will be plotted only for the clade
descended from |
plot |
A logical: if |
cex.axis |
Size of axis tick labels. |
cex.lab |
Size of axis labels. |
lwd |
Line width of the average rate. |
xline |
Margin line for placing x-axis label. |
yline |
Margin line for placing y-axis label. |
mar |
Passed to |
xticks |
Number of ticks on the x-axis, automatically inferred if
|
yticks |
Number of ticks on the y-axis, automatically inferred if
|
xlim |
Vector of length 2 with min and max times for x axis. X axis
is time since present, so if plotting till the present,
|
ylim |
Vector of length 2 with min and max rates for y axis. Can also be 'auto'. |
add |
A logical: should rates be added to an existing plot. |
axis.labels |
A logical: if |
If the input ephy
object has been generated by
getEventData
and is of class bammdata
, then
start.time
, end.time
, node
, and nodetype
can be specified. If the input ephy
object has been generated
by getRateThroughTimeMatrix
and is of class
bamm-ratematrix
, then those arguments cannot be specified
because they are needed to generate the rate matrix, which in this
case has already happened.
The user has complete control of the plotting of the confidence
intervals. Confidence intervals will not be plotted at all if
intervals=NULL
. If a single confidence interval polygon is
desired, rather than overlapping polygons, then intervals
can
specify the confidence interval bounds, and opacity
should be
set to 1 (see examples).
If working with a large dataset, we recommend first creating a
bamm-ratematrix
object with
getRateThroughTimeMatrix
and then using that object as
input for plotRateThroughTime
. This way, the computation of
rates has already happened and will not slow the plotting function
down, making it easier to adjust plotting parameters.
If plot = FALSE
, then a list is returned with the following
components:
poly: A list of matrices, where each matrix contains the coordinates that define each overlapping confidence interval polygon.
avg: A vector of y-coordinates for mean or median rates used to plot the average rates line.
times: A vector of time values, used as x-coordinates in this plot function.
Pascal Title
See getEventData
and
getRateThroughTimeMatrix
to generate input data.
## Not run: data(events.whales) data(whales) ephy <- getEventData(whales,events.whales) # Simple plot of rates through time with default settings plotRateThroughTime(ephy) # Plot two processes separately with 90% CI and loess smoothing plotRateThroughTime(ephy, intervals = seq(from = 0, 0.9, by = 0.01), smooth = TRUE, node = 141, nodetype = 'exclude', ylim = c(0, 1.2)) plotRateThroughTime(ephy, intervals = seq(from = 0, 0.9, by = 0.01), smooth = TRUE, node = 141, nodetype = 'include', add = TRUE, intervalCol = 'orange') legend('topleft', legend = c('Dolphins','Whales'), col = 'red', fill = c('orange', 'blue'), border = FALSE, lty = 1, lwd = 2, merge = TRUE, seg.len=0.6) # Same plot, but from bamm-ratematrix objects rmat1 <- getRateThroughTimeMatrix(ephy, node = 141, nodetype = 'exclude') rmat2 <- getRateThroughTimeMatrix(ephy, node = 141, nodetype = 'include') plotRateThroughTime(rmat1, intervals=seq(from = 0, 0.9, by = 0.01), smooth = TRUE, ylim = c(0, 1.2)) plotRateThroughTime(rmat2, intervals = seq(from = 0, 0.9, by = 0.01), smooth = TRUE, add = TRUE, intervalCol = 'orange') # To plot the mean rate without the confidence envelope plotRateThroughTime(ephy, useMedian = FALSE, intervals = NULL) # To plot the mean rate, with a single 95% confidence envelope, grayscale plotRateThroughTime(ephy, useMedian = FALSE, intervals = c(0.05, 0.95), intervalCol = 'gray70', avgCol = 'black', opacity = 1) # To not plot, but instead return the plotting data generated in this # function, we can make plot = FALSE plotRateThroughTime(ephy, plot = FALSE) ## End(Not run)
## Not run: data(events.whales) data(whales) ephy <- getEventData(whales,events.whales) # Simple plot of rates through time with default settings plotRateThroughTime(ephy) # Plot two processes separately with 90% CI and loess smoothing plotRateThroughTime(ephy, intervals = seq(from = 0, 0.9, by = 0.01), smooth = TRUE, node = 141, nodetype = 'exclude', ylim = c(0, 1.2)) plotRateThroughTime(ephy, intervals = seq(from = 0, 0.9, by = 0.01), smooth = TRUE, node = 141, nodetype = 'include', add = TRUE, intervalCol = 'orange') legend('topleft', legend = c('Dolphins','Whales'), col = 'red', fill = c('orange', 'blue'), border = FALSE, lty = 1, lwd = 2, merge = TRUE, seg.len=0.6) # Same plot, but from bamm-ratematrix objects rmat1 <- getRateThroughTimeMatrix(ephy, node = 141, nodetype = 'exclude') rmat2 <- getRateThroughTimeMatrix(ephy, node = 141, nodetype = 'include') plotRateThroughTime(rmat1, intervals=seq(from = 0, 0.9, by = 0.01), smooth = TRUE, ylim = c(0, 1.2)) plotRateThroughTime(rmat2, intervals = seq(from = 0, 0.9, by = 0.01), smooth = TRUE, add = TRUE, intervalCol = 'orange') # To plot the mean rate without the confidence envelope plotRateThroughTime(ephy, useMedian = FALSE, intervals = NULL) # To plot the mean rate, with a single 95% confidence envelope, grayscale plotRateThroughTime(ephy, useMedian = FALSE, intervals = c(0.05, 0.95), intervalCol = 'gray70', avgCol = 'black', opacity = 1) # To not plot, but instead return the plotting data generated in this # function, we can make plot = FALSE plotRateThroughTime(ephy, plot = FALSE) ## End(Not run)
BAMM
rate frequenciesPlots a histogram of the frequency of rate values across the phylogeny.
ratesHistogram( phylorates, plotBrks = TRUE, xlab = "speciation rate", ylab = "density", lwd = 0.2, lty = 1, brksCol = "black", ... )
ratesHistogram( phylorates, plotBrks = TRUE, xlab = "speciation rate", ylab = "density", lwd = 0.2, lty = 1, brksCol = "black", ... )
phylorates |
A saved |
plotBrks |
Boolean, should breaks be plotted over the histogram. |
xlab |
x-axis label. |
ylab |
y-axis label. |
lwd |
Line width for breaks. |
lty |
Line style for breaks. |
brksCol |
Color of breaks lines. |
... |
Additional arguments passed on to |
With this function, a histogram is plotted that shows the
frequency of rates present in the dataset. The color scheme plotted
is taken from the saved plot.bammdata
object that is the main
input. Therefore, the mapping of colors to rates in the histogram
corresponds exactly to what is plotted in the phylorate plot. If
plotBrks = TRUE
, then the color breaks used for the phylorates
plot are shown.
This function can be a useful tool for exploring different
plot.bammdata
options. Please see
http://bamm-project.org/colorbreaks.html on the bamm-project
website for more information on the utility of this function.
Pascal Title
data(primates, events.primates) ed <- getEventData(primates, events.primates, burnin=0.25, nsamples=500, type = 'trait') # create phylorate plot with the jenks breaks method to generate output phylorates <- plot(ed, breaksmethod='jenks', show = FALSE) ratesHistogram(phylorates, plotBrks = TRUE, xlab = 'trait rates')
data(primates, events.primates) ed <- getEventData(primates, events.primates, burnin=0.25, nsamples=500, type = 'trait') # create phylorate plot with the jenks breaks method to generate output phylorates <- plot(ed, breaksmethod='jenks', show = FALSE) ratesHistogram(phylorates, plotBrks = TRUE, xlab = 'trait rates')
Deprecated function. Please use
rich.colors
instead.
richColors(n)
richColors(n)
n |
The number of desired colors. |
If the user would like to specify species sampling on a
clade-by-clade basis, a sampling probability table can be provided to
BAMM
.
samplingProbs( tree, cladeTable, cladeRichness = NULL, globalSampling, output, writeToDisk = TRUE )
samplingProbs( tree, cladeTable, cladeRichness = NULL, globalSampling, output, writeToDisk = TRUE )
tree |
An object of class |
cladeTable |
A dataframe with one column of species names and a second column of group assignment. |
cladeRichness |
Either |
globalSampling |
percent sampling for the backbone of the phylogeny. |
output |
Path + output file name. |
writeToDisk |
A logical, should the table be written to disk,
defaults to |
This function handles two types of input: The cladeTable can
either contain the species found in the phylogeny, along with clade
assignment of those species, or it can contain more species than found
in the phylogeny. If the table only contains those species in the
phylogeny, then a vector cladeRichness
must be provided that
contains known clade richness. If the cladeTable contains more than
the species in the phylogeny, then cladeRichness should be set to
NULL
. The globalSampling
value should represent the
overall completeness of species sampling in terms of major clades. See
http://bamm-project.org/ for additional details.
If writeToDisk = TRUE
, then no object is returned. If
writeToDisk = FALSE
, then a dataframe is returned. The
resultant table must contain one row for each species in the
phylogeny, along with clade assignment, and sampling fraction. The
first line must contain the overall sampling fraction for the
phylogeny and must be written as tab-delimited, with no headers.
Pascal Title
# Generate dummy data tree <- read.tree(text="(((t1:2,(t2:1,t3:1):1):1,((t4:1,t5:1):1,t6:2):1) :1,(t7:3,(t8:2,t9:2):1):1);") tree$tip.label <- paste(rep('Species',9),1:9,sep='') spTable <- as.data.frame(matrix(nrow=9,ncol=2)) spTable[,1] <- tree$tip.label spTable[1:3,2] <- 'cladeA' spTable[4:6,2] <- 'cladeB' spTable[7:9,2] <- 'cladeC' richnessVec <- c(cladeA=5, cladeB=4, cladeC=12) # Option 1: We have a table of clade assignment for the species in the # tree, along with a vector of known clade richness spTable richnessVec samplingProbs(tree, cladeTable = spTable, cladeRichness = richnessVec, globalSampling = 1, writeToDisk = FALSE) # Option 2: We have a table of known species, beyond the sampling in the # phylogeny spTable <- rbind(spTable, c('Species10','cladeA'),c('Species11','cladeA'), c('Species12','cladeC'), c('Species13','cladeC'), c('Species14','cladeC')) spTable samplingProbs(tree, cladeTable = spTable, cladeRichness = NULL, globalSampling = 0.9, writeToDisk = FALSE)
# Generate dummy data tree <- read.tree(text="(((t1:2,(t2:1,t3:1):1):1,((t4:1,t5:1):1,t6:2):1) :1,(t7:3,(t8:2,t9:2):1):1);") tree$tip.label <- paste(rep('Species',9),1:9,sep='') spTable <- as.data.frame(matrix(nrow=9,ncol=2)) spTable[,1] <- tree$tip.label spTable[1:3,2] <- 'cladeA' spTable[4:6,2] <- 'cladeB' spTable[7:9,2] <- 'cladeC' richnessVec <- c(cladeA=5, cladeB=4, cladeC=12) # Option 1: We have a table of clade assignment for the species in the # tree, along with a vector of known clade richness spTable richnessVec samplingProbs(tree, cladeTable = spTable, cladeRichness = richnessVec, globalSampling = 1, writeToDisk = FALSE) # Option 2: We have a table of known species, beyond the sampling in the # phylogeny spTable <- rbind(spTable, c('Species10','cladeA'),c('Species11','cladeA'), c('Species12','cladeC'), c('Species13','cladeC'), c('Species14','cladeC')) spTable samplingProbs(tree, cladeTable = spTable, cladeRichness = NULL, globalSampling = 0.9, writeToDisk = FALSE)
Set priors for BAMM
analysis.
setBAMMpriors( phy, total.taxa = NULL, traits = NULL, outfile = "myPriors.txt", Nmax = 1000, suppressWarning = FALSE )
setBAMMpriors( phy, total.taxa = NULL, traits = NULL, outfile = "myPriors.txt", Nmax = 1000, suppressWarning = FALSE )
phy |
An object of class |
total.taxa |
If doing speciation-extinction analysis, the total
number of taxa in your clade. If your tree contains all taxa in the
clade (100% sampling), then leave this as |
traits |
A filename where the trait data ( |
outfile |
Filename for outputting the sample priors block. If
|
Nmax |
If analyzing a very large tree for phenotypic evolution, uses only this many taxa to estimate priors for your dataset. Avoid matrix inversion issues with large numbers of tips. |
suppressWarning |
Logical. If |
This is a "quick and dirty" tool for identifying approximately
acceptable priors for a BAMM
analysis. We have found that
choice of prior can have a substantial impact on BAMM
analyses.
It is difficult to simply set a default prior that applies across
datasets, because users often have trees with branch lengths in very
different units (e.g., numbers of substitutions versus millions of
years). Hence, without some careful attention, you can inadvertently
specify some very bad prior distributions. This function is designed
to at least put you in the right ballpark for decent prior
distributions, but there are no guarantees that these are most
appropriate for your data.
The general rules applied here are:
For the lambdaInitPrior
, we estimate the speciation rate of the
data under a pure birth model. We then set this prior to give an
exponential distribution with a mean five times greater than this
computed pure birth speciation estimate.
The lambdaShiftPrior
is the standard deviation of the normal
prior on the exponential change parameter k. We set the prior
distribution based on the age of the root of the tree. We set the
standard deviation of this distribution such that 2 standard
deviations give a parameter that will yield a 90% decline in the
initial speciation rate between the root of the tree and the tips.
The basic model is lambda(t) = lambda_0 * exp(k * t). This is a
straightforward calculation: let x = -log(0.1) / TMAX, where TMAX is
the age of the tree. Then set the standard deviation equal to (x / 2).
We set muInitPrior
equal to lambdaInitPrior
.
For trait evolution, we first compute the maximum likelihood estimate
of the variance of a Brownian motion process of trait evolution. The
prior betaInitPrior
is then set to an exponential distribution
with a mean 5 times greater than this value (similar to what is done
for lambda and mu, above).
This function generates an output file containing a prior parameters
block that can be pasted directly into the priors section of your
BAMM
control file.
The function does not return anything. It simply performs some
calculations and writes formatted output to a file. However, if
outfile = NULL
, then a named vector is returned.
Dan Rabosky
# for diversification analyses data(whales) setBAMMpriors(phy = whales, total.taxa = 89, outfile = NULL) # for trait analyses data("primates") data("mass.primates") ## create a named vector of trait values mass <- setNames(mass.primates[,2], mass.primates[,1]) setBAMMpriors(phy = primates, traits = mass, outfile = NULL)
# for diversification analyses data(whales) setBAMMpriors(phy = whales, total.taxa = 89, outfile = NULL) # for trait analyses data("primates") data("mass.primates") ## create a named vector of trait values mass <- setNames(mass.primates[,2], mass.primates[,1]) setBAMMpriors(phy = primates, traits = mass, outfile = NULL)
Computes the mean of the marginal posterior density of speciation/extinction or phenotypic rates for equally spaced points along the root to tip path for each species.
speciesByRatesMatrix(ephy, nslices, index = NULL, spex = "s")
speciesByRatesMatrix(ephy, nslices, index = NULL, spex = "s")
ephy |
An object of class |
nslices |
An integer number of time slices. This determines the number of equally spaced points in time at which rates are computed for each species. |
index |
An integer or vector of mode integer indicating which
posterior samples to use in the calculation. If |
spex |
A character string. "s" (default) calculates speciation rates;
"e" calculates extinction rates; "netdiv" calculates diversification
rates. Ignored if |
A list with two components:
times: A vector of time points where rates were calculated.
rates: A species X times matrix of rate through time trajectories.
Mike Grundler
data(whales, events.whales) ed <- getEventData(whales, events.whales, burnin=0.25, nsamples=500) ratemat <- speciesByRatesMatrix(ed, nslices = 100) dolphins <- extract.clade(whales, 140)$tip.label plot.new() plot.window(xlim = c(0, 35), ylim = c(0, 0.8)) for (i in 1:nrow(ratemat$rates)) { if (whales$tip.label[i] %in% dolphins) { lines(ratemat$times, ratemat$rates[i,], lwd = 2, col = 4) } else { lines(ratemat$times, ratemat$rates[i,], lwd = 2, col = 8) } } axis(1, seq(-5, 35, 5)) axis(2, seq(-0.2, 0.8, 0.2), las = 1) mtext("Time since root", 1, line = 2.5) mtext("Speciation rate", 2, line = 2.5)
data(whales, events.whales) ed <- getEventData(whales, events.whales, burnin=0.25, nsamples=500) ratemat <- speciesByRatesMatrix(ed, nslices = 100) dolphins <- extract.clade(whales, 140)$tip.label plot.new() plot.window(xlim = c(0, 35), ylim = c(0, 0.8)) for (i in 1:nrow(ratemat$rates)) { if (whales$tip.label[i] %in% dolphins) { lines(ratemat$times, ratemat$rates[i,], lwd = 2, col = 4) } else { lines(ratemat$times, ratemat$rates[i,], lwd = 2, col = 8) } } axis(1, seq(-5, 35, 5)) axis(2, seq(-0.2, 0.8, 0.2), las = 1) mtext("Time since root", 1, line = 2.5) mtext("Speciation rate", 2, line = 2.5)
stepBF is a function to determine the overall best fitting number of shifts using Bayes factor evidence.
stepBF(BFmat, step.size = 20, expectedNumberOfShifts = 1, inputType = "matrix")
stepBF(BFmat, step.size = 20, expectedNumberOfShifts = 1, inputType = "matrix")
BFmat |
square Bayes factor matrix or a named vector of posterior probabilities |
step.size |
how much Bayes factor support is required to accept a more complex model, see Details |
expectedNumberOfShifts |
expected number of shifts under the prior (only needed for |
inputType |
describes the input: |
stepBF takes either a square Bayes factor matrix (such as output by computeBayesFactors
) or a named
vector of posterior probabilities. If posterior probabilities are supplied, the model prior
(expectedNumberOfShifts
) must also be provided.
If the input is a Bayes factor matrix, specify inputType = 'matrix'
, otherwise if the input is
a named vector of posterior probabilities, specify inputType = 'postProb'
.
The step.size
argument is how much Bayes factor support is needed to accept a more complex model.
By default, this value is 1, so any more complex model that has a better Bayes factor than the previous model
will be accepted. Increasing the step size greatly reduces the Type I error at the cost of inflating Type II
error. So, with increasing step.size, you will infer fewer shifts.
a list of 3 items: the number of shifts for the best model, the number of shifts for the second best model, and the Bayes factor support for the best model over the second best.
Jonathan Mitchell
data(mcmc.whales) # remove 10% burnin mcmc.whales <- mcmc.whales[floor(0.1 * nrow(mcmc.whales)):nrow(mcmc.whales), ] # from a square matrix of Bayes factor values (inputType = 'matrix') bfmat <- computeBayesFactors(mcmc.whales, expectedNumberOfShifts = 1, burnin = 0) stepBF(bfmat, step.size = 1, inputType = 'matrix') # or from a vector of posterior probabilities (inputType = 'postProb') postProb <- table(mcmc.whales$N_shifts) / nrow(mcmc.whales) stepBF(postProb, step.size = 1, inputType = 'postProb')
data(mcmc.whales) # remove 10% burnin mcmc.whales <- mcmc.whales[floor(0.1 * nrow(mcmc.whales)):nrow(mcmc.whales), ] # from a square matrix of Bayes factor values (inputType = 'matrix') bfmat <- computeBayesFactors(mcmc.whales, expectedNumberOfShifts = 1, burnin = 0) stepBF(bfmat, step.size = 1, inputType = 'matrix') # or from a vector of posterior probabilities (inputType = 'postProb') postProb <- table(mcmc.whales$N_shifts) / nrow(mcmc.whales) stepBF(postProb, step.size = 1, inputType = 'postProb')
bammdata
objectSubsets a bammdata
object. Returns a bammdata
object after extracting a specified set of samples from the posterior.
subsetEventData(ephy, index)
subsetEventData(ephy, index)
ephy |
An object of class |
index |
A vector of integers corresponding to samples to be extracted
from the posterior distribution of shift configurations included in
the |
This will result in an error if you attempt to access samples
that do not exist in the ephy
data object. For example, if your
bammdata
object includes 100 samples from a posterior
distribution sampled with BAMM
, you can only attempt to subset
with index values 1:100.
Dan Rabosky
plot.bammdata
, getCohortMatrix
,
image
data(whales, events.whales) ed <- getEventData(whales, events.whales, nsamples=500) ed2 <- subsetEventData(ed, index=1) plot(ed2) addBAMMshifts(ed2, cex=2)
data(whales, events.whales) ed <- getEventData(whales, events.whales, nsamples=500) ed2 <- subsetEventData(ed, index=1) plot(ed2) addBAMMshifts(ed2, cex=2)
bammdata
objectGiven a set of tips or a node, this function extracts the
corresponding subtree from the bammdata
object. User should
specify either a set of tips or a node, and the node will overwrite
the tips if both are given.
subtreeBAMM(ephy, tips = NULL, node = NULL)
subtreeBAMM(ephy, tips = NULL, node = NULL)
ephy |
An object of class |
tips |
An integer or character vector indicating which tips (more than one) to be included in the subtree. |
node |
An integer indicating the root of the subtree to be extracted, and it must correspond to an innernode on the tree. |
This function allows users to extract a subtree from a big
bammdata
object, and examine the subset using
plot.bammdata
An object of class bammdata
.
Huateng Huang
data(whales, events.whales) ephy <- getEventData(whales, events.whales, burnin=0.25, nsamples=500) # specify a set of tips for the subtree tips <- sample(ephy$tip.label,size=20,replace=FALSE) subphy <- subtreeBAMM(ephy,tips=tips) # specify a innernode for subsetting subphy <- subtreeBAMM(ephy,node=103) # plot the subtree plot(subphy)
data(whales, events.whales) ephy <- getEventData(whales, events.whales, burnin=0.25, nsamples=500) # specify a set of tips for the subtree tips <- sample(ephy$tip.label,size=20,replace=FALSE) subphy <- subtreeBAMM(ephy,tips=tips) # specify a innernode for subsetting subphy <- subtreeBAMM(ephy,node=103) # plot the subtree plot(subphy)
BAMM
analysisSummarizes the posterior distribution on the number of shifts.
## S3 method for class 'bammdata' summary(object, display = 10, print = T, ...)
## S3 method for class 'bammdata' summary(object, display = 10, print = T, ...)
object |
An object of class |
display |
An integer for the number of rows of the posterior to display. |
print |
Print summary of shift distribution in console window? |
... |
Additional arguments (currently unused). |
Prints to console the number of posterior samples and the posterior distribution on the number of shifts, which is just the fraction of samples in the posterior having 0, 1, 2,...n shifts.
Returns (invisibly) a dataframe with 2 components:
shifts |
The number of shifts. |
prob |
The corresponding posterior probability of a model with a given number of rate shifts. |
Mike Grundler, Dan Rabosky
data(whales, events.whales) ephy <- getEventData(whales, events.whales, nsamples=100) summary(ephy)
data(whales, events.whales) ephy <- getEventData(whales, events.whales, nsamples=100) summary(ephy)
BAMM
analysisPrints summary attributes of the BAMM
credible set of
shift configurations.
## S3 method for class 'credibleshiftset' summary(object, ...) ## S3 method for class 'credibleshiftset' print(x, ...)
## S3 method for class 'credibleshiftset' summary(object, ...) ## S3 method for class 'credibleshiftset' print(x, ...)
object , x
|
An object of class |
... |
Additional arguments (unused). |
Prints to console summary attributes of the XX% credible set of
shift configurations sampled using BAMM
. Attributes printed
include: the number of distinct configurations in the XX% credible
set and the posterior probability, cumulative probability, and number
of rate shifts in the 9 most-probable shift configurations.
summary.credibleshiftset
returns (invisibly) a dataframe
with a number of rows equal to the number of shift configurations in
the credible set and four columns:
rank |
The ranked index of each shift configuration (ranked by posterior probability). |
probability |
The posterior probability of each shift configuration. |
cumulative |
The cumulative probability of each shift configuration. |
N_shifts |
The number of rate shifts in each shift configuration (can be zero). |
Dan Rabosky
distinctShiftConfigurations
,
plot.bammshifts
, credibleShiftSet
data(whales, events.whales) ed <- getEventData(whales, events.whales, nsamples = 500) cset <- credibleShiftSet(ed, expectedNumberOfShifts = 1, threshold = 5) summary(cset)
data(whales, events.whales) ed <- getEventData(whales, events.whales, nsamples = 500) cset <- credibleShiftSet(ed, expectedNumberOfShifts = 1, threshold = 5) summary(cset)
For each branch in a phylogenetic tree, evaluates the evidence (posterior probability or Bayes factor) that macroevolutionary rates have varied through time.
testTimeVariableBranches(ephy, prior_tv = 0.5, return.type = "posterior")
testTimeVariableBranches(ephy, prior_tv = 0.5, return.type = "posterior")
ephy |
An object of class |
prior_tv |
The prior probability that rate shifts lead to a new time-varying rate process (versus a time-constant process). |
return.type |
Either |
In BAMM 2.0
, rate shifts on trees can lead to time-varying
or constant-rate diversification processes. In other words, the model
will incorporate temporal variation in rates only if there is
sufficient evidence in the data to favor it. The function
testTimeVariableBranches
enables the user to extract the
evidence in favor of time-varying rates on any branch of a
phylogenetic tree from a bammdata
object.
The function returns a copy of the original phylogenetic tree, but
where branch lengths have been replaced by either the posterior
probability (return.type = "posterior"
) or the Bayes factor
evidence (return.type = "bayesfactor"
) that the
macroevolutionary rate regime governing each branch is time-variable.
Consider a particular branch X on a phylogenetic tree. If the length
of this branch is 0.97 and return.type = "posterior"
, this
implies that branch X was governed by a time-varying rate dynamic in
97% of all samples in the posterior. Alternatively, only 3% of
samples specified a constant rate dynamic on this branch.
The function also provides an alternative measure of support if
return.type = "bayesfactor"
. In this case, the Bayes factor
evidence for temporal rate variation is computed for each branch. We
simply imagine that diversification rates on each branch can be
explained by one of two models: either rates vary through time, or
they do not. In the above example (branch X), the Bayes factor would
be computed as follows, letting Prob_timevar and
Prior_timevar be the posterior and prior probabilities that a
particular branch is governed by a time-varying rate process:
( Prob_timevar) / (1 - Prob_timevar) * (1 - prior_timevar) / (prior_timevar)
The Bayes factor is not particularly useful under uniform prior odds
(e.g., prior_tv = 0.5
), since this simply reduces to the ratio
of posterior probabilities. Note that the prior must correspond to
whatever you used to analyze your data in BAMM
. By default,
time-variable and time-constant processes are assumed to have equal
prior odds.
This function can be used several ways, but this function allows the user to quickly evaluate which portions of a phylogenetic tree have "significant" evidence for rate variation through time (see Examples below).
An object of class phylo
, but where branch lengths are
replaced with the desired evidence (posterior probability or Bayes
factor) that each branch is governed by a time-varying rate dynamic.
Dan Rabosky
# Load whale data: data(whales, events.whales) ed <- getEventData(whales, events.whales, burnin=0.1, nsamples=200) # compute the posterior probability of # time-varying rates on each branch tree.pp <- testTimeVariableBranches(ed) # Plot tree, but color all branches where the posterior # probability of time-varying rates exceeds 95\%: colvec <- rep("black", nrow(whales$edge)) colvec[tree.pp$edge.length >= 0.95] <- 'red' plot.phylo(whales, edge.color=colvec, cex=0.5) # now, compute Bayes factors for each branch: tree.bf <- testTimeVariableBranches(ed, return.type = "bayesfactor") # now, assume that our prior was heavily stacked in favor # of a time-constant process: tree.bf2 <- testTimeVariableBranches(ed, prior_tv = 0.1, return.type = "bayesfactor") # Plotting the branch-specific Bayes factors against each other: plot.new() par(mar=c(5,5,1,1)) plot.window(xlim=c(0, 260), ylim=c(0, 260)) points(tree.bf2$edge.length, tree.bf$edge.length, pch=21, bg='red', cex=1.5) axis(1) axis(2, las=1) mtext(side=1, text="Bayes factor: prior_tv = 0.1", line=3, cex=1.5) mtext(side = 2, text = "Bayes factor: uniform prior odds", line=3, cex=1.5) # and you can see that if your prior favors CONSTANT RATE dynamics # you will obtain much stronger Bayes factor support for time varying # rates. # IF the evidence is present in your data to support time variation. # To be clear, the Bayes factors in this example were computed from the # same posterior probabilities: it is only the prior odds that differed.
# Load whale data: data(whales, events.whales) ed <- getEventData(whales, events.whales, burnin=0.1, nsamples=200) # compute the posterior probability of # time-varying rates on each branch tree.pp <- testTimeVariableBranches(ed) # Plot tree, but color all branches where the posterior # probability of time-varying rates exceeds 95\%: colvec <- rep("black", nrow(whales$edge)) colvec[tree.pp$edge.length >= 0.95] <- 'red' plot.phylo(whales, edge.color=colvec, cex=0.5) # now, compute Bayes factors for each branch: tree.bf <- testTimeVariableBranches(ed, return.type = "bayesfactor") # now, assume that our prior was heavily stacked in favor # of a time-constant process: tree.bf2 <- testTimeVariableBranches(ed, prior_tv = 0.1, return.type = "bayesfactor") # Plotting the branch-specific Bayes factors against each other: plot.new() par(mar=c(5,5,1,1)) plot.window(xlim=c(0, 260), ylim=c(0, 260)) points(tree.bf2$edge.length, tree.bf$edge.length, pch=21, bg='red', cex=1.5) axis(1) axis(2, las=1) mtext(side=1, text="Bayes factor: prior_tv = 0.1", line=3, cex=1.5) mtext(side = 2, text = "Bayes factor: uniform prior odds", line=3, cex=1.5) # and you can see that if your prior favors CONSTANT RATE dynamics # you will obtain much stronger Bayes factor support for time varying # rates. # IF the evidence is present in your data to support time variation. # To be clear, the Bayes factors in this example were computed from the # same posterior probabilities: it is only the prior odds that differed.
Given a bammdata
object and a vector of (continuous)
trait data, assess whether the correlation between the trait and bamm
estimated speciation, extinction or net diversification rate is
significant using permutation. A set of posterior samples is randomly
drawn from the bammdata
object. If the trait is continuous,
this function calculates the correlation coefficients between the
trait and tip rates (observed correlation), as well as that with
permuted rates for each posterior sample. In a one-tailed test for
positive correlations, the reported p-value is the proportion of the
posterior samples in which the observed correlation is larger than the
correlations calculated with permuted rates. In a two-tailed test, the
reported p-value is the proportion of the posterior samples in which
the null correlation is as extreme as the correlation observed. If the
trait is binary, the U statistic of the Mann-Whitney test is
calculated instead of correlation coefficients to assess whether there
is a significant difference in rate between the two trait states. For
categorical traits with more than two states, the Kruskal-Wallis rank
sum statistic is used.
traitDependentBAMM( ephy, traits, reps, rate = "speciation", return.full = FALSE, method = "spearman", logrates = TRUE, two.tailed = TRUE, traitorder = NA, nthreads = 1 )
traitDependentBAMM( ephy, traits, reps, rate = "speciation", return.full = FALSE, method = "spearman", logrates = TRUE, two.tailed = TRUE, traitorder = NA, nthreads = 1 )
ephy |
An object of class |
traits |
A vector of trait data, with names corresponding to tips in
the |
reps |
An integer specifying the number of permutations (i.e., the
number of posterior samples to randomly draw with replacement from the
|
rate |
A character string specifying which estimated rate from the
|
return.full |
A logical. If |
method |
A character string, must be one of 'spearman', 'pearson', 'mann-whitney', or 'kruskal'. Defaults to 'spearman'. You can specify just the initial letter. |
logrates |
A logical. If |
two.tailed |
A logical, used for continuous trait data. If
|
traitorder |
A character string specifying the direction of correlation for the alternative hypothesis. For continuous traits, it must be either "positive" or "negative"; only the initial letter is needed. For binary traits, it must be a string indicating states with increasing rate under the alternative hypothesis, separated by comma (e.g., 'A, B'). One-tailed test for categorical data with more than two states is not supported. |
nthreads |
Number of threads to use for parallelization of the
function. The R package |
Tip rates –trait, speciation, extinction, or net diversification
rates– are permuted in a way such that pairwise covariances in rates
between species are maintained. That is, tips with the same
tipStates
still have the same rate after permutation. Posterior
samples are randomly selected with replacement from the
bammdata
object, so reps
could be smaller or larger than
the total number of samples in the object.
This function expects that the bamm-data object and the trait data
have the same taxon set. It may be necessary to subset the trait data
and/or run subtreeBAMM
on the bamm-data
object in
order to meet this requirement.
A list with the following components:
estimate: A numeric value for continous trait data: the average observed correlation between tip rates and the trait across the posterior samples. For categorical traits, it is a list showing the median species-specific rates for each trait state.
p.value: A numeric value. The probability that the observed correlation is less than or equal to a sample from the null distribution.
method: A character string, as input.
rate: A character string, as input.
two.tailed: A logical, as input.
gen: An integer vector, recording which posterior samples
were selected. Only present when return.full
is
TRUE
.
obs.corr: A numeric vector, the observed correlation
coefficents for each posterior sample. Only present when
return.full
is TRUE
. For binary traits, centered
U statistics (U - n1* n2/2; where n1 and n2 are the number of
species in each state of the binary trait) is reported.
null: A numeric vector. The null distribution of
correlation coefficients (or centered U statistics for binary
traits) from permutation. Only present when return.full
is TRUE
.
Dan Rabosky, Huateng Huang
Rabosky, D. L. and Huang, H., 2015. A Robust Semi-Parametric Test for Detecting Trait-Dependent Diversification. Systematic Biology 65: 181-193.
Rabosky, D. L. 2014. Automatic detection of key innovations, rate shifts, and diversity-dependence on phylogenetic trees. PLoS ONE 9:e89543.
Rabosky, D. L., F. Santini, J. T. Eastman, S. A. Smith, B. L. Sidlauskas, J. Chang, and M. E. Alfaro. 2013. Rates of speciation and morphological evolution are correlated across the largest vertebrate radiation. Nature Communications DOI: 10.1038/ncomms2958.
# using a small subset of the fish data set (300 taxa) in Rabosky et al. # 2013. Nat. Com. paper data(fishes, events.fishes) xx <- getEventData(phy = fishes, eventdata = events.fishes, nsamples = 500, type = "diversification") # traits.fishes is the trait -- body size data(traits.fishes) x <- traitDependentBAMM(ephy = xx, traits = traits.fishes, reps = 1000, return.full = TRUE, method = 's', logrates = TRUE, two.tailed = TRUE)
# using a small subset of the fish data set (300 taxa) in Rabosky et al. # 2013. Nat. Com. paper data(fishes, events.fishes) xx <- getEventData(phy = fishes, eventdata = events.fishes, nsamples = 500, type = "diversification") # traits.fishes is the trait -- body size data(traits.fishes) x <- traitDependentBAMM(ephy = xx, traits = traits.fishes, reps = 1000, return.full = TRUE, method = 's', logrates = TRUE, two.tailed = TRUE)
Converts a named color and opacity and returns the proper RGB code.
transparentColor(namedColor, alpha = 0.8)
transparentColor(namedColor, alpha = 0.8)
namedColor |
A color name. |
alpha |
A transparency value between 0 and 1, where 0 is fully transparent. |
This function is used internally by
plotRateThroughTime
.
Returns the transparent color in RGB format.
Pascal Title
bammdata
object to diskTakes a bammdata
object and re-writes it back into a
treefile and an event csv file.
writeEventData(ephy, outtreefile, outeventfile, ...)
writeEventData(ephy, outtreefile, outeventfile, ...)
ephy |
A |
outtreefile |
The file name for outputting the tree. |
outeventfile |
The file name for outputting the event csv file. |
... |
Additional arguments to pass to |