Title: | Fitting Hansen Models to Investigate Convergent Evolution |
---|---|
Description: | This data-driven phylogenetic comparative method fits stabilizing selection models to continuous trait data, building on the 'ouch' methodology of Butler and King (2004) <doi:10.1086/426002>. The main functions fit a series of Hansen models using stepwise AIC, then identify cases of convergent evolution where multiple lineages have shifted to the same adaptive peak. For more information see Ingram and Mahler (2013) <doi:10.1111/2041-210X.12034>. |
Authors: | Travis Ingram [aut, cre] |
Maintainer: | Travis Ingram <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.6 |
Built: | 2024-11-18 06:35:56 UTC |
Source: | CRAN |
surface
provides a wrapper to the ouch
package, fitting a series of Hansen multiple-peak stabilizing selection models using stepwise AIC, and identifying cases of convergence where independent lineages discovered the same adaptive peak
Package: | surface |
Type: | Package |
Version: | 0.5 |
Date: | 2020-11-10 |
License: | GPL (>=2) |
surface
uses the Hansen model of stabilizing selection around multiple adaptive peaks to infer a macroevolutionary adaptive landscape using only trait data and a phylogenetic tree. The most important functions are surfaceForward
and surfaceBackward
, which carry out the two stepwise phases of the method, and runSurface
, a wrapper function that carries out both phases. Results can be displayed using surfaceSummary
, and visualized using surfaceTreePlot
, surfaceTraitPlot
, and surfaceAICPlot
. Hypothesis tests, such as whether the extent of convergence exceeds the expectation under a model without true convergence, can be done with the assistance of surfaceSimulate
. The vignette ‘surface_tutorial’ demonstrates the use of the various functions included in the package
Travis Ingram <[email protected]>
Ingram, T. & Mahler, D.L. (2013) SURFACE: detecting convergent evolution from comparative data by fitting Ornstein-Uhlenbeck models with stepwise AIC. Methods in Ecology and Evolution 4: 416-425.
runSurface
, surfaceForward
, surfaceBackward
, surfaceSimulate
, surfaceSummary
, surfaceTreePlot
, surfaceTraitPlot
, surfaceAICPlot
#executable R code and demonstrations of the key functions can be found in the tutorial #vignette("surface_tutorial", package = "surface")
#executable R code and demonstrations of the key functions can be found in the tutorial #vignette("surface_tutorial", package = "surface")
convertTreeData
converts a phylo
-formatted tree and a data frame into formats ready to be analyzed with the ouch
functions called by surface
. convertBack
converts an ouchtree
to a data frame including regime information, and is called internally by surfaceTreePlot
. nameNodes
adds unique node labels to a phylo
tree to ensure reliable conversion between formats
convertTreeData(tree, dat) convertBack(tree, otree, regshifts) nameNodes(tree)
convertTreeData(tree, dat) convertBack(tree, otree, regshifts) nameNodes(tree)
tree |
Phylogenetic tree in |
dat |
Data frame with row names corresponding to the taxa in |
otree |
Phylogenetic tree in |
regshifts |
Named character vector of regime shifts indicating branches containing shifts (numbered corresponding to |
convertTreeData
returns a list with components otree
(a phylogenetic tree in ouchtree
format) and odata
(a data frame containing trait data, with rownames corresponding to otree@labels
). convertBack
returns a data frame containing original phenotypic data as well as regime assignments of tip taxa. nameNodes
returns the input tree, with arbitrary node names added (zzz1
, zzz2
, etc) to ensure reliable conversion between formats
Travis Ingram
Ingram, T. & Mahler, D.L. (2013) SURFACE: detecting convergent evolution from comparative data by fitting Ornstein-Uhlenbeck models with stepwise AIC. Methods in Ecology and Evolution 4: 416-425.
surfaceBackward
, surfaceForward
, surfaceTreePlot
data(surfaceDemo) tree<-surfaceDemo$tree dat<-surfaceDemo$sim$dat olist<-convertTreeData(tree,dat)
data(surfaceDemo) tree<-surfaceDemo$tree dat<-surfaceDemo$sim$dat olist<-convertTreeData(tree,dat)
Calculates AICc for a Hansen model using combined likelihoods across multiple traits
getAIC(L, np, n, AICc = TRUE) npSurface(fit)
getAIC(L, np, n, AICc = TRUE) npSurface(fit)
fit |
Fitted Hansen model object (the list returned by one iteration of |
L |
Log-likelihood of the model |
np |
Number of parameters in the model |
n |
Sample size (total number of trait values) |
AICc |
A logical indicating whether to use small-sample size corrected AIC; defaults to |
The number of parameters is calculated as p = k + (k' + 2) m, where k is the number of regime shifts, k' is the number of distinct regimes, and m is the number of traits. Note that this differs from many applications of Hansen models, in that SURFACE counts regime shifts as "parameters", modeling the complexity of both the adaptive landscape (number of regimes) and the evolutionary history of the clade (number of regime shifts). For AICc, the sample size is taken to be the total number of trait values mn, where n is the number of taxa
npSurface
returns an integer number of parameters. getAIC
returns a numeric AIC or AICc value
Travis Ingram
Ingram, T. & Mahler, D.L. (2013) SURFACE: detecting convergent evolution from comparative data by fitting Ornstein-Uhlenbeck models with stepwise AIC. Methods in Ecology and Evolution 4: 416-425.
data(surfaceDemo) tree<-surfaceDemo$tree dat<-surfaceDemo$sim$dat olist<-convertTreeData(tree,dat) otree<-olist[[1]]; odata<-olist[[2]] startmod<-startingModel(otree, odata, shifts = c("6"="b")) np<-as.numeric(npSurface(startmod[[1]])) LnL<-sum(sapply(startmod[[1]]$fit, function(x) summary(x)$loglik)) getAIC(LnL,np,n=ncol(dat)*nrow(dat),AICc=TRUE)
data(surfaceDemo) tree<-surfaceDemo$tree dat<-surfaceDemo$sim$dat olist<-convertTreeData(tree,dat) otree<-olist[[1]]; odata<-olist[[2]] startmod<-startingModel(otree, odata, shifts = c("6"="b")) np<-as.numeric(npSurface(startmod[[1]])) LnL<-sum(sapply(startmod[[1]]$fit, function(x) summary(x)$loglik)) getAIC(LnL,np,n=ncol(dat)*nrow(dat),AICc=TRUE)
ouch
Tree
Extracts the time from root of each node in an ouchtree
or hansentree
formatted phylogenetic tree; used to compute the timing of regime shifts in a Hansen model
getBranchTimes(h)
getBranchTimes(h)
h |
Fitted |
A vector of branching times
Travis Ingram
Ingram, T. & Mahler, D.L. (2013) SURFACE: detecting convergent evolution from comparative data by fitting Ornstein-Uhlenbeck models with stepwise AIC. Methods in Ecology and Evolution 4: 416-425.
data(surfaceDemo) tree<-surfaceDemo$tree dat<-surfaceDemo$sim$dat olist<-convertTreeData(tree,dat) otree<-olist[[1]] getBranchTimes(otree)
data(surfaceDemo) tree<-surfaceDemo$tree dat<-surfaceDemo$sim$dat olist<-convertTreeData(tree,dat) otree<-olist[[1]] getBranchTimes(otree)
ouch
Tree
Identifies the nodes and tip taxa descended from a given ancestor in an ouchtree
or hansentree
object. Used to test whether two ‘convergent’ regimes are actually nested when randomly placing regime shifts in a Hansen model in the function surfaceSimulate
ouchDescendants(node, otree)
ouchDescendants(node, otree)
node |
Which node in the ouchtree object to identify the descendants of |
otree |
An |
A vector of integers corresponding to the descendents (integers match the @nodes
element of the ouchtree
)
Travis Ingram
Ingram, T. & Mahler, D.L. (2013) SURFACE: detecting convergent evolution from comparative data by fitting Ornstein-Uhlenbeck models with stepwise AIC. Methods in Ecology and Evolution 4: 416-425.
data(surfaceDemo) tree<-surfaceDemo$tree dat<-surfaceDemo$sim$dat olist<-convertTreeData(tree,dat) otree<-olist[[1]] ouchDescendants(6, otree)
data(surfaceDemo) tree<-surfaceDemo$tree dat<-surfaceDemo$sim$dat olist<-convertTreeData(tree,dat) otree<-olist[[1]] ouchDescendants(6, otree)
Calculates the pairwise matching between two alternate paintings of the same phylogenetic tree. This is done by creating a half-matrix for each hansentree
object indicating whether each pairwise comparison of tip species or branches shows they are in the same regime (coded ‘1’) or different regimes (coded ‘0’). The ‘proportion matching’ value returned is the proportion of elements of the two matrices that are equal; a measure of correspondence between two Hansen models (one of which may be the ‘true’ model if data are simulated)
propRegMatch(fit1, fit2, internal = FALSE)
propRegMatch(fit1, fit2, internal = FALSE)
fit1 |
First fitted Hansen model; can be the |
fit2 |
Second fitted Hansen model; see |
internal |
A logical indicating whether internal branches should be included in the calculation of matching in addition to tip taxa; this is only possible if the two trees have identical topology; defaults to FALSE |
A single value quantifying the proportion of pairwise regime comparisons that are the same between the two models
Travis Ingram
Ingram, T. & Mahler, D.L. (2013) SURFACE: detecting convergent evolution from comparative data by fitting Ornstein-Uhlenbeck models with stepwise AIC. Methods in Ecology and Evolution 4: 416-425.
surfaceForward
, surfaceBackward
, surfaceSimulate
data(surfaceDemo) tree<-surfaceDemo$tree dat<-surfaceDemo$sim$dat olist<-convertTreeData(tree,dat) otree<-olist[[1]]; odata<-olist[[2]] startmod<-startingModel(otree, odata, shifts = c("6"="b")) startmod2<-startingModel(otree, odata, shifts = c("6"="b","17"="c")) propRegMatch(startmod[[1]]$fit, startmod2[[1]]$fit)
data(surfaceDemo) tree<-surfaceDemo$tree dat<-surfaceDemo$sim$dat olist<-convertTreeData(tree,dat) otree<-olist[[1]]; odata<-olist[[2]] startmod<-startingModel(otree, odata, shifts = c("6"="b")) startmod2<-startingModel(otree, odata, shifts = c("6"="b","17"="c")) propRegMatch(startmod[[1]]$fit, startmod2[[1]]$fit)
A wrapper to the paint
function in ouch
to ensure that regime paintings are automatically formatted for SURFACE analysis (painting the stem branch of a clade and ensuring that the root is assigned a regime)
repaint(otree, regshifts, stem = TRUE)
repaint(otree, regshifts, stem = TRUE)
otree |
Phylogenetic tree in |
regshifts |
Named character vector of regime shifts |
stem |
A logical indicating whether the painting of a clade should include the stem branch; defaults to |
A named character vector of regime assignments for each branch, as returned by paint
Travis Ingram
Butler, M.A. & King, A.A. (2004) Phylogenetic comparative analysis: a modeling approach for adaptive evolution. American Naturalist 164: 683-695.
Ingram, T. & Mahler, D.L. (2013) SURFACE: detecting convergent evolution from comparative data by fitting Ornstein-Uhlenbeck models with stepwise AIC. Methods in Ecology and Evolution 4: 416-425.
data(surfaceDemo) tree<-surfaceDemo$tree dat<-surfaceDemo$sim$dat olist<-convertTreeData(tree,dat) otree<-olist[[1]] repaint(otree, regshifts = c(c("1"="a","6"="b","17"="c")))
data(surfaceDemo) tree<-surfaceDemo$tree dat<-surfaceDemo$sim$dat olist<-convertTreeData(tree,dat) otree<-olist[[1]] repaint(otree, regshifts = c(c("1"="a","6"="b","17"="c")))
Carries out both the forward and backward phases of SURFACE's stepwise AIC routine, with sensible default behaviors.
runSurface(tree, dat, exclude = 0, aic_threshold = 0, max_steps = NULL, verbose = FALSE, plotaic = FALSE, error_skip = FALSE, only_best = FALSE, sample_shifts=FALSE, sample_threshold = 2)
runSurface(tree, dat, exclude = 0, aic_threshold = 0, max_steps = NULL, verbose = FALSE, plotaic = FALSE, error_skip = FALSE, only_best = FALSE, sample_shifts=FALSE, sample_threshold = 2)
tree |
Phylogenetic tree in |
dat |
Data frame with taxa names as rownames matching the tip labels of |
exclude |
Optionally, the proportion of the worst models (AICc scores for each shift point) to exclude in the current round of the forward phase (defaults to zero) |
aic_threshold |
Change in AICc needed to accept a candidate model as a sufficient improvement over the previous iteration of SURFACE. Defaults to zero, meaning any improvement in the AICc will be accepted; more stringent thresholds are specified using *negative* values of |
max_steps |
Maximum number of steps to allow to allow each phase to carry out (assuming the model improvement continues to exceed |
verbose |
A logical indicating whether to print progress (defaults to |
plotaic |
A logical indicating whether to plot AICc values of all candidate models at each step (defaults to |
only_best |
A logical indicating whether to only allow one pair of regimes to be collapsed at each iteration; if |
error_skip |
A logical indicating whether to skip over any candidate model that produces an error message during likelihood optimization (this is rare, but can cause an entire analysis to abort; defaults to |
sample_shifts |
A logical indicating whether to randomly sample from among the best models at each step (those within |
sample_threshold |
Number of AICc units within which to sample among candidate models that are close to as good as the best model at each step (defaults to 2, but only used if |
Carries out all steps of SURFACE, including converting data structures and running both forward and backward phases of the analysis. The default behavior should be appropriate in most circumstances, but some functionalities require using the functions surfaceForward
and surfaceBackward
that are called by runSurface
A list with two elements, fwd
and bwd
.
fwd |
The results of the forward phase, as returned by |
bwd |
The results of the backward phase, as returned by |
Travis Ingram
Butler, M.A. & King, A.A. (2004) Phylogenetic comparative analysis: a modeling approach for adaptive evolution. American Naturalist 164: 683-695.
Ingram, T. & Mahler, D.L. (2013) SURFACE: detecting convergent evolution from comparative data by fitting Ornstein-Uhlenbeck models with stepwise AIC. Methods in Ecology and Evolution 4: 416-425.
Mahler, D.L., Ingram, T., Revell, L.J. & Losos, J.B. (2013) Exceptional convergence on the macroevolutionary landscape in island lizard radiations. Science 341: 292-295.
surfaceBackward
, surfaceForward
## Not run: data(surfaceDemo) tree<-surfaceDemo$tree dat<-surfaceDemo$sim$dat result<-runSurface(tree,dat) ## End(Not run)
## Not run: data(surfaceDemo) tree<-surfaceDemo$tree dat<-surfaceDemo$sim$dat result<-runSurface(tree,dat) ## End(Not run)
Generate a model to start a SURFACE analysis, or fit specific Hansen or Brownian motion models that can be compared to the models returned by SURFACE
startingModel(otree, odata, shifts = NULL, brownian = FALSE)
startingModel(otree, odata, shifts = NULL, brownian = FALSE)
otree |
Phylogenetic tree in |
odata |
Data frame with rownames corresponding to |
shifts |
A named character vector of regime shifts. Names should correspond to |
brownian |
A logical indicating whether to return the fitted Brownian motion model for the data set by calling the |
For most analysis, this function is not accessed by the user, but is called from within surfaceForward
to initialize the run with a single-regime OU model. However, the user can optionally supply a starting model that imposes some regime shifts (e.g. if there is strong a priori reason to include them, or to evaluate how their inclusion changes the result of SURFACE analysis). If shifts
are supplied, they are always modified so that the first element codes a basal regime 'shift' c("1"="a")
. Thus, if any other element in shifts
is specified as regime "a"
, or has name "1"
, an error will be returned. startingModel
can also be used to obtain a fit (with AICc calculated after adding log-likelihoods across traits) for any hypothesized Hansen model or for Brownian motion (if brownian=TRUE
) for comparison with models returned by SURFACE
A list of length 1 containing an object with the same structure as the lists returned by each iteration of surfaceForward
and surfaceBackward
(containing elements fit
, all_aic
, aic
, savedshifts
, and n_regimes
). This allows it to be supplied as argument starting_list
in a call to surfaceForward
.
Travis Ingram
Ingram, T. & Mahler, D.L. (2013) SURFACE: detecting convergent evolution from comparative data by fitting Ornstein-Uhlenbeck models with stepwise AIC. Methods in Ecology and Evolution 4: 416-425.
data(surfaceDemo) tree<-surfaceDemo$tree dat<-surfaceDemo$sim$dat olist<-convertTreeData(tree,dat) otree<-olist[[1]]; odata<-olist[[2]] startmod<-startingModel(otree, odata, shifts = c("6"="b"))
data(surfaceDemo) tree<-surfaceDemo$tree dat<-surfaceDemo$sim$dat olist<-convertTreeData(tree,dat) otree<-olist[[1]]; odata<-olist[[2]] startmod<-startingModel(otree, odata, shifts = c("6"="b"))
Plots a line graph showing how the AICc changed over the forward and backward phases of a SURFACE analysis. surfaceAICPlot
can optionally show the change in the deviance or 'partial AICc' for each trait separately as well as for the analysis as a whole. surfaceAICMultiPlot
plots lines from multiple runs on the same plot, allowing comparison among analyses done on alternate tree topologies or with stochasticity added using sample_shifts
surfaceAICPlot(fwd = NULL, bwd = NULL, out = NULL, summ = NULL, traitplot = "none", cols = NULL, daic = FALSE, ...) surfaceAICMultiPlot(fwd = NULL, bwd = NULL, out = NULL, summ = NULL, cols = NULL, daic = FALSE, ...)
surfaceAICPlot(fwd = NULL, bwd = NULL, out = NULL, summ = NULL, traitplot = "none", cols = NULL, daic = FALSE, ...) surfaceAICMultiPlot(fwd = NULL, bwd = NULL, out = NULL, summ = NULL, cols = NULL, daic = FALSE, ...)
fwd |
List resulting from a |
bwd |
List resulting from a |
out |
List resulting from a |
summ |
Object returned by |
traitplot |
String indicating what values to use to draw lines corresponding to individual traits: |
cols |
An optional character vector of colors for the AICc lines, used to color the different runs in |
daic |
A logical indicating whether to rescale all delta-AICc (and delta-deviance) values to the value from the starting model; defaults to |
... |
Additional arguments to be passed to the |
If values are plotted on a trait-by-trait basis, either traitplot="dev"
or traitplot="aic"
can be specified. If traitplot="dev"
, the deviance (-2*log likelihood) at each step is shown for each trait. If traitplot="aic"
, a "partial AICc" at each step is shown for each of the m
traits, consisting of the deviance and 1/m of the "penalty" part of the overall AICc, where m is the number of traits. Note that this is not a proper statistical construct, but its property of adding to give the overall AICc can be useful in visualizing the patterns among traits
Plots AIC values from a SURFACE analysis on the current graphics device
Travis Ingram
Ingram, T. & Mahler, D.L. (2013) SURFACE: detecting convergent evolution from comparative data by fitting Ornstein-Uhlenbeck models with stepwise AIC. Methods in Ecology and Evolution 4: 416-425.
Mahler, D.L., Ingram, T., Revell, L.J. & Losos, J.B. (2013) Exceptional convergence on the macroevolutionary landscape in island lizard radiations. Science 341: 292-295.
surfaceForward
, surfaceBackward
, surfaceSimulate
, surfaceSummary
, surfaceTreePlot
, surfaceTraitPlot
## Not run: data(surfaceDemo) tree<-surfaceDemo$tree dat<-surfaceDemo$sim$dat result<-runSurface(tree,dat) surfaceAICPlot(result$fwd,result$bwd) ## End(Not run)
## Not run: data(surfaceDemo) tree<-surfaceDemo$tree dat<-surfaceDemo$sim$dat result<-runSurface(tree,dat) surfaceAICPlot(result$fwd,result$bwd) ## End(Not run)
Carries out the backward phase of SURFACE's stepwise AIC routine. Beginning with a fitted Hansen model produced by surfaceForward
, tests pairwise collapses of regimes and identifies collapses that improve the fit. Continues this iterative process until the model stops improving beyond the given AIC threshold
surfaceBackward(otree, odata, starting_model, aic_threshold = 0, max_steps = NULL, save_steps = FALSE, filename = "temp_back_list.R", verbose = FALSE, only_best = FALSE, plotaic = FALSE, error_skip = FALSE, sample_shifts = FALSE, sample_threshold = 2) collapseRegimes(otree, odata, oldshifts, oldaic, oldfit, aic_threshold = 0, only_best = FALSE, verbose = TRUE, plotaic = TRUE, error_skip = FALSE, sample_shifts = FALSE, sample_threshold = 2)
surfaceBackward(otree, odata, starting_model, aic_threshold = 0, max_steps = NULL, save_steps = FALSE, filename = "temp_back_list.R", verbose = FALSE, only_best = FALSE, plotaic = FALSE, error_skip = FALSE, sample_shifts = FALSE, sample_threshold = 2) collapseRegimes(otree, odata, oldshifts, oldaic, oldfit, aic_threshold = 0, only_best = FALSE, verbose = TRUE, plotaic = TRUE, error_skip = FALSE, sample_shifts = FALSE, sample_threshold = 2)
otree |
Phylogenetic tree in |
odata |
Data frame with rownames corresponding to |
starting_model |
The Hansen model to attempt regime collapses on; typically the final element of a |
aic_threshold |
Change in AICc needed to accept a candidate model as a sufficient improvement over the previous iteration of SURFACE. Defaults to zero, meaning any improvement in the AICc will be accepted; more stringent thresholds are specified using *negative* values of |
max_steps |
Maximum number of steps in the backward phase to carry out (assuming the model improvement continues to exceed |
save_steps |
A logical indicating whether to save the current iteration of the model at each step (overwriting if necessary) to a file |
filename |
Name of the file to save progress to at each step, if |
verbose |
A logical indicating whether to print progress (defaults to |
only_best |
A logical indicating whether to only allow one pair of regimes to be collapsed at each iteration; if |
plotaic |
A logical indicating whether to plot AICc values of candidate models at each step (defaults to |
error_skip |
A logical indicating whether to skip over any candidate model that produces an error message (this is rare, but can cause an entire analysis to abort; defaults to |
sample_shifts |
A logical indicating whether to sample from among the best models at each step (those within |
sample_threshold |
Number of AICc units within which to sample among candidate models that are close to as good as the best model at each step (defaults to 2, but only used if |
oldshifts |
Shifts present in the previous iteration of the Hansen model |
oldaic |
AICc value for the Hansen model from the previous iteration |
oldfit |
Previous fitted Hansen model |
Can be time-consuming, as the number of likelihood searches at a step is k(k-1)/2
, where k
is the number of regimes in the model.
collapseRegime
returns a list corresponding to one iteration of the backward phase of the SURFACE analysis; surfaceBackward
returns a list of such lists consisting of each step of the stepwise process
fit |
The fitted Hansen model selected for improving the AICc most over the previous iteration; consists of a single |
all_aic |
The AICc for each model tested during the iteration (one per pair of regimes) |
aic |
The AICc of the current Hansen model |
savedshifts |
The shifts present in the current Hansen model; represented as a named character vector of regime assignments (lower-case letters), with names indicating branches containing shifts |
Travis Ingram
Butler, M.A. & King, A.A. (2004) Phylogenetic comparative analysis: a modeling approach for adaptive evolution. American Naturalist 164: 683-695.
Ingram, T. & Mahler, D.L. (2013) SURFACE: detecting convergent evolution from comparative data by fitting Ornstein-Uhlenbeck models with stepwise AIC. Methods in Ecology and Evolution 4: 416-425.
Mahler, D.L., Ingram, T., Revell, L.J. & Losos, J.B. (2013) Exceptional convergence on the macroevolutionary landscape in island lizard radiations. Science 341: 292-295.
surfaceForward
, surfaceSimulate
, surfaceTreePlot
, surfaceSummary
## Not run: data(surfaceDemo) tree<-surfaceDemo$tree dat<-surfaceDemo$sim$dat olist<-convertTreeData(tree,dat) otree<-olist[[1]]; odata<-olist[[2]] fwd<-surfaceForward(otree, odata, aic_threshold = 0, exclude = 0, verbose = FALSE, plotaic = FALSE) k<-length(fwd) bwd<-surfaceBackward(otree, odata, starting_model = fwd[[k]], aic_threshold = 0) ## End(Not run)
## Not run: data(surfaceDemo) tree<-surfaceDemo$tree dat<-surfaceDemo$sim$dat olist<-convertTreeData(tree,dat) otree<-olist[[1]]; odata<-olist[[2]] fwd<-surfaceForward(otree, odata, aic_threshold = 0, exclude = 0, verbose = FALSE, plotaic = FALSE) k<-length(fwd) bwd<-surfaceBackward(otree, odata, starting_model = fwd[[k]], aic_threshold = 0) ## End(Not run)
This simulated tree and data set can be used to demonstrate the functionality of SURFACE. The vignette ‘surface_tutorial’ demonstrates the use of the various functions included in the package using surfaceDemo
data(surfaceDemo)
data(surfaceDemo)
A list containing a tree in phylo
format (surfaceDemo$tree
), and a list surfaceDemo$sim
, which contains trait data (surfaceDemo$sim$data
) and the other features output by surfaceSimulate
, including the generating Hansen model (surfaceDemo$sim$fit
)
simulated data
Ingram, T. & Mahler, D.L. (2013) SURFACE: detecting convergent evolution from comparative data by fitting Ornstein-Uhlenbeck models with stepwise AIC. Methods in Ecology and Evolution 4: 416-425.
data(surfaceDemo) tree<-surfaceDemo$tree dat<-surfaceDemo$sim$dat
data(surfaceDemo) tree<-surfaceDemo$tree dat<-surfaceDemo$sim$dat
Carries out the forward phase of SURFACE's stepwise AIC routine, adding regime shifts to a Hansen model. addRegime
performs one step of this analysis, and is called repeatedly by surfaceForward
. At each step, the delta-AICc of each possible shift placement (i.e. branch) is calculated, and an updated Hansen model is returned with one shift added. This process is iterated until the model stops improving beyond a threshold delta-AICc
surfaceForward(otree, odata, starting_list=NULL, starting_shifts=NULL, exclude=0, aic_threshold=0, max_steps=NULL, save_steps=FALSE, filename="temp_out_list.R", verbose=FALSE, plotaic=FALSE, error_skip=FALSE, sample_shifts=FALSE, sample_threshold=2) addRegime(otree, odata, oldshifts, oldaic, oldfit, alloldaic=NULL, exclude=NULL, aic_threshold=0, verbose=FALSE, plotaic=FALSE, error_skip=FALSE, sample_shifts=FALSE, sample_threshold=2)
surfaceForward(otree, odata, starting_list=NULL, starting_shifts=NULL, exclude=0, aic_threshold=0, max_steps=NULL, save_steps=FALSE, filename="temp_out_list.R", verbose=FALSE, plotaic=FALSE, error_skip=FALSE, sample_shifts=FALSE, sample_threshold=2) addRegime(otree, odata, oldshifts, oldaic, oldfit, alloldaic=NULL, exclude=NULL, aic_threshold=0, verbose=FALSE, plotaic=FALSE, error_skip=FALSE, sample_shifts=FALSE, sample_threshold=2)
otree |
Phylogenetic tree in |
odata |
Data frame with rownames corresponding to |
starting_list |
An optional list which may containing either a partially completed analysis (which can be built upon instead of starting over), or a custom starting model created with |
starting_shifts |
An optional named character vector of shifts that are required to be in the Hansen model, which will be passed to |
exclude |
Optionally, the proportion of the worst models (AICc scores for each shift point) to exclude in the current round (defaults to zero; values greater than 0.5 are not recommended) |
aic_threshold |
Change in AICc needed to accept a candidate model as a sufficient improvement over the previous iteration of SURFACE. Defaults to zero, meaning any improvement in the AICc will be accepted; more stringent thresholds are specified using *negative* values of |
max_steps |
Maximum number of regimes to allow to be added (assuming the model improvement continues to exceed |
save_steps |
A logical indicating whether to save the current iteration of the model at each step (overwriting previous iterations) to a file |
filename |
Name of the file to save progress to at each step, if |
verbose |
A logical indicating whether to print progress (defaults to |
plotaic |
A logical indicating whether to plot AICc values of candidate models at each step (defaults to |
error_skip |
A logical indicating whether to skip over any candidate model that produces an error message (this is rare, but can cause an entire analysis to abort; defaults to |
sample_shifts |
A logical indicating whether to sample from among the best models at each step (those within |
sample_threshold |
Number of AICc units within which to sample among candidate models that are close to as good as the best model at each step (defaults to 2, but only used if |
oldshifts |
Any shifts present in the previous iteration of the Hansen model |
oldaic |
AICc value for the Hansen model from the previous iteration |
oldfit |
Previous fitted Hansen model |
alloldaic |
AICc values for each tested shift point in the previous iteration |
Can be time-consuming, as many likelihood searches are carried out at each iteration. Depending on the number of traits and taxa and the number of regimes that are fitted, surfaceForward
can take anywhere from minutes to many hours (only tree sizes up to 128 taxa have been tested). Options to manage computation time include adding regimes one at a time with addRegime
or using max_steps
to perform the analysis several iterations at a time
addRegime
returns a list describing one iteration of the forward phase of the SURFACE analysis; surfaceForward
returns a list of such lists consisting of each step of the stepwise process
fit |
The fitted Hansen model selected for improving the AICc most over the previous iteration; consists of a single |
all_aic |
The AICc for each model tested during the iteration (numbered by branch) |
aic |
The AICc of the current Hansen model |
savedshifts |
The shifts present in the current Hansen model; represented as a named character vector of regime shifts (lower-case letters), with names indicating branches containing shifts |
n_regimes |
A two-element vector of the number of regime shifts and the number of distinct regimes in the current model |
Travis Ingram
Butler, M.A. & King, A.A. (2004) Phylogenetic comparative analysis: a modeling approach for adaptive evolution. American Naturalist 164: 683-695.
Ingram, T. & Mahler, D.L. (2013) SURFACE: detecting convergent evolution from comparative data by fitting Ornstein-Uhlenbeck models with stepwise AIC. Methods in Ecology and Evolution 4: 416-425.
Mahler, D.L., Ingram, T., Revell, L.J. & Losos, J.B. (2013) Exceptional convergence on the macroevolutionary landscape in island lizard radiations. Science 341: 292-295.
surfaceBackward
, surfaceSimulate
, surfaceTreePlot
, surfaceSummary
, convertTreeData
, startingModel
## Not run: data(surfaceDemo) tree<-surfaceDemo$tree dat<-surfaceDemo$sim$dat olist<-convertTreeData(tree,dat) otree<-olist[[1]]; odata<-olist[[2]] fwd<-surfaceForward(otree, odata, aic_threshold = 0, exclude = 0) ## End(Not run)
## Not run: data(surfaceDemo) tree<-surfaceDemo$tree dat<-surfaceDemo$sim$dat olist<-convertTreeData(tree,dat) otree<-olist[[1]]; odata<-olist[[2]] fwd<-surfaceForward(otree, odata, aic_threshold = 0, exclude = 0) ## End(Not run)
Provides several ways to simulate data sets on phylogenetic trees in conjunction with SURFACE analyses. Can simulate under simple models without regime shifts, under a Hansen model with sampled shift locations, or under a fitted Hansen model (optionally with resampled optima)
surfaceSimulate(phy, type = "BM", param = 0, n_traits = NULL, dat = NULL, vcv = NULL, hansenfit = NULL, shifts = NULL, n_shifts = NULL, n_conv_shifts = NULL, n_regimes = NULL, n_per_regime = NULL, no_nested = TRUE, optima = NULL, sample_optima = TRUE, optima_distrib = NULL, optima_type = "rnorm", sigma_squared = NULL, alpha = NULL, pshift_timefactor = NULL)
surfaceSimulate(phy, type = "BM", param = 0, n_traits = NULL, dat = NULL, vcv = NULL, hansenfit = NULL, shifts = NULL, n_shifts = NULL, n_conv_shifts = NULL, n_regimes = NULL, n_per_regime = NULL, no_nested = TRUE, optima = NULL, sample_optima = TRUE, optima_distrib = NULL, optima_type = "rnorm", sigma_squared = NULL, alpha = NULL, pshift_timefactor = NULL)
phy |
A phylogenetic tree in |
type |
Type of simulation desired - options are |
param |
If |
n_traits |
Number of traits (if not provided will be determined from other inputs or default to 1) |
dat |
Optional data frame of original trait data (function will use this to extract features of the data set) |
vcv |
Optional evolutionary variance-covariance matrix |
hansenfit |
A fitted Hansen model (or a list of such if multiple traits) (if |
shifts |
A vector of regime shifts, named for the branches they are to be placed on in the Hansen model to be simulated under (if |
n_shifts |
Number of shifts to add to the Hansen model (if |
n_conv_shifts |
Number of convergent shifts to add to the Hansen model (if |
n_regimes |
Number of regimes to add to the Hansen model (if |
n_per_regime |
Integer vector of the number of shifts to each regime in the model (if |
no_nested |
A logical indicating whether to ensure that a pair of ‘convergent’ regimes is not in fact two nested clades (if |
optima |
Optional matrix of optima |
sample_optima |
A logical indicating whether to replace the optima in the fitted model with new values from a distribution based on the inferred optima (if |
optima_distrib |
Optional matrix of optima distribution for each trait (see |
optima_type |
How to sample optima based on |
sigma_squared |
Scalar or vector of Brownian rate parameters to use in simulations |
alpha |
Scalar or vector of OU attraction parameter values to use in simulations |
pshift_timefactor |
Factor by which to bias sampling of branches to place regimes on to be earlier (if <1) or later (if >1) in the tree. The sampling probability will be |
Type of simulation may be "BM"
, "hansen-fit"
, or "hansen-paint"
.
If type = "BM"
, simulation uses the sim.char
function in geiger
, with Brownian rate sigma_squared
. If type = "BM"
, param
values other than 0 will transform the tree based on the Early Burst (param < 0
) or single-peak Ornstein-Uhlenbeck (param > 0
) model before simulating, causing trait disparity to be concentrated earlier or later in the tree, respectively
If type = "hansen-fit"
, an existing hansentree
object is used as the basis of simulation using ouch
functions, optionally with new parameter values
If type = "hansen-paint"
, a new hansentree
object is produced for simulation using ouch
functions, with specified parameter values and numbers of regimes and/or regime shifts
A list with the following components (most are NULL if type = "BM"
):
data |
Simulated trait data in a data frame |
optima |
Matrix of optima for each regime for each trait in the generating model |
savedshifts |
Shift locations in the generating Hansen model |
regimes |
Regime assignments of tip taxa |
shifttimes |
Timing of each shift in the Hansen model (measured from the root of the tree |
fit |
Generating Hansen model used in the simulation |
Travis Ingram
Ingram, T. & Mahler, D.L. (2013) SURFACE: detecting convergent evolution from comparative data by fitting Ornstein-Uhlenbeck models with stepwise AIC. Methods in Ecology and Evolution 4: 416-425.
surfaceForward
, surfaceBackward
, surfaceTreePlot
, surfaceTraitPlot
data(surfaceDemo) tree<-surfaceDemo$tree dat<-surfaceDemo$sim$dat olist<-convertTreeData(tree,dat) otree<-olist[[1]]; odata<-olist[[2]] sim<-surfaceSimulate(otree,type="hansen-paint",dat=dat,shifts=c(c("1"="a","6"="b","17"="c")))
data(surfaceDemo) tree<-surfaceDemo$tree dat<-surfaceDemo$sim$dat olist<-convertTreeData(tree,dat) otree<-olist[[1]]; odata<-olist[[2]] sim<-surfaceSimulate(otree,type="hansen-paint",dat=dat,shifts=c(c("1"="a","6"="b","17"="c")))
Extracts the most important results from the output of the forward, backward, or both phases of a SURFACE analysis
surfaceSummary(fwd = NULL, bwd = NULL)
surfaceSummary(fwd = NULL, bwd = NULL)
fwd |
A list returned by |
bwd |
A list returned by |
If both fwd
and bwd
are provided, both phases of the analysis will be summarized together
A list with the following components:
n_steps |
number of iterations in the stepwise analysis |
lnls |
matrix of traits-by-iterations, giving the log-likelihood for each trait at each iteration of the analysis |
n_regimes_seq |
matrix of the summaries of regime structure at each iteration of the model |
aics |
vector giving the AICc value at each step |
shifts |
shifts present in the final fitted Hansen model |
n_regimes |
summary of regime structure of the final fitted Hansen model (see note below) |
alpha |
estimate of alpha for each trait in the final model |
sigma_squared |
estimate of sigma_squared for each trait in the final model |
theta |
matrix of estimated optima (one per regime per trait) in the final model |
The elements n_regimes_seq
and n_regimes
contain measures of the regime structure in a SURFACE analysis (for each iteration, and in the final model, respectively). The measures returned are: k
(the number of regime shifts, counting the basal regime as 1), kprime
, (the number of regimes, some of which may be reached by multiple shifts), deltak
(k-kprime
, a measure of convergence), c
(the number of shifts to convergent regimes, another measure of convergence), kprime_conv
(the number of convergent regimes shifted to multiple times), and kprime_nonconv
(the number of nonconvergent regimes only shifted to once)
Travis Ingram
Ingram, T. & Mahler, D.L. (2013) SURFACE: detecting convergent evolution from comparative data by fitting Ornstein-Uhlenbeck models with stepwise AIC. Methods in Ecology and Evolution 4: 416-425.
surfaceForward
, surfaceBackward
## Not run: data(surfaceDemo) tree<-surfaceDemo$tree dat<-surfaceDemo$sim$dat result<-runSurface(tree,dat) surfaceSummary(result$fwd,result$bwd) ## End(Not run)
## Not run: data(surfaceDemo) tree<-surfaceDemo$tree dat<-surfaceDemo$sim$dat result<-runSurface(tree,dat) surfaceSummary(result$fwd,result$bwd) ## End(Not run)
Plotting functions to visualize the results of a SURFACE analysis, with colors depicting regime structure: surfaceTreePlot
produces a customized plot.phylo
figure, and surfaceTraitPlot
produces a scatterplot of trait values and optima
surfaceTreePlot(tree, hansenfit, cols = NULL, convcol = TRUE, labelshifts = FALSE, ...) surfaceTraitPlot(dat, hansenfit, whattraits = c(1, 2), cols = NULL, convcol = TRUE, pchs = c(21, 21), cex.opt = 2.5, optellipses = FALSE, ellipsescale = 1, flatten1D = FALSE, add = FALSE, ypos = 0, plotoptima = TRUE, plottraits = TRUE, y.lim = NULL, x.lim = NULL, y.lab = NULL, x.lab = NULL, ...)
surfaceTreePlot(tree, hansenfit, cols = NULL, convcol = TRUE, labelshifts = FALSE, ...) surfaceTraitPlot(dat, hansenfit, whattraits = c(1, 2), cols = NULL, convcol = TRUE, pchs = c(21, 21), cex.opt = 2.5, optellipses = FALSE, ellipsescale = 1, flatten1D = FALSE, add = FALSE, ypos = 0, plotoptima = TRUE, plottraits = TRUE, y.lim = NULL, x.lim = NULL, y.lab = NULL, x.lab = NULL, ...)
tree |
Phylogenetic tree in |
dat |
Trait data formatted as a data frame with named rows and at least two columns |
hansenfit |
An object containing the fitted Hansen model to use in plotting, with elements |
whattraits |
A two-element integer (or a single integer; see Details) indicating which traits to use for the (x,y) axes of a trait plot (defaults to |
cols |
An optional character vector of colors for painting branches in |
convcol |
A logical indicating whether to select separate colors for convergent (colorful) and non-convergent (greyscale) regimes (defaults to |
labelshifts |
A logical indicating whether to add integer labels to branches in the tree to show the order in which regime shifts were added in the forward phase (defaults to |
pchs |
Vector with two integers representing the plotting characters to use for trait values and optima, respectively, in |
cex.opt |
Character expansion for symbols representing the optima in |
optellipses |
A logical indicating whether to draw ellipses based on the fitted OU model instead of denoting optimum positions with |
ellipsescale |
A scalar or vector indicating how many standard deviations to draw ellipses above and below the optima; if a vector, concentric ellipses of various sizes will be drawn; defaults to 1 |
flatten1D |
A logical indicating whether all regimes should be placed on a single line when |
add |
A logical indicating whether to add a new element to an existing |
ypos |
Position on the y axis to place the traits and optima on; only applies if a single trait is used and |
plotoptima |
A logical indicating whether the optima should be displayed in |
plottraits |
A logical indicating whether the trait values should be displayed in |
y.lim |
Lower and upper limits for the y-axis; by default will be calculated to fit all points and ellipses fit in the frame |
x.lim |
Lower and upper limits for the x-axis; by default will be calculated to fit all points and ellipses fit in the frame |
y.lab |
y-axis label; defaults to the column name in the data frame |
x.lab |
x-axis label; defaults to the column name in the data frame |
... |
Additional arguments to be passed to the |
For trait plots using the option optellipses=TRUE
, note that in some cases (e.g. if alpha is very small) the ellipses will not convey useful information. If trait data are unidimensional, or if whattraits
is provided as a single integer, data will be plotted on the x-axis and the y-axis will separate different regimes (and ellipse width in the y-dimension will not be meaningful)
Creates one tree or trait plot on the current graphics device
Travis Ingram
Ingram, T. & Mahler, D.L. (2013) SURFACE: detecting convergent evolution from comparative data by fitting Ornstein-Uhlenbeck models with stepwise AIC. Methods in Ecology and Evolution 4: 416-425.
Mahler, D.L., Ingram, T., Revell, L.J. & Losos, J.B. (2013) Exceptional convergence on the macroevolutionary landscape in island lizard radiations. Science 341: 292-295.
surfaceForward
, surfaceBackward
, surfaceSimulate
, surfaceSummary
, surfaceAICPlot
data(surfaceDemo) tree<-surfaceDemo$tree dat<-surfaceDemo$sim$dat olist<-convertTreeData(tree,dat) otree<-olist[[1]]; odata<-olist[[2]] startmod<-startingModel(otree, odata, shifts = c("6"="b")) surfaceTreePlot(tree,startmod[[1]],labelshifts=TRUE,cols=c("black","red")) surfaceTraitPlot(dat,startmod[[1]],whattraits=c(1,2),cols=c("black","red"))
data(surfaceDemo) tree<-surfaceDemo$tree dat<-surfaceDemo$sim$dat olist<-convertTreeData(tree,dat) otree<-olist[[1]]; odata<-olist[[2]] startmod<-startingModel(otree, odata, shifts = c("6"="b")) surfaceTreePlot(tree,startmod[[1]],labelshifts=TRUE,cols=c("black","red")) surfaceTraitPlot(dat,startmod[[1]],whattraits=c(1,2),cols=c("black","red"))