Title: | Spatial Phylogenetic Analysis |
---|---|
Description: | Conduct various analyses on spatial phylogenetics. Use your data on an evolutionary tree and geographic distributions of the terminal taxa to compute diversity and endemism metrics, test significance with null model randomization, analyze community turnover and biotic regionalization, and perform spatial conservation prioritizations. All functions support quantitative community data in addition to binary data. |
Authors: | Matthew Kling [aut, cre, cph]
|
Maintainer: | Matthew Kling <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.0.0 |
Built: | 2025-01-24 13:47:40 UTC |
Source: | CRAN |
Nonlinear function that converts proportion of range conserved into conservation "benefit."
benefit(x, lambda = 1)
benefit(x, lambda = 1)
x |
Fraction of taxon range protected (value between 0 and 1). |
lambda |
Shape parameter. |
Value between 0 and 1.
Get example phylospatial
data set based on a phylogeny and modeled distributions of 443 moss species
across California. This data set is a coarser version of data from Kling et al. (2024). It contains
occurrence probabilities, and is available in raster or polygon spatial formats.
moss(format = "raster")
moss(format = "raster")
format |
Either "raster" (default) or "polygon" |
a phylospatial
object
Kling, Gonzalez-Ramirez, Carter, Borokini, and Mishler (2024) bioRxiv, https://doi.org/10.1101/2024.12.16.628580.
moss()
moss()
This function creates a phylospatial
object. This is the core data type in the phylospatial library, and
is a required input to most other functions in the package. The two essential components of a spatial phylogenetic object
are a phylogenetic tree and an community data set.
phylospatial( comm, tree = NULL, spatial = NULL, data_type = c("auto", "probability", "binary", "abundance", "other"), clade_fun = NULL, build = TRUE, check = TRUE, area_tol = 0.01 )
phylospatial( comm, tree = NULL, spatial = NULL, data_type = c("auto", "probability", "binary", "abundance", "other"), clade_fun = NULL, build = TRUE, check = TRUE, area_tol = 0.01 )
comm |
Community data representing the distribution of terminal taxa across sites. Can be a matrix with a column per terminal and
a row per site, a SpatRaster with one layer per terminal, or a |
tree |
Phylogeny of class phylo. Terminals whose names do not match |
spatial |
An optional |
data_type |
Character giving the data type of |
clade_fun |
Function to calculate the local community weight for a clade based on community weights for tips found in a given location.
Must be either NULL (the default, in which case the default function for the selected |
build |
Logical indicating whether |
check |
Logical indicating whether community data should be validated. Default is TRUE. |
area_tol |
Numeric value giving tolerance for variation in the area of sites. Default is |
This function formats the input data as a phylospatial
object. Beyond validating, cleaning, and restructing the data, the main operation
it performs is to compute community occurrence data for every internal clade on the tree. For a given clade and site, community data for
all the terminals in the clade are used to calculate the clade's occurrence value in the site. As described above, this calculation can
happen in various ways, depending on what type of community data you have (e.g. binary, probability, or abundance) and how you want to
summarize them. By default, the function tries to detect your data_type
and use it to automatically select an appropriate summary
function as described above, but you can override this by providing your own function to clade_fun
.
You can also disable construction of the clade community matrix columns altogether by setting build = FALSE
). This is atypical, but you
might want to use this option if you have your own distribution data data on all clades (e.g. from modeling occurrence probabilities for
clades in addition to terminal species), or if your community data comes from a previously-constructed phylospatial
object.
A phylospatial
object, which is a list containing the following elements:
Character indicating the community data type
Phylogeny of class phylo
Community matrix, including a column for every terminal taxon and every larger clade. Column order corresponds to tree edge order.
A SpatRaster
or sf
providing spatial coordinates for the rows in comm
. May be missing if no spatial data was supplied.
A community dissimilary matrix of class dist
indicating pairwise phylogenetic dissimilarity between sites. Missing unless
ps_dissim(..., add = TRUE)
is called.
# load species distribution data and phylogeny comm <- terra::rast(system.file("extdata", "moss_comm.tif", package = "phylospatial")) tree <- ape::read.tree(system.file("extdata", "moss_tree.nex", package = "phylospatial")) # construct `phylospatial` object ps <- phylospatial(comm, tree) ps
# load species distribution data and phylogeny comm <- terra::rast(system.file("extdata", "moss_comm.tif", package = "phylospatial")) tree <- ape::read.tree(system.file("extdata", "moss_tree.nex", package = "phylospatial")) # construct `phylospatial` object ps <- phylospatial(comm, tree) ps
Show a plot illustrating alternative values for the lambda
parameter in ps_prioritize. Lambda determines the shape of
the "benefit" function that determines the conservation value of protecting a given proportion of the geographic range of a
species or clade. Positive values place a higher priority on protecting additional populations of largely unprotected taxa,
whereas negative values place a higher priority on protecting additional populations of relatively well-protected taxa. The
default value used by ps_prioritize is 1.
plot_lambda(lambda = c(-1, -0.5, 0, 0.5, 2, 1))
plot_lambda(lambda = c(-1, -0.5, 0, 0.5, 2, 1))
lambda |
A vector of lambda values to plot |
Plots a figure
plot_lambda() plot_lambda(seq(0, 3, .1))
plot_lambda() plot_lambda(seq(0, 3, .1))
phylospatial
objectPlot a phylospatial
object
## S3 method for class 'phylospatial' plot(x, y = c("tree", "comm"), max_taxa = 12, ...)
## S3 method for class 'phylospatial' plot(x, y = c("tree", "comm"), max_taxa = 12, ...)
x |
|
y |
Either |
max_taxa |
Integer giving the maximum number of taxon ranges to plot if |
... |
Additional arguments passed to plotting methods, depending on |
A plot of the tree or community data.
ps <- ps_simulate() plot(ps, "tree") plot(ps, "comm")
ps <- ps_simulate() plot(ps, "tree") plot(ps, "comm")
phylospatial
objectThis function calculates pairwise phylogenetic dissimilarity between communities and returns the phylospatial
object with the dissimilarity data added as an element called dissim
. See ps_dissim for details.
ps_add_dissim(ps, method = "sorensen", ...)
ps_add_dissim(ps, method = "sorensen", ...)
ps |
|
method |
Dissimilarity metric; see ps_dissim for details. |
... |
Additional arguments passed to ps_dissim, such as |
ps
with a new dissim
element added.
ps <- ps_simulate(data_type = "prob") ps_add_dissim(ps) ps_add_dissim(ps, fun = "vegdist", method = "jaccard", endemism = TRUE)
ps <- ps_simulate(data_type = "prob") ps_add_dissim(ps) ps_add_dissim(ps, fun = "vegdist", method = "jaccard", endemism = TRUE)
This function classifies sites into areas of significant endemism according to the scheme of Mishler et al. (2014). Categorization is based on randomization quantile values for PE, RPE, and CE (which Mishler et al. call "PE on the comparison tree").
ps_canape(rand, alpha = 0.05)
ps_canape(rand, alpha = 0.05)
rand |
An object returned by running |
alpha |
Numeric value between 0 and 1 giving the one-tailed p-value threshold to use when determining significance. |
Endemism significance categories are defined as follows:
Endemism not significant: neither PE nor CE are significantly high at alpha
.
Significant neoendemism: PE or CE are significantly high at alpha
; RPE significantly low at alpha / 2
(two-tailed test).
Significant paleoendemism: PE or CE are significantly high at alpha
; RPE significantly high at alpha / 2
(two-tailed test)..
Significant mixed-endemism: PE or CE are significantly high at alpha
; RPE not significant.
Significant super-endemism: PE or CE are significantly high at alpha / 5
; RPE not significant.
An object of the same class as rand
containing a variable called "canape"
, with values 0-4
corresponding to not-significant, mixed-, super-, neo-, and paleo-endemism, respectively.
Mishler, B. D., Knerr, N., González-Orozco, C. E., Thornhill, A. H., Laffan, S. W., & Miller, J. T. (2014). Phylogenetic measures of biodiversity and neo-and paleo-endemism in Australian Acacia. Nature Communications, 5(1), 4473.
# classic CANAPE using binary data and the curveball algorithm # (note that a real analysis would require a much higher `n_rand`) set.seed(123456) ps <- moss() rand <- ps_rand(ps, metric = c("PE", "RPE", "CE"), fun = "nullmodel", method = "curveball", n_rand = 25, burnin = 10000, progress = FALSE) canape <- ps_canape(rand) terra::plot(canape)
# classic CANAPE using binary data and the curveball algorithm # (note that a real analysis would require a much higher `n_rand`) set.seed(123456) ps <- moss() rand <- ps_rand(ps, metric = c("PE", "RPE", "CE"), fun = "nullmodel", method = "curveball", n_rand = 25, burnin = 10000, progress = FALSE) canape <- ps_canape(rand) terra::plot(canape)
This function is a wrapper around canaper::cpr_rand_test()
. It only works with binary community data.
It is largely redundant with ps_rand()
and ps_canape()
, which are more flexible in supporting data sets
with non-binary community data. However, this function runs faster, and supports custom null models via
make.commsim.
ps_canaper(ps, null_model = "curveball", spatial = TRUE, ...)
ps_canaper(ps, null_model = "curveball", spatial = TRUE, ...)
ps |
phylospatial object |
null_model |
see |
spatial |
Logical: should the function return a spatial object (TRUE, default) or a vector (FALSE). |
... |
further arguments passed to |
This function runs canaper::cpr_rand_test()
; see the help for that function for details.
It also runs canaper::cpr_classify_endem()
on the result, and includes the resulting classification as an
additional variable, 'endem_type', in the output. 'endem_type' values 0-4 correspond to not-significant, neo,
paleo, mixed, and super endemesim, respectively.
A matrix
or SpatRaster
, or sf
with a column or layer for each metric.
Mishler, B. D., Knerr, N., González-Orozco, C. E., Thornhill, A. H., Laffan, S. W., & Miller, J. T. (2014). Phylogenetic measures of biodiversity and neo-and paleo-endemism in Australian Acacia. Nature Communications, 5(1), 4473.
Nitta, J. H., Laffan, S. W., Mishler, B. D., & Iwasaki, W. (2023). canaper: categorical analysis of neo‐and paleo‐endemism in R. Ecography, 2023(9), e06638.
if(requireNamespace("canaper")){ ps <- ps_simulate(data_type = "binary") terra::plot(ps_canaper(ps)$pd_obs_p_upper) }
if(requireNamespace("canaper")){ ps <- ps_simulate(data_type = "binary") terra::plot(ps_canaper(ps)$pd_obs_p_upper) }
This function calculates pairwise phylogenetic dissimilarity between communities. It works with both binary and
quantitative community data sets. A wide range of phylogentic community dissimilarity metrics are supported,
including phylogenetic Sorensen's and Jaccard's distances, turnover and nestedness components of Sorensen's distance
(Baselga & Orme, 2012), and phylogenetic versions of all community distance indices provided through the vegan
library.
The function also includes options to scale the community matrix in order to focus the analysis on endemism and/or
on proportional differences in community composition. The results from this function can be visualized using
ps_rgb or ps_regions, or used in a variety of statistical analyses.
ps_dissim( ps, method = "sorensen", fun = c("vegdist", "designdist", "chaodist"), endemism = FALSE, normalize = FALSE, ... )
ps_dissim( ps, method = "sorensen", fun = c("vegdist", "designdist", "chaodist"), endemism = FALSE, normalize = FALSE, ... )
ps |
phylospatial object. |
method |
Character indicating the dissimilarity index to use:
|
fun |
Character indicating which general distance function from the |
endemism |
Logical indicating whether community values should be divided by column totals (taxon range sizes) to derive endemism before computing distances. |
normalize |
Logical indicating whether community values should be divided by row totals (community sums) before
computing distances. If |
... |
Additional arguments passed to |
A pairwise phylogenetic dissimilarity matrix of class dist
.
Graham, C. H., & Fine, P. V. (2008). Phylogenetic beta diversity: linking ecological and evolutionary processes across space in time. Ecology Letters, 11(12), 1265-1277.
Baselga, A., & Orme, C. D. L. (2012). betapart: an R package for the study of beta diversity. Methods in Ecology and Evolution, 3(5), 808-812.
Pavoine, S. (2016). A guide through a family of phylogenetic dissimilarity measures among sites. Oikos, 125(12), 1719-1732.
# example data set: ps <- ps_simulate(n_tips = 50) # The default arguments give Sorensen's quantitative dissimilarity index # (a.k.a. Bray-Curtis distance): d <- ps_dissim(ps) # Specifying a custom formula explicitly via `designdist`; # (this is the Bray-Curtis formula, so it's equivalent to the prior example) d <- ps_dissim(ps, method = "(b+c)/(2*a+b+c)", fun = "designdist", terms = "minimum", abcd = TRUE) # Alternative arguments can specify a wide range of dissimilarity measures; # here's endemism-weighted Jaccard's dissimilarity: d <- ps_dissim(ps, method = "jaccard", endemism = TRUE)
# example data set: ps <- ps_simulate(n_tips = 50) # The default arguments give Sorensen's quantitative dissimilarity index # (a.k.a. Bray-Curtis distance): d <- ps_dissim(ps) # Specifying a custom formula explicitly via `designdist`; # (this is the Bray-Curtis formula, so it's equivalent to the prior example) d <- ps_dissim(ps, method = "(b+c)/(2*a+b+c)", fun = "designdist", terms = "minimum", abcd = TRUE) # Alternative arguments can specify a wide range of dissimilarity measures; # here's endemism-weighted Jaccard's dissimilarity: d <- ps_dissim(ps, method = "jaccard", endemism = TRUE)
This function calculates a variety of diversity and endemism metrics including Faith's phylogenetic diversity, Shannon phylogenetic entropy, Simpson phylogentic diversity, relative phylogentic diversity, richness of clades, richness of terminals (typically species), and versions of all these metrics weighted by endemism (i.e. rarity). If continuous community data (probabilities or abundances) are provided, they are used in calculations, giving quantitative versions of the classic binary metrics.
ps_diversity(ps, metric = "all", spatial = TRUE)
ps_diversity(ps, metric = "all", spatial = TRUE)
ps |
phylospatial object (created by |
metric |
Character vector containing the abbreviation for one or more diversity metrics listed in
the details below. Can also specify |
spatial |
Logical: should the function return a spatial object (TRUE, default) or a matrix (FALSE)? |
The function calculates the following metrics:
TD—Terminal Diversity, i.e. richness of terminal taxa (in many cases these are species):
TE—Terminal Endemism, i.e. total endemism-weighted diversity of terminal taxa, a.k.a. "weighted endemism":
CD—Clade Diversity, i.e. richness of taxa at all levels (equivalent to PD on a cladogram):
CE—Clade Endemism, i.e. total endemism-weighted diversity of taxa at all levels (equivalent to PE on a cladrogram):
PD—Phylogenetic Diversity:
PE—Phylogenetic Endemism, i.e. endemism-weighted PD:
RPD—Relative Phylogenetic Diversity, i.e. branch length of mean resident (equivalent to PD / CR):
RPE—Relative Phylogenetic Endemism, i.e. mean endemism-weighted branch length (equivalent to PE / CE):
ShPD—Shannon Phylogenetic Diversity, a.k.a. "phylogenetic entropy":
ShPE–Shannon phylogenetic Endemism, an endemism-weighted version of ShPD:
SiPD—Simpson Phylogenetic Diversity:
SiPE—Simpson Phylogenetic Endemism, an endemism-weighted version of SiPD:
where indexes all taxa including terminals and larger clades;
indexes terminals only;
is the occurrence value
(binary, probability, or abundance) of clade/terminal
in a given community;
is the
length of the phylogenetic branch segment unique to clade
; and
is the sum of
across all sites.
For Shannon and Simpson indices, only nonzero elements of are used,
, and
.
A matrix, sf
data frame, or SpatRaster
with a column or layer for each requested diversity metric.
Faith, D. P. (1992). Conservation evaluation and phylogenetic diversity. Biological Conservation, 61(1), 1-10.
Laffan, S. W., & Crisp, M. D. (2003). Assessing endemism at multiple spatial scales, with an example from the Australian vascular flora. Journal of Biogeography, 30(4), 511-520.
Rosauer, D. A. N., Laffan, S. W., Crisp, M. D., Donnellan, S. C., & Cook, L. G. (2009). Phylogenetic endemism: a new approach for identifying geographical concentrations of evolutionary history. Molecular Ecology, 18(19), 4061-4072.
Allen, B., Kon, M., & Bar-Yam, Y. (2009). A new phylogenetic diversity measure generalizing the Shannon index and its application to phyllostomid bats. The American Naturalist, 174(2), 236-243.
Chao, A., Chiu, C. H., & Jost, L. (2010). Phylogenetic diversity measures based on Hill numbers. Philosophical Transactions of the Royal Society B: Biological Sciences, 365(1558), 3599-3609.
Mishler, B. D., Knerr, N., González-Orozco, C. E., Thornhill, A. H., Laffan, S. W., & Miller, J. T. (2014). Phylogenetic measures of biodiversity and neo-and paleo-endemism in Australian Acacia. Nature Communications, 5(1), 4473.
Kling, M. M., Mishler, B. D., Thornhill, A. H., Baldwin, B. G., & Ackerly, D. D. (2019). Facets of phylodiversity: evolutionary diversification, divergence and survival as conservation targets. Philosophical Transactions of the Royal Society B, 374(1763), 20170397.
ps <- ps_simulate() div <- ps_diversity(ps) terra::plot(div)
ps <- ps_simulate() div <- ps_diversity(ps) terra::plot(div)
phylospatial
community dataGet phylospatial
community data
ps_get_comm(ps, tips_only = TRUE, spatial = TRUE)
ps_get_comm(ps, tips_only = TRUE, spatial = TRUE)
ps |
|
tips_only |
Logical indicating whether only the terminal taxa (TRUE, the default) or all taxa (FALSE) should be returned. |
spatial |
Logical indicating whether a spatial ( |
Either a SpatRaster
with a layer for every taxon, or an sf
data frame with a variable for every taxon,
depending on which data type was used to create ps
.
ps <- ps_simulate() # the defaults return a spatial object of terminal taxa distributions: ps_get_comm(ps) # get distributions for all taxa, as a matrix pcomm <- ps_get_comm(ps, tips_only = FALSE, spatial = FALSE)
ps <- ps_simulate() # the defaults return a spatial object of terminal taxa distributions: ps_get_comm(ps) # get distributions for all taxa, as a matrix pcomm <- ps_get_comm(ps, tips_only = FALSE, spatial = FALSE)
Perform an ordination that reduces a spatial phylogenetic data set into k
dimensions, using one of
several alternative ordination algorithms.
ps_ordinate(ps, method = c("nmds", "cmds", "pca"), k = 3, spatial = TRUE)
ps_ordinate(ps, method = c("nmds", "cmds", "pca"), k = 3, spatial = TRUE)
ps |
A |
method |
Ordination method, either "pca" (principal component analysis implemented via |
k |
Positive integer giving the desired number of output dimensions; default is |
spatial |
Logical indicating whether a spatial object (inherited from |
A matrix or spatial object with k
variables.
ps <- ps_add_dissim(ps_simulate(50, 5, 5)) ord <- ps_ordinate(ps, method = "cmds", k = 4) terra::plot(ord)
ps <- ps_add_dissim(ps_simulate(50, 5, 5)) ord <- ps_ordinate(ps, method = "cmds", k = 4) terra::plot(ord)
Create a ranking of conservation priorities using optimal or probabilistic forward stepwise selection. Prioritization accounts for the occurrence quantities for all lineages present in the site, including terminal taxa and larger clades; the evolutionary branch lengths of these lineages on the phylogeny, which represent their unique evolutionary heritage; the impact that protecting the site would have on these lineages' range-wide protection levels; the compositional complementarity between the site, other high-priority sites, and existing protected areas; the site's initial protection level; the relative cost of protecting the site; and a free parameter "lambda" determining the shape of the conservation benefit function.
ps_prioritize( ps, init = NULL, cost = NULL, lambda = 1, protection = 1, max_iter = NULL, method = c("optimal", "probable"), trans = function(x) replace(x, which(rank(-x) > 25), 0), n_reps = 100, n_cores = 1, summarize = TRUE, spatial = TRUE, progress = interactive() )
ps_prioritize( ps, init = NULL, cost = NULL, lambda = 1, protection = 1, max_iter = NULL, method = c("optimal", "probable"), trans = function(x) replace(x, which(rank(-x) > 25), 0), n_reps = 100, n_cores = 1, summarize = TRUE, spatial = TRUE, progress = interactive() )
ps |
phylospatial object. |
init |
Optional numeric vector or spatial object giving the starting protection status of each site across the study area. Values should be between 0 and 1 and represent the existing level of conservation effectiveness in each site. If this argument is not specified, it is assumed that no existing reserves are present. |
cost |
Optional numeric vector or spatial object giving the relative cost of protecting each site. Values should be positive, with greater values indicating higher cost of conserving a site. If this argument is not specified, cost is assumed to be uniform across sites. |
lambda |
Shape parameter for taxon conservation benefit function. This can be any real number. Positive values, such as the default
value |
protection |
Degree of protection of proposed new reserves (number between 0 and 1, with same meaning as |
max_iter |
Integer giving max number of iterations to perform before stopping, i.e. max number of sites to rank. |
method |
Procedure for selecting which site to add to the reserve network at each iteration:
|
trans |
A function that transforms marginal values into relative selection probabilities; only used if |
n_reps |
Number of random repetitions to do; only used if |
n_cores |
Number of compute cores to use for parallel processing; only used if |
summarize |
Logical: should summary statistics across reps (TRUE, default) or the reps themselves (FALSE) be returned? Only relevant
if |
spatial |
Logical: should the function return a spatial object (TRUE, default) or a matrix (FALSE)? |
progress |
Logical: should a progress bar be displayed? |
This function uses the forward stepwise selection algorithm of Kling et al. (2019) to generate a ranked conservation prioritization.
Prioritization begins with the starting protected lands network identified in init
, if provided. At each iteration, the marginal
conservation value of fully protecting each site is calculated, and a site is selected to be added to the reserve network. Selection can
happen either in an "optimal" or "probable" fashion as described under the method
argument. This process is repeated until all sites
are fully protected or until max_iter
has been reached, with sites selected early in the process considered higher conservation
priorities.
The benefit of the probabilistic approach is that it relaxes the potentially unrealistic assumption that protected land will actually be added in the optimal order. Since the algorithm avoids compositional redundancy between high-priority sites, the optimal approach will never place high priority on a site that has high marginal value but is redundant with a slightly higher-value site, whereas the probabilistic approach will select them at similar frequencies (though never in the same randomized run).
Every time a new site is protected as the algorithm progresses, it changes the marginal conservation value of the other sites. Marginal
value is the increase in conservation benefit that would arise from fully protecting a given site, divided by the cost of protecting the
site. This is calculated as a function of the site's current protection level, the quantitative presence probability or abundance of all
terminal taxa and larger clades present in the site, their evolutionary branch lengths on the phylogeny, the impact that protecting the
site would have on their range-wide protection levels, and the free parameter lambda
. lambda
determines the relative importance of
protecting a small portion of every taxon's range, versus fully protecting the ranges of more valuable taxa (those with longer
evolutionary branches and smaller geographic ranges).
Matrix or spatial object containing a ranking of conservation priorities. Lower rank values represent higher
conservation priorities. All sites with a lower priority than max_iter
have a rank value equal to the number
of sites in the input data set (i.e. the lowest possible priority).
method = "optimal"
. the result contains a single variable "priority" containing the ranking.
method = "probable"
and summarize = TRUE
, the "priority" variable gives the average rank across reps, variables labeled "pctX" give the Xth percentile of the rank distribution for each site, variables labeled "topX" give the proportion of reps in which a site was in the top X highest-priority sites, and variables labeled "treX" give a ratio representing "topX" relative to the null expectation of how often "topX" should occur by chance alone.
method = "probable"
and summarize = FALSE
, the result contains the full set of n_rep
solutions,
each representing the the ranking, with low values representing higher priorities..
Kling, M. M., Mishler, B. D., Thornhill, A. H., Baldwin, B. G., & Ackerly, D. D. (2019). Facets of phylodiversity: evolutionary diversification, divergence and survival as conservation targets. Philosophical Transactions of the Royal Society B, 374(1763), 20170397.
# simulate a toy `phylospatial` data set set.seed(123) ps <- ps_simulate() # basic prioritization p <- ps_prioritize(ps) # specifying locations of initial protected areas # (can be binary, or can be continuous values between 0 and 1) # here we'll create an `init` raster with arbitrary values ranging from 0-1, # using the reference raster layer that's part of our `phylospatial` object protected <- terra::setValues(ps$spatial, seq(0, 1, length.out = 400)) cost <- terra::setValues(ps$spatial, rep(seq(100, 20, length.out = 20), 20)) p <- ps_prioritize(ps, init = protected, cost = cost) # using probabilistic prioritization p <- ps_prioritize(ps, init = protected, cost = cost, method = "prob", n_reps = 1000, max_iter = 10) terra::plot(p$top10)
# simulate a toy `phylospatial` data set set.seed(123) ps <- ps_simulate() # basic prioritization p <- ps_prioritize(ps) # specifying locations of initial protected areas # (can be binary, or can be continuous values between 0 and 1) # here we'll create an `init` raster with arbitrary values ranging from 0-1, # using the reference raster layer that's part of our `phylospatial` object protected <- terra::setValues(ps$spatial, seq(0, 1, length.out = 400)) cost <- terra::setValues(ps$spatial, rep(seq(100, 20, length.out = 20), 20)) p <- ps_prioritize(ps, init = protected, cost = cost) # using probabilistic prioritization p <- ps_prioritize(ps, init = protected, cost = cost, method = "prob", n_reps = 1000, max_iter = 10) terra::plot(p$top10)
This function compares to diversity metrics calculated in ps_diversity to their null distributions computed by randomizing the community matrix. Randomization is done using the quantize method for community matrices containing continuous quantities such as occurrence probabilities or abundances.
ps_rand( ps, metric = "all", fun = "quantize", method = "curveball", n_rand = 100, spatial = TRUE, n_cores = 1, progress = interactive(), ... )
ps_rand( ps, metric = "all", fun = "quantize", method = "curveball", n_rand = 100, spatial = TRUE, n_cores = 1, progress = interactive(), ... )
ps |
|
metric |
Character vector giving one or more diversity metrics to calculate; see ps_diversity
for options. Can also specify |
fun |
Null model function to use. Must be either "quantize", "nullmodel", or an actual function:
|
method |
One of the method options listed under commsim. If |
n_rand |
Integer giving the number of random communities to generate. |
spatial |
Logical: should the function return a spatial object (TRUE, default) or a matrix (FALSE). |
n_cores |
Integer giving the number of compute cores to use for parallel processing. |
progress |
Logical: should a progress bar be displayed? |
... |
Additional arguments passed to quantize, simulate.nullmodel, or custom function
|
A matrix with a row for every row of x
, a column for every metric specified in metric
, and
values indicating the proportion of randomizations in which the observed diversity metric was greater than
the randomized metric. Or if spatial = TRUE
, a sf
or SpatRaster
object containing these data.
# simulate a `phylospatial` data set and run randomization with default settings ps <- ps_simulate(data_type = "prob") rand <- ps_rand(ps) # using the default `quantize` function, but with alternative arguments rand <- ps_rand(ps, transform = sqrt, n_strata = 4, priority = "rows") # using binary data ps2 <- ps_simulate(data_type = "binary") rand <- ps_rand(ps2, fun = "nullmodel", method = "r2") # using abundance data, and demonstrating alternative metric choices ps3 <- ps_simulate(data_type = "abund") rand <- ps_rand(ps3, metric = c("ShPD", "SiPD"), fun = "nullmodel", method = "abuswap_c") rand
# simulate a `phylospatial` data set and run randomization with default settings ps <- ps_simulate(data_type = "prob") rand <- ps_rand(ps) # using the default `quantize` function, but with alternative arguments rand <- ps_rand(ps, transform = sqrt, n_strata = 4, priority = "rows") # using binary data ps2 <- ps_simulate(data_type = "binary") rand <- ps_rand(ps2, fun = "nullmodel", method = "r2") # using abundance data, and demonstrating alternative metric choices ps3 <- ps_simulate(data_type = "abund") rand <- ps_rand(ps3, metric = c("ShPD", "SiPD"), fun = "nullmodel", method = "abuswap_c") rand
Perform a clustering analysis that categorizes sites into biogeographic regions based on phylogenetic community compositional similarity.
ps_regions(ps, k = 5, method = "average", endemism = FALSE, normalize = TRUE)
ps_regions(ps, k = 5, method = "average", endemism = FALSE, normalize = TRUE)
ps |
A |
k |
Number of spatial clusters to divide the region into (positive integer). See ps_regions_eval to help choose a value of k by comparing the variance explained by different numbers of regions. |
method |
Clustering method. Options include all methods listed under hclust, and |
endemism |
Logical indicating whether community values should be divided by column totals (taxon range sizes)
to derive endemism. Only used if |
normalize |
Logical indicating whether community values should be divided by row totals (community sums). If |
A raster or matrix with an integer indicating which of the k
regions each site belongs to.
Daru, B. H., Elliott, T. L., Park, D. S., & Davies, T. J. (2017). Understanding the processes underpinning patterns of phylogenetic regionalization. Trends in Ecology & Evolution, 32(11), 845-860.
ps <- ps_simulate() # using kmeans clustering algorithm terra::plot(ps_regions(ps, method = "kmeans")) # to use a hierarchical clustering method, first we have to `ps_add_dissim()` terra::plot(ps_regions(ps_add_dissim(ps), k = 7, method = "average"))
ps <- ps_simulate() # using kmeans clustering algorithm terra::plot(ps_regions(ps, method = "kmeans")) # to use a hierarchical clustering method, first we have to `ps_add_dissim()` terra::plot(ps_regions(ps_add_dissim(ps), k = 7, method = "average"))
This function compares multiple potential values for k
, the number of clusters in to use in ps_regions()
,
to help you decide how well different numbers of regions fit your data set. For each value of k
, it performs
a cluster analysis and calculates the proportion of total variance explained (SSE, the sum of squared pairwise
distances explained). It also calculates second-order metrics of the relationship between k and SSE. While
many data sets have no optimal value of k
and the choice is often highly subjective, these evaluation metrics
can help you identify potential points where the variance explained stops increasing quickly as k
increases.
ps_regions_eval(ps, k = 1:20, plot = TRUE, ...)
ps_regions_eval(ps, k = 1:20, plot = TRUE, ...)
ps |
A |
k |
Vector of positive integers giving possible values for |
plot |
Logical indicating whether to print a plot of the results ( |
... |
Further arguments passed to ps_regions. |
The function generates a data frame with the following columns. If plot = FALSE
the data frame is
returned, otherwise the function prints a plot of the latter variables as a function of k
:
"k": The number of clusters.
"sse": The proportion of total variance explained, with variance defined as squared pairwise community phylogenetic dissimilarity between sites.
"curvature": The local second derivative. Lower (more negative) values indicate more attractive break-point values of k.
"dist11": The distance from the point to the 1:1 line on a plot of k
vs sse
in which k values over the
interval from 1 to the number of sites are rescaled to the unit interval. Higher values indicate more
attractive values for k.
ps <- ps_add_dissim(ps_simulate()) ps_regions_eval(ps, k = 1:15, plot = TRUE)
ps <- ps_add_dissim(ps_simulate()) ps_regions_eval(ps, k = 1:15, plot = TRUE)
Perform an ordination that reduces a spatial phylogenetic data set into three dimensions that can be
plotted as the RGB bands of color space to visualize spatial patterns of community phylogenetic composition.
This function is a wrapper around ps_ordinate()
.
ps_rgb(ps, method = c("nmds", "cmds", "pca"), trans = identity, spatial = TRUE)
ps_rgb(ps, method = c("nmds", "cmds", "pca"), trans = identity, spatial = TRUE)
ps |
A |
method |
Ordination method, either "pca" (principal component analysis implemented via |
trans |
A function giving a transformation to apply to each dimension of the ordinated data.
The default is the identity function. Specifying |
spatial |
Logical indicating whether a spatial object (inherited from |
A matrix or spatial object with three variables containing RGB color values in the range 0-1.
ps <- ps_add_dissim(moss()) RGB <- ps_rgb(ps, method = "cmds") terra::plotRGB(RGB * 255, smooth = FALSE)
ps <- ps_add_dissim(moss()) RGB <- ps_rgb(ps, method = "cmds") terra::plotRGB(RGB * 255, smooth = FALSE)
This function generates a simple phylospatial
object that can be used for testing other functions
in the package. It is not intended to be realistic.
ps_simulate( n_tips = 10, n_x = 20, n_y = 20, data_type = c("probability", "binary", "abundance"), spatial_type = c("raster", "none"), seed = NULL )
ps_simulate( n_tips = 10, n_x = 20, n_y = 20, data_type = c("probability", "binary", "abundance"), spatial_type = c("raster", "none"), seed = NULL )
n_tips |
Number of terminals on phylogeny. |
n_x |
Number of raster cells in x dimension of landscape. |
n_y |
Number of raster cells in y dimension of landscape. |
data_type |
Community data type for simulated ranges: either "probability" (default), "binary", or "abundance". |
spatial_type |
Either "raster" or "none". |
seed |
Optional integer to seed random number generator. |
phylospatial
object, comprising a random phylogeny and community matrix in which each terminal has a
circular geographic range with a random radius and location. The spatial reference data is a SpatRaster.
# using all the defaults ps_simulate() # specifying some arguments plot(ps_simulate(n_tips = 50, n_x = 30, n_y = 40, data_type = "abundance"), "comm")
# using all the defaults ps_simulate() # specifying some arguments plot(ps_simulate(n_tips = 50, n_x = 30, n_y = 40, data_type = "abundance"), "comm")
This is a community null model method for quantitative community data (e.g. abundance or occurrence probability).
It is designed to adapt binary null model algorithms for use with quantitative data, which can be useful if
there is not a quantitative-specific algorithm available that has the desired properties. For example, use with
the binary "curveball" algorithm preserves row and column totals, and also approximately preserves the marginal
distributions of rows and columns. For each randomization, the data set is split into strata representing numerical ranges
of the input quantities, a separate binary randomization is done for each stratum, and the results are
combined to produce a randomized, quantitative community matrix. See vegan::commsim()
for details about
other binary and quantitative null models.
quantize(x, method = "curveball", ...)
quantize(x, method = "curveball", ...)
x |
Community matrix with species in rows, sites in columns, and nonnegative quantities in cells. |
method |
Null model algorithm, passed to |
... |
Additional arguments, including:
|
A randomized version of x
.
# example quantitative community matrix comm <- ps_get_comm(moss("polygon"), tips_only = TRUE, spatial = FALSE)[1:50, 1:50] # examples of different quantize usage rand <- quantize(comm) rand <- quantize(comm, n_strata = 4, transform = sqrt, priority = "rows") rand <- quantize(comm, method = "swap", burnin = 10) # (note: this `burnin` value is far too small for a real analysis)
# example quantitative community matrix comm <- ps_get_comm(moss("polygon"), tips_only = TRUE, spatial = FALSE)[1:50, 1:50] # examples of different quantize usage rand <- quantize(comm) rand <- quantize(comm, n_strata = 4, transform = sqrt, priority = "rows") rand <- quantize(comm, method = "swap", burnin = 10) # (note: this `burnin` value is far too small for a real analysis)
Convert a site-by-variable matrix into a SpatRaster or sf object
to_spatial(m, template)
to_spatial(m, template)
m |
Matrix or vector. |
template |
|
SpatRaster
with a layer for every column in m
, or sf
data frame with
a variable for every column in m
, depending on the data type of template
.
ps <- moss() to_spatial(ps$comm[, 1:5], ps$spatial)
ps <- moss() to_spatial(ps$comm[, 1:5], ps$spatial)