Title: | Fast Likelihood Calculation for Phylogenetic Comparative Models |
---|---|
Description: | Provides a C++ backend for multivariate phylogenetic comparative models implemented in the R-package 'PCMBase'. Can be used in combination with 'PCMBase' to enable fast and parallel likelihood calculation. Implements the pruning likelihood calculation algorithm described in Mitov et al. (2018) <arXiv:1809.09014>. Uses the 'SPLITT' C++ library for parallel tree traversal described in Mitov and Stadler (2018) <doi:10.1111/2041-210X.13136>. |
Authors: | Venelin Mitov [aut, cre, cph] (<a href="https://venelin.github.io">venelin.github.io</a>) |
Maintainer: | Venelin Mitov <[email protected]> |
License: | GPL (>= 3.0) |
Version: | 0.1.9 |
Built: | 2024-12-09 07:03:34 UTC |
Source: | CRAN |
A dataset containing three triplets trees, trait-values and models to evaluate the likelihood calculation times for R and C++ implementations.
benchmarkData
benchmarkData
A data frame with 4 rows and 8 variables:
phylogenetic tree (phylo) with set edge.regimes member
MGPM model used to simulate the data in X
trait values
log-likelihood value
a random BM model
log-likelihood value form modelBM
a random OU model
log-likelihood value for modelOU
Results from running a performance benchmark on a personal computer including the time for parameter transformation
benchmarkResults
benchmarkResults
A data.table
Results from running a performance benchmark on a personal computer excluding the time for parameter transformation
benchmarkResultsNoTransform
benchmarkResultsNoTransform
A data.table
A log-likelihood calculation time comparison for different numbers of traits and option-sets
BenchmarkRvsCpp(ks = c(1, 2, 4, 8), includeR = TRUE, includeTransformationTime = TRUE, optionSets = NULL, includeParallelMode = TRUE, doProf = FALSE, RprofR.out = "RprofR.out", RprofCpp.out = "RprofCpp.out", verbose = FALSE)
BenchmarkRvsCpp(ks = c(1, 2, 4, 8), includeR = TRUE, includeTransformationTime = TRUE, optionSets = NULL, includeParallelMode = TRUE, doProf = FALSE, RprofR.out = "RprofR.out", RprofCpp.out = "RprofCpp.out", verbose = FALSE)
ks |
a vector of positive integers, denoting different numbers of traits.
Default: |
includeR |
logical (default TRUE) indicating if likelihood calculations in R should be included in the benchmark (can be slow). |
includeTransformationTime |
logical (default TRUE) indicating if the time for
|
optionSets |
a named list of lists of PCM-options. If NULL (the default)
the option set is set to |
includeParallelMode |
logical (default TRUE) indicating if the default
optionSet should include parallel execution modes, i.e. setting the option
PCMBase.Lmr.mode to 21 instead of 11. This argument is taken into account
only with the argument |
doProf |
logical indicating if profiling should be activated (see Rprof
from the utils R-package). Default: FALSE. Additional arguments to Rprof can
be specified by assigning lists of arguments to the options 'PCMBaseCpp.ArgsRprofR'
and 'PCMBaseCpp.ArgsRprofCpp'. The default values for both options is
|
RprofR.out |
character strings indicating Rprof.out files for the R and Cpp implementations; ignored if doProf is FALSE. Default values: 'RprofR.out' and 'Rprofcpp.out'. |
RprofCpp.out |
character strings indicating Rprof.out files for the R and Cpp implementations; ignored if doProf is FALSE. Default values: 'RprofR.out' and 'Rprofcpp.out'. |
verbose |
logical indicating if log-messages should be printed to the console during the benchmark. Default FALSE. |
a data.table for results similar to the data.table returned from MiniBenchmarkRvsCpp
with
additional columns for k, option-set and the type of model.
Evaluate the likelihood calculation times for example trees and data
MiniBenchmarkRvsCpp(data = PCMBaseCpp::benchmarkData, includeR = TRUE, includeTransformationTime = TRUE, nRepsCpp = 10L, listOptions = list(PCMBase.Lmr.mode = 11, PCMBase.Threshold.EV = 0, PCMBase.Threshold.SV = 0), doProf = FALSE, RprofR.out = "RprofR.out", RprofCpp.out = "RprofCpp.out")
MiniBenchmarkRvsCpp(data = PCMBaseCpp::benchmarkData, includeR = TRUE, includeTransformationTime = TRUE, nRepsCpp = 10L, listOptions = list(PCMBase.Lmr.mode = 11, PCMBase.Threshold.EV = 0, PCMBase.Threshold.SV = 0), doProf = FALSE, RprofR.out = "RprofR.out", RprofCpp.out = "RprofCpp.out")
data |
a 'data.frame' with at least the following columns:
Defaults: to 'benchmarkData', which is small data.table included with the PCMBaseCpp package. |
includeR |
logical (default TRUE) indicating if likelihood calculations in R should be included in the benchmark (can be slow). |
includeTransformationTime |
logical (default TRUE) indicating if the time for
|
nRepsCpp |
: number of repetitions for the cpp likelihood calculation calls: a bigger value increases the precision of time estimation at the expense of longer running time for the benchmark. Defaults to 10. |
listOptions |
options to set before measuring the calculation times. Defaults to 'list(PCMBase.Lmr.mode = 11, PCMBase.Threshold.EV = 0, PCMBase.Threshold.SV = 0)'. 'PCMBase.Lmr.mode' corresponds to the parallel traversal mode for the tree traversal algorithm (see this page for possible values). |
doProf |
logical indicating if profiling should be activated (see Rprof
from the utils R-package). Default: FALSE. Additional arguments to Rprof can
be specified by assigning lists of arguments to the options 'PCMBaseCpp.ArgsRprofR'
and 'PCMBaseCpp.ArgsRprofCpp'. The default values for both options is
|
RprofR.out , RprofCpp.out
|
character strings indicating Rprof.out files for the R and Cpp implementations; ignored if doProf is FALSE. Default values: 'RprofR.out' and 'Rprofcpp.out'. |
a data.frame.
library(PCMBase) library(PCMBaseCpp) library(data.table) testData <- PCMBaseCpp::benchmarkData[1] # original MGPM model MiniBenchmarkRvsCpp(data = testData) # original MGPM model and parallel mode MiniBenchmarkRvsCpp( data = testData, listOptions = list(PCMBase.Lmr.mode = 21, PCMBase.Threshold.EV = 1e-9, PCMBase.Threshold.SV = 1e-9)) # single-trait data, original MGPM model and single mode and enabled option # PCMBase.Use1DClasses MiniBenchmarkRvsCpp( data = PCMBaseCpp::benchmarkData[1, list( tree, X = lapply(X, function(x) x[1,, drop=FALSE]), model = lapply(model, function(m) PCMExtractDimensions(m, dims = 1)))], listOptions = list( PCMBase.Lmr.mode = 11, PCMBase.Threshold.EV = 1e-9, PCMBase.Threshold.SV = 1e-9, PCMBase.Use1DClasses = FALSE))
library(PCMBase) library(PCMBaseCpp) library(data.table) testData <- PCMBaseCpp::benchmarkData[1] # original MGPM model MiniBenchmarkRvsCpp(data = testData) # original MGPM model and parallel mode MiniBenchmarkRvsCpp( data = testData, listOptions = list(PCMBase.Lmr.mode = 21, PCMBase.Threshold.EV = 1e-9, PCMBase.Threshold.SV = 1e-9)) # single-trait data, original MGPM model and single mode and enabled option # PCMBase.Use1DClasses MiniBenchmarkRvsCpp( data = PCMBaseCpp::benchmarkData[1, list( tree, X = lapply(X, function(x) x[1,, drop=FALSE]), model = lapply(model, function(m) PCMExtractDimensions(m, dims = 1)))], listOptions = list( PCMBase.Lmr.mode = 11, PCMBase.Threshold.EV = 1e-9, PCMBase.Threshold.SV = 1e-9, PCMBase.Use1DClasses = FALSE))
Converts the logical matrix pc into a list of vectors denoting the (0-based) TRUE-indices in each column
PCListInt(pc)
PCListInt(pc)
pc |
a logical matrix. |
a list
This function is used during unit-testing, to disable some unit- tests which run extremely long or are consistently failing on some systems.
PCMBaseCppIsADevRelease()
PCMBaseCppIsADevRelease()
a logical
Replace calls to PCMInfo() with this method in order to use C++ for likelihood calculation.
PCMInfoCpp(X, tree, model, SE = matrix(0, PCMNumTraits(model), PCMTreeNumTips(tree)), metaI = PCMInfo(X = X, tree = tree, model = model, SE = SE, verbose = verbose, preorder = PCMTreePreorderCpp(tree)), verbose = FALSE, ...)
PCMInfoCpp(X, tree, model, SE = matrix(0, PCMNumTraits(model), PCMTreeNumTips(tree)), metaI = PCMInfo(X = X, tree = tree, model = model, SE = SE, verbose = verbose, preorder = PCMTreePreorderCpp(tree)), verbose = FALSE, ...)
X |
a |
tree |
a phylo object with N tips. |
model |
an S3 object specifying both, the model type (class, e.g. "OU") as well as the concrete model parameter values at which the likelihood is to be calculated (see also Details). |
SE |
a k x N matrix specifying the standard error for each measurement in
X. Alternatively, a k x k x N cube specifying an upper triangular k x k
Choleski factor of the variance covariance matrix for the measurement error
for each node i=1, ..., N.
Default: |
metaI |
a list returned from a call to |
verbose |
logical indicating if some debug-messages should be printed. Default: FALSE |
... |
passed to methods. |
a list to be passed to PCMLik as argument metaI.
metaICpp <- PCMInfoCpp( PCMBase::PCMBaseTestObjects$traits.a.123, PCMBase::PCMBaseTestObjects$tree.a, PCMBase::PCMBaseTestObjects$model.a.123) PCMBase::PCMLik( PCMBase::PCMBaseTestObjects$traits.a.123, PCMBase::PCMBaseTestObjects$tree.a, PCMBase::PCMBaseTestObjects$model.a.123, metaI = metaICpp)
metaICpp <- PCMInfoCpp( PCMBase::PCMBaseTestObjects$traits.a.123, PCMBase::PCMBaseTestObjects$tree.a, PCMBase::PCMBaseTestObjects$model.a.123) PCMBase::PCMLik( PCMBase::PCMBaseTestObjects$traits.a.123, PCMBase::PCMBaseTestObjects$tree.a, PCMBase::PCMBaseTestObjects$model.a.123, metaI = metaICpp)
Get a vector with all model parameters unrolled
PCMParamGetFullVector(model, ...)
PCMParamGetFullVector(model, ...)
model |
a PCM model object |
... |
passed to methods |
a numerical vector
PCMParamGetFullVector(PCMBase::PCMBaseTestObjects$model.a.123)
PCMParamGetFullVector(PCMBase::PCMBaseTestObjects$model.a.123)
Fast preorder of the edges in a tree
PCMTreePreorderCpp(tree)
PCMTreePreorderCpp(tree)
tree |
a phylo object |
an integer vector containing indices of rows in tree$edge
in
their preorder order.
PCMTreePreorderCpp(PCMBase::PCMBaseTestObjects$tree.a)
PCMTreePreorderCpp(PCMBase::PCMBaseTestObjects$tree.a)