Title: | Batch Experiments for 'mlr3' |
---|---|
Description: | Extends the 'mlr3' package with a connector to the package 'batchtools'. This allows to run large-scale benchmark experiments on scheduled high-performance computing clusters. |
Authors: | Marc Becker [cre, aut] , Michel Lang [aut] , Toby Hocking [ctb] |
Maintainer: | Marc Becker <[email protected]> |
License: | LGPL-3 |
Version: | 0.2.0 |
Built: | 2024-11-25 19:32:24 UTC |
Source: | CRAN |
Extends the 'mlr3' package with a connector to the package 'batchtools'. This allows to run large-scale benchmark experiments on scheduled high-performance computing clusters.
Maintainer: Marc Becker [email protected] (ORCID)
Authors:
Michel Lang [email protected] (ORCID)
Other contributors:
Toby Hocking (ORCID) [contributor]
Useful links:
Report bugs at https://github.com/mlr-org/mlr3batchmark/issues
This function provides the functionality to leave the interface of mlr3 for the computation of benchmark experiments and switch over to batchtools for a more fine grained control over the execution.
batchmark()
populates a batchtools::ExperimentRegistry with jobs in a mlr3::benchmark()
fashion.
Each combination of mlr3::Task and mlr3::Resampling defines a batchtools::Problem,
each mlr3::Learner is an batchtools::Algorithm.
After the jobs have been submitted and are terminated, results can be collected with reduceResultsBatchmark()
which returns a mlr3::BenchmarkResult and thus to return to the interface of mlr3.
batchmark( design, store_models = FALSE, reg = batchtools::getDefaultRegistry(), renv_project = NULL )
batchmark( design, store_models = FALSE, reg = batchtools::getDefaultRegistry(), renv_project = NULL )
design |
( |
store_models |
( |
reg |
|
renv_project |
|
data.table::data.table()
with ids of created jobs (invisibly).
tasks = list(mlr3::tsk("iris"), mlr3::tsk("sonar")) learners = list(mlr3::lrn("classif.featureless"), mlr3::lrn("classif.rpart")) resamplings = list(mlr3::rsmp("cv", folds = 3), mlr3::rsmp("holdout")) design = mlr3::benchmark_grid( tasks = tasks, learners = learners, resamplings = resamplings ) reg = batchtools::makeExperimentRegistry(NA) batchmark(design, reg = reg) batchtools::submitJobs(reg = reg) reduceResultsBatchmark(reg = reg)
tasks = list(mlr3::tsk("iris"), mlr3::tsk("sonar")) learners = list(mlr3::lrn("classif.featureless"), mlr3::lrn("classif.rpart")) resamplings = list(mlr3::rsmp("cv", folds = 3), mlr3::rsmp("holdout")) design = mlr3::benchmark_grid( tasks = tasks, learners = learners, resamplings = resamplings ) reg = batchtools::makeExperimentRegistry(NA) batchmark(design, reg = reg) batchtools::submitJobs(reg = reg) reduceResultsBatchmark(reg = reg)
Collect the results from jobs defined via batchmark()
and combine them into a mlr3::BenchmarkResult.
Note that ids
defaults to finished jobs (as reported by batchtools::findDone()
).
If a job threw an error, is expired or is still running, it will be ignored with this default.
Just leaving these jobs out in an analysis is not statistically sound.
Instead, try to robustify your jobs by using a fallback learner (c.f. mlr3::Learner).
reduceResultsBatchmark( ids = NULL, store_backends = TRUE, reg = batchtools::getDefaultRegistry(), fun = NULL, unmarshal = TRUE )
reduceResultsBatchmark( ids = NULL, store_backends = TRUE, reg = batchtools::getDefaultRegistry(), fun = NULL, unmarshal = TRUE )
ids |
[ |
store_backends |
( |
reg |
[ |
fun |
[ |
unmarshal |
|