Title: | Unified Interface to Parallelization Back-Ends |
---|---|
Description: | Unified parallelization framework for multiple back-end, designed for internal package and interactive usage. The main operation is parallel mapping over lists. Supports 'local', 'multicore', 'mpi' and 'BatchJobs' mode. Allows tagging of the parallel operation with a level name that can be later selected by the user to switch on parallel execution for exactly this operation. |
Authors: | Bernd Bischl [cre, aut], Michel Lang [aut] , Patrick Schratz [aut] |
Maintainer: | Bernd Bischl <[email protected]> |
License: | BSD_2_clause + file LICENSE |
Version: | 1.5.1 |
Built: | 2025-01-03 07:11:04 UTC |
Source: | CRAN |
Makes sure that the objects are exported to slave process so
that they can be used in a job function which is later run with
parallelMap()
.
parallelExport( ..., objnames, master = TRUE, level = NA_character_, show.info = NA )
parallelExport( ..., objnames, master = TRUE, level = NA_character_, show.info = NA )
... |
|
objnames |
( |
master |
( |
level |
( |
show.info |
( |
Nothing.
Returned are current and default settings, both as lists.
The return value has slots elements settings
and defaults
,
which are both lists of the same structure, named by option names.
A printer exists to display this object.
For details on the configuration procedure please read
parallelStart()
and https://github.com/mlr-org/parallelMap.
parallelGetOptions()
parallelGetOptions()
ParallelMapOptions
. See above.
With flatten = FALSE
, a structured S3 object is returned. The S3 object
only has one slot, which is called levels
. This contains a named list. Each
name refers to package
from the call to parallelRegisterLevels()
, while
the entries are character vectors of the form “package.level”. With
flatten = TRUE
, a simple character vector is returned that contains all
concatenated entries of levels
from above.
parallelGetRegisteredLevels(flatten = FALSE)
parallelGetRegisteredLevels(flatten = FALSE)
flatten |
( |
RegisteredLevels
| character
. See above.
parallelLapply
: A parallel lapply()
version.parallelSapply
: A parallel sapply()
version.
All functions are simple wrappers for parallelMap()
.
parallelLapply(xs, fun, ..., impute.error = NULL, level = NA_character_) parallelSapply( xs, fun, ..., simplify = TRUE, use.names = TRUE, impute.error = NULL, level = NA_character_ )
parallelLapply(xs, fun, ..., impute.error = NULL, level = NA_character_) parallelSapply( xs, fun, ..., simplify = TRUE, use.names = TRUE, impute.error = NULL, level = NA_character_ )
xs |
( |
fun |
|
... |
(any) |
impute.error |
( |
level |
( |
simplify |
( |
use.names |
( |
For parallelLapply
a named list, for parallelSapply
it depends
on the return value of fun
and the settings of simplify
and
use.names
.
Makes sure that the packages are loaded in slave process so that
they can be used in a job function which is later run with parallelMap()
.
For all modes, the packages are also (potentially) loaded on the master.
parallelLibrary( ..., packages, master = TRUE, level = NA_character_, show.info = NA )
parallelLibrary( ..., packages, master = TRUE, level = NA_character_, show.info = NA )
... |
character |
packages |
( |
master |
( |
level |
( |
show.info |
( |
Nothing.
Uses the parallelization mode and the other options specified in
parallelStart()
.
Libraries and source file can be initialized on slaves with
parallelLibrary()
and parallelSource()
.
Large objects can be separately exported via parallelExport()
,
they can be simply used under their exported name in slave body code.
Regarding error handling, see the argument impute.error
.
parallelMap( fun, ..., more.args = list(), simplify = FALSE, use.names = FALSE, impute.error = NULL, level = NA_character_, show.info = NA )
parallelMap( fun, ..., more.args = list(), simplify = FALSE, use.names = FALSE, impute.error = NULL, level = NA_character_, show.info = NA )
fun |
function |
... |
(any) |
more.args |
list |
simplify |
( |
use.names |
( |
impute.error |
( |
level |
( |
show.info |
( |
Result.
parallelStart() parallelMap(identity, 1:2) parallelStop()
parallelStart() parallelMap(identity, 1:2) parallelStop()
Package developers should call this function in their packages'
base::.onLoad()
. This enables the user to query available levels and bind
parallelization to specific levels. This is especially helpful for nested
calls to parallelMap()
, e.g. where the inner call should be parallelized
instead of the outer one.
To avoid name clashes, we encourage developers to always specify the argument
package
. This will prefix the specified levels with the string containing
the package name, e.g. parallelRegisterLevels(package="foo", levels="dummy")
will register the level “foo.dummy” and users can
start parallelization for this level with parallelStart(<backend>, level = "parallelMap.dummy")
. If you do not provide package
, the level names will
be associated with category “custom” and can there be later referred
to with “custom.dummy”.
parallelRegisterLevels(package = "custom", levels)
parallelRegisterLevels(package = "custom", levels)
package |
( |
levels |
( |
Nothing.
Makes sure that the files are sourced in slave process so that
they can be used in a job function which is later run with parallelMap()
.
For all modes, the files are also (potentially) loaded on the master.
parallelSource( ..., files, master = TRUE, level = NA_character_, show.info = NA )
parallelSource( ..., files, master = TRUE, level = NA_character_, show.info = NA )
... |
character |
files |
character |
master |
( |
level |
( |
show.info |
( |
Nothing.
Defines the underlying parallelization mode for parallelMap()
. Also allows
to set a “level” of parallelization. Only calls to parallelMap()
with a matching level are parallelized. The defaults of all settings are
taken from your options, which you can also define in your R profile. For an
introductory tutorial and information on the options configuration, please go
to the project's github page at https://github.com/mlr-org/parallelMap.
parallelStart( mode, cpus, socket.hosts, bj.resources = list(), bt.resources = list(), logging, storagedir, level, load.balancing = FALSE, show.info, suppress.local.errors = FALSE, reproducible, ... ) parallelStartLocal(show.info, suppress.local.errors = FALSE, ...) parallelStartMulticore( cpus, logging, storagedir, level, load.balancing = FALSE, show.info, reproducible, ... ) parallelStartSocket( cpus, socket.hosts, logging, storagedir, level, load.balancing = FALSE, show.info, reproducible, ... ) parallelStartMPI( cpus, logging, storagedir, level, load.balancing = FALSE, show.info, reproducible, ... ) parallelStartBatchJobs( bj.resources = list(), logging, storagedir, level, show.info, ... ) parallelStartBatchtools( bt.resources = list(), logging, storagedir, level, show.info, ... )
parallelStart( mode, cpus, socket.hosts, bj.resources = list(), bt.resources = list(), logging, storagedir, level, load.balancing = FALSE, show.info, suppress.local.errors = FALSE, reproducible, ... ) parallelStartLocal(show.info, suppress.local.errors = FALSE, ...) parallelStartMulticore( cpus, logging, storagedir, level, load.balancing = FALSE, show.info, reproducible, ... ) parallelStartSocket( cpus, socket.hosts, logging, storagedir, level, load.balancing = FALSE, show.info, reproducible, ... ) parallelStartMPI( cpus, logging, storagedir, level, load.balancing = FALSE, show.info, reproducible, ... ) parallelStartBatchJobs( bj.resources = list(), logging, storagedir, level, show.info, ... ) parallelStartBatchtools( bt.resources = list(), logging, storagedir, level, show.info, ... )
mode |
( |
cpus |
( |
socket.hosts |
character |
bj.resources |
list |
bt.resources |
list |
logging |
( |
storagedir |
( |
level |
( |
load.balancing |
( |
show.info |
( |
suppress.local.errors |
( |
reproducible |
( |
... |
(any) |
Currently the following modes are supported, which internally dispatch the mapping operation to functions from different parallelization packages:
local: No parallelization with mapply()
multicore: Multicore execution on a single machine with parallel::mclapply()
.
socket: Socket cluster on one or multiple machines with parallel::makePSOCKcluster()
and parallel::clusterMap()
.
mpi: Snow MPI cluster on one or multiple machines with parallel::makeCluster()
and parallel::clusterMap()
.
BatchJobs: Parallelization on batch queuing HPC clusters, e.g., Torque, SLURM, etc., with BatchJobs::batchMap()
.
For BatchJobs mode you need to define a storage directory through the
argument storagedir
or the option parallelMap.default.storagedir
.
Nothing.
Sets mode to “local”, i.e., parallelization is turned off and all necessary stuff is cleaned up.
For socket and mpi mode parallel::stopCluster()
is called.
For BatchJobs mode the subdirectory of the storagedir
containing the exported objects is removed.
After a subsequent call of parallelStart()
, no exported objects
are present on the slaves and no libraries are loaded,
i.e., you have clean R sessions on the slaves.
parallelStop()
parallelStop()
Nothing.