--- title: "Multi-caRamel optimization with MPI" author: "Fabrice Zaoui" date: "March 02 2021" output: html_document vignette: > %\VignetteEngine{knitr::rmarkdown} %\VignetteIndexEntry{Pareto front of fronts with MPI} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) knitr::opts_chunk$set(python.reticulate = FALSE) ``` # Short Description **caRamel** is a multiobjective evolutionary algorithm combining the MEAS algorithm and the NGSA-II algorithm. Download the package from CRAN or [GitHub](https://github.com/fzao/caRamel) and then install and load it. ```{r caRa, eval=F, echo=T} library(caRamel) ``` This example will use the **pbdMPI** package in order to use several processes of caRamel. Download the package from CRAN and then install and load it. Make also sure that you have an MPI distribution installed on your system, see for instance [*Open MPI*](https://www.open-mpi.org) if necessary. ```{r pbdMPI, eval=F, echo=T} library(pbdMPI) ``` [*Kursawe*](https://en.wikipedia.org/wiki/File:Kursawe_function.pdf) test function has two objectives of three variables. ```{r kursawe, eval=F, echo=T} kursawe <- function(i) { k1 <- -10 * exp(-0.2 * sqrt(x[i,1]^2 + x[i,2]^2)) - 10 * exp(-0.2 * sqrt(x[i,2]^2 + x[i,3]^2)) k2 <- abs(x[i,1])^0.8 + 5 * sin(x[i,1]^3) + abs(x[i,2])^0.8 + 5 * sin(x[i,2]^3) + abs(x[i,3])^0.8 + 5 * sin(x[i,3]^3) return(c(k1, k2)) } ``` The variables lie in the range [-5, 5]: ```{r kursawe_variable, eval=F, echo=T} nvar <- 3 # number of variables bounds <- matrix(data = 1, nrow = nvar, ncol = 2) # upper and lower bounds bounds[, 1] <- -5 * bounds[, 1] bounds[, 2] <- 5 * bounds[, 2] ``` Both functions are to be minimized: ```{r kursawe_objectives, eval=F, echo=T} nobj <- 2 # number of objectives minmax <- c(FALSE, FALSE) # minimization for both functions ``` Set algorithmic parameters for **caRamel**: ```{r kursawe_param, eval=F, echo=T} popsize <- 100 # size of the genetic population archsize <- 100 # size of the archive for the Pareto front maxrun <- 1000 # maximum number of calls prec <- matrix(1.e-3, nrow = 1, ncol = nobj) # accuracy for the convergence phase ``` # Combining distributed and shared memory parallelism The **caRamel** parameters *carallel* and *numcores* help to define the shared memory parallelism used to evaluate by the user's function each member of the genetic population. The shared memory parallelism runs on a single computational node (or workstation). For instance 20 cores can be defined with the following call: ```{r call_caRa, eval=F, echo=T} results <- caRamel(nobj, nvar, minmax, bounds, kursawe, popsize, archsize, maxrun, prec, carallel = 1, numcores = 20, graph = FALSE, verbose = FALSE) ``` For calling several times **caRamel** on a distributed memory cluster of computers (or nodes) MPI is used. At the end of the script, the different results of optimizations can be gathered with: ```{r mpi_caRa, eval=F, echo=T} init() # MPI functions from the pbdMPI package size <- comm.size() rank <- comm.rank() results <- gather(optres, rank.dest = 0) # gather all results on the main process if (rank == 0) saveRDS(results, "Results.Rds") # save all the results on disk finalize() ``` Finally, the previous script can be launched in parallel on a number of nodes, for instance 4 nodes: ```{bash launch, eval=F, echo=T} mpirun -n 4 Rscript --vanilla my_kursawe_optim.R ```