--- title: "Getting Started with momst" author: "Jorge A. Parraga-Alava" date: "`r Sys.Date()`" output: rmarkdown::html_vignette: toc: true toc_depth: 3 fig_width: 7 fig_height: 5 vignette: > %\VignetteIndexEntry{Getting Started with momst} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} # Global chunk options. Each example must be reproducible, so a fixed # random seed is set at every code block that involves stochastic steps. knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.align = "center", warning = FALSE, message = FALSE ) ``` # Overview The **momst** package solves the *Multi-Objective Minimum Spanning Tree* (MO-MST) problem on complete weighted graphs. Given a graph where every edge carries two or three independent costs (for example: distance, time, and risk), the goal is to obtain the **Pareto front** of spanning trees that no other tree dominates in all objectives simultaneously. The solver is built around the **NSGA-II** algorithm (Non-dominated Sorting Genetic Algorithm II) and uses **Prufer sequences** as the chromosome representation. By Cayley's theorem, every integer vector of length `n - 2` with values in `{1, ..., n}` decodes to a unique spanning tree of `n` nodes, so the representation is closed under random sampling, crossover, and mutation. This avoids the need for any repair operator. The package exposes four solver variants that differ in the local search operator applied after each generation: | Variant | Local search | |:----------|:----------------------------------------| | `"base"` | None (pure NSGA-II) | | `"PR"` | Path Relinking | | `"PLS"` | Pareto Local Search | | `"TS"` | Tabu Search | This vignette shows the minimal workflow: 1. Generate a random instance. 2. Run `run_momst()` with a chosen variant. 3. Inspect the population, the per-iteration Pareto fronts, and the global Pareto front. 4. Visualise the global Pareto front and the best-compromise spanning tree. A companion vignette (`vignette("momst-variants", package = "momst")`) compares the four variants side by side. ## Reference This package is the reference implementation of the method described in: > Parraga-Alava, J., Inostroza-Ponta, M., and Dorn, M. (2017). > "Using local search strategies to improve the performance of NSGA-II > for the Multi-Criteria Minimum Spanning Tree problem". > In *2017 IEEE Congress on Evolutionary Computation (CEC)*, > pages 1818 to 1825. IEEE. > [doi:10.1109/CEC.2017.7969432](https://doi.org/10.1109/CEC.2017.7969432) To get the citation entry from within R: ```{r citation, eval = FALSE} citation("momst") ``` # Installation You can install the development version of the package directly from GitHub: ```{r install, eval = FALSE} # install.packages("remotes") remotes::install_github("jorgeklz/momst", build_vignettes = TRUE) ``` ```{r load} library(momst) ``` # A first end-to-end example ## Step 1. Generate a random bi-objective instance `generate_instance()` returns an edge list of a complete graph with random uniform weights for every objective. The `seed` argument makes the instance fully reproducible without polluting the global random number generator. ```{r instance} # Number of nodes and number of objectives n_nodes <- 10 n_obj <- 2 # Generate a complete graph with two independent weights per edge inst <- generate_instance( n = n_nodes, num_obj = n_obj, range_a = c(10, 100), # weight range for objective 1 range_b = c(10, 50), # weight range for objective 2 seed = 12345 ) # The instance has n*(n-1)/2 edges nrow(inst) # First six edges of the complete graph head(inst) ``` For performance, the package converts the edge list into one `n x n` lookup matrix per objective so that the cost of any edge `(i, j)` can be retrieved in constant time: ```{r lookup} lk <- build_weight_lookup(inst, n_nodes, n_obj) # Each element is a symmetric weight matrix str(lk, max.level = 1) # Weight of edge (1, 2) in objective 1 lk[[1]][1, 2] ``` ## Step 2. Run the NSGA-II solver (base variant) `run_momst()` is the main entry point. It performs: 1. `iterations` independent runs of NSGA-II (useful to average results). 2. Inside each run, up to `max_generations` evolutionary cycles. 3. After every cycle, a local search operator is applied to the current non-dominated set (none for `"base"`). 4. A global Pareto front is assembled from the final population of each independent run. For a first contact we use the simplest variant, `"base"`. A small instance and short runs are enough to illustrate the workflow. ```{r run-base} res_base <- run_momst( instance = inst, n = n_nodes, num_obj = n_obj, variant = "base", # pure NSGA-II iterations = 3, # three independent runs pop_size = 30, # must be even tour_size = 2, # binary tournament selection cross_rate = 0.80, # crossover probability mut_rate = 0.10, # per-individual mutation probability max_generations = 40, # generations per run convergence_window = 8, # early stopping window verbose = FALSE, seed = 2026 ) ``` The function returns a list with everything needed to analyse the results: ```{r structure} names(res_base) ``` * `instance` : the edge list used. * `lookup` : the per-objective weight matrices. * `iterations` : the final populations of every independent run. * `pareto_per_iter` : the Pareto front of every independent run. * `global_pareto` : the **non-dominated union** of all those fronts. * `elapsed` : wall-clock time in seconds. ## Step 3. Inspect the global Pareto front Each row of `global_pareto` is a candidate spanning tree. The first `n - 2` columns are the Prufer sequence (the chromosome), and the columns `objective_1`, `objective_2` (and optionally `objective_3`) report the **total cost of the tree in each objective**. ```{r front-base} # Number of non-dominated trees nrow(res_base$global_pareto) # Show the chromosomes and their objective values res_base$global_pareto ``` We can sort the front by objective 1 to see the typical trade-off between objectives: ```{r front-sorted} front <- res_base$global_pareto[order(res_base$global_pareto$objective_1), ] front[, c("objective_1", "objective_2")] ``` The "extreme" solutions of the front are the trees that minimise each objective in isolation: ```{r extremes} # Best tree for objective 1 front[which.min(front$objective_1), c("objective_1", "objective_2")] # Best tree for objective 2 front[which.min(front$objective_2), c("objective_1", "objective_2")] ``` ## Step 4. Plot the Pareto front `plot_pareto_front()` produces a base-graphics scatter of the non-dominated set. The dashed line connects consecutive points after sorting by the first objective, which makes the trade-off easy to read. ```{r plot-front-base, fig.cap = "Global Pareto front returned by the base variant."} plot_pareto_front(res_base) ``` ## Step 5. Decode and plot the best-compromise tree The "best-compromise" tree is the one that minimises the sum of all objectives. The helper `plot_best_tree()` decodes its Prufer chromosome, turns it into an `igraph` object, and labels each edge with its cost in every objective. The `igraph` package is only needed for this plot; the rest of the package does not depend on it. ```{r best-tree, fig.cap = "Best-compromise spanning tree of the base variant.", fig.width = 6, fig.height = 6} if (requireNamespace("igraph", quietly = TRUE)) { plot_best_tree(res_base, n = n_nodes) } else { message("Install 'igraph' to plot the spanning tree.") } ``` # Working with three objectives The same workflow extends seamlessly to three-objective problems. Just set `num_obj = 3` and supply a third weight range `range_c`. The Pareto front then lives in a three-dimensional objective space. ```{r run-3obj} res_3obj <- run_momst( n = 8, num_obj = 3, variant = "base", iterations = 2, pop_size = 30, max_generations = 25, range_a = c(10, 100), range_b = c(10, 50), range_c = c(30, 200), verbose = FALSE, seed = 7 ) # Three-objective non-dominated set head(res_3obj$global_pareto[, c("objective_1", "objective_2", "objective_3")]) ``` A simple way to inspect the three-objective front without extra dependencies is a pairs plot: ```{r pairs-plot, fig.cap = "Pairwise projections of the three-objective Pareto front.", fig.width = 6, fig.height = 6} front3 <- res_3obj$global_pareto[, c("objective_1", "objective_2", "objective_3")] pairs(front3, pch = 19, col = "steelblue") ``` # Reproducibility Every stochastic step inside `run_momst()` is controlled by the `seed` argument. Two calls with the same arguments and the same seed return identical results: ```{r reproducibility} a <- run_momst(n = 8, num_obj = 2, variant = "base", iterations = 1, pop_size = 20, max_generations = 15, verbose = FALSE, seed = 99) b <- run_momst(n = 8, num_obj = 2, variant = "base", iterations = 1, pop_size = 20, max_generations = 15, verbose = FALSE, seed = 99) identical(a$global_pareto, b$global_pareto) ``` # Where to go next * `vignette("momst-variants", package = "momst")` runs the four solver variants on the same instance and compares their Pareto fronts both numerically and visually. * `?run_momst` documents every argument of the main solver. * `?plot_pareto_front` and `?plot_best_tree` document the plotting helpers. * `?apply_local_search` documents the dispatcher used internally to call the local search operator selected by the `variant` argument.