Title: | Plots of the Empirical Attainment Function |
---|---|
Description: | Computation and visualization of the empirical attainment function (EAF) for the analysis of random sets in multi-criterion optimization. M. López-Ibáñez, L. Paquete, and T. Stützle (2010) <doi:10.1007/978-3-642-02538-9_9>. |
Authors: | Manuel López-Ibáñez [aut, cre] , Marco Chiarandini [aut], Carlos Fonseca [aut], Luís Paquete [aut], Thomas Stützle [aut], Mickaël Binois [ctb] |
Maintainer: | Manuel López-Ibáñez <[email protected]> |
License: | GPL (>= 2) |
Version: | 2.5.1 |
Built: | 2024-11-23 06:46:18 UTC |
Source: | CRAN |
Convert a list of attainment surfaces to a single data.frame.
attsurf2df(x)
attsurf2df(x)
x |
( |
A data.frame with as many columns as objectives and an additional column percentiles
.
data(SPEA2relativeRichmond) attsurfs <- eafplot (SPEA2relativeRichmond, percentiles = c(0,50,100), xlab = expression(C[E]), ylab = "Total switches", lty=0, pch=21, xlim = c(90, 140), ylim = c(0, 25)) attsurfs <- attsurf2df(attsurfs) text(attsurfs[,1:2], labels = attsurfs[,3], adj = c(1.5,1.5))
data(SPEA2relativeRichmond) attsurfs <- eafplot (SPEA2relativeRichmond, percentiles = c(0,50,100), xlab = expression(C[E]), ylab = "Total switches", lty=0, pch=21, xlim = c(90, 140), ylim = c(0, 25)) attsurfs <- attsurf2df(attsurfs) text(attsurfs[,1:2], labels = attsurfs[,3], adj = c(1.5,1.5))
Creates the same plot as eafdiffplot()
but waits for the user to click in
one of the sides. Then it returns the rectangles the give the differences in
favour of the chosen side. These rectangles may be used for interactive
decision-making as shown in Diaz and López-Ibáñez (2021). The function
choose_eafdiff()
may be used in a non-interactive context.
choose_eafdiffplot( data.left, data.right, intervals = 5, maximise = c(FALSE, FALSE), title.left = deparse(substitute(data.left)), title.right = deparse(substitute(data.right)), ... ) choose_eafdiff(x, left = stop("'left' must be either TRUE or FALSE"))
choose_eafdiffplot( data.left, data.right, intervals = 5, maximise = c(FALSE, FALSE), title.left = deparse(substitute(data.left)), title.right = deparse(substitute(data.right)), ... ) choose_eafdiff(x, left = stop("'left' must be either TRUE or FALSE"))
data.left , data.right
|
Data frames corresponding to the input data of
left and right sides, respectively. Each data frame has at least three
columns, the third one being the set of each point. See also
|
intervals |
( |
maximise |
( |
title.left , title.right
|
Title for left and right panels, respectively. |
... |
Other graphical parameters are passed down to
|
x |
( |
left |
( |
matrix
where the first 4 columns give the coordinates of two
corners of each rectangle and the last column. In both cases, the last
column gives the positive differences in favor of the chosen side.
Juan Esteban Diaz, Manuel López-Ibáñez (2021). “Incorporating Decision-Maker's Preferences into the Automatic Configuration of Bi-Objective Optimisation Algorithms.” European Journal of Operational Research, 289(3), 1209–1222. doi:10.1016/j.ejor.2020.07.059.
read_datasets()
, eafdiffplot()
, whv_rect()
extdata_dir <- system.file(package="eaf", "extdata") A1 <- read_datasets(file.path(extdata_dir, "wrots_l100w10_dat")) A2 <- read_datasets(file.path(extdata_dir, "wrots_l10w100_dat")) if (interactive()) { rectangles <- choose_eafdiffplot(A1, A2, intervals = 5) } else { # Choose A1 rectangles <- eafdiff(A1, A2, intervals = 5, rectangles = TRUE) rectangles <- choose_eafdiff(rectangles, left = TRUE) } reference <- c(max(A1[, 1], A2[, 1]), max(A1[, 2], A2[, 2])) x <- split.data.frame(A1[,1:2], A1[,3]) hv_A1 <- sapply(split.data.frame(A1[, 1:2], A1[, 3]), hypervolume, reference=reference) hv_A2 <- sapply(split.data.frame(A2[, 1:2], A2[, 3]), hypervolume, reference=reference) boxplot(list(A1=hv_A1, A2=hv_A2), main = "Hypervolume") whv_A1 <- sapply(split.data.frame(A1[, 1:2], A1[, 3]), whv_rect, rectangles=rectangles, reference=reference) whv_A2 <- sapply(split.data.frame(A2[, 1:2], A2[, 3]), whv_rect, rectangles=rectangles, reference=reference) boxplot(list(A1=whv_A1, A2=whv_A2), main = "Weighted hypervolume")
extdata_dir <- system.file(package="eaf", "extdata") A1 <- read_datasets(file.path(extdata_dir, "wrots_l100w10_dat")) A2 <- read_datasets(file.path(extdata_dir, "wrots_l10w100_dat")) if (interactive()) { rectangles <- choose_eafdiffplot(A1, A2, intervals = 5) } else { # Choose A1 rectangles <- eafdiff(A1, A2, intervals = 5, rectangles = TRUE) rectangles <- choose_eafdiff(rectangles, left = TRUE) } reference <- c(max(A1[, 1], A2[, 1]), max(A1[, 2], A2[, 2])) x <- split.data.frame(A1[,1:2], A1[,3]) hv_A1 <- sapply(split.data.frame(A1[, 1:2], A1[, 3]), hypervolume, reference=reference) hv_A2 <- sapply(split.data.frame(A2[, 1:2], A2[, 3]), hypervolume, reference=reference) boxplot(list(A1=hv_A1, A2=hv_A2), main = "Hypervolume") whv_A1 <- sapply(split.data.frame(A1[, 1:2], A1[, 3]), whv_rect, rectangles=rectangles, reference=reference) whv_A2 <- sapply(split.data.frame(A2[, 1:2], A2[, 3]), whv_rect, rectangles=rectangles, reference=reference) boxplot(list(A1=whv_A1, A2=whv_A2), main = "Weighted hypervolume")
The data has the only goal of providing an example of use of vorobT()
and
vorobDev()
. It has been obtained by fitting two Gaussian processes on 20
observations of a bi-objective problem, before generating conditional
simulation of both GPs at different locations and extracting non-dominated
values of coupled simulations.
CPFs
CPFs
A data frame with 2967 observations on the following 3 variables.
f1
first objective values.
f2
second objective values.
set
indices of corresponding conditional Pareto fronts.
M Binois, D Ginsbourger, O Roustant (2015). “Quantifying uncertainty on Pareto fronts with Gaussian process conditional simulations.” European Journal of Operational Research, 243(2), 386–394. doi:10.1016/j.ejor.2014.07.032.
data(CPFs) res <- vorobT(CPFs, reference = c(2, 200)) eafplot(CPFs[,1:2], sets = CPFs[,3], percentiles = c(0, 20, 40, 60, 80, 100), col = gray(seq(0.8, 0.1, length.out = 6)^2), type = "area", legend.pos = "bottomleft", extra.points = res$VE, extra.col = "cyan")
data(CPFs) res <- vorobT(CPFs, reference = c(2, 200)) eafplot(CPFs[,1:2], sets = CPFs[,3], percentiles = c(0, 20, 40, 60, 80, 100), col = gray(seq(0.8, 0.1, length.out = 6)^2), type = "area", legend.pos = "bottomleft", extra.points = res$VE, extra.col = "cyan")
Calculate the differences between the empirical attainment functions of two data sets.
eafdiff(x, y, intervals = NULL, maximise = c(FALSE, FALSE), rectangles = FALSE)
eafdiff(x, y, intervals = NULL, maximise = c(FALSE, FALSE), rectangles = FALSE)
x , y
|
Data frames corresponding to the input data of
left and right sides, respectively. Each data frame has at least three
columns, the third one being the set of each point. See also
|
intervals |
( |
maximise |
( |
rectangles |
If TRUE, the output is in the form of rectangles of the same color. |
This function calculates the differences between the EAFs of two data sets.
With rectangle=FALSE
, a data.frame
containing points where there
is a transition in the value of the EAF differences. With
rectangle=TRUE
, a matrix
where the first 4 columns give the
coordinates of two corners of each rectangle and the last column. In both
cases, the last column gives the difference in terms of sets in x
minus
sets in y
that attain each point (i.e., negative values are differences
in favour y
).
read_datasets()
, eafdiffplot()
A1 <- read_datasets(text=' 3 2 2 3 2.5 1 1 2 1 2 ') A2 <- read_datasets(text=' 4 2.5 3 3 2.5 3.5 3 3 2.5 3.5 2 1 ') d <- eafdiff(A1, A2) str(d) print(d) d <- eafdiff(A1, A2, rectangles = TRUE) str(d) print(d)
A1 <- read_datasets(text=' 3 2 2 3 2.5 1 1 2 1 2 ') A2 <- read_datasets(text=' 4 2.5 3 3 2.5 3.5 3 3 2.5 3.5 2 1 ') d <- eafdiff(A1, A2) str(d) print(d) d <- eafdiff(A1, A2, rectangles = TRUE) str(d) print(d)
Plot the differences between the empirical attainment functions (EAFs) of two data sets as a two-panel plot, where the left side shows the values of the left EAF minus the right EAF and the right side shows the differences in the other direction.
eafdiffplot( data.left, data.right, col = c("#FFFFFF", "#808080", "#000000"), intervals = 5, percentiles = c(50), full.eaf = FALSE, type = "area", legend.pos = if (full.eaf) "bottomleft" else "topright", title.left = deparse(substitute(data.left)), title.right = deparse(substitute(data.right)), xlim = NULL, ylim = NULL, cex = par("cex"), cex.lab = par("cex.lab"), cex.axis = par("cex.axis"), maximise = c(FALSE, FALSE), grand.lines = TRUE, sci.notation = FALSE, left.panel.last = NULL, right.panel.last = NULL, ... )
eafdiffplot( data.left, data.right, col = c("#FFFFFF", "#808080", "#000000"), intervals = 5, percentiles = c(50), full.eaf = FALSE, type = "area", legend.pos = if (full.eaf) "bottomleft" else "topright", title.left = deparse(substitute(data.left)), title.right = deparse(substitute(data.right)), xlim = NULL, ylim = NULL, cex = par("cex"), cex.lab = par("cex.lab"), cex.axis = par("cex.axis"), maximise = c(FALSE, FALSE), grand.lines = TRUE, sci.notation = FALSE, left.panel.last = NULL, right.panel.last = NULL, ... )
data.left , data.right
|
Data frames corresponding to the input data of
left and right sides, respectively. Each data frame has at least three
columns, the third one being the set of each point. See also
|
col |
A character vector of three colors for the magnitude of the
differences of 0, 0.5, and 1. Intermediate colors are computed
automatically given the value of |
intervals |
( |
percentiles |
The percentiles of the EAF of each side that will be
plotted as attainment surfaces. |
full.eaf |
Whether to plot the EAF of each side instead of the differences between the EAFs. |
type |
Whether the EAF differences are plotted as points (‘points’) or whether to color the areas that have at least a certain value (‘area’). |
legend.pos |
The position of the legend. See |
title.left , title.right
|
Title for left and right panels, respectively. |
xlim , ylim , cex , cex.lab , cex.axis
|
Graphical parameters, see
|
maximise |
( |
grand.lines |
Whether to plot the grand-best and grand-worst attainment surfaces. |
sci.notation |
Generate prettier labels |
left.panel.last , right.panel.last
|
An expression to be evaluated after
plotting has taken place on each panel (left or right). This can be useful
for adding points or text to either panel. Note that this works by lazy
evaluation: passing this argument from other |
... |
Other graphical parameters are passed down to
|
This function calculates the differences between the EAFs of two data sets, and plots on the left the differences in favour of the left data set, and on the right the differences in favour of the right data set. By default, it also plots the grand best and worst attainment surfaces, that is, the 0%- and 100%-attainment surfaces over all data. These two surfaces delimit the area where differences may exist. In addition, it also plots the 50%-attainment surface of each data set.
With type = "point"
, only the points where there is a change in
the value of the EAF difference are plotted. This means that for areas
where the EAF differences stays constant, the region will appear in
white even if the value of the differences in that region is
large. This explains "white holes" surrounded by black
points.
With type = "area"
, the area where the EAF differences has a
certain value is plotted. The idea for the algorithm to compute the
areas was provided by Carlos M. Fonseca. The implementation uses R
polygons, which some PDF viewers may have trouble rendering correctly
(See
https://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-are-there-unwanted-borders). Plots (should) look correct when printed.
Large differences that appear when using type = "point"
may
seem to disappear when using type = "area"
. The explanation is
the points size is independent of the axes range, therefore, the
plotted points may seem to cover a much larger area than the actual
number of points. On the other hand, the areas size is plotted with
respect to the objective space, without any extra borders. If the
range of an area becomes smaller than one-pixel, it won't be
visible. As a consequence, zooming in or out certain regions of the plots
does not change the apparent size of the points, whereas it affects
considerably the apparent size of the areas.
Returns a representation of the EAF differences (invisibly).
## NOTE: The plots in the website look squashed because of how pkgdown ## generates them. They should look fine when you generate them yourself. extdata_dir <- system.file(package="eaf", "extdata") A1 <- read_datasets(file.path(extdata_dir, "ALG_1_dat.xz")) A2 <- read_datasets(file.path(extdata_dir, "ALG_2_dat.xz")) # These take time eafdiffplot(A1, A2, full.eaf = TRUE) if (requireNamespace("viridisLite", quietly=TRUE)) { viridis_r <- function(n) viridisLite::viridis(n, direction=-1) eafdiffplot(A1, A2, type = "area", col = viridis_r) } else { eafdiffplot(A1, A2, type = "area") } A1 <- read_datasets(file.path(extdata_dir, "wrots_l100w10_dat")) A2 <- read_datasets(file.path(extdata_dir, "wrots_l10w100_dat")) eafdiffplot(A1, A2, type = "point", sci.notation = TRUE, cex.axis=0.6) # A more complex example DIFF <- eafdiffplot(A1, A2, col = c("white", "blue", "red"), intervals = 5, type = "point", title.left=expression("W-RoTS," ~ lambda==100 * "," ~ omega==10), title.right=expression("W-RoTS," ~ lambda==10 * "," ~ omega==100), right.panel.last={ abline(a = 0, b = 1, col = "red", lty = "dashed")}) DIFF$right[,3] <- -DIFF$right[,3] ## Save the values to a file. # write.table(rbind(DIFF$left,DIFF$right), # file = "wrots_l100w10_dat-wrots_l10w100_dat-diff.txt", # quote = FALSE, row.names = FALSE, col.names = FALSE)
## NOTE: The plots in the website look squashed because of how pkgdown ## generates them. They should look fine when you generate them yourself. extdata_dir <- system.file(package="eaf", "extdata") A1 <- read_datasets(file.path(extdata_dir, "ALG_1_dat.xz")) A2 <- read_datasets(file.path(extdata_dir, "ALG_2_dat.xz")) # These take time eafdiffplot(A1, A2, full.eaf = TRUE) if (requireNamespace("viridisLite", quietly=TRUE)) { viridis_r <- function(n) viridisLite::viridis(n, direction=-1) eafdiffplot(A1, A2, type = "area", col = viridis_r) } else { eafdiffplot(A1, A2, type = "area") } A1 <- read_datasets(file.path(extdata_dir, "wrots_l100w10_dat")) A2 <- read_datasets(file.path(extdata_dir, "wrots_l10w100_dat")) eafdiffplot(A1, A2, type = "point", sci.notation = TRUE, cex.axis=0.6) # A more complex example DIFF <- eafdiffplot(A1, A2, col = c("white", "blue", "red"), intervals = 5, type = "point", title.left=expression("W-RoTS," ~ lambda==100 * "," ~ omega==10), title.right=expression("W-RoTS," ~ lambda==10 * "," ~ omega==100), right.panel.last={ abline(a = 0, b = 1, col = "red", lty = "dashed")}) DIFF$right[,3] <- -DIFF$right[,3] ## Save the values to a file. # write.table(rbind(DIFF$left,DIFF$right), # file = "wrots_l100w10_dat-wrots_l10w100_dat-diff.txt", # quote = FALSE, row.names = FALSE, col.names = FALSE)
Computes and plots the Empirical Attainment Function, either as attainment surfaces for certain percentiles or as points.
eafplot(x, ...) ## Default S3 method: eafplot( x, sets = NULL, groups = NULL, percentiles = c(0, 50, 100), attsurfs = NULL, xlab = NULL, ylab = NULL, xlim = NULL, ylim = NULL, log = "", type = "point", col = NULL, lty = c("dashed", "solid", "solid", "solid", "dashed"), lwd = 1.75, pch = NA, cex.pch = par("cex"), las = par("las"), legend.pos = "topright", legend.txt = NULL, extra.points = NULL, extra.legend = NULL, extra.pch = 4:25, extra.lwd = 0.5, extra.lty = NA, extra.col = "black", maximise = c(FALSE, FALSE), xaxis.side = "below", yaxis.side = "left", axes = TRUE, sci.notation = FALSE, ... ) ## S3 method for class 'formula' eafplot(formula, data, groups = NULL, subset = NULL, ...) ## S3 method for class 'list' eafplot(x, ...)
eafplot(x, ...) ## Default S3 method: eafplot( x, sets = NULL, groups = NULL, percentiles = c(0, 50, 100), attsurfs = NULL, xlab = NULL, ylab = NULL, xlim = NULL, ylim = NULL, log = "", type = "point", col = NULL, lty = c("dashed", "solid", "solid", "solid", "dashed"), lwd = 1.75, pch = NA, cex.pch = par("cex"), las = par("las"), legend.pos = "topright", legend.txt = NULL, extra.points = NULL, extra.legend = NULL, extra.pch = 4:25, extra.lwd = 0.5, extra.lty = NA, extra.col = "black", maximise = c(FALSE, FALSE), xaxis.side = "below", yaxis.side = "left", axes = TRUE, sci.notation = FALSE, ... ) ## S3 method for class 'formula' eafplot(formula, data, groups = NULL, subset = NULL, ...) ## S3 method for class 'list' eafplot(x, ...)
x |
Either a matrix of data values, or a data frame, or a list of data frames of exactly three columns. |
... |
Other graphical parameters to |
sets |
(numeric) |
groups |
This may be used to plot profiles of different algorithms on the same plot. |
percentiles |
( |
attsurfs |
TODO |
xlab , ylab , xlim , ylim , log , col , lty , lwd , pch , cex.pch , las
|
Graphical
parameters, see |
type |
( |
legend.pos |
the position of the legend, see |
legend.txt |
a character or expression vector to appear in the
legend. If |
extra.points |
A list of matrices or data.frames with
two-columns. Each element of the list defines a set of points, or
lines if one of the columns is |
extra.legend |
A character vector providing labels for the groups of points. |
extra.pch , extra.lwd , extra.lty , extra.col
|
Control the graphical aspect
of the points. See |
maximise |
( |
xaxis.side |
On which side that x-axis is drawn. Valid values are
|
yaxis.side |
On which side that y-axis is drawn. Valid values are |
axes |
A logical value indicating whether both axes should be drawn on the plot. |
sci.notation |
Generate prettier labels |
formula |
A formula of the type: |
data |
Dataframe containing the fields mentioned in the formula and in groups. |
subset |
( |
This function can be used to plot random sets of points like those obtained by different runs of biobjective stochastic optimisation algorithms. An EAF curve represents the boundary separating points that are known to be attainable (that is, dominated in Pareto sense) in at least a fraction (quantile) of the runs from those that are not. The median EAF represents the curve where the fraction of attainable points is 50%. In single objective optimisation the function can be used to plot the profile of solution quality over time of a collection of runs of a stochastic optimizer.
Return (invisibly) the attainment surfaces computed.
eafplot(default)
: Main function
eafplot(formula)
: Formula interface
eafplot(list)
: List interface for lists of data.frames or matrices
data(gcp2x2) tabucol <- subset(gcp2x2, alg != "TSinN1") tabucol$alg <- tabucol$alg[drop=TRUE] eafplot(time + best ~ run, data = tabucol, subset = tabucol$inst=="DSJC500.5") # These take time eafplot(time + best ~ run | inst, groups=alg, data=gcp2x2) eafplot(time + best ~ run | inst, groups=alg, data=gcp2x2, percentiles=c(0,50,100), cex.axis = 0.8, lty = c(2,1,2), lwd = c(2,2,2), col = c("black","blue","grey50")) extdata_path <- system.file(package = "eaf", "extdata") A1 <- read_datasets(file.path(extdata_path, "ALG_1_dat.xz")) A2 <- read_datasets(file.path(extdata_path, "ALG_2_dat.xz")) eafplot(A1, percentiles = 50, sci.notation = TRUE, cex.axis=0.6) # The attainment surfaces are returned invisibly. attsurfs <- eafplot(list(A1 = A1, A2 = A2), percentiles = 50) str(attsurfs) ## Save as a PDF file. # dev.copy2pdf(file = "eaf.pdf", onefile = TRUE, width = 5, height = 4) ## Using extra.points data(HybridGA) data(SPEA2relativeVanzyl) eafplot(SPEA2relativeVanzyl, percentiles = c(25, 50, 75), xlab = expression(C[E]), ylab = "Total switches", xlim = c(320, 400), extra.points = HybridGA$vanzyl, extra.legend = "Hybrid GA") data(SPEA2relativeRichmond) eafplot (SPEA2relativeRichmond, percentiles = c(25, 50, 75), xlab = expression(C[E]), ylab = "Total switches", xlim = c(90, 140), ylim = c(0, 25), extra.points = HybridGA$richmond, extra.lty = "dashed", extra.legend = "Hybrid GA") eafplot (SPEA2relativeRichmond, percentiles = c(25, 50, 75), xlab = expression(C[E]), ylab = "Total switches", xlim = c(90, 140), ylim = c(0, 25), type = "area", extra.points = HybridGA$richmond, extra.lty = "dashed", extra.legend = "Hybrid GA", legend.pos = "bottomright") data(SPEA2minstoptimeRichmond) SPEA2minstoptimeRichmond[,2] <- SPEA2minstoptimeRichmond[,2] / 60 eafplot (SPEA2minstoptimeRichmond, xlab = expression(C[E]), ylab = "Minimum idle time (minutes)", maximise = c(FALSE, TRUE), las = 1, log = "y", main = "SPEA2 (Richmond)", legend.pos = "bottomright")
data(gcp2x2) tabucol <- subset(gcp2x2, alg != "TSinN1") tabucol$alg <- tabucol$alg[drop=TRUE] eafplot(time + best ~ run, data = tabucol, subset = tabucol$inst=="DSJC500.5") # These take time eafplot(time + best ~ run | inst, groups=alg, data=gcp2x2) eafplot(time + best ~ run | inst, groups=alg, data=gcp2x2, percentiles=c(0,50,100), cex.axis = 0.8, lty = c(2,1,2), lwd = c(2,2,2), col = c("black","blue","grey50")) extdata_path <- system.file(package = "eaf", "extdata") A1 <- read_datasets(file.path(extdata_path, "ALG_1_dat.xz")) A2 <- read_datasets(file.path(extdata_path, "ALG_2_dat.xz")) eafplot(A1, percentiles = 50, sci.notation = TRUE, cex.axis=0.6) # The attainment surfaces are returned invisibly. attsurfs <- eafplot(list(A1 = A1, A2 = A2), percentiles = 50) str(attsurfs) ## Save as a PDF file. # dev.copy2pdf(file = "eaf.pdf", onefile = TRUE, width = 5, height = 4) ## Using extra.points data(HybridGA) data(SPEA2relativeVanzyl) eafplot(SPEA2relativeVanzyl, percentiles = c(25, 50, 75), xlab = expression(C[E]), ylab = "Total switches", xlim = c(320, 400), extra.points = HybridGA$vanzyl, extra.legend = "Hybrid GA") data(SPEA2relativeRichmond) eafplot (SPEA2relativeRichmond, percentiles = c(25, 50, 75), xlab = expression(C[E]), ylab = "Total switches", xlim = c(90, 140), ylim = c(0, 25), extra.points = HybridGA$richmond, extra.lty = "dashed", extra.legend = "Hybrid GA") eafplot (SPEA2relativeRichmond, percentiles = c(25, 50, 75), xlab = expression(C[E]), ylab = "Total switches", xlim = c(90, 140), ylim = c(0, 25), type = "area", extra.points = HybridGA$richmond, extra.lty = "dashed", extra.legend = "Hybrid GA", legend.pos = "bottomright") data(SPEA2minstoptimeRichmond) SPEA2minstoptimeRichmond[,2] <- SPEA2minstoptimeRichmond[,2] / 60 eafplot (SPEA2minstoptimeRichmond, xlab = expression(C[E]), ylab = "Minimum idle time (minutes)", maximise = c(FALSE, TRUE), las = 1, log = "y", main = "SPEA2 (Richmond)", legend.pos = "bottomright")
This function computes the EAF given a set of 2D or 3D points and a vector set
that indicates to which set each point belongs.
eafs(points, sets, groups = NULL, percentiles = NULL)
eafs(points, sets, groups = NULL, percentiles = NULL)
points |
Either a matrix or a data frame of numerical values, where each row gives the coordinates of a point. |
sets |
A vector indicating which set each point belongs to. |
groups |
Indicates that the EAF must be computed separately for data belonging to different groups. |
percentiles |
( |
A data frame (data.frame
) containing the exact representation
of EAF. The last column gives the percentile that corresponds to each
point. If groups is not NULL
, then an additional column
indicates to which group the point belongs.
There are several examples of data sets in system.file(package="eaf","extdata")
. The current implementation only supports two and three dimensional points.
Manuel López-Ibáñez
Viviane Grunert da Fonseca, Carlos M. Fonseca, Andreia O. Hall (2001). “Inferential Performance Assessment of Stochastic Optimisers and the Attainment Function.” In Eckart Zitzler, Kalyanmoy Deb, Lothar Thiele, Carlos A. Coello Coello, David Corne (eds.), Evolutionary Multi-criterion Optimization, EMO 2001, volume 1993 of Lecture Notes in Computer Science, 213–225. Springer, Heidelberg, Germany. doi:10.1007/3-540-44719-9_15.
Carlos M. Fonseca, Andreia P. Guerreiro, Manuel López-Ibáñez, Luís Paquete (2011). “On the Computation of the Empirical Attainment Function.” In R H C Takahashi, others (eds.), Evolutionary Multi-criterion Optimization, EMO 2011, volume 6576 of Lecture Notes in Computer Science, 106–120. Springer, Heidelberg. doi:10.1007/978-3-642-19893-9_8.
extdata_path <- system.file(package="eaf", "extdata") x <- read_datasets(file.path(extdata_path, "example1_dat")) # Compute full EAF str(eafs(x[,1:2], x[,3])) # Compute only best, median and worst str(eafs(x[,1:2], x[,3], percentiles = c(0, 50, 100))) x <- read_datasets(file.path(extdata_path, "spherical-250-10-3d.txt")) y <- read_datasets(file.path(extdata_path, "uniform-250-10-3d.txt")) x <- rbind(data.frame(x, groups = "spherical"), data.frame(y, groups = "uniform")) # Compute only median separately for each group z <- eafs(x[,1:3], sets = x[,4], groups = x[,5], percentiles = 50) str(z) # library(plotly) # plot_ly(z, x = ~X1, y = ~X2, z = ~X3, color = ~groups, # colors = c('#BF382A', '#0C4B8E')) %>% add_markers()
extdata_path <- system.file(package="eaf", "extdata") x <- read_datasets(file.path(extdata_path, "example1_dat")) # Compute full EAF str(eafs(x[,1:2], x[,3])) # Compute only best, median and worst str(eafs(x[,1:2], x[,3], percentiles = c(0, 50, 100))) x <- read_datasets(file.path(extdata_path, "spherical-250-10-3d.txt")) y <- read_datasets(file.path(extdata_path, "uniform-250-10-3d.txt")) x <- rbind(data.frame(x, groups = "spherical"), data.frame(y, groups = "uniform")) # Compute only median separately for each group z <- eafs(x[,1:3], sets = x[,4], groups = x[,5], percentiles = 50) str(z) # library(plotly) # plot_ly(z, x = ~X1, y = ~X2, z = ~X3, color = ~groups, # colors = c('#BF382A', '#0C4B8E')) %>% add_markers()
Computes the epsilon metric, either additive or multiplicative.
epsilon_additive(data, reference, maximise = FALSE) epsilon_mult(data, reference, maximise = FALSE)
epsilon_additive(data, reference, maximise = FALSE) epsilon_mult(data, reference, maximise = FALSE)
data |
( |
reference |
( |
maximise |
( |
The epsilon metric of a set with respect to a reference set
is defined as
where and
are objective vectors and, in the case of
minimization of objective
,
is computed as
for the multiplicative variant (respectively,
for the additive variant), whereas in the case of maximization of objective
,
for the multiplicative variant
(respectively,
for the additive variant). This allows
computing a single value for problems where some objectives are to be
maximized while others are to be minimized. Moreover, a lower value
corresponds to a better approximation set, independently of the type of
problem (minimization, maximization or mixed). However, the meaning of the
value is different for each objective type. For example, imagine that
objective 1 is to be minimized and objective 2 is to be maximized, and the
multiplicative epsilon computed here for
. This means
that
needs to be multiplied by 1/3 for all
values and by 3
for all
values in order to weakly dominate
. The
computation of the multiplicative version for negative values doesn't make
sense.
Computation of the epsilon indicator requires , where
is the number of objectives (dimension of vectors).
A single numerical value.
Manuel López-Ibáñez
Eckart Zitzler, Lothar Thiele, Marco Laumanns, Carlos M. Fonseca, Viviane Grunert da Fonseca (2003). “Performance Assessment of Multiobjective Optimizers: an Analysis and Review.” IEEE Transactions on Evolutionary Computation, 7(2), 117–132.
# Fig 6 from Zitzler et al. (2003). A1 <- matrix(c(9,2,8,4,7,5,5,6,4,7), ncol=2, byrow=TRUE) A2 <- matrix(c(8,4,7,5,5,6,4,7), ncol=2, byrow=TRUE) A3 <- matrix(c(10,4,9,5,8,6,7,7,6,8), ncol=2, byrow=TRUE) plot(A1, xlab=expression(f[1]), ylab=expression(f[2]), panel.first=grid(nx=NULL), pch=4, cex=1.5, xlim = c(0,10), ylim=c(0,8)) points(A2, pch=0, cex=1.5) points(A3, pch=1, cex=1.5) legend("bottomleft", legend=c("A1", "A2", "A3"), pch=c(4,0,1), pt.bg="gray", bg="white", bty = "n", pt.cex=1.5, cex=1.2) epsilon_mult(A1, A3) # A1 epsilon-dominates A3 => e = 9/10 < 1 epsilon_mult(A1, A2) # A1 weakly dominates A2 => e = 1 epsilon_mult(A2, A1) # A2 is epsilon-dominated by A1 => e = 2 > 1 # A more realistic example extdata_path <- system.file(package="eaf","extdata") path.A1 <- file.path(extdata_path, "ALG_1_dat.xz") path.A2 <- file.path(extdata_path, "ALG_2_dat.xz") A1 <- read_datasets(path.A1)[,1:2] A2 <- read_datasets(path.A2)[,1:2] ref <- filter_dominated(rbind(A1, A2)) epsilon_additive(A1, ref) epsilon_additive(A2, ref) # Multiplicative version of epsilon metric ref <- filter_dominated(rbind(A1, A2)) epsilon_mult(A1, ref) epsilon_mult(A2, ref)
# Fig 6 from Zitzler et al. (2003). A1 <- matrix(c(9,2,8,4,7,5,5,6,4,7), ncol=2, byrow=TRUE) A2 <- matrix(c(8,4,7,5,5,6,4,7), ncol=2, byrow=TRUE) A3 <- matrix(c(10,4,9,5,8,6,7,7,6,8), ncol=2, byrow=TRUE) plot(A1, xlab=expression(f[1]), ylab=expression(f[2]), panel.first=grid(nx=NULL), pch=4, cex=1.5, xlim = c(0,10), ylim=c(0,8)) points(A2, pch=0, cex=1.5) points(A3, pch=1, cex=1.5) legend("bottomleft", legend=c("A1", "A2", "A3"), pch=c(4,0,1), pt.bg="gray", bg="white", bty = "n", pt.cex=1.5, cex=1.2) epsilon_mult(A1, A3) # A1 epsilon-dominates A3 => e = 9/10 < 1 epsilon_mult(A1, A2) # A1 weakly dominates A2 => e = 1 epsilon_mult(A2, A1) # A2 is epsilon-dominated by A1 => e = 2 > 1 # A more realistic example extdata_path <- system.file(package="eaf","extdata") path.A1 <- file.path(extdata_path, "ALG_1_dat.xz") path.A2 <- file.path(extdata_path, "ALG_2_dat.xz") A1 <- read_datasets(path.A1)[,1:2] A2 <- read_datasets(path.A2)[,1:2] ref <- filter_dominated(rbind(A1, A2)) epsilon_additive(A1, ref) epsilon_additive(A2, ref) # Multiplicative version of epsilon metric ref <- filter_dominated(rbind(A1, A2)) epsilon_mult(A1, ref) epsilon_mult(A2, ref)
Two metaheuristic algorithms, TabuCol (Hertz et al., 1987) and simulated annealing (Johnson et al. 1991), to find a good approximation of the chromatic number of two random graphs. The data here has the only goal of providing an example of use of eafplot for comparing algorithm performance with respect to both time and quality when modelled as two objectives in trade off.
gcp2x2
gcp2x2
A data frame with 3133 observations on the following 6 variables.
alg
a factor with levels SAKempeFI
and TSinN1
inst
a factor with levels DSJC500.5
and
DSJC500.9
. Instances are taken from the DIMACS repository.
run
a numeric vector indicating the run to which the observation belong.
best
a numeric vector indicating the best solution in number of colors found in the corresponding run up to that time.
time
a numeric vector indicating the time since the beginning of the run for each observation. A rescaling is applied.
titer
a numeric vector indicating iteration number corresponding to the observations.
Each algorithm was run 10 times per graph registering the time and iteration number at which a new best solution was found. A time limit corresponding to 500*10^5 total iterations of TabuCol was imposed. The time was then normalized on a scale from 0 to 1 to make it instance independent.
Marco Chiarandini (2005). Stochastic Local Search Methods for Highly Constrained Combinatorial Optimisation Problems. Ph.D. thesis, FB Informatik, TU Darmstadt, Germany. (page 138)
A. Hertz and D. de Werra. Using Tabu Search Techniques for Graph Coloring. Computing, 1987, 39(4), 345-351.
David S. Johnson, Cecilia R. Aragon, Lyle A. McGeoch, Catherine Schevon (1991). “Optimization by Simulated Annealing: An Experimental Evaluation: Part II, Graph Coloring and Number Partitioning.” Operations Research, 39(3), 378–406.
data(gcp2x2)
data(gcp2x2)
Computes the hypervolume contribution of each point given a set of points with respect to a given reference point assuming minimization of all objectives. Dominated points have zero contribution. Duplicated points have zero contribution even if not dominated, because removing one of them does not change the hypervolume dominated by the remaining set.
hv_contributions(data, reference, maximise = FALSE)
hv_contributions(data, reference, maximise = FALSE)
data |
( |
reference |
( |
maximise |
( |
(numeric) A numerical vector
Manuel López-Ibáñez
Carlos M. Fonseca, Luís Paquete, Manuel López-Ibáñez (2006). “An improved dimension-sweep algorithm for the hypervolume indicator.” In Proceedings of the 2006 Congress on Evolutionary Computation (CEC 2006), 1157–1163. IEEE Press, Piscataway, NJ. doi:10.1109/CEC.2006.1688440.
Nicola Beume, Carlos M. Fonseca, Manuel López-Ibáñez, Luís Paquete, Jan Vahrenhold (2009). “On the complexity of computing the hypervolume indicator.” IEEE Transactions on Evolutionary Computation, 13(5), 1075–1082. doi:10.1109/TEVC.2009.2015575.
data(SPEA2minstoptimeRichmond) # The second objective must be maximized # We calculate the hypervolume contribution of each point of the union of all sets. hv_contributions(SPEA2minstoptimeRichmond[, 1:2], reference = c(250, 0), maximise = c(FALSE, TRUE)) # Duplicated points show zero contribution above, even if not # dominated. However, filter_dominated removes all duplicates except # one. Hence, there are more points below with nonzero contribution. hv_contributions(filter_dominated(SPEA2minstoptimeRichmond[, 1:2], maximise = c(FALSE, TRUE)), reference = c(250, 0), maximise = c(FALSE, TRUE))
data(SPEA2minstoptimeRichmond) # The second objective must be maximized # We calculate the hypervolume contribution of each point of the union of all sets. hv_contributions(SPEA2minstoptimeRichmond[, 1:2], reference = c(250, 0), maximise = c(FALSE, TRUE)) # Duplicated points show zero contribution above, even if not # dominated. However, filter_dominated removes all duplicates except # one. Hence, there are more points below with nonzero contribution. hv_contributions(filter_dominated(SPEA2minstoptimeRichmond[, 1:2], maximise = c(FALSE, TRUE)), reference = c(250, 0), maximise = c(FALSE, TRUE))
The data has the only goal of providing an example of use of eafplot.
HybridGA
HybridGA
A list with two data frames, each of them with three columns, as
produced by read_datasets()
.
$vanzyl
data frame of results on vanzyl network
$richmond
data frame of results on Richmond
network. The second column is filled with NA
Manuel López-Ibáñez (2009). Operational Optimisation of Water Distribution Networks. Ph.D. thesis, School of Engineering and the Built Environment, Edinburgh Napier University, UK. https://lopez-ibanez.eu/publications#LopezIbanezPhD..
data(HybridGA) print(HybridGA$vanzyl) print(HybridGA$richmond)
data(HybridGA) print(HybridGA$vanzyl) print(HybridGA$richmond)
Computes the hypervolume metric with respect to a given reference point assuming minimization of all objectives.
hypervolume(data, reference, maximise = FALSE)
hypervolume(data, reference, maximise = FALSE)
data |
( |
reference |
( |
maximise |
( |
The algorithm has time and linear space
complexity in the worst-case, but experimental results show that the
pruning techniques used may reduce the time complexity even further.
A single numerical value.
Manuel López-Ibáñez
Carlos M. Fonseca, Luís Paquete, Manuel López-Ibáñez (2006). “An improved dimension-sweep algorithm for the hypervolume indicator.” In Proceedings of the 2006 Congress on Evolutionary Computation (CEC 2006), 1157–1163. IEEE Press, Piscataway, NJ. doi:10.1109/CEC.2006.1688440.
Nicola Beume, Carlos M. Fonseca, Manuel López-Ibáñez, Luís Paquete, Jan Vahrenhold (2009). “On the complexity of computing the hypervolume indicator.” IEEE Transactions on Evolutionary Computation, 13(5), 1075–1082. doi:10.1109/TEVC.2009.2015575.
data(SPEA2minstoptimeRichmond) # The second objective must be maximized # We calculate the hypervolume of the union of all sets. hypervolume(SPEA2minstoptimeRichmond[, 1:2], reference = c(250, 0), maximise = c(FALSE, TRUE))
data(SPEA2minstoptimeRichmond) # The second objective must be maximized # We calculate the hypervolume of the union of all sets. hypervolume(SPEA2minstoptimeRichmond[, 1:2], reference = c(250, 0), maximise = c(FALSE, TRUE))
Functions to compute the inverted generational distance (IGD and IGD+) and the averaged Hausdorff distance between nondominated sets of points.
igd(data, reference, maximise = FALSE) igd_plus(data, reference, maximise = FALSE) avg_hausdorff_dist(data, reference, maximise = FALSE, p = 1L)
igd(data, reference, maximise = FALSE) igd_plus(data, reference, maximise = FALSE) avg_hausdorff_dist(data, reference, maximise = FALSE, p = 1L)
data |
( |
reference |
( |
maximise |
( |
p |
( |
The generational distance (GD) of a set is defined as the distance
between each point
and the closest point
in a
reference set
, averaged over the size of
. Formally,
where the distance in our implementation is the Euclidean distance:
The inverted generational distance (IGD) is calculated as .
The modified inverted generational distanced (IGD+) was proposed by
Ishibuchi et al. (2015) to ensure that IGD+ is weakly Pareto compliant,
similarly to epsilon_additive()
or epsilon_mult()
. It modifies the
distance measure as:
The average Hausdorff distance () was proposed by
Schütze et al. (2012) and it is calculated as:
IGDX (Zhou et al. 2009) is the application of IGD to decision vectors
instead of objective vectors to measure closeness and diversity in decision
space. One can use the functions igd()
or igd_plus()
(recommended)
directly, just passing the decision vectors as data
.
There are different formulations of the GD and IGD metrics in the literature
that differ on the value of , on the distance metric used and on
whether the term
is inside (as above) or outside the exponent
. GD was first proposed by Van Veldhuizen and Lamont (1998) with
and
the term
outside the exponent. IGD seems to have been
mentioned first by Coello Coello and Reyes-Sierra (2004), however, some people also used the
name D-metric for the same concept with
and later papers have
often used IGD/GD with
. Schütze et al. (2012) proposed to
place the term
inside the exponent, as in the formulation
shown above. This has a significant effect for GD and less so for IGD given
a constant reference set. IGD+ also follows this formulation. We refer to
Ishibuchi et al. (2015) and Bezerra et al. (2017) for a more detailed
historical perspective and a comparison of the various variants.
Following Ishibuchi et al. (2015), we always use in our
implementation of IGD and IGD+ because (1) it is the setting most used in
recent works; (2) it makes irrelevant whether the term
is
inside or outside the exponent
; and (3) the meaning of IGD becomes
the average Euclidean distance from each reference point to its nearest
objective vector). It is also slightly faster to compute.
GD should never be used directly to compare the quality of approximations to a Pareto front, as it often contradicts Pareto optimality (it is not weakly Pareto-compliant). We recommend IGD+ instead of IGD, since the latter contradicts Pareto optimality in some cases (see examples below) whereas IGD+ is weakly Pareto-compliant, but we implement IGD here because it is still popular due to historical reasons.
The average Hausdorff distance () is also not weakly
Pareto-compliant, as shown in the examples below.
(numeric(1)
) A single numerical value.
Manuel López-Ibáñez
Leonardo
C.
T. Bezerra, Manuel López-Ibáñez, Thomas Stützle (2017).
“An Empirical Assessment of the Properties of Inverted Generational Distance Indicators on Multi- and Many-objective Optimization.”
In Heike Trautmann, Günter Rudolph, Kathrin Klamroth, Oliver Schütze, Margaret
M. Wiecek, Yaochu Jin, Christian Grimme (eds.), Evolutionary Multi-criterion Optimization, EMO 2017, Lecture Notes in Computer Science, 31–45.
Springer International Publishing, Cham, Switzerland.
doi:10.1007/978-3-319-54157-0_3.
Carlos
A. Coello Coello, Margarita Reyes-Sierra (2004).
“A Study of the Parallelization of a Coevolutionary Multi-objective Evolutionary Algorithm.”
In Raúl Monroy, Gustavo Arroyo-Figueroa, Luis
Enrique Sucar, Humberto Sossa (eds.), Proceedings of MICAI, volume 2972 of Lecture Notes in Artificial Intelligence, 688–697.
Springer, Heidelberg, Germany.
Hisao Ishibuchi, Hiroyuki Masuda, Yuki Tanigaki, Yusuke Nojima (2015).
“Modified Distance Calculation in Generational Distance and Inverted Generational Distance.”
In António Gaspar-Cunha, Carlos
Henggeler Antunes, Carlos
A. Coello Coello (eds.), Evolutionary Multi-criterion Optimization, EMO 2015 Part I, volume 9018 of Lecture Notes in Computer Science, 110–125.
Springer, Heidelberg, Germany.
Oliver Schütze, X Esquivel, A Lara, Carlos
A. Coello Coello (2012).
“Using the Averaged Hausdorff Distance as a Performance Measure in Evolutionary Multiobjective Optimization.”
IEEE Transactions on Evolutionary Computation, 16(4), 504–522.
David
A. Van Veldhuizen, Gary
B. Lamont (1998).
“Evolutionary Computation and Convergence to a Pareto Front.”
In John
R. Koza (ed.), Late Breaking Papers at the Genetic Programming 1998 Conference, 221–228.
A Zhou, Qingfu Zhang, Yaochu Jin (2009).
“Approximating the set of Pareto-optimal solutions in both the decision and objective spaces by an estimation of distribution algorithm.”
IEEE Transactions on Evolutionary Computation, 13(5), 1167–1189.
doi:10.1109/TEVC.2009.2021467.
# Example 4 from Ishibuchi et al. (2015) ref <- matrix(c(10,0,6,1,2,2,1,6,0,10), ncol=2, byrow=TRUE) A <- matrix(c(4,2,3,3,2,4), ncol=2, byrow=TRUE) B <- matrix(c(8,2,4,4,2,8), ncol=2, byrow=TRUE) plot(ref, xlab=expression(f[1]), ylab=expression(f[2]), panel.first=grid(nx=NULL), pch=23, bg="gray", cex=1.5) points(A, pch=1, cex=1.5) points(B, pch=19, cex=1.5) legend("topright", legend=c("Reference", "A", "B"), pch=c(23,1,19), pt.bg="gray", bg="white", bty = "n", pt.cex=1.5, cex=1.2) cat("A is better than B in terms of Pareto optimality,\n however, IGD(A)=", igd(A, ref), "> IGD(B)=", igd(B, ref), "and AvgHausdorff(A)=", avg_hausdorff_dist(A, ref), "> AvgHausdorff(A)=", avg_hausdorff_dist(B, ref), ", which both contradict Pareto optimality.\nBy contrast, IGD+(A)=", igd_plus(A, ref), "< IGD+(B)=", igd_plus(B, ref), ", which is correct.\n") # A less trivial example. extdata_path <- system.file(package="eaf","extdata") path.A1 <- file.path(extdata_path, "ALG_1_dat.xz") path.A2 <- file.path(extdata_path, "ALG_2_dat.xz") A1 <- read_datasets(path.A1)[,1:2] A2 <- read_datasets(path.A2)[,1:2] ref <- filter_dominated(rbind(A1, A2)) igd(A1, ref) igd(A2, ref) # IGD+ (Pareto compliant) igd_plus(A1, ref) igd_plus(A2, ref) # Average Haussdorff distance avg_hausdorff_dist(A1, ref) avg_hausdorff_dist(A2, ref)
# Example 4 from Ishibuchi et al. (2015) ref <- matrix(c(10,0,6,1,2,2,1,6,0,10), ncol=2, byrow=TRUE) A <- matrix(c(4,2,3,3,2,4), ncol=2, byrow=TRUE) B <- matrix(c(8,2,4,4,2,8), ncol=2, byrow=TRUE) plot(ref, xlab=expression(f[1]), ylab=expression(f[2]), panel.first=grid(nx=NULL), pch=23, bg="gray", cex=1.5) points(A, pch=1, cex=1.5) points(B, pch=19, cex=1.5) legend("topright", legend=c("Reference", "A", "B"), pch=c(23,1,19), pt.bg="gray", bg="white", bty = "n", pt.cex=1.5, cex=1.2) cat("A is better than B in terms of Pareto optimality,\n however, IGD(A)=", igd(A, ref), "> IGD(B)=", igd(B, ref), "and AvgHausdorff(A)=", avg_hausdorff_dist(A, ref), "> AvgHausdorff(A)=", avg_hausdorff_dist(B, ref), ", which both contradict Pareto optimality.\nBy contrast, IGD+(A)=", igd_plus(A, ref), "< IGD+(B)=", igd_plus(B, ref), ", which is correct.\n") # A less trivial example. extdata_path <- system.file(package="eaf","extdata") path.A1 <- file.path(extdata_path, "ALG_1_dat.xz") path.A2 <- file.path(extdata_path, "ALG_2_dat.xz") A1 <- read_datasets(path.A1)[,1:2] A2 <- read_datasets(path.A2)[,1:2] ref <- filter_dominated(rbind(A1, A2)) igd(A1, ref) igd(A2, ref) # IGD+ (Pareto compliant) igd_plus(A1, ref) igd_plus(A2, ref) # Average Haussdorff distance avg_hausdorff_dist(A1, ref) avg_hausdorff_dist(A2, ref)
Identify nondominated points with is_nondominated
and remove dominated
ones with filter_dominated
.
pareto_rank()
ranks points according to Pareto-optimality,
which is also called nondominated sorting (Deb et al. 2002).
is_nondominated(data, maximise = FALSE, keep_weakly = FALSE) filter_dominated(data, maximise = FALSE, keep_weakly = FALSE) pareto_rank(data, maximise = FALSE)
is_nondominated(data, maximise = FALSE, keep_weakly = FALSE) filter_dominated(data, maximise = FALSE, keep_weakly = FALSE) pareto_rank(data, maximise = FALSE)
data |
( |
maximise |
( |
keep_weakly |
If |
pareto_rank()
is meant to be used like rank()
, but it
assigns ranks according to Pareto dominance. Duplicated points are kept on
the same front. When ncol(data) == 2
, the code uses the algorithm by Jensen (2003).
is_nondominated
returns a logical vector of the same length
as the number of rows of data
, where TRUE
means that the
point is not dominated by any other point.
filter_dominated
returns a matrix or data.frame with only mutually nondominated points.
pareto_rank()
returns an integer vector of the same length as
the number of rows of data
, where each value gives the rank of each
point.
Manuel López-Ibáñez
Kalyanmoy Deb, A Pratap, S Agarwal, T Meyarivan (2002).
“A fast and elitist multi-objective genetic algorithm: NSGA-II.”
IEEE Transactions on Evolutionary Computation, 6(2), 182–197.
doi:10.1109/4235.996017.
M
T Jensen (2003).
“Reducing the run-time complexity of multiobjective EAs: The NSGA-II and other algorithms.”
IEEE Transactions on Evolutionary Computation, 7(5), 503–515.
path_A1 <- file.path(system.file(package="eaf"),"extdata","ALG_1_dat.xz") set <- read_datasets(path_A1)[,1:2] is_nondom <- is_nondominated(set) cat("There are ", sum(is_nondom), " nondominated points\n") plot(set, col = "blue", type = "p", pch = 20) ndset <- filter_dominated(set) points(ndset[order(ndset[,1]),], col = "red", pch = 21) ranks <- pareto_rank(set) colors <- colorRampPalette(c("red","yellow","springgreen","royalblue"))(max(ranks)) plot(set, col = colors[ranks], type = "p", pch = 20)
path_A1 <- file.path(system.file(package="eaf"),"extdata","ALG_1_dat.xz") set <- read_datasets(path_A1)[,1:2] is_nondom <- is_nondominated(set) cat("There are ", sum(is_nondom), " nondominated points\n") plot(set, col = "blue", type = "p", pch = 20) ndset <- filter_dominated(set) points(ndset[order(ndset[,1]),], col = "red", pch = 21) ranks <- pareto_rank(set) colors <- colorRampPalette(c("red","yellow","springgreen","royalblue"))(max(ranks)) plot(set, col = colors[ranks], type = "p", pch = 20)
Given a list of datasets, return the indexes of the pair with the largest EAF differences according to the method proposed by Diaz and López-Ibáñez (2021).
largest_eafdiff(data, maximise = FALSE, intervals = 5, reference, ideal = NULL)
largest_eafdiff(data, maximise = FALSE, intervals = 5, reference, ideal = NULL)
data |
( |
maximise |
( |
intervals |
( |
reference |
( |
ideal |
( |
(list()
) A list with two components pair
and value
.
Juan Esteban Diaz, Manuel López-Ibáñez (2021). “Incorporating Decision-Maker's Preferences into the Automatic Configuration of Bi-Objective Optimisation Algorithms.” European Journal of Operational Research, 289(3), 1209–1222. doi:10.1016/j.ejor.2020.07.059.
# FIXME: This example is too large, we need a smaller one. files <- c("wrots_l100w10_dat","wrots_l10w100_dat") data <- lapply(files, function(x) read_datasets(file.path(system.file(package="eaf"), "extdata", x))) nadir <- apply(do.call(rbind, data)[,1:2], 2, max) x <- largest_eafdiff(data, reference = nadir) str(x)
# FIXME: This example is too large, we need a smaller one. files <- c("wrots_l100w10_dat","wrots_l10w100_dat") data <- lapply(files, function(x) read_datasets(file.path(system.file(package="eaf"), "extdata", x))) nadir <- apply(do.call(rbind, data)[,1:2], 2, max) x <- largest_eafdiff(data, reference = nadir) str(x)
Normalise points per coordinate to a range, e.g., c(1,2)
, where the
minimum value will correspond to 1 and the maximum to 2. If bounds are
given, they are used for the normalisation.
normalise(data, to_range = c(1, 2), lower = NA, upper = NA, maximise = FALSE)
normalise(data, to_range = c(1, 2), lower = NA, upper = NA, maximise = FALSE)
data |
( |
to_range |
Normalise values to this range. If the objective is
maximised, it is normalised to |
lower , upper
|
Bounds on the values. If NA, the maximum and minimum values of each coordinate are used. |
maximise |
( |
A numerical matrix
Manuel López-Ibáñez
data(SPEA2minstoptimeRichmond) # The second objective must be maximized head(SPEA2minstoptimeRichmond[, 1:2]) head(normalise(SPEA2minstoptimeRichmond[, 1:2], maximise = c(FALSE, TRUE))) head(normalise(SPEA2minstoptimeRichmond[, 1:2], to_range = c(0,1), maximise = c(FALSE, TRUE)))
data(SPEA2minstoptimeRichmond) # The second objective must be maximized head(SPEA2minstoptimeRichmond[, 1:2]) head(normalise(SPEA2minstoptimeRichmond[, 1:2], maximise = c(FALSE, TRUE))) head(normalise(SPEA2minstoptimeRichmond[, 1:2], to_range = c(0,1), maximise = c(FALSE, TRUE)))
Remove whitespace margins using https://ctan.org/pkg/pdfcrop and
optionally embed fonts using grDevices::embedFonts()
. You may install
pdfcrop
using TinyTeX (https://cran.r-project.org/package=tinytex) with
tinytex::tlmgr_install('pdfcrop')
.
pdf_crop( filename, mustWork = FALSE, pdfcrop = Sys.which("pdfcrop"), embed_fonts = FALSE )
pdf_crop( filename, mustWork = FALSE, pdfcrop = Sys.which("pdfcrop"), embed_fonts = FALSE )
filename |
Filename of a PDF file to crop. The file will be overwritten. |
mustWork |
If |
pdfcrop |
Path to the |
embed_fonts |
( |
You may also wish to consider extrafont::embed_fonts()
(https://cran.r-project.org/package=extrafont).
library(extrafont) # If you need to specify the path to Ghostscript (probably not needed in Linux) Sys.setenv(R_GSCMD = "C:/Program Files/gs/gs9.56.1/bin/gswin64c.exe") embed_fonts("original.pdf", outfile = "new.pdf")
As an alternative, saving the PDF with grDevices::cairo_pdf()
should
already embed the fonts.
Nothing
grDevices::embedFonts()
extrafont::embed_fonts()
grDevices::cairo_pdf()
## Not run: extdata_path <- system.file(package = "eaf", "extdata") A1 <- read_datasets(file.path(extdata_path, "wrots_l100w10_dat")) A2 <- read_datasets(file.path(extdata_path, "wrots_l10w100_dat")) pdf(file = "eaf.pdf", onefile = TRUE, width = 5, height = 4) eafplot(list(A1 = A1, A2 = A2), percentiles = 50, sci.notation=TRUE) dev.off() pdf_crop("eaf.pdf") ## End(Not run)
## Not run: extdata_path <- system.file(package = "eaf", "extdata") A1 <- read_datasets(file.path(extdata_path, "wrots_l100w10_dat")) A2 <- read_datasets(file.path(extdata_path, "wrots_l10w100_dat")) pdf(file = "eaf.pdf", onefile = TRUE, width = 5, height = 4) eafplot(list(A1 = A1, A2 = A2), percentiles = 50, sci.notation=TRUE) dev.off() pdf_crop("eaf.pdf") ## End(Not run)
Reads a text file in table format and creates a matrix from it. The file
may contain several sets, separated by empty lines. Lines starting by
'#'
are considered comments and treated as empty lines. The function
adds an additional column set
to indicate to which set each row
belongs.
read_datasets(file, col_names, text) read.data.sets(file, col.names)
read_datasets(file, col_names, text) read.data.sets(file, col.names)
file |
( |
col_names , col.names
|
Vector of optional names for the variables. The default is to use ‘"V"’ followed by the column number. |
text |
( |
(matrix()
) containing a representation of the
data in the file. An extra column set
is added to indicate to
which set each row belongs.
A known limitation is that the input file must use newline characters
native to the host system, otherwise they will be, possibly silently,
misinterpreted. In GNU/Linux the program dos2unix
may be used
to fix newline characters.
There are several examples of data sets in
system.file(package="eaf","extdata")
.
read.data.sets()
is a deprecated alias. It will be removed in the next
major release.
Manuel López-Ibáñez
read.table
, eafplot()
, eafdiffplot()
extdata_path <- system.file(package="eaf","extdata") A1 <- read_datasets(file.path(extdata_path,"ALG_1_dat.xz")) str(A1) read_datasets(text="1 2\n3 4\n\n5 6\n7 8\n", col_names=c("obj1", "obj2"))
extdata_path <- system.file(package="eaf","extdata") A1 <- read_datasets(file.path(extdata_path,"ALG_1_dat.xz")) str(A1) read_datasets(text="1 2\n3 4\n\n5 6\n7 8\n", col_names=c("obj1", "obj2"))
The data has the only goal of providing an example of use of eafplot.
SPEA2minstoptimeRichmond
SPEA2minstoptimeRichmond
A data frame as produced by read_datasets()
. The second
column measures time in seconds and corresponds to a maximisation problem.
Manuel López-Ibáñez (2009). Operational Optimisation of Water Distribution Networks. Ph.D. thesis, School of Engineering and the Built Environment, Edinburgh Napier University, UK. https://lopez-ibanez.eu/publications#LopezIbanezPhD.
data(HybridGA) data(SPEA2minstoptimeRichmond) SPEA2minstoptimeRichmond[,2] <- SPEA2minstoptimeRichmond[,2] / 60 eafplot (SPEA2minstoptimeRichmond, xlab = expression(C[E]), ylab = "Minimum idle time (minutes)", maximise = c(FALSE, TRUE), las = 1, log = "y", legend.pos = "bottomright")
data(HybridGA) data(SPEA2minstoptimeRichmond) SPEA2minstoptimeRichmond[,2] <- SPEA2minstoptimeRichmond[,2] / 60 eafplot (SPEA2minstoptimeRichmond, xlab = expression(C[E]), ylab = "Minimum idle time (minutes)", maximise = c(FALSE, TRUE), las = 1, log = "y", legend.pos = "bottomright")
The data has the only goal of providing an example of use of eafplot.
SPEA2relativeRichmond
SPEA2relativeRichmond
A data frame as produced by read_datasets()
.
Manuel López-Ibáñez (2009). Operational Optimisation of Water Distribution Networks. Ph.D. thesis, School of Engineering and the Built Environment, Edinburgh Napier University, UK. https://lopez-ibanez.eu/publications#LopezIbanezPhD.
data(HybridGA) data(SPEA2relativeRichmond) eafplot (SPEA2relativeRichmond, percentiles = c(25, 50, 75), xlab = expression(C[E]), ylab = "Total switches", xlim = c(90, 140), ylim = c(0, 25), extra.points = HybridGA$richmond, extra.lty = "dashed", extra.legend = "Hybrid GA")
data(HybridGA) data(SPEA2relativeRichmond) eafplot (SPEA2relativeRichmond, percentiles = c(25, 50, 75), xlab = expression(C[E]), ylab = "Total switches", xlim = c(90, 140), ylim = c(0, 25), extra.points = HybridGA$richmond, extra.lty = "dashed", extra.legend = "Hybrid GA")
The data has the only goal of providing an example of use of eafplot.
SPEA2relativeVanzyl
SPEA2relativeVanzyl
A data frame as produced by read_datasets()
.
Manuel López-Ibáñez (2009). Operational Optimisation of Water Distribution Networks. Ph.D. thesis, School of Engineering and the Built Environment, Edinburgh Napier University, UK. https://lopez-ibanez.eu/publications#LopezIbanezPhD.
data(HybridGA) data(SPEA2relativeVanzyl) eafplot(SPEA2relativeVanzyl, percentiles = c(25, 50, 75), xlab = expression(C[E]), ylab = "Total switches", xlim = c(320, 400), extra.points = HybridGA$vanzyl, extra.legend = "Hybrid GA")
data(HybridGA) data(SPEA2relativeVanzyl) eafplot(SPEA2relativeVanzyl, percentiles = c(25, 50, 75), xlab = expression(C[E]), ylab = "Total switches", xlim = c(320, 400), extra.points = HybridGA$vanzyl, extra.legend = "Hybrid GA")
Compute Vorob'ev threshold, expectation and deviation. Also, displaying the symmetric deviation function is possible. The symmetric deviation function is the probability for a given target in the objective space to belong to the symmetric difference between the Vorob'ev expectation and a realization of the (random) attained set.
vorobT(x, reference) vorobDev(x, VE, reference) symDifPlot( x, VE, threshold, nlevels = 11, ve.col = "blue", xlim = NULL, ylim = NULL, legend.pos = "topright", main = "Symmetric deviation function", col.fun = function(n) gray(seq(0, 0.9, length.out = n)^2) )
vorobT(x, reference) vorobDev(x, VE, reference) symDifPlot( x, VE, threshold, nlevels = 11, ve.col = "blue", xlim = NULL, ylim = NULL, legend.pos = "topright", main = "Symmetric deviation function", col.fun = function(n) gray(seq(0, 0.9, length.out = n)^2) )
x |
Either a matrix of data values, or a data frame, or a list of data frames of exactly three columns. The third column gives the set (run, sample, ...) identifier. |
reference |
( |
VE , threshold
|
Vorob'ev expectation and threshold, e.g., as returned
by |
nlevels |
number of levels in which is divided the range of the symmetric deviation. |
ve.col |
plotting parameters for the Vorob'ev expectation. |
xlim , ylim , main
|
Graphical parameters, see
|
legend.pos |
the position of the legend, see
|
col.fun |
function that creates a vector of |
vorobT
returns a list with elements threshold
,
VE
, and avg_hyp
(average hypervolume)
vorobDev
returns the Vorob'ev deviation.
Mickael Binois
M Binois, D Ginsbourger, O Roustant (2015). “Quantifying uncertainty on Pareto fronts with Gaussian process conditional simulations.” European Journal of Operational Research, 243(2), 386–394. doi:10.1016/j.ejor.2014.07.032.
C. Chevalier (2013), Fast uncertainty reduction strategies relying on Gaussian process models, University of Bern, PhD thesis.
I. Molchanov (2005), Theory of random sets, Springer.
data(CPFs) res <- vorobT(CPFs, reference = c(2, 200)) print(res$threshold) ## Display Vorob'ev expectation and attainment function # First style eafplot(CPFs[,1:2], sets = CPFs[,3], percentiles = c(0, 25, 50, 75, 100, res$threshold), main = substitute(paste("Empirical attainment function, ",beta,"* = ", a, "%"), list(a = formatC(res$threshold, digits = 2, format = "f")))) # Second style eafplot(CPFs[,1:2], sets = CPFs[,3], percentiles = c(0, 20, 40, 60, 80, 100), col = gray(seq(0.8, 0.1, length.out = 6)^0.5), type = "area", legend.pos = "bottomleft", extra.points = res$VE, extra.col = "cyan", extra.legend = "VE", extra.lty = "solid", extra.pch = NA, extra.lwd = 2, main = substitute(paste("Empirical attainment function, ",beta,"* = ", a, "%"), list(a = formatC(res$threshold, digits = 2, format = "f")))) # Now print Vorob'ev deviation VD <- vorobDev(CPFs, res$VE, reference = c(2, 200)) print(VD) # Now display the symmetric deviation function. symDifPlot(CPFs, res$VE, res$threshold, nlevels = 11) # Levels are adjusted automatically if too large. symDifPlot(CPFs, res$VE, res$threshold, nlevels = 200, legend.pos = "none") # Use a different palette. symDifPlot(CPFs, res$VE, res$threshold, nlevels = 11, col.fun = heat.colors)
data(CPFs) res <- vorobT(CPFs, reference = c(2, 200)) print(res$threshold) ## Display Vorob'ev expectation and attainment function # First style eafplot(CPFs[,1:2], sets = CPFs[,3], percentiles = c(0, 25, 50, 75, 100, res$threshold), main = substitute(paste("Empirical attainment function, ",beta,"* = ", a, "%"), list(a = formatC(res$threshold, digits = 2, format = "f")))) # Second style eafplot(CPFs[,1:2], sets = CPFs[,3], percentiles = c(0, 20, 40, 60, 80, 100), col = gray(seq(0.8, 0.1, length.out = 6)^0.5), type = "area", legend.pos = "bottomleft", extra.points = res$VE, extra.col = "cyan", extra.legend = "VE", extra.lty = "solid", extra.pch = NA, extra.lwd = 2, main = substitute(paste("Empirical attainment function, ",beta,"* = ", a, "%"), list(a = formatC(res$threshold, digits = 2, format = "f")))) # Now print Vorob'ev deviation VD <- vorobDev(CPFs, res$VE, reference = c(2, 200)) print(VD) # Now display the symmetric deviation function. symDifPlot(CPFs, res$VE, res$threshold, nlevels = 11) # Levels are adjusted automatically if too large. symDifPlot(CPFs, res$VE, res$threshold, nlevels = 200, legend.pos = "none") # Use a different palette. symDifPlot(CPFs, res$VE, res$threshold, nlevels = 11, col.fun = heat.colors)
Return an estimation of the hypervolume of the space dominated by the input data following the procedure described by Auger et al. (2009). A weight distribution describing user preferences may be specified.
whv_hype( data, reference, ideal, maximise = FALSE, dist = list(type = "uniform"), nsamples = 100000L )
whv_hype( data, reference, ideal, maximise = FALSE, dist = list(type = "uniform"), nsamples = 100000L )
data |
( |
reference |
( |
ideal |
( |
maximise |
( |
dist |
( |
nsamples |
( |
The current implementation only supports 2 objectives.
A weight distribution (Auger et al. 2009) can be provided via the dist
argument. The ones currently supported are:
type="uniform"
corresponds to the default hypervolume (unweighted).
type="point"
describes a goal in the objective space, where mu
gives the coordinates of the goal. The resulting weight distribution is a multivariate normal distribution centred at the goal.
type="exponential"
describes an exponential distribution with rate parameter 1/mu
, i.e., .
A single numerical value.
Anne Auger, Johannes Bader, Dimo Brockhoff, Eckart Zitzler (2009). “Articulating User Preferences in Many-Objective Problems by Sampling the Weighted Hypervolume.” In Franz Rothlauf (ed.), Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2009, 555–562. ACM Press, New York, NY.
read_datasets()
, eafdiff()
, whv_rect()
whv_hype (matrix(2, ncol=2), reference = 4, ideal = 1) whv_hype (matrix(c(3,1), ncol=2), reference = 4, ideal = 1) whv_hype (matrix(2, ncol=2), reference = 4, ideal = 1, dist = list(type="exponential", mu=0.2)) whv_hype (matrix(c(3,1), ncol=2), reference = 4, ideal = 1, dist = list(type="exponential", mu=0.2)) whv_hype (matrix(2, ncol=2), reference = 4, ideal = 1, dist = list(type="point", mu=c(1,1))) whv_hype (matrix(c(3,1), ncol=2), reference = 4, ideal = 1, dist = list(type="point", mu=c(1,1)))
whv_hype (matrix(2, ncol=2), reference = 4, ideal = 1) whv_hype (matrix(c(3,1), ncol=2), reference = 4, ideal = 1) whv_hype (matrix(2, ncol=2), reference = 4, ideal = 1, dist = list(type="exponential", mu=0.2)) whv_hype (matrix(c(3,1), ncol=2), reference = 4, ideal = 1, dist = list(type="exponential", mu=0.2)) whv_hype (matrix(2, ncol=2), reference = 4, ideal = 1, dist = list(type="point", mu=c(1,1))) whv_hype (matrix(c(3,1), ncol=2), reference = 4, ideal = 1, dist = list(type="point", mu=c(1,1)))
Calculates the hypervolume weighted by a set of rectangles (with zero weight outside the rectangles). The function total_whv_rect()
calculates the total weighted hypervolume as hypervolume()
+ scalefactor * abs(prod(reference - ideal)) * whv_rect()
. The details of the computation are given by Diaz and López-Ibáñez (2021).
whv_rect(data, rectangles, reference, maximise = FALSE) total_whv_rect( data, rectangles, reference, maximise = FALSE, ideal = NULL, scalefactor = 0.1 )
whv_rect(data, rectangles, reference, maximise = FALSE) total_whv_rect( data, rectangles, reference, maximise = FALSE, ideal = NULL, scalefactor = 0.1 )
data |
( |
rectangles |
( |
reference |
( |
maximise |
( |
ideal |
( |
scalefactor |
( |
TODO
A single numerical value.
Juan Esteban Diaz, Manuel López-Ibáñez (2021). “Incorporating Decision-Maker's Preferences into the Automatic Configuration of Bi-Objective Optimisation Algorithms.” European Journal of Operational Research, 289(3), 1209–1222. doi:10.1016/j.ejor.2020.07.059.
read_datasets()
, eafdiff()
, choose_eafdiff()
, whv_hype()
rectangles <- as.matrix(read.table(header=FALSE, text=' 1.0 3.0 2.0 Inf 1 2.0 3.5 2.5 Inf 2 2.0 3.0 3.0 3.5 3 ')) whv_rect (matrix(2, ncol=2), rectangles, reference = 6) whv_rect (matrix(c(2, 1), ncol=2), rectangles, reference = 6) whv_rect (matrix(c(1, 2), ncol=2), rectangles, reference = 6) total_whv_rect (matrix(2, ncol=2), rectangles, reference = 6, ideal = c(1,1)) total_whv_rect (matrix(c(2, 1), ncol=2), rectangles, reference = 6, ideal = c(1,1)) total_whv_rect (matrix(c(1, 2), ncol=2), rectangles, reference = 6, ideal = c(1,1))
rectangles <- as.matrix(read.table(header=FALSE, text=' 1.0 3.0 2.0 Inf 1 2.0 3.5 2.5 Inf 2 2.0 3.0 3.0 3.5 3 ')) whv_rect (matrix(2, ncol=2), rectangles, reference = 6) whv_rect (matrix(c(2, 1), ncol=2), rectangles, reference = 6) whv_rect (matrix(c(1, 2), ncol=2), rectangles, reference = 6) total_whv_rect (matrix(2, ncol=2), rectangles, reference = 6, ideal = c(1,1)) total_whv_rect (matrix(c(2, 1), ncol=2), rectangles, reference = 6, ideal = c(1,1)) total_whv_rect (matrix(c(1, 2), ncol=2), rectangles, reference = 6, ideal = c(1,1))
Write data sets to a file in the same format as read_datasets()
.
write_datasets(x, file = "")
write_datasets(x, file = "")
x |
The data set to write. The last column must be the set number. |
file |
either a character string naming a file or a connection open for
writing. |
x <- read_datasets(text="1 2\n3 4\n\n5 6\n7 8\n", col_names=c("obj1", "obj2")) write_datasets(x)
x <- read_datasets(text="1 2\n3 4\n\n5 6\n7 8\n", col_names=c("obj1", "obj2")) write_datasets(x)