--- title: "Benchmarking estimation speed against other packages" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Benchmarking estimation speed against other packages} %\VignetteEngine{knitr::rmarkdown} \usepackage[utf8]{inputenc} bibliography: "`r here::here('vignettes', 'library.bib')`" --- ```{r setup, include=FALSE} knitr::opts_chunk$set( warning = FALSE, message = FALSE, fig.width = 7.252, fig.height = 4, comment = "#>", fig.retina = 3 ) ``` {logitr} is faster than most other packages with similar functionality. To demonstrate this, a benchmark was conducted by estimating the same preference space mixed logit model using the following R packages: - {logitr} - {mixl} - {mlogit} - {gmnl} - {apollo} The benchmark can be viewed at this Google Colab notebook: https://colab.research.google.com/drive/1vYlBdJd4xCV43UwJ33XXpO3Ys8xWkuxx?usp=sharing Benchmarks will always vary for every run of a benchmarking code, even when run on the same machine due to variations in background processes. Thus, if you run this code yourself on a different machine, your results may vary, though the overall order and trends in terms of each package's relative speed should be similar to those from the Colab notebook. ## Comparing run times The {logitr} package includes a `runtimes` data frame that is exported from the [Google Colab](https://colab.research.google.com/drive/1vYlBdJd4xCV43UwJ33XXpO3Ys8xWkuxx?usp=sharing) notebook used to conduct the benchmark. The tables below summarize the run times for each package and how many times slower they are relative to {logitr}. ```{r} library(logitr) library(dplyr) library(tidyr) library(kableExtra) # For tables numDraws <- unique(runtimes$numDraws) logitr_time <- runtimes %>% filter(package == "logitr") %>% rename(time_logitr = time_sec) time_compare <- runtimes %>% left_join(select(logitr_time, -package), by = "numDraws") %>% mutate(mult = round(time_sec/ time_logitr, 1)) %>% select(-time_logitr) # Compare raw times time_compare %>% select(-mult) %>% pivot_wider(names_from = numDraws, values_from = time_sec) %>% kbl() # Compare how many times slower compared to logitr time_compare %>% select(-time_sec) %>% pivot_wider(names_from = numDraws, values_from = mult) %>% kbl() ``` The code below plots the relative run times from the Colab notebook. ```{r, eval=FALSE} library(ggplot2) library(ggrepel) plotColors <- c("black", RColorBrewer::brewer.pal(n = 5, name = "Set1"), "gold") benchmark <- runtimes %>% ggplot(aes(x = numDraws, y = time_sec, color = package)) + geom_line() + geom_point() + geom_text_repel( data = . %>% filter(numDraws == max(numDraws)), aes(label = package), hjust = 0, nudge_x = 40, direction = "y", size = 4.5, segment.size = 0 ) + scale_x_continuous( limits = c(0, 1200), breaks = numDraws, labels = scales::comma) + scale_y_continuous(limits = c(0, 300), breaks = seq(0, 300, 100)) + scale_color_manual(values = plotColors) + guides( point = guide_legend(override.aes = list(label = "")), color = guide_legend(override.aes = list(label = ""))) + theme_bw(base_size = 18) + theme( panel.grid.minor = element_blank(), panel.grid.major.x = element_blank(), legend.position = "none", axis.line.x = element_blank(), axis.ticks.x = element_blank() ) + labs( x = "Number of random draws", y = "Computation time (seconds)" ) benchmark ``` ```{r, eval=FALSE, echo=FALSE} ggsave('benchmark.png', benchmark, width = 8.5, height = 6) ``` ```{r probabilities, echo=FALSE} knitr::include_graphics('benchmark.png') ```