Benchmarking estimation speed against other packages

{logitr} is faster than most other packages with similar functionality. To demonstrate this, a benchmark was conducted by estimating the same preference space mixed logit model using the following R packages:

  • {logitr}
  • {mixl}
  • {mlogit}
  • {gmnl}
  • {apollo}

The benchmark can be viewed at this Google Colab notebook:

https://colab.research.google.com/drive/1vYlBdJd4xCV43UwJ33XXpO3Ys8xWkuxx?usp=sharing

Benchmarks will always vary for every run of a benchmarking code, even when run on the same machine due to variations in background processes. Thus, if you run this code yourself on a different machine, your results may vary, though the overall order and trends in terms of each package’s relative speed should be similar to those from the Colab notebook.

Comparing run times

The {logitr} package includes a runtimes data frame that is exported from the Google Colab notebook used to conduct the benchmark. The tables below summarize the run times for each package and how many times slower they are relative to {logitr}.

library(logitr)
library(dplyr)
library(tidyr)
library(kableExtra) # For tables

numDraws <- unique(runtimes$numDraws)
logitr_time <- runtimes %>%
    filter(package == "logitr") %>%
    rename(time_logitr = time_sec)
time_compare <- runtimes %>%
    left_join(select(logitr_time, -package), by = "numDraws") %>%
    mutate(mult = round(time_sec/ time_logitr, 1)) %>%
    select(-time_logitr)
# Compare raw times
time_compare %>%
    select(-mult) %>%
    pivot_wider(names_from = numDraws, values_from = time_sec) %>% 
    kbl()
package 50 200 400 600 800 1000
logitr 2.752860 8.930408 13.72697 24.13735 33.47056 39.22012
mixl (1 core) 10.640267 49.667703 80.22922 158.23673 229.17401 271.35824
mixl (2 cores) 8.928487 41.577738 66.05436 129.84954 184.54557 230.91617
mlogit 11.926502 19.901097 87.58429 60.38106 100.63138 98.41650
gmnl 10.553787 31.379011 69.74575 121.93843 99.23701 141.36727
apollo (1 core) 17.287605 44.118732 84.29395 129.33302 164.37400 198.21812
apollo (2 cores) 21.911355 53.223971 82.69647 120.27101 163.80518 196.84896
# Compare how many times slower compared to logitr
time_compare %>%
    select(-time_sec) %>%
    pivot_wider(names_from = numDraws, values_from = mult) %>% 
    kbl()
package 50 200 400 600 800 1000
logitr 1.0 1.0 1.0 1.0 1.0 1.0
mixl (1 core) 3.9 5.6 5.8 6.6 6.8 6.9
mixl (2 cores) 3.2 4.7 4.8 5.4 5.5 5.9
mlogit 4.3 2.2 6.4 2.5 3.0 2.5
gmnl 3.8 3.5 5.1 5.1 3.0 3.6
apollo (1 core) 6.3 4.9 6.1 5.4 4.9 5.1
apollo (2 cores) 8.0 6.0 6.0 5.0 4.9 5.0

The code below plots the relative run times from the Colab notebook.

library(ggplot2)
library(ggrepel)

plotColors <- c("black", RColorBrewer::brewer.pal(n = 5, name = "Set1"), "gold")
benchmark <- runtimes %>% 
    ggplot(aes(x = numDraws, y = time_sec, color = package)) +
    geom_line() +
    geom_point() +
    geom_text_repel(
        data = . %>% filter(numDraws == max(numDraws)),
        aes(label = package),
        hjust = 0, nudge_x = 40, direction = "y",
        size = 4.5, segment.size = 0
    ) +
    scale_x_continuous(
        limits = c(0, 1200),
        breaks = numDraws,
        labels = scales::comma) +
    scale_y_continuous(limits = c(0, 300), breaks = seq(0, 300, 100)) +
    scale_color_manual(values = plotColors) +
    guides(
        point = guide_legend(override.aes = list(label = "")),
        color = guide_legend(override.aes = list(label = ""))) +
    theme_bw(base_size = 18) +
    theme(
        panel.grid.minor = element_blank(),
        panel.grid.major.x = element_blank(),
        legend.position = "none",
        axis.line.x = element_blank(),
        axis.ticks.x = element_blank()
    ) +
    labs(
        x = "Number of random draws",
        y = "Computation time (seconds)"
    )

benchmark