Influential or leverage units in ADEA

Fernando Fernandez-Palacin1

Manuel Munoz-Marquez2

2024-11-12

Introduction

An influential or leverage unit is one that produces significant changes in results. In this context, it refers to a unit that has a substantial impact on the model load.

For more information about loads see the help of the package or see (Fernandez-Palacin, Lopez-Sanchez, and Munoz-Marquez 2018) and (Villanueva-Cantillo and Munoz-Marquez 2021).

Let’s load and examine the tokyo_libraries dataset using the following code:

data(tokyo_libraries)
head(tokyo_libraries)
#>   Area.I1 Books.I2 Staff.I3 Populations.I4 Regist.O1 Borrow.O2
#> 1   2.249  163.523       26         49.196     5.561   105.321
#> 2   4.617  338.671       30         78.599    18.106   314.682
#> 3   3.873  281.655       51        176.381    16.498   542.349
#> 4   5.541  400.993       78        189.397    30.810   847.872
#> 5  11.381  363.116       69        192.235    57.279   758.704
#> 6  10.086  541.658      114        194.091    66.137  1438.746

Searching for influential units

The adea_load_leverage function searches for units that cause substantial changes in loads. The following call demonstrates this:

input <- tokyo_libraries[, 1:4]
output <- tokyo_libraries[, 5:6]
adea_load_leverage(input, output)
#>        load  load.diff DMUs
#> 1 0.6028718 0.14740482   23
#> 2 0.4004102 0.05505682    6

The output reveals that units 23 and 6 produce changes greater than the default value for load.diff, which is set at 0.05. The output is sorted in decreasing order of “load.diff,” which represents the change in the load model.

While the previous calls only consider changes when removing units one by one, the ndel parameter allows for testing the removal of more than one unit at a time. The following call tests all combinations of two units:

adea_load_leverage(input, output, load.diff = 0.1, ndel = 2)
#>         load load.diff   DMUs
#> 1  0.8333337 0.3778667  9, 23
#> 2  0.6315800 0.1761130 12, 23
#> 3  0.6315800 0.1761130 10, 23
#> 4  0.6315800 0.1761130 15, 23
#> 5  0.6315800 0.1761130  4, 23
#> 6  0.6315800 0.1761130 11, 23
#> 7  0.6315800 0.1761130 22, 23
#> 8  0.6315800 0.1761130 16, 23
#> 9  0.6315800 0.1761130 14, 23
#> 10 0.6315800 0.1761130 18, 23
#> 11 0.6315800 0.1761130 20, 23
#> 12 0.6315800 0.1761130  3, 23
#> 13 0.6225027 0.1670357  2, 23
#> 14 0.6107273 0.1552603  7, 23
#> 15 0.6028718 0.1474048     23
#> 16 0.6020337 0.1465667 13, 23
#> 17 0.6010336 0.1455666  1, 23
#> 18 0.5980232 0.1425562  8, 23
#> 19 0.5879663 0.1324993 21, 23
#> 20 0.3334068 0.1220602   6, 9
#> 21 0.3430363 0.1124307   5, 6
#> 22 0.5599886 0.1045216 17, 23

This results in a long list, and to limit the number of groups in the output, you can set nmax to a specific value, as demonstrated in the following call:

adea_load_leverage(input, output, load.diff = 0.1, ndel = 2, nmax = 10)
#>         load load.diff   DMUs
#> 1  0.8333337 0.3778667  9, 23
#> 2  0.6315800 0.1761130 12, 23
#> 3  0.6315800 0.1761130 10, 23
#> 4  0.6315800 0.1761130 15, 23
#> 5  0.6315800 0.1761130  4, 23
#> 6  0.6315800 0.1761130 11, 23
#> 7  0.6315800 0.1761130 22, 23
#> 8  0.6315800 0.1761130 16, 23
#> 9  0.6315800 0.1761130 14, 23
#> 10 0.6315800 0.1761130 18, 23

It’s important to note that the best option for removing two units is not the same as removing the two units individually in the one-by-one analysis. This discrepancy arises due to interactions between the effects of the units.

From this point forward, decision-makers or researchers must handle these units carefully to avoid biases in DEA results.

Each call to adea_load_leverage requires solving a large linear program, making it computationally demanding and potentially time-consuming. Patience is essential when working with this function.

References

Fernandez-Palacin, Fernando, Marı́a Auxiliadora Lopez-Sanchez, and Manuel Munoz-Marquez. 2018. Stepwise selection of variables in DEA using contribution loads.” Pesquisa Operacional 38 (1): 31–52. http://dx.doi.org/10.1590/0101-7438.2018.038.01.0031.
Villanueva-Cantillo, Jeyms, and Manuel Munoz-Marquez. 2021. “Methodology for Calculating Critical Values of Relevance Measures in Variable Selection Methods in Data Envelopment Analysis.” European Journal of Operational Research 290 (2): 657–70. https://doi.org/10.1016/j.ejor.2020.08.021.

  1. Universidad de Cádiz, ↩︎

  2. Universidad de Cádiz, ↩︎