Introduction

Variable selection in Data Envelopment Analysis (DEA) is a critical aspect that demands careful consideration before the results of an analysis are applied in a real-world scenario. This is because the outcomes can undergo significant changes based on the variables included in the model. As a result, variable selection stands as a pivotal step in every DEA application.

The variable selection process may lead to the removal of a variable that a decision-maker might want to retain for political, tactical, or other reasons. However, if no action is taken, the contribution of that variable will be negligible. The cadea function provides a means to ensure that the contribution of a variable to the model is at least a specified value.

For more information about loads see the help of the package or see (Fernandez-Palacin, Lopez-Sanchez, and Munoz-Marquez 2018) and (Villanueva-Cantillo and Munoz-Marquez 2021).

Let’s load and examine the tokyo_libraries dataset using the following code:

data(tokyo_libraries)
head(tokyo_libraries)
#>   Area.I1 Books.I2 Staff.I3 Populations.I4 Regist.O1 Borrow.O2
#> 1   2.249  163.523       26         49.196     5.561   105.321
#> 2   4.617  338.671       30         78.599    18.106   314.682
#> 3   3.873  281.655       51        176.381    16.498   542.349
#> 4   5.541  400.993       78        189.397    30.810   847.872
#> 5  11.381  363.116       69        192.235    57.279   758.704
#> 6  10.086  541.658      114        194.091    66.137  1438.746

Constrained ADEA

First, let’s perform an ADEA analysis with the following code:

input <- tokyo_libraries[, 1:4]
output <- tokyo_libraries[, 5:6]
m <- adea(input, output)
summary(m)
#>                                                                   
#> Model name                                                        
#> Orientation                                                  input
#> Load orientation                                          inoutput
#> Model load                                       0.455466997833526
#> Input load.Area.I1                               0.455466997833526
#> Input load.Books.I2                               1.33716872370689
#> Input load.Staff.I3                              0.981885802948442
#> Input load.Populations.I4                         1.22547847551114
#> Output load.Regist.O1                            0.763942838453517
#> Output load.Borrow.O2                             1.23605716154648
#> Inputs                    Area.I1 Books.I2 Staff.I3 Populations.I4
#> Outputs                                        Regist.O1 Borrow.O2
#> nInputs                                                          4
#> nOutputs                                                         2
#> nVariables                                                       6
#> nEfficients                                                      6
#> Eff. Mean                                        0.775919227646031
#> Eff. sd                                          0.174702408743164
#> Eff. Min.                                        0.350010840234134
#> Eff. 1st Qu.                                     0.700942885344481
#> Eff. Median                                      0.784943740381793
#> Eff. 3rd Qu.                                     0.924285790399849
#> Eff. Max.                                                        1

This analysis reveals that Area.I1 has a load value below 0.6, indicating that its contribution to the DEA model is negligible.

With the subsequent cadea call, the contribution of Area.I1 is enforced to be greater than 0.6:

mc <- cadea(input, output, load.min = 0.6, load.max = 4)
summary(mc)
#>                                                                   
#> Model name                                                        
#> Orientation                                                  input
#> Load orientation                                          inoutput
#> Model load                                       0.600000000000042
#> Input load.Area.I1                               0.600000000000042
#> Input load.Books.I2                               1.16440394470301
#> Input load.Staff.I3                              0.932502044865763
#> Input load.Populations.I4                         1.30309401043119
#> Output load.Regist.O1                            0.912551322626857
#> Output load.Borrow.O2                             1.08744867737314
#> Minimum for loads1                                             0.6
#> Minimum for loads2                                             0.6
#> Minimum for loads3                                             0.6
#> Minimum for loads4                                             0.6
#> Minimum for loads5                                             0.6
#> Minimum for loads6                                             0.6
#> Maximum for loads1                                               4
#> Maximum for loads2                                               4
#> Maximum for loads3                                               4
#> Maximum for loads4                                               4
#> Maximum for loads5                                               4
#> Maximum for loads6                                               4
#> Inputs                    Area.I1 Books.I2 Staff.I3 Populations.I4
#> Outputs                                        Regist.O1 Borrow.O2
#> nInputs                                                          4
#> nOutputs                                                         2
#> nVariables                                                       6
#> nEfficients                                                      6
#> Eff. Mean                                        0.773704229966596
#> Eff. sd                                          0.174936730836523
#> Eff. Min.                                        0.349071771188186
#> Eff. 1st Qu.                                     0.700942885344227
#> Eff. Median                                      0.769117261231101
#> Eff. 3rd Qu.                                     0.924285790399358
#> Eff. Max.                                                        1

It is worth noting that the maximum value of a variable’s load is the maximum number of variables of its type, so setting load.max = 4 has no effect on the results. As a result, the load level increases to the specified value of 0.6, causing a slight decrease in the average efficiency.

To compare the two efficiency sets, it is essential to observe that the Spearman correlation coefficient between them is 0.998. This can also be visualized in the following plot:

All of these findings indicate that, in this particular case, the changes are minimal. More substantial changes can be expected if load.min is increased.

Universidad de Cádiz, [email protected]↩︎
Universidad de Cádiz, [email protected]↩︎

Constrained ADEA model

Fernando Fernandez-Palacin¹

Manuel Munoz-Marquez²

2024-11-12

Introduction

Constrained ADEA

References

Constrained ADEA model

Fernando Fernandez-Palacin1

Manuel Munoz-Marquez2

2024-11-12

Introduction

Constrained ADEA

References

Fernando Fernandez-Palacin¹

Manuel Munoz-Marquez²