Variable selection in Data Envelopment Analysis (DEA) is a critical aspect that demands careful consideration before the results of an analysis are applied in a real-world scenario. This is because the outcomes can undergo significant changes based on the variables included in the model. As a result, variable selection stands as a pivotal step in every DEA application.
The variable selection process may lead to the removal of a variable
that a decision-maker might want to retain for political, tactical, or
other reasons. However, if no action is taken, the contribution of that
variable will be negligible. The cadea
function provides a
means to ensure that the contribution of a variable to the model is at
least a specified value.
For more information about loads see the help of the package or see (Fernandez-Palacin, Lopez-Sanchez, and Munoz-Marquez 2018) and (Villanueva-Cantillo and Munoz-Marquez 2021).
Let’s load and examine the tokyo_libraries
dataset using
the following code:
data(tokyo_libraries)
head(tokyo_libraries)
#> Area.I1 Books.I2 Staff.I3 Populations.I4 Regist.O1 Borrow.O2
#> 1 2.249 163.523 26 49.196 5.561 105.321
#> 2 4.617 338.671 30 78.599 18.106 314.682
#> 3 3.873 281.655 51 176.381 16.498 542.349
#> 4 5.541 400.993 78 189.397 30.810 847.872
#> 5 11.381 363.116 69 192.235 57.279 758.704
#> 6 10.086 541.658 114 194.091 66.137 1438.746
First, let’s perform an ADEA analysis with the following code:
input <- tokyo_libraries[, 1:4]
output <- tokyo_libraries[, 5:6]
m <- adea(input, output)
summary(m)
#>
#> Model name
#> Orientation input
#> Load orientation inoutput
#> Model load 0.455466997833526
#> Input load.Area.I1 0.455466997833526
#> Input load.Books.I2 1.33716872370689
#> Input load.Staff.I3 0.981885802948442
#> Input load.Populations.I4 1.22547847551114
#> Output load.Regist.O1 0.763942838453517
#> Output load.Borrow.O2 1.23605716154648
#> Inputs Area.I1 Books.I2 Staff.I3 Populations.I4
#> Outputs Regist.O1 Borrow.O2
#> nInputs 4
#> nOutputs 2
#> nVariables 6
#> nEfficients 6
#> Eff. Mean 0.775919227646031
#> Eff. sd 0.174702408743164
#> Eff. Min. 0.350010840234134
#> Eff. 1st Qu. 0.700942885344481
#> Eff. Median 0.784943740381793
#> Eff. 3rd Qu. 0.924285790399849
#> Eff. Max. 1
This analysis reveals that Area.I1
has a load value
below 0.6, indicating that its contribution to the DEA model is
negligible.
With the subsequent cadea
call, the contribution of
Area.I1
is enforced to be greater than 0.6:
mc <- cadea(input, output, load.min = 0.6, load.max = 4)
summary(mc)
#>
#> Model name
#> Orientation input
#> Load orientation inoutput
#> Model load 0.600000000000042
#> Input load.Area.I1 0.600000000000042
#> Input load.Books.I2 1.16440394470301
#> Input load.Staff.I3 0.932502044865763
#> Input load.Populations.I4 1.30309401043119
#> Output load.Regist.O1 0.912551322626857
#> Output load.Borrow.O2 1.08744867737314
#> Minimum for loads1 0.6
#> Minimum for loads2 0.6
#> Minimum for loads3 0.6
#> Minimum for loads4 0.6
#> Minimum for loads5 0.6
#> Minimum for loads6 0.6
#> Maximum for loads1 4
#> Maximum for loads2 4
#> Maximum for loads3 4
#> Maximum for loads4 4
#> Maximum for loads5 4
#> Maximum for loads6 4
#> Inputs Area.I1 Books.I2 Staff.I3 Populations.I4
#> Outputs Regist.O1 Borrow.O2
#> nInputs 4
#> nOutputs 2
#> nVariables 6
#> nEfficients 6
#> Eff. Mean 0.773704229966596
#> Eff. sd 0.174936730836523
#> Eff. Min. 0.349071771188186
#> Eff. 1st Qu. 0.700942885344227
#> Eff. Median 0.769117261231101
#> Eff. 3rd Qu. 0.924285790399358
#> Eff. Max. 1
It is worth noting that the maximum value of a variable’s load is the
maximum number of variables of its type, so setting
load.max = 4
has no effect on the results. As a result, the
load level increases to the specified value of 0.6, causing a slight
decrease in the average efficiency.
To compare the two efficiency sets, it is essential to observe that the Spearman correlation coefficient between them is 0.998. This can also be visualized in the following plot:
All of these findings indicate that, in this particular case, the
changes are minimal. More substantial changes can be expected if
load.min
is increased.
Universidad de Cádiz, [email protected]↩︎
Universidad de Cádiz, [email protected]↩︎