Title: | Optimal Policy Learning |
---|---|
Description: | Provides functions for optimal policy learning in socioeconomic applications helping users to learn the most effective policies based on data in order to maximize empirical welfare. Specifically, 'OPL' allows to find "treatment assignment rules" that maximize the overall welfare, defined as the sum of the policy effects estimated over all the policy beneficiaries. Documentation about 'OPL' is provided by several international articles via Athey et al (2021, <doi:10.3982/ECTA15732>), Kitagawa et al (2018, <doi:10.3982/ECTA13288>), Cerulli (2022, <doi:10.1080/13504851.2022.2032577>), the paper by Cerulli (2021, <doi:10.1080/13504851.2020.1820939>) and the book by Gareth et al (2013, <doi:10.1007/978-1-4614-7138-7>). |
Authors: | Federico Brogi [aut, cre], Barbara Guardabascio [aut], Giovanni Cerulli [aut] |
Maintainer: | Federico Brogi <[email protected]> |
License: | GPL-3 |
Version: | 1.0.2 |
Built: | 2025-02-27 15:25:42 UTC |
Source: | CRAN |
Predicting conditional average treatment effect (CATE) on a new policy based on the training over an old policy
make_cate( model, train_data, test_data, w, x, y, family = gaussian(), ntree = 100, mtry = 2, verbose = TRUE )
make_cate( model, train_data, test_data, w, x, y, family = gaussian(), ntree = 100, mtry = 2, verbose = TRUE )
model |
A |
train_data |
The training dataset. |
test_data |
The test dataset. |
w |
Set the treatment variable. |
x |
set Independent variables for the model. |
y |
Set the outcome variable. |
family |
The family type for the model (e.g., 'binomial'). |
ntree |
Number of trees for the Random Forest model. |
mtry |
Number of variables to consider at each tree split in the Random Forest model. |
verbose |
Set TRUE to print the output on the console. |
An object containing the estimated causal treatment effect results.
Athey, S., and Wager S. 2021. Policy Learning with Observational Data, Econometrica, 89, 1, 133–161.
Cerulli, G. 2021. Improving econometric prediction by machine learning, Applied Economics Letters, 28, 16, 1419-1425.
Cerulli, G. 2022. Optimal treatment assignment of a threshold-based policy: empirical protocol and related issues, Applied Economics Letters, DOI: 10.1080/13504851.2022.2032577.
Gareth, J., Witten, D., Hastie, D.T., Tibshirani, R. 2013. An Introduction to Statistical Learning : with Applications in R. New York, Springer.
Kitagawa, T., and A. Tetenov. 2018. Who Should Be Treated? Empirical Welfare Maximization Methods for Treatment Choice, Econometrica, 86, 2, 591–616.
Implementing ex-ante treatment assignment using as policy class a 2-layer fixed-depth decision-tree at specific splitting variables and threshold values.
opl_dt_c(make_cate_result, z, w, c1 = NA, c2 = NA, c3 = NA, verbose = TRUE)
opl_dt_c(make_cate_result, z, w, c1 = NA, c2 = NA, c3 = NA, verbose = TRUE)
make_cate_result |
A data frame resulting from the |
z |
A character vector containing the names of the variables used for treatment assignment. |
w |
A string representing the treatment indicator variable name. |
c1 |
Value of the threshold value c1 for the first splitting variable. This number must be chosen between 0 and 1. |
c2 |
Value of the threshold value c2 for the second splitting variable. This number must be chosen between 0 and 1. |
c3 |
Value of the threshold value c3 for the third splitting variable. This number must be chosen between 0 and 1. |
verbose |
Set TRUE to print the output on the console. |
A list containing:
W_opt_constr
: The maximum average constrained welfare.
W_opt_unconstr
: The average unconstrained welfare.
units_to_be_treated
: A data frame of the units to be treated based on the optimal policy.
A plot showing the optimal policy assignment.
Athey, S., and Wager S. 2021. Policy Learning with Observational Data, Econometrica, 89, 1, 133–161.
Cerulli, G. 2021. Improving econometric prediction by machine learning, Applied Economics Letters, 28, 16, 1419-1425.
Cerulli, G. 2022. Optimal treatment assignment of a threshold-based policy: empirical protocol and related issues, Applied Economics Letters, DOI: 10.1080/13504851.2022.2032577.
Gareth, J., Witten, D., Hastie, D.T., Tibshirani, R. 2013. An Introduction to Statistical Learning : with Applications in R. New York, Springer.
Kitagawa, T., and A. Tetenov. 2018. Who Should Be Treated? Empirical Welfare Maximization Methods for Treatment Choice, Econometrica, 86, 2, 591–616.
Function that allows the user to select a row of maximum welfare among the rows with maximum welfare constrained. The function prints out the result and requires user input to select the row.
opl_dt_max_choice(nc, col_max, verbose = TRUE)
opl_dt_max_choice(nc, col_max, verbose = TRUE)
nc |
Numeber of max welfare. |
col_max |
Row index for max constrained welfare. |
verbose |
Set TRUE to print the output on the console. |
Return the user's selection as an input.
Implementing ex-ante treatment assignment using as policy class a linear-combination approach at specific parameters' values c1, c2, and c3 for the linear-combination of variables var1 and var2: c1var1+c2var2>=c3.
opl_lc_c(make_cate_result, z, w, c1 = NA, c2 = NA, c3 = NA, verbose = TRUE)
opl_lc_c(make_cate_result, z, w, c1 = NA, c2 = NA, c3 = NA, verbose = TRUE)
make_cate_result |
A data frame containing the input data. It must include
a column named |
z |
A character vector of length 2 specifying the column names of the two threshold variables to be standardized. |
w |
A character string specifying the column name indicating treatment assignment (binary variable). |
c1 |
Threshold for var1 given by the user or optimized by the the function. This number must be chosen between 0 and 1. |
c2 |
Threshold for var2 given by the user or optimized by the the function. This number must be chosen between 0 and 1. |
c3 |
Third parameter of the linear-combination. This number must be chosen between 0 and 1. |
verbose |
Set TRUE to print the output on the console. |
The function performs the following steps:
Standardizes the threshold variables using a min-max scaling technique.
Determines the optimal treatment assignment based on the linear combination of the threshold variables.
Performs a grid search to estimate the optimal policy.
Outputs a plot visualizing the optimal treatment assignments.
Prints the main results, including the percentage of treated units, the unconstrained and constrained welfare, and the policy parameters.
The function returns a data frame containing the standardized variables and treatment assignments, and prints a summary of the results and a plot showing the optimal policy assignment.
Athey, S., and Wager S. 2021. Policy Learning with Observational Data, Econometrica, 89, 1, 133–161.
Cerulli, G. 2021. Improving econometric prediction by machine learning, Applied Economics Letters, 28, 16, 1419-1425.
Cerulli, G. 2022. Optimal treatment assignment of a threshold-based policy: empirical protocol and related issues, Applied Economics Letters, DOI: 10.1080/13504851.2022.2032577.
Gareth, J., Witten, D., Hastie, D.T., Tibshirani, R. 2013. An Introduction to Statistical Learning : with Applications in R. New York, Springer.
Kitagawa, T., and A. Tetenov. 2018. Who Should Be Treated? Empirical Welfare Maximization Methods for Treatment Choice, Econometrica, 86, 2, 591–616.
Implementing ex-ante treatment assignment using as policy class a threshold-based (or quadrant) approach at specific threshold values c1 and c2 for respectively the selection variables var1 and var2.
opl_tb_c(make_cate_result, z, w, c1 = NA, c2 = NA, verbose = TRUE)
opl_tb_c(make_cate_result, z, w, c1 = NA, c2 = NA, verbose = TRUE)
make_cate_result |
A data frame containing the input data. It must include
a column named |
z |
A character vector of length 2 specifying the column names of the two threshold variables to be standardized. |
w |
A character string specifying the column name indicating treatment assignment (binary variable). |
c1 |
Threshold for var1 given by the user or optimized by the the function. This number must be chosen between 0 and 1. |
c2 |
Threshold for var2 given by the user or optimized by the the function. This number must be chosen between 0 and 1. |
verbose |
Set TRUE to print the output on the console. |
The function:
Standardizes the threshold variables to a 0-1 range.
Identifies the optimal thresholds based on grid search for maximizing constrained welfare.
Computes and displays key statistics, including average welfare measures and the percentage of treated units.
The function invisibly returns the input data frame augmented with the following columns:
z[1]_std
: Standardized version of the first threshold variable.
z[2]_std
: Standardized version of the second threshold variable.
units_to_be_treated
: Binary indicator for whether a unit should be treated based on the optimal policy.
Additionally, the function:
Prints the main results summary, including optimal threshold values, average constrained and unconstrained welfare, and treatment proportions.
Displays a scatter plot visualizing the policy assignment.
Athey, S., and Wager S. 2021. Policy Learning with Observational Data, Econometrica, 89, 1, 133–161.
Cerulli, G. 2021. Improving econometric prediction by machine learning, Applied Economics Letters, 28, 16, 1419-1425.
Cerulli, G. 2022. Optimal treatment assignment of a threshold-based policy: empirical protocol and related issues, Applied Economics Letters, DOI: 10.1080/13504851.2022.2032577.
Gareth, J., Witten, D., Hastie, D.T., Tibshirani, R. 2013. An Introduction to Statistical Learning : with Applications in R. New York, Springer.
Kitagawa, T., and A. Tetenov. 2018. Who Should Be Treated? Empirical Welfare Maximization Methods for Treatment Choice, Econometrica, 86, 2, 591–616.
Function to perform overlap analysis between train and test datasets. The function performs principal component analysis (PCA) on the covariates for both sets and calculates the Kolmogorov-Smirnov test for overlap.
overlapping(train_data, test_data, x)
overlapping(train_data, test_data, x)
train_data |
Train Dataset indicating the old policy sample. |
test_data |
Test Dataset indicating the new policy sample. |
x |
Vector of predictor variables. |
The function prints the superposition graph and the results of the Kolmogorov-Smirnov test.