Title: | Logistic Regression Equivalence |
---|---|
Description: | Tools for assessing equivalence of similar Logistic Regression models. |
Authors: | Guy Ashiri-Prossner |
Maintainer: | Guy Ashiri-Prossner <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.1.5 |
Built: | 2024-12-13 06:42:40 UTC |
Source: | CRAN |
This function takes two logistic regression models ,
sensitivity level
and significance level
.
It checks whether the coefficient vectors are equivalent.
beta_equivalence(model_a, model_b, delta, alpha)
beta_equivalence(model_a, model_b, delta, alpha)
model_a |
logistic regression model |
model_b |
logistic regression model |
delta |
equivalence sensitivity level |
alpha |
significance level |
equivalence
are the coefficient vectors equivalent? (boolean)
test_statistic
Equivalence test statistic
critical value
a level- critical value
ncp
non-centrality parameter
p_value
P-value
This function takes a observations vector and matching
predictions vector
. It returns the Brier score for the
predictions. Unless specified otherwise, input containing NAs will
result with an NA.
brier_score(y, pi, na.rm = FALSE)
brier_score(y, pi, na.rm = FALSE)
y |
the obsrevations vector |
pi |
the predictions vector |
na.rm |
ignore NA? (optional) |
The Brier score
brier_score(rbinom(10,1,seq(0.1, 1, 0.1)), seq(0.1, 1, 0.1))
brier_score(rbinom(10,1,seq(0.1, 1, 0.1)), seq(0.1, 1, 0.1))
This function takes two datasets , regression formula,
significance level
and sensitivity level
(either vector or scalar). It builds a logistic
regression model for each of the datasets and then checks whether the
obtained coefficient vectors are equivalent, using the
beta_equivalence
function.
descriptive_equiv(data_a, data_b, formula, delta, alpha = 0.05)
descriptive_equiv(data_a, data_b, formula, delta, alpha = 0.05)
data_a |
dataset |
data_b |
dataset |
formula |
logistic regression formula |
delta |
equivalence sensitivity level |
alpha |
significance level |
equivalence
the beta_equivalence
function output
model_a
logistic regression model
model_b
logistic regression model
This function takes two logistic regression models ,
test data, significance level
and allowed flips ratio
. It checks whether the models produce equivalent log-odds for
the given test set and returns various figures.
individual_predictive_equiv(model_a, model_b, test_data, r = 0.1, alpha = 0.05)
individual_predictive_equiv(model_a, model_b, test_data, r = 0.1, alpha = 0.05)
model_a |
logistic regression model |
model_b |
logistic regression model |
test_data |
testing dataset |
r |
ratio of allowed 'flips' (defaults to 0.1) |
alpha |
significance level |
equivalence
Are models producing equivalent
log-odds for the given test data? (boolean)
test_statistic
The test statistic
critical_value
a level- critical value the test
xi_bar
Mean value for the test
delta_theta
Calculated equivalence parameter
p_value
P-value
This function takes two logistic regression models ,
test data, significance level
and acceptable score
degradation
. It checks whether the models perform
equivalently on the test set and returns various figures.
performance_equiv( model_a, model_b, test_data, dv_index, delta_B = 1.1, alpha = 0.05 )
performance_equiv( model_a, model_b, test_data, dv_index, delta_B = 1.1, alpha = 0.05 )
model_a |
logistic regression model |
model_b |
logistic regression model |
test_data |
testing dataset |
dv_index |
column number of the dependent variable |
delta_B |
acceptable score degradation (defaults to 1.1) |
alpha |
significance level |
equivalence
Are models producing equivalent
Brier scores for the given test data? (boolean)
brier_score_ac
Brier score on the testing data
brier_score_bc
Brier score on the testing data
diff_sd_l
SD of the lower Brier difference
diff_sd_u
SD of the upper Brier difference
test_stat_l
equivalence boundary for the test
test_stat_u
equivalence boundary for the test
crit_val
a level- critical value for the test
delta_B
Calculated equivalence parameter
p_value_l
P-value for
p_value_u
P-value for
Data from a student achievement in secondary education of two Portuguese schools. Full attribute description could be found in the source webpage.
ptg_stud_data
ptg_stud_data
An object of class data.frame
with 649 rows and 31 columns.
The data used is taken from the Student Performance Data. The original data consists of 30 covariates (13 binary, 11 ordinal, 4 categorical, 2 numerical) and a numerical output variable indicating the students final grade in Portuguese Language course.
The data was split by gender (F/M) . The target
variable
G3
was converted to binary, final_fail
which
indicates the cases where G3 < 10
.
Next, each sub-population was divided into training and testing data, using a 4:1 ratio.
https://archive.ics.uci.edu/ml/datasets/student+performance
P. Cortez and A. Silva. Using Data Mining to Predict Secondary School Student Performance. In A. Brito and J. Teixeira Eds., Proceedings of 5th FUture BUsiness TEChnology Conference (FUBUTEC 2008) pp. 5-12, Porto, Portugal, April, 2008, EUROSIS, ISBN 978-9077381-39-7.
http://www3.dsi.uminho.pt/pcortez/student.pdf
Student Performance Data Set - female testing data
ptg_stud_f_test
ptg_stud_f_test
An object of class data.frame
with 77 rows and 30 columns.
ptg_stud_data
Student Performance Data Set - female training data
ptg_stud_f_train
ptg_stud_f_train
An object of class data.frame
with 306 rows and 30 columns.
ptg_stud_data
Student Performance Data Set - male testing data
ptg_stud_m_test
ptg_stud_m_test
An object of class data.frame
with 53 rows and 30 columns.
ptg_stud_data
Student Performance Data Set - male training data
ptg_stud_m_train
ptg_stud_m_train
An object of class data.frame
with 213 rows and 30 columns.
ptg_stud_data
This function takes a number and returns its
respective sigmoid probability
.
This is used in logistic regression to model
.
sigmoid(theta)
sigmoid(theta)
theta |
the linear predictor |
the sigmoid probability
sigmoid(0)
sigmoid(0)