Title: | Tools for Developing Binary Logistic Regression Models |
---|---|
Description: | Tools designed to make it easier for beginner and intermediate users to build and validate binary logistic regression models. Includes bivariate analysis, comprehensive regression output, model fit statistics, variable selection procedures, model validation techniques and a 'shiny' app for interactive model building. |
Authors: | Aravind Hebbali [aut, cre] |
Maintainer: | Aravind Hebbali <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.3.1 |
Built: | 2024-11-13 14:25:24 UTC |
Source: | CRAN |
The data is related with direct marketing campaigns of a Portuguese banking institution. The marketing campaigns were based on phone calls. Often, more than one contact to the same client was required, in order to access if the product (bank term deposit) would be ('yes') or not ('no') subscribed.
bank_marketing
bank_marketing
A tibble with 4521 rows and 17 variables:
age of the client
type of job
marital status
education level of the client
has credit in default?
has housing loan?
has personal loan?
contact communication type
last contact month of year
last contact day of the week
last contact duration, in seconds
number of contacts performed during this campaign and for this client
number of days that passed by after the client was last contacted from a previous campaign
number of contacts performed before this campaign and for this clien
outcome of the previous marketing campaign
has the client subscribed a term deposit?
[Moro et al., 2014] S. Moro, P. Cortez and P. Rita. A Data-Driven Approach to Predict the Success of Bank Telemarketing. Decision Support Systems, Elsevier, 62:22-31, June 2014
Information value and likelihood ratio chi square test for initial variable/predictor selection. Currently avialable for categorical predictors only.
blr_bivariate_analysis(data, response, ...) ## Default S3 method: blr_bivariate_analysis(data, response, ...)
blr_bivariate_analysis(data, response, ...) ## Default S3 method: blr_bivariate_analysis(data, response, ...)
data |
A |
response |
Response variable; column in |
... |
Predictor variables; columns in |
A tibble with the following columns:
Variable |
Variable name |
Information Value |
Information value |
LR Chi Square |
Likelihood ratio statisitc |
LR DF |
Likelihood ratio degrees of freedom |
LR p-value |
Likelihood ratio p value |
Other bivariate analysis procedures:
blr_segment()
,
blr_segment_dist()
,
blr_segment_twoway()
,
blr_woe_iv()
,
blr_woe_iv_stats()
blr_bivariate_analysis(hsb2, honcomp, female, prog, race, schtyp)
blr_bivariate_analysis(hsb2, honcomp, female, prog, race, schtyp)
Variance inflation factor, tolerance, eigenvalues and condition indices.
blr_coll_diag(model) blr_vif_tol(model) blr_eigen_cindex(model)
blr_coll_diag(model) blr_vif_tol(model) blr_eigen_cindex(model)
model |
An object of class |
Collinearity implies two variables are near perfect linear combinations of one another. Multicollinearity involves more than two variables. In the presence of multicollinearity, regression estimates are unstable and have high standard errors.
Tolerance
Percent of variance in the predictor that cannot be accounted for by other predictors.
Variance Inflation Factor
Variance inflation factors measure the inflation in the variances of the parameter estimates due to
collinearities that exist among the predictors. It is a measure of how much the variance of the estimated
regression coefficient is inflated by the existence of correlation among the predictor variables
in the model. A VIF of 1 means that there is no correlation among the kth predictor and the remaining predictor
variables, and hence the variance of
is not inflated at all. The general rule of thumb is that VIFs
exceeding 4 warrant further investigation, while VIFs exceeding 10 are signs of serious multicollinearity
requiring correction.
Condition Index
Most multivariate statistical approaches involve decomposing a correlation matrix into linear combinations of variables. The linear combinations are chosen so that the first combination has the largest possible variance (subject to some restrictions), the second combination has the next largest variance, subject to being uncorrelated with the first, the third has the largest possible variance, subject to being uncorrelated with the first and second, and so forth. The variance of each of these linear combinations is called an eigenvalue. Collinearity is spotted by finding 2 or more variables that have large proportions of variance (.50 or more) that correspond to large condition indices. A rule of thumb is to label as large those condition indices in the range of 30 or larger.
blr_coll_diag
returns an object of class "blr_coll_diag"
.
An object of class "blr_coll_diag"
is a list containing the
following components:
vif_t |
tolerance and variance inflation factors |
eig_cindex |
eigen values and condition index |
Belsley, D. A., Kuh, E., and Welsch, R. E. (1980). Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. New York: John Wiley & Sons.
# model model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) # vif and tolerance blr_vif_tol(model) # eigenvalues and condition indices blr_eigen_cindex(model) # collinearity diagnostics blr_coll_diag(model)
# model model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) # vif and tolerance blr_vif_tol(model) # eigenvalues and condition indices blr_eigen_cindex(model) # collinearity diagnostics blr_coll_diag(model)
Confusion matrix and statistics.
blr_confusion_matrix(model, cutoff = 0.5, data = NULL, ...) ## Default S3 method: blr_confusion_matrix(model, cutoff = 0.5, data = NULL, ...)
blr_confusion_matrix(model, cutoff = 0.5, data = NULL, ...) ## Default S3 method: blr_confusion_matrix(model, cutoff = 0.5, data = NULL, ...)
model |
An object of class |
cutoff |
Cutoff for classification. |
data |
A |
... |
Other arguments. |
Confusion matix.
Other model validation techniques:
blr_decile_capture_rate()
,
blr_decile_lift_chart()
,
blr_gains_table()
,
blr_gini_index()
,
blr_ks_chart()
,
blr_lorenz_curve()
,
blr_roc_curve()
,
blr_test_hosmer_lemeshow()
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_confusion_matrix(model, cutoff = 0.4)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_confusion_matrix(model, cutoff = 0.4)
Visualize the decile wise event rate.
blr_decile_capture_rate( gains_table, xaxis_title = "Decile", yaxis_title = "Capture Rate", title = "Capture Rate by Decile", bar_color = "blue", text_size = 3.5, text_vjust = -0.3, print_plot = TRUE )
blr_decile_capture_rate( gains_table, xaxis_title = "Decile", yaxis_title = "Capture Rate", title = "Capture Rate by Decile", bar_color = "blue", text_size = 3.5, text_vjust = -0.3, print_plot = TRUE )
gains_table |
An object of class |
xaxis_title |
X axis title. |
yaxis_title |
Y axis title. |
title |
Plot title. |
bar_color |
Bar color. |
text_size |
Size of the bar labels. |
text_vjust |
Vertical justification of the bar labels. |
print_plot |
logical; if |
Other model validation techniques:
blr_confusion_matrix()
,
blr_decile_lift_chart()
,
blr_gains_table()
,
blr_gini_index()
,
blr_ks_chart()
,
blr_lorenz_curve()
,
blr_roc_curve()
,
blr_test_hosmer_lemeshow()
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) gt <- blr_gains_table(model) blr_decile_capture_rate(gt)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) gt <- blr_gains_table(model) blr_decile_capture_rate(gt)
Decile wise lift chart.
blr_decile_lift_chart( gains_table, xaxis_title = "Decile", yaxis_title = "Decile Mean / Global Mean", title = "Decile Lift Chart", bar_color = "blue", text_size = 3.5, text_vjust = -0.3, print_plot = TRUE )
blr_decile_lift_chart( gains_table, xaxis_title = "Decile", yaxis_title = "Decile Mean / Global Mean", title = "Decile Lift Chart", bar_color = "blue", text_size = 3.5, text_vjust = -0.3, print_plot = TRUE )
gains_table |
An object of class |
xaxis_title |
X axis title. |
yaxis_title |
Y axis title. |
title |
Plot title. |
bar_color |
Color of the bars. |
text_size |
Size of the bar labels. |
text_vjust |
Vertical justification of the bar labels. |
print_plot |
logical; if |
Other model validation techniques:
blr_confusion_matrix()
,
blr_decile_capture_rate()
,
blr_gains_table()
,
blr_gini_index()
,
blr_ks_chart()
,
blr_lorenz_curve()
,
blr_roc_curve()
,
blr_test_hosmer_lemeshow()
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) gt <- blr_gains_table(model) blr_decile_lift_chart(gt)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) gt <- blr_gains_table(model) blr_decile_lift_chart(gt)
Compute sensitivity, specificity, accuracy and KS statistics to generate the lift chart and the KS chart.
blr_gains_table(model, data = NULL) ## S3 method for class 'blr_gains_table' plot( x, title = "Lift Chart", xaxis_title = "% Population", yaxis_title = "% Cumulative 1s", diag_line_col = "red", lift_curve_col = "blue", plot_title_justify = 0.5, print_plot = TRUE, ... )
blr_gains_table(model, data = NULL) ## S3 method for class 'blr_gains_table' plot( x, title = "Lift Chart", xaxis_title = "% Population", yaxis_title = "% Cumulative 1s", diag_line_col = "red", lift_curve_col = "blue", plot_title_justify = 0.5, print_plot = TRUE, ... )
model |
An object of class |
data |
A |
x |
An object of class |
title |
Plot title. |
xaxis_title |
X axis title. |
yaxis_title |
Y axis title. |
diag_line_col |
Diagonal line color. |
lift_curve_col |
Color of the lift curve. |
plot_title_justify |
Horizontal justification on the plot title. |
print_plot |
logical; if |
... |
Other inputs. |
A tibble.
Agresti, A. (2007), An Introduction to Categorical Data Analysis, Second Edition, New York: John Wiley & Sons.
Agresti, A. (2013), Categorical Data Analysis, Third Edition, New York: John Wiley & Sons.
Thomas LC (2009): Consumer Credit Models: Pricing, Profit, and Portfolio. Oxford, Oxford Uni-versity Press.
Sobehart J, Keenan S, Stein R (2000): Benchmarking Quantitative Default Risk Models: A Validation Methodology, Moody’s Investors Service.
Other model validation techniques:
blr_confusion_matrix()
,
blr_decile_capture_rate()
,
blr_decile_lift_chart()
,
blr_gini_index()
,
blr_ks_chart()
,
blr_lorenz_curve()
,
blr_roc_curve()
,
blr_test_hosmer_lemeshow()
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) # gains table blr_gains_table(model) # lift chart k <- blr_gains_table(model) plot(k)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) # gains table blr_gains_table(model) # lift chart k <- blr_gains_table(model) plot(k)
Gini index is a measure of inequality and was developed to measure income inequality in labour market. In the predictive model, Gini Index is used for measuring discriminatory power.
blr_gini_index(model, data = NULL)
blr_gini_index(model, data = NULL)
model |
An object of class |
data |
A |
Gini index.
Siddiqi N (2006): Credit Risk Scorecards: developing and implementing intelligent credit scoring. New Jersey, Wiley.
Müller M, Rönz B (2000): Credit Scoring using Semiparametric Methods. In: Franke J, Härdle W, Stahl G (Eds.): Measuring Risk in Complex Stochastic Systems. New York, Springer-Verlag.
Other model validation techniques:
blr_confusion_matrix()
,
blr_decile_capture_rate()
,
blr_decile_lift_chart()
,
blr_gains_table()
,
blr_ks_chart()
,
blr_lorenz_curve()
,
blr_roc_curve()
,
blr_test_hosmer_lemeshow()
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_gini_index(model)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_gini_index(model)
Kolmogorov-Smirnov (KS) statistics is used to assess predictive power for marketing or credit risk models. It is the maximum difference between cumulative event and non-event distribution across score/probability bands. The gains table typically has across score bands and can be used to find the KS for a model.
blr_ks_chart( gains_table, title = "KS Chart", yaxis_title = " ", xaxis_title = "Cumulative Population %", ks_line_color = "black", print_plot = TRUE )
blr_ks_chart( gains_table, title = "KS Chart", yaxis_title = " ", xaxis_title = "Cumulative Population %", ks_line_color = "black", print_plot = TRUE )
gains_table |
An object of class |
title |
Plot title. |
yaxis_title |
Y axis title. |
xaxis_title |
X axis title. |
ks_line_color |
Color of the line indicating maximum KS statistic. |
print_plot |
logical; if |
https://pubmed.ncbi.nlm.nih.gov/843576/
Other model validation techniques:
blr_confusion_matrix()
,
blr_decile_capture_rate()
,
blr_decile_lift_chart()
,
blr_gains_table()
,
blr_gini_index()
,
blr_lorenz_curve()
,
blr_roc_curve()
,
blr_test_hosmer_lemeshow()
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) gt <- blr_gains_table(model) blr_ks_chart(gt)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) gt <- blr_gains_table(model) blr_ks_chart(gt)
Launches shiny app for interactive model building.
blr_launch_app()
blr_launch_app()
## Not run: blr_launch_app() ## End(Not run)
## Not run: blr_launch_app() ## End(Not run)
Test for model specification error.
blr_linktest(model)
blr_linktest(model)
model |
An object of class |
An object of class glm
.
Pregibon, D. 1979. Data analytic methods for generalized linear models. PhD diss., University of Toronto.
Pregibon, D. 1980. Goodness of link tests for generalized linear models.
Tukey, J. W. 1949. One degree of freedom for non-additivity.
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_linktest(model)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_linktest(model)
Lorenz curve is a visual representation of inequality. It is used to measure the discriminatory power of the predictive model.
blr_lorenz_curve( model, data = NULL, title = "Lorenz Curve", xaxis_title = "Cumulative Events %", yaxis_title = "Cumulative Non Events %", diag_line_col = "red", lorenz_curve_col = "blue", print_plot = TRUE )
blr_lorenz_curve( model, data = NULL, title = "Lorenz Curve", xaxis_title = "Cumulative Events %", yaxis_title = "Cumulative Non Events %", diag_line_col = "red", lorenz_curve_col = "blue", print_plot = TRUE )
model |
An object of class |
data |
A |
title |
Plot title. |
xaxis_title |
X axis title. |
yaxis_title |
Y axis title. |
diag_line_col |
Diagonal line color. |
lorenz_curve_col |
Color of the lorenz curve. |
print_plot |
logical; if |
Other model validation techniques:
blr_confusion_matrix()
,
blr_decile_capture_rate()
,
blr_decile_lift_chart()
,
blr_gains_table()
,
blr_gini_index()
,
blr_ks_chart()
,
blr_roc_curve()
,
blr_test_hosmer_lemeshow()
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_lorenz_curve(model)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_lorenz_curve(model)
Model fit statistics.
blr_model_fit_stats(model, ...)
blr_model_fit_stats(model, ...)
model |
An object of class |
... |
Other inputs. |
Menard, S. (2000). Coefficients of determination for multiple logistic regression analysis. The American Statistician, 54(1), 17-24.
Windmeijer, F. A. G. (1995). Goodness-of-fit measures in binary choice models. Econometric Reviews, 14, 101-116.
Hosmer, D.W., Jr., & Lemeshow, S. (2000), Applied logistic regression(2nd ed.). New York: John Wiley & Sons.
J. Scott Long & Jeremy Freese, 2000. "FITSTAT: Stata module to compute fit statistics for single equation regression models," Statistical Software Components S407201, Boston College Department of Economics, revised 22 Feb 2001.
Freese, Jeremy and J. Scott Long. Regression Models for Categorical Dependent Variables Using Stata. College Station: Stata Press, 2006.
Long, J. Scott. Regression Models for Categorical and Limited Dependent Variables. Thousand Oaks: Sage Publications, 1997.
Other model fit statistics:
blr_multi_model_fit_stats()
,
blr_pairs()
,
blr_rsq_adj_count()
,
blr_rsq_cox_snell()
,
blr_rsq_effron()
,
blr_rsq_mcfadden_adj()
,
blr_rsq_mckelvey_zavoina()
,
blr_rsq_nagelkerke()
,
blr_test_lr()
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_model_fit_stats(model)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_model_fit_stats(model)
Measures of model fit statistics for multiple models.
blr_multi_model_fit_stats(model, ...) ## Default S3 method: blr_multi_model_fit_stats(model, ...)
blr_multi_model_fit_stats(model, ...) ## Default S3 method: blr_multi_model_fit_stats(model, ...)
model |
An object of class |
... |
Objects of class |
A tibble.
Other model fit statistics:
blr_model_fit_stats()
,
blr_pairs()
,
blr_rsq_adj_count()
,
blr_rsq_cox_snell()
,
blr_rsq_effron()
,
blr_rsq_mcfadden_adj()
,
blr_rsq_mckelvey_zavoina()
,
blr_rsq_nagelkerke()
,
blr_test_lr()
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) model2 <- glm(honcomp ~ female + read + math, data = hsb2, family = binomial(link = 'logit')) blr_multi_model_fit_stats(model, model2)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) model2 <- glm(honcomp ~ female + read + math, data = hsb2, family = binomial(link = 'logit')) blr_multi_model_fit_stats(model, model2)
Association of predicted probabilities and observed responses.
blr_pairs(model)
blr_pairs(model)
model |
An object of class |
A tibble.
Other model fit statistics:
blr_model_fit_stats()
,
blr_multi_model_fit_stats()
,
blr_rsq_adj_count()
,
blr_rsq_cox_snell()
,
blr_rsq_effron()
,
blr_rsq_mcfadden_adj()
,
blr_rsq_mckelvey_zavoina()
,
blr_rsq_nagelkerke()
,
blr_test_lr()
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_pairs(model)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_pairs(model)
Confidence interval displacement diagnostics C vs fitted values plot.
blr_plot_c_fitted( model, point_color = "blue", title = "CI Displacement C vs Fitted Values Plot", xaxis_title = "Fitted Values", yaxis_title = "CI Displacement C" )
blr_plot_c_fitted( model, point_color = "blue", title = "CI Displacement C vs Fitted Values Plot", xaxis_title = "Fitted Values", yaxis_title = "CI Displacement C" )
model |
An object of class |
point_color |
Color of the points. |
title |
Title of the plot. |
xaxis_title |
X axis label. |
yaxis_title |
Y axis label. |
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_c_fitted(model)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_c_fitted(model)
Confidence interval displacement diagnostics C vs leverage plot.
blr_plot_c_leverage( model, point_color = "blue", title = "CI Displacement C vs Leverage Plot", xaxis_title = "Leverage", yaxis_title = "CI Displacement C" )
blr_plot_c_leverage( model, point_color = "blue", title = "CI Displacement C vs Leverage Plot", xaxis_title = "Leverage", yaxis_title = "CI Displacement C" )
model |
An object of class |
point_color |
Color of the points. |
title |
Title of the plot. |
xaxis_title |
X axis label. |
yaxis_title |
Y axis label. |
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_c_leverage(model)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_c_leverage(model)
Deviance vs fitted values plot.
blr_plot_deviance_fitted( model, point_color = "blue", line_color = "red", title = "Deviance Residual vs Fitted Values", xaxis_title = "Fitted Values", yaxis_title = "Deviance Residual" )
blr_plot_deviance_fitted( model, point_color = "blue", line_color = "red", title = "Deviance Residual vs Fitted Values", xaxis_title = "Fitted Values", yaxis_title = "Deviance Residual" )
model |
An object of class |
point_color |
Color of the points. |
line_color |
Color of the horizontal line. |
title |
Title of the plot. |
xaxis_title |
X axis label. |
yaxis_title |
Y axis label. |
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_deviance_fitted(model)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_deviance_fitted(model)
Deviance residuals plot.
blr_plot_deviance_residual( model, point_color = "blue", title = "Deviance Residuals Plot", xaxis_title = "id", yaxis_title = "Deviance Residuals" )
blr_plot_deviance_residual( model, point_color = "blue", title = "Deviance Residuals Plot", xaxis_title = "id", yaxis_title = "Deviance Residuals" )
model |
An object of class |
point_color |
Color of the points. |
title |
Title of the plot. |
xaxis_title |
X axis label. |
yaxis_title |
Y axis label. |
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_deviance_residual(model)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_deviance_residual(model)
Panel of plots to detect influential observations using DFBETAs.
blr_plot_dfbetas_panel(model, print_plot = TRUE)
blr_plot_dfbetas_panel(model, print_plot = TRUE)
model |
An object of class |
print_plot |
logical; if |
DFBETA measures the difference in each parameter estimate with and without
the influential point. There is a DFBETA for each data point i.e if there
are n observations and k variables, there will be DFBETAs. In
general, large values of DFBETAS indicate observations that are influential
in estimating a given parameter. Belsley, Kuh, and Welsch recommend 2 as a
general cutoff value to indicate influential observations and
as a size-adjusted cutoff.
list; blr_dfbetas_panel
returns a list of tibbles (for
intercept and each predictor) with the observation number and DFBETA of
observations that exceed the threshold for classifying an observation as an
outlier/influential observation.
Belsley, David A.; Kuh, Edwin; Welsh, Roy E. (1980). Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. Wiley Series in Probability and Mathematical Statistics. New York: John Wiley & Sons. pp. ISBN 0-471-05856-4.
## Not run: model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_dfbetas_panel(model) ## End(Not run)
## Not run: model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_dfbetas_panel(model) ## End(Not run)
Confidence interval displacement diagnostics C plot.
blr_plot_diag_c( model, point_color = "blue", title = "CI Displacement C Plot", xaxis_title = "id", yaxis_title = "CI Displacement C" )
blr_plot_diag_c( model, point_color = "blue", title = "CI Displacement C Plot", xaxis_title = "id", yaxis_title = "CI Displacement C" )
model |
An object of class |
point_color |
Color of the points. |
title |
Title of the plot. |
xaxis_title |
X axis label. |
yaxis_title |
Y axis label. |
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_diag_c(model)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_diag_c(model)
Confidence interval displacement diagnostics CBAR plot.
blr_plot_diag_cbar( model, point_color = "blue", title = "CI Displacement CBAR Plot", xaxis_title = "id", yaxis_title = "CI Displacement CBAR" )
blr_plot_diag_cbar( model, point_color = "blue", title = "CI Displacement CBAR Plot", xaxis_title = "id", yaxis_title = "CI Displacement CBAR" )
model |
An object of class |
point_color |
Color of the points. |
title |
Title of the plot. |
xaxis_title |
X axis label. |
yaxis_title |
Y axis label. |
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_diag_cbar(model)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_diag_cbar(model)
Diagnostics for detecting ill fitted observations.
blr_plot_diag_difchisq( model, point_color = "blue", title = "Delta Chisquare Plot", xaxis_title = "id", yaxis_title = "Delta Chisquare" )
blr_plot_diag_difchisq( model, point_color = "blue", title = "Delta Chisquare Plot", xaxis_title = "id", yaxis_title = "Delta Chisquare" )
model |
An object of class |
point_color |
Color of the points. |
title |
Title of the plot. |
xaxis_title |
X axis label. |
yaxis_title |
Y axis label. |
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_diag_difchisq(model)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_diag_difchisq(model)
Diagnostics for detecting ill fitted observations.
blr_plot_diag_difdev( model, point_color = "blue", title = "Delta Deviance Plot", xaxis_title = "id", yaxis_title = "Delta Deviance" )
blr_plot_diag_difdev( model, point_color = "blue", title = "Delta Deviance Plot", xaxis_title = "id", yaxis_title = "Delta Deviance" )
model |
An object of class |
point_color |
Color of the points. |
title |
Title of the plot. |
xaxis_title |
X axis label. |
yaxis_title |
Y axis label. |
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_diag_difdev(model)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_diag_difdev(model)
Diagnostic plots for fitted values.
blr_plot_diag_fit(model, print_plot = TRUE)
blr_plot_diag_fit(model, print_plot = TRUE)
model |
An object of class |
print_plot |
logical; if |
A panel of diagnostic plots for fitted values.
Fox, John (1991), Regression Diagnostics. Newbury Park, CA: Sage Publications.
Cook, R. D. and Weisberg, S. (1982), Residuals and Influence in Regression, New York: Chapman & Hall.
Other diagnostic plots:
blr_plot_diag_influence()
,
blr_plot_diag_leverage()
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_diag_fit(model)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_diag_fit(model)
Reisudal diagnostic plots for detecting influential observations.
blr_plot_diag_influence(model, print_plot = TRUE)
blr_plot_diag_influence(model, print_plot = TRUE)
model |
An object of class |
print_plot |
logical; if |
A panel of influence diagnostic plots.
Fox, John (1991), Regression Diagnostics. Newbury Park, CA: Sage Publications.
Cook, R. D. and Weisberg, S. (1982), Residuals and Influence in Regression, New York: Chapman & Hall.
Other diagnostic plots:
blr_plot_diag_fit()
,
blr_plot_diag_leverage()
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_diag_influence(model)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_diag_influence(model)
Diagnostic plots for leverage.
blr_plot_diag_leverage(model, print_plot = TRUE)
blr_plot_diag_leverage(model, print_plot = TRUE)
model |
An object of class |
print_plot |
logical; if |
A panel of diagnostic plots for leverage.
Fox, John (1991), Regression Diagnostics. Newbury Park, CA: Sage Publications.
Cook, R. D. and Weisberg, S. (1982), Residuals and Influence in Regression, New York: Chapman & Hall.
Other diagnostic plots:
blr_plot_diag_fit()
,
blr_plot_diag_influence()
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_diag_leverage(model)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_diag_leverage(model)
Delta Chi Square vs fitted values plot for detecting ill fitted observations.
blr_plot_difchisq_fitted( model, point_color = "blue", title = "Delta Chi Square vs Fitted Values Plot", xaxis_title = "Fitted Values", yaxis_title = "Delta Chi Square" )
blr_plot_difchisq_fitted( model, point_color = "blue", title = "Delta Chi Square vs Fitted Values Plot", xaxis_title = "Fitted Values", yaxis_title = "Delta Chi Square" )
model |
An object of class |
point_color |
Color of the points. |
title |
Title of the plot. |
xaxis_title |
X axis label. |
yaxis_title |
Y axis label. |
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_difchisq_fitted(model)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_difchisq_fitted(model)
Delta chi square vs leverage plot.
blr_plot_difchisq_leverage( model, point_color = "blue", title = "Delta Chi Square vs Leverage Plot", xaxis_title = "Leverage", yaxis_title = "Delta Chi Square" )
blr_plot_difchisq_leverage( model, point_color = "blue", title = "Delta Chi Square vs Leverage Plot", xaxis_title = "Leverage", yaxis_title = "Delta Chi Square" )
model |
An object of class |
point_color |
Color of the points. |
title |
Title of the plot. |
xaxis_title |
X axis label. |
yaxis_title |
Y axis label. |
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_difchisq_leverage(model)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_difchisq_leverage(model)
Delta deviance vs fitted values plot for detecting ill fitted observations.
blr_plot_difdev_fitted( model, point_color = "blue", title = "Delta Deviance vs Fitted Values Plot", xaxis_title = "Fitted Values", yaxis_title = "Delta Deviance" )
blr_plot_difdev_fitted( model, point_color = "blue", title = "Delta Deviance vs Fitted Values Plot", xaxis_title = "Fitted Values", yaxis_title = "Delta Deviance" )
model |
An object of class |
point_color |
Color of the points. |
title |
Title of the plot. |
xaxis_title |
X axis label. |
yaxis_title |
Y axis label. |
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_difdev_fitted(model)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_difdev_fitted(model)
Delta deviance vs leverage plot.
blr_plot_difdev_leverage( model, point_color = "blue", title = "Delta Deviance vs Leverage Plot", xaxis_title = "Leverage", yaxis_title = "Delta Deviance" )
blr_plot_difdev_leverage( model, point_color = "blue", title = "Delta Deviance vs Leverage Plot", xaxis_title = "Leverage", yaxis_title = "Delta Deviance" )
model |
An object of class |
point_color |
Color of the points. |
title |
Title of the plot. |
xaxis_title |
X axis label. |
yaxis_title |
Y axis label. |
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_difdev_leverage(model)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_difdev_leverage(model)
Fitted values vs leverage plot.
blr_plot_fitted_leverage( model, point_color = "blue", title = "Fitted Values vs Leverage Plot", xaxis_title = "Leverage", yaxis_title = "Fitted Values" )
blr_plot_fitted_leverage( model, point_color = "blue", title = "Fitted Values vs Leverage Plot", xaxis_title = "Leverage", yaxis_title = "Fitted Values" )
model |
An object of class |
point_color |
Color of the points. |
title |
Title of the plot. |
xaxis_title |
X axis label. |
yaxis_title |
Y axis label. |
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_fitted_leverage(model)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_fitted_leverage(model)
Leverage plot.
blr_plot_leverage( model, point_color = "blue", title = "Leverage Plot", xaxis_title = "id", yaxis_title = "Leverage" )
blr_plot_leverage( model, point_color = "blue", title = "Leverage Plot", xaxis_title = "id", yaxis_title = "Leverage" )
model |
An object of class |
point_color |
Color of the points. |
title |
Title of the plot. |
xaxis_title |
X axis label. |
yaxis_title |
Y axis label. |
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_leverage(model)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_leverage(model)
Leverage vs fitted values plot
blr_plot_leverage_fitted( model, point_color = "blue", title = "Leverage vs Fitted Values", xaxis_title = "Fitted Values", yaxis_title = "Leverage" )
blr_plot_leverage_fitted( model, point_color = "blue", title = "Leverage vs Fitted Values", xaxis_title = "Fitted Values", yaxis_title = "Leverage" )
model |
An object of class |
point_color |
Color of the points. |
title |
Title of the plot. |
xaxis_title |
X axis label. |
yaxis_title |
Y axis label. |
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_leverage_fitted(model)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_leverage_fitted(model)
Standardised pearson residuals plot.
blr_plot_pearson_residual( model, point_color = "blue", title = "Standardized Pearson Residuals", xaxis_title = "id", yaxis_title = "Standardized Pearson Residuals" )
blr_plot_pearson_residual( model, point_color = "blue", title = "Standardized Pearson Residuals", xaxis_title = "id", yaxis_title = "Standardized Pearson Residuals" )
model |
An object of class |
point_color |
Color of the points. |
title |
Title of the plot. |
xaxis_title |
X axis label. |
yaxis_title |
Y axis label. |
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_pearson_residual(model)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_pearson_residual(model)
Residual vs fitted values plot.
blr_plot_residual_fitted( model, point_color = "blue", line_color = "red", title = "Standardized Pearson Residual vs Fitted Values", xaxis_title = "Fitted Values", yaxis_title = "Standardized Pearson Residual" )
blr_plot_residual_fitted( model, point_color = "blue", line_color = "red", title = "Standardized Pearson Residual vs Fitted Values", xaxis_title = "Fitted Values", yaxis_title = "Standardized Pearson Residual" )
model |
An object of class |
point_color |
Color of the points. |
line_color |
Color of the horizontal line. |
title |
Title of the plot. |
xaxis_title |
X axis label. |
yaxis_title |
Y axis label. |
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_residual_fitted(model)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_plot_residual_fitted(model)
Data for generating decile capture rate.
blr_prep_dcrate_data(gains_table)
blr_prep_dcrate_data(gains_table)
gains_table |
An object of clas |
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) gt <- blr_gains_table(model) blr_prep_dcrate_data(gt)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) gt <- blr_gains_table(model) blr_prep_dcrate_data(gt)
Data for generating KS chart.
blr_prep_kschart_data(gains_table) blr_prep_kschart_line(gains_table) blr_prep_ksannotate_y(ks_line) blr_prep_kschart_stat(ks_line) blr_prep_ksannotate_x(ks_line)
blr_prep_kschart_data(gains_table) blr_prep_kschart_line(gains_table) blr_prep_ksannotate_y(ks_line) blr_prep_kschart_stat(ks_line) blr_prep_ksannotate_x(ks_line)
gains_table |
An object of clas |
ks_line |
Overall conversion rate. |
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) gt <- blr_gains_table(model) blr_prep_kschart_data(gt) ks_line <- blr_prep_kschart_line(gt) blr_prep_kschart_stat(ks_line) blr_prep_ksannotate_y(ks_line) blr_prep_ksannotate_x(ks_line)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) gt <- blr_gains_table(model) blr_prep_kschart_data(gt) ks_line <- blr_prep_kschart_line(gt) blr_prep_kschart_stat(ks_line) blr_prep_ksannotate_y(ks_line) blr_prep_ksannotate_x(ks_line)
Data for generating lift chart.
blr_prep_lchart_gmean(gains_table) blr_prep_lchart_data(gains_table, global_mean)
blr_prep_lchart_gmean(gains_table) blr_prep_lchart_data(gains_table, global_mean)
gains_table |
An object of clas |
global_mean |
Overall conversion rate. |
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) gt <- blr_gains_table(model) globalmean <- blr_prep_lchart_gmean(gt) blr_prep_lchart_data(gt, globalmean)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) gt <- blr_gains_table(model) globalmean <- blr_prep_lchart_gmean(gt) blr_prep_lchart_data(gt, globalmean)
Data for generating Lorenz curve.
blr_prep_lorenz_data(model, data = NULL, test_data = FALSE)
blr_prep_lorenz_data(model, data = NULL, test_data = FALSE)
model |
An object of class |
data |
A |
test_data |
Logical; |
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) data <- model$data blr_prep_lorenz_data(model, data, FALSE)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) data <- model$data blr_prep_lorenz_data(model, data, FALSE)
Data for generating ROC curve.
blr_prep_roc_data(gains_table)
blr_prep_roc_data(gains_table)
gains_table |
An object of clas |
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) gt <- blr_gains_table(model) blr_prep_roc_data(gt)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) gt <- blr_gains_table(model) blr_prep_roc_data(gt)
Binary logistic regression.
blr_regress(object, ...) ## S3 method for class 'glm' blr_regress(object, odd_conf_limit = FALSE, ...)
blr_regress(object, ...) ## S3 method for class 'glm' blr_regress(object, odd_conf_limit = FALSE, ...)
object |
An object of class "formula" (or one that can be coerced to
that class): a symbolic description of the model to be fitted or class
|
... |
Other inputs. |
odd_conf_limit |
If TRUE, odds ratio confidence limts will be displayed. |
# using formula blr_regress(object = honcomp ~ female + read + science, data = hsb2) # using a model built with glm model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_regress(model) # odds ratio estimates blr_regress(model, odd_conf_limit = TRUE)
# using formula blr_regress(object = honcomp ~ female + read + science, data = hsb2) # using a model built with glm model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_regress(model) # odds ratio estimates blr_regress(model, odd_conf_limit = TRUE)
Diagnostics for confidence interval displacement and detecting ill fitted observations.
blr_residual_diagnostics(model)
blr_residual_diagnostics(model)
model |
An object of class |
C, CBAR, DIFDEV and DIFCHISQ.
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_residual_diagnostics(model)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_residual_diagnostics(model)
Receiver operating characteristic curve (ROC) curve is used for assessing accuracy of the model classification.
blr_roc_curve( gains_table, title = "ROC Curve", xaxis_title = "1 - Specificity", yaxis_title = "Sensitivity", roc_curve_col = "blue", diag_line_col = "red", point_shape = 18, point_fill = "blue", point_color = "blue", plot_title_justify = 0.5, print_plot = TRUE )
blr_roc_curve( gains_table, title = "ROC Curve", xaxis_title = "1 - Specificity", yaxis_title = "Sensitivity", roc_curve_col = "blue", diag_line_col = "red", point_shape = 18, point_fill = "blue", point_color = "blue", plot_title_justify = 0.5, print_plot = TRUE )
gains_table |
An object of class |
title |
Plot title. |
xaxis_title |
X axis title. |
yaxis_title |
Y axis title. |
roc_curve_col |
Color of the roc curve. |
diag_line_col |
Diagonal line color. |
point_shape |
Shape of the points on the roc curve. |
point_fill |
Fill of the points on the roc curve. |
point_color |
Color of the points on the roc curve. |
plot_title_justify |
Horizontal justification on the plot title. |
print_plot |
logical; if |
Agresti, A. (2007), An Introduction to Categorical Data Analysis, Second Edition, New York: John Wiley & Sons.
Hosmer, D. W., Jr. and Lemeshow, S. (2000), Applied Logistic Regression, 2nd Edition, New York: John Wiley & Sons.
Siddiqi N (2006): Credit Risk Scorecards: developing and implementing intelligent credit scoring. New Jersey, Wiley.
Thomas LC, Edelman DB, Crook JN (2002): Credit Scoring and Its Applications. Philadelphia, SIAM Monographs on Mathematical Modeling and Computation.
Other model validation techniques:
blr_confusion_matrix()
,
blr_decile_capture_rate()
,
blr_decile_lift_chart()
,
blr_gains_table()
,
blr_gini_index()
,
blr_ks_chart()
,
blr_lorenz_curve()
,
blr_test_hosmer_lemeshow()
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) k <- blr_gains_table(model) blr_roc_curve(k)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) k <- blr_gains_table(model) blr_roc_curve(k)
Adjusted count r-squared.
blr_rsq_adj_count(model)
blr_rsq_adj_count(model)
model |
An object of class |
Adjusted count r-squared.
Other model fit statistics:
blr_model_fit_stats()
,
blr_multi_model_fit_stats()
,
blr_pairs()
,
blr_rsq_cox_snell()
,
blr_rsq_effron()
,
blr_rsq_mcfadden_adj()
,
blr_rsq_mckelvey_zavoina()
,
blr_rsq_nagelkerke()
,
blr_test_lr()
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_rsq_adj_count(model)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_rsq_adj_count(model)
Count r-squared.
blr_rsq_count(model)
blr_rsq_count(model)
model |
An object of class |
Count r-squared.
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_rsq_count(model)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_rsq_count(model)
Cox Snell pseudo r-squared.
blr_rsq_cox_snell(model)
blr_rsq_cox_snell(model)
model |
An object of class |
Cox Snell pseudo r-squared.
Cox, D. R., & Snell, E. J. (1989). The analysis of binary data (2nd ed.). London: Chapman and Hall.
Maddala, G. S. (1983). Limited dependent and qualitative variables in economics. New York: Cambridge Press.
Other model fit statistics:
blr_model_fit_stats()
,
blr_multi_model_fit_stats()
,
blr_pairs()
,
blr_rsq_adj_count()
,
blr_rsq_effron()
,
blr_rsq_mcfadden_adj()
,
blr_rsq_mckelvey_zavoina()
,
blr_rsq_nagelkerke()
,
blr_test_lr()
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_rsq_cox_snell(model)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_rsq_cox_snell(model)
Effron pseudo r-squared.
blr_rsq_effron(model)
blr_rsq_effron(model)
model |
An object of class |
Effron pseudo r-squared.
Efron, B. (1978). Regression and ANOVA with zero-one data: Measures of residual variation. Journal of the American Statistical Association, 73, 113-121.
Other model fit statistics:
blr_model_fit_stats()
,
blr_multi_model_fit_stats()
,
blr_pairs()
,
blr_rsq_adj_count()
,
blr_rsq_cox_snell()
,
blr_rsq_mcfadden_adj()
,
blr_rsq_mckelvey_zavoina()
,
blr_rsq_nagelkerke()
,
blr_test_lr()
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_rsq_effron(model)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_rsq_effron(model)
McFadden's pseudo r-squared for the model.
blr_rsq_mcfadden(model)
blr_rsq_mcfadden(model)
model |
An object of class |
McFadden's r-squared.
https://eml.berkeley.edu/reprints/mcfadden/zarembka.pdf
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_rsq_mcfadden(model)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_rsq_mcfadden(model)
McFadden's adjusted pseudo r-squared for the model.
blr_rsq_mcfadden_adj(model)
blr_rsq_mcfadden_adj(model)
model |
An object of class |
McFadden's adjusted r-squared.
https://eml.berkeley.edu/reprints/mcfadden/zarembka.pdf
Other model fit statistics:
blr_model_fit_stats()
,
blr_multi_model_fit_stats()
,
blr_pairs()
,
blr_rsq_adj_count()
,
blr_rsq_cox_snell()
,
blr_rsq_effron()
,
blr_rsq_mckelvey_zavoina()
,
blr_rsq_nagelkerke()
,
blr_test_lr()
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_rsq_mcfadden_adj(model)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_rsq_mcfadden_adj(model)
McKelvey Zavoina pseudo r-squared.
blr_rsq_mckelvey_zavoina(model)
blr_rsq_mckelvey_zavoina(model)
model |
An object of class |
Cragg-Uhler (Nagelkerke) R2 pseudo r-squared.
McKelvey, R. D., & Zavoina, W. (1975). A statistical model for the analysis of ordinal level dependent variables. Journal of Mathematical Sociology, 4, 103-12.
Other model fit statistics:
blr_model_fit_stats()
,
blr_multi_model_fit_stats()
,
blr_pairs()
,
blr_rsq_adj_count()
,
blr_rsq_cox_snell()
,
blr_rsq_effron()
,
blr_rsq_mcfadden_adj()
,
blr_rsq_nagelkerke()
,
blr_test_lr()
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_rsq_mckelvey_zavoina(model)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_rsq_mckelvey_zavoina(model)
Cragg-Uhler (Nagelkerke) R2 pseudo r-squared.
blr_rsq_nagelkerke(model)
blr_rsq_nagelkerke(model)
model |
An object of class |
Cragg-Uhler (Nagelkerke) R2 pseudo r-squared.
Cragg, S. G., & Uhler, R. (1970). The demand for automobiles. Canadian Journal of Economics, 3, 386-406.
Maddala, G. S. (1983). Limited dependent and qualitative variables in economics. New York: Cambridge Press.
Nagelkerke, N. (1991). A note on a general definition of the coefficient of determination.
Other model fit statistics:
blr_model_fit_stats()
,
blr_multi_model_fit_stats()
,
blr_pairs()
,
blr_rsq_adj_count()
,
blr_rsq_cox_snell()
,
blr_rsq_effron()
,
blr_rsq_mcfadden_adj()
,
blr_rsq_mckelvey_zavoina()
,
blr_test_lr()
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_rsq_nagelkerke(model)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_rsq_nagelkerke(model)
Event rate by segements/levels of a qualitative variable.
blr_segment(data, response, predictor) ## Default S3 method: blr_segment(data, response, predictor)
blr_segment(data, response, predictor) ## Default S3 method: blr_segment(data, response, predictor)
data |
A |
response |
Response variable; column in |
predictor |
Predictor variable; column in |
A tibble.
Other bivariate analysis procedures:
blr_bivariate_analysis()
,
blr_segment_dist()
,
blr_segment_twoway()
,
blr_woe_iv()
,
blr_woe_iv_stats()
blr_segment(hsb2, honcomp, prog)
blr_segment(hsb2, honcomp, prog)
Distribution of response variable by segements/levels of a qualitative variable.
blr_segment_dist(data, response, predictor) ## S3 method for class 'blr_segment_dist' plot( x, title = NA, xaxis_title = "Levels", yaxis_title = "Sample Distribution", sec_yaxis_title = "1s Distribution", bar_color = "blue", line_color = "red", print_plot = TRUE, ... )
blr_segment_dist(data, response, predictor) ## S3 method for class 'blr_segment_dist' plot( x, title = NA, xaxis_title = "Levels", yaxis_title = "Sample Distribution", sec_yaxis_title = "1s Distribution", bar_color = "blue", line_color = "red", print_plot = TRUE, ... )
data |
A |
response |
Response variable; column in |
predictor |
Predictor variable; column in |
x |
An object of class |
title |
Plot title. |
xaxis_title |
X axis title. |
yaxis_title |
Y axis title. |
sec_yaxis_title |
Secondary y axis title. |
bar_color |
Bar color. |
line_color |
Line color. |
print_plot |
logical; if |
... |
Other inputs. |
A tibble.
Other bivariate analysis procedures:
blr_bivariate_analysis()
,
blr_segment()
,
blr_segment_twoway()
,
blr_woe_iv()
,
blr_woe_iv_stats()
k <- blr_segment_dist(hsb2, honcomp, prog) k # plot plot(k)
k <- blr_segment_dist(hsb2, honcomp, prog) k # plot plot(k)
Event rate across two qualitative variables.
blr_segment_twoway(data, response, variable_1, variable_2) ## Default S3 method: blr_segment_twoway(data, response, variable_1, variable_2)
blr_segment_twoway(data, response, variable_1, variable_2) ## Default S3 method: blr_segment_twoway(data, response, variable_1, variable_2)
data |
A |
response |
Response variable; column in |
variable_1 |
Column in |
variable_2 |
Column in |
A tibble.
Other bivariate analysis procedures:
blr_bivariate_analysis()
,
blr_segment()
,
blr_segment_dist()
,
blr_woe_iv()
,
blr_woe_iv_stats()
blr_segment_twoway(hsb2, honcomp, prog, female)
blr_segment_twoway(hsb2, honcomp, prog, female)
Build regression model from a set of candidate predictor variables by removing predictors based on akaike information criterion, in a stepwise manner until there is no variable left to remove any more.
blr_step_aic_backward(model, ...) ## Default S3 method: blr_step_aic_backward(model, progress = FALSE, details = FALSE, ...) ## S3 method for class 'blr_step_aic_backward' plot(x, text_size = 3, print_plot = TRUE, ...)
blr_step_aic_backward(model, ...) ## Default S3 method: blr_step_aic_backward(model, progress = FALSE, details = FALSE, ...) ## S3 method for class 'blr_step_aic_backward' plot(x, text_size = 3, print_plot = TRUE, ...)
model |
An object of class |
... |
Other arguments. |
progress |
Logical; if |
details |
Logical; if |
x |
An object of class |
text_size |
size of the text in the plot. |
print_plot |
logical; if |
blr_step_aic_backward
returns an object of class
"blr_step_aic_backward"
. An object of class
"blr_step_aic_backward"
is a list containing the following components:
model |
model with the least AIC; an object of class |
candidates |
candidate predictor variables |
steps |
total number of steps |
predictors |
variables removed from the model |
aics |
akaike information criteria |
bics |
bayesian information criteria |
devs |
deviances |
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.
Other variable selection procedures:
blr_step_aic_both()
,
blr_step_aic_forward()
,
blr_step_p_backward()
,
blr_step_p_forward()
## Not run: model <- glm(honcomp ~ female + read + science + math + prog + socst, data = hsb2, family = binomial(link = 'logit')) # elimination summary blr_step_aic_backward(model) # print details of each step blr_step_aic_backward(model, details = TRUE) # plot plot(blr_step_aic_backward(model)) # final model k <- blr_step_aic_backward(model) k$model ## End(Not run)
## Not run: model <- glm(honcomp ~ female + read + science + math + prog + socst, data = hsb2, family = binomial(link = 'logit')) # elimination summary blr_step_aic_backward(model) # print details of each step blr_step_aic_backward(model, details = TRUE) # plot plot(blr_step_aic_backward(model)) # final model k <- blr_step_aic_backward(model) k$model ## End(Not run)
Build regression model from a set of candidate predictor variables by entering and removing predictors based on akaike information criterion, in a stepwise manner until there is no variable left to enter or remove any more.
blr_step_aic_both(model, details = FALSE, ...) ## S3 method for class 'blr_step_aic_both' plot(x, text_size = 3, ...)
blr_step_aic_both(model, details = FALSE, ...) ## S3 method for class 'blr_step_aic_both' plot(x, text_size = 3, ...)
model |
An object of class |
details |
Logical; if |
... |
Other arguments. |
x |
An object of class |
text_size |
size of the text in the plot. |
blr_step_aic_both
returns an object of class "blr_step_aic_both"
.
An object of class "blr_step_aic_both"
is a list containing the
following components:
model |
model with the least AIC; an object of class |
candidates |
candidate predictor variables |
predictors |
variables added/removed from the model |
method |
addition/deletion |
aics |
akaike information criteria |
bics |
bayesian information criteria |
devs |
deviances |
steps |
total number of steps |
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.
Other variable selection procedures:
blr_step_aic_backward()
,
blr_step_aic_forward()
,
blr_step_p_backward()
,
blr_step_p_forward()
## Not run: model <- glm(y ~ ., data = stepwise) # selection summary blr_step_aic_both(model) # print details at each step blr_step_aic_both(model, details = TRUE) # plot plot(blr_step_aic_both(model)) # final model k <- blr_step_aic_both(model) k$model ## End(Not run)
## Not run: model <- glm(y ~ ., data = stepwise) # selection summary blr_step_aic_both(model) # print details at each step blr_step_aic_both(model, details = TRUE) # plot plot(blr_step_aic_both(model)) # final model k <- blr_step_aic_both(model) k$model ## End(Not run)
Build regression model from a set of candidate predictor variables by entering predictors based on chi square statistic, in a stepwise manner until there is no variable left to enter any more.
blr_step_aic_forward(model, ...) ## Default S3 method: blr_step_aic_forward(model, progress = FALSE, details = FALSE, ...) ## S3 method for class 'blr_step_aic_forward' plot(x, text_size = 3, print_plot = TRUE, ...)
blr_step_aic_forward(model, ...) ## Default S3 method: blr_step_aic_forward(model, progress = FALSE, details = FALSE, ...) ## S3 method for class 'blr_step_aic_forward' plot(x, text_size = 3, print_plot = TRUE, ...)
model |
An object of class |
... |
Other arguments. |
progress |
Logical; if |
details |
Logical; if |
x |
An object of class |
text_size |
size of the text in the plot. |
print_plot |
logical; if |
blr_step_aic_forward
returns an object of class
"blr_step_aic_forward"
. An object of class
"blr_step_aic_forward"
is a list containing the following components:
model |
model with the least AIC; an object of class |
candidates |
candidate predictor variables |
steps |
total number of steps |
predictors |
variables entered into the model |
aics |
akaike information criteria |
bics |
bayesian information criteria |
devs |
deviances |
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.
Other variable selection procedures:
blr_step_aic_backward()
,
blr_step_aic_both()
,
blr_step_p_backward()
,
blr_step_p_forward()
## Not run: model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) # selection summary blr_step_aic_forward(model) # print details of each step blr_step_aic_forward(model, details = TRUE) # plot plot(blr_step_aic_forward(model)) # final model k <- blr_step_aic_forward(model) k$model ## End(Not run)
## Not run: model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) # selection summary blr_step_aic_forward(model) # print details of each step blr_step_aic_forward(model, details = TRUE) # plot plot(blr_step_aic_forward(model)) # final model k <- blr_step_aic_forward(model) k$model ## End(Not run)
Build regression model from a set of candidate predictor variables by removing predictors based on p values, in a stepwise manner until there is no variable left to remove any more.
blr_step_p_backward(model, ...) ## Default S3 method: blr_step_p_backward(model, prem = 0.3, details = FALSE, ...) ## S3 method for class 'blr_step_p_backward' plot(x, model = NA, print_plot = TRUE, ...)
blr_step_p_backward(model, ...) ## Default S3 method: blr_step_p_backward(model, prem = 0.3, details = FALSE, ...) ## S3 method for class 'blr_step_p_backward' plot(x, model = NA, print_plot = TRUE, ...)
model |
An object of class |
... |
Other inputs. |
prem |
p value; variables with p more than |
details |
Logical; if |
x |
An object of class |
print_plot |
logical; if |
blr_step_p_backward
returns an object of class "blr_step_p_backward"
.
An object of class "blr_step_p_backward"
is a list containing the
following components:
model |
model with the least AIC; an object of class |
steps |
total number of steps |
removed |
variables removed from the model |
aic |
akaike information criteria |
bic |
bayesian information criteria |
dev |
deviance |
indvar |
predictors |
Chatterjee, Samprit and Hadi, Ali. Regression Analysis by Example. 5th ed. N.p.: John Wiley & Sons, 2012. Print.
Other variable selection procedures:
blr_step_aic_backward()
,
blr_step_aic_both()
,
blr_step_aic_forward()
,
blr_step_p_forward()
## Not run: # stepwise backward regression model <- glm(honcomp ~ female + read + science + math + prog + socst, data = hsb2, family = binomial(link = 'logit')) blr_step_p_backward(model) # stepwise backward regression plot model <- glm(honcomp ~ female + read + science + math + prog + socst, data = hsb2, family = binomial(link = 'logit')) k <- blr_step_p_backward(model) plot(k) # final model k$model ## End(Not run)
## Not run: # stepwise backward regression model <- glm(honcomp ~ female + read + science + math + prog + socst, data = hsb2, family = binomial(link = 'logit')) blr_step_p_backward(model) # stepwise backward regression plot model <- glm(honcomp ~ female + read + science + math + prog + socst, data = hsb2, family = binomial(link = 'logit')) k <- blr_step_p_backward(model) plot(k) # final model k$model ## End(Not run)
Build regression model from a set of candidate predictor variables by entering and removing predictors based on p values, in a stepwise manner until there is no variable left to enter or remove any more.
blr_step_p_both(model, ...) ## Default S3 method: blr_step_p_both(model, pent = 0.1, prem = 0.3, details = FALSE, ...) ## S3 method for class 'blr_step_p_both' plot(x, model = NA, print_plot = TRUE, ...)
blr_step_p_both(model, ...) ## Default S3 method: blr_step_p_both(model, pent = 0.1, prem = 0.3, details = FALSE, ...) ## S3 method for class 'blr_step_p_both' plot(x, model = NA, print_plot = TRUE, ...)
model |
An object of class |
... |
Other arguments. |
pent |
p value; variables with p value less than |
prem |
p value; variables with p more than |
details |
Logical; if |
x |
An object of class |
print_plot |
logical; if |
blr_step_p_both
returns an object of class "blr_step_p_both"
.
An object of class "blr_step_p_both"
is a list containing the
following components:
model |
final model; an object of class |
orders |
candidate predictor variables according to the order by which they were added or removed from the model |
method |
addition/deletion |
steps |
total number of steps |
predictors |
variables retained in the model (after addition) |
aic |
akaike information criteria |
bic |
bayesian information criteria |
dev |
deviance |
indvar |
predictors |
Chatterjee, Samprit and Hadi, Ali. Regression Analysis by Example. 5th ed. N.p.: John Wiley & Sons, 2012. Print.
## Not run: # stepwise regression model <- glm(y ~ ., data = stepwise) blr_step_p_both(model) # stepwise regression plot model <- glm(y ~ ., data = stepwise) k <- blr_step_p_both(model) plot(k) # final model k$model ## End(Not run)
## Not run: # stepwise regression model <- glm(y ~ ., data = stepwise) blr_step_p_both(model) # stepwise regression plot model <- glm(y ~ ., data = stepwise) k <- blr_step_p_both(model) plot(k) # final model k$model ## End(Not run)
Build regression model from a set of candidate predictor variables by entering predictors based on p values, in a stepwise manner until there is no variable left to enter any more.
blr_step_p_forward(model, ...) ## Default S3 method: blr_step_p_forward(model, penter = 0.3, details = FALSE, ...) ## S3 method for class 'blr_step_p_forward' plot(x, model = NA, print_plot = TRUE, ...)
blr_step_p_forward(model, ...) ## Default S3 method: blr_step_p_forward(model, penter = 0.3, details = FALSE, ...) ## S3 method for class 'blr_step_p_forward' plot(x, model = NA, print_plot = TRUE, ...)
model |
An object of class |
... |
Other arguments. |
penter |
p value; variables with p value less than |
details |
Logical; if |
x |
An object of class |
print_plot |
logical; if |
blr_step_p_forward
returns an object of class "blr_step_p_forward"
.
An object of class "blr_step_p_forward"
is a list containing the
following components:
model |
model with the least AIC; an object of class |
steps |
number of steps |
predictors |
variables added to the model |
aic |
akaike information criteria |
bic |
bayesian information criteria |
dev |
deviance |
indvar |
predictors |
Chatterjee, Samprit and Hadi, Ali. Regression Analysis by Example. 5th ed. N.p.: John Wiley & Sons, 2012. Print.
Kutner, MH, Nachtscheim CJ, Neter J and Li W., 2004, Applied Linear Statistical Models (5th edition). Chicago, IL., McGraw Hill/Irwin.
Other variable selection procedures:
blr_step_aic_backward()
,
blr_step_aic_both()
,
blr_step_aic_forward()
,
blr_step_p_backward()
## Not run: # stepwise forward regression model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_step_p_forward(model) # stepwise forward regression plot model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) k <- blr_step_p_forward(model) plot(k) # final model k$model ## End(Not run)
## Not run: # stepwise forward regression model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_step_p_forward(model) # stepwise forward regression plot model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) k <- blr_step_p_forward(model) plot(k) # final model k$model ## End(Not run)
Hosmer lemeshow goodness of fit test.
blr_test_hosmer_lemeshow(model, data = NULL)
blr_test_hosmer_lemeshow(model, data = NULL)
model |
An object of class |
data |
a |
Hosmer, D.W., Jr., & Lemeshow, S. (2000), Applied logistic regression(2nd ed.). New York: John Wiley & Sons.
Other model validation techniques:
blr_confusion_matrix()
,
blr_decile_capture_rate()
,
blr_decile_lift_chart()
,
blr_gains_table()
,
blr_gini_index()
,
blr_ks_chart()
,
blr_lorenz_curve()
,
blr_roc_curve()
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_test_hosmer_lemeshow(model)
model <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_test_hosmer_lemeshow(model)
Performs the likelihood ratio test for full and reduced model.
blr_test_lr(full_model, reduced_model) ## Default S3 method: blr_test_lr(full_model, reduced_model)
blr_test_lr(full_model, reduced_model) ## Default S3 method: blr_test_lr(full_model, reduced_model)
full_model |
An object of class |
reduced_model |
An object of class |
Two tibbles with model information and test results.
Other model fit statistics:
blr_model_fit_stats()
,
blr_multi_model_fit_stats()
,
blr_pairs()
,
blr_rsq_adj_count()
,
blr_rsq_cox_snell()
,
blr_rsq_effron()
,
blr_rsq_mcfadden_adj()
,
blr_rsq_mckelvey_zavoina()
,
blr_rsq_nagelkerke()
# compare full model with intercept only model # full model model_1 <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_test_lr(model_1) # compare full model with nested model # nested model model_2 <- glm(honcomp ~ female + read, data = hsb2, family = binomial(link = 'logit')) blr_test_lr(model_1, model_2)
# compare full model with intercept only model # full model model_1 <- glm(honcomp ~ female + read + science, data = hsb2, family = binomial(link = 'logit')) blr_test_lr(model_1) # compare full model with nested model # nested model model_2 <- glm(honcomp ~ female + read, data = hsb2, family = binomial(link = 'logit')) blr_test_lr(model_1, model_2)
Weight of evidence and information value. Currently avialable for categorical predictors only.
blr_woe_iv(data, predictor, response, digits = 4, ...) ## S3 method for class 'blr_woe_iv' plot( x, title = NA, xaxis_title = "Levels", yaxis_title = "WoE", bar_color = "blue", line_color = "red", print_plot = TRUE, ... )
blr_woe_iv(data, predictor, response, digits = 4, ...) ## S3 method for class 'blr_woe_iv' plot( x, title = NA, xaxis_title = "Levels", yaxis_title = "WoE", bar_color = "blue", line_color = "red", print_plot = TRUE, ... )
data |
A |
predictor |
Predictor variable; column in |
response |
Response variable; column in |
digits |
Number of decimal digits to round off. |
... |
Other inputs. |
x |
An object of class |
title |
Plot title. |
xaxis_title |
X axis title. |
yaxis_title |
Y axis title. |
bar_color |
Color of the bar. |
line_color |
Color of the horizontal line. |
print_plot |
logical; if |
A tibble.
Siddiqi N (2006): Credit Risk Scorecards: developing and implementing intelligent credit scoring. New Jersey, Wiley.
Other bivariate analysis procedures:
blr_bivariate_analysis()
,
blr_segment()
,
blr_segment_dist()
,
blr_segment_twoway()
,
blr_woe_iv_stats()
# woe and iv k <- blr_woe_iv(hsb2, female, honcomp) k # plot woe plot(k)
# woe and iv k <- blr_woe_iv(hsb2, female, honcomp) k # plot woe plot(k)
Prints weight of evidence and information value for multiple variables. Currently avialable for categorical predictors only.
blr_woe_iv_stats(data, response, ...)
blr_woe_iv_stats(data, response, ...)
data |
A |
response |
Response variable; column in |
... |
Predictor variables; column in |
Other bivariate analysis procedures:
blr_bivariate_analysis()
,
blr_segment()
,
blr_segment_dist()
,
blr_segment_twoway()
,
blr_woe_iv()
blr_woe_iv_stats(hsb2, honcomp, prog, race, female, schtyp)
blr_woe_iv_stats(hsb2, honcomp, prog, race, female, schtyp)
A dataset containing demographic information and standardized test scores of high school students.
hsb2
hsb2
A data frame with 200 rows and 11 variables:
id of the student
gender of the student
ethnic background of the student
socio-economic status of the student
school type
program type
scores from test of reading
scores from test of writing
scores from test of math
scores from test of science
scores from test of social studies
1 if write > 60, else 0
https://www.openintro.org/data/index.php?data=hsb
Dummy Data Set
stepwise
stepwise
An object of class data.frame
with 20000 rows and 7 columns.