Title: | Optimal Linear Regression |
---|---|
Description: | The optimal linear regression olr(), runs all the possible combinations of linear regression equations. The olr() returns the equation which has the greatest adjusted R-squared term or the greatest R-squared term based on the user's discretion. Essentially, the olr() returns the best fit equation out of all the possible equations. R-squared increases with the addition of an explanatory variable whether it is 'significant' or not, thus this was developed to eliminate that conundrum. Adjusted R-squared is preferred to overcome this phenomenon, but each combination will still produce different results and this will return the best one. Complimentary functions are included which list all of the equations, all of the equations in ascending order, a function to give the user a specific model's summary, and the list of adjusted R-squared terms & R-squared terms. A 'Python' version is available at: <https://pypi.org/project/olr/>. |
Authors: | Mathew Fok |
Maintainer: | Mathew Fok <[email protected]> |
License: | GPL-3 |
Version: | 1.1 |
Built: | 2024-10-31 21:25:31 UTC |
Source: | CRAN |
The main olr() runs all of the possible linear regression equation combinations, which are all of the combinations of dependent variables respect to the independent variable. In essence, the olr() returns the best fit linear regression model. The user can prompt the olr() to return either the best fit statistical summary of either the greatest adjusted R-squared, or the greatest R-squared term. R-squared increases with the addition of an explanatory variable whether it is 'significant' or not, thus this was developed to eliminate that conundrum. Adjusted R-squared is preferred to overcome this phenomenon, but each combination will still produce different results and this will return the best one.
olr(dataset, responseName = NULL, predictorNames = NULL, adjr2 = TRUE) olrmodels(dataset, responseName = NULL, predictorNames = NULL) olrformulas(dataset, responseName = NULL, predictorNames = NULL) olrformulaorder(dataset, responseName = NULL, predictorNames = NULL) adjr2list(dataset, responseName = NULL, predictorNames = NULL) r2list(dataset, responseName = NULL, predictorNames = NULL)
olr(dataset, responseName = NULL, predictorNames = NULL, adjr2 = TRUE) olrmodels(dataset, responseName = NULL, predictorNames = NULL) olrformulas(dataset, responseName = NULL, predictorNames = NULL) olrformulaorder(dataset, responseName = NULL, predictorNames = NULL) adjr2list(dataset, responseName = NULL, predictorNames = NULL) r2list(dataset, responseName = NULL, predictorNames = NULL)
dataset |
is defined by the user and points to the name of the dataset that is being used. |
responseName |
the response variable name defined as a string. For example, it represents a header in the data table. |
predictorNames |
the predictor variable or variables that are the terms that are to be regressed against the |
adjr2 |
|
Complimentary functions below follow the format: function(dataset, responseName = NULL, predictorNames = NULL)
olrmodels: returns the list of models accompanied by the coefficients. After typing in olrmodels(dataset, responseName, predictorNames)
type the desired summary number to the right of the comma in the brackets: [,x]
where x equals the desired summary number. For example, olrmodels(dataset, responseName, predictorNames)[,8]
olrformulas: returns the list of olr() formulas
olrformulasorder: returns the formulas with the predictors (dependent variables) in ascending order
adjr2list: list of the adjusted R-squared terms
r2list: list of the R-squared terms
When responseName
and predictorNames
are NULL
, then the first column in the dataset
is set as the responseName
and the remaining columns are the predictorNames
.
A 'Python' version is available at <https://pypi.org/project/olr>.
The regression summary for the adjusted R-squared or the R-squared, specified with TRUE
or FALSE
in the olr().
file <- system.file("extdata", "oildata.csv", package = "olr", mustWork = TRUE) oildata <- read.csv(file, header = TRUE) dataset <- oildata responseName <- 'OilPrices' predictorNames <- c('SP500', 'RigCount', 'API', 'Field_Production', 'OperableCapacity', 'Imports') olr(dataset, responseName, predictorNames, adjr2 = TRUE)
file <- system.file("extdata", "oildata.csv", package = "olr", mustWork = TRUE) oildata <- read.csv(file, header = TRUE) dataset <- oildata responseName <- 'OilPrices' predictorNames <- c('SP500', 'RigCount', 'API', 'Field_Production', 'OperableCapacity', 'Imports') olr(dataset, responseName, predictorNames, adjr2 = TRUE)