Package: gausscov 1.1.5

Laurie Davies

gausscov: The Gaussian Covariate Method for Variable Selection

The standard linear regression theory whether frequentist or Bayesian is based on an 'assumed (revealed?) truth' (John Tukey) attitude to models. This is reflected in the language of statistical inference which involves a concept of truth, for example confidence intervals, hypothesis testing and consistency. The motivation behind this package was to remove the word true from the theory and practice of linear regression and to replace it by approximation. The approximations considered are the least squares approximations. An approximation is called valid if it contains no irrelevant covariates. This is operationalized using the concept of a Gaussian P-value which is the probability that pure Gaussian noise is better in term of least squares than the covariate. The precise definition given in the paper, it is intuitive and requires only four simple equations. Its overwhelming advantage compared with a standard F P-value is that is is exact and valid whatever the data. In contrast F P-values are only valid for specially designed simulations. Given this a valid approximation is one where all the Gaussian P-values are less than a threshold p0 specified by the statistician, in this package with the default value 0.01. This approximations approach is not only much simpler it is overwhelmingly better than the standard model based approach. The will be demonstrated using six real data sets, four from high dimensional regression and two from vector autoregression. The simplicity and superiority of Gaussian P-values derive from their universal exactness and validity. This is in complete contrast to standard F P-values which are valid only for carefully designed simulations. The function f1st is the most important function. It is a greedy forward selection procedure which results in either just one or no approximations which may however not be valid. If the size is less than than a threshold with default value 21 then an all subset procedure is called which returns the best valid subset. A good default start is f1st(y,x,kmn=15) The best function for returning multiple approximations is f3st which repeatedly calls f1st. For more information see the web site below and the accompanying papers: L. Davies and L. Duembgen, "Covariate Selection Based on a Model-free Approach to Linear Regression with Exact Probabilities", <doi:10.48550/arXiv.2202.01553>, L. Davies, "An Approximation Based Theory of Linear Regression", 2024, <doi:10.48550/arXiv.2402.09858>.

Authors:Laurie Davies [aut, cre]

gausscov_1.1.5.tar.gz
gausscov_1.1.5.tar.gz(r-4.5-noble)gausscov_1.1.5.tar.gz(r-4.4-noble)
gausscov_1.1.5.tgz(r-4.4-emscripten)gausscov_1.1.5.tgz(r-4.3-emscripten)
gausscov.pdf |gausscov.html✨
gausscov/json (API)

# Install 'gausscov' in R:

install.packages('gausscov', repos = c('https://cran.r-universe.dev', 'https://cloud.r-project.org'))

Datasets:

abcq - American Business Cycle
boston - Boston data
leukemia - Leukemia data set
mel_temp - Melbourne minimum temperature
redwine - Redwine data
snspt - Sunspot data
vardata - USA economics data

On CRAN:

This package does not link to any Github/Gitlab/R-forge repository. No issue tracker or development information is available.

fortran

1.78 score 2 stars 784 downloads 13 exports 0 dependencies

Last updated 30 days agofrom:7392270e79. Checks:3 OK. Indexed: yes.

Target	Result	Latest binary
Doc / Vignettes	OK	Mar 25 2025
R-4.5-linux-x86_64	OK	Mar 25 2025
R-4.4-linux-x86_64	OK	Mar 25 2025

Exports:decode f1st f2st f3st f3sti fasb fgeninter fgentrig fgr1st flag fpval fundr simgpval

Dependencies:

Help page	Topics
American Business Cycle	abcq
Boston data	boston
Decodes the number of a subset selected by fasb.R to give the covariates	decode
Stepwise selection of covariates	f1st
Repeated stepwise selection of covariates	f2st
Stepwise selection of covariates	f3st
Selection of covariates with given excluded covariates	f3sti
Calculates all subsets where each included covariate is significant.	fasb
Generation of interactions	fgeninter
Generation of sine and cosine functions	fgentrig
Calculates a dependence graph using Gaussian stepwise selection	fgr1st
Calculation of lagged covariates	flag
Calculates the regression coefficients, the P-values and the standard P-values for the chosen subset ind	fpval
Converts directed into an undirected graph	fundr
Leukemia data set	leukemia
Melbourne minimum temperature	mel_temp
Redwine data	redwine
Simulates Gaussian P-values	simgpval
Sunspot data	snspt
USA economics data	vardata

Package: gausscov 1.1.5

gausscov: The Gaussian Covariate Method for Variable Selection

Citation

Readme and manuals

Help Manual

Usage by other packages (reverse dependencies)