Title: | Integrated Regression Goodness of Fit |
---|---|
Description: | Performs Goodness of Fit for regression models using Integrated Regression method. Works for several different fitting techniques. |
Authors: | Jorge Luis Ojeda Cabrera <[email protected]> |
Maintainer: | Jorge Luis Ojeda Cabrera <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.85-5 |
Built: | 2025-02-17 06:42:02 UTC |
Source: | CRAN |
Integrated Regression Goodness of Fit to test the adequacy of different model to represent the regression function for a given data.
anovarIntReg(objH0, ..., covars = NULL, B = 499, LINMOD = FALSE, INCREMENTAL = FALSE) ## S3 method for class 'anovarIntReg' print(x,...)
anovarIntReg(objH0, ..., covars = NULL, B = 499, LINMOD = FALSE, INCREMENTAL = FALSE) ## S3 method for class 'anovarIntReg' print(x,...)
objH0 |
An object of class |
.
... |
|
covars |
Names of continuous (numerical) variates used to
compute Integrated Regression. They should be variables contained
in the data frame used to compute the regression fit. When NULL it
is obtained as the max. number of different covariates in all tested
models. It also can be a |
B |
Bootstrap resampling size. |
LINMOD |
When |
INCREMENTAL |
When is |
x |
An object of class |
This function implements the test
for two different models ,
using the
Integrated Regression Goodness of Fit as os done in
intRegGOF
,
but instead of the accumulation of the residual of a givem model, in
this case, the accumuation of the difference in the fits is considered:
The test statistics considered are $K_n$ and $W^2_n$.
If objH0
and objH1
are lm
, glm
or nls
fits for the models in classes and
respectively, then
anovarIntReg(objH0,objH1)
computes
test vs
. When
anovarIntReg(objH0,objH1,...,objHk)
is executed (notice
that by default INCREMENTAL=FALSE
) we obtain a table with
the statistics and
and its associated
-values for each of the tests
vs
being
. On the other hand,
if the parameter
INCREMENTAL
is set to TRUE
, the
command returns the results for the tests vs
being
.
This function returns an object of class anovarIntReg
, a
matrix like structure
whose rows refers to models and
columns to statistics and its -values. It also has
an attribute
heading
to support printing the object.
This method requires more testing, and careful study of the effect of factors (discrete random variables) when fitting the model.
Jorge Luis Ojeda Cabrera ([email protected]).
n <- 50 d <- data.frame( X1=runif(n),X2=runif(n)) d$Y <- 1 - 2*d$X1 - 5*d$X2 + rnorm(n,sd=.125) a0 <- lm(Y~1,d) a1 <- lm(Y~X1,d) a2 <- lm(Y~X1+X2,d) anovarIntReg(a0,a1,a2,B=50) anovarIntReg(a0,a1,a2,B=50,INCREMENTAL=TRUE)
n <- 50 d <- data.frame( X1=runif(n),X2=runif(n)) d$Y <- 1 - 2*d$X1 - 5*d$X2 + rnorm(n,sd=.125) a0 <- lm(Y~1,d) a1 <- lm(Y~X1,d) a2 <- lm(Y~X1+X2,d) anovarIntReg(a0,a1,a2,B=50) anovarIntReg(a0,a1,a2,B=50,INCREMENTAL=TRUE)
Core functions for the computation of the Integrated Regression Goodness of Fit
compIntRegProc(y, xord, weig = rep(1, length(y))) compBootSamp(obj, datLT, B = 999, LINMOD = FALSE) plotIntRegProc(y, x, weig = rep(1, length(y)), ADD = FALSE, ...) getModelFrame(obj) getResiduals(obj,type)
compIntRegProc(y, xord, weig = rep(1, length(y))) compBootSamp(obj, datLT, B = 999, LINMOD = FALSE) plotIntRegProc(y, x, weig = rep(1, length(y)), ADD = FALSE, ...) getModelFrame(obj) getResiduals(obj,type)
y |
vector, values to add to compute the Integrated Regression. |
xord |
list of list with the index of covariate points that are less than covariate data. This tells how to cumulate according to covariates, |
weig |
vector of weights, specifically used to fit and compute test statistics when data is selection biased. |
obj |
|
datLT |
structure as |
B |
Bootstrap resampling size. |
LINMOD |
When |
x |
vector with covarates to plot |
ADD |
If |
type |
Type of residual. |
... |
Further parameters to plot. |
...TODO: Each of them computes what in which way
Surely they can better implemented.
Jorge Luis Ojeda Cabrera ([email protected]).
Integrated Regression Goodness of Fit to test if a given model is suitable to represent the regression function for a given data.
intRegGOF(obj, covars = NULL, B = 499, LINMOD = FALSE) ## S3 method for class 'intRegGOF' print(x,...)
intRegGOF(obj, covars = NULL, B = 499, LINMOD = FALSE) ## S3 method for class 'intRegGOF' print(x,...)
obj |
|
covars |
Names of continuous (numerical) variates used to compute Integrated Regression. They should be variables contained in the data frame used to compute the regression fit. |
B |
Bootstrap resampling size. |
LINMOD |
When |
x |
An object of class |
... |
Further parameters for print command. |
The Integrated Regression Goodness of Fit technique is introduce in Stute(1997). The main idea is to study the process that results from the cumulation of the residuals up to a given value of the covariates. Once this process is built, different functional over it can be considered to measure the discrepany between the true regression function and its estimation.
The tests that implements this function is
being the regression function, and
a given class
of functions. The statistics considered are
where is the cumulated residual process:
As the stochastic behaviour of this cumulated residual process is quite complex, the implementation of the technique is based on resampling techniques. In particular the chosen implementation is based on Wild Bootstrap methods.
The method also handles selection biased data by means of compensation, by means of the weights used to fit the resgression function when computing the cumulated residual process.
At the moment only 'response'
type of residuals are considered,
jointly with wild bootstrap resampling technique and the result for
discrete responses might no be proper.
This function returns an object of class intRegGOF
, a
list
which cointains following objects:
call |
The call to the function |
regObj |
String with the |
regModel |
|
p.value |
|
datStat |
value of |
covars |
continuous (numerical) variates used to compute Integrated Regression. |
intErr |
cumulated residual process at the values of
|
xLT |
structure with the order of |
bootSamp |
Bootstrap samples for |
This method requires more testing, and careful study of the effect of factors (discrete random variables) when fitting the model.
Jorge Luis Ojeda Cabrera ([email protected]).
Stute, W. (1997). Nonparametric model checks for regression. Ann. Statist., 25(2), pp. 613–641.
Ojeda, J. L., W. González-Manteiga W. and Cristóbal, J. A A bootstrap based Model Checking for Selection–Biased data Reports in Statistics and Operations Research, U. de Santiago de Compostela. Report 07-05 http://eio.usc.es/eipc1/BASE/BASEMASTER/FORMULARIOS-PHP-DPTO/REPORTS/447report07_05.pdf
Ojeda, J. L., Cristóbal, J. A., and Alcalá, J. T. (2008). A bootstrap approach to model checking for linear models under length-biased data. Ann. Inst. Statist. Math., 60(3), pp. 519–543.
lm
, glm
, nls
and its methods
summary
, print
, plot
, etc...
n <- 50 d <- data.frame( X1=runif(n),X2=runif(n)) d$Y <- 1 + 2*d$X1 + rnorm(n,sd=.125) plot( d ) intRegGOF(lm(Y~X1+X2,d),B=99) intRegGOF(a <- lm(Y~X1-1,d),B=99) intRegGOF(a,c("X1","X2"),B=99) intRegGOF(a,~X2+X1,B=99)
n <- 50 d <- data.frame( X1=runif(n),X2=runif(n)) d$Y <- 1 + 2*d$X1 + rnorm(n,sd=.125) plot( d ) intRegGOF(lm(Y~X1+X2,d),B=99) intRegGOF(a <- lm(Y~X1-1,d),B=99) intRegGOF(a,c("X1","X2"),B=99) intRegGOF(a,~X2+X1,B=99)
Methods to develop model validation and visualization of Integrated Regression Goodness of Fit technique.
plotAsIntRegGOF(obj, covar = 1, ADD = FALSE, ...) pointsAsIntRegGOF(obj,covar=1,...) linesAsIntRegGOF(obj,covar=1,...)
plotAsIntRegGOF(obj, covar = 1, ADD = FALSE, ...) pointsAsIntRegGOF(obj,covar=1,...) linesAsIntRegGOF(obj,covar=1,...)
obj |
|
covar |
Variable name, number or vector for which Int. Reg. is computed. If it is a number, it reference a covariate in the model frame, while if it is a name refer to data in data frame using in the fitting process. |
ADD |
If |
... |
Further parameters to for plotobj command. |
Currently, the implementation computes the accumulated residual
process against a single covariate (covar
). When the value
of covar
is set to 0, the response is used as the variable
whose residual are accumulated against.
Notice that if covar
is a vector its lenght should be equal
to the number of residuals.
lm
objects that does not have a data parameter set
when the call is executed does not work presently when the covar
parameter is different than 0.
Jorge Luis Ojeda Cabrera ([email protected]).
lm
, glm
, nls
its
associated plot
method and intRegGOF
.
n <- 50 d <- data.frame( X1=runif(n),X2=runif(n)) d$Y <- 1 + 2*d$X1 + rnorm(n,sd=.125) par(ask=TRUE) plot( d ) plotAsIntRegGOF(lm(Y~X1+X2,d),covar="X1") plotAsIntRegGOF(a <- lm(Y~X1-1,d)) plotAsIntRegGOF(a,c("X1")) plotAsIntRegGOF(a,0) plotAsIntRegGOF(a,fitted(a)) par(ask=FALSE)
n <- 50 d <- data.frame( X1=runif(n),X2=runif(n)) d$Y <- 1 + 2*d$X1 + rnorm(n,sd=.125) par(ask=TRUE) plot( d ) plotAsIntRegGOF(lm(Y~X1+X2,d),covar="X1") plotAsIntRegGOF(a <- lm(Y~X1-1,d)) plotAsIntRegGOF(a,c("X1")) plotAsIntRegGOF(a,0) plotAsIntRegGOF(a,fitted(a)) par(ask=FALSE)
Functions that are basic or/and useful for the computation of the Integrated Regression Goodness of Fit
getLessThan(x, d) mvCumSum(x, ord) mvPartOrd(x1, x2) getContVar(df, vars = NULL) getModelCovars(obj) getModelWeights(obj) rWildBoot(n)
getLessThan(x, d) mvCumSum(x, ord) mvPartOrd(x1, x2) getContVar(df, vars = NULL) getModelCovars(obj) getModelWeights(obj) rWildBoot(n)
x , d
|
matrix like structure. |
x1 , x2
|
vectors with the same length. |
df |
a data frame. |
ord |
list of list structure with the ordering to add data points according to a given covariates. |
obj |
|
vars |
vector with variable names in observations data frame . |
n |
integer, sample size. |
...TODO: Each of them computes what in which way
getLessThan
can be ceitainly better implemented.
Jorge Luis Ojeda Cabrera ([email protected]).