Title: | Tools and Classes for Statistical Models |
---|---|
Description: | A collection of tools to deal with statistical models. The functionality is experimental and the user interface is likely to change in the future. The documentation is rather terse, but packages `coin' and `party' have some working examples. However, if you find the implemented ideas interesting we would be very interested in a discussion of this proposal. Contributions are more than welcome! |
Authors: | Torsten Hothorn, Friedrich Leisch, Achim Zeileis |
Maintainer: | Torsten Hothorn <[email protected]> |
License: | GPL-2 |
Version: | 0.2-23 |
Built: | 2024-10-31 22:14:41 UTC |
Source: | CRAN |
A class describing the parts of a formula.
Objects can be created by calls of the form new("FormulaParts", ...)
.
formula
:Object of class "list"
.
No methods defined with class "FormulaParts" in the signature.
A collection of standard generic functions for which other packages provide methods.
ICL(object, ...) KLdiv(object, ...) Lapply(object, FUN, ...) clusters(object, newdata, ...) getModel(object, ...) parameters(object, ...) posterior(object, newdata, ...) prior(object, ...) refit(object, newdata, ...) relabel(object, by, ...) ParseFormula(formula, data = list())
ICL(object, ...) KLdiv(object, ...) Lapply(object, FUN, ...) clusters(object, newdata, ...) getModel(object, ...) parameters(object, ...) posterior(object, newdata, ...) prior(object, ...) refit(object, newdata, ...) relabel(object, by, ...) ParseFormula(formula, data = list())
object |
S4 classed object. |
formula |
A model formula. |
data |
An optional data frame. |
FUN |
The function to be applied. |
newdata |
Optional new data. |
by |
Typically a character string specifying how to relabel the object. |
... |
Some methods for these generic function may take additional, optional arguments. |
Integrated Completed Likelihood criterion for model selection.
Kullback-Leibler divergence.
S4 generic for lapply
Get cluster membership information from a model or compute it for new data.
Get single model from a collection of models.
Get parameters of a model (similar to but more
general than coefficients
).
Get posterior probabilities from a model or compute posteriors for new data.
Get prior probabilities from a model.
Refit a model (usually to obtain additional information that was not computed or stored during the initial fitting process).
Relabel a model (usually to obtain a new permutation of labels in mixture models or cluster objects).
Friedrich Leisch
Returns descriptive information about fitted objects.
info(object, which, ...) ## S4 method for signature 'ANY,missing' info(object, which, ...) infoCheck(object, which, ...)
info(object, which, ...) ## S4 method for signature 'ANY,missing' info(object, which, ...) infoCheck(object, which, ...)
object |
fitted object. |
which |
which information to get. Use |
... |
passed to methods. |
Function info
can be used to access slots of fitted
objects in a portable way.
Function infoCheck
returns a logical value that is TRUE
if the requested information can be computed from the object
.
Friedrich Leisch
Apply a single function or a collection of functions to the data objects stored in a model environment.
## S4 method for signature 'ModelEnv' MEapply(object, FUN, clone = TRUE, ...)
## S4 method for signature 'ModelEnv' MEapply(object, FUN, clone = TRUE, ...)
object |
Object of class |
FUN |
Function or list of functions. |
clone |
If |
... |
Passed on to |
data("iris") me <- ModelEnvFormula(Species+Petal.Width~.-1, data=iris, subset=sample(1:150, 10)) me1 <- MEapply(me, FUN=list(designMatrix=scale, response=function(x) sapply(x, as.numeric))) me@get("designMatrix") me1@get("designMatrix")
data("iris") me <- ModelEnvFormula(Species+Petal.Width~.-1, data=iris, subset=sample(1:150, 10)) me1 <- MEapply(me, FUN=list(designMatrix=scale, response=function(x) sapply(x, as.numeric))) me@get("designMatrix") me1@get("designMatrix")
A class for model environments.
Objects of class ModelEnv
basically consist of an
environment
for data storage as well as get
and
set
methods.
na.fail
returns FALSE
when at least one missing value occurs
in object@env
. na.pass
returns object
unchanged and
na.omit
returns a copy of object
with all missing values
removed.
Objects can be created by calls of the form new("ModelEnv", ...)
.
env
:Object of class "environment"
.
get
:Object of class "function"
for extracting
objects from environment env
.
set
:Object of class "function"
for setting
object in environment env
.
hooks
:A list of hook collections.
signature(object = "ModelEnv")
: copy an object.
signature(object = "ModelEnv", which = "character")
:
get the dimension of an object.
signature(object = "ModelEnv")
: Return
TRUE
, if the model environment contains no data.
signature(object = "ModelEnv", which = "character")
:
check if an object which
is available in env
.
signature(.Object = "ModelEnv")
: setup new
objects.
signature(object = "ModelEnv")
: show object.
signature(x = "ModelEnv")
: extract subsets from an
object.
na.action
method for ModelEnv
objects.
na.action
method for ModelEnv
objects.
na.action
method for ModelEnv
objects.
### a new object me <- new("ModelEnv") ## the new model environment is empty empty(me) ### define a bivariate response variable me@set("response", data.frame(y = rnorm(10), x = runif(10))) me ## now it is no longer empty empty(me) ### check if a response is available has(me, "response") ### the dimensions dimension(me, "response") ### extract the data me@get("response") df <- data.frame(x = rnorm(10), y = rnorm(10)) ## hook for set method: mf <- ModelEnvFormula(y ~ x-1, data = df, setHook=list(designMatrix=scale)) mf@get("designMatrix") mf@set(data=df[1:5,]) mf@get("designMatrix") ### NA handling df$x[1] <- NA mf <- ModelEnvFormula(y ~ x, data = df, na.action = na.pass) mf na.omit(mf)
### a new object me <- new("ModelEnv") ## the new model environment is empty empty(me) ### define a bivariate response variable me@set("response", data.frame(y = rnorm(10), x = runif(10))) me ## now it is no longer empty empty(me) ### check if a response is available has(me, "response") ### the dimensions dimension(me, "response") ### extract the data me@get("response") df <- data.frame(x = rnorm(10), y = rnorm(10)) ## hook for set method: mf <- ModelEnvFormula(y ~ x-1, data = df, setHook=list(designMatrix=scale)) mf@get("designMatrix") mf@set(data=df[1:5,]) mf@get("designMatrix") ### NA handling df$x[1] <- NA mf <- ModelEnvFormula(y ~ x, data = df, na.action = na.pass) mf na.omit(mf)
A flexible implementation of the classical formula based interface.
ModelEnvFormula(formula, data = list(), subset = NULL, na.action = NULL, frame = NULL, enclos = sys.frame(sys.nframe()), other = list(), designMatrix = TRUE, responseMatrix = TRUE, setHook = NULL, ...)
ModelEnvFormula(formula, data = list(), subset = NULL, na.action = NULL, frame = NULL, enclos = sys.frame(sys.nframe()), other = list(), designMatrix = TRUE, responseMatrix = TRUE, setHook = NULL, ...)
formula |
a symbolic description of the model to be fit. |
data |
an optional data frame containing the variables in the model.
If not found in |
subset |
an optional vector specifying a subset of observations to be used in the fitting process. |
na.action |
a function which indicates what should happen when the data
contain |
frame |
an optional environment |
enclos |
specifies the enclosure passed to |
other |
an optional named list of additional formulae. |
designMatrix |
a logical indicating whether the design matrix
defined by the right hand side of |
responseMatrix |
a logical indicating whether the design matrix
defined by the left hand side of |
setHook |
a list of functions to |
... |
additional arguments for be passed to function, for example
|
This function is an attempt to provide a flexible infrastucture for the
implementation of classical formula based interfaces. The arguments
formula
, data
, subset
and na.action
are well
known and are defined in the same way as in lm
, for example.
ModelEnvFormula
returns an object of class
ModelEnvFormula-class
- a high level object for storing
data improving upon the capabilities of data.frame
s.
An object of class ModelEnvFormula-class
.
### the `usual' interface data(iris) mf <- ModelEnvFormula(Species ~ ., data = iris) mf ### extract data from the ModelEnv object summary(mf@get("response")) summary(mf@get("input")) dim(mf@get("designMatrix")) ### contrasts mf <- ModelEnvFormula(Petal.Width ~ Species, data = iris, contrasts.arg = list(Species = contr.treatment)) attr(mf@get("designMatrix"), "contrasts") mf <- ModelEnvFormula(Petal.Width ~ Species, data = iris, contrasts.arg = list(Species = contr.sum)) attr(mf@get("designMatrix"), "contrasts") ### additional formulae mf <- ModelEnvFormula(Petal.Width ~ Species, data = iris, other = list(pl = ~ Petal.Length)) ls(mf@env) identical(mf@get("pl")[[1]], iris[["Petal.Length"]])
### the `usual' interface data(iris) mf <- ModelEnvFormula(Species ~ ., data = iris) mf ### extract data from the ModelEnv object summary(mf@get("response")) summary(mf@get("input")) dim(mf@get("designMatrix")) ### contrasts mf <- ModelEnvFormula(Petal.Width ~ Species, data = iris, contrasts.arg = list(Species = contr.treatment)) attr(mf@get("designMatrix"), "contrasts") mf <- ModelEnvFormula(Petal.Width ~ Species, data = iris, contrasts.arg = list(Species = contr.sum)) attr(mf@get("designMatrix"), "contrasts") ### additional formulae mf <- ModelEnvFormula(Petal.Width ~ Species, data = iris, other = list(pl = ~ Petal.Length)) ls(mf@env) identical(mf@get("pl")[[1]], iris[["Petal.Length"]])
A class for formula-based model environments.
Objects can be created by calls of the form new("ModelEnvFormula", ...)
.
env
:Object of class "environment"
.
get
:Object of class "function"
for extracting
objects from environment env
.
set
:Object of class "function"
for setting
object in environment env
.
formula
:Object of class "list"
.
hooks
:A list of hook collections.
Class "ModelEnv"
, directly.
Class "FormulaParts"
, directly.
No methods defined with class "ModelEnvFormula" in the signature.
A simple model environment creator function working off matrices for input and response. This is much simpler and more limited than formula-based environments, but faster and easier to use, if only matrices are allowed as input.
ModelEnvMatrix(designMatrix=NULL, responseMatrix=NULL, subset = NULL, na.action = NULL, other=list(), ...)
ModelEnvMatrix(designMatrix=NULL, responseMatrix=NULL, subset = NULL, na.action = NULL, other=list(), ...)
designMatrix |
design matrix of input |
responseMatrix |
matrix of responses |
subset |
an optional vector specifying a subset of observations to be used in the fitting process. |
na.action |
a function which indicates what should happen when the data
contain |
other |
an optional named list of additional formulae. |
... |
currently not used |
ModelEnvMatrix
returns an object of class
ModelEnv-class
- a high level object for storing
data improving upon the capabilities of simple data matrices.
Funny things may happen if the inpiut and response matrices do not have
distinct column names and the data new data are supplied via the
get
and set
slots.
An object of class ModelEnv-class
.
### use Sepal measurements as input and Petal as response data(iris) me <- ModelEnvMatrix(iris[,1:2], iris[,3:4]) me ### extract data from the ModelEnv object dim(me@get("designMatrix")) summary(me@get("responseMatrix")) ### subsets and missing values iris[1,1] <- NA me <- ModelEnvMatrix(iris[,1:2], iris[,3:4], subset=1:5, na.action=na.omit) ## First case is not complete, so me contains only cases 2:5 me me@get("designMatrix") me@get("responseMatrix") ## use different cases me@set(data=iris[10:20,]) me@get("designMatrix") ## these two should be the same stopifnot(all.equal(me@get("responseMatrix"), as.matrix(iris[10:20,3:4])))
### use Sepal measurements as input and Petal as response data(iris) me <- ModelEnvMatrix(iris[,1:2], iris[,3:4]) me ### extract data from the ModelEnv object dim(me@get("designMatrix")) summary(me@get("responseMatrix")) ### subsets and missing values iris[1,1] <- NA me <- ModelEnvMatrix(iris[,1:2], iris[,3:4], subset=1:5, na.action=na.omit) ## First case is not complete, so me contains only cases 2:5 me me@get("designMatrix") me@get("responseMatrix") ## use different cases me@set(data=iris[10:20,]) me@get("designMatrix") ## these two should be the same stopifnot(all.equal(me@get("responseMatrix"), as.matrix(iris[10:20,3:4])))
A function for predictions from the results of various model fitting functions.
Predict(object, ...)
Predict(object, ...)
object |
a model object for which prediction is desired. |
... |
additional arguments affecting the predictions produced. |
A somewhat improved version of predict
for models
fitted with objects of class StatModel-class
.
Should return a vector of the same type as the response variable specified
for fitting object
.
df <- data.frame(x = runif(10), y = rnorm(10)) mf <- dpp(linearModel, y ~ x, data = df) Predict(fit(linearModel, mf))
df <- data.frame(x = runif(10), y = rnorm(10)) mf <- dpp(linearModel, y ~ x, data = df) Predict(fit(linearModel, mf))
A class for unfitted statistical models.
Objects can be created by calls of the form new("StatModel", ...)
.
name
:Object of class "character"
, the name of the
model.
dpp
:Object of class "function"
, a function for
data preprocessing (usually formula-based).
fit
:Object of class "function"
, a function for
fitting the model to data.
predict
:Object of class "function"
, a function for
computing predictions.
capabilities
:Object of class
"StatModelCapabilities"
.
signature(model = "StatModel", data = "ModelEnv")
:
fit model
to data
.
This is an attempt to provide unified infra-structure for unfitted
statistical models. Basically, an unfitted model provides a function for
data pre-processing (dpp
, think of generating design matrices),
a function for fitting the specified model to data (fit
), and
a function for computing predictions (predict
).
Examples for such unfitted models are provided by linearModel
and
glinearModel
which provide interfaces in the "StatModel"
framework
to lm.fit
and glm.fit
, respectively. The functions
return objects of S3 class "linearModel"
(inheriting from "lm"
) and
"glinearModel"
(inheriting from "glm"
), respectively. Some
methods for S3 generics such as predict
, fitted
, print
and model.matrix
are provided to make use of the "StatModel"
structure. (Similarly, survReg
provides an experimental interface to
survreg
.)
### linear model example df <- data.frame(x = runif(10), y = rnorm(10)) mf <- dpp(linearModel, y ~ x, data = df) mylm <- fit(linearModel, mf) ### equivalent print(mylm) lm(y ~ x, data = df) ### predictions Predict(mylm, newdata = data.frame(x = runif(10)))
### linear model example df <- data.frame(x = runif(10), y = rnorm(10)) mf <- dpp(linearModel, y ~ x, data = df) mylm <- fit(linearModel, mf) ### equivalent print(mylm) lm(y ~ x, data = df) ### predictions Predict(mylm, newdata = data.frame(x = runif(10)))
A class describing capabilities of a statistical model.
Objects can be created by calls of the form new("StatModelCapabilities", ...)
.
weights
:Object of class "logical"
subset
:Object of class "logical"
No methods defined with class "StatModelCapabilities" in the signature.