Title: | Smooth Effects on Response Penalty for CLM |
---|---|
Description: | Implements a regularization method for cumulative link models using the Smooth-Effect-on-Response Penalty (SERP). This method allows flexible modeling of ordinal data by enabling a smooth transition from a general cumulative link model to a simplified version of the same model. As the tuning parameter increases from zero to infinity, the subject-specific effects for each variable converge to a single global effect. The approach addresses common issues in cumulative link models, such as parameter unidentifiability and numerical instability, by maximizing a penalized log-likelihood instead of the standard non-penalized version. Fitting is performed using a modified Newton's method. Additionally, the package includes various model performance metrics and descriptive tools. For details on the implemented penalty method, see Ugba (2021) <doi:10.21105/joss.03705> and Ugba et al. (2021) <doi:10.3390/stats4030037>. |
Authors: | Ejike R. Ugba [aut, cre, cph] |
Maintainer: | Ejike R. Ugba <[email protected]> |
License: | GPL-2 |
Version: | 0.2.5 |
Built: | 2024-11-25 19:34:15 UTC |
Source: | CRAN |
Returns the akaike information criterion of a fitted object of class
serp
. For the penalized slope, the effective degrees of freedom (edf)
is obtained from the trace of the generalized hat matrix which depends on
the tuning parameter.
## S3 method for class 'serp' AIC(object, ..., k = 2)
## S3 method for class 'serp' AIC(object, ..., k = 2)
object |
An object of class |
... |
additional arguments. |
k |
fixed value equal to 2. |
A single numeric value of the model AIC.
serp
, BIC.serp
, coef.serp
,
logLik.serp
,
library(serp) m <- serp(rating ~ temp + contact, slope = "parallel", link = "probit", data = wine) AIC(m)
library(serp) m <- serp(rating ~ temp + contact, slope = "parallel", link = "probit", data = wine) AIC(m)
Provides a likelihood ratio test for comparing two or more serp
objects. This
does not currently support model(s) with penalized slope.
## S3 method for class 'serp' anova(object, ..., test = c("Chisq", "none"))
## S3 method for class 'serp' anova(object, ..., test = c("Chisq", "none"))
object |
An object of class |
... |
additional arguments. |
test |
type of test to be conducted. |
An ANOVA table with the following components on display:
model |
the respective model aliases. |
slope |
type of slope fitted, which may be any of, unparallel, parallel, or partial slope. |
no.par |
the no of parameters in the model. |
AIC |
the akaike information criterion. |
logLik |
the realized log-likelihood. |
Test |
the different pair(s) of test(s) conducted. |
LR.stat |
the computed Likelihood ratio statistic. |
df |
the degree of freedom. |
Pr(chi) |
the p-value of test statitic. |
library(serp) m1 <- serp(rating ~ temp + contact, slope = "parallel", link = "logit", data = wine) m2 <- update(m1, ~ contact) anova(m1, m2)
library(serp) m1 <- serp(rating ~ temp + contact, slope = "parallel", link = "logit", data = wine) m2 <- update(m1, ~ contact) anova(m1, m2)
Returns the bayesian information criterion of a fitted object of class
serp
. For the penalized slopes, the effective degrees of freedom (edf)
is obtained from the trace of the generalized hat matrix which depends on
the tuning parameter.
## S3 method for class 'serp' BIC(object, ...)
## S3 method for class 'serp' BIC(object, ...)
object |
An object of class |
... |
additional arguments. |
A single numeric value of the model.
serp
, AIC.serp
, coef.serp
,
logLik.serp
,
library(serp) m <- serp(rating ~ temp + contact, slope = "parallel", link = "loglog", data = wine) BIC(m)
library(serp) m <- serp(rating ~ temp + contact, slope = "parallel", link = "loglog", data = wine) BIC(m)
Returns the coefficients of a fitted object of class serp
.
## S3 method for class 'serp' coef(object, ...) ## S3 method for class 'serp' coefficients(object, ...)
## S3 method for class 'serp' coef(object, ...) ## S3 method for class 'serp' coefficients(object, ...)
object |
An object of class |
... |
additional arguments. |
A vector of model coefficients.
serp
, AIC.serp
, BIC.serp
,
logLik.serp
library(serp) m <- serp(rating ~ temp + contact, slope = "parallel", link = "loglog", data = wine) coef(m)
library(serp) m <- serp(rating ~ temp + contact, slope = "parallel", link = "loglog", data = wine) coef(m)
Provides the confidence interval of estimates for an object of class serp
.
## S3 method for class 'serp' confint(object, ..., parm, level = 0.95)
## S3 method for class 'serp' confint(object, ..., parm, level = 0.95)
object |
An object of class |
... |
additional arguments. |
parm |
unused argument. |
level |
significance level. |
A matrix of the the confidence intervals of fitted model.
library(serp) m <- serp(rating ~ temp + contact, slope = "parallel", link = "logit", data = wine) confint(m)
library(serp) m <- serp(rating ~ temp + contact, slope = "parallel", link = "logit", data = wine) confint(m)
Returns the Log-likelihood for a fitted object of class serp
.
## S3 method for class 'serp' logLik(object, ...)
## S3 method for class 'serp' logLik(object, ...)
object |
An object of class |
... |
additional arguments. |
A single numeric value of model log-likelihood
serp
, AIC.serp
, BIC.serp
,
coef.serp
library(serp) m <- serp(rating ~ temp + contact, slope = "parallel", link = "loglog", data = wine) logLik(m)
library(serp) m <- serp(rating ~ temp + contact, slope = "parallel", link = "loglog", data = wine) logLik(m)
This function takes a fitted serp
object produced by serp() and
produces predicted values. Type of predictions returned include response,
link and class. Prediction is also possible with new set of values having
the same column names as in the original values used for the model fit.
## S3 method for class 'serp' predict(object, type = c("link", "response", "class"), newdata = NULL, ...)
## S3 method for class 'serp' predict(object, type = c("link", "response", "class"), newdata = NULL, ...)
object |
An object of class |
type |
could be any of these: response, link or terms. |
newdata |
fresh dataset with all relevant variables. |
... |
additional arguments. |
A vector of predicted classes with type
equal to 'class'
or a dataframe of predicted values for type
equal to 'response'
and 'link'.
anova.serp
, summary.serp
,
confint.serp
, vcov.serp
library(serp) m <- serp(rating ~ temp + contact, slope = "penalize", reverse = TRUE, link = "logit", tuneMethod = "user", lambda = 1, data = wine) head(predict(m, type = "link")) head(predict(m, type = "response")) predict(m, type = "class") n.wine <- wine[1:20,] predict(m, newdata = n.wine, type = "class")
library(serp) m <- serp(rating ~ temp + contact, slope = "penalize", reverse = TRUE, link = "logit", tuneMethod = "user", lambda = 1, data = wine) head(predict(m, type = "link")) head(predict(m, type = "response")) predict(m, type = "class") n.wine <- wine[1:20,] predict(m, newdata = n.wine, type = "class")
Prints out a vector of coefficients of the fitted model with some additional goodness-of-fit measures.
## S3 method for class 'serp' print(x, ...)
## S3 method for class 'serp' print(x, ...)
x |
An object of class |
... |
additional arguments. |
No return value
Prints out the information supplied via summary.serp
method.
## S3 method for class 'summary.serp' print(x, ...)
## S3 method for class 'summary.serp' print(x, ...)
x |
An object of class |
... |
additional arguments. |
No return value
Fits cumulative link models (CLMs) with the
smooth-effect-on-response penalty (SERP) via a modified Newton-Raphson
algorithm. SERP enables the regularization of the parameter space between
the general and the restricted cumulative models, with a resultant shrinkage
of all subject-specific effects to global effects. The Akaike information
critrion (aic
), K-fold cross validation (cv
), among other tuning
aproaches, provide the means of arriving at an optimal tuning parameter in a
in a situation where a user-supplied tuning value is not available.
The slope
argument allows for the selection of a penalized, unparallel,
parallel, or partial slope.
serp( formula, link = c("logit", "probit","loglog", "cloglog", "cauchit"), slope = c("penalize", "parallel", "unparallel", "partial"), tuneMethod = c("aic", "cv", "finite", "user"), reverse = FALSE, lambdaGrid = NULL, cvMetric = c("brier", "logloss", "misclass"), gridType = c("discrete", "fine"), globalEff = NULL, data, subset, weights = NULL, weight.type = c("analytic", "frequency"), na.action = NULL, lambda = NULL, contrasts = NULL, control = list(), ...)
serp( formula, link = c("logit", "probit","loglog", "cloglog", "cauchit"), slope = c("penalize", "parallel", "unparallel", "partial"), tuneMethod = c("aic", "cv", "finite", "user"), reverse = FALSE, lambdaGrid = NULL, cvMetric = c("brier", "logloss", "misclass"), gridType = c("discrete", "fine"), globalEff = NULL, data, subset, weights = NULL, weight.type = c("analytic", "frequency"), na.action = NULL, lambda = NULL, contrasts = NULL, control = list(), ...)
formula |
regression formula of the form: response ~ predictors. The response should be a factor (ordered). |
link |
sets the link function for the cumulative link model including: logit, probit, complementary log-log, cloglog, cauchit. |
slope |
selects the form of coefficients used in the model, with
|
tuneMethod |
sets the method of choosing an optimal shrinkage
parameter, including: |
reverse |
false by default, when true the sign of the linear predictor is reversed. |
lambdaGrid |
optional user-supplied lambda grid for the |
cvMetric |
sets the performance metric for the cv tuning, with the brier score used by default. |
gridType |
chooses if a discrete or a continuous lambda grid should be
used to select the optimal tuning parameter. The former is used by default
and could be adjusted as desired in |
globalEff |
specifies variable(s) to be assigned global effects during
penalization or when |
data |
optional dataframe explaining the variables used in the formula. |
subset |
specifies which subset of the rows of the data should be used for fit. All observations are used by default. |
weights |
optional case weights in fitting. Negative weights are not allowed. Defaults to 1. |
weight.type |
chooses between analytic and frequency weights with the former used by default. The latter should be used when weights are mere case counts used to compress the data set. |
na.action |
a function to filter missing data. |
lambda |
a user-supplied single numeric value for the tuning parameter
when using the |
contrasts |
a list of contrasts to be used for some or all of the factors appearing as variables in the model formula. |
control |
A list of fit control parameters to replace default values
returned by |
... |
additional arguments. |
The serp
function fits the cumulative link model (CLM)
with smooth-effect-on-response penalty (SERP). The cumulative
model developed by McCullagh (1980) is probably most frequently
used ordinal model. When motivated by an underlying latent
variable, a simple form of the model is expressed as follows:
where is a vector of covariates,
a vector
of regression parameters and
a continuous distribution
function. This model assumes that the effect of
does not
depend on the category. However, with this assumption relaxed,
one obtains the following general cumulative model:
where r=1,...,k-1. This model, however, has the stochastic ordering
property, which implies that
holds for all
and all categories
. Such assumption
is often problematic, resulting in unstable likelihoods with
ill-conditioned parameter space during the iterative procedure.
SERP offers a means of arriving at stable estimates of the general model. It provides a form of regularization that is based on minimizing the penalized log-likelihood:
where , is the log-likelihood of the general cumulative
model and
the penalty
function weighted by the turning parameter
. Assuming an
ordered categorical outcome
, and considering
that the corresponding parameters
vary smoothly over the categories, the following penalty
(Tutz and Gertheiss, 2016),
enables the smoothing of response categories such that all category-specific effects associated with the response turn towards a common global effect. SERP could also be applied to a semi-parallel model with only the category-specific part of the model penalized. See, Ugba (2021), Ugba et al. (2021) for further details and application in empirical studies.
An object of class serp
with the components listed below,
depending on the type of slope modeled. Other summary methods include:
summary
, coef
, predict
, vcov
,
anova
, etc.
aic |
the akaike information criterion, with effective degrees of freedom obtained from the trace of the generalized hat matrix depending on the tuning parameter. |
bic |
the bayesian information criterion, with effective degrees of freedom obtained from the trace of the generalized hat matrix depending on the tuning parameter. |
call |
the matched call. |
coef |
a vector of coefficients of the fitted model. |
converged |
a character vector of fit convergence status. |
contrasts |
(where relevant) the contrasts used in the model. |
control |
list of control parameters from |
cvMetric |
the performance metric used for cv tuning. |
deviance |
the residual deviance. |
edf |
the (effective) number of degrees of freedom used by the model |
fitted.values |
the fitted probabilities. |
globalEff |
variable(s) in model treated as global effect(s) |
gradient |
a column vector of gradients for the coefficients at the model convergence. |
Hessian |
the hessian matrix for the coefficients at the model convergence. |
iter |
number of interactions before convergence or non-convergence. |
lambda |
a user-supplied single numeric value for the |
lambdaGrid |
a numeric vector of lambda values used to determine the optimum tuning parameter. |
logLik |
the realized log-likelihood at the model convergence. |
link |
character vector indicating the link function of the fit. |
message |
character vector stating the type of convergence obtained |
misc |
a list to hold miscellaneous fit information. |
model |
model.frame having variables from formula. |
na.action |
(where relevant) information on the treatment of NAs. |
nobs |
the number of observations. |
nrFold |
the number of k-fold cross validation for the cv tuning method. Default to k = 5. |
rdf |
the residual degrees of freedom |
reverse |
a logical vector indicating the the direction of the cumulative probabilities. Default to P(Y<=r). |
slope |
a character vector indicating the type of slope parameters
fitted. Default to |
Terms |
the terms structure describing the model. |
testError |
numeric value of the cross-validated test error at which the optimal tuning parameter emerged. |
tuneMethod |
a character vector specifying the method for choosing an optimal shrinkage parameter. |
value |
numeric value of AIC or logLik obtained at the optimal tuning
parameter when using |
ylev |
the number of the response levels. |
Ugba, E. R. (2021). serp: An R package for smoothing in ordinal regression Journal of Open Source Software, 6(66), 3705. https://doi.org/10.21105/joss.03705
Ugba, E. R., Mörlein, D. and Gertheiss, J. (2021). Smoothing in Ordinal Regression: An Application to Sensory Data. Stats, 4, 616–633. https://doi.org/10.3390/stats4030037
Tutz, G. and Gertheiss, J. (2016). Regularized Regression for Categorical Data (With Discussion and Rejoinder). Statistical Modelling, 16, pp. 161-260. https://doi.org/10.1177/1471082X16642560
McCullagh, P. (1980). Regression Models for Ordinal Data. Journal of the Royal Statistical Society. Series B (Methodological), 42, pp. 109-142. https://doi.org/10.1111/j.2517-6161.1980.tb01109.x
anova.serp
, summary.serp
,
predict.serp
, confint.serp
,
vcov.serp
require(serp) ## The unpenalized non-proportional odds model returns unbounded estimates, hence, ## not fully identifiable. f1 <- serp(rating ~ temp + contact, slope = "unparallel", reverse = TRUE, link = "logit", data = wine) coef(f1) ## The penalized non-proportional odds model with a user-supplied lambda gives ## a fully identified model with bounded estimates. A suitable tuning criterion ## could as well be used to select lambda (e.g., aic, cv) f2 <- serp(rating ~ temp + contact, slope = "penalize", link = "logit", reverse = TRUE, tuneMethod = "user", lambda = 1e1, data = wine) coef(f2) ## A penalized partial proportional odds model with some variables set to ## global effect is also possible. f3 <- serp(rating ~ temp + contact, slope = "penalize", reverse = TRUE, link = "logit", tuneMethod = "user", lambda = 2e1, globalEff = ~ temp, data = wine) coef(f3) ## The unpenalized proportional odds model having constrained estimates can ## as well be fit. Under extreme shrinkage, estimates in f2 equal those in ## this model. f4 <- serp(rating ~ temp + contact, slope = "parallel", reverse = FALSE, link = "logit", data = wine) summary(f4)
require(serp) ## The unpenalized non-proportional odds model returns unbounded estimates, hence, ## not fully identifiable. f1 <- serp(rating ~ temp + contact, slope = "unparallel", reverse = TRUE, link = "logit", data = wine) coef(f1) ## The penalized non-proportional odds model with a user-supplied lambda gives ## a fully identified model with bounded estimates. A suitable tuning criterion ## could as well be used to select lambda (e.g., aic, cv) f2 <- serp(rating ~ temp + contact, slope = "penalize", link = "logit", reverse = TRUE, tuneMethod = "user", lambda = 1e1, data = wine) coef(f2) ## A penalized partial proportional odds model with some variables set to ## global effect is also possible. f3 <- serp(rating ~ temp + contact, slope = "penalize", reverse = TRUE, link = "logit", tuneMethod = "user", lambda = 2e1, globalEff = ~ temp, data = wine) coef(f3) ## The unpenalized proportional odds model having constrained estimates can ## as well be fit. Under extreme shrinkage, estimates in f2 equal those in ## this model. f4 <- serp(rating ~ temp + contact, slope = "parallel", reverse = FALSE, link = "logit", data = wine) summary(f4)
Default control parameters for 'serp' fit. User-supplied control parameters could be specified in the main function.
serp.control( maxits = 5e01, eps = 1e-07, maxpen = 1e07, trace = 0L, maxAdjIter = 5e0, max.half.iter = 1e01, relTol = 1e-03, nrFold = 5e0, cv.seed = 1e01, grid.length = 5e01, misclass.thresh = 5e-01, minP = .Machine$double.eps, ...)
serp.control( maxits = 5e01, eps = 1e-07, maxpen = 1e07, trace = 0L, maxAdjIter = 5e0, max.half.iter = 1e01, relTol = 1e-03, nrFold = 5e0, cv.seed = 1e01, grid.length = 5e01, misclass.thresh = 5e-01, minP = .Machine$double.eps, ...)
maxits |
the maximum number of Newton's iterations. Default to 100. |
eps |
threshold value during optimization at which the iteration routine terminates. In other words, when the reported change in the log-likelihood goes below this threshold, convergence is achieved. |
maxpen |
the upper end point of the interval from zero to be searched for a tuning parameter. |
trace |
prints the Newton's fitting process at each iteration step.If 0 (default) no information is printed, if 1, 2 or 3 different shades of information are printed. |
maxAdjIter |
the maximum allowable number of Newton step adjustment to forestall an early optimization failure. Defaults to 5. |
max.half.iter |
the maximum number of iteration step-halfings. Defaults to 10. |
relTol |
relative convergence tolerance, defaults to 1e-03. checks relative changes in the parameter estimates between Newton iterations. |
nrFold |
the number of k-fold cross validation for the CV tuning method. Default to k = 5. |
cv.seed |
single numeric value to change the random seed in CV tuning. |
grid.length |
the length of the discrete lambda grid for the penalty method. |
misclass.thresh |
to reset the classification threshold in
|
minP |
A near zero minimum value the fitted probabilities are allowed to get during iteration to prevent numerical instability . |
... |
additional arguments. |
a list of control parameters.
library(serp) serp(rating ~ contact, slope = "parallel", link = "logit", control = list(maxits = 2e01, eps=1e-05, trace = 2), data = wine)
library(serp) serp(rating ~ contact, slope = "parallel", link = "logit", control = list(maxits = 2e01, eps=1e-05, trace = 2), data = wine)
This function summarizes the result of a fitted serp object in a dataframe.
## S3 method for class 'serp' summary(object, ...)
## S3 method for class 'serp' summary(object, ...)
object |
An object of class |
... |
Not used. Additional summary arguments. |
an object of class summary.serp
. A list (depending on the type of
slope
used) of all model components defined in the serp
,
function with additional components listed below.
coefficients |
the matrix of coefficients, standard errors, z-values and p-values. |
null.deviance |
the deviance for the intercept only model. |
null.logLik |
the log-likelihood for the intercept only model. |
penalty |
list of penalization information obtained with
|
expcoefs |
the exponentiated coefficients. |
anova.serp
, predict.serp
,
confint.serp
, vcov.serp
library(serp) m <- serp(rating ~ temp + contact, slope = "penalize", reverse = TRUE, link = "logit", tuneMethod = "user", lambda = 0, data = wine) summary(m)
library(serp) m <- serp(rating ~ temp + contact, slope = "penalize", reverse = TRUE, link = "logit", tuneMethod = "user", lambda = 0, data = wine) summary(m)
Provides the Variance covariance matrix of an object of class serp
.
## S3 method for class 'serp' vcov(object, ...)
## S3 method for class 'serp' vcov(object, ...)
object |
An object of class |
... |
additional arguments. |
A variance covariance matrix of a fitted model.
serp
, anova.serp
, confint.serp
library(serp) m <- serp(rating ~ temp + contact, slope = "parallel", link = "logit", data = serp::wine) vcov(m)
library(serp) m <- serp(rating ~ temp + contact, slope = "parallel", link = "logit", data = serp::wine) vcov(m)
The wine
dataset adopted from Randall(1989),
represents the outcome of a factorial experiment on factors
determining the bitterness of wine. Two treatment factors
(temperature and contact) with two levels each are provided,
with the rating of wine taken on a continuous scale in the interval
from 0 (none) to 100 (intense). These were subsequently grouped
into five ordered categories ranging from 1 = 'least bitter'
to 5 = 'most bitter'. Altogether, nine different judges assessed
wine from two bottles and out of the four treatment conditions,
making a total of 72 observations.
wine
wine
A data frame with 72 rows and 6 variables:
response |
scorings of wine bitterness on a 0—100 continuous scale. |
rating |
ordered factor with 5 levels; a grouped version of |
contact |
factor with two levels ( |
temp |
temperature: factor with two levels. |
judge |
factor with nine levels. |
bottle |
factor with eight levels. |
Taken from Randall (1989).
Randall, J (1989). The analysis of sensory data by generalized linear model. Biometrical Journal, 31, 781–793. https://doi.org/10.1002/bimj.4710310703
## Not run: str(wine) head(wine) ## End(Not run)
## Not run: str(wine) head(wine) ## End(Not run)