Title: | Generalized Linear Models with Clustering |
---|---|
Description: | Binomial and Poisson regression for clustered data, fixed and random effects with bootstrapping. |
Authors: | Göran Broström [aut, cre], Jianming Jin [ctb], Henrik Holmberg [ctb] |
Maintainer: | Göran Broström <[email protected]> |
License: | GPL (>= 3) |
Version: | 1.1.7 |
Built: | 2024-11-20 06:59:00 UTC |
Source: | CRAN |
Calculates the zeros and weights needed for Gauss-Hermite quadrature.
ghq(n.points = 1, modified = TRUE)
ghq(n.points = 1, modified = TRUE)
n.points |
Number of points. |
modified |
Multiply by exp(zeros**2)? Default is TRUE. |
Based on a Fortran 66 subroutine written by professor Jianming Jin.
A list vith components
zeros |
The zeros (abscissas). |
weights |
The weights |
The code is modified to suit the purpose of glmmML, with the permission of professor Jin.
Jianming Jin, Univ. of Illinois, Urbana-Campaign
Gauss-Hermite
ghq(15, FALSE)
ghq(15, FALSE)
Fits grouped GLMs with fixed group effects. The significance of the grouping is tested by simulation, with a bootstrap approach.
glmmboot(formula, family = binomial, data, cluster, weights, subset, na.action, offset, contrasts = NULL, start.coef = NULL, control = list(epsilon = 1e-08, maxit = 200, trace = FALSE), boot = 0)
glmmboot(formula, family = binomial, data, cluster, weights, subset, na.action, offset, contrasts = NULL, start.coef = NULL, control = list(epsilon = 1e-08, maxit = 200, trace = FALSE), boot = 0)
formula |
a symbolic description of the model to be fit. The details of model specification are given below. |
family |
Currently, the only valid values are |
data |
an optional data frame containing the variables in the model. By default the variables are taken from ‘environment(formula)’, typically the environment from which ‘glmmML’ is called. |
cluster |
Factor indicating which items are correlated. |
weights |
Case weights. |
subset |
an optional vector specifying a subset of observations to be used in the fitting process. |
na.action |
See glm. |
offset |
this can be used to specify an a priori known component to be included in the linear predictor during fitting. |
contrasts |
an optional list. See the 'contrasts.arg' of 'model.matrix.default'. |
start.coef |
starting values for the parameters in the linear predictor. Defaults to zero. |
control |
Controls the convergence criteria. See
|
boot |
number of bootstrap replicates. If equal to zero, no test of significance of the grouping factor is performed. |
The simulation is performed by
simulating new response vectors from the fitted probabilities without
clustering, and comparing the maximized log likelihoods. The
maximizations are performed by profiling out the grouping factor. It is
a very fast procedure, compared to glm
, when the grouping
factor has many levels.
The return value is a list, an object of class 'glmmboot'.
coefficients |
Estimated regression coefficients |
logLik |
the max log likelihood |
cluster.null.deviance |
Deviance without the clustering |
frail |
The estimated cluster effects |
bootLog |
The logLik values from the bootstrap samples |
bootP |
Bootstrap p value |
variance |
Variance covariance matrix |
sd |
Standard error of regression parameters |
boot_rep |
No. of bootstrap replicates |
mixed |
Logical |
deviance |
Deviance |
df.residual |
Its degrees of freedom |
aic |
AIC |
boot |
Logical |
call |
The function call |
There is no overall intercept for this model; each cluster has its
own intercept. See frail
G\"oran Brostr\"om and Henrik Holmberg
Brostr\"om, G. and Holmberg, H. (2011). Generalized linear models with clustered data: Fixed and random effects models. Computational Statistics and Data Analysis 55:3123-3134.
link{glmmML}
, optim
,
lmer
in Matrix
, and
glmmPQL
in MASS
.
## Not run: id <- factor(rep(1:20, rep(5, 20))) y <- rbinom(100, prob = rep(runif(20), rep(5, 20)), size = 1) x <- rnorm(100) dat <- data.frame(y = y, x = x, id = id) res <- glmmboot(y ~ x, cluster = id, data = dat, boot = 5000) ## End(Not run) ##system.time(res.glm <- glm(y ~ x + id, family = binomial))
## Not run: id <- factor(rep(1:20, rep(5, 20))) y <- rbinom(100, prob = rep(runif(20), rep(5, 20)), size = 1) x <- rnorm(100) dat <- data.frame(y = y, x = x, id = id) res <- glmmboot(y ~ x, cluster = id, data = dat, boot = 5000) ## End(Not run) ##system.time(res.glm <- glm(y ~ x + id, family = binomial))
'glmmbootFit' is the workhorse in the function glmmboot
. It is
suitable to call instead of 'glmmboot', e.g. in simulations.
glmmbootFit(X, Y, weights = rep(1, NROW(Y)), start.coef = NULL, cluster = rep(1, length(Y)), offset = rep(0, length(Y)), family = binomial(), control = list(epsilon = 1.e-8, maxit = 200, trace = FALSE), boot = 0)
glmmbootFit(X, Y, weights = rep(1, NROW(Y)), start.coef = NULL, cluster = rep(1, length(Y)), offset = rep(0, length(Y)), family = binomial(), control = list(epsilon = 1.e-8, maxit = 200, trace = FALSE), boot = 0)
X |
The design matrix (n * p). |
Y |
The response vector of length n. |
weights |
Case weights. |
start.coef |
start values for the parameters in the linear predictor (except the intercept). |
cluster |
Factor indicating which items are correlated. |
offset |
this can be used to specify an a priori known component to be included in the linear predictor during fitting. |
family |
Currently, the only valid values are |
control |
A list. Controls the convergence criteria. See
|
boot |
number of bootstrap replicates. If equal to zero, no test of significance of the grouping factor is performed. If non-zero, it should be large, at least, say, 2000. |
A list with components
coefficients |
Estimated regression coefficients (note: No intercept). |
logLik |
The maximised log likelihood. |
cluster.null.deviance |
deviance from a moddel without cluster. |
frail |
The estimated cluster effects. |
bootLog |
The maximised bootstrap log likelihood values. A vector
of length |
bootP |
The bootstrap p value. |
variance |
The variance-covariance matrix of the fixed effects (no intercept). |
sd |
The standard errors of the |
boot_rep |
The number of bootstrap replicates. |
A profiling approach is used to estimate the cluster effects.
Göran Broström
## Not run x <- matrix(rnorm(1000), ncol = 1) id <- rep(1:100, rep(10, 100)) y <- rbinom(1000, size = 1, prob = 0.4) fit <- glmmbootFit(x, y, cluster = id, boot = 200) summary(fit) ## End(Not run) ## Should show no effects. And boot too small.
## Not run x <- matrix(rnorm(1000), ncol = 1) id <- rep(1:100, rep(10, 100)) y <- rbinom(1000, size = 1, prob = 0.4) fit <- glmmbootFit(x, y, cluster = id, boot = 200) summary(fit) ## End(Not run) ## Should show no effects. And boot too small.
Fits GLMs with random intercept by Maximum Likelihood and numerical integration via Gauss-Hermite quadrature.
glmmML(formula, family = binomial, data, cluster, weights, cluster.weights, subset, na.action, offset, contrasts = NULL, prior = c("gaussian", "logistic", "cauchy"), start.coef = NULL, start.sigma = NULL, fix.sigma = FALSE, x = FALSE, control = list(epsilon = 1e-08, maxit = 200, trace = FALSE), method = c("Laplace", "ghq"), n.points = 8, boot = 0)
glmmML(formula, family = binomial, data, cluster, weights, cluster.weights, subset, na.action, offset, contrasts = NULL, prior = c("gaussian", "logistic", "cauchy"), start.coef = NULL, start.sigma = NULL, fix.sigma = FALSE, x = FALSE, control = list(epsilon = 1e-08, maxit = 200, trace = FALSE), method = c("Laplace", "ghq"), n.points = 8, boot = 0)
formula |
a symbolic description of the model to be fit. The details of model specification are given below. |
family |
Currently, the only valid values are |
data |
an optional data frame containing the variables in the model. By default the variables are taken from ‘environment(formula)’, typically the environment from which ‘glmmML’ is called. |
cluster |
Factor indicating which items are correlated. |
weights |
Case weights. Defaults to one. |
cluster.weights |
Cluster weights. Defaults to one. |
subset |
an optional vector specifying a subset of observations to be used in the fitting process. |
na.action |
See glm. |
start.coef |
starting values for the parameters in the linear predictor. Defaults to zero. |
start.sigma |
starting value for the mixing standard deviation. Defaults to 0.5. |
fix.sigma |
Should sigma be fixed at start.sigma? |
x |
If TRUE, the design matrix is returned (as x). |
offset |
this can be used to specify an a priori known component to be included in the linear predictor during fitting. |
contrasts |
an optional list. See the 'contrasts.arg' of 'model.matrix.default'. |
prior |
Which "prior" distribution (for the random effects)? Possible choices are "gaussian" (default), "logistic", and "cauchy". |
control |
Controls the convergence criteria. See
|
method |
There are two choices "Laplace" (default) and "ghq" (Gauss-Hermite). |
n.points |
Number of points in the Gauss-Hermite quadrature. If
n.points == 1, the Gauss-Hermite is the same as Laplace
approximation. If |
boot |
Do you want a bootstrap estimate of cluster effect? The default
is No ( |
The integrals in the log likelihood function are evaluated by the Laplace approximation (default) or Gauss-Hermite quadrature. The latter is now fully adaptive; however, only approximate estimates of variances are available for the Gauss-Hermite (n.points > 1) method.
For the binomial families, the response can be a two-column matrix, see the help page for glm for details.
The return value is a list, an object of class 'glmmML'. The components are:
boot |
No. of boot replicates |
converged |
Logical |
coefficients |
Estimated regression coefficients |
coef.sd |
Their standard errors |
sigma |
The estimated random effects' standard deviation |
sigma.sd |
Its standard error |
variance |
The estimated variance-covariance matrix. The last
column/row corresponds to the standard
deviation of the random effects ( |
aic |
AIC |
bootP |
Bootstrap p value from testing the null hypothesis of no random effect (sigma = 0) |
deviance |
Deviance |
mixed |
Logical |
df.residual |
Degrees of freedom |
cluster.null.deviance |
Deviance from a glm with no
clustering. Subtracting |
cluster.null.df |
Its degrees of freedom |
posterior.modes |
Estimated posterior modes of the random effects |
terms |
The terms object |
info |
From hessian inversion. Should be 0. If not, no variances could be estimated. You could try fixing sigma at the estimated value and rerun. |
prior |
Which prior was used? |
call |
The function call |
x |
The design matrix if asked for, otherwise not present |
The optimization may not converge with
the default value of start.sigma
. In that case, try different
start values for sigma. If still no convergence, consider the
possibility to fix the value of sigma at several values and study the
profile likelihood.
G\"oran Brostr\"om
Brostr\"om, G. and Holmberg, H. (2011). Generalized linear models with clustered data: Fixed and random effects models. Computational Statistics and Data Analysis 55:3123-3134.
glmmboot
, glm
, optim
,
lmer
in Matrix
and
glmmPQL
in MASS
.
id <- factor(rep(1:20, rep(5, 20))) y <- rbinom(100, prob = rep(runif(20), rep(5, 20)), size = 1) x <- rnorm(100) dat <- data.frame(y = y, x = x, id = id) glmmML(y ~ x, data = dat, cluster = id)
id <- factor(rep(1:20, rep(5, 20))) y <- rbinom(100, prob = rep(runif(20), rep(5, 20)), size = 1) x <- rnorm(100) dat <- data.frame(y = y, x = x, id = id) glmmML(y ~ x, data = dat, cluster = id)
This function is called by glmmML
, but it can also be called
directly by the user.
glmmML.fit(X, Y, weights = rep(1, NROW(Y)), cluster.weights = rep(1, NROW(Y)), start.coef = NULL, start.sigma = NULL, fix.sigma = FALSE, cluster = NULL, offset = rep(0, nobs), family = binomial(), method = 1, n.points = 1, control = list(epsilon = 1.e-8, maxit = 200, trace = FALSE), intercept = TRUE, boot = 0, prior = 0)
glmmML.fit(X, Y, weights = rep(1, NROW(Y)), cluster.weights = rep(1, NROW(Y)), start.coef = NULL, start.sigma = NULL, fix.sigma = FALSE, cluster = NULL, offset = rep(0, nobs), family = binomial(), method = 1, n.points = 1, control = list(epsilon = 1.e-8, maxit = 200, trace = FALSE), intercept = TRUE, boot = 0, prior = 0)
X |
Design matrix of covariates. |
Y |
Response vector. Or two-column matrix. |
weights |
Case weights. Defaults to one. |
cluster.weights |
Cluster weights. Defaults to one. |
start.coef |
Starting values for the coefficients. |
start.sigma |
Starting value for the mixing standard deviation. |
fix.sigma |
Should sigma be fixed at start.sigma? |
cluster |
The clustering variable. |
offset |
The offset in the model. |
family |
Family of distributions. Defaults to binomial with logit link. Other possibilities are binomial with cloglog link and poisson with log link. |
method |
Laplace (1) or Gauss-hermite (0)? |
n.points |
Number of points in the Gauss-Hermite
quadrature. Default is |
control |
Control of the iterations. See |
intercept |
Logical. If TRUE, an intercept is fitted. |
boot |
Integer. If > 0, bootstrapping with |
prior |
Which prior distribution? 0 for "gaussian", 1 for "logistic", 2 for "cauchy". |
In the optimisation, "vmmin" (in C code) is used.
A list. For details, see the code, and glmmML
.
Göran Broström
Broström (2003)
x <- cbind(rep(1, 14), rnorm(14)) y <- rbinom(14, prob = 0.5, size = 1) id <- rep(1:7, 2) glmmML.fit(x, y, cluster = id)
x <- cbind(rep(1, 14), rnorm(14)) y <- rbinom(14, prob = 0.5, size = 1) id <- rep(1:7, 2) glmmML.fit(x, y, cluster = id)
A glmmboot object is the output of glmmboot
.
## S3 method for class 'glmmboot' print(x, digits = max(3, getOption("digits") - 3), na.print = "", ...)
## S3 method for class 'glmmboot' print(x, digits = max(3, getOption("digits") - 3), na.print = "", ...)
x |
The glmmboot object |
digits |
Number of printed digits. |
na.print |
How to print NAs |
... |
Additional parameters, which are ignored. |
Nothing in particular.
A short summary of the object is printed.
This is the only summary method available for the moment.
Göran Broström
A glmmML object is the output of glmmML
.
## S3 method for class 'glmmML' print(x, digits = max(3, getOption("digits") - 3), na.print = "", ...)
## S3 method for class 'glmmML' print(x, digits = max(3, getOption("digits") - 3), na.print = "", ...)
x |
The glmmML object |
digits |
Number of printed digits. |
na.print |
How to print NAs |
... |
Additional parameters, which are ignored. |
Nothing in particular.
A short summary of the object is printed.
This is the only summary method available for the moment.
Göran Broström
It simply calls print.glmmboot
## S3 method for class 'glmmboot' summary(object, ...)
## S3 method for class 'glmmboot' summary(object, ...)
object |
A glmmboot object |
... |
Additional arguments |
A summary method will be written soon.
Nothing is returned.
Preliminary
Göran Broström
It simply calls print.glmmML
## S3 method for class 'glmmML' summary(object, ...)
## S3 method for class 'glmmML' summary(object, ...)
object |
A glmmML object |
... |
Additional arguments |
Nothing is returned.
Preliminary
Göran Broström