Title: | Nonparametric Maximum Likelihood Estimation for Random Effect Models |
---|---|
Description: | Nonparametric maximum likelihood estimation or Gaussian quadrature for overdispersed generalized linear models and variance component models. |
Authors: | Jochen Einbeck, Ross Darnell and John Hinde |
Maintainer: | Jochen Einbeck <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.46-5 |
Built: | 2024-10-27 06:23:18 UTC |
Source: | CRAN |
Nonparametric maximum likelihood estimation or Gaussian quadrature
for overdispersed generalized linear models and variance component models.
The main functions are alldist
and allvc
.
Package: | npmlreg |
Type: | Package |
License: | GPL version 2 or newer |
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
For details on the GNU General Public License see http://www.gnu.org/copyleft/gpl.html or write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
This R package is based on several GLIM4 macros originally written by
Murray Aitkin and Brian Francis. The authors are also grateful to
Nick Sofroniou for retrieving the suicide data and providing the function gqz
.
The work on this R package was supported by Science Foundation Ireland Basic Research Grant 04/BR/M0051.
Jochen Einbeck, Ross Darnell and John Hinde.
Maintainer: Jochen Einbeck <[email protected]>
Aitkin, M., Francis, B. and Hinde, J. (2009). Statistical Modelling in R. Oxford Statistical Science Series, Oxford, UK.
Einbeck, J., and Hinde, J.: Nonparametric maximum likelihood estimation
for random effect models in R. Vignette to R package npmlreg.
Type vignette("npmlreg-v")
to open it.
Fits a random effect model using Gaussian quadrature (Hinde, 1982) or nonparametric maximum likelihood (Aitkin, 1996a).
The function alldist
is designed to account for overdispersion, while allvc
fits variance component models.
alldist(formula, random = ~1, family = gaussian(), data, k = 4, random.distribution = "np", tol = 0.5, offset, weights, pluginz, na.action, EMmaxit = 500, EMdev.change = 0.001, lambda = 0, damp = TRUE, damp.power = 1, spike.protect = 0, sdev, shape, plot.opt = 3, verbose = TRUE, ...) allvc(formula, random = ~1, family = gaussian(), data, k = 4, random.distribution = "np", tol = 0.5, offset, weights, pluginz, na.action, EMmaxit = 500, EMdev.change = 0.001, lambda=0, damp = TRUE, damp.power = 1, spike.protect=0, sdev, shape, plot.opt = 3, verbose = TRUE, ...)
alldist(formula, random = ~1, family = gaussian(), data, k = 4, random.distribution = "np", tol = 0.5, offset, weights, pluginz, na.action, EMmaxit = 500, EMdev.change = 0.001, lambda = 0, damp = TRUE, damp.power = 1, spike.protect = 0, sdev, shape, plot.opt = 3, verbose = TRUE, ...) allvc(formula, random = ~1, family = gaussian(), data, k = 4, random.distribution = "np", tol = 0.5, offset, weights, pluginz, na.action, EMmaxit = 500, EMdev.change = 0.001, lambda=0, damp = TRUE, damp.power = 1, spike.protect=0, sdev, shape, plot.opt = 3, verbose = TRUE, ...)
formula |
a formula defining the response and the fixed effects (e.g. |
random |
a formula defining the random model. In the case of |
family |
conditional distribution of responses: "gaussian",
"poisson", "binomial", "Gamma", or "inverse.gaussian" can be set. If
"gaussian", "Gamma", or "inverse.gaussian", then equal component
dispersion parameters are assumed, except if the optional parameter
|
data |
the data frame (mandatory, even if it is attached to the workspace!). |
k |
the number of mass points/integration points (supported are up to 600 mass points). |
random.distribution |
the mixing distribution, Gaussian Quadrature ( |
tol |
the tol scalar (usually, |
offset |
an optional offset to be included in the model. |
weights |
optional prior weights for the data. |
pluginz |
optional numerical vector of length |
na.action |
a function indicating what should happen when |
EMmaxit |
maximum number of EM iterations. |
EMdev.change |
stops EM algorithm when deviance change falls below this value. |
lambda |
only applicable for Gaussian, Gamma, and Inverse Gaussian mixtures. If set, standard
deviations/ shape parameters are calculated smoothly across components via
an Aitchison-Aitken kernel ( |
damp |
switches EM damping on or off. |
damp.power |
steers degree of damping applied on dispersion parameter according
to formula |
spike.protect |
for Gaussian, Gamma, and Inverse Gaussian mixtures with unequal or
smooth component standard deviations: protects algorithm to converge into likelihood spikes
by stopping the EM algorithm if one of the component standard deviations
(shape parameters, resp.), divided by the fitted mass points, falls below
(exceeds, resp.) a certain threshold, which is |
sdev |
optional; specifies standard deviation for normally distributed response. If unspecified, it will be estimated from the data. |
shape |
optional; specifies shape parameter for Gamma-and IG-distributed response.
For Gamma, setting |
plot.opt |
if equal to zero, then no graphical output is given.
For |
verbose |
if set to |
... |
generic options for the |
The nonparametric maximum likelihood (NPML) approach was introduced in Aitkin (1996) as a tool to fit overdispersed generalized linear models. The idea is to approximate the unknown and unspecified distribution of the random effect by a discrete mixture of exponential family densities, leading to a simple expression of the marginal likelihood which can then be maximized using a standard EM algorithm.
Aitkin (1999) extended this method to generalized linear models with shared
random effects arising through variance component or repeated measures
structure. Applications are two-stage sample designs, when firstly the
primary sampling units (the upper-level units, e.g. classes) and then the
secondary sampling units (lower-level units, e.g. students) are selected, or
longitudinal data. Models of this type have also been referred to as multi-level
models (Goldstein, 2003). allvc
is restricted to 2-level models.
The number of components k
of the finite mixture has to be specified beforehand.
When option 'gq'
is set, then Gauss-Hermite masses and mass points are used,
assuming implicitly a normally distributed random effect.
When option 'np'
is chosen, the EM algorithm uses the Gauss-Hermite masses
and mass points as starting points. The position of the starting points can
be concentrated or extended by setting tol
smaller or larger than one,
respectively.
Fitting random coefficient models (Aitkin, Francis & Hinde, 2009, pp. 496, p. 514) is
possible by specifying the random term explicitly. Note that the setting
random= ~ x
gives a model with a random slope and a random
intercept, and
that only one random coefficient can be specified. The option
random.distribution
is restricted to np
in this case,
i.e. Gaussian Quadrature is not supported for random coefficient
models (see also Aitkin, Francis & Hinde (2005), page 475 bottom).
As for glm
, there are three different ways of specifying a
binomial model, namley through
a two-column matrix before the '~' symbol, specifying the counts of successes and non-successes.
a vector of proportions of successes before the '~' symbol, and the
associated number of trials provided in the weights
argument.
a two-level factor before the '~' symbol (only for Bernoulli-response).
The weights have to be understood as frequency weights, i.e. setting all weights in
alldist
equal to 2 will duplicate each data point and hence double the
disparity and deviance.
The Inverse Gaussian (IG) response distribution is parametrized as usual through the
mean and a scaling parameter. We refer to the latter, which is the
inverse of the dispersion parameter in exponential family formulation,
as shape
. The canonical "1/mu^2" link is supported, but it is
quite tenuous since the linear predictor is likely to become negative
after adding the random effect. The log
link behaves more
reliably for this distribution.
For k
, mass points with negligible mass
(i.e. < 1e-50) are omitted. The maximum number of 'effective' mass points
is then 198.
The function alldist produces an object of class glmmNPML
(if random.distributon
is set to ‘np’) or glmmGQ
(‘gq’). Both objects contain the following 29 components:
coefficients |
a named vector of coefficients (including the mass points).
In case of Gaussian quadrature, the coefficient given at |
residuals |
the difference between the true response and the empirical Bayes predictions. |
fitted.values |
the empirical Bayes predictions (Aitkin, 1996b) on the scale of the responses. |
family |
the ‘family’ object used. |
linear.predictors |
the extended linear predictors |
disparity |
the disparity ( |
deviance |
the deviance of the fitted mixture regression model. |
null.deviance |
the deviance for the null model (just containing an intercept), comparable with
|
df.residual |
the residual degrees of freedom of the fitted model (including the random part). |
df.null |
the residual degrees of freedom for the null model. |
y |
the (extended) response vector. |
call |
the matched call. |
formula |
the formula supplied. |
random |
the random term of the model formula. |
data |
the data argument. |
model |
the (extended) design matrix. |
weights |
the case weights initially supplied. |
offset |
the offset initially supplied. |
mass.points |
the fitted mass points. |
masses |
the mixture probabilities corresponding to the mass points. |
sdev |
a list of the two elements |
shape |
a list of the two elements |
rsdev |
estimated random effect standard deviation. |
post.prob |
a matrix of posteriori probabilities. |
post.int |
a vector of ‘posteriori intercepts’ (as in Sofroniou et al. (2006)). |
ebp |
the empirical Bayes Predictions on the scale of the linear predictor. For compatibility with older versions. |
EMiter |
gives the number of iterations of the EM algorithm. |
EMconverged |
logical value indicating if the EM algorithm converged. |
lastglm |
the fitted |
Misc |
contains additional information relevant for the summary and plot functions, in particular the disparity trend and the EM trajectories. |
If a binomial model is specified by giving a two-column response,
the weights returned by weights
are the total numbers of cases
(factored by the supplied case weights) and the component y
of the result is the proportion of successes.
As a by-product, alldist
produces a plot showing the disparity
in dependence of the iteration number. Further, a plot with the EM trajectories
is given. The x-axis corresponds to the iteration number, and the y-axis
to the value of the mass points at a particular iteration.
This plot is not produced for GQ.
In contrast to the GLIM 4 version, this R implementation
uses for Gaussian (as well Gamma and IG) mixtures by default a damping procedure in the
first cycles of the EM algorithm (Einbeck & Hinde, 2006), which stabilizes
the algorithm and makes it less sensitive to the optimal
choice of tol
. If tol
is very small
(i.e. less than 0.1), it can be useful to set damp.power
to values
larger than 1 in order to accelerate convergence.
Do not use damp.power=0
, as this would mean permanent damping during EM.
Using the option pluginz
, one can to some extent circumvent the
necessity to specify tol
by giving the starting points explicitly.
However, when using pluginz
for normal, Gamma- or IG- distributed response,
damping will be strictly necessary to ensure that the imposed starting points
don't get blurred immediately due to initial fluctuations, implying that
tol
still plays a role in this case.
Originally translated from the GLIM 4 functions alldist
and
allvc
(Aitkin & Francis, 1995) to R by Ross Darnell (2002). Modified,
extended, and prepared for publication by Jochen Einbeck & John Hinde (2006).
Aitkin, M. and Francis, B. (1995). Fitting overdispersed generalized linear models by nonparametric maximum likelihood. GLIM Newsletter 25, 37-45.
Aitkin, M. (1996a). A general maximum likelihood analysis of overdispersion in generalized linear models. Statistics and Computing 6, 251-262.
Aitkin, M. (1996b). Empirical Bayes shrinkage using posterior random effect means from nonparametric maximum likelihood estimation in general random effect models. Statistical Modelling: Proceedings of the 11th IWSM 1996, 87-94.
Aitkin, M. (1999). A general maximum likelihood analysis of variance components in generalized linear models. Biometrics 55, 117-128.
Aitkin, M., Francis, B. and Hinde, J. (2009). Statistical Modelling in R. Oxford Statistical Science Series, Oxford, UK.
Einbeck, J. & Hinde, J. (2006). A note on NPML estimation for exponential family regression models with unspecified dispersion parameter. Austrian Journal of Statistics 35, 233-243.
Goldstein, H. (2003). Multilevel Statistical Models (3rd edition). Arnold, London, UK.
Hinde, J. (1982). Compound Poisson regression models. Lecture Notes in Statistics 14, 109-121.
Sofroniou, N., Einbeck, J., and Hinde, J. (2006). Analyzing Irish suicide rates with mixture models. Proceedings of the 21st International Workshop on Statistical Modelling in Galway, Ireland, 2006.
glm
, summary.glmmNPML
,
predict.glmmNPML
family.glmmNPML
,
plot.glmmNPML
.
# The first three examples (galaxy data, toxoplasmosis data , fabric faults) # are based on GLIM examples in Aitkin et al. (2005), and the forth example using # the Hospital-Stay-Data (Rosner, 2000) is taken from Einbeck & Hinde (2006). # The fifth data example using the Oxford boys is again inspired by Aitkin et al. (2005). # The sixth example on Irish suicide rates is taken from Sofroniou et al. (2006). # The galaxy data data(galaxies, package="MASS") gal<-as.data.frame(galaxies) galaxy.np6 <- alldist(galaxies/1000~1, random=~1, random.distribution="np", data=gal, k=6) galaxy.np8u <- alldist(galaxies/1000~1, random=~1, random.distribution="np", data=gal, k=8, lambda=0.99) round(galaxy.np8u$sdev$sdevk, digits=3) # [1] 0.912 0.435 0.220 0.675 1.214 0.264 0.413 0.297 # The toxoplasmosis data data(rainfall) rainfall$x<-rainfall$Rain/1000 rainfall$x2<- rainfall$x^2; rainfall$x3<- rainfall$x^3 toxo.np3<- alldist(cbind(Cases,Total-Cases) ~ x+x2+x3, random=~1, random.distribution="np", family=binomial(link=logit), data=rainfall, k=3) toxo.np3x<- alldist(cbind(Cases,Total-Cases) ~ x, random=~x, random.distribution="np", family=binomial(link=logit), data=rainfall, k=3) # is the same as toxo.np3x<- alldist(Cases/Total ~ x, random = ~x, weights=Total, family=binomial(link=logit), data=rainfall, k=3) # or toxo.np3x<-update(toxo.np3, .~.-x2-x3, random = ~x) # The fabric faults data data(fabric) coefficients(alldist(y ~ x, random=~1, family=poisson(link=log), random.distribution="gq", data= fabric, k=3, verbose=FALSE)) # (Intercept) x z # -3.3088663 0.8488060 0.3574909 # The Pennsylvanian hospital stay data data(hosp) fitnp3<- alldist(duration~age+temp1, data=hosp, k=3, family=Gamma(link=log), tol=0.5) fitnp3$shape$shape # [1] 50.76636 fitnp3<- alldist(duration~age+temp1, data=hosp, k=3, family=Gamma(link=log), tol=0.5, lambda=0.9) fitnp3$shape$shapek # [1] 49.03101 42.79522 126.64077 # The Oxford boys data data(Oxboys, package="nlme") Oxboys$boy <- gl(26,9) allvc(height~age, random=~1|boy, data=Oxboys, random.distribution='gq', k=20) allvc(height~age, random=~1|boy, data=Oxboys,random.distribution='np',k=8) # with random coefficients: allvc(height~age,random=~age|boy, data=Oxboys, random.distribution='np', k=8) # Irish suicide data data(irlsuicide) # Crude rate model: crude<- allvc(death~sex* age, random=~1|ID, offset=log(pop), k=3, data=irlsuicide, family=poisson) crude$disparity # [1] 654.021 # Relative risk model: relrisk<- allvc(death~1, random=~1|ID, offset=log(expected), k=3, data=irlsuicide, family=poisson) relrisk$disparity # [1] 656.4955
# The first three examples (galaxy data, toxoplasmosis data , fabric faults) # are based on GLIM examples in Aitkin et al. (2005), and the forth example using # the Hospital-Stay-Data (Rosner, 2000) is taken from Einbeck & Hinde (2006). # The fifth data example using the Oxford boys is again inspired by Aitkin et al. (2005). # The sixth example on Irish suicide rates is taken from Sofroniou et al. (2006). # The galaxy data data(galaxies, package="MASS") gal<-as.data.frame(galaxies) galaxy.np6 <- alldist(galaxies/1000~1, random=~1, random.distribution="np", data=gal, k=6) galaxy.np8u <- alldist(galaxies/1000~1, random=~1, random.distribution="np", data=gal, k=8, lambda=0.99) round(galaxy.np8u$sdev$sdevk, digits=3) # [1] 0.912 0.435 0.220 0.675 1.214 0.264 0.413 0.297 # The toxoplasmosis data data(rainfall) rainfall$x<-rainfall$Rain/1000 rainfall$x2<- rainfall$x^2; rainfall$x3<- rainfall$x^3 toxo.np3<- alldist(cbind(Cases,Total-Cases) ~ x+x2+x3, random=~1, random.distribution="np", family=binomial(link=logit), data=rainfall, k=3) toxo.np3x<- alldist(cbind(Cases,Total-Cases) ~ x, random=~x, random.distribution="np", family=binomial(link=logit), data=rainfall, k=3) # is the same as toxo.np3x<- alldist(Cases/Total ~ x, random = ~x, weights=Total, family=binomial(link=logit), data=rainfall, k=3) # or toxo.np3x<-update(toxo.np3, .~.-x2-x3, random = ~x) # The fabric faults data data(fabric) coefficients(alldist(y ~ x, random=~1, family=poisson(link=log), random.distribution="gq", data= fabric, k=3, verbose=FALSE)) # (Intercept) x z # -3.3088663 0.8488060 0.3574909 # The Pennsylvanian hospital stay data data(hosp) fitnp3<- alldist(duration~age+temp1, data=hosp, k=3, family=Gamma(link=log), tol=0.5) fitnp3$shape$shape # [1] 50.76636 fitnp3<- alldist(duration~age+temp1, data=hosp, k=3, family=Gamma(link=log), tol=0.5, lambda=0.9) fitnp3$shape$shapek # [1] 49.03101 42.79522 126.64077 # The Oxford boys data data(Oxboys, package="nlme") Oxboys$boy <- gl(26,9) allvc(height~age, random=~1|boy, data=Oxboys, random.distribution='gq', k=20) allvc(height~age, random=~1|boy, data=Oxboys,random.distribution='np',k=8) # with random coefficients: allvc(height~age,random=~age|boy, data=Oxboys, random.distribution='np', k=8) # Irish suicide data data(irlsuicide) # Crude rate model: crude<- allvc(death~sex* age, random=~1|ID, offset=log(pop), k=3, data=irlsuicide, family=poisson) crude$disparity # [1] 654.021 # Relative risk model: relrisk<- allvc(death~1, random=~1|ID, offset=log(expected), k=3, data=irlsuicide, family=poisson) relrisk$disparity # [1] 656.4955
Discrete kernel for categorical data with k
unordered categories.
dkern(x, y, k, lambda)
dkern(x, y, k, lambda)
x |
categorical data vector |
y |
postive integer defining a fixed category |
k |
positive integer giving the number of categories |
lambda |
smoothing parameter |
This kernel was introduced in Aitchison & Aitken (1976); see also Titterington (1980).
The setting lambda =1/k
corresponds to the extreme case 'maximal smoothing',
while lambda = 1
means ‘no smoothing’. Statistically sensible settings are
only 1/k
lambda
1
.
Jochen Einbeck (2006)
Aitchison, J. and Aitken, C.G.G. (1976). Multivariate binary discrimination by kernel method. Biometrika 63, 413-420.
Titterington, D. M. (1980). A comparative study of kernel-based density estimates for categorical data. Technometrics, 22, 259-268.
k<-6; dkern(1:k,4,k,0.99) # Kernel centered at the 4th component with a very small amount of smoothing. ## The function is currently defined as function(x,y,k,lambda){ ifelse(y==x, lambda, (1-lambda)/(k-1)) }
k<-6; dkern(1:k,4,k,0.99) # Kernel centered at the 4th component with a very small amount of smoothing. ## The function is currently defined as function(x,y,k,lambda){ ifelse(y==x, lambda, (1-lambda)/(k-1)) }
The data are 32 observations on faults in rolls of fabric
data(fabric)
data(fabric)
A data frame with 32 observations on the following 3 variables.
the length of the roll : a numeric vector
the number of faults in the roll of fabric : a discrete vector
the log of the length of the roll : a numeric vector
The data are 32 observations on faults in rolls of fabric taken from Hinde (1982) who used the EM algorithm to fit a Poisson-normal model. The response variable is the number of faults in the roll of fabric and the explanatory variable is the log of the length of the roll.
This data set and help file is an identical copy
of the fabric
data in package gamlss.data.
John Hinde.
Hinde, J. (1982) Compound Poisson regression models: in GLIM 82, Proceedings of the International Conference on Generalized Linear Models, ed. Gilchrist, R., 109–121, Springer: New York.
data(fabric) attach(fabric) plot(x,y) detach(fabric)
data(fabric) attach(fabric) plot(x,y) detach(fabric)
Methods for the generic family
and model.matrix
functions
## S3 method for class 'glmmNPML' family(object, ...) ## S3 method for class 'glmmGQ' family(object, ...) ## S3 method for class 'glmmNPML' model.matrix(object, ...) ## S3 method for class 'glmmGQ' model.matrix(object, ...)
## S3 method for class 'glmmNPML' family(object, ...) ## S3 method for class 'glmmGQ' family(object, ...) ## S3 method for class 'glmmNPML' model.matrix(object, ...) ## S3 method for class 'glmmGQ' model.matrix(object, ...)
object |
object of class |
... |
further arguments, ensuring compability with generic functions. |
The generic R functions update()
, coefficients()
, coef()
,
fitted()
, fitted.values()
, and df.residual()
can also be applied straightforwardly on all objects of
class glmmNPML
or glmmGQ
. They are not listed above as they use
the generic default functions and are not separately implemented.
Explicit implementations exist for predict
, summary
,
print
, and plot
, and these functions are explained in the corresponding
help files.
Jochen Einbeck and John Hinde (2007)
summary.glmmNPML
, predict.glmmNPML
,
family
, model.matrix
, update
,
coefficients
, alldist
.
Calculate Gaussian Quadrature points for the Normal distribution using the abscissas and weights for Hermite integration.
gqz(numnodes=20, minweight=0.000001)
gqz(numnodes=20, minweight=0.000001)
numnodes |
theoretical number of quadrature points. |
minweight |
locations with weights that are less than this value will be omitted. |
The conversion of the locations and weights is given in Lindsey (1992,
page 169:3) and Skrondal & Rabe-Hesketh (2004, page 165:1).
The argument numnodes is the theoretical number of quadrature points,
locations with weights that are less than the argument minweight
will
be omitted. The default value of minweight=0.000001
returns 14 masspoints
for the default numnodes=20
as in Aitkin, Francis & Hinde (2005).
A list with two vectors:
location |
locations of mass points |
weight |
masses |
Nick Sofroniou (2005)
Aitkin, M., Francis, B. and Hinde, J. (2005). Statistical Modelling in GLIM 4. Second Edition, Oxford Statistical Science Series, Oxford, UK.
Lindsey, J. K. (1992). The Analysis of Stochastic Processes using GLIM. Berlin: Springer-Verlag.
Skrondal, A. and Rabe-Hesketh, S. (2004). Generalized latent variable modelling. Boca Raton: Chapman and Hall/CRC.
gqz(20, minweight=1e-14) # gives k=20 GH integration points. These are used in alldist # and allvc as fixed mass point locations when working with # option random.distribution='gq', and serve as EM starting points # otherwise.
gqz(20, minweight=1e-14) # gives k=20 GH integration points. These are used in alldist # and allvc as fixed mass point locations when working with # option random.distribution='gq', and serve as EM starting points # otherwise.
The data, 25 observations, are a subset from a larger data set collected on persons discharged from a selected Pennsylvania hospital as part of a retrospective chart review of antibiotic use in hospitals (Towensend et al., 1979, Rosner, 2000).
data(hosp)
data(hosp)
A data frame with 25 observations on the following 9 variables. All variables are given as numerical vectors.
id
patient ID.
duration
the total number of days patients spent in hospital.
age
age of patient in whole years.
sex
gender: 1=M, 2=F.
temp1
first temperature following admission.
wbc1
first WBC count () following admission. [WBC= white blood cells].
antib
received antibiotic: 1=yes, 2=no.
bact
received bacterial culture: 1=yes, 2=no.
serv
service: 1 =med., 2=surg.
Don't confuse with the Barcelona 'Hospital stay data' aep
in package gamlss.
B. Rosner, Harvard University.
Rosner, B. (2000). Fundamentals of Biostatistics. Thomson Learning, Duxbury, CA, USA.
Townsend, T.R., Shapiro, M., Rosner, B., & Kass, E. H. (1979). Use of antimicrobial drugs in general hospitals. I. Description of population and definition of methods. Journal of Infectious Diseases 139 , 688-697.
data(hosp) glm1<- glm(duration~age+temp1+wbc1, data=hosp, family=Gamma(link=log))
data(hosp) glm1<- glm(duration~age+temp1+wbc1, data=hosp, family=Gamma(link=log))
Suicide Rates in the Republic of Ireland 1989-1998.
data(irlsuicide)
data(irlsuicide)
A data frame with 104 observations on the following 8 variables.
Region
a factor with levels Cork
, Dublin
, EHB - Dub.
, Galway
, Lim.
, Mid HB
, MWHB-Lim.
, NEHB
, NWHB
, SEHB-Wat.
, SHB-Cork
, Waterf.
, WHB-Gal.
.
ID
a factor with levels 1
2
3
4
5
6
7
8
9
10
11
12
13
corresponding to Regions.
pop
a numeric vector giving the population sizes (estimated for 1994).
death
a numeric vector giving the total number of deaths.
sex
a factor for gender with levels 0
(female) and 1
(male).
age
a factor for age with levels 1
(0-29), 2
(30-39), 3
(40-59), 4
(60+ years).
smr
a numeric vector with standardized mortality ratios (SMRs)
expected
a numeric vector with ‘expected’ number of cases obtained from a reference population (Ahlbom, 1993).
The data set is examined in Sofroniou et al. (2006), using a variance component model with regions as upper level.
Institute of Public Health in Ireland (2005). All Ireland Mortality Database. Retrieved August 8, 2005.
Ahlbom, A., (1993). Biostatistics for Epidemiologists. Boca Raton: Lewis Publishers.
Sofroniou, N., Einbeck, J., and Hinde, J. (2006). Analyzing Irish Suicide Rates with Mixture Models. Proceedings of the 21st Workshop on Statistical Modelling in Galway, Ireland, 2006.
data(irlsuicide) library(lattice) trellis.device(color=FALSE) plot2age<-rep(gl(4,2),13) xyplot(irlsuicide$death/irlsuicide$pop~plot2age|irlsuicide$Region, pch=(1+(irlsuicide$sex==1)),xlab="age",ylab="Crude rates")
data(irlsuicide) library(lattice) trellis.device(color=FALSE) plot2age<-rep(gl(4,2),13) xyplot(irlsuicide$death/irlsuicide$pop~plot2age|irlsuicide$Region, pch=(1+(irlsuicide$sex==1)),xlab="age",ylab="Crude rates")
Lung cancer mortality in the 84 largest Missouri cities, for males aged 45-54, 1972-1981. Data presented in Tsutakawa (1985).
data(missouri)
data(missouri)
A data frame with 84 observations on the following 2 variables.
Size
population of the city.
Deaths
number of lung cancer deaths.
The data set was analyzed using a Poisson model with normal random effect in Tsutakawa (1985), and using a binomial logit model with unspecified random effect distribution in Aitkin (1996b). Aitkin fitted this model with GLIM4.
Tsutakawa, R. (1985).
Aitkin, M. (1996b). Empirical Bayes shrinkage using posterior random effect means from nonparametric maximum likelihood estimation in general random effect models. Statistical Modelling: Proceedings of the 11th IWSM 1996, 87-94.
Tsutakawa, R. (1985). Estimation of Cancer Mortalilty Rates: A Bayesian Analysis of Small Frequencies. Biometrics 41, 69-79.
data(missouri) alldist(Deaths~1, offset=log(Size), random=~1, k=2, family=poisson(link='log'), data=missouri)
data(missouri) alldist(Deaths~1, offset=log(Size), random=~1, k=2, family=poisson(link='log'), data=missouri)
The functions alldist
and allvc
produce
objects of type glmmGQ
, if Gaussian quadrature (Hinde, 1982,
random.distribution="gq"
) was applied for computation, and objects
of class glmmNPML
, if parameter estimation was carried out by nonparametric
maximum likelihood (Aitkin, 1996a, random.distribution="np"
).
The functions presented here give some useful diagnostic plotting functionalities
to analyze these objects.
## S3 method for class 'glmmNPML' plot(x, plot.opt = 15, noformat=FALSE, ...) ## S3 method for class 'glmmGQ' plot(x, plot.opt = 3, noformat=FALSE, ...)
## S3 method for class 'glmmNPML' plot(x, plot.opt = 15, noformat=FALSE, ...) ## S3 method for class 'glmmGQ' plot(x, plot.opt = 3, noformat=FALSE, ...)
x |
a fitted object of class |
plot.opt |
an integer with values |
noformat |
if |
... |
further arguments which will mostly not have any effect
(and are included only to ensure compatibility with the
generic |
See the help pages to alldist and the vignette (Einbeck & Hinde, 2007).
It is sufficient to write plot
instead of plot.glmmNPML
or plot.glmmGQ
,
since the generic plot
function provided in R automatically selects the right model class.
For class glmmNPML
: Depending on the choice of plot.opt
, a subset
of the following four plots:
1 |
Disparity trend. |
2 |
EM Trajectories. |
3 |
Empirical Bayes Predictions against observed response. |
4 |
Individual posterior probabilities. |
The number given in plot.opt
is transformed into a binary
number indicating which plots are to be selected. The first digit
(from the right!) refers to plot 1, the second one to plot 2, and so on.
For example, plot.opt=4
gives the binary number 0100 and hence selects
just plot 3.
For class glmmGQ
: Depending on the choice of plot.opt,
a subset of plots 1 and 3. Again, the number is transformed into binary coding, yielding only the
disparity trend for plot.opt=1
, only the EBP's for plot.opt=2
,
and both plots for plot.opt=3
.
Jochen Einbeck and John Hinde (2007)
Aitkin, M. (1996a). A general maximum likelihood analysis of overdispersion in generalized linear models. Statistics and Computing 6, 251-262.
Einbeck, J., and Hinde, J.: Nonparametric maximum likelihood estimation for random effect models in R. Vignette to R package npmlreg.
Type vignette("npmlreg-v")
to open it.
Hinde, J. (1982). Compound Poisson regression models. Lecture Notes in Statistics 14, 109-121.
data(galaxies, package="MASS") gal<-as.data.frame(galaxies) galaxy.np4u <- alldist(galaxies/1000~1,random=~1,k=4,tol=0.5,data=gal,lambda=1) predict(galaxy.np4u, type="response") # EBP on scale of responses plot(galaxy.np4u, plot.opt=4) # plots only EBP vs. response plot(galaxy.np4u, plot.opt=3) # gives same output as given by default when executing alldist plot(galaxy.np4u) # gives all four plots.
data(galaxies, package="MASS") gal<-as.data.frame(galaxies) galaxy.np4u <- alldist(galaxies/1000~1,random=~1,k=4,tol=0.5,data=gal,lambda=1) predict(galaxy.np4u, type="response") # EBP on scale of responses plot(galaxy.np4u, plot.opt=4) # plots only EBP vs. response plot(galaxy.np4u, plot.opt=3) # gives same output as given by default when executing alldist plot(galaxy.np4u) # gives all four plots.
Takes an object of class glmmNPML
or glmmGQ
and displays the
posterior probabilites as well as the posterior intercepts
(Sofroniou et. al, 2006). Further it classfies the observations to mass points
according to their posterior probability. The level on which the information
in all three cases is displayed can be chosen by the user via the
level
argument ("upper"
or "lower"
). The actual information in both cases is
identical, the latter is just an expanded version of the former. In case of
simple overdispersion models, the level
argument is not relevant.
post(object, level="upper")
post(object, level="upper")
object |
an object of class |
level |
|
A list of the following four items:
prob |
posterior probabilities (identical to |
int |
posterior intercepts (identical to |
classif |
a numerical vector containing the class numbers (the order of the classes corresponds to the
order of the mass points given in the output of |
level |
either |
Jochen Einbeck and John Hinde (2006)
Sofroniou, N., Einbeck, J., and Hinde, J. (2006). Analyzing Irish suicide rates with mixture models. Proceedings of the 21st International Workshop on Statistical Modelling in Galway, Ireland, 2006.
data(galaxies, package="MASS") gal <- as.data.frame(galaxies) post(alldist(galaxies/1000~1, random=~1, data=gal, k=5))$classif # classifies the 82 galaxies to one of the five mass points
data(galaxies, package="MASS") gal <- as.data.frame(galaxies) post(alldist(galaxies/1000~1, random=~1, data=gal, k=5))$classif # classifies the 82 galaxies to one of the five mass points
The functions alldist
and allvc
produce objects of type glmmGQ
,
if Gaussian quadrature (Hinde, 1982, random.distribution="gq"
)
was applied for computation, and objects of class glmmNPML
, if
parameter estimation was carried out by nonparametric maximum likelihood
(Aitkin, 1996a, random.distribution="np"
). The functions presented here
give predictions from those objects.
## S3 method for class 'glmmNPML' predict(object, newdata, type = "link", ...) ## S3 method for class 'glmmGQ' predict(object, newdata, type = "link", ...)
## S3 method for class 'glmmNPML' predict(object, newdata, type = "link", ...) ## S3 method for class 'glmmGQ' predict(object, newdata, type = "link", ...)
object |
a fitted object of class |
newdata |
a data frame with covariates from which prediction is desired. If omitted, empirical Bayes predictions for the original data will be given. |
type |
if set to |
... |
further arguments which will mostly not have any effect (and are
included only to ensure compatibility
with the generic |
The predicted values are obtained by
Empirical Bayes (Aitkin, 1996b), if newdata
has not been specified.
That is, the prediction on the linear predictor scale is given by
,
whereby
are the fitted linear predictors,
are the weights in the final iteration of the EM algorithm
(corresponding to the posterior probability for observation
to come from component
), and the sum is taken over the number
of components
for fixed
.
the marginal model, if object is of class glmmNPML
and
newdata
has been specified. That is, computation is identical as above, but
with replaced by the masses
of the fitted model.
the analytical expression for the marginal mean of the responses,
if object is of class glmmGQ
and newdata
has been specified.
See Aitkin et al. (2009), p. 481, for the formula. This method is only
supported for the logarithmic link function, as otherwise no analytical
expression for the marginal mean of the responses exists.
It is sufficient to call predict
instead of predict.glmmNPML
or
predict.glmmGQ
, since the generic predict function provided in R automatically selects the right
model class.
A vector of predicted values.
The results of the generic fitted()
method
correspond to predict(object, type="response")
. Note that, as we are
working with random effects, fitted values are never really ‘fitted’ but rather
‘predicted’.
Jochen Einbeck and John Hinde (2007).
Aitkin, M. (1996a). A general maximum likelihood analysis of overdispersion in generalized linear models. Statistics and Computing 6, 251-262.
Aitkin, M. (1996b). Empirical Bayes shrinkage using posterior random effect means from nonparametric maximum likelihood estimation in general random effect models. Statistical Modelling: Proceedings of the 11th IWSM 1996, 87-94.
Aitkin, M., Francis, B. and Hinde, J. (2009). Statistical Modelling in R. Oxford Statistical Science Series, Oxford, UK.
Hinde, J. (1982). Compound Poisson regression models. Lecture Notes in Statistics 14, 109-121.
# Toxoplasmosis data: data(rainfall) rainfall$x<-rainfall$Rain/1000 toxo.0.3x<- alldist(cbind(Cases,Total-Cases)~1, random=~x, data=rainfall, k=3, family=binomial(link=logit)) toxo.1.3x<- alldist(cbind(Cases,Total-Cases)~x, random=~x, data=rainfall, k=3, family=binomial(link=logit)) predict(toxo.0.3x, type="response", newdata=data.frame(x=2)) # [1] 0.4608 predict(toxo.1.3x, type="response", newdata=data.frame(x=2)) # [1] 0.4608 # gives the same result, as both models are equivalent and only differ # by a parameter transformation. # Fabric faults data: data(fabric) names(fabric) # [1] "leng" "y" "x" faults.g2<- alldist(y ~ x, family=poisson(link=log), random=~1, data= fabric,k=2, random.distribution="gq") predict(faults.g2, type="response",newdata=fabric[1:6,]) # [1] 8.715805 10.354556 13.341242 5.856821 11.407828 13.938013 # is not the same as predict(faults.g2, type="response")[1:6] # [1] 6.557786 7.046213 17.020242 7.288989 13.992591 9.533823 # since in the first case prediction is done using the analytical # mean of the marginal distribution, and in the second case using the # individual posterior probabilities in an empirical Bayes approach.
# Toxoplasmosis data: data(rainfall) rainfall$x<-rainfall$Rain/1000 toxo.0.3x<- alldist(cbind(Cases,Total-Cases)~1, random=~x, data=rainfall, k=3, family=binomial(link=logit)) toxo.1.3x<- alldist(cbind(Cases,Total-Cases)~x, random=~x, data=rainfall, k=3, family=binomial(link=logit)) predict(toxo.0.3x, type="response", newdata=data.frame(x=2)) # [1] 0.4608 predict(toxo.1.3x, type="response", newdata=data.frame(x=2)) # [1] 0.4608 # gives the same result, as both models are equivalent and only differ # by a parameter transformation. # Fabric faults data: data(fabric) names(fabric) # [1] "leng" "y" "x" faults.g2<- alldist(y ~ x, family=poisson(link=log), random=~1, data= fabric,k=2, random.distribution="gq") predict(faults.g2, type="response",newdata=fabric[1:6,]) # [1] 8.715805 10.354556 13.341242 5.856821 11.407828 13.938013 # is not the same as predict(faults.g2, type="response")[1:6] # [1] 6.557786 7.046213 17.020242 7.288989 13.992591 9.533823 # since in the first case prediction is done using the analytical # mean of the marginal distribution, and in the second case using the # individual posterior probabilities in an empirical Bayes approach.
Toxoplasmosis data.
The rainfall
data frame has 34 rows and 3 columns.
data(rainfall)
data(rainfall)
This data frame contains the following columns:
mm of rain
cases of toxoplasmosis
total
Luca Scrucca; originally in R package forward
Atkinson, A.C. and Riani, M. (2000), Robust Diagnostic Regression Analysis, First Edition. New York: Springer, Table A.22
These functions are the summary
and print
methods for objects of type
glmmNPML
and glmmGQ
.
## S3 method for class 'glmmNPML' summary(object, digits = max(3, getOption("digits") - 3), ...) ## S3 method for class 'glmmGQ' summary(object, digits = max(3, getOption("digits") - 3), ...) ## S3 method for class 'glmmNPML' print(x, digits=max(3,getOption('digits')-3), ...) ## S3 method for class 'glmmGQ' print(x, digits=max(3,getOption('digits')-3), ...)
## S3 method for class 'glmmNPML' summary(object, digits = max(3, getOption("digits") - 3), ...) ## S3 method for class 'glmmGQ' summary(object, digits = max(3, getOption("digits") - 3), ...) ## S3 method for class 'glmmNPML' print(x, digits=max(3,getOption('digits')-3), ...) ## S3 method for class 'glmmGQ' print(x, digits=max(3,getOption('digits')-3), ...)
object |
a fitted object of class |
x |
a fitted object of class |
digits |
number of digits; applied on various displayed quantities. |
... |
further arguments, which will mostly be ignored. |
The summary...
- and print...
-functions invoke the generic
UseMethod(...)
function and detect the right model class
automatically. In other words, it is enough to write
summary(...)
or print(...)
.
Prints regression output or summary on screen.
Objects returned by summary.glmmNPML
or summary.glmmGQ
are
essentially identical to objects of class glmmNPML
or glmmGQ
.
However, their $coef
component contains the parameter standard errors
and t values (taken from the GLM fitted in the last EM iteration), and they
have two additional components $dispersion
and $lastglmsumm
providing the estimated dispersion parameter and a summary of the glm
fitted in the last EM iteration.
Please note that the provided parameter standard errors tend to be underestimated as the uncertainty due to the EM algorithm is not incorporated into them. According to Aitkin et al (2009), Section 7.5, page 440, more accurate standard errors can be obtained by dividing the (absolute value of the) parameter estimate through the square root of the change in disparity when omitting/not omitting the variable from the model.
originally from Ross Darnell (2002), modified and prepared for publication by Jochen Einbeck and John Hinde (2007).
Aitkin, M., Francis, B. and Hinde, J. (2009). Statistical Modelling in R. Oxford Statistical Science Series, Oxford, UK.
alldist
, allvc
, summary
,
print
, family.glmmNPML
Performs a grid search to select the parameter tol
,
which is a tuning parameter for starting point selection of the EM algorithm
for NPML estimation (see e.g. Aitkin, Hinde & Francis, 2009, p. 437)
tolfind(formula, random = ~1, family = gaussian(), data, k = 4, random.distribution="np", offset, weights, na.action, EMmaxit = 500, EMdev.change = 0.001, lambda = 0, damp = TRUE, damp.power = 1, spike.protect = 1, sdev, shape, plot.opt = 1, steps = 15, find.in.range = c(0.05, 0.8), verbose = FALSE, noformat = FALSE, ...)
tolfind(formula, random = ~1, family = gaussian(), data, k = 4, random.distribution="np", offset, weights, na.action, EMmaxit = 500, EMdev.change = 0.001, lambda = 0, damp = TRUE, damp.power = 1, spike.protect = 1, sdev, shape, plot.opt = 1, steps = 15, find.in.range = c(0.05, 0.8), verbose = FALSE, noformat = FALSE, ...)
formula |
a formula defining the response and the fixed effects (e.g. |
random |
a formula defining the random model. Set |
family |
conditional distribution of responses: "gaussian", "poisson", "binomial", "Gamma", or "inverse.gaussian" can be set. |
data |
the data frame (mandatory, even if it is attached to the workspace!). |
k |
the number of mass points/integration points (supported are up to 600 mass points). |
random.distribution |
the mixing distribution, Gaussian Quadrature
( |
offset |
an optional offset to be included in the model. |
weights |
optional prior weights for the data. |
na.action |
a function indicating what should happen when |
EMmaxit |
maximum number of EM iterations. |
EMdev.change |
stops EM algorithm when deviance change falls below this value. |
lambda |
see the help file for |
damp |
switches EM damping on or off. |
damp.power |
steers degree of damping. |
spike.protect |
see the help file for |
sdev |
optional fixed standard deviation for normal mixture. |
shape |
optional fixed shape parameter for Gamma and IG mixtures. |
plot.opt |
For |
steps |
number of grid points for the search of |
find.in.range |
range for the search of |
verbose |
If set to |
noformat |
If |
... |
further arguments which will be ignored. |
The EM algorithm for NPML estimation (Aitkin, 1996) uses the Gauss-Hermite masses
and mass points as starting points. The position of the starting points can be
concentrated or extended by setting tol
smaller or larger than 1,
respectively. The tuning parameter tol
is, as in GLIM4, responsible for this scaling.
A careful selection of tol
may be necessary for some data sets.
The reason is that NPML has a tendency to get stuck in local maxima,
as the log-likelihhod function is not concave for fixed k
(Boehning, 1999).
For Gaussian, Gamma, and IG mixtures this R implementation uses by default a damping
procedure in the first cycles of the EM algorithm (Einbeck & Hinde, 2006),
which stabilizes the algorithm and makes it less sensitive to the optimal choice
of tol
. Application of tolfind
to check that the optimal solution has
not been overlooked may nevertheless be advisable.
tolfind
works for alldist
and allvc
. The tolfind
function
is mainly designed for NPML (random.distribution="np"
). It
can also be applied to Gaussian Quadrature (random.distribution="gq"
),
though tol
is of little importance for this and primarily influences
the speed of convergence.
A list of 5 items:
MinDisparity |
the minimal disparity achieved (for which EM converged). |
Mintol |
the |
AllDisparities |
a vector containing all disparities calculated on the grid. |
Alltol |
all corresponding |
AllEMconverged |
a vector of Booleans indicating
if EM converged for the particular |
Jochen Einbeck & John Hinde (2006).
Aitkin, M. (1996). A general maximum likelihood analysis of overdispersion in generalized linear models. Statistics and Computing 6 , 251-262.
Aitkin, M., Francis, B. and Hinde, J. (2009). Statistical Modelling in R. Oxford Statistical Science Series, Oxford, UK.
Boehning, D. (1999). Computer-Assisted Analysis of Mixtures and Applications. Meta-Analysis, Disease Mapping and others. Chapman & Hall / CRC, Boca Raton, FL, USA.
Einbeck, J. & Hinde, J. (2006). A note on NPML estimation for exponential family regression models with unspecified dispersion parameter. Austrian Journal of Statistics 35, 233-243.
data(galaxies, package="MASS") gal<-as.data.frame(galaxies) tolfind(galaxies/1000~1, random=~1, k=5, data=gal, lambda=1, damp=TRUE, find.in.range=c(0,1), steps=10) # Minimal Disparity: 380.1444 at tol= 0.5
data(galaxies, package="MASS") gal<-as.data.frame(galaxies) tolfind(galaxies/1000~1, random=~1, k=5, data=gal, lambda=1, damp=TRUE, find.in.range=c(0,1), steps=10) # Minimal Disparity: 380.1444 at tol= 0.5
These are not to be called by the user.
weightslogl.calc.w(p, fjk, weights) expand(x, k) expand.vc(x, ni) binomial.expand(Y, k, w)
weightslogl.calc.w(p, fjk, weights) expand(x, k) expand.vc(x, ni) binomial.expand(Y, k, w)
p |
... |
fjk |
... |
weights |
... |
x |
... |
k |
... |
ni |
... |
Y |
... |
w |
... |
Ross Darnell and Jochen Einbeck.