Title: | Aster Models |
---|---|
Description: | Aster models (Geyer, Wagenius, and Shaw, 2007, <doi:10.1093/biomet/asm030>; Shaw, Geyer, Wagenius, Hangelbroek, and Etterson, 2008, <doi:10.1086/588063>; Geyer, Ridley, Latta, Etterson, and Shaw, 2013, <doi:10.1214/13-AOAS653>) are exponential family regression models for life history analysis. They are like generalized linear models except that elements of the response vector can have different families (e. g., some Bernoulli, some Poisson, some zero-truncated Poisson, some normal) and can be dependent, the dependence indicated by a graphical structure. Discrete time survival analysis, life table analysis, zero-inflated Poisson regression, and generalized linear models that are exponential family (e. g., logistic regression and Poisson regression with log link) are special cases. Main use is for data in which there is survival over discrete time periods and there is additional data about what happens conditional on survival (e. g., number of offspring). Uses the exponential family canonical parameterization (aster transform of usual parameterization). There are also random effects versions of these models. |
Authors: | Charles J. Geyer <[email protected]> |
Maintainer: | Charles J. Geyer <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.1-3 |
Built: | 2024-12-09 06:33:27 UTC |
Source: | CRAN |
Compute an analysis of deviance table for two or more aster model fits with or without random effects.
## S3 method for class 'asterOrReaster' anova(object, ..., tolerance = .Machine$double.eps ^ 0.75) anovaAsterOrReasterList(objectlist, tolerance = .Machine$double.eps ^ 0.75)
## S3 method for class 'asterOrReaster' anova(object, ..., tolerance = .Machine$double.eps ^ 0.75) anovaAsterOrReasterList(objectlist, tolerance = .Machine$double.eps ^ 0.75)
object , ...
|
objects of class |
objectlist |
list of objects of class |
tolerance |
tolerance for comparing nesting of model matrices. |
Constructs a table having a row for the
degrees of freedom and deviance for each model.
For all but the first model, the change in degrees of freedom and
deviance is also given, as is the corresponding asymptotic -value.
For objects of class "reaster"
, the
quantity called deviance is only approximate. See references on
help for reaster
.
When objects of class "reaster"
are among those supplied,
degrees of freedom for fixed effects and degrees of freedom for
variance components are reported separately, because tests for fixed
effects are effectively two-tailed and tests for variance components
are effectively one-tailed.
In case models being compared differ by one variance component, the reference distribution is half a chi-square with the fixed effect degrees of freedom (difference of number of fixed effects in the two models) and half a chi-square with one more degrees of freedom.
In case models being compared differ by two or more variance components, we do not know how to how to do the test. The reference distribution is a mixture of chi-squares but the mixing weights are difficult to calculate. An error is given in this case.
An object of class "anova"
inheriting from class "data.frame"
.
The comparison between two or more models by anova
or
anovaAsterOrReasterList
will only be valid if they
are (1) fitted to the same dataset,
(2) models are nested,
(3) have the same
dependence graph and exponential families.
Some of this is currently checked. Some warnings are given.
### see package vignette for explanation ### library(aster) data(echinacea) vars <- c("ld02", "ld03", "ld04", "fl02", "fl03", "fl04", "hdct02", "hdct03", "hdct04") redata <- reshape(echinacea, varying = list(vars), direction = "long", timevar = "varb", times = as.factor(vars), v.names = "resp") redata <- data.frame(redata, root = 1) pred <- c(0, 1, 2, 1, 2, 3, 4, 5, 6) fam <- c(1, 1, 1, 1, 1, 1, 3, 3, 3) hdct <- grepl("hdct", as.character(redata$varb)) redata <- data.frame(redata, hdct = as.integer(hdct)) level <- gsub("[0-9]", "", as.character(redata$varb)) redata <- data.frame(redata, level = as.factor(level)) aout1 <- aster(resp ~ varb + hdct : (nsloc + ewloc + pop), pred, fam, varb, id, root, data = redata) aout2 <- aster(resp ~ varb + level : (nsloc + ewloc) + hdct : pop, pred, fam, varb, id, root, data = redata) aout3 <- aster(resp ~ varb + level : (nsloc + ewloc + pop), pred, fam, varb, id, root, data = redata) anova(aout1, aout2, aout3) # now random effects models and models without random effects mixed ## Not run: ### CRAN policy says examples must take < 5 sec. ### This doesn't (on their computers). data(radish) pred <- c(0,1,2) fam <- c(1,3,2) rout2 <- reaster(resp ~ varb + fit : (Site * Region), list(block = ~ 0 + fit : Block, pop = ~ 0 + fit : Pop), pred, fam, varb, id, root, data = radish) rout1 <- reaster(resp ~ varb + fit : (Site * Region), list(block = ~ 0 + fit : Block), pred, fam, varb, id, root, data = radish) rout0 <- aster(resp ~ varb + fit : (Site * Region), pred, fam, varb, id, root, data = radish) anova(rout0, rout1, rout2) ## End(Not run)
### see package vignette for explanation ### library(aster) data(echinacea) vars <- c("ld02", "ld03", "ld04", "fl02", "fl03", "fl04", "hdct02", "hdct03", "hdct04") redata <- reshape(echinacea, varying = list(vars), direction = "long", timevar = "varb", times = as.factor(vars), v.names = "resp") redata <- data.frame(redata, root = 1) pred <- c(0, 1, 2, 1, 2, 3, 4, 5, 6) fam <- c(1, 1, 1, 1, 1, 1, 3, 3, 3) hdct <- grepl("hdct", as.character(redata$varb)) redata <- data.frame(redata, hdct = as.integer(hdct)) level <- gsub("[0-9]", "", as.character(redata$varb)) redata <- data.frame(redata, level = as.factor(level)) aout1 <- aster(resp ~ varb + hdct : (nsloc + ewloc + pop), pred, fam, varb, id, root, data = redata) aout2 <- aster(resp ~ varb + level : (nsloc + ewloc) + hdct : pop, pred, fam, varb, id, root, data = redata) aout3 <- aster(resp ~ varb + level : (nsloc + ewloc + pop), pred, fam, varb, id, root, data = redata) anova(aout1, aout2, aout3) # now random effects models and models without random effects mixed ## Not run: ### CRAN policy says examples must take < 5 sec. ### This doesn't (on their computers). data(radish) pred <- c(0,1,2) fam <- c(1,3,2) rout2 <- reaster(resp ~ varb + fit : (Site * Region), list(block = ~ 0 + fit : Block, pop = ~ 0 + fit : Pop), pred, fam, varb, id, root, data = radish) rout1 <- reaster(resp ~ varb + fit : (Site * Region), list(block = ~ 0 + fit : Block), pred, fam, varb, id, root, data = radish) rout0 <- aster(resp ~ varb + fit : (Site * Region), pred, fam, varb, id, root, data = radish) anova(rout0, rout1, rout2) ## End(Not run)
Data on life history traits for the brown ambrosia aphid Uroleucon rudbeckiae
data(aphid)
data(aphid)
A data frame with records for 18 insects. Data are already in “long” format; no need to reshape.
Response vector.
Categorical. Gives node of graphical model corresponding
to each component of resp
. See details below.
All ones. Root variables for graphical model.
Categorical. Indicates individual plants.
The levels of varb
indicate nodes of the graphical model to which
the corresponding elements of the response vector resp
belong.
This is the typical “long” format produced by the R reshape
function. For each individual, there are several response variables.
All response variables are combined in one vector resp
.
The variable varb
indicates which “original” variable
the number was for. The variable id
indicates which individual
the number was for. The levels of varb
, which are the names
of the “original” variables are the following.
S1
through S13
are Bernoulli: one if alive, zero if dead.
B2
through B9
are conditionally Poisson: the number of
offspring in the corresponding time period. Some variables in the
original data that were zero have been deleted.
These data were published in the following, where they were analyzed by non-aster methods.
Lenski, R. E. and Service, P. M. (1982). The statistical analysis of population growth rates calculated from schedules of survivorship and fecunidity. Ecology, 63, 655-662. doi:10.2307/1936785.
These data are reanalyzed by aster methods in the following.
Shaw, R. G., Geyer, C. J., Wagenius, S., Hangelbroek, H. H., and Etterson, J. R. (2008) Unifying life history analyses for inference of fitness and population growth. American Naturalist, 172, E35-E47. doi:10.1086/588063.
data(aphid) ### wide version aphidw <- reshape(aphid, direction = "wide", timevar = "varb", v.names = "resp", varying = list(levels(aphid$varb)))
data(aphid) ### wide version aphidw <- reshape(aphid, direction = "wide", timevar = "varb", v.names = "resp", varying = list(levels(aphid$varb)))
Fits Aster Models.
aster(x, ...) ## Default S3 method: aster(x, root, pred, fam, modmat, parm, type = c("unconditional", "conditional"), famlist = fam.default(), origin, origin.type = c("model.type", "unconditional", "conditional"), method = c("trust", "nlm", "CG", "L-BFGS-B"), fscale, maxiter = 1000, nowarn = TRUE, newton = TRUE, optout = FALSE, coef.names, ...) ## S3 method for class 'formula' aster(formula, pred, fam, varvar, idvar, root, data, parm, type = c("unconditional", "conditional"), famlist = fam.default(), origin, origin.type = c("model.type", "unconditional", "conditional"), method = c("trust", "nlm", "CG", "L-BFGS-B"), fscale, maxiter = 1000, nowarn = TRUE, newton = TRUE, optout = FALSE, ...)
aster(x, ...) ## Default S3 method: aster(x, root, pred, fam, modmat, parm, type = c("unconditional", "conditional"), famlist = fam.default(), origin, origin.type = c("model.type", "unconditional", "conditional"), method = c("trust", "nlm", "CG", "L-BFGS-B"), fscale, maxiter = 1000, nowarn = TRUE, newton = TRUE, optout = FALSE, coef.names, ...) ## S3 method for class 'formula' aster(formula, pred, fam, varvar, idvar, root, data, parm, type = c("unconditional", "conditional"), famlist = fam.default(), origin, origin.type = c("model.type", "unconditional", "conditional"), method = c("trust", "nlm", "CG", "L-BFGS-B"), fscale, maxiter = 1000, nowarn = TRUE, newton = TRUE, optout = FALSE, ...)
x |
an
|
root |
an object of the same shape as |
pred |
an integer vector of length |
fam |
an integer vector of length |
modmat |
an
|
parm |
usually missing. Otherwise a vector of length |
type |
type of model. The value of this argument can be abbreviated. |
famlist |
a list of family specifications (see |
origin |
Distinguished point in parameter space. May be missing,
in which case an unspecified default is provided. See details below
for further explanation. This is what |
origin.type |
Parameter space in which specified distinguished point
is located. If |
method |
optimization method. If |
fscale |
an estimate of the size of the log likelihood at the maximum.
Defaults to |
maxiter |
maximum number of iterations. Defaults to '1000'. |
nowarn |
if |
newton |
if |
optout |
if |
coef.names |
names of the regression coefficients. If missing,
|
... |
other arguments passed to the optimization method. |
formula |
a symbolic description of the model to be fit. See
|
varvar |
a variable of the same length as the response in
the formula that is a factor whose levels are character strings
treated as variable names. The number of variable names is |
idvar |
a variable of the same length as the response in
the formula that indexes individuals. The number
of individuals is |
data |
an optional data frame containing the variables
in the model. If not found in |
The vector pred
must satisfy all(pred < seq(along = pred))
,
that is, each predecessor must precede in the order given in pred
.
The vector pred
defines a function .
The joint distribution of the data matrix x
is a product of conditionals
When , the notation
means
root[i, j]
. Other elements of the matrix root
are
not used.
The conditional distribution
is the
-fold convolution of the
-th family
in the vector
fam
, a one-parameter exponential family
(i.e., the sum of i.i.d. terms having
this one-parameter exponential family distribution).
For type == "conditional"
the canonical parameter vector
is modeled in GLM fashion as
where
is the model
matrix
modmat
and is the distinguished point
origin
.
Since the “vector” is
actually a matrix, the “matrix”
must correspondingly
be a three-dimensional array. So
written out in full is
This specifies the log likelihood.
For type == "unconditional"
the canonical parameter vector
for an unconditional model is modeled in GLM fashion as
(where the notation is as above).
The unconditional canonical parameters are then specified in terms of
the conditional ones by
where denotes the set of successors of
,
the
such that
, and
is the
cumulant function for the
-th exponential family.
This rather crazy looking formulation is an invertible change of parameter
and makes
the canonical parameter and
the canonical statistic of a full
flat unconditional exponential family.
Again, this specifies the log likelihood.
In versions of aster prior to version 0.6 there was no in the model
specification, which is the same as specifying
in the current
specification. If
is in the column space of the model matrix, that
is, if there exists an
such that
, then there is no difference
in the model specified with
and the one with
.
The maximum likelihood regression coefficients
will be different, but the maximum likelihood estimates of all other
parameters (conditional and unconditional, canonical and mean value)
will be the same. This is the usual case and explains why “linear”
models (with
) as opposed to “affine” models
(with general
)
are popular. In the unusual case where
is not in the column space
of the design matrix, then affine models are a generalization of linear
models: the two are not equivalent, their maximum likelihood estimates are
not the same in any parameterization.
In order to use the R model formula mini-language we must flatten
the dimensionality, making the model matrix modmat
two-dimensional
(a true matrix). This must be done as if by
matrix(modmat, ncol = ncoef)
,
which imposes the requirements on varvar
and idvar
given in the arguments section: they must look like row(x)
and
col(x)
modulo relabeling.
Then x
and root
become one-dimensional, done as if by as.numeric(x)
and as.numeric(root)
.
The standard way to do this in R is to use the reshape
function on a data frame in which the columns of the x
matrix
are variables in the data frame. reshape
automatically puts
things in the right order and creates varvar
and idvar
.
This is shown in the examples section below and in the package vignette
titled "Aster Package Tutorial".
aster
returns an object of class inheriting from "aster"
.
aster.formula
, returns an object of class "aster"
and
subclass "aster.formula"
.
The function summary
(i.e., summary.aster
) can
be used to obtain or print a summary of the results, the function
anova
(i.e., anova.aster
)
to produce an analysis of deviance table, and the function
predict
(i.e., predict.aster
)
to produce predicted values and standard errors.
An object of class "aster"
is a list containing at least the
following components:
coefficients |
a named vector of coefficients. |
rank |
the numeric rank of the fitted generalized linear model
part of the aster model (i.e., the rank of |
deviance |
up to a constant, minus twice the maximized log-likelihood. (Note the minus. This is somewhat counterintuitive, but cannot be changed for reasons of backward compatibility.) |
iter |
the number of iterations used by the optimization method. |
converged |
logical. Was the optimization algorithm judged to have converged? |
code |
integer. The convergence code returned by the optimization method. |
gradient |
The gradient vector of minus the log likelihood at the
fitted |
hessian |
The Hessian matrix of minus the log likelihood
(i.e., the observed Fisher information) at the
fitted |
fisher |
Expected Fisher information at the fitted |
optout |
The object returned by the optimization routine
( |
Calls to aster.formula
return a list also containing:
call |
the matched call. |
formula |
the formula supplied. |
terms |
the |
data |
the |
It was almost always wrong for aster model data to have NA
values.
Although theoretically possible for the R formula mini-language to do the
right thing for an aster model with NA
values in the data, usually
it does some wrong thing. Thus, since version 0.8-20, this function and
the reaster
function give errors when used with data having
NA
values. Users must remove all NA
values (or replace them
with what they should be, perhaps zero values) “by hand”.
Even if the result of this function
has component converges
equal to TRUE
, the result will be
nonsense if there are one or more directions of recession. These are
not detected by this function, but rather by the summary
function
applied to the result of this function,
for which see summary.aster
.
Geyer, C. J., Wagenius, S., and Shaw, R. G. (2007) Aster Models for Life History Analysis. Biometrika, 94, 415–426. doi:10.1093/biomet/asm030.
Shaw, R. G., Geyer, C. J., Wagenius, S., Hangelbroek, H. H., and Etterson, J. R. (2008) Unifying Life History Analysis for Inference of Fitness and population growth. American Naturalist, 172, E35–E47. doi:10.1086/588063.
anova.aster
,
summary.aster
,
and
predict.aster
### see package vignette for explanation ### library(aster) data(echinacea) vars <- c("ld02", "ld03", "ld04", "fl02", "fl03", "fl04", "hdct02", "hdct03", "hdct04") redata <- reshape(echinacea, varying = list(vars), direction = "long", timevar = "varb", times = as.factor(vars), v.names = "resp") redata <- data.frame(redata, root = 1) pred <- c(0, 1, 2, 1, 2, 3, 4, 5, 6) fam <- c(1, 1, 1, 1, 1, 1, 3, 3, 3) hdct <- grepl("hdct", as.character(redata$varb)) redata <- data.frame(redata, hdct = as.integer(hdct)) level <- gsub("[0-9]", "", as.character(redata$varb)) redata <- data.frame(redata, level = as.factor(level)) aout <- aster(resp ~ varb + level : (nsloc + ewloc) + hdct : pop, pred, fam, varb, id, root, data = redata) summary(aout, show.graph = TRUE)
### see package vignette for explanation ### library(aster) data(echinacea) vars <- c("ld02", "ld03", "ld04", "fl02", "fl03", "fl04", "hdct02", "hdct03", "hdct04") redata <- reshape(echinacea, varying = list(vars), direction = "long", timevar = "varb", times = as.factor(vars), v.names = "resp") redata <- data.frame(redata, root = 1) pred <- c(0, 1, 2, 1, 2, 3, 4, 5, 6) fam <- c(1, 1, 1, 1, 1, 1, 3, 3, 3) hdct <- grepl("hdct", as.character(redata$varb)) redata <- data.frame(redata, hdct = as.integer(hdct)) level <- gsub("[0-9]", "", as.character(redata$varb)) redata <- data.frame(redata, level = as.factor(level)) aout <- aster(resp ~ varb + level : (nsloc + ewloc) + hdct : pop, pred, fam, varb, id, root, data = redata) summary(aout, show.graph = TRUE)
Transform between different parameterizations of the aster model.
In effect, this function is called inside predict.aster
.
Users generally do not need to call it directly.
astertransform(arg, obj, from = c("unconditional", "conditional"), to.cond = c("unconditional", "conditional"), to.mean = c("mean.value", "canonical"))
astertransform(arg, obj, from = c("unconditional", "conditional"), to.cond = c("unconditional", "conditional"), to.mean = c("mean.value", "canonical"))
arg |
canonical parameter vector of length |
obj |
aster model object, the result of a call to |
from |
the type of canonical parameter which argument |
to.cond |
the type of parameter we want. |
to.mean |
the type of parameter we want. |
a vector of the same length as arg
, the transformed parameter vector.
Data on life history traits for the partridge pea Chamaecrista fasciculata
data(chamae)
data(chamae)
A data frame with records for 2235 plants. Data are already in “long” format; no need to reshape.
Response vector.
Categorical. Gives node of graphical model corresponding
to each component of resp
. See details below.
All ones. Root variables for graphical model.
Categorical. Indicates individual plants.
Numerical. Reproductive stage. Integer with only 3 values in this dataset.
Numerical. Log leaf number.
Numerical. Log leaf thickness.
Categorical. Block within experiment.
The levels of varb
indicate nodes of the graphical model to which
the corresponding elements of the response vector resp
belong.
This is the typical “long” format produced by the R reshape
function. For each individual, there are several response variables.
All response variables are combined in one vector resp
.
The variable varb
indicates which “original” variable
the number was for. The variable id
indicates which individual
the number was for. The levels of varb
, which are the names
of the “original” variables are
Fecundity. Bernoulli, One if any fruit, zero if no fruit.
Integer. Number of fruits observed. Greater than or equal 3 if nonzero.
Integer. Number of seeds observed from a random sample of 3 of the fruits for this individual.
Julie Etterson https://sites.google.com/d.umn.edu/dr-julie-r-etterson/home
These data are a subset of data previously analyzed by non-aster methods in the following.
Etterson, J. R. (2004). Evolutionary potential of Chamaecrista fasciculata in relation to climate change. I. Clinal patterns of selection along an environmental gradient in the great plains. Evolution, 58, 1446-1458. doi:10.1111/j.0014-3820.2004.tb01726.x.
Etterson, J. R., and Shaw, R. G. (2001). Constraint to adaptive evolution in response to global warming. Science, 294, 151-154. doi:10.1126/science.1063656.
These data are reanalyzed by aster methods in the following.
Shaw, R. G., Geyer, C. J., Wagenius, S., Hangelbroek, H. H., and Etterson, J. R. (2008) Unifying life history analyses for inference of fitness and population growth. American Naturalist, 172, E35-E47. doi:10.1086/588063.
data(chamae) ### wide version chamaew <- reshape(chamae, direction = "wide", timevar = "varb", v.names = "resp", varying = list(levels(chamae$varb)))
data(chamae) ### wide version chamaew <- reshape(chamae, direction = "wide", timevar = "varb", v.names = "resp", varying = list(levels(chamae$varb)))
Data on life history traits for the partridge pea Chamaecrista fasciculata
data(chamae2)
data(chamae2)
A data frame with records for 2239 plants. Data are already in “long” format; no need to reshape.
Response vector.
Categorical. Gives node of graphical model corresponding
to each component of resp
. See details below.
All ones. Root variables for graphical model.
Categorical. Indicates individual plants.
Numerical. Reproductive stage. Integer with only 3 values in this dataset.
Numerical. Log leaf number.
Numerical. Log leaf thickness.
Categorical. Block within experiment.
The levels of varb
indicate nodes of the graphical model to which
the corresponding elements of the response vector resp
belong.
This is the typical “long” format produced by the R reshape
function. For each individual, there are several response variables.
All response variables are combined in one vector resp
.
The variable varb
indicates which “original” variable
the number was for. The variable id
indicates which individual
the number was for. The levels of varb
, which are the names
of the “original” variables are
Fecundity. Bernoulli, One if any fruit, zero if no fruit.
Integer. Number of fruits observed.
Julie Etterson https://sites.google.com/d.umn.edu/dr-julie-r-etterson/home
These data are a subset of data previously analyzed by non-aster methods in the following.
Etterson, J. R. (2004). Evolutionary potential of Chamaecrista fasciculata in relation to climate change. I. Clinal patterns of selection along an environmental gradient in the great plains. Evolution, 58, 1446-1458. doi:10.1111/j.0014-3820.2004.tb01726.x.
Etterson, J. R., and Shaw, R. G. (2001). Constraint to adaptive evolution in response to global warming. Science, 294, 151-154. doi:10.1126/science.1063656.
These data are reanalyzed by aster methods in the following.
Shaw, R. G., Geyer, C. J., Wagenius, S., Hangelbroek, H. H., and Etterson, J. R. (2008) Unifying life history analyses for inference of fitness and population growth. American Naturalist, 172, E35-E47. doi:10.1086/588063.
data(chamae2) ### wide version chamae2w <- reshape(chamae2, direction = "wide", timevar = "varb", v.names = "resp", varying = list(levels(chamae2$varb)))
data(chamae2) ### wide version chamae2w <- reshape(chamae2, direction = "wide", timevar = "varb", v.names = "resp", varying = list(levels(chamae2$varb)))
Data on life history traits for the partridge pea Chamaecrista fasciculata
data(chamae3)
data(chamae3)
A data frame with records for 2239 plants. Data are already in “long” format; no need to reshape.
Response vector.
Categorical. Gives node of graphical model corresponding
to each component of resp
. See details below.
All ones. Root variables for graphical model.
Categorical. Indicates individual plants.
Zero-or-one-valued. Indicates “fitness” nodes of the graph.
Categorical. Sire.
Categorical. Dam.
Categorical. Experiment site.
Categorical. Population of sire and dam.
Numerical. Row. Position in site.
Categorical. Block within site.
The levels of varb
indicate nodes of the graphical model to which
the corresponding elements of the response vector resp
belong.
This is the typical “long” format produced by the R reshape
function. For each individual, there are several response variables.
All response variables are combined in one vector resp
.
The variable varb
indicates which “original” variable
the number was for. The variable id
indicates which individual
the number was for. The levels of varb
, which are the names
of the “original” variables are
Fecundity. Bernoulli, One if any fruit, zero if no fruit.
Integer. Number of fruits observed.
Julie Etterson https://sites.google.com/d.umn.edu/dr-julie-r-etterson/home
These data are a subset of data previously analyzed by non-aster methods in the following.
Etterson, J. R. (2004). Evolutionary potential of Chamaecrista fasciculata in relation to climate change. II. Genetic architecture of three populations reciprocally planted along an environmental gradient in the great plains. Evolution, 58, 1459–1471. doi:10.1111/j.0014-3820.2004.tb01727.x.
These data were reanalyzed by aster methods in the following.
Geyer, C. J., Ridley, C. E., Latta, R. G., Etterson, J. R., and Shaw, R. G. (2013) Local Adaptation and Genetic Effects on Fitness: Calculations for Exponential Family Models with Random Effects. Annals of Applied Statistics, 7, 1778–1795. doi:10.1214/13-AOAS653.
data(chamae3) ### wide version ## Not run: ### CRAN policy says examples must take < 5 sec. This doesn't. foo <- chamae3 ### delete fit because it makes no sense in wide version foo$fit <- NULL chamae3w <- reshape(foo, direction = "wide", timevar = "varb", v.names = "resp", varying = list(levels(chamae3$varb))) ## End(Not run)
data(chamae3) ### wide version ## Not run: ### CRAN policy says examples must take < 5 sec. This doesn't. foo <- chamae3 ### delete fit because it makes no sense in wide version foo$fit <- NULL chamae3w <- reshape(foo, direction = "wide", timevar = "varb", v.names = "resp", varying = list(levels(chamae3$varb))) ## End(Not run)
Data on life history traits for the narrow-leaved purple coneflower Echinacea angustifolia
data(echin2)
data(echin2)
A data frame with records for 557 plants observed over five years. Data are already in “long” format; no need to reshape.
Response vector.
Categorical. Gives node of graphical model corresponding
to each component of resp
. See details below.
All ones. Root variables for graphical model.
Categorical. Indicates individual plants.
Categorical. Position in growth chamber.
Categorical. Row in the field.
Numerical. Position within row in the field.
Categorical. See details.
Categorical. Year in which cross was done.
The levels of varb
indicate nodes of the graphical model to which
the corresponding elements of the response vector resp
belong.
This is the typical “long” format produced by the R reshape
function. For each individual, there are several response variables.
All response variables are combined in one vector resp
.
The variable varb
indicates which “original” variable
the number was for. The variable id
indicates which individual
the number was for. The levels of varb
, which are the names
of the “original” variables are
Survival for the first month in the growth chamber.
Ditto for 2nd month in the growth chamber.
Ditto for 3rd month in the growth chamber.
Survival for first year in the field.
Ditto for 2nd year in the field.
Ditto for 3rd year in the field.
Ditto for 4th year in the field.
Ditto for 5th year in the field.
Rosette count, measure of size and vigor, recorded for 3rd year in the field.
Ditto for 4th year in the field.
Ditto for 5th year in the field.
These data are complicated by the experiment being done in two parts.
Plants start their life indoors in a growth chamber. The predictor
variable flat
only makes sense during this time in which three
response variables lds1
, lds2
, and lds3
are observed.
After three months in the growth chamber, the plants (if they
survived, i. e., if lds3 == 1
) were planted in an experimental
field plot outdoors. The variables row
and posi
only make
sense during this time in which all of the rest of the response variables
are observed. Because of certain predictor variables only making sense
with respect to certain components of the response vector, the R formula
mini-language is unable to cope, and model matrices must be constructed
"by hand".
Echinacea angustifolia is native to North American tallgrass prairie,
which was once extensive but now exists only in isolated remnants.
To evaluate the effects of different mating regimes on the fitness of
resulting progeny, crosses were conducted to produce progeny of (a) mates
from different remnants, (b) mates chosen at random from the same remnant,
and (c) mates known to share maternal parent. These three categories are
the three levels of crosstype
.
Stuart Wagenius, https://www.chicagobotanic.org/research/staff/wagenius
These data are analyzed in the following.
Shaw, R. G., Geyer, C. J., Wagenius, S., Hangelbroek, H. H., and Etterson, J. R. (2008) Unifying life history analyses for inference of fitness and population growth. American Naturalist, 172, E35-E47. doi:10.1086/588063.
data(echin2)
data(echin2)
Data on life history traits for the narrow-leaved purple coneflower Echinacea angustifolia
data(echinacea)
data(echinacea)
A data frame with records for 570 plants observed over three years.
Indicator of being alive in 2002.
Ditto for 2003.
Ditto for 2004.
Indicator of flowering 2002.
Ditto for 2003.
Ditto for 2004.
Count of number of flower heads in 2002.
Ditto for 2003.
Ditto for 2004.
the remnant population of origin of the plant
(all plants were grown together, pop
encodes ancestry).
east-west location in plot.
north-south location in plot.
Stuart Wagenius, https://www.chicagobotanic.org/research/staff/wagenius
These data are analyzed in the following.
Geyer, C. J., Wagenius, S., and Shaw, R. G. (2007) Aster Models for Life History Analysis. Biometrika, 94, 415–426. doi:10.1093/biomet/asm030.
library(aster) data(echinacea) vars <- c("ld02", "ld03", "ld04", "fl02", "fl03", "fl04", "hdct02", "hdct03", "hdct04") redata <- reshape(echinacea, varying = list(vars), direction = "long", timevar = "varb", times = as.factor(vars), v.names = "resp") names(echinacea) dim(echinacea) names(redata) dim(redata)
library(aster) data(echinacea) vars <- c("ld02", "ld03", "ld04", "fl02", "fl03", "fl04", "hdct02", "hdct03", "hdct04") redata <- reshape(echinacea, varying = list(vars), direction = "long", timevar = "varb", times = as.factor(vars), v.names = "resp") names(echinacea) dim(echinacea) names(redata) dim(redata)
Families (response models) known to the package.
These functions construct simple family specifications used
in specifying models for aster
and mlogl
.
They are mostly for convenience, since the specifications are easy
to construct by hand.
fam.bernoulli() fam.poisson() fam.truncated.poisson(truncation) fam.negative.binomial(size) fam.truncated.negative.binomial(size, truncation) fam.normal.location(sd) fam.default() famfun(fam, deriv, theta)
fam.bernoulli() fam.poisson() fam.truncated.poisson(truncation) fam.negative.binomial(size) fam.truncated.negative.binomial(size, truncation) fam.normal.location(sd) fam.default() famfun(fam, deriv, theta)
truncation |
the truncation point, called |
size |
the sample size. May be non-integer. |
sd |
the standard deviation. May be non-integer. |
fam |
a family specification, which is a list of class |
deriv |
derivative wanted: 0, 1, or 2. |
theta |
value of the canonical parameter. |
Currently implemented families are
"bernoulli"
Bernoulli. The mean value parameter
is the success probability. The canonical parameter is
,
also called logit of
. The first derivative of the
cumulant function has the value
and the second derivative
of the cumulant function has the value
.
"poisson"
Poisson. The mean value parameter
is the mean of the Poisson distribution.
The canonical parameter is
.
The first and second derivatives of the cumulant function both have
the value
.
"truncated.poisson"
Poisson conditioned on being
strictly greater than , specified by
the argument
truncation
.
Let be the mean of the corresponding untruncated
Poisson distribution. Then the canonical parameters for both
truncated and untruncated distributions are the same
.
Let
be a Poisson random variable having the same mean parameter
as this distribution, and define
Then the mean value parameter and first derivative of the cumulant function of this distribution has the value
and the second derivative of the cumulant function has the value
.
"negative.binomial"
Negative binomial. The size parameter
may be noninteger, meaning the cumulant function
is
times the cumulant function of the geometric
distribution. The mean value parameter
is the mean of
the negative binomial distribution. The success probability parameter
is
The canonical parameter
is .
Since
, the canonical parameter space is restricted,
the set of
such that
.
This is, however, a
regular exponential family (the log likelihood goes to minus infinity
as
converges to the boundary of the parameter space,
so the constraint
plays no role in maximum
likelihood estimation so long as the optimization software is not too
stupid. There will be no problems so long as the default optimizer
(
trust
) is used. Since zero is not in the canonical
parameter space a negative default origin is used. The first derivative
of the cumulant function has the value
and the second derivative has the value
"truncated.negative.binomial"
Negative binomial conditioned
on being strictly greater than , specified by
the argument
truncation
.
Let be the success probability parameter of the corresponding
untruncated negative binomial distribution. Then the canonical
parameters for both
truncated and untruncated distributions are the same
, and consequently
the canonical parameter spaces are the same,
the set of
such that
,
and both models are regular exponential families.
Let
be an untruncated negative binomial random variable having
the same size and success probability parameters as this distribution.
and define
Then the mean value parameter and first derivative of the cumulant function of this distribution has the value
and the second derivative is too complicated to write here (the
formula can be found in the vignette trunc.pdf
.
"normal.location"
Normal, unknown mean, known variance.
The sd (standard deviation) parameter
may be noninteger, meaning the cumulant function
is
times the cumulant function of the standard
normal distribution. The mean value parameter
is the
mean of the normal distribution.
The canonical parameter
is
.
The first derivative of the cumulant function has the value
and the second derivative has the value
For all but fam.default
,
a list of class "astfam"
giving name and values of any
hyperparameters.
For fam.default
,
a list each element of which is of class "astfam"
.
The list of families which were hard coded in earlier versions of the
package.
### mean of poisson with mean 0.2 famfun(fam.poisson(), 1, log(0.2)) ### variance of poisson with mean 0.2 famfun(fam.poisson(), 2, log(0.2)) ### mean of poisson with mean 0.2 conditioned on being nonzero famfun(fam.truncated.poisson(trunc = 0), 1, log(0.2)) ### variance of poisson with mean 0.2 conditioned on being nonzero famfun(fam.truncated.poisson(trunc = 0), 2, log(0.2))
### mean of poisson with mean 0.2 famfun(fam.poisson(), 1, log(0.2)) ### variance of poisson with mean 0.2 famfun(fam.poisson(), 2, log(0.2)) ### mean of poisson with mean 0.2 conditioned on being nonzero famfun(fam.truncated.poisson(trunc = 0), 1, log(0.2)) ### variance of poisson with mean 0.2 conditioned on being nonzero famfun(fam.truncated.poisson(trunc = 0), 2, log(0.2))
Toy life history data created to exhibit the phenomenon of directions of recession. It was analyzed in a special topics course on aster models Fall 2018.
data(foobar)
data(foobar)
data(foobar)
loads four R objects.
a vector of family indices.
a vector of predecessor indices.
a vector of variable names associated with the nodes of the aster graph.
a data frame having data on 300 individuals, each of
which has length(vars) == 4
components of fitness,
so the aster graph for one individual has four nodes. This data
frame is already in long format; no need to reshape
.
The variables in this data frame are:
the response vector.
Categorical. Gives node of graphical model corresponding
to each component of resp
. See details below.
All ones. Root variables for graphical model.
Categorical. Individual ID numbers.
Categorical. Treatment.
Categorical. Block.
Bernoulli (zero-or-one-valued). Indicator variable of the fitness nodes of the graph; in these data there is just one node for fitness.
The levels of varb
indicate nodes of the graphical model to which
the corresponding elements of the response vector resp
belong.
This is the typical “long” format produced by the R reshape
function. For each individual, there are several response variables.
All response variables are combined in one vector resp
.
The variable varb
indicates which “original” variable
the the corresponding component of the response vector was.
The variable id
indicates which individual the corresponding
component of the response vector was.
Charles J. Geyer http://www.stat.umn.edu/geyer/8931aster/foobar.rda
These data are analyzed in deck 9 of the slides for a special topics course on aster models taught fall semester 2018 (http://www.stat.umn.edu/geyer/8931aster/slides/s9.pdf).
data(foobar) library(aster) aout <- aster(resp ~ varb + fit : (trt * blk), pred, fam, varb, id, root, data = redata) foo <- try(summary(aout)) # gives an error about directions of recession # get directions of recession dor <- attr(foo, "condition")$dor dor # found one apparent direction of recession # from regular pattern # it looks like a true direction of recession dor <- dor / max(abs(dor)) dor # but what does it do? For that need to map to saturated model # parameter space modmat <- aout$modmat dim(modmat) # oof! modmat is three-dimensional. Need an actual matrix here. modmat <- matrix(as.vector(modmat), ncol = length(dor)) dor.phi <- drop(modmat %*% dor) names(dor.phi) <- with(redata, paste(id, as.character(varb), sep = ".")) dor.phi[dor.phi != 0] fam.default()[fam[vars == "seeds"]] # since the support of the Poisson distribution is bounded above, # actually this must be minus the DOR (if it is a DOR at all). # check that all components of response vector for which dor.phi == 1 are zero # (lower bound of Poisson range) all(redata$resp[dor.phi == 1] == 0) # so minus dor.phi is a true direction of recession in the saturated model # canonical parameter space, and minus dor is a true direction of recession # in the submodel canonical parameter space # # try to get more info trt.blk <- with(redata, paste(as.character(trt), as.character(blk), sep = ".")) unique(trt.blk[dor.phi == 1]) # the reason for the direction of recession is that every individual getting # treatment a in block A had zero seeds. # # the reason the submodel DOR, R object dor, was so hard to interpret was # because fit:trta:blkA is not in the model. So let's force it in redata <- transform(redata, trt = relevel(trt, ref = "b"), blk = relevel(blk, ref = "B")) # Note: following code is copied exactly from above. Only difference # is releveling in the immediately preceding statement aout <- aster(resp ~ varb + fit : (trt * blk), pred, fam, varb, id, root, data = redata) foo <- try(summary(aout)) dor <- attr(foo, "condition")$dor dor <- dor / max(abs(dor)) dor # now it is obvious from looking at this dor that all individuals in trt a # and blk A are at the lower end (zero) of the Poisson range. # maybe the other dor we had previously would be "obvious" to someone # sufficiently skilled in understanding the meaning of regression coefficients # but not "obvious" to your humble author # # as for what to do about this, see the course slides cited in the reference # section. There is no single Right Thing to do.
data(foobar) library(aster) aout <- aster(resp ~ varb + fit : (trt * blk), pred, fam, varb, id, root, data = redata) foo <- try(summary(aout)) # gives an error about directions of recession # get directions of recession dor <- attr(foo, "condition")$dor dor # found one apparent direction of recession # from regular pattern # it looks like a true direction of recession dor <- dor / max(abs(dor)) dor # but what does it do? For that need to map to saturated model # parameter space modmat <- aout$modmat dim(modmat) # oof! modmat is three-dimensional. Need an actual matrix here. modmat <- matrix(as.vector(modmat), ncol = length(dor)) dor.phi <- drop(modmat %*% dor) names(dor.phi) <- with(redata, paste(id, as.character(varb), sep = ".")) dor.phi[dor.phi != 0] fam.default()[fam[vars == "seeds"]] # since the support of the Poisson distribution is bounded above, # actually this must be minus the DOR (if it is a DOR at all). # check that all components of response vector for which dor.phi == 1 are zero # (lower bound of Poisson range) all(redata$resp[dor.phi == 1] == 0) # so minus dor.phi is a true direction of recession in the saturated model # canonical parameter space, and minus dor is a true direction of recession # in the submodel canonical parameter space # # try to get more info trt.blk <- with(redata, paste(as.character(trt), as.character(blk), sep = ".")) unique(trt.blk[dor.phi == 1]) # the reason for the direction of recession is that every individual getting # treatment a in block A had zero seeds. # # the reason the submodel DOR, R object dor, was so hard to interpret was # because fit:trta:blkA is not in the model. So let's force it in redata <- transform(redata, trt = relevel(trt, ref = "b"), blk = relevel(blk, ref = "B")) # Note: following code is copied exactly from above. Only difference # is releveling in the immediately preceding statement aout <- aster(resp ~ varb + fit : (trt * blk), pred, fam, varb, id, root, data = redata) foo <- try(summary(aout)) dor <- attr(foo, "condition")$dor dor <- dor / max(abs(dor)) dor # now it is obvious from looking at this dor that all individuals in trt a # and blk A are at the lower end (zero) of the Poisson range. # maybe the other dor we had previously would be "obvious" to someone # sufficiently skilled in understanding the meaning of regression coefficients # but not "obvious" to your humble author # # as for what to do about this, see the course slides cited in the reference # section. There is no single Right Thing to do.
Minus the Log Likelihood for an Aster model, and its first and second
derivative. This function is called inside R function aster
.
Users generally do not need to call it directly.
mlogl(parm, pred, fam, x, root, modmat, deriv = 0, type = c("unconditional", "conditional"), famlist = fam.default(), origin, origin.type = c("model.type", "unconditional", "conditional"))
mlogl(parm, pred, fam, x, root, modmat, deriv = 0, type = c("unconditional", "conditional"), famlist = fam.default(), origin, origin.type = c("model.type", "unconditional", "conditional"))
parm |
parameter value (vector of regression coefficients)
where we evaluate the log likelihood, etc.
We also refer to |
pred |
integer vector determining the graph.
|
fam |
an integer vector of length |
x |
the response. If a matrix, rows are individuals, and columns are
variables (nodes of graphical model). So |
root |
A matrix or vector like |
modmat |
a three-dimensional array, |
deriv |
derivatives wanted: 0, 1, or 2. |
type |
type of model. The value of this argument can be abbreviated. |
famlist |
a list of family specifications (see |
origin |
Distinguished point in parameter space. May be missing,
in which case an unspecified default is provided. See |
origin.type |
Parameter space in which specified distinguished point
is located. If |
a list containing some of the following components:
value |
minus the log likelihood. |
gradient |
minus the first derivative vector of the log likelihood (minus the score). |
hessian |
minus the second derivative matrix of the log likelihood (observed Fisher information). |
Evaluates the objective function for approximate maximum likelihood for an aster model with random effects. Uses Laplace approximation to integrate out the random effects analytically. The “quasi” in the title is a misnomer in the context of aster models but the acronym PQL for this procedure is well-established in the generalized linear mixed models literature.
newpickle(alphaceesigma, fixed, random, obj, y, origin, zwz, deriv = 0)
newpickle(alphaceesigma, fixed, random, obj, y, origin, zwz, deriv = 0)
alphaceesigma |
the parameter value where the function is evaluated, a numeric vector, see details. |
fixed |
the model matrix for fixed effects. The number of rows
is |
random |
the model matrix or matrices for random effects.
The number of rows is |
obj |
aster model object, the result of a call to |
y |
response vector. May be omitted, in which case |
origin |
origin of aster model. May be omitted, in which case
default origin (see |
zwz |
A possible value of |
deriv |
Number of derivatives wanted, either zero or one.
Must be zero if |
Define
where is minus the log likelihood function of a saturated aster model,
where
is the Hessian matrix of
,
where
is a known vector (the offset vector in the terminology
of
glm
but the origin in the terminology
of aster
), where is a known matrix, the model matrix for
fixed effects (the argument
fixed
of this function),
is a known matrix, the model matrix for random effects
(either the argument
random
of this functions if it is a matrix or
Reduce(cbind, random)
if random
is a list of matrices),
where is a diagonal matrix whose diagonal is the vector
rep(sigma, times = nrand)
where nrand
is sapply(random, ncol)
when random
is a list of
matrices and ncol(random)
when random
is a matrix,
and where is the identity matrix.
This function evaluates
when
zwz
is missing.
Otherwise it evaluates the same thing except that
is replaced by zwz
.
Note that is a function of
although the
notation does not explicitly indicate this.
a list with components value
and gradient
,
the latter missing if deriv == 0
.
Not intended for use by naive users. Use reaster
.
Actually no longer used by other functions in this package.
data(radish) pred <- c(0,1,2) fam <- c(1,3,2) ### need object of type aster to supply to penmlogl and pickle aout <- aster(resp ~ varb + fit : (Site * Region + Block + Pop), pred, fam, varb, id, root, data = radish) ### model matrices for fixed and random effects modmat.fix <- model.matrix(resp ~ varb + fit : (Site * Region), data = radish) modmat.blk <- model.matrix(resp ~ 0 + fit:Block, data = radish) modmat.pop <- model.matrix(resp ~ 0 + fit:Pop, data = radish) rownames(modmat.fix) <- NULL rownames(modmat.blk) <- NULL rownames(modmat.pop) <- NULL idrop <- match(aout$dropped, colnames(modmat.fix)) idrop <- idrop[! is.na(idrop)] modmat.fix <- modmat.fix[ , - idrop] nfix <- ncol(modmat.fix) nblk <- ncol(modmat.blk) npop <- ncol(modmat.pop) alpha.start <- aout$coefficients[match(colnames(modmat.fix), names(aout$coefficients))] cee.start <- rep(0, nblk + npop) sigma.start <- rep(1, 2) alphaceesigma.start <- c(alpha.start, cee.start, sigma.start) foo <- newpickle(alphaceesigma.start, fixed = modmat.fix, random = list(modmat.blk, modmat.pop), obj = aout)
data(radish) pred <- c(0,1,2) fam <- c(1,3,2) ### need object of type aster to supply to penmlogl and pickle aout <- aster(resp ~ varb + fit : (Site * Region + Block + Pop), pred, fam, varb, id, root, data = radish) ### model matrices for fixed and random effects modmat.fix <- model.matrix(resp ~ varb + fit : (Site * Region), data = radish) modmat.blk <- model.matrix(resp ~ 0 + fit:Block, data = radish) modmat.pop <- model.matrix(resp ~ 0 + fit:Pop, data = radish) rownames(modmat.fix) <- NULL rownames(modmat.blk) <- NULL rownames(modmat.pop) <- NULL idrop <- match(aout$dropped, colnames(modmat.fix)) idrop <- idrop[! is.na(idrop)] modmat.fix <- modmat.fix[ , - idrop] nfix <- ncol(modmat.fix) nblk <- ncol(modmat.blk) npop <- ncol(modmat.pop) alpha.start <- aout$coefficients[match(colnames(modmat.fix), names(aout$coefficients))] cee.start <- rep(0, nblk + npop) sigma.start <- rep(1, 2) alphaceesigma.start <- c(alpha.start, cee.start, sigma.start) foo <- newpickle(alphaceesigma.start, fixed = modmat.fix, random = list(modmat.blk, modmat.pop), obj = aout)
Data on life history traits for the invasive California wild oat Avena barbata
data(oats)
data(oats)
A data frame with records for 821 plants. Data are already in “long” format; no need to reshape.
Response vector.
Categorical. Gives node of graphical model corresponding
to each component of resp
. See details below.
All ones. Root variables for graphical model.
Categorical. Indicates individual plants.
Categorical. Another indicator of individual plants.
Categorical. Environment in which plant was grown, a combination of experimental site and year.
Categorical. Ecotype of plant: mesic (M) or xeric (X).
Categorical. Accession, nested within ecotype.
Categorical. Experiment site. Two sites in these data.
Categorical. Year in which data were collected. Four years in these data.
Indicator (zero or one). Shorthand
for as.numeric(oats$varb == "Spike")
. So-called because the
components of outcome
indicated are the best surrogate of
Darwinian fitness in these data.
The levels of varb
indicate nodes of the graphical model to which
the corresponding elements of the response vector resp
belong.
This is the typical “long” format produced by the R reshape
function. For each individual, there are several response variables.
All response variables are combined in one vector resp
.
The variable varb
indicates which “original” variable
the number was for. The variable id
indicates which individual
the number was for. The levels of varb
, which are the names
of the “original” variables are
Indicator (zero or one). Bernoulli, One if individual survived to produce flowers.
Integer. Zero-truncated Poisson, number of spikelets (compound floral structures) observed.
Graphical model is
Robert Latta https://www.dal.ca/faculty/science/biology/faculty-staff/our-faculty/robert-latta.html
These data are a subset of data previously analyzed using non-aster methods in the following.
Latta, R. G. (2009). Testing for local adaptation in Avena barbata, a classic example of ecotypic divergence. Molecular Ecology, 18, 3781–3791. doi:10.1111/j.1365-294X.2009.04302.x.
These data are reanalyzed by aster methods in the following.
Geyer, C. J., Ridley, C. E., Latta, R. G., Etterson, J. R., and Shaw, R. G. (2013) Local Adaptation and Genetic Effects on Fitness: Calculations for Exponential Family Models with Random Effects. Annals of Applied Statistics, 7, 1778–1795. doi:10.1214/13-AOAS653.
data(oats)
data(oats)
Penalized minus log likelihood for an aster model, and its first and second
derivative. The penalization allows for (approximate) random effects.
These functions are called inside pickle
,
pickle1
, pickle2
, pickle3
,
and reaster
.
penmlogl(parm, sigma, fixed, random, obj, y, origin, deriv = 2) penmlogl2(parm, alpha, sigma, fixed, random, obj, y, origin)
penmlogl(parm, sigma, fixed, random, obj, y, origin, deriv = 2) penmlogl2(parm, alpha, sigma, fixed, random, obj, y, origin)
parm |
for |
alpha |
the vector of fixed effects. For |
sigma |
vector of square roots of variance components, one component for each group of random effects. |
fixed |
the model matrix for fixed effects. The number of rows
is |
random |
the model matrix or matrices for random effects.
Each has the same number of rows as |
obj |
aster model object, the result of a call to |
y |
response vector. May be omitted, in which case |
origin |
origin of aster model. May be omitted, in which case
default origin (see |
deriv |
number of derivatives wanted. Allowed values are 0, 1, or 2. |
Consider an aster model with random effects and canonical parameter vector of the form
where and each
are known matrices having the same
row dimension, where
is a vector of unknown parameters
(the fixed effects) and each
is a vector of random effects
that are supposed to be (marginally) independent and identically distributed
mean-zero normal with variance
sigma[j]^2
.
These functions evaluate minus the “penalized log likelihood” for this model, which considers the random effects as parameters but adds a penalization term
to minus the log likelihood.
To properly deal with random effects that are zero, random effects
are rescaled by their standard deviation.
The rescaled random effects are
.
If
, then the corresponding rescaled
random effects
are also zero.
a list containing some of the following components:
value |
minus the penalized log likelihood. |
gradient |
minus the first derivative vector of the penalized log likelihood. |
hessian |
minus the second derivative matrix of the penalized log likelihood. |
argument |
the value of the |
scale |
the vector by which |
mlogl.gradient |
gradient for evaluation of log likelihood;
|
mlogl.hessian |
hessian for evaluation of log likelihood;
|
Not intended for use by naive users. Use reaster
,
which calls them.
For an example using this function see the example
for pickle
.
Evaluates an approximation to minus the log likelihood for an aster model with random effects. Uses Laplace approximation to integrate out the random effects analytically. The “quasi” in the title is a misnomer in the context of aster models but the acronym PQL for this procedure is well-established in the generalized linear mixed models literature.
pickle(sigma, parm, fixed, random, obj, y, origin, cache, ...) makezwz(sigma, parm, fixed, random, obj, y, origin) pickle1(sigma, parm, fixed, random, obj, y, origin, cache, zwz, deriv = 0, ...) pickle2(alphasigma, parm, fixed, random, obj, y, origin, cache, zwz, deriv = 0, ...) pickle3(alphaceesigma, fixed, random, obj, y, origin, zwz, deriv = 0)
pickle(sigma, parm, fixed, random, obj, y, origin, cache, ...) makezwz(sigma, parm, fixed, random, obj, y, origin) pickle1(sigma, parm, fixed, random, obj, y, origin, cache, zwz, deriv = 0, ...) pickle2(alphasigma, parm, fixed, random, obj, y, origin, cache, zwz, deriv = 0, ...) pickle3(alphaceesigma, fixed, random, obj, y, origin, zwz, deriv = 0)
sigma |
vector of square roots of variance components,
one component for each group of random effects. Negative values are
allowed; the vector of variance components is |
parm |
starting value for inner optimization. Ignored if
|
alphasigma |
the concatenation of the vector of fixed effects and the vector of square roots of variance components. |
alphaceesigma |
the concatenation of the vector of fixed effects, the vector of rescaled random effects, and the vector of square roots of variance components. |
fixed |
the model matrix for fixed effects. The number of rows
is |
random |
the model matrix or matrices for random effects.
The number of rows is |
obj |
aster model object, the result of a call to |
y |
response vector. May be omitted, in which case |
origin |
origin of aster model. May be omitted, in which case
default origin (see |
cache |
If not missing, an environment in which to cache the value
of |
zwz |
A possible value of |
deriv |
Number of derivatives wanted. For |
... |
additional arguments passed to |
Define
where is minus the log likelihood function of a saturated aster model,
is a known vector (the offset vector in the terminology
of
glm
but the origin in the terminology
of aster
), is a known matrix, the model matrix for
fixed effects (the argument
fixed
of these functions),
is a known matrix, the model matrix for random effects
(either the argument
random
of these functions if it is a matrix or
Reduce(cbind, random)
if random
is a list of matrices),
is a diagonal matrix whose diagonal is the vector
rep(sigma, times = nrand)
where nrand
is sapply(random, ncol)
when random
is a list of
matrices and ncol(random)
when random
is a matrix,
is any symmetric positive semidefinite matrix
(more on this below),
and
is the identity matrix.
Note that
is a function of
although the
notation does not explicitly indicate this.
Let denote the minimizer of
considered as a function of
for fixed
and
,
and let
and
denote the (joint) minimizers of
considered as
a function of
and
for fixed
.
Note that
is a function of
and
although the notation does not explicitly
indicate this.
Note that
and
are functions of
(only) although the notation
does not explicitly indicate this.
Now define
and
Then pickle1
evaluates ,
pickle2
evaluates , and
pickle3
evaluates ,
where
in the formulas above is
specified by the argument
zwz
of these functions.
All of these functions supply derivative (gradient) vectors if
deriv = 1
is specified, and pickle3
supplies the
second derivative (Hessian) matrix if deriv = 2
is specified.
Let denote the second derivative function of
, that is,
is the second derivative matrix of the function
evaluated at the point
. The idea is that
should be approximately the value of
,
where
,
,
and
are the (joint) minimizers of
and
.
In aid of this, the function
makezwz
evaluates
for any
,
, and
.
pickle
evaluates the function
no derivatives can be computed because no derivatives of the function
are computed for aster models.
The general idea is the one uses pickle
with a no-derivative
optimizer, such as the "Nelder-Mead"
method of the optim
function to get a crude estimate of .
Then one uses
trust
with objective
function penmlogl
to estimate the corresponding
and
(example below).
Then one use
makezwz
to produce the corresponding zwz
(example below).
These estimates can be improved using trust
with objective
function pickle3
using this zwz
(example below), and this
step may be iterated until convergence.
Finally, optim
is used with objective function pickle2
to estimate the Hessian matrix of ,
which is approximate observed information because
is approximate minus log likelihood.
For pickle
, a scalar, minus the
(PQL approximation of) the log likelihood.
For pickle1
and pickle2
, a list having components value
and gradient
(present only when deriv = 1
).
For pickle3
, a list having components value
,
gradient
(present only when deriv >= 1
),
and hessian
(present only when deriv = 2
).
Not intended for use by naive users. Use reaster
,
which calls them.
data(radish) library(trust) pred <- c(0,1,2) fam <- c(1,3,2) ### need object of type aster to supply to penmlogl and pickle aout <- aster(resp ~ varb + fit : (Site * Region + Block + Pop), pred, fam, varb, id, root, data = radish) ### model matrices for fixed and random effects modmat.fix <- model.matrix(resp ~ varb + fit : (Site * Region), data = radish) modmat.blk <- model.matrix(resp ~ 0 + fit:Block, data = radish) modmat.pop <- model.matrix(resp ~ 0 + fit:Pop, data = radish) rownames(modmat.fix) <- NULL rownames(modmat.blk) <- NULL rownames(modmat.pop) <- NULL idrop <- match(aout$dropped, colnames(modmat.fix)) idrop <- idrop[! is.na(idrop)] modmat.fix <- modmat.fix[ , - idrop] nfix <- ncol(modmat.fix) nblk <- ncol(modmat.blk) npop <- ncol(modmat.pop) ### try penmlogl sigma.start <- c(1, 1) alpha.start <- aout$coefficients[match(colnames(modmat.fix), names(aout$coefficients))] parm.start <- c(alpha.start, rep(0, nblk + npop)) tout <- trust(objfun = penmlogl, parm.start, rinit = 1, rmax = 10, sigma = sigma.start, fixed = modmat.fix, random = list(modmat.blk, modmat.pop), obj = aout) tout$converged ### crude estimate of variance components eff.blk <- tout$argument[seq(nfix + 1, nfix + nblk)] eff.pop <- tout$argument[seq(nfix + nblk + 1, nfix + nblk + npop)] sigma.crude <- sqrt(c(var(eff.blk), var(eff.pop))) ### try optim and pickle cache <- new.env(parent = emptyenv()) oout <- optim(sigma.crude, pickle, parm = tout$argument, fixed = modmat.fix, random = list(modmat.blk, modmat.pop), obj = aout, cache = cache) oout$convergence == 0 ### estimated variance components oout$par^2 ### get estimates of fixed and random effects tout <- trust(objfun = penmlogl, tout$argument, rinit = 1, rmax = 10, sigma = oout$par, fixed = modmat.fix, random = list(modmat.blk, modmat.pop), obj = aout, fterm = 0) tout$converged sigma.better <- oout$par alpha.better <- tout$argument[1:nfix] c.better <- tout$argument[- (1:nfix)] zwz.better <- makezwz(sigma.better, parm = c(alpha.better, c.better), fixed = modmat.fix, random = list(modmat.blk, modmat.pop), obj = aout) ### get better estimates objfun <- function(alphaceesigma, zwz) pickle3(alphaceesigma, fixed = modmat.fix, random = list(modmat.blk, modmat.pop), obj = aout, zwz = zwz, deriv = 2) tout <- trust(objfun, c(alpha.better, c.better, sigma.better), rinit = 1, rmax = 10, zwz = zwz.better) tout$converged alpha.mle <- tout$argument[1:nfix] c.mle <- tout$argument[nfix + 1:(nblk + npop)] sigma.mle <- tout$argument[nfix + nblk + npop + 1:2] zwz.mle <- makezwz(sigma.mle, parm = c(alpha.mle, c.mle), fixed = modmat.fix, random = list(modmat.blk, modmat.pop), obj = aout) ### estimated variance components sigma.mle^2 ### preceding step can be iterated "until convergence" ### get (approximate) Fisher information objfun <- function(alphasigma) pickle2(alphasigma, parm = c.mle, fixed = modmat.fix, random = list(modmat.blk, modmat.pop), obj = aout, zwz = zwz.mle)$value gradfun <- function(alphasigma) pickle2(alphasigma, parm = c.mle, fixed = modmat.fix, random = list(modmat.blk, modmat.pop), obj = aout, zwz = zwz.mle, deriv = 1)$gradient oout <- optim(c(alpha.mle, sigma.mle), objfun, gradfun, method = "BFGS", hessian = TRUE) oout$convergence == 0 fish <- oout$hessian
data(radish) library(trust) pred <- c(0,1,2) fam <- c(1,3,2) ### need object of type aster to supply to penmlogl and pickle aout <- aster(resp ~ varb + fit : (Site * Region + Block + Pop), pred, fam, varb, id, root, data = radish) ### model matrices for fixed and random effects modmat.fix <- model.matrix(resp ~ varb + fit : (Site * Region), data = radish) modmat.blk <- model.matrix(resp ~ 0 + fit:Block, data = radish) modmat.pop <- model.matrix(resp ~ 0 + fit:Pop, data = radish) rownames(modmat.fix) <- NULL rownames(modmat.blk) <- NULL rownames(modmat.pop) <- NULL idrop <- match(aout$dropped, colnames(modmat.fix)) idrop <- idrop[! is.na(idrop)] modmat.fix <- modmat.fix[ , - idrop] nfix <- ncol(modmat.fix) nblk <- ncol(modmat.blk) npop <- ncol(modmat.pop) ### try penmlogl sigma.start <- c(1, 1) alpha.start <- aout$coefficients[match(colnames(modmat.fix), names(aout$coefficients))] parm.start <- c(alpha.start, rep(0, nblk + npop)) tout <- trust(objfun = penmlogl, parm.start, rinit = 1, rmax = 10, sigma = sigma.start, fixed = modmat.fix, random = list(modmat.blk, modmat.pop), obj = aout) tout$converged ### crude estimate of variance components eff.blk <- tout$argument[seq(nfix + 1, nfix + nblk)] eff.pop <- tout$argument[seq(nfix + nblk + 1, nfix + nblk + npop)] sigma.crude <- sqrt(c(var(eff.blk), var(eff.pop))) ### try optim and pickle cache <- new.env(parent = emptyenv()) oout <- optim(sigma.crude, pickle, parm = tout$argument, fixed = modmat.fix, random = list(modmat.blk, modmat.pop), obj = aout, cache = cache) oout$convergence == 0 ### estimated variance components oout$par^2 ### get estimates of fixed and random effects tout <- trust(objfun = penmlogl, tout$argument, rinit = 1, rmax = 10, sigma = oout$par, fixed = modmat.fix, random = list(modmat.blk, modmat.pop), obj = aout, fterm = 0) tout$converged sigma.better <- oout$par alpha.better <- tout$argument[1:nfix] c.better <- tout$argument[- (1:nfix)] zwz.better <- makezwz(sigma.better, parm = c(alpha.better, c.better), fixed = modmat.fix, random = list(modmat.blk, modmat.pop), obj = aout) ### get better estimates objfun <- function(alphaceesigma, zwz) pickle3(alphaceesigma, fixed = modmat.fix, random = list(modmat.blk, modmat.pop), obj = aout, zwz = zwz, deriv = 2) tout <- trust(objfun, c(alpha.better, c.better, sigma.better), rinit = 1, rmax = 10, zwz = zwz.better) tout$converged alpha.mle <- tout$argument[1:nfix] c.mle <- tout$argument[nfix + 1:(nblk + npop)] sigma.mle <- tout$argument[nfix + nblk + npop + 1:2] zwz.mle <- makezwz(sigma.mle, parm = c(alpha.mle, c.mle), fixed = modmat.fix, random = list(modmat.blk, modmat.pop), obj = aout) ### estimated variance components sigma.mle^2 ### preceding step can be iterated "until convergence" ### get (approximate) Fisher information objfun <- function(alphasigma) pickle2(alphasigma, parm = c.mle, fixed = modmat.fix, random = list(modmat.blk, modmat.pop), obj = aout, zwz = zwz.mle)$value gradfun <- function(alphasigma) pickle2(alphasigma, parm = c.mle, fixed = modmat.fix, random = list(modmat.blk, modmat.pop), obj = aout, zwz = zwz.mle, deriv = 1)$gradient oout <- optim(c(alpha.mle, sigma.mle), objfun, gradfun, method = "BFGS", hessian = TRUE) oout$convergence == 0 fish <- oout$hessian
Obtains predictions (parameter estimates) and optionally estimates standard errors of those predictions (parameter estimates) from a fitted Aster model object.
## S3 method for class 'aster' predict(object, x, root, modmat, amat, parm.type = c("mean.value", "canonical"), model.type = c("unconditional", "conditional"), is.always.parameter = FALSE, se.fit = FALSE, info = c("expected", "observed"), info.tol = sqrt(.Machine$double.eps), newcoef = NULL, gradient = se.fit, ...) ## S3 method for class 'aster.formula' predict(object, newdata, varvar, idvar, root, amat, parm.type = c("mean.value", "canonical"), model.type = c("unconditional", "conditional"), is.always.parameter = FALSE, se.fit = FALSE, info = c("expected", "observed"), info.tol = sqrt(.Machine$double.eps), newcoef = NULL, gradient = se.fit, ...)
## S3 method for class 'aster' predict(object, x, root, modmat, amat, parm.type = c("mean.value", "canonical"), model.type = c("unconditional", "conditional"), is.always.parameter = FALSE, se.fit = FALSE, info = c("expected", "observed"), info.tol = sqrt(.Machine$double.eps), newcoef = NULL, gradient = se.fit, ...) ## S3 method for class 'aster.formula' predict(object, newdata, varvar, idvar, root, amat, parm.type = c("mean.value", "canonical"), model.type = c("unconditional", "conditional"), is.always.parameter = FALSE, se.fit = FALSE, info = c("expected", "observed"), info.tol = sqrt(.Machine$double.eps), newcoef = NULL, gradient = se.fit, ...)
object |
a fitted object of class inheriting from |
modmat |
a model matrix to use instead of May be missing, in which case
|
x |
response. Ignored and may be missing unless
|
root |
root data. Ignored and may be missing unless
|
amat |
if For For |
parm.type |
the type of parameter to predict. The default is
mean value parameters (the opposite of the default
for The alternative The value of this argument can be abbreviated. |
model.type |
the type of model in which to predict. The default is
The value of this argument can be abbreviated. |
is.always.parameter |
logical, default |
se.fit |
logical switch indicating if standard errors are required. |
info |
the type of Fisher information to use to compute standard errors. |
info.tol |
tolerance for eigenvalues of Fisher information.
If |
newdata |
optionally, a data frame in which to look for variables with
which to predict. If omitted, see |
varvar |
a variable of length |
idvar |
a variable of length |
newcoef |
if not |
gradient |
if |
... |
further arguments passed to or from other methods. |
Note that model.type
need have nothing to do with the type
of the fitted aster model, which is object$type
.
Whether the
fitted model is conditional or unconditional, one typically wants
unconditional mean value parameters, because conditional mean
value parameters for hypothetical individuals depend on the hypothetical
data x
, which usually makes no scientific sense.
If one asks for conditional mean value parameters, one gets
them only if is.always.parameter = TRUE
is specified.
Otherwise, conditional expectations that are not parameters (because
they depend on data) are produced.
See Conditional Mean Values Section for more about this.
Similarly, if object$type == "conditional"
, then the conditional
canonical parameters are a linear function of the regression coefficients
, where
is the model matrix,
but one can predict either
or the unconditional
canonical parameters
,
as selected by
model.type
.
Similarly, if object$type == "unconditional"
,
so , one can predict either
or
as selected by
model.type
.
The specification of the prediction model is confusing because there
are so many possibilities. First the “usual” case.
The fit was done using a formula, found in object$formula
.
A data frame newdata
that has the same variables as object$data
,
the data frame used in the fit, but may have different rows (representing
hypothetical individuals) is supplied.
But newdata
must specify all nodes
of the graphical model for each (hypothetical, new) individual,
just like object$data
did for real observed individuals.
Hence newdata
is typically constructed using reshape
.
See also the details section of aster
.
In this “usual” case we need varvar
and idvar
to
tell us what rows of newdata
correspond to which individuals and
nodes (the same role they played in the original fit by aster
).
If we are predicting canonical parameters, then we do not need root
or
x
.
If we are predicting unconditional mean value parameters, then
we also need root
but not x
.
If we are predicting conditional mean value parameters, then
we also need both root
and x
.
In the “usual” case, these are found in newdata
and
not supplied as arguments to predict
. Moreover, x
is not named "x"
but is the response in out$formula
.
The next case, predict(object)
with no other arguments,
is often used with linear models (predict.lm
),
but we expect will be little used for aster models. As for linear
models, this “predicts” the observed data. In this case
modmat
, x
, and root
are found in object
and nothing is supplied as an argument to predict.aster
, except
perhaps amat
if one wants a function of predictions for the observed
data.
The final case, also perhaps little used, is a fail-safe mode for problems
in which the R formula language just cannot be bludgeoned into doing what
you want. This is the same reason aster.default
exists.
Then a model matrix can be constructed “by hand”, and the function
predict.aster
is used instead of predict.aster.formula
.
Note that it is possible to use a “constructed by hand”
model matrix even if object
was produced by aster.formula
.
Simply explicitly call predict.aster
rather than predict
to override the R method dispatch (which would
call predict.aster.formula
in this case).
If se.fit = FALSE
and gradient = FALSE
, a vector of predictions.
If se.fit = TRUE
, a list with components
fit |
Predictions |
se.fit |
Estimated standard errors |
gradient |
The gradient of the transformation from regression coefficients to predictions |
If gradient = TRUE
, a list with components
fit |
Predictions |
gradient |
The gradient of the transformation from regression coefficients to predictions |
Both the original aster paper (Geyer, et al., 2007) and this package are weird about conditional mean values. Equation (10) of that paper defines (using different notation from what we use here)
where are components of the response vector and
denotes denotes the predecessor of node
. That paper explicitly says
that this is is not a parameter because it depends on the data. In fact
(this is equation (3) of that paper in different notation). Thus it is weird to use a Greek letter to denote this.
There should be a conditional mean value parameter, and Geyer (2010, equation (11b)) defines it as
(This equation only makes sense when the conditioning event
is possible, which it is not for
-truncated arrows for
. Then a more complicated definition
must be used. By definition
is the sum
of
independent and identically distributed random
variables, and
is always the mean of one of those
random variables.)
This gives us the important relationship between conditional and unconditional
mean value parameters
which holds for all successor nodes .
All later writings of this author use this definition of
as does the R package
aster2
(Geyer, 2017).
This is one of six important parameterizations of an unconditional aster model
(Geyer, 2010, Sections 2.7 and 2.8). The R package aster2
uses all
of them.
This function (as of version 1.0 of this package) has an argument
is.always.parameter
to switch between these two definitions
in case parm.type = "mean.value"
and model.type = "conditional"
are specified. Then is.always.parameter = TRUE
specifies that the latter
definition of is produced (which is a parameter, with all other
options for
parm.type
and model.type
). The option
is.always.parameter = FALSE
specifies that the former
definition of is produced (which is a conditional expectation
but not a parameter) and is what this function produced in versions of this
package before 1.0.
Geyer, C. J. (2010) A Philosophical Look at Aster Models. Technical Report No. 676. School of Statistics, University of Minnesota. http://purl.umn.edu/57163.
Geyer, C.~J. (2017).
R package aster2
(Aster Models), version 0.3.
https://cran.r-project.org/package=aster2.
Geyer, C. J., Wagenius, S., and Shaw, R. G. (2007) Aster models for life history analysis. Biometrika, 94, 415–426. doi:10.1093/biomet/asm030.
### see package vignette for explanation ### library(aster) data(echinacea) vars <- c("ld02", "ld03", "ld04", "fl02", "fl03", "fl04", "hdct02", "hdct03", "hdct04") redata <- reshape(echinacea, varying = list(vars), direction = "long", timevar = "varb", times = as.factor(vars), v.names = "resp") redata <- data.frame(redata, root = 1) pred <- c(0, 1, 2, 1, 2, 3, 4, 5, 6) fam <- c(1, 1, 1, 1, 1, 1, 3, 3, 3) hdct <- grepl("hdct", as.character(redata$varb)) redata <- data.frame(redata, hdct = as.integer(hdct)) level <- gsub("[0-9]", "", as.character(redata$varb)) redata <- data.frame(redata, level = as.factor(level)) aout <- aster(resp ~ varb + level : (nsloc + ewloc) + hdct : pop, pred, fam, varb, id, root, data = redata) newdata <- data.frame(pop = levels(echinacea$pop)) for (v in vars) newdata[[v]] <- 1 newdata$root <- 1 newdata$ewloc <- 0 newdata$nsloc <- 0 renewdata <- reshape(newdata, varying = list(vars), direction = "long", timevar = "varb", times = as.factor(vars), v.names = "resp") hdct <- grepl("hdct", as.character(renewdata$varb)) renewdata <- data.frame(renewdata, hdct = as.integer(hdct)) level <- gsub("[0-9]", "", as.character(renewdata$varb)) renewdata <- data.frame(renewdata, level = as.factor(level)) nind <- nrow(newdata) nnode <- length(vars) amat <- array(0, c(nind, nnode, nind)) for (i in 1:nind) amat[i , grep("hdct", vars), i] <- 1 foo <- predict(aout, varvar = varb, idvar = id, root = root, newdata = renewdata, se.fit = TRUE, amat = amat) bar <- cbind(foo$fit, foo$se.fit) dimnames(bar) <- list(as.character(newdata$pop), c("Estimate", "Std. Error")) print(bar)
### see package vignette for explanation ### library(aster) data(echinacea) vars <- c("ld02", "ld03", "ld04", "fl02", "fl03", "fl04", "hdct02", "hdct03", "hdct04") redata <- reshape(echinacea, varying = list(vars), direction = "long", timevar = "varb", times = as.factor(vars), v.names = "resp") redata <- data.frame(redata, root = 1) pred <- c(0, 1, 2, 1, 2, 3, 4, 5, 6) fam <- c(1, 1, 1, 1, 1, 1, 3, 3, 3) hdct <- grepl("hdct", as.character(redata$varb)) redata <- data.frame(redata, hdct = as.integer(hdct)) level <- gsub("[0-9]", "", as.character(redata$varb)) redata <- data.frame(redata, level = as.factor(level)) aout <- aster(resp ~ varb + level : (nsloc + ewloc) + hdct : pop, pred, fam, varb, id, root, data = redata) newdata <- data.frame(pop = levels(echinacea$pop)) for (v in vars) newdata[[v]] <- 1 newdata$root <- 1 newdata$ewloc <- 0 newdata$nsloc <- 0 renewdata <- reshape(newdata, varying = list(vars), direction = "long", timevar = "varb", times = as.factor(vars), v.names = "resp") hdct <- grepl("hdct", as.character(renewdata$varb)) renewdata <- data.frame(renewdata, hdct = as.integer(hdct)) level <- gsub("[0-9]", "", as.character(renewdata$varb)) renewdata <- data.frame(renewdata, level = as.factor(level)) nind <- nrow(newdata) nnode <- length(vars) amat <- array(0, c(nind, nnode, nind)) for (i in 1:nind) amat[i , grep("hdct", vars), i] <- 1 foo <- predict(aout, varvar = varb, idvar = id, root = root, newdata = renewdata, se.fit = TRUE, amat = amat) bar <- cbind(foo$fit, foo$se.fit) dimnames(bar) <- list(as.character(newdata$pop), c("Estimate", "Std. Error")) print(bar)
Evaluates the objective function for approximate maximum likelihood for an aster model with random effects. Uses Laplace approximation to integrate out the random effects analytically. The “quasi” in the title is a misnomer in the context of aster models but the acronym PQL for this procedure is well-established in the generalized linear mixed models literature.
quickle(alphanu, bee, fixed, random, obj, y, origin, zwz, deriv = 0)
quickle(alphanu, bee, fixed, random, obj, y, origin, zwz, deriv = 0)
alphanu |
the parameter vector value where the function is evaluated, a numeric vector, see details. |
bee |
the random effects vector that is used as the starting point
for the inner optimization, which maximizes the penalized log likelihood
to find the optimal random effects vector matching |
fixed |
the model matrix for fixed effects. The number of rows
is |
random |
the model matrix or matrices for random effects.
The number of rows is |
obj |
aster model object, the result of a call to |
y |
response vector. May be omitted, in which case |
origin |
origin of aster model. May be omitted, in which case
default origin (see |
zwz |
A possible value of |
deriv |
Number of derivatives wanted, zero, one, or two. |
Define
where is minus the log likelihood function of a saturated aster model,
where
is a known vector (the offset vector in the terminology
of
glm
but the origin in the terminology
of aster
),
where is a known matrix, the model matrix for fixed effects
(the argument
fixed
of this function),
where is a known matrix, the model matrix for random effects
(either the argument
random
of this function if it is a matrix or
Reduce(cbind, random)
if random
is a list of matrices),
where is a diagonal matrix whose diagonal is the vector
rep(nu, times = nrand)
where nrand
is sapply(random, ncol)
when random
is a list of
matrices and ncol(random)
when random
is a matrix,
where is an arbitrary symmetric positive semidefinite matrix
(
is the argument
zwz
of this function),
and where is the identity matrix.
Note that
is a function of
although the notation does not explicitly indicate this.
The argument alphanu
of this function is the concatenation
of the parameter vectors and
.
The argument
bee
of this function is a possible value of .
The length of
is the column dimension of
.
The length of
is the column dimension of
.
The length of
is the length of the argument
random
of this function if it is a list and is one otherwise.
Let denote the minimizer
of
considered as a function of
for fixed
and
, so
is a function of
and
.
This function evaluates
and its gradient vector and Hessian matrix (if requested).
Note that is a function of
and
although the notation does not explicitly indicate this.
a list with some of the following components: value
, gradient
,
hessian
, alpha
, bee
, nu
. The first three are
the requested derivatives. The second three are the corresponding parameter
values: alpha
and nu
are the corresponding parts of the
argument alphanu
, the value of bee
is the result of the inner
optimization ( in the notation in details),
not the argument
bee
of this function.
Not intended for use by naive users. Use summary.reaster
,
which calls it.
data(radish) pred <- c(0,1,2) fam <- c(1,3,2) rout <- reaster(resp ~ varb + fit : (Site * Region), list(block = ~ 0 + fit : Block, pop = ~ 0 + fit : Pop), pred, fam, varb, id, root, data = radish) alpha.mle <- rout$alpha bee.mle <- rout$b nu.mle <- rout$sigma^2 zwz.mle <- rout$zwz obj <- rout$obj fixed <- rout$fixed random <- rout$random alphanu.mle <- c(alpha.mle, nu.mle) qout <- quickle(alphanu.mle, bee.mle, fixed, random, obj, zwz = zwz.mle, deriv = 2)
data(radish) pred <- c(0,1,2) fam <- c(1,3,2) rout <- reaster(resp ~ varb + fit : (Site * Region), list(block = ~ 0 + fit : Block, pop = ~ 0 + fit : Pop), pred, fam, varb, id, root, data = radish) alpha.mle <- rout$alpha bee.mle <- rout$b nu.mle <- rout$sigma^2 zwz.mle <- rout$zwz obj <- rout$obj fixed <- rout$fixed random <- rout$random alphanu.mle <- c(alpha.mle, nu.mle) qout <- quickle(alphanu.mle, bee.mle, fixed, random, obj, zwz = zwz.mle, deriv = 2)
Data on life history traits for the invasive California wild radish Raphanus sativus
data(radish)
data(radish)
A data frame with records for 286 plants. Data are already in “long” format; no need to reshape.
Response vector.
Categorical. Gives node of graphical model corresponding
to each component of resp
. See details below.
All ones. Root variables for graphical model.
Categorical. Indicates individual plants.
Categorical. Experimental site where plant was grown. Two sites in this dataset.
Categorical. Block nested within site.
Categorical. Region from which individuals were obtained: northern, coastal California (N) or southern, inland California (S).
Categorical. Wild population nested within region.
Indicator (zero or one). Shorthand
for as.numeric(radish$varb == "Flowering")
.
Indicator (zero or one). Shorthand
for as.numeric(radish$varb == "Flowers")
.
Indicator (zero or one). Shorthand
for as.numeric(radish$varb == "Fruits")
. So-called because the
components of outcome
indicated are the best surrogate of
Darwinian fitness in these data.
The levels of varb
indicate nodes of the graphical model to which
the corresponding elements of the response vector resp
belong.
This is the typical “long” format produced by the R reshape
function. For each individual, there are several response variables.
All response variables are combined in one vector resp
.
The variable varb
indicates which “original” variable
the number was for. The variable id
indicates which individual
the number was for. The levels of varb
, which are the names
of the “original” variables are
Indicator (zero or one). Bernoulli, One if individual survived to produce flowers.
Integer. Zero-truncated Poisson, number of flowers observed.
Integer. Poisson, number of fruits observed.
Graphical model is
Caroline Ridley
These data are a subset of data previously analyzed using fixed effect
aster methods (R function aster
) in the following.
Ridley, C. E. and Ellstrand, N. C. (2010). Rapid evolution of morphology and adaptive life history in the invasive California wild radish (Raphanus sativus) and the implications for management. Evolutionary Applications, 3, 64–76.
These data are a subset of data previously analyzed using random effect
aster methods (R function reaster
) in the following.
Geyer, C. J., Ridley, C. E., Latta, R. G., Etterson, J. R., and Shaw, R. G. (2013) Local Adaptation and Genetic Effects on Fitness: Calculations for Exponential Family Models with Random Effects. Annals of Applied Statistics, 7, 1778–1795. doi:10.1214/13-AOAS653.
data(radish)
data(radish)
Random generation of data for Aster models.
raster(theta, pred, fam, root, famlist = fam.default())
raster(theta, pred, fam, root, famlist = fam.default())
theta |
canonical parameter of the conditional model. A matrix, rows represent individuals and columns represent nodes in the graphical model. |
pred |
integer vector of length |
fam |
integer vector of length |
root |
A matrix of the same dimensions as |
famlist |
a list of family specifications (see |
A matrix of the same dimensions as theta
. The random data
for an aster model with the specified graph, parameters, and root
data.
### see package vignette for explanation ### data(echinacea) vars <- c("ld02", "ld03", "ld04", "fl02", "fl03", "fl04", "hdct02", "hdct03", "hdct04") redata <- reshape(echinacea, varying = list(vars), direction = "long", timevar = "varb", times = as.factor(vars), v.names = "resp") redata <- data.frame(redata, root = 1) pred <- c(0, 1, 2, 1, 2, 3, 4, 5, 6) fam <- c(1, 1, 1, 1, 1, 1, 3, 3, 3) hdct <- grep("hdct", as.character(redata$varb)) hdct <- is.element(seq(along = redata$varb), hdct) redata <- data.frame(redata, hdct = as.integer(hdct)) aout4 <- aster(resp ~ varb + nsloc + ewloc + pop * hdct - pop, pred, fam, varb, id, root, data = redata) newdata <- data.frame(pop = levels(echinacea$pop)) for (v in vars) newdata[[v]] <- 1 newdata$root <- 1 newdata$ewloc <- 0 newdata$nsloc <- 0 renewdata <- reshape(newdata, varying = list(vars), direction = "long", timevar = "varb", times = as.factor(vars), v.names = "resp") hdct <- grep("hdct", as.character(renewdata$varb)) hdct <- is.element(seq(along = renewdata$varb), hdct) renewdata <- data.frame(renewdata, hdct = as.integer(hdct)) beta.hat <- aout4$coef theta.hat <- predict(aout4, model.type = "cond", parm.type = "canon") theta.hat <- matrix(theta.hat, nrow = nrow(aout4$x), ncol = ncol(aout4$x)) xstar <- raster(theta.hat, pred, fam, aout4$root) aout4star <- aster(xstar, aout4$root, pred, fam, aout4$modmat, beta.hat) beta.star <- aout4star$coef print(cbind(beta.hat, beta.star))
### see package vignette for explanation ### data(echinacea) vars <- c("ld02", "ld03", "ld04", "fl02", "fl03", "fl04", "hdct02", "hdct03", "hdct04") redata <- reshape(echinacea, varying = list(vars), direction = "long", timevar = "varb", times = as.factor(vars), v.names = "resp") redata <- data.frame(redata, root = 1) pred <- c(0, 1, 2, 1, 2, 3, 4, 5, 6) fam <- c(1, 1, 1, 1, 1, 1, 3, 3, 3) hdct <- grep("hdct", as.character(redata$varb)) hdct <- is.element(seq(along = redata$varb), hdct) redata <- data.frame(redata, hdct = as.integer(hdct)) aout4 <- aster(resp ~ varb + nsloc + ewloc + pop * hdct - pop, pred, fam, varb, id, root, data = redata) newdata <- data.frame(pop = levels(echinacea$pop)) for (v in vars) newdata[[v]] <- 1 newdata$root <- 1 newdata$ewloc <- 0 newdata$nsloc <- 0 renewdata <- reshape(newdata, varying = list(vars), direction = "long", timevar = "varb", times = as.factor(vars), v.names = "resp") hdct <- grep("hdct", as.character(renewdata$varb)) hdct <- is.element(seq(along = renewdata$varb), hdct) renewdata <- data.frame(renewdata, hdct = as.integer(hdct)) beta.hat <- aout4$coef theta.hat <- predict(aout4, model.type = "cond", parm.type = "canon") theta.hat <- matrix(theta.hat, nrow = nrow(aout4$x), ncol = ncol(aout4$x)) xstar <- raster(theta.hat, pred, fam, aout4$root) aout4star <- aster(xstar, aout4$root, pred, fam, aout4$modmat, beta.hat) beta.star <- aout4star$coef print(cbind(beta.hat, beta.star))
Fits Aster Models with Random Effects using Laplace Approximation.
reaster(fixed, random, pred, fam, varvar, idvar, root, famlist = fam.default(), origin, data, effects, sigma, response)
reaster(fixed, random, pred, fam, varvar, idvar, root, famlist = fam.default(), origin, data, effects, sigma, response)
fixed |
either a model matrix or a formula specifying response and model matrix. The model matrix for fixed effects. |
random |
either a model matrix or list of model matrices or a formula or a list of formulas specifying a model matrix or matrices. The model matrix or matrices for random effects. Each model matrix specifies the random effects for one variance component. |
pred |
an integer vector of length |
fam |
an integer vector of length |
varvar |
a variable whose length is the row dimension of all model
matrices that is a factor whose levels are character strings
treated as variable names. The number of variable names is |
idvar |
a variable whose length is the row dimension of all model
matrices. The number of individuals is |
root |
a vector whose length is the row dimension of all model matrices. For nodes whose predecessors are root nodes specifies the value of the constant at that root node. Typically the vector having all components equal to one. |
famlist |
a list of family specifications (see |
origin |
a vector whose length is the row dimension of all model
matrices. Distinguished point in parameter space. May be missing,
in which case an unspecified default is provided. See details of
|
data |
an optional data frame containing the variables
in the model. If not found in |
effects |
if not missing, a vector specifying starting values for
all effects, fixed and random. Length is the sum of the column dimensions
of all model matrices. If supplied, the random effects part should be
standardized (random effects divided by their standard deviations, like
the component |
sigma |
if not missing, a vector specifying starting values for
the square roots of the variance components. Length is the number
of model matrices for
random effects (the length of the list |
response |
if not missing, a vector specifying the response. Length
is the row dimension of all model matrices. If missing, the response
is determined by the response in the formula |
See the help page for the function aster
for specification
of aster models. This function only fits unconditional aster models
(those with default values of the aster
function arguments
type
and origin.type
.
The only difference between this function and the aster
function is
that some effects are treated as random. The unconditional canonical
parameter vector of the aster model is treated as an affine function of
fixed and random effects
where and the
are model matrices specified by
the arguments
fixed
and random
, where
is a vector of
fixed effects and each
is a vector of random
effects that are assumed to be (marginally) normally distributed with
mean vector zero and variance matrix
times
the identity matrix.
The vectors of random effects
are not parameters, rather
they are latent (unobservable, hypothetical) variables. The square roots
of the variance components
are parameters as
are the components of
.
This function maximizes an approximation to the likelihood introduced by Breslow and Clayton (1993). See Geyer, et al. (2013) for details.
reaster
returns an object of class inheriting from "reaster"
.
An object of class "reaster"
is a list containing at least the
following components:
obj |
The aster object returned by a call to the |
fixed |
the model matrix for fixed effects. |
random |
the model matrix or matrices for random effects. |
dropped |
names of columns dropped from the fixed effects matrix. |
sigma |
approximate MLE for square roots of variance components. |
nu |
approximate MLE for variance components. |
c |
penalized likelihood estimates for the |
b |
penalized likelihood estimates for the random effects. |
alpha |
approximate MLE for fixed effects. |
zwz |
|
response |
the response vector. |
origin |
the origin (offset) vector. |
iterations |
number of iterations of trust region algorithm in
each iteration of re-estimating |
counts |
number of iterations of Nelder-Mead in initial optimization of approximate missing data log likelihood. |
deviance |
up to a constant, minus twice the maximized value of
the Breslow-Clayton approximation to the
log-likelihood. (Note the minus. This is somewhat counterintuitive,
but agrees with the convention used by the |
Calls to reaster.formula
return a list also containing:
call |
the matched call. |
formula |
the formulas supplied. |
It was almost always wrong for aster model data to have NA
values.
Although theoretically possible for the R formula mini-language to do the
right thing for an aster model with NA
values in the data, usually
it does some wrong thing. Thus, since version 0.8-20, this function and
the aster
function give errors when used with data having
NA
values. Users must remove all NA
values (or replace them
with what they should be, perhaps zero values) “by hand”.
The negative binomial and truncated negative binomial are fundamentally
incompatible with random effects. The reason is that the canonical parameter
space for a one-parameter negative binomial or truncated negative binomial
is the negative half line. Thus the conditional canonical parameter
for such a node must be negative valued. The aster
transform is so complicated that it is unclear what the corresponding
constraint on the unconditional canonical parameter
is,
but there is a constraint: its parameter space is not the whole real line.
A normal random effect, in contrast, does have support the whole real line.
It wants to make parameters that are constrained to have any real number.
The code only warns about this situation, because if the random effects do
not influence any negative binomial or truncated negative binomial nodes
of the graph, then there would be no problem.
The Breslow-Clayton approximation assumes the complete data log likelihood is approximately quadratic considered as a function of random effects only. This will be the case by the law of large numbers if the number of individuals is much larger than the number of random effects. Thus Geyer, et al. (2013) warn against trying to put a random effect for each individual in the model. If you do that, the code will try to fit the model, but it will take forever and no theory says the results will make any sense.
Breslow, N. E., and Clayton, D. G. (1993). Approximate Inference in Generalized Linear Mixed Models. Journal of the American Statistical Association, 88, 9–25. doi:10.1080/01621459.1993.10594284.
Geyer, C. J., Ridley, C. E., Latta, R. G., Etterson, J. R., and Shaw, R. G. (2012) Aster Models with Random Effects via Penalized Likelihood. Technical Report 692, School of Statistics, University of Minnesota. http://purl.umn.edu/135870.
Geyer, C. J., Ridley, C. E., Latta, R. G., Etterson, J. R., and Shaw, R. G. (2013) Local Adaptation and Genetic Effects on Fitness: Calculations for Exponential Family Models with Random Effects. Annals of Applied Statistics, 7, 1778–1795. doi:10.1214/13-AOAS653.
library(aster) data(radish) pred <- c(0,1,2) fam <- c(1,3,2) rout <- reaster(resp ~ varb + fit : (Site * Region), list(block = ~ 0 + fit : Block, pop = ~ 0 + fit : Pop), pred, fam, varb, id, root, data = radish) summary(rout) summary(rout, stand = FALSE, random = TRUE)
library(aster) data(radish) pred <- c(0,1,2) fam <- c(1,3,2) rout <- reaster(resp ~ varb + fit : (Site * Region), list(block = ~ 0 + fit : Block, pop = ~ 0 + fit : Pop), pred, fam, varb, id, root, data = radish) summary(rout) summary(rout, stand = FALSE, random = TRUE)
Data on life history traits for four years and five fitness components
data(sim)
data(sim)
Loads nine objects.
The objects beta.true
, mu.true
, phi.true
, and
theta.true
are the simulation truth parameter values in
different parametrizations.
Regression coefficient vector for model
resp ~ varb + 0 + z1 + z2 + I(z1^2) + I(z1*z2) + I(z2^2)
.
Unconditional mean value parameter vector for same model.
Unconditional canonical value parameter vector for same model.
Conditional canonical value parameter vector for same model.
The objects fam
, pred
, and vars
specify the aster model graphical and probabilistic structure.
Integer vector giving the families of the variables in the graph.
Integer vector giving the predecessors of the variables in the graph.
Character vector giving the names of the variables in the graph.
The objects ladata
and redata
are the simulated data
in two forms "wide"
and "long"
in the terminology
of the reshape
function.
Data frame with variables y
, z1
,
z2
used for Lande-Arnold type estimation of fitness landscape.
y
is the response, fitness, and z1
and z1
are
predictor variables, phenotypes.
Data frame with variables resp
, z1
,
z2
, varb
, id
, root
used for aster type estimation of fitness landscape.
resp
is the response, containing all components of fitness,
and z1
and z1
are predictor variables, phenotypes.
varb
is a factor whose levels are are elements of vars
indicating which elements of resp
go with which nodes of the
aster model graphical structure. The variables z1
and z2
have been set equal to zero except when grep("nseed", varb)
is
TRUE
. For the rationale see Section 3.2 of TR 669 referenced
below.
Geyer, C. J and Shaw, R. G. (2008) Supporting Data Analysis for a talk to be given at Evolution 2008. Technical Report No. 669. School of Statistics, University of Minnesota. http://hdl.handle.net/11299/56204.
Geyer, C. J and Shaw, R. G. (2009) Hypothesis Tests and Confidence Intervals Involving Fitness Landscapes fit by Aster Models. Technical Report No. 671. School of Statistics, University of Minnesota. http://hdl.handle.net/11299/56219.
data(sim) ## Not run: ### CRAN policy says examples must take < 5 sec. This doesn't. out6 <- aster(resp ~ varb + 0 + z1 + z2 + I(z1^2) + I(z1*z2) + I(z2^2), pred, fam, varb, id, root, data = redata) summary(out6) ## End(Not run) lout <- lm(y ~ z1 + z2 + I(z1^2) + I(z1*z2) + I(z2^2), data = ladata) summary(lout)
data(sim) ## Not run: ### CRAN policy says examples must take < 5 sec. This doesn't. out6 <- aster(resp ~ varb + 0 + z1 + z2 + I(z1^2) + I(z1*z2) + I(z2^2), pred, fam, varb, id, root, data = redata) summary(out6) ## End(Not run) lout <- lm(y ~ z1 + z2 + I(z1^2) + I(z1*z2) + I(z2^2), data = ladata) summary(lout)
These functions are all methods
for class aster
or
summary.aster
objects.
## S3 method for class 'aster' summary(object, info = c("expected", "observed"), info.tol = sqrt(.Machine$double.eps), show.graph = FALSE, ...) ## S3 method for class 'summary.aster' print(x, digits = max(3, getOption("digits") - 3), signif.stars = getOption("show.signif.stars"), ...)
## S3 method for class 'aster' summary(object, info = c("expected", "observed"), info.tol = sqrt(.Machine$double.eps), show.graph = FALSE, ...) ## S3 method for class 'summary.aster' print(x, digits = max(3, getOption("digits") - 3), signif.stars = getOption("show.signif.stars"), ...)
object |
an object of class |
info |
the type of Fisher information use to compute standard errors. |
info.tol |
tolerance for eigenvalues of Fisher information.
If |
show.graph |
if |
x |
an object of class |
digits |
the number of significant digits to use when printing. |
signif.stars |
logical. If |
... |
further arguments passed to or from other methods. |
summary.aster
returns an object of class "summary.aster"
list with the same components as object
, which is of class
"aster"
.
This function may give an error message
"cannot compute standard errors, apparent directions of recession"
.
There are two reasons why this can happen.
There may actually be a direction of recession (DOR). Then the maximum likelihood estimate does not exist; increasing the likelihood drives (some of) the coefficients to infinity or minus infinity.
This function's guess at the DOR can be extracted
from the error object obtained by wrapping this function
in try
and then extracting the dor
component
of the condition
attribute of the error object.
An example of this is on the help page for the foobar
data set.
This function's guessed DOR are apparent null eigenvector(s) of the Fisher
information matrix. Due to inaccuracy of computer arithmetic, this is
only a guess. What are deemed null eigenvectors is controlled by the
info.tol
argument of this function. Reducing info.tol
to perhaps 1e-9
or 1e-10
or even a little lower
may make the putative DOR go away. In this case they were probably
bogus (see next item). Reducing info.tol
to near or below the
machine epsilon .Machine$double.eps
(.Machine
)
instructs this function to feed you garbage with no error or warning.
Putative DOR are probably true DOR if they are highly patterned with many zero or nearly zero components and other components that are nearly (small) integer multiples of each other. Putative DOR are probably bogus if they look like random noise.
DOR, if true, cannot simply be ignored. For more information,
including how to do more rigorous investigation of whether putative
DOR are true or bogus,
see the example on the help page for the foobar
data set and the reference cited on that help page.
All of the putative directions of recession may be bogus. Due to inaccuracy of computer arithmetic, ill-conditioning of predictor variables, or ill-conditioning of the aster model itself (large graphs cause more inaccurate computation), what appear to be null eigenvectors of the Fisher information matrix need not be true null eigenvectors.
In this case, the problem will go away when info.tol
is decreased
slightly. Only when one has proved that there is no DOR, should one
use info.tol = 1e-20
which says to ignore the problem altogether
(whether putative DOR are true or bogus).
These functions are all methods
for class reaster
or
summary.reaster
objects.
## S3 method for class 'reaster' summary(object, standard.deviation = TRUE, ...) ## S3 method for class 'summary.reaster' print(x, digits = max(3, getOption("digits") - 3), signif.stars = getOption("show.signif.stars"), ...)
## S3 method for class 'reaster' summary(object, standard.deviation = TRUE, ...) ## S3 method for class 'summary.reaster' print(x, digits = max(3, getOption("digits") - 3), signif.stars = getOption("show.signif.stars"), ...)
object |
an object of class |
standard.deviation |
if |
x |
an object of class |
digits |
the number of significant digits to use when printing. |
signif.stars |
logical. If |
... |
further arguments passed to or from other methods. |
The reaster
function only does approximate maximum likelihood.
Even if it did actual maximum likelihood, standard errors would be only
approximate. Standard errors for variance components are derived via
the delta method from standard errors for square roots of variance
components (standard deviations). Hence P-values for variance components
and square roots of variance components do not agree exactly (although
they do asymptotically).
summary.reaster
returns an object of class "summary.reaster"
.
Random generation for the -truncated Poisson distribution
or for the
-truncated negative binomial distribution, where
“
-truncated” means conditioned on being strictly greater
than
. If
xpred
is not one, then the random variate is
the sum of xpred
such random variates.
rktp(n, k, mu, xpred = 1) rktnb(n, size, k, mu, xpred = 1) rnzp(n, mu, xpred = 1)
rktp(n, k, mu, xpred = 1) rktnb(n, size, k, mu, xpred = 1) rnzp(n, mu, xpred = 1)
n |
number of random values to return. If |
size |
the size parameter for the negative binomial distribution. |
k |
truncation limit. |
xpred |
number of trials. |
mu |
vector of positive means. |
rktp
simulates -truncated Poisson random variates.
rktnb
simulates -truncated negative binomial random variates.
rnzp
simulates zero-truncated Poisson random variates
(maintained only for backward compatibility, it now calls rktp
).
a vector of random deviates.
rktp(10, 2, 0.75) rktnb(10, 2.222, 2, 0.75)
rktp(10, 2, 0.75) rktnb(10, 2.222, 2, 0.75)