Title: | Robust Geostatistical Analysis of Spatial Data |
---|---|
Description: | Provides functions for efficiently fitting linear models with spatially correlated errors by robust (Kuensch et al. (2011) <doi:10.3929/ethz-a-009900710>) and Gaussian (Harville (1977) <doi:10.1080/01621459.1977.10480998>) (Restricted) Maximum Likelihood and for computing robust and customary point and block external-drift Kriging predictions (Cressie (1993) <doi:10.1002/9781119115151>), along with utility functions for variogram modelling in ad hoc geostatistical analyses, model building, model evaluation by cross-validation, (conditional) simulation of Gaussian processes (Davies and Bryant (2013) <doi:10.18637/jss.v055.i09>), unbiased back-transformation of Kriging predictions of log-transformed data (Cressie (2006) <doi:10.1007/s11004-005-9022-8>). |
Authors: | Andreas Papritz [aut, cre] |
Maintainer: | Andreas Papritz <[email protected]> |
License: | GPL (>= 2) | LGPL (>= 2) |
Version: | 0.3-20 |
Built: | 2024-12-05 06:48:38 UTC |
Source: | CRAN |
This is a summary of the features and functionality of georob, a package in R for customary and robust geostatistical analyses.
georob is a package for customary and robust analyses of
geostatistical data.
Such data, say , are
recorded at a set of locations,
,
, in a
domain
,
, along
with covariate information
,
.
We use the following model for the data
:
where
is the so-called signal,
is the external drift,
is an unobserved stationary or
intrinsic spatial Gaussian random field with zero mean, and
is an
i.i.d error from a possibly long-tailed distribution with scale parameter
(
is usually called nugget effect).
In vector form the model is written as
where is the model matrix with the
rows
.
The (generalized) covariance matrix of the vector of
spatial Gaussian random effects
is denoted by
where
is the variance of seemingly uncorrelated micro-scale variation in
that cannot be resolved with the chosen sampling design,
is the identity matrix,
is the variance of the captured auto-correlated variation in
,
is the signal variance, and
.
To estimate both
and
(and not only their sum), one needs
replicated measurements for some of the
.
We define
to be the (generalized) correlation matrix with elements
where the constant is chosen large enough so that
is positive definite,
is a valid stationary or intrinsic variogram, and
is a matrix that is used to model geometrically anisotropic auto-correlation.
In more detail,
maps an arbitrary point on an ellipsoidal surface with constant semi-variance in
,
centred on the origin, and having lengths of semi-principal axes,
,
,
,
equal to
,
and
,
,
respectively, onto the surface of the unit ball centred on the origin.
The orientation of the ellipsoid is defined by the three angles
,
and
:
is the azimuth of
(= angle between north and the projection
of
onto the
-
-plane,
measured from north to south positive clockwise in degrees),
is 90 degrees minus the latitude of
(= angle between the zenith and
,
measured from zenith to nadir positive clockwise in degrees), and
is the angle between
and the direction of the line, say
,
defined by the intersection between the
-
-plane and the plane orthogonal to
running through the origin
(
is measured from
positive counter-clockwise in degrees).
The transformation matrix is given by
where
To model geometrically anisotropic variograms in
one has to set
and
,
and for
one obtains the model for isotropic auto-correlation
with range parameter
.
Note that for isotropic auto-correlation the software processes data for
which
may exceed 3.
Two remarks are in order:
Clearly, the (generalized) covariance matrix of the observations
is given by
Depending on the context, the term “variogram
parameters” denotes sometimes all parameters of a geometrically
anisotropic variogram model, but in places only the parameters of an
isotropic variogram model, i.e. and
are denoted by the term
“anisotropy parameters”. In the sequel
is used to denote all
variogram and anisotropy parameters except the nugget effect
.
The unobserved spatial random effects
at the data locations
and the model parameters
,
and
are unknown and are estimated in georob either by Gaussian
(Harville, 1977) or robust (Künsch et al., 2011)
restricted maximum likelihood (REML) or
Gaussian maximum likelihood (ML). Here ...
denote further parameters of the variogram such as the smoothness parameter
of the Whittle-Matérn model.
In brief, the robust REML method is based on the insight that for
given and
the
Kriging predictions (= BLUP) of
and the generalized least
squares (GLS = ML) estimates of
can be obtained
simultaneously by maximizing
with respect to
and
, e.g.
Harville (1977).
Hence, the BLUP of
,
ML estimates of
,
and
are obtained by maximizing
jointly with respect to
,
,
and
or by solving the respective estimating equations.
The estimating equations can then by robustified by
replacing the standardized errors, say
,
by a bounded or re-descending
-function,
,
of them (e.g. Maronna et al, 2006, chap. 2) and by
introducing suitable bias correction terms for Fisher consistency at the Gaussian model,
see Künsch et al. (2011) for details.
The robustified estimating equations
are solved numerically by a combination of iterated re-weighted least squares
(IRWLS) to estimate and
for given
and
and nonlinear root finding by the function
nleqslv
of the R package nleqslv
to get and
.
The robustness of the procedure is controlled by the tuning parameter
of the
-function. For
the algorithm computes
Gaussian (RE)ML estimates and customary plug-in Kriging predictions.
Instead of solving the Gaussian (RE)ML estimating equations, our software then
maximizes the Gaussian (restricted) log-likelihood using
nlminb
or
optim
.
georob uses variogram models that were provided formerly by the
now archived R package RandomFields and are now implemented in
the function gencorr
of georob.
Currently, estimation of the parameters of the following models is
implemented:
"RMaskey"
, "RMbessel"
, "RMcauchy"
,
"RMcircular"
, "RMcubic"
, "RMdagum"
, "RMdampedcos"
, "RMdewijsian"
, "RMexp"
(default),
"RMfbm"
, "RMgauss"
, "RMgencauchy"
,
"RMgenfbm"
, "RMgengneiting"
, "RMgneiting"
,
"RMlgd"
, "RMmatern"
, "RMpenta"
, "RMqexp"
,
"RMspheric"
, "RMstable"
, "RMwave"
, "RMwhittle"
.
For most variogram parameters, closed-form expressions of and
are used in the computations.
However, for the parameter
of the models
"RMbessel"
,
"RMmatern"
and "RMwhittle"
is evaluated numerically by the function
numericDeriv
, and this results in an increase in
computing time when is estimated.
Customary and robust plug-in external drift point Kriging predictions
can be computed for an non-sampled location
from the covariates
,
the estimated parameters
,
and the predicted random effects
by
where
is the estimated (generalized) covariance matrix of
and
is the vector with the estimated (generalized) covariances between
and
.
Kriging variances can be computed as well, based on approximated covariances of
and
(see Künsch et al., 2011, and appendices of
Nussbaum et al., 2014, for details).
The package georob provides in addition software for computing customary and robust external drift block Kriging predictions. The required integrals of the generalized covariance function are computed by functions of the R package constrainedKriging.
For the time being, the functionality of georob is limited to geostatistical analyses of single response variables. No software is currently available for customary and robust multivariate geostatistical analyses. georob offers functions for:
Robustly fitting a spatial linear model to data that are
possibly contaminated by independent errors from a long-tailed
distribution by robust REML (see functions georob
—
which also fits such models efficiently by Gaussian (RE)ML —
profilelogLik
and control.georob
).
Extracting estimated model components (see
residuals.georob
, rstandard.georob
,
ranef.georob
).
Robustly estimating sample variograms and for fitting variogram
model functions to them (see sample.variogram
and
fit.variogram.model
).
Model building by forward and backward selection of covariates
for the external drift (see waldtest.georob
,
step.georob
, add1.georob
,
drop1.georob
, extractAIC.georob
, logLik.georob
, deviance.georob
). For a
robust fit, the log-likelihood is not defined. The function then
computes the (restricted) log-likelihood of an equivalent Gaussian
model with heteroscedastic nugget (see deviance.georob
for details).
Assessing the goodness-of-fit and predictive power of the model
by K-fold cross-validation (see cv.georob
and
validate.predictions
).
Computing customary and robust external drift point and block
Kriging predictions (see predict.georob
,
control.predict.georob
).
Unbiased back-transformation of both point and block Kriging
predictions of log-transformed data to the original scale of the
measurements (see lgnpp
).
Computing unconditional and conditional Gaussian simulations
from a fitted spatial linear model (see condsim
).
Andreas Papritz [email protected].
Harville, D. A. (1977) Maximum likelihood approaches to variance component estimation and to related problems, Journal of the American Statistical Association, 72, 320–340, doi:10.1080/01621459.1977.10480998.
Künsch, H. R., Papritz, A., Schwierz, C. and Stahel, W. A. (in preparation) Robust Geostatistics.
Künsch, H. R., Papritz, A., Schwierz, C. and Stahel, W. A. (2011) Robust estimation of the external drift and the variogram of spatial data. Proceedings of the ISI 58th World Statistics Congress of the International Statistical Institute. doi:10.3929/ethz-a-009900710
Maronna, R. A., Martin, R. D. and Yohai, V. J. (2006) Robust Statistics Theory and Methods, Wiley, Hoboken, doi:10.1002/0470010940.
Nussbaum, M., Papritz, A., Baltensweiler, A. and Walthert, L. (2014) Estimating soil organic carbon stocks of Swiss forest soils by robust external-drift kriging. Geoscientific Model Development, 7, 1197–1210. doi:10.5194/gmd-7-1197-2014.
georob
for (robust) fitting of spatial linear models;
georobObject
for a description of the class georob
;
profilelogLik
for computing profiles of Gaussian likelihoods;
plot.georob
for display of RE(ML) variogram estimates;
control.georob
for controlling the behaviour of georob
;
georobModelBuilding
for stepwise building models of class georob
;
cv.georob
for assessing the goodness of a fit by georob
;
georobMethods
for further methods for the class georob
;
predict.georob
for computing robust Kriging predictions;
lgnpp
for unbiased back-transformation of Kriging prediction
of log-transformed data;
georobSimulation
for simulating realizations of a Gaussian process
from model fitted by georob
; and finally
sample.variogram
and fit.variogram.model
for robust estimation and modelling of sample variograms.
The utility function compress
stores symmetric or triangular
matrices compactly by retaining only the diagonal and either the
lower or upper off-diagonal elements. The function expand
restores such compressed matrices again to a square form.
compress(m) expand(object)
compress(m) expand(object)
m |
either a single symmetric, lower or upper triangular
matrix or a list of such matrices. The type of |
object |
a single compressed matrix or a list of such matrices
generated by |
If m
is a single square matrix then compress
generates a
compressed matrix, which is a list with two components:
diag |
a vector with the diagonal elements of |
tri |
a vector with off-diagonal elements. |
If m
is a list of square matrices then the result is also a list
of compressed matrices.
expand
creates a square matrix if object
is a list with
components diag
and tri
and a list of square matrices if
object
is a list of such lists. If m
or objects
are
lists that contain other components than square or compressed matrices
then these additional components are returned unchanged.
Andreas Papritz [email protected].
georob
for (robust) fitting of spatial linear models.
data(meuse) r.logzn.rob <- georob(log(zinc) ~ sqrt(dist) + ffreq, data = meuse, locations = ~ x + y, variogram.model = "RMexp", param = c(variance = 0.15, nugget = 0.05, scale = 200), tuning.psi = 1) cov2cor(expand(r.logzn.rob[["cov"]][["cov.betahat"]]))
data(meuse) r.logzn.rob <- georob(log(zinc) ~ sqrt(dist) + ffreq, data = meuse, locations = ~ x + y, variogram.model = "RMexp", param = c(variance = 0.15, nugget = 0.05, scale = 200), tuning.psi = 1) cov2cor(expand(r.logzn.rob[["cov"]][["cov.betahat"]]))
This page documents parameters used to control georob
. It
describes the arguments of the functions control.georob
,
param.transf
, fwd.transf
, dfwd.transf
,
bwd.transf
, control.rq
, control.nleqslv
,
control.nlminb
and control.optim
, which all serve to
control the behaviour of georob
.
control.georob(ml.method = c("REML", "ML"), reparam = TRUE, maximizer = c("nlminb", "optim"), initial.param = TRUE, initial.fixef = c("lmrob", "rq", "lm"), bhat = NULL, min.rweight = 0.25, param.tf = param.transf(), fwd.tf = fwd.transf(), deriv.fwd.tf = dfwd.transf(), bwd.tf = bwd.transf(), psi.func = c("logistic", "t.dist", "huber"), irwls.maxiter = 50, irwls.ftol = 1.e-5, force.gradient = FALSE, min.condnum = 1.e-12, zero.dist = sqrt(.Machine[["double.eps"]]), error.family.estimation = c("gaussian", "long.tailed"), error.family.cov.effects = c("gaussian", "long.tailed"), error.family.cov.residuals = c("gaussian", "long.tailed"), cov.bhat = TRUE, full.cov.bhat = FALSE, cov.betahat = TRUE, cov.delta.bhat = TRUE, full.cov.delta.bhat = TRUE, cov.delta.bhat.betahat = TRUE, cov.ehat = TRUE, full.cov.ehat = FALSE, cov.ehat.p.bhat = FALSE, full.cov.ehat.p.bhat = FALSE, hessian = TRUE, rq = control.rq(), lmrob = lmrob.control(), nleqslv = control.nleqslv(), optim = control.optim(), nlminb = control.nlminb(), pcmp = control.pcmp(), ...) param.transf(variance = "log", snugget = "log", nugget = "log", scale = "log", alpha = c( RMaskey = "log", RMdewijsian = "logit2", RMfbm = "logit2", RMgencauchy = "logit2", RMgenfbm = "logit2", RMlgd = "identity", RMqexp = "logit1", RMstable = "logit2" ), beta = c(RMdagum = "logit1", RMgencauchy = "log", RMlgd = "log"), delta = "logit1", gamma = c(RMcauchy = "log", RMdagum = "logit1"), kappa = "logit3", lambda = "log", mu = "log", nu = "log", f1 = "log", f2 ="log", omega = "identity", phi = "identity", zeta = "identity") fwd.transf(...) dfwd.transf(...) bwd.transf(...) control.rq(tau = 0.5, rq.method = c("br", "fnb", "pfn"), rq.alpha = 0.1, ci = FALSE, iid = TRUE, interp = TRUE, tcrit = TRUE, rq.beta = 0.99995, eps = 1e-06, Mm.factor = 0.8, max.bad.fixup = 3, ...) control.nleqslv(method = c("Broyden", "Newton"), global = c("dbldog", "pwldog", "qline", "gline", "none"), xscalm = c("fixed", "auto"), control = list(ftol = 1e-04), ...) control.optim(method = c("BFGS", "Nelder-Mead", "CG", "L-BFGS-B", "SANN", "Brent"), lower = -Inf, upper = Inf, control = list(reltol = 1e-05), ...) control.nlminb(control = list(rel.tol = 1.e-5), lower = -Inf, upper = Inf, ...)
control.georob(ml.method = c("REML", "ML"), reparam = TRUE, maximizer = c("nlminb", "optim"), initial.param = TRUE, initial.fixef = c("lmrob", "rq", "lm"), bhat = NULL, min.rweight = 0.25, param.tf = param.transf(), fwd.tf = fwd.transf(), deriv.fwd.tf = dfwd.transf(), bwd.tf = bwd.transf(), psi.func = c("logistic", "t.dist", "huber"), irwls.maxiter = 50, irwls.ftol = 1.e-5, force.gradient = FALSE, min.condnum = 1.e-12, zero.dist = sqrt(.Machine[["double.eps"]]), error.family.estimation = c("gaussian", "long.tailed"), error.family.cov.effects = c("gaussian", "long.tailed"), error.family.cov.residuals = c("gaussian", "long.tailed"), cov.bhat = TRUE, full.cov.bhat = FALSE, cov.betahat = TRUE, cov.delta.bhat = TRUE, full.cov.delta.bhat = TRUE, cov.delta.bhat.betahat = TRUE, cov.ehat = TRUE, full.cov.ehat = FALSE, cov.ehat.p.bhat = FALSE, full.cov.ehat.p.bhat = FALSE, hessian = TRUE, rq = control.rq(), lmrob = lmrob.control(), nleqslv = control.nleqslv(), optim = control.optim(), nlminb = control.nlminb(), pcmp = control.pcmp(), ...) param.transf(variance = "log", snugget = "log", nugget = "log", scale = "log", alpha = c( RMaskey = "log", RMdewijsian = "logit2", RMfbm = "logit2", RMgencauchy = "logit2", RMgenfbm = "logit2", RMlgd = "identity", RMqexp = "logit1", RMstable = "logit2" ), beta = c(RMdagum = "logit1", RMgencauchy = "log", RMlgd = "log"), delta = "logit1", gamma = c(RMcauchy = "log", RMdagum = "logit1"), kappa = "logit3", lambda = "log", mu = "log", nu = "log", f1 = "log", f2 ="log", omega = "identity", phi = "identity", zeta = "identity") fwd.transf(...) dfwd.transf(...) bwd.transf(...) control.rq(tau = 0.5, rq.method = c("br", "fnb", "pfn"), rq.alpha = 0.1, ci = FALSE, iid = TRUE, interp = TRUE, tcrit = TRUE, rq.beta = 0.99995, eps = 1e-06, Mm.factor = 0.8, max.bad.fixup = 3, ...) control.nleqslv(method = c("Broyden", "Newton"), global = c("dbldog", "pwldog", "qline", "gline", "none"), xscalm = c("fixed", "auto"), control = list(ftol = 1e-04), ...) control.optim(method = c("BFGS", "Nelder-Mead", "CG", "L-BFGS-B", "SANN", "Brent"), lower = -Inf, upper = Inf, control = list(reltol = 1e-05), ...) control.nlminb(control = list(rel.tol = 1.e-5), lower = -Inf, upper = Inf, ...)
ml.method |
a character keyword defining whether Gaussian maximum
likelihood ( |
reparam |
a logical scalar. If |
maximizer |
a character keyword defining whether the Gaussian
(restricted) log-likelihood is maximized by |
initial.param |
a logical scalar, controlling whether initial values
of variogram parameters are computed for solving the robustified
estimating equations of the variogram and anisotropy parameters. If
|
initial.fixef |
a character keyword defining whether the function
|
bhat |
a numeric vector with initial values for the spatial random
effects |
min.rweight |
a positive numeric with the “robustness
weight” of the initial |
param.tf |
a function such as |
fwd.tf |
a function such as |
deriv.fwd.tf |
a function such as |
bwd.tf |
a function such as |
psi.func |
a character keyword defining what |
irwls.maxiter |
a positive integer equal to the maximum number of
IRWLS iterations to solve the estimating equations of
|
irwls.ftol |
a positive numeric with the convergence criterion for
IRWLS. Convergence is assumed if the objective function change of a IRWLS
iteration does not exceed |
force.gradient |
a logical scalar controlling whether the estimating
equations or the gradient of the Gaussian restricted log-likelihood are
evaluated even if all variogram parameters are fixed (default
|
min.condnum |
a positive numeric with the minimum acceptable
ratio of smallest to largest singular value of the model matrix
|
zero.dist |
a positive numeric equal to the maximum distance, separating two sampling locations that are still considered as being coincident. |
error.family.estimation |
a character keyword, defining the
probability distribution for |
error.family.cov.effects |
a character keyword, defining the
probability distribution for |
error.family.cov.residuals |
a character keyword, defining the
probability distribution for |
cov.bhat |
a logical scalar controlling whether the covariances of
|
full.cov.bhat |
a logical scalar controlling whether the full
covariance matrix ( |
cov.betahat |
a logical scalar controlling whether the covariance
matrix of |
cov.delta.bhat |
a logical scalar controlling whether the covariances of
|
full.cov.delta.bhat |
a logical scalar controlling whether the full covariance
matrix ( |
cov.delta.bhat.betahat |
a logical scalar controlling whether the covariance
matrix of |
cov.ehat |
a logical scalar controlling whether the covariances of
|
full.cov.ehat |
a logical scalar controlling whether the full covariance
matrix ( |
cov.ehat.p.bhat |
a logical scalar controlling whether the covariances of
|
full.cov.ehat.p.bhat |
a logical scalar controlling whether the full
covariance matrix ( |
hessian |
a logical scalar controlling whether for Gaussian (RE)ML the Hessian should be computed at the MLEs. |
rq |
a list of arguments passed to |
lmrob |
a list of arguments passed to the |
nleqslv |
a list of arguments passed to
|
nlminb |
a list of arguments passed to |
optim |
a list of arguments passed to |
pcmp |
a list of arguments, passed e.g. to |
... |
for |
variance , snugget , nugget , scale , alpha , beta , delta , gamma , kappa , lambda , mu , nu
|
character strings with names of transformation functions of the variogram parameters. |
f1 , f2 , omega , phi , zeta
|
character strings with names of transformation functions of the anisotropy variogram parameters. |
tau , rq.method , rq.alpha , ci , iid , interp , tcrit
|
arguments passed
as |
rq.beta , eps , Mm.factor , max.bad.fixup
|
arguments passed as
|
method , global , xscalm , control , lower , upper , reltol , rel.tol
|
arguments passed to related arguments of
|
The arguments param.tf
, fwd.tf
, deriv.fwd.tf
,
bwd.tf
define the transformations of the variogram parameters for
RE(ML) estimation. Implemented are currently "log"
,
"logit1"
, "logit2"
, "logit3"
(various variants of
logit-transformation, see code of function fwd.transf
) and "identity"
(= no)
transformations. These are the possible values that the many arguments
of the function param.transf
accept (as quoted character strings)
and these are the names of the list components returned by
fwd.transf
, dfwd.transf
and bwd.transf
. Additional
transformations can be implemented by:
Extending the function definitions by arguments like
fwd.tf = fwd.transf(my.fun = function(x) your transformation)
,deriv.fwd.tf = dfwd.transf(my.fun = function(x) your derivative)
,bwd.tf = bwd.transf(my.fun = function(x) your back-transformation)
,
Assigning to a given argument of param.transf
the name of
the new function, e.g.variance = "my.fun"
.
Note that the values given for the arguments of param.transf
must match the names of the functions returned by fwd.transf
,
dfwd.transf
and bwd.transf
.
The robustified estimating equations of robust REML depend on the
covariances of .
These covariances (and the covariances of
,
,
,
) are
approximated by expressions that in turn depend on the variances of
,
and the expectation
of
. The arguments
error.family.estimation
, error.family.cov.effects
anderror.family.cov.residuals
control what parametric distribution
for is used to compute the variance of
,
and the expectation
of
when
solving the estimating equations (error.family.estimation
),
computing the covariances of
,
and
(
error.family.cov.effects
) and
computing the covariances of
and
(error.family.cov.residuals
).
Possible options are: "gaussian"
or "long.tailed"
. In
the latter case the probability density function of
is assumed to be proportional to
, where
.
control.georob
, control.rq
, control.nleqslv
,
control.optim
and control.nlminb
all create lists with
control parameters passed to georob
,
rq
, nleqslv
,
optim
or nlminb
, see arguments
above and the help pages of the respective functions for information
about the components of these lists. Note that the list returned by
control.georob
contains some components (irwls.initial
,
tuning.psi.nr
, cov.bhat.betahat
,
aux.cov.pred.target
) that cannot be changed by the user.
param.transf
generates a list with character strings that
define what transformations are used for estimating the variogram
parameters, and fwd.transf
, bwd.transf
and
dfwd.transf
return lists of functions with forward and backward
transformations and the first derivatives of the forward
transformations, see section Parameter transformations above.
Andreas Papritz [email protected].
georobPackage
for a description of the model and a brief summary of the algorithms;
georob
for (robust) fitting of spatial linear models;
georobObject
for a description of the class georob
;
profilelogLik
for computing profiles of Gaussian likelihoods;
plot.georob
for display of RE(ML) variogram estimates;
georobModelBuilding
for stepwise building models of class georob
;
cv.georob
for assessing the goodness of a fit by georob
;
georobMethods
for further methods for the class georob
;
predict.georob
for computing robust Kriging predictions;
lgnpp
for unbiased back-transformation of Kriging prediction
of log-transformed data;
georobSimulation
for simulating realizations of a Gaussian process
from model fitted by georob
; and finally
sample.variogram
and fit.variogram.model
for robust estimation and modelling of sample variograms.
data(meuse) r.logzn.rob <- georob(log(zinc) ~ sqrt(dist), data = meuse, locations = ~ x + y, variogram.model = "RMexp", param = c(variance = 0.15, nugget = 0.05, scale = 200), tuning.psi = 1, control = control.georob(cov.bhat = TRUE, cov.ehat.p.bhat = TRUE, initial.fixef = "rq"), verbose = 2) qqnorm(rstandard(r.logzn.rob, level = 0)); abline(0, 1) qqnorm(ranef(r.logzn.rob, standard = TRUE)); abline(0, 1)
data(meuse) r.logzn.rob <- georob(log(zinc) ~ sqrt(dist), data = meuse, locations = ~ x + y, variogram.model = "RMexp", param = c(variance = 0.15, nugget = 0.05, scale = 200), tuning.psi = 1, control = control.georob(cov.bhat = TRUE, cov.ehat.p.bhat = TRUE, initial.fixef = "rq"), verbose = 2) qqnorm(rstandard(r.logzn.rob, level = 0)); abline(0, 1) qqnorm(ranef(r.logzn.rob, standard = TRUE)); abline(0, 1)
Generic function for cross-validating models.
cv(object, ...)
cv(object, ...)
object |
any model object. |
... |
additional arguments as required by the methods. |
will depend on the method function used; see the respective documentation.
Andreas Papritz [email protected].
georob
for (robust) fitting of spatial linear models;
cv.georob
for assessing the goodness of a model fitted by
georob
.
georob
This function assesses the goodness-of-fit of a spatial linear model by
K-fold cross-validation. In more detail, the model is re-fitted
K times by robust (or Gaussian) (RE)ML, excluding each time
1/Kth of the data. The re-fitted models are used to compute robust
(or customary) external Kriging predictions for the omitted observations.
If the response variable is log-transformed then the Kriging predictions
can be optionally transformed back to the original scale of the
measurements. S3methods for evaluating and plotting diagnostic summaries
of the cross-validation errors are described for the function
validate.predictions
.
## S3 method for class 'georob' cv(object, formula = NULL, subset = NULL, method = c("block", "random"), nset = 10L, seed = NULL, sets = NULL, duplicates.in.same.set = TRUE, re.estimate = TRUE, param = object[["variogram.object"]][[1]][["param"]], fit.param = object[["variogram.object"]][[1]][["fit.param"]], aniso = object[["variogram.object"]][[1]][["aniso"]], fit.aniso = object[["variogram.object"]][[1]][["fit.aniso"]], variogram.object = NULL, use.fitted.param = TRUE, return.fit = FALSE, reduced.output = TRUE, lgn = FALSE, mfl.action = c("offset", "stop"), ncores = min(nset, parallel::detectCores()), verbose = 0, ...)
## S3 method for class 'georob' cv(object, formula = NULL, subset = NULL, method = c("block", "random"), nset = 10L, seed = NULL, sets = NULL, duplicates.in.same.set = TRUE, re.estimate = TRUE, param = object[["variogram.object"]][[1]][["param"]], fit.param = object[["variogram.object"]][[1]][["fit.param"]], aniso = object[["variogram.object"]][[1]][["aniso"]], fit.aniso = object[["variogram.object"]][[1]][["fit.aniso"]], variogram.object = NULL, use.fitted.param = TRUE, return.fit = FALSE, reduced.output = TRUE, lgn = FALSE, mfl.action = c("offset", "stop"), ncores = min(nset, parallel::detectCores()), verbose = 0, ...)
object |
an object of class of |
formula |
an optional formula for the regression model passed by
|
subset |
an optional vector specifying a subset of observations to be used in the fitting process. |
method |
a character keyword, controlling whether subsets are formed
by partitioning data set into contiguous spatial |
nset |
a positive integer defining the number K of subsets into
which the data set is partitioned (default: |
seed |
an optional integer seed to initialize random number generation,
see |
sets |
an optional vector of the same length as the response vector
of the fitted model and with positive integers taking values in
|
duplicates.in.same.set |
a logical scalar controlling whether
replicated observations at a given location are assigned to the same
subset when partitioning the data (default |
re.estimate |
a logical scalar controlling whether the model is
re-fitted to the reduced data sets before computing the Kriging
predictions ( |
param |
a named numeric vector or a matrix or data frame with
initial values of variogram parameters passed by
|
fit.param |
a named logical vector or a matrix or data frame
defining which variogram parameters should be adjusted by
|
aniso |
a named numeric vector or a matrix or data frame with
initial values of anisotropy parameters passed by
|
fit.aniso |
a named logical vector or a matrix or data frame
defining which anisotropy parameters should be adjusted by
|
variogram.object |
an optional list that gives initial values for fitting a possibly nested variogram model for the cross-validation sets. Each component is a list with the following components:
|
use.fitted.param |
a logical scalar controlling whether fitted values
of |
return.fit |
a logical scalar controlling whether information about the fit
should be returned when re-estimating the model with the reduced data
sets (default |
reduced.output |
a logical scalar controlling whether the complete fitted
model objects, fitted to the reduced data sets, are returned
( |
lgn |
a logical scalar controlling whether Kriging predictions of a
log-transformed response should be transformed back to the original scale
of the measurements (default |
mfl.action |
a character keyword controlling what is done when some
levels of factor(s) are not present in any of the subsets used to fit the
model. The function either stops ( |
ncores |
a positive integer controlling how many cores are used for parallelized computations, see Details. |
verbose |
a positive integer controlling logging of diagnostic
messages to the console during model fitting. Passed by
|
... |
additional arguments passed by |
Note that the data frame passed as data
argument to georob
must exist in the user workspace
when calling cv.georob
.
cv.georob
uses the packages parallel and snowfall for
parallelized computations. By default, the function uses CPUs
but not more than are physically available (as returned by
detectCores
).
cv.georob
uses the function update
to
re-estimated the model with the reduced data sets. Therefore, any
argument accepted by georob
except data
can be
changed when re-fitting the model. Some of them (e.g. formula
,
subset
, etc.) are explicit arguments of cv.georob
, but
also the remaining ones can be passed by ...
to the function.
Practitioners in geostatistics commonly cross-validate a fitted model
without re-estimating the model parameters with the reduced data sets.
This is clearly an unsound practice (see Hastie et al., 2009, sec.
7.10). Therefore, the argument re.estimate
should always be set
to TRUE
. The alternative is provided only for historic reasons.
The method cv.georob
returns an object of class cv.georob
,
which is a list with the two
components pred
and fit
.
pred
is a data frame with the coordinates and the
cross-validation prediction results with the following variables:
subset |
an integer vector defining to which of the |
data |
the values of the (possibly log-transformed) response. |
pred |
the Kriging predictions. |
se |
the Kriging standard errors. |
If lgn = TRUE
then pred
has the additional variables:
lgn.data |
the untransformed response. |
lgn.pred |
the unbiased back-transformed predictions of a log-transformed response. |
lgn.se |
the Kriging standard errors of the back-transformed
predictions of a |
The second component fit
contains either the full outputs of
georob
, fitted for the reduced data sets
(
reduced.output = FALSE
), or lists with the components
tuning.psi
, converged
, convergence.code
,
gradient
, variogram.object
, coefficients
along with
the standard errors of
, see
georobObject
.
Andreas Papritz [email protected].
Hastie, T., Tibshirani, R. and Friedman, J. (2009) The Elements of Statistical Learning; Data Mining, Inference and Prediction, Springer, New York, doi:10.1007/978-0-387-84858-7
georobPackage
for a description of the model and a brief summary of the algorithms;
georob
for (robust) fitting of spatial linear models;
georobObject
for a description of the class georob
;
profilelogLik
for computing profiles of Gaussian likelihoods;
plot.georob
for display of RE(ML) variogram estimates;
control.georob
for controlling the behaviour of georob
;
georobModelBuilding
for stepwise building models of class georob
;
georobMethods
for further methods for the class georob
;
predict.georob
for computing robust Kriging predictions;
validate.predictions
for validating Kriging predictions;
lgnpp
for unbiased back-transformation of Kriging prediction
of log-transformed data;
georobSimulation
for simulating realizations of a Gaussian process
from model fitted by georob
; and finally
sample.variogram
and fit.variogram.model
for robust estimation and modelling of sample variograms.
## define number of cores for parallel computations if(interactive()) ncpu <- 10L else ncpu <- 1L data(meuse) r.logzn <- georob(log(zinc) ~ sqrt(dist), data = meuse, locations = ~ x + y, variogram.model = "RMexp", param = c(variance = 0.15, nugget = 0.05, scale = 200), tuning.psi = 1000) if(interactive()){ ## example is run only in interactive session because cpu times exceeds 5 s r.logzn.cv.1 <- cv(r.logzn, seed = 1, lgn = TRUE, ncores = 1, verbose = 1) r.logzn.cv.2 <- cv(r.logzn, formula = .~. + ffreq, seed = 1, lgn = TRUE, ncores = ncpu) plot(r.logzn.cv.1, type = "bs") plot(r.logzn.cv.2, type = "bs", add = TRUE, col = "red") legend("topright", lty = 1, col = c("black", "red"), bty = "n", legend = c("log(Zn) ~ sqrt(dist)", "log(Zn) ~ sqrt(dist) + ffreq")) }
## define number of cores for parallel computations if(interactive()) ncpu <- 10L else ncpu <- 1L data(meuse) r.logzn <- georob(log(zinc) ~ sqrt(dist), data = meuse, locations = ~ x + y, variogram.model = "RMexp", param = c(variance = 0.15, nugget = 0.05, scale = 200), tuning.psi = 1000) if(interactive()){ ## example is run only in interactive session because cpu times exceeds 5 s r.logzn.cv.1 <- cv(r.logzn, seed = 1, lgn = TRUE, ncores = 1, verbose = 1) r.logzn.cv.2 <- cv(r.logzn, formula = .~. + ffreq, seed = 1, lgn = TRUE, ncores = ncpu) plot(r.logzn.cv.1, type = "bs") plot(r.logzn.cv.2, type = "bs", add = TRUE, col = "red") legend("topright", lty = 1, col = c("black", "red"), bty = "n", legend = c("log(Zn) ~ sqrt(dist)", "log(Zn) ~ sqrt(dist) + ffreq")) }
Auxiliary functions to set sensible default values for anisotropy parameters and for controlling what variogram and anisotropy parameters should be estimated.
default.aniso(f1 = 1., f2 = 1., omega = 90., phi = 90., zeta = 0.) default.fit.param( variance = TRUE, snugget = FALSE, nugget = TRUE, scale = TRUE, alpha = FALSE, beta = FALSE, delta = FALSE, gamma = FALSE, kappa = FALSE, lambda = FALSE, mu = FALSE, nu = FALSE) default.fit.aniso(f1 = FALSE, f2 = FALSE, omega = FALSE, phi = FALSE, zeta = FALSE)
default.aniso(f1 = 1., f2 = 1., omega = 90., phi = 90., zeta = 0.) default.fit.param( variance = TRUE, snugget = FALSE, nugget = TRUE, scale = TRUE, alpha = FALSE, beta = FALSE, delta = FALSE, gamma = FALSE, kappa = FALSE, lambda = FALSE, mu = FALSE, nu = FALSE) default.fit.aniso(f1 = FALSE, f2 = FALSE, omega = FALSE, phi = FALSE, zeta = FALSE)
variance |
variance (sill |
snugget |
variance (spatial nugget
|
nugget |
variance (nugget |
scale |
range parameter ( |
alpha , beta , delta , gamma , kappa , lambda , mu , nu
|
names of
additional variogram parameters such as the smoothness parameter
|
f1 |
positive ratio |
f2 |
positive ratio |
omega |
azimuth in degrees of the first semi-principal axis of the
semi-variance ellipsoid (default |
phi |
90 degrees minus altitude of the first semi-principal axis of
the semi-variance ellipsoid (default |
zeta |
angle in degrees between the second semi-principal axis and
the direction of the line defined by the intersection between the
|
Either a named numeric vector with initial values of anisotropy
parameters (default.aniso
) or named logical vectors, controlling
what parameters should be estimated (default.fit.param
,default.fit.aniso
).
Andreas Papritz [email protected].
georobPackage
for a description of the model and a brief summary of the algorithms;
georob
for (robust) fitting of spatial linear models.
default.aniso(f1 = 0.5, omega = 45) default.fit.param(scale=FALSE, alpha = TRUE) default.fit.aniso(f1 = TRUE, omega = TRUE)
default.aniso(f1 = 0.5, omega = 45) default.fit.param(scale=FALSE, alpha = TRUE) default.fit.aniso(f1 = TRUE, omega = TRUE)
Surface elevation data taken from Davis (1972).
data(elevation)
data(elevation)
A data frame with 52 observations on the following 3 variables:
x
a numeric vector with the easting coordinate in multiplies of 50 feet.
y
a numeric vector with the northing coordinate in multiplies of 50 feet..
height
a numeric vector with the elevation in multiples of 10 feet.
The data were imported from the package geoR.
Davis, J.C. (1973) Statistics and Data Analysis in Geology, Wiley, New York.
data(elevation) summary(elevation)
data(elevation) summary(elevation)
The function fit.variogram.model
fits a variogram model to a sample
variogram by (weighted) non-linear least squares. The function
control.fit.variogram.model
generates a list with steering parameters which
control fit.variogram.model
. There are print
, summary
and lines
methods for summarizing and displaying fitted variogram
models.
fit.variogram.model(sv, variogram.model = c("RMexp", "RMaskey", "RMbessel", "RMcauchy", "RMcircular", "RMcubic", "RMdagum", "RMdampedcos", "RMdewijsian", "RMfbm", "RMgauss", "RMgencauchy", "RMgenfbm", "RMgengneiting", "RMgneiting", "RMlgd", "RMmatern", "RMpenta", "RMqexp", "RMspheric", "RMstable", "RMwave", "RMwhittle"), param, fit.param = default.fit.param()[names(param)], aniso = default.aniso(), fit.aniso = default.fit.aniso(), variogram.object = NULL, max.lag = max(sv[["lag.dist"]]), min.npairs = 30, weighting.method = c("cressie", "equal", "npairs"), control = control.fit.variogram.model(), verbose = 0) control.fit.variogram.model(maximizer = c("nlminb", "optim"), param.tf = param.transf(), fwd.tf = fwd.transf(), deriv.fwd.tf = dfwd.transf(), bwd.tf = bwd.transf(), hessian = TRUE, optim = control.optim(), nlminb = control.nlminb()) ## S3 method for class 'fitted.variogram' print(x, digits = max(3, getOption("digits") - 3), ...) ## S3 method for class 'fitted.variogram' summary(object, correlation = FALSE, signif = 0.95, ...) ## S3 method for class 'fitted.variogram' lines(x, what = c("variogram", "covariance", "correlation"), from = 1.e-6, to, n = 501, xy.angle = 90, xz.angle = 90, col = 1:length(xy.angle), pch = 1:length(xz.angle), lty = "solid", ...)
fit.variogram.model(sv, variogram.model = c("RMexp", "RMaskey", "RMbessel", "RMcauchy", "RMcircular", "RMcubic", "RMdagum", "RMdampedcos", "RMdewijsian", "RMfbm", "RMgauss", "RMgencauchy", "RMgenfbm", "RMgengneiting", "RMgneiting", "RMlgd", "RMmatern", "RMpenta", "RMqexp", "RMspheric", "RMstable", "RMwave", "RMwhittle"), param, fit.param = default.fit.param()[names(param)], aniso = default.aniso(), fit.aniso = default.fit.aniso(), variogram.object = NULL, max.lag = max(sv[["lag.dist"]]), min.npairs = 30, weighting.method = c("cressie", "equal", "npairs"), control = control.fit.variogram.model(), verbose = 0) control.fit.variogram.model(maximizer = c("nlminb", "optim"), param.tf = param.transf(), fwd.tf = fwd.transf(), deriv.fwd.tf = dfwd.transf(), bwd.tf = bwd.transf(), hessian = TRUE, optim = control.optim(), nlminb = control.nlminb()) ## S3 method for class 'fitted.variogram' print(x, digits = max(3, getOption("digits") - 3), ...) ## S3 method for class 'fitted.variogram' summary(object, correlation = FALSE, signif = 0.95, ...) ## S3 method for class 'fitted.variogram' lines(x, what = c("variogram", "covariance", "correlation"), from = 1.e-6, to, n = 501, xy.angle = 90, xz.angle = 90, col = 1:length(xy.angle), pch = 1:length(xz.angle), lty = "solid", ...)
sv |
an object of class |
variogram.model |
a character keyword defining the variogram model
to be fitted. Currently, most basic variogram models provided formerly
by the now archived package RandomFields can be fitted (see
Details of |
param |
a named numeric vector with initial values of the variogram
parameters. The following parameter names are allowed (see
Details of
|
fit.param |
a named logical vector (or a function such as
|
aniso |
a named numeric vector with initial values (or a function such as
|
fit.aniso |
a named logical vector (or a function such as
|
variogram.object |
an optional list that defines a possibly nested variogram model. Each component is itself a list with the following components:
Note that the arguments |
max.lag |
a positive numeric defining the maximum lag distance to be used for fitting or plotting variogram models (default all lag classes). |
min.npairs |
a positive integer defining the minimum number of data
pairs required so that a lag class is used for fitting a variogram
model (default |
weighting.method |
a character keyword defining the weights for non-linear least squares. Possible values are:
|
verbose |
a positive integer controlling logging of diagnostic messages to the console during model fitting. |
control |
a list with the components |
maximizer |
a character keyword defining the optimizer for nonlinear
least squares. Possible values are |
hessian |
a logical scalar controlling whether the Hessian should be computed at the nonlinear least squares estimates. |
param.tf |
a function such as |
fwd.tf |
a function such as |
deriv.fwd.tf |
a function such as |
bwd.tf |
a function such as |
nlminb |
a list of arguments passed to |
optim |
a list of arguments passed to |
object , x
|
an object of class |
digits |
a positive integer indicating the number of decimal digits to print. |
correlation |
a logical scalar controlling whether the correlation matrix of
the fitted variogram parameters is computed (default |
signif |
a numeric with the confidence level for computing
confidence intervals for variogram parameters (default |
what |
a character keyword with the quantity that should be
displayed (default |
from |
a numeric with the minimal lag distance used in plotting variogram models. |
to |
a numeric with the maximum lag distance used in plotting variogram models (default: largest lag distance of current plot). |
n |
a positive integer specifying the number of equally spaced lag
distances for which semi-variances are evaluated in plotting variogram
models (default |
xy.angle |
a numeric vector with azimuth angles (in degrees,
clockwise positive from north) in |
xz.angle |
a numeric vector with angles in |
col |
a vector with colours of curves to distinguish curves relating
to different azimuth angles in |
pch |
a vector with the plotting symbols added to lines to
distinguish curves relating to different angles in
|
lty |
a vector with the line types for plotting variogram models. |
... |
additional arguments passed to methods. |
The parametrization of geometrically anisotropic variograms is
described in detail in georobPackage
, and the section
Details of georob
describes how the parameter
estimates are constrained to permissible ranges. The same
mechanisms are used in fit.variogram.model
.
The method summary
computes confidence intervals of the estimated
variogram and anisotropy parameters from the Hessian matrix of the residual
sums of squares, based on the asymptotic normal distribution of least
squares estimates. Note that the Hessian matrix with respect to the
transformed variogram and anisotropy parameters is used for this.
Hence the inverse Hessian matrix is the covariance matrix of the
transformed parameters, confidence intervals are first computed for the
transformed parameters and the limits of these intervals are transformed
back to the original scale of the parameters. Optionally, summary
reports the correlation matrix of the transformed parameters, also
computed from the Hessian matrix.
The function fit.variogram.model
generates an object of class
fitted.variogram
which is a list with the following components:
sse |
the value of the object function (weighted residual sum of squares) evaluated at the solution. |
variogram.object |
the estimated parameters of a possibly nested variograms model. This is a list that contains for each variogram model structure the following components:
|
param.tf |
a character vector listing the transformations of the variogram parameters used for model fitting. |
fwd.tf |
a list of functions for variogram parameter transformations. |
bwd.tf |
a list of functions for inverse variogram parameter transformations. |
converged |
a logical scalar indicating whether numerical
maximization by |
convergence.code |
a diagnostic integer issued by
|
iter |
a named integer vector of length two with the number of
function and gradient evaluations by |
call |
the matched call. |
residuals |
a numeric vector with the residuals, that is the sample semi-variance minus the fitted values. |
fitted |
a numeric vector with the modelled semi-variances. |
weights |
a numeric vector with the weights used for fitting. |
hessian.tfpa |
a symmetric matrix with the Hessian at the solution
with respect to the transformed variogram and anisotropy parameters
(missing if |
hessian.ntfpa |
a symmetric matrix with the Hessian at the solution
with respect to the non-transformed variogram and anisotropy parameters
(missing if |
The function control.fit.variogram.model
returns a list with
parameters to steer fit.variogram.model
, see arguments of
the function above for its components.
The method print.fitted.variogram
invisibly returns the fitted
variogram model unchanged.
The method summary.fitted.variogram
returns an object of class
summary.fitted.variogram
which is a list containing a subset of
the components of the fitted variogram object (call
,
residuals
, weights
, converged
,
convergence.code
, iter
, sse
,
variogram.object
), the matrix param.aniso
with the
estimated values of the variogram parameters along with the bounds of the
confidence intervals and optionally the correlation matrix
cor.tf.param
of the estimated transformed parameters. There is a
print
method for objects of class summary.fitted.variogram
which returns invisibly the object unchanged.
The method lines.fitted.variogram
is called for its side effects
and returns the value NULL
invisibly.
Andreas Papritz [email protected].
Cressie, N. A. C. (1993) Statistics for Spatial Data, Wiley, New York, doi:10.1002/9781119115151.
georobPackage
for a description of the model and a brief summary of the algorithms;
georob
for (robust) fitting of spatial linear models;
georobObject
for a description of the class georob
;
profilelogLik
for computing profiles of Gaussian likelihoods;
plot.georob
for display of RE(ML) variogram estimates;
control.georob
for controlling the behaviour of georob
;
georobModelBuilding
for stepwise building models of class georob
;
cv.georob
for assessing the goodness of a fit by georob
;
georobMethods
for further methods for the class georob
;
predict.georob
for computing robust Kriging predictions;
lgnpp
for unbiased back-transformation of Kriging prediction
of log-transformed data;
georobSimulation
for simulating realizations of a Gaussian process
from model fitted by georob
.
data(wolfcamp) ## fitting an isotropic IRF(0) model r.sv.iso <- sample.variogram(pressure~1, data = wolfcamp, locations = ~x + y, lag.dist.def = seq(0, 200, by = 15)) plot(r.sv.iso, type = "l") if(interactive()){ ## example is run only in interactive session because cpu times exceeds 5 s r.irf0.iso <- fit.variogram.model(r.sv.iso, variogram.model = "RMfbm", param = c(variance = 100, nugget = 1000, scale = 1., alpha = 1.), fit.param = default.fit.param(scale = FALSE, alpha = TRUE)) summary(r.irf0.iso, correlation = TRUE) lines(r.irf0.iso, line.col = "red") } ## fitting an anisotropic IRF(0) model r.sv.aniso <- sample.variogram(pressure~1, data = wolfcamp, locations = ~x + y, lag.dist.def = seq(0, 200, by = 15), xy.angle.def = c(0., 22.5, 67.5, 112.5, 157.5, 180.)) plot(r.sv.aniso, type = "l") if(interactive()){ ## example is run only in interactive session because cpu times exceeds 5 s r.irf0.aniso <- fit.variogram.model(r.sv.aniso, variogram.model = "RMfbm", param = c(variance = 100, nugget = 1000, scale = 1., alpha = 1.5), fit.param = default.fit.param(scale = FALSE, alpha = TRUE), aniso = default.aniso(f1 = 0.4, omega = 135.), fit.aniso = default.fit.aniso(f1 = TRUE, omega = TRUE), control = control.fit.variogram.model( maximizer = "optim", optim = control.optim( method = "BFGS", hessian = TRUE, control = list(maxit = 5000) ) )) summary(r.irf0.aniso, correlation = TRUE) lines(r.irf0.aniso, xy.angle = seq(0, 135, by = 45)) }
data(wolfcamp) ## fitting an isotropic IRF(0) model r.sv.iso <- sample.variogram(pressure~1, data = wolfcamp, locations = ~x + y, lag.dist.def = seq(0, 200, by = 15)) plot(r.sv.iso, type = "l") if(interactive()){ ## example is run only in interactive session because cpu times exceeds 5 s r.irf0.iso <- fit.variogram.model(r.sv.iso, variogram.model = "RMfbm", param = c(variance = 100, nugget = 1000, scale = 1., alpha = 1.), fit.param = default.fit.param(scale = FALSE, alpha = TRUE)) summary(r.irf0.iso, correlation = TRUE) lines(r.irf0.iso, line.col = "red") } ## fitting an anisotropic IRF(0) model r.sv.aniso <- sample.variogram(pressure~1, data = wolfcamp, locations = ~x + y, lag.dist.def = seq(0, 200, by = 15), xy.angle.def = c(0., 22.5, 67.5, 112.5, 157.5, 180.)) plot(r.sv.aniso, type = "l") if(interactive()){ ## example is run only in interactive session because cpu times exceeds 5 s r.irf0.aniso <- fit.variogram.model(r.sv.aniso, variogram.model = "RMfbm", param = c(variance = 100, nugget = 1000, scale = 1., alpha = 1.5), fit.param = default.fit.param(scale = FALSE, alpha = TRUE), aniso = default.aniso(f1 = 0.4, omega = 135.), fit.aniso = default.fit.aniso(f1 = TRUE, omega = TRUE), control = control.fit.variogram.model( maximizer = "optim", optim = control.optim( method = "BFGS", hessian = TRUE, control = list(maxit = 5000) ) )) summary(r.irf0.aniso, correlation = TRUE) lines(r.irf0.aniso, xy.angle = seq(0, 135, by = 45)) }
The function gencorr
computes intrinsic or
stationary isotropic generalized correlations (= negative semi-variances
computed with sill (variance) parameter equal to 1) for a set of basic
variogram models formerly made available by the function RFfctn
of
the now archived R package RandomFields.
gencorr(x, variogram.model, param)
gencorr(x, variogram.model, param)
x |
a numeric vector with scaled lag distances, i.e. lag distances
divided by the range parameter |
variogram.model |
a character keyword defining the variogram model.
Currently, the following models are implemented: |
param |
a named numeric vector with values of the additional
parameters of the variogram models such as the smoothness parameter of
the Whittle-Matérn model, see |
The name and parametrization of the variogram models originate from the
function RFfctn
of RandomFields. The equations and further
information are taken (with minor modifications) from the help pages of
the respective functions of the archived R package RandomFields,
version 3.3.14 (Schlather et al., 2022). Note that the variance
(sill, param["variance"]
) and the range parameters
(param["scale"]
) are assumed to be equal to 1 in the following
formulae, and is the lag distance. The variogram functions are
stationary and are valid for any number of dimensions if not mentioned
otherwise.
The following intrinsic or stationary isotropic variogram
functions are implemented in
gencorr
:
RMaskey
is the indicator function equal to 1 for
and 0 otherwise. This variogram function is
valid for dimension
if
. For
we get the well-known triangle (or tent) model, which is
only valid on the real line.
RMbessel
where ,
denotes
the gamma function and
is a Bessel function of first kind.
This models a hole effect (see Chilès and Delfiner, 1999,
p. 92).
An important case is
which gives the variogram
function
and which is only valid for (this equals
RMdampedcos
for ).
A second important case is
with variogram function
which is valid for . This coincides with
RMwave
.
RMcauchy
where . The parameter
determines the
asymptotic power law. The smaller
, the longer the
long-range dependence. The generalized Cauchy Family
(
RMgencauchy
) includes this family for the choice and
.
RMcircular
This variogram function is valid only for dimensions .
RMcubic
The model is only valid for dimensions . It is a 2 times
differentiable variogram function with compact support (see
Chilès and Delfiner, 1999, p. 84).
RMdagum
The parameters and
can be varied in the
intervals
and
, respectively. Like the
generalized Cauchy model (
RMgencauchy
) the Dagum family can be
used to model separately fractal dimension and Hurst effect
(see Berg et al., 2008).
RMdampedcos
The model is valid for any dimension . However, depending on
the dimension of the random field the following bound
has to be respected.
This variogram function models a hole effect
(see Chilès and Delfiner, 1999, p. 92).
For
we obtain
which is only valid
for and corresponds to
RMbessel
for .
RMdewijsian
where . This is an intrinsic variogram function.
Originally, the logarithmic model
was named after de Wijs and
reflects a principle of
similarity (see Chilès and Delfiner, 1999, p. 90). But
note that
is not
a valid variogram function.
RMexp
This model is a special case of the Whittle model
(RMwhittle
) if
and of the stable family (
RMstable
)
if . Moreover, it is the continuous-time analogue
of the first order auto-regressive time series covariance structure.
RMfbm
where . This is an
intrinsically stationary variogram function. For
we get a variogram function corresponding to a standard
Brownian Motion. For
the
quantity
is called Hurst index
and determines the fractal dimension
of the corresponding
Gaussian sample paths where
is the
dimension of the random field
(see Chilès and Delfiner, 1999, p. 89).
RMgauss
The Gaussian model has an infinitely differentiable variogram
function. This smoothness is artificial. Furthermore, this often
leads to singular matrices and therefore numerically instable
procedures (see Stein, 1999, p. 29). The Gaussian model is included in
the stable class (RMstable
) for the choice .
RMgencauchy
where and
. This model has a smoothness parameter
and
a parameter
which determines the asymptotic power law.
More precisely, this model admits simulating random fields where
fractal dimension D of the Gaussian sample path and Hurst
coefficient H can be chosen independently (compare also with
RMlgd
): Here, we have and
. The smaller
, the longer the
long-range dependence. The variogram function is very regular
near the origin, because its Taylor expansion only contains even terms
and reaches its sill slowly. Note that the Cauchy Family
(
RMcauchy
) is included in this family for the choice and
.
RMgenfbm
where and
. This is an intrinsic variogram function.
RMgengneiting
This is a family of stationary models whose elements are specified by
the two parameters and
with
being a
non-negative integer and
with
denoting the dimension of the random field (the models can be
used for any dimension). Let
.
For the model equals the Askey model (
RMaskey
)
and is therefore not implemented.
For the model is given by
If
and for
A special case of this family is RMgneiting
(with
there) for the choice
.
RMgneiting
if and
otherwise. Here,
. This variogram function is
valid only for dimensions less than or equal to 3. It is 6 times
differentiable and has compact support. This model is an alternative
to
RMgauss
as its graph is hardly distinguishable from the graph
of the Gaussian model, but possesses neither the mathematical nor the
numerical disadvantages of the Gaussian model. It is a special case of
RMgengneiting
for the choice .
RMlgd
where and
, with
denoting the dimension of the random field.
The model is only valid for dimension
. This model admits
simulating random fields where fractal dimension
of the
Gaussian sample and Hurst coefficient
can be chosen
independently (compare also
RMgencauchy
): Here, the random field
has fractal dimension and Hurst coefficient
for
.
RMmatern
where and
is the modified Bessel function of
second kind. This is one of 3 possible parametrizations (Whittle,
Matérn, Handcock-Wallis) of the Whittle-Matérn model. The
Whittle-Matérn model is the model of choice if the smoothness of a
random field is to be parametrized: the sample paths of a Gaussian
random field with this covariance structure are
times
differentiable if and only if
(see Gneiting and Guttorp,
2010, p. 24). Furthermore, the fractal dimension
of the
Gaussian sample paths is determined by
: We have
and
for
where
is the dimension of the random
field (see Stein, 1999, p. 32). If
the Matérn
model equals
RMexp
. For tending to
a
rescaled Gaussian model
RMgauss
appears as limit for the
Handcock-Wallis parametrization.
RMpenta
The model is only valid for dimensions . It has
a 4 times differentiable variogram function with compact support (cf.
Chilès and Delfiner, 1999, p. 84).
RMqexp
where .
RMspheric
This variogram model is valid only for dimensions less than or equal to 3 and has compact support.
RMstable
where . The parameter
determines the fractal dimension
of the Gaussian
sample paths:
where
is the dimension
of the random field. For
the Gaussian sample paths
are not differentiable (see Gelfand et al., 2010, p. 25). The stable
family includes the exponential model (
RMexp
) for and the Gaussian model (
RMgauss
) for .
RMwave
The model is only valid for dimensions . It is a special
case of
RMbessel
for . This variogram models a
hole effect (see Chilès and Delfiner, 1999, p. 92).
RMwhittle
where and
is the modified Bessel function of
second kind. This is one of 3 possible parametrizations (Whittle,
Matérn, Handcock-Wallis) of the
Whittle-Matérn model, for further details, see
information for entry
RMmatern
above.
A numeric vector with generalized correlations (= negative semi-variances
computed with variance parameter param["variance"] = 1
).
Andreas Papritz [email protected]
Berg, C., Mateau, J., Porcu, E. (2008) The dagum family of isotropic correlation functions, Bernoulli, 14, 1134–1149, doi:10.3150/08-BEJ139.
Chilès, J.-P., Delfiner, P. (1999) Geostatistics Modeling Spatial Uncertainty, Wiley, New York, doi:10.1002/9780470316993.
Gneiting, T. (2002) Compactly supported correlation functions. Journal of Multivariate Analysis, 83, 493–508, doi:10.1006/jmva.2001.2056.
Gneiting, T., Schlather, M. (2004) Stochastic models which separate fractal dimension and Hurst effect. SIAM review 46, 269–282, doi:10.1137/S0036144501394387.
Gneiting, T., Guttorp, P. (2010) Continuous Parameter Stochastic Process Theory, In Gelfand, A. E., Diggle, P. J., Fuentes, M., Guttrop, P. (Eds.) Handbook of Spatial Statistics, CRC Press, Boca Raton, p. 17–28, doi:10.1201/9781420072884.
Schlather M., Malinowski A., Oesting M., Boecker D., Strokorb K., Engelke S., Martini J., Ballani F., Moreva O., Auel J., Menck P.J., Gross S., Ober U., Ribeiro P., Ripley B.D., Singleton R., Pfaff B., R Core Team (2022). RandomFields: Simulation and Analysis of Random Fields. R package version 3.3.14, https://cran.r-project.org/src/contrib/Archive/RandomFields/.
Stein, M. L. (1999) Interpolation of Spatial Data: Some Theory for Kriging, Springer, New York, doi:10.1007/978-1-4612-1494-6.
georobPackage
for a description of the model and a brief summary of the algorithms;
georob
for (robust) fitting of spatial linear models;
georobObject
for a description of the class georob
;
profilelogLik
for computing profiles of Gaussian likelihoods;
plot.georob
for display of RE(ML) variogram estimates;
control.georob
for controlling the behaviour of georob
;
georobModelBuilding
for stepwise building models of class georob
;
cv.georob
for assessing the goodness of a fit by georob
;
georobMethods
for further methods for the class georob
;
predict.georob
for computing robust Kriging predictions;
lgnpp
for unbiased back-transformation of Kriging prediction
of log-transformed data;
georobSimulation
for simulating realizations of a Gaussian process
from model fitted by georob
; and finally
sample.variogram
and fit.variogram.model
for robust estimation and modelling of sample variograms.
## scaled lag distances x <- seq(0, 3, length.out = 100) ## generalized correlations stable model y <- gencorr(x, variogram.model = "RMstable", param = c(alpha = 1.5)) plot(x, y) ## generalized correlations circular model y <- gencorr(x, variogram.model = "RMcircular") plot(x, y)
## scaled lag distances x <- seq(0, 3, length.out = 100) ## generalized correlations stable model y <- gencorr(x, variogram.model = "RMstable", param = c(alpha = 1.5)) plot(x, y) ## generalized correlations circular model y <- gencorr(x, variogram.model = "RMcircular") plot(x, y)
The function georob
fits a linear model with spatially correlated
errors to geostatistical data that are possibly contaminated by
independent outliers. The regression coefficients and the parameters of
the variogram model are estimated by robust or Gaussian restricted
maximum likelihood (REML) or by Gaussian maximum likelihood (ML).
georob(formula, data, subset, weights, na.action, model = TRUE, x = FALSE, y = FALSE, contrasts = NULL, offset, locations, variogram.model = c("RMexp", "RMaskey", "RMbessel", "RMcauchy", "RMcircular", "RMcubic", "RMdagum", "RMdampedcos", "RMdewijsian", "RMfbm", "RMgauss", "RMgencauchy", "RMgenfbm", "RMgengneiting", "RMgneiting", "RMlgd", "RMmatern", "RMpenta", "RMqexp", "RMspheric", "RMstable", "RMwave", "RMwhittle"), param, fit.param = default.fit.param()[names(param)], aniso = default.aniso(), fit.aniso = default.fit.aniso(), variogram.object = NULL, tuning.psi = 2, control = control.georob(), verbose = 0, ...)
georob(formula, data, subset, weights, na.action, model = TRUE, x = FALSE, y = FALSE, contrasts = NULL, offset, locations, variogram.model = c("RMexp", "RMaskey", "RMbessel", "RMcauchy", "RMcircular", "RMcubic", "RMdagum", "RMdampedcos", "RMdewijsian", "RMfbm", "RMgauss", "RMgencauchy", "RMgenfbm", "RMgengneiting", "RMgneiting", "RMlgd", "RMmatern", "RMpenta", "RMqexp", "RMspheric", "RMstable", "RMwave", "RMwhittle"), param, fit.param = default.fit.param()[names(param)], aniso = default.aniso(), fit.aniso = default.fit.aniso(), variogram.object = NULL, tuning.psi = 2, control = control.georob(), verbose = 0, ...)
formula |
a symbolic description of the regression model for the
external drift to be fit (mandatory argument). See
|
data |
an optional data frame, a
|
subset |
an optional vector specifying a subset of observations to be used in the fitting process. |
weights |
an optional vector of weights to be used in the fitting process, currently ignored. |
na.action |
a function which indicates what should happen when the
data contain |
model , x , y
|
logical scalars. If |
contrasts |
an optional list. See the |
offset |
this optional argument can be used to specify an a
priori known component to be included in the linear predictor during
fitting. An |
locations |
a one-sided formula defining the variables that are used as coordinates of the locations were the data was recorded (mandatory argument). |
variogram.model |
a character keyword defining the variogram model
to be fitted. Currently, most basic variogram models provided formerly
by the now archived package RandomFields can be fitted (see
Details and |
param |
a named numeric vector with initial values of the
variogram parameters (mandatory argument). The names of
|
fit.param |
a named logical vector (or a function such as
|
aniso |
a named numeric vector with initial values (or a function such as
|
fit.aniso |
a named logical vector (or a function such as
|
variogram.object |
an optional list that defines a possibly nested variogram model. Each component is itself a list with the following components:
Note that the arguments |
tuning.psi |
positive numeric. The tuning constant |
control |
a list specifying parameters that control the behaviour of
|
verbose |
positive integer controlling logging of diagnostic
messages to the console during model fitting. |
... |
further arguments passed to function (e.g. |
georob
fits a spatial linear model by robust (Künsch et al., 2011, Künsch
et al., in preparation) or Gaussian RE(ML) (Harville, 1977).
georobPackage
describes the employed model and briefly
sketches the robust REML estimation and the robust external drift Kriging
method. Here, we describe further details of georob
.
Currently, most basic variogram models provided formerly by the now
archived package RandomFields can be fitted by georob
(see
argument variogram.model
and gencorr
for a list of
implemented models). Some of these models have in addition to
variance
, snugget
, nugget
and scale
further
parameters. Initial values of these parameters (param
) and
fitting flags (fit.param
) must be passed to georob
by the
same names as used for the models RM...
in
gencorr
. Use the function param.names
to
list additional parameters of a given variogram.model.
The arguments fit.param
and fit.aniso
are used to control
what variogram and anisotropy parameters are estimated and which are
kept at the constant initial values. The functionsdefault.fit.param
and default.fit.aniso
set
reasonable default values for these arguments. Note, as an aside, that
the function default.aniso
sets (default) values of the
anisotropy parameters for an isotropic variogram.
The intrinsic variogram model RMfbm
is over-parametrized when
both the variance
(plus possibly snugget
) and the
scale
are estimated. Therefore, to estimate the parameters of
this model, scale
must be kept fixed at an arbitrary value by
using fit.param["scale"] = FALSE
.
The subsection Model of georobPackage
describes
how such models are parametrized and gives definitions the various
elements of aniso
. Some additional remarks might be helpful:
The first semi-principal axis points into the direction with
the farthest reaching auto-correlation, which is described by the range
parameter scale
().
The ranges in the direction of the second and third
semi-principal axes are given by and
, with
.
The default values for aniso
(,
)
define an isotropic variogram model.
Valid ranges for the angles characterizing the orientation of
the semi-variance ellipsoid are (in degrees): [0, 180],
[0, 180],
[-90, 90].
Simultaneous estimation of the variance of the micro-scale variation
(snugget
, ), appears seemingly
as spatially uncorrelated with a given sampling design, and of the
variance (
nugget
, ) of the independent errors
requires that for some locations
replicated observations are
available. Locations less or equal than
zero.dist
apart are
thereby considered as being coincident (see
control.georob
).
Parameters of variogram models can vary only within certain bounds (see
param.bounds
and gencorr
for allowed ranges). georob
uses three mechanisms to constrain
parameter estimates to permissible ranges:
Parameter transformations: By default, all variance
(variance
, snugget
, nugget
), the range
scale
, the anisotropy parameters f1
and f2
and
many of the additional parameters are log-transformed before
solving the estimating equations or maximizing the restricted
log-likelihood and this warrants that the estimates are always
positive (see control.georob
for detailed explanations
how to control parameter transformations).
Checking permissible ranges: The additional parameters
of the variogram models such as the smoothness parameter
of the Whittle-Matérn model are forced to stay in the permissible
ranges by signalling an error to
nleqslv
, nlminb
or
optim
if the current trial values are invalid. These
functions then graciously update the trial values of the parameters
and carry their task on. However, it is clear that such a
procedure likely gets stuck at a point on the boundary of the
parameter space and is therefore just a workaround for avoiding
runtime errors due to invalid parameter values.
Exploiting the functionality of nlminb
and
optim
: If a spatial model is fitted non-robustly, then the
arguments lower
, upper
(and method
of
optim
) can be used to constrain the parameters
(see control.optim
how to pass them to optim
).
For optim
one has to use the arguments method =
"L-BFGS-B"
, lower = l
, upper = u
, where
l and u are numeric vectors with the lower and upper
bounds of the transformed parameters in the order as they
appear inc(variance, snugget, nugget, scale,
...)[fit.param], aniso[fit.aniso])
,
where ...
are
additional parameters of isotropic variogram models (use param.names(variogram.model)
to display the names and the
order of the additional parameters for variogram.model
).
For nlminb
one has to use the arguments lower =
l
, upper = u
, where l and u are
numeric vectors as described above.
To solve the robustified estimating equations for
and
the following initial
estimates are used:
if this turns out to be
infeasible, initial values can be passed to
georob
by the
argument bhat
of control.georob
.
is
either estimated robustly by the function
lmrob
, rq
or
non-robustly by lm
(see argument
initial.fixef
of control.georob
).
Finding the roots of the robustified estimating equations of the
variogram and anisotropy parameters is more sensitive to a good choice
of initial values than maximizing the Gaussian (restricted)
log-likelihood with respect to the same parameters. If the initial
values for param
and aniso
are not sufficiently close to
the roots of the system of nonlinear equations, then
nleqslv
may fail to find them.
Setting initial.param = TRUE
(see control.georob
)
allows one to find initial values that are
often sufficiently close to the roots so that
nleqslv
converges. This is achieved by:
Initial values of the regression parameters are computed by
lmrob
irrespective of the choice for
initial.fixef
(see control.georob
).
Observations with “robustness weights” of the
lmrob
fit, satisfying, are discarded (see
control.georob
).
The model is fit to the pruned data set by Gaussian REML using
nlminb
or optim
.
The resulting estimates of the variogram parameters
(param
, aniso
) are used as initial estimates for the
subsequent robust fit of the model by nleqslv
.
Note that for step 3 above, initial values of param
and
aniso
must be provided to georob
.
Unlike robust REML, where robustified estimating equations are solved
for the variance parameters nugget
(),
variance
(), and possibly
snugget
(), for Gaussian (RE)ML the
variances can be re-parametrized to
the signal variance
,
the inverse relative nugget
and
the relative auto-correlated signal variance
.
georob
maximizes then a (restricted) profile
log-likelihood that depends only on ,
,
, ..., and
is estimated by an explicit
expression that depends on these parameters (e.g. Diggle and
Ribeiro, 2006, p. 113). This is usually more efficient than
maximizing the (restricted) log-likelihood with respect to the original
variance parameters
,
and
.
georob
chooses the parametrization automatically, but the user
can control it by the argument reparam
of the function
control.georob
.
An object of class georob
representing a robust (or Gaussian) (RE)ML
fit of a spatial linear model. See
georobObject
for the components of the fit.
Andreas Papritz [email protected]
with contributions by Cornelia Schwierz.
Diggle, P. J. and Ribeiro, P. J. R. (2006) Model-based Geostatistics. Springer, New York, doi:10.1007/978-0-387-48536-2.
Harville, D. A. (1977) Maximum likelihood approaches to variance component estimation and to related problems, Journal of the American Statistical Association, 72, 320–340, doi:10.1080/01621459.1977.10480998.
Künsch, H. R., Papritz, A., Schwierz, C. and Stahel, W. A. (in preparation) Robust Geostatistics.
Künsch, H. R., Papritz, A., Schwierz, C. and Stahel, W. A. (2011) Robust estimation of the external drift and the variogram of spatial data. Proceedings of the ISI 58th World Statistics Congress of the International Statistical Institute. doi:10.3929/ethz-a-009900710
georobPackage
for a description of the model and a brief summary of the algorithms;
georobObject
for a description of the class georob
;
profilelogLik
for computing profiles of Gaussian likelihoods;
plot.georob
for display of RE(ML) variogram estimates;
control.georob
for controlling the behaviour of georob
;
georobModelBuilding
for stepwise building models of class georob
;
cv.georob
for assessing the goodness of a fit by georob
;
georobMethods
for further methods for the class georob
;
predict.georob
for computing robust Kriging predictions;
lgnpp
for unbiased back-transformation of Kriging prediction
of log-transformed data;
georobSimulation
for simulating realizations of a Gaussian process
from model fitted by georob
; and finally
sample.variogram
and fit.variogram.model
for robust estimation and modelling of sample variograms.
################ ## meuse data ## ################ data(meuse) ## Gaussian REML fit r.logzn.reml <- georob(log(zinc) ~ sqrt(dist), data = meuse, locations = ~ x + y, variogram.model = "RMexp", param = c(variance = 0.15, nugget = 0.05, scale = 200), tuning.psi = 1000) summary(r.logzn.reml, correlation = TRUE) plot(r.logzn.reml, lag.dist.def = seq(0, 2000, by = 100)) ## robust REML fit if(interactive()){ ## example is run only in interactive session because cpu times exceeds 5 s r.logzn.rob <- update(r.logzn.reml, tuning.psi = 1) summary(r.logzn.rob, correlation = TRUE) lines(r.logzn.rob, col = "red") } ################### ## wolfcamp data ## ################### data(wolfcamp) ## fitting isotropic IRF(0) model r.irf0.iso <- georob(pressure ~ 1, data = wolfcamp, locations = ~ x + y, variogram.model = "RMfbm", param = c(variance = 10, nugget = 1500, scale = 1, alpha = 1.5), fit.param = default.fit.param(scale = FALSE, alpha = TRUE), tuning.psi = 1000) summary(r.irf0.iso) plot(r.irf0.iso, lag.dist.def = seq(0, 200, by = 7.5)) ## fitting anisotropic IRF(0) model if(interactive()){ ## example is run only in interactive session because cpu times exceeds 5 s r.irf0.aniso <- georob(pressure ~ 1, data = wolfcamp, locations = ~ x + y, variogram.model = "RMfbm", param = c(variance = 5.9, nugget = 1450, scale = 1, alpha = 1), fit.param = default.fit.param(scale = FALSE, alpha = TRUE), aniso = default.aniso(f1 = 0.51, omega = 148.), fit.aniso = default.fit.aniso(f1 = TRUE, omega = TRUE), tuning.psi = 1000) summary(r.irf0.aniso) plot(r.irf0.aniso, lag.dist.def = seq(0, 200, by = 7.5), xy.angle.def = c(0, 22.5, 67.5, 112.5, 157.5, 180.), add = TRUE, col = 2:5) pchisq(2*(r.irf0.aniso[["loglik"]] - r.irf0.iso[["loglik"]]), 2, lower = FALSE) }
################ ## meuse data ## ################ data(meuse) ## Gaussian REML fit r.logzn.reml <- georob(log(zinc) ~ sqrt(dist), data = meuse, locations = ~ x + y, variogram.model = "RMexp", param = c(variance = 0.15, nugget = 0.05, scale = 200), tuning.psi = 1000) summary(r.logzn.reml, correlation = TRUE) plot(r.logzn.reml, lag.dist.def = seq(0, 2000, by = 100)) ## robust REML fit if(interactive()){ ## example is run only in interactive session because cpu times exceeds 5 s r.logzn.rob <- update(r.logzn.reml, tuning.psi = 1) summary(r.logzn.rob, correlation = TRUE) lines(r.logzn.rob, col = "red") } ################### ## wolfcamp data ## ################### data(wolfcamp) ## fitting isotropic IRF(0) model r.irf0.iso <- georob(pressure ~ 1, data = wolfcamp, locations = ~ x + y, variogram.model = "RMfbm", param = c(variance = 10, nugget = 1500, scale = 1, alpha = 1.5), fit.param = default.fit.param(scale = FALSE, alpha = TRUE), tuning.psi = 1000) summary(r.irf0.iso) plot(r.irf0.iso, lag.dist.def = seq(0, 200, by = 7.5)) ## fitting anisotropic IRF(0) model if(interactive()){ ## example is run only in interactive session because cpu times exceeds 5 s r.irf0.aniso <- georob(pressure ~ 1, data = wolfcamp, locations = ~ x + y, variogram.model = "RMfbm", param = c(variance = 5.9, nugget = 1450, scale = 1, alpha = 1), fit.param = default.fit.param(scale = FALSE, alpha = TRUE), aniso = default.aniso(f1 = 0.51, omega = 148.), fit.aniso = default.fit.aniso(f1 = TRUE, omega = TRUE), tuning.psi = 1000) summary(r.irf0.aniso) plot(r.irf0.aniso, lag.dist.def = seq(0, 200, by = 7.5), xy.angle.def = c(0, 22.5, 67.5, 112.5, 157.5, 180.), add = TRUE, col = 2:5) pchisq(2*(r.irf0.aniso[["loglik"]] - r.irf0.iso[["loglik"]]), 2, lower = FALSE) }
georob
This page documents the methods deviance
,
logLik
, extractAIC
, add1
, drop1
,
step
and waldtest
for the class georob
. The package
georob
provides a generic step
function and a default
method which is identical with the (non-generic) function
step
.
## S3 method for class 'georob' deviance(object, warn = TRUE, REML = FALSE, ...) ## S3 method for class 'georob' logLik(object, warn = TRUE, REML = FALSE, ...) ## S3 method for class 'georob' extractAIC(fit, scale = 0, k = 2, ...) ## S3 method for class 'georob' add1(object, scope, scale = 0, test = c("none", "Chisq"), k = 2, trace = FALSE, fixed = TRUE, use.fitted.param = TRUE, verbose = 0, ncores = 1, ...) ## S3 method for class 'georob' drop1(object, scope, scale = 0, test = c("none", "Chisq"), k = 2, trace = FALSE, fixed = TRUE, use.fitted.param = TRUE, verbose = 0, ncores = 1, ...) step(object, ...) ## Default S3 method: step(object, scope, scale = 0, direction = c("both", "backward", "forward"), trace = 1, keep = NULL, steps = 1000, k = 2, ...) ## S3 method for class 'georob' step(object, scope, scale = 0, direction = c("both", "backward", "forward"), trace = 1, keep = NULL, steps = 1000, k = 2, fixed.add1.drop1 = TRUE, fixed.step = fixed.add1.drop1, use.fitted.param = TRUE, verbose = 0, ncores = 1, ...) ## S3 method for class 'georob' waldtest(object, ..., vcov = NULL, test = c("F", "Chisq"), name = NULL)
## S3 method for class 'georob' deviance(object, warn = TRUE, REML = FALSE, ...) ## S3 method for class 'georob' logLik(object, warn = TRUE, REML = FALSE, ...) ## S3 method for class 'georob' extractAIC(fit, scale = 0, k = 2, ...) ## S3 method for class 'georob' add1(object, scope, scale = 0, test = c("none", "Chisq"), k = 2, trace = FALSE, fixed = TRUE, use.fitted.param = TRUE, verbose = 0, ncores = 1, ...) ## S3 method for class 'georob' drop1(object, scope, scale = 0, test = c("none", "Chisq"), k = 2, trace = FALSE, fixed = TRUE, use.fitted.param = TRUE, verbose = 0, ncores = 1, ...) step(object, ...) ## Default S3 method: step(object, scope, scale = 0, direction = c("both", "backward", "forward"), trace = 1, keep = NULL, steps = 1000, k = 2, ...) ## S3 method for class 'georob' step(object, scope, scale = 0, direction = c("both", "backward", "forward"), trace = 1, keep = NULL, steps = 1000, k = 2, fixed.add1.drop1 = TRUE, fixed.step = fixed.add1.drop1, use.fitted.param = TRUE, verbose = 0, ncores = 1, ...) ## S3 method for class 'georob' waldtest(object, ..., vcov = NULL, test = c("F", "Chisq"), name = NULL)
object , fit
|
an object of class |
direction |
a character keyword with the mode of stepwise search,
see |
fixed , fixed.add1.drop1
|
a logical scalar controlling whether the
variogram parameters are not adjusted when |
fixed.step |
a logical scalar controlling whether the variogram
parameters are not adjusted after having called |
k |
a numeric specifying the 'weight' of the equivalent degrees of
freedom (=: edf) part in the AIC formula, see
|
keep |
a filter function whose input is a fitted model object and the
associated |
name |
a function for extracting a suitable name/description from a
fitted model object. By default the name is queried by calling
|
ncores |
an integer specifying the number of cores used for
parallelized execution of |
REML |
a logical scalar controlling whether the restricted log-likelihood
should be extracted (default |
scale |
a numeric, currently not used, see
|
scope |
defines the range of models examined in the stepwise search.
This should be either a single formula, or a list containing
components |
steps |
a numeric with the maximum number of steps to be considered
(default is 1000), see |
test |
a character keyword specifying whether to compute the large
sample Chi-squared statistic (with asymptotic Chi-squared distribution)
or the finite sample F statistic (with approximate F distribution), see
|
trace |
a numeric. If positive, information is printed during the
running of |
use.fitted.param |
a logical scalar controlling whether fitted
values of |
vcov |
a function for estimating the covariance matrix of the
regression coefficients, see |
verbose |
a positive integer controlling logging of diagnostic
messages to the console during model fitting, see |
warn |
a logical scalar controlling whether warnings should be suppressed. |
... |
additional arguments passed to methods (see in particular
|
For a non-robust fit the function deviance
returns the residual deviance
(see georobPackage
for an explanation of the notation).
For a robust fit the deviance is not defined. The function then computes with a warning
the deviance of an equivalent Gaussian model with heteroscedastic nugget
where
are
the “robustness weights”
rweights
, see georobObject
.
logLik
returns the maximized (restricted) log-likelihood. For
a robust fit, the log-likelihood is not defined. The function then
computes the (restricted) log-likelihood of an equivalent Gaussian model with
heteroscedastic nugget (see above).
The methods extractAIC
, add1
, drop1
and step
are used for stepwise model building.
If fixed=TRUE
or
fixed.add1.drop1=TRUE
(default) then the variogram parameters are
kept fixed at the values of object
. For
fixed=FALSE
or fixed.add1.drop1=FALSE
the variogram
parameters are fitted afresh for each model tested by add1
and
drop1
. Then either the variogram parameters in
object$initial.objects
(use.fitted.param=FALSE
) or the
fitted parameters of object
(use.fitted.param=TRUE
) are
used as initial values. For fixed.step=TRUE
the variogram
parameters are not fitted afresh by step
after the calls to
drop1
and add1
have been completed, unlike for
fixed.step=FALSE
where the parameters are estimated afresh for
the new model that minimized AIC (BIC) in the previous step.
In addition, the functions of the R package multcomp can be used to test general linear hypotheses about the fixed effects of the model.
The method deviance.georob
returns the deviance of the fitted
spatial linear model with the attributes log.det.covmat
containing the logarithm of the determinant of the covariance matrix
of the observations and optionally
log.det.xticovmatx
with the logarithm of the determinant of
, when
REML = true
,
see Details above.
The method logLik.georob
returns an object of class logLik
with the maximized (restricted) log-likelihood, see Details above
and logLik
.
The method extractAIC.georob
returns a numeric vector of length 2
with the first and second elements giving the equivalent degrees of
freedom and the (generalized) Akaike Information Criterion for the fitted
model fit
.
The methods add1.georob
and drop1.georob
return objects of
class anova
which are data.frame
s summarizing the
differences in fit between models. In addition to the customary
variables Df
and AIC
the output contains a logical variable
Converged
which signals (non-)convergence when fitting the
respective sub-model.
The generic function step
returns the stepwise selected model plus
optionally some additional attributes depending on the method.
The methods step.default
and step.georob
return the
stepwise-selected model with up to two additional components
(anova
, keep
), see step
for details.
The method waldtest.georob
returns an object of class anova
which contains the residual degrees of freedom, the difference in degrees
of freedom, Wald statistic (either "Chisq" or "F") and corresponding
p-value.
Andreas Papritz [email protected].
georobPackage
for a description of the model and a brief summary of the algorithms;
georob
for (robust) fitting of spatial linear models;
georobObject
for a description of the class georob
;
profilelogLik
for computing profiles of Gaussian likelihoods;
plot.georob
for display of RE(ML) variogram estimates;
control.georob
for controlling the behaviour of georob
;
cv.georob
for assessing the goodness of a fit by georob
;
georobMethods
for further methods for the class georob
;
predict.georob
for computing robust Kriging predictions;
lgnpp
for unbiased back-transformation of Kriging prediction
of log-transformed data;
georobSimulation
for simulating realizations of a Gaussian process
from model fitted by georob
; and finally
sample.variogram
and fit.variogram.model
for robust estimation and modelling of sample variograms.
data(meuse) ## Gaussian REML fit r.logzn.reml <- georob(log(zinc) ~ sqrt(dist), data = meuse, locations = ~ x + y, variogram.model = "RMexp", param = c(variance = 0.15, nugget = 0.05, scale = 200), tuning.psi = 1000) summary(r.logzn.reml, correlation = TRUE) deviance(r.logzn.reml) logLik(r.logzn.reml) waldtest(r.logzn.reml, .~. + ffreq) step(r.logzn.reml, ~ sqrt(dist) + ffreq + soil) ## robust REML fit if(interactive()){ ## example is run only in interactive session because cpu times exceeds 5 s r.logzn.rob <- update(r.logzn.reml, tuning.psi = 1) deviance(r.logzn.rob) logLik(r.logzn.rob) logLik(r.logzn.rob, REML=TRUE) step(r.logzn.rob, ~ sqrt(dist) + ffreq + soil, fixed.step=FALSE, trace=2) }
data(meuse) ## Gaussian REML fit r.logzn.reml <- georob(log(zinc) ~ sqrt(dist), data = meuse, locations = ~ x + y, variogram.model = "RMexp", param = c(variance = 0.15, nugget = 0.05, scale = 200), tuning.psi = 1000) summary(r.logzn.reml, correlation = TRUE) deviance(r.logzn.reml) logLik(r.logzn.reml) waldtest(r.logzn.reml, .~. + ffreq) step(r.logzn.reml, ~ sqrt(dist) + ffreq + soil) ## robust REML fit if(interactive()){ ## example is run only in interactive session because cpu times exceeds 5 s r.logzn.rob <- update(r.logzn.reml, tuning.psi = 1) deviance(r.logzn.rob) logLik(r.logzn.rob) logLik(r.logzn.rob, REML=TRUE) step(r.logzn.rob, ~ sqrt(dist) + ffreq + soil, fixed.step=FALSE, trace=2) }
An object of class georob
as returned by georob
and
representing a (robustly) fitted spatial linear model. Objects of this
class have methods for model building (see
georobModelBuilding
) and cross-validation (see
cv.georob
), for computing (robust) Kriging predictions (see
predict.georob
), for plotting (see
plot.georob
) and for common generic functions (see
georobMethods
).
A georob
object is a list with following components:
loglik |
the maximized (restricted) Gaussian log-likelihood of a
non-robust (RE)ML fit or |
variogram.object |
the estimated parameters of a possibly nested variograms model. This is a list that contains for each variogram model structure the following components:
|
gradient |
a named numeric vector with the estimating equations (robust REML) or the gradient of the maximized (restricted) log-likelihood (Gaussian (RE)ML) evaluated at the solution. |
tuning.psi |
the value of the tuning constant |
coefficients |
a named vector with the estimated regression coefficients
|
fitted.values |
a named vector with the fitted values of the
external drift
|
bhat |
a named vector with the predicted spatial random effects
|
residuals |
a named vector with the residuals
|
rweights |
a named numeric vector with the “robustness weights”
|
converged |
a logical scalar indicating whether numerical maximization of
the (restricted) |
convergence.code |
a diagnostic integer issued by
|
iter |
a named integer vector of length two, indicating either |
Tmat |
the compressed design matrix for replicated observations at coincident locations (integer vector that contains for each observation the row index of the respective unique location). |
cov |
a list with covariance matrices (or diagonal variance
vectors). Covariance matrices are stored in compressed form (see
|
expectations |
a named numeric vector with the expectations of
|
Valphaxi.objects |
a list of matrices in compressed form with (among others) the following components:
|
zhat.objects |
a list of matrices in (partly) compressed form with the following components:
|
locations.object |
a list with 3 components:
|
initial.objects |
a list with 3 components: |
hessian.tfpa |
a symmetric matrix with the Hessian (observed
Fisher information) at the solution with respect to the transformed
variogram and anisotropy parameters if the model was fitted non-robustly
with the argument |
hessian.ntfpa |
a symmetric matrix with the Hessian (observed
Fisher information) at the solution with respect to the non-transformed
variogram and anisotropy parameters if the model was fitted non-robustly
with the argument |
control |
a list with control parameters generated by
|
MD |
optionally a matrix of robust distances in the space spanned by
|
model , x , y
|
if requested the model frame, the model matrix and the response, respectively. |
na.action , offset , contrasts , xlevels , rank , df.residual , call , terms
|
further
components of the fit as described for an object of class
|
Andreas Papritz [email protected].
georobPackage
for a description of the model and a brief summary of the algorithms;
georob
for (robust) fitting of spatial linear models;
profilelogLik
for computing profiles of Gaussian likelihoods;
plot.georob
for display of RE(ML) variogram estimates;
control.georob
for controlling the behaviour of georob
;
georobModelBuilding
for stepwise building models of class georob
;
cv.georob
for assessing the goodness of a fit by georob
;
georobMethods
for further methods for the class georob
;
predict.georob
for computing robust Kriging predictions;
lgnpp
for unbiased back-transformation of Kriging prediction
of log-transformed data;
georobSimulation
for simulating realizations of a Gaussian process
from model fitted by georob
; and finally
sample.variogram
and fit.variogram.model
for robust estimation and modelling of sample variograms.
georob
This page documents the methods coef
, fixef
,
fixed.effects
, model.frame
, model.matrix
,
nobs
, print
, ranef
, random.effects
,
resid
, residuals
, rstandard
,
summary
and vcov
for the class georob
which extract
the respective components or summarize a georob
object.
## S3 method for class 'georob' coef(object, what = c("trend", "variogram"), ...) ## S3 method for class 'georob' fixef(object, ...) ## S3 method for class 'georob' fixed.effects(object, ...) ## S3 method for class 'georob' model.frame(formula, ...) ## S3 method for class 'georob' model.matrix(object, ...) ## S3 method for class 'georob' nobs(object, ...) ## S3 method for class 'georob' print(x, digits = max(3, getOption("digits") - 3), ...) ## S3 method for class 'georob' ranef(object, standard = FALSE, ...) ## S3 method for class 'georob' random.effects(object, standard = FALSE, ...) ## S3 method for class 'georob' resid(object, type = c("working", "response", "deviance", "pearson", "partial"), terms = NULL, level = 1, ...) ## S3 method for class 'georob' residuals(object, type = c("working", "response", "deviance", "pearson", "partial"), terms = NULL, level = 1, ...) ## S3 method for class 'georob' rstandard(model, level = 1, ...) ## S3 method for class 'georob' summary(object, correlation = FALSE, signif = 0.95, ...) ## S3 method for class 'georob' vcov(object, ...)
## S3 method for class 'georob' coef(object, what = c("trend", "variogram"), ...) ## S3 method for class 'georob' fixef(object, ...) ## S3 method for class 'georob' fixed.effects(object, ...) ## S3 method for class 'georob' model.frame(formula, ...) ## S3 method for class 'georob' model.matrix(object, ...) ## S3 method for class 'georob' nobs(object, ...) ## S3 method for class 'georob' print(x, digits = max(3, getOption("digits") - 3), ...) ## S3 method for class 'georob' ranef(object, standard = FALSE, ...) ## S3 method for class 'georob' random.effects(object, standard = FALSE, ...) ## S3 method for class 'georob' resid(object, type = c("working", "response", "deviance", "pearson", "partial"), terms = NULL, level = 1, ...) ## S3 method for class 'georob' residuals(object, type = c("working", "response", "deviance", "pearson", "partial"), terms = NULL, level = 1, ...) ## S3 method for class 'georob' rstandard(model, level = 1, ...) ## S3 method for class 'georob' summary(object, correlation = FALSE, signif = 0.95, ...) ## S3 method for class 'georob' vcov(object, ...)
object , model , x
|
an object of class |
formula |
a model |
correlation |
a logical scalar controlling whether the correlation
matrix of the estimated regression coefficients and of the fitted
variogram parameters (only for non-robust fits) is computed (default
|
digits |
a positive integer indicating the number of decimal digits to print. |
level |
an optional integer giving the level for extracting the
residuals from |
signif |
a numeric with the confidence level for computing
confidence intervals for variogram parameters (default |
standard |
a logical scalar controlling whether the spatial random effects
|
type |
a character keyword indicating the type of residuals to
compute, see |
terms |
If |
what |
If |
... |
additional arguments passed to methods. |
For robust REML fits deviance
returns (possibly with a warning)
the deviance of the Gaussian REML fit of the equivalent Gaussian spatial
linear model with heteroscedastic nugget.
The methods model.frame
, model.matrix
and nobs
extract the model frame, model matrix and the number of observations, see
help pages of respective generic functions.
The methods residuals
(and resid
) extract either the
estimated independent errors
or the sum of the latter quantities and the spatial random effects
.
rstandard
does the same but standardizes the residuals to unit
variance. ranef
(random.effects
) extracts the spatial
random effects with the option to standardize them as well, and
fixef
(fixed.effects
) extracts the fitted fixed-effects
regression coefficients, which may of course also be obtained by
coef
.
For Gaussian REML the method summary
computes confidence intervals
of the estimated variogram and anisotropy parameters from the Hessian
matrix of the (restricted) log-likelihood (= observed Fisher
information), based on the asymptotic normal distribution of (RE)ML
estimates. Note that the Hessian matrix with respect to the
transformed variogram and anisotropy parameters is used for this.
Hence the inverse Hessian matrix is the covariance matrix of the
transformed parameters, confidence intervals are first computed for the
transformed parameters and the limits of these intervals are transformed
back to the orginal scale of the parameters. Optionally, summary
reports the correlation matrix of the transformed parameters, also
computed from the Hessian matrix.
Note that the methods coef
and summary
generate objects of
class coef.georob
and summary.georob
, respectively, for
which only print
methods are available.
Besides, the default methods of the generic functions
confint
,
df.residual
, fitted
,
formula
, termplot
and
update
can be used for objects of class
georob
.
The methods fixef.georob
and fixed.effects.georob
return
the numeric vector of estimated fixed-effects regression coefficients, and
vcov.georob
returns the covariance matrix of the estimated
regression coefficients.
The method coef.georob
returns an object of class
coef.georob
which is a numeric vector with estimated fixed-effects
regression coefficients or variogram and anisotropy parameters. There is
a print
method for objects of class coef.georob
which
returns invisibly the object unchanged.
The methods resid.georob
, residuals.georob
and
rstandard.georob
return numeric vectors of (standardized)
residuals, and ranef.georob
and random.effects.georob
the
numeric vector of (standardized) spatial random effects, see
Details.
The methods model.frame.georob
and model.matrix.georob
return a model frame and the fixed-effects model matrix, respectively,
and nobs.georob
returns the number of observations used to fit a
spatial linear model.
The method summary.georob
generates an object of class
summary.georob
which is a list with components extracted directly
from object
(call
, residuals
, bhat
,
rweights
, converged
, convergence.code
, iter
,
loglik
, variogram.object
, gradient
,
tuning.psi
, df.residual
, control
, terms
)
and complemented by the following components:
scale
the square root of the estimated nugget effect
.
coefficients
a 4-column matrix with estimated regression coefficients, their standard errors, t-statistics and corresponding (two-sided) p-values.
correlation
an optional compress
ed
lower-triagonal matrix with the Pearson correlation coefficients of the
estimated regression coefficients.
param.aniso
either a vector (robust REML) or a 3-column matrix (Gaussian REML) with estimated variogram and anisotropy parameters, complemented for Gaussian REML with confidence limits, see Details.
cor.tf.param
an optional compress
ed
lower-triagonal matrix with the Pearson correlation coefficients of
estimated transformed variogram and anisotropy parameters, see
Details.
se.residuals
a vector with the standard errors of the
estimated .
There is a print
methods for class summary.georob
which
invisibly returns the object unchanged.
The method print.georob
invisibly returns the object unchanged.
Andreas Papritz [email protected].
georobPackage
for a description of the model and a brief summary of the algorithms;
georob
for (robust) fitting of spatial linear models;
georobObject
for a description of the class georob
;
profilelogLik
for computing profiles of Gaussian likelihoods;
plot.georob
for display of RE(ML) variogram estimates;
control.georob
for controlling the behaviour of georob
;
georobModelBuilding
for stepwise building models of class georob
;
cv.georob
for assessing the goodness of a fit by georob
;
predict.georob
for computing robust Kriging predictions;
lgnpp
for unbiased back-transformation of Kriging prediction
of log-transformed data;
georobSimulation
for simulating realizations of a Gaussian process
from model fitted by georob
; and finally
sample.variogram
and fit.variogram.model
for robust estimation and modelling of sample variograms.
data(meuse) ## Gaussian REML fit r.logzn.reml <- georob(log(zinc) ~ sqrt(dist), data = meuse, locations = ~ x + y, variogram.model = "RMexp", param = c(variance = 0.15, nugget = 0.05, scale = 200), tuning.psi = 1000) summary(r.logzn.reml, correlation = TRUE) ## robust REML fit r.logzn.rob <- update(r.logzn.reml, tuning.psi = 1) summary(r.logzn.rob, correlation = TRUE) ## residual diagnostics old.par <- par(mfrow = c(2,3)) plot(fitted(r.logzn.reml), rstandard(r.logzn.reml)) abline(h = 0, lty = "dotted") qqnorm(rstandard(r.logzn.reml)) abline(0, 1) qqnorm(ranef(r.logzn.reml, standard = TRUE)) abline(0, 1) plot(fitted(r.logzn.rob), rstandard(r.logzn.rob)) abline(h = 0, lty = "dotted") qqnorm(rstandard(r.logzn.rob)) abline(0, 1) qqnorm(ranef(r.logzn.rob, standard = TRUE)) abline(0, 1) par(old.par)
data(meuse) ## Gaussian REML fit r.logzn.reml <- georob(log(zinc) ~ sqrt(dist), data = meuse, locations = ~ x + y, variogram.model = "RMexp", param = c(variance = 0.15, nugget = 0.05, scale = 200), tuning.psi = 1000) summary(r.logzn.reml, correlation = TRUE) ## robust REML fit r.logzn.rob <- update(r.logzn.reml, tuning.psi = 1) summary(r.logzn.rob, correlation = TRUE) ## residual diagnostics old.par <- par(mfrow = c(2,3)) plot(fitted(r.logzn.reml), rstandard(r.logzn.reml)) abline(h = 0, lty = "dotted") qqnorm(rstandard(r.logzn.reml)) abline(0, 1) qqnorm(ranef(r.logzn.reml, standard = TRUE)) abline(0, 1) plot(fitted(r.logzn.rob), rstandard(r.logzn.rob)) abline(h = 0, lty = "dotted") qqnorm(rstandard(r.logzn.rob)) abline(0, 1) qqnorm(ranef(r.logzn.rob, standard = TRUE)) abline(0, 1) par(old.par)
This page documents the function condsim
that
simulates (un)conditional realizations of Gaussian processes from the
parameters of a spatial linear model estimated by the function
georob
.
condsim(object, newdata, nsim, seed, type = c("response", "signal"), locations, trend.coef = NULL, variogram.model = NULL, param = NULL, aniso = NULL, variogram.object = NULL, control = control.condsim(), verbose = 0) control.condsim(use.grid = FALSE, grid.refinement = 2., condsim = TRUE, ce.method = c( "standard", "approximate" ), ce.grid.expansion = 1., include.data.sites = FALSE, means = FALSE, trend.covariates = FALSE, covariances = FALSE, ncores = 1, mmax = 10000, pcmp = control.pcmp())
condsim(object, newdata, nsim, seed, type = c("response", "signal"), locations, trend.coef = NULL, variogram.model = NULL, param = NULL, aniso = NULL, variogram.object = NULL, control = control.condsim(), verbose = 0) control.condsim(use.grid = FALSE, grid.refinement = 2., condsim = TRUE, ce.method = c( "standard", "approximate" ), ce.grid.expansion = 1., include.data.sites = FALSE, means = FALSE, trend.covariates = FALSE, covariances = FALSE, ncores = 1, mmax = 10000, pcmp = control.pcmp())
object |
an object of class |
newdata |
a mandatory data frame,
|
nsim |
a positive interger with the number of (condititional) realizations to compute (mandatory argument). |
seed |
an integer seed to initialize random number generation,
see |
type |
a character keyword defining what target quantity should be simulated. Possible values are
see |
locations |
an optional one-sided formula specifying what variables
of |
trend.coef |
an optional numeric vector with the coefficients of the trend model to be used for computing the (conditional) mean function of the random process see Details. |
variogram.model |
an optional character keyword defining the
variogram model to be used for the simulations, see |
param |
an optional named numeric vector with values of the
variogram parameters used for the simulations, see |
aniso |
an optional named numeric vector with values of anisotropy
parameters of a variogram used for the simulations, see
|
variogram.object |
an optional list that defines a possibly nested
variogram model used for the simulations, see |
control |
a list with the components |
verbose |
a positive integer controlling logging of diagnostic
messages to the console. |
use.grid |
a logical scalar (default |
grid.refinement |
a numeric that defines a factor by which the
minimum differences of the coordinates between any pair of points in
|
condsim |
a logical scalar to control whether conditional
( |
ce.method |
a character keyword to select the method to simulate realizations by the circulant embedding algorithm, see Details. |
ce.grid.expansion |
a numeric with the factor by which the
dimensions of the simulation grid is expanded in the circulant embedding
algorithm. Should be |
include.data.sites |
a logical scalar, to control whether
(conditionally) simulated values are computed also for the points of the
original data set used to estimate the model parameters and contained in
|
means |
a logical scalar, to control whether the (un)conditional means are included in the output. |
trend.covariates |
a logical scalar, to control whether the covariates required for the trend model are included in the output. |
covariances |
a logical scalar, to control whether the covariances
between the points of the original data set used to estimate the model
parameters ( |
ncores |
a positive integer controlling how many cores are used for parallelized computations, defaults to 1. |
mmax |
a positive integer equal to the maximum number (default
|
pcmp |
a list of arguments, passed e.g. to |
condsim
(conditionally) simulates from a Gaussian process that
has a linear mean function with parameters
and an auto-correlation structure
characterized by a parametric variogram model and variogram parameters
and
(see
georobPackage
for the employed parametrization of the
spatial linear model). The parameters of the mean and auto-correlation
function are either taken from the spatial linear model estimated by
georob
and passed by the argument
object
to condsim
or from the optional arguments
trend.coef
()
and
variogram.model
, param
, aniso
or variogram.object
(,
).
Simulated values are computed for the points in newdata
and
optionally also for the data points in object
if
include.data.sites = TRUE
. Both unconditional and conditional
simulations can be computed. In the latter cases, the simulated values
are always conditioned to the response data used to fit the spatial
linear model by georob
and contained in object
.
Unconditional realizations are either computed for the exact locations of
the points in newdata
(use.grid = FALSE
), irrespective of
the fact whether these are arranged on a regular grid or not.
Simulations are then generated by the Cholesky matrix decomposition
method (e.g. Chilès and Delfiner, 1999, sec.
7.2.2).
For use.grid = TRUE
the points in newdata
are matched to a
rectangular simulation grid and the simulations are generated for all
nodes of this grid by the circulant embedding method (Davis and
Bryant, 2013; Dietrich and Newsam, 1993; Wood and Chan,
1994). For large problems this approach may be substantially faster and
less memory demanding than the Cholesky matrix decomposition method.
For circulant embedding, first a rectangular simulation grid is
constructed from the coordinates of the points in newdata
and
object
. The spacings of the simulation grid is equal to the
minimum coordinate differences between any pair of points in
newdata
, divided by grid.refinement
. The spatial extent of
the simulation grid is chosen such that it covers the bounding boxes of
all points in newdata
and object
. The points in
newdata
and object
are then matched to the closest nodes of
the simulation grid. If the same node is assigned to a point in
object
and newdata
then the point in object
is kept
and the concerned point in newdata
is omitted.
The rectangular simulation grid is then expanded to the larger circulant embedding grid, and the eigenvalues of the so-called base matrix (= first row of the covariance matrix of the nodes of the circulant embedding grid with block circulant structure, see Davies and Bryant, 2013) are computed by fast discrete Fourier transform (FFT). It may happen that some of the eigenvalues of the base matrix are negative. The standard circulant embedding algorithm then fails.
Two approaches are implemented in condsim
to handle this
situation:
First, one may use the approximate circulant embedding
method by choosing ce.method = "approximate"
. This sets the
negative eigenvalues of the base matrix to zero and scales the
eigenvalues, see Chan and Wood (1994, sec. 4, choice ).
Second, one may attempt to avoid the problem of negative
eigenvalues by increasing the size of the simulation (and circulant
embedding) grids. This can be achieved by choosing a value
for the argument
ce.grid.expansion
, see respective parts in
Dietrich and Newsam (1993, sec. 4) and Wood and Chan
(1994, sec. 3).
Note that the dimension of the simulation and embedding grids are chosen such that the number of nodes is a highly composite integer. This allows efficient FFT.
For both the Cholesky matrix decomposition and the circulant embedding approach, simulations are conditioned to data by the Kriging method, see Chilès and Delfiner, 1999, sec. 7.3.
condsim
uses the packages parallel and snowfall for
parallelized computations. Three tasks can be executed in parallel:
Computation of (generalized correlations), see
control.pcmp
how to do this.
Computation of Kriging predictions required for conditional
simulations, see section Details of
predict.georob
.
Fast Fourier transform of realizations of standard normal
deviates generated for the nodes of the base matrix (see
Davies and Bryant, 2013, steps 3–5 of algorithm). If there are
nsim
realizations to simulate, the task is split into
ceiling(nsim / ncores)
sub-tasks that are then distributed to
ncores
CPUs. Evidently, ncores = 1
(default) suppresses
parallel execution.
The output generated by condsim
is an object of a “similar”
class as newdata
(data frame,SpatialPointsDataFrame
,
SpatialPixelsDataFrame
,
SpatialGridDataFrame
, SpatialPolygonsDataFrame
).
The data frame or the
data
slot of the Spatial...DataFrame
objects
have the following components:
the coordinates of the prediction points (only present if
newdata
is a data frame).
expct
: optionally the (un)conditional means.
optionally the covariates required for the trend model.
sim.1
, sim.2
, ...: the (un)conditionally
simulated realizations.
The function control.condsim
returns a list with parameters to
steer condsim
, see arguments above.
Andreas Papritz [email protected].
Chilès, J.-P., Delfiner, P. (1999) Geostatistics Modeling Spatial Uncertainty, Wiley, New York, doi:10.1002/9780470316993.
Davies, T. M., Bryant, D. (2013) On circulant embedding for gaussian random fields in R, Journal of Statistical Software, 55, 1–21, doi:10.18637/jss.v055.i09.
Dietrich, C. R., Newsam, G. N. (1993) A fast and exact method for multidimensional gaussian stochastic simulations, Water Resources Research, 9, 2861–2869, doi:10.1029/93WR01070.
Wood, A. T. A., Chan, G. (1994) Simulation of stationary gaussian
processes in , Journal of Computational and Graphcal
Statistics, 3, 409–432, doi:10.2307/1390903.
georobPackage
for a description of the model and a brief summary of the algorithms;
georob
for (robust) fitting of spatial linear models;
georobObject
for a description of the class georob
;
profilelogLik
for computing profiles of Gaussian likelihoods;
plot.georob
for display of RE(ML) variogram estimates;
control.georob
for controlling the behaviour of georob
;
georobModelBuilding
for stepwise building models of class georob
;
cv.georob
for assessing the goodness of a fit by georob
;
georobMethods
for further methods for the class georob
;
predict.georob
for computing robust Kriging predictions;
lgnpp
for unbiased back-transformation of Kriging prediction
of log-transformed data;
sample.variogram
and fit.variogram.model
for robust estimation and modelling of sample variograms.
data(meuse) data(meuse.grid) ## convert to SpatialGridDataFrame meuse.grid.sgdf <- meuse.grid coordinates(meuse.grid.sgdf) <- ~ x + y gridded(meuse.grid.sgdf) <- TRUE fullgrid(meuse.grid.sgdf) <- TRUE ## Gaussian REML fit r.logzn.reml <- georob(log(zinc) ~ sqrt(dist), data = meuse, locations = ~ x + y, variogram.model = "RMexp", param = c(variance = 0.15, nugget = 0.05, scale = 200), tuning.psi = 1000) ## Unconditional simulations using circulant embedding on rectangular ## simulation grid r.sim.1 <- condsim(r.logzn.reml, newdata = meuse.grid.sgdf, nsim = 2, seed = 1, control = control.condsim(use.grid = TRUE, condsim = FALSE)) spplot(r.sim.1, zcol = "sim.1", at = seq(3.5, 8.5, by = 0.5)) ## Conditional simulations using circulant embedding if(interactive()){ ## example is run only in interactive session because cpu times exceeds 5 s r.sim.2 <- condsim(r.logzn.reml, newdata = meuse.grid.sgdf, nsim = 2, seed = 1, control = control.condsim(use.grid = FALSE, condsim = TRUE)) spplot(r.sim.2, zcol = "sim.2", at = seq(3.5, 8.5, by = 0.5)) }
data(meuse) data(meuse.grid) ## convert to SpatialGridDataFrame meuse.grid.sgdf <- meuse.grid coordinates(meuse.grid.sgdf) <- ~ x + y gridded(meuse.grid.sgdf) <- TRUE fullgrid(meuse.grid.sgdf) <- TRUE ## Gaussian REML fit r.logzn.reml <- georob(log(zinc) ~ sqrt(dist), data = meuse, locations = ~ x + y, variogram.model = "RMexp", param = c(variance = 0.15, nugget = 0.05, scale = 200), tuning.psi = 1000) ## Unconditional simulations using circulant embedding on rectangular ## simulation grid r.sim.1 <- condsim(r.logzn.reml, newdata = meuse.grid.sgdf, nsim = 2, seed = 1, control = control.condsim(use.grid = TRUE, condsim = FALSE)) spplot(r.sim.1, zcol = "sim.1", at = seq(3.5, 8.5, by = 0.5)) ## Conditional simulations using circulant embedding if(interactive()){ ## example is run only in interactive session because cpu times exceeds 5 s r.sim.2 <- condsim(r.logzn.reml, newdata = meuse.grid.sgdf, nsim = 2, seed = 1, control = control.condsim(use.grid = FALSE, condsim = TRUE)) spplot(r.sim.2, zcol = "sim.2", at = seq(3.5, 8.5, by = 0.5)) }
The function lgnpp
back-transforms point or block Kriging
predictions of a log-transformed response variable computed by
predict.georob
. Alternatively, the function averages
log-normal point Kriging predictions for a block and approximates the mean
squared prediction error of the block mean.
lgnpp(object, newdata, locations, is.block = FALSE, all.pred = NULL, extended.output = FALSE)
lgnpp(object, newdata, locations, is.block = FALSE, all.pred = NULL, extended.output = FALSE)
object |
an object with Kriging predictions of a log-transformed
response variable as obtained by
|
newdata |
an optional object as passed as argument |
locations |
an optional one-sided formula specifying what variables
of |
is.block |
an optional logical scalar (default |
all.pred |
an optional positive integer or an object as obtained by
|
extended.output |
a logical scalar controlling whether the covariance matrix of the errors of the back-transformed point predictions is added as an attribute to the result, see Details. |
The function lgnpp
performs three tasks:
The usual, marginally unbiased back-transformation for log-normal point Kriging is used:
where and
denote the
log- and back-transformed predictions of the signal,
and
The expressions for the required covariance terms can be found in the
Appendices of Nussbaum et al. (2014). Instead of the signal
, predictions of the
log-transformed response
or the estimated trend
of the log-transformed data can be back-transformed (see
georobPackage
). The
above transformations are used if object
contains point Kriging predictions (see predict.georob
,
Value) and if is.block = FALSE
and all.pred
is
missing.
Block Kriging predictions of a log-transformed response variable are back-transformed by the approximately unbiased transformation proposed by Cressie (2006, Appendix C)
where and
are the log- and
back-transformed predictions of the block mean
, respectively,
is the spatial
covariance matrix of the covariates
within the block where
and
This back-transformation is based on the assumption that both the point data
and the block means
follow log-normal laws, which strictly cannot hold. But
for small blocks the assumption works well as the bias and the loss of
efficiency caused by this assumption are small (Cressie, 2006;
Hofer et al., 2013).
The above formulae are used by lgnpp
if object
contains
block Kriging predictions in the form of a
SpatialPolygonsDataFrame
. To approximate
, one needs the covariates
on a fine grid for the whole study domain in which the blocks lie. The
covariates are passed
lgnpp
as argument newdata
, where
newdata
can be any spatial data frame accepted by
predict.georob
. For evaluating
the geometry of the blocks
is taken from the
polygons
slot of theSpatialPolygonsDataFrame
passed as object
to lgnpp
.
lgnpp
allows as a further option to back-transform and
average point Kriging predictions passed as object
to the
function. One then assumes that the predictions in object
refer
to points that lie in a single block. Hence, one uses the
approximation
to predict the block mean , where
is the number of
points in
. The mean squared prediction error can be approximated by
In most instances, the evaluation of the above double sum is not feasible
because a large number of points is used to discretize the block .
lgnpp
then uses the following approximations to compute the mean
squared error (see also Appendix E of Nussbaum et al., 2014):
Point prediction results are passed as object
to lgnpp
only for a random sample of points in (of size
),
for which the evaluation of the above double sum is feasible.
The prediction results for the complete set of points
within the block are passed as argument all.pred
to
lgnpp
. These results are used to compute .
The mean squared error is then approximated by
The first term of the RHS (and ) can be
computed from the point Kriging results contained in
all.pred
,
and the double sum is evaluated from the full covariance matrices of
the predictions and the respective targets, passed to lgnpp
as
object
(one has to use the arguments
control=control.predict.georob(full.covmat=TRUE)
for
predict.georob
when computing the point Kriging
predictions stored in object
).
If the prediction results are not available for the complete set
of points in then
all.pred
may be equal to . The
block mean is then approximated by
and the first term of the RHS of the expression for the mean squared error by
By drawing samples repeatedly and passing the related Kriging
results as object
to lgnpp
, one can reduce the error of
the approximation of the mean squared error.
If is.block
is FALSE
and all.pred
is equal to
NULL
lgnpp
returns an updated object of the same class as
object
(see section Value of predict.georob
).
The data frame with the point or block Kriging predictions is
complemented by lgnpp
with the following new components:
lgn.pred
: the back-transformed Kriging predictions of a
log-transformed response.
lgn.se
: the standard errors of the
back-transformed predictions.
lgn.lower
, lgn.upper
: the bounds of the
back-transformed prediction intervals.
If is.block
is TRUE
or all.pred
not equal to
NULL
lgnpp
returns a named numeric vector with two
elements:
mean
: the back-transformed block Kriging estimate, see
Details.
se
: the (approximated) block Kriging standard error, see
Details.
If extended.output
is TRUE
then the vector is supplemented
with the attribute mse.lgn.pred
that contains the full covariance
matrix of the back-transformed point prediction errors.
Andreas Papritz [email protected].
Cressie, N. (2006) Block Kriging for Lognormal Spatial Processes. Mathematical Geology, 38, 413–443, doi:10.1007/s11004-005-9022-8.
Hofer, C., Borer, F., Bono, R., Kayser, A. and Papritz, A. 2013. Predicting topsoil heavy metal content of parcels of land: An empirical validation of customary and constrained lognormal block Kriging and conditional simulations. Geoderma, 193–194, 200–212, doi:10.1016/j.geoderma.2012.08.034.
Nussbaum, M., Papritz, A., Baltensweiler, A. and Walthert, L. (2014) Estimating soil organic carbon stocks of Swiss forest soils by robust external-drift kriging. Geoscientific Model Development, 7, 1197–1210. doi:10.5194/gmd-7-1197-2014.
georobPackage
for a description of the model and a brief summary of the algorithms;
georob
for (robust) fitting of spatial linear models;
predict.georob
for computing robust Kriging predictions.
data(meuse) data(meuse.grid) coordinates(meuse.grid) <- ~x+y meuse.grid.pixdf <- meuse.grid gridded(meuse.grid.pixdf) <- TRUE data(meuse.blocks, package = "constrainedKriging") r.logzn.rob <- georob(log(zinc) ~ sqrt(dist), data = meuse, locations = ~ x + y, variogram.model = "RMexp", param = c(variance = 0.15, nugget = 0.05, scale = 200), tuning.psi = 1., control = control.georob(cov.bhat = TRUE, full.cov.bhat = TRUE)) ## point predictions of log(Zn) r.pred.points.1 <- predict(r.logzn.rob, newdata = meuse.grid.pixdf, control = control.predict.georob(extended.output = TRUE)) str(r.pred.points.1, max = 3) ## back-transformation of point predictions r.backtf.pred.points <- lgnpp(r.pred.points.1) str(r.backtf.pred.points, max = 3) spplot(r.backtf.pred.points, zcol = "lgn.pred", main = "Zn content") ## predicting mean Zn content for whole area if(interactive()){ ## example is run only in interactive session because cpu times exceeds 5 s ## recompute point predictions with argument full.covmat = TRUE r.pred.points.2 <- predict(r.logzn.rob, newdata = meuse.grid.pixdf, control = control.predict.georob(extended.output = TRUE, full.covmat = TRUE)) str(r.pred.points.2, max = 3) r.block <- lgnpp(r.pred.points.2, is.block = TRUE, all.pred = r.backtf.pred.points@data) r.block } ## block predictions of log(Zn) if(interactive()){ ## example is run only in interactive session because cpu times exceeds 5 s r.pred.block <- predict(r.logzn.rob, newdata = meuse.blocks, control = control.predict.georob(extended.output = TRUE, pwidth = 75, pheight = 75, mmax = 50)) r.backtf.pred.block <- lgnpp(r.pred.block, newdata = meuse.grid) spplot(r.backtf.pred.block, zcol = "lgn.pred", main = "block means Zn content") }
data(meuse) data(meuse.grid) coordinates(meuse.grid) <- ~x+y meuse.grid.pixdf <- meuse.grid gridded(meuse.grid.pixdf) <- TRUE data(meuse.blocks, package = "constrainedKriging") r.logzn.rob <- georob(log(zinc) ~ sqrt(dist), data = meuse, locations = ~ x + y, variogram.model = "RMexp", param = c(variance = 0.15, nugget = 0.05, scale = 200), tuning.psi = 1., control = control.georob(cov.bhat = TRUE, full.cov.bhat = TRUE)) ## point predictions of log(Zn) r.pred.points.1 <- predict(r.logzn.rob, newdata = meuse.grid.pixdf, control = control.predict.georob(extended.output = TRUE)) str(r.pred.points.1, max = 3) ## back-transformation of point predictions r.backtf.pred.points <- lgnpp(r.pred.points.1) str(r.backtf.pred.points, max = 3) spplot(r.backtf.pred.points, zcol = "lgn.pred", main = "Zn content") ## predicting mean Zn content for whole area if(interactive()){ ## example is run only in interactive session because cpu times exceeds 5 s ## recompute point predictions with argument full.covmat = TRUE r.pred.points.2 <- predict(r.logzn.rob, newdata = meuse.grid.pixdf, control = control.predict.georob(extended.output = TRUE, full.covmat = TRUE)) str(r.pred.points.2, max = 3) r.block <- lgnpp(r.pred.points.2, is.block = TRUE, all.pred = r.backtf.pred.points@data) r.block } ## block predictions of log(Zn) if(interactive()){ ## example is run only in interactive session because cpu times exceeds 5 s r.pred.block <- predict(r.logzn.rob, newdata = meuse.blocks, control = control.predict.georob(extended.output = TRUE, pwidth = 75, pheight = 75, mmax = 50)) r.backtf.pred.block <- lgnpp(r.pred.block, newdata = meuse.grid) spplot(r.backtf.pred.block, zcol = "lgn.pred", main = "block means Zn content") }
Auxiliary functions to query names and permissible ranges of variogram parameters.
param.names(model) param.bounds(model, d)
param.names(model) param.bounds(model, d)
model |
a character keyword denoting a valid variogram,
see |
d |
a positive integer with the number of dimensions of the survey domain. |
Either a character vector with the names of the additional variogram
parameters such as the smoothness parameter of the
Whittle-Matérn model (param.names
) or a named list
with the lower and upper bounds of permissible parameter ranges.
Andreas Papritz [email protected].
georobPackage
for a description of the model and a brief summary of the algorithms;
georob
for (robust) fitting of spatial linear models.
param.names("RMgengneiting") param.bounds("RMgengneiting", d = 2)
param.names("RMgengneiting") param.bounds("RMgengneiting", d = 2)
georob
The plot
and lines
methods for class
georob
plot the variogram model, estimated by (robust) restricted
maximum likelihood.
plot.georob
computes and plots in addition the
sample variogram of the (robust) regression residuals and can be used to
generate residual diagnostics plots (Tukey-Anscombe plot, normal QQ plots
of residuals and random effects).
## S3 method for class 'georob' plot(x, what = c( "variogram", "covariance", "correlation", "ta", "sl", "qq.res", "qq.ranef" ), add = FALSE, lag.dist.def, xy.angle.def = c(0, 180), xz.angle.def = c(0, 180), max.lag = Inf, estimator = c("mad", "qn", "ch", "matheron"), mean.angle = TRUE, level = what != "ta", smooth = what == "ta" || what == "sl", id.n = 3, labels.id = names(residuals(x)), cex.id = 0.75, label.pos = c(4,2), col, pch, xlab, ylab, main, lty = "solid", ...) ## S3 method for class 'georob' lines(x, what = c("variogram", "covariance", "correlation"), from = 1.e-6, to, n = 501, xy.angle = 90, xz.angle = 90, col = 1:length(xy.angle), pch = 1:length(xz.angle), lty = "solid", ...)
## S3 method for class 'georob' plot(x, what = c( "variogram", "covariance", "correlation", "ta", "sl", "qq.res", "qq.ranef" ), add = FALSE, lag.dist.def, xy.angle.def = c(0, 180), xz.angle.def = c(0, 180), max.lag = Inf, estimator = c("mad", "qn", "ch", "matheron"), mean.angle = TRUE, level = what != "ta", smooth = what == "ta" || what == "sl", id.n = 3, labels.id = names(residuals(x)), cex.id = 0.75, label.pos = c(4,2), col, pch, xlab, ylab, main, lty = "solid", ...) ## S3 method for class 'georob' lines(x, what = c("variogram", "covariance", "correlation"), from = 1.e-6, to, n = 501, xy.angle = 90, xz.angle = 90, col = 1:length(xy.angle), pch = 1:length(xz.angle), lty = "solid", ...)
x |
an object of class |
what |
a character keyword for the quantity that should be displayed. Possible values are:
|
add |
a logical scalar controlling whether a new plot should be
generated ( |
lag.dist.def |
an optional numeric scalar defining a constant bin
width for grouping the lag distances or an optional numeric vector with
the upper bounds of a set of contiguous bins for computing the sample
variogram of the regression residuals, see
|
xy.angle.def |
an numeric vector defining angular classes in the
horizontal plane for computing directional variograms.
|
xz.angle.def |
an numeric vector defining angular classes in the
|
max.lag |
a positive numeric defining the largest lag distance for which semi-variances should be computed (default no restriction). |
estimator |
a character keyword defining the estimator for computing the sample variogram. Possible values are:
|
mean.angle |
a logical scalar controlling whether the mean lag
vector (per combination of lag distance and angular class) is computed
from the mean angles of all the lag vectors falling into a given class
( |
level |
an integer giving the level for extracting the residuals
from |
smooth |
a logical scalar controlling whether a
|
id.n |
a numeric with the number of points to be labelled in each
plot, starting with the most extreme (see
|
labels.id |
a vector of labels, from which the labels for extreme
points will be chosen (see |
cex.id |
a numeric with the magnification of point labels (see
|
label.pos |
a numeric for positioning of labels, for the left half
and right half of the graph respectively (see
|
from |
a numeric with the minimal lag distance for plotting variogram models. |
to |
a numeric with the maximum lag distance for plotting variogram models (default: largest lag distance of current plot). |
n |
a positive integer specifying the number of equally spaced lag
distances for which semi-variances are evaluated in plotting variogram
models (default |
xy.angle |
a numeric (vector) with azimuth angles (in degrees,
clockwise positive from north) in |
xz.angle |
a numeric (vector) with angles in |
col |
an optional vector with colours of points and curves to
distinguish items relating to different azimuth angles in
|
pch |
an optional vector with symbols for points and curves to
distinguish items relating to different azimuth angles in
|
lty |
line type for plotting variogram models. |
xlab , ylab , main
|
plot annotation, see
|
... |
additional arguments passed to
|
The method plot.georob
returns no value, it is called for its side
effects.
The method lines.georob
is called for its side effects and returns
the value NULL
invisibly.
Andreas Papritz [email protected].
georobPackage
for a description of the model and a brief summary of the algorithms;
georob
for (robust) fitting of spatial linear models;
georobObject
for a description of the class georob
;
profilelogLik
for computing profiles of Gaussian likelihoods;
control.georob
for controlling the behaviour of georob
;
georobModelBuilding
for stepwise building models of class georob
;
cv.georob
for assessing the goodness of a fit by georob
;
georobMethods
for further methods for the class georob
;
predict.georob
for computing robust Kriging predictions;
lgnpp
for unbiased back-transformation of Kriging prediction
of log-transformed data;
georobSimulation
for simulating realizations of a Gaussian process
from model fitted by georob
; and finally
sample.variogram
and fit.variogram.model
for robust estimation and modelling of sample variograms.
################ ## meuse data ## ################ data(meuse) ## Gaussian REML fit r.logzn.reml <- georob(log(zinc) ~ sqrt(dist), data = meuse, locations = ~ x + y, variogram.model = "RMexp", param = c(variance = 0.15, nugget = 0.05, scale = 200), tuning.psi = 1000) summary(r.logzn.reml, correlation = TRUE) plot(r.logzn.reml, lag.dist.def = seq(0, 2000, by = 100)) ## robust REML fit if(interactive()){ ## example is run only in interactive session because cpu times exceeds 5 s r.logzn.rob <- update(r.logzn.reml, tuning.psi = 1) summary(r.logzn.rob, correlation = TRUE) lines(r.logzn.rob, col = "red") }
################ ## meuse data ## ################ data(meuse) ## Gaussian REML fit r.logzn.reml <- georob(log(zinc) ~ sqrt(dist), data = meuse, locations = ~ x + y, variogram.model = "RMexp", param = c(variance = 0.15, nugget = 0.05, scale = 200), tuning.psi = 1000) summary(r.logzn.reml, correlation = TRUE) plot(r.logzn.reml, lag.dist.def = seq(0, 2000, by = 100)) ## robust REML fit if(interactive()){ ## example is run only in interactive session because cpu times exceeds 5 s r.logzn.rob <- update(r.logzn.reml, tuning.psi = 1) summary(r.logzn.rob, correlation = TRUE) lines(r.logzn.rob, col = "red") }
This page documents the function pmm
for parallelized matrix
multiplication and the functioncontrol.pcmp
, which controls
the behaviour of pmm
and other functions that execute tasks in
parallel.
pmm(A, B, control = control.pcmp()) control.pcmp(pmm.ncores = 1, gcr.ncores = 1, max.ncores = parallel::detectCores(), f = 1, sfstop = FALSE, allow.recursive = FALSE, fork = !identical(.Platform[["OS.type"]], "windows"), ...)
pmm(A, B, control = control.pcmp()) control.pcmp(pmm.ncores = 1, gcr.ncores = 1, max.ncores = parallel::detectCores(), f = 1, sfstop = FALSE, allow.recursive = FALSE, fork = !identical(.Platform[["OS.type"]], "windows"), ...)
A , B
|
two numeric matrices to be multiplied. |
control |
a list with the arguments |
pmm.ncores |
an integer (default 1) with the number of cores used for parallelized matrix multiplication. |
gcr.ncores |
an integer (default 1) with the number of cores used for parallelized computation of (generalized) covariances or semi-variances. |
max.ncores |
maximum number of cores (integer, default all cores of a machine) used for parallelized computations. |
f |
an integer (default 1) with the number of tasks assigned to each core in parallelized operations. |
sfstop |
a logical scalar controlling whether the SNOW socket
cluster is stopped after each parallelized matrix multiplication on
windows OS (default |
allow.recursive |
a logical scalar controlling whether parallelized
matrix multiplicaction and computation of generalized) covariances should
be allowed by child processes running already in parallel (default
|
fork |
a logical scalar controlling whether forking should be used for
parallel computations (default |
... |
further arguments, currently not used. |
Parallelized matrix multiplication shortens computing time for large data
sets (). However, spawning child processes requires itself
resources and increasing the number of cores for parallelized matrix
multiplication and parallelized computation of covariances does not always
result in reduced computing time. Furthermore, these operations may be
initiated by child processes, that were itself spawned by functions like
cv.georob
, predict.georob
,
profilelogLik
, add1.georob
,
drop1.georob
and step.georob
. By default,
parallelized matrix multiplication and computation of covariances is then
suppressed to avoid that child processes itself spawn child processes. To
allow parallelized matrix multipliation and parallelized computation of
covariances by child processes one has to use the argument
allow.recursive = TRUE
.
Note that very substantial reductions in computing time results when one
uses the OpenBLAS library instead of the reference BLAS library
that ships with R, see
https://www.openblas.net/ and R FAQ for your OS. With OpenBLAS no
gains are obtained by using more than one core for matrix
multiplication, and one should therefore use the default arguments
pmm.ncores = 1
for control.pcmp()
.
max.ncores
controls how many child processes are spawned in total.
This can be used to prevent that child processes spawn
themselves children which may result in a considerable number of child
processes.
pmm
:the matrix product A %*% B
,
control.pcmp
:a list with components
pmm.ncores
, gcr.ncores
, max.ncores
, f
,
sfstop
,allow.recursive
.
Andreas Papritz [email protected].
georobPackage
for a description of the model and a brief summary of the algorithms;
georob
for (robust) fitting of spatial linear models.
if(interactive()){ ## example is run only in interactive session because cpu times exceeds 5 s A <- as.matrix(dist(rnorm(2000))) B <- as.matrix(dist(rnorm(2000))) system.time(C <- A %*% B) system.time(C <- pmm( A, B, control = control.pcmp(pmm.ncores = 2L))) }
if(interactive()){ ## example is run only in interactive session because cpu times exceeds 5 s A <- as.matrix(dist(rnorm(2000))) B <- as.matrix(dist(rnorm(2000))) system.time(C <- A %*% B) system.time(C <- pmm( A, B, control = control.pcmp(pmm.ncores = 2L))) }
Robust and customary external drift Kriging prediction
based on a spatial linear models fitted by georob
. The
predict
method for the class georob
computes fitted values, point
and block Kriging predictions as
well as model terms for display by termplot
.
## S3 method for class 'georob' predict(object, newdata, type = c("signal", "response", "trend", "terms"), terms = NULL, se.fit = TRUE, signif = 0.95, locations, variogram.model = NULL, param = NULL, aniso = NULL, variogram.object = NULL, control = control.predict.georob(), verbose = 0, ...) control.predict.georob(full.covmat = FALSE, extended.output = FALSE, mmax = 10000, ncores = pcmp[["max.ncores"]], pwidth = NULL, pheight = NULL, napp = 1, pcmp = control.pcmp())
## S3 method for class 'georob' predict(object, newdata, type = c("signal", "response", "trend", "terms"), terms = NULL, se.fit = TRUE, signif = 0.95, locations, variogram.model = NULL, param = NULL, aniso = NULL, variogram.object = NULL, control = control.predict.georob(), verbose = 0, ...) control.predict.georob(full.covmat = FALSE, extended.output = FALSE, mmax = 10000, ncores = pcmp[["max.ncores"]], pwidth = NULL, pheight = NULL, napp = 1, pcmp = control.pcmp())
object |
an object of class |
newdata |
an optional data frame,
|
type |
a character keyword defining what target quantity should be predicted (computed). Possible values are
|
terms |
If |
se.fit |
a logical scalar, only used if |
signif |
a positive numeric scalar equal to the tolerance or confidence level
for computing respective intervals. If |
locations |
an optional one-sided formula specifying what variables
of |
variogram.model |
an optional character keyword defining the
variogram model to be used for Kriging, see |
param |
an optional named numeric vector with values of the
variogram parameters used for Kriging, see |
aniso |
an optional named numeric vector with values of anisotropy
parameters of a variogram used for Kriging, see |
variogram.object |
an optional list that defines a possibly nested
variogram model used for Kriging, see |
control |
a list with the components |
full.covmat |
a logical scalar controlling whether the full
covariance matrix of the prediction errors is returned ( |
extended.output |
a logical scalar controlling whether the covariance
matrices of the Kriging predictions and of the data should be computed, see
Details (default |
mmax |
a positive integer equal to the maximum number (default
|
ncores |
a positive integer controlling how many cores are used for parallelized computations, see Details. |
pwidth , pheight , napp
|
numeric scalars, used to tune numeric
integration of semi-variances for block Kriging, see
|
pcmp |
a list of arguments passed to |
verbose |
a positive integer controlling logging of diagnostic
messages to the console. |
... |
arguments passed to |
If newdata
is an object of class SpatialPoints
,
SpatialPixels
or SpatialGrid
then the drift model may only
use the coordinates as covariates (universal Kriging). Furthermore the
names used for the coordinates in newdata
must be the same as in
data
when creating object
(argument locations
of
predict.georob
should not be used). Note that the result returned
by predict.georob
is then an object of class
SpatialPointsDataFrame
, SpatialPixelsDataFrame
or
SpatialGridDataFrame
.
The predict
method for class georob
uses the packages
parallel and snowfall for parallelized
computation of Kriging predictions. If there are items to
predict, the task is split into
ceiling(m/mmax)
sub-tasks that are
then distributed to ncores
CPUs. Evidently, ncores = 1
suppresses parallel execution. By default, the function uses all
available CPUs as returned by detectCores
.
Note that if full.covmat
is TRUE
mmax
must exceed
(and parallel execution is not possible).
The argument extended.output = TRUE
is used to compute all
quantities required for (approximately) unbiased back-transformation of
Kriging predictions of log-transformed data to the original scale of the
measurements by lgnpp
. In more detail, the following items
are computed:
trend
: the fitted values,
,
var.pred
: the variances of the Kriging predictions,
or
,
cov.pred.target
: the covariances between the predictions and the
prediction targets, or
,
var.target
: the variances of the prediction targets
or
.
Note that the component var.pred
is also present if
type
is equal to "trend"
, irrespective of the choice for extended.output
.
This component contains then the variances of the fitted values.
The method predict.georob
returns, depending on its arguments, the
following objects:
If type
is equal to "terms"
then a vector, a matrix, or a
list with prediction results along with bounds and standard errors, see
predict.lm
. Otherwise, the structure and contents
of the output generated by predict.georob
are determined by the
class of newdata
and the logical flags full.covmat
and
extended.output
:
If full.covmat
is FALSE
then the result is an object of a "similar"
class as newdata
(data frame,
SpatialPointsDataFrame
,
SpatialPixelsDataFrame
SpatialGridDataFrame
, SpatialPolygonsDataFrame
).
The data frame or the
data
slot of the Spatial...DataFrame
objects
have the following components:
the coordinates of the prediction points (only present if
newdata
is a data frame).
pred
: the Kriging predictions (or fitted values).
se
: the root mean squared prediction errors (Kriging
standard errors).
lower
, upper
: the limits of tolerance/confidence
intervals,
trend
, var.pred
, cov.pred.target
,
var.target
: only present if extended.output
is TRUE
,
see Details.
If full.covmat
is TRUE
then predict.georob
returns a list
with the following components:
pred
: a data frame or a Spatial...DataFrame
object
as described above forfull.covmat = FALSE
.
mse.pred
: the full covariance matrix of the prediction errors,
or
see Details.
var.pred
: the full covariance matrix of the
Kriging predictions, see Details.
cov.pred.target
: the full covariance matrix of the
predictions and the prediction targets, see Details.
var.target
: the full covariance matrix of the
prediction targets, see Details.
The function control.predict.georob
returns a list with control
parameters to steer predict.georob
, see arguments of the
function above for its components.
Andreas Papritz [email protected].
Nussbaum, M., Papritz, A., Baltensweiler, A. and Walthert, L. (2014) Estimating soil organic carbon stocks of Swiss forest soils by robust external-drift kriging. Geoscientific Model Development, 7, 1197–1210. doi:10.5194/gmd-7-1197-2014.
Künsch, H. R., Papritz, A., Schwierz, C. and Stahel, W. A. (2011) Robust estimation of the external drift and the variogram of spatial data. Proceedings of the ISI 58th World Statistics Congress of the International Statistical Institute. doi:10.3929/ethz-a-009900710
georobPackage
for a description of the model and a brief summary of the algorithms;
georob
for (robust) fitting of spatial linear models;
georobObject
for a description of the class georob
;
profilelogLik
for computing profiles of Gaussian likelihoods;
plot.georob
for display of RE(ML) variogram estimates;
control.georob
for controlling the behaviour of georob
;
georobModelBuilding
for stepwise building models of class georob
;
cv.georob
for assessing the goodness of a fit by georob
;
georobMethods
for further methods for the class georob
;
lgnpp
for unbiased back-transformation of Kriging prediction
of log-transformed data;
georobSimulation
for simulating realizations of a Gaussian process
from model fitted by georob
; and finally
sample.variogram
and fit.variogram.model
for robust estimation and modelling of sample variograms.
data(meuse) data(meuse.grid) coordinates(meuse.grid) <- ~x+y meuse.grid.pixdf <- meuse.grid gridded(meuse.grid.pixdf) <- TRUE data(meuse.blocks, package = "constrainedKriging") r.logzn.rob <- georob(log(zinc) ~ sqrt(dist), data = meuse, locations = ~ x + y, variogram.model = "RMexp", param = c(variance = 0.15, nugget = 0.05, scale = 200), tuning.psi = 1., control = control.georob(cov.bhat = TRUE, full.cov.bhat = TRUE)) ## point predictions of log(Zn) r.pred.points.1 <- predict(r.logzn.rob, newdata = meuse.grid.pixdf, control = control.predict.georob(extended.output = TRUE)) str(r.pred.points.1, max = 3) ## back-transformation of point predictions r.backtf.pred.points <- lgnpp(r.pred.points.1) str(r.backtf.pred.points, max = 3) spplot(r.backtf.pred.points, zcol = "lgn.pred", main = "Zn content") ## predicting mean Zn content for whole area if(interactive()){ ## example is run only in interactive session because cpu times exceeds 5 s ## recompute point predictions with argument full.covmat = TRUE r.pred.points.2 <- predict(r.logzn.rob, newdata = meuse.grid.pixdf, control = control.predict.georob(extended.output = TRUE, full.covmat = TRUE)) str(r.pred.points.2, max = 3) r.block <- lgnpp(r.pred.points.2, is.block = TRUE, all.pred = r.backtf.pred.points@data) r.block } ## block predictions of log(Zn) if(interactive()){ ## example is run only in interactive session because cpu times exceeds 5 s r.pred.block <- predict(r.logzn.rob, newdata = meuse.blocks, control = control.predict.georob(extended.output = TRUE, pwidth = 75, pheight = 75, mmax = 50)) r.backtf.pred.block <- lgnpp(r.pred.block, newdata = meuse.grid) spplot(r.backtf.pred.block, zcol = "lgn.pred", main = "block means Zn content") }
data(meuse) data(meuse.grid) coordinates(meuse.grid) <- ~x+y meuse.grid.pixdf <- meuse.grid gridded(meuse.grid.pixdf) <- TRUE data(meuse.blocks, package = "constrainedKriging") r.logzn.rob <- georob(log(zinc) ~ sqrt(dist), data = meuse, locations = ~ x + y, variogram.model = "RMexp", param = c(variance = 0.15, nugget = 0.05, scale = 200), tuning.psi = 1., control = control.georob(cov.bhat = TRUE, full.cov.bhat = TRUE)) ## point predictions of log(Zn) r.pred.points.1 <- predict(r.logzn.rob, newdata = meuse.grid.pixdf, control = control.predict.georob(extended.output = TRUE)) str(r.pred.points.1, max = 3) ## back-transformation of point predictions r.backtf.pred.points <- lgnpp(r.pred.points.1) str(r.backtf.pred.points, max = 3) spplot(r.backtf.pred.points, zcol = "lgn.pred", main = "Zn content") ## predicting mean Zn content for whole area if(interactive()){ ## example is run only in interactive session because cpu times exceeds 5 s ## recompute point predictions with argument full.covmat = TRUE r.pred.points.2 <- predict(r.logzn.rob, newdata = meuse.grid.pixdf, control = control.predict.georob(extended.output = TRUE, full.covmat = TRUE)) str(r.pred.points.2, max = 3) r.block <- lgnpp(r.pred.points.2, is.block = TRUE, all.pred = r.backtf.pred.points@data) r.block } ## block predictions of log(Zn) if(interactive()){ ## example is run only in interactive session because cpu times exceeds 5 s r.pred.block <- predict(r.logzn.rob, newdata = meuse.blocks, control = control.predict.georob(extended.output = TRUE, pwidth = 75, pheight = 75, mmax = 50)) r.backtf.pred.block <- lgnpp(r.pred.block, newdata = meuse.grid) spplot(r.backtf.pred.block, zcol = "lgn.pred", main = "block means Zn content") }
The function profilelogLik
computes for an array of fixed
variogram parameters the profile log-likelihood by maximizing the
(restricted) log-likelihood with respect to the remaining variogram
parameters, the fixed and random effects.
profilelogLik(object, values, use.fitted = TRUE, verbose = 0, ncores = min(parallel::detectCores(), NROW(values)))
profilelogLik(object, values, use.fitted = TRUE, verbose = 0, ncores = min(parallel::detectCores(), NROW(values)))
object |
an object of class |
values |
a |
use.fitted |
a logical scalar controlling whether the fitted variogram
parameters of |
verbose |
a positive integer controlling logging of diagnostic
messages to the console during model fitting, see |
ncores |
a positive integer controlling how many cores are used for parallelized computations, see Details. |
For robust REML fits profilelogLik
returns (possibly with a
warning) the log-likelihood of the Gaussian (RE)ML fit of the equivalent
Gaussian spatial linear model with heteroscedastic nugget.
Note that the data frame passed as data
argument to georob
must exist in the user workspace
when calling profilelogLik
.
profilelogLik
uses the packages parallel and
snowfall for parallelized computation of the profile
likelihood. By default, the function uses NROW(values)
CPUs but
not more than are physically available (as returned by
detectCores
).
profilelogLik
uses the function update
to
re-estimated the model with partly fixed variogram parameters.
Therefore, any argument accepted by georob
except
data
can be changed when re-fitting the model. Some of them (e.g.
verbose
) are explicit arguments of
profilelogLik
, but also the remaining ones can be passed by
...
to the function.
A data.frame
with the columns of values
, a column
loglik
(contains the maximized [restricted] log-likelihood),
columns with the estimated variogram and fixed effect parameters, columns
with the gradient of the (restricted) log-likelihood (or the roots of the
estimating equations) and a column converged
, indicating whether
convergence has occurred when fitting the respective model.
Andreas Papritz [email protected].
georobPackage
for a description of the model and a brief summary of the algorithms;
georob
for (robust) fitting of spatial linear models;
georobObject
for a description of the class georob
;
plot.georob
for display of RE(ML) variogram estimates;
control.georob
for controlling the behaviour of georob
;
georobModelBuilding
for stepwise building models of class georob
;
cv.georob
for assessing the goodness of a fit by georob
;
georobMethods
for further methods for the class georob
;
predict.georob
for computing robust Kriging predictions;
lgnpp
for unbiased back-transformation of Kriging prediction
of log-transformed data;
georobSimulation
for simulating realizations of a Gaussian process
from model fitted by georob
; and finally
sample.variogram
and fit.variogram.model
for robust estimation and modelling of sample variograms.
## define number of cores for parallel computations if(interactive()) ncpu <- 2L else ncpu <- 1L data(meuse) r.logzn.ml <- georob(log(zinc)~sqrt(dist), data=meuse, locations=~x+y, variogram.model="RMexp", param=c(variance=0.15, nugget=0.05, scale=200), tuning.psi=1000, control=control.georob(ml.method="ML")) if(interactive()){ ## example is run only in interactive session because cpu times exceeds 5 s r.prflik <- profilelogLik(r.logzn.ml, values=expand.grid(scale=seq(75, 600, by=25)), ncores = ncpu) plot(loglik~scale, r.prflik, type="l") abline(v=r.logzn.ml$variogram.object[[1]]$param["scale"], lty="dotted") abline(h=r.logzn.ml$loglik-0.5*qchisq(0.95, 1), lty="dotted") }
## define number of cores for parallel computations if(interactive()) ncpu <- 2L else ncpu <- 1L data(meuse) r.logzn.ml <- georob(log(zinc)~sqrt(dist), data=meuse, locations=~x+y, variogram.model="RMexp", param=c(variance=0.15, nugget=0.05, scale=200), tuning.psi=1000, control=control.georob(ml.method="ML")) if(interactive()){ ## example is run only in interactive session because cpu times exceeds 5 s r.prflik <- profilelogLik(r.logzn.ml, values=expand.grid(scale=seq(75, 600, by=25)), ncores = ncpu) plot(loglik~scale, r.prflik, type="l") abline(v=r.logzn.ml$variogram.object[[1]]$param["scale"], lty="dotted") abline(h=r.logzn.ml$loglik-0.5*qchisq(0.95, 1), lty="dotted") }
The imported functions K
,
lmrob.control
,
and waldtest
are re-exported for ease of use without
attaching the respective packages.
K(dist, model) lmrob.control(setting, seed = NULL, nResample = 500, tuning.chi = NULL, bb = 0.5, tuning.psi = NULL, max.it = 50, groups = 5, n.group = 400, k.fast.s = 1, best.r.s = 2, k.max = 200, maxit.scale = 200, k.m_s = 20, refine.tol = 1e-7, rel.tol = 1e-7, scale.tol = 1e-10, solve.tol = 1e-7, zero.tol = 1e-10, trace.lev = 0, mts = 1000, subsampling = c("nonsingular", "simple"), compute.rd = FALSE, method = "MM", psi = "bisquare", numpoints = 10, cov = NULL, split.type = c("f", "fi", "fii"), fast.s.large.n = 2000, # only for outlierStats() : eps.outlier = function(nobs) 0.1 / nobs, eps.x = function(maxx) .Machine$double.eps^(.75)*maxx, compute.outlier.stats = method, warn.limit.reject = 0.5, warn.limit.meanrw = 0.5, ...)
K(dist, model) lmrob.control(setting, seed = NULL, nResample = 500, tuning.chi = NULL, bb = 0.5, tuning.psi = NULL, max.it = 50, groups = 5, n.group = 400, k.fast.s = 1, best.r.s = 2, k.max = 200, maxit.scale = 200, k.m_s = 20, refine.tol = 1e-7, rel.tol = 1e-7, scale.tol = 1e-10, solve.tol = 1e-7, zero.tol = 1e-10, trace.lev = 0, mts = 1000, subsampling = c("nonsingular", "simple"), compute.rd = FALSE, method = "MM", psi = "bisquare", numpoints = 10, cov = NULL, split.type = c("f", "fi", "fii"), fast.s.large.n = 2000, # only for outlierStats() : eps.outlier = function(nobs) 0.1 / nobs, eps.x = function(maxx) .Machine$double.eps^(.75)*maxx, compute.outlier.stats = method, warn.limit.reject = 0.5, warn.limit.meanrw = 0.5, ...)
dist |
a numeric vector with distances. |
model |
an object of class “ |
setting |
a string specifying alternative default values, see
|
seed |
|
nResample |
number of re-sampling candidates to be used to find the
initial S-estimator, see |
tuning.chi |
tuning constant vector for the S-estimator, see
|
bb |
expected value under the normal model of the “chi”, see
|
tuning.psi |
tuning constant vector for the redescending
M-estimator, see |
max.it |
integer specifying the maximum number of IRWLS iterations,
see |
groups |
(for the fast-S algorithm): Number of random subsets to use
when the data set is large, see |
n.group |
(for the fast-S algorithm): Size of each of the
|
k.fast.s |
(for the fast-S algorithm): Number of local improvement
steps (“I-steps”) for each re-sampling candidate, see
|
best.r.s |
(for the fast-S algorithm): Number of of best candidates
to be iterated further, see |
k.max |
(for the fast-S algorithm): maximal number of refinement
steps for the “fully” iterated best candidates, see
|
maxit.scale |
integer specifying the maximum number of C level
|
k.m_s |
(for the M-S algorithm): specifies after how many
unsuccessful refinement steps the algorithm stops, see
|
refine.tol |
(for the fast-S algorithm): relative convergence
tolerance for the fully iterated best candidates, see
|
rel.tol |
(for the RWLS iterations of the MM algorithm): relative
convergence tolerance for the parameter vector, see
|
scale.tol |
(for the scale estimation iterations of the S
algorithm): relative convergence tolerance for the |
solve.tol |
(for the S algorithm): relative tolerance for inversionsee
|
zero.tol |
for checking 0-residuals in the S algorithm, non-negative
number, see |
trace.lev |
integer indicating if the progress of the MM-algorithm
and the fast-S algorithms should be traced, see
|
mts |
maximum number of samples to try in subsampling algorithm, see
|
subsampling |
type of subsampling to be used, see
|
compute.rd |
a logical scalar indicating if robust distances (based on the
MCD robust covariance estimator) are to be computed for the robust
diagnostic plots, see |
method |
string specifying the estimator-chain, see
|
psi |
string specifying the type |
numpoints |
number of points used in Gauss quadrature, see
|
cov |
function or string with function name to be used to calculate
covariance matrix estimate, see |
split.type |
determines how categorical and continuous variables are
split, see |
fast.s.large.n |
minimum number of observations required to switch
from ordinary “fast S” algorithm to an efficient “large n”
strategy, see |
eps.outlier |
limit on the robustness weight below which an
observation is considered to be an outlier, see
|
eps.x |
limit on the absolute value of the elements of the design
matrix below which an element is considered zero, see
|
compute.outlier.stats |
vector of character strings, each valid to
be used as |
warn.limit.reject |
see |
warn.limit.meanrw |
limit of the mean robustness per factor level
below which ( |
... |
some methods for the generic function
|
The function K
is required for
computing block Kriging predictions by the function
f.point.block.cov
of the package
constrainedKriging.
Furthermore, the function lmrob.control
allows
to pass tuning parameters to the function lmrob
of the package robustbase, which is used for computing robust
initial values of the regression coefficients.
See help pages of K
and
lmrob.control
for the output generated by these
functions.
The function sample.variogram
computes the
sample (empirical) variogram of a spatial variable by the method-of-moment
and three robust estimators. Both omnidirectional and direction-dependent
variograms can be computed, the latter for observation locations in a
three-dimensional domain. There are summary
and plot
methods for summarizing and displaying sample variograms.
sample.variogram(object, ...) ## Default S3 method: sample.variogram(object, locations, lag.dist.def, xy.angle.def = c(0, 180), xz.angle.def = c(0, 180), max.lag = Inf, estimator = c("qn", "mad", "matheron", "ch"), mean.angle = TRUE, ...) ## S3 method for class 'formula' sample.variogram(object, data, subset, na.action, locations, lag.dist.def, xy.angle.def = c(0, 180), xz.angle.def = c(0, 180), max.lag = Inf, estimator = c("qn", "mad", "matheron", "ch"), mean.angle = TRUE, ...) ## S3 method for class 'georob' sample.variogram(object, lag.dist.def, xy.angle.def = c(0, 180), xz.angle.def = c(0, 180), max.lag = Inf, estimator = c("qn", "mad", "matheron", "ch"), mean.angle = TRUE, ...) ## S3 method for class 'sample.variogram' summary(object, ...) ## S3 method for class 'sample.variogram' plot(x, type = "p", add = FALSE, xlim = c(0, max(x[["lag.dist"]])), ylim = c(0, 1.1 * max(x[["gamma"]])), col, pch, lty, cex = 0.8, xlab = "lag distance", ylab = "semivariance", annotate.npairs = FALSE, npairs.pos = 3, npairs.cex = 0.7, legend = nlevels(x[["xy.angle"]]) > 1 || nlevels(x[["xz.angle"]]) > 1, legend.pos = "topleft", ...)
sample.variogram(object, ...) ## Default S3 method: sample.variogram(object, locations, lag.dist.def, xy.angle.def = c(0, 180), xz.angle.def = c(0, 180), max.lag = Inf, estimator = c("qn", "mad", "matheron", "ch"), mean.angle = TRUE, ...) ## S3 method for class 'formula' sample.variogram(object, data, subset, na.action, locations, lag.dist.def, xy.angle.def = c(0, 180), xz.angle.def = c(0, 180), max.lag = Inf, estimator = c("qn", "mad", "matheron", "ch"), mean.angle = TRUE, ...) ## S3 method for class 'georob' sample.variogram(object, lag.dist.def, xy.angle.def = c(0, 180), xz.angle.def = c(0, 180), max.lag = Inf, estimator = c("qn", "mad", "matheron", "ch"), mean.angle = TRUE, ...) ## S3 method for class 'sample.variogram' summary(object, ...) ## S3 method for class 'sample.variogram' plot(x, type = "p", add = FALSE, xlim = c(0, max(x[["lag.dist"]])), ylim = c(0, 1.1 * max(x[["gamma"]])), col, pch, lty, cex = 0.8, xlab = "lag distance", ylab = "semivariance", annotate.npairs = FALSE, npairs.pos = 3, npairs.cex = 0.7, legend = nlevels(x[["xy.angle"]]) > 1 || nlevels(x[["xz.angle"]]) > 1, legend.pos = "topleft", ...)
object |
a numeric vector with the values of the response for which
the sample variogram should be computed
( |
locations |
a numeric matrix with the coordinates of the locations
where the response was observed ( |
data |
an optional data frame, list or environment (or another
object coercible by |
subset |
an optional vector specifying a subset of observations to be used for estimating the variogram. |
na.action |
a function which indicates what should happen when the
data contain |
lag.dist.def |
a numeric scalar defining a constant bin
width for grouping the lag distances or a numeric vector with the bounds
of a set of contiguous bins (upper bounds of bins except for the first
element of |
xy.angle.def |
an numeric vector defining angular classes
in the horizontal plane for computing directional variograms.
|
xz.angle.def |
an numeric vector defining angular classes
in the |
max.lag |
a positive numeric defining the largest lag distance for which semi variances should be computed (default no restriction). |
estimator |
a character keyword defining the estimator for computing the sample variogram. Possible values are:
|
mean.angle |
a logical scalar controlling whether the mean lag vector (per
combination of lag distance and angular class) is computed from the mean
angles of all the lag vectors falling into a given class ( |
x |
an object of class |
type , xlim , ylim , xlab , ylab
|
see respective arguments of
|
add |
a logical scalar controlling whether a new plot should be
generated ( |
col |
a vector with the colours of plotting symbols for distinguishing semi variances
for angular classes in the |
pch |
a vector with the types of plotting symbols for distinguishing
semi variances for angular classes in the |
lty |
the line type. |
cex |
a numeric with the character expansion factor for plotting symbols. |
annotate.npairs |
a logical scalar controlling whether the plotting symbols should be annotated by the number of data pairs per lag class. |
npairs.pos |
an integer defining the position where text annotation
about number of pairs should be plotted, see
|
npairs.cex |
a numeric defining the character expansion for text annotation about number of pairs. |
legend |
a logical scalar controlling whether a
|
legend.pos |
a character keyword defining where to place the
legend, see |
... |
additional arguments passed to
|
The angular classes in the -
- and
-
-plane are
defined by vectors of ascending angles on the half circle. The
th
angular class is defined by the vector elements, say l and u,
with indices
and
. A lag vector belongs to the
th angular class if its azimuth (or angle from zenith), say
, satisfies
.
If the first and the last element of
xy.angle.def
or
xz.angle.def
are equal to 0
and 180
degrees,
respectively, then the first and the last angular class are
“joined”, i.e., if there are angles, there will be only
angular classes and the first class is defined by the interval
( xy.angle.def[K-1]-180, xy.angle.def[2] ] and the last
class by ( xy.angle.def[K-2], xy.angle.def[K-1]].
All methods of the generic function sample.variogram
return an object of class sample.variogram
, which
is a data frame with the following components:
lag.dist |
the mean lag distance of the lag class, |
xy.angle |
the angular class in the - -plane, |
xz.angle |
the angular class in the - -plane, |
gamma |
the estimated semi-variance of the lag class, |
npairs |
the number of data pairs in the lag class, |
lag.x |
the -component of the mean lag vector of the lag class, |
lag.x |
the -component of the mean lag vector of the lag class, |
lag.z |
the -component of the mean lag vector of the lag class. |
The method summary.sample.variogram
returns an object of class
summary.sample.variogram
which is list with the components
log.dist
, npairs
, xy.angle
and xz.angle
, see
description for object of class sample.variogram
above. There is a
print
method for objects of class summary.sample.variogram
which invisibly returns the object unchanged.
The method plot.sample.variogram
is called for its side effects and
invisibly returns the object sample.variogram
unchanged.
Andreas Papritz [email protected].
Cressie, N. and Hawkins, D. M. (1980) Robust Estimation of the Variogram: I. Mathematical Geology, 12, 115–125, doi:10.1007/BF01035243.
Dowd, P. A. (1984) The variogram and Kriging: Robust and resistant estimators. In Geostatistics for Natural Resources Characterization, Verly, G., David, M., Journel, A. and Marechal, A. (Eds.) Dordrecht: D. Reidel Publishing Company, Part I, 1, 91–106, doi:10.1007/978-94-009-3699-7.
Genton, M. (1998) Highly Robust Variogram Estimation. Mathematical Geology, 30, 213–220, doi:10.1023/A:1021728614555.
georobPackage
for a description of the model and a brief summary of the algorithms;
georob
for (robust) fitting of spatial linear models;
georobObject
for a description of the class georob
;
profilelogLik
for computing profiles of Gaussian likelihoods;
plot.georob
for display of RE(ML) variogram estimates;
control.georob
for controlling the behaviour of georob
;
georobModelBuilding
for stepwise building models of class georob
;
cv.georob
for assessing the goodness of a fit by georob
;
georobMethods
for further methods for the class georob
;
predict.georob
for computing robust Kriging predictions;
lgnpp
for unbiased back-transformation of Kriging prediction
of log-transformed data;
georobSimulation
for simulating realizations of a Gaussian process
from model fitted by georob
.
data(wolfcamp) ## omnidirectional sample variogram r.sv.iso <- sample.variogram(pressure~1, data = wolfcamp, locations = ~x + y, lag.dist.def = seq(0, 200, by = 15)) plot(r.sv.iso, type = "l") ## direction-dependent sample variogram r.sv.aniso <- sample.variogram(pressure~1, data = wolfcamp, locations = ~x + y, lag.dist.def = seq(0, 200, by = 15), xy.angle.def = c(0., 22.5, 67.5, 112.5, 157.5, 180.)) plot(r.sv.aniso, type = "l", add = TRUE, col = 2:5)
data(wolfcamp) ## omnidirectional sample variogram r.sv.iso <- sample.variogram(pressure~1, data = wolfcamp, locations = ~x + y, lag.dist.def = seq(0, 200, by = 15)) plot(r.sv.iso, type = "l") ## direction-dependent sample variogram r.sv.aniso <- sample.variogram(pressure~1, data = wolfcamp, locations = ~x + y, lag.dist.def = seq(0, 200, by = 15), xy.angle.def = c(0., 22.5, 67.5, 112.5, 157.5, 180.)) plot(r.sv.aniso, type = "l", add = TRUE, col = 2:5)
Functions to compute and plot summary statistics of prediction errors to (cross-)validate fitted spatial linear models by the criteria proposed by Gneiting et al. (2007) for assessing probabilistic forecasts.
validate.predictions(data, pred, se.pred, statistic = c("crps", "pit", "mc", "bs", "st"), ncutoff = NULL) ## S3 method for class 'cv.georob' plot(x, type = c("sc", "lgn.sc", "ta", "qq", "hist.pit", "ecdf.pit", "mc", "bs"), smooth = TRUE, span = 2/3, ncutoff = NULL, add = FALSE, col, pch, lty, main, xlab, ylab, ...) ## S3 method for class 'cv.georob' print(x, digits = max(3, getOption("digits") - 3), ...) ## S3 method for class 'cv.georob' summary(object, se = FALSE, ...)
validate.predictions(data, pred, se.pred, statistic = c("crps", "pit", "mc", "bs", "st"), ncutoff = NULL) ## S3 method for class 'cv.georob' plot(x, type = c("sc", "lgn.sc", "ta", "qq", "hist.pit", "ecdf.pit", "mc", "bs"), smooth = TRUE, span = 2/3, ncutoff = NULL, add = FALSE, col, pch, lty, main, xlab, ylab, ...) ## S3 method for class 'cv.georob' print(x, digits = max(3, getOption("digits") - 3), ...) ## S3 method for class 'cv.georob' summary(object, se = FALSE, ...)
data |
a numeric vector with observations about a response (mandatory argument). |
pred |
a numeric vector with predictions for the response (mandatory argument). |
se.pred |
a numeric vector with prediction standard errors (mandatory argument). |
statistic |
a character keyword defining what statistic of the prediction errors should be computed. Possible values are (see Details):
|
ncutoff |
a positive integer ( |
x , object
|
objects of class |
digits |
a positive integer indicating the number of decimal digits to print. |
type |
a character keyword defining what type of plot is created by the
|
smooth |
a logical scalar controlling whether scatter plots of data
vs. predictions should be smoothed by
|
span |
a numeric with the smoothness parameter for loess (see
|
add |
a logical scalar controlling whether the current high-level plot is
added to an existing graphics without cleaning the frame before (default:
|
main , xlab , ylab
|
title and axes labels of plot. |
col , pch , lty
|
color, symbol and line type. |
se |
a logical scalar controlling if the standard errors of the
averaged continuous ranked probability score and of the mean and
dispersion statistics of the prediction errors (see Details) are
computed from the respective values of the |
... |
additional arguments passed to the methods. |
validate.predictions
computes the items required to evaluate (and
plot) the diagnostic criteria proposed by Gneiting et al. (2007) for
assessing the calibration and the sharpness of
probabilistic predictions of (cross-)validation data. To this aim,
validate.predictions
uses the assumption that the prediction
errors
follow normal distributions with zero mean and standard deviations equal
to the Kriging standard errors. This assumption is an approximation if
the errors
come from a long-tailed
distribution. Furthermore, for the time being, the Kriging variance of
the response
is approximated by adding the estimated
nugget
to the Kriging variance of the
signal
. This approximation likely underestimates the mean
squared prediction error of the response if the errors come from a
long-tailed distribution. Hence, for robust Kriging, the standard errors of
the (cross-)validation errors are likely too small.
Notwithstanding these difficulties and imperfections, validate.predictions
computes
the probability integral transform (PIT),
where denotes the (plug-in) predictive CDF evaluated at
, the value of the
th (cross-)validation datum,
the average predictive CDF (plug-in)
where is the number of (cross-)validation observations and the
are evaluated at
quantiles equal to the set of
distinct
(or a subset of size
of them),
the Brier Score (plug-in)
where is the indicator function for the event
, and
the Brier score is again evaluated at the unique values of the (cross-)validation
observations (or a subset of size
of them),
the averaged continuous ranked probability score, CRPS, a strictly proper scoring criterion to rank predictions, which is related to the Brier score by
Gneiting et al. (2007) proposed the following plots to validate probabilistic predictions:
A histogram (or a plot of the empirical CDF) of the PIT values. For ideal predictions, with observed coverages of prediction intervals matching nominal coverages, the PIT values have an uniform distribution.
Plots of and of the empirical CDF of
the data, say
, and of their
difference,
vs
. The forecasts are said to be marginally calibrated
if
and
match.
A plot of vs.
. Probabilistic
predictions are said to be sharp if the area under this curve,
which equals CRPS, is minimized.
The plot
method for class cv.georob
allows to create
these plots, along with scatter-plots of observations and predictions,
Tukey-Anscombe and normal QQ plots of the standardized prediction
errors.
summary.cv.georob
computes the mean and dispersion statistics
of the (standardized) prediction errors (by a call to
validate.prediction
with argument statistic = "st"
, see
Value) and the averaged continuous ranked probability score
(crps
). If present in the cv.georob
object, the error
statistics are also computed for the errors of the unbiasedly
back-transformed predictions of a log-transformed response. If se
is TRUE
then these statistics are evaluated separately for the
cross-validation subsets and the standard errors of the means of
these statistics are returned as well.
The print
method for class cv.georob
returns the mean
and dispersion statistics of the (standardized) prediction errors.
Depending on the argument statistic
, the function
validate.predictions
returns
a numeric vector of PIT values if statistic
is equal to "pit"
,
a named numeric vector with summary statistics of the
(standardized) prediction errors if statistic
is equal to "st"
. The
following statistics are computed:
me |
mean prediction error |
mede |
median prediction error |
rmse |
root mean squared prediction error |
made |
median absolute prediction error |
qne |
Qn dispersion measure of prediction errors
(see Qn ) |
msse |
mean squared standardized prediction error |
medsse |
median squared standardized prediction error |
a data frame if statistic
is equal to "mc"
or
"bs"
with the components (see Details):
z |
the sorted unique (cross-)validation
observations (or a subset of size
ncutoff of them) |
ghat |
the empirical CDF of the (cross-)validation
observations
|
fbar |
the average predictive distribution
|
bs |
the Brier score
|
The method print.cv.georob
invisibly returns the object unchanged.
The method summary.cv.georob
returns an object of class
summary.cv.georob
which is a list with 3 components:
st
a numeric vector with summary statistics of the
(standardized) prediction errors of the possibly log-transformed
response, see output of function validate.predictions
for
argument statistic = "st"
above.
crps
the value of the continuous ranked probability score.
st.lgn
a numeric vector with summary statistics of the
(standardized) prediction errors of the back-transformed response if
argument lgn = TRUE
and NULL
otherwise.
There is a print
method for objects of class summary.cv.georob
which invisibly returns the object unchanged.
The method plot.georob
is called for its side effects and
invisibly returns NULL
.
Andreas Papritz [email protected].
Gneiting, T., Balabdaoui, F. and Raftery, A. E.(2007) Probabilistic
forecasts, calibration and sharpness. Journal of the Royal
Statistical Society Series B 69, 243–268,
doi:10.1111/j.1467-9868.2007.00587.x.
georob
for (robust) fitting of spatial linear models;
cv.georob
for assessing the goodness of a fit by georob
.
## define number of cores for parallel computations if(interactive()) ncpu <- 10L else ncpu <- 1L data(meuse) r.logzn <- georob(log(zinc) ~ sqrt(dist), data = meuse, locations = ~ x + y, variogram.model = "RMexp", param = c(variance = 0.15, nugget = 0.05, scale = 200), tuning.psi = 1000) if(interactive()){ ## example is run only in interactive session because cpu times exceeds 5 s r.logzn.cv.1 <- cv(r.logzn, seed = 1, lgn = TRUE, ncores = 1, verbose = 1) r.logzn.cv.2 <- cv(r.logzn, formula = .~. + ffreq, seed = 1, lgn = TRUE, ncores = ncpu) summary(r.logzn.cv.1, se = TRUE) summary(r.logzn.cv.2, se = TRUE) op <- par(mfrow = c(2,2)) plot(r.logzn.cv.1, type = "lgn.sc") plot(r.logzn.cv.2, type = "lgn.sc", add = TRUE, col = "red") abline(0, 1, lty= "dotted") plot(r.logzn.cv.1, type = "ta") plot(r.logzn.cv.2, type = "ta", add = TRUE, col = "red") abline(h=0, lty= "dotted") plot(r.logzn.cv.2, type = "mc", col = "red") plot(r.logzn.cv.1, type = "bs") plot(r.logzn.cv.2, type = "bs", add = TRUE, col = "red") legend("topright", lty = 1, col = c("black", "red"), bty = "n", legend = c("log(Zn) ~ sqrt(dist)", "log(Zn) ~ sqrt(dist) + ffreq")) par(op) }
## define number of cores for parallel computations if(interactive()) ncpu <- 10L else ncpu <- 1L data(meuse) r.logzn <- georob(log(zinc) ~ sqrt(dist), data = meuse, locations = ~ x + y, variogram.model = "RMexp", param = c(variance = 0.15, nugget = 0.05, scale = 200), tuning.psi = 1000) if(interactive()){ ## example is run only in interactive session because cpu times exceeds 5 s r.logzn.cv.1 <- cv(r.logzn, seed = 1, lgn = TRUE, ncores = 1, verbose = 1) r.logzn.cv.2 <- cv(r.logzn, formula = .~. + ffreq, seed = 1, lgn = TRUE, ncores = ncpu) summary(r.logzn.cv.1, se = TRUE) summary(r.logzn.cv.2, se = TRUE) op <- par(mfrow = c(2,2)) plot(r.logzn.cv.1, type = "lgn.sc") plot(r.logzn.cv.2, type = "lgn.sc", add = TRUE, col = "red") abline(0, 1, lty= "dotted") plot(r.logzn.cv.1, type = "ta") plot(r.logzn.cv.2, type = "ta", add = TRUE, col = "red") abline(h=0, lty= "dotted") plot(r.logzn.cv.2, type = "mc", col = "red") plot(r.logzn.cv.1, type = "bs") plot(r.logzn.cv.2, type = "bs", add = TRUE, col = "red") legend("topright", lty = 1, col = c("black", "red"), bty = "n", legend = c("log(Zn) ~ sqrt(dist)", "log(Zn) ~ sqrt(dist) + ffreq")) par(op) }
Piezometric head measurements taken at the Wolfcamp Aquifer, Texas, USA. See Cressie (1993, p. 212–214) for description of the scientific problem and the data. Original data were converted to SI units: coordinates are given in kilometers and pressure heads in meters.
data(wolfcamp)
data(wolfcamp)
A data frame with 85 observations on the following 3 variables:
x
a numeric vector with the easting coordinate in kilometers.
y
a numeric vector with the northing coordinate in kilometers.
pressure
a numeric vector with the piezometric head in meters.
The data were imported from the package geoR.
Harper, W.V. and Furr, J.M. (1986) Geostatistical analysis of potentiometric data in the Wolfcamp Aquifer of the Palo Duro Basin, Texas. Technical Report BMI/ONWI-587, Bettelle Memorial Institute, Columbus, OH.
Cressie, N. A. C. (1993) Statistics for Spatial Data, Wiley, New York, doi:10.1002/9781119115151.
Papritz, A. and Moyeed, R. (2001) Parameter uncertainty in spatial prediction: checking its importance by cross-validating the Wolfcamp and Rongelap data sets, GeoENV 2000: Geostatistical for Environmental Applications. Eds P. Monestiez, D. Allard, R. Froidevaux. Kluwer, doi:10.1007/978-94-010-0810-5.
data(wolfcamp) summary(wolfcamp)
data(wolfcamp) summary(wolfcamp)