Title: | Spatial Estimation and Prediction for Censored/Missing Responses |
---|---|
Description: | It provides functions to estimate parameters in linear spatial models with censored/missing responses via the Expectation-Maximization (EM), the Stochastic Approximation EM (SAEM), or the Monte Carlo EM (MCEM) algorithm. These algorithms are widely used to compute the maximum likelihood (ML) estimates in problems with incomplete data. The EM algorithm computes the ML estimates when a closed expression for the conditional expectation of the complete-data log-likelihood function is available. In the MCEM algorithm, the conditional expectation is substituted by a Monte Carlo approximation based on many independent simulations of the missing data. In contrast, the SAEM algorithm splits the E-step into simulation and integration steps. This package also approximates the standard error of the estimates using the Louis method. Moreover, it has a function that performs spatial prediction in new locations. |
Authors: | Katherine A. L. Valeriano [aut, cre] , Alejandro Ordonez Cuastumal [ctb] , Christian Galarza Morales [ctb] , Larissa Avila Matos [ctb] |
Maintainer: | Katherine A. L. Valeriano <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.3.0 |
Built: | 2024-12-05 07:12:09 UTC |
Source: | CRAN |
It computes the spatial variance-covariance matrix considering exponential, gaussian, matérn, or power exponential correlation function.
CovMat(phi, tau2, sig2, coords, type = "exponential", kappa = NULL)
CovMat(phi, tau2, sig2, coords, type = "exponential", kappa = NULL)
phi |
spatial scaling parameter. |
tau2 |
nugget effect parameter. |
sig2 |
partial sill parameter. |
coords |
2D spatial coordinates of dimensions |
type |
type of spatial correlation function: |
kappa |
parameter for some spatial correlation functions. For exponential
and gaussian |
The spatial covariance matrix is given by
,
where is the partial sill,
is the spatial scaling
parameter,
is known as the nugget effect in the geostatistical
framework,
is the
correlation matrix computed from a
correlation function, and
is the
identity matrix.
The spatial correlation functions available are:
,
,
,
,
where is the Euclidean distance between two observations,
is the gamma function,
is the smoothness parameter,
and
is the modified Bessel function of the second kind of order
.
An spatial covariance matrix.
Katherine L. Valeriano, Alejandro Ordoñez, Christian E. Galarza, and Larissa A. Matos.
dist2Dmatrix
, EM.sclm
, MCEM.sclm
, SAEM.sclm
set.seed(1000) n = 20 coords = round(matrix(runif(2*n, 0, 10), n, 2), 5) Cov = CovMat(phi=5, tau2=0.8, sig2=2, coords=coords, type="exponential")
set.seed(1000) n = 20 coords = round(matrix(runif(2*n, 0, 10), n, 2), 5) Cov = CovMat(phi=5, tau2=0.8, sig2=2, coords=coords, type="exponential")
It computes the Euclidean distance matrix for a set of coordinates.
dist2Dmatrix(coords)
dist2Dmatrix(coords)
coords |
2D spatial coordinates of dimensions |
An distance matrix.
Katherine L. Valeriano, Alejandro Ordoñez, Christian E. Galarza, and Larissa A. Matos.
n = 100 set.seed(1000) x = round(runif(n,0,10), 5) # X coordinate y = round(runif(n,0,10), 5) # Y coordinate Mdist = dist2Dmatrix(cbind(x, y))
n = 100 set.seed(1000) x = round(runif(n,0,10), 5) # X coordinate y = round(runif(n,0,10), 5) # Y coordinate Mdist = dist2Dmatrix(cbind(x, y))
It fits the left, right, or interval spatial censored linear model using the Expectation-Maximization (EM) algorithm. It provides estimates and standard errors of the parameters and supports missing values on the dependent variable.
EM.sclm(y, x, ci, lcl = NULL, ucl = NULL, coords, phi0, nugget0, type = "exponential", kappa = NULL, lower = c(0.01, 0.01), upper = c(30, 30), MaxIter = 300, error = 1e-04, show_se = TRUE)
EM.sclm(y, x, ci, lcl = NULL, ucl = NULL, coords, phi0, nugget0, type = "exponential", kappa = NULL, lower = c(0.01, 0.01), upper = c(30, 30), MaxIter = 300, error = 1e-04, show_se = TRUE)
y |
vector of responses of length |
x |
design matrix of dimensions |
ci |
vector of censoring indicators of length |
lcl , ucl
|
vectors of length |
coords |
2D spatial coordinates of dimensions |
phi0 |
initial value for the spatial scaling parameter. |
nugget0 |
initial value for the nugget effect parameter. |
type |
type of spatial correlation function: |
kappa |
parameter for some spatial correlation functions. See |
lower , upper
|
vectors of lower and upper bounds for the optimization method.
If unspecified, the default is |
MaxIter |
maximum number of iterations for the EM algorithm. By default |
error |
maximum convergence error. By default |
show_se |
logical. It indicates if the standard errors should be estimated by default |
The spatial Gaussian model is given by
,
where is the
response vector,
is the
design matrix,
is the
vector of regression coefficients
to be estimated, and
is the error term. Which is normally distributed with
zero-mean and covariance matrix
. We assume
that
is non-singular and
has a full rank (Diggle and Ribeiro 2007).
The estimation process is performed via the EM algorithm, initially proposed by
Dempster et al. (1977). The conditional
expectations are computed using the function meanvarTMD
available in the
package MomTrunc
.
An object of class "sclm". Generic functions print
and summary
have
methods to show the results of the fit. The function plot
can extract
convergence graphs for the parameter estimates.
Specifically, the following components are returned:
Theta |
estimated parameters in all iterations, |
theta |
final estimation of |
beta |
estimated |
sigma2 |
estimated |
phi |
estimated |
tau2 |
estimated |
EY |
first conditional moment computed in the last iteration. |
EYY |
second conditional moment computed in the last iteration. |
SE |
vector of standard errors of |
InfMat |
observed information matrix. |
loglik |
log-likelihood for the EM method. |
AIC |
Akaike information criterion. |
BIC |
Bayesian information criterion. |
Iter |
number of iterations needed to converge. |
time |
processing time. |
call |
|
tab |
table of estimates. |
critFin |
selection criteria. |
range |
effective range. |
ncens |
number of censored/missing observations. |
MaxIter |
maximum number of iterations for the EM algorithm. |
The EM final estimates correspond to the estimates obtained at the last iteration of the EM algorithm.
To fit a regression model for non-censored data, just set ci
as a vector of zeros.
Katherine L. Valeriano, Alejandro Ordoñez, Christian E. Galarza, and Larissa A. Matos.
Dempster AP, Laird NM, Rubin DB (1977).
“Maximum likelihood from incomplete data via the EM algorithm.”
Journal of the Royal Statistical Society: Series B (Methodological), 39(1), 1–38.
Diggle P, Ribeiro P (2007).
Model-based Geostatistics.
Springer.
MCEM.sclm
, SAEM.sclm
, predict.sclm
# Simulated example: 10% of left-censored observations set.seed(1000) n = 50 # Test with another values for n coords = round(matrix(runif(2*n,0,15),n,2), 5) x = cbind(rnorm(n), runif(n)) data = rCensSp(c(-1,3), 2, 4, 0.5, x, coords, "left", 0.10, 0, "gaussian") fit = EM.sclm(y=data$y, x=x, ci=data$ci, lcl=data$lcl, ucl=data$ucl, coords=coords, phi0=3, nugget0=1, type="gaussian") fit
# Simulated example: 10% of left-censored observations set.seed(1000) n = 50 # Test with another values for n coords = round(matrix(runif(2*n,0,15),n,2), 5) x = cbind(rnorm(n), runif(n)) data = rCensSp(c(-1,3), 2, 4, 0.5, x, coords, "left", 0.10, 0, "gaussian") fit = EM.sclm(y=data$y, x=x, ci=data$ci, lcl=data$lcl, ucl=data$ucl, coords=coords, phi0=3, nugget0=1, type="gaussian") fit
It fits the left, right, or interval spatial censored linear model using the Monte Carlo EM (MCEM) algorithm. It provides estimates and standard errors of the parameters and supports missing values on the dependent variable.
MCEM.sclm(y, x, ci, lcl = NULL, ucl = NULL, coords, phi0, nugget0, type = "exponential", kappa = NULL, lower = c(0.01, 0.01), upper = c(30, 30), MaxIter = 500, nMin = 20, nMax = 5000, error = 1e-04, show_se = TRUE)
MCEM.sclm(y, x, ci, lcl = NULL, ucl = NULL, coords, phi0, nugget0, type = "exponential", kappa = NULL, lower = c(0.01, 0.01), upper = c(30, 30), MaxIter = 500, nMin = 20, nMax = 5000, error = 1e-04, show_se = TRUE)
y |
vector of responses of length |
x |
design matrix of dimensions |
ci |
vector of censoring indicators of length |
lcl , ucl
|
vectors of length |
coords |
2D spatial coordinates of dimensions |
phi0 |
initial value for the spatial scaling parameter. |
nugget0 |
initial value for the nugget effect parameter. |
type |
type of spatial correlation function: |
kappa |
parameter for some spatial correlation functions. See |
lower , upper
|
vectors of lower and upper bounds for the optimization method.
If unspecified, the default is |
MaxIter |
maximum number of iterations for the MCEM algorithm. By default |
nMin |
initial sample size for Monte Carlo integration. By default |
nMax |
maximum sample size for Monte Carlo integration. By default |
error |
maximum convergence error. By default |
show_se |
logical. It indicates if the standard errors
should be estimated by default |
The spatial Gaussian model is given by
,
where is the
response vector,
is the
design matrix,
is the
vector of regression coefficients
to be estimated, and
is the error term. Which is normally distributed with
zero-mean and covariance matrix
. We assume
that
is non-singular and
has a full rank (Diggle and Ribeiro 2007).
The estimation process is performed via the MCEM algorithm, initially proposed by
Wei and Tanner (1990). The Monte Carlo (MC) approximation starts
with a sample of size nMin
; at each iteration, the sample size increases (nMax-nMin
)/MaxIter
,
and at the last iteration, the sample size is nMax
. The random observations are sampled through the slice
sampling algorithm available in package relliptical
.
An object of class "sclm". Generic functions print
and summary
have methods to show the results of the fit. The function plot
can extract
convergence graphs for the parameter estimates.
Specifically, the following components are returned:
Theta |
estimated parameters in all iterations, |
theta |
final estimation of |
beta |
estimated |
sigma2 |
estimated |
phi |
estimated |
tau2 |
estimated |
EY |
MC approximation of the first conditional moment. |
EYY |
MC approximation of the second conditional moment. |
SE |
vector of standard errors of |
InfMat |
observed information matrix. |
loglik |
log-likelihood for the MCEM method. |
AIC |
Akaike information criterion. |
BIC |
Bayesian information criterion. |
Iter |
number of iterations needed to converge. |
time |
processing time. |
call |
|
tab |
table of estimates. |
critFin |
selection criteria. |
range |
effective range. |
ncens |
number of censored/missing observations. |
MaxIter |
maximum number of iterations for the MCEM algorithm. |
The MCEM final estimates correspond to the mean of the estimates obtained at each iteration after deleting the half and applying a thinning of 3.
To fit a regression model for non-censored data, just set ci
as a vector of zeros.
Katherine L. Valeriano, Alejandro Ordoñez, Christian E. Galarza, and Larissa A. Matos.
Diggle P, Ribeiro P (2007).
Model-based Geostatistics.
Springer.
Wei G, Tanner M (1990).
“A Monte Carlo implementation of the EM algorithm and the poor man's data augmentation algorithms.”
Journal of the American Statistical Association, 85(411), 699–704.
doi:10.1080/01621459.1990.10474930.
EM.sclm
, SAEM.sclm
, predict.sclm
# Example 1: left censoring data set.seed(1000) n = 50 # Test with another values for n coords = round(matrix(runif(2*n,0,15),n,2), 5) x = cbind(rnorm(n), rnorm(n)) data = rCensSp(c(2,-1), 2, 3, 0.70, x, coords, "left", 0.08, 0, "matern", 1) fit = MCEM.sclm(y=data$y, x=x, ci=data$ci, lcl=data$lcl, ucl=data$ucl, coords, phi0=2.50, nugget0=0.75, type="matern", kappa=1, MaxIter=30, nMax=1000) fit$tab # Example 2: left censoring and missing data yMiss = data$y yMiss[20] = NA ci = data$ci ci[20] = 1 ucl = data$ucl ucl[20] = Inf fit1 = MCEM.sclm(y=yMiss, x=x, ci=ci, lcl=data$lcl, ucl=ucl, coords, phi0=2.50, nugget0=0.75, type="matern", kappa=1, MaxIter=300, nMax=1000) summary(fit1) plot(fit1)
# Example 1: left censoring data set.seed(1000) n = 50 # Test with another values for n coords = round(matrix(runif(2*n,0,15),n,2), 5) x = cbind(rnorm(n), rnorm(n)) data = rCensSp(c(2,-1), 2, 3, 0.70, x, coords, "left", 0.08, 0, "matern", 1) fit = MCEM.sclm(y=data$y, x=x, ci=data$ci, lcl=data$lcl, ucl=data$ucl, coords, phi0=2.50, nugget0=0.75, type="matern", kappa=1, MaxIter=30, nMax=1000) fit$tab # Example 2: left censoring and missing data yMiss = data$y yMiss[20] = NA ci = data$ci ci[20] = 1 ucl = data$ucl ucl[20] = Inf fit1 = MCEM.sclm(y=yMiss, x=x, ci=ci, lcl=data$lcl, ucl=ucl, coords, phi0=2.50, nugget0=0.75, type="matern", kappa=1, MaxIter=300, nMax=1000) summary(fit1) plot(fit1)
The level of dioxin (2,3,7,8-tetrachlorodibenzo-p-dioxin or TCDD) data was collected
in November 1983 by the U.S. Environmental Protection Agency (EPA) in several areas
of a highway in Missouri, USA. The TCDD measurement was subject to a limit of
detection (cens
); thereby, the TCDD
data is left-censored. Only the
locations used in the geostatistical analysis by Zirschky and Harris (1986) are shown.
data("Missouri")
data("Missouri")
A data frame with 127 observations and five variables:
x coordinate of the start of each transect (ft).
y coordinate of the start of each transect (ft).
TCDD concentrations (mg/kg).
transect length (ft).
indicator of censoring (left-censored observations).
Zirschky JH, Harris DJ (1986). “Geostatistical analysis of hazardous waste site data.” Journal of Environmental Engineering, 112(4), 770–784.
data("Missouri") y = log(Missouri$TCDD) cc = Missouri$cens coord = cbind(Missouri$xcoord/100, Missouri$ycoord) x = matrix(1, length(y), 1) lcl = rep(-Inf, length(y)) ucl = y ## SAEM fit set.seed(83789) fit1 = SAEM.sclm(y, x, cc, lcl, ucl, coord, 5, 1, lower=c(1e-5,1e-5), upper=c(50,50)) fit1$tab ## MCEM fit fit2 = MCEM.sclm(y, x, cc, lcl, ucl, coord, 5, 1, lower=c(1e-5,1e-5), upper=c(50,50), MaxIter=300, nMax=1000) fit2$tab ## Imputed values cbind(fit1$EY, fit2$EY)[cc==1,]
data("Missouri") y = log(Missouri$TCDD) cc = Missouri$cens coord = cbind(Missouri$xcoord/100, Missouri$ycoord) x = matrix(1, length(y), 1) lcl = rep(-Inf, length(y)) ucl = y ## SAEM fit set.seed(83789) fit1 = SAEM.sclm(y, x, cc, lcl, ucl, coord, 5, 1, lower=c(1e-5,1e-5), upper=c(50,50)) fit1$tab ## MCEM fit fit2 = MCEM.sclm(y, x, cc, lcl, ucl, coord, 5, 1, lower=c(1e-5,1e-5), upper=c(50,50), MaxIter=300, nMax=1000) fit2$tab ## Imputed values cbind(fit1$EY, fit2$EY)[cc==1,]
It performs spatial prediction in a set of new S
spatial locations.
## S3 method for class 'sclm' predict(object, locPre, xPre, ...)
## S3 method for class 'sclm' predict(object, locPre, xPre, ...)
object |
object of class |
locPre |
matrix of coordinates for which prediction is performed. |
xPre |
matrix of covariates for which prediction is performed. |
... |
further arguments passed to or from other methods. |
This function predicts using the mean squared error (MSE) criterion, which takes the conditional expectation E(Y|X) as the best linear predictor.
The function returns a list with:
coord |
matrix of coordinates. |
predValues |
predicted values. |
sdPred |
predicted standard deviations. |
Katherine L. Valeriano, Alejandro Ordoñez, Christian E. Galarza, and Larissa A. Matos.
set.seed(1000) n = 120 coords = round(matrix(runif(2*n,0,15),n,2), 5) x = cbind(rbinom(n,1,0.50), rnorm(n), rnorm(n)) data = rCensSp(c(1,4,-1), 2, 3, 0.50, x, coords, "left", 0.10, 20) ## Estimation data1 = data$Data # Estimation: EM algorithm fit1 = EM.sclm(y=data1$y, x=data1$x, ci=data1$ci, lcl=data1$lcl, ucl=data1$ucl, coords=data1$coords, phi0=2.50, nugget0=1) # Estimation: SAEM algorithm fit2 = SAEM.sclm(y=data1$y, x=data1$x, ci=data1$ci, lcl=data1$lcl, ucl=data1$ucl, coords=data1$coords, phi0=2.50, nugget0=1) # Estimation: MCEM algorithm fit3 = MCEM.sclm(y=data1$y, x=data1$x, ci=data1$ci, lcl=data1$lcl, ucl=data1$ucl, coords=data1$coords, phi0=2.50, nugget0=1, MaxIter=300) cbind(fit1$theta, fit2$theta, fit3$theta) # Prediction data2 = data$TestData pred1 = predict(fit1, data2$coords, data2$x) pred2 = predict(fit2, data2$coords, data2$x) pred3 = predict(fit3, data2$coords, data2$x) # Cross-validation mean((data2$y - pred1$predValues)^2) mean((data2$y - pred2$predValues)^2) mean((data2$y - pred3$predValues)^2)
set.seed(1000) n = 120 coords = round(matrix(runif(2*n,0,15),n,2), 5) x = cbind(rbinom(n,1,0.50), rnorm(n), rnorm(n)) data = rCensSp(c(1,4,-1), 2, 3, 0.50, x, coords, "left", 0.10, 20) ## Estimation data1 = data$Data # Estimation: EM algorithm fit1 = EM.sclm(y=data1$y, x=data1$x, ci=data1$ci, lcl=data1$lcl, ucl=data1$ucl, coords=data1$coords, phi0=2.50, nugget0=1) # Estimation: SAEM algorithm fit2 = SAEM.sclm(y=data1$y, x=data1$x, ci=data1$ci, lcl=data1$lcl, ucl=data1$ucl, coords=data1$coords, phi0=2.50, nugget0=1) # Estimation: MCEM algorithm fit3 = MCEM.sclm(y=data1$y, x=data1$x, ci=data1$ci, lcl=data1$lcl, ucl=data1$ucl, coords=data1$coords, phi0=2.50, nugget0=1, MaxIter=300) cbind(fit1$theta, fit2$theta, fit3$theta) # Prediction data2 = data$TestData pred1 = predict(fit1, data2$coords, data2$x) pred2 = predict(fit2, data2$coords, data2$x) pred3 = predict(fit3, data2$coords, data2$x) # Cross-validation mean((data2$y - pred1$predValues)^2) mean((data2$y - pred2$predValues)^2) mean((data2$y - pred3$predValues)^2)
It simulates censored spatial data with a linear structure for an established censoring rate.
rCensSp(beta, sigma2, phi, nugget, x, coords, cens = "left", pcens = 0.1, npred = 0, cov.model = "exponential", kappa = NULL)
rCensSp(beta, sigma2, phi, nugget, x, coords, cens = "left", pcens = 0.1, npred = 0, cov.model = "exponential", kappa = NULL)
beta |
linear regression parameters. |
sigma2 |
partial sill parameter. |
phi |
spatial scaling parameter. |
nugget |
nugget effect parameter. |
x |
design matrix of dimensions |
coords |
2D spatial coordinates of dimensions |
cens |
|
pcens |
desired censoring rate. By default |
npred |
number of simulated data used for cross-validation (Prediction). By default |
cov.model |
type of spatial correlation function: |
kappa |
parameter for some spatial correlation functions. For exponential and
gaussian |
If npred > 0
, it returns two lists: Data
and
TestData
; otherwise, it returns a list with the simulated data.
Data
y |
response vector. |
ci |
censoring indicator. |
lcl |
lower censoring bound. |
ucl |
upper censoring bound. |
coords |
coordinates matrix. |
x |
design matrix. |
TestData
y |
response vector. |
coords |
coordinates matrix. |
x |
design matrix. |
Katherine L. Valeriano, Alejandro Ordoñez, Christian E. Galarza, and Larissa A. Matos.
n = 100 set.seed(1000) coords = round(matrix(runif(2*n,0,15),n,2), 5) x = cbind(1, rnorm(n)) data = rCensSp(beta=c(5,2), sigma2=2, phi=4, nugget=0.70, x=x, coords=coords, cens="left", pcens=0.10, npred=10, cov.model="gaussian") data$Data data$TestData
n = 100 set.seed(1000) coords = round(matrix(runif(2*n,0,15),n,2), 5) x = cbind(1, rnorm(n)) data = rCensSp(beta=c(5,2), sigma2=2, phi=4, nugget=0.70, x=x, coords=coords, cens="left", pcens=0.10, npred=10, cov.model="gaussian") data$Data data$TestData
It fits the left, right, or interval spatial censored linear model using the Stochastic Approximation EM (SAEM) algorithm. It provides estimates and standard errors of the parameters and supports missing values on the dependent variable.
SAEM.sclm(y, x, ci, lcl = NULL, ucl = NULL, coords, phi0, nugget0, type = "exponential", kappa = NULL, lower = c(0.01, 0.01), upper = c(30, 30), MaxIter = 300, M = 20, pc = 0.2, error = 1e-04, show_se = TRUE)
SAEM.sclm(y, x, ci, lcl = NULL, ucl = NULL, coords, phi0, nugget0, type = "exponential", kappa = NULL, lower = c(0.01, 0.01), upper = c(30, 30), MaxIter = 300, M = 20, pc = 0.2, error = 1e-04, show_se = TRUE)
y |
vector of responses of length |
x |
design matrix of dimensions |
ci |
vector of censoring indicators of length |
lcl , ucl
|
vectors of length |
coords |
2D spatial coordinates of dimensions |
phi0 |
initial value for the spatial scaling parameter. |
nugget0 |
initial value for the nugget effect parameter. |
type |
type of spatial correlation function: |
kappa |
parameter for some spatial correlation functions. See |
lower , upper
|
vectors of lower and upper bounds for the optimization method.
If unspecified, the default is |
MaxIter |
maximum number of iterations of the SAEM algorithm. By default |
M |
number of Monte Carlo samples for stochastic approximation. By default |
pc |
percentage of initial iterations of the SAEM algorithm with no memory.
It is recommended that |
error |
maximum convergence error. By default |
show_se |
logical. It indicates if the standard errors
should be estimated by default |
The spatial Gaussian model is given by
,
where is the
response vector,
is the
design matrix,
is the
vector of regression coefficients
to be estimated, and
is the error term which is normally distributed with
zero-mean and covariance matrix
. We assume
that
is non-singular and
has full rank (Diggle and Ribeiro 2007).
The estimation process is performed via the SAEM algorithm, initially proposed by
Delyon et al. (1999). The spatial censored
(SAEM) algorithm was previously proposed by Lachos et al. (2017) and
Ordoñez et al. (2018) and is available in the package CensSpatial
.
These packages differ in the random number generation and optimization procedure.
This model is also a particular case of the spatio-temporal model defined by
Valeriano et al. (2021) when the number of
temporal observations is equal to one. The computing codes of the spatio-temporal
SAEM algorithm are available in the package StempCens
.
An object of class "sclm". Generic functions print
and summary
have
methods to show the results of the fit. The function plot
can extract
convergence graphs for the parameter estimates.
Specifically, the following components are returned:
Theta |
estimated parameters in all iterations, |
theta |
final estimation of |
beta |
estimated |
sigma2 |
estimated |
phi |
estimated |
tau2 |
estimated |
EY |
stochastic approximation of the first conditional moment. |
EYY |
stochastic approximation of the second conditional moment. |
SE |
vector of standard errors of |
InfMat |
observed information matrix. |
loglik |
log-likelihood for the SAEM method. |
AIC |
Akaike information criterion. |
BIC |
Bayesian information criterion. |
Iter |
number of iterations needed to converge. |
time |
processing time. |
call |
|
tab |
table of estimates. |
critFin |
selection criteria. |
range |
effective range. |
ncens |
number of censored/missing observations. |
MaxIter |
maximum number of iterations for the SAEM algorithm. |
The SAEM final estimates correspond to the estimates obtained at the last iteration of the algorithm.
To fit a regression model for non-censored data, just set ci
as a vector of zeros.
Katherine L. Valeriano, Alejandro Ordoñez, Christian E. Galarza, and Larissa A. Matos.
Delyon B, Lavielle M, Moulines E (1999).
“Convergence of a stochastic approximation version of the EM algorithm.”
The Annals of Statistics, 27(1), 94–128.
Diggle P, Ribeiro P (2007).
Model-based Geostatistics.
Springer.
Lachos VH, Matos LA, Barbosa TS, Garay AM, Dey DK (2017).
“Influence diagnostics in spatial models with censored response.”
Environmetrics, 28(7).
Ordoñez JA, Bandyopadhyay D, Lachos VH, Cabral CRB (2018).
“Geostatistical estimation and prediction for censored responses.”
Spatial Statistics, 23, 109–123.
doi:10.1016/j.spasta.2017.12.001.
Valeriano KL, Lachos VH, Prates MO, Matos LA (2021).
“Likelihood-based inference for spatiotemporal data with censored and missing responses.”
Environmetrics, 32(3).
EM.sclm
, MCEM.sclm
, predict.sclm
# Example 1: 8% of right-censored observations set.seed(1000) n = 50 # Test with another values for n coords = round(matrix(runif(2*n,0,15),n,2), 5) x = cbind(rnorm(n), rnorm(n)) data = rCensSp(c(4,-2), 1, 3, 0.50, x, coords, "right", 0.08) fit = SAEM.sclm(y=data$y, x=x, ci=data$ci, lcl=data$lcl, ucl=data$ucl, coords, phi0=2, nugget0=1, type="exponential", M=10, pc=0.18) fit # Example 2: censored and missing observations set.seed(123) n = 200 coords = round(matrix(runif(2*n,0,20),n,2), 5) x = cbind(runif(n), rnorm(n), rexp(n)) data = rCensSp(c(1,4,-1), 2, 3, 0.50, x, coords, "left", 0.05, 0, "matern", 3) data$y[c(10,120)] = NA data$ci[c(10,120)] = 1 data$ucl[c(10,120)] = Inf fit2 = SAEM.sclm(y=data$y, x=x, ci=data$ci, lcl=data$lcl, ucl=data$ucl, coords, phi0=2, nugget0=1, type="matern", kappa=3, M=10, pc=0.18) fit2$tab plot(fit2)
# Example 1: 8% of right-censored observations set.seed(1000) n = 50 # Test with another values for n coords = round(matrix(runif(2*n,0,15),n,2), 5) x = cbind(rnorm(n), rnorm(n)) data = rCensSp(c(4,-2), 1, 3, 0.50, x, coords, "right", 0.08) fit = SAEM.sclm(y=data$y, x=x, ci=data$ci, lcl=data$lcl, ucl=data$ucl, coords, phi0=2, nugget0=1, type="exponential", M=10, pc=0.18) fit # Example 2: censored and missing observations set.seed(123) n = 200 coords = round(matrix(runif(2*n,0,20),n,2), 5) x = cbind(runif(n), rnorm(n), rexp(n)) data = rCensSp(c(1,4,-1), 2, 3, 0.50, x, coords, "left", 0.05, 0, "matern", 3) data$y[c(10,120)] = NA data$ci[c(10,120)] = 1 data$ucl[c(10,120)] = Inf fit2 = SAEM.sclm(y=data$y, x=x, ci=data$ci, lcl=data$lcl, ucl=data$ucl, coords, phi0=2, nugget0=1, type="matern", kappa=3, M=10, pc=0.18) fit2$tab plot(fit2)