| Title: | Phase-Function Based Estimation and Inference for Linear Errors-in-Variables (EIV) Models |
|---|---|
| Description: | Estimation and inference for coefficients of linear EIV models with symmetric measurement errors. The measurement errors can be homoscedastic or heteroscedastic, for the latter, replication for at least some observations needs to be available. The estimation method and asymptotic inference are based on a generalised method of moments framework, where the estimating equations are formed from (1) minimising the distance between the empirical phase function (normalised characteristic function) of the response and that of the linear combination of all the covariates at the estimates, and (2) minimising a corrected least-square discrepancy function. Specifically, for a linear EIV model with p error-prone and q error-free covariates, if replicates are available, the GMM approach is based on a 2(p+q) estimating equations if some replicates are available and based on p+2q estimating equations if no replicate is available. The details of the method are described in Nghiem and Potgieter (2020) <doi:10.1093/biomet/asaa025> and Nghiem and Potgieter (2025) <doi:10.5705/ss.202022.0331>. |
| Authors: | Chang Liu [aut, cre], Linh Nghiem [aut] |
| Maintainer: | Chang Liu <[email protected]> |
| License: | GPL-2 |
| Version: | 0.1.1 |
| Built: | 2026-05-28 06:25:45 UTC |
| Source: | https://github.com/cran/PhaseGMM |
Extract Coefficients from an eiv_mlr Object
## S3 method for class 'eiv_mlr' coef(object, ...)## S3 method for class 'eiv_mlr' coef(object, ...)
object |
An object of class |
... |
Not used. |
A named numeric vector of estimated regression coefficients.
Computes Wald-type confidence intervals for regression coefficients.
## S3 method for class 'eiv_mlr' confint(object, parm = NULL, level = 0.95, ...)## S3 method for class 'eiv_mlr' confint(object, parm = NULL, level = 0.95, ...)
object |
An object of class |
parm |
Not used. |
level |
Confidence level; defaults to 0.95. |
... |
Not used. |
A matrix with columns lower and upper.
A processed Dataset containing repeated 24-hour dietary recalls.
dietary_white_womendietary_white_women
A data frame with 2 rows per individual (one per recall day).
Integer unit identifier
Body Mass Index (kg/m^2)
Total energy intake (kcal)
Protein intake (g)
Fat intake (g)
Age in months
Replicate id of each observation
The variables energy, protein, and fat
are treated as error-prone covariates with two replicate
measurements per individual, and age_in_month is treated as error-free
covariate.
Processed from NHANES dietary recall data.
Fits a linear regression model in the presence of measurement error in covariates using replicated measurements and a combination of phase-function estimation and generalized method of moments (GMM).
eiv_mlr( formula, data, weight_method = c("uniform", "minimax", "quasi-likelihood"), B = 100, t_grid_length = 1000 )eiv_mlr( formula, data, weight_method = c("uniform", "minimax", "quasi-likelihood"), B = 100, t_grid_length = 1000 )
formula |
A symbolic description of the model to be fitted.
Error-prone covariates must be wrapped in The general form is:
y ~ W(W1 + W2 + ...) + Z(Z1 + Z2 + ...)
An intercept is included automatically unless removed explicitly. |
data |
A data frame containing the response variable and all covariates
appearing in The data frame must contain:
Replicated measurements are represented by multiple rows sharing the
same |
weight_method |
Character string specifying the observation weighting method used in estimation. One of:
|
B |
Integer specifying the number of bootstrap replications used
to estimate the GMM weighting matrix. Defaults to |
t_grid_length |
Integer specifying the number of frequency grid points used in phase-function integration. Larger values improve numerical accuracy at the cost of computation time. |
The function provides an lm-like interface while internally handling
replicated error-prone covariates, measurement error correction, and
robust variance estimation.
This function implements a measurement-error-corrected linear regression estimator for models with replicated error-prone covariates. When fewer than two units contain replicated measurements, the function automatically falls back to a quadratic (identity-weight) estimator. This ensures the model remains estimable even in the absence of replication.
The estimation procedure:
Aggregates replicated measurements into a structured array.
Uses phase-function estimating equations to correct for unknown measurement error distributions.
Combines moment conditions via GMM when sufficient replication information is available.
Automatically switches to a quadratic (identity-weight) estimator when fewer than two statistical units contain replicated measurements.
Variance estimation is performed using a sandwich estimator, with the GMM weighting matrix estimated via a cluster bootstrap over statistical units.
An object of class "eiv_mlr" containing:
Estimated regression coefficients.
Estimated variance-covariance matrix.
Standard errors of the estimates.
Z-statistics for hypothesis testing.
Two-sided p-values.
Fitted values at the unit level.
Estimation method used (GMM or quadratic fallback).
Number of statistical units.
Standard methods such as summary(), coef(),
vcov(), confint(), predict(), and
residuals() are available for objects of this class.
## ------------------------------------------ ## Small reproducible example (for speed reasons, we chose a too small number of bootstrap samples) ## ------------------------------------------ set.seed(1) n <- 30 J <- 2 unit <- rep(1:n, each = J) W_true <- rnorm(n) W_obs <- rep(W_true, each = J) + rnorm(n * J, sd = 0.5) Z1 <- rep(rnorm(n), each = J) y <- rep(1 + 2 * W_true - 0.5 * Z1[seq(1, n * J, by = J)], each = J) + rnorm(n * J) sim_data <- data.frame( unit = unit, y = y, W1 = W_obs, Z1 = Z1 ) # For speed reasons, we use a very small number of bootstrap samples fit <- eiv_mlr( y ~ W(W1) + Z(Z1), data = sim_data, B = 10, t_grid_length = 20 ) coef(fit) summary(fit) ## ------------------------------------------ ## Additional examples (not run during checks) ## ------------------------------------------ ## ------------------------------------------ ## Example using included dataset ## ------------------------------------------ fit <- eiv_mlr( bmi ~ W(energy + protein + fat) + Z(age_in_month), data = dietary_white_women, weight_method = "minimax", B = 100, t_grid_length = 200 ) summary(fit) confint(fit) ## ------------------------------------------ ## Simulated example with replication ## ------------------------------------------ set.seed(1) n <- 200 J <- 2 unit <- rep(1:n, each = J) W_true <- rnorm(n) W_obs <- rep(W_true, each = J) + rnorm(n * J, sd = 0.5) Z1 <- rep(rnorm(n), each = J) y <- rep(1 + 2 * W_true - 0.5 * Z1[seq(1, n * J, by = J)], each = J) + rnorm(n * J) sim_data <- data.frame( unit = unit, y = y, W1 = W_obs, Z1 = Z1 ) fit_rep <- eiv_mlr( y ~ W(W1) + Z(Z1), data = sim_data, B = 20 ) summary(fit_rep) ## ------------------------------------------ ## Simulated example without replication ## ------------------------------------------ sim_norep <- sim_data[!duplicated(sim_data$unit), ] fit_norep <- eiv_mlr( y ~ W(W1) + Z(Z1), data = sim_norep, B = 20 ) summary(fit_norep)## ------------------------------------------ ## Small reproducible example (for speed reasons, we chose a too small number of bootstrap samples) ## ------------------------------------------ set.seed(1) n <- 30 J <- 2 unit <- rep(1:n, each = J) W_true <- rnorm(n) W_obs <- rep(W_true, each = J) + rnorm(n * J, sd = 0.5) Z1 <- rep(rnorm(n), each = J) y <- rep(1 + 2 * W_true - 0.5 * Z1[seq(1, n * J, by = J)], each = J) + rnorm(n * J) sim_data <- data.frame( unit = unit, y = y, W1 = W_obs, Z1 = Z1 ) # For speed reasons, we use a very small number of bootstrap samples fit <- eiv_mlr( y ~ W(W1) + Z(Z1), data = sim_data, B = 10, t_grid_length = 20 ) coef(fit) summary(fit) ## ------------------------------------------ ## Additional examples (not run during checks) ## ------------------------------------------ ## ------------------------------------------ ## Example using included dataset ## ------------------------------------------ fit <- eiv_mlr( bmi ~ W(energy + protein + fat) + Z(age_in_month), data = dietary_white_women, weight_method = "minimax", B = 100, t_grid_length = 200 ) summary(fit) confint(fit) ## ------------------------------------------ ## Simulated example with replication ## ------------------------------------------ set.seed(1) n <- 200 J <- 2 unit <- rep(1:n, each = J) W_true <- rnorm(n) W_obs <- rep(W_true, each = J) + rnorm(n * J, sd = 0.5) Z1 <- rep(rnorm(n), each = J) y <- rep(1 + 2 * W_true - 0.5 * Z1[seq(1, n * J, by = J)], each = J) + rnorm(n * J) sim_data <- data.frame( unit = unit, y = y, W1 = W_obs, Z1 = Z1 ) fit_rep <- eiv_mlr( y ~ W(W1) + Z(Z1), data = sim_data, B = 20 ) summary(fit_rep) ## ------------------------------------------ ## Simulated example without replication ## ------------------------------------------ sim_norep <- sim_data[!duplicated(sim_data$unit), ] fit_norep <- eiv_mlr( y ~ W(W1) + Z(Z1), data = sim_norep, B = 20 ) summary(fit_norep)
Generates fitted values or confidence intervals for new observations using plug-in estimates. Measurement error is not corrected for new data.
## S3 method for class 'eiv_mlr' predict( object, newdata = NULL, interval = c("none", "confidence"), level = 0.95, ... )## S3 method for class 'eiv_mlr' predict( object, newdata = NULL, interval = c("none", "confidence"), level = 0.95, ... )
object |
An object of class |
newdata |
Optional data frame containing covariates. If omitted, predictions are returned for the training data. |
interval |
Type of interval to compute: |
level |
Confidence level for intervals. |
... |
Not used. |
A numeric vector of predictions, or a matrix with columns
fit, lwr, and upr if intervals are requested.
Displays a concise summary of an errors-in-variables linear model fit.
## S3 method for class 'eiv_mlr' print(x, ...)## S3 method for class 'eiv_mlr' print(x, ...)
x |
An object of class |
... |
Not used. |
Invisibly returns the input object x of class "eiv_mlr".
This function is called for its side effect of printing a summary of the model.
Computes plug-in residuals based on observed covariates and estimated coefficients. Measurement error is not corrected in residuals.
## S3 method for class 'eiv_mlr' residuals(object, ...)## S3 method for class 'eiv_mlr' residuals(object, ...)
object |
An object of class |
... |
Not used. |
A numeric vector of residuals.
Produces a coefficient table including standard errors, z-statistics, p-values, and significance stars.
## S3 method for class 'eiv_mlr' summary(object, ...)## S3 method for class 'eiv_mlr' summary(object, ...)
object |
An object of class |
... |
Not used. |
An object of class "summary" containing a
coefficient table and model information.
Variance-Covariance Matrix for eiv_mlr Objects
## S3 method for class 'eiv_mlr' vcov(object, ...)## S3 method for class 'eiv_mlr' vcov(object, ...)
object |
An object of class |
... |
Not used. |
A numeric variance-covariance matrix of the parameter estimates.