| Title: | Five Novel Stochastic Regression Models with Arvind-Distributed Errors and Effects |
|---|---|
| Description: | Implements the 'Arvind' distribution and five novel stochastic regression models that replace the traditional Gaussian error assumption with 'Arvind'-distributed errors. The 'Arvind' distribution is a flexible single-parameter continuous distribution on the positive real line characterised by a polynomial numerator with Gaussian-type decay. The package provides complete distribution functions (darvind(), parvind(), qarvind(), rarvind()), maximum likelihood estimation via fit_arvind_mle(), and five model-fitting routines: Random Walk on Coefficients via fit_rw1(), Time-Varying Coefficient Linear Model via fit_tvlm(), Simulation-Extrapolation via fit_simex(), Mixed-Effects Regression via fit_mixed(), and Regime-Switching Hidden Markov Model via fit_hmm(). Additionally provides Monte Carlo forecasting with prediction intervals via forecast_arvind(), comprehensive goodness-of-fit diagnostics (21 metrics and 25 plots) via diagnostics_arvind() and plot_arvind(), k-fold and rolling-window cross-validation via cv_arvind(), and unified model comparison via summary_arvind(). For more details see Pandey, Singh, Tyagi, and Tyagi (2024) "Modelling climate, COVID-19, and reliability data: A new continuous lifetime model under different methods of estimation", Statistics and Applications, 22(2), <https://ssca.org.in/journal.html>. |
| Authors: | Shikhar Tyagi [aut, cre] (ORCID: <https://orcid.org/0000-0003-1606-0844>), Arvind Pandey [aut] |
| Maintainer: | Shikhar Tyagi <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 1.0.0 |
| Built: | 2026-05-11 20:38:01 UTC |
| Source: | https://github.com/cran/ArvindSt |
Computes the theoretical mean of the Arvind distribution with parameter
theta by numerical integration.
arvind_mean_fn(theta)arvind_mean_fn(theta)
theta |
positive numeric scalar; the distribution parameter. |
A numeric scalar giving the theoretical mean, or NA if
integration fails.
arvind_mean_fn(1) arvind_mean_fn(2)arvind_mean_fn(1) arvind_mean_fn(2)
Computes the theoretical variance of the Arvind distribution with parameter
theta by numerical integration.
arvind_var_fn(theta)arvind_var_fn(theta)
theta |
positive numeric scalar; the distribution parameter. |
A numeric scalar giving the theoretical variance, or NA if
integration fails.
arvind_var_fn(1) arvind_var_fn(2)arvind_var_fn(1) arvind_var_fn(2)
Performs k-fold cross-validation and optionally rolling-window
(expanding-window) cross-validation for an ArvindFit model.
cv_arvind(fit, k_folds = 5, rolling = TRUE, n0_frac = 0.5, seed = 42)cv_arvind(fit, k_folds = 5, rolling = TRUE, n0_frac = 0.5, seed = 42)
fit |
an object of class |
k_folds |
integer; number of cross-validation folds (default: 5). |
rolling |
logical; if |
n0_frac |
numeric; fraction of data used as initial training window for rolling CV (default: 0.5). |
seed |
integer; random seed for reproducibility (default: 42). |
A list with components:
numeric vector of length k_folds; per-fold RMSE.
numeric vector of length k_folds; per-fold MAE.
numeric; average k-fold RMSE.
numeric; average k-fold MAE.
numeric; rolling-window RMSE (or NA).
diagnostics_arvind(), forecast_arvind()
dat <- simulate_arvind_data(n = 50, seed = 1) m1 <- fit_rw1(Y ~ X1 + X2 + X3, dat, seed = 42) cv <- cv_arvind(m1, k_folds = 3, rolling = FALSE, seed = 42) cv$mean_cv_rmsedat <- simulate_arvind_data(n = 50, seed = 1) m1 <- fit_rw1(Y ~ X1 + X2 + X3, dat, seed = 42) cv <- cv_arvind(m1, k_folds = 3, rolling = FALSE, seed = 42) cv$mean_cv_rmse
Computes the probability density function (PDF) of the Arvind distribution.
darvind(x, theta, log = FALSE)darvind(x, theta, log = FALSE)
x |
numeric vector of quantiles. |
theta |
positive numeric scalar; the distribution parameter. |
log |
logical; if |
The Arvind distribution with parameter has PDF
A numeric vector of density values (or log-density values when
log = TRUE).
# Evaluate the PDF at several points darvind(c(0.5, 1, 2), theta = 1) # Log-density darvind(1, theta = 2, log = TRUE) # Returns 0 for x <= 0 darvind(-1, theta = 1)# Evaluate the PDF at several points darvind(c(0.5, 1, 2), theta = 1) # Log-density darvind(1, theta = 2, log = TRUE) # Returns 0 for x <= 0 darvind(-1, theta = 1)
Computes 21 goodness-of-fit metrics for any fitted ArvindFit
object, including MSE, RMSE, MAE, MAPE, R-squared, AIC, BIC,
Kolmogorov-Smirnov test, Anderson-Darling statistic, and more.
diagnostics_arvind(fit)diagnostics_arvind(fit)
fit |
an object of class |
The following metrics are computed:
character; the model type.
Mean Squared Error.
Root Mean Squared Error.
Mean Absolute Error.
Mean Absolute Percentage Error.
R-squared.
Adjusted R-squared.
Akaike Information Criterion.
Corrected AIC.
Bayesian Information Criterion.
Log-likelihood at the MLE.
Mean residual.
Mean Absolute Scaled Error.
Durbin-Watson statistic.
Ljung-Box test statistic.
Ljung-Box p-value.
Estimated Arvind parameter.
Kolmogorov-Smirnov test statistic.
Kolmogorov-Smirnov p-value.
Anderson-Darling test statistic.
Cramer-von Mises test statistic.
A data frame with one row and 21 columns of diagnostics metrics. See Details for the full list.
dat <- simulate_arvind_data(n = 50, seed = 1) m1 <- fit_rw1(Y ~ X1 + X2 + X3, dat, seed = 42) diagnostics_arvind(m1)dat <- simulate_arvind_data(n = 50, seed = 1) m1 <- fit_rw1(Y ~ X1 + X2 + X3, dat, seed = 42) diagnostics_arvind(m1)
Fits the Arvind distribution to a vector of positive observations by maximum likelihood. Optimisation is performed on the log-scale via the Brent method.
fit_arvind_mle(e_pos)fit_arvind_mle(e_pos)
e_pos |
numeric vector of strictly positive observations. |
A list with components:
numeric; the MLE of theta.
numeric; the minimised negative log-likelihood.
set.seed(42) x <- rarvind(200, theta = 2) fit_arvind_mle(x)set.seed(42) x <- rarvind(200, theta = 2) fit_arvind_mle(x)
Fits a hidden Markov model with state-dependent coefficients and Arvind-distributed errors. The EM algorithm with forward-backward recursions is used for parameter estimation, and the Viterbi algorithm decodes the most likely state sequence.
fit_hmm(formula, data, nstates = 2, seed = 42)fit_hmm(formula, data, nstates = 2, seed = 42)
formula |
an object of class |
data |
a data frame containing the variables in the formula. |
nstates |
integer; number of hidden states (default: 2). |
seed |
integer; random seed for reproducibility (default: 42). |
An object of class "ArvindFit", a list containing the
same standard fields as fit_rw1(), plus:
the fitted depmixS4 object.
integer; number of hidden states.
integer vector; Viterbi-decoded state sequence.
matrix; estimated transition probability matrix.
list of numeric vectors; state-specific coefficients.
numeric vector; state-specific standard deviations.
diagnostics_arvind(), forecast_arvind(), cv_arvind()
dat <- simulate_arvind_data(n = 50, seed = 1) m5 <- fit_hmm(Y ~ X1 + X2 + X3, dat, nstates = 2, seed = 42) m5$states m5$statesdat <- simulate_arvind_data(n = 50, seed = 1) m5 <- fit_hmm(Y ~ X1 + X2 + X3, dat, nstates = 2, seed = 42) m5$states m5$states
Fits a mixed-effects regression model with Arvind-distributed random effects and observation-level errors. Estimation uses a two-stage approach: REML initialisation via lme4, followed by Arvind MLE on the residuals.
fit_mixed(formula, data, group_var = "Season", re_formula = NULL, seed = 42)fit_mixed(formula, data, group_var = "Season", re_formula = NULL, seed = 42)
formula |
an object of class |
data |
a data frame containing the variables in the formula and the grouping variable. |
group_var |
character string; the name of the grouping variable in
|
re_formula |
optional random-effects formula (e.g.,
|
seed |
integer; random seed for reproducibility (default: 42). |
An object of class "ArvindFit", a list containing the
same standard fields as fit_rw1(), plus:
the fitted lme4::lmer object.
numeric; Arvind parameter estimated from random effects.
character; the grouping variable name.
diagnostics_arvind(), forecast_arvind(), cv_arvind()
dat <- simulate_arvind_data(n = 50, seed = 1) m4 <- fit_mixed(Y ~ X1 + X2 + X3, dat, group_var = "Group", seed = 42) m4$theta m4$thetadat <- simulate_arvind_data(n = 50, seed = 1) m4 <- fit_mixed(Y ~ X1 + X2 + X3, dat, group_var = "Group", seed = 42) m4$theta m4$theta
Fits a stochastic regression model with time-varying coefficients evolving as a random walk with Arvind-distributed innovations. The observation errors also follow the Arvind distribution.
fit_rw1(formula, data, theta_innov = 2, rw_scale = 0.01, seed = 42)fit_rw1(formula, data, theta_innov = 2, rw_scale = 0.01, seed = 42)
formula |
an object of class |
data |
a data frame containing the variables in the formula. |
theta_innov |
positive numeric; the Arvind parameter for state innovations (default: 2.0). |
rw_scale |
numeric; proportion of OLS coefficients used as innovation scale (default: 0.01). |
seed |
integer; random seed for reproducibility (default: 42). |
An object of class "ArvindFit", a list containing:
character; "RW1-approx".
numeric vector; fitted values.
numeric vector; raw residuals.
numeric; estimated Arvind parameter for residuals.
numeric; residual scale.
numeric; shift applied to residuals.
numeric vector; positive standardised residuals.
numeric; negative log-likelihood.
matrix; time-varying coefficient paths.
numeric vector; final coefficient values.
numeric vector; random walk innovation scales.
numeric; Arvind parameter used for innovations.
integer; number of observations.
integer; number of parameters.
matrix; design matrix.
numeric vector; response variable.
the model formula.
the input data frame.
diagnostics_arvind(), forecast_arvind(), cv_arvind()
dat <- simulate_arvind_data(n = 50, seed = 1) m1 <- fit_rw1(Y ~ X1 + X2 + X3, dat, seed = 42) m1$thetadat <- simulate_arvind_data(n = 50, seed = 1) m1 <- fit_rw1(Y ~ X1 + X2 + X3, dat, seed = 42) m1$theta
Fits a regression model correcting for measurement error attenuation using the SIMEX algorithm with Arvind-distributed measurement noise and residuals.
fit_simex( formula, data, me_vars = NULL, me_frac = 0.05, lambda_grid = c(0.5, 1, 1.5, 2), n_sim = 100, theta_me = 2, seed = 123 )fit_simex( formula, data, me_vars = NULL, me_frac = 0.05, lambda_grid = c(0.5, 1, 1.5, 2), n_sim = 100, theta_me = 2, seed = 123 )
formula |
an object of class |
data |
a data frame containing the variables in the formula. |
me_vars |
character vector of covariate names measured with error.
If |
me_frac |
numeric; fraction of marginal variance used as measurement error variance (default: 0.05). |
lambda_grid |
numeric vector; SIMEX lambda grid
(default: |
n_sim |
integer; number of SIMEX simulation replicates (default: 100). |
theta_me |
positive numeric; Arvind parameter for measurement error (default: 2.0). |
seed |
integer; random seed for reproducibility (default: 123). |
An object of class "ArvindFit", a list containing the
same standard fields as fit_rw1(), plus:
numeric vector; SIMEX-corrected coefficient estimates.
matrix; coefficient estimates at each lambda level.
numeric vector; the SIMEX lambda grid used.
character vector; covariate names with measurement error.
named numeric vector; measurement error variances.
diagnostics_arvind(), forecast_arvind(), cv_arvind()
dat <- simulate_arvind_data(n = 50, seed = 1) m3 <- fit_simex(Y ~ X1 + X2 + X3, dat, me_vars = c("X1", "X2"), n_sim = 20, seed = 123) m3$beta m3$betadat <- simulate_arvind_data(n = 50, seed = 1) m3 <- fit_simex(Y ~ X1 + X2 + X3, dat, me_vars = c("X1", "X2"), n_sim = 20, seed = 123) m3$beta m3$beta
Fits a time-varying coefficient linear model using kernel-weighted least squares (via the tvReg package) with Arvind-distributed residuals.
fit_tvlm(formula, data, bw = NULL, seed = 42)fit_tvlm(formula, data, bw = NULL, seed = 42)
formula |
an object of class |
data |
a data frame containing the variables in the formula. |
bw |
numeric or |
seed |
integer; random seed for reproducibility (default: 42). |
An object of class "ArvindFit", a list containing the
same standard fields as fit_rw1(), plus:
matrix; time-varying coefficient estimates.
the fitted tvReg::tvLM object.
diagnostics_arvind(), forecast_arvind(), cv_arvind()
dat <- simulate_arvind_data(n = 50, seed = 1) m2 <- fit_tvlm(Y ~ X1 + X2 + X3, dat, bw = 0.5, seed = 42) m2$thetadat <- simulate_arvind_data(n = 50, seed = 1) m2 <- fit_tvlm(Y ~ X1 + X2 + X3, dat, bw = 0.5, seed = 42) m2$theta
Generates Monte Carlo forecasts with 80 percent and 95 percent prediction
intervals for any fitted ArvindFit model. Covariates are forecast
using SARIMA models (via the forecast package) if not supplied.
forecast_arvind( fit, newdata_sims = NULL, h = 120, nsim = 5000, covariate_models = NULL, seed = 123 )forecast_arvind( fit, newdata_sims = NULL, h = 120, nsim = 5000, covariate_models = NULL, seed = 123 )
fit |
an object of class |
newdata_sims |
optional named list of pre-computed covariate
simulation matrices, each of dimension |
h |
integer; forecast horizon in time steps (default: 120). |
nsim |
integer; number of Monte Carlo replicates (default: 5000). |
covariate_models |
optional list of fitted SARIMA models for
covariates (auto-fitted if |
seed |
integer; random seed for reproducibility (default: 123). |
A list with components:
matrix (h x nsim); full simulation matrix.
numeric vector length h; mean forecast.
numeric vector length h; median forecast.
numeric vector; lower 80 percent prediction interval.
numeric vector; upper 80 percent prediction interval.
numeric vector; lower 95 percent prediction interval.
numeric vector; upper 95 percent prediction interval.
fit_rw1(), diagnostics_arvind(), cv_arvind()
dat <- simulate_arvind_data(n = 50, seed = 1) m1 <- fit_rw1(Y ~ X1 + X2 + X3, dat, seed = 42) fc <- forecast_arvind(m1, h = 12, nsim = 100, seed = 42) head(fc$mean)dat <- simulate_arvind_data(n = 50, seed = 1) m1 <- fit_rw1(Y ~ X1 + X2 + X3, dat, seed = 42) fc <- forecast_arvind(m1, h = 12, nsim = 100, seed = 42) head(fc$mean)
Computes the cumulative distribution function (CDF) of the Arvind distribution.
parvind(q, theta, lower.tail = TRUE)parvind(q, theta, lower.tail = TRUE)
q |
numeric vector of quantiles. |
theta |
positive numeric scalar; the distribution parameter. |
lower.tail |
logical; if |
The CDF is given by
A numeric vector of probabilities.
parvind(1, theta = 1) parvind(c(0.5, 1, 2), theta = 2) parvind(1, theta = 1, lower.tail = FALSE)parvind(1, theta = 1) parvind(c(0.5, 1, 2), theta = 2) parvind(1, theta = 1, lower.tail = FALSE)
Generates up to 25 diagnostic plots for a fitted ArvindFit object,
including observed vs fitted, residual histogram with Arvind density overlay,
Q-Q plot, ACF, ECDF comparison, and more.
plot_arvind(fit, output_dir = tempdir(), prefix = NULL)plot_arvind(fit, output_dir = tempdir(), prefix = NULL)
fit |
an object of class |
output_dir |
character; directory where plots are saved. Defaults to a temporary directory. |
prefix |
character or |
The fit object is returned invisibly.
dat <- simulate_arvind_data(n = 50, seed = 1) m1 <- fit_rw1(Y ~ X1 + X2 + X3, dat, seed = 42) plot_arvind(m1, output_dir = tempdir())dat <- simulate_arvind_data(n = 50, seed = 1) m1 <- fit_rw1(Y ~ X1 + X2 + X3, dat, seed = 42) plot_arvind(m1, output_dir = tempdir())
Computes quantiles of the Arvind distribution by numerical inversion
of the CDF using uniroot.
qarvind(p, theta)qarvind(p, theta)
p |
numeric vector of probabilities ( |
theta |
positive numeric scalar; the distribution parameter. |
A numeric vector of quantiles.
qarvind(0.5, theta = 1) qarvind(c(0.25, 0.5, 0.75), theta = 2)qarvind(0.5, theta = 1) qarvind(c(0.25, 0.5, 0.75), theta = 2)
Generates random variates from the Arvind distribution using a rejection sampling algorithm with a half-normal proposal distribution.
rarvind(n, theta)rarvind(n, theta)
n |
positive integer; number of random variates to generate. |
theta |
positive numeric scalar; the distribution parameter. |
A numeric vector of length n containing positive random
variates.
set.seed(42) x <- rarvind(100, theta = 1) summary(x)set.seed(42) x <- rarvind(100, theta = 1) summary(x)
Generates centred Arvind variates with approximately zero mean, suitable for use as error terms and innovation terms in stochastic regression models.
rarvind_centred(n, theta)rarvind_centred(n, theta)
n |
positive integer; number of random variates to generate. |
theta |
positive numeric scalar; the distribution parameter. |
The centred variate is computed as , where and is the mean of the
Arvind distribution.
A numeric vector of length n with approximately zero mean.
set.seed(42) eps <- rarvind_centred(1000, theta = 2) mean(eps) # approximately 0set.seed(42) eps <- rarvind_centred(1000, theta = 2) mean(eps) # approximately 0
Creates a small simulated dataset that mimics the structure needed for demonstrating the ArvindSt model-fitting functions. Useful for examples and testing.
simulate_arvind_data(n = 60, seed = 42)simulate_arvind_data(n = 60, seed = 42)
n |
integer; number of observations to generate (default: 60). |
seed |
integer; random seed for reproducibility (default: 42). |
A data frame with columns:
numeric; simulated response variable.
numeric; first covariate.
numeric; second covariate.
numeric; third covariate.
factor; grouping variable with 4 levels.
dat <- simulate_arvind_data(n = 50, seed = 1) head(dat)dat <- simulate_arvind_data(n = 50, seed = 1) head(dat)
Accepts multiple ArvindFit objects, computes diagnostics for each,
produces a unified comparison table, and prints the best model by RMSE,
R-squared, and AIC.
summary_arvind(..., comparison_plots = TRUE, output_dir = tempdir())summary_arvind(..., comparison_plots = TRUE, output_dir = tempdir())
... |
one or more objects of class |
comparison_plots |
logical; if |
output_dir |
character; directory to save comparison plots. Defaults to a temporary directory. |
A data frame of diagnostic metrics (one row per model) is returned invisibly.
diagnostics_arvind(), plot_arvind()
dat <- simulate_arvind_data(n = 50, seed = 1) m1 <- fit_rw1(Y ~ X1 + X2 + X3, dat, seed = 42) summary_arvind(m1)dat <- simulate_arvind_data(n = 50, seed = 1) m1 <- fit_rw1(Y ~ X1 + X2 + X3, dat, seed = 42) summary_arvind(m1)