| Title: | Extensible Finite Mixtures of Quantile and Expectile Regressions |
|---|---|
| Description: | An extensible expectation-maximization (EM) framework for finite mixtures of quantile regressions (clusterwise / mixture-of-experts quantile regression). A single EM substrate with an engine/extension contract carries a family of capabilities: the core free-weight mixture of Wu and Yao (2016) <doi:10.1016/j.csda.2014.04.014> -- a fast asymmetric-Laplace path and the nonparametric kernel-density EM with components constrained to have their tau-quantile equal to zero (Hall and Presnell 1999 device); expectile and M-quantile component-loss families (Newey and Powell 1987; Breckling and Chambers 1988); component-specific penalized variable selection (SCAD / adaptive-LASSO, the quantile analogue of Khalili and Chen 2007); and joint multi-quantile estimation with a shared latent classification and non-crossing component curves. Provides classification-aware standard errors (sparsity and stochastic-EM multiple imputation), multi-start estimation, component-count selection, and prediction. The companion package 'mixqrgate' adds location-varying gating. |
| Authors: | Kailas Venkitasubramanian [aut, cre, cph] |
| Maintainer: | Kailas Venkitasubramanian <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.2.0 |
| Built: | 2026-06-25 19:59:30 UTC |
| Source: | https://github.com/cran/mixqr |
Component coefficients of a mixqr fit
## S3 method for class 'mixqr' coef(object, include_pi = FALSE, ...)## S3 method for class 'mixqr' coef(object, include_pi = FALSE, ...)
object |
A |
include_pi |
If |
... |
Unused. |
A (p+1) x m coefficient matrix (one column per component), or a list
if include_pi = TRUE.
Wald intervals from the fitted standard errors. Requires variance = "sparsity" or "stochEM"; "stochEM" is recommended (sparsity SEs are
classification-conditional and understate uncertainty).
## S3 method for class 'mixqr' confint(object, parm, level = 0.95, ...)## S3 method for class 'mixqr' confint(object, parm, level = 0.95, ...)
object |
A |
parm |
Optional subset of parameter names. |
level |
Confidence level. Default |
... |
Unused. |
A matrix of lower/upper bounds.
Measurements from a single-cylinder automobile test engine burning ethanol,
exhibiting two homogeneous regimes when the equivalence ratio is regressed on
nitrous-oxide concentration – the engine example of Wu & Yao (2016, Fig. 5).
Derived from the public-domain ethanol data redistributed in the lattice
package (Brinkman 1981).
engineengine
A data frame with 88 rows and 3 columns:
equivalence ratio: richness of the air/ethanol mix (response).
concentration of nitrous oxide (NO and NO2) in the exhaust, normalized by engine work (predictor).
compression ratio of the engine.
Brinkman, N. D. (1981). Ethanol fuel – a single-cylinder engine study
of efficiency and exhaust emissions. SAE Transactions 90, 1410–1427.
Redistributed via the lattice package ethanol dataset.
Wu, Q. and Yao, W. (2016). Mixtures of quantile regressions. CSDA 93, 162–176.
fit <- mixqr(equivalence ~ nox, data = engine, tau = 0.5, m = 2) summary(fit)fit <- mixqr(equivalence ~ nox, data = engine, tau = 0.5, m = 2) summary(fit)
fitted() returns the classified conditional tau-quantile for each
observation; residuals() the corresponding residuals (or, with
type = "component", the full n x m residual matrix); nobs() the sample
size.
## S3 method for class 'mixqr' fitted(object, ...) ## S3 method for class 'mixqr' residuals(object, type = c("hard", "component"), ...) ## S3 method for class 'mixqr' nobs(object, ...)## S3 method for class 'mixqr' fitted(object, ...) ## S3 method for class 'mixqr' residuals(object, type = c("hard", "component"), ...) ## S3 method for class 'mixqr' nobs(object, ...)
object |
A |
... |
Unused. |
type |
For |
A numeric vector or matrix.
Retrieve a registered mixqr engine constructor
get_mixqr_engine(name)get_mixqr_engine(name)
name |
Character engine name. |
The constructor function.
List registered mixqr engines
list_mixqr_engines()list_mixqr_engines()
Character vector of engine names.
Available for the parametric ALD engine (a genuine working likelihood); NA
for the semiparametric kdEM engine, which has no global likelihood.
## S3 method for class 'mixqr' logLik(object, ...) ## S3 method for class 'mixqr' AIC(object, ..., k = 2) ## S3 method for class 'mixqr' BIC(object, ...)## S3 method for class 'mixqr' logLik(object, ...) ## S3 method for class 'mixqr' AIC(object, ..., k = 2) ## S3 method for class 'mixqr' BIC(object, ...)
object |
A |
... |
Unused. |
k |
The penalty per parameter for |
A "logLik" object (logLik) or a numeric criterion (AIC, BIC).
Estimates a finite mixture of tau-quantile regressions (clusterwise quantile
regression) at a single quantile level. Two engines are available: a fast
parametric asymmetric-Laplace mixture ("ald", genuine likelihood and
AIC/BIC) and the kernel-density EM of Wu & Yao (2016) ("kdEM", nonparametric
component error densities constrained to have their tau-th quantile = 0).
mixqr( formula, data, tau = 0.5, m = 2L, family = c("quantile", "expectile", "mquantile"), engine = "ald", error_density = c("unequal", "equal"), init = c("ald", "kmeans", "random", "manual"), nstart = 20L, control = mixqr_control(), weights = NULL, manual_init = NULL, variance = c("none", "sparsity", "stochEM"), vcontrol = mixqr_vcontrol(), ... )mixqr( formula, data, tau = 0.5, m = 2L, family = c("quantile", "expectile", "mquantile"), engine = "ald", error_density = c("unequal", "equal"), init = c("ald", "kmeans", "random", "manual"), nstart = 20L, control = mixqr_control(), weights = NULL, manual_init = NULL, variance = c("none", "sparsity", "stochEM"), vcontrol = mixqr_vcontrol(), ... )
formula |
A model formula |
data |
A data frame. |
tau |
Quantile level in (0, 1). Default |
m |
Number of mixture components (>= 1). Default |
family |
Component-loss family: |
engine |
|
error_density |
For |
init |
Initialisation strategy: |
nstart |
Number of multi-start initialisations (the mixture likelihood
is multimodal). Default |
control |
A |
weights |
Optional prior observation weights. |
manual_init |
Optional |
variance |
Standard-error method: |
vcontrol |
A |
... |
Reserved for engine extensions; currently unused. |
An object of class "mixqr".
The semiparametric (kdEM) estimator solves estimating equations whose score
I(a <= 0) - tau is not orthogonal to the nuisance tangent space of the
unknown error densities, so it can be BIASED when component error densities are
asymmetric and the clusters have imbalanced overlap (Wu & Yao 2016, sec.6,
Fig.6). Watch fit$diagnostics$overlap (responsibility entropy; larger = more
overlap = more bias risk) and cross-check against the parametric ald engine.
Well-separated clusters and (near-)symmetric errors are the safe regime.
Sparsity SEs (variance = "sparsity", eq.3.3) are CONDITIONAL on the fitted
classification and understate uncertainty (they are flagged as such by
summary()). Use variance = "stochEM" (the stochastic-EM multiple-imputation
estimator, Algorithm 3.1) for inference; it propagates classification and
mixing uncertainty.
Wu, Q. and Yao, W. (2016). Mixtures of quantile regressions. CSDA 93, 162–176.
set.seed(1) d <- sim_mixqr2(n = 300) fit <- mixqr(y ~ x, data = d, tau = 0.5, m = 2, engine = "ald", nstart = 10) fit coef(fit)set.seed(1) d <- sim_mixqr2(n = 300) fit <- mixqr(y ~ x, data = d, tau = 0.5, m = 2, engine = "ald", nstart = 10) fit coef(fit)
mixqr()
Control parameters for mixqr()
mixqr_control( tol = 1e-06, maxit = 500L, reltol = TRUE, bandwidth = NULL, kde_grid = 512L, label_order = "slope", distinct_tol = 0.001, trace = FALSE, seed = NULL )mixqr_control( tol = 1e-06, maxit = 500L, reltol = TRUE, bandwidth = NULL, kde_grid = 512L, label_order = "slope", distinct_tol = 0.001, trace = FALSE, seed = NULL )
tol |
Convergence tolerance on the summed absolute parameter change
(Wu & Yao 2016, p. 164). Default |
maxit |
Maximum EM iterations. Default |
reltol |
Logical; if |
bandwidth |
Optional numeric KDE bandwidth override (kdEM engine). If
|
kde_grid |
Number of grid points for stored KDE evaluation / plotting.
Default |
label_order |
Label-switching constraint: |
distinct_tol |
RELATIVE threshold below which component slope vectors are
flagged as near-coincident (near-violation of Wu & Yao Thm 2.1), measured as
|
trace |
Logical; print iteration progress. Default |
seed |
Optional integer RNG seed for reproducible multi-start. |
A list of class "mixqr_control".
Estimates a finite mixture of quantile regressions at a vector of quantile
levels tau jointly, with one latent classification shared across all levels
(a coupled E-step that pools the per-level component likelihoods) and
guaranteed non-crossing of the within-component quantile curves. This closes
the two problems Wu & Yao (2016, sec.5) leave open: within-component crossing
and cross-tau classification ambiguity.
mixqr_nc( formula, data, m = 2L, tau = c(0.25, 0.5, 0.75), noncrossing = c("rearrange", "none"), class_coupling = c("pool", "median"), nstart = 10L, control = mixqr_control(), weights = NULL )mixqr_nc( formula, data, m = 2L, tau = c(0.25, 0.5, 0.75), noncrossing = c("rearrange", "none"), class_coupling = c("pool", "median"), nstart = 10L, control = mixqr_control(), weights = NULL )
formula, data, m, nstart, control, weights
|
As in |
tau |
Increasing vector of quantile levels (length >= 2). |
noncrossing |
|
class_coupling |
How the shared E-step pools over |
An object of class c("mixqr_multitau", "mixqr") with a coefficient
array coefficients ([p+1, m, L]), the shared posterior/classification,
mix_prop, the tau grid, and a crossing diagnostic (raw crossings, 0
after rearrangement).
Wu, Q. and Yao, W. (2016). Mixtures of quantile regressions. CSDA 93, 162–176. Chernozhukov, V., Fernandez-Val, I. and Galichon, A. (2010). Quantile and probability curves without crossing. Econometrica 78, 1093–1125.
d <- sim_mixqr_cross(n = 160, seed = 1) fit <- mixqr_nc(y ~ x, data = d, m = 2, tau = c(0.1, 0.25, 0.5, 0.75, 0.9)) fit$crossing # raw crossings -> 0 after rearrangement predict(fit)[1, , ] # non-crossing quantiles for observation 1d <- sim_mixqr_cross(n = 160, seed = 1) fit <- mixqr_nc(y ~ x, data = d, m = 2, tau = c(0.1, 0.25, 0.5, 0.75, 0.9)) fit$crossing # raw crossings -> 0 after rearrangement predict(fit)[1, , ] # non-crossing quantiles for observation 1
Component-specific penalized variable selection for the mixqr() model: each
latent component gets its own sparse slope vector, a covariate active in one
quantile-regression cluster and dropped in another. The weighted check-loss
M-step gains a SCAD / adaptive-LASSO / LASSO / MCP penalty (intercept free);
the penalty strength lambda is selected by a mixture BIC over a path, and
components whose mixing weight falls below pi_min are pruned.
mixqr_pen( formula, data, tau = 0.5, m = 2L, penalty = c("SCAD", "aLASSO", "LASSO", "MCP"), lambda = NULL, nlambda = 40L, a = 3.7, alasso_gamma = 1, penalty_factor = NULL, pi_min = 0.02, nstart = 10L, control = mixqr_control(), weights = NULL )mixqr_pen( formula, data, tau = 0.5, m = 2L, penalty = c("SCAD", "aLASSO", "LASSO", "MCP"), lambda = NULL, nlambda = 40L, a = 3.7, alasso_gamma = 1, penalty_factor = NULL, pi_min = 0.02, nstart = 10L, control = mixqr_control(), weights = NULL )
formula, data, tau, m, nstart, control, weights
|
As in |
penalty |
One of |
lambda |
Optional fixed penalty (skips selection). If |
nlambda |
Path length when |
a |
SCAD/MCP concavity. Default |
alasso_gamma |
Adaptive-LASSO weight exponent (default |
penalty_factor |
Optional length- |
pi_min |
Components with mixing weight below this are pruned. Default |
A "mixqr" object with an extra $selection slot (active covariates
per component, chosen lambda, the BIC path, and df).
Khalili, A. and Chen, J. (2007). Variable selection in finite mixture of regression models. JASA 102, 1025–1038. Sherwood, B., Li, S. and Maidman, A. (2025). rqPen. R Journal.
set.seed(1) n <- 250; p <- 8; x <- matrix(rnorm(n * p), n) colnames(x) <- paste0("x", 1:p) z <- rbinom(n, 1, 0.5) y <- ifelse(z == 0, 1 + 2 * x[, 1], -1 + 2 * x[, 2]) + rnorm(n) d <- data.frame(y = y, x) fit <- mixqr_pen(y ~ ., data = d, tau = 0.5, m = 2, penalty = "SCAD") selectedVars(fit)set.seed(1) n <- 250; p <- 8; x <- matrix(rnorm(n * p), n) colnames(x) <- paste0("x", 1:p) z <- rbinom(n, 1, 0.5) y <- ifelse(z == 0, 1 + 2 * x[, 1], -1 + 2 * x[, 2]) + rnorm(n) d <- data.frame(y = y, x) fit <- mixqr_pen(y ~ ., data = d, tau = 0.5, m = 2, penalty = "SCAD") selectedVars(fit)
Fits mixqr over a range of component counts and scores them by an
information criterion or cross-validated check loss.
mixqr_select( formula, data, tau = 0.5, m = 1:4, criterion = c("BIC", "AIC", "cv"), engine = "ald", error_density = c("unequal", "equal"), folds = 5L, weights = NULL, nstart = 20L, control = mixqr_control() )mixqr_select( formula, data, tau = 0.5, m = 1:4, criterion = c("BIC", "AIC", "cv"), engine = "ald", error_density = c("unequal", "equal"), folds = 5L, weights = NULL, nstart = 20L, control = mixqr_control() )
formula, data, tau, weights
|
Passed to |
m |
Integer vector of component counts to try. Default |
criterion |
|
engine |
Engine to use ( |
error_density |
For the kdEM engine. |
folds |
Number of CV folds when |
nstart |
Multi-starts per fit. |
control |
A |
For criterion %in% c("AIC", "BIC") the score uses the parametric ALD
working-likelihood (the semiparametric kdEM engine has no global likelihood;
Wu & Yao 2016, p. 164). Note that AIC/BIC for the number of mixture
components is heuristic: testing m vs m+1 places a component on the
parameter-space boundary (pi_j = 0), so the usual penalty asymptotics do not
hold (McLachlan & Peel 2000, ch. 6; Chen, Chen & Kalbfleisch 2001). The "cv"
criterion (cross-validated held-out check loss) avoids the likelihood entirely
and works for either engine.
A list with table (data frame of scores by m), best (chosen m),
criterion, and fit (the chosen model refit on all data).
Control parameters for stochastic-EM variance estimation (Algorithm 3.1)
mixqr_vcontrol(B = 500L, burnin = 50L, thin = 1L, seed = NULL)mixqr_vcontrol(B = 500L, burnin = 50L, thin = 1L, seed = NULL)
B |
Number of imputation draws. Default |
burnin |
Burn-in draws discarded before accumulation. Default |
thin |
Thinning interval. Default |
seed |
Optional integer RNG seed. |
A list of class "mixqr_vcontrol".
Draws the data coloured by hard classification with the component quantile-regression lines, and a panel of the estimated component error densities (mirroring Wu & Yao Figs. 1, 3–5).
## S3 method for class 'mixqr' plot(x, which = c("both", "fit", "density"), ...)## S3 method for class 'mixqr' plot(x, which = c("both", "fit", "density"), ...)
x |
A |
which |
|
... |
Passed to |
Invisibly x.
Overlays, for each component, the (rearranged, non-crossing) fitted quantile
curves at every tau against the first covariate.
## S3 method for class 'mixqr_multitau' plot(x, ...)## S3 method for class 'mixqr_multitau' plot(x, ...)
x |
A |
... |
Passed to |
x, invisibly.
Predict from a mixqr fit
## S3 method for class 'mixqr' predict( object, newdata = NULL, type = c("quantile", "quantile_byclass", "posterior", "class", "density"), ... )## S3 method for class 'mixqr' predict( object, newdata = NULL, type = c("quantile", "quantile_byclass", "posterior", "class", "density"), ... )
object |
A |
newdata |
Optional data frame. If omitted, the training data are used. |
type |
One of |
... |
Unused. |
A matrix, vector, or list depending on type.
"posterior", "class", and "quantile_byclass" require the RESPONSE in
newdata, because component membership is inferred from the residual
densities. This core therefore cannot assign a class to a covariate vector x
alone; covariate-based gating (membership as a function of x) is provided by
the location-varying-gating extension (QMM sub-project 03).
Predict non-crossing component quantiles from a multi-tau fit
## S3 method for class 'mixqr_multitau' predict(object, newdata = NULL, ...)## S3 method for class 'mixqr_multitau' predict(object, newdata = NULL, ...)
object |
A |
newdata |
Optional data frame; defaults to the training design. |
... |
Unused. |
An [n, m, L] array of fitted quantiles. With
noncrossing = "rearrange" each component's L quantiles are sorted to be
nondecreasing in tau (Chernozhukov et al. 2010), guaranteeing no crossing.
An engine is a constructor function(tau, m, control) returning a list of
closures with the FROZEN signatures documented in mixqr_engine_contract().
Sub-projects 03–06 of the QMM suite ship engines this way without forking
the driver.
register_mixqr_engine(name, constructor)register_mixqr_engine(name, constructor)
name |
Character engine name (e.g. |
constructor |
Function |
Invisibly TRUE.
Active (selected) covariates per component of a penalized mixqr fit
selectedVars(object)selectedVars(object)
object |
A |
A named list (one entry per component) of selected covariate names.
One component is well-behaved; the other has a near-flat slope and strong
heteroscedasticity (error scale growing across the design), so its
independently fitted per-tau quantile lines cross in finite samples – exactly
the within-component crossing Wu & Yao (2016, sec.5) flag as common "when the
sample size is small." Use to demonstrate that noncrossing = "none" crosses
while "rearrange" does not. (A valid location-scale model has non-crossing
true quantiles; crossing here is the finite-sample estimation artefact the
method repairs.)
sim_mixqr_cross(n = 160, seed = NULL)sim_mixqr_cross(n = 160, seed = NULL)
n |
Sample size (default 160; crossing is a small-sample phenomenon). |
seed |
Optional RNG seed. |
A data frame with columns y, x, and the true label z.
Y = 10 - 10x + e (component 1) or Y = -10 + 10x + e (component 2), with
pi = (0.5, 0.5), x ~ U(0, 1), and a bimodal error
e ~ 0.5 N(-1, 1^2) + 0.5 N(2, 2^2) (median 0; median regression, tau = 0.5).
sim_mixqr2(n = 300L, pi1 = 0.5)sim_mixqr2(n = 300L, pi1 = 0.5)
n |
Sample size. |
pi1 |
Probability of component 1. |
A data frame with columns x, y, z (true component label).
Three components with pi = (1/3, 1/3, 1/3), covariates x1, x2 ~ U(0, 1),
means -20 x1 - 20 x2, 0, 20 x1 + 20 x2, and shifted-lognormal errors
exp(N(1, sigma^2)) - exp(1) with sigma^2 in {1, 0.5, 0.25} (each median 0).
sim_mixqr3(n = 300L)sim_mixqr3(n = 300L)
n |
Sample size. |
A data frame with columns x1, x2, y, z.
Prints, per component, the coefficient estimates with standard errors, z-values and p-values (when a variance method was used), the mixing probabilities, and the separability / overlap diagnostics. Sparsity standard errors are flagged as classification-conditional.
## S3 method for class 'mixqr' summary(object, ...)## S3 method for class 'mixqr' summary(object, ...)
object |
A |
... |
Unused. |
An object of class "summary.mixqr".
The estimated parameter covariance (requires a fitted variance method).
## S3 method for class 'mixqr' vcov(object, ...)## S3 method for class 'mixqr' vcov(object, ...)
object |
A |
... |
Unused. |
The variance-covariance matrix.