--- title: "Get started with mixqr" author: "Kailas Venkitasubramanian" output: rmarkdown::html_vignette: toc: true toc_depth: 2 bibliography: mixqr.bib vignette: > %\VignetteIndexEntry{Get started with mixqr} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", message = FALSE, warning = FALSE, fig.width = 7, fig.height = 4.2, dpi = 150, fig.align = "center" ) set.seed(1) ``` **mixqr** is an extensible framework for finite mixtures of quantile (and expectile) regressions: at its core it finds hidden subgroups in your data and fits a separate quantile regression in each. This page is a five-minute tour of that core; the [Tutorial](mixqr-tutorial.html) is the full guide, and the [Extensions article](https://kvenkita.github.io/mixqr/articles/mixqr-extensions.html) covers the expectile/M-quantile families, penalized selection, and non-crossing multi-quantile estimation built on the same platform. ```{r} library(mixqr) ``` ## A two-regime example The `engine` data [@brinkman1981] record the equivalence ratio (richness of the air/fuel mix) against nitrous-oxide concentration for a test engine. A single line fits badly; there are **two regimes**. ```{r fit} fit <- mixqr(equivalence ~ nox, data = engine, tau = 0.5, m = 2, variance = "stochEM") fit ``` `mixqr()` has jointly (i) split the observations into two groups and (ii) estimated a **median** regression in each. `summary()` adds standard errors: ```{r summary} summary(fit) ``` ## A first picture A little `ggplot2` shows the two recovered regimes and their median lines. ```{r plot, fig.alt = "Engine data coloured by recovered regime with two median regression lines."} library(ggplot2) dat <- transform(engine, regime = factor(predict(fit, type = "class"))) grid <- data.frame(nox = seq(min(engine$nox), max(engine$nox), length.out = 100)) lines <- do.call(rbind, lapply(1:2, function(j) { data.frame(nox = grid$nox, equivalence = cbind(1, grid$nox) %*% fit$beta[, j], regime = factor(j)) })) ggplot(dat, aes(nox, equivalence, colour = regime)) + geom_point(size = 2, alpha = 0.8) + geom_line(data = lines, linewidth = 1.1) + scale_colour_manual(values = c("#1b6ca8", "#e07b39")) + labs(x = "Nitrous oxide", y = "Equivalence ratio", title = "Two median regimes recovered by mixqr") + theme_minimal(base_size = 12) ``` ## Where to next * The **[Tutorial](mixqr-tutorial.html)** walks through a full applied analysis: interpreting estimates, classifying observations, reading diagnostics, fitting several quantiles, choosing the number of components, and reporting results, all with publication-ready graphics. * The **[Validation article](https://kvenkita.github.io/mixqr/articles/mixqr-validation.html)** documents the simulation evidence behind the point estimates and the standard errors. ```{r cite, eval = FALSE} citation("mixqr") ``` ## References