---
title: "Modeling Relative Air Humidity with Simplex Regression"
output:
  rmarkdown::html_vignette:
    toc: true
    number_sections: true
vignette: >
  %\VignetteIndexEntry{relative-humidity}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width  = 7,
  fig.height = 4.5,
  fig.align  = "center",
  message    = FALSE,
  warning    = FALSE
)
old_opts <- options(width = 70, prompt = "R> ", continue = "+  ", digits = 5)
```

# Introduction

## The simplex distribution

Regression models for continuous responses restricted to the unit interval
$(0, 1)$ play a central role in empirical research across economics,
biostatistics, political science, and environmental studies. Typical examples
include rates, proportions, indices, and concentration measures. The
**simplex regression model** provides a flexible and theoretically appealing
framework for such data. It is based on the simplex distribution, introduced
by Barndorff-Nielsen and Jørgensen (1991), which is indexed by a mean
parameter $\mu \in (0, 1)$ and a dispersion parameter $\sigma^2 > 0$.

A random variable $Y$ follows a simplex distribution,
$Y \sim S^-(\mu, \sigma^2)$, if its density is

$$f(y; \mu, \sigma^2) = \left\{2\pi\sigma^2[y(1-y)]^3\right\}^{-1/2}
\exp\!\left(-\frac{d(y;\mu)}{2\sigma^2}\right), \quad y \in (0,1),$$

where $d(y;\mu) = (y - \mu)^2 / [y(1-y)\mu^2(1-\mu)^2]$ is the unit
deviance. The mean and variance are $\mathrm{E}(Y) = \mu$ and
$\mathrm{Var}(Y) < \mu(1-\mu)$, with variance function
$\mathrm{V}(\mu) = \mu^3(1-\mu)^3$.

The simplex distribution can assume a wide variety of shapes — symmetric,
left- or right-skewed, J-shaped, U-shaped, and even bimodal — making it
particularly flexible for modeling bounded responses. Unlike the beta
distribution, which is indexed by a precision parameter, the simplex
distribution is indexed by a dispersion parameter $\sigma^2$: smaller values
correspond to higher concentration around the mean, whereas larger values
indicate greater variability.

## The simplex regression model

The simplex regression model with variable dispersion relates the mean and
dispersion to covariates through link functions:

$$g(\mu_i) = \mathbf{x}_i^\top \boldsymbol{\beta} = \eta_{1i}
\quad \text{and} \quad
h(\sigma^2_i) = \mathbf{z}_i^\top \boldsymbol{\gamma} = \eta_{2i},$$

where $\boldsymbol{\beta}$ and $\boldsymbol{\gamma}$ are unknown parameter
vectors. For models with a **parametric mean link**, an additional shape
parameter $\lambda > 0$ is estimated jointly with $\boldsymbol{\beta}$ and
$\boldsymbol{\gamma}$:

$$g(\mu_i, \lambda) = \mathbf{x}_i^\top \boldsymbol{\beta}.$$

The `SimplexRegression` package supports five fixed link functions for the
mean submodel — `"logit"`, `"probit"`, `"loglog"`, `"cloglog"`,
`"cauchit"` — and two parametric links, `"plogit1"` and `"plogit2"`,
defined as

$$\text{plogit1:}\; g(\mu_i,\lambda) = \log\!\left[(1-\mu_i)^{-\lambda}-1\right],
\qquad
\text{plogit2:}\; g(\mu_i,\lambda) = \log\!\left(\frac{\mu_i^\lambda}{1-\mu_i^\lambda}\right).$$

Both reduce to the standard logit when $\lambda = 1$. For the dispersion
submodel, the links `"log"`, `"sqrt"`, and `"identity"` are available.
All parameters are estimated by maximum likelihood using a mixed algorithm
that combines the BFGS quasi-Newton method with Fisher scoring steps.

## Advantages over beta regression

Compared to beta regression, simplex regression has several distinctive
advantages:

- It can represent **bimodal densities**, which the beta distribution
  cannot;
- Maximum likelihood estimators for the mean and dispersion parameters are
  **asymptotically orthogonal**, simplifying inference and improving
  numerical stability;
- Likelihood-based inferences are typically **less sensitive to atypical
  observations**;
- Its higher-order variance function $\mu^3(1-\mu)^3$ allows **more
  flexible variance patterns** than the quadratic variance function of
  the beta distribution.

## This vignette

This vignette illustrates the main functionalities of the `SimplexRegression`
package through the analysis of monthly average relative humidity data
recorded in Brasília, Brazil ($n = 312$ observations). The analysis covers:

- data preparation and exploratory summary;
- fitting simplex regression models with **parametric** and **fixed** mean
  link functions;
- link function selection via the **penalized Scout Score** (`penalized.ss`)
  and **penalized information criteria** (`penalized.ic`);
- **likelihood ratio test** for nested models (`lrtest`);
- score test for $H_0\!: \lambda = 1$ (`scoretest`);
- specification testing with the **RESET test** (`resettest`);
- fitted values, residuals, and predictive performance (`press`);
- prediction and simulation;
- influence measures: hat values, Cook's distance, generalized leverage;
- diagnostic plots (`plot`, `halfnormal.plot`);
- **local influence** and **diagnostic distance** measures
  (`local.influence`, `diag.im`, `diag.distances`).

# The Data

Relative air humidity (RH) is defined as the ratio of the partial pressure
of water vapor in the air to the saturation vapor pressure at the same
temperature. In Brasília, low RH levels during dry months are associated with
adverse health outcomes (asthma, nosebleeds, dehydration), increased
forest-fire risk, and pressure on water resources. The dataset contains
monthly averages from January 2000 to December 2025, obtained from the
National Institute of Meteorology (INMET).

```{r setup}
library(SimplexRegression)
data(RelativeHumidity, package = "SimplexRegression")
head(RelativeHumidity, 5)
```

| Variable | Description |
|----------|-------------|
| `Date`   | Observation date (end of month) |
| `RH`     | Monthly average relative humidity, rescaled to $(0,1)$ |
| `Ins`    | Total insolation (hours), with two missing values |
| `Ins2`   | Total insolation with imputed missing values |
| `Pre`    | Total precipitation (mm), with two missing values |
| `Pre2`   | Total precipitation with imputed missing values |
| `Neb`    | Average cloudiness (tenths) |
| `AP`     | Average atmospheric pressure (hPa) |
| `MT`     | Average maximum temperature (°C) |
| `WS`     | Average wind speed (m/s) |
| `Dir`    | Predominant wind direction (degrees) |

## Data preparation

To capture the 12-month seasonal cycle we add harmonic regressors
$s_i = \sin(2\pi i/12)$ and $c_i = \cos(2\pi i/12)$, and a rainy-season
dummy equal to 1 for October, November, and December (the rainy season in
Brasília) and 0 otherwise.

```{r data-prep}
rh       <- RelativeHumidity
rh$hs    <- sin(2 * pi * seq_len(nrow(rh)) / 12)
rh$hc    <- cos(2 * pi * seq_len(nrow(rh)) / 12)
rh$dummy <- as.integer(as.integer(format(rh$Date, "%m")) %in% 10:12)
```

## Descriptive summary

```{r summary-rh}
summary(rh$RH)
cat(sprintf(
  "Std. dev.: %.4f  |  Skewness: %.4f\n",
  sd(rh$RH),
  mean(((rh$RH - mean(rh$RH)) / sd(rh$RH))^3)
))
```

The response ranges from 0.301 to 0.851, with mean 0.635 and moderate left
skewness ($-0.47$), consistent with a higher concentration of RH values
above the mean during the wetter months. These features support the use of a
unit-interval distribution such as the simplex.

# Model Formula

All models fitted in this vignette share the same formula structure. The
two-part formula `y ~ x | z` specifies the mean submodel (left of `|`) and
the dispersion submodel (right of `|`) simultaneously. The mean submodel
includes insolation (`Ins2`), maximum temperature (`MT`), wind speed (`WS`),
harmonic terms (`hs`, `hc`) to capture seasonality, a rainy-season dummy,
and its interaction with wind speed. The dispersion submodel contains only
precipitation (`Pre2`), allowing the variability of RH to depend on rainfall.

```{r formula}
formula <- RH ~ Ins2 + MT + WS + hs + hc + dummy + I(dummy * WS) | Pre2
```

# Parametric Mean Link Functions

## Fitting and model selection

We first fit models with the two parametric mean links available in the
package, `plogit1` and `plogit2`. These links include a shape parameter
$\lambda$ estimated jointly with the regression coefficients. When $\lambda = 1$,
both reduce to the standard logit link; values $\lambda \neq 1$ introduce
asymmetry in the link, adding flexibility to capture non-standard
relationships between the mean response and the linear predictor.

```{r models-plogit}
fit_p1 <- simplexreg(formula, data = rh, link.mu = "plogit1")
fit_p2 <- simplexreg(formula, data = rh, link.mu = "plogit2")
```

## Penalized Scout Score and information criteria

When comparing models that all use a parametric link, the **penalized Scout
Score** (`penalized.ss`) and **penalized information criteria**
(`penalized.ic`) are recommended. These functions incorporate an additional
penalty controlled by `kappa` (default `kappa = 0.1`) that accounts for the
complexity introduced by $\lambda$ relative to its deviation from the
standard logit. The model with the highest Scout Score or lowest penalized
criterion is selected. 

```{r penalized-ss-param}
penalized.ss(fit_p1, fit_p2, kappa = 0.1)
```

```{r penalized-ic}
penalized.ic(fit_p1, fit_p2, kappa = 0.1)
```

All penalized criteria unanimously select the `plogit1` link. The Scout
Score of `fit_p1` is substantially higher than that of `fit_p2`, and all
three penalized information criteria are lower for `fit_p1`.

# Fixed Mean Link Functions

## Fitting

We also fit models with all five fixed mean links supported by the package.
Fixed links do not include an additional parameter and are therefore always
comparable using standard (unpenalized) criteria.

```{r models-fixed}
fit_loglog  <- simplexreg(formula, data = rh, link.mu = "loglog")
fit_logit   <- simplexreg(formula, data = rh, link.mu = "logit")
fit_probit  <- simplexreg(formula, data = rh, link.mu = "probit")
fit_cauchit <- simplexreg(formula, data = rh, link.mu = "cauchit")
fit_cloglog <- simplexreg(formula, data = rh, link.mu = "cloglog")
```

## Comparing all candidates

When comparing models with both fixed and parametric links simultaneously,
the **unpenalized** version (`kappa = 0`) must be used, since the penalty
for $\lambda$ is only meaningful when all candidates use a parametric link.

```{r penalized-ss-all}
penalized.ss(
  fit_loglog, fit_logit, fit_probit,
  fit_cauchit, fit_cloglog, fit_p1,
  kappa = 0
)
```

The log-log link achieves the highest Scout Score and is selected as the
best-fitting specification. We proceed with `fit_loglog` for all remaining
analyses.

# Model Summary

The `summary` method provides a standard regression output: estimated
coefficients for the mean and dispersion submodels, standard errors, Wald
$z$-statistics and $p$-values, the maximized log-likelihood, information
criteria (AIC, BIC, HQIC), pseudo-$R^2$ measures, and details on the
numerical optimization procedure.

```{r summary-fit}
summary(fit_loglog)
```

All regression coefficients are statistically significant at the 5% level.
The harmonic terms (`hs`, `hc`) and the rainy-season dummy capture the
strong seasonal pattern in RH, while insolation (`Ins2`), maximum temperature
(`MT`), and wind speed (`WS`) explain the within-season variation. The
Nagelkerke and Ferrari–Cribari-Neto pseudo-$R^2$ values both exceed 0.95,
indicating an excellent fit. The model converged in 101 BFGS iterations
followed by 7 Fisher scoring steps.

# Standard Information Criteria

The standard (unpenalized) AIC, BIC, and HQIC are accessible via the usual
S3 generics and compare multiple models simultaneously. Lower values
indicate a better trade-off between fit and complexity.

```{r aic-bic}
AIC(fit_loglog, fit_logit, fit_probit, fit_cauchit, fit_cloglog, fit_p1)
BIC(fit_loglog, fit_logit, fit_probit, fit_cauchit, fit_cloglog, fit_p1)
HQIC(fit_loglog, fit_logit, fit_probit, fit_cauchit, fit_cloglog, fit_p1)
```

The log-log model consistently achieves the lowest values across all three
criteria, confirming the Scout Score selection.

# Extracting Model Components

## Coefficients and covariance matrix

The `coef` method extracts estimated coefficients for the full model or for
each submodel separately, controlled by the `model` argument
(`"full"`, `"mean"`, or `"dispersion"`). The `vcov` method returns the
corresponding variance-covariance matrix.

```{r coef-vcov}
coef(fit_loglog)                          # full coefficient vector
coef(fit_loglog, model = "mean")          # mean submodel only
coef(fit_loglog, model = "dispersion")    # dispersion submodel only
round(vcov(fit_loglog, model = "mean"), 6)  # vcov of mean submodel
```

## Log-likelihood

The `logLik` method returns the maximized log-likelihood together with the
number of estimated parameters (`df`) and the number of observations. This
is compatible with `AIC()` and `BIC()` from the `stats` package.

```{r loglik}
logLik(fit_loglog)
```

# Hypothesis Tests

## Likelihood ratio test for nested models

The `lrtest` method from the `lmtest` package performs likelihood ratio tests
between nested simplex regression models. A natural application is testing
whether the dispersion is constant against a model with covariate-dependent
dispersion. The restricted model (constant dispersion) is obtained via
`update` by replacing the dispersion submodel with an intercept only
(`| 1`).

```{r lrtest}
fit_loglog_null <- update(fit_loglog, . ~ . | 1)
lmtest::lrtest(fit_loglog, fit_loglog_null)
```

The very small $p$-value ($< 0.001$) provides strong evidence against
constant dispersion, confirming that RH variability is significantly
associated with precipitation (`Pre2`). The model with variable dispersion
is therefore preferred.

## Score test for $H_0\!: \lambda = 1$

The `scoretest` function implements Rao's score test for the hypothesis that
the standard logit link is adequate, i.e., $H_0\!: \lambda = 1$. The test
requires a model fitted under $H_0$ (with the logit link) and tests against
either `"plogit1"` or `"plogit2"` as the alternative. Rejection of $H_0$
suggests that a parametric link provides a better fit than the standard
logit.

```{r scoretest}
fit_logit_h0 <- simplexreg(formula, data = rh, link.mu = "logit")
scoretest(fit_logit_h0, link.mu = "plogit1")
scoretest(fit_logit_h0, link.mu = "plogit2")
```

## RESET test for functional form

The `resettest` function implements Ramsey's RESET test, which augments the
original model with the squared fitted linear predictor as an additional
covariate and tests its significance. By default (`dispersion = TRUE`), the
squared predictor is added to both the mean and dispersion submodels.
Setting `dispersion = FALSE` restricts the augmentation to the mean submodel
only. Failure to reject $H_0$ supports the adequacy of the assumed
functional form.

```{r resettest}
resettest(fit_loglog)                      # both submodels augmented
resettest(fit_loglog, dispersion = FALSE)  # mean submodel only
```

The large $p$-values in both cases ($0.817$ and $0.929$) provide no
evidence of functional form misspecification in either submodel.

# Fitted Values, Residuals, and Predictive Performance

## Fitted values and residuals

The `fitted` method returns fitted mean values $\hat{\mu}_i$. The
`residuals` method supports ten residual types. Quantile residuals
(default) have an approximately standard normal distribution under correct
model specification and are recommended for general diagnostics. Weighted
residuals are particularly useful for half-normal plots.

```{r fitted-resid}
head(fitted(fit_loglog))
head(residuals(fit_loglog, type = "quantile"))  # approx. N(0,1)
head(residuals(fit_loglog, type = "pearson"))
head(residuals(fit_loglog, type = "weighted"))  # for halfnormal.plot
```

## Predictive performance: PRESS and $P^2$

The `press` function computes the PRESS (Predicted Residual Error Sum of
Squares) statistic and the associated cross-validation measures $P^2$ and
adjusted $P^2_c$. The $P^2$ statistic is a cross-validation analog of the
coefficient of determination, computed from the hat matrix diagonal without
refitting the model $n$ times. Values close to 1 indicate strong predictive
accuracy. The function accepts multiple models simultaneously, enabling
direct comparison of predictive performance across competing specifications.

```{r press}
press(fit_loglog)                            # single model
press(fit_loglog, fit_logit, fit_probit)     # comparing models
```

The log-log model achieves the highest $P^2$ and $P^2_c$ values, confirming
its superior predictive performance. The PRESS statistic reflects the total
cumulative squared prediction error across all leave-one-out fits.

# Prediction and Simulation

## Prediction

The `predict` method computes predicted values for new or original data. The
`type` argument controls the output: `"response"` returns fitted means
$\hat{\mu}_i$; `"link"` returns the linear predictors $\hat{\eta}_{1i}$ and
$\hat{\eta}_{2i}$; `"dispersion"` returns fitted dispersion values
$\hat{\sigma}^2_i$.

```{r predict}
head(predict(fit_loglog, type = "response"))        # fitted means
head(predict(fit_loglog, type = "link")$mean)       # mean linear predictor
head(predict(fit_loglog, type = "link")$dispersion) # dispersion predictor
head(predict(fit_loglog, type = "dispersion"))      # fitted sigma^2

# Out-of-sample prediction
new_obs <- rh[1:3, ]
predict(fit_loglog, newdata = new_obs, type = "response")
```

## Simulation

The `simulate` method generates response vectors from the fitted simplex
distribution using the estimated $\hat{\mu}_i$ and $\hat{\sigma}^2_i$. The
`nsim` argument controls the number of replicates and `seed` ensures
reproducibility. The result is a data frame with `nsim` columns.

```{r simulate}
set.seed(2026)
sims <- simulate(fit_loglog, nsim = 3)
head(sims)
```

# Influence Measures

## Hat values and Cook's distance

The `hatvalues` method returns the diagonal elements $h_{ii}$ of the hat
matrix, measuring the leverage of each observation on the fitted values.
Observations with $h_{ii}$ much larger than the average $r/n$ (where $r$
is the number of estimated parameters) deserve closer inspection. The
`cooks.distance` method computes approximate Cook's distances, which combine
leverage and residual size to measure overall influence on the parameter
estimates. The `type` argument selects the residual type used
(`"pearson"` or `"weighted"`).

```{r influence}
hii  <- hatvalues(fit_loglog)
cook <- cooks.distance(fit_loglog, type = "pearson")
cat(sprintf("Leverages  — max: %.4f  mean: %.4f\n", max(hii),  mean(hii)))
cat(sprintf("Cook's D   — max: %.4f\n", max(cook)))
```

No observation has a disproportionately large leverage or Cook's distance,
suggesting no single point dominates the fit.

## Generalized leverage

The `gleverage` function computes the generalized leverage values, which
extend the classical hat values to account for both the mean and dispersion
submodels. High generalized leverage indicates observations that may exert
disproportionate influence on the fitted values across the entire model.

```{r gleverage}
gl <- gleverage(fit_loglog)
cat(sprintf("Generalized leverage — max: %.4f  mean: %.4f\n",
            max(gl), mean(gl)))
```

# Diagnostic Plots

## Residual and fit plots

The `plot` method produces up to eight diagnostic plots selected via `which`:
(1) residuals vs. observation index; (2) residuals vs. fitted values;
(3) residuals vs. linear predictor; (4) observed vs. fitted values;
(5) normal Q-Q plot; (6) Cook's distances; (7) generalized
leverages. The default residual type is `"quantile"`. For plots 6 and 7,
`threshold` highlights influential observations and `label.pos` controls
label placement (1 = below, 2 = left, 3 = above, 4 = right).

```{r plots-1-5, fig.height = 8, fig.cap = "Diagnostic plots (1–6) for the fitted simplex regression model with log-log link."}
oldpar <- par(mfrow = c(3, 2))
plot(fit_loglog, which = 1:5, reset.par = FALSE)
par(oldpar)
```

```{r plot-cook, fig.height = 4, fig.cap = "Cook's distances. Observations exceeding the threshold of 0.15 are labeled."}
plot(fit_loglog, which = 6, threshold = 0.15, label.pos = 4)
```

```{r plot-glev, fig.height = 4, fig.cap = "Generalized leverage values. Observations exceeding 0.08 are labeled."}
plot(fit_loglog, which = 7, threshold = 0.08, label.pos = 3)
```

The quantile residuals are approximately standard normal, with no systematic
patterns across fitted values or the linear predictor. Cook's distances and
generalized leverages are uniformly small, with no evidence of unduly
influential observations.

## Local influence

Local influence analysis quantifies how small perturbations to the model
assumptions affect the parameter estimates. Two perturbation schemes are
supported: `"case.weight"` perturbs observation weights and `"response"`
perturbs the response values. The `parameter` argument selects the
parameter block of interest — `"theta"` (all parameters), `"beta"` (mean
submodel), or `"gamma"` (dispersion submodel). The `type` argument controls
the influence measure: `"Ci"` for the total local influence index $C_i$,
or `"dmax"` for the direction of maximum curvature. Observations exceeding
`threshold` are labeled in the index plot.

```{r local-influence-cw, fig.height = 4.5, fig.cap = "Total local influence $C_i$ under case-weight perturbation for all parameters."}
local.influence(
  fit_loglog,
  scheme    = "case.weight",
  parameter = "theta",
  type      = "Ci",
  plot      = TRUE,
  threshold = 0.5,
  label.pos = c(3, 4, 3, 2, 2)
)
```

```{r local-influence-resp, fig.height = 4.5, fig.cap = "Total local influence $C_i$ under response perturbation for all parameters."}
local.influence(
  fit_loglog,
  scheme    = "response",
  parameter = "theta",
  type      = "Ci",
  plot      = TRUE,
  threshold = 0.4,
  label.pos = 2
)
```

A small set of observations is identified as locally influential under both
perturbation schemes, but none produces an extreme influence value.

## Half-normal plot with simulated envelope

The `halfnormal.plot` function produces a half-normal plot of the absolute
residuals together with a simulated envelope based on `nsim` Monte Carlo
replications of the fitted model (default `nsim = 100`). Points outside the
envelope may indicate model inadequacy. The `type` argument selects the
residual type (default `"weighted"`) and `seed` ensures reproducibility.

```{r halfnormal, fig.height = 5, fig.cap = "Half-normal plot of absolute weighted residuals with 95% simulated envelope (100 replications)."}
halfnormal.plot(fit_loglog, nsim = 19, type = "weighted", seed = 2026)
```

All diagnostic plots reveal no evidence against the adequacy of the fitted
model. Quantile residuals are approximately standard normal, no observation
has disproportionate influence, and nearly all points in the half-normal
plot fall within the simulated envelope.

# Fitted vs. Observed: Time Series

```{r timeseries, fig.height = 4, fig.cap = "Observed (solid black) and fitted (dashed red) monthly relative humidity in Brasília, January 2000 to December 2025."}
plot(rh$Date, rh$RH,
     type = "l", col = "black", lwd = 1.2,
     xlab = "Date", ylab = "Relative humidity",
     main = "Observed vs Fitted RH — Brasília (2000–2025)")
lines(rh$Date, fitted(fit_loglog), col = "red", lwd = 1.5, lty = 2)
legend("bottomleft",
       legend = c("Observed", "Fitted"),
       col    = c("black", "red"),
       lty    = c(1, 2), lwd = c(1.2, 1.5),
       bty    = "n", cex = 0.85)
```

The simplex regression model with log-log link accurately captures the
strong seasonal pattern in relative humidity, with fitted values tracking
the observed series closely throughout the entire 26-year period.

# Summary

This vignette demonstrated the main functionalities of the
`SimplexRegression` package through the analysis of monthly relative
humidity data from Brasília. The selected simplex regression model uses the
log-log mean link with variable dispersion driven by precipitation. It
achieved excellent goodness of fit (pseudo-$R^2 > 0.95$) and strong
cross-validated predictive performance ($P^2 > 0.95$), with no evidence of
misspecification from the RESET test or departures from model assumptions
in the diagnostic plots.

The package provides a comprehensive and self-consistent workflow — from
data preparation and model fitting to inference, model selection,
diagnostics, and influence analysis — following the standard conventions
of formula-based regression modeling in R.

# References

- Barndorff-Nielsen, O. E. and Jørgensen, B. (1991). Some parametric models
  on the simplex. *Journal of Multivariate Analysis*, **39**(1), 106--116.
- Cribari-Neto, F., Vasconcellos, K. L. P. and Santana e Silva, J. J.
  (2025). New strategies for detecting atypical observations based on the
  information matrix equality. *Journal of Applied Statistics*, **52**,
  2873--2893.
- Espinheira, P. L. and Silva, A. O. (2020). Residual and influence
  analysis to a general class of simplex regression. *TEST*, **29**,
  523--552.
- Justino, M. E. C. and Cribari-Neto, F. (2026). Simplex regression with a
  flexible logit link: Inference and application to cross-country impunity
  data. *Applied Mathematical Modelling*, **154**, 116713.

```{r restore-options, include = FALSE}
options(old_opts)
```

# Session Info

```{r session}
sessionInfo()
```