---
title: "Introduction to simplexgof"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Introduction to simplexgof}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width = 6,
  fig.height = 5
)
```

```{r setup}
library(simplexgof)
```

## Overview

**simplexgof** implements a bootstrap-calibrated local-influence
goodness-of-fit (GoF) test for simplex regression models with constant or
varying dispersion. The package provides:

- `simplex_fit()`: fit a simplex regression model via maximum likelihood,
  with logit link for the mean and log link for the dispersion.
- `simplex_diag()`: compute local-influence diagnostic quantities (the
  $T_n$ and $U_n$ statistics, individual influence measures $C_{I_t}$).
- `simplex_gof()`: run the full parametric-bootstrap GoF test.
- Plotting functions to visualise influence diagnostics, half-normal
  envelopes, and the bootstrap distribution of $U_n$.

This vignette walks through a complete analysis using the `ammonia`
dataset bundled with the package.

## The data

The `ammonia` dataset (Brownlee, 1965) has 21 observations on the
proportion of ammonia lost during an industrial oxidation process,
together with three covariates.

```{r}
data(ammonia)
head(ammonia)
```

The response `perda` is a proportion in $(0, 1)$, making it a natural
candidate for simplex regression.

## Fitting a simplex regression model

We model the mean $\mu_t$ with covariates `corr_ar`, `temp_agua`, and
their interaction, and allow the dispersion $\sigma^2_t$ to depend on
`temp_agua` and the same interaction term.

```{r}
X <- cbind(1, ammonia$corr_ar, ammonia$temp_agua,
           ammonia$corr_ar * ammonia$temp_agua)
Z <- cbind(1, ammonia$temp_agua,
           ammonia$corr_ar * ammonia$temp_agua)

fit <- simplex_fit(ammonia$perda, X, Z)
fit
```

The fitted object has class `"simplexfit"`, with `print`, `coef`, and
`fitted` methods.

```{r}
coef(fit)
```

## Influence diagnostics

`simplex_diag()` computes the case-weight local-influence measures
$C_{I_t}$ and the test statistics $T_n$ and $U_n$ that aggregate them.

```{r}
dg <- simplex_diag(fit)
dg$Tn
dg$Un
```

These quantities can be visualised with `plot_influence()`, which produces
an index plot of the individual influence values $C_{I_t}$:

```{r, fig.alt = "Influence index plot for the ammonia model"}
plot_influence(dg)
```

## The bootstrap goodness-of-fit test

Because the first-order asymptotic normal calibration of $U_n$ is known
to be liberal in small samples, `simplex_gof()` provides a parametric
bootstrap calibration. With `B = 50` replicates (for speed in this
vignette; use a larger `B`, e.g. 1000, in practice):

```{r}
set.seed(42)
gof <- simplex_gof(ammonia$perda, X, Z, B = 50, alpha = 0.01,
                    verbose = FALSE)
gof
```

The bootstrap distribution of $U_n$ under $H_0$ can be visualised with
`plot_gof_boot()`:

```{r, fig.alt = "Bootstrap distribution of Un for the ammonia model"}
plot_gof_boot(gof)
```

## Half-normal plot with simulated envelope

`plot_envelope()` produces a half-normal plot of the influence measures
with a simulated envelope, useful for spotting individual observations
that drive the lack of fit:

```{r, eval = FALSE}
plot_envelope(fit, B = 99)
```

## Convenience `plot` methods

Both `"simplexfit"` and `"simplexgof"` objects have `plot()` methods that
wrap the functions above:

```{r, eval = FALSE}
plot(fit, which = "influence")
plot(gof, which = "boot")
```

## Next steps

For full reproductions of the figures and tables in the companion
methodological paper (Ospina, Espinheira, Silva and Barros, 2026), see
the *"Paper: ammonia application"* and *"Paper: PBSC application"*
articles.