Hierarchical Bayesian models

library(serosv)

Parametric Bayesian framework

Currently, serosv only has models under parametric Bayesian framework

Proposed approach

Prevalence has a parametric form π(a_i, α) where α is a parameter vector

One can constraint the parameter space of the prior distribution P(α) in order to achieve the desired monotonicity of the posterior distribution P(π₁, π₂, ..., π_m|y, n)

Where:

n = (n₁, n₂, ..., n_m) and n_i is the sample size at age a_i

y = (y₁, y₂, ..., y_m) and y_i is the number of infected individual from the n_i sampled subjects

Farrington

Refer to Chapter 10.3.1

Proposed model

The model for prevalence is as followed

$$ \pi (a) = 1 - exp\{ \frac{\alpha_1}{\alpha_2}ae^{-\alpha_2 a} + \frac{1}{\alpha_2}(\frac{\alpha_1}{\alpha_2} - \alpha_3)(e^{-\alpha_2 a} - 1) -\alpha_3 a \} $$

For likelihood model, independent binomial distribution are assumed for the number of infected individuals at age a_i

y_i ∼ Bin(n_i, π_i), for i = 1, 2, 3, ...m

The constraint on the parameter space can be incorporated by assuming truncated normal distribution for the components of α, α = (α₁, α₂, α₃) in π_i = π(a_i, α)

α_j ∼ truncated 𝒩(μ_j, τ_j), j = 1, 2, 3

The joint posterior distribution for α can be derived by combining the likelihood and prior as followed

$$ P(\alpha|y) \propto \prod^m_{i=1} \text{Bin}(y_i|n_i, \pi(a_i, \alpha)) \prod^3_{i=1}-\frac{1}{\tau_j}\text{exp}(\frac{1}{2\tau^2_j} (\alpha_j - \mu_j)^2) $$

Where the flat hyperprior distribution is defined as followed:
- μ_j ∼ 𝒩(0, 10000)
- τ_j⁻² ∼ Γ(100, 100)

The full conditional distribution of α_i is thus $$ P(\alpha_i|\alpha_j,\alpha_k, k, j \neq i) \propto -\frac{1}{\tau_i}\text{exp}(\frac{1}{2\tau^2_i} (\alpha_i - \mu_i)^2) \prod^m_{i=1} \text{Bin}(y_i|n_i, \pi(a_i, \alpha)) $$

Fitting data

To fit Farrington model, use hierarchical_bayesian_model() and define type = "far2" or type = "far3" where

type = "far2" refers to Farrington model with 2 parameters (α₃ = 0)
type = "far3" refers to Farrington model with 3 parameters (α₃ > 0)

df <- mumps_uk_1986_1987
model <- hierarchical_bayesian_model(age = df$age, pos = df$pos, tot = df$tot, type="far3")
#> 
#> SAMPLING FOR MODEL 'fra_3' NOW (CHAIN 1).
#> Chain 1: 
#> Chain 1: Gradient evaluation took 4e-05 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.4 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1: 
#> Chain 1: 
#> Chain 1: Iteration:    1 / 5000 [  0%]  (Warmup)
#> Chain 1: Iteration:  500 / 5000 [ 10%]  (Warmup)
#> Chain 1: Iteration: 1000 / 5000 [ 20%]  (Warmup)
#> Chain 1: Iteration: 1500 / 5000 [ 30%]  (Warmup)
#> Chain 1: Iteration: 1501 / 5000 [ 30%]  (Sampling)
#> Chain 1: Iteration: 2000 / 5000 [ 40%]  (Sampling)
#> Chain 1: Iteration: 2500 / 5000 [ 50%]  (Sampling)
#> Chain 1: Iteration: 3000 / 5000 [ 60%]  (Sampling)
#> Chain 1: Iteration: 3500 / 5000 [ 70%]  (Sampling)
#> Chain 1: Iteration: 4000 / 5000 [ 80%]  (Sampling)
#> Chain 1: Iteration: 4500 / 5000 [ 90%]  (Sampling)
#> Chain 1: Iteration: 5000 / 5000 [100%]  (Sampling)
#> Chain 1: 
#> Chain 1:  Elapsed Time: 6.739 seconds (Warm-up)
#> Chain 1:                3.792 seconds (Sampling)
#> Chain 1:                10.531 seconds (Total)
#> Chain 1:
#> Warning: There were 871 divergent transitions after warmup. See
#> https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
#> to find out why this is a problem and how to eliminate them.
#> Warning: Examine the pairs() plot to diagnose sampling problems
#> Warning: The largest R-hat is 1.05, indicating chains have not mixed.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#r-hat
#> Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#bulk-ess
#> Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#tail-ess

model$info
#>                       mean      se_mean           sd          2.5%
#> alpha1        1.396931e-01 8.463026e-04 6.041910e-03  1.277919e-01
#> alpha2        1.985614e-01 1.126946e-03 8.098993e-03  1.830444e-01
#> alpha3        8.465645e-03 3.819195e-04 6.183691e-03  4.160092e-04
#> tau_alpha1    2.709383e-01 5.219661e-02 4.810963e-01  7.351768e-06
#> tau_alpha2    5.749247e-01 4.099933e-01 1.545243e+00  4.451090e-06
#> tau_alpha3    1.660061e-01 6.276310e-02 4.367883e-01  4.039611e-06
#> mu_alpha1     3.308494e+00 1.696001e+00 3.129011e+01 -6.059877e+01
#> mu_alpha2     1.064375e+00 2.814756e+00 4.288312e+01 -1.092571e+02
#> mu_alpha3     2.521953e+00 1.794159e+00 3.830443e+01 -7.918874e+01
#> sigma_alpha1  5.584206e+01 1.056450e+01 2.350729e+02  7.138984e-01
#> sigma_alpha2  7.035767e+01 1.099709e+01 2.666187e+02  3.994285e-01
#> sigma_alpha3  1.260856e+02 5.903215e+01 1.962449e+03  8.035415e-01
#> lp__         -2.534862e+03 7.042354e-01 4.294862e+00 -2.543464e+03
#>                        25%           50%           75%         97.5%      n_eff
#> alpha1        1.352481e-01  1.397090e-01  1.443264e-01  1.506097e-01   50.96796
#> alpha2        1.928777e-01  1.982516e-01  2.041228e-01  2.147369e-01   51.64831
#> alpha3        3.130684e-03  7.980146e-03  1.202945e-02  2.284824e-02  262.15112
#> tau_alpha1    1.025602e-03  2.110170e-02  4.215330e-01  1.962128e+00   84.95313
#> tau_alpha2    4.377582e-04  1.342041e-02  2.342996e-01  6.267898e+00   14.20496
#> tau_alpha3    4.391953e-04  8.788893e-03  5.557513e-02  1.548758e+00   48.43210
#> mu_alpha1    -2.924724e+00  4.079600e-01  4.689481e+00  9.259387e+01  340.37848
#> mu_alpha2    -5.549039e+00  4.503123e-02  4.234916e+00  1.084755e+02  232.10867
#> mu_alpha3    -4.318451e+00  3.714941e-01  6.760525e+00  9.793322e+01  455.80174
#> sigma_alpha1  1.540225e+00  6.884007e+00  3.122560e+01  3.688112e+02  495.11582
#> sigma_alpha2  2.065925e+00  8.632222e+00  4.779504e+01  4.744050e+02  587.79456
#> sigma_alpha3  4.241894e+00  1.066677e+01  4.771694e+01  4.975570e+02 1105.14576
#> lp__         -2.537890e+03 -2.534636e+03 -2.531969e+03 -2.527364e+03   37.19312
#>                   Rhat
#> alpha1       1.0459334
#> alpha2       1.0380598
#> alpha3       1.0081062
#> tau_alpha1   1.0539045
#> tau_alpha2   1.0840459
#> tau_alpha3   1.0099767
#> mu_alpha1    0.9997791
#> mu_alpha2    1.0005006
#> mu_alpha3    1.0004760
#> sigma_alpha1 1.0002080
#> sigma_alpha2 0.9997154
#> sigma_alpha3 1.0004110
#> lp__         1.0188335
plot(model)
#> Warning: No shared levels found between `names(values)` of the manual scale and the
#> data's fill values.

Log-logistic

Proposed approach

The model for seroprevalence is as followed

$$ \pi(a) = \frac{\beta a^\alpha}{1 + \beta a^\alpha}, \text{ } \alpha, \beta > 0 $$

The likelihood is specified to be the same as Farrington model (y_i ∼ Bin(n_i, π_i)) with

logit(π(a)) = α₂ + α₁log (a)

Where α₂ = log(β)

The prior model of α₁ is specified as α₁ ∼ truncated 𝒩(μ₁, τ₁) with flat hyperprior as in Farrington model

β is constrained to be positive by specifying α₂ ∼ 𝒩(μ₂, τ₂)

The full conditional distribution of α₁ is thus

$$ P(\alpha_1|\alpha_2) \propto -\frac{1}{\tau_1} \text{exp} (\frac{1}{2 \tau_1^2} (\alpha_1 - \mu_1)^2) \prod_{i=1}^m \text{Bin}(y_i|n_i,\pi(a_i, \alpha_1, \alpha_2) ) $$

And α₂ can be derived in the same way

Fitting data

To fit Log-logistic model, use hierarchical_bayesian_model() and define type = "log_logistic"

df <- rubella_uk_1986_1987
model <- hierarchical_bayesian_model(age = df$age, pos = df$pos, tot = df$tot, type="log_logistic")
#> 
#> SAMPLING FOR MODEL 'log_logistic' NOW (CHAIN 1).
#> Chain 1: 
#> Chain 1: Gradient evaluation took 1.1e-05 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.11 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1: 
#> Chain 1: 
#> Chain 1: Iteration:    1 / 5000 [  0%]  (Warmup)
#> Chain 1: Iteration:  500 / 5000 [ 10%]  (Warmup)
#> Chain 1: Iteration: 1000 / 5000 [ 20%]  (Warmup)
#> Chain 1: Iteration: 1500 / 5000 [ 30%]  (Warmup)
#> Chain 1: Iteration: 1501 / 5000 [ 30%]  (Sampling)
#> Chain 1: Iteration: 2000 / 5000 [ 40%]  (Sampling)
#> Chain 1: Iteration: 2500 / 5000 [ 50%]  (Sampling)
#> Chain 1: Iteration: 3000 / 5000 [ 60%]  (Sampling)
#> Chain 1: Iteration: 3500 / 5000 [ 70%]  (Sampling)
#> Chain 1: Iteration: 4000 / 5000 [ 80%]  (Sampling)
#> Chain 1: Iteration: 4500 / 5000 [ 90%]  (Sampling)
#> Chain 1: Iteration: 5000 / 5000 [100%]  (Sampling)
#> Chain 1: 
#> Chain 1:  Elapsed Time: 0.754 seconds (Warm-up)
#> Chain 1:                1.878 seconds (Sampling)
#> Chain 1:                2.632 seconds (Total)
#> Chain 1:
#> Warning: There were 482 divergent transitions after warmup. See
#> https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
#> to find out why this is a problem and how to eliminate them.
#> Warning: Examine the pairs() plot to diagnose sampling problems
#> Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#bulk-ess
#> Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#tail-ess

model$type
#> [1] "log_logistic"
plot(model)
#> Warning: No shared levels found between `names(values)` of the manual scale and the
#> data's fill values.

- Parametric Bayesian framework