EcoDiet explored on a realistic example

The introduction employed a simplistic expemple of food web to familiarize the user with the basic commands and options of the EcoDiet package. Here we will use a more realistic example (although still artificial!) to run the different EcoDiet configurations, compare their results and hence higlight the complementarity in the different data used.

The data corresponds to 10 trophic groups with stomach content data, and very distinct isotopic measures.

realistic_stomach_data_path <- system.file("extdata", "realistic_stomach_data.csv",
                                           package = "EcoDiet")
realistic_stomach_data <- read.csv(realistic_stomach_data_path)
knitr::kable(realistic_stomach_data)
X Cod Pout Sardine Shrimps Crabs Bivalves Worms Zooplankton Phytoplankton Detritus
Cod 0 0 0 0 0 0 0 0 0 0
Pout 1 0 0 0 0 0 0 0 0 0
Sardine 9 0 0 0 0 0 0 0 0 0
Shrimps 4 4 29 0 24 0 0 0 0 0
Crabs 1 24 0 0 0 0 0 0 0 0
Bivalves 0 3 0 0 11 0 0 0 0 0
Worms 16 30 0 1 24 0 0 0 0 0
Zooplankton 0 27 6 3 0 0 0 0 0 0
Phytoplankton 0 0 14 10 0 16 0 20 0 0
Detritus 0 0 0 12 19 18 18 0 0 0
full 21 30 29 19 29 27 18 20 0 0
realistic_biotracer_data_path <- system.file("extdata", "realistic_biotracer_data.csv",
                                           package = "EcoDiet")
realistic_biotracer_data <- read.csv(realistic_biotracer_data_path)
knitr::kable(realistic_biotracer_data[c(1:3, 31:33, 61:63), ])
group d13C d15N
1 Cod -12.94144 19.18913
2 Cod -14.96070 20.23939
3 Cod -13.77822 19.48809
31 Pout -13.47127 18.57353
32 Pout -13.16888 17.58714
33 Pout -14.23085 17.38938
61 Sardine -14.56111 16.95231
62 Sardine -15.04729 17.15197
63 Sardine -14.63688 16.90906
library(EcoDiet)

plot_data(biotracer_data = realistic_biotracer_data,
          stomach_data = realistic_stomach_data)

#> Warning: Use of `biotracer_data$group` is discouraged.
#> ℹ Use `group` instead.

Yes, we are aware that isotopic data is usually messier, but isn’t it a beautiful plot?

The configuration without literature data

We define the configuration we are in, and preprocess the data:

literature_configuration <- FALSE

data <- preprocess_data(biotracer_data = realistic_biotracer_data,
                        trophic_discrimination_factor = c(0.8, 3.4),
                        literature_configuration = literature_configuration,
                        stomach_data = realistic_stomach_data)
#> The model will investigate the following trophic links:
#>               Bivalves Cod Crabs Detritus Phytoplankton Pout Sardine Shrimps
#> Bivalves             0   0     1        0             0    1       0       0
#> Cod                  0   0     0        0             0    0       0       0
#> Crabs                0   1     0        0             0    1       0       0
#> Detritus             1   0     1        0             0    0       0       1
#> Phytoplankton        1   0     0        0             0    0       1       1
#> Pout                 0   1     0        0             0    0       0       0
#> Sardine              0   1     0        0             0    0       0       0
#> Shrimps              0   1     1        0             0    1       1       0
#> Worms                0   1     1        0             0    1       0       1
#> Zooplankton          0   0     0        0             0    1       1       1
#>               Worms Zooplankton
#> Bivalves          0           0
#> Cod               0           0
#> Crabs             0           0
#> Detritus          1           0
#> Phytoplankton     0           1
#> Pout              0           0
#> Sardine           0           0
#> Shrimps           0           0
#> Worms             0           0
#> Zooplankton       0           0

In this configuration, priors are set for each trophic link identified as plausible by the user but the priors are not informed by literature data, and are thus uninformative:

plot_prior(data, literature_configuration)

The marginal prior distributions have different shape depending on the variables:

  • it is flat or uniform for η, the probabilities that a trophic link exists (all the probabilities of existence are thus equiprobable),

  • the marginal distributions for each diet proportion Π are peaking at zero, although the joint distribution for Πs is a flat Dirichlet prior, because all the diet proportions must sum to one.

plot_prior(data, literature_configuration, pred = "Pout")

We define the model, and test if it compiles well with a few iterations and adaptation steps:

filename <- "mymodel.txt"
write_model(file.name = filename, literature_configuration = literature_configuration, print.model = F)
mcmc_output <- run_model(filename, data, run_param="test")
#> 
#> Processing function input....... 
#> 
#> Done. 
#>  
#> Compiling model graph
#>    Resolving undeclared variables
#>    Allocating nodes
#> Graph information:
#>    Observed stochastic nodes: 316
#>    Unobserved stochastic nodes: 125
#>    Total graph size: 1104
#> 
#> Initializing model
#> 
#> Adaptive phase, 500 iterations x 3 chains 
#> If no progress bar appears JAGS has decided not to adapt 
#>  
#> 
#>  Burn-in phase, 500 iterations x 3 chains 
#>  
#> 
#> Sampling from joint posterior, 500 iterations x 3 chains 
#>  
#> 
#> Calculating statistics....... 
#> 
#> Done.
#> 
#>   /!\ Convergence warning:
#> Out of the 51 variables, 14 variables have a Gelman-Rubin statistic > 1.1.
#> You may consider modifying the model run settings.
#> The variables with the poorest convergence are: PI[4,3], PI[8,2], PI[9,3], PI[8,6], PI[9,8], PI[10,7], PI[6,2], PI[10,8], PI[8,7], PI[1,6].
#> JAGS output for model 'mymodel.txt', generated by jagsUI.
#> Estimates based on 3 chains of 1000 iterations,
#> adaptation = 500 iterations (sufficient),
#> burn-in = 500 iterations and thin rate = 1,
#> yielding 1500 total samples from the joint posterior. 
#> MCMC ran for 0.194 minutes at time 2025-02-19 06:44:08.931851.
#> 
#>              mean     sd    2.5%     50%   97.5% overlap0 f  Rhat n.eff
#> eta[4,1]    0.656  0.087   0.483   0.659   0.812    FALSE 1 1.001  1500
#> eta[5,1]    0.587  0.089   0.411   0.586   0.756    FALSE 1 1.001  1500
#> eta[3,2]    0.093  0.060   0.013   0.081   0.237    FALSE 1 1.015   192
#> eta[6,2]    0.087  0.059   0.011   0.074   0.236    FALSE 1 1.001  1500
#> eta[7,2]    0.435  0.103   0.238   0.437   0.641    FALSE 1 1.013   153
#> eta[8,2]    0.213  0.084   0.078   0.203   0.406    FALSE 1 1.001  1500
#> eta[9,2]    0.731  0.088   0.546   0.738   0.884    FALSE 1 1.014   152
#> eta[1,3]    0.390  0.089   0.226   0.388   0.573    FALSE 1 1.005   346
#> eta[4,3]    0.649  0.084   0.474   0.651   0.805    FALSE 1 1.002   773
#> eta[8,3]    0.807  0.069   0.656   0.811   0.923    FALSE 1 1.000  1500
#> eta[9,3]    0.803  0.071   0.641   0.809   0.919    FALSE 1 1.019   108
#> eta[1,6]    0.127  0.059   0.038   0.117   0.267    FALSE 1 1.003  1356
#> eta[3,6]    0.784  0.073   0.621   0.790   0.904    FALSE 1 1.007   348
#> eta[8,6]    0.158  0.063   0.057   0.153   0.300    FALSE 1 1.007   261
#> eta[9,6]    0.970  0.029   0.884   0.979   0.999    FALSE 1 1.003  1500
#> eta[10,6]   0.878  0.056   0.747   0.885   0.963    FALSE 1 1.000  1500
#> eta[5,7]    0.486  0.089   0.313   0.487   0.656    FALSE 1 1.000  1500
#> eta[8,7]    0.968  0.031   0.885   0.977   0.999    FALSE 1 1.009   737
#> eta[10,7]   0.232  0.075   0.105   0.226   0.402    FALSE 1 1.006   343
#> eta[4,8]    0.617  0.105   0.399   0.619   0.807    FALSE 1 1.008   228
#> eta[5,8]    0.533  0.105   0.335   0.532   0.747    FALSE 1 1.007   271
#> eta[9,8]    0.093  0.062   0.013   0.080   0.251    FALSE 1 1.005   742
#> eta[10,8]   0.210  0.087   0.064   0.203   0.397    FALSE 1 1.002   784
#> eta[4,9]    0.949  0.051   0.815   0.964   0.998    FALSE 1 1.003  1500
#> eta[5,10]   0.957  0.041   0.848   0.970   0.999    FALSE 1 1.000  1500
#> PI[4,1]     0.449  0.336   0.000   0.387   1.000    FALSE 1 1.116    26
#> PI[5,1]     0.551  0.336   0.000   0.613   1.000    FALSE 1 1.116    26
#> PI[3,2]     0.235  0.268   0.000   0.121   0.906    FALSE 1 1.067    57
#> PI[6,2]     0.126  0.196   0.000   0.016   0.703    FALSE 1 1.234    14
#> PI[7,2]     0.372  0.306   0.000   0.321   0.995    FALSE 1 1.148    19
#> PI[8,2]     0.103  0.189   0.000   0.001   0.707    FALSE 1 1.706     7
#> PI[9,2]     0.165  0.143   0.000   0.137   0.517    FALSE 1 1.053    43
#> PI[1,3]     0.133  0.194   0.000   0.015   0.640    FALSE 1 1.085    33
#> PI[4,3]     0.303  0.219   0.000   0.318   0.692    FALSE 1 1.744     6
#> PI[8,3]     0.282  0.186   0.000   0.278   0.648    FALSE 1 1.011   194
#> PI[9,3]     0.282  0.294   0.000   0.186   0.912    FALSE 1 1.563     7
#> PI[1,6]     0.053  0.114   0.000   0.001   0.414    FALSE 1 1.154    53
#> PI[3,6]     0.306  0.191   0.000   0.296   0.687    FALSE 1 1.044    53
#> PI[8,6]     0.095  0.157   0.000   0.004   0.522    FALSE 1 1.409    10
#> PI[9,6]     0.283  0.194   0.015   0.257   0.738    FALSE 1 1.009   290
#> PI[10,6]    0.263  0.186   0.005   0.238   0.674    FALSE 1 1.039    70
#> PI[5,7]     0.197  0.178   0.000   0.160   0.572    FALSE 1 1.112    24
#> PI[8,7]     0.439  0.252   0.002   0.450   0.939    FALSE 1 1.177    16
#> PI[10,7]    0.364  0.331   0.000   0.340   0.959    FALSE 1 1.271    11
#> PI[4,8]     0.168  0.194   0.000   0.106   0.679    FALSE 1 1.080    39
#> PI[5,8]     0.275  0.184   0.002   0.261   0.671    FALSE 1 1.034   555
#> PI[9,8]     0.050  0.129   0.000   0.000   0.487    FALSE 1 1.325    15
#> PI[10,8]    0.506  0.232   0.000   0.539   0.920    FALSE 1 1.198    18
#> PI[4,9]     1.000  0.000   1.000   1.000   1.000    FALSE 1    NA     1
#> PI[5,10]    1.000  0.000   1.000   1.000   1.000    FALSE 1    NA     1
#> deviance  865.999 11.225 846.582 865.407 890.264    FALSE 1 1.015   136
#> 
#> **WARNING** Some Rhat values could not be calculated.
#> **WARNING** Rhat values indicate convergence failure. 
#> Rhat is the potential scale reduction factor (at convergence, Rhat=1). 
#> For each parameter, n.eff is a crude measure of effective sample size. 
#> 
#> overlap0 checks if 0 falls in the parameter's 95% credible interval.
#> f is the proportion of the posterior with the same sign as the mean;
#> i.e., our confidence that the parameter is positive or negative.
#> 
#> DIC info: (pD = var(deviance)/2) 
#> pD = 62.1 and DIC = 928.147 
#> DIC is an estimate of expected predictive error (lower is better).

You should now try to run the model until it converges (it should take around half an hour to run, so we won’t do it in this vignette):

mcmc_output <- run_model(filename, data, run_param="normal", parallelize = T)

Here are the figures corresponding to the results that have converged:

plot_results(mcmc_output, data)

plot_results(mcmc_output, data, pred = "Pout")

You can also plot the results for specific prey if you want a clearer figure:

plot_results(mcmc_output, data, pred = "Pout", 
             variable = "PI", prey = c("Bivalves", "Worms"))

The configuration with literature data

We now change the configuration to add literature data to the model:

literature_configuration <- TRUE
realistic_literature_diets_path <- system.file("extdata", "realistic_literature_diets.csv",
                                               package = "EcoDiet")
realistic_literature_diets <- read.csv(realistic_literature_diets_path)
knitr::kable(realistic_literature_diets)
X Cod Pout Sardine Shrimps Crabs Bivalves Worms Zooplankton Phytoplankton Detritus
Cod 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0 0.0 0 0
Pout 0.4275065 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0 0.0 0 0
Sardine 0.3603675 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0 0.0 0 0
Shrimps 0.0300737 0.5295859 0.0002143 0.0000000 0.0082107 0.0000000 0.0 0.0 0 0
Crabs 0.1410430 0.3332779 0.0000000 0.0000000 0.0000000 0.0000000 0.0 0.0 0 0
Bivalves 0.0000000 0.0006130 0.0000000 0.0000000 0.3441081 0.0000000 0.0 0.0 0 0
Worms 0.0410093 0.1023676 0.0000000 0.0171336 0.4435377 0.0000000 0.0 0.0 0 0
Zooplankton 0.0000000 0.0341557 0.7381375 0.9121505 0.0000000 0.0000000 0.0 0.0 0 0
Phytoplankton 0.0000000 0.0000000 0.2616482 0.0000610 0.0000000 0.9966847 0.0 1.0 0 0
Detritus 0.0000000 0.0000000 0.0000000 0.0706550 0.2041434 0.0033153 1.0 0.0 0 0
pedigree 0.8000000 0.1000000 0.5000000 0.3000000 0.7000000 0.1000000 0.2 0.2 1 1
data <- preprocess_data(biotracer_data = realistic_biotracer_data,
                        trophic_discrimination_factor = c(0.8, 3.4),
                        literature_configuration = literature_configuration,
                        stomach_data = realistic_stomach_data,
                        literature_diets = realistic_literature_diets,
                        nb_literature = 12,
                        literature_slope = 0.5)
#> The model will investigate the following trophic links:
#>               Bivalves Cod Crabs Detritus Phytoplankton Pout Sardine Shrimps
#> Bivalves             0   0     1        0             0    1       0       0
#> Cod                  0   0     0        0             0    0       0       0
#> Crabs                0   1     0        0             0    1       0       0
#> Detritus             1   0     1        0             0    0       0       1
#> Phytoplankton        1   0     0        0             0    0       1       1
#> Pout                 0   1     0        0             0    0       0       0
#> Sardine              0   1     0        0             0    0       0       0
#> Shrimps              0   1     1        0             0    1       1       0
#> Worms                0   1     1        0             0    1       0       1
#> Zooplankton          0   0     0        0             0    1       1       1
#>               Worms Zooplankton
#> Bivalves          0           0
#> Cod               0           0
#> Crabs             0           0
#> Detritus          1           0
#> Phytoplankton     0           1
#> Pout              0           0
#> Sardine           0           0
#> Shrimps           0           0
#> Worms             0           0
#> Zooplankton       0           0

Now we see that the prior distributions are informed by the literature data:

  • when the literature diet input is > 0, the trophic link probabilities η are shifted toward one. Here this is the case for all prey but we could imagine that the user identify a species as a plausible prey whereas it has not been observed being consumed by the predator in the literature. In that case, the literature diet of 0 would drive η toward 0.

  • the average prior for the diet proportions Π is directly the literature diet input.

plot_prior(data, literature_configuration)

plot_prior(data, literature_configuration, pred = "Pout")

Again, we verify that the model compiles well:

filename <- "mymodel_literature.txt"
write_model(file.name = filename, literature_configuration = literature_configuration, print.model = F)
mcmc_output <- run_model(filename, data, run_param="test")
#> 
#> Processing function input....... 
#> 
#> Done. 
#>  
#> Compiling model graph
#>    Resolving undeclared variables
#>    Allocating nodes
#> Graph information:
#>    Observed stochastic nodes: 316
#>    Unobserved stochastic nodes: 125
#>    Total graph size: 1594
#> 
#> Initializing model
#> 
#> Adaptive phase, 500 iterations x 3 chains 
#> If no progress bar appears JAGS has decided not to adapt 
#>  
#> 
#>  Burn-in phase, 500 iterations x 3 chains 
#>  
#> 
#> Sampling from joint posterior, 500 iterations x 3 chains 
#>  
#> 
#> Calculating statistics....... 
#> 
#> Done.
#> 
#>   /!\ Convergence warning:
#> Out of the 51 variables, 19 variables have a Gelman-Rubin statistic > 1.1.
#> You may consider modifying the model run settings.
#> The variables with the poorest convergence are: PI[10,7], PI[8,7], PI[10,8], PI[9,8], PI[5,7], PI[8,6], PI[8,2], PI[8,3], PI[5,8], PI[3,2].
#> JAGS output for model 'mymodel_literature.txt', generated by jagsUI.
#> Estimates based on 3 chains of 1000 iterations,
#> adaptation = 500 iterations (sufficient),
#> burn-in = 500 iterations and thin rate = 1,
#> yielding 1500 total samples from the joint posterior. 
#> MCMC ran for 0.196 minutes at time 2025-02-19 06:44:23.426356.
#> 
#>              mean     sd    2.5%     50%   97.5% overlap0 f  Rhat n.eff
#> eta[4,1]    0.667  0.086   0.486   0.671   0.815    FALSE 1 1.003   648
#> eta[5,1]    0.604  0.088   0.418   0.607   0.781    FALSE 1 0.999  1500
#> eta[3,2]    0.350  0.082   0.196   0.346   0.522    FALSE 1 1.002  1047
#> eta[6,2]    0.359  0.083   0.205   0.354   0.523    FALSE 1 1.002   708
#> eta[7,2]    0.612  0.084   0.449   0.612   0.776    FALSE 1 1.000  1500
#> eta[8,2]    0.445  0.086   0.281   0.446   0.612    FALSE 1 1.004   410
#> eta[9,2]    0.816  0.067   0.676   0.824   0.925    FALSE 1 1.000  1500
#> eta[1,3]    0.526  0.080   0.369   0.526   0.684    FALSE 1 1.001  1130
#> eta[4,3]    0.715  0.072   0.569   0.719   0.846    FALSE 1 1.010   196
#> eta[8,3]    0.846  0.056   0.721   0.853   0.938    FALSE 1 1.000  1500
#> eta[9,3]    0.850  0.056   0.726   0.856   0.941    FALSE 1 1.001  1500
#> eta[1,6]    0.157  0.062   0.055   0.151   0.296    FALSE 1 1.001  1500
#> eta[3,6]    0.790  0.070   0.625   0.797   0.907    FALSE 1 1.001   974
#> eta[8,6]    0.190  0.066   0.080   0.185   0.334    FALSE 1 1.005   385
#> eta[9,6]    0.970  0.029   0.889   0.978   0.999    FALSE 1 0.999  1500
#> eta[10,6]   0.880  0.056   0.753   0.888   0.965    FALSE 1 1.001  1500
#> eta[5,7]    0.565  0.080   0.406   0.566   0.716    FALSE 1 1.000  1500
#> eta[8,7]    0.974  0.024   0.911   0.982   0.999    FALSE 1 1.000  1500
#> eta[10,7]   0.360  0.076   0.217   0.360   0.516    FALSE 1 1.027    77
#> eta[4,8]    0.675  0.093   0.474   0.682   0.839    FALSE 1 1.000  1500
#> eta[5,8]    0.589  0.098   0.392   0.590   0.784    FALSE 1 1.010   222
#> eta[9,8]    0.229  0.084   0.086   0.221   0.413    FALSE 1 1.001  1431
#> eta[10,8]   0.314  0.093   0.150   0.308   0.514    FALSE 1 1.013   147
#> eta[4,9]    0.954  0.043   0.842   0.967   0.999    FALSE 1 1.009   542
#> eta[5,10]   0.958  0.040   0.856   0.970   0.999    FALSE 1 1.003  1343
#> PI[4,1]     0.186  0.306   0.000   0.007   0.995    FALSE 1 1.237    15
#> PI[5,1]     0.814  0.306   0.005   0.993   1.000    FALSE 1 1.237    15
#> PI[3,2]     0.084  0.146   0.000   0.004   0.513    FALSE 1 1.266    14
#> PI[6,2]     0.277  0.269   0.000   0.243   0.901    FALSE 1 1.056    55
#> PI[7,2]     0.518  0.281   0.000   0.527   0.971    FALSE 1 1.129    21
#> PI[8,2]     0.021  0.058   0.000   0.001   0.162    FALSE 1 1.388    18
#> PI[9,2]     0.100  0.106   0.000   0.066   0.362    FALSE 1 1.011   446
#> PI[1,3]     0.387  0.238   0.000   0.376   0.877    FALSE 1 1.085    44
#> PI[4,3]     0.094  0.112   0.000   0.053   0.367    FALSE 1 1.187    15
#> PI[8,3]     0.021  0.046   0.000   0.000   0.147    FALSE 1 1.354    13
#> PI[9,3]     0.498  0.260   0.000   0.493   0.990    FALSE 1 1.225    14
#> PI[1,6]     0.077  0.170   0.000   0.002   0.613    FALSE 1 1.202    21
#> PI[3,6]     0.390  0.263   0.000   0.386   0.912    FALSE 1 1.107    23
#> PI[8,6]     0.223  0.295   0.000   0.043   0.935    FALSE 1 1.478     8
#> PI[9,6]     0.254  0.230   0.000   0.198   0.737    FALSE 1 1.095    30
#> PI[10,6]    0.056  0.108   0.000   0.005   0.410    FALSE 1 1.132    39
#> PI[5,7]     0.161  0.184   0.000   0.082   0.608    FALSE 1 1.712     6
#> PI[8,7]     0.236  0.337   0.000   0.006   1.000    FALSE 1 4.755     3
#> PI[10,7]    0.603  0.438   0.000   0.855   1.000    FALSE 1 7.586     3
#> PI[4,8]     0.184  0.238   0.000   0.073   0.874    FALSE 1 1.237    15
#> PI[5,8]     0.160  0.249   0.000   0.017   0.991    FALSE 1 1.347    13
#> PI[9,8]     0.136  0.246   0.000   0.002   0.848    FALSE 1 1.911     6
#> PI[10,8]    0.520  0.385   0.000   0.629   0.997    FALSE 1 3.153     4
#> PI[4,9]     1.000  0.000   1.000   1.000   1.000    FALSE 1    NA     1
#> PI[5,10]    1.000  0.000   1.000   1.000   1.000    FALSE 1    NA     1
#> deviance  866.331 11.469 846.293 865.818 890.720    FALSE 1 1.021    99
#> 
#> **WARNING** Some Rhat values could not be calculated.
#> **WARNING** Rhat values indicate convergence failure. 
#> Rhat is the potential scale reduction factor (at convergence, Rhat=1). 
#> For each parameter, n.eff is a crude measure of effective sample size. 
#> 
#> overlap0 checks if 0 falls in the parameter's 95% credible interval.
#> f is the proportion of the posterior with the same sign as the mean;
#> i.e., our confidence that the parameter is positive or negative.
#> 
#> DIC info: (pD = var(deviance)/2) 
#> pD = 64.5 and DIC = 930.846 
#> DIC is an estimate of expected predictive error (lower is better).

You should now try to run the model until it converges (it should take around half an hour to run, so we won’t do it in this vignette):

mcmc_output <- run_model(filename, data, run_param=list(nb_iter=100000, nb_burnin=50000, nb_thin=50, nb_adapt=50000), parallelize = T)

Here are the figures corresponding to the results that have converged:

plot_results(mcmc_output, data)

plot_results(mcmc_output, data, pred = "Pout")

You can save the figures as PNG using:

plot_results(mcmc_output, data, pred = "Pout", save = TRUE, save_path = ".")

Last, if you want to explore further in detail the a posteriori distribution of your parameters Π and η, you can run the following code line, which will store the values for all iterations into a data frame.

reshape_mcmc(mcmc_output, data)