EcoDiet explored on a realistic example

The introduction employed a simplistic expemple of food web to familiarize the user with the basic commands and options of the EcoDiet package. Here we will use a more realistic example (although still artificial!) to run the different EcoDiet configurations, compare their results and hence higlight the complementarity in the different data used.

The data corresponds to 10 trophic groups with stomach content data, and very distinct isotopic measures.

realistic_stomach_data_path <- system.file("extdata", "realistic_stomach_data.csv",
                                           package = "EcoDiet")
realistic_stomach_data <- read.csv(realistic_stomach_data_path)
knitr::kable(realistic_stomach_data)
X Cod Pout Sardine Shrimps Crabs Bivalves Worms Zooplankton Phytoplankton Detritus
Cod 0 0 0 0 0 0 0 0 0 0
Pout 1 0 0 0 0 0 0 0 0 0
Sardine 9 0 0 0 0 0 0 0 0 0
Shrimps 4 4 29 0 24 0 0 0 0 0
Crabs 1 24 0 0 0 0 0 0 0 0
Bivalves 0 3 0 0 11 0 0 0 0 0
Worms 16 30 0 1 24 0 0 0 0 0
Zooplankton 0 27 6 3 0 0 0 0 0 0
Phytoplankton 0 0 14 10 0 16 0 20 0 0
Detritus 0 0 0 12 19 18 18 0 0 0
full 21 30 29 19 29 27 18 20 0 0
realistic_biotracer_data_path <- system.file("extdata", "realistic_biotracer_data.csv",
                                           package = "EcoDiet")
realistic_biotracer_data <- read.csv(realistic_biotracer_data_path)
knitr::kable(realistic_biotracer_data[c(1:3, 31:33, 61:63), ])
group d13C d15N
1 Cod -12.94144 19.18913
2 Cod -14.96070 20.23939
3 Cod -13.77822 19.48809
31 Pout -13.47127 18.57353
32 Pout -13.16888 17.58714
33 Pout -14.23085 17.38938
61 Sardine -14.56111 16.95231
62 Sardine -15.04729 17.15197
63 Sardine -14.63688 16.90906
library(EcoDiet)

plot_data(biotracer_data = realistic_biotracer_data,
          stomach_data = realistic_stomach_data)

#> Warning: Use of `biotracer_data$group` is discouraged.
#> ℹ Use `group` instead.

Yes, we are aware that isotopic data is usually messier, but isn’t it a beautiful plot?

The configuration without literature data

We define the configuration we are in, and preprocess the data:

literature_configuration <- FALSE

data <- preprocess_data(biotracer_data = realistic_biotracer_data,
                        trophic_discrimination_factor = c(0.8, 3.4),
                        literature_configuration = literature_configuration,
                        stomach_data = realistic_stomach_data)
#> The model will investigate the following trophic links:
#>               Bivalves Cod Crabs Detritus Phytoplankton Pout Sardine Shrimps
#> Bivalves             0   0     1        0             0    1       0       0
#> Cod                  0   0     0        0             0    0       0       0
#> Crabs                0   1     0        0             0    1       0       0
#> Detritus             1   0     1        0             0    0       0       1
#> Phytoplankton        1   0     0        0             0    0       1       1
#> Pout                 0   1     0        0             0    0       0       0
#> Sardine              0   1     0        0             0    0       0       0
#> Shrimps              0   1     1        0             0    1       1       0
#> Worms                0   1     1        0             0    1       0       1
#> Zooplankton          0   0     0        0             0    1       1       1
#>               Worms Zooplankton
#> Bivalves          0           0
#> Cod               0           0
#> Crabs             0           0
#> Detritus          1           0
#> Phytoplankton     0           1
#> Pout              0           0
#> Sardine           0           0
#> Shrimps           0           0
#> Worms             0           0
#> Zooplankton       0           0

In this configuration, priors are set for each trophic link identified as plausible by the user but the priors are not informed by literature data, and are thus uninformative:

plot_prior(data, literature_configuration)

The marginal prior distributions have different shape depending on the variables:

  • it is flat or uniform for η, the probabilities that a trophic link exists (all the probabilities of existence are thus equiprobable),

  • the marginal distributions for each diet proportion Π are peaking at zero, although the joint distribution for Πs is a flat Dirichlet prior, because all the diet proportions must sum to one.

plot_prior(data, literature_configuration, pred = "Pout")

We define the model, and test if it compiles well with a few iterations and adaptation steps:

filename <- "mymodel.txt"
write_model(file.name = filename, literature_configuration = literature_configuration, print.model = F)
mcmc_output <- run_model(filename, data, run_param="test")
#> 
#> Processing function input....... 
#> 
#> Done. 
#>  
#> Compiling model graph
#>    Resolving undeclared variables
#>    Allocating nodes
#> Graph information:
#>    Observed stochastic nodes: 316
#>    Unobserved stochastic nodes: 125
#>    Total graph size: 1104
#> 
#> Initializing model
#> 
#> Adaptive phase, 500 iterations x 3 chains 
#> If no progress bar appears JAGS has decided not to adapt 
#>  
#> 
#>  Burn-in phase, 500 iterations x 3 chains 
#>  
#> 
#> Sampling from joint posterior, 500 iterations x 3 chains 
#>  
#> 
#> Calculating statistics....... 
#> 
#> Done.
#> 
#>   /!\ Convergence warning:
#> Out of the 51 variables, 10 variables have a Gelman-Rubin statistic > 1.1.
#> You may consider modifying the model run settings.
#> The variables with the poorest convergence are: PI[10,8], PI[9,8], PI[7,2], PI[8,2], PI[3,2], PI[1,6], PI[6,2], PI[5,8], PI[8,6], PI[4,8].
#> JAGS output for model 'mymodel.txt', generated by jagsUI.
#> Estimates based on 3 chains of 1000 iterations,
#> adaptation = 500 iterations (sufficient),
#> burn-in = 500 iterations and thin rate = 1,
#> yielding 1500 total samples from the joint posterior. 
#> MCMC ran for 0.194 minutes at time 2024-12-21 06:36:46.265899.
#> 
#>              mean     sd    2.5%     50%   97.5% overlap0 f  Rhat n.eff
#> eta[4,1]    0.652  0.086   0.473   0.655   0.816    FALSE 1 1.000  1500
#> eta[5,1]    0.590  0.092   0.398   0.594   0.764    FALSE 1 0.999  1500
#> eta[3,2]    0.093  0.062   0.013   0.082   0.246    FALSE 1 1.022   129
#> eta[6,2]    0.090  0.058   0.012   0.079   0.228    FALSE 1 1.046    64
#> eta[7,2]    0.438  0.098   0.253   0.437   0.631    FALSE 1 1.019   112
#> eta[8,2]    0.218  0.084   0.078   0.210   0.392    FALSE 1 1.013   151
#> eta[9,2]    0.737  0.090   0.543   0.742   0.886    FALSE 1 1.003   515
#> eta[1,3]    0.387  0.086   0.230   0.386   0.567    FALSE 1 1.004  1004
#> eta[4,3]    0.649  0.085   0.470   0.652   0.802    FALSE 1 1.001  1165
#> eta[8,3]    0.806  0.068   0.661   0.812   0.918    FALSE 1 1.000  1500
#> eta[9,3]    0.808  0.068   0.658   0.817   0.921    FALSE 1 1.005   320
#> eta[1,6]    0.123  0.055   0.039   0.116   0.251    FALSE 1 1.000  1500
#> eta[3,6]    0.788  0.069   0.639   0.793   0.904    FALSE 1 1.000  1500
#> eta[8,6]    0.155  0.065   0.055   0.149   0.307    FALSE 1 1.003  1500
#> eta[9,6]    0.969  0.029   0.895   0.977   0.999    FALSE 1 1.000  1500
#> eta[10,6]   0.874  0.057   0.748   0.880   0.964    FALSE 1 1.003   569
#> eta[5,7]    0.490  0.088   0.317   0.488   0.654    FALSE 1 1.001  1262
#> eta[8,7]    0.968  0.030   0.891   0.977   0.999    FALSE 1 1.000  1500
#> eta[10,7]   0.228  0.074   0.105   0.220   0.383    FALSE 1 1.009   216
#> eta[4,8]    0.619  0.104   0.400   0.623   0.807    FALSE 1 1.009   245
#> eta[5,8]    0.519  0.107   0.310   0.516   0.723    FALSE 1 1.005   379
#> eta[9,8]    0.099  0.065   0.014   0.086   0.261    FALSE 1 1.008   303
#> eta[10,8]   0.197  0.083   0.058   0.188   0.380    FALSE 1 1.038    57
#> eta[4,9]    0.950  0.047   0.829   0.965   0.998    FALSE 1 1.001  1500
#> eta[5,10]   0.954  0.044   0.844   0.967   0.999    FALSE 1 1.002  1500
#> PI[4,1]     0.326  0.280   0.000   0.287   0.956    FALSE 1 1.043    56
#> PI[5,1]     0.674  0.280   0.044   0.713   1.000    FALSE 1 1.043    56
#> PI[3,2]     0.263  0.376   0.000   0.002   0.999    FALSE 1 1.467     8
#> PI[6,2]     0.123  0.265   0.000   0.000   0.953    FALSE 1 1.357    13
#> PI[7,2]     0.318  0.363   0.000   0.021   0.975    FALSE 1 1.724     6
#> PI[8,2]     0.116  0.216   0.000   0.006   0.782    FALSE 1 1.607     8
#> PI[9,2]     0.180  0.179   0.000   0.140   0.598    FALSE 1 1.087    29
#> PI[1,3]     0.156  0.214   0.000   0.052   0.745    FALSE 1 1.047   227
#> PI[4,3]     0.207  0.164   0.000   0.191   0.551    FALSE 1 1.032    67
#> PI[8,3]     0.289  0.191   0.002   0.279   0.691    FALSE 1 1.044    52
#> PI[9,3]     0.347  0.245   0.000   0.314   0.893    FALSE 1 1.035    62
#> PI[1,6]     0.018  0.058   0.000   0.000   0.183    FALSE 1 1.423    16
#> PI[3,6]     0.326  0.199   0.003   0.315   0.755    FALSE 1 1.004   415
#> PI[8,6]     0.045  0.088   0.000   0.005   0.328    FALSE 1 1.277    19
#> PI[9,6]     0.345  0.214   0.017   0.328   0.787    FALSE 1 1.002  1119
#> PI[10,6]    0.266  0.207   0.000   0.244   0.730    FALSE 1 1.027    87
#> PI[5,7]     0.285  0.175   0.001   0.283   0.651    FALSE 1 1.035    65
#> PI[8,7]     0.523  0.215   0.064   0.545   0.914    FALSE 1 1.012   192
#> PI[10,7]    0.192  0.257   0.000   0.049   0.864    FALSE 1 1.055    48
#> PI[4,8]     0.253  0.224   0.000   0.199   0.807    FALSE 1 1.146    24
#> PI[5,8]     0.245  0.236   0.000   0.211   0.768    FALSE 1 1.296    11
#> PI[9,8]     0.237  0.259   0.000   0.145   0.770    FALSE 1 2.637     4
#> PI[10,8]    0.265  0.308   0.000   0.076   0.928    FALSE 1 2.717     4
#> PI[4,9]     1.000  0.000   1.000   1.000   1.000    FALSE 1    NA     1
#> PI[5,10]    1.000  0.000   1.000   1.000   1.000    FALSE 1    NA     1
#> deviance  866.107 11.245 846.223 865.634 889.223    FALSE 1 1.026    79
#> 
#> **WARNING** Some Rhat values could not be calculated.
#> **WARNING** Rhat values indicate convergence failure. 
#> Rhat is the potential scale reduction factor (at convergence, Rhat=1). 
#> For each parameter, n.eff is a crude measure of effective sample size. 
#> 
#> overlap0 checks if 0 falls in the parameter's 95% credible interval.
#> f is the proportion of the posterior with the same sign as the mean;
#> i.e., our confidence that the parameter is positive or negative.
#> 
#> DIC info: (pD = var(deviance)/2) 
#> pD = 61.7 and DIC = 927.783 
#> DIC is an estimate of expected predictive error (lower is better).

You should now try to run the model until it converges (it should take around half an hour to run, so we won’t do it in this vignette):

mcmc_output <- run_model(filename, data, run_param="normal", parallelize = T)

Here are the figures corresponding to the results that have converged:

plot_results(mcmc_output, data)

plot_results(mcmc_output, data, pred = "Pout")

You can also plot the results for specific prey if you want a clearer figure:

plot_results(mcmc_output, data, pred = "Pout", 
             variable = "PI", prey = c("Bivalves", "Worms"))

The configuration with literature data

We now change the configuration to add literature data to the model:

literature_configuration <- TRUE
realistic_literature_diets_path <- system.file("extdata", "realistic_literature_diets.csv",
                                               package = "EcoDiet")
realistic_literature_diets <- read.csv(realistic_literature_diets_path)
knitr::kable(realistic_literature_diets)
X Cod Pout Sardine Shrimps Crabs Bivalves Worms Zooplankton Phytoplankton Detritus
Cod 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0 0.0 0 0
Pout 0.4275065 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0 0.0 0 0
Sardine 0.3603675 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0 0.0 0 0
Shrimps 0.0300737 0.5295859 0.0002143 0.0000000 0.0082107 0.0000000 0.0 0.0 0 0
Crabs 0.1410430 0.3332779 0.0000000 0.0000000 0.0000000 0.0000000 0.0 0.0 0 0
Bivalves 0.0000000 0.0006130 0.0000000 0.0000000 0.3441081 0.0000000 0.0 0.0 0 0
Worms 0.0410093 0.1023676 0.0000000 0.0171336 0.4435377 0.0000000 0.0 0.0 0 0
Zooplankton 0.0000000 0.0341557 0.7381375 0.9121505 0.0000000 0.0000000 0.0 0.0 0 0
Phytoplankton 0.0000000 0.0000000 0.2616482 0.0000610 0.0000000 0.9966847 0.0 1.0 0 0
Detritus 0.0000000 0.0000000 0.0000000 0.0706550 0.2041434 0.0033153 1.0 0.0 0 0
pedigree 0.8000000 0.1000000 0.5000000 0.3000000 0.7000000 0.1000000 0.2 0.2 1 1
data <- preprocess_data(biotracer_data = realistic_biotracer_data,
                        trophic_discrimination_factor = c(0.8, 3.4),
                        literature_configuration = literature_configuration,
                        stomach_data = realistic_stomach_data,
                        literature_diets = realistic_literature_diets,
                        nb_literature = 12,
                        literature_slope = 0.5)
#> The model will investigate the following trophic links:
#>               Bivalves Cod Crabs Detritus Phytoplankton Pout Sardine Shrimps
#> Bivalves             0   0     1        0             0    1       0       0
#> Cod                  0   0     0        0             0    0       0       0
#> Crabs                0   1     0        0             0    1       0       0
#> Detritus             1   0     1        0             0    0       0       1
#> Phytoplankton        1   0     0        0             0    0       1       1
#> Pout                 0   1     0        0             0    0       0       0
#> Sardine              0   1     0        0             0    0       0       0
#> Shrimps              0   1     1        0             0    1       1       0
#> Worms                0   1     1        0             0    1       0       1
#> Zooplankton          0   0     0        0             0    1       1       1
#>               Worms Zooplankton
#> Bivalves          0           0
#> Cod               0           0
#> Crabs             0           0
#> Detritus          1           0
#> Phytoplankton     0           1
#> Pout              0           0
#> Sardine           0           0
#> Shrimps           0           0
#> Worms             0           0
#> Zooplankton       0           0

Now we see that the prior distributions are informed by the literature data:

  • when the literature diet input is > 0, the trophic link probabilities η are shifted toward one. Here this is the case for all prey but we could imagine that the user identify a species as a plausible prey whereas it has not been observed being consumed by the predator in the literature. In that case, the literature diet of 0 would drive η toward 0.

  • the average prior for the diet proportions Π is directly the literature diet input.

plot_prior(data, literature_configuration)

plot_prior(data, literature_configuration, pred = "Pout")

Again, we verify that the model compiles well:

filename <- "mymodel_literature.txt"
write_model(file.name = filename, literature_configuration = literature_configuration, print.model = F)
mcmc_output <- run_model(filename, data, run_param="test")
#> 
#> Processing function input....... 
#> 
#> Done. 
#>  
#> Compiling model graph
#>    Resolving undeclared variables
#>    Allocating nodes
#> Graph information:
#>    Observed stochastic nodes: 316
#>    Unobserved stochastic nodes: 125
#>    Total graph size: 1594
#> 
#> Initializing model
#> 
#> Adaptive phase, 500 iterations x 3 chains 
#> If no progress bar appears JAGS has decided not to adapt 
#>  
#> 
#>  Burn-in phase, 500 iterations x 3 chains 
#>  
#> 
#> Sampling from joint posterior, 500 iterations x 3 chains 
#>  
#> 
#> Calculating statistics....... 
#> 
#> Done.
#> 
#>   /!\ Convergence warning:
#> Out of the 51 variables, 15 variables have a Gelman-Rubin statistic > 1.1.
#> You may consider modifying the model run settings.
#> The variables with the poorest convergence are: PI[10,7], PI[8,7], PI[8,6], PI[5,8], PI[3,2], PI[7,2], PI[1,6], PI[8,2], PI[9,6], PI[9,2].
#> JAGS output for model 'mymodel_literature.txt', generated by jagsUI.
#> Estimates based on 3 chains of 1000 iterations,
#> adaptation = 500 iterations (sufficient),
#> burn-in = 500 iterations and thin rate = 1,
#> yielding 1500 total samples from the joint posterior. 
#> MCMC ran for 0.198 minutes at time 2024-12-21 06:37:00.787027.
#> 
#>              mean     sd    2.5%     50%   97.5% overlap0 f  Rhat n.eff
#> eta[4,1]    0.665  0.083   0.489   0.670   0.818    FALSE 1 1.003   696
#> eta[5,1]    0.605  0.086   0.431   0.606   0.766    FALSE 1 1.001  1277
#> eta[3,2]    0.367  0.082   0.218   0.367   0.536    FALSE 1 1.024    89
#> eta[6,2]    0.354  0.083   0.198   0.352   0.529    FALSE 1 1.009   216
#> eta[7,2]    0.585  0.084   0.413   0.586   0.745    FALSE 1 1.003   492
#> eta[8,2]    0.454  0.084   0.287   0.451   0.618    FALSE 1 1.000  1500
#> eta[9,2]    0.820  0.068   0.671   0.828   0.931    FALSE 1 1.001  1296
#> eta[1,3]    0.526  0.077   0.371   0.528   0.671    FALSE 1 0.999  1500
#> eta[4,3]    0.717  0.072   0.570   0.719   0.850    FALSE 1 0.999  1500
#> eta[8,3]    0.849  0.055   0.715   0.855   0.939    FALSE 1 1.001  1449
#> eta[9,3]    0.847  0.058   0.719   0.852   0.943    FALSE 1 1.007   294
#> eta[1,6]    0.152  0.063   0.051   0.145   0.300    FALSE 1 1.002   978
#> eta[3,6]    0.791  0.069   0.642   0.798   0.907    FALSE 1 1.000  1500
#> eta[8,6]    0.188  0.065   0.079   0.181   0.323    FALSE 1 1.015   155
#> eta[9,6]    0.970  0.029   0.890   0.979   0.999    FALSE 1 1.000  1500
#> eta[10,6]   0.880  0.055   0.745   0.888   0.963    FALSE 1 1.002  1500
#> eta[5,7]    0.562  0.081   0.403   0.562   0.720    FALSE 1 1.000  1500
#> eta[8,7]    0.972  0.027   0.901   0.980   0.999    FALSE 1 1.007   368
#> eta[10,7]   0.356  0.077   0.213   0.355   0.505    FALSE 1 1.004   473
#> eta[4,8]    0.676  0.093   0.488   0.677   0.843    FALSE 1 1.003   838
#> eta[5,8]    0.590  0.098   0.397   0.592   0.769    FALSE 1 1.000  1500
#> eta[9,8]    0.228  0.081   0.094   0.221   0.404    FALSE 1 1.002  1389
#> eta[10,8]   0.316  0.093   0.152   0.311   0.511    FALSE 1 1.001  1500
#> eta[4,9]    0.955  0.043   0.850   0.969   0.999    FALSE 1 1.002  1500
#> eta[5,10]   0.960  0.040   0.853   0.973   0.999    FALSE 1 1.000  1500
#> PI[4,1]     0.181  0.287   0.000   0.023   0.981    FALSE 1 1.219    16
#> PI[5,1]     0.819  0.287   0.019   0.977   1.000    FALSE 1 1.219    16
#> PI[3,2]     0.441  0.379   0.000   0.463   0.983    FALSE 1 1.477     8
#> PI[6,2]     0.227  0.314   0.000   0.008   0.893    FALSE 1 1.117    25
#> PI[7,2]     0.119  0.258   0.000   0.000   0.886    FALSE 1 1.412    12
#> PI[8,2]     0.113  0.175   0.000   0.035   0.681    FALSE 1 1.280    17
#> PI[9,2]     0.100  0.122   0.000   0.057   0.440    FALSE 1 1.240    17
#> PI[1,3]     0.421  0.279   0.000   0.417   0.983    FALSE 1 1.079    34
#> PI[4,3]     0.108  0.123   0.000   0.067   0.416    FALSE 1 1.009   607
#> PI[8,3]     0.024  0.050   0.000   0.002   0.156    FALSE 1 1.118    36
#> PI[9,3]     0.447  0.295   0.000   0.465   0.968    FALSE 1 1.066    39
#> PI[1,6]     0.147  0.249   0.000   0.002   0.830    FALSE 1 1.399    10
#> PI[3,6]     0.387  0.260   0.003   0.355   0.946    FALSE 1 1.077    31
#> PI[8,6]     0.164  0.267   0.000   0.009   0.897    FALSE 1 1.876     6
#> PI[9,6]     0.209  0.239   0.000   0.107   0.796    FALSE 1 1.275    12
#> PI[10,6]    0.094  0.168   0.000   0.002   0.553    FALSE 1 1.001  1500
#> PI[5,7]     0.105  0.154   0.000   0.012   0.532    FALSE 1 1.187    18
#> PI[8,7]     0.317  0.370   0.000   0.053   1.000    FALSE 1 2.125     5
#> PI[10,7]    0.578  0.421   0.000   0.744   1.000    FALSE 1 2.303     4
#> PI[4,8]     0.145  0.191   0.000   0.049   0.619    FALSE 1 1.035    64
#> PI[5,8]     0.094  0.184   0.000   0.000   0.657    FALSE 1 1.486     9
#> PI[9,8]     0.150  0.220   0.000   0.021   0.740    FALSE 1 1.008   375
#> PI[10,8]    0.611  0.307   0.005   0.669   0.998    FALSE 1 1.071    34
#> PI[4,9]     1.000  0.000   1.000   1.000   1.000    FALSE 1    NA     1
#> PI[5,10]    1.000  0.000   1.000   1.000   1.000    FALSE 1    NA     1
#> deviance  866.428 11.653 845.577 865.820 892.057    FALSE 1 1.013   156
#> 
#> **WARNING** Some Rhat values could not be calculated.
#> **WARNING** Rhat values indicate convergence failure. 
#> Rhat is the potential scale reduction factor (at convergence, Rhat=1). 
#> For each parameter, n.eff is a crude measure of effective sample size. 
#> 
#> overlap0 checks if 0 falls in the parameter's 95% credible interval.
#> f is the proportion of the posterior with the same sign as the mean;
#> i.e., our confidence that the parameter is positive or negative.
#> 
#> DIC info: (pD = var(deviance)/2) 
#> pD = 67.1 and DIC = 933.534 
#> DIC is an estimate of expected predictive error (lower is better).

You should now try to run the model until it converges (it should take around half an hour to run, so we won’t do it in this vignette):

mcmc_output <- run_model(filename, data, run_param=list(nb_iter=100000, nb_burnin=50000, nb_thin=50, nb_adapt=50000), parallelize = T)

Here are the figures corresponding to the results that have converged:

plot_results(mcmc_output, data)

plot_results(mcmc_output, data, pred = "Pout")

You can save the figures as PNG using:

plot_results(mcmc_output, data, pred = "Pout", save = TRUE, save_path = ".")

Last, if you want to explore further in detail the a posteriori distribution of your parameters Π and η, you can run the following code line, which will store the values for all iterations into a data frame.

reshape_mcmc(mcmc_output, data)