--- title: "Statistical Modelling for Infectious Disease Management - Risk assessment group quarantine" output: rmarkdown::html_vignette: fig_width: 7 fig_height: 5 fontsize: 12pt geometry: margin=1in vignette: > %\VignetteIndexEntry{Statistical Modelling for Infectious Disease Management - Risk assessment group quarantine} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", warning = FALSE ) knitr::knit_hooks$set( source = function(x, options) { hook.r <- function(x, options) { fence <- "```" language <- tolower(options$engine) if (language == 'node') language <- 'javascript' if (!options$highlight) language <- 'text' if(!is.null(options$foldcode)) { paste0('\n\n', "
Source\n", fence, language, '\n', x, fence, '\n\n', "
\n") } else { paste0('\n\n', fence, language, '\n', x, fence, '\n\n') } } x <- knitr:::hilight_source(x, 'markdown', options) hook.r( paste(c( x, '' ), collapse = '\n'), options ) } ) Sys.setlocale("LC_TIME", "C") ``` ```{r setup, echo = FALSE, message = FALSE} library(smidm) library(ggplot2) library(dplyr) ``` ## Question: A school class with 25 students has met on the 04.11.2021 and two students were infected with COVID-19. Several students were tested: - First group: 3 persons in the group, tested with PCR on the second day after the group event. - Second group: 5 persons in the group, tested with PCR on the fourth day after the group event and with Antigen on the sixth day after the group event. How likely is it that the two cases did not transmit COVID-19 and that no further cases will occur? ## Calculating the probability that nobody is infected given the negative test results using \newline `calculate_posterior_no_infections` ### Inputs The function `calculate_posterior_no_infections()` calculates the probability that nobody is infected given the negative tests. To that end, the following input values are required: The input `negative_persons` denotes the number of people without the infectious persons and `infected_persons` denotes the number of primary COVID-19 cases in the school class. Furthermore, the event type `event` describes the setting of the meeting. In the package the event types `school` (school class) and `day_care_center` (day care center group) were modeled. In addition, the information about the conducted tests `test_infos`, the conducted test types `test_types` and the number of persons which were tested at each days `subgroup_size` are needed. Each row of `test_infos`, `test_types` and each entry of `subgroup_size` describes one group of persons, which were tested in the same way on the same dates. `test_infos` describes in the first column how many tests and the following columns on which days after the event they were conducted, and `test_types` contains in each column which kind of test was conducted at the corresponding date. Beside this, `distribution` as well as `info` are optional inputs, which can be set to use an own, custom prior distribution and own custom tests. `distribution` should be a probability vector, containing the probabilities that one COVID-19 case infects 0, ..., `negative_persons` persons. For `info` a data frame with one column containing the day-specific sensitivities of the considered test after infection has to be created and the column should have the name of the considered tests. The specificity of the test is not affecting the likelihood, since no positive tests results are assumed. ### Methodology The function `calculate_posterior_no_infections()` uses `calculate_likelihood_negative_tests` and `calculate_prior_infections` to calculate the probability. The whole bayesian statistical model behind this function is described in the paper Jäckle et al [1]. ### Output ```{r calculate_posterior_no_infections_statement} negative_persons <- 23 infected_persons <- 2 event <- "school" test_infos <- matrix(nrow = 2, ncol = 3) test_infos[1, 1] <- 1 test_infos[1, 2] <- 2 test_infos[2, 1] <- 2 test_infos[2, 2] <- 4 test_infos[2, 3] <- 6 test_types <- matrix(nrow = 2, ncol = 2) test_types[1, 1] <- "PCR" test_types[2, 1] <- "PCR" test_types[2, 2] <- "Antigen" subgroup_size <- c(3, 5) prob <- calculate_posterior_no_infections(negative_persons, infected_persons, event, test_infos, test_types, subgroup_size) print(prob) ``` The resulting probability for no further infections given the negative test results is around 82.6%. ## Calculating the likelihood using `calculate_likelihood_negative_tests` The function `calculate_likelihood_negative_tests()` is used to calculate the vector of probabilities (likelihood) that zero positive tests are observed, given different numbers of persons infected by the primary cases. ### Inputs The `calculate_likelihood_negative_tests()` uses some inputs of `calculate_posterior_no_infections`, namely `negative_persons`, `test_infos`, `test_types` and `subgroup_size`. Beside this, also `info` is an optional input, which can be set for using own custom tests. ### Methodology The likelihood function as part of the bayesian statistical model is described in the paper Jäckle et al [1]. ### Output ```{r calculate_likelihood_negative_tests} test_infos <- matrix(nrow = 2, ncol = 3) test_infos[1, 1] <- 1 test_infos[1, 2] <- 2 test_infos[2, 1] <- 2 test_infos[2, 2] <- 4 test_infos[2, 3] <- 6 test_types <- matrix(nrow = 2, ncol = 2) test_types[1, 1] <- "PCR" test_types[2, 1] <- "PCR" test_types[2, 2] <- "Antigen" negative_persons <- 23 subgroup_size <- c(3, 5) likelihood <- calculate_likelihood_negative_tests(test_infos, test_types, negative_persons, subgroup_size) print(likelihood) ``` The output vector shows the likelihood probabilities for negative test results given $0, ..., 23$ persons infected by the primary cases. ## Calculating the priori probability distribution of further infections using \newline `calculate_prior_infections` The function `calculate_prior_infections()` calculates the a priori probability of how many people are infected in a specific event setting. ### Inputs The `calculate_prior_infections()` needs the same inputs as `calculate_posterior_no_infections` except the `test_infos`, `test_types` and `subgroup_sizes` variables. Currently, the event types `school` (school class) and `day_care_center` (day care center group) are modeled and can be chosen. Beside this, the optional inputs `p_one` and `infect_average` can be set in order to model an own, custom event scenario. `p_one` defines the probability that a COVID-19 case infects at least one of the persons attending the event, and `infect_average` the average number of infected persons, under the condition that COVID-19 was transmitted at the event. ### Methodology The priori probability distribution as part of the bayesian statistical model is described in the paper Jäckle et al [1]. ### Outputs ```{r calculate_prior_infections} negative_persons <- 23 infected_persons <- 2 event <- "school" prior <- calculate_prior_infections(negative_persons, infected_persons, event) print(prior) ``` The output vector shows the prior probabilities that 0, ..., 23 persons are infected by the primary cases. ## Visualization example of all date inputs on a time scale ```{r foldcode = TRUE} date <- as.Date("2021-10-05") tests_df <- data.frame(c("2021-10-06 PCR-test", "2021-10-07 antigen-test", "2021-10-07 antigen-test", "2021-10-08 antigen-test", "2021-10-10 antigen-test"), c(2, 2, 3, 5, 1), c(1, 2, 2, 3, 5)) ``` ```{r foldcode = TRUE} # Convert test data frame into arrays test_dates <- c() test_colors <- c() test_positions <- c() test_text_positions <- c() test_labels <- c("event") test_label_positions_x <- c(date) test_label_positions_y <- c(0.6) test_label_colours <- c("1") if (nrow(tests_df > 0)) { for (i in 1:nrow(tests_df)) { test_list <- strsplit(tests_df[i, 1], ", ")[[1]] dates_group <- c() for (j in 1:length(test_list)) { test <- strsplit(test_list[j], " ")[[1]] test_dates <- c(test_dates, test[1]) dates_group <- c(dates_group, as.Date(test[1])) test_colors <- c(test_colors, as.character(i + 1) ) test_positions <- c(test_positions, i * 0.9 / nrow(tests_df)) test_text_positions <- c(test_text_positions, i * 0.9 / nrow(tests_df) + 0.1) } test_label <- paste(as.character(tests_df[i, 2]), " persons") if(tests_df[i, 2] == 1) { test_label <- "1 person" } test_labels <- c(test_labels, test_label) test_label_positions_x <- c(test_label_positions_x, mean.Date(dates_group)) test_label_positions_y <- c(test_label_positions_y, i*0.9/nrow(tests_df) + 0.1) test_label_colours <- c(test_label_colours, as.character(i+1) ) } } df_label <- data.frame(Date = test_label_positions_x, Position = test_label_positions_y, Label = test_labels, col = test_label_colours) df <- data.frame(Date = c(date, test_dates), Position = c(0.5, test_positions), Text_pos = c(0.6, test_text_positions), col = c("1", test_colors)) df <- df %>% arrange(desc(Position)) timeline_plot <- ggplot(df, aes(x = Date, y = Position, color = col)) + scale_color_brewer(palette = "Set1") + theme_classic() + geom_hline(yintercept = 0, color = "black", size = 0.3) + geom_segment(aes(y = Position, yend = 0, xend = Date, color = col), size = 0.2) + geom_point(aes(y = Position), size = 3) + theme(axis.line.y = element_blank(), axis.text.y = element_blank(), axis.title.x = element_blank(), axis.title.y = element_blank(), axis.ticks.y = element_blank(), # axis.text.x = element_blank(), # axis.ticks.x = element_blank(), axis.line.x = element_blank(), legend.position = "none", legend.title = element_blank(), text = element_text(size = 16), axis.text.x = element_text(angle = 30, hjust = 1) ) + geom_label(aes(x = Date, y = Position, label = Label), df_label, size = 5, label.size = NA) + scale_x_date(date_labels = "%d %b", date_breaks = "1 day", limits = c(min(df$Date) - 1, max(df$Date) + 1)) timeline_plot ``` ## Literature [1] Jäckle S, Röger E, Dicken V, Geisler B, Schumacher J, Westphal M. A Statistical Model to Assess Risk for Supporting COVID-19 Quarantine Decisions. International Journal of Environmental Research and Public Health. 2021; 18(17): 9166.