A school class with 25 students has met on the 04.11.2021 and two students were infected with COVID-19. Several students were tested:
How likely is it that the two cases did not transmit COVID-19 and that no further cases will occur?
calculate_posterior_no_infections
The function calculate_posterior_no_infections()
calculates the probability that nobody is infected given the negative
tests. To that end, the following input values are required:
The input negative_persons
denotes the number of people
without the infectious persons and infected_persons
denotes
the number of primary COVID-19 cases in the school class. Furthermore,
the event type event
describes the setting of the meeting.
In the package the event types school
(school class) and
day_care_center
(day care center group) were modeled.
In addition, the information about the conducted tests
test_infos
, the conducted test types
test_types
and the number of persons which were tested at
each days subgroup_size
are needed. Each row of
test_infos
, test_types
and each entry of
subgroup_size
describes one group of persons, which were
tested in the same way on the same dates. test_infos
describes in the first column how many tests and the following columns
on which days after the event they were conducted, and
test_types
contains in each column which kind of test was
conducted at the corresponding date.
Beside this, distribution
as well as info
are optional inputs, which can be set to use an own, custom prior
distribution and own custom tests. distribution
should be a
probability vector, containing the probabilities that one COVID-19 case
infects 0, …, negative_persons
persons. For
info
a data frame with one column containing the
day-specific sensitivities of the considered test after infection has to
be created and the column should have the name of the considered tests.
The specificity of the test is not affecting the likelihood, since no
positive tests results are assumed.
The function calculate_posterior_no_infections()
uses
calculate_likelihood_negative_tests
and
calculate_prior_infections
to calculate the probability.
The whole bayesian statistical model behind this function is described
in the paper Jäckle et al [1].
negative_persons <- 23
infected_persons <- 2
event <- "school"
test_infos <- matrix(nrow = 2, ncol = 3)
test_infos[1, 1] <- 1
test_infos[1, 2] <- 2
test_infos[2, 1] <- 2
test_infos[2, 2] <- 4
test_infos[2, 3] <- 6
test_types <- matrix(nrow = 2, ncol = 2)
test_types[1, 1] <- "PCR"
test_types[2, 1] <- "PCR"
test_types[2, 2] <- "Antigen"
subgroup_size <- c(3, 5)
prob <- calculate_posterior_no_infections(negative_persons,
infected_persons,
event,
test_infos,
test_types,
subgroup_size)
print(prob)
#> [1] 0.825621
The resulting probability for no further infections given the negative test results is around 82.6%.
calculate_likelihood_negative_tests
The function calculate_likelihood_negative_tests()
is
used to calculate the vector of probabilities (likelihood) that zero
positive tests are observed, given different numbers of persons infected
by the primary cases.
The calculate_likelihood_negative_tests()
uses some
inputs of calculate_posterior_no_infections
, namely
negative_persons
, test_infos
,
test_types
and subgroup_size
.
Beside this, also info
is an optional input, which can
be set for using own custom tests.
The likelihood function as part of the bayesian statistical model is described in the paper Jäckle et al [1].
test_infos <- matrix(nrow = 2, ncol = 3)
test_infos[1, 1] <- 1
test_infos[1, 2] <- 2
test_infos[2, 1] <- 2
test_infos[2, 2] <- 4
test_infos[2, 3] <- 6
test_types <- matrix(nrow = 2, ncol = 2)
test_types[1, 1] <- "PCR"
test_types[2, 1] <- "PCR"
test_types[2, 2] <- "Antigen"
negative_persons <- 23
subgroup_size <- c(3, 5)
likelihood <- calculate_likelihood_negative_tests(test_infos,
test_types,
negative_persons,
subgroup_size)
print(likelihood)
#> [1] 1.0000000000 0.8338649049 0.6908141737 0.5683275826 0.4640775090
#> [6] 0.3759211840 0.3018929462 0.2401964946 0.1891971417 0.1474140667
#> [11] 0.1135125689 0.0862963205 0.0646996202 0.0477796460 0.0347087085
#> [16] 0.0247665041 0.0173323682 0.0118775284 0.0079573574 0.0052036265
#> [21] 0.0033167587 0.0020580817 0.0012420813 0.0007286544
The output vector shows the likelihood probabilities for negative test results given 0, ..., 23 persons infected by the primary cases.
calculate_prior_infections
The function calculate_prior_infections()
calculates the
a priori probability of how many people are infected in a specific event
setting.
The calculate_prior_infections()
needs the same inputs
as calculate_posterior_no_infections
except the
test_infos
, test_types
and
subgroup_sizes
variables. Currently, the event types
school
(school class) and day_care_center
(day
care center group) are modeled and can be chosen. Beside this, the
optional inputs p_one
and infect_average
can
be set in order to model an own, custom event scenario.
p_one
defines the probability that a COVID-19 case infects
at least one of the persons attending the event, and
infect_average
the average number of infected persons,
under the condition that COVID-19 was transmitted at the event.
The priori probability distribution as part of the bayesian statistical model is described in the paper Jäckle et al [1].
negative_persons <- 23
infected_persons <- 2
event <- "school"
prior <- calculate_prior_infections(negative_persons,
infected_persons,
event)
print(prior)
#> [1] 7.743990e-01 1.266298e-01 5.061046e-02 2.384428e-02 1.194404e-02
#> [6] 6.134207e-03 3.172921e-03 1.635224e-03 8.333112e-04 4.173381e-04
#> [11] 2.042826e-04 9.720703e-05 4.471282e-05 1.975681e-05 8.325528e-06
#> [16] 3.317071e-06 1.236190e-06 4.250407e-07 1.323857e-07 3.641152e-08
#> [21] 8.515680e-09 1.593207e-09 2.128086e-10 1.529138e-11
The output vector shows the prior probabilities that 0, …, 23 persons are infected by the primary cases.
# Convert test data frame into arrays
test_dates <- c()
test_colors <- c()
test_positions <- c()
test_text_positions <- c()
test_labels <- c("event")
test_label_positions_x <- c(date)
test_label_positions_y <- c(0.6)
test_label_colours <- c("1")
if (nrow(tests_df > 0)) {
for (i in 1:nrow(tests_df)) {
test_list <- strsplit(tests_df[i, 1], ", ")[[1]]
dates_group <- c()
for (j in 1:length(test_list)) {
test <- strsplit(test_list[j], " ")[[1]]
test_dates <- c(test_dates, test[1])
dates_group <- c(dates_group, as.Date(test[1]))
test_colors <- c(test_colors, as.character(i + 1) )
test_positions <- c(test_positions, i * 0.9 / nrow(tests_df))
test_text_positions <- c(test_text_positions, i * 0.9 / nrow(tests_df) + 0.1)
}
test_label <- paste(as.character(tests_df[i, 2]), " persons")
if(tests_df[i, 2] == 1)
{
test_label <- "1 person"
}
test_labels <- c(test_labels, test_label)
test_label_positions_x <- c(test_label_positions_x, mean.Date(dates_group))
test_label_positions_y <- c(test_label_positions_y, i*0.9/nrow(tests_df) + 0.1)
test_label_colours <- c(test_label_colours, as.character(i+1) )
}
}
df_label <- data.frame(Date = test_label_positions_x,
Position = test_label_positions_y,
Label = test_labels,
col = test_label_colours)
df <- data.frame(Date = c(date, test_dates),
Position = c(0.5, test_positions),
Text_pos = c(0.6, test_text_positions),
col = c("1", test_colors))
df <- df %>% arrange(desc(Position))
timeline_plot <- ggplot(df, aes(x = Date, y = Position, color = col)) +
scale_color_brewer(palette = "Set1") +
theme_classic() +
geom_hline(yintercept = 0, color = "black", size = 0.3) +
geom_segment(aes(y = Position, yend = 0, xend = Date, color = col),
size = 0.2) +
geom_point(aes(y = Position), size = 3) +
theme(axis.line.y = element_blank(),
axis.text.y = element_blank(),
axis.title.x = element_blank(),
axis.title.y = element_blank(),
axis.ticks.y = element_blank(),
# axis.text.x = element_blank(),
# axis.ticks.x = element_blank(),
axis.line.x = element_blank(),
legend.position = "none",
legend.title = element_blank(),
text = element_text(size = 16),
axis.text.x = element_text(angle = 30, hjust = 1)
) +
geom_label(aes(x = Date, y = Position, label = Label), df_label, size = 5, label.size = NA) +
scale_x_date(date_labels = "%d %b", date_breaks = "1 day", limits = c(min(df$Date) - 1, max(df$Date) + 1))
timeline_plot
[1] Jäckle S, Röger E, Dicken V, Geisler B, Schumacher J, Westphal M. A Statistical Model to Assess Risk for Supporting COVID-19 Quarantine Decisions. International Journal of Environmental Research and Public Health. 2021; 18(17): 9166.
calculate_posterior_no_infections
calculate_likelihood_negative_tests
calculate_prior_infections