--- title: "Exogenous dyadic covariates" author: "Francisco Richter" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Exogenous dyadic covariates} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 6, fig.height = 4 ) ``` ```{r} library(amorem) ``` Exogenous information — such as geographic distance between actors — can drive the rate at which relational events occur. `amorem` supports this through the `contribution_logits` argument of `simulate_relational_events()`, which accepts any sender × receiver matrix of log-intensities. ## US state distance matrix The package ships a 56 × 56 distance matrix (in metres) between US states and territories. We load it and transform to a log-scale: ```{r} data("dist_matrix", package = "amorem") # log-transform to compress the range dist_log <- log(dist_matrix / 100000 + 1) ``` ## Defining a non-linear effect Following the issue description, the true effect of distance on the log-rate is a smooth, non-linear function: $$f(d) = \sin\!\bigl(-d / 1.5\bigr)$$ where $d$ is the log-transformed distance. ```{r} true_effect <- sin(-dist_log / 1.5) ``` We can visualise this curve: ```{r true-effect-curve} d_seq <- seq(0, max(dist_log), length.out = 200) plot(d_seq, sin(-d_seq / 1.5), type = "l", lwd = 2, col = "red", xlab = "log-distance", ylab = "f(d)", main = "True non-linear distance effect" ) ``` ## Simulating events with exogenous covariates We pass the effect matrix directly as `contribution_logits`. The Gillespie algorithm uses these values to weight which dyad fires next. We also request one control per event for downstream inference: ```{r} set.seed(42) states <- rownames(dist_matrix) events <- simulate_relational_events( n_events = 800, senders = states, receivers = states, contribution_logits = true_effect, allow_loops = FALSE, n_controls = 1 ) head(events) ``` ## Recovering the effect with a GAM For each event–control pair we compute the **difference** in log-distance. A GAM with a smooth term `s(delta_dist)` should recover the true curve. ```{r fit-gam} library(mgcv) get_dist <- function(s, r) { dist_log[cbind(match(s, states), match(r, states))] } events$dist_val <- mapply(get_dist, events$sender, events$receiver) cases <- events[events$event == 1, ] controls <- events[events$event == 0, ] cases <- cases[order(cases$stratum), ] controls <- controls[order(controls$stratum), ] fit_df <- data.frame( y = 1, delta_dist = cases$dist_val - controls$dist_val ) fit <- gam(y ~ s(delta_dist) - 1, family = binomial, data = fit_df) summary(fit) ``` ## Plotting estimated vs true effect ```{r effect-plot} x_grid <- seq(min(fit_df$delta_dist), max(fit_df$delta_dist), length.out = 300) pred <- predict(fit, newdata = data.frame(delta_dist = x_grid), type = "link") plot(x_grid, pred, type = "l", lwd = 2, xlab = expression(Delta ~ "log-distance"), ylab = "Estimated effect", main = "GAM-recovered smooth vs true effect" ) abline(h = 0, lty = 2, col = "grey50") ``` The GAM successfully captures the non-linear relationship between distance and event intensity, demonstrating that `amorem` handles exogenous dyadic covariates seamlessly through `contribution_logits`.