--- title: "Risk set" author: "" package: remify date: "" output: rmarkdown::html_document: theme: spacelab highlight: pygments code_folding: show css: "remify-theme.css" header-includes: - \usepackage{tikz} - \usepackage{pgfplots} vignette: > %\VignetteEngine{knitr::rmarkdown} %\VignetteIndexEntry{Definition of full, active and manual risk set} %\VignetteEncoding{UTF-8} --- --- This vignette provides a definition of _full_, _active_ and _manual_ risk set, it explains how a _manual_ risk set is declared in the processing function `remify::remify()`, and it shows how the processed risk set looks like in the `remify` object. --- Consider the `remify` object for the network `randomREHsmall`. ```{r} library(remify) # loading package data(randomREHsmall) # data # processing the edgelist reh <- remify(edgelist = randomREHsmall$edgelist, directed = TRUE, # events are directed ordinal = FALSE, # model with waiting times model = "tie", # tie-oriented modeling actors = randomREHsmall$actors, origin = randomREHsmall$origin, omit_dyad = NULL) # summary(reh) ``` --- # {.tabset .tabset-fade .tabset-pills .tabcontent} ## Definition of risk set A relational event history consists of a time-ordered sequence of (directed or undirected) interaction. For each event, we know: - its __time__ of occurrence, either as timestamp/date/continuous value or just as order - the __actors__ that were involved in the realtional event - the __type__ of the event (if measured) For instance, the first five events of the `randomREHsmall` sequence are reported as follows ```{r, include = TRUE} randomREHsmall$edgelist[1:5,] ``` where `time`, `actor1`, `actor2` describe each observed event in the sequence (Note that in this example the `type` of events is not annotated). When modeling a relational event sequence, we have to define per each time point a risk set, which consists of the set of those relational events (dyads) that at a specific time point were likely to be observed (this set also contains the event that is actually observed at a specific time point). The definition of the risk set is an important building block of the likelihood function for both tie-oriented and actor-oriented modeling framework. In the sections of this vignette, we discuss three possible definitions of the risk set: _full_, _active_ and _manual_ risk set. These three types of risk set can be processed with `remify::remify()` by specifying the risk set type to the input argument `riskset`. --- ### The _full_ risk set The most common definition of the risk set assumes that all the possible dyads are likely to occur over the whole observation period. We refer to this definition as _full_ risk set. If the network has _N_ actors and it consists of directed events that can assume a number of _C_ possible event types, then the risk set will be characterized by all the possible directed dyads among _N_ actors, which are _D = N(N-1)C_, or _D = N(N-1)C/2_ in the case of undirected dyads. For instance, in the random network (`randomREHsmall`) dyads are directed, actors are _N = 5_ and event types are _C = 1_, therefore we expect the dimension of the risk set to be _D = 5 * 4 * 1 = 20_. The first five dyads in the _full_ risk set will be ```{r, include = TRUE} # method getDyad(), see more in ?remify::getDyad getDyad(x = reh, dyadID = c(1:5)) ``` The ID of the dyads (`dyadID`) corresponds to the order of the dyads used by the functions in `` and it is processed by the function `remify::remify()`. The ID of the dyads is defined by a two-steps approach: 1. Actors' and types' names are first sorted according to their
The alphanumeric order follows first the order of numbers from 0 to 9, then the alphabetical order of the letters.
For instance, given the vector of names `c("user22","0usr","1user","1deer")`, its alphanumeric order will be `c("0usr","1deer","1user","user22)`