Example script analyzing the RFID
validity study data (Elmer et al, 2019) with the DyNAM-i model of the
goldfish
package. Data received from:
Elmer, T., Chaitanya, K., Purwar, P., & Stadtfeld, C. (2019). The validity of RFID badges measuring face-to-face interactions. Behavior research methods, 51(5), 2120-2138.
Analyses inspired from:
Hoffman, M., Block, P., Elmer, T., & Stadtfeld, C. (2020). A model for the dynamics of face-to-face interactions in social groups. Network Science, 8(S1), S4-S25.
First, we load the goldfish
package and load the data. .
You can find out more about this dataset and its format by callings its
documentation.
The participants
object contains the nodeset of actors
interacting together. The available attributes are their age, their
gender, their organizational unit (group), and their seniority
level:
## actor label present age gender group level
## 1 1 Actor1 TRUE 35 0 1 4
## 2 2 Actor2 TRUE 34 0 2 3
## 3 3 Actor3 TRUE 25 1 1 1
## 4 4 Actor4 TRUE 26 0 1 2
## 5 5 Actor5 TRUE 26 0 1 1
## 6 6 Actor6 TRUE 31 1 3 3
The rfid
object contains the list of dyadic interactions
collected via RFID badges. Each interaction is characterized by two
actors (NodeA
and NodeB
) and two time points
(Start
and End
):
## NodeA NodeB Start End
## 1 3 4 1491329893 1491329930
## 2 1 11 1491329894 1491329926
## 3 10 8 1491329894 1491329904
## 4 10 4 1491329895 1491329911
## 5 10 5 1491329895 1491329924
## 6 3 5 1491329902 1491329913
The video
object contains the list of dyadic
interactions collected via video recordings and has the same format as
the rfid
object. We know that these measures should be more
reliable, so we will work with those data in this script!
## NodeA NodeB Start End
## 13 5 8 1491329893 1491329984
## 14 7 1 1491329893 1491329984
## 18 1 11 1491329893 1491329984
## 19 1 6 1491329893 1491329984
## 42 7 6 1491329893 1491330372
## 45 11 6 1491329893 1491330372
Finally, the known.before
matrix indicates which actors
knew each other before the event.
Before specifying the model, we need to set the data in the right
format for DyNAM-i: We had a list of dyadic interactions, but DyNAM-i is
specifically meant for group interactions. More specifically, the model
is designed for events of actors joining or leavin interaction groups.
We can use this function to create these group events (please note we
need to use the right labels in the dataframe: NodeA
,
NodeB
,Start
, and End
):
#?defineGroups_interaction
prepdata <- defineGroups_interaction(video, participants,
seed.randomization = 1)
This functions to creates 5 objects: 1. groups
: a
goldfish nodeset containing the interaction groups (initially there are
as many groups as actors and they are all present, meaning
available)
## label present
## 1 Group1 TRUE
## 2 Group2 TRUE
## 3 Group3 TRUE
## 4 Group4 TRUE
## 5 Group5 TRUE
## 6 Group6 TRUE
dependent.events
: goldfish events specifying the events
that we want to model, when an actor (in the dataframe, the
sender
column) joins or leaves (increment
= 1
or -1) a group (receiver
) at a particular point in time
(time
)## time sender receiver increment
## 1 1491329893 Actor4 Group3 1
## 2 1491329893 Actor2 Group9 1
## 3 1491329893 Actor7 Group6 1
## 4 1491329893 Actor1 Group6 1
## 5 1491329893 Actor11 Group6 1
## 6 1491329893 Actor8 Group5 1
exogenous.events
: goldfish events specifying the events
that need to happen but are not modeled (for example, when an actor
leaves a group, a dependent event is created for this leaving but an
exogenous event is also created because the actor “joins” a new group,
its own isolated group)## time sender receiver increment
## 1 1491329893 Actor4 Group4 -1
## 2 1491329893 Actor2 Group2 -1
## 3 1491329893 Actor7 Group7 -1
## 4 1491329893 Actor1 Group1 -1
## 5 1491329893 Actor11 Group11 -1
## 6 1491329893 Actor8 Group8 -1
interaction.updates
: goldfish events that are used to
update the number of past interactions between participants:## time sender receiver increment
## 1 1491329893 Actor4 Actor3 1
## 2 1491329893 Actor2 Actor9 1
## 3 1491329893 Actor7 Actor6 1
## 4 1491329893 Actor1 Actor6 1
## 5 1491329893 Actor1 Actor7 1
## 6 1491329893 Actor11 Actor1 1
opportunities
: list containing the interaction groups
available at each decision time (this will vary when groups are being
joined or left)## [[1]]
## [1] 1 2 3 4 5 6 7 8 9 10 11
##
## [[2]]
## [1] 1 2 3 5 6 7 8 9 10 11
##
## [[3]]
## [1] 1 9 3 5 6 7 8 10 11
##
## [[4]]
## [1] 1 9 3 5 6 8 10 11
##
## [[5]]
## [1] 6 9 3 5 8 10 11
##
## [[6]]
## [1] 6 9 3 5 8 10
Now that we have all the data we need, we need to define the link between our objects in a way that goldfish understands what is going on.
First, we define the first mode nodeset, the actors:
# goldfish requires character names
participants$label <- as.character(participants$label)
actors <- defineNodes(participants)
Then we define the second mode nodeset, the groups:
Then we create the dynamic interaction network between the actors and
the groups, updated by the previously created events in
dependent.events
and exogenous.events
init.network <- diag(x = 1, nrow(actors), nrow(groups))
# goldfish check that row/column names agree with the nodes data frame labels
dimnames(init.network) <- list(actors$label, groups$label)
network.interactions <- defineNetwork(
matrix = init.network, nodes = actors, nodes2 = groups, directed = TRUE
)
network.interactions <- linkEvents(
x = network.interactions, changeEvent = dependent.events,
nodes = actors, nodes2 = groups
)
network.interactions <- linkEvents(
x = network.interactions, changeEvent = exogenous.events,
nodes = actors, nodes2 = groups
)
Then we create the dynamic network between the actors that records
past interactions, updated by the previously created events in
interaction.updates
network.past <- defineNetwork(nodes = actors, directed = FALSE)
network.past <- linkEvents(
x = network.past, changeEvents = interaction.updates, nodes = actors
) # don't worry about the warnings
Finally, we define the events that we want to model:
dependent.events
:
Let us define some models we could estimate (we try not to put too many effects because we have little data).
We first have a model only with effects related to individual attributes (M1).
For the rate, we have the following effects:
formula.rate.M1 <- dependent.events ~ 1 +
intercept(network.interactions, joining = 1) +
ego(actors$age, joining = 1, subType = "centered") +
ego(actors$age, joining = -1, subType = "centered") +
diff(actors$age, joining = -1, subType = "averaged_sum") +
diff(actors$level, joining = -1, subType = "averaged_sum") +
same(actors$gender, joining = -1, subType = "proportion") +
same(actors$group, joining = -1, subType = "proportion") +
tie(known.before, joining = -1, subType = "proportion")
and for the choice model:
formula.choice.M1 <- dependent.events ~
diff(actors$age, subType = "averaged_sum") +
diff(actors$level, subType = "averaged_sum") +
same(actors$gender, subType = "proportion") +
same(actors$group, subType = "proportion") +
tie(known.before, subType = "proportion")
Note that effects in the choice model can mirror the ones in the leaving model: The first model explains who people want to join (or not join), the second explains who people want to leave (or stay with).
Now let us run goldfish estimation!
##
## Call:
## estimate(x = dependent.events ~ 1 + intercept(network.interactions,
## joining = 1) + ego(actors$age, joining = 1, subType = "centered") +
## ego(actors$age, joining = -1, subType = "centered") + diff(actors$age,
## joining = -1, subType = "averaged_sum") + diff(actors$level,
## joining = -1, subType = "averaged_sum") + same(actors$gender,
## joining = -1, subType = "proportion") + same(actors$group,
## joining = -1, subType = "proportion") + tie(known.before,
## joining = -1, subType = "proportion"), model = "DyNAMi",
## subModel = "rate")
##
##
## Effects details:
## Object joining subType
## Intercept "" "" ""
## intercept "network.interactions" "1" ""
## ego "actors$age" "1" ""centered""
## ego "actors$age" "-1" ""centered""
## diff "actors$age" "-1" ""averaged_sum""
## diff "actors$level" "-1" ""averaged_sum""
## same "actors$gender" "-1" ""proportion""
## same "actors$group" "-1" ""proportion""
## tie "known.before" "-1" ""proportion""
##
## Coefficients:
## Estimate Std. Error z-value Pr(>|z|)
## Intercept -5.942916 0.322237 -18.4427 < 2.2e-16 ***
## intercept 2.509660 0.334344 7.5062 6.084e-14 ***
## ego 0.061687 0.024706 2.4968 0.01253 *
## ego 0.027958 0.035570 0.7860 0.43188
## diff -0.163643 0.069175 -2.3656 0.01800 *
## diff 0.361468 0.226675 1.5947 0.11079
## same 0.027282 0.254374 0.1073 0.91459
## same 0.508497 0.371577 1.3685 0.17116
## tie -0.156303 0.346267 -0.4514 0.65170
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Converged with max abs. score of 0.00012
## Log-Likelihood: -1306.3
## AIC: 2630.6
## AICc: 2631.4
## BIC: 2661.7
## model: "DyNAMi" subModel: "rate"
est.choice.M1 <- estimate(
formula.choice.M1,
model = "DyNAMi", subModel = "choice",
estimationInit = list(opportunitiesList = opportunities)
)
summary(est.choice.M1)
##
## Call:
## estimate(x = dependent.events ~ diff(actors$age, subType = "averaged_sum") +
## diff(actors$level, subType = "averaged_sum") + same(actors$gender,
## subType = "proportion") + same(actors$group, subType = "proportion") +
## tie(known.before, subType = "proportion"), model = "DyNAMi",
## subModel = "choice", estimationInit = list(opportunitiesList = opportunities))
##
##
## Effects details:
## Object subType
## diff "actors$age" ""averaged_sum""
## diff "actors$level" ""averaged_sum""
## same "actors$gender" ""proportion""
## same "actors$group" ""proportion""
## tie "known.before" ""proportion""
##
## Coefficients:
## Estimate Std. Error z-value Pr(>|z|)
## diff -0.107578 0.067179 -1.6014 0.1092943
## diff 0.262945 0.225568 1.1657 0.2437353
## same 0.292121 0.298317 0.9792 0.3274665
## same 0.024975 0.384103 0.0650 0.9481565
## tie 1.316141 0.378545 3.4768 0.0005074 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Converged with max abs. score of 0.00026
## Log-Likelihood: -195.16
## AIC: 400.32
## AICc: 400.84
## BIC: 414.39
## model: "DyNAMi" subModel: "choice"
Now let’s add effects related to group sizes and past interactions. We add to the rate model the effects of:
formula.rate.M2 <- dependent.events ~ 1 +
intercept(network.interactions, joining = 1) +
ego(actors$age, joining = 1, subType = "centered") +
ego(actors$age, joining = -1, subType = "centered") +
diff(actors$age, joining = -1, subType = "averaged_sum") +
diff(actors$level, joining = -1, subType = "averaged_sum") +
same(actors$gender, joining = -1, subType = "proportion") +
same(actors$group, joining = -1, subType = "proportion") +
tie(known.before, joining = -1, subType = "proportion") +
size(network.interactions, joining = -1, subType = "identity") +
egopop(network.past, joining = 1, subType = "normalized") +
egopop(network.past, joining = -1, subType = "normalized")
We add to the choice model:
formula.choice.M2 <- dependent.events ~
diff(actors$age, subType = "averaged_sum") +
diff(actors$level, subType = "averaged_sum") +
same(actors$gender, subType = "proportion") +
same(actors$group, subType = "proportion") +
alter(actors$age, subType = "mean") +
tie(known.before, subType = "proportion") +
size(network.interactions, subType = "identity") +
alterpop(network.past, subType = "mean_normalized") +
inertia(network.past, window = 60, subType = "mean") +
inertia(network.past, window = 300, subType = "mean")
One can note that we have normalized (using the option
subType="mean"
) the effects related to previous
interactions, because we want to avoid time heterogeneity issues (in the
beginning no interaction has happened, at the end, a lot happened).
All effects can have different subTypes, please look at the goldfish documentation to learn more.
Now we can run again
##
## Call:
## estimate(x = dependent.events ~ 1 + intercept(network.interactions,
## joining = 1) + ego(actors$age, joining = 1, subType = "centered") +
## ego(actors$age, joining = -1, subType = "centered") + diff(actors$age,
## joining = -1, subType = "averaged_sum") + diff(actors$level,
## joining = -1, subType = "averaged_sum") + same(actors$gender,
## joining = -1, subType = "proportion") + same(actors$group,
## joining = -1, subType = "proportion") + tie(known.before,
## joining = -1, subType = "proportion") + size(network.interactions,
## joining = -1, subType = "identity") + egopop(network.past,
## joining = 1, subType = "normalized") + egopop(network.past,
## joining = -1, subType = "normalized"), model = "DyNAMi",
## subModel = "rate")
##
##
## Effects details:
## Object joining subType
## Intercept "" "" ""
## intercept "network.interactions" "1" ""
## ego "actors$age" "1" ""centered""
## ego "actors$age" "-1" ""centered""
## diff "actors$age" "-1" ""averaged_sum""
## diff "actors$level" "-1" ""averaged_sum""
## same "actors$gender" "-1" ""proportion""
## same "actors$group" "-1" ""proportion""
## tie "known.before" "-1" ""proportion""
## size "network.interactions" "-1" ""identity""
## egopop "network.past" "1" ""normalized""
## egopop "network.past" "-1" ""normalized""
##
## Coefficients:
## Estimate Std. Error z-value Pr(>|z|)
## Intercept -6.376865 0.396152 -16.0970 < 2.2e-16 ***
## intercept 2.942756 0.406082 7.2467 4.27e-13 ***
## ego 0.067331 0.026053 2.5844 0.009755 **
## ego 0.012435 0.036750 0.3384 0.735094
## diff -0.142109 0.076203 -1.8649 0.062200 .
## diff 0.248709 0.240336 1.0348 0.300744
## same 0.051020 0.263799 0.1934 0.846641
## same 0.172918 0.409717 0.4220 0.672994
## tie 0.145864 0.379845 0.3840 0.700971
## size 0.133780 0.073838 1.8118 0.070015 .
## egopop 0.068527 0.105430 0.6500 0.515708
## egopop 0.177403 0.131982 1.3441 0.178900
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Converged with max abs. score of 0.00053
## Log-Likelihood: -1303.5
## AIC: 2631
## AICc: 2632.4
## BIC: 2672.4
## model: "DyNAMi" subModel: "rate"
and:
est.choice.M2 <- estimate(
formula.choice.M2,
model = "DyNAMi", subModel = "choice",
estimationInit = list(opportunitiesList = opportunities)
)
summary(est.choice.M2)
##
## Call:
## estimate(x = dependent.events ~ diff(actors$age, subType = "averaged_sum") +
## diff(actors$level, subType = "averaged_sum") + same(actors$gender,
## subType = "proportion") + same(actors$group, subType = "proportion") +
## alter(actors$age, subType = "mean") + tie(known.before, subType = "proportion") +
## size(network.interactions, subType = "identity") + alterpop(network.past,
## subType = "mean_normalized") + inertia(network.past, window = 60,
## subType = "mean") + inertia(network.past, window = 300, subType = "mean"),
## model = "DyNAMi", subModel = "choice", estimationInit = list(opportunitiesList = opportunities))
##
##
## Effects details:
## Object window subType
## diff "actors$age" "" ""averaged_sum""
## diff "actors$level" "" ""averaged_sum""
## same "actors$gender" "" ""proportion""
## same "actors$group" "" ""proportion""
## alter "actors$age" "" ""mean""
## tie "known.before" "" ""proportion""
## size "network.interactions" "" ""identity""
## alterpop "network.past" "" ""mean_normalized""
## inertia "network.past" "60" ""mean""
## inertia "network.past" "300" ""mean""
##
## Coefficients:
## Estimate Std. Error z-value Pr(>|z|)
## diff -0.133077 0.073387 -1.8134 0.069776 .
## diff 0.095857 0.231054 0.4149 0.678238
## same 0.141033 0.321444 0.4387 0.660843
## same -0.284446 0.418710 -0.6793 0.496924
## alter 0.033538 0.017678 1.8971 0.057809 .
## tie 1.274992 0.393714 3.2384 0.001202 **
## size 0.097754 0.093141 1.0495 0.293934
## alterpop 0.288526 0.170532 1.6919 0.090662 .
## inertia 0.211654 0.458129 0.4620 0.644084
## inertia -0.096613 0.235330 -0.4105 0.681410
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Converged with max abs. score of 2e-05
## Log-Likelihood: -189.76
## AIC: 399.53
## AICc: 401.49
## BIC: 427.65
## model: "DyNAMi" subModel: "choice"
Of course these models ask a bit much from our data, so ideally we would a bit more reasonable with the number of effects we include, or do some model selection.
We now have for each rate model a general intercept and a dummy interaction between the intercept and the joining events. If we want to report the actual intercept for joining, we can calculate the following (for M2 for example):
cov.matrix <- vcov(est.rate.M2)
est.interceptjoining <- coef(est.rate.M2)[1] + coef(est.rate.M2)[2]
se.interceptjoining <- sqrt(
cov.matrix[1, 1] + cov.matrix[2, 2] + 2 * cov.matrix[1, 2]
)
t.interceptjoining <- est.interceptjoining / se.interceptjoining
sprintf(
"Intercept for joining: %.3f (SE = %.3f, t = %.3f)",
est.interceptjoining, se.interceptjoining, t.interceptjoining
)
## [1] "Intercept for joining: -3.434 (SE = 0.089, t = -38.477)"
This script can be continued by looking at better model specifications than the ones provided, or trying to model the interaction data collected from RFID badges rather than video recordings, and maybe compare both.