Title: | Decision Trees with Structural Equation Models Fit in 'Mplus' |
---|---|
Description: | Uses recursive partitioning to create homogeneous subgroups based on structural equation models fit in 'Mplus', a stand-alone program developed by Muthen and Muthen. |
Authors: | Sarfaraz Serang [aut,cre], Ross Jacobucci [aut,cre], Kevin J. Grimm [ctb], Gabriela Stegmann [ctb], Andreas M. Brandmaier [ctb] |
Maintainer: | Sarfaraz Serang <[email protected]> |
License: | GPL |
Version: | 0.2.2 |
Built: | 2024-12-15 07:39:17 UTC |
Source: | CRAN |
Uses Mplus Trees to match on structural equation model parameters in matching subsample. Then estimates Conditional Average Treatment Effects (CATEs) in holdout estimation subsample.
causalmpt( script, data, rPartFormula, group = ~id, treat, outcome, est.samp = 0.2, ... )
causalmpt( script, data, rPartFormula, group = ~id, treat, outcome, est.samp = 0.2, ... )
script |
An |
data |
Dataset that is specified in the script |
rPartFormula |
Formula of the form ~ variable names |
group |
id variable. If not specified an id variable is created for each row |
treat |
Treatment variable |
outcome |
Univariate outcome of interest (dependent variable in mean comparison tests) |
est.samp |
Proportion of sample to be used as holdout sample (estimation subsample) |
... |
Other arguments to |
See documentation for MplusTrees()
for further information on tree building process.
Takes terminal nodes from Mplus Tree and considers them "matched". Splits estimation subsample into
groups defined by covariate pattern in terminal nodes from Mplus Tree. Performs t tests in each group
with treat
as independent variable and outcome
as dependent variable to estimate CATEs.
Also performs ANOVA to determine if treatment effect differs by group (interaction).
An object of class 'causalmpt
'. Tree structure drawn from MplusTrees()
. CATEs
estimated in estimation (holdout) subsample. Provides results of t tests to estimate CATEs in each
group and ANOVA to examine group differences in treatment effect.
Sarfaraz Serang
Serang, S., & Sears, J. (2021). Tree-based matching on structural equation model parameters. Behavioral Data Science, 1, 31-53.
## Not run: library(lavaan) script = mplusObject( TITLE = "Causal Mplus Trees Example", MODEL = "f1 BY x1-x3;", usevariables = c('x1','x2','x3'), rdata = HolzingerSwineford1939) fit.cmpt = causalmpt(script, HolzingerSwineford1939, group=~id, rPartFormula=~school+grade, control=rpart.control(minsplit=100, minbucket=100, cp=.01), treat="sex", outcome="x4") fit.cmpt ## End(Not run)
## Not run: library(lavaan) script = mplusObject( TITLE = "Causal Mplus Trees Example", MODEL = "f1 BY x1-x3;", usevariables = c('x1','x2','x3'), rdata = HolzingerSwineford1939) fit.cmpt = causalmpt(script, HolzingerSwineford1939, group=~id, rPartFormula=~school+grade, control=rpart.control(minsplit=100, minbucket=100, cp=.01), treat="sex", outcome="x4") fit.cmpt ## End(Not run)
Generates recursive partitioning trees using Mplus models. MplusTrees()
takes an
Mplus model written in the form of an MplusAutomation
script, uses
MplusAutomation
to fit the model in Mplus, and performs recursive partitioning
using rpart
.
MplusTrees( script, data, rPartFormula, catvars = NULL, group = ~id, control = rpart.control(), se = F, psplit = F, palpha = 0.05, cv = F, k = 5 )
MplusTrees( script, data, rPartFormula, catvars = NULL, group = ~id, control = rpart.control(), se = F, psplit = F, palpha = 0.05, cv = F, k = 5 )
script |
An |
data |
Dataset that is specified in the script |
rPartFormula |
Formula of the form ~ variable names |
catvars |
Vector of names of categorical covariates |
group |
id variable. If not specified an id variable is created for each row |
control |
Control object for |
se |
Whether to print standard errors and p values. In general should be set to FALSE |
psplit |
Whether to use likelihood ratio p values as a splitting criterion |
palpha |
Type I error rate (alpha level) for rejecting with likelihood ratio test when
|
cv |
Performs k-fold cross-validation to select value of |
k |
number of folds for cross-validation |
The function temporarily changes the working directory to the temporary directory. Files used
and generated by Mplus are stored here and can be accessed using tempdir()
.
By default MplusTrees()
only splits on the criteria specified in the control
argument, the most important of which is the cp
parameter. The user can also split on the
p value generated from the likelihood ratio test comparing the parent node to a multiple group
model consisting of 2 groups (the daughter nodes). This p value criterion is used in addition
to the cp
criterion in that both must be met for a split to be made. The psplit
argument
turns this option on, and palpha
sets the alpha level criterion for rejection.
Cross-validation (CV) can also be used to choose the cp
parameter. If this option is used, any
user-specified cp
value will be overridden by the optimal cp
value chosen by CV. CV fits
the model to the training set and calculates an expected minus 2 log-likelihood (-2LL) for each terminal
node. In the test set, individuals are assigned to terminal nodes based on the tree structure found in
the training set. Their "expected" values are the -2LL values from the respective training set terminal
nodes. The "observed" values are the -2LL values from fitting a multiple group model, with each terminal
node as a group. The cp
value chosen is the one that produces the smallest MSE.
CV should only be used when (1) the Mplus model can be fit relatively quickly, (2) there are only a few covariates with a few response options, and (3) the sample size is large enough that the user is confident the model can be fit without issue in a sample of size N/k and a tree that partitions this sample further. If these conditions are not met, the process could take prohibitively long to arrive at a solution. Note that if even a single model fails to produce a valid log-likelihood value, the function will terminate with an error.
An object of class 'mplustree
'. rpart_out
provides the tree structure, terminal
gives a vector of terminal nodes, where
shows the terminal node of each id, and estimates
gives
the parameter estimates for each terminal node.
Ross Jacobucci and Sarfaraz Serang
Serang, S., Jacobucci, R., Stegmann, G., Brandmaier, A. M., Culianos, D., & Grimm, K. J. (2021). Mplus Trees: Structural equation model trees using Mplus. Structural Equation Modeling, 28, 127-137.
## Not run: library(lavaan) script = mplusObject( TITLE = "Example #1 - Factor Model;", MODEL = "f1 BY x1-x3; f2 BY x4-x6; f3 BY x7-x9;", usevariables = c('x1','x2','x3','x4','x5','x6','x7','x8','x9'), rdata = HolzingerSwineford1939) fit = MplusTrees(script, HolzingerSwineford1939, group=~id, rPartFormula=~sex+school+grade, control=rpart.control(minsplit=100, minbucket=100, cp=.01)) fit ## End(Not run)
## Not run: library(lavaan) script = mplusObject( TITLE = "Example #1 - Factor Model;", MODEL = "f1 BY x1-x3; f2 BY x4-x6; f3 BY x7-x9;", usevariables = c('x1','x2','x3','x4','x5','x6','x7','x8','x9'), rdata = HolzingerSwineford1939) fit = MplusTrees(script, HolzingerSwineford1939, group=~id, rPartFormula=~sex+school+grade, control=rpart.control(minsplit=100, minbucket=100, cp=.01)) fit ## End(Not run)
Wrapper using rpart.plot
package to plot the tree structure of a
fitted Mplus Tree.
## S3 method for class 'mplustree' plot(x, ...)
## S3 method for class 'mplustree' plot(x, ...)
x |
An object of class "mplustree" (a fitted Mplus Tree) |
... |
Other arguments passed to |
Each node of the plot by default contain the -2 log-likelihood (deviance), the number of individuals in the node, and the percentage of the total sample in the node.
Sarfaraz Serang, relying heavily on the rpart.plot
package by Stephen Milborrow.
## Not run: library(lavaan) script = mplusObject( TITLE = "Example #1 - Factor Model;", MODEL = "f1 BY x1-x3; f2 BY x4-x6; f3 BY x7-x9;", usevariables = c('x1','x2','x3','x4','x5','x6','x7','x8','x9'), rdata = HolzingerSwineford1939) fit = MplusTrees(script, HolzingerSwineford1939, group=~id, rPartFormula=~sex+school+grade, control=rpart.control(cp=.01)) fit plot(fit) ## End(Not run)
## Not run: library(lavaan) script = mplusObject( TITLE = "Example #1 - Factor Model;", MODEL = "f1 BY x1-x3; f2 BY x4-x6; f3 BY x7-x9;", usevariables = c('x1','x2','x3','x4','x5','x6','x7','x8','x9'), rdata = HolzingerSwineford1939) fit = MplusTrees(script, HolzingerSwineford1939, group=~id, rPartFormula=~sex+school+grade, control=rpart.control(cp=.01)) fit plot(fit) ## End(Not run)
summary
method for class "mplustree".
## S3 method for class 'mplustree' summary(object, ...)
## S3 method for class 'mplustree' summary(object, ...)
object |
An object of class "mplustree" (a fitted Mplus Tree) |
... |
Other arguments passed to or from other methods |
Prints the tree structure given in object
## Not run: library(lavaan) script = mplusObject( TITLE = "Example #1 - Factor Model;", MODEL = "f1 BY x1-x3; f2 BY x4-x6; f3 BY x7-x9;", usevariables = c('x1','x2','x3','x4','x5','x6','x7','x8','x9'), rdata = HolzingerSwineford1939) fit = MplusTrees(script, HolzingerSwineford1939, group=~id, rPartFormula=~sex+school+grade, control=rpart.control(cp=.01)) summary(fit) ## End(Not run)
## Not run: library(lavaan) script = mplusObject( TITLE = "Example #1 - Factor Model;", MODEL = "f1 BY x1-x3; f2 BY x4-x6; f3 BY x7-x9;", usevariables = c('x1','x2','x3','x4','x5','x6','x7','x8','x9'), rdata = HolzingerSwineford1939) fit = MplusTrees(script, HolzingerSwineford1939, group=~id, rPartFormula=~sex+school+grade, control=rpart.control(cp=.01)) summary(fit) ## End(Not run)