| Title: | Multivariate Generalized Linear Mixed Models for Ranking Sports Teams |
|---|---|
| Description: | Maximum likelihood estimates are obtained via an EM algorithm with either a first-order or a fully exponential Laplace approximation as documented by Broatch and Karl (2018) <doi:10.48550/arXiv.1710.05284>, Karl, Yang, and Lohr (2014) <doi:10.1016/j.csda.2013.11.019>, and by Karl (2012) <doi:10.1515/1559-0410.1471>. Karl and Zimmerman <doi:10.1016/j.jspi.2020.06.004> use this package to illustrate how the home field effect estimator from a mixed model can be biased under nonrandom scheduling. |
| Authors: | Andrew T. Karl [cre, aut] (ORCID: <https://orcid.org/0000-0002-5933-8706>), Jennifer Broatch [aut] |
| Maintainer: | Andrew T. Karl <[email protected]> |
| License: | GPL-2 |
| Version: | 1.2-5 |
| Built: | 2026-06-09 14:00:03 UTC |
| Source: | https://github.com/cran/mvglmmRank |
The package fits multivariate generalized linear mixed models for team scores, win/loss indicators, and margin-of-victory responses. Maximum likelihood estimates are obtained by an EM algorithm using either a first-order or fully exponential Laplace approximation.
See mvglmmRank for the fitting interface and
game.pred for printed game predictions from fitted models.
Maintainer: Andrew T. Karl [email protected] (ORCID)
Authors:
Andrew T. Karl [email protected] (ORCID)
Jennifer Broatch
Broatch, J.E. and Karl, A.T. (2018). Multivariate Generalized Linear Mixed Models for Joint Estimation of Sporting Outcomes. Italian Journal of Applied Statistics, 30(2), 189-211. Also available from https://arxiv.org/abs/1710.05284.
Karl, A.T. and Zimmerman, D.L. (2021). A Diagnostic for Bias in Linear Mixed Model Estimators Induced by Dependence Between the Random Effects and the Corresponding Model Matrix. Journal of Statistical Planning and Inference, 211, 107-118. doi:10.1016/j.jspi.2020.06.004.
Karl, A.T., Yang, Y. and Lohr, S. (2014). Computation of Maximum Likelihood Estimates for Multiresponse Generalized Linear Mixed Models with Non-nested, Correlated Random Effects. Computational Statistics & Data Analysis, 73, 146-162. doi:10.1016/j.csda.2013.11.019.
Karl, A.T. (2012). The Sensitivity of College Football Rankings to Several Modeling Choices. Journal of Quantitative Analysis in Sports, 8(3). doi:10.1515/1559-0410.1471.
An internal function.
binary_cre(Z_mat = Z_mat, first.order = first.order, home.field, control = control)binary_cre(Z_mat = Z_mat, first.order = first.order, home.field, control = control)
Z_mat |
data frame. |
first.order |
logical |
home.field |
logical |
control |
list |
2008 FBS College Football Regular Season Data
data(f2008)data(f2008)
A data frame with 772 observations on the following 9 variables.
homea factor
Game.Datea POSIXlt date variable
awaya factor
home.responsea numeric vector
home.scorea numeric vector
away.responsea numeric vector
away.scorea numeric vector
neutral.sitea numeric vector
partitiona numeric vector
http://web1.ncaa.org/mfb/download.jsp?year=2008&div=IA
data(f2008) ## maybe str(f2008) ; plot(f2008) ...data(f2008) ## maybe str(f2008) ; plot(f2008) ...
2009 FBS College Football Regular Season Data
data(f2009)data(f2009)
A data frame with 772 observations on the following 7 variables.
homea factor
Game.Datea POSIXlt date variable
awaya factor
home.responsea numeric vector
home.scorea numeric vector
away.responsea numeric vector
away.scorea numeric vector
neutral.sitea numeric vector
partitiona numeric vector
http://web1.ncaa.org/mfb/download.jsp?year=2009&div=IA
data(f2009) ## maybe str(f2009) ; plot(f2009) ...data(f2009) ## maybe str(f2009) ; plot(f2009) ...
2010 FBS College Football Regular Season Data
data(f2010)data(f2010)
A data frame with 770 observations on the following 9 variables.
homea factor
Game.Datea POSIXlt
awaya factor
home.responsea numeric vector
home.scorea numeric vector
away.responsea numeric vector
away.scorea numeric vector
neutral.sitea numeric vector
partitiona numeric vector
http://web1.ncaa.org/mfb/download.jsp?year=2010&div=IA
data(f2010) ## maybe str(f2010) ; plot(f2010) ...data(f2010) ## maybe str(f2010) ; plot(f2010) ...
2011 FBS College Football Regular Season Data
data(f2011)data(f2011)
A data frame with 781 observations on the following 9 variables.
homea factor
Game.Datea POSIXlt
awaya factor
home.responsea numeric vector
home.scorea numeric vector
away.responsea numeric vector
away.scorea numeric vector
neutral.sitea numeric vector
partitiona numeric vector
http://web1.ncaa.org/mfb/download.jsp?year=2011&div=IA
data(f2011) ## maybe str(f2011) ; plot(f2011) ...data(f2011) ## maybe str(f2011) ; plot(f2011) ...
2012 FBS College Football Regular Season Data
data(f2012)data(f2012)
A data frame with 809 observations on the following 9 variables.
homea factor
Game.Datea POSIXlt
awaya factor
home.responsea numeric vector
home.scorea numeric vector
away.responsea numeric vector
away.scorea numeric vector
neutral.sitea numeric vector
partitiona numeric vector
http://web1.ncaa.org/mfb/download.jsp?year=2012&div=IA
data(f2012) ## maybe str(f2012) ; plot(f2012) ...data(f2012) ## maybe str(f2012) ; plot(f2012) ...
Uses a fitted mvglmmRank object to print predicted scores,
win probability, and/or margin of victory for a specified matchup.
game.pred(res, home, away, neutral.site = FALSE)game.pred(res, home, away, neutral.site = FALSE)
res |
An object of class |
home |
Character string naming the home team. The name should match a team name in the fitted object. |
away |
Character string naming the away team. The name should match a team name in the fitted object. |
neutral.site |
Logical. If |
Neutral-site predictions require the training data supplied to
mvglmmRank to contain neutral.site = 1 games. If a
fitted score model has no neutral-site mean, neutral-site score predictions
may be unavailable.
Prints predictions and returns NULL invisibly.
Broatch, J.E. and Karl, A.T. (2018). Multivariate Generalized Linear Mixed Models for Joint Estimation of Sporting Outcomes. Italian Journal of Applied Statistics, 30(2), 189-211. Also available from https://arxiv.org/abs/1710.05284.
Karl, A.T., Yang, Y. and Lohr, S. (2014). Computation of Maximum Likelihood Estimates for Multiresponse Generalized Linear Mixed Models with Non-nested, Correlated Random Effects. Computational Statistics & Data Analysis, 73, 146-162. doi:10.1016/j.csda.2013.11.019.
Karl, A.T. (2012). The Sensitivity of College Football Rankings to Several Modeling Choices. Journal of Quantitative Analysis in Sports, 8(3). doi:10.1515/1559-0410.1471.
data(nfl2012) fit <- mvglmmRank(nfl2012, method = "PB0", first.order = TRUE, max.iter.EM = 1, verbose = FALSE) game.pred(fit, home = "Denver Broncos", away = "Green Bay Packers")data(nfl2012) fit <- mvglmmRank(nfl2012, method = "PB0", first.order = TRUE, max.iter.EM = 1, verbose = FALSE) game.pred(fit, home = "Denver Broncos", away = "Green Bay Packers")
Fits one of several generalized linear mixed models for team scores, win/loss indicators, or margin of victory. The fitted random effects are used as team ratings.
mvglmmRank( game.data, method = "PB0", first.order = FALSE, home.field = TRUE, max.iter.EM = 1000, tol1 = 1e-04, tol2 = 1e-04, tolFE = 0, tol.n = 1e-07, verbose = TRUE, OT.flag = FALSE, Hessian = FALSE, REML.N = TRUE )mvglmmRank( game.data, method = "PB0", first.order = FALSE, home.field = TRUE, max.iter.EM = 1000, tol1 = 1e-04, tol2 = 1e-04, tolFE = 0, tol.n = 1e-07, verbose = TRUE, OT.flag = FALSE, Hessian = FALSE, REML.N = TRUE )
game.data |
A data frame with columns |
method |
Character string naming the model to fit. Choices are
|
first.order |
Logical. If |
home.field |
Logical. If |
max.iter.EM |
Maximum number of EM iterations. |
tol1 |
Convergence tolerance for the first-order Laplace approximation, based on the maximum relative parameter change. |
tol2 |
Convergence tolerance for the fully exponential Laplace
approximation. Not used when |
tolFE |
Intermediate convergence tolerance for the fully exponential approximation. Corrections to the random-effects covariance matrix begin after this tolerance is reached. |
tol.n |
Convergence tolerance for the normal models. Convergence is
declared when |
verbose |
Logical. If |
OT.flag |
Logical. If |
Hessian |
Logical. If |
REML.N |
Logical. If |
The available methods are:
"B"Binary/probit model for home win/loss indicators.
"P0"Poisson score model without a game-level random effect.
"P1"Poisson score model with a game-level random effect.
"N"Normal score model with an unstructured within-game error covariance matrix.
"NB"Joint normal score and binary/probit win/loss model.
"PB0"Joint Poisson score and binary/probit win/loss model without a game-level random effect.
"PB1"Joint Poisson score and binary/probit win/loss model with a game-level random effect.
"NB.mov"Joint normal margin-of-victory and binary/probit win/loss model.
"N.mov"Normal margin-of-victory model.
Neutral-site games are represented in game.data$neutral.site. Use
1 for neutral-site games and 0 otherwise. For neutral-site
games, the teams may be assigned to the home and away columns
arbitrarily. With home.field = TRUE, score models estimate a
neutral-site mean score when neutral-site games are present. With
home.field = FALSE, the home/away and neutral-site mean structure is
suppressed.
Setting first.order = TRUE yields the first-order Laplace
approximation. A partial fully exponential Laplace approximation can be
obtained by setting tol1 > tol2 and tolFE = 0. This applies
fully exponential corrections to the vector of team ratings, but not to the
covariance matrix of this vector. Karl, Yang, and Lohr (2014) show that this
approach produces a large portion of the benefit of the fully exponential
Laplace approximation in only a fraction of the time.
The "PB1" method is the least scalable, as its memory and
computational requirements are at least quadratic in the number of teams
plus the number of games.
An object of class "mvglmmRank". The object is a list whose
components depend on method and may include:
n.ratings.offense, n.ratings.defense
Normal-model
offensive and defensive ratings, or NULL.
p.ratings.offense, p.ratings.defense
Poisson-model
offensive and defensive ratings, or NULL.
b.ratingsBinary/probit win-propensity ratings, or
NULL.
n.ratings.movNormal margin-of-victory ratings, or
NULL.
n.mean, p.mean, b.mean
Estimated fixed-effect means or home-field effects for the fitted model components.
G, G.cor
Random-effects covariance and correlation matrices.
R, R.cor
Normal-model error covariance and
correlation matrices, or NULL.
home.fieldLogical indicating whether a home-field effect was modeled.
HessianNumerical Hessian if requested, otherwise
NULL.
parametersVector of fitted model parameters.
actual, pred, sresid
Observed values, fitted values, and scaled residuals where available.
N.outputAdditional normal-model matrices and covariance
output for method = "N" and method = "N.mov".
fixed.effect.model.outputAdditional fixed-effect
margin-of-victory output for method = "N.mov".
methodThe model method supplied by the user.
Broatch, J.E. and Karl, A.T. (2018). Multivariate Generalized Linear Mixed Models for Joint Estimation of Sporting Outcomes. Italian Journal of Applied Statistics, 30(2), 189-211. Also available from https://arxiv.org/abs/1710.05284.
Karl, A.T. and Zimmerman, D.L. (2021). A Diagnostic for Bias in Linear Mixed Model Estimators Induced by Dependence Between the Random Effects and the Corresponding Model Matrix. Journal of Statistical Planning and Inference, 211, 107-118. doi:10.1016/j.jspi.2020.06.004.
Karl, A.T., Yang, Y. and Lohr, S. (2013). Efficient Maximum Likelihood Estimation of Multiple Membership Linear Mixed Models, with an Application to Educational Value-Added Assessments. Computational Statistics and Data Analysis, 59, 13-27.
Karl, A.T., Yang, Y. and Lohr, S. (2014). Computation of Maximum Likelihood Estimates for Multiresponse Generalized Linear Mixed Models with Non-nested, Correlated Random Effects. Computational Statistics & Data Analysis, 73, 146-162. doi:10.1016/j.csda.2013.11.019.
Karl, A.T. (2012). The Sensitivity of College Football Rankings to Several Modeling Choices. Journal of Quantitative Analysis in Sports, 8(3). doi:10.1515/1559-0410.1471.
data(nfl2012) fit <- mvglmmRank(nfl2012, method = "PB0", first.order = TRUE, max.iter.EM = 1, verbose = FALSE) game.pred(fit, home = "Denver Broncos", away = "Green Bay Packers") result <- mvglmmRank(nfl2012, method = "PB0", first.order = TRUE, verbose = FALSE) print(result) game.pred(result, home = "Denver Broncos", away = "Green Bay Packers")data(nfl2012) fit <- mvglmmRank(nfl2012, method = "PB0", first.order = TRUE, max.iter.EM = 1, verbose = FALSE) game.pred(fit, home = "Denver Broncos", away = "Green Bay Packers") result <- mvglmmRank(nfl2012, method = "PB0", first.order = TRUE, verbose = FALSE) print(result) game.pred(result, home = "Denver Broncos", away = "Green Bay Packers")
Internal Function for Normal MOV model
N_mov(Z_mat = Z_mat, first.order = TRUE, home.field = home.field, control = control)N_mov(Z_mat = Z_mat, first.order = TRUE, home.field = home.field, control = control)
Z_mat |
data frame |
first.order |
logical |
home.field |
logical |
control |
list |
Internal Function for Normal-Binary Model
NB_cre(Z_mat = Z_mat, first.order = first.order, home.field = home.field, control = control)NB_cre(Z_mat = Z_mat, first.order = first.order, home.field = home.field, control = control)
Z_mat |
data frame |
first.order |
logical |
home.field |
logical |
control |
list |
Internal Function for Normal-Binary Model
NB_mov(Z_mat = Z_mat, first.order = first.order, home.field = home.field, control = control)NB_mov(Z_mat = Z_mat, first.order = first.order, home.field = home.field, control = control)
Z_mat |
data frame |
first.order |
logical |
home.field |
logical |
control |
list |
2013 NBA Data
data(nba2013)data(nba2013)
A data frame with 1229 observations on the following 11 variables.
Datea factor
awaya factor
homea factor
OTa factor
partitiona numeric vector
neutral.sitea numeric vector
ot.counta numeric vector
home.responsea numeric vector
home.scorea numeric vector
away.responsea numeric vector
away.scorea numeric vector
http://masseyratings.com/data.php
data(nba2013) ## maybe str(nba2013) ; plot(nba2013) ...data(nba2013) ## maybe str(nba2013) ; plot(nba2013) ...
2012 NCAA Division I Basketball Results
data(ncaab2012)data(ncaab2012)
A data frame with 5253 observations on the following 10 variables.
datea factor
awaya factor
homea factor
neutral.sitea numeric vector
partitiona numeric vector
home_wina numeric vector
home.responsea numeric vector
home.scorea numeric vector
away.responsea numeric vector
away.scorea numeric vector
http://masseyratings.com/data.php
data(ncaab2012) ## maybe str(ncaab2012) ; plot(ncaab2012) ...data(ncaab2012) ## maybe str(ncaab2012) ; plot(ncaab2012) ...
2012 NFL Regular Season Data
data(nfl2012)data(nfl2012)
A data frame with 256 observations on the following 9 variables.
Datea factor
awaya factor
homea factor
neutral.sitea numeric vector
home.responsea numeric vector
home.scorea numeric vector
away.responsea numeric vector
away.scorea numeric vector
partitiona numeric vector
http://masseyratings.com/data.php
data(nfl2012) ## maybe str(nfl2012) ; plot(nfl2012) ...data(nfl2012) ## maybe str(nfl2012) ; plot(nfl2012) ...
Internal Function for Normal Model
normal_cre(Z_mat = Z_mat, first.order = first.order, home.field = home.field, control = control)normal_cre(Z_mat = Z_mat, first.order = first.order, home.field = home.field, control = control)
Z_mat |
data frame |
first.order |
logical |
home.field |
logical |
control |
list |
Internal Function for Poisson-binary Model
PB_cre(Z_mat = Z_mat, first.order = first.order, home.field = home.field, control = control, game.effect = game.effect)PB_cre(Z_mat = Z_mat, first.order = first.order, home.field = home.field, control = control, game.effect = game.effect)
Z_mat |
data frame |
first.order |
logical |
home.field |
logical |
control |
list |
game.effect |
logical |
Internal Function for Poisson Model
poisson_cre(Z_mat = Z_mat, first.order = first.order, control = control, game.effect = game.effect, home.field = home.field)poisson_cre(Z_mat = Z_mat, first.order = first.order, control = control, game.effect = game.effect, home.field = home.field)
Z_mat |
data frame |
first.order |
logical |
control |
logical |
game.effect |
logical |
home.field |
logical |