This vignette demonstrates an
example of how to use the logitr()
function with the
weights
argument to estimate weighted logit models.
This example uses the cars_us data set from Helveston et al. (2015) containing 384 stated choice observations from US car buyers. Conjoint surveys were fielded in 2012 online in the US on Amazon Mechanical Turk and in person at the 2013 Pittsburgh Auto show. Participants were asked to select a vehicle from a set of three alternatives. Each participant answered 15 choice questions.
In the utility models described below, the data variables are represented as follows:
Symbol | Variable |
---|---|
p | The price in US dollars. |
xjhev | Dummy variable for HEV vehicle type |
xjphev10 | Dummy variable for PHEV10 vehicle type |
xjphev20 | Dummy variable for PHEV20 vehicle type |
xjphev40 | Dummy variable for PHEV40 vehicle type |
xjbev75 | Dummy variable for BEV75 vehicle type |
xjbev100 | Dummy variable for BEV100 vehicle type |
xjbev150 | Dummy variable for BEV150 vehicle type |
xjphevFastcharge | Dummy variable for if the PHEV has a fast charging capability |
xjbevFastcharge | Dummy variable for if the BEV has a fast charging capability |
xjopCost | The vehicle operating costs (cents / mile) |
xjaccelTime | The vehicle 0-60mph acceleration time |
xjamerican | Dummy variable for an American brand |
xjjapanese | Dummy variable for a Japanese brand |
xjchinese | Dummy variable for a Chinese brand |
xjskorean | Dummy variable for a S. Korean brand |
In this example, we’ll estimate two versions of the following utility model in the WTP space: one without weights and one with weights. Notation is taken from Helveston et al. (2015):
where all the ω parameters have units of dollars and λ is the scale parameter.
Estimate the unweighted model using the logitr()
function. In this example, I have set robust = TRUE
since
it will also be TRUE
in the weighted model:
library("logitr")
mnl_wtp_unweighted <- logitr(
data = cars_us,
outcome = 'choice',
obsID = 'obsnum',
pars = c(
'hev', 'phev10', 'phev20', 'phev40', 'bev75', 'bev100', 'bev150',
'american', 'japanese', 'chinese', 'skorean', 'phevFastcharge',
'bevFastcharge','opCost', 'accelTime'),
scalePar = 'price',
robust = TRUE,
# Since WTP space models are non-convex, run a multistart
numMultiStarts = 10
)
Print a summary of the results:
summary(mnl_wtp_unweighted)
#> =================================================
#>
#> Model estimated on: Wed Oct 23 06:49:41 AM 2024
#>
#> Using logitr version: 1.1.2
#>
#> Call:
#> logitr(data = cars_us, outcome = "choice", obsID = "obsnum",
#> pars = c("hev", "phev10", "phev20", "phev40", "bev75", "bev100",
#> "bev150", "american", "japanese", "chinese", "skorean",
#> "phevFastcharge", "bevFastcharge", "opCost", "accelTime"),
#> scalePar = "price", robust = TRUE, numMultiStarts = 10)
#>
#> Frequencies of alternatives:
#> 1 2 3
#> 0.34323 0.33507 0.32170
#>
#> Summary Of Multistart Runs:
#> Log Likelihood Iterations Exit Status
#> 1 -4616.952 26 3
#> 2 -4616.952 31 3
#> 3 -4616.952 37 3
#> 4 -4616.952 33 3
#> 5 -4616.952 37 3
#> 6 -4616.952 30 3
#> 7 -4616.952 34 3
#> 8 -4616.952 36 3
#> 9 -4616.952 32 3
#> 10 -4616.952 34 3
#>
#> Use statusCodes() to view the meaning of each status code
#>
#> Exit Status: 3, Optimization stopped because ftol_rel or ftol_abs was reached.
#>
#> Model Type: Multinomial Logit
#> Model Space: Willingness-to-Pay
#> Model Run: 1 of 10
#> Iterations: 26
#> Elapsed Time: 0h:0m:0.57s
#> Algorithm: NLOPT_LD_LBFGS
#> Weights Used?: FALSE
#> Cluster ID: obsnum
#> Robust? TRUE
#>
#> Model Coefficients:
#> Estimate Std. Error z-value Pr(>|z|)
#> scalePar 0.0738778 0.0021929 33.6898 < 2.2e-16 ***
#> hev 0.8068260 0.9990687 0.8076 0.4193335
#> phev10 1.1649982 1.0615103 1.0975 0.2724267
#> phev20 1.6473977 1.0617543 1.5516 0.1207625
#> phev40 2.5795310 1.0499377 2.4568 0.0140164 *
#> bev75 -16.0458997 1.2541305 -12.7944 < 2.2e-16 ***
#> bev100 -13.0033090 1.2388653 -10.4961 < 2.2e-16 ***
#> bev150 -9.5736990 1.1641920 -8.2235 2.220e-16 ***
#> american 2.3429451 0.7979690 2.9361 0.0033233 **
#> japanese -0.3756939 0.7998368 -0.4697 0.6385600
#> chinese -10.2706045 0.8859927 -11.5922 < 2.2e-16 ***
#> skorean -6.0320195 0.8514460 -7.0844 1.396e-12 ***
#> phevFastcharge 2.8797097 0.8028928 3.5867 0.0003349 ***
#> bevFastcharge 2.9187141 0.9181393 3.1789 0.0014781 **
#> opCost -1.6360512 0.0686316 -23.8382 < 2.2e-16 ***
#> accelTime -1.6970892 0.1638118 -10.3600 < 2.2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Log-Likelihood: -4616.9517826
#> Null Log-Likelihood: -6328.0067827
#> AIC: 9265.9035652
#> BIC: 9372.4426000
#> McFadden R2: 0.2703940
#> Adj McFadden R2: 0.2678655
#> Number of Observations: 5760.0000000
#> Number of Clusters 5760.0000000
To estimate the weighted model, simply add the weights
argument to the call to logitr()
, referring to the column
of weights that will be used to weight each choice observation. In this
example, the weights used in the weights
column range from
0.2 to 5:
summary(cars_us$weights)
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> 0.2000 0.2000 0.2000 0.6891 0.2000 5.0000
mnl_wtp_weighted <- logitr(
data = cars_us,
outcome = 'choice',
obsID = 'obsnum',
pars = c(
'hev', 'phev10', 'phev20', 'phev40', 'bev75', 'bev100', 'bev150',
'american', 'japanese', 'chinese', 'skorean', 'phevFastcharge',
'bevFastcharge','opCost', 'accelTime'),
scalePar = 'price',
weights = 'weights', # This enables the weights
robust = TRUE,
numMultiStarts = 10
)
Print a summary of the results:
summary(mnl_wtp_weighted)
#> =================================================
#>
#> Model estimated on: Wed Oct 23 06:49:45 AM 2024
#>
#> Using logitr version: 1.1.2
#>
#> Call:
#> logitr(data = cars_us, outcome = "choice", obsID = "obsnum",
#> pars = c("hev", "phev10", "phev20", "phev40", "bev75", "bev100",
#> "bev150", "american", "japanese", "chinese", "skorean",
#> "phevFastcharge", "bevFastcharge", "opCost", "accelTime"),
#> scalePar = "price", weights = "weights", robust = TRUE, numMultiStarts = 10)
#>
#> Frequencies of alternatives:
#> 1 2 3
#> 0.34323 0.33507 0.32170
#>
#> Summary Of Multistart Runs:
#> Log Likelihood Iterations Exit Status
#> 1 -3425.633 19 3
#> 2 -3425.630 27 3
#> 3 -3425.630 26 3
#> 4 -3425.631 35 3
#> 5 -3425.630 40 3
#> 6 -3425.630 26 3
#> 7 -3425.630 31 3
#> 8 -3425.630 41 3
#> 9 -3832.646 61 -1
#> 10 -3425.633 29 3
#>
#> Use statusCodes() to view the meaning of each status code
#>
#> Exit Status: 3, Optimization stopped because ftol_rel or ftol_abs was reached.
#>
#> Model Type: Multinomial Logit
#> Model Space: Willingness-to-Pay
#> Model Run: 6 of 10
#> Iterations: 26
#> Elapsed Time: 0h:0m:0.31s
#> Algorithm: NLOPT_LD_LBFGS
#> Weights Used?: TRUE
#> Cluster ID: obsnum
#> Robust? TRUE
#>
#> Model Coefficients:
#> Estimate Std. Error z-value Pr(>|z|)
#> scalePar 0.0522800 0.0040694 12.8472 < 2.2e-16 ***
#> hev -1.1757219 2.9133650 -0.4036 0.6865352
#> phev10 0.0278999 3.1280728 0.0089 0.9928836
#> phev20 1.6951556 3.0997720 0.5469 0.5844718
#> phev40 2.6502311 2.9852275 0.8878 0.3746580
#> bev75 -20.1360393 3.6672237 -5.4908 4.001e-08 ***
#> bev100 -19.4958931 3.6256227 -5.3773 7.563e-08 ***
#> bev150 -13.6904861 3.4927207 -3.9197 8.865e-05 ***
#> american 8.1890444 2.4053823 3.4045 0.0006629 ***
#> japanese 0.9343746 2.3603886 0.3959 0.6922111
#> chinese -19.0083809 2.8542641 -6.6596 2.745e-11 ***
#> skorean -9.5107822 2.5235187 -3.7689 0.0001640 ***
#> phevFastcharge 3.9441022 2.3624666 1.6695 0.0950213 .
#> bevFastcharge 3.3421640 2.8087143 1.1899 0.2340752
#> opCost -1.5976246 0.1948661 -8.1986 2.220e-16 ***
#> accelTime -1.1718413 0.4834796 -2.4238 0.0153605 *
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Log-Likelihood: -3425.6302859
#> Null Log-Likelihood: -4360.5909275
#> AIC: 6883.2605719
#> BIC: 6989.7997000
#> McFadden R2: 0.2144115
#> Adj McFadden R2: 0.2107422
#> Number of Observations: 5760.0000000
#> Number of Clusters 5760.0000000
Here is a comparison of the coefficients between the weighted and unweighted models. All of the significant coefficients have the same sign, but the magnitudes shift some based on the differential weighting of each individual choice in the weighted model:
data.frame(
Unweighted = coef(mnl_wtp_unweighted),
Weighted = coef(mnl_wtp_weighted)
)
#> Unweighted Weighted
#> scalePar 0.07387776 0.05228002
#> hev 0.80682602 -1.17572186
#> phev10 1.16499818 0.02789991
#> phev20 1.64739765 1.69515557
#> phev40 2.57953096 2.65023112
#> bev75 -16.04589966 -20.13603935
#> bev100 -13.00330902 -19.49589313
#> bev150 -9.57369897 -13.69048607
#> american 2.34294510 8.18904443
#> japanese -0.37569387 0.93437460
#> chinese -10.27060451 -19.00838091
#> skorean -6.03201950 -9.51078221
#> phevFastcharge 2.87970966 3.94410222
#> bevFastcharge 2.91871408 3.34216404
#> opCost -1.63605116 -1.59762456
#> accelTime -1.69708921 -1.17184132
Here is a comparison of the log-likelihood for the weighted and unweighted models: