BLPestimatoR
provides an efficient estimation algorithm
to perform the demand estimation described in Berry, Levinsohn, and Pakes (1995). The routine
uses analytic gradients and offers a large number of optimization
routines and implemented integration methods as discussed in Brunner et al. (2017).
This extended documentation demonstrates the steps of a typical demand estimation with the package:
BLP_data
(includes the
specification of a model and providing integration draws for observed or
unobserved heterogeneity)estimate_BLP
summary
get_elasticities
For this purpose the well-known training datasets for the cereal market (Nevo 2001) and the car market (Berry, Levinsohn, and Pakes 1995) are included in the package. Loading the package is therefore the very first step of the demand estimation:
Since version 0.1.6 the model is provided in R’s formula syntax and
consists of five parts. The variable to be explained is given by
observed market shares. Explanatory variables are grouped into four
(possibly overlapping) categories separated by |
:
The first part of this documentation starts with the cereal data example from Nevo (2001). Nevo’s model can be translated into the following formula syntax:
nevos_model <- as.formula("share ~ price + productdummy |
0+ productdummy |
price + sugar + mushy |
0+ IV1 + IV2 + IV3 + IV4 + IV5 + IV6 + IV7 + IV8 + IV9 + IV10 +
IV11 + IV12 + IV13 + IV14 + IV15 + IV16 + IV17 + IV18 + IV19 + IV20")
The model is directly related to consumer i’s indirect utility from purchasing cereal j in market t:
$$u_{ijt}=\sum_{m=1}^M x^{(m)}_{jt} \beta_{i,m}+\xi_{jt}+\epsilon_{ijt} \;\; \text{with}$$ $$\beta_{i,m}= \bar{\beta}_m + \sum_{r=1}^R \gamma_{m,r} d_{i,r} + \sigma_m \nu_{i,m}$$ and
price
, sugar
,
mushy
and an intercept)income
, incomesq
,
age
, child
)$$\theta_2 = \begin{pmatrix} \sigma_1 & \gamma_{1,1} & \cdots & \gamma_{1,R} \\ \sigma_2 & \gamma_{2,1} & \cdots & \gamma_{2,R} \\ \vdots & \vdots & \ddots & \vdots \\ \sigma_M & \gamma_{M,1} & \cdots & \gamma_{M,R} \end{pmatrix}$$
Product related variables are collected in the dataframe
productData
with the following requirements:
market_identifier
)A variable that uniquely identifies a product in a market
(product_identifier
) is optional, but enhances clarity
(interpreting elasticities, for example, is much easier).
market_identifier
and product_identifier
together uniquely identify an observation, which is used by the function
update_BLP_data
to update any variable in the data (in this
case product_identifier
is mandatory).
In the cereal example, this gives the following dataframe:
head(productData_cereal)
#> price const sugar mushy share cdid IV1 IV2
#> 1 0.07208794 1 2 1 0.012417212 market_1 -0.2159728 0.04057341
#> 2 0.11417849 1 18 1 0.007809387 market_1 -0.2452393 0.05474226
#> 3 0.13239066 1 4 1 0.012994511 market_1 -0.1764587 0.04659597
#> 4 0.13034408 1 3 0 0.005769961 market_1 -0.1214013 0.04876037
#> 5 0.15482331 1 12 0 0.017934141 market_1 -0.1326114 0.03962835
#> 6 0.13704921 1 14 0 0.026601892 market_1 -0.1534998 0.04298842
#> IV3 IV4 IV5 IV6 IV7 IV8
#> 1 -3.247948 -0.523937695 -0.23246005 0.0068326605 3.1397395 -0.57478633
#> 2 -19.832461 -0.180519694 0.01468859 0.0007988026 0.2876539 0.03293960
#> 3 -2.878531 -0.284219004 -0.21553691 -0.0318693281 2.8862741 -0.74976495
#> 4 -2.059918 -0.328412257 -0.22206995 -0.0314740402 4.4531096 0.25567529
#> 5 -6.137598 -0.138625095 -0.18936521 -0.0437471023 -3.5546508 0.13882114
#> 6 -8.417332 0.007829087 -0.13850121 -0.0210582272 -2.7594799 0.05020052
#> IV9 IV10 IV11 IV12 IV13 IV14
#> 1 0.2062202 0.1774656 2.1163580 -0.15470824 -0.0057964065 0.01453802
#> 2 0.1051208 -0.2875618 -7.3740909 -0.57641176 0.0129908544 0.07614324
#> 3 -0.4789565 0.2147389 2.1878721 -0.20734643 0.0035092777 0.09178117
#> 4 -0.4729673 0.3560980 2.7045762 0.04074801 -0.0037242656 0.09473168
#> 5 -0.6886784 0.2602726 1.2612419 0.03483558 -0.0005676374 0.10245147
#> 6 -0.2734440 0.1273060 0.3375543 0.02351037 0.0002637777 0.08627983
#> IV15 IV16 IV17 IV18 IV19 IV20 product_id
#> 1 0.12624398 0.06734464 0.06842261 0.03480046 0.12634612 0.03548368 cereal_1
#> 2 0.02973565 0.08786672 0.11050060 0.08778380 0.04987192 0.07257905 cereal_2
#> 3 0.16377308 0.11188073 0.10822551 0.08643905 0.12234707 0.10184248 cereal_3
#> 4 0.13527378 0.08809001 0.10176745 0.10177748 0.11074119 0.10433204 cereal_4
#> 5 0.13063951 0.08481820 0.10107461 0.12516923 0.13346381 0.12111110 cereal_5
#> 6 0.07233581 0.02225051 0.10564387 0.11603699 0.09965063 0.10572660 cereal_6
#> productdummy
#> 1 product1
#> 2 product2
#> 3 product3
#> 4 product4
#> 5 product5
#> 6 product6
The arguments related to the numerical integration problem are of particular importance when providing own integration draws and weights, which is most relevant for observed heterogeneity (for unobserved heterogeneity, the straightforward approach is the use of automatic integration).
In the cereal data, both, observed and unobserved heterogeneity, is used for the random coefficients. Starting with observed heterogeneity, user provided draws are collected in a list. Each list entry must be named according to the name of a demographic. Each entry contains the following variables:
market_identifier
that matches each line to
a market (same variable name as in productData
)In the cereal example, observed heterogeneity is provided as follows (list names correspond to the demographics):
demographicData_cereal$income[1:4, 1:5]
#> cdid draw_1 draw_2 draw_3 draw_4
#> 1 market_1 0.49512349 0.3787622 0.1050146 -1.48548093
#> 2 market_2 0.05389113 -1.5030833 0.6217917 0.26922901
#> 3 market_3 0.62459788 -0.2242698 -0.2846169 0.67845943
#> 4 market_4 0.94419851 0.4056359 0.5859923 0.01813051
demographicData_cereal$incomesq[1:4, 1:5]
#> cdid draw_1 draw_2 draw_3 draw_4
#> 1 market_1 8.3313042 6.121865 1.030803 -25.5836047
#> 2 market_2 0.0966346 -25.849845 10.767233 4.0668173
#> 3 market_3 10.8215619 -4.894545 -5.956954 11.8673879
#> 4 market_4 17.1121550 6.629730 10.075529 -0.5537043
demographicData_cereal$age[1:4, 1:5]
#> cdid draw_1 draw_2 draw_3 draw_4
#> 1 market_1 -0.2301090 -2.5326941 -0.006965458 -0.8279460
#> 2 market_2 0.1414545 -0.1813188 -1.279931134 -0.4532526
#> 3 market_3 -0.1813188 -0.5867840 0.208145922 -0.4532526
#> 4 market_4 0.7444506 0.1414545 0.645359728 0.8346017
demographicData_cereal$child[1:4, 1:5]
#> cdid draw_1 draw_2 draw_3 draw_4
#> 1 market_1 -0.2308511 0.7691489 -0.2308511 0.7691489
#> 2 market_2 -0.2308511 -0.2308511 0.7691489 -0.2308511
#> 3 market_3 -0.2308511 0.7691489 -0.2308511 -0.2308511
#> 4 market_4 -0.2308511 -0.2308511 -0.2308511 -0.2308511
If demographic input (demographicData
) is missing, the
estimation routine considers only coefficients for unobserved
heterogeneity. This can be done by already implemented integration
methods via integration_method
as shown in the estimation
section. In Nevo’s cereal example however, a specific set of 20 draws is
given. For this situation, draws are also provided as a list (list names
correspond to the formula’s random coefficients and each list entry has
a variable market_identifier
):
originalDraws_cereal$constant[1:4, 1:5]
#> cdid draw_1 draw_2 draw_3 draw_4
#> 1 market_1 0.434100553 -0.7266491 -0.6230607 -0.04131699
#> 2 market_2 0.001702598 0.2228245 -1.0287201 0.38312476
#> 3 market_3 -0.529569305 0.7248712 0.2910167 -1.39164253
#> 4 market_4 0.019233489 -1.4250308 -0.2678170 -2.26541236
# renaming constants:
names(originalDraws_cereal)[1] <- "(Intercept)"
originalDraws_cereal$price[1:4, 1:5]
#> cdid draw_1 draw_2 draw_3 draw_4
#> 1 market_1 -1.5008378 0.1331817 -0.1382405 1.2571357
#> 2 market_2 0.0276477 -0.8414414 -0.9056861 -2.0306179
#> 3 market_3 -0.4768085 -0.9348689 1.4300456 2.6188581
#> 4 market_4 0.5396019 0.1698178 0.7128376 -0.1216309
originalDraws_cereal$sugar[1:4, 1:5]
#> cdid draw_1 draw_2 draw_3 draw_4
#> 1 market_1 -1.1510786 -0.5007498 0.7974412 -0.6830540
#> 2 market_2 1.0451218 1.0170277 2.3012550 0.1490338
#> 3 market_3 -1.0187809 0.3107185 -0.9531532 0.8057080
#> 4 market_4 -0.5760582 -0.5072059 -0.7496653 1.1163544
originalDraws_cereal$mushy[1:4, 1:5]
#> cdid draw_1 draw_2 draw_3 draw_4
#> 1 market_1 0.16101681 0.1297316 -0.79554915 0.2590437
#> 2 market_2 0.23595798 0.1345995 0.07284586 0.9509871
#> 3 market_3 0.02922435 -0.1704289 0.36395246 0.4879702
#> 4 market_4 1.27701102 0.4284194 0.46124654 -1.0110437
As demonstrated above, list entries for draws of constants
must be named (Intercept)
. Other names of
list entries must match the random coefficients specified in the
formula.
Calling BLP_data
structures and prepares the data for
estimation and creates the data object:
productData_cereal$startingGuessesDelta <- c(log(w_guesses_cereal)) # include orig. draws in the product data
cereal_data <- BLP_data(
model = nevos_model,
market_identifier = "cdid",
par_delta = "startingGuessesDelta",
product_identifier = "product_id",
productData = productData_cereal,
demographic_draws = demographicData_cereal,
blp_inner_tol = 1e-6, blp_inner_maxit = 5000,
integration_draws = originalDraws_cereal,
integration_weights = rep(1 / 20, 20)
)
The arguments in greater detail:
model
provides the utility model as explained
above
market_identifier
gives the name of the variable in
productData
that matches each observation to a
market
product_identifier
gives the name of the variable in
productData
that matches each observation to a product
(must be unique in a market)
productData
is given as a dataframe and
demographicData
as a list as described above
par_delta
gives the name of the variable in
productData
for mean utilities
blp_inner_tol
, blp_inner_maxit
:
arguments related to be BLP algorithm include the convergence threshold
and the maximum number of iterations in the contraction mapping
if integration draws are provided manually,
integration_draws
and integration_weights
need
to be specified
for automatic integration the user specifies
integration_method
, for example
integration_method= "MLHS"
, and the accuracy of the
integration method by integration_accuracy
(for stochastic
integration methods this equals the number of draws)
If you decide to update your data later, you can use the function
update_BLP_data
.
The provided set of starting guesses par_theta2
is
matched with formula input and demographic data:
par_theta2
must match with the random
coefficients specified in the formula (note: constants
must be named (Intercept)
)par_theta2
must match with list entry names
of demographicData
and a column for unobserved
heterogeneity (must be named `unobs_sd)NA
s in par_theta2
indicate the exclusion
from estimation, i.e. the coefficient is assumed to be zero.These requirements are demonstrated with a set of exemplary starting guesses:
# before:
theta_guesses_cereal
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 0.3302 5.4819 0.0 0.2037 0.0000
#> [2,] 2.4526 15.8935 -1.2 0.0000 2.6342
#> [3,] 0.0163 -0.2506 0.0 0.0511 0.0000
#> [4,] 0.2441 1.2650 0.0 -0.8091 0.0000
theta_guesses_cereal[theta_guesses_cereal == 0] <- NA
colnames(theta_guesses_cereal) <- c("unobs_sd", "income", "incomesq", "age", "child")
rownames(theta_guesses_cereal) <- c("(Intercept)", "price", "sugar", "mushy")
# correctly named:
theta_guesses_cereal
#> unobs_sd income incomesq age child
#> (Intercept) 0.3302 5.4819 NA 0.2037 NA
#> price 2.4526 15.8935 -1.2 NA 2.6342
#> sugar 0.0163 -0.2506 NA 0.0511 NA
#> mushy 0.2441 1.2650 NA -0.8091 NA
The following code performs the demand estimation:
cereal_est <- estimateBLP(
blp_data = cereal_data,
par_theta2 = theta_guesses_cereal,
solver_method = "BFGS", solver_maxit = 1000, solver_reltol = 1e-6,
standardError = "heteroskedastic",
extremumCheck = FALSE,
printLevel = 1
)
#> blp_data were prepared with the following arguments:
#> BLP_data(model = nevos_model, market_identifier = "cdid", product_identifier = "product_id",
#> par_delta = "startingGuessesDelta", productData = productData_cereal,
#> demographic_draws = demographicData_cereal, integration_draws = originalDraws_cereal,
#> integration_weights = rep(1/20, 20), blp_inner_tol = 1e-06,
#> blp_inner_maxit = 5000)
#> Starting a BLP demand estimation with 2256 observations in 94 markets...
#> [integration::method integration::amountDraws 20 ]
#> [blp::inner_tol 1e-06 blp::inner_maxit 5000 ]
#> [solver::method BFGS solver::maxit 1000 solver::reltol 1e-06 ]
#> gmm objective: 29.3522
#> gmm objective: Inf [delta contains NaN's]
#> gmm objective: Inf [delta contains NaN's]
#> gmm objective: Inf [delta contains NaN's]
#> gmm objective: 22451.18
#> gmm objective: 775.9515
#> gmm objective: 71.6209
#> gmm objective: 26.8851
#> gmm objective: Inf [delta contains NaN's]
#> gmm objective: 330914
#> gmm objective: 12694.84
#> gmm objective: 311.8842
#> gmm objective: 38.2627
#> gmm objective: 26.9935
#> gmm objective: 26.8196
#> gmm objective: 40772.63
#> gmm objective: 1186.144
#> gmm objective: 103.4133
#> gmm objective: 27.9556
#> gmm objective: 26.1524
#> gmm objective: Inf [delta contains NaN's]
#> gmm objective: Inf [delta contains NaN's]
#> gmm objective: 28234.6
#> gmm objective: 756.8709
#> gmm objective: 37.7704
#> gmm objective: 24.7211
#> gmm objective: 5772.739
#> gmm objective: 202.5456
#> gmm objective: 27.2616
#> gmm objective: 23.9898
#> gmm objective: 838.469
#> gmm objective: 32.1783
#> gmm objective: 21.9662
#> gmm objective: 23912.56
#> gmm objective: 244.3514
#> gmm objective: 20.9664
#> gmm objective: 1417.216
#> gmm objective: 56.3625
#> gmm objective: 21.2928
#> gmm objective: 20.6459
#> gmm objective: 24.7031
#> gmm objective: 19.3915
#> gmm objective: 192.3949
#> gmm objective: 18.8625
#> gmm objective: 20.5334
#> gmm objective: 17.7234
#> gmm objective: 16.7846
#> gmm objective: 16.4664
#> gmm objective: 15.9588
#> gmm objective: 15.0231
#> gmm objective: 14.9101
#> gmm objective: 14.8987
#> gmm objective: 14.8961
#> gmm objective: 14.8902
#> gmm objective: 14.8788
#> gmm objective: 14.8479
#> gmm objective: 14.7731
#> gmm objective: 14.5807
#> gmm objective: 14.0873
#> gmm objective: 12.8012
#> gmm objective: 11.9828
#> gmm objective: 9.4672
#> gmm objective: Inf [delta contains NaN's]
#> gmm objective: Inf [delta contains NaN's]
#> gmm objective: 150504.2
#> gmm objective: 6437.658
#> gmm objective: 224.7393
#> gmm objective: 13.0643
#> gmm objective: 9.2023
#> gmm objective: Inf [delta contains NaN's]
#> gmm objective: 10140.17
#> gmm objective: 554.5767
#> gmm objective: 34.6091
#> gmm objective: 10.0322
#> gmm objective: 9.1844
#> gmm objective: 75730.14
#> gmm objective: 2145.751
#> gmm objective: 45.6103
#> gmm objective: 9.9089
#> gmm objective: 9.0889
#> gmm objective: 111102.8
#> gmm objective: 15512.82
#> gmm objective: 389.1241
#> gmm objective: 20.1367
#> gmm objective: 8.7624
#> gmm objective: 766.0638
#> gmm objective: 49.5382
#> gmm objective: 10.0242
#> gmm objective: 8.6598
#> gmm objective: 155.1232
#> gmm objective: 12.3475
#> gmm objective: 8.5964
#> gmm objective: 308.36
#> gmm objective: 10.1499
#> gmm objective: 8.3448
#> gmm objective: 1593.047
#> gmm objective: 18.5902
#> gmm objective: 7.9521
#> gmm objective: 24.6953
#> gmm objective: 7.4967
#> gmm objective: 7.8465
#> gmm objective: 7.4219
#> gmm objective: 7.3411
#> gmm objective: 7.2325
#> gmm objective: 7.0497
#> gmm objective: 6.9818
#> gmm objective: 6.9711
#> gmm objective: 6.9695
#> gmm objective: 6.9696
#> gmm objective: 6.9692
#> gmm objective: 6.9633
#> gmm objective: 6.9462
#> gmm objective: 6.8986
#> gmm objective: 6.8301
#> gmm objective: 6.6517
#> gmm objective: 6.3046
#> gmm objective: 5.7187
#> gmm objective: 5.0274
#> gmm objective: 4.6491
#> gmm objective: 4.5752
#> gmm objective: 54557.61
#> gmm objective: 2335.286
#> gmm objective: 90.1735
#> gmm objective: 7.3992
#> gmm objective: 4.667
#> gmm objective: 4.576
#> gmm objective: 4.575
#> gmm objective: 14656.41
#> gmm objective: 601.8347
#> gmm objective: 32.5354
#> gmm objective: 5.863
#> gmm objective: 4.6245
#> gmm objective: 4.5771
#> gmm objective: 4.5758
#> gmm objective: 4.5759
#> gmm objective: 4.5757
#> gmm objective: 4.5755
#> gmm objective: 4.5754
#> gmm objective: 4.5753
#> gmm objective: 4.5753
#> gmm objective: 4.5753
#> gmm objective: 4.5753
#> gmm objective: 4.5752
#> gmm objective: 4.5752
#> gmm objective: 4.5752
#> gmm objective: 4.5752
#> gmm objective: 4.5752
#> gmm objective: 4.5752
#> gmm objective: 4.5752
#> gmm objective: 4.5752
#> gmm objective: 4.5752
#> gmm objective: 16969.25
#> gmm objective: 603.3225
#> gmm objective: 23.8438
#> gmm objective: 5.2753
#> gmm objective: 4.5966
#> gmm objective: 4.575
#> gmm objective: 4.5752
#> gmm objective: 4.5753
#> gmm objective: 4.5753
#> gmm objective: 4.5753
#> gmm objective: 4.5753
#> gmm objective: 4.5753
#> gmm objective: 4.5753
#> gmm objective: 4.5753
#> gmm objective: 4.5752
#> gmm objective: 4.5752
#> gmm objective: 4.5752
#> gmm objective: 4.5752
#> gmm objective: 4.5752
#> gmm objective: 4.5752
#> gmm objective: 4.5752
#> gmm objective: 4.5752
#> gmm objective: 4.5752
#> gmm objective: 4.5752
#> gmm objective: 4.5752
#> ------------------------------------------
#> Solver message: Successful convergence
#> ------------------------------------------
#> Final GMM evaluation at optimal parameters:
#> gmm objective: 4.5744
#> theta (RC): 0.55 3.29 -0.01 0.09
#> theta (demogr.): 2.3 577.44 -0.38 0.83 0 -29.63 0 0 1.27 0 0.05 -1.35 0 11.03 0 0
#> inner iterations: 70
#> gradient: -0.0605 0.0017 -0.162 -0.0354 -0.0926 -0.0083 -2.2061 0.2945 -0.1468 0.0797 -1.2189 0.2315 2e-04
#> Using the heteroskedastic asymptotic variance-covariance matrix...
summary(cereal_est)
#>
#> Data information:
#>
#> 94 market(s) with 2256 products
#> 25 linear coefficient(s) (24 exogenous coefficients)
#> 13 non-linear parameters related to random coefficients
#> 4 demographic variable(s)
#>
#> Estimation results:
#>
#> Linear Coefficients
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -2.5769264 0.8531569 -3.020460 2.523909e-03
#> price -62.1403349 14.5135858 -4.281529 1.856137e-05
#> productdummyproduct10 1.9000897 0.6653478 2.855784 4.293071e-03
#> productdummyproduct11 3.6767952 0.6664428 5.517045 3.447468e-08
#> productdummyproduct12 0.2615994 0.2287542 1.143583 2.527967e-01
#> productdummyproduct13 2.3591835 0.2184213 10.801071 3.402135e-27
#> productdummyproduct14 3.8446786 0.6855662 5.608034 2.046376e-08
#> productdummyproduct15 4.2873240 0.6779154 6.324276 2.544228e-10
#> productdummyproduct16 6.0916947 0.6723137 9.060792 1.295068e-19
#> productdummyproduct17 2.6036321 0.6510521 3.999115 6.357976e-05
#> productdummyproduct18 1.4324858 0.6079004 2.356448 1.845065e-02
#> productdummyproduct19 4.3135147 0.6191372 6.966977 3.238237e-12
#> productdummyproduct2 4.8750516 0.6334282 7.696297 1.400664e-14
#> productdummyproduct20 4.1706988 0.6075700 6.864557 6.669789e-12
#> productdummyproduct21 4.9106550 0.6827987 7.191951 6.387201e-13
#> productdummyproduct22 1.7215961 0.5917418 2.909370 3.621574e-03
#> productdummyproduct23 3.2960677 0.6839398 4.819237 1.441085e-06
#> productdummyproduct24 1.8879979 0.7307294 2.583717 9.774202e-03
#> productdummyproduct3 2.3028803 0.2314068 9.951652 2.480314e-23
#> productdummyproduct4 1.3668373 0.5940217 2.300989 2.139225e-02
#>
#> ...
#>
#> 5 estimates are omitted. They are available in the LinCoefficients generated by summary.
#>
#> Random Coefficients
#> Estimate Std. Error t value Pr(>|t|)
#> unobs_sd*(Intercept) 0.551339802 0.16015070 3.4426313 0.0005760842
#> unobs_sd*price 3.285559241 1.30636137 2.5150462 0.0119016778
#> unobs_sd*sugar -0.005237547 0.01336128 -0.3919943 0.6950624579
#> unobs_sd*mushy 0.091406886 0.18480221 0.4946201 0.6208683094
#> income*(Intercept) 2.303677742 1.19131427 1.9337280 0.0531465812
#> income*price 577.439894749 264.92313024 2.1796507 0.0292833612
#> income*sugar -0.384035468 0.11965304 -3.2095755 0.0013293115
#> income*mushy 0.826450257 0.77955169 1.0601609 0.2890713880
#> incomesq*price -29.627526236 13.80993187 -2.1453782 0.0319226238
#> age*(Intercept) 1.268858975 0.62894244 2.0174485 0.0436487312
#> age*sugar 0.051714014 0.02546546 2.0307513 0.0422802275
#> age*mushy -1.350839887 0.66343728 -2.0361230 0.0417380079
#> child*price 11.025866517 4.11565257 2.6790081 0.0073840611
#>
#> Wald Test
#> 126.7789 on 13 DF, p-value: 9.13936392437539e-21
#>
#> Computational Details:
#> Solver converged with 176 iterations to a minimum at 4.5744 .
#> Local minima check: NA
#> stopping criterion outer loop: 1e-06
#> stopping criterion inner loop: 1e-06
#> Market shares are integrated with provided_by_user and 20 draws.
#> Method for standard errors: heteroskedastic
The arguments in greater detail:
par_theta2
gives initial values for non-linear
parameters to be optimized over. Correct naming of columns and rows is
important to allow correct matching.
solver_method
, solver_maxit
,
solver_reltol
: solver related arguments that specify the R
internal optimization (optim
function). Additional
arguments can be passed to optim via ...
standardError
can be specified as
homoskedastic
, heteroskedastic
or
cluster
. The latter requires the variable
group_structure
in productData
giving the
related cluster.
if extremumCheck
is TRUE
, numerical
derivatives at the solver optimum are used to check, if a local minimum
was found
printLevel
controls for the amount of information
that is provided during the estimation
Many of these arguments have default values. In the following setting you see a minimum of necessary arguments with an automatic generation of integration draws and just unobserved heterogeneity. The summary output informs you about the most important default values.
cereal_data2 <- BLP_data(
model = nevos_model,
market_identifier = "cdid",
product_identifier = "product_id",
productData = productData_cereal,
integration_method = "MLHS",
integration_accuracy = 20, integration_seed = 213
)
#> Mean utility (variable name: `delta`) is initialized with 0 because of missing or invalid par_delta argument.
cereal_est2 <- estimateBLP(blp_data = cereal_data2, printLevel = 1)
#> blp_data were prepared with the following arguments:
#> BLP_data(model = nevos_model, market_identifier = "cdid", product_identifier = "product_id",
#> productData = productData_cereal, integration_accuracy = 20,
#> integration_method = "MLHS", integration_seed = 213)
#> Starting a BLP demand estimation with 2256 observations in 94 markets...
#> [integration::method integration::amountDraws 20 ]
#> [blp::inner_tol 1e-09 blp::inner_maxit 10000 ]
#> [solver::method BFGS solver::maxit 10000 solver::reltol 1e-06 ]
#> gmm objective: 189.9432
#> gmm objective: 5632.612
#> gmm objective: 439.3216
#> gmm objective: 202.5585
#> gmm objective: 190.4644
#> gmm objective: 189.9569
#> gmm objective: 189.9422
#> gmm objective: 366.2611
#> gmm objective: 196.3801
#> gmm objective: 190.0315
#> gmm objective: 189.9122
#> gmm objective: 240.4055
#> gmm objective: 191.6947
#> gmm objective: 189.9326
#> gmm objective: 189.903
#> gmm objective: 189.8537
#> gmm objective: 189.8352
#> gmm objective: 189.8352
#> gmm objective: 212.3727
#> gmm objective: 190.7591
#> gmm objective: 189.8716
#> gmm objective: 189.8367
#> gmm objective: 189.8353
#> gmm objective: 189.8352
#> gmm objective: 189.8352
#> ------------------------------------------
#> Solver message: Successful convergence
#> ------------------------------------------
#> Final GMM evaluation at optimal parameters:
#> gmm objective: 189.8352
#> theta (RC): 0 -0.32 0 0.06
#> theta (demogr.):
#> inner iterations: 61
#> gradient: 0 0 0.0316 0
#> Using the heteroskedastic asymptotic variance-covariance matrix...
summary(cereal_est2)
#>
#> Data information:
#>
#> 94 market(s) with 2256 products
#> 25 linear coefficient(s) (24 exogenous coefficients)
#> 4 non-linear parameters related to random coefficients
#> 0 demographic variable(s)
#>
#> Estimation results:
#>
#> Linear Coefficients
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -1.7731873 0.1602903 -11.062350 1.910261e-28
#> price -30.1150257 1.2899147 -23.346526 1.494838e-120
#> productdummyproduct10 1.8588694 0.1680453 11.061715 1.923839e-28
#> productdummyproduct11 1.7623940 0.1495181 11.787162 4.546047e-32
#> productdummyproduct12 -0.2874741 0.1592495 -1.805180 7.104647e-02
#> productdummyproduct13 2.2500443 0.1511584 14.885338 4.104043e-50
#> productdummyproduct14 1.7795210 0.1626780 10.938918 7.509066e-28
#> productdummyproduct15 2.1598572 0.1600857 13.491879 1.745928e-41
#> productdummyproduct16 3.6903310 0.1718044 21.479845 2.403069e-102
#> productdummyproduct17 0.7890090 0.1696815 4.649940 3.320313e-06
#> productdummyproduct18 0.8752804 0.1637558 5.345035 9.039982e-08
#> productdummyproduct19 1.7678372 0.1493846 11.834129 2.600273e-32
#> productdummyproduct2 2.3665019 0.1535953 15.407387 1.459966e-53
#> productdummyproduct20 2.8086743 0.1635328 17.174992 4.087069e-66
#> productdummyproduct21 2.6809487 0.1689161 15.871478 9.986136e-57
#> productdummyproduct22 0.4722649 0.1546958 3.052862 2.266698e-03
#> productdummyproduct23 1.3755276 0.1541332 8.924278 4.486152e-19
#> productdummyproduct24 2.1728901 0.1699102 12.788464 1.901958e-37
#> productdummyproduct3 1.8016223 0.1488931 12.100102 1.054799e-33
#> productdummyproduct4 0.7639859 0.1468578 5.202216 1.969266e-07
#>
#> ...
#>
#> 5 estimates are omitted. They are available in the LinCoefficients generated by summary.
#>
#> Random Coefficients
#> Estimate Std. Error t value Pr(>|t|)
#> unobs_sd*(Intercept) -0.0028937363 1.1219334 -0.002579241 0.9979421
#> unobs_sd*price -0.3179154684 9.1643440 -0.034690478 0.9723266
#> unobs_sd*sugar -0.0003841069 0.1139372 -0.003371217 0.9973102
#> unobs_sd*mushy 0.0603775693 2.2758039 0.026530216 0.9788344
#>
#> Wald Test
#> 0.0018 on 4 DF, p-value: 0.9999995836557
#>
#> Computational Details:
#> Solver converged with 25 iterations to a minimum at 189.8352 .
#> Local minima check: NA
#> stopping criterion outer loop: 1e-06
#> stopping criterion inner loop: 1e-09
#> Market shares are integrated with MLHS and 20 draws.
#> Method for standard errors: heteroskedastic
Standard errors can be computed with three options that control for the unobserved characteristic ξ, which consists of N elements. Ω denotes the variance covariance matrix of ξ.
option homoskedastic
requires the standard deviation
σi for
each ξi ∀i ∈ 1, ⋯, N
to be identical: $$\Omega = \begin{pmatrix}
\sigma & 0 &
\dots & 0\\ 0 & \sigma & & 0\\ \vdots & &
\ddots & 0\\
0 & 0 & 0 & \sigma \\ \end{pmatrix}$$
option heteroskedastic
allows for individual
standard deviations σi for each
ξi : $$\Omega = \begin{pmatrix} \sigma_1 & 0 &
\dots & 0\\
0 & \sigma_2 & & 0\\
\vdots & & \ddots & 0\\
0 & 0 & 0 & \sigma_N \\ \end{pmatrix}$$
option cluster
allows for cluster individual
variance covariance matrices in each of M cluster groups. For this option
the argument group_structure
needs to be specified in the
function BLP_data
to determine the cluster group. This
gives the block-diagonal form with Σm as the
variance covariance matrix for all ξi in cluster
m:
$$\Omega = \begin{pmatrix} \Sigma_1 & 0 & \dots & 0\\ 0 & \Sigma_2 & & 0\\ \vdots & & \ddots & 0\\ 0 & 0 & 0 & \Sigma_M \\ \end{pmatrix}$$
The following code demonstrates the calculation of elasticities for
the estimation object cereal_est
.
# extract parameters from output
theta1_price <- cereal_est$theta_lin["price", ]
theta2 <- matrix(NA, nrow = 4, ncol = 5)
colnames(theta2) <- c("unobs_sd", "income", "incomesq", "age", "child")
rownames(theta2) <- c("(Intercept)", "price", "sugar", "mushy")
for (i in 1:13) {
theta2[cereal_est$indices[i, 1], cereal_est$indices[i, 2]] <- cereal_est$theta_rc[i]
}
delta_data <- data.frame(
"product_id" = cereal_data$parameters$product_id,
"cdid" = cereal_data$parameters$market_id_char_in,
"startingGuessesDelta" = cereal_est$delta
)
# always use update_BLP_data() to update data object to maintain consistent data
cereal_data <- update_BLP_data(
data_update = delta_data,
blp_data = cereal_data
)
#> Mean utility variable startingGuessesDelta has been updated.
shareObj <- getShareInfo(
blp_data = cereal_data,
par_theta2 = theta2,
printLevel = 1
)
#> Mean utility (delta) is used as provided in the BLP_data() function.
get_elasticities(
blp_data = cereal_data,
share_info = shareObj,
theta_lin = theta1_price,
variable = "price",
products = c("cereal_1", "cereal_4"),
market = "market_2"
)
#> cereal_1 cereal_4
#> cereal_1 -1.710300059 0.005433147
#> cereal_4 0.006965292 -2.717589276
The value of the elasticity matrix in row j and column i for a variable x, gives the effect of a change in product i’s characteristic x on the share of product j.
Further analysis like incorporating a supply side or performing a merger simulation often requires access to building blocks of the BLP algorithm. The following wrappers insure correct data inputs and access the internal functions of the algorithm.
In the following, you find an example of the contraction mapping and an evaluation of the GMM function at the starting guess:
delta_eval <- getDelta_wrap(
blp_data = cereal_data,
par_theta2 = theta_guesses_cereal,
printLevel = 4
)
#> ----------------------
#> dist: 0.230998
#> dist: 0.188133
#> dist: 0.161181
#> dist: 0.133303
#> dist: 0.107161
#> dist: 0.0844808
#> dist: 0.0657196
#> dist: 0.0506556
#> dist: 0.0387915
#> dist: 0.0295683
#> dist: 0.0224622
#> dist: 0.0170219
#> dist: 0.0128758
#> dist: 0.00972657
#> dist: 0.00734024
#> dist: 0.00553524
#> dist: 0.00417178
#> dist: 0.00314287
#> dist: 0.00236698
#> dist: 0.00178222
#> dist: 0.00134169
#> dist: 0.00100991
#> dist: 0.000760105
#> dist: 0.000572046
#> dist: 0.000430491
#> dist: 0.00032395
#> dist: 0.000243769
#> dist: 0.00019173
#> dist: 0.000156244
#> dist: 0.000127321
#> dist: 0.000103748
#> dist: 8.45372e-05
#> dist: 6.88821e-05
#> dist: 5.61251e-05
#> dist: 4.573e-05
#> dist: 3.72598e-05
#> dist: 3.03582e-05
#> dist: 2.47347e-05
#> dist: 2.01528e-05
#> dist: 1.64195e-05
#> dist: 1.33778e-05
#> dist: 1.08995e-05
#> dist: 8.88033e-06
#> dist: 7.23518e-06
#> dist: 5.8948e-06
#> dist: 4.80272e-06
#> dist: 3.91296e-06
#> dist: 3.18804e-06
#> dist: 2.59741e-06
#> dist: 2.14713e-06
#> dist: 1.78121e-06
#> dist: 1.47765e-06
#> dist: 1.22582e-06
#> dist: 1.01691e-06
#> dist: 9.93231e-07
productData_cereal$startingGuessesDelta[1:6]
#> [1] -3.944367 -2.845205 -3.958199 -4.934153 -2.425356 -4.086816
delta_eval$delta[1:6]
#> cereal_1_market_1 cereal_2_market_1 cereal_3_market_1 cereal_4_market_1
#> -7.069801 -4.357675 -6.056909 -5.887501
#> cereal_5_market_1 cereal_6_market_1
#> -3.501289 -3.079323
delta_eval$counter
#> [1] 55
gmm <- gmm_obj_wrap(
blp_data = cereal_data,
par_theta2 = theta_guesses_cereal,
printLevel = 2
)
#> gmm objective: 29.3544
#> theta (RC): 0.33 2.45 0.02 0.24
#> theta (demogr.): 5.48 15.89 -0.25 1.26 0 -1.2 0 0 0.2 0 0.05 -0.81 0 2.63 0 0
#> inner iterations: 55
#> gradient: 9.8468 0.3171 363.5319 16.3598 10.6031 0.7028 42.5236 -3.4759 13.4982 -2.0231 10.9331 1.2851 -0.5714
#> [1] 29.35439
gmm$local_min
#> [1] 29.35439
Printed distances in the contraction mapping are maximum absolute distances between the current vector of mean utilities and the previous one.
For any θ2, you can compute predicted shares:
shareObj <- getShareInfo(
blp_data = cereal_data,
par_theta2 = theta_guesses_cereal,
printLevel = 4
)
#> Mean utility (delta) is used as provided in the BLP_data() function.
shareObj$shares[1:6]
#> cereal_1_market_1 cereal_2_market_1 cereal_3_market_1 cereal_4_market_1
#> 0.052856545 0.002175260 0.006105588 0.001529422
#> cereal_5_market_1 cereal_6_market_1
#> 0.001571742 0.003654192
The object contains a list of outputs that are useful for further
economic analysis. For example, the list element sij
contains share probabilities for every individual and needs to be given
to calculate elasticities.
The gradient contains two important building blocks as explained in the appendix of Nevo (2001):
$\frac{\partial s_{ijt}}{\partial \theta_2}$ , i.e. the derivative of individual i’s share of product j in market t with respect to non-linear parameters
$\frac{\partial s_{ijt}}{\partial \delta}$ , i.e. the derivative of individual i’s share of product j in market t with respect to mean utilities
Both are used to compute the jacobian and are easy to obtain with the package as the following example demonstrates:
# market 2:
derivatives1 <- dstdtheta_wrap(
blp_data = cereal_data,
par_theta2 = theta_guesses_cereal,
market = "market_2"
)
#> Mean utility (delta) is used as provided in the BLP_data() function.
derivatives2 <- dstddelta_wrap(
blp_data = cereal_data,
par_theta2 = theta_guesses_cereal,
market = "market_2"
)
#> Mean utility (delta) is used as provided in the BLP_data() function.
jac_mkt2 <- -solve(derivatives2) %*% derivatives1
jac_mkt2[1:5, 1:4]
#> unobs_sd*(Intercept) unobs_sd*price unobs_sd*sugar
#> meanUtility_cereal_1 -0.3935523 0.010936676 -1.430109
#> meanUtility_cereal_2 -0.5385401 0.002959190 -8.608901
#> meanUtility_cereal_3 -0.4567829 0.003935221 -2.612346
#> meanUtility_cereal_4 -0.7766701 -0.022430917 -1.567758
#> meanUtility_cereal_5 -0.7821670 -0.085051524 -2.722660
#> unobs_sd*mushy
#> meanUtility_cereal_1 -0.47865751
#> meanUtility_cereal_2 -0.20678073
#> meanUtility_cereal_3 -0.43499042
#> meanUtility_cereal_4 -0.04046717
#> meanUtility_cereal_5 -0.04388127
# all markets
jacobian_nevo <- getJacobian_wrap(
blp_data = cereal_data,
par_theta2 = theta_guesses_cereal,
printLevel = 2
)
#> Mean utility (delta) is used as provided in the BLP_data() function.
jacobian_nevo[25:29, 1:4] # compare to jac_mkt2
#> unobs_sd*(Intercept) unobs_sd*price unobs_sd*sugar
#> cereal_1_market_2 -0.3935523 0.010936676 -1.430109
#> cereal_2_market_2 -0.5385401 0.002959190 -8.608901
#> cereal_3_market_2 -0.4567829 0.003935221 -2.612346
#> cereal_4_market_2 -0.7766701 -0.022430917 -1.567758
#> cereal_5_market_2 -0.7821670 -0.085051524 -2.722660
#> unobs_sd*mushy
#> cereal_1_market_2 -0.47865751
#> cereal_2_market_2 -0.20678073
#> cereal_3_market_2 -0.43499042
#> cereal_4_market_2 -0.04046717
#> cereal_5_market_2 -0.04388127
Analyzing a hypothetical merger is demonstrated by the car data of Berry, Levinsohn, and Pakes (1995). In this case, the preparation of product data comprises the computation of instruments as a function of product characteristics of competitors’ products (for details, check Berry, Levinsohn, and Pakes (1995)). This example is based on data and documentation of Knittel and Metaxoglou (2014).
# add owner matix to productData
own_pre <- dummies_cars
colnames(own_pre) <- paste0("company", 1:26)
productData_cars <- cbind(productData_cars, own_pre)
# construct instruments
nobs <- nrow(productData_cars)
X <- data.frame(
productData_cars$const, productData_cars$hpwt,
productData_cars$air, productData_cars$mpg, productData_cars$space
)
sum_other <- matrix(NA, nobs, ncol(X))
sum_rival <- matrix(NA, nobs, ncol(X))
sum_total <- matrix(NA, nobs, ncol(X))
for (i in 1:nobs) {
other_ind <- productData_cars$firmid == productData_cars$firmid[i] &
productData_cars$cdid == productData_cars$cdid[i] &
productData_cars$id != productData_cars$id[i]
rival_ind <- productData_cars$firmid != productData_cars$firmid[i] &
productData_cars$cdid == productData_cars$cdid[i]
total_ind <- productData_cars$cdid == productData_cars$cdid[i]
sum_other[i, ] <- colSums(X[other_ind == 1, ])
sum_rival[i, ] <- colSums(X[rival_ind == 1, ])
sum_total[i, ] <- colSums(X[total_ind == 1, ])
}
colnames(sum_other) <- paste0("IV", 1:5)
colnames(sum_rival) <- paste0("IV", 6:10)
productData_cars <- cbind(productData_cars, sum_other, sum_rival)
head(productData_cars)
#> firmid cdid id const price hpwt air mpg space share
#> 1 15 1 129 1 4.935802 0.5289969 0 1.697 1.1502 0.0010512928
#> 2 15 1 130 1 5.516049 0.4943244 0 1.740 1.2780 0.0006700762
#> 3 15 1 132 1 7.108642 0.4676134 0 1.543 1.4592 0.0003405273
#> 4 15 1 134 1 6.839506 0.4265403 0 1.517 1.6068 0.0005222419
#> 5 15 1 136 1 8.928395 0.4524887 0 1.352 1.6458 0.0004424231
#> 6 19 1 138 1 7.153086 0.4508706 0 1.552 1.6224 0.0027560592
#> company1 company2 company3 company4 company5 company6 company7 company8
#> 1 0 0 0 0 0 0 0 0
#> 2 0 0 0 0 0 0 0 0
#> 3 0 0 0 0 0 0 0 0
#> 4 0 0 0 0 0 0 0 0
#> 5 0 0 0 0 0 0 0 0
#> 6 0 0 0 0 0 0 0 0
#> company9 company10 company11 company12 company13 company14 company15
#> 1 0 0 0 0 0 0 1
#> 2 0 0 0 0 0 0 1
#> 3 0 0 0 0 0 0 1
#> 4 0 0 0 0 0 0 1
#> 5 0 0 0 0 0 0 1
#> 6 0 0 0 0 0 0 0
#> company16 company17 company18 company19 company20 company21 company22
#> 1 0 0 0 0 0 0 0
#> 2 0 0 0 0 0 0 0
#> 3 0 0 0 0 0 0 0
#> 4 0 0 0 0 0 0 0
#> 5 0 0 0 0 0 0 0
#> 6 0 0 0 1 0 0 0
#> company23 company24 company25 company26 IV1 IV2 IV3 IV4 IV5 IV6
#> 1 0 0 0 0 4 1.840967 0 6.152 5.9898 87
#> 2 0 0 0 0 4 1.875639 0 6.109 5.8620 87
#> 3 0 0 0 0 4 1.902350 0 6.306 5.6808 87
#> 4 0 0 0 0 4 1.943423 0 6.332 5.5332 87
#> 5 0 0 0 0 4 1.917475 0 6.497 5.4942 87
#> 6 0 0 0 0 28 16.964809 0 44.328 46.1233 63
#> IV7 IV8 IV9 IV10
#> 1 44.55554 0 150.386 125.5613
#> 2 44.55554 0 150.386 125.5613
#> 3 44.55554 0 150.386 125.5613
#> 4 44.55554 0 150.386 125.5613
#> 5 44.55554 0 150.386 125.5613
#> 6 29.50982 0 112.355 84.9556
# To show similarities between implementations of other authors,
# the variable "const" is used, although constants are considered by default.
blps_model <- as.formula("share ~ 0 + const + price + hpwt + air + mpg + space |
0 + const + hpwt + air + mpg + space |
0 + price + const + hpwt + air + mpg |
0 + IV1 + IV2 + IV3 + IV4 + IV5 + IV6 + IV7 + IV8 + IV9 + IV10")
car_data <- BLP_data(
model = blps_model,
market_identifier = "cdid",
product_identifier = "id",
additional_variables = paste0("company", 1:26), # check reordering works
productData = productData_cars,
blp_inner_tol = 1e-9,
blp_inner_maxit = 5000,
integration_method = "MLHS",
integration_accuracy = 50, integration_seed = 48
)
#> Mean utility (variable name: `delta`) is initialized with 0 because of missing or invalid par_delta argument.
In the next step, starting guesses for random coefficients are generated from a standard normal distribution. The estimation of the model works like before.
set.seed(121)
theta_guesses <- matrix(rnorm(5))
rownames(theta_guesses) <- c("price", "const", "hpwt", "air", "mpg")
colnames(theta_guesses) <- "unobs_sd"
car_est <- estimateBLP(
blp_data = car_data,
par_theta2 = theta_guesses,
solver_method = "BFGS", solver_maxit = 1000, solver_reltol = 1e-6,
extremumCheck = FALSE, printLevel = 0
)
#> blp_data were prepared with the following arguments:
#> BLP_data(model = blps_model, market_identifier = "cdid", product_identifier = "id",
#> additional_variables = paste0("company", 1:26), productData = productData_cars,
#> integration_accuracy = 50, integration_method = "MLHS", integration_seed = 48,
#> blp_inner_tol = 1e-09, blp_inner_maxit = 5000)
#> ------------------------------------------
#> Solver message: Successful convergence
#> ------------------------------------------
#> Final GMM evaluation at optimal parameters:
#> gmm objective: 131.2733
#> theta (RC): -0.42 -8.59 1.27 -0.1 1.07
#> theta (demogr.):
#> inner iterations: 67
#> gradient: -18.916 3.6499 27.2623 -8.5398 -28.9531
#> Using the heteroskedastic asymptotic variance-covariance matrix...
summary(car_est)
#>
#> Data information:
#>
#> 20 market(s) with 2217 products
#> 6 linear coefficient(s) (5 exogenous coefficients)
#> 5 non-linear parameters related to random coefficients
#> 0 demographic variable(s)
#>
#> Estimation results:
#>
#> Linear Coefficients
#> Estimate Std. Error t value Pr(>|t|)
#> const -13.0980731 1.0748672 -12.185759 3.701895e-34
#> price -0.9049313 0.1609564 -5.622213 1.885269e-08
#> hpwt 2.1752674 0.8598893 2.529706 1.141581e-02
#> air 1.4011076 0.2567444 5.457207 4.836817e-08
#> mpg -0.7994763 0.6366915 -1.255673 2.092345e-01
#> space 3.1465811 0.3542019 8.883580 6.474223e-19
#>
#> Random Coefficients
#> Estimate Std. Error t value Pr(>|t|)
#> unobs_sd*price -0.4150985 0.08593854 -4.8301784 1.364108e-06
#> unobs_sd*const -8.5907908 2.19917404 -3.9063715 9.369243e-05
#> unobs_sd*hpwt 1.2712851 0.41076258 3.0949390 1.968534e-03
#> unobs_sd*air -0.1005375 0.67001283 -0.1500531 8.807227e-01
#> unobs_sd*mpg 1.0652496 0.48637023 2.1902032 2.850950e-02
#>
#> Wald Test
#> 45.1733 on 5 DF, p-value: 1.33780393851001e-08
#>
#> Computational Details:
#> Solver converged with 187 iterations to a minimum at 131.2733 .
#> Local minima check: NA
#> stopping criterion outer loop: 1e-06
#> stopping criterion inner loop: 1e-09
#> Market shares are integrated with MLHS and 50 draws.
#> Method for standard errors: heteroskedastic
Next, all parameters that are required by the subsequent merger
analysis are extracted. Note that all extracted data is based on the
estimation object car_est
or the data object
car_data
to maintain data consistency (for example, the
order of data in product_data_cars
might differ from
car_data
). Moreover, mean utilities are updated in
car_data
by the values in the estimation object
car_est
.
## Pre-Merger data
own_pre <- as.matrix(car_data$data$additional_data[, paste0("company", 1:26)])
delta_pre <- car_est$delta
theta1_price <- car_est$theta_lin["price", ]
theta2_price <- car_est$theta_rc["unobs_sd*price"]
theta2_all <- matrix(car_est$theta_rc)
rownames(theta2_all) <- c("price", "const", "hpwt", "air", "mpg")
colnames(theta2_all) <- "unobs_sd"
## update mean utility in data ( always use update_BLP_data() to update data object to maintain consistent data )
delta_data <- data.frame(
"id" = car_data$parameters$product_id,
"cdid" = car_data$parameters$market_id,
"delta" = delta_pre
)
car_data_updated <- update_BLP_data(
data_update = delta_data,
blp_data = car_data
)
#> Mean utility variable delta has been updated.
In the next step, an estimate for marginal costs mc before the merger is computed. The following is based on the FOC of a Bertrand equilibrium with prices p before the merger:
$$ p^{pre} - \widehat{mc} = \Omega^{pre}(p^{pre})^{-1} \hat{s}(p^{pre}) $$ Ωpre(ppre)−1 is defined marketwise as the inverse of
$$ \Omega^{pre}(p^{pre}) = \pmatrix{ -\frac{\partial s_{1}}{\partial p_{1}} (p^{pre}) \cdot D_{1,1} & -\frac{\partial s_{2}}{\partial p_{1}} (p^{pre}) \cdot D_{1,2} & \cdots & -\frac{\partial s_{j}}{\partial p_{1}} (p^{pre}) \cdot D_{1,j} & \cdots & -\frac{\partial s_{J}}{\partial p_{1}} (p^{pre}) \cdot D_{1,J}\\ -\frac{\partial s_{1}}{\partial p_{2}} (p^{pre}) \cdot D_{2,1} & -\frac{\partial s_{2}}{\partial p_{2}} (p^{pre}) \cdot D_{2,2} & \cdots & -\frac{\partial s_{j}}{\partial p_{2}} (p^{pre}) \cdot D_{2,j} & \cdots & -\frac{\partial s_{J}}{\partial p_{2}} (p^{pre}) \cdot D_{2,J}\\ \vdots & \vdots & \ddots & \vdots & \ddots & \vdots\\ -\frac{\partial s_{1}}{\partial p_{k}} (p^{pre}) \cdot D_{k,1} & -\frac{\partial s_{2}}{\partial p_{k}} (p^{pre}) \cdot D_{k,2} & \cdots & -\frac{\partial s_{j}}{\partial p_{k}} (p^{pre}) \cdot D_{k,j} & \cdots & -\frac{\partial s_{J}}{\partial p_{k}} (p^{pre}) \cdot D_{k,J}\\ \vdots & \vdots & \ddots & \vdots & \ddots & \vdots\\ -\frac{\partial s_{1}}{\partial p_{J}} (p^{pre}) \cdot D_{J,1} & -\frac{\partial s_{2}}{\partial p_{J}} (p^{pre}) \cdot D_{J,2} & \cdots & -\frac{\partial s_{j}}{\partial p_{J}} (p^{pre}) \cdot D_{J,j}& \cdots & -\frac{\partial s_{J}}{\partial p_{J}} (p^{pre}) \cdot D_{J,J}\\ } $$
with $$D_{k,j} = \begin{cases} 1 & \text{if products k and j are produced by the same firm} \\ 0 & \text{otherwise} \\ \end{cases}$$
Partial derivatives $\frac{\partial s_j}{\partial p_k}$ can be calculated based on the elasticity $\eta_{jk} = \frac{\partial s_j }{\partial p_k }\frac{ p_k}{ s_j}$, so $$ \frac{\partial s_j}{\partial p_k} = \eta_{jk} \cdot \frac{ s_j}{ p_k} $$
In the following code chunk, these objects in a market i
are labeled as follows:
own_prod_pre_i
(Dk, j)elasticities_i
(ηjk)derivatives_i
($\eta_{jk}
\cdot \frac{ s_j}{ p_k}$)-solve(t(derivatives_i) * own_prod_pre_i)
(Ωpre(ppre)−1)shareObj$shares
(ŝ(ppre)).## calculate sij
shareObj <- getShareInfo(
blp_data = car_data_updated,
par_theta2 = theta2_all,
printLevel = 0
)
## computation of marginal costs
market_id <- car_data$parameters$market_id
nmkt <- length(unique(market_id))
markups <- numeric(length(market_id))
sh <- shareObj$shares
prices_pre <- car_data$data$X_rand[, "price"]
for (i in 1:nmkt) {
mkt_ind <- market_id == i
share_i <- sh[ mkt_ind ]
price_pre_i <- prices_pre[ mkt_ind ]
scalar_i <- matrix(1 / share_i) %*% matrix(price_pre_i, nrow = 1)
elasticities_i <- get_elasticities(
blp_data = car_data_updated,
share_info = shareObj,
theta_lin = theta1_price,
variable = "price",
market = i,
printLevel = 0
)
derivatives_i <- elasticities_i / scalar_i # partial derivatives of shares wrt price
own_pre_i <- own_pre[ mkt_ind, ]
own_prod_pre_i <- own_pre_i %*% t(own_pre_i) # if element (i,j) equals 1, that means that prod i and j are produced by same firm
markups[mkt_ind] <- c(-solve(t(derivatives_i) * own_prod_pre_i) %*% share_i)
}
marg_cost <- prices_pre - markups
The ownership matrix is adjusted to implement a hypothetical merger between Chrysler and GM:
# Merger between company 16 and 19 (i.e. GM and Chrysler)
prices_post <- numeric(2217)
own_post <- cbind(
own_pre[, 1:15],
own_pre[, 16] + own_pre[, 19],
own_pre[, 17:18],
own_pre[, 20:26]
)
To analyze the effect on prices the FOC of the new equilibrium must be solved: $$ p^{post} - \widehat{mc} = \Omega^{post}(p^{post})^{-1} \hat{s}(p^{post}) $$
The solution of this set of non-linear equations is obtained by the
function foc_bertrand_mkt
and the package
nleqslv
:
foc_bertrand_mkt <- function(par, own_prod, blp_data, mkt, marg_cost, theta_lin, theta_rc) {
# argument par: candidate for post merger prices
# arguments own_prod, blp_data, mkt, marg_cost, theta_lin, theta_rc: see previous code blocks
# post merger updates: update the BLP_data object for market i
tmp <- data.frame(
"id" = blp_data$parameters$product_id,
"cdid" = blp_data$parameters$market_id,
"delta" = blp_data$data$delta,
"price" = blp_data$data$X_rand[, "price"]
)
market_ind <- blp_data$parameters$market_id == mkt
delta_old <- blp_data$data$delta
prices_pre <- blp_data$data$X_rand[, "price"]
tmp$price[ market_ind ] <- par
tmp$delta[ market_ind ] <- delta_old[market_ind] - prices_pre[market_ind] * theta_lin + par * theta_lin
new_blp_data <- update_BLP_data(
blp_data = blp_data,
data_update = tmp
)
ShareObj <- getShareInfo(
blp_data = new_blp_data,
par_theta2 = theta_rc,
printLevel = 0
)
implied_shares <- as.matrix(ShareObj$shares[market_ind])
elasticities_post_mkt <- get_elasticities(
blp_data = new_blp_data,
share_info = ShareObj,
theta_lin = theta_lin,
variable = "price",
market = mkt,
printLevel = 0
)
scalar_mkt <- matrix(1 / implied_shares) %*% matrix(par, nrow = 1)
derivatives_mkt <- elasticities_post_mkt / scalar_mkt
markups_post <- c(-solve(t(derivatives_mkt) * own_prod) %*% implied_shares)
differences <- par - marg_cost[market_ind] - markups_post
return(differences)
}
Finally, the function is used to compute the new equilibrium:
library(nleqslv) # to solve non linear first order conditions
for (i in 1:nmkt) {
mkt_ind <- market_id == i
own_post_i <- own_post[ mkt_ind, ]
own_prod_post_i <- own_post_i %*% t(own_post_i)
price_pre_i <- prices_pre[ mkt_ind ]
solution <- nleqslv(
x = price_pre_i, foc_bertrand_mkt, # startingguesses: price_pre_i
own_prod = own_prod_post_i,
blp_data = car_data_updated,
mkt = i,
marg_cost = marg_cost,
theta_lin = theta1_price,
theta_rc = theta2_all
)
prices_post[ market_id == i ] <- solution$x
}