| Title: | Hedonic and Multilateral Index Methods for Real Estate Price Statistics |
|---|---|
| Description: | Compute price indices using various Hedonic and multilateral methods, including Laspeyres, Paasche, Fisher, and HMTS (Hedonic Multilateral Time series re-estimation with splicing). The central function calculate_price_index() offers a unified interface for running these methods on structured datasets. This package is designed to support index construction workflows for real estate and other domains where quality-adjusted price comparisons over time are essential. The development of this package was funded by Eurostat and Statistics Netherlands (CBS), and carried out by Statistics Netherlands. The HMTS method implemented here is described in Ishaak, Ouwehand and Remøy (2024) <doi:10.1177/0282423X241246617>. For broader methodological context, see Eurostat (2013, ISBN:978-92-79-25984-5, <doi:10.2785/34007>). |
| Authors: | Farley Ishaak [aut], Pim Ouwehand [aut], David Pietersz [aut], Liu Nuo Su [aut], Cynthia Cao [aut], Mohammed Kardal [aut], Odens van der Zwan [aut], Vivek Gajadhar [aut, cre] |
| Maintainer: | Vivek Gajadhar <[email protected]> |
| License: | GPL-2 |
| Version: | 0.1.0 |
| Built: | 2026-05-18 07:04:48 UTC |
| Source: | https://github.com/cran/cbsREPS |
By the parameters 'dependent_variable', 'continue_variable' and 'categorical_variables' as regression model is compiled. With the model, a direct series of index figures is estimated by use of hedonic regression.
calculate_fisher( dataset, period_variable, dependent_variable, continuous_variables, categorical_variables, reference_period = NULL, number_of_observations = FALSE )calculate_fisher( dataset, period_variable, dependent_variable, continuous_variables, categorical_variables, reference_period = NULL, number_of_observations = FALSE )
dataset |
table with data (does not need to be a selection of relevant variables) |
period_variable |
variable in the table with periods |
dependent_variable |
usually the sale price |
continuous_variables |
vector with quality determining numeric variables (no dummies) |
categorical_variables |
vector with quality determining categorical variables (also dummies) |
reference_period |
period or group of periods that will be set to 100 (numeric/string) |
number_of_observations |
number of observations per period (default = TRUE) |
N.B.: the independent variables must be entered transformed (and ready) in the parameters. Hence, not: log(floor_area), but transform the variable in advance and then provide log_floor_area. This does not count for the dependent variable. This should be entered untransformed
Within the data, it is not neccesary to filter the data on relevant variables or complete records. This is taken care of in the function.
table with index, imputation averages, number of observations and confidence intervals per period
Farley Ishaak
The equation for the calculation is:: exp(mean(log(series_values)))
calculate_geometric_average(values)calculate_geometric_average(values)
values |
series with numeric values |
geometric average
Farley Ishaak
By the parameters 'dependent_variable', 'continue_variable' and 'categorical_variables' as regression model is compiled. With the model, a direct series of index figures is estimated by use of hedonic regression.
calculate_laspeyres( dataset, period_variable, dependent_variable, continuous_variables, categorical_variables, reference_period = NULL, index = TRUE, number_of_observations = FALSE, imputation = FALSE )calculate_laspeyres( dataset, period_variable, dependent_variable, continuous_variables, categorical_variables, reference_period = NULL, index = TRUE, number_of_observations = FALSE, imputation = FALSE )
dataset |
table with data (does not need to be a selection of relevant variables) |
period_variable |
variable in the table with periods |
dependent_variable |
usually the sale price |
continuous_variables |
vector with quality determining numeric variables (no dummies) |
categorical_variables |
vector with quality determining categorical variables (also dummies) |
reference_period |
period or group of periods that will be set to 100 (numeric/string) |
index |
caprice index |
number_of_observations |
number of observations per period (default = TRUE) |
imputation |
display the underlying average imputation values? (default = FALSE) |
N.B.: the independent variables must be entered transformed (and ready) in the parameters. Hence, not: log(floor_area), but transform the variable in advance and then provide log_floor_area. This does not count for the dependent variable. This should be entered untransformed/
Within the data, it is not necessary to filter the data on relevant variables or complete records. This is taken care of in the function.
table with index, imputation averages, number of observations and confidence intervals per period
Farley Ishaak
By the parameters 'dependent_variable', 'continue_variable' and 'categorical_variables' as regression model is compiled. With the model, a direct series of index figures is estimated by use of hedonic regression.
calculate_paasche( dataset, period_variable, dependent_variable, continuous_variables, categorical_variables, reference_period = NULL, index = TRUE, number_of_observations = FALSE, imputation = FALSE )calculate_paasche( dataset, period_variable, dependent_variable, continuous_variables, categorical_variables, reference_period = NULL, index = TRUE, number_of_observations = FALSE, imputation = FALSE )
dataset |
table with data (does not need to be a selection of relevant variables) |
period_variable |
variable in the table with periods |
dependent_variable |
usually the sale price |
continuous_variables |
vector with quality determining numeric variables (no dummies) |
categorical_variables |
vector with quality determining categorical variables (also dummies) |
reference_period |
period or group of periods that will be set to 100 (numeric/string) |
index |
caprice index |
number_of_observations |
number of observations per period (default = TRUE) |
imputation |
display the underlying average imputation values? (default = FALSE) |
N.B.: the independent variables must be entered transformed (and ready) in the parameters. Hence, not: log(floor_area), but transform the variable in advance and then provide log_floor_area. This does not count for the dependent variable. This should be entered untransformed
Within the data, it is not necessary to filter the data on relevant variables or complete records. This is taken care of in the function.
table with index, imputation averages, number of observations and confidence intervals per period
Farley Ishaak
Central hub function to calculate index figures using different methods.
calculate_price_index( method, dataset, period_variable, dependent_variable, continuous_variables, categorical_variables, reference_period = NULL, number_of_observations = TRUE, periods_in_year = 4, production_since = NULL, number_preliminary_periods = 3, resting_points = FALSE, index = TRUE, imputation = FALSE )calculate_price_index( method, dataset, period_variable, dependent_variable, continuous_variables, categorical_variables, reference_period = NULL, number_of_observations = TRUE, periods_in_year = 4, production_since = NULL, number_preliminary_periods = 3, resting_points = FALSE, index = TRUE, imputation = FALSE )
method |
one of: "fisher", "laspeyres", "paasche", "hmts" |
dataset |
data frame with input data |
period_variable |
name of the variable indicating time periods |
dependent_variable |
usually the price |
continuous_variables |
vector with numeric quality-determining variables |
categorical_variables |
vector with categorical variables (also dummies) |
reference_period |
period or group of periods that will be set to 100 |
number_of_observations |
show number of observations? Default = TRUE |
periods_in_year |
(HMTS only) number of periods per year (e.g. 12 for months) |
production_since |
(HMTS only) start period for production simulation |
number_preliminary_periods |
(HMTS only) number of preliminary periods |
resting_points |
(HMTS only) return detailed outputs? Default = FALSE |
index |
(Laspeyres/Paasche only) include index column? Default = TRUE |
imputation |
(Laspeyres/Paasche only) include imputation values? Default = FALSE |
A data.frame (or list for when method is HMTS with resting_points = TRUE)
Vivek Gajadhar
# Laspeyres index Tbl_Laspeyres <- calculate_price_index( method = "laspeyres", dataset = data_constraxion, period_variable = "period", dependent_variable = "price", continuous_variables = "floor_area", categorical_variables = "neighbourhood_code", reference_period = 2015, number_of_observations = TRUE, imputation = FALSE ) head(Tbl_Laspeyres) # Paasche index Tbl_Paasche <- calculate_price_index( method = "paasche", dataset = data_constraxion, period_variable = "period", dependent_variable = "price", continuous_variables = "floor_area", categorical_variables = "neighbourhood_code", reference_period = 2015, number_of_observations = TRUE, imputation = FALSE ) head(Tbl_Paasche) # Fisher index (geometric mean of Laspeyres and Paasche) Tbl_Fisher <- calculate_price_index( method = "fisher", dataset = data_constraxion, period_variable = "period", dependent_variable = "price", continuous_variables = "floor_area", categorical_variables = "neighbourhood_code", reference_period = 2015, number_of_observations = TRUE ) head(Tbl_Fisher)# Laspeyres index Tbl_Laspeyres <- calculate_price_index( method = "laspeyres", dataset = data_constraxion, period_variable = "period", dependent_variable = "price", continuous_variables = "floor_area", categorical_variables = "neighbourhood_code", reference_period = 2015, number_of_observations = TRUE, imputation = FALSE ) head(Tbl_Laspeyres) # Paasche index Tbl_Paasche <- calculate_price_index( method = "paasche", dataset = data_constraxion, period_variable = "period", dependent_variable = "price", continuous_variables = "floor_area", categorical_variables = "neighbourhood_code", reference_period = 2015, number_of_observations = TRUE, imputation = FALSE ) head(Tbl_Paasche) # Fisher index (geometric mean of Laspeyres and Paasche) Tbl_Fisher <- calculate_price_index( method = "fisher", dataset = data_constraxion, period_variable = "period", dependent_variable = "price", continuous_variables = "floor_area", categorical_variables = "neighbourhood_code", reference_period = 2015, number_of_observations = TRUE ) head(Tbl_Fisher)
This function is used in the function: calculate_trend_line_KFAS()
custom_update_function(params, model)custom_update_function(params, model)
params |
startvalues |
model |
state space modelnumber |
Newmodel
Vivek Gajadhar
A subset of data from a fictitious real estate data frame containing transaction prices and some categorical and numerical characteristics of each dwelling.
data_constraxiondata_constraxion
A data frame with 7,800 rows and 6 columns:
A (string) vector indicating a time period
A (string) vector indicating the transaction price of the dwelling
A real-valued vector of (the logarithm of) the floor area of the dwelling
A real-valued vector of (the logarithm of) the distance of the dwelling to the nearest train station
A categorical code/string referring to the neighbourhood the dwelling belongs to
A vector indicating whether the dwelling belongs to a large city or not
A fictitious dataset for illustration purposes
data(data_constraxion) head(data_constraxion)data(data_constraxion) head(data_constraxion)