Title: | Composite 'Indicator' Construction and Imputation Data |
---|---|
Description: | Different functions includes constructing composite indicators, imputing missing data, and evaluating imputation techniques. Additionally, different tools for data normalization. Detailed methodologies of 'Indicator' package are: OECD/European Union/EC-JRC (2008), "Handbook on Constructing Composite Indicators: Methodology and User Guide", OECD Publishing, Paris, <DOI:10.1787/533411815016>, Matteo Mazziotta & Adriano Pareto, (2018) "Measuring Well-Being Over Time: The Adjusted Mazziotta–Pareto Index Versus Other Non-compensatory Indices" <DOI:10.1007/s11205-017-1577-5> and De Muro P., Mazziotta M., Pareto A. (2011), "Composite Indices of Development and Poverty: An Application to MDGs" <DOI:10.1007/s11205-010-9727-z>. |
Authors: | Gianmarco Borrata [aut, cre], Pasquale Pipiciello [aut] |
Maintainer: | Gianmarco Borrata <[email protected]> |
License: | Unlimited |
Version: | 0.1.3 |
Built: | 2024-11-28 00:06:34 UTC |
Source: | CRAN |
Provides three main categories of functions: construction of composite indicators, imputation and evaluation of missing data, and data normalization.
Key features include:
Construction of composite indicators, such as Mazziotta-Pareto Index, Adjusted Mazziotta-Pareto Index, Geometric aggregation, Linear aggregation, and other functions;
Imputation of missing data through techniques such as Linear Regression Imputation, Hot Deck Imputation, etc;
Evaluation of missing data imputation using metrics such as R^2, RMSE, and MAE;
Several functions to standardize and normalize data, such as Standardization by Adjusted Mazziotta-Pareto method, Normalization by Adjusted Mazziotta-Pareto method, and other functions.
Gianmarco Borrata <[email protected]> and Pasquale Pipiciello <[email protected]>
OECD/European Union/EC-JRC (2008), "Handbook on Constructing Composite Indicators: Methodology and User Guide", OECD Publishing, Paris, <DOI:10.1787/533411815016>
Matteo Mazziotta & Adriano Pareto (2018), "Measuring Well-Being Over Time: The Adjusted Mazziotta–Pareto Index Versus Other Non-compensatory Indices",Social Indicators Research, Springer, vol. 136(3), pages 967-976, April <DOI:10.1007/s11205-017-1577-5>
De Muro P., Mazziotta M., Pareto A. (2011), "Composite Indices of Development and Poverty: An Application to MDGs",Social Indicators Research, Volume 104, Number 1, pp. 1-18 <DOI:10.1007/s11205-010-9727-z>
Useful links:
Report bugs at https://github.com/GianmarcoBorrata/Indicator/issues
This function identifies and returns the names of the columns in a DataFrame that contain missing values (NaN). It is particularly useful for missing data imputation and preliminary analysis, allowing for quick identification of columns that need to be handled due to the presence of missing values
columns_with_nan(data)
columns_with_nan(data)
data |
The DataFrame's rows represent observations and the columns represent variables |
It returns a vector of columns with NAN
data("airquality") columns_with_nan(airquality)
data("airquality") columns_with_nan(airquality)
Diagnosis of collinearity in matrix X
compute_CI(matrix)
compute_CI(matrix)
matrix |
a matrix of data where rows = observations and columns = variables |
Collinearities can inflate the variance of the estimated regression coefficients and numerical stability. The condition indices are calculated by the eigenvalues of the crossproduct matrix of the scaled but uncentered explanatory variables. Indices > 30 may indicate collinearity
It returns the condition index of the matrix
Belsley, D. , Kuh, E. and Welsch, R. E. (1979), Regression Diagnostics: Identifying Influential Data and Sources of Collinearity, John Wiley (New York)
data("Education") compute_CI(Education)
data("Education") compute_CI(Education)
This dataset contains educational and cultural data related to Italian regions. Each row represents a region, and there are 20 Italian regions. The eight variables included in the dataset provide information on various aspects of the educational system and cultural activities in the regions in 2020.
Education
Education
A data frame with 20 observations on the following 8 variables:
Participation in the school system of 4-5 year olds
People with at least a high school diploma (25-64 years old)
Graduates and other tertiary degrees (30-34 years)
Participation in continuing education
People earning a STEM tertiary degree in the year Cultural
Participation outside the home
Reading of books and newspapers
Library use
https://www.istat.it/en/well-being-and-sustainability/the-measurement-of-well-being/indicators
The purpose of the Geometric Aggregation function is to calculate a synthetic index based on the geometric mean for a set of variables
geometric_aggregation( data, weights = rep(1/ncol(data), ncol(data)), geo_wo = 0 )
geometric_aggregation( data, weights = rep(1/ncol(data), ncol(data)), geo_wo = 0 )
data |
dataframe with rows = observations and columns = quantitative |
weights |
vector of weights (default all weights = 1/ ncol(dataframe)) |
geo_wo |
(geometric mean workaround to deal with negative values), positive number to make all variables positive |
This is simply the product of each indicator to the power of its weight, all raised the the power of the inverse of the sum of the weights
The geometric mean is less compensatory than the arithmetic mean – low values in one indicator only partially substitute high values in others. For this reason, the geometric mean may sometimes be preferred when indicators represent “essentials”. An example might be quality of life: a longer life expectancy perhaps should not compensate severe restrictions on personal freedoms
It returns a dataframe with rows = observations and column = composite indicator
OECD/European Union/EC-JRC (2008), Handbook on Constructing Composite Indicators: Methodology and User Guide, OECD Publishing, Paris, <https://doi.org/10.1787/9789264043466-en>
data("Education") Indicator=geometric_aggregation(Education) print(Indicator) #-----When there are negative values set.seed(123) Data=matrix(rnorm(100),nrow = 10,ncol = 10) Indicator=geometric_aggregation(Data,geo_wo = 100) print(Indicator)
data("Education") Indicator=geometric_aggregation(Education) print(Indicator) #-----When there are negative values set.seed(123) Data=matrix(rnorm(100),nrow = 10,ncol = 10) Indicator=geometric_aggregation(Data,geo_wo = 100) print(Indicator)
The get_all_performance_boot function is designed to evaluate different methods of imputing missing values into a dataset
get_all_performance(data, to_impute, regressors)
get_all_performance(data, to_impute, regressors)
data |
dataframe with rows = observations and columns = quantitative variables |
to_impute |
string , name of the variables where there are NANs to impute |
regressors |
vector of string with names of the variables to use to apply 1st,4th imputation method |
The function calculates performance metrics, such as:
- ,
-
and
-
for each imputation method
Supported Imputation Methods:
1. Linear Regression Imputation (lm_imputation): it uses a linear regression model to predict and impute missing values
2. Median Imputation (median_imputation): it replaces missing values with the median of observed values
3. Mean Imputation (mean_imputation): it replaces missing values with the mean of observed values
4. Hot Deck Imputation (hot_deck_imputation): it replaces missing values with similar observed values
5. Expectation-Maximization Imputation (EM_imputation): it uses the Expectation-Maximization algorithm to estimate and impute missing values
It evaluate different methods of imputing missing values and calculate performance metrics for each method
It returns a performance measures dataframe with rows = methods and columns = methods' performances
For the methods Median Imputation and Mean Imputation, it is not possible to calculate the R^2 value. This is because the standard deviation is zero based on the following R^2 formula:
where:
- is the number of imputations,
- are the observed data point,
- are the imputed data point,
- are the average of the observed data,
- are the average of the imputed data,
- are the standard deviation of the imputed data,
- are the standard deviation of the observed data
OECD/European Union/EC-JRC (2008), Handbook on Constructing Composite Indicators: Methodology and User Guide, OECD Publishing, Paris, <https://doi.org/10.1787/9789264043466-en>
data("airquality") regressors<-colnames(airquality[,c(3,4)]) suppressWarnings(get_all_performance(data =airquality,"Ozone",regressors = regressors))
data("airquality") regressors<-colnames(airquality[,c(3,4)]) suppressWarnings(get_all_performance(data =airquality,"Ozone",regressors = regressors))
The get_all_performance_boot function is designed to evaluate different methods of imputing missing values into a dataset. The evaluation is performed using bootstrapping to ensure robustness of the results
get_all_performance_boot(data, to_impute, regressors, nb = 1)
get_all_performance_boot(data, to_impute, regressors, nb = 1)
data |
dataframe with rows = observations and columns = quantitative variables |
to_impute |
string , name of the variables where there are NANs to impute |
regressors |
vector of string with names of the variables to use to apply 1st, 4th imputation method |
nb |
number of bootstrap samples |
The function calculates performance metrics, such as:
- ,
-
and
-
for each imputation method
Supported Imputation Methods:
1. Linear Regression Imputation (lm_imputation): it uses a linear regression model to predict and impute missing values
2. Median Imputation (median_imputation): it replaces missing values with the median of observed values
3. Mean Imputation (mean_imputation): it replaces missing values with the mean of observed values
4. Hot Deck Imputation (hot_deck_imputation): it replaces missing values with similar observed values
5. Expectation-Maximization Imputation (EM_imputation): it uses the Expectation-Maximization algorithm to estimate and impute missing values
Evaluate different methods of imputing missing values using bootstrapping and calculate performance metrics for each method
It returns a performance measures dataframe with rows = methods and columns = methods' performances averaged over bootstraps.
For the methods Median Imputation and Mean Imputation, it is not possible to calculate the R^2 value. This is because the standard deviation is zero based on the following R^2 formula:
where:
- is the number of imputations,
- are the observed data point,
- are the imputed data point,
- are the average of the observed data,
- are the average of the imputed data,
- are the standard deviation of the imputed data,
- are the standard deviation of the observed data.
OECD/European Union/EC-JRC (2008), Handbook on Constructing Composite Indicators: Methodology and User Guide, OECD Publishing, Paris, <https://doi.org/10.1787/9789264043466-en>
data("airquality") regressors<-colnames(airquality[,c(3,4)]) suppressWarnings(get_all_performance_boot(data =airquality,"Ozone",regressors = regressors,nb=100))
data("airquality") regressors<-colnames(airquality[,c(3,4)]) suppressWarnings(get_all_performance_boot(data =airquality,"Ozone",regressors = regressors,nb=100))
The Jevons_aggregation function computes an index using the Jevons method of static aggregation. This method calculates the geometric mean, multiplied for 100.
Jevons_aggregation(data)
Jevons_aggregation(data)
data |
dataframe with rows = observations and columns = quantitative variables |
It returns a dataframe with rows = observations and column = composite indicator
Massoli, P., Mazziotta, M., Pareto, A., Rinaldelli, C. (2013). Synthesis Methodologies and Spatial Analysis. Composite indices for BES,DAYS OF RESEARCH IN ISTAT, NOVEMBER 10-11, 2014
data("Education") Indicator=Jevons_aggregation(Education) print(Indicator)
data("Education") Indicator=Jevons_aggregation(Education) print(Indicator)
This is a function to apply linear aggregation index
linear_aggregation(data, weights = rep(1/ncol(data), ncol(data)))
linear_aggregation(data, weights = rep(1/ncol(data), ncol(data)))
data |
dataframe with rows = observations and columns = quantitative variables |
weights |
vector of weights (default all weights = 1/ ncol(dataframe)) |
It returns a dataframe with rows = observations and column = composite indicator
OECD/European Union/EC-JRC (2008), Handbook on Constructing Composite Indicators: Methodology and User Guide, OECD Publishing, Paris, <https://doi.org/10.1787/9789264043466-en>
data("Education") Indicator=linear_aggregation(Education) print(Indicator)
data("Education") Indicator=linear_aggregation(Education) print(Indicator)
The Adjusted Mazziotta-Pareto Index (AMPI) is a composite index for summarizing a set of indicators that are assumed to be non-substitutable, i.e., all components must be balanced. It is based on a non-linear function which, starting from the arithmetic mean, introduces a penalty for the units with unbalanced values of the indicators
linear_aggregation_AMPI(data, pol = "pos")
linear_aggregation_AMPI(data, pol = "pos")
data |
dataframe with rows = observations and columns = quantitative variables |
pol |
pol if not selected is "positive"", otherwise write "neg" (see details) |
The ‘polarity’ of an indicator is the sign of the relation between the indicator and the phenomenon to be measured ( + if the indicator represents a dimension considered positive and - otherwise)
It returns a dataframe with rows = observations and column = composite indicator
Matteo Mazziotta & Adriano Pareto, 2018. "Measuring Well-Being Over Time: The Adjusted Mazziotta–Pareto Index Versus Other Non-compensatory Indices," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 136(3), pages 967-976, April
data("Education") Indicator=linear_aggregation_AMPI(Education) print(Indicator) #----With negative polarity Indicator_neg=linear_aggregation_AMPI(Education,"neg") print(Indicator_neg)
data("Education") Indicator=linear_aggregation_AMPI(Education) print(Indicator) #----With negative polarity Indicator_neg=linear_aggregation_AMPI(Education,"neg") print(Indicator_neg)
The Mazziotta–Pareto index (MPI) is a composite index for summarizing a set of individual indicators that are assumed to be not fully substitutable. It is based on a non-linear function which, starting from the arithmetic mean of the normalized indicators, introduces a penalty for the units with unbalanced values of the indicators
linear_aggregation_MPI(data, pol = "pos")
linear_aggregation_MPI(data, pol = "pos")
data |
dataframe with rows = observations and columns = quantitative |
pol |
polarity if not selected is positive, otherwise write neg (see details) |
The ‘polarity’ of an indicator is the sign of the relation between the indicator and the phenomenon to be measured ( + if the indicator represents a dimension considered positive and - otherwise)
It returns a dataframe with rows = observations and column = composite indicator
De Muro P., Mazziotta M., Pareto A. (2011), "Composite Indices of Development and Poverty: An Application to MDGs", Social Indicators Research, Volume 104, Number 1, pp. 1-18
data("Education") Indicator=linear_aggregation_MPI(Education) print(Indicator) #----With negative polarity Indicator_neg=linear_aggregation_MPI(Education,"neg") print(Indicator_neg)
data("Education") Indicator=linear_aggregation_MPI(Education) print(Indicator) #----With negative polarity Indicator_neg=linear_aggregation_MPI(Education,"neg") print(Indicator_neg)
The lm_imputation function aims to replace missing values (NA) in a dataset with values estimated using a linear regression model. This technique allows the existing relationships between variables in the dataset to be used to accurately estimate missing values
lm_imputation(data, to_impute, regressors)
lm_imputation(data, to_impute, regressors)
data |
dataframe with rows = observations and columns = quantitative variables |
to_impute |
string , name of the variables whre there are NANs to impute |
regressors |
vector of string with names of the variables to use to apply linear regression imputation |
It returns a dataframe with imputed values
OECD/European Union/EC-JRC (2008), Handbook on Constructing Composite Indicators: Methodology and User Guide, OECD Publishing, Paris, <https://doi.org/10.1787/9789264043466-en>
data("airquality") regressors<-colnames(airquality[,c(3,4)]) lm_imputation(data =airquality,"Ozone",regressors = regressors)
data("airquality") regressors<-colnames(airquality[,c(3,4)]) lm_imputation(data =airquality,"Ozone",regressors = regressors)
Function to calculate the mean absolute difference of rank for different methods
MAD(matrix_data)
MAD(matrix_data)
matrix_data |
data matrix of indicator |
Function to calculate the mean absolute difference of rank for different methods. Create the matrix of ranking for different columns, the rank is the high value is the first. Calculate the different in absolute values for different columns and calculate the mean for different methods
It returns a data frame of mean absolute difference of rank for different methods
Matteo Mazziotta & Adriano Pareto, 2018. "Measuring Well-Being Over Time: The Adjusted Mazziotta–Pareto Index Versus Other Non-compensatory Indices," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 136(3), pages 967-976, April
data("Education") Indicator_MPI=linear_aggregation_MPI(Education) Indicator_AMPI=linear_aggregation_AMPI(Education) Indicator_GA=geometric_aggregation(Education) All_Indicator=cbind(Indicator_MPI,Indicator_AMPI,Indicator_GA) MAD=MAD(All_Indicator) print(MAD)
data("Education") Indicator_MPI=linear_aggregation_MPI(Education) Indicator_AMPI=linear_aggregation_AMPI(Education) Indicator_GA=geometric_aggregation(Education) All_Indicator=cbind(Indicator_MPI,Indicator_AMPI,Indicator_GA) MAD=MAD(All_Indicator) print(MAD)
Min-max normalization transforms each value by subtracting its minimum and dividing by its range (maximum-minimum). The result is a new variable with a minimum of zero and a maximum of one
min_max(data)
min_max(data)
data |
dataframe with rows = observations and columns = quantiative variables |
Change the value of variable to negative if it has negative polarity
It returns a datafame of normalized data
OECD/European Union/EC-JRC (2008), Handbook on Constructing Composite Indicators: Methodology and User Guide, OECD Publishing, Paris, <https://doi.org/10.1787/9789264043466-en>
data("Education") Normalization=min_max(Education) print(Normalization)
data("Education") Normalization=min_max(Education) print(Normalization)
This is a data normalization function for the geometric mean, where we multiplied the normalized data by 198 and add 1, with positive or negative polarity
min_max_GM(data, pol = "pos")
min_max_GM(data, pol = "pos")
data |
dataframe with rows = observations and columns = quantiative |
pol |
polarity if not selected is "positive"", otherwise write "neg" |
The ‘polarity’ of an indicator is the sign of the relation between the indicator and the phenomenon to be measured ( + if the indicator represents a dimension considered positive and - otherwise)
It returns a datafame of normalized data
Massoli, P., Mazziotta, M., Pareto, A., Rinaldelli, C. (2013). Synthesis Methodologies and Spatial Analysis. Composite indices for BES,DAYS OF RESEARCH IN ISTAT, NOVEMBER 10-11, 2014
data("Education") Normalization=min_max_GM(Education) print(Normalization) #----With negative polarity Normalization_neg=linear_aggregation_AMPI(Education,"neg") print(Normalization_neg)
data("Education") Normalization=min_max_GM(Education) print(Normalization) #----With negative polarity Normalization_neg=linear_aggregation_AMPI(Education,"neg") print(Normalization_neg)
This transformation considers the indicators which are above and below an arbitrarily defined threshold, p, around the mean. The threshold p builds a neutral region around the mean, where the transformed indicator is zero. This reduces the sharp discontinuity, from -1 to +1, which exists across the mean value to two minor discontinuities, from -1 to 0 and from 0 to +1, across the thresholds
normalization_abov_below_mean(data, p = 0.01)
normalization_abov_below_mean(data, p = 0.01)
data |
dataframe with rows = observations and columns = quantiative variables |
p |
threshold for the window |
This function to normalize in -1, 0, and 1
It returns a datafame of normalized data
OECD/European Union/EC-JRC (2008), Handbook on Constructing Composite Indicators: Methodology and User Guide, OECD Publishing, Paris, <https://doi.org/10.1787/9789264043466-en>
data("Education") Indicator=normalization_abov_below_mean(Education) print(Indicator) #---With different threshold Indicator=normalization_abov_below_mean(Education,p=0.1) print(Indicator)
data("Education") Indicator=normalization_abov_below_mean(Education) print(Indicator) #---With different threshold Indicator=normalization_abov_below_mean(Education,p=0.1) print(Indicator)
The pca_weighting function is designed to perform a principal component analysis (PCA) on the input data to calculate weights that correct for overlapping information between related indicators. This process makes it possible to create a composite indicator that captures as much information as possible from individual indicators while reducing the dimensionality of the data
pca_weighting(data)
pca_weighting(data)
data |
dataframe with rows = observations and columns = quantitative variables |
It returns a dataframe with rows = observations and column = composite indicator
OECD/European Union/EC-JRC (2008), Handbook on Constructing Composite Indicators: Methodology and User Guide, OECD Publishing, Paris, <https://doi.org/10.1787/9789264043466-en>
data("Education") Indicator_pca=pca_weighting(Education) print(Indicator_pca)
data("Education") Indicator_pca=pca_weighting(Education) print(Indicator_pca)
This function evaluates the performance of various missing value imputation methods in a quantitative dataframe. It is designed to examine and compare five different imputation methods using standard performance measures
performance_nan_imputation(data, to_impute, regressors, method = 1)
performance_nan_imputation(data, to_impute, regressors, method = 1)
data |
A dataframe containing the observations (rows) and quantitative variables (columns) to be analyzed. This dataframe includes variables with missing values to be imputed |
to_impute |
A string specifying the name of the variable in the dataframe that contains the missing values to be imputed |
regressors |
A vector of strings indicating the names of the variables to be used as regressors for imputation in the case of methods 1 (lm_imputation) and 4 (hot deck imputation) |
method |
An integer between 1 and 5 that specifies the imputation method to be used. The supported methods are: 1: lm_imputation (Imputation by linear model) 2: median imputation (imputation by median) 3: mean imputation (imputation by mean) 4: hot deck imputation (imputation via hot deck) 5: EM imputation (imputation via Expectation-Maximization) |
This function is useful for comparing the effectiveness of different methods of imputing missing values, allowing the most appropriate method to be chosen based on measured performance
The function returns a dataframe that contains a row for each imputation method and columns with performance measures. The performance measures included are:
R^2: Coefficient of Determination, which measures how well the imputed values fit the observed values
RMSE: Root Mean Squared Error, which provides a measure of the mean square deviation between imputed and observed values
MAE: Mean Absolute Error, which represents the mean absolute deviation between the imputed and observed values
OECD/European Union/EC-JRC (2008), Handbook on Constructing Composite Indicators: Methodology and User Guide, OECD Publishing, Paris, <https://doi.org/10.1787/9789264043466-en>
data("airquality") regressors<-colnames(airquality[,c(3,4)]) #---Methods 1 = Imputation by linear model performance_nan_imputation(data =airquality,"Ozone",regressors = regressors,method = 1) #---Methods 2 = Imputation by Median suppressWarnings(performance_nan_imputation(data =airquality,"Ozone",method = 2)) #---Methods 3 = Imputation by Mean suppressWarnings(performance_nan_imputation(data =airquality,"Ozone",method = 3)) #---Methods 4 = Hot Deck imputation performance_nan_imputation(data =airquality,"Ozone",regressors = regressors,method = 4) #---Methods 5 = Expectation-Maximization imputation performance_nan_imputation(data =airquality,"Ozone",regressors = regressors,method = 5)
data("airquality") regressors<-colnames(airquality[,c(3,4)]) #---Methods 1 = Imputation by linear model performance_nan_imputation(data =airquality,"Ozone",regressors = regressors,method = 1) #---Methods 2 = Imputation by Median suppressWarnings(performance_nan_imputation(data =airquality,"Ozone",method = 2)) #---Methods 3 = Imputation by Mean suppressWarnings(performance_nan_imputation(data =airquality,"Ozone",method = 3)) #---Methods 4 = Hot Deck imputation performance_nan_imputation(data =airquality,"Ozone",regressors = regressors,method = 4) #---Methods 5 = Expectation-Maximization imputation performance_nan_imputation(data =airquality,"Ozone",regressors = regressors,method = 5)
It's a function to apply ranking aggregation. The highest value is the first value of rank
rank_aggregation(data)
rank_aggregation(data)
data |
dataframe with rows = observations and columns = quantiative variables |
It returns a dataframe with rows = observations and column = composite indicator
OECD/European Union/EC-JRC (2008), Handbook on Constructing Composite Indicators: Methodology and User Guide, OECD Publishing, Paris, <https://doi.org/10.1787/9789264043466-en>
data("Education") Indicator_rank=rank_aggregation(Education) print(Indicator_rank)
data("Education") Indicator_rank=rank_aggregation(Education) print(Indicator_rank)
It's a function that normalize by ranking method. Create the matrix of ranking for different columns, the rank is the high value is the first
rank_normalisation(data)
rank_normalisation(data)
data |
dataframe with rows = observations and columns = quantiative variables |
It returns a datafame of normalized data
OECD/European Union/EC-JRC (2008), Handbook on Constructing Composite Indicators: Methodology and User Guide, OECD Publishing, Paris, <https://doi.org/10.1787/9789264043466-en>
data("Education") Normalized_rank=rank_normalisation(Education) print(Normalized_rank)
data("Education") Normalized_rank=rank_normalisation(Education) print(Normalized_rank)
It performs a standardization of data, i.e., centering and scaling, so that the data is expressed in terms of standard deviation (i.e., mean = 0, SD = 1). When applied to a statistical model, this function extracts the dataset, standardizes it, and refits the model with this standardized version of the dataset
standardization(data)
standardization(data)
data |
dataframe with rows = observations and columns = quantitative variables |
It returns a dataframe of scaled data
OECD/European Union/EC-JRC (2008), Handbook on Constructing Composite Indicators: Methodology and User Guide, OECD Publishing, Paris, <https://doi.org/10.1787/9789264043466-en>
data("Education") Standardization=standardization(Education) print(Standardization)
data("Education") Standardization=standardization(Education) print(Standardization)
This is a function that standardized the data with Adjusted Maziotta-Pareto index with positive or negative polarity
Standardization_AMPI(data, pol = "pos")
Standardization_AMPI(data, pol = "pos")
data |
dataframe with rows = observations and columns = quantitative variables |
pol |
polarity if not selected is "positive"", otherwise write "neg" (see details) |
The ‘polarity’ of an indicator is the sign of the relation between the indicator and the phenomenon to be measured ( + if the indicator represents a dimension considered positive and - otherwise)
It returns a dataframe of normalized data
Matteo Mazziotta & Adriano Pareto, 2018. "Measuring Well-Being Over Time: The Adjusted Mazziotta–Pareto Index Versus Other Non-compensatory Indices," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 136(3), pages 967-976, April
data("Education") Standardization_AMPI=Standardization_AMPI(Education) print(Standardization_AMPI) #----With negative polarity Standardization_AMPI_neg=Standardization_AMPI(Education,"neg") print(Standardization_AMPI_neg)
data("Education") Standardization_AMPI=Standardization_AMPI(Education) print(Standardization_AMPI) #----With negative polarity Standardization_AMPI_neg=Standardization_AMPI(Education,"neg") print(Standardization_AMPI_neg)
This is a function that standardized the data with Maziotta-Pareto index
standardization_MPI(data)
standardization_MPI(data)
data |
dataframe with rows = observations and columns = quantitative variables |
It returns a dataframe of normalized data
De Muro P., Mazziotta M., Pareto A. (2011), "Composite Indices of Development and Poverty: An Application to MDGs", Social Indicators Research, Volume 104, Number 1, pp. 1-18
data("Education") Standardization_MPI=standardization_MPI(Education) print(Standardization_MPI)
data("Education") Standardization_MPI=standardization_MPI(Education) print(Standardization_MPI)