Title: | An Analyzer of International Large Scale Assessments in Education |
---|---|
Description: | An easy way to analyze international large-scale assessments and surveys in education or any other dataset that includes replicated weights (Balanced Repeated Replication (BRR) weights, Jackknife replicate weights,...) while also allowing for analysis with multiply imputed variables (plausible values). It supports the estimation of univariate statistics (e.g. mean, variance, standard deviation, quantiles), frequencies, correlation, linear regression and any other model already implemented in R that takes a data frame and weights as parameters. It also includes options to prepare the results for publication, following the table formatting standards of the Organization for Economic Cooperation and Development (OECD). |
Authors: | Rodolfo Ilizaliturri [aut, cre], Francesco Avvisati [aut], Francois Keslair [aut] |
Maintainer: | Rodolfo Ilizaliturri <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.5.3 |
Built: | 2025-02-17 13:32:19 UTC |
Source: | CRAN |
Group Averages
average_groups( res, data = NULL, group, by = NULL, over = NULL, est = NULL, svy = NULL, user_na = FALSE, ... )
average_groups( res, data = NULL, group, by = NULL, over = NULL, est = NULL, svy = NULL, user_na = FALSE, ... )
res |
(dataframe) df of results with b. and se. to average |
data |
(dataframe) df from which to get replicated weights |
group |
(grp function) that takes arguments group.name, column, cases to create averages at the end of dataframe |
by |
(string vector) column in which we'll break down results |
over |
(vector string) columns over which to do analysis |
est |
(est function) that takes arguments what = estimate, tgt = target, rgr = regressor |
svy |
(string) name of possible projects to analyse TALISSCH and TALISTCH |
user_na |
(bool) TRUE: show nature of user defined missing values for by.var |
... |
Additional arguments such as na_to_zero : (Bool) TRUE → will take NA as zero for the simple average calculation |
Dataframe with avergas or weighted averages (totals) in last rows for the selected cases
Transform coverage columns in Rrepest to dagger and double dagger according to the coverage (cvge) column. Default levels are 75 (dagger) and 50 (double dagger). If coverage is above both levels, no symbol is produced (empty). Used with the coverage o option in Rrepest set to TRUE to produce columns with coverage percentages.
coverage_daggers(res_df, one_dagger = 75, two_dagger = 50)
coverage_daggers(res_df, one_dagger = 75, two_dagger = 50)
res_df |
(data frame) Rrepest output with columns for coverage (cvge) |
one_dagger |
(numeric) Level at which the coverage is transformed into a dagger. 75 by default. |
two_dagger |
(numeric) Level at which the coverage is transformed into a double dagger. 50 by default |
Dataframe with daggers or double daggers in coverage column
cvge_data <- Rrepest(df_talis18, est = est("freq", "tt3g23o"), svy = "TALISTCH", by = "cntry", coverage = TRUE) coverage_daggers(cvge_data, one_dagger = 95, two_dagger = 90)
cvge_data <- Rrepest(df_talis18, est = est("freq", "tt3g23o"), svy = "TALISTCH", by = "cntry", coverage = TRUE) coverage_daggers(cvge_data, one_dagger = 95, two_dagger = 90)
Cpmputes teh coverage percentage for the column/variable of interest.
coverage_pct(df, by, x, w = NULL, limit = NULL)
coverage_pct(df, by, x, w = NULL, limit = NULL)
df |
(data frame) Data to analyse |
by |
(string vector) Variable(s) used for tabulating the variable of interest |
x |
(string) Variable of interest for which to compute the number of valid (i.e. non-missing) observations |
w |
(string) Vector of weights |
limit |
(numeric) Threshold at which, if lower, value will be TRUE |
Data frame containing the number of valid (i.e. non-missing) observations for the variable of interest
data(df_pisa18) data(df_talis18) coverage_pct(df = df_pisa18, by = "cnt",x = "wb173q03ha") coverage_pct(df = df_talis18, by = "cntry",x = "tt3g01", w = "TCHWGT")
data(df_pisa18) data(df_talis18) coverage_pct(df = df_pisa18, by = "cnt",x = "wb173q03ha") coverage_pct(df = df_talis18, by = "cntry",x = "tt3g01", w = "TCHWGT")
A subset of noisy data from the Program for International Student Assessment (PISA) 2018 cycle for Mexico, Italy, and France.
This dataset is a subset of the PISA 2018 database produced by the OECD for the countries of France, Italy, and Mexico.
df_pisa18 data(df_pisa18)
df_pisa18 data(df_pisa18)
df_pisa18
A data frame in tibble format with 1269 rows and 1106 columns from which only some relevant variables are listed below. For a detailed description of all variables go to https://www.oecd.org/en/data/datasets/pisa-2018-database.html#data
Country language
Country ISO 3166-1 alpha-3 codes.
Student International Grade
Gender
...
A data frame with 1269 rows/observations and 1120 columns/variables.
https://www.oecd.org/en/data/datasets/pisa-2018-database.html#data
A subset of noisy data from the Teaching and Learning International Survey (TALIS) 2018 cycle for Mexico, Italy, and France.
This dataset is a subset of the TALIS 2018 database produced by the OECD for the countries of France, Italy, and Mexico.
df_talis18 data(df_talis18)
df_talis18 data(df_talis18)
df_talis18
A data frame in tibble format with 496 rows and 548 columns from which only some relevant variables are listed below. For a detailed description of all variables go to https://www.oecd.org/en/data/datasets/talis-2018-database.html#data
Country language
Country ISO 3166-1 alpha-3 codes.
Gender
Age
Highest level of formal education completed
...
A data frame with 548 rows/observations and 496 columns/variables.
https://www.oecd.org/en/data/datasets/talis-2018-database.html#data
Obtains a list with the arguments for the est() function used within the Rrepest() function.
est(statistic, target, regressor = NULL)
est(statistic, target, regressor = NULL)
statistic |
(string vector) Statistics of interest that can include mean ("mean"), variance ("var"), standard deviation ("std"), quantile ("quant"), inter-quantile range ("iqr"), frequency count ("freq"), correlation ("corr"), linear regression ("lm"), covariance ("cov") and any other statistics that are not pre-programmed into Rrepest but take a data frame and weights as parameters ("gen") |
target |
(string vector) Variable(s) of interest of the estimation |
regressor |
(string vector) Independent variable(s) to be included in a linear regression |
List containing the arguments for the est() function used within Rrepest() function
est(c("mean","quant",.5,"corr"),c("pv1math","pv1read","pv1scie"))
est(c("mean","quant",.5,"corr"),c("pv1math","pv1read","pv1scie"))
Format categorical variables as factor for Rrepest
format_data_categ_vars(df, categ.vars, show_na = F)
format_data_categ_vars(df, categ.vars, show_na = F)
df |
(data frame) Data to analyse. |
categ.vars |
(string vector) Categorical variables for analysis. |
show_na |
(bool) Keeps NAs as categories to get frequency from. |
Data frame with categorical variables as factors for frequencies or over variables.
format_data_categ_vars(df = mtcars, categ.vars = "cyl")
format_data_categ_vars(df = mtcars, categ.vars = "cyl")
Format continuous variables as numeric for Rrepest
format_data_cont_vars(df, cont.vars)
format_data_cont_vars(df, cont.vars)
df |
(data frame) Data to be analysed. |
cont.vars |
(string vector) continuous variables for analysis |
Data frame with cantinuous variables converted to numeric for a continuous analysis (means, regression, etc.)
format_data_cont_vars(mtcars,"hp")
format_data_cont_vars(mtcars,"hp")
Formatting target, by, and over variables for Rrepest.
format_data_repest(df, svy, x, by.over, user_na = F, ...)
format_data_repest(df, svy, x, by.over, user_na = F, ...)
df |
(data frame) Data to analyze. |
svy |
(string) Possible projects to analyse: PIAAC, PISA, TALISSCH, TALISTCH, etc. |
x |
(string vector) Target variables. |
by.over |
(string vector) Variables to break analysis by. |
user_na |
(bool) TRUE → show nature of user defined missing values |
... |
Optional arguments such as custom weights (cm.weights) |
Data frame with variables in numeric format for analysis.
df1 <- format_data_repest(df_pisa18, "PISA", "pv1math", "cnt") df2 <- format_data_repest(df_pisa18, "PISA", "pv1math", c("cnt","st004d01t")) df3 <- format_data_repest(df_pisa18, "PISA", "pv1math", c("cnt","st004d01t","iscedl")) df4 <- format_data_repest(df_talis18, "TALISTCH", "tt3g02", "cntry", isced = 2)
df1 <- format_data_repest(df_pisa18, "PISA", "pv1math", "cnt") df2 <- format_data_repest(df_pisa18, "PISA", "pv1math", c("cnt","st004d01t")) df3 <- format_data_repest(df_pisa18, "PISA", "pv1math", c("cnt","st004d01t","iscedl")) df4 <- format_data_repest(df_talis18, "TALISTCH", "tt3g02", "cntry", isced = 2)
Computes a data frame with frequency counts.
grouped_sum_freqs(data, small.level, big.level, w = NULL) grouped_sum_freqs(data, small.level, big.level, w = NULL)
grouped_sum_freqs(data, small.level, big.level, w = NULL) grouped_sum_freqs(data, small.level, big.level, w = NULL)
data |
(data frame) Data to analyze. |
small.level |
(string vector) all variables to get grouped sum. |
big.level |
(string vector) Must be fully contained in variables from small.level |
w |
(string) Numeric variable from which to get weights (if NULL then 1). |
Data frame containing the frequency counts
Data frame with frequencies from the grouped sum of small.level and big.level used for getting percentages.
grouped_sum_freqs(data = mtcars,small.level = c("cyl","am"),big.level = c("cyl")) grouped_sum_freqs(data = mtcars,small.level = c("cyl","gear"),big.level = c("cyl"))
grouped_sum_freqs(data = mtcars,small.level = c("cyl","am"),big.level = c("cyl")) grouped_sum_freqs(data = mtcars,small.level = c("cyl","gear"),big.level = c("cyl"))
Obtains a list with the arguments for the grp() function used within Rrepest() function's average() and group() options.
grp(group.name, column, cases, full_weight = FALSE)
grp(group.name, column, cases, full_weight = FALSE)
group.name |
(string) Name of the group to be displayed |
column |
(string) Column/variable of interest to be grouped |
cases |
(string vector) Rows/values to be included in the group |
full_weight |
(bool) TRUE: average of the group will be weighted average |
List containing the arguments for the grp() function used within Rrepest() function's average() and group() options.
append(grp("OECD Average","CNTRY",c("HUN","MEX")), grp("Europe","CNTRY",c("ITA","FRA")))
append(grp("OECD Average","CNTRY",c("HUN","MEX")), grp("Europe","CNTRY",c("ITA","FRA")))
Get in a dataframe all bivariate combinations of differences and standard errors from columns starting with b. and se. assuming independence. For a time series the vector must be order from oldest to newest.
indep_diff(df, vec_series)
indep_diff(df, vec_series)
df |
(dataframe) Dataframe that contains the columns with b. and se. on their name to get the difference from |
vec_series |
(numeric vector) Column names to get difference from, not including "b." at the start. Must have a column in dataframe also including "se.". |
Dataframe with extra columns containing the difference of the columns along with their standard error.
indep_diff(rrepest_pisa_age_isced, paste0("mean.age..ISCED level ",c("1","3A","2")))
indep_diff(rrepest_pisa_age_isced, paste0("mean.age..ISCED level ",c("1","3A","2")))
Invert test column from Rrepest test = TRUE by name on "b." and "se." in the column name and by sign (*-1) on "b."
inv_test(data, name_index)
inv_test(data, name_index)
data |
(dataframe) df to analyze |
name_index |
(string/numeric) name or index for the estimate (b.) columns containing the data for the test in Rrepest) |
Dataframe cointaining inverted test column names for "b." and "se." according to Rrepest structure and column multiplied by (-1) for "b."
inv_test(rrepest_pisa_age_gender, name_index = 6)
inv_test(rrepest_pisa_age_gender, name_index = 6)
To check if a number n is prime, you only need to check for factors up to the square root of n. This is because if n has a factor greater than its square root, it must also have a smaller factor (since a factor is a number that divides n without leaving a remainder). This method significantly reduces the number of checks needed to determine if a number is prime.
is_prime(n)
is_prime(n)
n |
(numeric) Number to verify if it prime |
(bool) TRUE if the number is prime, FALSE if the number is not prime
is_prime(0) is_prime(1) is_prime(7) is_prime(10) is_prime(100011869)
is_prime(0) is_prime(1) is_prime(7) is_prime(10) is_prime(100011869)
Computes the number of valid (i.e. non-missing) observations for the column/variable of interest.
n_obs_x(df, by, x, svy = NULL)
n_obs_x(df, by, x, svy = NULL)
df |
(data frame) Data to analyse with lowercase column names. |
by |
(string vector) Variable(s) used for tabulating the variable of interest |
x |
(string) Variable of interest for which to compute the number of valid (i.e. non-missing) observations |
svy |
(string) Survey settings that must be equal to one of the following: ALL, IALS, ICCS, ICILS, IELS, PBTS, PIAAC, PIRLS, PISA, PISAOOS, PISA2015, SSES, SSES2023, SVY, TALISSCH, TALISTCH, TALISEC_LEADER, TALISEC_STAFF, TIMSS |
Data frame containing the number of valid (i.e. non-missing) observations for the variable of interest
n_obs_x(df = df_pisa18, by = "cnt",x = "wb173q03ha", svy = "PISA2015") n_obs_x(df = df_talis18, by = "cntry",x = "tt3g01", svy = "TALISTCH")
n_obs_x(df = df_pisa18, by = "cnt",x = "wb173q03ha", svy = "PISA2015") n_obs_x(df = df_talis18, by = "cntry",x = "tt3g01", svy = "TALISTCH")
Get differences and standard errors from two columns starting with b. and se. assuming independence. The second column will be subtracted from the first.
paired_indep_diff(df, col1, col2)
paired_indep_diff(df, col1, col2)
df |
(dataframe) Dataframe that contains the columns with b. and se. on their name to get the difference from |
col1 |
(numeric vector) Column name, not including "b.". Must have a column in dataframe also including "se.". |
col2 |
(numeric vector) Column name, not including "b.". Must have a column in dataframe also including "se.". |
Dataframe with extra columns containing the difference of the columns along with their standard error.
paired_indep_diff(rrepest_pisa_age_gender,"mean.age..Male","mean.age..Female")
paired_indep_diff(rrepest_pisa_age_gender,"mean.age..Male","mean.age..Female")
Estimates statistics using replicate weights (Balanced Repeated Replication (BRR) weights, Jackknife replicate weights,...), thus accounting for complex survey designs in the estimation of sampling variances. It is designed specifically to be used with the data sets produced by the Organization for Economic Cooperation and Development (OECD), some of which include the Programme for the International Assessment of Adult Competencies (PIAAC), Programme for International Student Assessment (PISA) and Teaching and Learning International Survey (TALIS) data sets, but works for any educational large-scale assessment and survey that uses replicated weights. It also allows for analyses with multiply imputed variables (plausible values); where plausible values are used, average estimator across plausible values is reported and the imputation error is added to the variance estimator.
Rrepest( data, svy, est, by = NULL, over = NULL, test = FALSE, user_na = FALSE, show_na = FALSE, flag = FALSE, fast = FALSE, tabl = FALSE, average = NULL, total = NULL, coverage = FALSE, invert_tests = FALSE, save_arg = FALSE, cores = NULL, ... )
Rrepest( data, svy, est, by = NULL, over = NULL, test = FALSE, user_na = FALSE, show_na = FALSE, flag = FALSE, fast = FALSE, tabl = FALSE, average = NULL, total = NULL, coverage = FALSE, invert_tests = FALSE, save_arg = FALSE, cores = NULL, ... )
data |
(data frame) Data to analyse |
svy |
(string) Declares the survey settings. It must be equal to one of the following: ALL, IALS, ICCS, ICILS, IELS, PBTS, PIAAC, PIRLS, PISA, PISAOOS, PISA2015, SSES, SSES2023, SVY, TALISSCH, TALISTCH, TALISEC_LEADER, TALISEC_STAFF, TIMSS. |
est |
(est function) Specifies the estimates of interest. It has three arguments: statistics type, target variable and an (optional) regressor list in case of a linear regression. |
by |
(string vector) Produces separate estimates by levels of the variable(s) specified by the string vector. |
over |
(string vector) Requests estimates to be obtained separately for each level of categorical variable(s) identified by the string vector. |
test |
(bool) if TRUE: Computes the difference between estimates obtained for the lowest and highest values of the 'over' variable(s). (See 'over' option above.) It is useful to test for differences between dependent samples (e.g. female-male). |
user_na |
(bool) if TRUE: Shows the nature of user defined missing values. |
show_na |
(bool) if TRUE: Includes missing values (i.e. NAs) when estimating frequencies for the variable of interest. |
flag |
(bool) if TRUE: Replaces estimation results that are based on fewer observations than required for reporting with NaN. When used with the PIAAC survey settings, it checks if each estimation result is based on at least 30 observations. When used with the PISA, PISAOOS, PISA2015 survey settings, it checks if each estimation result is based on at least 30 observations and 5 schools. When used with the TALISSCH survey settings, it checks if each estimation result is based on at least 10 schools. When used with the TALISTCH survey settings, it checks if each estimation result is based on at least 30 observations and 10 schools. |
fast |
(bool) if TRUE: Computes estimates by using only 6 replicated weights. |
tabl |
(bool) if TRUE: Creates customisable and transferable tables using the flextable R package. |
average |
(grp function) Computes an arithmetic average (or weighted average). It has three arguments: name of the average, column/variable used for computing the average, rows/observations included in the average. It has three arguments: name of the group, column/variable used for computing the group, rows/observations included in the group. |
total |
(grp function) Computes an average weighted by the estimated size of the target population covered. |
coverage |
(bool/numeric) TRUE: shows column next to se. Numeric: Shows NaN if bellow the set coverage. |
invert_tests |
(bool) Invert test columns from Rrepest test = TRUE by name on "b." and "se." in the column name and by sign (*-1) on "b." |
save_arg |
(bool) TRUE: returns a named list with the estimation data frame and all arguments used in Rrepest. |
cores |
(numeric) NULL: Will recruit max-1 cores when doing PVs. Else, will recruit the specified number of cores for PVs |
... |
Other optional parameters include: isced = Filters the data used for analysis by ISCED level (e.g. isced = 2), n.pvs = Customizes the number of plausible values used in the estimation (e.g. n.pvs = 5), cm.weights = Customizes the weights used in the estimation (e.g. cm.weights = c("finw",paste0("repw",1:22))), var.factor = Customizes the variance factor used in the estimation (e.g. var.factor = 1/(0.5^2)), z.score = qnorm(1-0.05/2) |
Data frame containing estimation "b." and standard error "se.".
data(df_pisa18) Rrepest(data = df_pisa18, svy = "PISA2015", est = est("mean","AGE"), by = c("CNT"))
data(df_pisa18) Rrepest(data = df_pisa18, svy = "PISA2015", est = est("mean","AGE"), by = c("CNT"))
A table of results from Rrepest of the mean age of students broken down by gender for PISA 2018.
rrepest_pisa_age_gender
rrepest_pisa_age_gender
rrepest_pisa_age_gender
A data frame in tibble format with 3 rows and 7 columns:
Country ISO 3166-1 alpha-3 codes.
Mean age for female students.
Standar error of the mean age for female students.
Mean age for male students.
Standar error of the mean age for male students.
Difference of the mean age from female to male students.
Standard error of the difference of the mean age from female to male students.
https://www.oecd.org/en/data/datasets/pisa-2018-database.html#data
A table of results from Rrepest of the mean age of students broken down by completed schooling level of students' mothers for PISA 2018.
rrepest_pisa_age_isced
rrepest_pisa_age_isced
rrepest_pisa_age_isced
A data frame in tibble format with 3 rows and 7 columns:
Country ISO 3166-1 alpha-3 codes.
Mean age of students whose mothers completed ISCED 1.
Standard error of the mean age of students whose mothers completed ISCED 1.
Mean age of students whose mothers completed ISCED 2.
Standard error of the mean age of students whose mothers completed ISCED 2.
Mean age of students whose mothers completed ISCED 3A.
Standard error of the mean age of students whose mothers completed ISCED 3A.
Mean age of students whose mothers completed ISCED 3B/3C.
Standard error of the mean age of students whose mothers completed ISCED 3B/3C.
Mean age of students whose mothers did not complete ISCED 1.
Standard error of the mean age of students whose mothers did not complete ISCED 1.
Mean age difference between ISCED 1 completed vs. non-ISCED 1 completed mothers.
Standard error of the mean age difference between ISCED 1 completed vs. non-ISCED 1 completed mothers.
...
https://www.oecd.org/en/data/datasets/pisa-2018-database.html#data
Computes the weighted Pearson correlation coefficient of two numeric vectors.
weighted.corr(x, y, w, na.rm = TRUE)
weighted.corr(x, y, w, na.rm = TRUE)
x |
(numeric vector) Variable of interest x for computing the correlation |
y |
(numeric vector) Variable of interest y for computing the correlation |
w |
(numeric vector) Vector with the weights |
na.rm |
(bool) if TRUE: Excludes missing values before computing the correlation |
Scalar containing the Pearson correlation coefficient
data(df_talis18) weighted.corr(x = df_talis18$t3stake, y = df_talis18$t3team, w = df_talis18$tchwgt)
data(df_talis18) weighted.corr(x = df_talis18$t3stake, y = df_talis18$t3team, w = df_talis18$tchwgt)
Computes multivariate correlation and covariance for the variables of interest.
weighted.corr.cov.n( data, x, w = rep(1, length(data[x[1]])), corr = TRUE, na.rm = TRUE )
weighted.corr.cov.n( data, x, w = rep(1, length(data[x[1]])), corr = TRUE, na.rm = TRUE )
data |
(data frame) Data to analyse |
x |
(string vector) Variables of interest for which to compute the correlation/covariance |
w |
(string) Name of the numeric variable representing the weights |
corr |
(bool) if TRUE: Computes correlation; if FALSE: Computes covariance |
na.rm |
(bool) if TRUE: Excludes missing values before computing the correlation/covariance. |
Data frame containing each pairwise bivariate correlation/covariance
data(df_talis18) weighted.corr.cov.n(df_talis18,c("t3stake","t3team","t3stud"),"tchwgt")
data(df_talis18) weighted.corr.cov.n(df_talis18,c("t3stake","t3team","t3stud"),"tchwgt")
Computes the weighted covariance coefficient of two numeric vectors.
weighted.cov(x, y, w, na.rm = TRUE)
weighted.cov(x, y, w, na.rm = TRUE)
x |
(numeric vector) Variable of interest x for computing the covariance |
y |
(numeric vector) Variable of interest y for computing the covariance |
w |
(numeric vector) Vector with the weights |
na.rm |
(bool) if TRUE: Excludes missing values before computing the covariance |
Scalar containing the covariance
data(df_talis18) weighted.cov(x = df_talis18$t3stake, y = df_talis18$t3team, w = df_talis18$tchwgt)
data(df_talis18) weighted.cov(x = df_talis18$t3stake, y = df_talis18$t3team, w = df_talis18$tchwgt)
Computes the weighted inter-quantile range of a numeric vector.
weighted.iqr(x, w = rep(1, length(x)), rang = c(0.25, 0.75))
weighted.iqr(x, w = rep(1, length(x)), rang = c(0.25, 0.75))
x |
(numeric vector) Variable of interest for which to compute the inter-quantile range |
w |
(numeric vector) Vector with the weights |
rang |
(numeric vector) Two numbers between 0 and 1 indicating the desired inter-quantile range |
Scalar containing the inter-quantile range
weighted.iqr(x = mtcars$mpg, w = mtcars$wt, rang = c(.5,.9))
weighted.iqr(x = mtcars$mpg, w = mtcars$wt, rang = c(.5,.9))
Calculate the arithmetic mode of a vector. If multiple elements have the same frequency, all of them will be displayed.
weighted.mode(x, w = rep(1, length(x)))
weighted.mode(x, w = rep(1, length(x)))
x |
(numeric vector) vector from which we'll obtain the mode |
w |
(numeric vector) vector of weights. If not provided, it defaults to non-weighted mode. |
(numeric vector) one or multiple elements that will be the arithmetic mode
weighted.mode(c(1,2,3,4,5,4,3,4,5,3)) weighted.mode(c(NA,1,3,NA))
weighted.mode(c(1,2,3,4,5,4,3,4,5,3)) weighted.mode(c(NA,1,3,NA))
Computes weighted quantiles of a numeric vector.
weighted.quant(x, w = rep(1, length(x)), q = 0.5)
weighted.quant(x, w = rep(1, length(x)), q = 0.5)
x |
(numeric vector) Variable of interest for which to compute the quantile |
w |
(numeric vector) Vector with the weights |
q |
(numeric vector) A number between 0 and less than 1 indicating the desired quantile |
Scalar containing the quantile
weighted.quant(x = mtcars$mpg, w = mtcars$wt, q = seq(.1,.9,.1))
weighted.quant(x = mtcars$mpg, w = mtcars$wt, q = seq(.1,.9,.1))
Computes the weighted standard deviation of a numeric vector.
weighted.std(x, w, na.rm = TRUE)
weighted.std(x, w, na.rm = TRUE)
x |
(numeric vector) Variable of interest for which to compute the standard deviation. |
w |
(numeric vector) Vector with the weights. |
na.rm |
(bool) if TRUE: Excludes missing values before computing the standard deviation |
Scalar containing the standard deviation
data(df_talis18) weighted.std(df_talis18$TT3G02, df_talis18$TRWGT1)
data(df_talis18) weighted.std(df_talis18$TT3G02, df_talis18$TRWGT1)
Computes the weighted variance of a numeric vector.
weighted.var(x, w, na.rm = TRUE)
weighted.var(x, w, na.rm = TRUE)
x |
(numeric vector) Variable of interest for which to compute the variance |
w |
(numeric vector) Vector with weights |
na.rm |
(bool) if TRUE: Excludes missing values before computing the variance |
Scalar containing the variance
data(df_talis18) weighted.var(df_talis18$TT3G02, df_talis18$TRWGT1)
data(df_talis18) weighted.var(df_talis18$TT3G02, df_talis18$TRWGT1)