Title: | R Commander Plugin for Teaching Statistical Methods |
---|---|
Description: | R Commander plugin for teaching statistical methods. It adds a new menu for making easier the teaching of the main concepts about the main statistical methods. |
Authors: | Tomás R. Cotos Yañez [aut] , Manuel A. Mosquera Rodríguez [aut, cre] , Ana Pérez González [aut] , Benigno Reguengo Lareo [aut] |
Maintainer: | Manuel A. Mosquera Rodríguez <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.1.3 |
Built: | 2024-12-09 06:58:46 UTC |
Source: | CRAN |
It adds a new menu for making easier the teaching of the main concepts about the main statistical methods.
Package: | RcmdrPlugin.TeachStat
|
Type: | Package |
Version: | 1.1.2 |
Date: | 2023-11-13 |
License: | GPL version 2 or newer |
Tomás R. Cotos Yañez <[email protected]>
Manuel A. Mosquera Rodríguez <[email protected]>
Ana Pérez González <[email protected]>
Benigno Reguengo Lareo <[email protected]>
Grouped or tabulated data set, given by lower and upper limits and frequency.
It is used as an example for the use of the Numerical Summaries - Tabulated data window of the RcmdrPlugin.TeachStat
package
data("Agrupadas")
data("Agrupadas")
Data frame with 4 cases (rows) and 3 variables (columns).
Linf
Numeric value, the lower limit of the tabulated data.
Lsup
Numeric value, the upper limit of the tabulated data.
ni
Numeric value, the frequency of the tabulated data.
data(Agrupadas) calcularResumenDatosTabulados(l_inf=Agrupadas$Linf, l_sup=Agrupadas$Lsup, ni=Agrupadas$ni, statistics =c("mean", "sd", "IQR", "quantiles"), quantiles = c(0,0.25,0.5,0.75,1), tablaFrecuencia=FALSE)
data(Agrupadas) calcularResumenDatosTabulados(l_inf=Agrupadas$Linf, l_sup=Agrupadas$Lsup, ni=Agrupadas$ni, statistics =c("mean", "sd", "IQR", "quantiles"), quantiles = c(0,0.25,0.5,0.75,1), tablaFrecuencia=FALSE)
Print the ANOVA table with random effects and compute the point estimations of the variance components using the maximum likelihood method or the REstricted Maximum Likelihood (REML) method. It also provides some confidence intervals.
aovreml(formula, data = NULL, Lconfint = FALSE, REML = TRUE, ...)
aovreml(formula, data = NULL, Lconfint = FALSE, REML = TRUE, ...)
formula |
A formula specifying the model. Random-effects terms are distinguished by vertical bars (|) separating expressions for design matrices from grouping factors. Two vertical bars (||) can be used to specify multiple uncorrelated random effects for the same grouping variable. (Because of the way it is implemented, the ||-syntax works only for design matrices containing numeric (continuous) predictors.) |
data |
an optional data frame containing the variables named in |
Lconfint |
logical scalar - Should the confidence intervals be printed? |
REML |
logical scalar - Should the estimates be chosen to optimize the REML criterion (as opposed to the log-likelihood)? |
... |
Arguments to be passed to other functions. |
A list
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function (formula, data = NULL, Lconfint = FALSE, REML = TRUE, ...) { vars <- all.vars(formula) formulaaov <- as.formula(paste(vars[1], "~", vars[2])) ANOV <- aov(formulaaov, data, ...) .ANOV <- summary(ANOV) cat("-------------------------------") cat("\n", gettext("ANOVA table", domain = "R-RcmdrPlugin.TeachStat"), ":\n", sep = "") print(.ANOV) cat("\n-------------------------------\n") .sol <- lme4::lmer(formula, data = data, REML = REML, ...) .varcor <- lme4::VarCorr(.sol) .sighat2 <- unname(attr(.varcor, "sc"))^2 .sighatalph2 <- unname(attr(.varcor[[vars[2]]], "stddev"))^2 .prop <- .sighatalph2/(.sighatalph2 + .sighat2) estim <- c(.sighat2, .sighatalph2, .prop) names(estim) <- c("var (Error)", "var (Effect)", "% var (Effect)") cat("\n", gettext("Components of Variance", domain = "R-RcmdrPlugin.TeachStat"), " (", lme4::methTitle(.sol@devcomp$dims), "):\n", sep = "") print(estim) if (Lconfint) { cat("\n", gettext("Confidence intervals", domain = "R-RcmdrPlugin.TeachStat"), ":\n", sep = "") print(confint(.sol, oldNames = FALSE)) } return(invisible(list(model = .sol, estimation = estim))) }
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function (formula, data = NULL, Lconfint = FALSE, REML = TRUE, ...) { vars <- all.vars(formula) formulaaov <- as.formula(paste(vars[1], "~", vars[2])) ANOV <- aov(formulaaov, data, ...) .ANOV <- summary(ANOV) cat("-------------------------------") cat("\n", gettext("ANOVA table", domain = "R-RcmdrPlugin.TeachStat"), ":\n", sep = "") print(.ANOV) cat("\n-------------------------------\n") .sol <- lme4::lmer(formula, data = data, REML = REML, ...) .varcor <- lme4::VarCorr(.sol) .sighat2 <- unname(attr(.varcor, "sc"))^2 .sighatalph2 <- unname(attr(.varcor[[vars[2]]], "stddev"))^2 .prop <- .sighatalph2/(.sighatalph2 + .sighat2) estim <- c(.sighat2, .sighatalph2, .prop) names(estim) <- c("var (Error)", "var (Effect)", "% var (Effect)") cat("\n", gettext("Components of Variance", domain = "R-RcmdrPlugin.TeachStat"), " (", lme4::methTitle(.sol@devcomp$dims), "):\n", sep = "") print(estim) if (Lconfint) { cat("\n", gettext("Confidence intervals", domain = "R-RcmdrPlugin.TeachStat"), ":\n", sep = "") print(confint(.sol, oldNames = FALSE)) } return(invisible(list(model = .sol, estimation = estim))) }
Print the ANOVA table with random effects and compute the classical point estimations of the variance components using the Moments method.
aovremm(formula, data = NULL, ...)
aovremm(formula, data = NULL, ...)
formula |
A formula specifying the model. |
data |
A data frame in which the variables specified in the |
... |
Arguments to be passed to other functions. |
A list
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function (formula, data = NULL, ...) { ANOV <- aov(formula, data, ...) .ANOV <- summary(ANOV) cat("-------------------------------") cat("\n", gettext("ANOVA table", domain = "R-RcmdrPlugin.TeachStat"), ":\n", sep = "") print(.ANOV) cat("\n-------------------------------\n\n") .sighat2 <- .ANOV[[1]]$`Mean Sq`[2] .vars <- all.vars(formula) .groups <- data[[.vars[2]]][!is.na(data[[.vars[1]]])] .n <- length(.groups) .ni <- table(.groups) .c <- (.n^2 - sum(.ni^2))/(.n * (length(.ni) - 1)) .sighatalph2 <- (.ANOV[[1]]$`Mean Sq`[1] - .sighat2)/.c if (.sighatalph2 < 0) warning("Estimation of any variance component is not positive. The variance component model is inadequate.") .prop <- .sighatalph2/(.sighatalph2 + .sighat2) estim <- c(.sighat2, .sighatalph2, .prop) names(estim) <- c("var (Error)", "var (Effect)", "% var (Effect)") cat("\n", gettext("Components of Variance", domain = "R-RcmdrPlugin.TeachStat"), ":\n", sep = "") print(estim) return(invisible(list(model = ANOV, estimation = estim))) }
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function (formula, data = NULL, ...) { ANOV <- aov(formula, data, ...) .ANOV <- summary(ANOV) cat("-------------------------------") cat("\n", gettext("ANOVA table", domain = "R-RcmdrPlugin.TeachStat"), ":\n", sep = "") print(.ANOV) cat("\n-------------------------------\n\n") .sighat2 <- .ANOV[[1]]$`Mean Sq`[2] .vars <- all.vars(formula) .groups <- data[[.vars[2]]][!is.na(data[[.vars[1]]])] .n <- length(.groups) .ni <- table(.groups) .c <- (.n^2 - sum(.ni^2))/(.n * (length(.ni) - 1)) .sighatalph2 <- (.ANOV[[1]]$`Mean Sq`[1] - .sighat2)/.c if (.sighatalph2 < 0) warning("Estimation of any variance component is not positive. The variance component model is inadequate.") .prop <- .sighatalph2/(.sighatalph2 + .sighat2) estim <- c(.sighat2, .sighatalph2, .prop) names(estim) <- c("var (Error)", "var (Effect)", "% var (Effect)") cat("\n", gettext("Components of Variance", domain = "R-RcmdrPlugin.TeachStat"), ":\n", sep = "") print(estim) return(invisible(list(model = ANOV, estimation = estim))) }
Performs frequency distribution for qualitative, nominal and/or ordinal variables. For ordinal variables, the requested quantile is calculated.
calcular_frecuencia(df.nominal, ordenado.frec = FALSE,df.ordinal, cuantil.p = 0.5,iprint = TRUE, ...)
calcular_frecuencia(df.nominal, ordenado.frec = FALSE,df.ordinal, cuantil.p = 0.5,iprint = TRUE, ...)
df.nominal |
|
ordenado.frec |
table ordered frequencies depending on their frequency (only used for nominal variables). |
df.ordinal |
|
cuantil.p |
requested quantile value (only used for ordinal variables). |
iprint |
logical value indicating whether or not to display the frequency table. |
... |
further arguments to be passed to or from methods. |
calcular_frecuencia
returns a list of three elements:
.nominal |
a matrix containing the table of frequency distribution for nominal variables ( |
.ordinal |
a matrix containing the table of frequency distribution for ordinal variables ( |
df.cuantil |
data frame containing the quantiles. |
data(cars93) aa <- calcular_frecuencia(df.nominal=cars93["Type"], ordenado.frec=TRUE, df.ordinal=NULL, cuantil.p=0.5, iprint = TRUE) calcular_frecuencia(df.nominal=NULL, ordenado.frec=TRUE, df.ordinal=cars93["Airbags"], cuantil.p=0.25, iprint = TRUE) bb <- calcular_frecuencia(df.nominal=cars93["Type"], ordenado.frec=TRUE, df.ordinal=cars93["Airbags"], cuantil.p=0.25, iprint = FALSE) str(bb) bb
data(cars93) aa <- calcular_frecuencia(df.nominal=cars93["Type"], ordenado.frec=TRUE, df.ordinal=NULL, cuantil.p=0.5, iprint = TRUE) calcular_frecuencia(df.nominal=NULL, ordenado.frec=TRUE, df.ordinal=cars93["Airbags"], cuantil.p=0.25, iprint = TRUE) bb <- calcular_frecuencia(df.nominal=cars93["Type"], ordenado.frec=TRUE, df.ordinal=cars93["Airbags"], cuantil.p=0.25, iprint = FALSE) str(bb) bb
calcularResumenDatosTabulados
performs the main statistical summary for tabulated data (mean, standard deviation, coefficient of variation, skewness, kurtosis, quantile and mode) are calculated. Also it allows to obtain the frequency table (with classmark, amplitude and density).
calcularResumenDatosTabulados(l_inf, l_sup, ni, statistics = c("mean", "sd", "se(mean)", "IQR", "quantiles", "cv", "skewness", "kurtosis"), quantiles = c(0, 0.25, 0.5, 0.75, 1), tablaFrecuencia = FALSE)
calcularResumenDatosTabulados(l_inf, l_sup, ni, statistics = c("mean", "sd", "se(mean)", "IQR", "quantiles", "cv", "skewness", "kurtosis"), quantiles = c(0, 0.25, 0.5, 0.75, 1), tablaFrecuencia = FALSE)
l_inf |
numeric vector with the lower limit of each interval. |
l_sup |
numeric vector with the upper limit of each interval. |
ni |
numeric vector with the frequency of occurrence of values in the range between the lower limit and upper limit [l_inf [i-1], l_sup [i]). |
statistics |
any of |
quantiles |
quantiles to report; by default is |
tablaFrecuencia |
logical value indicating whether or not to display the frequency table, by default is |
calcularResumenDatosTabulados
performs an analysis of tabulated data (frequently used in statistics when the number of distinct values is large or when dealing with continuous quantitative variables), represented by a table of statistics (arithmetic mean, standard deviation, interquartile range, coefficient of variation, asymmetry, kurtosis, and quantile).
It also allows to show the frequency table of the tabulated variable by selecting tablaFrecuencia=TRUE
. The class mark, amplitude and density are added to the frequency table.
The LOWER LIMIT or L[i-1]
and UPPER LIMIT or L[i]
vectors, represent the data of continuous quantitative variables in class intervals of the form [L[i-1], L[i]) where i = 1, .. ., k.
calcularResumenDatosTabulados()
returns a list of two elements:
.numsummary |
an object of class |
.table |
a matrix containing the values of the frequency table. |
data(cars93) cortes <- seq(from=1500, to=4250, by=250) aa <- cut( cars93$Weight, breaks=cortes, dig.lab=4) ni <- table(aa) l_inf <- cortes[-length(cortes)] l_sup <- cortes[-1] agrup <- data.frame(l_inf,l_sup,ni) head(agrup) calcularResumenDatosTabulados(agrup$l_inf, agrup$l_sup, agrup$Freq) calcularResumenDatosTabulados(agrup$l_inf, agrup$l_sup, agrup$Freq, tabla=TRUE) bb <- calcularResumenDatosTabulados(agrup$l_inf, agrup$l_sup, agrup$Freq, statistics=c("mean","mode") ) bb str(bb) class(bb$.summary) class(bb$.table)
data(cars93) cortes <- seq(from=1500, to=4250, by=250) aa <- cut( cars93$Weight, breaks=cortes, dig.lab=4) ni <- table(aa) l_inf <- cortes[-length(cortes)] l_sup <- cortes[-1] agrup <- data.frame(l_inf,l_sup,ni) head(agrup) calcularResumenDatosTabulados(agrup$l_inf, agrup$l_sup, agrup$Freq) calcularResumenDatosTabulados(agrup$l_inf, agrup$l_sup, agrup$Freq, tabla=TRUE) bb <- calcularResumenDatosTabulados(agrup$l_inf, agrup$l_sup, agrup$Freq, statistics=c("mean","mode") ) bb str(bb) class(bb$.summary) class(bb$.table)
calcularResumenVariablesContinuas
gives the main statistical summary for continuous variables (mean, standard deviation, coefficient of variation, skewness, kurtosis and quantiles). Also builds the frequency table (with classmark, amplitude and density).
calcularResumenVariablesContinuas(data, statistics = c("mean", "sd", "se(mean)", "IQR", "quantiles", "cv", "skewness", "kurtosis"), quantiles = c(0, 0.25, 0.5, 0.75, 1), groups = NULL, tablaFrecuencia = FALSE, cortes="Sturges", ...)
calcularResumenVariablesContinuas(data, statistics = c("mean", "sd", "se(mean)", "IQR", "quantiles", "cv", "skewness", "kurtosis"), quantiles = c(0, 0.25, 0.5, 0.75, 1), groups = NULL, tablaFrecuencia = FALSE, cortes="Sturges", ...)
data |
|
statistics |
any of |
quantiles |
quantiles to report; by default is |
groups |
optional variable, typically a factor, to be used to partition the data. By default is |
tablaFrecuencia |
logical value indicating whether or not to display the frequency table, by default is |
cortes |
one of:
by default is |
... |
further arguments to be passed to |
calcularResumenVariablesContinuas
performs a descriptive analysis of continuous variables (quantitative variables that take infinite distinct values into an interval), generating a table of statistics (arithmetic mean, standard deviation, interquartile range, coefficient of variation, skewness, kurtosis, and quantiles) optionally allowing the partition of the data by a factor variable (groups
).
It also allows to show the frequency table of selected continuous variables by selecting tablaFrecuencia=TRUE
. Moreover it also allows to divide the range of the variables into intervals given by the argument cortes
(breaks). See more info in cut
and in hist
.
calcularResumenVariablesContinuas
returns a list of two elements:
.numsummary |
an object of class |
.table |
a matrix containing the values of the frequency table. |
## Not run: data(cars93) calcularResumenVariablesContinuas(data=cars93["FuelCapacity"],group=NULL) calcularResumenVariablesContinuas(data=cars93["FuelCapacity"],group=cars93$Airbags) bb <- calcularResumenVariablesContinuas(data=cars93["FuelCapacity"],group=cars93$Airbags, tablaFrecuencia=TRUE) str(bb) bb bb$.summary class(bb$.summary) calcularResumenVariablesContinuas(data=cars93["MidPrice"], tablaFrecuencia=TRUE) calcularResumenVariablesContinuas(data=cars93["MidPrice"], tablaFrecuencia=TRUE, cortes=5) calcularResumenVariablesContinuas(data=cars93["MidPrice"], tablaFrecuencia=TRUE, cortes=c(7,14,21,28,63)) calcularResumenVariablesContinuas(data=cars93["MidPrice"], tablaFrecuencia=TRUE, cortes="Scott") calcularResumenVariablesContinuas(data=cars93["MidPrice"], groups=cars93$Airbags, tablaFrecuencia=TRUE, cortes=5) ## End(Not run)
## Not run: data(cars93) calcularResumenVariablesContinuas(data=cars93["FuelCapacity"],group=NULL) calcularResumenVariablesContinuas(data=cars93["FuelCapacity"],group=cars93$Airbags) bb <- calcularResumenVariablesContinuas(data=cars93["FuelCapacity"],group=cars93$Airbags, tablaFrecuencia=TRUE) str(bb) bb bb$.summary class(bb$.summary) calcularResumenVariablesContinuas(data=cars93["MidPrice"], tablaFrecuencia=TRUE) calcularResumenVariablesContinuas(data=cars93["MidPrice"], tablaFrecuencia=TRUE, cortes=5) calcularResumenVariablesContinuas(data=cars93["MidPrice"], tablaFrecuencia=TRUE, cortes=c(7,14,21,28,63)) calcularResumenVariablesContinuas(data=cars93["MidPrice"], tablaFrecuencia=TRUE, cortes="Scott") calcularResumenVariablesContinuas(data=cars93["MidPrice"], groups=cars93$Airbags, tablaFrecuencia=TRUE, cortes=5) ## End(Not run)
calcularResumenVariablesDiscretas
gives the main statistical summary for discrete variables (mean, standard deviation, coefficient of variation, skewness, kurtosis and quantiles). Also builds the frequency table
calcularResumenVariablesDiscretas(data, statistics = c("mean", "sd", "se(mean)", "IQR", "quantiles", "cv", "skewness", "kurtosis"), quantiles = c(0, 0.25, 0.5, 0.75, 1), groups = NULL, tablaFrecuencia = FALSE, cortes=NULL)
calcularResumenVariablesDiscretas(data, statistics = c("mean", "sd", "se(mean)", "IQR", "quantiles", "cv", "skewness", "kurtosis"), quantiles = c(0, 0.25, 0.5, 0.75, 1), groups = NULL, tablaFrecuencia = FALSE, cortes=NULL)
data |
|
statistics |
any of |
quantiles |
quantiles to report; by default is |
groups |
optional variable, typically a factor, to be used to partition the data. By default is |
tablaFrecuencia |
logical value indicating whether or not to display the frequency table, by default is |
cortes |
one of:
by default is |
calcularResumenVariablesDiscretas
performs a descriptive analysis of discrete variables (quantitative variables that take as a finite or infinite numerable distinct values), generating a table of statistics (arithmetic mean, standard deviation, interquartile range, coefficient of variation, skewness, kurtosis, and quantiles) optionally allowing the partition of the data by a factor variable (groups
).
It also allows to show the frequency table of selected discrete variables by selecting tablaFrecuencia=TRUE
. Moreover it also allows to divide the range of the variables into intervals given by the argument cortes
(breaks). See more info in cut
and in hist
.
calcularResumenVariablesDiscretas
returns a list of two elements:
.numsummary |
an object of class |
.table |
a matrix containing the values of the frequency table. |
## Not run: data(cars93) calcularResumenVariablesDiscretas(data=cars93["Cylinders"],group=NULL) calcularResumenVariablesDiscretas(data=cars93["Cylinders"],group=cars93$Airbags) bb <- calcularResumenVariablesDiscretas(data=cars93["Cylinders"],group=cars93$Airbags, tablaFrecuencia=TRUE) str(bb) bb bb$.summary class(bb$.summary) calcularResumenVariablesDiscretas(data=cars93["Horsepower"], tablaFrecuencia=TRUE) calcularResumenVariablesDiscretas(data=cars93["Horsepower"], tablaFrecuencia=TRUE, cortes=5) calcularResumenVariablesDiscretas(data=cars93["Horsepower"], tablaFrecuencia=TRUE, cortes=c(50,100,200,250,300)) calcularResumenVariablesDiscretas(data=cars93["Horsepower"], tablaFrecuencia=TRUE, cortes="Sturges") calcularResumenVariablesDiscretas(data=cars93["Horsepower"], groups=cars93$Airbags, tablaFrecuencia=TRUE, cortes=5) ## End(Not run)
## Not run: data(cars93) calcularResumenVariablesDiscretas(data=cars93["Cylinders"],group=NULL) calcularResumenVariablesDiscretas(data=cars93["Cylinders"],group=cars93$Airbags) bb <- calcularResumenVariablesDiscretas(data=cars93["Cylinders"],group=cars93$Airbags, tablaFrecuencia=TRUE) str(bb) bb bb$.summary class(bb$.summary) calcularResumenVariablesDiscretas(data=cars93["Horsepower"], tablaFrecuencia=TRUE) calcularResumenVariablesDiscretas(data=cars93["Horsepower"], tablaFrecuencia=TRUE, cortes=5) calcularResumenVariablesDiscretas(data=cars93["Horsepower"], tablaFrecuencia=TRUE, cortes=c(50,100,200,250,300)) calcularResumenVariablesDiscretas(data=cars93["Horsepower"], tablaFrecuencia=TRUE, cortes="Sturges") calcularResumenVariablesDiscretas(data=cars93["Horsepower"], groups=cars93$Airbags, tablaFrecuencia=TRUE, cortes=5) ## End(Not run)
The cars93
data frame has 93 rows and 26 columns.
cars93
cars93
This data frame contains the following columns:
Manufacturer
Manufacturer.
Model
Model.
Type
Type: a factor with levels "Small"
, "Sporty"
, "Compact"
, "Midsize"
, "Large"
and "Van"
.
MinPrice
Minimum Price (in $1,000): price for a basic version.
MidPrice
Midrange Price (in $1,000): average of Min.Price
and
Max.Price
.
MaxPrice
Maximum Price (in $1,000): price for “a premium version”.
CityMPG
City MPG (miles per US gallon by EPA rating).
HighwayMPG
Highway MPG.
Airbags
Air Bags standard. Factor: none, driver only, or driver & passenger.
DriveTrain
Drive train type: rear wheel, front wheel or 4WD; (factor).
Cylinders
Number of cylinders (missing for Mazda RX-7, which has a rotary engine).
EngineSize
Engine size (litres).
Horsepower
Horsepower (maximum).
RPM
RPM (revs per minute at maximum horsepower).
EngineRevol
Engine revolutions per mile (in highest gear).
Manual
Is a manual transmission version available? (yes or no, Factor).
FuelCapacity
Fuel tank capacity (US gallons).
Passengers
Passenger capacity (persons)
Length
Length (inches).
Wheelbase
Wheelbase (inches).
Width
Width (inches).
UTurnSpace
U-turn space (feet).
RearSeatRoom
Rear seat room (inches) (missing for 2-seater vehicles).
LuggageCapacity
Luggage capacity (cubic feet) (missing for vans).
Weight
Weight (pounds).
USA
Of non-USA or USA company origins? (factor).
Cars were selected at random from among 1993 passenger car models that were listed in both the Consumer Reports issue and the PACE Buying Guide. Pickup trucks and Sport/Utility vehicles were eliminated due to incomplete information in the Consumer Reports source. Duplicate models (e.g., Dodge Shadow and Plymouth Sundance) were listed at most once.
Further description can be found in Lock (1993).
Lock, R. H. (1993) 1993 New Car Data. Journal of Statistics Education 1(1). doi:10.1080/10691898.1993.11910459.
Venables, W. N. and Ripley, B. D. (1999) Modern Applied Statistics with S-PLUS. Third Edition. Springer.
This function computes the main characteristics of a random variable (expectation, median, standard deviation, ...)
characRV(D, charact = c("expectation", "median", "sd", "IQR", "skewness", "kurtosis", "moment", "cmoment"), moment = 1, cmoment = 2)
characRV(D, charact = c("expectation", "median", "sd", "IQR", "skewness", "kurtosis", "moment", "cmoment"), moment = 1, cmoment = 2)
D |
An object of the class |
charact |
any of " |
moment |
an integer indicating the moment with respect to the origin to be calculated. |
cmoment |
an integer indicating the moment with respect to the expectation to be calculated. |
'characRV' returns a table containing the selected characteristics.
E
, median
, sd
, IQR
,
skewness
, kurtosis
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function (D, charact = c("expectation", "median", "sd", "IQR", "skewness", "kurtosis", "moment", "cmoment"), moment = 1, cmoment = 2) { if (missing(charact)) charact <- c("expectation", "sd") charact <- match.arg(charact, c("expectation", "median", "sd", "IQR", "skewness", "kurtosis", "moment", "cmoment"), several.ok = TRUE) moment <- if ("moment" %in% charact) moment else NULL cmoment <- if ("cmoment" %in% charact) cmoment else NULL mom <- if (!is.null(moment)) paste("alpha_", moment, sep = "") else NULL cmom <- if (!is.null(cmoment)) paste("mu_", cmoment, sep = "") else NULL chars <- c(c("expectation", "median", "sd", "IQR", "skewness", "kurtosis")[c("expectation", "median", "sd", "IQR", "skewness", "kurtosis") %in% charact], mom, cmom) nchars <- length(chars) table <- matrix(0, 1, nchars) rownames(table) <- gsub("[[:space:]]", "", deparse(substitute(D))) colnames(table) <- chars if ("expectation" %in% chars) table[, "expectation"] <- distrEx::E(D) if ("median" %in% chars) table[, "median"] <- distrEx::median(D) if ("sd" %in% chars) table[, "sd"] <- distrEx::sd(D) if ("IQR" %in% chars) table[, "IQR"] <- distrEx::IQR(D) if ("skewness" %in% chars) table[, "skewness"] <- distrEx::skewness(D) if ("kurtosis" %in% chars) table[, "kurtosis"] <- distrEx::kurtosis(D) if ("moment" %in% charact) table[, mom] <- distrEx::E(D, fun = function(x) { x^moment }) if ("cmoment" %in% charact) table[, cmom] <- distrEx::E(D, fun = function(x) { (x - distrEx::E(D))^cmoment }) print(table) return(invisible(table)) }
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function (D, charact = c("expectation", "median", "sd", "IQR", "skewness", "kurtosis", "moment", "cmoment"), moment = 1, cmoment = 2) { if (missing(charact)) charact <- c("expectation", "sd") charact <- match.arg(charact, c("expectation", "median", "sd", "IQR", "skewness", "kurtosis", "moment", "cmoment"), several.ok = TRUE) moment <- if ("moment" %in% charact) moment else NULL cmoment <- if ("cmoment" %in% charact) cmoment else NULL mom <- if (!is.null(moment)) paste("alpha_", moment, sep = "") else NULL cmom <- if (!is.null(cmoment)) paste("mu_", cmoment, sep = "") else NULL chars <- c(c("expectation", "median", "sd", "IQR", "skewness", "kurtosis")[c("expectation", "median", "sd", "IQR", "skewness", "kurtosis") %in% charact], mom, cmom) nchars <- length(chars) table <- matrix(0, 1, nchars) rownames(table) <- gsub("[[:space:]]", "", deparse(substitute(D))) colnames(table) <- chars if ("expectation" %in% chars) table[, "expectation"] <- distrEx::E(D) if ("median" %in% chars) table[, "median"] <- distrEx::median(D) if ("sd" %in% chars) table[, "sd"] <- distrEx::sd(D) if ("IQR" %in% chars) table[, "IQR"] <- distrEx::IQR(D) if ("skewness" %in% chars) table[, "skewness"] <- distrEx::skewness(D) if ("kurtosis" %in% chars) table[, "kurtosis"] <- distrEx::kurtosis(D) if ("moment" %in% charact) table[, mom] <- distrEx::E(D, fun = function(x) { x^moment }) if ("cmoment" %in% charact) table[, cmom] <- distrEx::E(D, fun = function(x) { (x - distrEx::E(D))^cmoment }) print(table) return(invisible(table)) }
ComplexIN
computes the aggregation of a set of index numbers using the arithmetic, geometric or harmonic means.
ComplexIN(data, means = c("arithmetic", "geometric", "harmonic"), zero.rm = TRUE, na.rm = TRUE, ...)
ComplexIN(data, means = c("arithmetic", "geometric", "harmonic"), zero.rm = TRUE, na.rm = TRUE, ...)
data |
Data frame containing, the index numbers to aggregate. |
means |
Character vector with the name of the mean to compute. |
zero.rm |
Logical string for geometric and harmonic means. |
na.rm |
Logical value indicating whether NA values should be stripped before the computation proceeds. It is |
... |
Further arguments passed to or from other methods. |
Sindex
, Deflat
, priceIndexNum
.
Matrix with as many rows as columns of x
and as many columns as means
selected.
df <- data.frame(Index=round(runif(12,80,105),2)) ComplexIN(df, means = c("arithmetic", "geometric", "harmonic"))
df <- data.frame(Index=round(runif(12,80,105),2)) ComplexIN(df, means = c("arithmetic", "geometric", "harmonic"))
In this graphical interface, the user can modify the variable type into nominal (factor), ordinal (ordered factor) or numeric type.
Performs hypothesis testing and confidence interval for a proportion or difference of two proportions. The values of the samples necessary to perform the function are the number of successes and the number of trails.
Cprop.test(ex, nx, ey = NULL, ny = NULL, p.null = 0.5, alternative = c("two.sided", "less", "greater"), conf.level = 0.95, ...)
Cprop.test(ex, nx, ey = NULL, ny = NULL, p.null = 0.5, alternative = c("two.sided", "less", "greater"), conf.level = 0.95, ...)
ex |
numeric value that represents the number of successes of the first sample (see Details). |
nx |
numerical value representing the total number of trails of the first sample. |
ey |
(optional) numerical value representing the number of success of the second sample (see Details). |
ny |
(optional) numerical value representing the total number of trails of the second sample. |
p.null |
numeric value that represents the value of the population proportion or the difference between the two population proportions, depending on whether there are one or two samples (see Details). |
alternative |
a character string specifying the alternative hypothesis, must be one of |
conf.level |
confidence level of the interval. |
... |
further arguments to be passed to or from methods. |
So that the contrast can be made must be fulfilled that at least 1 hit. That is, in the case of a sample ex
must be greater than or equal to 1 and in the case of two samples, ex
or ey
must be greater than or equal to 1.
Furthermore, for the case of a sample value p.null must be strictly positive.
A list with class "htest" containing the following components:
statistic |
the value of the test statistic. |
parameter |
number of trails and value of the population proportion or the difference in population proportions. |
p.value |
the p-value for the test. |
conf.int |
a confidence interval for the proportion or for the difference in proportions, appropriate to the specified alternative hypothesis. |
estimate |
a value with the sample proportions. |
null.value |
the value of the null hypothesis. |
alternative |
a character string describing the alternative. |
method |
a character string indicating the method used, and whether Yates' continuity correction was applied. |
data.name |
a character string giving the names of the data. |
## Proportion for a sample Cprop.test(1,6) # 1 success in 6 attempts #### With a data set: proportion of cars not manufactured in US data(cars93) #data set provided with the package exitos<-sum(cars93$USA == "nonUS") total<-length(cars93$USA) Cprop.test(ex=exitos, nx=total) ## Difference of proportions Cprop.test(1,6,3,15) # Sample 1: 1 success in 6 attempts # Sample 2: 3 success in 15 attempts #### With a data set: difference of proportions of cars not manufactured in US #### between manual and automatic exitosx<-sum(cars93$USA == "nonUS" & cars93$Manual == "Yes" ) totalx<-sum(cars93$Manual == "Yes") exitosy<-sum(cars93$USA == "nonUS" & cars93$Manual == "No" ) totaly<-sum(cars93$Manual == "No") Cprop.test(ex=exitosx, nx=totalx,ey=exitosy, ny=totaly)
## Proportion for a sample Cprop.test(1,6) # 1 success in 6 attempts #### With a data set: proportion of cars not manufactured in US data(cars93) #data set provided with the package exitos<-sum(cars93$USA == "nonUS") total<-length(cars93$USA) Cprop.test(ex=exitos, nx=total) ## Difference of proportions Cprop.test(1,6,3,15) # Sample 1: 1 success in 6 attempts # Sample 2: 3 success in 15 attempts #### With a data set: difference of proportions of cars not manufactured in US #### between manual and automatic exitosx<-sum(cars93$USA == "nonUS" & cars93$Manual == "Yes" ) totalx<-sum(cars93$Manual == "Yes") exitosy<-sum(cars93$USA == "nonUS" & cars93$Manual == "No" ) totaly<-sum(cars93$Manual == "No") Cprop.test(ex=exitosx, nx=totalx,ey=exitosy, ny=totaly)
Deflat
deflates a current value variable into a constant value variable.
Deflat(x, pervar, cvar, defl, base)
Deflat(x, pervar, cvar, defl, base)
x |
Data frame containing, at least, the characteristics (time, location, ...), the current value and the deflator variables. |
pervar |
Character string for the name of the factor variable with the characteristics. |
cvar |
Character string for the name of the numeric variable with the current values. |
defl |
Character string for the name of the numeric variable with the index number used as deflator. |
base |
Character string for the name of the base characteristic. |
Deflat
returns a data frame with one column:
const_base |
The variable with constant values at base |
Sindex
, ComplexIN
, priceIndexNum
.
data(Depositos, package = "RcmdrPlugin.TeachStat") Deflat(Depositos, "year", "quantity", "G_IPC_2016", "2018")
data(Depositos, package = "RcmdrPlugin.TeachStat") Deflat(Depositos, "year", "quantity", "G_IPC_2016", "2018")
Private sector deposits (in millions of euro) with credit institutions in the province of Ourense (Spain) in 2002-2018.
data("Depositos")
data("Depositos")
A data frame with 17 observations on the following 4 variables.
year
a factor, year
quantity
a numeric vector, deposit (in millions of euro) with credit institutions
E_IPC_2016
a numeric vector, Consumer Price Index (CPI) with base 2016 in Spain
G_IPC_2016
a numeric vector, Consumer Price Index (CPI) with base 2016 in Galicia
Galician Institute of Statistics (2019):
data(Depositos) .Sindex <- Sindex(Depositos, "year", "quantity", "2010")*100 print(.Sindex) Deflat(Depositos, "year", "quantity", "E_IPC_2016", "2011")
data(Depositos) .Sindex <- Sindex(Depositos, "year", "quantity", "2010")*100 print(.Sindex) Deflat(Depositos, "year", "quantity", "E_IPC_2016", "2011")
In this window the user can define any random variable by providing its parameters (for a well-known random variable), its distribution function, its density function (for a generic absolutely continuous random variable), or its mass probability function (for a generic discrete random variable).
Under the assumption that the data come from two independent Normal distributions, it performs the hypothesis test and the confidence interval for the difference of means with known population variances.
DMKV.test(x, y, difmu = 0, sdx, sdy, alternative = c("two.sided", "less", "greater"), conf.level = 0.95, ...)
DMKV.test(x, y, difmu = 0, sdx, sdy, alternative = c("two.sided", "less", "greater"), conf.level = 0.95, ...)
x |
numerical vector (non-empty) that contains the data of the first sample. |
y |
numerical vector (non-empty) containing the data of the second sample. |
difmu |
numeric value indicating the value of the difference in population means between the two samples. |
sdx |
numerical value indicating the population standard deviation of the first sample, which is assumed to be known (mandatory). |
sdy |
numeric value indicating the population standard deviation of the second sample, which is assumed to be known (mandatory). |
alternative |
a character string specifying the alternative hypothesis, must be one of |
conf.level |
confidence level of the interval. |
... |
further arguments to be passed to or from methods. |
A list with class "htest"
containing the following components:
statistic |
the value of the test statistic. |
parameter |
sample lengths and population standard deviations. |
p.value |
the p-value for the test. |
conf.int |
confidence interval for the difference of means with known population variances associated with the specified alternative hypothesis. |
estimate |
the estimated difference in means. |
null.value |
the specified hypothesized value of the mean difference. |
alternative |
a character string describing the alternative hypothesis. |
method |
a character string indicating what type of statistical method was performed. |
data.name |
a character string giving the name(s) of the data. |
data(cars93) # Data set provided with the package # Maximum price difference (MaxPrice) in means between cars manufactured in the # US and those manufactured outside, assuming that the variances are known and # equal to 64 and 169, respectively var1<-subset(cars93, USA=="nonUS", select=MaxPrice) var2<-subset(cars93, USA=="US", select=MaxPrice) DMKV.test(var1, var2, sdx=13, sdy=8, difmu=0, alternative="greater", conf.level=0.95)
data(cars93) # Data set provided with the package # Maximum price difference (MaxPrice) in means between cars manufactured in the # US and those manufactured outside, assuming that the variances are known and # equal to 64 and 169, respectively var1<-subset(cars93, USA=="nonUS", select=MaxPrice) var2<-subset(cars93, USA=="US", select=MaxPrice) DMKV.test(var1, var2, sdx=13, sdy=8, difmu=0, alternative="greater", conf.level=0.95)
In this menu the user can perform some calculations related to index numbers.
This menu will call the functions for calculating simple index numbers and making base changes (Sindex
), for calculating complex index numbers (ComplexIN
), for calculating price indices (priceIndexNum
), and for deflation of economic series (Deflat
).
In this graphical interface, the data selection is made to perform the calculation of the confidence interval or the hypothesis testing for the mean of a Normal variable.
This interface will call the statistical functions MKV.test
and t.test
, depending, respectively, on whether the population variance is known or not.
In this graphical interface, the data selection is made to perform the calculation of the confidence interval or the hypothesis testing for the difference in means of two independent Normal variables.
This interface will call the statistical functions DMKV.test
and t.test
, depending, respectively, on whether the population variances are known or not.
In this graphical interface, the data selection is made to perform the calculation of the confidence interval or the hypothesis testing for the variance of a Normal variable.
This interface will call the statistical functions VKM.test
and VUM.test
, depending, respectively, on whether the population mean is known or not.
listTypesVariables
returns a vector with the names and types of the variables of a data frame.
listTypesVariables(dataSet)
listTypesVariables(dataSet)
dataSet |
the quoted name of a data frame in memory. |
A character vector
require(datasets) listTypesVariables("iris")
require(datasets) listTypesVariables("iris")
Under the assumption that the data come from a Normal distribution, it makes the hypothesis testing and the confidence interval for the mean with known population variance.
MKV.test(x, mu = 0, sd, alternative = c("two.sided", "less", "greater"), conf.level = 0.95, ...)
MKV.test(x, mu = 0, sd, alternative = c("two.sided", "less", "greater"), conf.level = 0.95, ...)
x |
a (non-empty) numeric vector of data values. |
mu |
a number indicating the true value of the mean - Null hypothesis. |
sd |
numerical value indicating the population standard deviation assumed to be known (mandatory). |
alternative |
a character string specifying the alternative hypothesis, must be one of |
conf.level |
confidence level of the interval. |
... |
further arguments to be passed to or from methods. |
A list with class "htest"
l containing the following components:
statistic |
the value of the test statistic. |
parameter |
sample length, population standard deviation and sample standard deviation. |
p.value |
the p-value for the test. |
conf.int |
a confidence interval for the mean appropriate to the specified alternative hypothesis. |
estimate |
the estimated mean. |
null.value |
the specified hypothesized value of the mean. |
alternative |
a character string describing the alternative hypothesis. |
method |
a character string indicating what type of statistical test was performed. |
data.name |
a character string giving the name of the data. |
data(cars93) # Dataset provided with the package # Mean maximum price (MaxPrice) less than 20 thousand $ assuming that the # variance is known and equal to 11 MKV.test(cars93$MaxPrice, sd=11, alternative="less", mu=20, conf.level=0.95)
data(cars93) # Dataset provided with the package # Mean maximum price (MaxPrice) less than 20 thousand $ assuming that the # variance is known and equal to 11 MKV.test(cars93$MaxPrice, sd=11, alternative="less", mu=20, conf.level=0.95)
This function plot regions in probability mass or density functions.
plotRegions(D, add = FALSE, regions = NULL, col = "gray", legend = TRUE, legend.pos = "topright", to.draw.arg = 1, verticals = FALSE, ngrid = 1000, cex.points = par("cex"), mfColRow = FALSE, lwd = par("lwd"), ...)
plotRegions(D, add = FALSE, regions = NULL, col = "gray", legend = TRUE, legend.pos = "topright", to.draw.arg = 1, verticals = FALSE, ngrid = 1000, cex.points = par("cex"), mfColRow = FALSE, lwd = par("lwd"), ...)
D |
object of class " |
add |
logical; if |
regions |
a list of regions to fill with color |
col |
may be a single value or a vector indicating the colors of the regions. |
legend |
plot a legend of the regions (default |
legend.pos |
position for the |
to.draw.arg |
Either |
verticals |
logical: if TRUE, draw vertical lines at steps; as in plot.stepfun |
ngrid |
integer: number of grid points used for plots of absolutely continuous distributions |
cex.points |
numeric; character expansion factor; as in plot.stepfun |
mfColRow |
shall default partition in panels be used – defaults to TRUE |
lwd |
a vector of line widths, see par. |
... |
arguments to be passed to plot. |
invisible
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function (D, add = FALSE, regions = NULL, col = "gray", legend = TRUE, legend.pos = "topright", to.draw.arg = 1, verticals = FALSE, ngrid = 1000, cex.points = par("cex"), mfColRow = FALSE, lwd = par("lwd"), ...) { dots <- match.call(call = sys.call(0), expand.dots = FALSE)$... if (!is.null(dots[["panel.first"]])) { pF <- .panel.mingle(dots, "panel.first") } else if (to.draw.arg == 1) { pF <- quote(abline(h = 0, col = "gray")) } else if (to.draw.arg == 2) { pF <- quote(abline(h = 0:1, col = "gray")) } else { pF <- NULL } dots$panel.first <- pF if (!add) { do.call(plot, c(list(D, to.draw.arg = to.draw.arg, cex.points = cex.points, mfColRow = mfColRow, verticals = verticals), dots)) } discrete <- is(D, "DiscreteDistribution") if (discrete) { x <- support(D) if (hasArg("xlim")) { if (length(xlim) != 2) stop("Wrong length of Argument xlim") x <- x[(x >= xlim[1]) & (x <= xlim[2])] } if (!is.null(regions)) { col <- rep(col, length = length(regions)) for (i in 1:length(regions)) { region <- regions[[i]] which.xs <- (x > region[1] & x <= region[2]) xs <- x[which.xs] ps <- d(D)(x)[which.xs] lines(xs, ps, type = "h", col = col[i], lwd = 3 * lwd, ...) points(xs, ps, pch = 16, col = col[i], cex = 2 * cex.points, ...) } if (legend) { if (length(unique(col)) > 1) { legend(legend.pos, title = if (length(regions) > 1) "Regions" else "Region", legend = sapply(regions, function(region) { paste(round(region[1], 2), "to", round(region[2], 2)) }), col = col, pch = 15, pt.cex = 2.5, inset = 0.02) } else { legend(legend.pos, title = if (length(regions) > 1) "Regions" else "Region", legend = sapply(regions, function(region) { paste(round(region[1], 2), "to", round(region[2], 2)) }), inset = 0.02) } } } } else { lower0 <- getLow(D, eps = getdistrOption("TruncQuantile") * 2) upper0 <- getUp(D, eps = getdistrOption("TruncQuantile") * 2) me <- (distr::q.l(D))(1/2) s <- (distr::q.l(D))(3/4) - (distr::q.l(D))(1/4) lower1 <- me - 6 * s upper1 <- me + 6 * s lower <- max(lower0, lower1) upper <- min(upper0, upper1) dist <- upper - lower if (hasArg("xlim")) { if (length(xlim) != 2) stop("Wrong length of Argument xlim") x <- seq(xlim[1], xlim[2], length = ngrid) } else x <- seq(from = lower - 0.1 * dist, to = upper + 0.1 * dist, length = ngrid) if (!is.null(regions)) { col <- rep(col, length = length(regions)) for (i in 1:length(regions)) { region <- regions[[i]] which.xs <- (x >= region[1] & x <= region[2]) xs <- x[which.xs] ps <- d(D)(x)[which.xs] xs <- c(xs[1], xs, xs[length(xs)]) ps <- c(0, ps, 0) polygon(xs, ps, col = col[i]) } if (legend) { if (length(unique(col)) > 1) { legend(legend.pos, title = if (length(regions) > 1) "Regions" else "Region", legend = sapply(regions, function(region) { paste(round(region[1], 2), "to", round(region[2], 2)) }), col = col, pch = 15, pt.cex = 2.5, inset = 0.02) } else { legend(legend.pos, title = if (length(regions) > 1) "Regions" else "Region", legend = sapply(regions, function(region) { paste(round(region[1], 2), "to", round(region[2], 2)) }), inset = 0.02) } } } } return(invisible(NULL)) }
##---- Should be DIRECTLY executable !! ---- ##-- ==> Define data, use random, ##-- or do help(data=index) for the standard data sets. ## The function is currently defined as function (D, add = FALSE, regions = NULL, col = "gray", legend = TRUE, legend.pos = "topright", to.draw.arg = 1, verticals = FALSE, ngrid = 1000, cex.points = par("cex"), mfColRow = FALSE, lwd = par("lwd"), ...) { dots <- match.call(call = sys.call(0), expand.dots = FALSE)$... if (!is.null(dots[["panel.first"]])) { pF <- .panel.mingle(dots, "panel.first") } else if (to.draw.arg == 1) { pF <- quote(abline(h = 0, col = "gray")) } else if (to.draw.arg == 2) { pF <- quote(abline(h = 0:1, col = "gray")) } else { pF <- NULL } dots$panel.first <- pF if (!add) { do.call(plot, c(list(D, to.draw.arg = to.draw.arg, cex.points = cex.points, mfColRow = mfColRow, verticals = verticals), dots)) } discrete <- is(D, "DiscreteDistribution") if (discrete) { x <- support(D) if (hasArg("xlim")) { if (length(xlim) != 2) stop("Wrong length of Argument xlim") x <- x[(x >= xlim[1]) & (x <= xlim[2])] } if (!is.null(regions)) { col <- rep(col, length = length(regions)) for (i in 1:length(regions)) { region <- regions[[i]] which.xs <- (x > region[1] & x <= region[2]) xs <- x[which.xs] ps <- d(D)(x)[which.xs] lines(xs, ps, type = "h", col = col[i], lwd = 3 * lwd, ...) points(xs, ps, pch = 16, col = col[i], cex = 2 * cex.points, ...) } if (legend) { if (length(unique(col)) > 1) { legend(legend.pos, title = if (length(regions) > 1) "Regions" else "Region", legend = sapply(regions, function(region) { paste(round(region[1], 2), "to", round(region[2], 2)) }), col = col, pch = 15, pt.cex = 2.5, inset = 0.02) } else { legend(legend.pos, title = if (length(regions) > 1) "Regions" else "Region", legend = sapply(regions, function(region) { paste(round(region[1], 2), "to", round(region[2], 2)) }), inset = 0.02) } } } } else { lower0 <- getLow(D, eps = getdistrOption("TruncQuantile") * 2) upper0 <- getUp(D, eps = getdistrOption("TruncQuantile") * 2) me <- (distr::q.l(D))(1/2) s <- (distr::q.l(D))(3/4) - (distr::q.l(D))(1/4) lower1 <- me - 6 * s upper1 <- me + 6 * s lower <- max(lower0, lower1) upper <- min(upper0, upper1) dist <- upper - lower if (hasArg("xlim")) { if (length(xlim) != 2) stop("Wrong length of Argument xlim") x <- seq(xlim[1], xlim[2], length = ngrid) } else x <- seq(from = lower - 0.1 * dist, to = upper + 0.1 * dist, length = ngrid) if (!is.null(regions)) { col <- rep(col, length = length(regions)) for (i in 1:length(regions)) { region <- regions[[i]] which.xs <- (x >= region[1] & x <= region[2]) xs <- x[which.xs] ps <- d(D)(x)[which.xs] xs <- c(xs[1], xs, xs[length(xs)]) ps <- c(0, ps, 0) polygon(xs, ps, col = col[i]) } if (legend) { if (length(unique(col)) > 1) { legend(legend.pos, title = if (length(regions) > 1) "Regions" else "Region", legend = sapply(regions, function(region) { paste(round(region[1], 2), "to", round(region[2], 2)) }), col = col, pch = 15, pt.cex = 2.5, inset = 0.02) } else { legend(legend.pos, title = if (length(regions) > 1) "Regions" else "Region", legend = sapply(regions, function(region) { paste(round(region[1], 2), "to", round(region[2], 2)) }), inset = 0.02) } } } } return(invisible(NULL)) }
priceIndexNum
computes price indices given data on products over time (prices and quantities)
priceIndexNum(x, prodID, pervar, pvar, qvar, base, indexMethod = "laspeyres", output = "fixedBase", ...)
priceIndexNum(x, prodID, pervar, pvar, qvar, base, indexMethod = "laspeyres", output = "fixedBase", ...)
x |
Data frame containing, at least, the characteristics (time, location, ...), the product identifiers, the prices and the quantities. |
prodID |
Character string for the name of the product identifier. |
pervar |
Character string for the name of the factor variable with the characteristics. |
pvar |
Character string for the name of the price variable. |
qvar |
Character string for the name of the quantity variable. |
base |
Character string for the name of the base characteristic. |
indexMethod |
Character vector to select the price index method. Tipical price index methods are laspeyres (default), paasche, and fisher, but it can also be use those in function |
output |
A character string specifying whether a chained (output="chained") , fixed base (output="fixedBase") or period-on-period (output="pop") price index numbers should be returned. Default is fixed base. |
... |
Further arguments passed to or from other methods. |
priceIndexNum
uses the function priceIndex
from package IndexNumR
without restricting the argument pervar
from being integers starting at period 1 (base) and increasing in increments of 1 period.
priceIndexNum
returns a data frame with one column with the characteristic variable plus as many columns as indexMethod
selected:
period |
The characteristic variable. |
laspeyres |
The price index computed by the Laspeyres method. |
paasche |
The price index computed by the Paasche method. |
fisher |
The price index computed by the Fisher method. |
... |
priceIndex
, Sindex
, Deflat
, ComplexIN
.
library(IndexNumR) data(Prices, package = "RcmdrPlugin.TeachStat") priceIndexNum(Prices, prodID = "prodID", pervar = "year", pvar = "price", qvar = "quantity", base = "2003", indexMethod = c("laspeyres", "paasche", "fisher"))
library(IndexNumR) data(Prices, package = "RcmdrPlugin.TeachStat") priceIndexNum(Prices, prodID = "prodID", pervar = "year", pvar = "price", qvar = "quantity", base = "2003", indexMethod = c("laspeyres", "paasche", "fisher"))
Data on the sold quantity and sale price of several products through some years.
It is used as an example for the use of the Price index window of the RcmdrPlugin.TeachStat
package
data("Prices")
data("Prices")
A data frame with 15 observations on the following 4 variables.
year
a factor representing the year
prodID
a factor with the ID of the products
price
the sale price
quantity
the sold quantity
data(Prices) priceIndexNum (Prices, prodID ="prodID", pervar ="year", pvar="price", qvar ="quantity", base="2001", indexMethod =c("laspeyres", "paasche", "fisher"))
data(Prices) priceIndexNum (Prices, prodID ="prodID", pervar ="year", pvar="price", qvar ="quantity", base="2001", indexMethod =c("laspeyres", "paasche", "fisher"))
In this menu the user can perform some calculations related to One-Way ANOVA with random effects.
This menu will call the functions for calculating the ANOVA table (aov
from package stats
) and the estimations of variance components using the maximum likelihood method and the REstricted Maximum Likelihood (REML) method (lmer
from package lme4
).
In the "Nonparametric Tests" menu, two new entries are provided to perform the randomness test.
The first "Randomness test for two level factor..." can be used to
contrast the randomness of a factor with two levels. This option use the function runs.test
from
tseries
package.
runs.test
.
The second entry in the menu
"Randomness test for numeric variable..." is used to test the
randomness of a numerical variable. This option use the function runs.test
from
randtest
package.
Here is an example of "Randomness test for a two level factor..." menu entry.
Load data "AMSsurvey" selecting from Rcmdr menu: "Data" -> "Data in packages" -> "Read data set from an attached package..." then double-click on "car", click on "AMSsurvey" and on "OK". Rcmdr reply with the following command in source pane (R Script)
data(AMSsurvey, package="car")
To make randomness test on variable "sex", select from Rcmdr menu: "Statistics" -> "Nonparametric tests" -> "Randomness test for two level factor..." select "sex" and "OK". Rcmdr reply with the following command in source pane (R Script)
with(AMSsurvey, twolevelfactor.runs.test(sex))
Here is an example of "Randomness test for a numeric variable..." menu entry.
Load data "sweetpotato" selecting from Rcmdr menu: "Data" -> "Data in packages" -> "Read data set from an attached package..." then double-click on "randtests", click on "sweetpotato" and on "OK". Rcmdr reply with the following command in source pane (R Script)
data(sweetpotato, package="randtests")
sweetpotato <- as.data.frame(sweetpotato)
To make randomness test on variable "yield", select from Rcmdr menu: "Statistics" -> "Nonparametric tests" -> "Randomness test for numeric variable..." select "yield" and "OK". Rcmdr reply with the following command in source pane (R Script)
with(sweetpotato, numeric.runs.test(yield))
Manuel Munoz-Marquez <[email protected]>
For more information see Rcmdr-package
.
Sindex
returns a data frame with the index numbers with a given base. An index number measures changes in a variable with respect to a characteristic (time, location, ...)
Sindex
can also be used for computing the base change of an index number.
Sindex(x, pervar, vvar, base)
Sindex(x, pervar, vvar, base)
x |
Data frame containing, at least, a factor and a numeric variables. |
pervar |
Character string for the name of the factor variable with the characteristics. |
vvar |
Character string for the name of the numeric variable for which you want to calculate the index number or representing the index number for which you want to compute a base change. |
base |
Character string for the name of the base characteristic. |
Sindex
returns a data frame with one column:
index_base |
The index number with base |
Deflat
, ComplexIN
, priceIndexNum
.
data(Depositos, package = "RcmdrPlugin.TeachStat") Sindex(Depositos, "year", "quantity", "2006")
data(Depositos, package = "RcmdrPlugin.TeachStat") Sindex(Depositos, "year", "quantity", "2006")
RcmdrPlugin.TeachStat
Utility Functions
twoOrMoreLevelFactorsP()
returns TRUE if there is at least one factor in the active dataset that has two or more levels.
twoOrMoreLevelFactors()
returns the object name of those factors that are active in the dataset that have at least two levels.
listDistrs(class, env)
returns the object name of those distributions of class class
(see package distr
) in the env
environment.
DiscreteDistrsP()
returns TRUE if there is at least one distribution of class DiscreteDistribution
(see DiscreteDistribution
).
AbscontDistrsP()
returns TRUE if there is at least one distribution of class AbscontDistribution
(see AbscontDistribution
).
twoOrMoreLevelFactors() twoOrMoreLevelFactorsP() listDistrs(class = "UnivariateDistribution", envir = .GlobalEnv, ...) DiscreteDistrsP() AbscontDistrsP()
twoOrMoreLevelFactors() twoOrMoreLevelFactorsP() listDistrs(class = "UnivariateDistribution", envir = .GlobalEnv, ...) DiscreteDistrsP() AbscontDistrsP()
class |
string with the name of the class to be listed. |
envir |
string with the name of the environment. |
... |
further arguments. |
Under the assumption that the data come from a Normal distribution, it performs the hypothesis testing and the confidence interval for the variance with known population mean.
VKM.test(x, sigma = 1, sigmasq = sigma^2, mu, alternative = c("two.sided", "less", "greater"), conf.level = 0.95, ...)
VKM.test(x, sigma = 1, sigmasq = sigma^2, mu, alternative = c("two.sided", "less", "greater"), conf.level = 0.95, ...)
x |
a (non-empty) numeric vector of data values. |
sigma |
a number indicating the true value of the population standard deviation - Null hypothesis. |
sigmasq |
control argument. |
mu |
numerical value indicating the population mean assumed to be known (mandatory). |
alternative |
a character string specifying the alternative hypothesis, must be one of |
conf.level |
confidence level of the interval. |
... |
further arguments to be passed to or from methods. |
A list with class "htest"
containing the following components:
statistic |
the value of the ctest statistic. |
parameter |
the degrees of freedom for the test statistic. |
p.value |
the p-value for the test. |
conf.int |
confidence interval for variance with known population mean associated with the specified alternative hypothesis. |
estimate |
the estimated variance. |
null.value |
the specified hypothesized value of the variance. |
alternative |
a character string describing the alternative hypothesis. |
method |
a character string indicating what type of statistical test was performed. |
data.name |
a character string giving the name of the data. |
data(cars93) # Dataset provided with the package # Variance of the maximum price (MaxPrice) assuming that the population mean # price is known and equal to 22 VKM.test(cars93$MaxPrice, alternative="two.sided", sigma=11, mu=22, conf.level=0.95)
data(cars93) # Dataset provided with the package # Variance of the maximum price (MaxPrice) assuming that the population mean # price is known and equal to 22 VKM.test(cars93$MaxPrice, alternative="two.sided", sigma=11, mu=22, conf.level=0.95)
Under the assumption that the data come from a Normal distribution, it performs the hypothesis testing and the confidence interval for the variance with unknown population mean.
VUM.test(x, sigma = 1, sigmasq = sigma^2, alternative = c("two.sided", "less", "greater"), conf.level = 0.95, ...)
VUM.test(x, sigma = 1, sigmasq = sigma^2, alternative = c("two.sided", "less", "greater"), conf.level = 0.95, ...)
x |
a (non-empty) numeric vector of data values. |
sigma |
a number indicating the true value of the population standard deviation - Null hypothesis. |
sigmasq |
control argument. |
alternative |
a character string specifying the alternative hypothesis, must be one of |
conf.level |
confidence level of the interval. |
... |
further arguments to be passed to or from methods. |
A list with class "htest"
containing the following components:
statistic |
the value of the test statistic |
parameter |
the degrees of freedom for the test statistic |
p.value |
the p-value for the test. |
conf.int |
confidence interval for variance with unknown population mean associated with the specified alternative hypothesis. |
estimate |
the estimated variance. |
null.value |
the specified hypothesized value of the variance. |
alternative |
a character string describing the alternative hypothesis. |
method |
a character string indicating what type of statistical test was performed. |
data.name |
a character string giving the name of the data. |
data(cars93) # Dataset provided with the package # Variance of the maximum price (MaxPrice) assuming that the population mean # price is unknown VUM.test(cars93$MaxPrice, alternative="two.sided", sigma=11, conf.level=0.95)
data(cars93) # Dataset provided with the package # Variance of the maximum price (MaxPrice) assuming that the population mean # price is unknown VUM.test(cars93$MaxPrice, alternative="two.sided", sigma=11, conf.level=0.95)
W.numSummary
gives the main statistical summary for weighted variables (mean, standard deviation, coefficient of variation, skewness, kurtosis and quantiles). It also allows the partition of the data by a factor variable.
W.numSummary(data, statistics = c("mean", "sd", "se(mean)", "IQR", "quantiles", "cv", "skewness", "kurtosis"),type = c("2", "1", "3"), quantiles = c(0, 0.25, 0.5, 0.75, 1),groups = NULL, weights)
W.numSummary(data, statistics = c("mean", "sd", "se(mean)", "IQR", "quantiles", "cv", "skewness", "kurtosis"),type = c("2", "1", "3"), quantiles = c(0, 0.25, 0.5, 0.75, 1),groups = NULL, weights)
data |
|
statistics |
any of |
type |
definition to use in computing skewness and kurtosis; see the |
quantiles |
quantiles to report; by default is |
groups |
optional variable, typically a factor, to be used to partition the data. By default is |
weights |
numeric vector of weights. Zero values are allowed. |
W.numSummary
performs a descriptive analysis of quantitative variables weighted (or not) by a numeric variable which determines the importance of each subject in the data frame. Optionally it allows the partition of the data by a factor variable (groups
).
Note that, unlike the numSummary
function, the sample standard deviation is calculated instead of the sample standard quasideviation.
An object with class "numSummary"
.
numSummary
, skewness
, kurtosis
.
data(cars93) # no weighted W.numSummary(data=cars93[,c("CityMPG")], statistics =c("mean", "sd", "IQR", "quantiles"), quantiles = c(0,0.25,0.5,0.75,1), weights=NULL, groups=NULL) # weighted W.numSummary(data=cars93[,c("CityMPG")], statistics =c("mean", "sd", "IQR", "quantiles"), quantiles = c(0,0.25,0.5,0.75,1), weights=cars93$FuelCapacity, groups=NULL) # no weighted W.numSummary(data=cars93[,c("CityMPG")], statistics =c("mean", "sd", "IQR", "quantiles"), quantiles = c(0,0.25,0.5,0.75,1), weights=NULL, groups=cars93$Manual) # weighted bb <- W.numSummary(data=cars93[,c("CityMPG")], statistics =c("mean", "sd", "IQR", "quantiles"), quantiles = c(0,0.25,0.5,0.75,1), weights=cars93$FuelCapacity, groups=cars93$Manual) bb str(bb) class(bb)
data(cars93) # no weighted W.numSummary(data=cars93[,c("CityMPG")], statistics =c("mean", "sd", "IQR", "quantiles"), quantiles = c(0,0.25,0.5,0.75,1), weights=NULL, groups=NULL) # weighted W.numSummary(data=cars93[,c("CityMPG")], statistics =c("mean", "sd", "IQR", "quantiles"), quantiles = c(0,0.25,0.5,0.75,1), weights=cars93$FuelCapacity, groups=NULL) # no weighted W.numSummary(data=cars93[,c("CityMPG")], statistics =c("mean", "sd", "IQR", "quantiles"), quantiles = c(0,0.25,0.5,0.75,1), weights=NULL, groups=cars93$Manual) # weighted bb <- W.numSummary(data=cars93[,c("CityMPG")], statistics =c("mean", "sd", "IQR", "quantiles"), quantiles = c(0,0.25,0.5,0.75,1), weights=cars93$FuelCapacity, groups=cars93$Manual) bb str(bb) class(bb)