Package 'KbMvtSkew'

Title: Khattree-Bahuguna's Univariate and Multivariate Skewness
Description: Computes Khattree-Bahuguna's univariate and multivariate skewness, principal-component-based Khattree-Bahuguna's multivariate skewness. It also provides several measures of univariate or multivariate skewnesses including, Pearson’s coefficient of skewness, Bowley’s univariate skewness and Mardia's multivariate skewness. See Khattree, R. and Bahuguna, M. (2019) <doi: 10.1007/s41060-018-0106-1>.
Authors: Zhixin Lun [aut, cre] , Ravindra Khattree [aut]
Maintainer: Zhixin Lun <[email protected]>
License: GPL-3
Version: 1.0.2
Built: 2024-12-03 06:54:05 UTC
Source: CRAN

Help Index


Bowley's Univariate Skewness

Description

Compute Bowley's Univariate Skewness.

Usage

BowleySkew(x)

Arguments

x

a vector of original observations.

Details

Bowley's skewness is defined in terms of quantiles as

γ^=Q3+Q12Q2Q3Q1\hat{\gamma} = \frac{Q_3 + Q_1 - 2 Q_2}{Q_3 - Q_1}

where QiQ_i is the iith quartile i=1,2,3i=1,2,3 of the data.

Value

BowleySkew gives the Bowley's univariate skewness of the data.

References

Bowley, A. L. (1920). Elements of Statistics. London : P.S. King & Son, Ltd.

Examples

# Compute Bowley's univariate skewness

set.seed(2019)
x <- rnorm(1000) # Normal Distribution
BowleySkew(x)

set.seed(2019)
y <- rlnorm(1000, meanlog = 1, sdlog = 0.25) # Log-normal Distribution
BowleySkew(y)

Chatterjee, Hadi and Price Data

Description

Chatterjee, Hadi and Price Data

Usage

data(Chatterjee)

Format

The format is a dataframe of 40 observations and 7 variables.

Source

The data come from Chatterjee, Hadi and Price (2000).

References

Chatterjee, S., Hadi, A. S., and Price, B. (2000). Regression Analysis by Example. Hoboken: Wiley.

Examples

data(Chatterjee)

1984 Los Angeles Olympic records data of track events for women

Description

Data are time records from 1984 Olympic track events for women from 55 countries: 100-meter, 200-meter, 400-meter, 800-meter, 1500-meter, 3000-meter, and Marathon. The corresponding variables are named as m100, m200, m400, m800, m1500, m3000, and marathon. The time measurements are recorded in seconds.

Usage

data(OlymWomen)

Format

The format is a dataframe of 55 observations and 8 variables.

Source

The data come from Khattree and Naik (2000, pp. 511-512).

References

Khattree, R. and Naik, D. (2000). Multivariate Data Reduction and Discrimination with SAS® Software. Cary, NC: SAS Institute Inc.

Examples

data(OlymWomen)

Measurements of Heads of Swiss Soldiers

Description

Data are measurements, in millimeters, of the heads of 200 Swiss soldiers.

Usage

data(SwissHead)

Format

The format is a dataframe of 200 observations and 6 variables.

Source

The data come from Flurry and Riedwyl (1988).

References

Flurry, B. and Riedwyl, H. (1988). Multivariate Statistics: A Practical Approach. London: Chapman and Hall.

Examples

data(SwissHead)

Khattree-Bahuguna's Multivariate Skewness

Description

Compute Khattree-Bahuguna's Multivariate Skewness.

Usage

kbMvtSkew(x)

Arguments

x

a matrix of original observations.

Details

Let X=(X1,,Xp)\mathbf{X}=(X_1,\ldots,X_p)' be the multivariate random vector and (Xi1,Xi2,,Xip)(X_{i_1}, X_{i_2}, \ldots, X_{i_p})' be one of the p!p! permutations of (X1,,Xp)(X_1,\ldots,X_p)'. We predict XijX_{i_j} conditionally on subvector (Xi1,,Xij1)(X_{i_1}, \ldots,X_{i_{j-1}}) and compute the corresponding residual VijV_{i_j} through a linear regression model for j=2,,pj = 2, \cdots, p. For j=1j=1, we define Vi1=Xi1Xˉi1V_{i_1} = X_{i_1} - \bar{X}_{i_1}, where Xˉi1\bar{X}_{i_1} is the mean of Xi1X_{i_1}. For j2j \ge 2, we have

X^i2=β^0+β^1Xi1,Vi2=Xi2X^i2\hat{X}_{i_2} = \hat{\beta}_0 + \hat{\beta}_1 X_{i_1}, \quad V_{i_2} = X_{i_2} - \hat{X}_{i_2}

X^i3=β^0+β^1Xi1+β^2Xi2,Vi3=Xi3X^i3\hat{X}_{i_3} = \hat{\beta}_0 + \hat{\beta}_1 X_{i_1} + \hat{\beta}_2 X_{i_2}, \quad V_{i_3} = X_{i_3} - \hat{X}_{i_3}

\vdots

X^ip=β^0+β^1Xi1+β^2Xi2++β^p1Xip1,Vip=XipX^ip.\hat{X}_{i_p} = \hat{\beta}_0 + \hat{\beta}_1 X_{i_1} + \hat{\beta}_2 X_{i_2} + \cdots + \hat{\beta}_{p-1} X_{i_{p-1}}, \quad V_{i_p} = X_{i_p} - \hat{X}_{i_p}.

We calculate the sample skewness δ^ij\hat{\delta}_{i_j} of VijV_{i_j} by the sample Khattree-Bahuguna's univariate skewness formula (see details of kbSkew that follows) respectively for j=1,,pj=1,\cdots,p and define Δ^i=j=1pδ^ij,i=1,2,,P\hat{\Delta}_{i} = \sum_{j=1}^{p} \hat{\delta}_{i_j}, i = 1, 2, \ldots, P for all P=p!P = p! permutations of (X1,,Xp)(X_1,\ldots,X_p)'. The sample Khattree-Bahuguna's multivariate skewness is defined as

Δ^=1Pi=1PΔ^i.\hat{\Delta} = \frac{1}{P} \sum_{i=1}^{P} \hat{\Delta}_{i}.

Clearly, 0Δ^p20 \le \hat{\Delta} \le \frac{p}{2}.

Value

kbMvtSkew computes the Khattree-Bahuguna's multivairate skewness for a pp-dimensional data.

References

Khattree, R. and Bahuguna, M. (2019). An alternative data analytic approach to measure the univariate and multivariate skewness. International Journal of Data Science and Analytics, Vol. 7, No. 1, 1-16.

See Also

kbSkew for Khattree-Bahuguna's univariate skewness.

Examples

# Compute Khattree-Bahuguna's multivairate skewness

data(OlymWomen)
kbMvtSkew(OlymWomen[, c("m800","m1500","m3000","marathon")])

Khattree-Bahuguna's Univariate Skewness

Description

Compute Khattree-Bahuguna's Univariate Skewness.

Usage

kbSkew(x)

Arguments

x

a vector of original observations.

Details

Given a univariate random sample of size nn consist of observations x1,x2,,xnx_1, x_2, \ldots, x_n, let x(1)x(2)x(n)x_{(1)} \le x_{(2)} \le \cdots \le x_{(n)} be the order statistics of x1,x2,,xnx_1, x_2, \ldots, x_n after being centered by their mean. Define

yi=x(i)+x(ni+1)2y_ i = \frac{x_{(i)} + x_{(n - i + 1)}}{2}

and

wi=x(i)x(ni+1)2w_ i = \frac{x_{(i)} - x_{(n - i + 1)}}{2}

The sample Khattree-Bahuguna's univariate skewness is defined as

δ^=yi2yi2+wi2.\hat{\delta} = \frac{\sum y_i^2}{\sum y_i^2 + \sum w_i^2}.

It can be shown that 0δ^120 \le \hat{\delta} \le \frac{1}{2}. Values close to zero indicate, low skewness while those close to 12\frac{1}{2} indicate the presence of high degree of skewness.

Value

kbSkew gives the Khattree-Bahuguna's univariate skewness of the data.

References

Khattree, R. and Bahuguna, M. (2019). An alternative data analytic approach to measure the univariate and multivariate skewness. International Journal of Data Science and Analytics, Vol. 7, No. 1, 1-16.

Examples

# Compute Khattree-Bahuguna's univariate skewness

set.seed(2019)
x <- rnorm(1000) # Normal Distribution
kbSkew(x)

set.seed(2019)
y <- rlnorm(1000, meanlog = 1, sdlog = 0.25) # Log-normal Distribution
kbSkew(y)

Mardia's Multivariate Skewness

Description

Compute Mardia's Multivariate Skewness.

Usage

MardiaMvtSkew(x)

Arguments

x

a matrix of original observations.

Details

Given a pp-dimensional multivariate random vector with mean vector μ\boldsymbol{\mu} and positive definite variance-covariance matrix Σ\boldsymbol{\Sigma}, Mardia's multivariate skewness is defined as

β1,p=E[(X1μ)Σ1(X2μ)]3,\beta_{1,p} = E[(\boldsymbol{X}_1 - \boldsymbol{\mu})' \boldsymbol{\Sigma}^{-1} (\boldsymbol{X}_2 - \boldsymbol{\mu})]^3,

where X1\boldsymbol{X}_1 and X2\boldsymbol{X}_2 are independently and identically distributed copies of X\boldsymbol{X}. For a multivariate random sample of size nn, x1,x1,,xn\boldsymbol{x}_1, \boldsymbol{x}_1, \ldots, \boldsymbol{x}_n, its sample version is defined as

β^1,p=1n2i=1nj=1n[(xixˉ)S1(xjxˉ)]3,\hat{\beta}_{1,p} = \frac{1}{n^2} \sum_{i=1}^{n} \sum_{j=1}^{n} [(\boldsymbol{x}_i - \bar{\boldsymbol{x}})'\boldsymbol{S}^{-1} (\boldsymbol{x}_j - \bar{\boldsymbol{x}})]^3,

where the sample mean xˉ=1ni=1nxi\bar{\boldsymbol{x}} = \frac{1}{n}\sum_{i=1}^{n} \boldsymbol{x}_i and the sample variance-covariance matrix S=1ni=1n(xixˉ)(xixˉ)\boldsymbol{S} = \frac{1}{n} \sum_{i=1}^{n} (\boldsymbol{x}_i - \bar{\boldsymbol{x}}) (\boldsymbol{x}_i - \bar{\boldsymbol{x}})'. It is assumed that npn \ge p.

Value

MardiaMvtSkew gives the sample Mardia's multivairate skewness.

References

Mardia, K.V. (1970). Measures of multivariate skewness and kurtosis with applications. Biometrika, 57(3), 519–530.

Examples

# Compute Mardia's multivairate skewness

data(OlymWomen)
MardiaMvtSkew(OlymWomen[, c("m800","m1500","m3000","marathon")])

Principal-component-based Khattree-Bahuguna's Multivariate Skewness

Description

Compute Principal-component-based Khattree-Bahuguna's Multivariate Skewness.

Usage

pcKbSkew(x, cor = FALSE)

Arguments

x

a matrix of original scale observations.

cor

a logical value indicating whether the calculation should use the correlation matrix (cor = TRUE) or the covariance matrix (cor = FALSE). The default value is cor = FALSE.

Details

Let X=X1,,Xp\mathbf{X} = X_1, \ldots, X_p be a pp-dimensional multivariate random vector. We compute the sample skewness for pp principal components of X\mathbf{X} respectively by the sample Khattree-Bahuguna's univariate skewness formula (see details of kbSkew that follows). Let η1,η2,,ηp\eta_1, \eta_2, \ldots, \eta_p be the pp univariate skewnesses for pp principal components. Principal-component-based Khattree-Bahuguna's multivariate skewness for a sample is then defined as

η=i=1pηi.\eta = \sum_{i=1}^{p} \eta_i.

Clearly, 0ηp20 \le \eta \le \frac{p}{2}.

Value

pcKbSkew gives the sample principal-component-based Khattree-Bahuguna's multivairate skewness.

References

Khattree, R. and Bahuguna, M. (2019). An alternative data analytic approach to measure the univariate and multivariate skewness. International Journal of Data Science and Analytics, Vol. 7, No. 1, 1-16.

See Also

kbSkew for Khattree-Bahuguna's univariate skewness.

Examples

# Compute principal-component-based Khattree-Bahuguna's multivairate skewness

data(OlymWomen)
pcKbSkew(OlymWomen[, c("m800","m1500","m3000","marathon")])

Pearson's coefficient of skewness

Description

Compute Pearson's coefficient of skewness.

Usage

PearsonSkew(x)

Arguments

x

a vector of original observations.

Details

Pearson's coefficient of skewness is defined as

γ1=E[(Xμ)3](σ3)\gamma_1 = \frac{E[(X - \mu)^3]}{(\sigma^3)}

where μ=E(X)\mu = E(X) and σ2=E[(Xμ)2]\sigma^2 = E[(X - \mu)^2]. The sample version based on a random sample x1,x2,,xnx_1,x_2,\ldots,x_n is defined as

γ1^=i=1n(xixˉ)3ns3\hat{\gamma_1} = \frac{\sum_{i=1}^n (x_i - \bar{x})^3}{n s^3}

where xˉ\bar{x} is the sample mean and ss is the sample standard deviation of the data, respectively.

Value

PearsonSkew gives the sample Pearson's univariate skewness.

References

Pearson, K. (1894). Contributions to the mathematical theory of evolution. Philos. Trans. R. Soc. Lond. A 185, 71-110.

Pearson, K. (1895). Contributions to the mathematical theory of evolution II: skew variation in homogeneous material. Philos. Trans. R. Soc. Lond. A 86, 343-414.

Examples

# Compute Pearson's univariate skewness

set.seed(2019)
x <- rnorm(1000) # Normal Distribution
PearsonSkew(x)

set.seed(2019)
y <- rlnorm(1000, meanlog = 1, sdlog = 0.25) # Log-normal Distribution
PearsonSkew(y)