Package 'Rfit' reference manual

Title:	Rank-Based Estimation for Linear Models
Description:	Rank-based (R) estimation and inference for linear models. Estimation is for general scores and a library of commonly used score functions is included.
Authors:	John Kloke, Joseph McKean
Maintainer:	John Kloke <[email protected]>
License:	GPL (>= 2)
Version:	0.27.0
Built:	2025-01-29 08:27:52 UTC
Source:	CRAN

Rank-Based Estimates and Inference for Linear Models

Description

Package provides functions for rank-based analyses of linear models. Rank-based estimation and inference offers a robust alternative to least squares.

Details

Package:	Rfit
Type:	Package
Version:	0.27.0
Date:	2024-05-25
License:	GPL (version 2 or later)
LazyLoad:	yes

Author(s)

John Kloke, Joesph McKean

Maintainer: John Kloke <[email protected]>

References

Hettmansperger, T.P. and McKean J.W. (2011), Robust Nonparametric Statistical Methods, 2nd ed., New York: Chapman-Hall.

Jaeckel, L. A. (1972). Estimating regression coefficients by minimizing the dispersion of residuals. Annal s of Mathematical Statistics, 43, 1449 - 1458.

Jureckova, J. (1971). Nonparametric estimate of regression coefficients. Annals of Mathematical Statistics , 42, 1328 - 1338.

Examples

data(baseball)
data(wscores)
fit<-rfit(weight~height,data=baseball)
summary(fit)
plot(fitted(fit),rstudent(fit))

### Example of the Reduction (Drop) in dispersion test ###
y<-rnorm(47)
x1<-rnorm(47)
x2<-rnorm(47)
fitF<-rfit(y~x1+x2)
fitR<-rfit(y~x1)
drop.test(fitF,fitR)



data(baseball)
data(wscores)
fit<-rfit(weight~height,data=baseball)
summary(fit)
plot(fitted(fit),rstudent(fit))

### Example of the Reduction (Drop) in dispersion test ###
y<-rnorm(47)
x1<-rnorm(47)
x2<-rnorm(47)
fitF<-rfit(y~x1+x2)
fitR<-rfit(y~x1)
drop.test(fitF,fitR)

All Scores

Description

An object of class scores which includes the score function and it's derivative for rank-based regression inference.

Usage

data(wscores)data(wscores)

Format

The format is: Formal class 'scores' [package ".GlobalEnv"] with 2 slots ..@ phi :function (u) ..@ Dphi:function (u)

Details

Using Wilcoxon (linear) scores leads to inference which has ARE of 0.955 to least squares (ML) when the data are normal. Wilcoxon scores are optimal when the underlying error distribution is logistic. Normal scores are optimal when the data are normally distributed. Log-rank scores are optimal when the data are from an exponential distribution, e.g. in a proportional hazards model. Log-Generalized F scores can also be used in the analysis of survival data (see Hettmansperger and McKean p. 233).

bentscores1 are recommended for right-skewed distributions. bentscores2 are recommended for light-tailed distributions. bentscores3 are recommended for left-skewed distributions. bentscores4 are recommended for heavy-tailed distributions.

References

Hettmansperger, T.P. and McKean J.W. (2011), Robust Nonparametric Statistical Methods, 2nd ed., New York: Chapman-Hall.

Examples

u <- seq(0.01,0.99,by=0.01)
plot(u,getScores(wscores,u),type='l',main='Wilcoxon Scores')
plot(u,getScores(nscores,u),type='l',main='Normal Scores')

data(wscores)
x<-runif(50)
y<-rlogis(50)
rfit(y~x,scores=wscores)

x<-rnorm(50)
y<-rnorm(50)
rfit(y~x,scores=nscores)

u <- seq(0.01,0.99,by=0.01)
plot(u,getScores(wscores,u),type='l',main='Wilcoxon Scores')
plot(u,getScores(nscores,u),type='l',main='Normal Scores')

data(wscores)
x<-runif(50)
y<-rlogis(50)
rfit(y~x,scores=wscores)

x<-rnorm(50)
y<-rnorm(50)
rfit(y~x,scores=nscores)

Baseball Card Data

Description

These data come from the back-side of 59 baseball cards that Carrie had.

Usage

data(baseball)data(baseball)

Format

A data frame with 59 observations on the following 6 variables.

height: Height in inches
weight: Weight in pounds
bat: a factor with levels L R S
throw: a factor with levels L R
field: a factor with levels 0 1
average: ERA if the player is a pitcher and his batting average if the player is a fielder

Source

Hettmansperger, T.P. and McKean J.W. (2011), Robust Nonparametric Statistical Methods, 2nd ed., New York: Chapman-Hall.

Examples

data(baseball)
wilcox.test(height~field,data=baseball)
rfit(weight~height,data=baseball)
data(baseball)
wilcox.test(height~field,data=baseball)
rfit(weight~height,data=baseball)

Baseball Salaries

Description

Salaries of 176 professional baseball players for the 1987 season.

Usage

data(bbsalaries)data(bbsalaries)

Format

A data frame with 176 observations on the following 8 variables.

logYears: Log of the number of years experience
aveWins: Average wins per year
aveLosses: Average losses per year
era: Earned Run Average
aveGames: Average games pitched in per year
aveInnings: Average number of innings pitched per year
aveSaves: Average number of saves per year
logSalary: Log of the base salary in dollars

Source

http://lib.stat.cmu.edu/datasets/baseball.data

References

Hettmansperger, T.P. and McKean J.W. (2011), Robust Nonparametric Statistical Methods, 2nd ed., New York: Chapman-Hall.

Examples

data(bbsalaries)
summary(rfit(logSalary~logYears+aveWins+aveLosses+era+aveGames+aveInnings+aveSaves,data=bbsalaries))
data(bbsalaries)
summary(rfit(logSalary~logYears+aveWins+aveLosses+era+aveGames+aveInnings+aveSaves,data=bbsalaries))

Box and Cox (1964) data.

Description

The data are the results of a 3 * 4 two-way design, where forty-eight animals were exposed to three different poisons and four different treatments. The design is balanced with four replications per cell. The response was the log survival time of the animal.

Usage

data(BoxCox)data(BoxCox)

Format

A data frame with 48 observations on the following 3 variables.

logSurv: log Survival Time
Poison: a factor indicating poison level
Treatment: a factor indicating treatment level

Source

Box, G.E.P. and Cox, D.R. (1964), An analysis of transformations, Journal of the Royal Statistical Society, Series B, Methodological, 26, 211-252.

References

Hettmansperger, T.P. and McKean J.W. (2011), Robust Nonparametric Statistical Methods, 2nd ed., New York: Chapman-Hall.

Examples

data(BoxCox)
with(BoxCox,interaction.plot(Treatment,Poison,logSurv,median))
raov(logSurv~Poison+Treatment,data=BoxCox)
data(BoxCox)
with(BoxCox,interaction.plot(Treatment,Poison,logSurv,median))
raov(logSurv~Poison+Treatment,data=BoxCox)

Cardiovascular risk factors

Description

Data from a study to investigate assocation between uric acid and various cardiovascular risk factors in developing countries (Heritier et. al. 2009). There are 474 men and 524 women aged 25-64.

Usage

data(CardioRiskFactors)data(CardioRiskFactors)

Format

A data frame with 998 observations on the following 14 variables.

age: Age of subject
bmi: Body Mass Index
waisthip: waist/hip ratio(?)
smok: indicator for regular smoker
choles: total cholesterol
trig: triglycerides level in body fat
hdl: high-density lipoprotien(?)
ldl: low-density lipoprotein
sys: systolic blood pressure
dia: diastolic blood pressure(?)
Uric: serum uric
sex: indicator for male
alco: alcohol intake (mL/day)
apoa: apoprotein A

Details

Data set and description taken from Heritier et. al. (2009) (c.f. Conen et. al. 2004).

Source

Heritier, S., Cantoni, E., Copt, S., and Victoria-Feser, M. (2009), Robust Methods in Biostatistics, New York: John Wiley and Sons.

Conen, D., Wietlisbach, V., Bovet, P., Shamlaye, C., Riesen, W., Paccaud, F., and Burnier, M. (2004), Prevalence of hyperuricemia and relation of serum uric acid with cardiovascular risk factors in a developing country. BMC Public Health.

Examples

data(CardioRiskFactors)
fitF<-rfit(Uric~bmi+sys+choles+ldl+sex+smok+alco+apoa+trig+age,data=CardioRiskFactors)
fitR<-rfit(Uric~bmi+sys+choles+ldl+sex,data=CardioRiskFactors)
drop.test(fitF,fitR)
summary(fitR)
data(CardioRiskFactors)
fitF<-rfit(Uric~bmi+sys+choles+ldl+sex+smok+alco+apoa+trig+age,data=CardioRiskFactors)
fitR<-rfit(Uric~bmi+sys+choles+ldl+sex,data=CardioRiskFactors)
drop.test(fitF,fitR)
summary(fitR)

Confidence interval adjustment methods

Description

Returns the critical value to be used in calculating adjusted confidence intervals. Currently provides methods for Boneferroni and Tukey for confidence interval adjustment methods as well as no adjustment.

Usage

confintadjust(n, k, alpha = 0.05, method = confintadjust.methods, ...)
confintadjust(n, k, alpha = 0.05, method = confintadjust.methods, ...)

Arguments

`n`	sample size
`k`	number of comparisons
`alpha`	overall (experimentwise) type I error rate
`method`	one of confintadjust.methods
`...`	Additonal arguments. Currently not used.

Details

Returns critial value based on one of the adjustment methods.

Value

`cv`	critical value
`method`	the method used

Author(s)

Joseph McKean, John Kloke

References

Hettmansperger, T.P. and McKean J.W. (2011), Robust Nonparametric Statistical Methods, 2nd ed., New York: Chapman-Hall.

Jaeckel's Dispersion Function

Description

Returns the value of Jaeckel's dispersion function for given values of the regression coefficents.

Usage

disp(beta, x, y, scores)
disp(beta, x, y, scores)

Arguments

`beta`	p by 1 vector of regression coefficents
`x`	n by p design matrix
`y`	n by 1 response vector
`scores`	an object of class scores

Details

Returns the value of Jaeckel's disperion function evaluated at the value of the parameters in the function call. That is, $sum_{i=1}^n a(R(e_i)) * e_i$ where R denotes rank and a(1) <= a(2) <= ... <= a(n) are the scores. The residuals (e_i i=1,...n) are calculated y - x beta.

Author(s)

John Kloke, Joseph McKean

References

Hettmansperger, T.P. and McKean J.W. (2011), Robust Nonparametric Statistical Methods, 2nd ed., New York: Chapman-Hall.

Jaeckel, L. A. (1972). Estimating regression coefficients by minimizing the dispersion of residuals. Annals of Mathematical Statistics, 43, 1449 - 1458.

Drop (Reduction) in Dispersion Test

Description

Given two full model fits, this function performs a reduction in dispersion test.

Usage

drop.test(fitF, fitR = NULL)
drop.test(fitF, fitR = NULL)

Arguments

`fitF`	An object of class rfit. The full model fit.
`fitR`	An object of class rfit. The reduced model fit.

Details

Rank-based inference procedure analogous to the traditional (LS) reduced model test.

The full and reduced model dispersions are calculated. The reduction in dispersion test, or drop test for short, has an asymptotic chi-sq distribution. Simulation studies suggest using F critical values. The p-value returned is based on a F-distribution with df1 and df2 degrees of freedom where df1 is the difference in the number of parameters in the fits of fitF and fitR and df2 is the residual degrees of freedom in the fit fitF.

Both fits are based on a minimization routine. It is possible that resulting solutions are such that the fitF$disp > fitRdisp. We recommend starting the full model at the reduced model fit as a way to avoid this situation. See examples.

Checks to see if models appear to be proper subsets. The space spanned by the columns of the reduced model design matrix should be a subset of the space spanned by the columns of the full model design matrix.

Value

`F`	Value of the F test statistic
`p.value`	The observed significance level of the test (using an F quantile)
`RD`	Reduced model dispersion minus Full model dispersion
`tauhat`	Estimate of the scale parameter (using the full model residuals)
`df1`	numerator degrees of freedom
`df2`	denominator degrees of freedom

Author(s)

John Kloke, Joseph McKean

References

Hettmansperger, T.P. and McKean J.W. (2011), Robust Nonparametric Statistical Methods, 2nd ed., New York: Chapman-Hall.

Examples

y<-rnorm(47)
x1<-rnorm(47)
x2<-rnorm(47)
fitF<-rfit(y~x1+x2)
fitR<-rfit(y~x1)
drop.test(fitF,fitR)

## try starting the full model at the reduced model fit ##
fitF<-rfit(y~x1+x2,yhat0=fitR$fitted)
drop.test(fitF,fitR)
y<-rnorm(47)
x1<-rnorm(47)
x2<-rnorm(47)
fitF<-rfit(y~x1+x2)
fitR<-rfit(y~x1)
drop.test(fitF,fitR)

## try starting the full model at the reduced model fit ##
fitF<-rfit(y~x1+x2,yhat0=fitR$fitted)
drop.test(fitF,fitR)

Free Fatty Acid Data

Description

The response variable is level of free fatty acid in a sample of prepubescent boys. The explanatory variables are age (in months), weight (in lbs), and skin fold thickness.

Usage

data(ffa)data(ffa)

Format

A data frame with 41 rows and 4 columns.

age: age in years
weight: weight in lbs
skin: skin fold thinkness
ffa: free fatty acid

Source

Morrison, D.F. (1983), Applied Linear Statistical Models, Englewood Cliffs, NJ:Prentice Hall.

References

Hettmansperger, T.P. and McKean J.W. (2011), Robust Nonparametric Statistical Methods, 2nd ed., New York: Chapman-Hall.

Examples

data(ffa)
summary(rfit(ffa~age+weight+skin,data=ffa))  #using the default (Wilcoxon scores)
summary(rfit(ffa~age+weight+skin,data=ffa,scores=bentscores1))
data(ffa)
summary(rfit(ffa~age+weight+skin,data=ffa))  #using the default (Wilcoxon scores)
summary(rfit(ffa~age+weight+skin,data=ffa,scores=bentscores1))

Methods for Function getScores

Description

~~ Methods for function getScores ~~ Calculates the centered and scaled scores as used in rank-based analysis.

Methods

signature(object = "scores")

Methods for Function getScoresDeriv

Description

~~ Methods for function getScoresDeriv ~~ This derivative is used in the estimate of the scale parameter tau.

Methods

signature(object = "scores")

Estimate of the scale parameter tau

Description

An estimate of the scale parameter tau may be used for the standard errors of the coefficients in rank-based regression.

Usage

gettau(ehat, p, scores = Rfit::wscores, delta = 0.8, hparm = 2, ...)
gettauF0(ehat, p, scores = Rfit::wscores, delta = 0.8, hparm = 2, ...)
gettau(ehat, p, scores = Rfit::wscores, delta = 0.8, hparm = 2, ...)
gettauF0(ehat, p, scores = Rfit::wscores, delta = 0.8, hparm = 2, ...)

Arguments

`ehat`	vector of length n: full model residuals
`p`	scalar: number of regression coefficients (excluding the intercept); see Details
`scores`	object of class scores, defaults to Wilcoxon scores
`delta`	confidence level; see Details
`hparm`	used in Huber's degrees of freedom correction; see Details
`...`	additional arguments. currently unused

Details

For rank-based analyses of linear models, the estimator $\hat{\tau}$ of the scale parameter $\tau$ plays a standardizing role in the standard errors (SE) of the rank-based estimators of the regression coefficients and in the denominator of Wald-type and the drop-in-dispersion test statistics of linear hypotheses. rfit currently implements the KSM (Koul, Sievers, and McKean 1987) estimator of tau.

The functions gettau and gettauF0 are both available to compute the KSM estimate and may be call from rfit and used for inference. The default is to use the faster FORTRAN version gettauF0 via the to option TAU='F0'. The R version, gettau, may be much slower especially when sample sizes are large; this version may be called from rfit using the option TAU='R'.

The KSM estimator tauhat is a density type estimator that has the bandwidth given by $t_\delta/sqrt{n}$ , where $t_\delta$ is the $\delta-th$ quantile of the cdf $H(y)$ given in expression (3.7.2) of Hettmansperger and McKean (2011), with the corresponding estimator $\hat{H}$ , given in expression (3.7.7) of Hettmansperger and McKean (2011).

Based on simulation studies, most situations where (n/p >= 6), the default delta = 0.80 provides a valid rank-based analysis (McKean and Sheather, 1991). For situations with n/p < 6, caution is needed as the KSM estimate is sensitive to choice of bandwidth. McKean and Sheather (1991) recommend using a value of 0.95 for delta in such situations.

To correct for heavy-tailed random errors, Huber (1973) proposed a degree of freedom correction for the M-estimate scale parameter. The correction is given by $K = 1 + [p*(1-h_c)/n*h_c]$ where $h_c$ is the proportion of standardized residuals in absolute value less than the parameter hparm. This correction $K$ is used as a multiplicative factor to tauhat. The default value of hparm is set at 2.

The usual degrees of freedom correction, $\sqrt{n/(n-p)}$ , is also used as a multiplicative factor to tauhat.

Value

Length one numeric object.

Author(s)

Joseph McKean, John Kloke

References

Hettmansperger, T.P. and McKean J.W. (2011), Robust Nonparametric Statistical Methods, 2nd ed., New York: Chapman-Hall.

Huber, P.J. (1973), Robust regression: Asymptotics, conjectures and Monte Carlo, Annals of Statistics, 1, 799–821.

Koul, H.L., Sievers, G.L., and McKean, J.W. (1987), An estimator of the scale parameter for the rank analysis of linear models under general score functions, Scandinavian Journal of Statistics, 14, 131–141.

McKean, J. W. and Sheather, S. J. (1991), Small Sample Properties of Robust Analyses of Linear Models Based on R-Estimates: A Survey, in Directions in Robust Statistics and Diagnostics, Part II, Editors: W.\ Stahel and S.\ Weisberg, Springer-Verlag: New York, 1–19.

Examples

#  For a standard normal distribution the parameter tau has the value 1.023327 (sqrt(pi/3)).
set.seed(283643659)
n <- 12; p <- 6; y <- rnorm(n); x <- matrix(rnorm(n*p),ncol=p)
tau1 <- rfit(y~x)$tauhat; tau2 <- rfit(y~x,delta=0.95)$tauhat
c(tau1,tau2) # 0.5516708 1.0138415
n <- 120; p <- 6; y <- rnorm(n); x <- matrix(rnorm(n*p),ncol=p)
tau3 <- rfit(y~x)$tauhat; tau4 <- rfit(y~x,delta=0.95)$tauhat
c(tau3,tau4) # 1.053974 1.041783
#  For a standard normal distribution the parameter tau has the value 1.023327 (sqrt(pi/3)).
set.seed(283643659)
n <- 12; p <- 6; y <- rnorm(n); x <- matrix(rnorm(n*p),ncol=p)
tau1 <- rfit(y~x)$tauhat; tau2 <- rfit(y~x,delta=0.95)$tauhat
c(tau1,tau2) # 0.5516708 1.0138415
n <- 120; p <- 6; y <- rnorm(n); x <- matrix(rnorm(n*p),ncol=p)
tau3 <- rfit(y~x)$tauhat; tau4 <- rfit(y~x,delta=0.95)$tauhat
c(tau3,tau4) # 1.053974 1.041783

Calculate the Gradiant of Jaeckel's Dispersion Function

Description

Calculate the Gradiant of Jaeckel's Dispersion Function

Usage

grad(x, y, beta, scores)
grad(x, y, beta, scores)

Arguments

`x`	n by p design matrix
`y`	n by 1 response vector
`beta`	p by 1 vector of regression coefficients
`scores`	an object of class scores

Value

The gradiant evaluated at beta.

Author(s)

John Kloke

References

Hettmansperger, T.P. and McKean J.W. (2011), Robust Nonparametric Statistical Methods, 2nd ed., New York: Chapman-Hall.

Jaeckel, L. A. (1972). Estimating regression coefficients by minimizing the dispersion of residuals. Annals of Mathematical Statistics, 43, 1449 - 1458.

Jureckova, J. (1971). Nonparametric estimate of regression coefficients. Annals of Mathematical Statistics, 42, 1328 - 1338.

Examples

## The function is currently defined as
function (x, y, beta, scores) 
{
    x <- as.matrix(x)
    e <- y - x %*% beta
    r <- rank(e, ties.method = "first")/(length(e) + 1)
    -t(x) %*% scores@phi(r)
  }
## The function is currently defined as
function (x, y, beta, scores) 
{
    x <- as.matrix(x)
    e <- y - x %*% beta
    r <- rank(e, ties.method = "first")/(length(e) + 1)
    -t(x) %*% scores@phi(r)
  }

Function to Minimize Jaeckel's Dispersion Function

Description

Uses the built-in function optim to minimize Jaeckel's dispersion function with respect to beta.

Usage

jaeckel(x, y, beta0 = lm(y ~ x)$coef[2:(ncol(x) + 1)], 
  scores = Rfit::wscores, control = NULL,...)
jaeckel(x, y, beta0 = lm(y ~ x)$coef[2:(ncol(x) + 1)], 
  scores = Rfit::wscores, control = NULL,...)

Arguments

`x`	n by p design matrix
`y`	n by 1 response vector
`beta0`	initial estimate of beta
`scores`	object of class 'scores'
`control`	control passed to fitting routine
`...`	addtional arguments to be passed to fitting routine

Details

Jaeckel's dispersion function (Jaeckel 1972) is a convex function which measures the distance between the observed responses $y$ and the fitted values $x \beta$ . The dispersion function is a sum of the products of the residuals, $y - x \beta$ , and the scored ranks of the residuals. A rank-based fit minimizes the dispersion function; see McKean and Schrader (1980) and Kloke and McKean (2012) for discussion. jaeckel uses optim with the method set to BFGS to minimize Jaeckel's dispersion function. If control is not specified at the function call, the relative tolerance (reltol) is set to .Machine$double.eps^(3/4) and maximum number of iterations is set to 200.

jaeckel is intended to be an internal function. See rfit for a general purpose function.

Value

Results of optim are returned.

Author(s)

John Kloke

References

Hettmansperger, T.P. and McKean J.W. (2011), Robust Nonparametric Statistical Methods, 2nd ed., New York: Chapman-Hall.

Jaeckel, L. A. (1972), Estimating regression coefficients by minimizing the dispersion of residuals. Annals of Mathematical Statistics, 43, 1449 - 1458.

Kapenga, J. A., McKean, J. W., and Vidmar, T. J. (1988), RGLM: Users Manual, Statist. Assoc. Short Course on Robust Statistical Procedures for the Analysis of Linear and Nonlinear Models, New Orleans.

Examples

##  This is a internal function.  See rfit for user-level examples.
##  This is a internal function.  See rfit for user-level examples.

Internal Functions for K-Way analysis of variance

Description

These are internal functions used to construct the robust anova table. The function raov is the main program.

Usage

kwayr(levs, data,...)
cellx(X)
khmat(levsind,permh)
pasteColsRfit(x,sep="")
redmod(xmat,amat)
subsets(k)
kwayr(levs, data,...)
cellx(X)
khmat(levsind,permh)
pasteColsRfit(x,sep="")
redmod(xmat,amat)
subsets(k)

Arguments

`levs`	vector of levels corresponding to each of the factors
`data`	data matrix in the form y, factor 1,..., factor k
`X`	n x k matrix where the columns represent the levels of the k factors.
`levsind`	Internal parameter.
`permh`	Internal parameter.
`x`	n x k matrix where the columns represent the levels of the k factors.
`xmat`	n x p full model design matrix
`amat`	Internal parameter.
`k`	Internal parameter.
`sep`	Seperator used in pasteColsRfit
`...`	additional arguments

Note

Renamed pasteCols of library plotrix written by Jim Lemon et. al. June 2011 under GPL 2

Author(s)

Joseph McKean, John Kloke

References

Hettmansperger, T.P. and McKean J.W. (2011), Robust Nonparametric Statistical Methods, 2nd ed., New York: Chapman-Hall.

Hocking, R. R. (1985), The Analysis of Linear Models, Monterey, California: Brooks/Cole.

Rank-based Oneway Analysis of Variance

Description

Carries out a robust analysis of variance for a one factor design. Analysis is based on the R estimates.

Usage

oneway.rfit(y, g, scores = Rfit::wscores, p.adjust = "none",...)
oneway.rfit(y, g, scores = Rfit::wscores, p.adjust = "none",...)

Arguments

`y`	n by 1 response vector
`g`	n by 1 vector representing group membership
`scores`	an object of class 'scores'
`p.adjust`	adjustment to the p-values, argument passed to p.adjust
`...`	additional arguments

Details

Carries out a robust one-way analysis of variance based on full model r fit.

Value

`fit`	full model fit from rfit
`est`	Estimates
`se`	Standard Errors
`I`	First Index
`J`	Second Index
`p.value`	p-values
`y`	response vector
`g`	vector denoting group membership

Author(s)

Joseph McKean, John Kloke

References

Hettmansperger, T.P. and McKean J.W. (2011), Robust Nonparametric Statistical Methods, 2nd ed., New York: Chapman-Hall.

Examples

	data(quail)
	oneway.rfit(quail$ldl,quail$treat)
 data(quail)
	oneway.rfit(quail$ldl,quail$treat)

Class "param"

Description

Internal class for use with score functions.

Objects from the Class

A virtual Class: No objects may be created from it.

Methods

No methods defined with class "param" in the signature.

Author(s)

John Kloke

Examples

showClass("param")
showClass("param")

Rfit Internal Print Functions

Description

These functions print the output in a user-friendly manner using the internal R function print.

Usage

## S3 method for class 'rfit'
print(x, ...)
## S3 method for class 'summary.rfit'
print(x, digits = max(5, .Options$digits - 2), ...)
## S3 method for class 'drop.test'
print(x, digits = max(5, .Options$digits - 2), ...)
## S3 method for class 'oneway.rfit'
print(x, digits = max(5, .Options$digits - 2), ...)
## S3 method for class 'summary.oneway.rfit'
print(x, digits = max(5, .Options$digits - 2), ...)
## S3 method for class 'raov'
print(x, digits = max(5, .Options$digits - 2), ...)
## S3 method for class 'rfit'
print(x, ...)
## S3 method for class 'summary.rfit'
print(x, digits = max(5, .Options$digits - 2), ...)
## S3 method for class 'drop.test'
print(x, digits = max(5, .Options$digits - 2), ...)
## S3 method for class 'oneway.rfit'
print(x, digits = max(5, .Options$digits - 2), ...)
## S3 method for class 'summary.oneway.rfit'
print(x, digits = max(5, .Options$digits - 2), ...)
## S3 method for class 'raov'
print(x, digits = max(5, .Options$digits - 2), ...)

Arguments

`x`	An object to be printed
`digits`	number of digits to display
`...`	additional arguments to be passed to `print`

Author(s)

John Kloke

Quail Data

Description

Thirty-nine quail were randomized to one of for treatments for lowering cholesterol.

Usage

data(quail)data(quail)

Format

A data frame with 39 observations on the following 2 variables.

treat: a factor with levels 1 2 3 4
ldl: a numeric vector

Source

Hettmansperger, T.P. and McKean J.W. (2011), Robust Nonparametric Statistical Methods, 2nd ed., New York: Chapman-Hall.

Examples

data(quail)
boxplot(ldl~treat,data=quail)
data(quail)
boxplot(ldl~treat,data=quail)

R ANOVA

Description

Returns full model fit and robust ANOVA table for all main effects and interactions.

Usage

raov(f, data = list(), ...)
raov(f, data = list(), ...)

Arguments

`f`	an object of class formula
`data`	an optional data frame
`...`	additional arguments

Details

Based on reduction in dispersion tests for testing main effects and interaction. Uses an algorithm described in Hocking (1985).

Value

`table`	Description of 'comp1'
`fit`	full model fit returned from rfit
`residuals`	the residuals, i.e. y-yhat
`fitted.values`	yhat = x betahat
`call`	Call to the function

Author(s)

Joseph McKean, John Kloke

References

Hettmansperger, T.P. and McKean J.W. (2011), Robust Nonparametric Statistical Methods, 2nd ed., New York: Chapman-Hall.

Hocking, R. R. (1985), The Analysis of Linear Models, Monterey, California: Brooks/Cole.

Examples

raov(logSurv~Poison+Treatment,data=BoxCox)
raov(logSurv~Poison+Treatment,data=BoxCox)

Rank-based Estimates of Regression Coefficients

Description

Minimizes Jaeckel's dispersion function to obtain a rank-based solution for linear models.

Usage

rfit(formula, data = list(), ...)
## Default S3 method:
rfit(formula, data, subset, yhat0 = NULL, 
scores = Rfit::wscores, symmetric = FALSE, TAU = "F0", 
betahat0 = NULL, ...)

rfit(formula, data = list(), ...)
## Default S3 method:
rfit(formula, data, subset, yhat0 = NULL, 
scores = Rfit::wscores, symmetric = FALSE, TAU = "F0", 
betahat0 = NULL, ...)

Arguments

`formula`	an object of class formula
`data`	an optional data frame
`subset`	an optional argument specifying the subset of observations to be used
`yhat0`	an n by 1 vector of initial fitted values, default is NULL
`scores`	an object of class 'scores'
`symmetric`	logical. If 'FALSE' uses median of residuals as estimate of intercept
`TAU`	version of estimation routine for scale parameter. F0 for Fortran, R for (slower) R, N for none
`betahat0`	a p by 1 vector of initial parameter estimates, default is NULL
`...`	additional arguments to be passed to fitting routines

Details

Rank-based estimation involves replacing the L2 norm of least squares estimation with a pseudo-norm which is a function of the residuals and the scored ranks of the residuals. That is, in rank-based estimation, the usual notion of Euclidean distance is replaced with another measure of distance which is referred to as Jaeckel's (1972) dispersion function. Jaeckel's dispersion function depends on a score function and a library of commonly used score functions is included; eg., linear (Wilcoxon) and normal (Gaussian) scores. If an inital fit is not supplied (i.e. yhat0 = NULL and betahat0 = NULL) then inital fit is based on a LS fit.

Esimation of scale parameter tau is provided which may be used for inference.

Value

`coefficients`	estimated regression coefficents with intercept
`residuals`	the residuals, i.e. y-yhat
`fitted.values`	yhat = x betahat
`xc`	centered design matrix
`tauhat`	estimated value of the scale parameter tau
`taushat`	estimated value of the scale parameter tau_s
`betahat`	estimated regression coefficents
`call`	Call to the function

Author(s)

John Kloke, Joesph McKean

References

Hettmansperger, T.P. and McKean J.W. (2011), Robust Nonparametric Statistical Methods, 2nd ed., New York: Chapman-Hall.

Jaeckel, L. A. (1972). Estimating regression coefficients by minimizing the dispersion of residuals. Annals of Mathematical Statistics, 43, 1449 - 1458.

Jureckova, J. (1971). Nonparametric estimate of regression coefficients. Annals of Mathematical Statistics, 42, 1328 - 1338.

Examples

data(baseball)
data(wscores)
fit<-rfit(weight~height,data=baseball)
summary(fit)

### set the starting value
x1 <- runif(47); x2 <- runif(47); y <- 1 + 0.5*x1 + rnorm(47)
# based on a fit to a sub-model
rfit(y~x1+x2,yhat0=fitted.values(rfit(y~x1)))

### set value of delta used in estimation of tau ###
w <- factor(rep(1:3,each=3))
y <- rt(9,9)
rfit(y~w)$tauhat
rfit(y~w,delta=0.95)$tauhat  # recommended when n/p < 5
data(baseball)
data(wscores)
fit<-rfit(weight~height,data=baseball)
summary(fit)

### set the starting value
x1 <- runif(47); x2 <- runif(47); y <- 1 + 0.5*x1 + rnorm(47)
# based on a fit to a sub-model
rfit(y~x1+x2,yhat0=fitted.values(rfit(y~x1)))

### set value of delta used in estimation of tau ###
w <- factor(rep(1:3,each=3))
y <- rt(9,9)
rfit(y~w)$tauhat
rfit(y~w,delta=0.95)$tauhat  # recommended when n/p < 5

Studentized Residuals for Rank-Based Regression

Description

Returns the Studentized residuals based on rank-based estimation.

Usage

## S3 method for class 'rfit'
rstudent(model,...)
## S3 method for class 'rfit'
rstudent(model,...)

Arguments

`model`	an object of class rfit
`...`	additional arguments. currently not used.

Author(s)

John Kloke, Joseph McKean

References

Hettmansperger, T.P. and McKean J.W. (2011), Robust Nonparametric Statistical Methods, 2nd ed., New York: Chapman-Hall.

Examples

x<-runif(47)
y<-rcauchy(47)
qqnorm(rstudent(fit<-rfit(y~x)))
plot(x,rstudent(fit)) ; abline(h=c(-2,2))
x<-runif(47)
y<-rcauchy(47)
qqnorm(rstudent(fit<-rfit(y~x)))
plot(x,rstudent(fit)) ; abline(h=c(-2,2))

Class "scores"

Description

A score function and it's corresponding derivative is required for rank-based estimation. This object puts them together.

Objects from the Class

Objects can be created by calls of the form new("scores", ...).

Slots

phi:: Object of class "function" the score function
Dphi:: Object of class "function" the first derivative of the score function
param:: Object of class "param"

Author(s)

John Kloke

References

Hettmansperger, T.P. and McKean J.W. (2011), Robust Nonparametric Statistical Methods, 2nd ed., New York: Chapman-Hall.

Examples

showClass("scores")
showClass("scores")

Serum Level of luteinizing hormone (LH)

Description

Hollander and Wolfe (1999) discuss a 2 by 5 factorial design for a study to determine the effect of light on the release of luteinizing hormone (LH). The factors in the design are: light regimes at two levels (constant light and 14 hours of light followed by 10 hours of darkness) and a luteinizing release factor (LRF) at 5 different dosage levels. The response is the level of luteinizing hormone (LH), nanograms per ml of serum in blood samples. Sixty rats were put on test under these 10 treatment combinations, six rats per combination.

Usage

data(serumLH)data(serumLH)

Format

A data frame with 60 observations on the following 3 variables.

serum: a numeric vector
light.regime: a factor with levels Constant Intermittent
LRF.dose: a factor with levels 0 10 1250 250 50

Source

Hollander, M. and Wolfe, D.A. (1999), Nonparametric Statistical Methods, New York: Wiley.

References

Hollander, M. and Wolfe, D.A. (1999), Nonparametric Statistical Methods, New York: Wiley.

Examples

data(serumLH)
raov(serum~light.regime + LRF.dose + light.regime*LRF.dose, data = serumLH)
data(serumLH)
raov(serum~light.regime + LRF.dose + light.regime*LRF.dose, data = serumLH)

Signed-Rank Estimate of Location (Intercept)

Description

Returns the signed-rank estimate of intercept with is equivalent to the Hodges-Lehmann estimate of the residuals.

Usage

signedrank(x)
signedrank(x)

Arguments

`x`	numeric vector

Value

Returns the median of the Walsh averages.

Author(s)

John Kloke, Joseph McKean

References

Hettmansperger, T.P. and McKean J.W. (2011), Robust Nonparametric Statistical Methods, 2nd ed., New York: Chapman-Hall.

Hollander, M. and Wolfe, D.A. (1999), Nonparametric Statistical Methods, New York: Wiley.

Examples


## The function is currently defined as
function (x) 
median(walsh(x))
## The function is currently defined as
function (x) 
median(walsh(x))

Provides a summary for the oneway anova based on an R fit.

Description

Provides a summary for the oneway anova based on an R fit including a test for main effects as tests for pairwise comparisons.

Usage

## S3 method for class 'oneway.rfit'
summary(object, alpha=0.05,method=confintadjust.methods,...)
## S3 method for class 'oneway.rfit'
summary(object, alpha=0.05,method=confintadjust.methods,...)

Arguments

`object`	an object of class 'oneway.rfit', usually, a result of a call to 'oneway.rfit'
`alpha`	Experimentwise Error Rate
`method`	method used in confidence interval adjustment
`...`	additional arguments

Author(s)

John Kloke, Joseph McKean

References

Hettmansperger, T.P. and McKean J.W. (2011), Robust Nonparametric Statistical Methods, 2nd ed., New York: Chapman-Hall.

Examples

data(quail)
oneway.rfit(quail$ldl,quail$treat)
data(quail)
oneway.rfit(quail$ldl,quail$treat)

Summarize Rank-Based Linear Model Fits

Description

Provides a summary similar to the traditional least squares fit.

Usage

## S3 method for class 'rfit'
summary(object,overall.test,...)
## S3 method for class 'rfit'
summary(object,overall.test,...)

Arguments

`object`	an object of class 'rfit', usually, a result of a call to 'rfit'
`overall.test`	either 'wald' or 'drop'
`...`	additional arguments

Details

Provides summary statistics based on a rank-based fit. A table of estimates, standard errors, t-ratios, and p-values are provided. An overall test of the explantory variables is provided; the default is to use a Wald test. A drop in dispersion test is also availble in which case a robust R^2 is provided as well.

Author(s)

John Kloke

References

Hettmansperger, T.P. and McKean J.W. (2011), Robust Nonparametric Statistical Methods, 2nd ed., New York: Chapman-Hall.

Examples

data(baseball)
fit<-rfit(weight~height,data=baseball)
summary(fit)
summary(fit,overall.test='drop')
data(baseball)
fit<-rfit(weight~height,data=baseball)
summary(fit)
summary(fit,overall.test='drop')

Internal Functions for Estimating tau

Description

These are internal functions used for calculating the scale parameter tau necessary for estimating the standard errors of coefficients for rank-regression.

Usage

hstarreadyscr(ehat,asc,ascpr)
hstar(abdord, wtord, const, n, y) 
looptau(delta, abdord, wtord, const, n)
pairup(x,type="less") 
hstarreadyscr(ehat,asc,ascpr)
hstar(abdord, wtord, const, n, y) 
looptau(delta, abdord, wtord, const, n)
pairup(x,type="less")

Arguments

`ehat`	Full model residals
`delta`	Window parameter (proportion) used in the Koul et al. estimator of tau. Default value is 0.80. If the ratio of sample size to number of regression parameters (n to p) is less than 5, larger values such as 0.90 to 0.95 are more approporiate.
`y`	Argument of function hstar
`abdord`	Ordered absolute differences of residuals
`wtord`	Standardized (by const) ordered absolute differences of residuals
`const`	Range of score function
`n`	Sample size
`x`	Argument for pairup
`type`	Argument for the function pairup
`asc`	scores
`ascpr`	derivative of the scores

Author(s)

Joseph McKean, John Kloke

References

Hettmansperger, T.P. and McKean J.W. (2011), Robust Nonparametric Statistical Methods, 2nd ed., New York: Chapman-Hall.

Koul, H.L., Sievers, G.L., and McKean, J.W. (1987) An esimator of the scale parameter for the rank analysis of linear models under general score functions, Scandinavian Journal of Statistics, 14, 131-141.

Estimate of the Scale Parameter taustar

Description

An estimate of the scale parameter taustar = 1/(2*f(0)) is needed for the standard error of the intercept in rank-based regression.

Usage

taustar(e, p, conf = 0.95)
taustar(e, p, conf = 0.95)

Arguments

`e`	n x 1 vector of full model residuals
`p`	is the number of regression coefficients (without the intercept)
`conf`	confidence level of CI used

Details

Confidence interval estimate of taustar. See, for example, Hettmansperger and McKean (1998) p.7-8 and p.25-26.

Value

Length-one numeric object containing the estimated scale parameter taustar.

Author(s)

Joseph McKean, John Kloke

References

Hettmansperger, T.P. and McKean J.W. (2011), Robust Nonparametric Statistical Methods, 2nd ed., New York: Chapman-Hall.

Examples

##  This is an internal function.  See rfit for user-level examples.
##  This is an internal function.  See rfit for user-level examples.

Telephone Data

Description

The number of telephone calls (in tens of millions) made in Belgium from 1950-1973.

Usage

data(telephone)data(telephone)

Format

A data frame with 24 observations on the following 2 variables.

year: years since 1950 AD
calls: number of telephone calls in tens of millions

Source

Rousseeuw, P.J. and Leroy, A.M. (1987), Robust Regression and Outlier Detection, New York: Wiley.

References

Hettmansperger, T.P. and McKean J.W. (2011), Robust Nonparametric Statistical Methods, 2nd ed., New York: Chapman-Hall.

Examples

data(telephone)
plot(telephone)
abline(rfit(calls~year,data=telephone))
data(telephone)
plot(telephone)
abline(rfit(calls~year,data=telephone))

Variance-Covariance Matrix for Rank-Based Regression

Description

Returns the variance-covariance matrix of the regression estimates from an object of type rfit.

Usage

## S3 method for class 'rfit'
vcov(object, intercept = NULL,...)
## S3 method for class 'rfit'
vcov(object, intercept = NULL,...)

Arguments

`object`	an object of type rfit
`intercept`	logical. If TRUE include the variance-covariance estimates corresponding to the intercept
`...`	additional arguments

Author(s)

John Kloke

References

Hettmansperger, T.P. and McKean J.W. (2011), Robust Nonparametric Statistical Methods, 2nd ed., New York: Chapman-Hall.

Overall Wald test

Description

Conducts a Wald test of all regression parameters are zero

Usage

wald.test.overall(fit)wald.test.overall(fit)

Arguments

fit

result from a rfit

Author(s)

John Kloke

References

Hettmansperger, T.P. and McKean J.W. (2011), Robust Nonparametric Statistical Methods, 2nd ed., New York: Chapman-Hall.

Examples

x <- rnorm(47)
y <- rnorm(47)
wald.test.overall(rfit(y~x))
x <- rnorm(47)
y <- rnorm(47)
wald.test.overall(rfit(y~x))

Walsh Averages

Description

Given a list of n numbers, the Walsh averages are the $latex$ pairwise averages.

Usage

walsh(x)
walsh(x)

Arguments

`x`	A numeric vector

Value

The Walsh averages.

Author(s)

John Kloke, Joseph McKean

References

Hettmansperger, T.P. and McKean J.W. (2011), Robust Nonparametric Statistical Methods, 2nd ed., New York: Chapman-Hall.

Hollander, M. and Wolfe, D.A. (1999), Nonparametric Statistical Methods, New York: Wiley.

Examples


median(walsh(rnorm(100)))  # Hodges-Lehmann estimate of location

## The function is currently defined as
function (x) 
{
    n <- length(x)
    w <- vector(n * (n + 1)/2, mode = "numeric")
    ind <- 0
    for (i in 1:n) {
        for (j in i:n) {
            ind <- ind + 1
            w[ind] <- 0.5 * (x[i] + x[j])
        }
    }
    return(w)
  }
median(walsh(rnorm(100)))  # Hodges-Lehmann estimate of location

## The function is currently defined as
function (x) 
{
    n <- length(x)
    w <- vector(n * (n + 1)/2, mode = "numeric")
    ind <- 0
    for (i in 1:n) {
        for (j in i:n) {
            ind <- ind + 1
            w[ind] <- 0.5 * (x[i] + x[j])
        }
    }
    return(w)
  }

Package 'Rfit'

Help Index

Rank-Based Estimates and Inference for Linear Models

Description

Details

Author(s)

References

Examples

All Scores

Description

Usage

Format

Details

References

Examples

Baseball Card Data

Description

Usage

Format

Source

Examples

Baseball Salaries

Description

Usage

Format

Source

References

Examples

Box and Cox (1964) data.

Description

Usage

Format

Source

References

Examples

Cardiovascular risk factors

Description

Usage

Format

Details

Source

Examples

Confidence interval adjustment methods

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Jaeckel's Dispersion Function

Description

Usage

Arguments

Details

Author(s)

References

See Also

Drop (Reduction) in Dispersion Test

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Free Fatty Acid Data

Description

Usage

Format

Source

References

Examples

~~ Methods for Function getScores ~~

Description

Methods

See Also

Methods for Function getScores

Methods for Function getScoresDeriv