Package 'dr'

Title: Methods for Dimension Reduction for Regression
Description: Functions, methods, and datasets for fitting dimension reduction regression, using slicing (methods SAVE and SIR), Principal Hessian Directions (phd, using residuals and the response), and an iterative IRE. Partial methods, that condition on categorical predictors are also available. A variety of tests, and stepwise deletion of predictors, is also included. Also included is code for computing permutation tests of dimension. Adding additional methods of estimating dimension is straightforward. For documentation, see the vignette in the package. With version 3.0.4, the arguments for dr.step have been modified.
Authors: Sanford Weisberg <[email protected]>,
Maintainer: Sanford Weisberg <[email protected]>
License: GPL (>= 2)
Version: 3.0.10
Built: 2024-11-07 06:23:54 UTC
Source: CRAN

Help Index


Australian institute of sport data

Description

Data on 102 male and 100 female athletes collected at the Australian Institute of Sport.

Format

This data frame contains the following columns:

Sex

(0 = male or 1 = female)

Ht

height (cm)

Wt

weight (kg)

LBM

lean body mass

RCC

red cell count

WCC

white cell count

Hc

Hematocrit

Hg

Hemoglobin

Ferr

plasma ferritin concentration

BMI

body mass index, weight/(height)**2

SSF

sum of skin folds

Bfat

Percent body fat

Label

Case Labels

Sport

Sport

Source

Ross Cunningham and Richard Telford

References

S. Weisberg (2005). Applied Linear Regression, 3rd edition. New York: Wiley, Section 6.4

Examples

data(ais)

Swiss banknote data

Description

Six measurements made on 100 genuine Swiss banknotes and 100 counterfeit ones.

Format

This data frame contains the following columns:

Length

Length of bill, mm

Left

Width of left edge, mm

Right

Width of right edge, mm

Bottom

Bottom margin width, mm

Top

Top margin width, mm

Diagonal

Length of image diagonal, mm

Y

0 = genuine, 1 = counterfeit

Source

Flury, B. and Riedwyl, H. (1988). Multivariate Statistics: A practical approach. London: Chapman & Hall.

References

Weisberg, S. (2005). Applied Linear Regression, 3rd edition. New York: Wiley, Problem 12.5.

Examples

data(banknote)

Main function for dimension reduction regression

Description

This is the main function in the dr package. It creates objects of class dr to estimate the central (mean) subspace and perform tests concerning its dimension. Several helper functions that require a dr object can then be applied to the output from this function.

Usage

dr (formula, data, subset, group=NULL, na.action = na.fail, weights, ...)
    
dr.compute (x, y, weights, group=NULL, method = "sir", chi2approx="bx",...)

Arguments

formula

a two-sided formula like y~x1+x2+x3, where the left-side variable is a vector or a matrix of the response variable(s), and the right-hand side variables represent the predictors. While any legal formula in the Rogers-Wilkinson notation can appear, dimension reduction methods generally expect the predictors to be numeric, not factors, with no nesting. Full rank models are recommended, although rank deficient models are permitted.

The left-hand side of the formula will generally be a single vector, but it can also be a matrix, such as cbind(y1+y2)~x1+x2+x3 if the method is "save" or "sir". Both of these methods are based on slicing, and for the multivariate case slices are determined by slicing on all the columns of the left-hand side variables.

data

an optional data frame containing the variables in the model. By default the variables are taken from the environment from which ‘dr’ is called.

subset

an optional vector specifying a subset of observations to be used in the fitting process.

group

If used, this argument specifies a grouping variable so that dimension reduction is done separately for each distinct level. This is implemented only when method is one of "sir", "save", or "ire". This argument must be a one-sided formula. For example, ~Location would fit separately for each level of the variable Location. The formula ~A:B would fit separately for each combination of A and B, provided that both have been declared factors.

weights

an optional vector of weights to be used where appropriate. In the context of dimension reduction methods, weights are used to obtain elliptical symmetry, not constant variance.

na.action

a function which indicates what should happen when the data contain ‘NA’s. The default is ‘na.fail,’ which will stop calculations. The option 'na.omit' is also permitted, but it may not work correctly when weights are used.

x

The design matrix. This will be computed from the formula by dr and then passed to dr.compute, or you can create it yourself.

y

The response vector or matrix

method

This character string specifies the method of fitting. The options include "sir", "save", "phdy", "phdres" and "ire". Each method may have its own additional arguments, or its own defaults; see the details below for more information.

chi2approx

Several dr methods compute significance levels using statistics that are asymptotically distributed as a linear combination of χ2(1)\chi^2(1) random variables. This keyword chooses the method for computing the chi2approx, either "bx", the default for a method suggested by Bentler and Xie (2000) or "wood" for a method proposed by Wood (1989).

...

For dr, all additional arguments passed to dr.compute. For dr.compute, additional arguments may be required for particular dimension reduction method. For example, nslices is the number of slices used by "sir" and "save". numdir is the maximum number of directions to compute, with default equal to 4. Other methods may have other defaults.

Details

The general regression problem studies F(yx)F(y|x), the conditional distribution of a response yy given a set of predictors xx. This function provides methods for estimating the dimension and central subspace of a general regression problem. That is, we want to find a p×dp \times d matrix BB of minimal rank dd such that

F(yx)=F(yBx)F(y|x)=F(y|B'x)

Both the dimension dd and the subspace R(B)R(B) are unknown. These methods make few assumptions. Many methods are based on the inverse distribution, F(xy)F(x|y).

For the methods "sir", "save", "phdy" and "phdres", a kernel matrix MM is estimated such that the column space of MM should be close to the central subspace R(B)R(B). The eigenvectors corresponding to the d largest eigenvalues of MM provide an estimate of R(B)R(B).

For the method "ire", subspaces are estimated by minimizing an objective function.

Categorical predictors can be included using the groups argument, with the methods "sir", "save" and "ire", using the ideas from Chiaromonte, Cook and Li (2002).

The primary output from this method is (1) a set of vectors whose span estimates R(B); and various tests concerning the dimension d.

Weights can be used, essentially to specify the relative frequency of each case in the data. Empirical weights that make the contours of the weighted sample closer to elliptical can be computed using dr.weights. This will usually result in zero weight for some cases. The function will set zero estimated weights to missing.

Value

dr returns an object that inherits from dr (the name of the type is the value of the method argument), with attributes:

x

The design matrix

y

The response vector

weights

The weights used, normalized to add to n.

qr

QR factorization of x.

cases

Number of cases used.

call

The initial call to dr.

M

A matrix that depends on the method of computing. The column space of M should be close to the central subspace.

evalues

The eigenvalues of M (or squared singular values if M is not symmetric).

evectors

The eigenvectors of M (or of M'M if M is not square and symmetric) ordered according to the eigenvalues.

chi2approx

Value of the input argument of this name.

numdir

The maximum number of directions to be found. The output value of numdir may be smaller than the input value.

slice.info

output from 'sir.slice', used by sir and save.

method

the dimension reduction method used.

terms

same as terms attribute in lm or glm. Needed to make update work correctly.

A

If method="save", then A is a three dimensional array needed to compute test statistics.

Author(s)

Sanford Weisberg, <[email protected]>.

References

Bentler, P. M. and Xie, J. (2000), Corrections to test statistics in principal Hessian directions. Statistics and Probability Letters, 47, 381-389. Approximate p-values.

Cook, R. D. (1998). Regression Graphics. New York: Wiley. This book provides the basic results for dimension reduction methods, including detailed discussion of the methods "sir", "phdy" and "phdres".

Cook, R. D. (2004). Testing predictor contributions in sufficient dimension reduction. Annals of Statistics, 32, 1062-1092. Introduced marginal coordinate tests.

Cook, R. D. and Nachtsheim, C. (1994), Reweighting to achieve elliptically contoured predictors in regression. Journal of the American Statistical Association, 89, 592–599. Describes the weighting scheme used by dr.weights.

Cook, R. D. and Ni, L. (2004). Sufficient dimension reduction via inverse regression: A minimum discrrepancy approach, Journal of the American Statistical Association, 100, 410-428. The "ire" is described in this paper.

Cook, R. D. and Weisberg, S. (1999). Applied Regression Including Computing and Graphics, New York: Wiley, http://www.stat.umn.edu/arc. The program arc described in this book also computes most of the dimension reduction methods described here.

Chiaromonte, F., Cook, R. D. and Li, B. (2002). Sufficient dimension reduction in regressions with categorical predictors. Ann. Statist. 30 475-497. Introduced grouping, or conditioning on factors.

Shao, Y., Cook, R. D. and Weisberg (2007). Marginal tests with sliced average variance estimation. Biometrika. Describes the tests used for "save".

Wen, X. and Cook, R. D. (2007). Optimal Sufficient Dimension Reduction in Regressions with Categorical Predictors, Journal of Statistical Inference and Planning. This paper extends the "ire" method to grouping.

Wood, A. T. A. (1989) An FF approximation to the distribution of a linear combination of chi-squared variables. Communications in Statistics: Simulation and Computation, 18, 1439-1456. Approximations for p-values.

Examples

data(ais)
# default fitting method is "sir"
s0 <- dr(LBM~log(SSF)+log(Wt)+log(Hg)+log(Ht)+log(WCC)+log(RCC)+
  log(Hc)+log(Ferr),data=ais) 
# Refit, using a different function for slicing to agree with arc.
summary(s1 <- update(s0,slice.function=dr.slices.arc))
# Refit again, using save, with 10 slices; the default is max(8,ncol+3)
summary(s2<-update(s1,nslices=10,method="save"))
# Refit, using phdres.  Tests are different for phd, and not
# Fit using phdres; output is similar for phdy, but tests are not justifiable. 
summary(s3<- update(s1,method="phdres"))
# fit using ire:
summary(s4 <- update(s1,method="ire"))
# fit using Sex as a grouping variable.  
s5 <- update(s4,group=~Sex)

Dimension reduction tests

Description

Functions to compute various tests concerning the dimension of a central subspace.

Usage

dr.test(object, numdir, ...)

dr.coordinate.test(object, hypothesis,d,chi2approx,...)

## S3 method for class 'ire'
dr.joint.test(object, hypothesis, d = NULL,...)

Arguments

object

The name of an object returned by a call to dr.

hypothesis

A specification of the null hypothesis to be tested by the coordinate hypothesis. See details below for options.

d

For conditional coordinate hypotheses, specify the dimension of the central mean subspace, typically 1, 2 or possibly 3. If left at the default, tests are unconditional.

numdir

The maximum dimension to consider. If not set defaults to 4.

chi2approx

Approximation method for p.values of linear combination of χ2(1)\chi^2(1) random variables. Choices are from c("bx","wood"), for the Bentler-Xie and Wood approximatations, respectively. The default is either "bx" or the value set in the call that created the dr object.

...

Additional arguments. None are currently available.

Details

dr.test returns marginal dimension tests. dr.coordinate.test returns marginal dimension tests (Cook, 2004) if d=NULL or conditional dimension tests if d is a positive integer giving the assumed dimension of the central subspace. The function dr.joint.test tests the coordinate hypothesis and dimension simultaneously. It is defined only for ire, and is used to compute the conditional coordinate test.

As an example, suppose we have created a dr object using the formula y ~ x1 + x2 + x3 + x4. The marginal coordinate hypothesis defined by Cook (2004) tests the hypothesis that y is independent of some of the predictors given the other predictors. For example, one could test whether x4 could be dropped from the problem by testing y independent of x4 given x1,x2,x3.

The hypothesis to be tested is determined by the argument hypothesis. The argument hypothesis = ~.-x4 would test the hypothesis of the last paragraph. Alternatively, hypothesis = ~x1+x2+x3 would fit the same hypothesis.

More generally, if H is a p×qp \times q rank qq matrix, and P(H)P(H) is the projection on the column space of H, then specifying hypothesis = H will test the hypothesis that YY is independent of (IP(H))XP(H)X(I-P(H))X | P(H)X.

Value

Returns a list giving the value of the test statistic and an asymptotic p.value computed from the test statistic. For SIR objects, the p.value is computed in two ways. The general test, indicated by p.val(Gen) in the output, assumes only that the predictors are linearly related. The restricted test, indicated by p.val(Res) in the output, assumes in addition to the linearity condition that a constant covariance condition holds; see Cook (2004) for more information on these assumptions. In either case, the asymptotic distribution is a linear combination of Chi-squared random variables. The function specified by the chi2approx approximates this linear combination by a single Chi-squared variable.

For SAVE objects, two p.values are also returned. p.val(Nor) assumes predictors are normally distributed, in which case the test statistic is asympotically Chi-sqaured with the number of df shown. Assuming general linearly related predictors we again get an asymptotic linear combination of Chi-squares that leads to p.val(Gen).

For IRE and PIRE, the tests statistics have an asymptotic χ2\chi^2 distribution, so the value of chi2approx is not relevant.

Author(s)

Yongwu Shao for SIR and SAVE and Sanford Weisberg for all methods, <[email protected]>

References

Cook, R. D. (2004). Testing predictor contributions in sufficient dimension reduction. Annals of Statistics, 32, 1062-1092.

Cook, R. D. and Ni, L. (2004). Sufficient dimension reduction via inverse regression: A minimum discrrepancy approach, Journal of the American Statistical Association, 100, 410-428.

Cook, R. D. and Weisberg, S. (1999). Applied Regression Including Computing and Graphics. Hoboken NJ: Wiley.

Shao, Y., Cook, R. D. and Weisberg, S. (2007, in press). Marginal tests with sliced average variance estimation. Biometrika.

See Also

drop1.dr, coord.hyp.basis, dr.step, dr.pvalue

Examples

#  This will match Table 5 in Cook (2004).  
data(ais)
# To make this idential to Arc (Cook and Weisberg, 1999), need to modify slices to match.
summary(s1 <- dr(LBM~log(SSF)+log(Wt)+log(Hg)+log(Ht)+log(WCC)+log(RCC)+log(Hc)+log(Ferr),
  data=ais,method="sir",slice.function=dr.slices.arc,nslices=8))
dr.coordinate.test(s1,~.-log(Hg))
#The following nearly reproduces Table 5 in Cook (2004)
drop1(s1,chi2approx="wood",update=FALSE)
drop1(s1,d=2,chi2approx="wood",update=FALSE)
drop1(s1,d=3,chi2approx="wood",update=FALSE)

Directions selected by dimension reduction regressiosn

Description

Dimension reduction regression returns a set of up to pp orthogonal direction vectors each of length pp, the first dd of which are estimates a basis of a dd dimensional central subspace. The function returns the estimated directions in the original nn dimensional space for plotting.

Usage

dr.direction(object, which, x)
dr.directions(object, which, x)
## Default S3 method:
dr.direction(object, which=NULL,x=dr.x(object))

dr.basis(object,numdir)

## S3 method for class 'ire'
dr.basis(object,numdir=length(object$result))

Arguments

object

a dimension reduction regression object created by dr.

which

select the directions wanted, default is all directions. If method is ire, then the directions depend on the value of the dimension you select. If omitted, select all directions.

numdir

The number of basis vectors to return

x

select the X matrix, the default is dr.x(object)

Details

Dimension reduction regression is used to estimate a basis of the central subspace or mean central subspace of a regression. If there are pp predictors, the dimension of the central subspace is less than or equal to pp. These two functions, dr.basis and dr.direction, return vectors that describe the central subspace in various ways.

Consder dr.basis first. If you set numdir=3, for example, this method will return a pp by 3 matrix whose columns span the estimated three dimensional central subspace. For all methods except for ire, this simply returns the first three columns of object$evectors. For the ire method, this returns the three vectors determined by a three-dimensional solution. Call this matrix CC. The basis is determined by back-transforming from centered and scaled predictors to the scale of the original predictors, and then renormalizing the vectors to have length one. These vectors are orthogonal in the inner product determined by Var(X).

The dr.direction method return XCXC, the same space but now a subspace of the original nn-dimensional space. These vectors are appropriate for plotting.

Value

Both functions return a matrix: for dr.direction, the matrix has n rows and numdir columns, and for dr.basis it has p rows and numdir columns.

Author(s)

Sanford Weisberg <[email protected]>

References

See R. D. Cook (1998). Regression Graphics. New York: Wiley.

See Also

dr

Examples

data(ais)
#fit dimension reduction using sir
m1 <- dr(LBM~Wt+Ht+RCC+WCC, method="sir", nslices = 8, data=ais)
summary(m1)
dr.basis(m1)
dr.directions(m1)

Permutation tests of dimension for dr

Description

Approximates marginal dimension test significance levels for sir, save, and phd by sampling from the permutation distribution.

Usage

dr.permutation.test(object, npermute=50,numdir=object$numdir)

Arguments

object

a dimension reduction regression object created by dr

npermute

number of permutations to compute, default is 50

numdir

maximum permitted value of the dimension, with the default from the object

Details

The method approximates significance levels of the marginal dimension tests based on a permutation test. The algorithm: (1) permutes the rows of the predictor but not the response; (2) computes marginal dimension tests for the permuted data; (3) obtains significane levels by comparing the observed statsitics to the permutation distribution.

The method is not implemented for ire.

Value

Returns an object of type ‘dr.permutation.test’ that can be printed or summarized to give the summary of the test.

Author(s)

Sanford Weisberg, [email protected]

References

See www.stat.umn.edu/arc/addons.html, and then select the article on dimension reduction regression or inverse regression.

See Also

dr

Examples

data(ais)
attach(ais)  # the Australian athletes data
#fit dimension reduction regression using sir
m1 <- dr(LBM~Wt+Ht+RCC+WCC, method="sir", nslices = 8)
summary(m1)
dr.permutation.test(m1,npermute=100)
plot(m1)

Compute the Chi-square approximations to a weighted sum of Chi-square(1) random variables.

Description

Returns an approximate quantile for a weighted sum of independent χ2(1)\chi^2(1) random variables.

Usage

dr.pvalue(coef,f,chi2approx=c("bx","wood"),...)

bentlerxie.pvalue(coef, f)

wood.pvalue(coef, f, tol=0.0, print=FALSE)

Arguments

coef

a vector of nonnegative weights

f

Observed value of the statistic

chi2approx

Which approximation should be used?

tol

tolerance for Wood's method.

print

Printed output for Wood's method

...

Arguments passed from dr.pvalue to wood.pvalue.

Details

For Bentler-Xie, we approximate ff by cχ2(d)c \chi^2(d) for values of cc and dd computed by the function. The Wood approximation is more complicated.

Value

Returns a data.frame with four named components:

test

The input argument f.

test.adj

For Bentler-Xie, returns cfcf; for Wood, returns NA.

df.adj

For Bentler-Xie, returns dd; for Wood, returns NA.

pval.adj

Approximate p.value.

Author(s)

Sanford Weisberg <[email protected]>

References

Peter M. Bentler and Jun Xie (2000), Corrections to test statistics in principal Hessian directions. Statistics and Probability Letters, 47, 381-389.

Wood, Andrew T. A. (1989) An FF approximation to the distribution of a linear combination of chi-squared variables. Communications in Statistics: Simulation and Computation, 18, 1439-1456.


Divide a vector into slices of approximately equal size

Description

Divides a vector into slices of approximately equal size.

Usage

dr.slices(y, nslices)

dr.slices.arc(y, nslices)

Arguments

y

a vector of length nn or an n×pn \times p matrix

nslices

the number of slices, no larger than nn, or a vector of pp numbers giving the number of slices in each direction. If yy has pp columns and nslices is a number, then the number of slices in each direction is the smallest integer greater than the p-th root of nslices.

Details

If yy is an n-vector, order yy. The goal for the number of observations per slice is mm, the smallest integer in nslices/n. Allocate the first mm observations to slice 1. If there are duplicates in yy, keep adding observations to the first slice until the next value of yy is not equal to the largest value in the first slice. Allocate the next mm values to the next slice, and again check for ties. Continue until all values are allocated to a slice. This does not guarantee that nslices will be obtained, nor does it guarantee an equal number of observations per slice. This method of choosing slices is invariant under rescaling, but not under multiplication by 1-1, so the slices of yy will not be the same as the slices of y-y. This function was rewritten for Version 2.0.4 of this package, and will no longer give exactly the same results as the program Arc. If you want to duplicate Arc, use the function dr.slice.arc, as illustrated in the example below.

If yy is a matrix of p columns, slice the first column as described above. Then, within each of the slices determined for the first column, slice based on the second column, so that each of the “cells” has approximately the same number of observations. Continue through all the columns. This method is not invariant under reordering of the columns, or under multiplication by 1-1.

Value

Returns a named list with three elements as follows:

slice.indicator

ordered eigenvectors that describe the estimates of the dimension reduction subspace

nslices

Gives the actual number of slices produced, which may be smaller than the number requested.

slice.sizes

The number of observations in each slice.

Author(s)

Sanford Weisberg, <[email protected]>

References

R. D. Cook and S. Weisberg (1999), Applied Regression Including Computing and Graphics, New York: Wiley.

See Also

dr

Examples

data(ais)
summary(s1 <- dr(LBM~log(SSF)+log(Wt)+log(Hg)+log(Ht)+log(WCC)+log(RCC)+
                 log(Hc)+log(Ferr), data=ais,method="sir",nslices=8))
# To make this idential to ARC, need to modify slices to match.
summary(s2 <- update(s1,slice.info=dr.slices.arc(ais$LBM,8)))

Estimate weights for elliptical symmetry

Description

This function estimate weights to apply to the rows of a data matrix to make the resulting weighted matrix as close to elliptically symmetric as possible.

Usage

dr.weights(formula, data = list(), subset, na.action = na.fail, 
    sigma=1, nsamples=NULL, ...)

Arguments

formula

A one-sided or two-sided formula. The right hand side is used to define the design matrix.

data

An optional data frame.

subset

A list of cases to be used in computing the weights.

na.action

The default is na.fail, to prohibit computations. If set to na.omit, the function will return a list of weights of the wrong length for use with dr.

nsamples

The weights are determined by random sampling from a data-determined normal distribution. This controls the number of samples. The default is 10 times the number of cases.

sigma

Scale factor, set to one by default; see the paper by Cook and Nachtsheim for more information on choosing this parameter.

...

Arguments are passed to cov.rob to compute a robust estimate of the covariance matrix.

Details

The basic outline is: (1) Estimate a mean m and covariance matrix S using a possibly robust method; (2) For each iteration, obtain a random vector from N(m,sigma*S). Add 1 to a counter for observation i if the i-th row of the data matrix is closest to the random vector; (3) return as weights the sample faction allocated to each observation. If you set the keyword weights.only to T on the call to dr, then only the list of weights will be returned.

Value

Returns a list of nn weights, some of which may be zero.

Author(s)

Sanford Weisberg, [email protected]

References

R. D. Cook and C. Nachtsheim (1994), Reweighting to achieve elliptically contoured predictors in regression. Journal of the American Statistical Association, 89, 592–599.

See Also

dr, cov.rob

Examples

data(ais)
w1 <- dr.weights(~ Ht +Wt +RCC, data = ais)
m1 <- dr(LBM~Ht+Wt+RCC,data=ais,weights=w1)

Sequential fitting of coordinate tests using a dr object

Description

This function implements backward elimination using a dr object for which a dr.coordinate.test is defined, currently for SIR SAVE, IRE and PIRE.

Usage

dr.step(object,scope=NULL,d=NULL,minsize=2,stop=0,trace=1,...)

## S3 method for class 'dr'
drop1(object, scope = NULL,  update=TRUE,
test="general",trace=1,...)

Arguments

object

A dr object for which dr.coordinate.test is defined, for method equal to one of sir, save or ire.

scope

A one sided formula specifying predictors that will never be removed.

d

To use conditional coordinate tests, specify the dimension of the central (mean) subspace. The default is NULL, meaning no conditioning. This is currently available only for methods sir, save without categorical predictors, or for ire with or without categorical predictors.

minsize

Minimum subset size, must be greater than or equal to 2.

stop

Set stopping criterion: continue removing variables until the p-value for the next variable to be removed is less than stop. The default is stop = 0.

update

If true, the update method is used to return a dr object obtained from object by updating the formula to drop the variable with the largest p.value. This can significantly slow the computations for IRE but has little effect on SAVE and SIR.

test

Type of test to be used for selecting the next predictor to remove for method="save" only. "normal" assumes normal predictors, "general" assumes elliptically contoured predictors. For other methods, this argument is ignored.

trace

If positive, print informative output at each step, the default. If trace is 0 or false, suppress all printing.

...

Additional arguments passed to dr.coordinate.test.

Details

Suppose a dr object has p=a+bp=a+b predictors, with aa predictors specified in the scope statement. drop1 will compute either marginal coordinate tests (if d=NULL) or conditional marginal coordinate tests (if d is positive) for dropping each of the b predictors not in the scope, and return p.values. The result is an object created from the original object with the predictor with the largest p.value removed.

dr.step will call drop1.dr repeatedly until max(a,d+1)\max(a,d+1) predictors remain.

Value

As a side effect, a data frame of labels, tests, df, and p.values is printed. If update=TRUE, a dr object is returned with the predictor with the largest p.value removed.

Author(s)

Sanford Weisberg, <[email protected]>, based on the drop1 generic function in the base R. The dr.step function is also similar to step in base R.

References

Cook, R. D. (2004). Testing predictor contributions in sufficient dimension reduction. Annals of Statistics, 32, 1062-1092.

Shao, Y., Cook, R. D. and Weisberg (2007). Marginal tests with sliced average variance estimation. Biometrika.

See Also

dr.coordinate.test

Examples

data(ais)
# To make this idential to ARC, need to modify slices to match by
# using slice.info=dr.slices.arc() rather than nslices=8
summary(s1 <- dr(LBM~log(SSF)+log(Wt)+log(Hg)+log(Ht)+log(WCC)+log(RCC)+
                 log(Hc)+log(Ferr), data=ais,method="sir",
                 slice.method=dr.slices.arc,nslices=8)) 
# The following will almost duplicate information in Table 5 of Cook (2004).
# Slight differences occur because a different approximation for the
# sum of independent chi-square(1) random variables is used:
ans1 <- drop1(s1)
ans2 <- drop1(s1,d=2)
ans3 <- drop1(s1,d=3)
# remove predictors stepwise until we run out of variables to drop.
dr.step(s1,scope=~log(Wt)+log(Ht))

Mussels' muscles data

Description

Data were furnished by Mike Camden, Wellington Polytechnic, Wellington, New Zealand. Horse mussels, (Atrinia), were sampled from the Marlborough Sounds. The response is the mussels' Muscle Mass.

Format

This data frame contains the following columns:

H

Shell height in mm

L

Shell length in mm

M

Muscle mass in g

S

Shell mass in g

W

Shell width in mm

Source

R. D. Cook and S. Weisberg (1999). Applied Statistics Including Computing and Graphics. New York: Wiley.


Basic plot of a dr object

Description

Plots selected direction vectors determined by a dimension reduction regression fit. By default, the pairs function is used for plotting, but the user can use any other graphics command that is appropriate.

Usage

## S3 method for class 'dr'
plot(x, which = 1:x$numdir, mark.by.y = FALSE, plot.method = pairs, ...)

Arguments

x

The name of an object of class dr, a dimension reduction regression object

which

selects the directions to be plotted

mark.by.y

if TRUE, color points according to the value of the response, otherwise, do not color points but include the response as a variable in the plot.

plot.method

the name of a function for the plotting. The default is pairs.

...

arguments passed to the plot.method.

Value

Returns a graph.

Author(s)

Sanford Weisberg, <[email protected]>.

Examples

data(ais)
# default fitting method is "sir"
s0 <- dr(LBM~log(SSF)+log(Wt)+log(Hg)+log(Ht)+log(WCC)+log(RCC)+
  log(Hc)+log(Ferr),data=ais)
plot(s0)
plot(s0,mark.by.y=TRUE)