Title: | Semiparametic Regression |
---|---|
Description: | Functions for semiparametric regression analysis, to complement the book: Ruppert, D., Wand, M.P. and Carroll, R.J. (2003). Semiparametric Regression. Cambridge University Press. |
Authors: | Matt Wand <[email protected]> |
Maintainer: | Billy Aung Myint <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.0-4.2 |
Built: | 2024-11-09 06:12:44 UTC |
Source: | CRAN |
The age.income
data frame has 205 pairs observations on
Canadian workers from a 1971 Canadian Census Public Use Tape
(Ullah, 1985).
data(age.income)
data(age.income)
This data frame contains the following columns:
age in years.
logarithm of income.
Ullah, A. (1985). Specification analysis of econometric models. Journal of Quantitative Economics, 2, 187-209.
Ruppert, D., Wand, M.P. and Carroll, R.J. (2003)
Semiparametric Regression Cambridge University Press.
http://stat.tamu.edu/~carroll/semiregbook/
library(SemiPar) data(age.income) attach(age.income) plot(age,log.income)
library(SemiPar) data(age.income) attach(age.income) plot(age,log.income)
The bpd
data frame has data on 223 human babies.
data(bpd)
data(bpd)
This data frame contains the following columns:
birthweight of baby (grammes).
an indicator of presence of bronchopulmonary dysplasia (BPD): 0=absent, 1=present.
Pagano, M. and Gauvreau, K. (1993). Principles of Biostatistics. Duxbury Press.
Ruppert, D., Wand, M.P. and Carroll, R.J. (2003)
Semiparametric Regression Cambridge University Press.
http://stat.tamu.edu/~carroll/semiregbook/
library(SemiPar) data(bpd) attach(bpd) plot(birthweight,BPD) boxplot(split(birthweight,BPD),col="green")
library(SemiPar) data(bpd) attach(bpd) plot(birthweight,BPD) boxplot(split(birthweight,BPD),col="green")
The calif.air.poll
data frame has 345 sets
of observations ozone level and meteorological variables
in Upland, California, U.S.A., in 1976.
data(calif.air.poll)
data(calif.air.poll)
This data frame contains the following columns:
Ozone concentration (ppm) at Sandburg Air Force Base.
Pressure gradient at Daggett, California.
Inversion base height, feet.
Inversion base temperature, degrees Fahrenheit.
Brieman, L. and Friedman, J. (1985). Estimating optimal transformations for multiple regression and correlation (with discussion). Journal of the American Statistical Association, 80, 580–619.
library(SemiPar) data(calif.air.poll) pairs(calif.air.poll)
library(SemiPar) data(calif.air.poll) pairs(calif.air.poll)
The copper
data frame has 442 sets
of observations from a simulation based on a stockpile
of mined material in the former Soviet Union. Boreholes
have been drilled into the dump. The drill core
is cut every 5 metres and assayed for copper and cobalt
content in percentage by weight.
data(copper)
data(copper)
This data frame contains the following columns:
sample number.
sample identification number.
zone code.
x co-ordinate.
y co-ordinate.
z co-ordinate.
grade measurement.
percentage of copper.
Clark, I. and Harper, W.V. (2000). Practical Geostatistics 2000. Columbus, Ohio: Ecosse North America Llc.
library(SemiPar) data(copper) pairs(copper[,4:7])
library(SemiPar) data(copper) pairs(copper[,4:7])
The elec.temp
data frame has 55 observations on
monthly electricity usage and average temperature for
a house in Westchester County, New York, USA.
data(elec.temp)
data(elec.temp)
This data frame contains the following columns:
monthly electricity usage (kilowatt-hours) from a house in Westchester County, New York, USA.
average temperature (degrees Fahrenheit) for the corresponding month.
Chatterjee, S., Handcock, M. and Simonoff, J.S. (1995). A Casebook for a First Course in Statistics and Data Analysis, New York: John Wiley & Sons.
Ruppert, D., Wand, M.P. and Carroll, R.J. (2003)
Semiparametric Regression Cambridge University Press.
http://stat.tamu.edu/~carroll/semiregbook/
library(SemiPar) data(elec.temp) attach(elec.temp) plot(usage,temp)
library(SemiPar) data(elec.temp) attach(elec.temp) plot(usage,temp)
The ethanol data frame contains 88 sets of measurements for variables from an experiment in which ethanol was burned in a single cylinder automobile test engine.
data(ethanol)
data(ethanol)
This data frame contains the following columns:
the concentration of nitric oxide (NO) and nitrogen dioxide (NO2) in engine exhaust, normalized by the work done by the engine.
the compression ratio of the engine
the equivalence ratio at which the engine was run – a measure of the richness of the air/ethanol mix.
Brinkman, N.D. (1981). Ethanol fuel – a single-cylinder engine study of efficiency and exhaust emissions. SAE transactions Vol. 90, No 810345, 1410–1424.
Ruppert, D., Wand, M.P. and Carroll, R.J. (2003)
Semiparametric Regression Cambridge University Press.
http://stat.tamu.edu/~carroll/semiregbook/
library(SemiPar) data(ethanol) pairs(ethanol)
library(SemiPar) data(ethanol) pairs(ethanol)
Extracts fitted values from a semiparametric regression fit object.
## S3 method for class 'spm' fitted(object,...)
## S3 method for class 'spm' fitted(object,...)
object |
a fitted |
... |
other possible arguments. |
Extracts fitted from a semiparametric regression fit object. The fitted are defined to be the set of values obtained when the predictor variable data are substituted into the fitted regression model.
The vector of fitted.
M.P. Wand [email protected] (other contributors listed in SemiPar Users' Manual).
Ruppert, D., Wand, M.P. and Carroll, R.J. (2003)
Semiparametric Regression Cambridge University Press.
http://stat.tamu.edu/~carroll/semiregbook/
Ganguli, B. and Wand, M.P. (2005)
SemiPar 1.0 Users' Manual.
http://matt-wand.utsacademics.info/SPmanu.pdf
spm
plot.spm
lines.spm
predict.spm
summary.spm
residuals.spm
library(SemiPar) data(fossil) attach(fossil) fit <- spm(strontium.ratio~f(age)) plot(fit) points(age,fitted(fit),col="red")
library(SemiPar) data(fossil) attach(fossil) fit <- spm(strontium.ratio~f(age)) plot(fit) points(age,fitted(fit),col="red")
The fossil
data frame has 106 observations on fossil shells.
data(fossil)
data(fossil)
This data frame contains the following columns:
age in millions of years
ratios of strontium isotopes
Bralower, T.J., Fullagar, P.D., Paull, C.K., Dwyer, G.S. and Leckie, R.M. (1997). Mid-cretaceous strontium-isotope stratigraphy of deep-sea sections. Geological Society of America Bulletin, 109, 1421-1442.
Ruppert, D., Wand, M.P. and Carroll, R.J. (2003)
Semiparametric Regression Cambridge University Press.
http://stat.tamu.edu/~carroll/semiregbook/
library(SemiPar) data(fossil) attach(fossil) plot(age,strontium.ratio)
library(SemiPar) data(fossil) attach(fossil) plot(age,strontium.ratio)
The fuel.frame
data frame contains data on
5 variables (columns) for 117 cars (rows).
data(fuel.frame)
data(fuel.frame)
This data frame contains the following columns:
character variable giving the name (make) of the car
the weight of the car in pounds.
the engine displacement in litres.
gas mileage in miles/gallon.
a derived variable concerning fuel efficiency.
a factor giving the general type of car. The levels are: Small ,Sporty , Compact , Medium , Large , Van.
Consumer Reports, April, 1990, pp. 235-288.
Chambers, J.M. and Hastie, T.J. (eds.) (1992)
Statistical Models in S.
Wadsworth and Brooks, Pacific Grove, California.
library(SemiPar) data(fuel.frame) pairs(fuel.frame) par(mfrow=c(2,2)) fuel.fit <- lm(Fuel ~ Weight + Disp.,fuel.frame) plot(fuel.fit,ask=FALSE) par(mfrow=c(1,1))
library(SemiPar) data(fuel.frame) pairs(fuel.frame) par(mfrow=c(2,2)) fuel.fit <- lm(Fuel ~ Weight + Disp.,fuel.frame) plot(fuel.fit,ask=FALSE) par(mfrow=c(1,1))
The janka
data frame has 36 observations on Australian timber
samples.
data(janka)
data(janka)
This data frame contains the following columns:
a measure of density of the timber.
the Janka hardness (structural property) of the timber.
Williams, E.J. (1959) Regression Analysis, New York: John Wiley & Sons.
Ruppert, D., Wand, M.P. and Carroll, R.J. (2003)
Semiparametric Regression Cambridge University Press.
http://stat.tamu.edu/~carroll/semiregbook/
library(SemiPar) data(janka) attach(janka) plot(dens,hardness)
library(SemiPar) data(janka) attach(janka) plot(dens,hardness)
The lidar
data frame has 221 observations
from a light detection and ranging (LIDAR) experiment.
data(lidar)
data(lidar)
This data frame contains the following columns:
distance travelled before the light is reflected back to its source.
logarithm of the ratio of received light from two laser sources.
Sigrist, M. (Ed.) (1994). Air Monitoring by Spectroscopic Techniques (Chemical Analysis Series, vol. 197). New York: Wiley.
Ruppert, D., Wand, M.P. and Carroll, R.J. (2003)
Semiparametric Regression Cambridge University Press.
http://stat.tamu.edu/~carroll/semiregbook/
library(SemiPar) data(lidar) attach(lidar) plot(range,logratio)
library(SemiPar) data(lidar) attach(lidar) plot(range,logratio)
Takes a fitted spm
object produced by spm()
and
adds a curve. The function is only appropriate in the case
of a single predictor.
## S3 method for class 'spm' lines(x,...)
## S3 method for class 'spm' lines(x,...)
x |
a fitted |
... |
other graphics parameters described in Appendix B of the SemiPar Users' Manual http://matt-wand.utsacademics.info/SPmanu.pdf |
Takes a fitted spm
object produced by spm()
and
adds a curve. The function is only appropriate in the case
of a single predictor.
The function adds a curve to a plot.
M.P. Wand [email protected] (other contributors listed in SemiPar Users' Manual).
Ruppert, D., Wand, M.P. and Carroll, R.J. (2003)
Semiparametric Regression Cambridge University Press.
http://stat.tamu.edu/~carroll/semiregbook/
Ganguli, B. and Wand, M.P. (2005)
SemiPar 1.0 Users' Manual.
http://matt-wand.utsacademics.info/SPmanu.pdf
spm
plot.spm
predict.spm
summary.spm
residuals.spm
fitted.spm
library(SemiPar) data(fossil) attach(fossil) fit <- spm(strontium.ratio~f(age)) plot(fossil,type="n") lines(fit) points(fossil) # Now do several customisations op <- par(bg="white") par(bg="honeydew") plot(fossil,type="n") lines(fit,col="green",lwd=5,shade.col="mediumpurple1") points(fossil,col="orange",pch=16) par(op)
library(SemiPar) data(fossil) attach(fossil) fit <- spm(strontium.ratio~f(age)) plot(fossil,type="n") lines(fit) points(fossil) # Now do several customisations op <- par(bg="white") par(bg="honeydew") plot(fossil,type="n") lines(fit,col="green",lwd=5,shade.col="mediumpurple1") points(fossil,col="orange",pch=16) par(op)
The milan.mort
data frame has data on 3652 consecutive
days (10 consecutive years: 1st January, 1980 to
30th December, 1989) for the city of Milan, Italy.
data(milan.mort)
data(milan.mort)
This data frame contains the following columns:
number of days since 31st December, 1979
1=Monday,2=Tuesday,3=Wednesday,4=Thursday, 5=Friday,6=Saturday,7=Sunday.
indicator of public holiday: 1=public holiday, 0=otherwise.
mean daily temperature in degrees Celcius.
relative humidity.
total number of deaths.
total number of respiratory deaths.
measure of sulphur dioxide level in ambient air.
total suspended particles in ambient air.
Vigotti, M.A., Rossi, G., Bisanti, L., Zanobetti, A. and Schwartz, J. (1996). Short term effect of urban air pollution on respiratory health in Milan, Italy, 1980-1989. Journal of Epidemiology and Community Health, 50, S71-S75.
Ruppert, D., Wand, M.P. and Carroll, R.J. (2003)
Semiparametric Regression Cambridge University Press.
http://stat.tamu.edu/~carroll/semiregbook/
library(SemiPar) data(milan.mort) pairs(milan.mort,pch=".")
library(SemiPar) data(milan.mort) pairs(milan.mort,pch=".")
The monitor.mercury
data frame has 22 observations
from sampling locations around a solid waste
incinerator in Warren County, New Jersey, USA
data(monitor.mercury)
data(monitor.mercury)
This data frame contains the following columns:
longitude of sampling location.
latitude of sampling location.
mercury concentration in dry sphagnum moss grown at the sampling location.
Opsomer, J.D., Agras, J., Carpi, A. and Rodrigues, G. (1995), An application of locally weighted regression to airborne mercury deposition around an incinerator site, Environmetrics, 6, 205-221.
Ruppert, D., Wand, M.P. and Carroll, R.J. (2003)
Semiparametric Regression Cambridge University Press.
http//stat.tamu.edu/~carroll/semiregbook/
library(SemiPar) data(monitor.mercury) pairs(monitor.mercury)
library(SemiPar) data(monitor.mercury) pairs(monitor.mercury)
The onions
data frame contains 84 sets of observations
from an experiment involving the production of white Spanish onions
in two South Australian locations.
data(onions)
data(onions)
This data frame contains the following columns:
areal density of plants (plants per square metre)
onion yield (grammes per plant).
indicator of location: 0=Purnong Landing, 1=Virginia.
Ratkowsky, D. A. (1983). Nonlinear Regression Modeling: A Unified Practical Approach. New York: Marcel Dekker.
Ruppert, D., Wand, M.P. and Carroll, R.J. (2003)
Semiparametric Regression Cambridge University Press.
http://stat.tamu.edu/~carroll/semiregbook/
library(SemiPar) data(onions) attach(onions) points.cols <- c("red","blue") plot(dens,yield,col=points.cols[location+1],pch=16) legend(100,250,c("Purnong Landing","Virginia"),col=points.cols,pch=rep(16,2))
library(SemiPar) data(onions) attach(onions) points.cols <- c("red","blue") plot(dens,yield,col=points.cols[location+1],pch=16) legend(100,250,c("Purnong Landing","Virginia"),col=points.cols,pch=rep(16,2))
The pig.weights
data frame has 9 repeated
weight measures on 48 pigs.
data(pig.weights)
data(pig.weights)
This data frame contains the following columns:
identification number of pig.
number of weeks since measurements commenced.
bodyweight of pig "id.num" after "num.weeks" weeks.
Diggle, P.J., Heagerty, P., Liang, K.-Y. and Zeger, S.L. (2002). Analysis of Longitudinal Data, Second Edition, Oxord: Oxford University Press.
Ruppert, D., Wand, M.P. and Carroll, R.J. (2003)
Semiparametric Regression Cambridge University Press.
http://stat.tamu.edu/~carroll/semiregbook/
library(SemiPar) data(pig.weights) library(lattice) xyplot(weight~num.weeks,data=pig.weights,groups=id.num,type="b")
library(SemiPar) data(pig.weights) library(lattice) xyplot(weight~num.weeks,data=pig.weights,groups=id.num,type="b")
Takes a fitted spm
object
produced by spm()
and plots the
component smooth functions that make it up, on the
scale of the linear predictor.
## S3 method for class 'spm' plot(x,...)
## S3 method for class 'spm' plot(x,...)
x |
a fitted |
... |
other graphics parameters described in Appendix B of the SemiPar Users' Manual http://matt-wand.utsacademics.info/SPmanu.pdf |
Produces plots with each panel corresponding to a component of the semiparametric regression model.
The function generates plots.
M.P. Wand [email protected] (other contributors listed in SemiPar Users' Manual).
Ruppert, D., Wand, M.P. and Carroll, R.J. (2003)
Semiparametric Regression Cambridge University Press.
http://stat.tamu.edu/~carroll/semiregbook/
Ganguli, B. and Wand, M.P. (2005)
SemiPar 1.0 Users' Manual.
http://matt-wand.utsacademics.info/SPmanu.pdf
spm
lines.spm
predict.spm
summary.spm
library(SemiPar) data(fossil) attach(fossil) fit <- spm(strontium.ratio~f(age)) plot(fit) # Now do several customisations op <- par(bg="white") par(bg="honeydew") plot(fit,ylim=range(strontium.ratio),col="green", lwd=5,shade.col="mediumpurple1",rug.col="blue") points(age,strontium.ratio,col="orange",pch=16) par(op)
library(SemiPar) data(fossil) attach(fossil) fit <- spm(strontium.ratio~f(age)) plot(fit) # Now do several customisations op <- par(bg="white") par(bg="honeydew") plot(fit,ylim=range(strontium.ratio),col="green", lwd=5,shade.col="mediumpurple1",rug.col="blue") points(age,strontium.ratio,col="orange",pch=16) par(op)
Takes a fitted spm
object produced by
spm()
and obtains predictions at new data values.
## S3 method for class 'spm' predict(object,newdata,se,...)
## S3 method for class 'spm' predict(object,newdata,se,...)
object |
a fitted |
newdata |
a data frame containing the values of the predictors at which predictions are required. The columns should have the same name as the predictors. |
se |
when this is TRUE standard error estimates are returned for each prediction. The default is FALSE. |
... |
other arguments. |
Takes a fitted spm
object produced by
spm()
and obtains predictions at new data values
as specified by the ‘newdata’ argument. If ‘se=TRUE’ then
standard error estimates are also obtained.
If se=FALSE then a vector of predictions at ‘newdata’ is returned. If se=TRUE then a list with components named ‘fit’ and ‘se’ is returned. The ‘fit’ component contains the predictions. The ‘se’ component contains standard error estimates.
M.P. Wand [email protected] (other contributors listed in SemiPar Users' Manual).
Ruppert, D., Wand, M.P. and Carroll, R.J. (2003)
Semiparametric Regression Cambridge University Press.
http://stat.tamu.edu/~carroll/semiregbook/
Ganguli, B. and Wand, M.P. (2005)
SemiPar 1.0 Users' Manual.
http://matt-wand.utsacademics.info/SPmanu.pdf
spm
lines.spm
plot.spm
summary.spm
library(SemiPar) data(fossil) attach(fossil) fit <- spm(strontium.ratio~f(age)) newdata.age <- data.frame(age=c(90,100,110,120,130)) preds <- predict(fit,newdata=newdata.age,se=TRUE) print(preds) plot(fit,xlim=c(90,130)) points(unlist(newdata.age),preds$fit,col="red") points(unlist(newdata.age),preds$fit+2*preds$se,col="blue") points(unlist(newdata.age),preds$fit-2*preds$se,col="green")
library(SemiPar) data(fossil) attach(fossil) fit <- spm(strontium.ratio~f(age)) newdata.age <- data.frame(age=c(90,100,110,120,130)) preds <- predict(fit,newdata=newdata.age,se=TRUE) print(preds) plot(fit,xlim=c(90,130)) points(unlist(newdata.age),preds$fit,col="red") points(unlist(newdata.age),preds$fit+2*preds$se,col="blue") points(unlist(newdata.age),preds$fit-2*preds$se,col="green")
Prints a brief description of a semiparametric regression fit object to the screen.
## S3 method for class 'spm' print(x,...)
## S3 method for class 'spm' print(x,...)
x |
a fitted |
... |
other possible arguments. |
Prints a brief description of a semiparametric regression fit object to the screen.
The function prints to the screen.
M.P. Wand [email protected] (other contributors listed in SemiPar Users' Manual).
Ruppert, D., Wand, M.P. and Carroll, R.J. (2003)
Semiparametric Regression Cambridge University Press.
http://stat.tamu.edu/~carroll/semiregbook/
Ganguli, B. and Wand, M.P. (2005)
SemiPar 1.0 Users' Manual.
http://matt-wand.utsacademics.info/SPmanu.pdf
spm
plot.spm
lines.spm
predict.spm
summary.spm
residuals.spm
fitted.spm
library(SemiPar) data(fossil) attach(fossil) fit <- spm(strontium.ratio~f(age)) print(fit)
library(SemiPar) data(fossil) attach(fossil) fit <- spm(strontium.ratio~f(age)) print(fit)
The ragweed
data frame has data on ragweed levels
and meteorological variables for 335 days in Kalamazoo,
Michigan, U.S.A.
data(ragweed)
data(ragweed)
This data frame contains the following columns:
ragweed level (grains per cubic metre).
one of 1991, 1992, 1993 or 1994.
day number in the current ragweed pollen season.
temperature of following day (degrees Fahrenheit).
indicator of significant rain the following day: 1=at least 3 hours of steady or brief but intense rain, 0=otherwise.
wind speed forecast for following day (knots).
Stark, P. C., Ryan, L. M., McDonald, J. L. and Burge, H. A. (1997). Using meteorologic data to model and predict daily ragweed pollen levels. Aerobiologia, 13, 177-184.
Ruppert, D., Wand, M.P. and Carroll, R.J. (2003)
Semiparametric Regression Cambridge University Press.
http://stat.tamu.edu/~carroll/semiregbook/
library(SemiPar) data(ragweed) pairs(ragweed,pch=".")
library(SemiPar) data(ragweed) pairs(ragweed,pch=".")
Extracts residuals from a semiparametric regression fit object.
## S3 method for class 'spm' residuals(object,...)
## S3 method for class 'spm' residuals(object,...)
object |
a fitted |
... |
other possible arguments. |
Extracts residuals from a semiparametric regression fit object. The residuals are defined to be the difference between the response variable and the fitted values.
The vector of residuals.
M.P. Wand [email protected] (other contributors listed in SemiPar Users' Manual).
Ruppert, D., Wand, M.P. and Carroll, R.J. (2003)
Semiparametric Regression Cambridge University Press.
http://stat.tamu.edu/~carroll/semiregbook/
Ganguli, B. and Wand, M.P. (2005)
SemiPar 1.0 Users' Manual.
http://matt-wand.utsacademics.info/SPmanu.pdf
spm
plot.spm
lines.spm
predict.spm
summary.spm
fitted.spm
library(SemiPar) data(fossil) attach(fossil) fit <- spm(strontium.ratio~f(age)) plot(age,residuals(fit)) abline(0,0)
library(SemiPar) data(fossil) attach(fossil) fit <- spm(strontium.ratio~f(age)) plot(age,residuals(fit)) abline(0,0)
The retire.plan
data frame has data on "401(k)" retirement
plans for employees of
92 firms managed by a company code-named Best Retirement Inc. (BRI).
data(retire.plan)
data(retire.plan)
This data frame contains the following columns:
contribution to retirement plan at end of first year
1=client has group life of group health insurance policy, 0=otherwise.
employee turnover rate.
number of employees eligible to participate in 401(k) plans.
1=plan has immediate vesting of employer contributions, 0=otherwise.
1=plan has a fail-safe provision, 0=otherwise.
percentage of contributions matched by the employer.
average annual employee salary in dollars
.
underwriter's estimate of end-of-year contributions in dollars.
1=plan was sold by a sales representative who has been specifically trained to deal exclusively with 401(k) plans (code-named Susan Shepard).
Bryant, P.G. and Smith, M.A. (1995). Practical data analysis: case studies in business statistics. Chicago: Irwin.
Ruppert, D., Wand, M.P. and Carroll, R.J. (2003)
Semiparametric Regression Cambridge University Press.
http://stat.tamu.edu/~carroll/semiregbook/
library(SemiPar) data(retire.plan) pairs(retire.plan)
library(SemiPar) data(retire.plan) pairs(retire.plan)
The salinity
data frame has 28 observations
on hydrological measurements from Pamlico Sound,
North Carolina, USA.
data(salinity)
data(salinity)
This data frame contains the following columns:
salinity in Pamlico Sound.
salinity in Pamlico Sound during the previous six weeks.
trend=1 if the data is the first six-week period of the spring, and so forth. Used to detect possible effects of the seasonal warming trend.
discharge of fresh water from rivers into the sound.
Ruppert, D, and Carroll, R.J. (1980), Trimmed least squares estimation in the linear model, Journal of the American Statistical Association, 75, 828-838.
Ruppert, D., Wand, M.P. and Carroll, R.J. (2003)
Semiparametric Regression Cambridge University Press.
http//stat.tamu.edu/~carroll/semiregbook/
library(SemiPar) data(salinity) pairs(salinity)
library(SemiPar) data(salinity) pairs(salinity)
The sausage
data frame has data on 54 ‘hot dog’
sausages.
data(sausage)
data(sausage)
This data frame contains the following columns:
type of meat.
number of calories.
measure of sodium content.
Moore, D.S. and McCabe, G.P. (2003). Introduction to the Practice of Statistics, Fourth Edition, W.H. Freeman and Company.
Ruppert, D., Wand, M.P. and Carroll, R.J. (2003)
Semiparametric Regression Cambridge University Press.
http://stat.tamu.edu/~carroll/semiregbook/
library(SemiPar) data(sausage) attach(sausage) points.cols <- c("red","blue","green") plot(sodium,calories,col=points.cols[type],pch=16) legend(200,180,c("beef","pork","poultry"),col=points.cols,pch=rep(16,3))
library(SemiPar) data(sausage) attach(sausage) points.cols <- c("red","blue","green") plot(sodium,calories,col=points.cols[type],pch=16) legend(200,180,c("beef","pork","poultry"),col=points.cols,pch=rep(16,3))
The scallop
data frame has 148 triplets
concerning scallop abundance; based
on a 1990 survey cruise in the Atlantic continental
shelf off Long Island, New York, U.S.A.
data(scallop)
data(scallop)
This data frame contains the following columns:
degrees latitude (north of the Equator).
degrees longitude (west of Greenwich).
size of scallop catch at location specified by "latitude" and "longitude".
Ecker, M.D. and Heltshe, J.F. (1994). Geostatistical estimates of scallop abundance. In Case Studies in Biometry. Lange, N., Ryan, L., Billard, L., Brillinger, D., Conquest, L. and Greenhouse, J. (eds.) New York: John Wiley & Sons, 107-124.
Ruppert, D., Wand, M.P. and Carroll, R.J. (2003)
Semiparametric Regression Cambridge University Press.
http://stat.tamu.edu/~carroll/semiregbook/
library(SemiPar) data(scallop) pairs(scallop)
library(SemiPar) data(scallop) pairs(scallop)
The sitka
data frame contains measurements
of log-size for 79 Sitka spruce trees grown in normal
or ozone-enriched environments. Within each year, the
data are organised in four blocks, corresponding to
four controlled environment chambers. The first
two chambers, containing 27 trees each, have
an ozone-enriched atmosphere, the remaining two,
containing 12 and 13 trees respectively, have a normal
(control) atmosphere.
data(sitka)
data(sitka)
This data frame contains the following columns:
identification number of tree.
time order ranking within each tree.
time in days since 1st January, 1988.
tree size measured on a logarithmic scale.
indicator ozone treatment: 0=control,1=ozone.
Diggle, P.J., Heagerty, P., Liang, K.-Y. and Zeger, S.L. (2002). Analysis of Longitudinal Data, Second Edition, Oxord: Oxford University Press.
Ruppert, D., Wand, M.P. and Carroll, R.J. (2003)
Semiparametric Regression Cambridge University Press.
http://stat.tamu.edu/~carroll/semiregbook/
library(SemiPar) data(sitka) attach(sitka) library(lattice) ozone.char <- rep("control",nrow(sitka)) ozone.char[ozone==1] <- "ozone" xyplot(log.size~days|ozone.char,data=sitka,groups=id.num,type="b")
library(SemiPar) data(sitka) attach(sitka) library(lattice) ozone.char <- rep("control",nrow(sitka)) ozone.char[ozone==1] <- "ozone" xyplot(log.size~days|ozone.char,data=sitka,groups=id.num,type="b")
spm
is used to fit semiparametric
regression models using the mixed model
representation of penalized splines
(per Ruppert, Wand and Carroll, 2003).
spm(form,random=NULL,group=NULL,family="gaussian", spar.method="REML",omit.missing=NULL)
spm(form,random=NULL,group=NULL,family="gaussian", spar.method="REML",omit.missing=NULL)
form |
a formula describing the model to be fit. Note, that an intercept is always included, whether given in the formula or not. |
random |
"random=~1" specifies inclusion of a random intercept according to the groups specified by the "group" argument. |
group |
a vector of labels for specifying groups. |
family |
for specification of the type of likelihood model assumed in the fitting. May be "gaussian","binomial" or "poisson" |
spar.method |
method for automatic smoothing parameter selection. May be "REML" (restricted maximum likelihood) or "ML" (maximum likelihood). |
omit.missing |
a logical value indicating whether fields with missing values are to be omitted. |
See the SemiPar Users' Manual for details and examples.
An list object of class "spm"
containing the fitted model.
The components are:
fit |
mimics fit object of lme() for family="gaussian" and glmmPQL() for family="binomial" or family="poisson". |
info |
information about the inputs. |
aux |
auxiliary information such as variability estimates. |
M.P. Wand [email protected] (other contributors listed in SemiPar Users' Manual).
Ruppert, D., Wand, M.P. and Carroll, R.J. (2003)
Semiparametric Regression Cambridge University Press.
http://stat.tamu.edu/~carroll/semiregbook/
Ganguli, B. and Wand, M.P. (2005)
SemiPar 1.0 Users' Manual.
http://matt-wand.utsacademics.info/SPmanu.pdf
gam
(in package ‘mgcv’)
lme
(in package ‘nlme’)
glmmPQL
(in package ‘MASS’)
plot.spm
summary.spm
library(SemiPar) data(fossil) attach(fossil) fit <- spm(strontium.ratio~f(age)) plot(fit) summary(fit) data(calif.air.poll) attach(calif.air.poll) fit <- spm(ozone.level ~ f(daggett.pressure.gradient)+ f(inversion.base.height) + f(inversion.base.temp)) summary(fit) par(mfrow=c(2,2)) plot(fit) # The SemiPar User Manual contains several other examples # and details of plotting parameters. # # The current version of the manual is posted on the web-site: # # http://matt-wand.utsacademics.info/SPmanu.pdf
library(SemiPar) data(fossil) attach(fossil) fit <- spm(strontium.ratio~f(age)) plot(fit) summary(fit) data(calif.air.poll) attach(calif.air.poll) fit <- spm(ozone.level ~ f(daggett.pressure.gradient)+ f(inversion.base.height) + f(inversion.base.temp)) summary(fit) par(mfrow=c(2,2)) plot(fit) # The SemiPar User Manual contains several other examples # and details of plotting parameters. # # The current version of the manual is posted on the web-site: # # http://matt-wand.utsacademics.info/SPmanu.pdf
Takes a fitted spm
object produced by
spm()
and summarises the fit.
## S3 method for class 'spm' summary(object,...)
## S3 method for class 'spm' summary(object,...)
object |
a fitted |
... |
other arguments. |
Produces tables for the linear (parametric) and non-linear (nonparametric) components. The linear table provides coefficient estimates, standard errors and p-values. The non-linear table provides degrees of freedom values and other information.
The function generates summary tables.
M.P. Wand [email protected] (other contributors listed in SemiPar Users' Manual).
Ruppert, D., Wand, M.P. and Carroll, R.J. (2003)
Semiparametric Regression Cambridge University Press.
http://stat.tamu.edu/~carroll/semiregbook/
Ganguli, B. and Wand, M.P. (2005)
SemiPar 1.0 Users' Manual.
http://matt-wand.utsacademics.info/SPmanu.pdf
library(SemiPar) data(onions) attach(onions) log.yield <- log(yield) fit <- spm(log.yield~location+f(dens)) summary(fit)
library(SemiPar) data(onions) attach(onions) log.yield <- log(yield) fit <- spm(log.yield~location+f(dens)) summary(fit)
The term.structure
data frame has 117
observations on the prices of U.S. STRIPS
(Separate Trading on Registered Interest and
Principal of Securities) on December 31, 1995.
data(term.structure)
data(term.structure)
This data frame contains the following columns:
time in years between 31st December, 1995, and the date on which the STRIPS matures.
price of the STRIPS as a percent of par.
University of Houston Fixed Income Database.
Jarrow, R., Ruppert, D., and Yu, Y. (2004). Estimating the term structure of corporate debt with a semiparametric penalized spline model, Journal of the American Statistical Association, 99, 57-66.
Ruppert, D., Wand, M.P. and Carroll, R.J. (2003)
Semiparametric Regression Cambridge University Press.
http//stat.tamu.edu/~carroll/semiregbook/
library(SemiPar) data(term.structure) attach(term.structure) plot(time.to.maturity,price)
library(SemiPar) data(term.structure) attach(term.structure) plot(time.to.maturity,price)
The trade.union
data frame has data on 534 U.S. workers.
data(trade.union)
data(trade.union)
This data frame contains the following columns:
number of years of education.
indicator of living in southern region of U.S.A.
gender indicator: 0=male,1=female.
number of years of work experience
indicator of trade union membership: 0=non-member, 1=member.
wages in dollars per hour.
age in years.
1=black, 2=Hispanic, 3=white.
1=management, 2=sales, 3=clerical, 4=service, 5=professional, 6=other.
0=other, 1=manufacturing, 2=construction.
indicator of being married: 0=unmarried, 1=married.
Berndt, E.R. (1991) The Practice of Econometrics. New York: Addison-Wesley.
Ruppert, D., Wand, M.P. and Carroll, R.J. (2003)
Semiparametric Regression Cambridge University Press.
http://stat.tamu.edu/~carroll/semiregbook/
library(SemiPar) data(trade.union) pairs(trade.union,pch=".")
library(SemiPar) data(trade.union) pairs(trade.union,pch=".")
The ustemp
data frame has 56 observations on the temperature
and location of 56 U.S. cities.
data(ustemp)
data(ustemp)
This data frame contains the following columns:
character string giving name of city and state (two-letter abbreviation).
average minimum January temperature.
degrees latitude (north of Equator).
degrees longitude (west of Greenwich).
Peixoto, J.L. (1990). A property of well-formulated polynomial regression models. American Statistician, 44, 26-30.
Ruppert, D., Wand, M.P. and Carroll, R.J. (2003)
Semiparametric Regression Cambridge University Press.
http://stat.tamu.edu/~carroll/semiregbook/
library(SemiPar) data(ustemp) attach(ustemp) grey.levs <- min.temp+20 col.vec <- paste("grey",as.character(grey.levs),sep="") plot(-longitude,latitude,col=col.vec,pch=16,cex=3,xlim=c(-130,-60)) text(-longitude,latitude,as.character(city))
library(SemiPar) data(ustemp) attach(ustemp) grey.levs <- min.temp+20 col.vec <- paste("grey",as.character(grey.levs),sep="") plot(-longitude,latitude,col=col.vec,pch=16,cex=3,xlim=c(-130,-60)) text(-longitude,latitude,as.character(city))