Title: | Mixed Model ANOVA and Statistics for Education |
---|---|
Description: | The main functions perform mixed models analysis by least squares or REML by adding the function r() to formulas of lm() and glm(). A collection of text-book statistics for higher education is also included, e.g. modifications of the functions lm(), glm() and associated summaries from the package 'stats'. |
Authors: | Kristian Hovde Liland [aut, cre], Solve Sæbø¸ [ctb], R-Core [ctb] |
Maintainer: | Kristian Hovde Liland <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.4.0 |
Built: | 2024-12-07 06:27:52 UTC |
Source: | CRAN |
Summarizes all effects in one.
anova_reg(lm.object)
anova_reg(lm.object)
lm.object |
an object of class |
Returns data.frame containing analysis of variance
Kristian Hovde Liland
anova_reg(lm(y~x, data=data.frame(y=1:4,x=rnorm(4))))
anova_reg(lm(y~x, data=data.frame(y=1:4,x=rnorm(4))))
Wrapper for anova.lm
in package stats
that halts execution
if unsupported input is detected.
## S3 method for class 'lmm' anova(object, ...)
## S3 method for class 'lmm' anova(object, ...)
object |
object fitted by |
... |
possible additional argument to underlying functions. |
Returns appropriate analysis of variance or halts if unsupported input is detected.
Kristian Hovde Liland
mixlm <- lm(y~x*r(z), data = data.frame(y = rnorm(8), x = factor(c(rep(1,4),rep(0,4))), z = factor(rep(c(1,0),4)))) anova(mixlm)
mixlm <- lm(y~x*r(z), data = data.frame(y = rnorm(8), x = factor(c(rep(1,4),rep(0,4))), z = factor(rep(c(1,0),4)))) anova(mixlm)
Replacement for Anova.lm
in package car
. This
version adds support for random effects when needed.
## S3 method for class 'lmm' Anova(mod, ...)
## S3 method for class 'lmm' Anova(mod, ...)
mod |
|
... |
do not use. |
Returns appropriate analysis of variance or halts if unsupported input is detected.
John Fox [email protected]. Extended by Kristian Hovde Liland.
Anova
, print.AnovaMix
, AnovaMix
, lm
dataset <- data.frame(y = rnorm(8), x = factor(c(rep(1,4),rep(0,4))), z = factor(rep(c(1,0),4))) mixlm <- lm(y~x*r(z), data = dataset) Anova(mixlm, type="III")
dataset <- data.frame(y = rnorm(8), x = factor(c(rep(1,4),rep(0,4))), z = factor(rep(c(1,0),4))) mixlm <- lm(y~x*r(z), data = dataset) Anova(mixlm, type="III")
Uses output from lm() in mixlm package to compute ANOVA table, variance components and errors.
AnovaMix(object, SStype)
AnovaMix(object, SStype)
object |
object fitted by lm (mixlm package) containing at least one random effect. |
SStype |
type of sums-of-squares (I/II/III) for Analysis of Variance. |
AnovaMix
can either be invoked directly or through the Anova() function (with type III error).
lm |
linear model fitted by lm in package mixlm. |
anova |
ANOVA table. |
err.terms |
|
denom.df |
|
restricted |
|
exp.mean.sq |
|
var.comps |
|
random.effects |
|
ind.randoms |
|
formula.text |
|
Only balanced models are fully supported.
Kristian Hovde Liland
mydata <- data.frame(y = rnorm(12), x = factor(c(rep(2,4),rep(1,4),rep(0,4))), z = factor(rep(c(1,0),6))) mixlm <- lm(y~x*r(z), data = mydata) Anova(mixlm,type="III")
mydata <- data.frame(y = rnorm(12), x = factor(c(rep(2,4),rep(1,4),rep(0,4))), z = factor(rep(c(1,0),6))) mixlm <- lm(y~x*r(z), data = mydata) Anova(mixlm,type="III")
Adaptation of existing methods based on AIC/BIC.
best.subsets(model, nbest = 5, nvmax, digits, force.in = "NULL")
best.subsets(model, nbest = 5, nvmax, digits, force.in = "NULL")
model |
object class |
nbest |
|
nvmax |
|
digits |
|
force.in |
|
F-based versions of built in subset method.
No return, only print.
Kristian Hovde Liland
data <- data.frame(y = rnorm(8), x = factor(c('a','a','a','a','b','b','b','b')), z = factor(c('a','a','b','b','a','a','b','b'))) mod <- lm(y ~ x + z, data=data) best.subsets(mod)
data <- data.frame(y = rnorm(8), x = factor(c('a','a','a','a','b','b','b','b')), z = factor(c('a','a','b','b','a','a','b','b'))) mod <- lm(y ~ x + z, data=data) best.subsets(mod)
This function estimates the confidence interval for the grand mean of a balanced linear (mixed) model.
CIgrandMean(object, alpha = 0.05) ## S3 method for class 'CIgm' print(x, ...)
CIgrandMean(object, alpha = 0.05) ## S3 method for class 'CIgm' print(x, ...)
object |
An |
alpha |
A scalar significance level for the confidence interval. |
x |
An object returned from |
... |
Additional arguments (not used). |
This implementation is only valid for models containing no continuous effects and only balanced data.
CIgrandMean
returns a vector of interval endpoints and center.
print.CIgm
has no return.
Kristian Hovde Liland
Suggestions are welcome.
set.seed(42) dataset <- data.frame(y=rnorm(8), x=factor(c(rep(1,4),rep(0,4))), z=factor(rep(c(1,0),4))) mixlm <- lm(y~x*r(z), data = dataset) CIgrandMean(mixlm)
set.seed(42) dataset <- data.frame(y=rnorm(8), x=factor(c(rep(1,4),rep(0,4))), z=factor(rep(c(1,0),4))) mixlm <- lm(y~x*r(z), data = dataset) CIgrandMean(mixlm)
Computes the confusion matrix of a classification result.
confusion(true, predicted)
confusion(true, predicted)
true |
true classes. |
predicted |
predicted classes. |
This is a pure print function.
true <- c('a','a','b','b','c','c') predicted <- c('a','c','b','b','a','c') confusion(true, predicted)
true <- c('a','a','b','b','c','c') predicted <- c('a','c','b','b','a','c') confusion(true, predicted)
Weighted contrast coding for linear models.
contr.weighted(x, base)
contr.weighted(x, base)
x |
|
base |
|
Different from the contrasts made throught the stats
package functions
this contrast requires a full factor
vector as input rather than its
respective levels
as weights are computed from the frequencies of the
factor
levels
.
A matrix
with n
rows and n-1
values.
contr.weighted
cannot be used directly as a replacement for other
contrasts by name, but must be used via contrasts matrix computations.
Kristian Hovde Liland
Nieuwenhuis, R.; Grotenhuis, M.; Pelzer, B. Weighted Effect Coding for Observational Data with wec. R. J. 2017, 9, 477-485.
balanced <- factor(c(rep("A", 3), rep("B", 3), rep("C", 3))) unbalanced <- factor(c(rep("A", 3), rep("B", 3), rep("C", 2))) # Weighted coding when applied to balanced data contr.weighted(balanced) # Weighted coding when applied to unbalanced data (default base level) contr.weighted(unbalanced) # Weighted coding when applied to unbalanced data (base level = "C") contr.weighted(unbalanced, "C")
balanced <- factor(c(rep("A", 3), rep("B", 3), rep("C", 3))) unbalanced <- factor(c(rep("A", 3), rep("B", 3), rep("C", 2))) # Weighted coding when applied to balanced data contr.weighted(balanced) # Weighted coding when applied to unbalanced data (default base level) contr.weighted(unbalanced) # Weighted coding when applied to unbalanced data (base level = "C") contr.weighted(unbalanced, "C")
lm
Alternative notation of effect labels including levels.
effect.labels(t, data)
effect.labels(t, data)
t |
Terms object. |
data |
Corresponding model.matrix. |
names |
Character vector of effect labels. |
Kristian Hovde Liland
Adaptation of existing methods based on AIC/BIC.
forward(model, alpha = 0.2, full = FALSE, force.in) backward(model, alpha = 0.2, full = FALSE, hierarchy = TRUE, force.in) stepWise(model, alpha.enter = 0.15, alpha.remove = 0.15, full = FALSE) stepWiseBack(model, alpha.remove = 0.15, alpha.enter = 0.15, full = FALSE) wideForward(formula, data, alpha = 0.2, force.in = NULL)
forward(model, alpha = 0.2, full = FALSE, force.in) backward(model, alpha = 0.2, full = FALSE, hierarchy = TRUE, force.in) stepWise(model, alpha.enter = 0.15, alpha.remove = 0.15, full = FALSE) stepWiseBack(model, alpha.remove = 0.15, alpha.enter = 0.15, full = FALSE) wideForward(formula, data, alpha = 0.2, force.in = NULL)
model |
object class |
formula |
|
data |
|
alpha |
|
full |
|
force.in |
|
alpha.enter |
|
alpha.remove |
|
hierarchy |
|
F-based versions of built in stepwise methods.
The final linear model after selection is returned.
Kristian Hovde Liland
set.seed(0) data <- data.frame(y = rnorm(8), x = factor(c('a','a','a','a','b','b','b','b')), z = factor(c('a','a','b','b','a','a','b','b'))) mod <- lm(y ~ x + z, data=data) forward(mod) backward(mod) stepWise(mod) stepWiseBack(mod) # Forward selection for wide matrices (large number of predictors) set.seed(0) mydata <- data.frame(y = rnorm(6), X = matrix(rnorm(60),6,10)) fs <- wideForward(y ~ ., mydata) print(fs)
set.seed(0) data <- data.frame(y = rnorm(8), x = factor(c('a','a','a','a','b','b','b','b')), z = factor(c('a','a','b','b','a','a','b','b'))) mod <- lm(y ~ x + z, data=data) forward(mod) backward(mod) stepWise(mod) stepWiseBack(mod) # Forward selection for wide matrices (large number of predictors) set.seed(0) mydata <- data.frame(y = rnorm(6), X = matrix(rnorm(60),6,10)) fs <- wideForward(y ~ ., mydata) print(fs)
Extracts all effects from a formula, even though inside functions or interactions.
fparse(f)
fparse(f)
f |
|
Returns a character
vector containing all effects.
Bjørn-Helge Mevik
f <- formula(y~x*r(z)) fparse(f)
f <- formula(y~x*r(z)) fparse(f)
glm
is used to fit generalized linear models, specified by
giving a symbolic description of the linear predictor and a
description of the error distribution.
The version of glm
supplied by the package mixlm
parses to glmer
for mixed modelling.
glm(formula, family = gaussian, data, weights, subset, na.action, start = NULL, etastart, mustart, offset, control = list(...), model = TRUE, method = "glm.fit", x = FALSE, y = TRUE, contrasts = NULL, REML = TRUE, ...)
glm(formula, family = gaussian, data, weights, subset, na.action, start = NULL, etastart, mustart, offset, control = list(...), model = TRUE, method = "glm.fit", x = FALSE, y = TRUE, contrasts = NULL, REML = TRUE, ...)
formula |
an object of class |
family |
a description of the error distribution and link
function to be used in the model. This can be a character string
naming a family function, a family function or the result of a call
to a family function. (See |
data |
an optional data frame, list or environment (or object
coercible by |
weights |
an optional vector of ‘prior weights’ to be used
in the fitting process. Should be |
subset |
an optional vector specifying a subset of observations to be used in the fitting process. |
na.action |
a function which indicates what should happen
when the data contain |
start |
starting values for the parameters in the linear predictor. |
etastart |
starting values for the linear predictor. |
mustart |
starting values for the vector of means. |
offset |
this can be used to specify an a priori known
component to be included in the linear predictor during fitting.
This should be |
control |
a list of parameters for controlling the fitting
process. For |
model |
a logical value indicating whether model frame should be included as a component of the returned value. |
method |
the method to be used in fitting the model. The default
method User-supplied fitting functions can be supplied either as a function
or a character string naming a function, with a function which takes
the same arguments as |
x , y
|
For For |
contrasts |
an optional list. See the |
REML |
is used to invoke restricted maximum likelihood (TRUE) or maximum likelihood (FALSE) estimation instead of least squares. |
... |
For For |
A typical predictor has the form response ~ terms
where
response
is the (numeric) response vector and terms
is a
series of terms which specifies a linear predictor for
response
. For binomial
and quasibinomial
families the response can also be specified as a factor
(when the first level denotes failure and all others success) or as a
two-column matrix with the columns giving the numbers of successes and
failures. A terms specification of the form first + second
indicates all the terms in first
together with all the terms in
second
with any duplicates removed.
A specification of the form first:second
indicates the the set
of terms obtained by taking the interactions of all terms in
first
with all terms in second
. The specification
first*second
indicates the cross of first
and
second
. This is the same as first + second +
first:second
.
The terms in the formula will be re-ordered so that main effects come
first, followed by the interactions, all second-order, all third-order
and so on: to avoid this pass a terms
object as the formula.
Non-NULL
weights
can be used to indicate that different
observations have different dispersions (with the values in
weights
being inversely proportional to the dispersions); or
equivalently, when the elements of weights
are positive
integers , that each response
is the mean of
unit-weight observations. For a binomial GLM prior weights
are used to give the number of trials when the response is the
proportion of successes: they would rarely be used for a Poisson GLM.
glm.fit
is the workhorse function: it is not normally called
directly but can be more efficient where the response vector and
design matrix have already been calculated.
If more than one of etastart
, start
and mustart
is specified, the first in the list will be used. It is often
advisable to supply starting values for a quasi
family,
and also for families with unusual links such as gaussian("log")
.
All of weights
, subset
, offset
, etastart
and mustart
are evaluated in the same way as variables in
formula
, that is first in data
and then in the
environment of formula
.
For the background to warning messages about ‘fitted probabilities numerically 0 or 1 occurred’ for binomial GLMs, see Venables & Ripley (2002, pp. 197–8).
glm
returns an object of class inheriting from "glm"
which inherits from the class "lm"
. See later in this section.
If a non-standard method
is used, the object will also inherit
from the class (if any) returned by that function.
The function summary
(i.e., summary.glm
) can
be used to obtain or print a summary of the results and the function
anova
(i.e., anova.glm
)
to produce an analysis of variance table.
The generic accessor functions coefficients
,
effects
, fitted.values
and residuals
can be used to
extract various useful features of the value returned by glm
.
weights
extracts a vector of weights, one for each case in the
fit (after subsetting and na.action
).
An object of class "glm"
is a list containing at least the
following components:
coefficients |
a named vector of coefficients |
residuals |
the working residuals, that is the residuals
in the final iteration of the IWLS fit. Since cases with zero
weights are omitted, their working residuals are |
fitted.values |
the fitted mean values, obtained by transforming the linear predictors by the inverse of the link function. |
rank |
the numeric rank of the fitted linear model. |
family |
the |
linear.predictors |
the linear fit on link scale. |
deviance |
up to a constant, minus twice the maximized log-likelihood. Where sensible, the constant is chosen so that a saturated model has deviance zero. |
aic |
A version of Akaike's An Information Criterion,
minus twice the maximized log-likelihood plus twice the number of
parameters, computed by the |
null.deviance |
The deviance for the null model, comparable with
|
iter |
the number of iterations of IWLS used. |
weights |
the working weights, that is the weights in the final iteration of the IWLS fit. |
prior.weights |
the weights initially supplied, a vector of
|
df.residual |
the residual degrees of freedom. |
df.null |
the residual degrees of freedom for the null model. |
y |
if requested (the default) the |
x |
if requested, the model matrix. |
model |
if requested (the default), the model frame. |
converged |
logical. Was the IWLS algorithm judged to have converged? |
boundary |
logical. Is the fitted value on the boundary of the attainable values? |
call |
the matched call. |
formula |
the formula supplied. |
terms |
the |
data |
the |
offset |
the offset vector used. |
control |
the value of the |
method |
the name of the fitter function used, currently always
|
contrasts |
(where relevant) the contrasts used. |
xlevels |
(where relevant) a record of the levels of the factors used in fitting. |
na.action |
(where relevant) information returned by
|
In addition, non-empty fits will have components qr
, R
and effects
relating to the final weighted linear fit.
Objects of class "glm"
are normally of class c("glm",
"lm")
, that is inherit from class "lm"
, and well-designed
methods for class "lm"
will be applied to the weighted linear
model at the final iteration of IWLS. However, care is needed, as
extractor functions for class "glm"
such as
residuals
and weights
do not just pick out
the component of the fit with the same name.
If a binomial
glm
model was specified by giving a
two-column response, the weights returned by prior.weights
are
the total numbers of cases (factored by the supplied case weights) and
the component y
of the result is the proportion of successes.
The argument method
serves two purposes. One is to allow the
model frame to be recreated with no fitting. The other is to allow
the default fitting function glm.fit
to be replaced by a
function which takes the same arguments and uses a different fitting
algorithm. If glm.fit
is supplied as a character string it is
used to search for a function of that name, starting in the
stats namespace.
The class of the object return by the fitter (if any) will be
prepended to the class returned by glm
.
The original R implementation of glm
was written by Simon
Davies working for Ross Ihaka at the University of Auckland, but has
since been extensively re-written by members of the R Core team.
The design was inspired by the S function of the same name described in Hastie & Pregibon (1992).
Mixed model additions by Kristian Hovde Liland.
Dobson, A. J. (1990) An Introduction to Generalized Linear Models. London: Chapman and Hall.
Hastie, T. J. and Pregibon, D. (1992) Generalized linear models. Chapter 6 of Statistical Models in S eds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.
McCullagh P. and Nelder, J. A. (1989) Generalized Linear Models. London: Chapman and Hall.
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. New York: Springer.
anova.glm
, summary.glm
, etc. for
glm
methods,
and the generic functions anova
, summary
,
effects
, fitted.values
,
and residuals
.
lm
for non-generalized linear models (which SAS
calls GLMs, for ‘general’ linear models).
loglin
and loglm
(package
MASS) for fitting log-linear models (which binomial and
Poisson GLMs are) to contingency tables.
bigglm
in package biglm for an alternative
way to fit GLMs to large datasets (especially those with many cases).
esoph
, infert
and
predict.glm
have examples of fitting binomial glms.
## Dobson (1990) Page 93: Randomized Controlled Trial : counts <- c(18,17,15,20,10,20,25,13,12) outcome <- gl(3,1,9) treatment <- gl(3,3) print(d.AD <- data.frame(treatment, outcome, counts)) glm.D93 <- glm(counts ~ outcome + treatment, family=poisson()) anova(glm.D93) # A Gamma example, from McCullagh & Nelder (1989, pp. 300-2) clotting <- data.frame( u = c(5,10,15,20,30,40,60,80,100), lot1 = c(118,58,42,35,27,25,21,19,18), lot2 = c(69,35,26,21,18,16,13,12,12)) summary(glm(lot1 ~ log(u), data=clotting, family=Gamma)) summary(glm(lot2 ~ log(u), data=clotting, family=Gamma)) # Mixed model example dataset <- data.frame(y=rnorm(8), x=factor(c(rep(1,4),rep(0,4))), z=factor(rep(c(1,0),4))) if(require(lme4)){ GLM <- glm(y ~ x+r(z), family=gaussian(identity), data=dataset) summary(GLM) logLik(GLM) Anova(GLM,type=3) } ## Not run: ## for an example of the use of a terms object as a formula demo(glm.vr) ## End(Not run)
## Dobson (1990) Page 93: Randomized Controlled Trial : counts <- c(18,17,15,20,10,20,25,13,12) outcome <- gl(3,1,9) treatment <- gl(3,3) print(d.AD <- data.frame(treatment, outcome, counts)) glm.D93 <- glm(counts ~ outcome + treatment, family=poisson()) anova(glm.D93) # A Gamma example, from McCullagh & Nelder (1989, pp. 300-2) clotting <- data.frame( u = c(5,10,15,20,30,40,60,80,100), lot1 = c(118,58,42,35,27,25,21,19,18), lot2 = c(69,35,26,21,18,16,13,12,12)) summary(glm(lot1 ~ log(u), data=clotting, family=Gamma)) summary(glm(lot2 ~ log(u), data=clotting, family=Gamma)) # Mixed model example dataset <- data.frame(y=rnorm(8), x=factor(c(rep(1,4),rep(0,4))), z=factor(rep(c(1,0),4))) if(require(lme4)){ GLM <- glm(y ~ x+r(z), family=gaussian(identity), data=dataset) summary(GLM) logLik(GLM) Anova(GLM,type=3) } ## Not run: ## for an example of the use of a terms object as a formula demo(glm.vr) ## End(Not run)
This function extracts terms from a linear model object and creates a Hasse diagram of the terms. The function is useful for visualizing the structure of a linear model. If the model contains random effects, these are placed in parentheses in the diagram. Manually placed terms are supported to some extent, see example for usage.
hasseMod(object, manualTerms=NULL, manualParents=NULL, meanName="M", errorName="(E)")
hasseMod(object, manualTerms=NULL, manualParents=NULL, meanName="M", errorName="(E)")
object |
A linear model object, e.g., |
manualTerms |
A |
manualParents |
A |
meanName |
The name of the mean term (default = "M"). |
errorName |
The name of the error term (default = "(E)"). |
A list with the levels of the diagram and the adjacency matrix.
Kristian Hovde Liland
# Random data dat <- data.frame(A = factor(rep(c(1,2),32)), B = factor(rep(c(1,1,2,2),16)), C = factor(rep(c(1,1,1,1,2,2,2,2),8)), Ind = factor(rep(c(1,1,1,1,2,2,2,2),8)), D = factor(rep(c(rep(1,8),rep(2,8)),4)), E = factor(rep(c(rep(1,16),rep(2,16)),2))) dat$y = rnorm(64) # Linear model with interactions and nested factors mod <- lm(y~A*B*C + D + E%in%D, data=dat) (an <- Anova(mod, type="II")) H <- hasseMod(mod) ## Not run: # Requires installation of Rgraphviz library(Rgraphviz) hasse(H$hasse, parameters=list(cluster = FALSE, arrows = "none", edgeColor = "darkred")) ## End(Not run) # Linear model with repeated measures where Ind is nested in A modv <- lm(y~A*r(B) + r(Ind), data=dat) (anv <- Anova(mod, type="II")) Hv <- hasseMod(modv, manualTerms=c("Ind"), manualParents=list(c("A"))) ## Not run: # Requires installation og Rgraphviz hasse(Hv$hasse, parameters=list(cluster = FALSE, arrows = "none", edgeColor = "darkred")) ## End(Not run)
# Random data dat <- data.frame(A = factor(rep(c(1,2),32)), B = factor(rep(c(1,1,2,2),16)), C = factor(rep(c(1,1,1,1,2,2,2,2),8)), Ind = factor(rep(c(1,1,1,1,2,2,2,2),8)), D = factor(rep(c(rep(1,8),rep(2,8)),4)), E = factor(rep(c(rep(1,16),rep(2,16)),2))) dat$y = rnorm(64) # Linear model with interactions and nested factors mod <- lm(y~A*B*C + D + E%in%D, data=dat) (an <- Anova(mod, type="II")) H <- hasseMod(mod) ## Not run: # Requires installation of Rgraphviz library(Rgraphviz) hasse(H$hasse, parameters=list(cluster = FALSE, arrows = "none", edgeColor = "darkred")) ## End(Not run) # Linear model with repeated measures where Ind is nested in A modv <- lm(y~A*r(B) + r(Ind), data=dat) (anv <- Anova(mod, type="II")) Hv <- hasseMod(modv, manualTerms=c("Ind"), manualParents=list(c("A"))) ## Not run: # Requires installation og Rgraphviz hasse(Hv$hasse, parameters=list(cluster = FALSE, arrows = "none", edgeColor = "darkred")) ## End(Not run)
Checks if models have balanced data.
is.balanced(object)
is.balanced(object)
object |
fitted model that includes |
Returns TRUE if balanced, FALSE if not.
Kristian Hovde Liland
mixlm <- lm(y~x*r(z), data = data.frame(y = rnorm(8), x = factor(c(rep(1,4),rep(0,4))), z = factor(rep(c(1,0),4)))) is.balanced(mixlm)
mixlm <- lm(y~x*r(z), data = data.frame(y = rnorm(8), x = factor(c(rep(1,4),rep(0,4))), z = factor(rep(c(1,0),4)))) is.balanced(mixlm)
lm
is used to fit linear models.
It can be used to carry out regression,
single stratum analysis of variance and
analysis of covariance (although aov
may provide a more
convenient interface for these).
The version distributed through the package mixlm
extends the capabilities with balanced mixture
models and lmer
interfacing. Random effects
are indicated by wrapping their formula entries in r()
.
Also, effect level names are kept in printing.
lm(formula, data, subset, weights, na.action, method = "qr", model = TRUE, x = TRUE, y = TRUE, qr = TRUE, singular.ok = TRUE, contrasts = "contr.sum", offset, unrestricted = TRUE, REML = NULL, ...)
lm(formula, data, subset, weights, na.action, method = "qr", model = TRUE, x = TRUE, y = TRUE, qr = TRUE, singular.ok = TRUE, contrasts = "contr.sum", offset, unrestricted = TRUE, REML = NULL, ...)
formula |
an object of class |
data |
an optional data frame, list or environment (or object
coercible by |
subset |
an optional vector specifying a subset of observations to be used in the fitting process. |
weights |
an optional vector of weights to be used in the fitting
process. Should be |
na.action |
a function which indicates what should happen
when the data contain |
method |
the method to be used; for fitting, currently only
|
model , x , y , qr
|
logicals. If |
singular.ok |
logical. If |
contrasts |
character indicating which coding should be applied
to all |
offset |
this can be used to specify an a priori known
component to be included in the linear predictor during fitting.
This should be |
unrestricted |
additional argument for switching between unrestricted and restricted models if including random variables. |
REML |
is used to invoke restricted maximum likelihood (TRUE) or maximum likelihood (FALSE) estimation instead of least squares. |
... |
additional arguments to be passed to the low level regression fitting functions (see below). |
Models for lm
are specified symbolically. A typical model has
the form response ~ terms
where response
is the (numeric)
response vector and terms
is a series of terms which specifies a
linear predictor for response
. A terms specification of the form
first + second
indicates all the terms in first
together
with all the terms in second
with duplicates removed. A
specification of the form first:second
indicates the set of
terms obtained by taking the interactions of all terms in first
with all terms in second
. The specification first*second
indicates the cross of first
and second
. This is
the same as first + second + first:second
.
If the formula includes an offset
, this is evaluated and
subtracted from the response.
If response
is a matrix a linear model is fitted separately by
least-squares to each column of the matrix.
See model.matrix
for some further details. The terms in
the formula will be re-ordered so that main effects come first,
followed by the interactions, all second-order, all third-order and so
on: to avoid this pass a terms
object as the formula (see
aov
and demo(glm.vr)
for an example).
A formula has an implied intercept term. To remove this use either
y ~ x - 1
or y ~ 0 + x
. See formula
for
more details of allowed formulae.
Non-NULL
weights
can be used to indicate that different
observations have different variances (with the values in
weights
being inversely proportional to the variances); or
equivalently, when the elements of weights
are positive
integers , that each response
is the mean of
unit-weight observations (including the case that there are
observations equal to
and the data have been
summarized).
lm
calls the lower level functions lm.fit
, etc,
see below, for the actual numerical computations. For programming
only, you may consider doing likewise.
All of weights
, subset
and offset
are evaluated
in the same way as variables in formula
, that is first in
data
and then in the environment of formula
.
The contrasts
argument is applied when the lm
model is
fitted. In additional to the standard contrasts from the stats
package, one can choose weighted coding: contr.weighted
which
balances unbalanced data through factor weighting. Different from the
original version of lm
, a single contrast can be indicated
to automatically be applied to all factor
s.
lm
returns an object of class
c("lmm","lm")
or for
multiple responses of class c("mlm", "lm")
.
The functions summary
and anova
are used to
obtain and print a summary and analysis of variance table of the
results. The generic accessor functions coefficients
,
effects
, fitted.values
and residuals
extract
various useful features of the value returned by lm
.
An object of class "lm"
is a list containing at least the
following components:
coefficients |
a named vector of coefficients |
residuals |
the residuals, that is response minus fitted values. |
fitted.values |
the fitted mean values. |
rank |
the numeric rank of the fitted linear model. |
weights |
(only for weighted fits) the specified weights. |
df.residual |
the residual degrees of freedom. |
call |
the matched call. |
terms |
the |
contrasts |
(only where relevant) the contrasts used. |
xlevels |
(only where relevant) a record of the levels of the factors used in fitting. |
offset |
the offset used (missing if none were used). |
y |
if requested, the response used. |
x |
if requested, the model matrix used. |
model |
if requested (the default), the model frame used. |
na.action |
(where relevant) information returned by
|
In addition, non-null fits will have components assign
,
effects
and (unless not requested) qr
relating to the linear
fit, for use by extractor functions such as summary
and
effects
.
And models containing random effect will contain
random
having additional information about
the model.
Considerable care is needed when using lm
with time series.
Unless na.action = NULL
, the time series attributes are
stripped from the variables before the regression is done. (This is
necessary as omitting NA
s would invalidate the time series
attributes, and if NA
s are omitted in the middle of the series
the result would no longer be a regular time series.)
Even if the time series attributes are retained, they are not used to
line up series, so that the time shift of a lagged or differenced
regressor would be ignored. It is good practice to prepare a
data
argument by ts.intersect(..., dframe = TRUE)
,
then apply a suitable na.action
to that data frame and call
lm
with na.action = NULL
so that residuals and fitted
values are time series.
Offsets specified by offset
will not be included in predictions
by predict.lm
, whereas those specified by an offset term
in the formula will be.
The design was inspired by the S function of the same name described in Chambers (1992). The implementation of model formula by Ross Ihaka was based on Wilkinson & Rogers (1973). Mixed model extensions by Kristian Hovde Liland.
Chambers, J. M. (1992) Linear models. Chapter 4 of Statistical Models in S eds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole.
Wilkinson, G. N. and Rogers, C. E. (1973) Symbolic descriptions of factorial models for analysis of variance. Applied Statistics, 22, 392–9.
summary.lmm
for summaries and anova.lmm
for
the ANOVA table; aov
for a different interface.
The generic functions coef
, effects
,
residuals
, fitted
, vcov
.
predict.lm
(via predict
) for prediction,
including confidence and prediction intervals;
confint
for confidence intervals of parameters.
lm.influence
for regression diagnostics, and
glm
for generalized linear models.
The underlying low level functions,
lm.fit
for plain, and lm.wfit
for weighted
regression fitting.
More lm()
examples are available e.g., in
anscombe
, attitude
, freeny
,
LifeCycleSavings
, longley
,
stackloss
, swiss
.
biglm
in package biglm for an alternative
way to fit linear models to large datasets (especially those with many
cases).
print.AnovaMix
, AnovaMix
, Anova
require(graphics) ## Annette Dobson (1990) "An Introduction to Generalized Linear Models". ## Page 9: Plant Weight Data. ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14) trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69) group <- gl(2,10,20, labels=c("Ctl","Trt")) weight <- c(ctl, trt) lm.D9 <- lm(weight ~ group) lm.D90 <- lm(weight ~ group - 1) # omitting intercept anova(lm.D9) summary(lm.D90) opar <- par(mfrow = c(2,2), oma = c(0, 0, 1.1, 0)) plot(lm.D9, las = 1) # Residuals, Fitted, ... par(opar) # Linear mixed model dataset <- data.frame(y=rnorm(8), x=factor(c(rep(1,4),rep(0,4))), z=factor(rep(c(1,0),4))) mixlm <- lm(y~x*r(z), data = dataset) Anova(mixlm,type="III") ### less simple examples in "See Also" above
require(graphics) ## Annette Dobson (1990) "An Introduction to Generalized Linear Models". ## Page 9: Plant Weight Data. ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14) trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69) group <- gl(2,10,20, labels=c("Ctl","Trt")) weight <- c(ctl, trt) lm.D9 <- lm(weight ~ group) lm.D90 <- lm(weight ~ group - 1) # omitting intercept anova(lm.D9) summary(lm.D90) opar <- par(mfrow = c(2,2), oma = c(0, 0, 1.1, 0)) plot(lm.D9, las = 1) # Residuals, Fitted, ... par(opar) # Linear mixed model dataset <- data.frame(y=rnorm(8), x=factor(c(rep(1,4),rep(0,4))), z=factor(rep(c(1,0),4))) mixlm <- lm(y~x*r(z), data = dataset) Anova(mixlm,type="III") ### less simple examples in "See Also" above
Plot summary of relevant component analysis.
plotprops(Y, X, doscaleX = FALSE, docenterX = TRUE, ncomp, subset)
plotprops(Y, X, doscaleX = FALSE, docenterX = TRUE, ncomp, subset)
Y |
Response matrix. |
X |
Predictor matrix. |
doscaleX |
Standardize predictors. |
docenterX |
Center predictors. |
ncomp |
Number of singular values and eigenvalues to extract. |
subset |
Subset of predictor and response. |
Only plotting.
Solve Sæbø
Helland, I.S. & T. Almøy (1994) Comparison of prediction methods when only a few components are relevant. JASA 89, 583-591.
X <- matrix(rnorm(100),20,5) Y <- matrix(rnorm(20),20,1) plotprops(Y, X, doscaleX = FALSE, docenterX = TRUE, 5)
X <- matrix(rnorm(100),20,5) Y <- matrix(rnorm(20),20,1) plotprops(Y, X, doscaleX = FALSE, docenterX = TRUE, 5)
Various summaries of predictions and PRESS residuals.
R2pred(object = NULL) RMSEP(object) rmsep(object) PRESS(object = NULL) PRESS.res(object = NULL, ncomp = NULL) PRESS.pred(object = NULL, ncomp = NULL)
R2pred(object = NULL) RMSEP(object) rmsep(object) PRESS(object = NULL) PRESS.res(object = NULL, ncomp = NULL) PRESS.pred(object = NULL, ncomp = NULL)
object |
a fitted model of type |
ncomp |
number of components to use with |
Predictions are extracted and summaries/residuals are computed.
Returns either an object of summaries or residuals.
Kristian Hovde Liland
data <- data.frame(y = rnorm(8), x = factor(c('a','a','a','a','b','b','b','b')), z = factor(c('a','a','b','b','a','a','b','b'))) mod <- lm(y ~ x + z, data=data) RMSEP(mod) rmsep(mod) # Alias to distinguish it from pls::RMSEP R2pred(mod) PRESS(mod) PRESS.res(mod) PRESS.pred(mod)
data <- data.frame(y = rnorm(8), x = factor(c('a','a','a','a','b','b','b','b')), z = factor(c('a','a','b','b','a','a','b','b'))) mod <- lm(y ~ x + z, data=data) RMSEP(mod) rmsep(mod) # Alias to distinguish it from pls::RMSEP R2pred(mod) PRESS(mod) PRESS.res(mod) PRESS.pred(mod)
Prints relevant information like the ANOVA table, variance components and errors.
## S3 method for class 'AnovaMix' print(x, ...)
## S3 method for class 'AnovaMix' print(x, ...)
x |
|
... |
Additional arguments (not supported yet). |
Only balanced models are fully supported.
Kristian Hovde Liland
mixlm <- lm(y~x*r(z), data = data.frame(y = rnorm(8), x = factor(c(rep(1,4),rep(0,4))), z = factor(rep(c(1,0),4)))) Anova(mixlm,type="III")
mixlm <- lm(y~x*r(z), data = data.frame(y = rnorm(8), x = factor(c(rep(1,4),rep(0,4))), z = factor(rep(c(1,0),4)))) Anova(mixlm,type="III")
summary
method for class "lmm"
.
## S3 method for class 'summary.lmm' print(x, digits = max(3, getOption("digits") - 3), symbolic.cor = x$symbolic.cor, signif.stars = getOption("show.signif.stars"), ...)
## S3 method for class 'summary.lmm' print(x, digits = max(3, getOption("digits") - 3), symbolic.cor = x$symbolic.cor, signif.stars = getOption("show.signif.stars"), ...)
x |
an object of class |
digits |
the number of significant digits to use when printing. |
symbolic.cor |
logical. If |
signif.stars |
logical. If |
... |
further arguments passed to or from other methods. |
This adaptation of print.summary.lm
from package stats
slightly alters the output to better conform with text-book notation.
print.summary.lm
tries to be smart about formatting the
coefficients, standard errors, etc. and additionally gives
‘significance stars’ if signif.stars
is TRUE
.
Correlations are printed to two decimal places (or symbolically): to
see the actual correlations print summary(object)$correlation
directly.
The function summary.lm
computes and returns a list of summary
statistics of the fitted linear model given in object
, using
the components (list elements) "call"
and "terms"
from its argument, plus
residuals |
the weighted residuals, the usual residuals
rescaled by the square root of the weights specified in the call to
|
coefficients |
a |
aliased |
named logical vector showing if the original coefficients are aliased. |
sigma |
the square root of the estimated variance of the random error
where |
df |
degrees of freedom, a 3-vector |
fstatistic |
(for models including non-intercept terms) a 3-vector with the value of the F-statistic with its numerator and denominator degrees of freedom. |
r.squared |
where |
adj.r.squared |
the above |
cov.unscaled |
a |
correlation |
the correlation matrix corresponding to the above
|
symbolic.cor |
(only if |
na.action |
from |
The model fitting function lm
, summary
.
Function coef
will extract the matrix of coefficients
with standard errors, t-statistics and p-values.
ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14) trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69) group <- gl(2,10,20, labels=c("Ctl","Trt")) weight <- c(ctl, trt) sld90 <- summary(lm.D90 <- lm(weight ~ group -1))# omitting intercept sld90
ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14) trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69) group <- gl(2,10,20, labels=c("Ctl","Trt")) weight <- c(ctl, trt) sld90 <- summary(lm.D90 <- lm(weight ~ group -1))# omitting intercept sld90
This adaptation of prop.test
from package stats
strips the test down to a text-book version.
prop.test
can be used for testing the null that the
proportions (probabilities of success) in several groups are the
same, or that they equal certain given values.
prop.test.ordinary(x, n, p = NULL, alternative = c("two.sided", "less", "greater"), conf.level = 0.95, correct = TRUE, pooled = TRUE)
prop.test.ordinary(x, n, p = NULL, alternative = c("two.sided", "less", "greater"), conf.level = 0.95, correct = TRUE, pooled = TRUE)
x |
a vector of counts of successes, a one-dimensional table with two entries, or a two-dimensional table (or matrix) with 2 columns, giving the counts of successes and failures, respectively. |
n |
a vector of counts of trials; ignored if |
p |
a vector of probabilities of success. The length of
|
alternative |
a character string specifying the alternative
hypothesis, must be one of |
conf.level |
confidence level of the returned confidence interval. Must be a single number between 0 and 1. Only used when testing the null that a single proportion equals a given value, or that two proportions are equal; ignored otherwise. |
correct |
a logical indicating whether Yates' continuity correction should be applied where possible. |
pooled |
a logical indicating wheter pooled standard deviation should be used.. |
Only groups with finite numbers of successes and failures are used. Counts of successes and failures must be nonnegative and hence not greater than the corresponding numbers of trials which must be positive. All finite counts should be integers.
If p
is NULL
and there is more than one group, the null
tested is that the proportions in each group are the same. If there
are two groups, the alternatives are that the probability of success
in the first group is less than, not equal to, or greater than the
probability of success in the second group, as specified by
alternative
. A confidence interval for the difference of
proportions with confidence level as specified by conf.level
and clipped to is returned. Continuity correction is
used only if it does not exceed the difference of the sample
proportions in absolute value. Otherwise, if there are more than 2
groups, the alternative is always
"two.sided"
, the returned
confidence interval is NULL
, and continuity correction is never
used.
If there is only one group, then the null tested is that the
underlying probability of success is p
, or .5 if p
is
not given. The alternative is that the probability of success is less
than, not equal to, or greater than p
or 0.5, respectively, as
specified by alternative
. A confidence interval for the
underlying proportion with confidence level as specified by
conf.level
and clipped to is returned. Continuity
correction is used only if it does not exceed the difference between
sample and null proportions in absolute value. The confidence interval
is computed by inverting the score test.
Finally, if p
is given and there are more than 2 groups, the
null tested is that the underlying probabilities of success are those
given by p
. The alternative is always "two.sided"
, the
returned confidence interval is NULL
, and continuity correction
is never used.
A list with class "htest"
containing the following
components:
statistic |
the value of Pearson's chi-squared test statistic. |
parameter |
the degrees of freedom of the approximate chi-squared distribution of the test statistic. |
p.value |
the p-value of the test. |
estimate |
a vector with the sample proportions |
conf.int |
a confidence interval for the true proportion if
there is one group, or for the difference in proportions if
there are 2 groups and |
null.value |
the value of |
alternative |
a character string describing the alternative. |
method |
a character string indicating the method used, and whether Yates' continuity correction was applied. |
data.name |
a character string giving the names of the data. |
Wilson, E.B. (1927) Probable inference, the law of succession, and statistical inference. J. Am. Stat. Assoc., 22, 209–212.
Newcombe R.G. (1998) Two-Sided Confidence Intervals for the Single Proportion: Comparison of Seven Methods. Statistics in Medicine 17, 857–872.
Newcombe R.G. (1998) Interval Estimation for the Difference Between Independent Proportions: Comparison of Eleven Methods. Statistics in Medicine 17, 873–890.
binom.test
for an exact test of a binomial
hypothesis.
heads <- rbinom(1, size=100, prob = .5) prop.test(heads, 100) # continuity correction TRUE by default prop.test(heads, 100, correct = FALSE) ## Data from Fleiss (1981), p. 139. ## H0: The null hypothesis is that the four populations from which ## the patients were drawn have the same true proportion of smokers. ## A: The alternative is that this proportion is different in at ## least one of the populations. smokers <- c( 83, 90, 129, 70 ) patients <- c( 86, 93, 136, 82 ) prop.test.ordinary(smokers, patients)
heads <- rbinom(1, size=100, prob = .5) prop.test(heads, 100) # continuity correction TRUE by default prop.test(heads, 100, correct = FALSE) ## Data from Fleiss (1981), p. 139. ## H0: The null hypothesis is that the four populations from which ## the patients were drawn have the same true proportion of smokers. ## A: The alternative is that this proportion is different in at ## least one of the populations. smokers <- c( 83, 90, 129, 70 ) patients <- c( 86, 93, 136, 82 ) prop.test.ordinary(smokers, patients)
Removes function r() from formulas. Can also convert to lmer formula.
rparse(f, REML = FALSE)
rparse(f, REML = FALSE)
f |
formula to be stripped of r(). |
REML |
logical indicating if lmer conversion should be done. |
Formula without r(), possibly converted to lmer mixed model format.
Kristian Hovde Liland
f <- formula(y~x*r(z)) rparse(f)
f <- formula(y~x*r(z)) rparse(f)
Extension of glht from the multcomp package to handle Fisher family-wise error and Bonferroni testing. Create a set of confidence intervals on the differences between the means of the levels of a factor with the specified family-wise probability of coverage. The intervals are based on the Studentized range statistic, Tukey's ‘Honest Significant Difference’ method, Fisher's family-wise error, or Bonferroni testing.
simple.glht(mod, effect, corr = c("Tukey","Bonferroni","Fisher"), level = 0.95, df = NULL, ...)
simple.glht(mod, effect, corr = c("Tukey","Bonferroni","Fisher"), level = 0.95, df = NULL, ...)
mod |
|
effect |
A character vector giving the term of the fitted model for which the intervals should be calculated. This can also be an interaction. |
corr |
A character vector giving the multiple testing correction
method. Defaults to |
level |
A numeric value between zero and one giving the family-wise confidence level to use. |
df |
User supplied number of degrees of freedom. If not supplied or NULL, the default is to extract these from the model. |
... |
Optional additional arguments. None are used at present. |
When comparing the means for the levels of a factor in an analysis of variance, a simple comparison using t-tests will inflate the probability of declaring a significant difference when it is not in fact present. This because the intervals are calculated with a given coverage probability for each interval but the interpretation of the coverage is usually with respect to the entire family of intervals.
John Tukey introduced intervals based on the range of the sample means rather than the individual differences. The intervals returned by this function are based on this Studentized range statistics.
The intervals constructed in this way would only apply exactly to balanced designs where there are the same number of observations made at each level of the factor. This function incorporates an adjustment for sample size that produces sensible intervals for mildly unbalanced designs.
If which
specifies non-factor terms these will be dropped with
a warning: if no terms are left this is a an error.
An object of classes "simple.glht"
, "summary.glht"
and "glht"
containing information to produce confidence intervals,
tests and plotting.
There are print
, plot
and cld
methods for class
"simple.glht"
.
The plot
method does not accept
xlab
, ylab
or main
arguments and creates its own
values for each plot.
Douglas Bates, extended to mixed effect models by Kristian Hovde Liland.
Miller, R. G. (1981) Simultaneous Statistical Inference. Springer.
Yandell, B. S. (1997) Practical Data Analysis for Designed Experiments. Chapman & Hall.
aov
, qtukey
, model.tables
,
glht
in package multcomp.
require(graphics) summary(fm1 <- lm(breaks ~ wool + tension, data = warpbreaks)) simple.glht(fm1, "tension") plot(simple.glht(fm1, "tension")) cld(simple.glht(fm1, "tension"))
require(graphics) summary(fm1 <- lm(breaks ~ wool + tension, data = warpbreaks)) simple.glht(fm1, "tension") plot(simple.glht(fm1, "tension")) cld(simple.glht(fm1, "tension"))
Standardized Pearson residuals.
spearson(object)
spearson(object)
object |
fitted model. |
Takes ordinary Pearson residuals and standardizes them.
Returns the residuals.
Kristian Hovde Liland
data <- data.frame(y = rnorm(8), x = factor(c('a','a','a','a','b','b','b','b')), z = factor(c('a','a','b','b','a','a','b','b'))) mod <- lm(y ~ x + z, data=data) spearson(mod)
data <- data.frame(y = rnorm(8), x = factor(c('a','a','a','a','b','b','b','b')), z = factor(c('a','a','b','b','a','a','b','b'))) mod <- lm(y ~ x + z, data=data) spearson(mod)
Adaptations of base
t.test to better confrom to text book standards.
t_test_sum
and z_test_sum
takes summarized data as input.
t_test(x, ...) z_test(x, ...) ## Default S3 method: t_test(x, y = NULL, alternative = c("two.sided", "less", "greater"), mu = 0, paired = FALSE, var.equal = FALSE, conf.level = 0.95, ...) ## Default S3 method: z_test(x, y = NULL, alternative = c("two.sided", "less", "greater"), mu = 0, paired = FALSE, var.equal = FALSE, conf.level = 0.95, sds = NULL, ...) ## S3 method for class 'formula' t_test(formula, data, subset, na.action, ...) ## S3 method for class 'formula' z_test(formula, data, subset, na.action, ...) ## Function for summarized data: t_test_sum(means, sds, ns, alternative = c("two.sided", "less", "greater"), mu = 0, var.equal = FALSE, conf.level = 0.95, z.test = FALSE, ...) z_test_sum(means, sds, ns, alternative = c("two.sided", "less", "greater"), mu = 0, var.equal = FALSE, conf.level = 0.95, z.test = TRUE, ...)
t_test(x, ...) z_test(x, ...) ## Default S3 method: t_test(x, y = NULL, alternative = c("two.sided", "less", "greater"), mu = 0, paired = FALSE, var.equal = FALSE, conf.level = 0.95, ...) ## Default S3 method: z_test(x, y = NULL, alternative = c("two.sided", "less", "greater"), mu = 0, paired = FALSE, var.equal = FALSE, conf.level = 0.95, sds = NULL, ...) ## S3 method for class 'formula' t_test(formula, data, subset, na.action, ...) ## S3 method for class 'formula' z_test(formula, data, subset, na.action, ...) ## Function for summarized data: t_test_sum(means, sds, ns, alternative = c("two.sided", "less", "greater"), mu = 0, var.equal = FALSE, conf.level = 0.95, z.test = FALSE, ...) z_test_sum(means, sds, ns, alternative = c("two.sided", "less", "greater"), mu = 0, var.equal = FALSE, conf.level = 0.95, z.test = TRUE, ...)
x |
a (non-empty) numeric vector of data values. |
y |
an optional (non-empty) numeric vector of data values. |
alternative |
a character string specifying the alternative
hypothesis, must be one of |
mu |
a number indicating the true value of the mean (or difference in means if you are performing a two sample test). |
paired |
a logical indicating whether you want a paired t-test. |
var.equal |
a logical variable indicating whether to treat the
two variances as being equal. If |
conf.level |
confidence level of the interval. |
formula |
a formula of the form |
data |
an optional matrix or data frame (or similar: see
|
subset |
an optional vector specifying a subset of observations to be used. |
na.action |
a function which indicates what should happen when
the data contain |
means |
means of groups. |
sds |
standard deviations of groups. |
ns |
number of objects in groups. |
z.test |
normal approximation. |
... |
further arguments to be passed to or from methods. |
The formula interface is only applicable for the 2-sample tests.
alternative = "greater"
is the alternative that x
has a
larger mean than y
.
If paired
is TRUE
then both x
and y
must
be specified and they must be the same length. Missing values are
silently removed (in pairs if paired
is TRUE
). If
var.equal
is TRUE
then the pooled estimate of the
variance is used. By default, if var.equal
is FALSE
then the variance is estimated separately for both groups and the
Welch modification to the degrees of freedom is used.
If the input data are effectively constant (compared to the larger of the two means) an error is generated.
A list with class "htest"
containing the following components:
statistic |
the value of the t-statistic. |
parameter |
the degrees of freedom for the t-statistic. |
p.value |
the p-value for the test. |
conf.int |
a confidence interval for the mean appropriate to the specified alternative hypothesis. |
estimate |
the estimated mean or difference in means depending on whether it was a one-sample test or a two-sample test. |
null.value |
the specified hypothesized value of the mean or mean difference depending on whether it was a one-sample test or a two-sample test. |
alternative |
a character string describing the alternative hypothesis. |
method |
a character string indicating what type of t-test was performed. |
data.name |
a character string giving the name(s) of the data. |
t.test(1:10,y=c(7:20)) # P = .00001855 t.test(1:10,y=c(7:20, 200)) # P = .1245 -- NOT significant anymore ## Classical example: Student's sleep data plot(extra ~ group, data = sleep) ## Traditional interface with(sleep, t.test(extra[group == 1], extra[group == 2])) ## Formula interface t_test(extra ~ group, data = sleep)
t.test(1:10,y=c(7:20)) # P = .00001855 t.test(1:10,y=c(7:20, 200)) # P = .1245 -- NOT significant anymore ## Classical example: Student's sleep data plot(extra ~ group, data = sleep) ## Traditional interface with(sleep, t.test(extra[group == 1], extra[group == 2])) ## Formula interface t_test(extra ~ group, data = sleep)
Tally of discrete numbers
tally(x)
tally(x)
x |
Discrete number vector. |
Returns the tally.
Kristian Hovde Liland
tally(c(1,5,1,3,2,5,6,2,2,1,4,3,6))
tally(c(1,5,1,3,2,5,6,2,2,1,4,3,6))