Title: | Mark-Recapture Distance Sampling |
---|---|
Description: | Animal abundance estimation via conventional, multiple covariate and mark-recapture distance sampling (CDS/MCDS/MRDS). Detection function fitting is performed via maximum likelihood. Also included are diagnostics and plotting for fitted detection functions. Abundance estimation is via a Horvitz-Thompson-like estimator. |
Authors: | Laura Marshall [cre], Jeff Laake [aut], David Miller [aut], Felix Petersma [aut], Len Thomas [ctb], David Borchers [ctb], Jon Bishop [ctb], Jonah McArthur [ctb], Eric Rexstad [rev] |
Maintainer: | Laura Marshall <[email protected]> |
License: | GPL (>= 2) |
Version: | 3.0.0 |
Built: | 2024-10-24 04:26:51 UTC |
Source: | CRAN |
This package implements mark-recapture distance sampling methods as described in D.L. Borchers, W. Zucchini and Fewster, R.M. (1988), "Mark-recapture models for line transect surveys", Biometrics 54: 1207-1220. and Laake, J.L. (1999) "Distance sampling with independent observers: Reducing bias from heterogeneity by weakening the conditional independence assumption." in Amstrup, G.W., Garner, S.C., Laake, J.L., Manly, B.F.J., McDonald, L.L. and Robertson, D.G. (eds) "Marine mammal survey and assessment methods", Balkema, Rotterdam: 137-148 and Borchers, D.L., Laake, J.L., Southwell, C. and Paxton, C.L.G. "Accommodating unmodelled heterogeneity in double-observer distance sampling surveys". 2006. Biometrics 62:372-378.)
Examples of distance sampling analyses are available at http://examples.distancesampling.org/.
For help with distance sampling and this package, there is a Google Group https://groups.google.com/forum/#!forum/distance-sampling.
Jeff Laake <[email protected]>, David Borchers <[email protected]>, Len Thomas <[email protected]>, David L. Miller <[email protected]>, Jon Bishop <[email protected]>, Felix Petersma <[email protected]>
Add a line or lines to a plot of the detection function which correspond to a a given covariate combination. These can be particularly useful when there is a small number of factor levels or if quantiles of a continuous covariate are specified.
add.df.covar.line(ddf, data, ndist = 250, pdf = FALSE, breaks = "Sturges", ...) add_df_covar_line(ddf, data, ndist = 250, pdf = FALSE, breaks = "Sturges", ...)
add.df.covar.line(ddf, data, ndist = 250, pdf = FALSE, breaks = "Sturges", ...) add_df_covar_line(ddf, data, ndist = 250, pdf = FALSE, breaks = "Sturges", ...)
ddf |
a fitted detection function object. |
data |
a |
ndist |
number of distances at which to evaluate the detection function. |
pdf |
should the line be drawn on the probability density scale; ignored for line transects. |
breaks |
required to ensure that PDF lines are the right size, should
match what is supplied to original |
... |
extra arguments to give to |
All covariates must be specified in data
. Plots can become quite busy
when this approach is used. It may be useful to fix some covariates at their
median level and plot set values of a covariate of interest. For example
setting weather (e.g., Beaufort) to its median and plotting levels of
observer, then creating a second plot for a fixed observer with levels of
weather.
Arguments to lines
are supplied in ... and aesthetics like
line type (lty
), line width (lwd
) and colour (col
) are
recycled. By default lty
is used to distinguish between the lines. It
may be useful to add a legend
to the plot (lines are plotted
in the order of data
).
invisibly, the values of detectability over the truncation range.
David L Miller
## Not run: # fit an example model data(book.tee.data) egdata <- book.tee.data$book.tee.dataframe result <- ddf(dsmodel = ~mcds(key = "hn", formula = ~sex), data = egdata[egdata$observer==1, ], method = "ds", meta.data = list(width = 4)) # make a base plot, showpoints=FALSE makes the plot less busy plot(result, showpoints=FALSE) # add lines for sex one at a time add.df.covar.line(result, data.frame(sex=0), lty=2) add.df.covar.line(result, data.frame(sex=1), lty=3) # add a legend legend(3, 1, c("Average", "sex==0", "sex==1"), lty=1:3) # alternatively we can add both at once # fixing line type and varying colour plot(result, showpoints=FALSE) add.df.covar.line(result, data.frame(sex=c(0,1)), lty=1, col=c("red", "green")) # add a legend legend(3, 1, c("Average", "sex==0", "sex==1"), lty=1, col=c("black", "red", "green")) ## End(Not run)
## Not run: # fit an example model data(book.tee.data) egdata <- book.tee.data$book.tee.dataframe result <- ddf(dsmodel = ~mcds(key = "hn", formula = ~sex), data = egdata[egdata$observer==1, ], method = "ds", meta.data = list(width = 4)) # make a base plot, showpoints=FALSE makes the plot less busy plot(result, showpoints=FALSE) # add lines for sex one at a time add.df.covar.line(result, data.frame(sex=0), lty=2) add.df.covar.line(result, data.frame(sex=1), lty=3) # add a legend legend(3, 1, c("Average", "sex==0", "sex==1"), lty=1:3) # alternatively we can add both at once # fixing line type and varying colour plot(result, showpoints=FALSE) add.df.covar.line(result, data.frame(sex=c(0,1)), lty=1, col=c("red", "green")) # add a legend legend(3, 1, c("Average", "sex==0", "sex==1"), lty=1, col=c("black", "red", "green")) ## End(Not run)
'adj.check.order' checks that the Cosine, Hermite or simple polynomials are of the correct order.
adj.check.order(adj.series, adj.order, key)
adj.check.order(adj.series, adj.order, key)
adj.series |
Adjustment series used
(' |
adj.order |
Integer to check |
key |
key function to be used with this adjustment series |
Only even functions are allowed as adjustment terms, per p.47 of Buckland et
al (2001). If incorrect terms are supplied then an error is throw via
stop
.
Nothing! Just calls stop
if something goes wrong.
David Miller
S.T.Buckland, D.R.Anderson, K.P. Burnham, J.L. Laake. 1993. Robust Models. In: Distance Sampling, eds. S.T.Buckland, D.R.Anderson, K.P. Burnham, J.L. Laake. Chapman & Hall.
adjfct.cos
, adjfct.poly
,
adjfct.herm
, detfct
, mcds
,
cds
For internal use only – not to be called by 'mrds' or 'Distance' users directly.
adj.cos(distance, scaling, adj.order)
adj.cos(distance, scaling, adj.order)
distance |
perpendicular distance vector/scalar |
scaling |
scale parameter |
adj.order |
the adjustment order |
scalar or vector containing the cosine adjustment term for every
value in distance
argument
Felix Petersma
For internal use only – not to be called by 'mrds' or 'Distance' users directly.
adj.herm(distance, scaling, adj.order)
adj.herm(distance, scaling, adj.order)
distance |
perpendicular distance vector/scalar |
scaling |
scale parameter |
adj.order |
the adjustment order |
scalar or vector containing the Hermite adjustment term for every
value in distance
argument
Felix Petersma
For internal use only – not to be called by 'mrds' or 'Distance' users directly.
adj.poly(distance, scaling, adj.order)
adj.poly(distance, scaling, adj.order)
distance |
perpendicular distance vector/scalar |
scaling |
scale parameter |
adj.order |
the adjustment order |
scalar or vector containing the polynomial adjustment term for every
value in distance
argument
Felix Petersma
For internal use only – not to be called by 'mrds' or 'Distance' users directly.
adj.series.grad.cos( distance, scaling = 1, adj.order, adj.parm = NULL, adj.exp = FALSE )
adj.series.grad.cos( distance, scaling = 1, adj.order, adj.parm = NULL, adj.exp = FALSE )
distance |
perpendicular distance vector/scalar |
scaling |
scale parameter |
adj.order |
the adjustment order |
adj.parm |
vector of parameters (a_j) |
adj.exp |
boolean, defaults to FALSE |
scalar or vector containing the gradient of the cosine adjustment
series for every value in distance
argument
Felix Petersma
For internal use only – not to be called by 'mrds' or 'Distance' users directly.
adj.series.grad.herm( distance, scaling = 1, adj.order, adj.parm = NULL, adj.exp = FALSE )
adj.series.grad.herm( distance, scaling = 1, adj.order, adj.parm = NULL, adj.exp = FALSE )
distance |
perpendicular distance vector/scalar |
scaling |
scale parameter |
adj.order |
the adjustment order |
adj.parm |
vector of parameters (a_j) |
adj.exp |
boolean, defaults to FALSE |
scalar or vector containing the gradient of the Hermite adjustment
series for every value in distance
argument
Felix Petersma
For internal use only – not to be called by 'mrds' or 'Distance' users directly.
adj.series.grad.poly( distance, scaling = 1, adj.order, adj.parm = NULL, adj.exp = FALSE )
adj.series.grad.poly( distance, scaling = 1, adj.order, adj.parm = NULL, adj.exp = FALSE )
distance |
perpendicular distance vector/scalar |
scaling |
scale parameter |
adj.order |
the adjustment order |
adj.parm |
vector of parameters (a_j) |
adj.exp |
boolean, defaults to FALSE |
scalar or vector containing the gradient of the polynomial adjustment
series for every value in distance
argument
Felix Petersma
Extract the AIC from a fitted detection function.
## S3 method for class 'ddf' AIC(object, ..., k = 2)
## S3 method for class 'ddf' AIC(object, ..., k = 2)
object |
a fitted detection function object |
... |
optionally more fitted model objects. |
k |
penalty per parameter to be used; the default |
David L Miller
Get the apex for a gamma detection function
apex.gamma(ddfobj)
apex.gamma(ddfobj)
ddfobj |
ddf object |
the distance at which the gamma peaks
Jeff Laake
Assigns default values for argument
in list x
from
argument=value
pairs in ... if x$argument
doesn't already
exist
assign.default.values(x, ...)
assign.default.values(x, ...)
x |
generic list |
... |
unspecified list of argument=value pairs that are used to assign values |
x - list with filled values
Jeff Laake
For models with covariates the detection probability for each observation can vary. This function computes an average value for a set of distances to plot an average line to graphically represent the fitted model in plots that compare histograms and the scatter of individual estimated detection probabilities. Averages are calculated over the observed covariate combinations.
average.line(finebr, obs, model)
average.line(finebr, obs, model)
finebr |
set of fine breaks in distance over which detection function values are averaged and plotted |
obs |
value of observer for averaging (1-2 individual observers; 3 duplicates; 4 pooled observation team) |
model |
ddf model object |
list with 2 elements
xgrid |
vector of gridded distance values |
values |
vector of average detection function values at
the xgrid values |
Internal function called from plot functions for ddf objects
Jeff Laake
For models with covariates the detection probability for each observation can vary. This function computes an average value for a set of distances to plot an average line to graphically represent the fitted model in plots that compare histograms and the scatter of individual estimated detection probabilities.
average.line.cond(finebr, obs, model)
average.line.cond(finebr, obs, model)
finebr |
set of fine breaks in distance over which detection function values are averaged and plotted |
obs |
value of observer for averaging (1-2 individual observers) |
model |
ddf model object |
list with 2 elements:
xgrid |
vector of gridded distance values |
values |
vector of average detection function values at
the xgrid values |
Internal function called from plot functions for ddf
objects
Jeff Laake
Double platform data collected in a line transect survey of golf tees by 2 observers at St. Andrews. Field sex was actually colour of the golf tee: 0 - green; 1 - yellow. Exposure was either low (0) or high(1) depending on height of tee above the ground. size was the number of tees in an observed cluster.
A list of 4 dataframes, with the list elements named: book.tee.dataframe, book.tee.region, book.tee.samples and book.tee.obs.
book.tee.dataframe is the distance sampling data
dataframe. Used in the call to fit the detection function in ddf
.
Contains the following columns:
numeric object id
factor representing observer 1 or 2
numeric 1 if the animal was detected 0 otherwise
numeric value for the distance the animal was detected
numeric value for the group size
numeric value for sex of animal
numeric value for exposure level 0 or 1
book.tee.region: is the region table dataframe. Used to
supply the strata areas to the dht
function. Contains the following
columns:
factor giving the strata labels
numeric value giving the strata areas
book.tee.samples is the samples table dataframe to match
the transect ids to the region ids and supply the effort. Used in the
dht
function. Contains the following columns:
numeric giving the sample / transect labels
factor giving the strata labels
numeric value giving the sample / transect lengths
book.tee.obs is the observations table dataframe to match
the object ids in the distance data to the transect labels. Used in the
dht
function. Contains the following columns:
numeric value object id
factor giving the strata labels
numeric giving the sample / transect labels
Find se of average p and N
calc.se.Np(model, avgp, n, average.p)
calc.se.Np(model, avgp, n, average.p)
model |
a |
avgp |
average p function |
n |
sample size |
average.p |
the average probability of detection for the model |
David L. Miller
Computes cdf values of observed distances from fitted distribution. For a set of observed x it returns the integral of f(x) for the range= (inner, x), where inner is the innermost distance which is observable (either 0 or left if left truncated). In terms of g(x) this is the integral of g(x) over range divided by the integral of g(x) over the entire range of the data (inner, W).
cdf.ds(model, newdata = NULL)
cdf.ds(model, newdata = NULL)
model |
fitted distance sampling model |
newdata |
new data values if computed for values other than the original observations |
vector of cdf values for each observation
This is an internal function that is not intended to be invoked
directly. It is called by qqplot.ddf
to compute values for
Kolmogorov-Smirnov and Cramer-von Mises tests and the Q-Q plot.
Jeff Laake
Creates model formula list for conventional distance sampling using values
supplied in call to ddf
cds( key = NULL, adj.series = NULL, adj.order = NULL, adj.scale = "width", adj.exp = FALSE, formula = ~1, shape.formula = ~1 )
cds( key = NULL, adj.series = NULL, adj.order = NULL, adj.scale = "width", adj.exp = FALSE, formula = ~1, shape.formula = ~1 )
key |
string identifying key function (currently either "hn" (half-normal),"hr" (hazard-rate), "unif" (uniform) or "gamma" (gamma distribution) |
adj.series |
string identifying adjustment functions cos (Cosine), herm (Hermite polynomials), poly (simple polynomials) or NULL |
adj.order |
vector of order of adjustment terms to include |
adj.scale |
whether to scale the adjustment terms by "width" or "scale" |
adj.exp |
if TRUE uses exp(adj) for adjustment to keep f(x)>0 |
formula |
formula for scale function (included for completeness only only formula=~1 for cds) |
shape.formula |
formula for shape function |
A formula list used to define the detection function model
fct |
string "cds" |
key |
key function string |
adj.series |
adjustment function string |
adj.order |
adjustment function orders |
adj.scale |
adjustment function scale type |
formula |
formula for scale function |
shape.formula |
formula for shape function |
Jeff Laake; Dave Miller
Simple internal function to check that the optimisation didn't hit bounds.
Based on code that used to live in detfct.fit.opt
.
check.bounds(lt, lowerbounds, upperbounds, ddfobj, showit, setlower, setupper)
check.bounds(lt, lowerbounds, upperbounds, ddfobj, showit, setlower, setupper)
lt |
optimisation object |
lowerbounds |
current lower bounds |
upperbounds |
current upper bounds |
ddfobj |
ddf object |
showit |
debug level |
setlower |
were lower bounds set by the user |
setupper |
were upper bounds set by the user |
TRUE
if parameters are close to the bound, else FALSE
Dave Miller; Jeff Laake
Check that a fitted detection function is monotone non-increasing.
check.mono( df, strict = TRUE, n.pts = 100, tolerance = 1e-08, plot = FALSE, max.plots = 6 )
check.mono( df, strict = TRUE, n.pts = 100, tolerance = 1e-08, plot = FALSE, max.plots = 6 )
df |
a fitted detection function object |
strict |
if |
n.pts |
number of equally-spaced points between left and right truncation at which to evaluate the detection function (default 100) |
tolerance |
numerical tolerance for monotonicity checks (default 1e-8) |
plot |
plot a diagnostic highlighting the non-monotonic areas (default
|
max.plots |
when |
Evaluates a series of points over the range of the detection function (left to right truncation) then determines:
1. If the detection function is always less than or equal to its value at
the left truncation (g(x)<=g(left)
, or usually g(x)<=g(0)
).
2. (Optionally) The detection function is always monotone decreasing
(g(x[i])<=g(x[i-1])
). This check is only performed when
strict=TRUE
(the default).
3. The detection function is never less than 0 (g(x)>=0
).
4. The detection function is never greater than 1 (g(x)<=1
).
For models with covariates in the scale parameter of the detection function is evaluated at all observed covariate combinations.
Currently covariates in the shape parameter are not supported.
TRUE
if the detection function is monotone, FALSE
if
it's not. warning
s are issued to warn the user that the function is
non-monotonic.
David L. Miller, Felix Petersma
Extract coefficients and provide a summary of parameters and estimates
from the output of ddf
model objects.
## S3 method for class 'ds' coef(object,...) ## S3 method for class 'io' coef(object,...) ## S3 method for class 'io.fi' coef(object,...) ## S3 method for class 'trial' coef(object,...) ## S3 method for class 'trial.fi' coef(object,...) ## S3 method for class 'rem' coef(object,...) ## S3 method for class 'rem.fi' coef(object,...)
## S3 method for class 'ds' coef(object,...) ## S3 method for class 'io' coef(object,...) ## S3 method for class 'io.fi' coef(object,...) ## S3 method for class 'trial' coef(object,...) ## S3 method for class 'trial.fi' coef(object,...) ## S3 method for class 'rem' coef(object,...) ## S3 method for class 'rem.fi' coef(object,...)
object |
ddf model object of class |
... |
unspecified arguments that are unused at present |
For coef.ds
List of data frames for coefficients (scale and exponent
(if hazard))
scale |
dataframe of scale coefficient estimates and standard errors |
exponent |
dataframe with exponent estimate and standard error if hazard detection function |
For all others Data frame containing each coefficient and standard error
These functions are called by the generic function coef
for any
ddf
model object. It can be called directly by the user, but it is
typically safest to use coef
which calls the appropriate function
based on the type of model.
Jeff Laake
Compute individual components of Horvitz-Thompson abundance estimate in covered region for a particular subset of the data depending on value of group = TRUE (do group abundance); FALSE(do individual abundance)
compute.Nht(pdot, group = TRUE, size = NULL)
compute.Nht(pdot, group = TRUE, size = NULL)
pdot |
vector of estimated detection probabilities |
group |
if TRUE (do group abundance); FALSE(do individual abundance) |
size |
vector of group size values for clustered populations |
vector of H-T components for abundance estimate
Internal function called by covered.region.dht
Jeff Laake
Computes H-T abundance within covered region by sample.
covered.region.dht(obs, samples, group)
covered.region.dht(obs, samples, group)
obs |
observations table |
samples |
samples table |
group |
if TRUE compute abundance of group otherwise abundance of individuals |
Nhat.by.sample - dataframe of abundance by sample
Internal function called by dht
and related functions
Jeff Laake
This is an internal routine and shouldn't be necessary in normal analyses.
create.bins(data, cutpoints)
create.bins(data, cutpoints)
data |
'data.frame' with at least the column 'distance'. |
cutpoints |
vector of cutpoints for the bins |
argument 'data' with two extra columns 'distbegin' and 'distend'.
David L. Miller
create.command.file
create.command.file(dsmodel = call(), data, method, meta.data, control)
create.command.file(dsmodel = call(), data, method, meta.data, control)
dsmodel |
distance sampling model specification |
data |
dataframe containing data to be analyzed |
method |
analysis method |
meta.data |
list containing settings controlling data structure |
control |
list containing settings controlling model fitting |
Jonah McArthur
Creates a model.frame for distance detection function fitting. It includes some pre-specified and computed variables with those included in the model specified by user (formula)
create.model.frame(xmat, scale.formula, meta.data, shape.formula = NULL)
create.model.frame(xmat, scale.formula, meta.data, shape.formula = NULL)
xmat |
dataframe for ddf |
scale.formula |
user specified formula for scale of distance detection function |
meta.data |
user-specified meta.data (see |
shape.formula |
user specified formula for shape parameter of distance detection function |
The following fields are always included: detected, observer, binned, and optionally distance (unless null), timesdetected (if present in data). If the distance data were binned, include distbegin and distend point fields. If the integration width varies also include int.begin and int.end and include an offset field for an iterative glm, if used. Beyond these fields only fields used in the model formula are included.
model frame for analysis
Internal function and not called by user
Jeff Laake
Creates samples and obs dataframes used to compute abundance and its variance based on a structure of geographic regions and samples within each region. The intent is to generalize this routine to work with other sampling structures.
create.varstructure(model, region, sample, obs, dht.se)
create.varstructure(model, region, sample, obs, dht.se)
model |
fitted ddf object |
region |
region table |
sample |
sample table |
obs |
table of object #'s and links to sample and region table |
dht.se |
is uncertainty going to be calculated later? |
The function performs the following tasks: 1)tests to make sure that region labels are unique, 2) merges sample and region tables into a samples table and issue a warning if not all samples were used, 3) if some regions have no samples or if some values of Area were not valid areas given then issue error and stop, then an error is given and the code stops, 4) creates a unique region/sample label in samples and in obs, 5) merges observations with sample and issues a warning if not all observations were used, 6) sorts regions by its label and merges the values with the predictions from the fitted model based on the object number and limits it to the data that is appropriate for the fitted detection function.
List with 2 elements:
samples |
merged dataframe containing region and sample info - one record per sample |
obs |
merged observation data and links to region and samples |
Internal function called by dht
Jeff Laake
Generic function for fitting detection functions for distance sampling with
single and double observer configurations. Independent observer, trial and
dependent observer (removal) configurations are included. This is a generic
function which does little other than to validate the calling arguments and
methods and then calls the appropriate method
specific function to do
the analysis.
ddf( dsmodel = call(), mrmodel = call(), data, method = "ds", meta.data = list(), control = list(), call = NULL )
ddf( dsmodel = call(), mrmodel = call(), data, method = "ds", meta.data = list(), control = list(), call = NULL )
dsmodel |
distance sampling model specification |
mrmodel |
mark-recapture model specification |
data |
dataframe containing data to be analyzed |
method |
analysis method |
meta.data |
list containing settings controlling data structure |
control |
list containing settings controlling model fitting |
call |
not implemented for top level ddf function, this is set by ddf as it is passed to the other ddf generics. |
The fitting code has certain expectations about data
. It should be a
dataframe with at least the following fields named and defined as follows:
object |
object number |
observer |
observer number (1 or 2) for double observer; only 1 if single observer |
detected |
1 if detected by the observer and 0 if missed; always 1 for single observer |
distance |
perpendicular distance |
If the data are for clustered objects, the dataframe should also contain a
field named size
that gives the observed number in the cluster. If
the data are for a double observer survey, then there are two records for
each observation and each should have the same object number. The code
assumes the observations are listed in the same order for each observer such
that if the data are subsetted by observer
there will be the same
number of records in each and each subset will be in the same object
order. In addition to these predefined and pre-named fields, the dataframe
can have any number and type of fields that are used as covariates in the
dsmodel
and mrmodel
. At present, discrepancies between
observations in distance
, size
and any user-specified
covariates cannot be assimilated into the uncertainty of the estimate. The
code presumes the values for those fields are the same for both records
(observer=1 and observer=2) and it uses the value from observer 1. Thus it
makes sense to make the values the same for both records in each pair even
when both detect the object or when observer 1 doesn't detect the object the
data would have to be taken from observer 2 and would not be consistent.
Five different fitting methods are currently available and these in turn
define whether dsmodel
and mrmodel
need to be defined.
Method | Single/Double | dsmodel |
mrmodel
|
ds |
Single | yes | no |
io |
Double | yes | yes |
io.fi |
Double | no | yes |
trial |
Double | yes | yes |
trial.fi |
Double | no | yes |
rem |
Double | yes | yes |
rem.fi |
Double | no | yes |
Methods with the suffix ".fi
" use the assumption of full independence
and do not use the distance sampling portion of the likelihood which is why a
dsmodel
is not needed. An mrmodel
is only needed for double
observer surveys and thus is not needed for method ds
.
The dsmodel
specifies the detection function g(y) for the distance
sampling data and the models restrict g(0)=1. For single observer data g(y)
is the detection function for the single observer and if it is a double
observer survey it is the relative detection function (assuming g(0)=1) of
both observers as a team (the unique observations from both observers). In
double observer surveys, the detection function is p(y)=p(0)g(y) such that
p(0)<1. The detection function g(y) is specified by dsmodel
and p(0)
estimated from the conditional detection functions (see mrmodel
below). The value of dsmodel
is specified using a hybrid
formula/function notation. The model definition is prefixed with a ~
and the remainder is a function definition with specified arguments. At
present there are two different functions, cds
and
mcds
, for conventional distance sampling and multi-covariate
distance sampling. Both functions have the same required arguments
(key
,formula
). The first specifies the key function this
can be half-normal ("hn"), hazard-rate ("hr"), gamma ("gamma") or uniform
("unif"). The argument formula
specifies the formula
for the log of the scale parameter of the key function (e.g., the equivalent
of the standard deviation in the half-normal). The variable distance
should not be included in the formula because the scale is for distance.
See Marques, F.F.C. and S.T. Buckland (2004) for more details on the
representation of the scale formula. For the hazard rate and gamma
functions, an additional shape.formula
can be specified for the model
of the shape parameter. The default will be ~1.
Adjustment terms can be specified by setting adj.series
which can have
the values: "none", "cos" (cosine), "poly" (polynomials), and "herm"
(Hermite polynomials). One must also specify a vector of orders for the
adjustment terms (adj.order
) and a scaling (adj.scale
) which
may be "width" or "scale" (for scaling by the scale parameter). Note that
the uniform key can only be used with adjustments (usually cosine adjustments
for a Fourier-type analysis).
The mrmodel
specifies the form of the conditional detection functions
(i.e.,probability it is seen by observer j given it was seen by observer
3-j) for each observer (j=1,2) in a double observer survey. The value is
specified using the same mix of formula/function notation but in this case
the functions are glm
and gam
. The arguments for the
functions are formula
and link
. At present, only glm
is allowed and it is restricted to link=logit
. Thus, currently the
only form for the conditional detection functions is logistic as expressed
in eq 6.32 of Laake and Borchers (2004). In contrast to dsmodel
, the
argument formula
will typically include distance
and all other
covariates that affect detection probability. For example,
mrmodel=~glm(formula=~distance+size+sex)
constructs a conditional
detection function based on the logistic form with additive factors,
distance, size, and sex. As another example,
mrmodel=~glm(formula=~distance*size+sex)
constructs the same model
with an added interaction between distance and size.
The argument meta.data
is a list that enables various options about
the data to be set. These options include:
point
if TRUE
the data are from point counts and
FALSE
(default) implies line transect data
width
distance specifying half-width of the transect
left
distance specifying inner truncation value
binned
TRUE
or FALSE
to specify whether
distances should be binned for analysis
breaks
if binned=TRUE
, this is a required sequence of
break points that are used for plotting/gof. They should match
distbegin
, distend
values if bins are fixed
int.range
an integration range for detection probability; either a vector of 2 or matrix with 2 columns
mono
constrain the detection function to be weakly monotonically decreasing (only applicable when there are no covariates in the detection function)
mono.strict
when TRUE
constrain the detection function
to be strictly monotonically decreasing (again, only applicable when there
are no covariates in the detection function)
Using meta.data=list(int.range=c(1,10))
is the same as
meta.data=list(left=1,width=10)
. If
meta.data=list(binned=TRUE)
is used, the dataframe needs to contain
the fields distbegin and distend for each observation which specify the left
and right hand end points of the distance interval containing the
observation. This is a general data structure that allows the intervals to
change rather than being fixed as in the standard distance analysis tools.
Typically, if the intervals are changing so is the integration range. For
example, assume that distance bins are generated using fixed angular
measurements from an aircraft in which the altitude is varying. Because all
analyses are truncated (i.e., the last interval does not go to infinity),
the transect width (and the left truncation point if there is a blindspot
below the aircraft) can potentially change for each observation. The
argument int.range
can also be entered as a matrix with 2 columns
(left and width) and a row for each observation.
The argument control
is a list that enables various analysis options
to be set. It is not necessary to set any of these for most analyses. They
were provided so the user can optionally see intermediate fitting output and
to control fitting if the algorithm doesn't converge which happens
infrequently. The list values include:
showit
Integer (0-3, default 0) controls the (increasing)amount of information printed during fitting. 0 - none, >=1 - information about refitting and bound changes is printed, >=2 - information about adjustment term fitting is printed, ==3 -per-iteration parameter estimates and log-likelihood printed.
estimate
if FALSE fits model but doesn't estimate predicted probabilities
refit
if TRUE the algorithm will attempt multiple optimizations at different starting values if it doesn't converge
nrefits
number of refitting attempts
initial
a named list of starting values for the dsmodel
parameters (e.g. $scale
, $shape
, $adjustment
)
lowerbounds
a vector of lowerbounds for the dsmodel
parameters in the order the ds parameters will appear in the par
element of the ddf object, i.e. fit.ddf$par
where fit.ddf
is a fitted ddf model.
upperbounds
a vector of upperbounds for the dsmodel
parameters in the order the ds parameters will appear in the par
element of the ddf object, i.e. fit.ddf$par
where fit.ddf
is a fitted ddf model.
limit
if TRUE restrict analysis to observations with
detected
=1
debug
if TRUE, if fitting fails, return an object with fitting information
nofit
if TRUE don't fit a model, but use the starting values and generate an object based on those values
optimx.method
one (or a vector of) string(s) giving the
optimisation method to use. If more than one is supplied, the results from
one are used as the starting values for the next. See
optimx
optimx.maxit
maximum number of iterations to use in the optimisation.
mono.random.start
By default when monotonicity constraints
are enforced, a grid of starting values are tested. Instead random
starting values can be used (uniformly distributed between the upper and
lower bounds). Set TRUE
for random start, FALSE
(default)
uses the grid method
mono.method
The optimiser method to be used when (strict)
monotonicity is enforced. Can be either slsqp
or solnp
.
Default slsqp
.
mono.startvals
Controls if the mono.optimiser should find
better starting values by first fitting a key function without adjustments,
and then use those start values for the key function parameters when
fitting the key + adjustment series detection function. Defaults to
FALSE
mono.outer.iter
Number of outer iterations to be used by
solnp
when fitting a monotonic model and solnp
is selected.
Default 200.
silent
silences warnings within ds fitting method (helpful for running many times without generating many warning/error messages).
optimizer
By default this is set to 'both' for single
observer analyses and 'R' for double observer analyses. For single
observer analyses where optimizer = 'both', the R optimizer will be used
and if present the MCDS optimizer will also be used. The result with the
best likelihood value will be selected. To run only a specified optimizer
set this value to either 'R' or 'MCDS'. The MCDS optimizer cannot currently
be used for detection function fitting with double observer analyses.
See mcds_dot_exe
for more information.
winebin
Location of the wine
binary used to run
MCDS.exe
. See mcds_dot_exe for more information.
Examples of distance sampling analyses are available at https://examples.distancesampling.org/.
Hints and tips on fitting (particularly optimisation issues) are on the
mrds_opt
manual page.
model object of class=(method, "ddf")
Jeff Laake
Laake, J.L. and D.L. Borchers. 2004. Methods for incomplete detection at distance zero. In: Advanced Distance Sampling, eds. S.T. Buckland, D.R.Anderson, K.P. Burnham, J.L. Laake, D.L. Borchers, and L. Thomas. Oxford University Press.
Marques, F.F.C. and S.T. Buckland. 2004. Covariate models for the detection function. In: Advanced Distance Sampling, eds. S.T. Buckland, D.R.Anderson, K.P. Burnham, J.L. Laake, D.L. Borchers, and L. Thomas. Oxford University Press.
ddf.ds
, ddf.io
,
ddf.io.fi
, ddf.trial
,
ddf.trial.fi
, ddf.rem
, ddf.rem.fi
,
mrds_opt
# load data data(book.tee.data) region <- book.tee.data$book.tee.region egdata <- book.tee.data$book.tee.dataframe samples <- book.tee.data$book.tee.samples obs <- book.tee.data$book.tee.obs # fit a half-normal detection function result <- ddf(dsmodel=~mcds(key="hn", formula=~1), data=egdata, method="ds", meta.data=list(width=4)) # fit an independent observer model with full independence result.io.fi <- ddf(mrmodel=~glm(~distance), data=egdata, method="io.fi", meta.data=list(width = 4)) # fit an independent observer model with point independence result.io <- ddf(dsmodel=~cds(key = "hn"), mrmodel=~glm(~distance), data=egdata, method="io", meta.data=list(width=4)) ## Not run: # simulated single observer point count data (see ?ptdata.single) data(ptdata.single) ptdata.single$distbegin <- (as.numeric(cut(ptdata.single$distance, 10*(0:10)))-1)*10 ptdata.single$distend <- (as.numeric(cut(ptdata.single$distance, 10*(0:10))))*10 model <- ddf(data=ptdata.single, dsmodel=~cds(key="hn"), meta.data=list(point=TRUE,binned=TRUE,breaks=10*(0:10))) summary(model) plot(model,main="Single observer binned point data - half normal") model <- ddf(data=ptdata.single, dsmodel=~cds(key="hr"), meta.data=list(point=TRUE, binned=TRUE, breaks=10*(0:10))) summary(model) plot(model, main="Single observer binned point data - hazard rate") dev.new() # simulated double observer point count data (see ?ptdata.dual) # setup data data(ptdata.dual) ptdata.dual$distbegin <- (as.numeric(cut(ptdata.dual$distance, 10*(0:10)))-1)*10 ptdata.dual$distend <- (as.numeric(cut(ptdata.dual$distance, 10*(0:10))))*10 model <- ddf(method="io", data=ptdata.dual, dsmodel=~cds(key="hn"), mrmodel=~glm(formula=~distance*observer), meta.data=list(point=TRUE, binned=TRUE, breaks=10*(0:10))) summary(model) plot(model, main="Dual observer binned point data", new=FALSE, pages=1) model <- ddf(method="io", data=ptdata.dual, dsmodel=~cds(key="unif", adj.series="cos", adj.order=1), mrmodel=~glm(formula=~distance*observer), meta.data=list(point=TRUE, binned=TRUE, breaks=10*(0:10))) summary(model) par(mfrow=c(2,3)) plot(model,main="Dual observer binned point data",new=FALSE) ## End(Not run)
# load data data(book.tee.data) region <- book.tee.data$book.tee.region egdata <- book.tee.data$book.tee.dataframe samples <- book.tee.data$book.tee.samples obs <- book.tee.data$book.tee.obs # fit a half-normal detection function result <- ddf(dsmodel=~mcds(key="hn", formula=~1), data=egdata, method="ds", meta.data=list(width=4)) # fit an independent observer model with full independence result.io.fi <- ddf(mrmodel=~glm(~distance), data=egdata, method="io.fi", meta.data=list(width = 4)) # fit an independent observer model with point independence result.io <- ddf(dsmodel=~cds(key = "hn"), mrmodel=~glm(~distance), data=egdata, method="io", meta.data=list(width=4)) ## Not run: # simulated single observer point count data (see ?ptdata.single) data(ptdata.single) ptdata.single$distbegin <- (as.numeric(cut(ptdata.single$distance, 10*(0:10)))-1)*10 ptdata.single$distend <- (as.numeric(cut(ptdata.single$distance, 10*(0:10))))*10 model <- ddf(data=ptdata.single, dsmodel=~cds(key="hn"), meta.data=list(point=TRUE,binned=TRUE,breaks=10*(0:10))) summary(model) plot(model,main="Single observer binned point data - half normal") model <- ddf(data=ptdata.single, dsmodel=~cds(key="hr"), meta.data=list(point=TRUE, binned=TRUE, breaks=10*(0:10))) summary(model) plot(model, main="Single observer binned point data - hazard rate") dev.new() # simulated double observer point count data (see ?ptdata.dual) # setup data data(ptdata.dual) ptdata.dual$distbegin <- (as.numeric(cut(ptdata.dual$distance, 10*(0:10)))-1)*10 ptdata.dual$distend <- (as.numeric(cut(ptdata.dual$distance, 10*(0:10))))*10 model <- ddf(method="io", data=ptdata.dual, dsmodel=~cds(key="hn"), mrmodel=~glm(formula=~distance*observer), meta.data=list(point=TRUE, binned=TRUE, breaks=10*(0:10))) summary(model) plot(model, main="Dual observer binned point data", new=FALSE, pages=1) model <- ddf(method="io", data=ptdata.dual, dsmodel=~cds(key="unif", adj.series="cos", adj.order=1), mrmodel=~glm(formula=~distance*observer), meta.data=list(point=TRUE, binned=TRUE, breaks=10*(0:10))) summary(model) par(mfrow=c(2,3)) plot(model,main="Dual observer binned point data",new=FALSE) ## End(Not run)
Fits a conventional distance sampling (CDS) (likelihood eq 6.6 in Laake and
Borchers 2004) or multi-covariate distance sampling (MCDS)(likelihood eq
6.14 in Laake and Borchers 2004) model for the detection function of
observed distance data. It only uses key functions and does not incorporate
adjustment functions as in CDS/MCDS analysis engines in DISTANCE (Marques
and Buckland 2004). Distance can be grouped (binned), ungrouped (unbinned)
or mixture of the two. This function is not called directly by the user and
is called from ddf
,ddf.io
, or ddf.trial
.
## S3 method for class 'ds' ddf( dsmodel, mrmodel = NULL, data, method = "ds", meta.data = list(), control = list(), call )
## S3 method for class 'ds' ddf( dsmodel, mrmodel = NULL, data, method = "ds", meta.data = list(), control = list(), call )
dsmodel |
model list with key function and scale formula if any |
mrmodel |
not used |
data |
|
method |
analysis method; only needed if this function called from
|
meta.data |
|
control |
|
call |
original function call if this function not called directly from
|
For a complete description of each of the calling arguments, see
ddf
. The argument model
in this function is the same
as dsmodel
in ddf
. The argument dataname
is the name
of the dataframe specified by the argument data
in ddf
. The
arguments control
,meta.data
,and method
are defined the
same as in ddf
.
result: a ds
model object
If mixture of binned and unbinned distance, width must be set to be >= largest interval endpoint; this could be changed with a more complicated analysis; likewise, if all binned and bins overlap, the above must also hold; if bins don't overlap, width must be one of the interval endpoints; same holds for left truncation Although the mixture analysis works in principle it has not been tested via simulation.
Jeff Laake
Laake, J.L. and D.L. Borchers. 2004. Methods for incomplete detection at distance zero. In: Advanced Distance Sampling, eds. S.T. Buckland, D.R. Anderson, K.P. Burnham, J.L. Laake, D.L. Borchers, and L. Thomas. Oxford University Press.
Marques, F.F.C. and S.T. Buckland. 2004. Covariate models for the detection function. In: Advanced Distance Sampling, eds. S.T. Buckland, D.R. Anderson, K.P. Burnham, J.L. Laake, D.L. Borchers, and L. Thomas. Oxford University Press.
flnl
, summary.ds
, coef.ds
,
plot.ds
,gof.ds
# ddf.ds is called when ddf is called with method="ds" data(book.tee.data) region <- book.tee.data$book.tee.region egdata <- book.tee.data$book.tee.dataframe samples <- book.tee.data$book.tee.samples obs <- book.tee.data$book.tee.obs result <- ddf(dsmodel = ~mcds(key = "hn", formula = ~1), data = egdata[egdata$observer==1, ], method = "ds", meta.data = list(width = 4)) summary(result,se=TRUE) plot(result,main="cds - observer 1") print(dht(result,region,samples,obs,options=list(varflag=0,group=TRUE), se=TRUE)) print(ddf.gof(result))
# ddf.ds is called when ddf is called with method="ds" data(book.tee.data) region <- book.tee.data$book.tee.region egdata <- book.tee.data$book.tee.dataframe samples <- book.tee.data$book.tee.samples obs <- book.tee.data$book.tee.obs result <- ddf(dsmodel = ~mcds(key = "hn", formula = ~1), data = egdata[egdata$observer==1, ], method = "ds", meta.data = list(width = 4)) summary(result,se=TRUE) plot(result,main="cds - observer 1") print(dht(result,region,samples,obs,options=list(varflag=0,group=TRUE), se=TRUE)) print(ddf.gof(result))
Generic function that computes chi-square goodness of fit test for detection
function models with binned data and Cramer-von Mises and Kolmogorov-Smirnov
(if ks=TRUE
)tests for exact distance data. By default a Q-Q plot is
generated for exact data (and can be suppressed using the qq=FALSE
argument).
ddf.gof( model, breaks = NULL, nc = NULL, qq = TRUE, nboot = 100, ks = FALSE, ... )
ddf.gof( model, breaks = NULL, nc = NULL, qq = TRUE, nboot = 100, ks = FALSE, ... )
model |
model object |
breaks |
Cutpoints to use for binning data |
nc |
Number of distance classes |
qq |
Flag to indicate whether quantile-quantile plot is desired |
nboot |
number of replicates to use to calculate p-values for the Kolmogorov-Smirnov goodness of fit test statistics |
ks |
perform the Kolmogorov-Smirnov test (this involves many bootstraps so can take a while) |
... |
Graphics parameters to pass into qqplot function |
Formal goodness of fit testing for detection function models using
Kolmogorov-Smirnov and Cramer-von Mises tests. Both tests are based on
looking at the quantile-quantile plot produced by qqplot.ddf
and deviations from the line x=y.
The Kolmogorov-Smirnov test asks the question "what's the largest vertical distance between a point and the y=x line?" It uses this distance as a statistic to test the null hypothesis that the samples (EDF and CDF in our case) are from the same distribution (and hence our model fits well). If the deviation between the y=x line and the points is too large we reject the null hypothesis and say the model doesn't have a good fit.
Rather than looking at the single biggest difference between the y=x line
and the points in the Q-Q plot, we might prefer to think about all the
differences between line and points, since there may be many smaller
differences that we want to take into account rather than looking for one
large deviation. Its null hypothesis is the same, but the statistic it uses
is the sum of the deviations from each of the point to the line.
Note that a bootstrap procedure is required for the Kolmogorov-Smirnov test
to ensure that the p-values from the procedure are correct as the we are
comparing the cumulative distribution function (CDF) and empirical
distribution function (EDF) and we have estimated the parameters of the
detection function. The nboot
parameter controls the number of
bootstraps to use. Set to 0
to avoid computing bootstraps (much
faster but with no Kolmogorov-Smirnov results, of course).
One can change the precision of printed values by using the print.ddf.gof
method's digits
argument.
List of class ddf.gof
containing
chi-square |
Goodness of fit test statistic |
df |
Degrees of freedom associated with test statistic |
p-value |
Significance level of test statistic |
Jeff Laake
Mark-Recapture Distance Sampling (MRDS) Analysis of Independent Observer Configuration and Point Independence
## S3 method for class 'io' ddf( dsmodel, mrmodel, data, method = NULL, meta.data = list(), control = list(), call = "" )
## S3 method for class 'io' ddf( dsmodel, mrmodel, data, method = NULL, meta.data = list(), control = list(), call = "" )
dsmodel |
distance sampling model specification; model list with key function and scale formula if any |
mrmodel |
mark-recapture model specification; model list with formula and link |
data |
analysis dataframe |
method |
not used |
meta.data |
list containing settings controlling data structure |
control |
list containing settings controlling model fitting |
call |
original function call used to call |
MRDS analysis based on point independence involves two separate and
independent analyses of the mark-recapture data and the distance sampling
data. For the independent observer configuration, the mark-recapture data
are analysed with a call to ddf.io.fi
(see likelihood eq 6.8
and 6.16 in Laake and Borchers 2004) to fit conditional distance sampling
detection functions to estimate p(0), detection probability at distance zero
for the independent observer team based on independence at zero (eq 6.22 in
Laake and Borchers 2004). Independently, the distance data, the union of the
observations from the independent observers, are used to fit a conventional
distance sampling (CDS) (likelihood eq 6.6) or multi-covariate distance
sampling (MCDS) (likelihood eq 6.14) model for the detection function, g(y),
such that g(0)=1. The detection function for the observer team is then
created as p(y)=p(0)*g(y) (eq 6.28 of Laake and Borchers 2004) from which
predictions are made. ddf.io
is not called directly by the user and
is called from ddf
with method="io"
.
For a complete description of each of the calling arguments, see
ddf
. The argument dataname
is the name of the
dataframe specified by the argument data
in ddf
. The arguments
dsmodel
, mrmodel
, control
and meta.data
are
defined the same as in ddf
.
result: an io model object which is composed of io.fi and ds model objects
Jeff Laake
Laake, J.L. and D.L. Borchers. 2004. Methods for incomplete detection at distance zero. In: Advanced Distance Sampling, eds. S.T. Buckland, D.R.Anderson, K.P. Burnham, J.L. Laake, D.L. Borchers, and L. Thomas. Oxford University Press.
ddf.io.fi
,
ddf.ds
,summary.io
,coef.io
,plot.io
,
gof.io
Mark-Recapture Analysis of Independent Observer Configuration with Full Independence
## S3 method for class 'io.fi' ddf( dsmodel = NULL, mrmodel, data, method, meta.data = list(), control = list(), call = "" )
## S3 method for class 'io.fi' ddf( dsmodel = NULL, mrmodel, data, method, meta.data = list(), control = list(), call = "" )
dsmodel |
not used |
mrmodel |
mark-recapture model specification |
data |
analysis dataframe |
method |
analysis method; only needed if this function called from
|
meta.data |
list containing settings controlling data structure |
control |
list containing settings controlling model fitting |
call |
original function call used to call |
The mark-recapture data derived from an independent observer distance sampling survey can be used to derive conditional detection functions (p_j(y)) for both observers (j=1,2). They are conditional detection functions because detection probability for observer j is based on seeing or not seeing observations made by observer 3-j. Thus, p_1(y) is estimated by p_1|2(y).
If detections by the observers are independent (full independence) then p_1(y)=p_1|2(y),p_2(y)=p_2|1(y) and for the union, full independence means that p(y)=p_1(y) + p_2(y) - p_1(y)*p_2(y) for each distance y. In fitting the detection functions the likelihood given by eq 6.8 and 6.16 in Laake and Borchers (2004) is used. That analysis does not require the usual distance sampling assumption that perpendicular distances are uniformly distributed based on line placement that is random relative to animal distribution. However, that assumption is used in computing predicted detection probability which is averaged based on a uniform distribution (see eq 6.11 of Laake and Borchers 2004).
For a complete description of each of the calling arguments, see
ddf
. The argument model
in this function is the same
as mrmodel
in ddf
. The argument dataname
is the name
of the dataframe specified by the argument data
in ddf
. The
arguments control
,meta.data
,and method
are defined the
same as in ddf
.
result: an io.fi model object
Jeff Laake
Laake, J.L. and D.L. Borchers. 2004. Methods for incomplete detection at distance zero. In: Advanced Distance Sampling, eds. S.T. Buckland, D.R.Anderson, K.P. Burnham, J.L. Laake, D.L. Borchers, and L. Thomas. Oxford University Press.
ddf.io
,summary.io.fi
,coef.io.fi
,
plot.io.fi
,gof.io.fi
,io.glm
Mark-Recapture Distance Sampling (MRDS) Analysis of Removal Observer Configuration and Point Independence
## S3 method for class 'rem' ddf( dsmodel, mrmodel, data, method = NULL, meta.data = list(), control = list(), call = "" )
## S3 method for class 'rem' ddf( dsmodel, mrmodel, data, method = NULL, meta.data = list(), control = list(), call = "" )
dsmodel |
distance sampling model specification; model list with key function and scale formula if any |
mrmodel |
mark-recapture model specification; model list with formula and link |
data |
analysis dataframe |
method |
not used |
meta.data |
list containing settings controlling data structure |
control |
list containing settings controlling model fitting |
call |
original function call used to call |
MRDS analysis based on point independence involves two separate and
independent analyses of the mark-recapture data and the distance sampling
data. For the removal observer configuration, the mark-recapture data are
analysed with a call to ddf.rem.fi
(see Laake and Borchers
2004) to fit conditional distance sampling detection functions to estimate
p(0), detection probability at distance zero for the primary observer based
on independence at zero (eq 6.22 in Laake and Borchers 2004). Independently,
the distance data, the observations from the primary observer, are used to
fit a conventional distance sampling (CDS) (likelihood eq 6.6) or
multi-covariate distance sampling (MCDS) (likelihood eq 6.14) model for the
detection function, g(y), such that g(0)=1. The detection function for the
primary observer is then created as p(y)=p(0)*g(y) (eq 6.28 of Laake and
Borchers 2004) from which predictions are made. ddf.rem
is not called
directly by the user and is called from ddf
with
method="rem"
.
For a complete description of each of the calling arguments, see
ddf
. The argument data
is the dataframe specified by
the argument data
in ddf
. The arguments dsmodel
,
mrmodel
, control
and meta.data
are defined the same as
in ddf
.
result: an rem model object which is composed of rem.fi and ds model objects
Jeff Laake
Laake, J.L. and D.L. Borchers. 2004. Methods for incomplete detection at distance zero. In: Advanced Distance Sampling, eds. S.T. Buckland, D.R.Anderson, K.P. Burnham, J.L. Laake, D.L. Borchers, and L. Thomas. Oxford University Press.
Mark-Recapture Distance Sampling (MRDS) Analysis of Removal Observer Configuration with Full Independence
## S3 method for class 'rem.fi' ddf( dsmodel = NULL, mrmodel, data, method, meta.data = list(), control = list(), call = "" )
## S3 method for class 'rem.fi' ddf( dsmodel = NULL, mrmodel, data, method, meta.data = list(), control = list(), call = "" )
dsmodel |
not used |
mrmodel |
mark-recapture model specification |
data |
analysis dataframe |
method |
analysis method; only needed if this function called from
|
meta.data |
list containing settings controlling data structure |
control |
list containing settings controlling model fitting |
call |
original function call used to call |
The mark-recapture data derived from an removal observer distance sampling survey can only derive conditional detection functions (p_j(y)) for both observers (j=1) because technically it assumes that detection probability does not vary by occasion (observer in this case). It is a conditional detection function because detection probability for observer 1 is conditional on the observations seen by either of the observers. Thus, p_1(y) is estimated by p_1|2(y).
If detections by the observers are independent (full independence) then p_1(y)=p_1|2(y) and for the union, full independence means that p(y)=p_1(y) + p_2(y) - p_1(y)*p_2(y) for each distance y. In fitting the detection functions the likelihood from Laake and Borchers (2004) are used. That analysis does not require the usual distance sampling assumption that perpendicular distances are uniformly distributed based on line placement that is random relative to animal distribution. However, that assumption is used in computing predicted detection probability which is averaged based on a uniform distribution (see eq 6.11 of Laake and Borchers 2004).
For a complete description of each of the calling arguments, see
ddf
. The argument model
in this function is the same
as mrmodel
in ddf
. The argument dataname
is the name
of the dataframe specified by the argument data
in ddf
. The
arguments control
,meta.data
,and method
are defined the
same as in ddf
.
result: an rem.fi model object
Jeff Laake
Laake, J.L. and D.L. Borchers. 2004. Methods for incomplete detection at distance zero. In: Advanced Distance Sampling, eds. S.T. Buckland, D.R.Anderson, K.P. Burnham, J.L. Laake, D.L. Borchers, and L. Thomas. Oxford University Press.
Mark-Recapture Distance Sampling (MRDS) Analysis of Trial Observer Configuration and Point Independence
## S3 method for class 'trial' ddf( dsmodel, mrmodel, data, method = NULL, meta.data = list(), control = list(), call = "" )
## S3 method for class 'trial' ddf( dsmodel, mrmodel, data, method = NULL, meta.data = list(), control = list(), call = "" )
dsmodel |
distance sampling model specification; model list with key function and scale formula if any |
mrmodel |
mark-recapture model specification; model list with formula and link |
data |
analysis |
method |
not used |
meta.data |
list containing settings controlling data structure |
control |
list containing settings controlling model fitting |
call |
original function call used to call |
MRDS analysis based on point independence involves two separate and
independent analyses of the mark-recapture data and the distance sampling
data. For the trial configuration, the mark-recapture data are analysed
with a call to ddf.trial.fi
(see likelihood eq 6.12 and 6.17
in Laake and Borchers 2004) to fit a conditional distance sampling detection
function for observer 1 based on trials (observations) from observer 2 to
estimate p_1(0), detection probability at distance zero for observer 1.
Independently, the distance data from observer 1 are used to fit a
conventional distance sampling (CDS) (likelihood eq 6.6) or multi-covariate
distance sampling (MCDS) (likelihood eq 6.14) model for the detection
function, g(y), such that g(0)=1. The detection function for observer 1 is
then created as p_1(y)=p_1(0)*g(y) (eq 6.28 of Laake and Borchers 2004) from
which predictions are made. ddf.trial
is not called directly by the
user and is called from ddf
with method="trial"
.
For a complete description of each of the calling arguments, see
ddf
. The argument dataname
is the name of the
dataframe specified by the argument data
in ddf
. The arguments
dsmodel
, mrmodel
, control
and meta.data
are
defined the same as in ddf
.
result: a trial model object which is composed of trial.fi
and ds
model objects
Jeff Laake
Laake, J.L. and D.L. Borchers. 2004. Methods for incomplete detection at distance zero. In: Advanced Distance Sampling, eds. S.T. Buckland, D.R.Anderson, K.P. Burnham, J.L. Laake, D.L. Borchers, and L. Thomas. Oxford University Press.
ddf.trial.fi
, ddf.ds
, summary.trial
, coef.trial
, plot.trial
, gof.trial
Mark-Recapture Analysis of Trial Observer Configuration with Full Independence
## S3 method for class 'trial.fi' ddf( dsmodel = NULL, mrmodel, data, method, meta.data = list(), control = list(), call = "" )
## S3 method for class 'trial.fi' ddf( dsmodel = NULL, mrmodel, data, method, meta.data = list(), control = list(), call = "" )
dsmodel |
not used |
mrmodel |
mark-recapture model specification |
data |
analysis dataframe |
method |
analysis method; only needed if this function called from
|
meta.data |
list containing settings controlling data structure |
control |
list containing settings controlling model fitting |
call |
original function call used to call |
The mark-recapture data derived from a trial observer distance sampling survey can be used to derive a conditional detection function (p_1(y)) for observer 1 based on trials (observations) from observer 2. It is a conditional detection function because detection probability for observer 1 is based on seeing or not seeing observations made by observer 2. Thus, p_1(y) is estimated by p_1|2(y). If detections by the observers are independent (full independence) then p_1(y)=p_1|2(y) for each distance y. In fitting the detection functions the likelihood given by eq 6.12 or 6.17 in Laake and Borchers (2004) is used. That analysis does not require the usual distance sampling assumption that perpendicular distances are uniformly distributed based on line placement that is random relative to animal distribution. However, that assumption is used in computing predicted detection probability which is averaged based on a uniform distribution (see eq 6.13 of Laake and Borchers 2004).
For a complete description of each of the calling arguments, see
ddf
. The argument model
in this function is the same
as mrmodel
in ddf
. The argument dataname
is the name
of the dataframe specified by the argument data
in ddf
. The
arguments control
,meta.data
,and method
are defined the
same as in ddf
.
result: a trial.fi model object
Jeff Laake
Laake, J.L. and D.L. Borchers. 2004. Methods for incomplete detection at distance zero. In: Advanced Distance Sampling, eds. S.T. Buckland, D.R.Anderson, K.P. Burnham, J.L. Laake, D.L. Borchers, and L. Thomas. Oxford University Press.
ddf.trial
, summary.trial.fi
,
coef.trial.fi
, plot.trial.fi
,
gof.trial.fi
Computes delta method variance-covariance matrix of results of any generic
function fct
that computes a vector of estimates as a function of a
set of estimated parameters par
.
DeltaMethod(par, fct, vcov, delta, ...)
DeltaMethod(par, fct, vcov, delta, ...)
par |
vector of parameter values at which estimates should be constructed |
fct |
function that constructs estimates from parameters |
vcov |
variance-covariance matrix of the parameters |
delta |
proportional change in parameters used to numerically estimate first derivative with central-difference formula (ignored) |
... |
any additional arguments needed by |
The delta method (aka propagation of errors is based on Taylor series
approximation - see Seber's book on Estimation of Animal Abundance). It uses
the first derivative of fct
with respect to par
.
It also uses the variance-covariance matrix of the estimated parameters
which is derived in estimating the parameters and is an input argument.
The first argument of fct
should be par
which is a vector of
parameter estimates. It should return a single value (or vector) of
estimate(s). The remaining arguments of fct
if any can be passed to
fct
by including them at the end of the call to DeltaMethod
as
name=value
pairs.
a list with values
variance |
estimated variance-covariance
matrix of estimates derived by |
partial |
matrix (or
vector) of partial derivatives of |
This is a generic function that can be used in any setting beyond the
mrds
package. However this is an internal function for mrds
and the user does not need to call it explicitly.
Jeff Laake and David L Miller
Creates a series of tables for dual observer data that shows the number missed and detected for each observer within defined distance classes.
det.tables(model, nc = NULL, breaks = NULL)
det.tables(model, nc = NULL, breaks = NULL)
model |
fitted model from |
nc |
number of equal-width bins for histogram |
breaks |
user define breakpoints |
list object of class "det.tables"
Observer1 |
table for observer 1 |
Observer2 |
table for observer 2 |
Duplicates |
histogram counts for duplicates |
Pooled |
histogram counts for all observations by either observer |
Obs1_2 |
table for observer 1 within subset seen by observer 2 |
Obs2_1 |
table for observer 2 within subset seen by observer 1 |
Jeff Laake
data(book.tee.data) region <- book.tee.data$book.tee.region egdata <- book.tee.data$book.tee.dataframe samples <- book.tee.data$book.tee.samples obs <- book.tee.data$book.tee.obs xx <- ddf(mrmodel=~glm(formula=~distance*observer), dsmodel=~mcds(key="hn", formula=~sex), data=egdata, method="io", meta.data=list(width=4)) tabs <- det.tables(xx, breaks=c(0, 0.5, 1, 2, 3, 4)) par(mfrow=c(2, 2)) plot(tabs, new=FALSE, which=c(1, 2, 5, 6))
data(book.tee.data) region <- book.tee.data$book.tee.region egdata <- book.tee.data$book.tee.dataframe samples <- book.tee.data$book.tee.samples obs <- book.tee.data$book.tee.obs xx <- ddf(mrmodel=~glm(formula=~distance*observer), dsmodel=~mcds(key="hn", formula=~sex), data=egdata, method="io", meta.data=list(width=4)) tabs <- det.tables(xx, breaks=c(0, 0.5, 1, 2, 3, 4)) par(mfrow=c(2, 2)) plot(tabs, new=FALSE, which=c(1, 2, 5, 6))
Fit detection function to observed distances using the key-adjustment
function approach. If adjustment functions are included it will alternate
between fitting parameters of key and adjustment functions and then all
parameters much like the approach in the CDS and MCDS Distance FORTRAN code.
To do so it calls detfct.fit.opt
which uses the R optim function
which does not allow non-linear constraints so inclusion of adjustments does
allow the detection function to be non-monotone.
detfct.fit(ddfobj, optim.options, bounds, misc.options)
detfct.fit(ddfobj, optim.options, bounds, misc.options)
ddfobj |
detection function object |
optim.options |
control options for optim |
bounds |
bounds for the parameters |
misc.options |
miscellaneous options |
fitted detection function model object with the following list structure
par |
final parameter vector |
value |
final negative log likelihood value |
counts |
number of function evaluations |
convergence |
see codes in optim |
message |
string about convergence |
hessian |
hessian evaluated at final parameter values |
aux |
a list with 20 elements
|
Dave Miller; Jeff Laake
Fit detection function to observed distances using the key-adjustment
function approach. If adjustment functions are included it will alternate
between fitting parameters of key and adjustment functions and then all
parameters much like the approach in the CDS and MCDS Distance FORTRAN code.
This function is called by the driver function detfct.fit
, it then
calls the relevant optimisation routine, slsqp
,
solnp
or optimx
.
detfct.fit.opt(ddfobj, optim.options, bounds, misc.options, fitting = "all")
detfct.fit.opt(ddfobj, optim.options, bounds, misc.options, fitting = "all")
ddfobj |
detection function object |
optim.options |
control options for optim |
bounds |
bounds for the parameters |
misc.options |
miscellaneous options |
fitting |
character string with values "all","key","adjust" to determine which parameters are allowed to vary in the fitting |
fitted detection function model object with the following list structure
par |
final parameter vector |
value |
final negative log likelihood value |
counts |
number of function evaluations |
convergence |
see codes in optim |
message |
string about convergence |
hessian |
hessian evaluated at final parameter values |
aux |
a list with 20 elements
|
Dave Miller; Jeff Laake; Lorenzo Milazzo; Felix Petersma
Compute density and abundance estimates and variances based on Horvitz-Thompson-like estimator.
dht( model, region.table, sample.table, obs.table = NULL, subset = NULL, se = TRUE, options = list() )
dht( model, region.table, sample.table, obs.table = NULL, subset = NULL, se = TRUE, options = list() )
model |
ddf model object |
region.table |
|
sample.table |
|
obs.table |
|
subset |
subset statement to create |
se |
if |
options |
a list of options that can be set, see " |
Density and abundance within the sampled region is computed based on a
Horvitz-Thompson-like estimator for groups and individuals (if a clustered
population) and this is extrapolated to the entire survey region based on
any defined regional stratification. The variance is based on replicate
samples within any regional stratification. For clustered populations,
and its standard error are also output.
Abundance is estimated with a Horvitz-Thompson-like estimator (Huggins 1989,
1991; Borchers et al 1998; Borchers and Burnham 2004). The abundance in the
sampled region is simply where
is the estimated detection probability for the
th detection of
total observations. It is not strictly a Horvitz-Thompson estimator
because the
are estimated and not known. For animals observed in
tight clusters, that estimator gives the abundance of groups
(
group=TRUE
in options
) and the abundance of individuals is
estimated as , where
is the
size (e.g., number of animals in the group) of each observation
(
group=FALSE
in options
).
Extrapolation and estimation of abundance to the entire survey region is
based on either a random sampling design or a stratified random sampling
design. Replicate samples (lines) are specified within regional strata
region.table
, if any. If there is no stratification,
region.table
should contain only a single record with the Area
for the entire survey region. The sample.table
is linked to the
region.table
with the Region.Label
. The obs.table
is
linked to the sample.table
with the Sample.Label
and
Region.Label
. Abundance can be restricted to a subset (e.g., for a
particular species) of the population by limiting the list the observations
in obs.table
to those in the desired subset. Alternatively, if
Sample.Label
and Region.Label
are in the data.frame
used to fit the model, then a subset
argument can be given in place
of the obs.table
. To use the subset
argument but include all
of the observations, use subset=1==1
to avoid creating an
obs.table
.
In extrapolating to the entire survey region it is important that the unit
measurements be consistent or converted for consistency. A conversion factor
can be specified with the convert.units
variable in the
options
list. The values of Area
in region.table
, must
be made consistent with the units for Effort
in sample.table
and the units of distance
in the data.frame
that was analyzed.
It is easiest to do if the units of Area
is the square of the units
of Effort
and then it is only necessary to convert the units of
distance
to the units of Effort
. For example, if Effort
was entered in kilometres and Area
in square kilometres and
distance
in metres then using
options=list(convert.units=0.001)
would convert metres to kilometres,
density would be expressed in square kilometres which would then be
consistent with units for Area
. However, they can all be in different
units as long as the appropriate composite value for convert.units
is
chosen. Abundance for a survey region can be expressed as: A*N/a
where A
is Area
for the survey region, N
is the
abundance in the covered (sampled) region, and a
is the area of the
sampled region and is in units of Effort * distance
. The sampled
region a
is multiplied by convert.units
, so it should be
chosen such that the result is in the same units of Area
. For
example, if Effort
was entered in kilometres, Area
in hectares
(100m x 100m) and distance
in metres, then using
options=list(convert.units=10)
will convert a
to units of
hectares (100 to convert metres to 100 metres for distance and .1 to convert
km to 100m units).
The argument options
is a list of variable=value
pairs that
set options for the analysis. All but two of these have been described above.
pdelta
should not need to be changed but was included for
completeness. It controls the precision of the first derivative calculation
for the delta method variance. If the option areas.supplied
is
TRUE
then the covered area is assumed to be supplied in the
CoveredArea
column of the sample data.frame
.
list object of class dht
with elements:
clusters |
result list for object clusters |
individuals |
result list for individuals |
Expected.S |
|
The list structure of clusters and individuals are the same:
bysample |
|
summary |
|
N |
|
D |
|
average.p |
average detection probability estimate |
cormat |
correlation matrix of regional abundance/density estimates and total (if more than one region) |
vc |
list of 3: total variance-covariance matrix, detection function component of variance and encounter rate component of variance. For detection the v-c matrix and partial vector are returned |
Nhat.by.sample |
another summary of |
If the argument se=TRUE
, standard errors for density and abundance is
computed. Coefficient of variation and log-normal confidence intervals are
constructed using a Satterthwaite approximation for degrees of freedom
(Buckland et al. 2001 p. 90). The function dht.se
computes the
variance and interval estimates.
The variance has two components:
variation due to uncertainty from estimation of the detection function parameters;
variation in abundance due to random sample selection;
The first component (model parameter uncertainty) is computed using a delta
method estimate of variance (Huggins 1989, 1991, Borchers et al. 1998) in
which the first derivatives of the abundance estimator with respect to the
parameters in the detection function are computed numerically (see
DeltaMethod
).
The second component (encounter rate variance) can be computed in one of
several ways depending on the form taken for the encounter rate and the
estimator used. To begin with there three possible values for varflag
to calculate encounter rate:
0
uses a binomial variance for the number of observations
(equation 13 of Borchers et al. 1998). This estimator is only useful if the
sampled region is the survey region and the objects are not clustered; this
situation will not occur very often;
1
uses the encounter rate (objects observed per unit
transect) from Buckland et al. (2001) pg 78-79 (equation 3.78) for line
transects (see also Fewster et al, 2009 estimator R2). This variance
estimator is not appropriate if
size
or a derivative of size
is used in the detection function;
2
is the default and uses the encounter rate estimator
(estimated abundance per unit transect) suggested by Innes
et al (2002) and Marques & Buckland (2004).
In general if any covariates are used in the models, the default
varflag=2
is preferable as the estimated abundance will take into
account variability due to covariate effects. If the population is clustered
the mean group size and standard error is also reported.
For options 1
and 2
, it is then possible to choose one of the
estimator forms given in Fewster et al (2009) for line transects:
"R2"
, "R3"
, "R4"
, "S1"
, "S2"
,
"O1"
, "O2"
or "O3"
by specifying the ervar=
option (default "R2"
). For points, either the "P2"
or
"P3"
estimator can be selected (>=mrds 2.3.0 default "P2"
,
<= mrds 2.2.9 default "P3"
). See varn
and Fewster
et al (2009) for further details on these estimators.
dht
optionsSeveral options are available to control calculations and output:
ci.width
Confidence interval width, expressed as a decimal
between 0 and 1 (default 0.95
, giving a 95% CI)
pdelta
delta value for computing numerical first derivatives (Default: 0.001)
varflag
0,1,2 (see "Uncertainty") (Default: 2
)
convert.units
multiplier for width to convert to units of
length (Default: 1
)
ervar
encounter rate variance type (see "Uncertainty" and
type
argument of varn
). (Default: "R2"
for
lines and "P2"
for points)
Jeff Laake, David L Miller
Borchers, D.L., S.T. Buckland, P.W. Goedhart, E.D. Clarke, and S.L. Hedley. 1998. Horvitz-Thompson estimators for double-platform line transect surveys. Biometrics 54: 1221-1237.
Borchers, D.L. and K.P. Burnham. General formulation for distance sampling pp 10-11 In: Advanced Distance Sampling, eds. S.T. Buckland, D.R.Anderson, K.P. Burnham, J.L. Laake, D.L. Borchers, and L. Thomas. Oxford University Press.
Buckland, S.T., D.R.Anderson, K.P. Burnham, J.L. Laake, D.L. Borchers, and L. Thomas. 2001. Introduction to Distance Sampling: Estimating Abundance of Biological Populations. Oxford University Press.
Fewster, R.M., S.T. Buckland, K.P. Burnham, D.L. Borchers, P.E. Jupp, J.L. Laake and L. Thomas. 2009. Estimating the encounter rate variance in distance sampling. Biometrics 65: 225-236.
Huggins, R.M. 1989. On the statistical analysis of capture experiments. Biometrika 76:133-140.
Huggins, R.M. 1991. Some practical aspects of a conditional likelihood approach to capture experiments. Biometrics 47: 725-732.
Innes, S., M.P. Heide-Jorgensen, J.L. Laake, K.L. Laidre, H.J. Cleator, P. Richard, and R.E.A. Stewart. 2002. Surveys of belugas and narwhals in the Canadian High Arctic in 1996. NAMMCO Scientific Publications 4: 169-190.
Marques, F.F.C. and S.T. Buckland. 2004. Covariate models for the detection function. In: Advanced Distance Sampling, eds. S.T. Buckland, D.R.Anderson, K.P. Burnham, J.L. Laake, D.L. Borchers, and L. Thomas. Oxford University Press.
print.dht dht.se
Computes abundance at specified values of parameters for numerical computation of first derivative with respect to parameters in detection function. An internal function called by DeltaMethod which is invoked by dht.se
dht.deriv(par, model, obs, samples, options = list())
dht.deriv(par, model, obs, samples, options = list())
par |
detection function parameter values |
model |
ddf model object |
obs |
observations table |
samples |
samples table |
options |
list of options as specified in |
vector of abundance estimates at values of parameters specified in par
Internal function; not intended to be called by user
Jeff Laake
Computes standard error, cv, and log-normal confidence intervals for abundance and density within each region (if any) and for the total of all the regions. It also produces the correlation matrix for regional and total estimates.
dht.se( model, region.table, samples, obs, options, numRegions, estimate.table, Nhat.by.sample )
dht.se( model, region.table, samples, obs, options, numRegions, estimate.table, Nhat.by.sample )
model |
ddf model object |
region.table |
table of region values |
samples |
table of samples(replicates) |
obs |
table of observations |
options |
list of options that can be set (see |
numRegions |
number of regions |
estimate.table |
table of estimate values |
Nhat.by.sample |
estimated abundances by sample |
The variance has two components:
variation due to uncertainty from estimation of the detection function parameters;
variation in abundance due to random sample selection;
The first component (model parameter uncertainty) is computed using a delta
method estimate of variance (Huggins 1989, 1991, Borchers et al. 1998) in
which the first derivatives of the abundance estimator with respect to the
parameters in the detection function are computed numerically (see
DeltaMethod
).
The second component (encounter rate variance) can be computed in one of
several ways depending on the form taken for the encounter rate and the
estimator used. To begin with there three possible values for varflag
to calculate encounter rate:
0
uses a binomial variance for the number of observations
(equation 13 of Borchers et al. 1998). This estimator is only useful if the
sampled region is the survey region and the objects are not clustered; this
situation will not occur very often;
1
uses the encounter rate (objects observed per unit
transect) from Buckland et al. (2001) pg 78-79 (equation 3.78) for line
transects (see also Fewster et al, 2009 estimator R2). This variance
estimator is not appropriate if
size
or a derivative of size
is used in the detection function;
2
is the default and uses the encounter rate estimator
(estimated abundance per unit transect) suggested by Innes
et al (2002) and Marques & Buckland (2004).
In general if any covariates are used in the models, the default
varflag=2
is preferable as the estimated abundance will take into
account variability due to covariate effects. If the population is clustered
the mean group size and standard error is also reported.
For options 1
and 2
, it is then possible to choose one of the
estimator forms given in Fewster et al (2009). For line transects:
"R2"
, "R3"
, "R4"
, "S1"
, "S2"
,
"O1"
, "O2"
or "O3"
can be used by specifying the
ervar=
option (default "R2"
). For points, either the
"P2"
or "P3"
estimator can be selected (>=mrds 2.3.0
default "P2"
, <= mrds 2.2.9 default "P3"
). See
varn
and Fewster et al (2009) for further details
on these estimators.
Exceptions to the above occur if there is only one sample in a stratum. In
that case it uses Poisson assumption () and it assumes a known
variance so
is used for critical value. In all other cases the
degrees of freedom for the
-distribution assumed for the
log(abundance) or log(density) is based on the Satterthwaite approximation
(Buckland et al. 2001 pg 90) for the degrees of freedom (df). The df are
weighted by the squared cv in combining the two sources of variation because
of the assumed log-normal distribution because the components are
multiplicative. For combining df for the sampling variance across regions
they are weighted by the variance because it is a sum across regions.
A non-zero correlation between regional estimates can occur from using a common detection function across regions. This is reflected in the correlation matrix of the regional and total estimates which is given in the value list. It is only needed if subtotals of regional estimates are needed.
List with 2 elements:
estimate.table |
completed table with se, cv and confidence limits |
vc |
correlation matrix of estimates |
This function is called by dht
and it is not expected that the
user will call this function directly but it is documented here for
completeness and for anyone expanding the code or using this function in
their own code.
Jeff Laake
see dht
This function has been updated to match distpdf closely, so that it has the same flexibility. Effectively, it gives the gradient of distpdf or detfct, whichever one is specified.
distpdf.grad( distance, par.index, ddfobj, standardize = FALSE, width, point, left = 0, pdf.based = TRUE )
distpdf.grad( distance, par.index, ddfobj, standardize = FALSE, width, point, left = 0, pdf.based = TRUE )
distance |
vector of distances |
par.index |
the index of the parameter of interest |
ddfobj |
the ddf object |
standardize |
whether the function should return the gradient of the standardized detection function g(x)/g(0) (TRUE), or simply of g(0) (FALSE). Currently only implemented for standardize = FALSE. |
width |
the truncation width |
point |
are the data from point transects (TRUE) or line transects (FALSE). |
left |
the left truncation (default 0) |
pdf.based |
is it the gradient of the non-normalised pdf (TRUE) or the detection function (FALSE)? Default is TRUE. |
Various functions used to specify key and adjustment functions for gradients of detection functions.
So far, only developed for the half-normal, hazard-rate and uniform key functions in combination with cosine, simple polynomial and Hermite polynomial adjustments. It is only called by the gradient-based solver and should not be called by the general user.
distpdf.grad
will call either a half-normal, hazard-rate or uniform
function with adjustment terms to fit the data better, returning the
gradient of detection at that distance w.r.t. the parameters. The adjustments
are either cosine, Hermite or simple polynomial.
the gradient of the non-normalised pdf or detection w.r.t. to
the parameter with parameter index par.index
.
Felix Petersma
Computes values of conditional and unconditional detection functions and probability density functions for for line/point data for single observer or dual observer in any of the 3 configurations (io,trial,rem).
ds.function( model, newdata = NULL, obs = "All", conditional = FALSE, pdf = TRUE, finebr )
ds.function( model, newdata = NULL, obs = "All", conditional = FALSE, pdf = TRUE, finebr )
model |
model object |
newdata |
dataframe at which to compute values; if NULL uses fitting data |
obs |
1 or 2 for observer 1 or 2, 3 for duplicates, "." for combined and "All" to return all of the values |
conditional |
if FALSE, computes p(x) based on distance detection function and if TRUE based on mr detection function |
pdf |
if FALSE, returns p(x) and if TRUE, returns p(x)*pi(x)/integral p(x)*pi(x) |
finebr |
fine break values over which line is averaged |
Placeholder – Not functional —-
List containing
xgrid |
grid of distance values |
values |
average detection fct values at the xgrid values |
Jeff Laake
For a specific set of parameter values, it computes and returns the negative
log-likelihood for the distance sampling likelihood for distances that are
unbinned, binned and a mixture of both. The function flnl
is the
function minimized using optim
from within
ddf.ds
.
flnl(fpar, ddfobj, misc.options, fitting = "all")
flnl(fpar, ddfobj, misc.options, fitting = "all")
fpar |
parameter values for detection function at which negative log-likelihood should be evaluated |
ddfobj |
distance sampling object |
misc.options |
a |
fitting |
character |
Most of the computation is in flpt.lnl
in which the negative
log-likelihood is computed for each observation. flnl
is a wrapper
that optionally outputs intermediate results and sums the individual
log-likelihood values.
flnl
is the main routine that manipulates the parameters using
getpar
to handle fitting of key, adjustment or all of the
parameters. It then calls flpt.lnl
to do the actual computation of
the likelihood. The probability density function for point counts is
fr
and for line transects is fx
.
fx
=g(x)/mu (where g(x) is the detection function); whereas,
f(r)=r*g(r)/mu where mu in both cases is the normalizing constant. Both
functions are in source code file for link{detfct}
and are called from
distpdf
and the integral calculations are made with
integratepdf
.
negative log-likelihood value at the parameter values specified in
fpar
These are internal functions used by ddf.ds
to fit
distance sampling detection functions. It is not intended for the user to
invoke these functions but they are documented here for completeness.
Jeff Laake, David L Miller
The function derives the gradients of the constraint function for all model parameters, in the following order: 1. Scale parameter (if part of key function) 2. Shape parameter (if part of key function) 3. Adjustment parameter 1 4. Adjustment parameter 2 5. Etc.
flnl.constr.grad.neg(pars, ddfobj, misc.options, fitting = "all")
flnl.constr.grad.neg(pars, ddfobj, misc.options, fitting = "all")
pars |
vector of parameter values for the detection function at which the gradients of the negative log-likelihood should be evaluated |
ddfobj |
distance sampling object |
misc.options |
a list object containing all additional information such
as the type of optimiser or the truncation width, and is created within
|
fitting |
character string with values "all", "key", "adjust" to determine which parameters are allowed to vary in the fitting. Not actually used. Defaults to "all". |
The constraint function itself is formed of a specified number of non-linear
constraints, which defaults to 20 and is specified through
misc.options$mono.points
. The constraint function checks whether the
standardised detection function is 1) weakly/strictly monotonic at the
points and 2) non-negative at all the points. flnl.constr.grad
returns
the gradients of those constraints w.r.t. all parameters of the detection
function, i.e., 2 times mono.points
gradients for every parameter.
This function mostly follows the same structure as flnl.constr
in
detfct.fit.mono.R
.
a matrix of gradients for all constraints (rows) w.r.t to every parameters (columns)
mrds
and Distance
packages directly but rather by the gradient-based
solver. This solver is use when our distance sampling model is for
single-observer data coming from either line or point transect and only when
the detection function contains an adjustment series but no covariates. It is
implement for the following key + adjustment series combinations for the
detections function: the key function can be half-normal, hazard-rate or
uniform, and the adjustment series can be cosine, simple polynomial or
Hermite polynomial. Data can be either binned or exact, but a combination
of the two has not been implemented yet.This function derives the gradients of the negative log likelihood function,
with respect to all parameters. It is based on the theory presented in
Introduction to Distance Sampling (2001) and Distance Sampling: Methods and
Applications (2015). It is not meant to be called by users of the mrds
and Distance
packages directly but rather by the gradient-based
solver. This solver is use when our distance sampling model is for
single-observer data coming from either line or point transect and only when
the detection function contains an adjustment series but no covariates. It is
implement for the following key + adjustment series combinations for the
detections function: the key function can be half-normal, hazard-rate or
uniform, and the adjustment series can be cosine, simple polynomial or
Hermite polynomial. Data can be either binned or exact, but a combination
of the two has not been implemented yet.
flnl.grad(pars, ddfobj, misc.options, fitting = "all")
flnl.grad(pars, ddfobj, misc.options, fitting = "all")
pars |
vector of parameter values for the detection function at which the gradients of the negative log-likelihood should be evaluated |
ddfobj |
distance sampling object |
misc.options |
a list object containing all additional information such
as the type of optimiser or the truncation width, and is created by
|
fitting |
character string with values "all", "key", "adjust" to determine which parameters are allowed to vary in the fitting. Not actually used. Defaults to "all". |
The gradients of the negative log-likelihood w.r.t. the parameters
Felix Petersma
Computes hessian to be used for variance-covariance matrix. The hessian is the outer product of the vector of first partials (see pg 62 of Buckland et al 2002).
flt.var(ddfobj, misc.options)
flt.var(ddfobj, misc.options)
ddfobj |
distance sampling object |
misc.options |
width-transect width (W); int.range-integration range for observations; showit-0 to 3 controls level of iteration printing; integral.numeric-if TRUE integral is computed numerically rather than analytically |
variance-covariance matrix of parameters in the detection function
This is an internal function used by ddf.ds
to fit
distance sampling detection functions. It is not intended for the user to
invoke this function but it is documented here for completeness.
Jeff Laake and David L Miller
Buckland et al. 2002
Compute value of p(0) using a logit formulation
g0(beta, z)
g0(beta, z)
beta |
logistic parameters |
z |
design matrix of covariate values |
vector of p(0) values
Jeff Laake
Extracts parameters of a particular type (scale,
shape, adjustments or g0 (p(0))) from the vector of parameters in
ddfobj
. All of the parameters are kept in a single vector for
optimization even though they have very different uses. assign.par
parses them from the vector based on a known structure and assigns them into
ddfobj
. getpar
extracts the requested types to be extracted
from ddfobj
.
getpar(ddfobj, fitting = "all", index = FALSE)
getpar(ddfobj, fitting = "all", index = FALSE)
ddfobj |
distance sampling object (see |
fitting |
character string which is either "all","key","adjust" which determines which parameters are retrieved |
index |
logical that determines whether parameters are returned (FALSE) or starting indices in parameter vector for scale, shape, adjustment parameters |
index==FALSE, vector of parameters that were requested or index==TRUE, vector of 3 indices for shape, scale, adjustment
Internal functions not intended to be called by user.
Jeff Laake
assign.par
Compute chi-square goodness-of-fit test for ds models
gof.ds(model, breaks = NULL, nc = NULL)
gof.ds(model, breaks = NULL, nc = NULL)
model |
|
breaks |
distance cut points |
nc |
number of distance classes |
list with chi-square value, df and p-value
Jeff Laake
ddf.gof
Computes the integral of distpdf
with scale=1 (stdint=TRUE
) or
specified scale (stdint=FALSE
).
gstdint( x, ddfobj, index = NULL, select = NULL, width, standardize = TRUE, point = FALSE, stdint = TRUE, doeachint = FALSE, left = left )
gstdint( x, ddfobj, index = NULL, select = NULL, width, standardize = TRUE, point = FALSE, stdint = TRUE, doeachint = FALSE, left = left )
x |
lower, upper value for integration |
ddfobj |
distance detection function specification |
index |
specific data row index |
select |
logical vector for selection of data values |
width |
truncation width |
standardize |
if |
point |
logical to determine if point ( |
stdint |
if |
doeachint |
if |
left |
left truncation width |
vector of integral values of detection function
This is an internal function that is not intended to be invoked directly.
Jeff Laake and David L Miller
Takes bar heights (height) and cutpoints (breaks), and constructs a line-only histogram from them using the function plot() (if lineonly==FALSE) or lines() (if lineonly==TRUE).
histline( height, breaks, lineonly = FALSE, outline = FALSE, ylim = range(height), xlab = "x", ylab = "y", det.plot = FALSE, add = FALSE, ... )
histline( height, breaks, lineonly = FALSE, outline = FALSE, ylim = range(height), xlab = "x", ylab = "y", det.plot = FALSE, add = FALSE, ... )
height |
heights of histogram bars |
breaks |
cutpoints for x |
lineonly |
if TRUE, drawn with plot; otherwise with lines to allow addition of current plot |
outline |
if TRUE, only outline of histogram is plotted |
ylim |
limits for y axis |
xlab |
label for x axis |
ylab |
label for y axis |
det.plot |
if TRUE, plot is of detection so yaxis limited to unit interval |
add |
should this plot add to a previous window |
... |
Additional unspecified arguments for plot |
None
Jeff Laake and David L Miller
Integrates a logistic detection function; a separate function is used because in certain cases the integral can be solved analytically and also because the scale trick used with the half-normal and hazard rate doesn't work with the logistic.
integratedetfct.logistic(x, scalemodel, width, theta1, integral.numeric, w)
integratedetfct.logistic(x, scalemodel, width, theta1, integral.numeric, w)
x |
logistic design matrix values |
scalemodel |
scale model for logistic |
width |
transect width |
theta1 |
parameters for logistic |
integral.numeric |
if |
w |
design covariates |
vector of integral values
Jeff Laake
Computes integral (analytically) over x from 0 to width of a logistic detection function; For reference see integral #526 in CRC Std Math Table 24th ed
integratelogistic.analytic(x, models, beta, width)
integratelogistic.analytic(x, models, beta, width)
x |
matrix of data |
models |
list of model formulae |
beta |
parameters of logistic detection function |
width |
transect half-width |
Jeff Laake
Computes integral of pdf of observed distances over x for each observation. The method of computation depends on argument switches set and the type of detection function.
integratepdf( ddfobj, select, width, int.range, standardize = TRUE, point = FALSE, left = 0, doeachint = FALSE )
integratepdf( ddfobj, select, width, int.range, standardize = TRUE, point = FALSE, left = 0, doeachint = FALSE )
ddfobj |
distance detection function specification |
select |
logical vector for selection of data values |
width |
truncation width |
int.range |
integration range matrix; vector is converted to matrix |
standardize |
logical used to decide whether to divide through by the function evaluated at 0 |
point |
logical to determine if point count ( |
left |
left truncation width |
doeachint |
calculate each integral numerically |
vector of integral values - one for each observation
Jeff Laake & Dave Miller
Gradient of the integral of the detection function, i.e., d beta/d theta in the documentation. This gradient of the integral is the same as the integral of the gradient, thanks to Leibniz integral rule.
integratepdf.grad( par.index, ddfobj, int.range, width, standardize = FALSE, point = FALSE, left = 0, pdf.based = TRUE )
integratepdf.grad( par.index, ddfobj, int.range, width, standardize = FALSE, point = FALSE, left = 0, pdf.based = TRUE )
par.index |
the index of the parameter of interest |
ddfobj |
the ddf object |
int.range |
vector with the lower and upper bound of the integration |
width |
the truncation width |
standardize |
TRUE if the non-standardised detection function should be integrated. Only implemented for standardize = FALSE, so users should not touch this argument and it can probably be removed. |
point |
are the data from point transects (TRUE) or line transects (FALSE). |
left |
the left truncation. Defaults to zero. |
pdf.based |
evaluate the non-normalised pdf or the detection function? Default is TRUE. |
For internal use only – not to be called by mrds
or Distance
users directly.
Felix Petersma
Provides an iterative algorithm for finding the MLEs of detection (capture) probabilities for a two-occasion (double observer) mark-recapture experiment using standard algorithms GLM/GAM and an offset to compensate for conditioning on the set of observations. While the likelihood can be formulated and solved numerically, the use of GLM/GAM provides all of the available tools for fitting, predictions, plotting etc without any further development.
io.glm( datavec, fitformula, eps = 1e-05, iterlimit = 500, GAM = FALSE, gamplot = TRUE )
io.glm( datavec, fitformula, eps = 1e-05, iterlimit = 500, GAM = FALSE, gamplot = TRUE )
datavec |
dataframe |
fitformula |
logit link formula |
eps |
convergence criterion |
iterlimit |
maximum number of iterations allowed |
GAM |
uses GAM instead of GLM for fitting |
gamplot |
set to TRUE to get a gam plot object if |
Note that currently the code in this function for GAMs has been commented
out until the remainder of the mrds package will work with GAMs. This is an
internal function that is used as by ddf.io.fi
to fit mark-recapture
models with 2 occasions. The argument mrmodel
is used for
fitformula
.
list of class("ioglm","glm","lm") or class("ioglm","gam")
glmobj |
GLM or GAM object |
offsetvalue |
offsetvalues from iterative fit |
plotobj |
gam plot object (if GAM & gamplot==TRUE, else NULL) |
Jeff Laake, David Borchers, Charles Paxton
Buckland, S.T., J.M. breiwick, K.L. Cattanach, and J.L. Laake. 1993. Estimated population size of the California gray whale. Marine Mammal Science, 9:235-249.
Burnham, K.P., S.T. Buckland, J.L. Laake, D.L. Borchers, T.A. Marques, J.R.B. Bishop, and L. Thomas. 2004. Further topics in distance sampling. pp: 360-363. In: Advanced Distance Sampling, eds. S.T. Buckland, D.R.Anderson, K.P. Burnham, J.L. Laake, D.L. Borchers, and L. Thomas. Oxford University Press.
These functions are used to test whether a logistic detection function is a
linear function of distance (is.linear.logistic
) or is constant
(varies by distance but no other covariates) is.logistic.constant
).
Based on these tests, the most appropriate manner for integrating the
detection function with respect to distance is chosen. The integrals are
needed to estimate the average detection probability for a given set of
covariates.
is.linear.logistic(xmat, g0model, zdim, width)
is.linear.logistic(xmat, g0model, zdim, width)
xmat |
data matrix |
g0model |
logit model |
zdim |
number of columns in design matrix |
width |
transect width |
If the logit is linear in distance then the integral can be computed analytically. If the logit is constant or only varies by distance then only one integral needs to be computed rather than an integral for each observation.
Logical TRUE if condition holds and FALSE otherwise
Jeff Laake
Determines whether the specified logit model is constant for all observations. If it is constant then only one integral needs to be computed.
is.logistic.constant(xmat, g0model, width)
is.logistic.constant(xmat, g0model, width)
xmat |
data |
g0model |
logit model |
width |
transect width |
logical value
Jeff Laake
The key function contains one parameter, the scale. Current implementation assumes that scaled dist is x/scale, not x/width
keyfct.grad.hn(distance, key.scale)
keyfct.grad.hn(distance, key.scale)
distance |
perpendicular distance vector |
key.scale |
vector of scale values |
d key / d scale = exp(-y ^ 2 / (2 scale ^ 2)) * (y ^ 2 / scale ^ 3)
vector of derivatives of the half-normal key function w.r.t. the scale parameter
The key function contains two parameters, the scale and the shape, and so the gradient is two-dimensional. Current implementation assumes that scaled dist is x/scale, not x/width
keyfct.grad.hz(distance, key.scale, key.shape, shape = FALSE)
keyfct.grad.hz(distance, key.scale, key.shape, shape = FALSE)
distance |
perpendicular distance vector |
key.scale |
vector of scale values |
key.shape |
vector of shape values |
shape |
is the gradient parameter the shape parameter? Defaults to FALSE |
d key / d scale = (shape * exp(-(1/ (x/scale) ^ shape)) / ((x/scale) ^ shape ) * scale) d key / d shape = - ((log(x / scale) * exp(-(1/ (x/scale) ^ shape))) / (x/scale) ^ shape)
When distance = 0, the gradients are also zero. However, the equation below will result in NaN and (-)Inf due to operations such as log(0) or division by zero. We correct for this in line 33.
matrix of derivatives of the hazard rate key function w.r.t. the scale parameter and the shape parameter.
Threshold key function
keyfct.th1(distance, key.scale, key.shape)
keyfct.th1(distance, key.scale, key.shape)
distance |
perpendicular distance vector |
key.scale |
vector of scale values |
key.shape |
vector of shape values |
vector of probabilities
Threshold key function
keyfct.th2(distance, key.scale, key.shape)
keyfct.th2(distance, key.scale, key.shape)
distance |
perpendicular distance vector |
key.scale |
vector of scale values |
key.shape |
vector of shape values |
vector of probabilities
The two-part normal detection function of Becker and Christ (2015). Either side of an estimated apex in the distance histogram has a half-normal distribution, with differing scale parameters. Covariates may be included but affect both sides of the function.
keyfct.tpn(distance, ddfobj)
keyfct.tpn(distance, ddfobj)
distance |
perpendicular distance vector |
ddfobj |
meta object containing parameters, design matrices etc |
Two-part normal models have 2 important parameters:
The apex, which estimates the peak in the detection function (where
g(x)=1). The log apex is reported in summary
results, so taking the
exponential of this value should give the peak in the plotted function (see
examples).
The parameter that controls the difference between the sides
.dummy_apex_side
, which is automatically added to the formula for a
two-part normal model. One can add interactions with this variable as
normal, but don't need to add the main effect as it will be automatically
added.
a vector of probabilities that the observation were detected given they were at the specified distance and assuming that g(mu)=1
Earl F Becker, David L Miller
Becker, E. F., & Christ, A. M. (2015). A Unimodal Model for Double Observer Distance Sampling Surveys. PLOS ONE, 10(8), e0136403. doi:10.1371/journal.pone.0136403
These data represent avian point count surveys conducted at 453 point sample survey locations on the 24,000 (approx) live-fire region of Fort Hood in central Texas. Surveys were conducted by independent double observers (2 per survey occasion) and as such we had a maximum of 3 paired survey histories, giving a maximum of 6 sample occasions (see MacKenzie et al. 2006, MacKenzie and Royle 2005, and Laake et al. 2011 for various sample survey design details). At each point, we surveyed for 5 minutes (technically broken into 3 time intervals of 2, 2, and 1 minutes; not used here) and we noted detections by each observer and collected distance to each observation within a set of distance bins (0-25, 25-50, 50-75, 75-100m) of the target species (Black-capped vireo's in this case) for each surveyor. Our primary focus was to use mark-recapture distance sampling methods to estimate density of Black-capped vireo's, and to estimate detection rates for the mark-recapture, distance, and composite model.
The format is a data frame with the following covariate metrics.
Unique identifier for each sample location; locations are the same for both species
Visit number to the point
Species designation, either Golden-cheeked warbler (GW) or Black-capped Vireo (BV)
Distance measure, which is either NA (representing no detection), or the median of the binned detection distances
ID value indicating which observers were paired for that sampling occasion
Observer ID, either primary(1), or secondary (2)
Detection of a bird, either 1 = detected, or 0 = not detected
Date of survey since 15 march 2011
Predicted occupancy value for that survey hexagon based on Farrell et al. (2013)
Region.Label categorization, see mrds
help file for
details on data structure
Amount of survey effort at the point
Number of days since 15 March 2011
Unique ID for each paired observations
In addition to detailing the analysis used by Collier et al. (2013, In
Review), this example documents the use of mrds
for avian point count
surveys and shows how density models can be incorporated with occupancy
models to develop spatially explicit density surface maps. For those that
are interested, for the distance sampling portion of our analysis, we used
both conventional distance sampling (cds
) and multiple covariate
distance sampling (mcds
) with uniform and half-normal key functions.
For the mark-recapture portion of our analysis, we tended to use covariates
for distance (median bin width), observer, and date of survey (days since 15
March 2011).
We combined our mrds
density estimates via a Horvitz-Thompson styled
estimator with the resource selection function gradient developed in Farrell
et al. (2013) and estimated density on an ~3.14ha hexagonal grid across our
study area, which provided a density gradient for the Fort Hood military
installation. Because there was considerable data manipulation needed for
each analysis to structure the data appropriately for use in mrds
,
rather than wrap each analysis in a single function, we have provided both
the Golden-cheeked warbler and Black-capped vireo analyses in their full
detail. The primary differences you will see will be changes to model
structures and model outputs between the two species.
Bret Collier and Jeff Laake
Farrell, S.F., B.A. Collier, K.L. Skow, A.M. Long, A.J. Campomizzi, M.L. Morrison, B. Hays, and R.N. Wilkins. 2013. Using LiDAR-derived structural vegetation characteristics to develop high-resolution, small-scale, species distribution models for conservation planning. Ecosphere 43(3): 42. http://dx.doi.org/10.1890/ES12-000352.1
Laake, J.L., B.A. Collier, M.L. Morrison, and R.N. Wilkins. 2011. Point-based mark recapture distance sampling. Journal of Agricultural, Biological and Environmental Statistics 16: 389-408.
Collier, B.A., S.L. Farrell, K.L. Skow, A. M. Long, A.J. Campomizzi, K.B. Hays, J.L. Laake, M.L. Morrison, and R.N. Wilkins. 2013. Spatially explicit density of endangered avian species in a disturbed landscape. Auk, In Review.
## Not run: data(lfbcvi) xy=cut(lfbcvi$Pred, c(-0.0001, .1, .2, .3, .4, .5, .6, .7, .8, .9, 1), labels=c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10")) x=data.frame(lfbcvi, New=xy) # Note that I scaled the individual covariate of day-helps with # convergence issues bird.data <- data.frame(object=x$ObjectID, observer=x$Observer, detected=x$Detected, distance=x$Distance, Region.Label=x$New, Sample.Label=x$PointID, Day=(x$Day/max(x$Day))) # make observer a factor variable bird.data$observer=factor(bird.data$observer) # Jeff Laake suggested this snippet to quickly create distance medians # which adds bin information to the bird.data dataframe bird.data$distbegin=0 bird.data$distend=100 bird.data$distend[bird.data$distance==12.5]=25 bird.data$distbegin[bird.data$distance==37.5]=25 bird.data$distend[bird.data$distance==37.5]=50 bird.data$distbegin[bird.data$distance==62.5]=50 bird.data$distend[bird.data$distance==62.5]=75 bird.data$distbegin[bird.data$distance==87.5]=75 bird.data$distend[bird.data$distance==87.5]=100 # Removed all survey points with distance=NA for a survey event; # hence no observations for use in ddf() but needed later bird.data=bird.data[complete.cases(bird.data),] # Manipulations on full dataset for various data.frame creation for # use in density estimation using dht() #Samples dataframe xx=x x=data.frame(PointID=x$PointID, Species=x$Species, Category=x$New, Effort=x$Effort) x=x[!duplicated(x$PointID),] point.num=table(x$Category) samples=data.frame(PointID=x$PointID, Region.Label=x$Category, Effort=x$Effort) final.samples=data.frame(Sample.Label=samples$PointID, Region.Label=samples$Region.Label, Effort=samples$Effort) #obs dataframe obs=data.frame(ObjectID=xx$ObjectID, PointID=xx$PointID) #used to get Region and Sample assigned to ObjectID obs=merge(obs, samples, by=c("PointID", "PointID")) obs=obs[!duplicated(obs$ObjectID),] obs=data.frame(object=obs$ObjectID, Region.Label=obs$Region.Label, Sample.Label=obs$PointID) region.data=data.frame(Region.Label=c(1, 2, 3,4,5,6,7,8,9, 10), Area=c(point.num[1]*3.14, point.num[2]*3.14, point.num[3]*3.14, point.num[4]*3.14, point.num[5]*3.14, point.num[6]*3.14, point.num[7]*3.14, point.num[8]*3.14, point.num[9]*3.14, point.num[10]*3.14)) # Candidate Models BV1=ddf( dsmodel=~mcds(key="hn",formula=~1), mrmodel=~glm(~distance), data=bird.data, method="io", meta.data=list(binned=TRUE,point=TRUE,width=100,breaks=c(0,50,100))) BV1FI=ddf( dsmodel=~mcds(key="hn",formula=~1), mrmodel=~glm(~distance), data=bird.data, method="io.fi", meta.data=list(binned=TRUE,point=TRUE,width=100,breaks=c(0,50,100))) BV2=ddf( dsmodel=~mcds(key="hr",formula=~1), mrmodel=~glm(~distance), data=bird.data, method="io", meta.data=list(binned=TRUE,point=TRUE,width=100,breaks=c(0,50,100))) BV3=ddf( dsmodel=~mcds(key="hn",formula=~1), mrmodel=~glm(~distance+observer), data=bird.data, method="io", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) BV3FI=ddf( dsmodel=~mcds(key="hn",formula=~1), mrmodel=~glm(~distance+observer), data=bird.data, method="io.fi", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) BV4=ddf( dsmodel=~mcds(key="hr",formula=~1), mrmodel=~glm(~distance+observer), data=bird.data, method="io", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) BV5=ddf( dsmodel=~mcds(key="hn",formula=~1), mrmodel=~glm(~distance*observer), data=bird.data, method="io", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) BV5FI=ddf( dsmodel=~mcds(key="hn",formula=~1), mrmodel=~glm(~distance*observer), data=bird.data, method="io.fi", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) BV6=ddf( dsmodel=~mcds(key="hr",formula=~1), mrmodel=~glm(~distance*observer), data=bird.data, method="io", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) BV7=ddf( dsmodel=~cds(key="hn",formula=~1), mrmodel=~glm(~distance*Day), data=bird.data, method="io", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) BV7FI=ddf( dsmodel=~cds(key="hn",formula=~1), mrmodel=~glm(~distance*Day), data=bird.data, method="io.fi", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) BV8=ddf( dsmodel=~cds(key="hr",formula=~1), mrmodel=~glm(~distance*Day), data=bird.data, method="io", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) BV9=ddf( dsmodel=~mcds(key="hn",formula=~1), mrmodel=~glm(~distance*observer*Day), data=bird.data, method="io", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) BV9FI=ddf( dsmodel=~mcds(key="hn",formula=~1), mrmodel=~glm(~distance*observer*Day), data=bird.data, method="io.fi", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) BV10=ddf( dsmodel=~mcds(key="hr",formula=~1), mrmodel=~glm(~distance*observer*Day), data=bird.data, method="io", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) #BV.DS=ddf( # dsmodel=~mcds(key="hn",formula=~1), # data=bird.data, # method="ds", # meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) #AIC table building code. AIC = c(BV1$criterion, BV1FI$criterion, BV2$criterion, BV3$criterion, BV3FI$criterion, BV4$criterion, BV5$criterion, BV5FI$criterion, BV6$criterion, BV7$criterion, BV7FI$criterion, BV8$criterion, BV9$criterion, BV9FI$criterion, BV10$criterion) #creates a set of row names for me to check my grep() call below rn = c("BV1", "BV1FI", "BV2", "BV3", "BV3FI", "BV4", "BV5", "BV5FI", "BV6", "BV7", "BV7FI", "BV8", "BV9", "BV9FI", "BV10") #Number parameters k = c(length(BV1$par), length(BV1FI$par), length(BV2$par), length(BV3$par), length(BV3FI$par), length(BV4$par), length(BV5$par),length(BV5FI$par), length(BV6$par), length(BV7$par), length(BV7FI$par), length(BV8$par), #build AIC table AIC.table=data.frame(AIC = AIC, rn=rn, k=k, dAIC = abs(min(AIC)-AIC) , likg=exp(-.5*(abs(min(AIC)-AIC)))) #row.names(AIC.table)=grep("BV", ls(), value=TRUE) AIC.table=AIC.table[with(AIC.table, order(-likg, -dAIC, AIC, k)),] AIC.table=data.frame(AIC.table, wi=AIC.table$likg/sum(AIC.table$likg)) AIC.table # Model average N_hat_covered estimates # not very clean, but I wanted to show full process, need to use # collect.models and model.table here later on estimate <- c(BV1$Nhat, BV1FI$Nhat, BV2$Nhat, BV3$Nhat, BV3FI$Nhat, BV4$Nhat, BV5$Nhat, BV5FI$Nhat, BV6$Nhat, BV7$Nhat, BV7FI$Nhat, BV8$Nhat, BV9$Nhat, BV9FI$Nhat, BV10$Nhat) AIC.values=AIC # had to use str() to extract here as Nhat.se is calculated in # mrds:::summary.io, not in ddf(), so it takes a bit std.err <- c(summary(BV1)$Nhat.se, summary(BV1FI)$Nhat.se, summary(BV2)$Nhat.se, summary(BV3)$Nhat.se, summary(BV3FI)$Nhat.se, summary(BV4)$Nhat.se, summary(BV5)$Nhat.se, summary(BV5FI)$Nhat.se, summary(BV6)$Nhat.se, summary(BV7)$Nhat.se, summary(BV7FI)$Nhat.se,summary(BV8)$Nhat.se, summary(BV9)$Nhat.se, summary(BV9FI)$Nhat.se, summary(BV10)$Nhat.se) ## End(Not run) ## Not run: #Not Run #requires RMark library(RMark) #uses model.average structure to model average real abundance estimates for #covered area of the surveys mmi.list=list(estimate=estimate, AIC=AIC.values, se=std.err) model.average(mmi.list, revised=TRUE) #Not Run #Summary for the top 2 models #summary(BV5, se=TRUE) #summary(BV5FI, se=TRUE) #Not Run #Best Model #best.model=AIC.table[1,] #Not Run #GOF for models #ddf.gof(BV5, breaks=c(0, 25, 50, 75, 100)) #Not Run #Density estimation across occupancy categories #out.BV=dht(BV5, region.data, final.samples, obs, se=TRUE, # options=list(convert.units=.01)) #Plot--Not Run #Composite Detection Function #plot(BV5, which=3, showpoints=FALSE, angle=0, density=0, col="black", lwd=3, # main="Black-capped Vireo",xlab="Distance (m)", las=1, cex.axis=1.25, # cex.lab=1.25) ## End(Not run)
## Not run: data(lfbcvi) xy=cut(lfbcvi$Pred, c(-0.0001, .1, .2, .3, .4, .5, .6, .7, .8, .9, 1), labels=c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10")) x=data.frame(lfbcvi, New=xy) # Note that I scaled the individual covariate of day-helps with # convergence issues bird.data <- data.frame(object=x$ObjectID, observer=x$Observer, detected=x$Detected, distance=x$Distance, Region.Label=x$New, Sample.Label=x$PointID, Day=(x$Day/max(x$Day))) # make observer a factor variable bird.data$observer=factor(bird.data$observer) # Jeff Laake suggested this snippet to quickly create distance medians # which adds bin information to the bird.data dataframe bird.data$distbegin=0 bird.data$distend=100 bird.data$distend[bird.data$distance==12.5]=25 bird.data$distbegin[bird.data$distance==37.5]=25 bird.data$distend[bird.data$distance==37.5]=50 bird.data$distbegin[bird.data$distance==62.5]=50 bird.data$distend[bird.data$distance==62.5]=75 bird.data$distbegin[bird.data$distance==87.5]=75 bird.data$distend[bird.data$distance==87.5]=100 # Removed all survey points with distance=NA for a survey event; # hence no observations for use in ddf() but needed later bird.data=bird.data[complete.cases(bird.data),] # Manipulations on full dataset for various data.frame creation for # use in density estimation using dht() #Samples dataframe xx=x x=data.frame(PointID=x$PointID, Species=x$Species, Category=x$New, Effort=x$Effort) x=x[!duplicated(x$PointID),] point.num=table(x$Category) samples=data.frame(PointID=x$PointID, Region.Label=x$Category, Effort=x$Effort) final.samples=data.frame(Sample.Label=samples$PointID, Region.Label=samples$Region.Label, Effort=samples$Effort) #obs dataframe obs=data.frame(ObjectID=xx$ObjectID, PointID=xx$PointID) #used to get Region and Sample assigned to ObjectID obs=merge(obs, samples, by=c("PointID", "PointID")) obs=obs[!duplicated(obs$ObjectID),] obs=data.frame(object=obs$ObjectID, Region.Label=obs$Region.Label, Sample.Label=obs$PointID) region.data=data.frame(Region.Label=c(1, 2, 3,4,5,6,7,8,9, 10), Area=c(point.num[1]*3.14, point.num[2]*3.14, point.num[3]*3.14, point.num[4]*3.14, point.num[5]*3.14, point.num[6]*3.14, point.num[7]*3.14, point.num[8]*3.14, point.num[9]*3.14, point.num[10]*3.14)) # Candidate Models BV1=ddf( dsmodel=~mcds(key="hn",formula=~1), mrmodel=~glm(~distance), data=bird.data, method="io", meta.data=list(binned=TRUE,point=TRUE,width=100,breaks=c(0,50,100))) BV1FI=ddf( dsmodel=~mcds(key="hn",formula=~1), mrmodel=~glm(~distance), data=bird.data, method="io.fi", meta.data=list(binned=TRUE,point=TRUE,width=100,breaks=c(0,50,100))) BV2=ddf( dsmodel=~mcds(key="hr",formula=~1), mrmodel=~glm(~distance), data=bird.data, method="io", meta.data=list(binned=TRUE,point=TRUE,width=100,breaks=c(0,50,100))) BV3=ddf( dsmodel=~mcds(key="hn",formula=~1), mrmodel=~glm(~distance+observer), data=bird.data, method="io", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) BV3FI=ddf( dsmodel=~mcds(key="hn",formula=~1), mrmodel=~glm(~distance+observer), data=bird.data, method="io.fi", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) BV4=ddf( dsmodel=~mcds(key="hr",formula=~1), mrmodel=~glm(~distance+observer), data=bird.data, method="io", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) BV5=ddf( dsmodel=~mcds(key="hn",formula=~1), mrmodel=~glm(~distance*observer), data=bird.data, method="io", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) BV5FI=ddf( dsmodel=~mcds(key="hn",formula=~1), mrmodel=~glm(~distance*observer), data=bird.data, method="io.fi", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) BV6=ddf( dsmodel=~mcds(key="hr",formula=~1), mrmodel=~glm(~distance*observer), data=bird.data, method="io", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) BV7=ddf( dsmodel=~cds(key="hn",formula=~1), mrmodel=~glm(~distance*Day), data=bird.data, method="io", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) BV7FI=ddf( dsmodel=~cds(key="hn",formula=~1), mrmodel=~glm(~distance*Day), data=bird.data, method="io.fi", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) BV8=ddf( dsmodel=~cds(key="hr",formula=~1), mrmodel=~glm(~distance*Day), data=bird.data, method="io", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) BV9=ddf( dsmodel=~mcds(key="hn",formula=~1), mrmodel=~glm(~distance*observer*Day), data=bird.data, method="io", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) BV9FI=ddf( dsmodel=~mcds(key="hn",formula=~1), mrmodel=~glm(~distance*observer*Day), data=bird.data, method="io.fi", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) BV10=ddf( dsmodel=~mcds(key="hr",formula=~1), mrmodel=~glm(~distance*observer*Day), data=bird.data, method="io", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) #BV.DS=ddf( # dsmodel=~mcds(key="hn",formula=~1), # data=bird.data, # method="ds", # meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) #AIC table building code. AIC = c(BV1$criterion, BV1FI$criterion, BV2$criterion, BV3$criterion, BV3FI$criterion, BV4$criterion, BV5$criterion, BV5FI$criterion, BV6$criterion, BV7$criterion, BV7FI$criterion, BV8$criterion, BV9$criterion, BV9FI$criterion, BV10$criterion) #creates a set of row names for me to check my grep() call below rn = c("BV1", "BV1FI", "BV2", "BV3", "BV3FI", "BV4", "BV5", "BV5FI", "BV6", "BV7", "BV7FI", "BV8", "BV9", "BV9FI", "BV10") #Number parameters k = c(length(BV1$par), length(BV1FI$par), length(BV2$par), length(BV3$par), length(BV3FI$par), length(BV4$par), length(BV5$par),length(BV5FI$par), length(BV6$par), length(BV7$par), length(BV7FI$par), length(BV8$par), #build AIC table AIC.table=data.frame(AIC = AIC, rn=rn, k=k, dAIC = abs(min(AIC)-AIC) , likg=exp(-.5*(abs(min(AIC)-AIC)))) #row.names(AIC.table)=grep("BV", ls(), value=TRUE) AIC.table=AIC.table[with(AIC.table, order(-likg, -dAIC, AIC, k)),] AIC.table=data.frame(AIC.table, wi=AIC.table$likg/sum(AIC.table$likg)) AIC.table # Model average N_hat_covered estimates # not very clean, but I wanted to show full process, need to use # collect.models and model.table here later on estimate <- c(BV1$Nhat, BV1FI$Nhat, BV2$Nhat, BV3$Nhat, BV3FI$Nhat, BV4$Nhat, BV5$Nhat, BV5FI$Nhat, BV6$Nhat, BV7$Nhat, BV7FI$Nhat, BV8$Nhat, BV9$Nhat, BV9FI$Nhat, BV10$Nhat) AIC.values=AIC # had to use str() to extract here as Nhat.se is calculated in # mrds:::summary.io, not in ddf(), so it takes a bit std.err <- c(summary(BV1)$Nhat.se, summary(BV1FI)$Nhat.se, summary(BV2)$Nhat.se, summary(BV3)$Nhat.se, summary(BV3FI)$Nhat.se, summary(BV4)$Nhat.se, summary(BV5)$Nhat.se, summary(BV5FI)$Nhat.se, summary(BV6)$Nhat.se, summary(BV7)$Nhat.se, summary(BV7FI)$Nhat.se,summary(BV8)$Nhat.se, summary(BV9)$Nhat.se, summary(BV9FI)$Nhat.se, summary(BV10)$Nhat.se) ## End(Not run) ## Not run: #Not Run #requires RMark library(RMark) #uses model.average structure to model average real abundance estimates for #covered area of the surveys mmi.list=list(estimate=estimate, AIC=AIC.values, se=std.err) model.average(mmi.list, revised=TRUE) #Not Run #Summary for the top 2 models #summary(BV5, se=TRUE) #summary(BV5FI, se=TRUE) #Not Run #Best Model #best.model=AIC.table[1,] #Not Run #GOF for models #ddf.gof(BV5, breaks=c(0, 25, 50, 75, 100)) #Not Run #Density estimation across occupancy categories #out.BV=dht(BV5, region.data, final.samples, obs, se=TRUE, # options=list(convert.units=.01)) #Plot--Not Run #Composite Detection Function #plot(BV5, which=3, showpoints=FALSE, angle=0, density=0, col="black", lwd=3, # main="Black-capped Vireo",xlab="Distance (m)", las=1, cex.axis=1.25, # cex.lab=1.25) ## End(Not run)
These data represent avian point count surveys conducted at 453 point sample survey locations on the 24,000 (approx) live-fire region of Fort Hood in central Texas. Surveys were conducted by independent double observers (2 per survey occasion) and as such we had a maximum of 3 paired survey histories, giving a maximum of 6 sample occasions (see MacKenzie et al. 2006, MacKenzie and Royle 2005, and Laake et al. 2011 for various sample survey design details). At each point, we surveyed for 5 minutes (technically broken into 3 time intervals of 2, 2, and 1 minutes; not used here) and we noted detections by each observer and collected distance to each observation within a set of distance bins (0-50, 50-100m; Laake et al. 2011) of the target species (Golden-cheeked warblers in this case) for each surveyor. Our primary focus was to use mark-recapture distance sampling methods to estimate density of Golden-cheeked warblers, and to estimate detection rates for the mark-recapture, distance, and composite model.
The format is a data frame with the following covariate metrics.
Unique identifier for each sample location; locations are the same for both species
Visit number to the point
Species designation, either Golden-cheeked warbler (GW) or Black-capped Vireo (BV)
Distance measure, which is either NA (representing no detection), or the median of the binned detection distances
ID value indicating which observers were paired for that sampling occasion
Observer ID, either primary(1), or secondary (2)
Detection of a bird, either 1 = detected, or 0 = not detected
Date of survey since 15 March 2011, numeric value
Predicted occupancy value for that survey hexagon based on Farrell et al. (2013)
Region.Label categorization, see R package mrds
help
file for details on data structure
Amount of survey effort at the point
Number of days since 15 March 2011, numeric value
Unique ID for each paired observations
In addition to detailing the analysis used by Collier et al.
(2013, In Review), this example documents the use of mrds
for avian
point count surveys and shows how density models can be incorporated with
occupancy models to develop spatially explicit density surface maps. For
those that are interested, for the distance sampling portion of our
analysis, we used both conventional distance sampling (cds
) and
multiple covariate distance sampling (mcds
) with uniform and
half-normal key functions. For the mark-recapture portion of our analysis,
we tended to use covariates for distance (median bin width), observer, and
date of survey (days since 15 March 2011).
We combined our mrds
density estimates via a Horvitz-Thompson styled
estimator with the resource selection function gradient developed in Farrell
et al. (2013) and estimated density on an ~3.14ha hexagonal grid across our
study area, which provided a density gradient for Fort Hood. Because there
was considerable data manipulation needed for each analysis to structure the
data appropriately for use in mrds
, rather than wrap each analysis in
a single function, we have provided both the Golden-cheeked warbler and
Black-capped vireo analyses in their full detail. The primary differences
you will see will be changes to model structures and model outputs between
the two species.
Bret Collier and Jeff Laake
Farrell, S.F., B.A. Collier, K.L. Skow, A.M. Long, A.J. Campomizzi, M.L. Morrison, B. Hays, and R.N. Wilkins. 2013. Using LiDAR-derived structural vegetation characteristics to develop high-resolution, small-scale, species distribution models for conservation planning. Ecosphere 43(3): 42. http://dx.doi.org/10.1890/ES12-000352.1
Laake, J.L., B.A. Collier, M.L. Morrison, and R.N. Wilkins. 2011. Point-based mark recapture distance sampling. Journal of Agricultural, Biological and Environmental Statistics 16: 389-408.
Collier, B.A., S.L. Farrell, K.L. Skow, A.M. Long, A.J. Campomizzi, K.B. Hays, J.L. Laake, M.L. Morrison, and R.N. Wilkins. 2013. Spatially explicit density of endangered avian species in a disturbed landscape. Auk, In Review.
## Not run: data(lfgcwa) xy <- cut(lfgcwa$Pred, c(-0.0001, .1, .2, .3, .4, .5, .6, .7, .8, .9, 1), labels=c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10")) x <- data.frame(lfgcwa, New=xy) # Note that I scaled the individual covariate of day-helps with # convergence issues bird.data <- data.frame(object=x$ObjectID, observer=x$Observer, detected=x$Detected, distance=x$Distance, Region.Label=x$New, Sample.Label=x$PointID, Day=(x$Day/max(x$Day))) # make observer a factor variable bird.data$observer=factor(bird.data$observer) # Jeff Laake suggested this snippet to quickly create distance medians # which adds bin information to the \code{bird.data} dataframe bird.data$distbegin=0 bird.data$distend=100 bird.data$distend[bird.data$distance==12.5]=50 bird.data$distbegin[bird.data$distance==37.5]=0 bird.data$distend[bird.data$distance==37.5]=50 bird.data$distbegin[bird.data$distance==62.5]=50 bird.data$distend[bird.data$distance==62.5]=100 bird.data$distbegin[bird.data$distance==87.5]=50 bird.data$distend[bird.data$distance==87.5]=100 # Removed all survey points with distance=NA for a survey event; # hence no observations for use in \code{ddf()} but needed later bird.data=bird.data[complete.cases(bird.data),] # Manipulations on full dataset for various data.frame creation # for use in density estimation using \code{dht()} # Samples dataframe xx <- x x <- data.frame(PointID=x$PointID, Species=x$Species, Category=x$New, Effort=x$Effort) x <- x[!duplicated(x$PointID),] point.num <- table(x$Category) samples <- data.frame(PointID=x$PointID, Region.Label=x$Category, Effort=x$Effort) final.samples=data.frame(Sample.Label=samples$PointID, Region.Label=samples$Region.Label, Effort=samples$Effort) # obs dataframe obs <- data.frame(ObjectID=xx$ObjectID, PointID=xx$PointID) # used to get Region and Sample assigned to ObjectID obs <- merge(obs, samples, by=c("PointID", "PointID")) obs <- obs[!duplicated(obs$ObjectID),] obs <- data.frame(object=obs$ObjectID, Region.Label=obs$Region.Label, Sample.Label=obs$PointID) #Region.Label dataframe region.data <- data.frame(Region.Label=c(1,2,3,4,5,6,7,8,9), Area=c(point.num[1]*3.14, point.num[2]*3.14, point.num[3]*3.14, point.num[4]*3.14, point.num[5]*3.14, point.num[6]*3.14, point.num[7]*3.14, point.num[8]*3.14, point.num[9]*3.14)) # Candidate Models GW1=ddf( dsmodel=~cds(key="unif", adj.series="cos", adj.order=1,adj.scale="width"), mrmodel=~glm(~distance), data=bird.data, method="io", meta.data=list(binned=TRUE,point=TRUE,width=100,breaks=c(0,50,100))) GW2=ddf( dsmodel=~cds(key="unif", adj.series="cos", adj.order=1,adj.scale="width"), mrmodel=~glm(~distance+observer), data=bird.data, method="io", meta.data=list(binned=TRUE,point=TRUE,width=100,breaks=c(0,50,100))) GW3=ddf( dsmodel=~cds(key="unif", adj.series="cos", adj.order=1,adj.scale="width"), mrmodel=~glm(~distance*observer), data=bird.data, method="io", meta.data=list(binned=TRUE,point=TRUE,width=100,breaks=c(0,50,100))) GW4=ddf( dsmodel=~mcds(key="hn",formula=~1), mrmodel=~glm(~distance), data=bird.data, method="io", meta.data=list(binned=TRUE,point=TRUE,width=100,breaks=c(0,50,100))) GW4FI=ddf( dsmodel=~mcds(key="hn",formula=~1), mrmodel=~glm(~distance), data=bird.data, method="io.fi", meta.data=list(binned=TRUE,point=TRUE,width=100,breaks=c(0,50,100))) GW5=ddf( dsmodel=~mcds(key="hn",formula=~1), mrmodel=~glm(~distance+observer), data=bird.data, method="io", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) GW5FI=ddf( dsmodel=~mcds(key="hn",formula=~1), mrmodel=~glm(~distance+observer), data=bird.data, method="io.fi", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) GW6=ddf( dsmodel=~mcds(key="hn",formula=~1), mrmodel=~glm(~distance*observer), data=bird.data, method="io", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) GW6FI=ddf( dsmodel=~mcds(key="hn",formula=~1), mrmodel=~glm(~distance*observer), data=bird.data, method="io.fi", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) GW7=ddf( dsmodel=~cds(key="hn",formula=~1), mrmodel=~glm(~distance*Day), data=bird.data, method="io", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) GW7FI=ddf( dsmodel=~cds(key="hn",formula=~1), mrmodel=~glm(~distance*Day), data=bird.data, method="io.fi", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) GW8=ddf( dsmodel=~mcds(key="hn",formula=~1), mrmodel=~glm(~distance*observer*Day), data=bird.data, method="io", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) GW8FI=ddf( dsmodel=~mcds(key="hn",formula=~1), mrmodel=~glm(~distance*observer*Day), data=bird.data, method="io.fi", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) #GWDS=ddf( # dsmodel=~mcds(key="hn",formula=~1), # data=bird.data, # method="ds", # meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) #### GCWA Summary Metrics #AIC table building code, not exactly elegant, but I did not want to add more package dependencies AIC = c(GW1$criterion, GW2$criterion, GW3$criterion, GW4$criterion, GW4FI$criterion, GW5$criterion, GW5FI$criterion, GW6$criterion, GW6FI$criterion, GW7$criterion, GW7FI$criterion, GW8$criterion, GW8FI$criterion) #creates a set of row names for me to check my grep() call below rn <- c("GW1", "GW2", "GW3", "GW4", "GW4FI", "GW5", "GW5FI", "GW6", "GW6FI", "GW7","GW7FI", "GW8", "GW8FI") # number of parameters for each model k <- c(length(GW1$par), length(GW2$par), length(GW3$par), length(GW4$par), length(GW4FI$par), length(GW5$par), length(GW5FI$par), length(GW6$par), length(GW6FI$par), length(GW7$par), length(GW7FI$par), length(GW8$par), length(GW8FI$par)) # build AIC table and AIC.table <- data.frame(AIC = AIC, rn=rn, k=k, dAIC = abs(min(AIC)-AIC), likg = exp(-.5*(abs(min(AIC)-AIC)))) # row.names(AIC.table)=grep("GW", ls(), value=TRUE) AIC.table <- AIC.table[with(AIC.table, order(-likg, -dAIC, AIC, k)),] AIC.table <- data.frame(AIC.table, wi=AIC.table$likg/sum(AIC.table$likg)) AIC.table # Model average N_hat_covered estimates # not very clean, but I wanted to show full process, need to use # collect.models and model.table here estimate <- c(GW1$Nhat, GW2$Nhat, GW3$Nhat, GW4$Nhat, GW4FI$Nhat, GW5$Nhat, GW5FI$Nhat, GW6$Nhat, GW6FI$Nhat, GW7$Nhat, GW7FI$Nhat, GW8$Nhat, GW8FI$Nhat) AIC.values <- AIC # Nhat.se is calculated in mrds:::summary.io, not in ddf(), so # it takes a bit to pull out std.err <- c(summary(GW1)$Nhat.se, summary(GW2)$Nhat.se, summary(GW3)$Nhat.se,summary(GW4)$Nhat.se, summary(GW4FI)$Nhat.se, summary(GW5)$Nhat.se, summary(GW5FI)$Nhat.se, summary(GW6)$Nhat.se, summary(GW6FI)$Nhat.se, summary(GW7)$Nhat.se, summary(GW7FI)$Nhat.se,summary(GW8)$Nhat.se, summary(GW8FI)$Nhat.se) ## End(Not run) ## Not run: #Not Run #requires RMark library(RMark) #uses model.average structure to model average real abundance estimates for #covered area of the surveys mmi.list=list(estimate=estimate, AIC=AIC.values, se=std.err) model.average(mmi.list, revised=TRUE) #Not Run #Best Model FI #best.modelFI=AIC.table[1,] #best.model #Best Model PI #best.modelPI=AIC.table[2,] #best.modelPI #Not Run #summary(GW7FI, se=TRUE) #summary(GW7, se=TRUE) #Not Run #GOF for models #ddf.gof(GW7, breaks=c(0,50,100)) #Not Run #Density estimation across occupancy categories #out.GW=dht(GW7, region.data, final.samples, obs, se=TRUE, options=list(convert.units=.01)) #Plots--Not Run #Composite Detection Function examples #plot(GW7, which=3, showpoints=FALSE, angle=0, density=0, # col="black", lwd=3, main="Golden-cheeked Warbler", # xlab="Distance (m)", las=1, cex.axis=1.25, cex.lab=1.25) #Conditional Detection Function #dd=expand.grid(distance=0:100,Day=(4:82)/82) #dmat=model.matrix(~distance*Day,dd) #dd$p=plogis(model.matrix(~distance*Day,dd)%*%coef(GW7$mr)$estimate) #dd$Day=dd$Day*82 #with(dd[dd$Day==12,],plot(distance,p,ylim=c(0,1), las=1, # ylab="Detection probability", xlab="Distance (m)", # type="l",lty=1, lwd=3, bty="l", cex.axis=1.5, cex.lab=1.5)) #with(dd[dd$Day==65,],lines(distance,p,lty=2, lwd=3)) #ch=paste(bird.data$detected[bird.data$observer==1], # bird.data$detected[bird.data$observer==2], # sep="") #tab=table(ch,cut(82*bird.data$Day[bird.data$observer==1],c(0,45,83)), # cut(bird.data$distance[bird.data$observer==1],c(0,50,100))) #tabmat=cbind(colMeans(rbind(tab[3,,1]/colSums(tab[2:3,,1], # tab[3,,1]/colSums(tab[c(1,3),,1])))), # colMeans(rbind(tab[3,,2]/colSums(tab[2:3,,2], # tab[3,,2]/colSums(tab[c(1,3),,2]))))) #colnames(tabmat)=c("0-50","51-100") #points(c(25,75),tabmat[1,],pch=1, cex=1.5) #points(c(25,75),tabmat[2,],pch=2, cex=1.5) # Another alternative plot using barplot instead of points # (this is one in paper) #ch=paste(bird.data$detected[bird.data$observer==1], # bird.data$detected[bird.data$observer==2], #sep="") #tab=table(ch,cut(82*bird.data$Day[bird.data$observer==1],c(0,45,83)), # cut(bird.data$distance[bird.data$observer==1],c(0,50,100))) #tabmat=cbind(colMeans(rbind(tab[3,,1]/colSums(tab[2:3,,1], # tab[3,,1]/colSums(tab[c(1,3),,1])))), #colMeans(rbind(tab[3,,2]/colSums(tab[2:3,,2], # tab[3,,2]/colSums(tab[c(1,3),,2]))))) #colnames(tabmat)=c("0-50","51-100") #par(mfrow=c(2, 1), mai=c(1,1,1,1)) #with(dd[dd$Day==12,], # plot(distance,p,ylim=c(0,1), las=1, # ylab="Detection probability", xlab="", # type="l",lty=1, lwd=4, bty="l", cex.axis=1.5, cex.lab=1.5)) #segments(0, 0, .0, tabmat[1,1], lwd=3) #segments(0, tabmat[1,1], 50, tabmat[1,1], lwd=4) #segments(50, tabmat[1,1], 50, 0, lwd=4) #segments(50, tabmat[1,2], 100, tabmat[1,2], lwd=4) #segments(0, tabmat[1,1], 50, tabmat[1,1], lwd=4) #segments(100, tabmat[1,2], 100, 0, lwd=4) #mtext("a",line=-1, at=90) #with(dd[dd$Day==65,], # plot(distance,p,ylim=c(0,1), las=1, ylab="Detection probability", # xlab="Distance", type="l",lty=1, # lwd=4, bty="l", cex.axis=1.5, cex.lab=1.5)) #segments(0, 0, .0, tabmat[2,1], lwd=4) #segments(0, tabmat[2,1], 50, tabmat[2,1], lwd=4) #segments(50, tabmat[2,1], 50, 0, lwd=4) #segments(50, tabmat[2,2], 50, tabmat[2,1], lwd=4) #segments(50, tabmat[2,2], 100, tabmat[2,2], lwd=4) #segments(100, tabmat[2,2], 100, 0, lwd=4) #mtext("b",line=-1, at=90) ## End(Not run)
## Not run: data(lfgcwa) xy <- cut(lfgcwa$Pred, c(-0.0001, .1, .2, .3, .4, .5, .6, .7, .8, .9, 1), labels=c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10")) x <- data.frame(lfgcwa, New=xy) # Note that I scaled the individual covariate of day-helps with # convergence issues bird.data <- data.frame(object=x$ObjectID, observer=x$Observer, detected=x$Detected, distance=x$Distance, Region.Label=x$New, Sample.Label=x$PointID, Day=(x$Day/max(x$Day))) # make observer a factor variable bird.data$observer=factor(bird.data$observer) # Jeff Laake suggested this snippet to quickly create distance medians # which adds bin information to the \code{bird.data} dataframe bird.data$distbegin=0 bird.data$distend=100 bird.data$distend[bird.data$distance==12.5]=50 bird.data$distbegin[bird.data$distance==37.5]=0 bird.data$distend[bird.data$distance==37.5]=50 bird.data$distbegin[bird.data$distance==62.5]=50 bird.data$distend[bird.data$distance==62.5]=100 bird.data$distbegin[bird.data$distance==87.5]=50 bird.data$distend[bird.data$distance==87.5]=100 # Removed all survey points with distance=NA for a survey event; # hence no observations for use in \code{ddf()} but needed later bird.data=bird.data[complete.cases(bird.data),] # Manipulations on full dataset for various data.frame creation # for use in density estimation using \code{dht()} # Samples dataframe xx <- x x <- data.frame(PointID=x$PointID, Species=x$Species, Category=x$New, Effort=x$Effort) x <- x[!duplicated(x$PointID),] point.num <- table(x$Category) samples <- data.frame(PointID=x$PointID, Region.Label=x$Category, Effort=x$Effort) final.samples=data.frame(Sample.Label=samples$PointID, Region.Label=samples$Region.Label, Effort=samples$Effort) # obs dataframe obs <- data.frame(ObjectID=xx$ObjectID, PointID=xx$PointID) # used to get Region and Sample assigned to ObjectID obs <- merge(obs, samples, by=c("PointID", "PointID")) obs <- obs[!duplicated(obs$ObjectID),] obs <- data.frame(object=obs$ObjectID, Region.Label=obs$Region.Label, Sample.Label=obs$PointID) #Region.Label dataframe region.data <- data.frame(Region.Label=c(1,2,3,4,5,6,7,8,9), Area=c(point.num[1]*3.14, point.num[2]*3.14, point.num[3]*3.14, point.num[4]*3.14, point.num[5]*3.14, point.num[6]*3.14, point.num[7]*3.14, point.num[8]*3.14, point.num[9]*3.14)) # Candidate Models GW1=ddf( dsmodel=~cds(key="unif", adj.series="cos", adj.order=1,adj.scale="width"), mrmodel=~glm(~distance), data=bird.data, method="io", meta.data=list(binned=TRUE,point=TRUE,width=100,breaks=c(0,50,100))) GW2=ddf( dsmodel=~cds(key="unif", adj.series="cos", adj.order=1,adj.scale="width"), mrmodel=~glm(~distance+observer), data=bird.data, method="io", meta.data=list(binned=TRUE,point=TRUE,width=100,breaks=c(0,50,100))) GW3=ddf( dsmodel=~cds(key="unif", adj.series="cos", adj.order=1,adj.scale="width"), mrmodel=~glm(~distance*observer), data=bird.data, method="io", meta.data=list(binned=TRUE,point=TRUE,width=100,breaks=c(0,50,100))) GW4=ddf( dsmodel=~mcds(key="hn",formula=~1), mrmodel=~glm(~distance), data=bird.data, method="io", meta.data=list(binned=TRUE,point=TRUE,width=100,breaks=c(0,50,100))) GW4FI=ddf( dsmodel=~mcds(key="hn",formula=~1), mrmodel=~glm(~distance), data=bird.data, method="io.fi", meta.data=list(binned=TRUE,point=TRUE,width=100,breaks=c(0,50,100))) GW5=ddf( dsmodel=~mcds(key="hn",formula=~1), mrmodel=~glm(~distance+observer), data=bird.data, method="io", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) GW5FI=ddf( dsmodel=~mcds(key="hn",formula=~1), mrmodel=~glm(~distance+observer), data=bird.data, method="io.fi", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) GW6=ddf( dsmodel=~mcds(key="hn",formula=~1), mrmodel=~glm(~distance*observer), data=bird.data, method="io", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) GW6FI=ddf( dsmodel=~mcds(key="hn",formula=~1), mrmodel=~glm(~distance*observer), data=bird.data, method="io.fi", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) GW7=ddf( dsmodel=~cds(key="hn",formula=~1), mrmodel=~glm(~distance*Day), data=bird.data, method="io", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) GW7FI=ddf( dsmodel=~cds(key="hn",formula=~1), mrmodel=~glm(~distance*Day), data=bird.data, method="io.fi", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) GW8=ddf( dsmodel=~mcds(key="hn",formula=~1), mrmodel=~glm(~distance*observer*Day), data=bird.data, method="io", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) GW8FI=ddf( dsmodel=~mcds(key="hn",formula=~1), mrmodel=~glm(~distance*observer*Day), data=bird.data, method="io.fi", meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) #GWDS=ddf( # dsmodel=~mcds(key="hn",formula=~1), # data=bird.data, # method="ds", # meta.data=list(binned=TRUE, point=TRUE, width=100,breaks=c(0,50,100))) #### GCWA Summary Metrics #AIC table building code, not exactly elegant, but I did not want to add more package dependencies AIC = c(GW1$criterion, GW2$criterion, GW3$criterion, GW4$criterion, GW4FI$criterion, GW5$criterion, GW5FI$criterion, GW6$criterion, GW6FI$criterion, GW7$criterion, GW7FI$criterion, GW8$criterion, GW8FI$criterion) #creates a set of row names for me to check my grep() call below rn <- c("GW1", "GW2", "GW3", "GW4", "GW4FI", "GW5", "GW5FI", "GW6", "GW6FI", "GW7","GW7FI", "GW8", "GW8FI") # number of parameters for each model k <- c(length(GW1$par), length(GW2$par), length(GW3$par), length(GW4$par), length(GW4FI$par), length(GW5$par), length(GW5FI$par), length(GW6$par), length(GW6FI$par), length(GW7$par), length(GW7FI$par), length(GW8$par), length(GW8FI$par)) # build AIC table and AIC.table <- data.frame(AIC = AIC, rn=rn, k=k, dAIC = abs(min(AIC)-AIC), likg = exp(-.5*(abs(min(AIC)-AIC)))) # row.names(AIC.table)=grep("GW", ls(), value=TRUE) AIC.table <- AIC.table[with(AIC.table, order(-likg, -dAIC, AIC, k)),] AIC.table <- data.frame(AIC.table, wi=AIC.table$likg/sum(AIC.table$likg)) AIC.table # Model average N_hat_covered estimates # not very clean, but I wanted to show full process, need to use # collect.models and model.table here estimate <- c(GW1$Nhat, GW2$Nhat, GW3$Nhat, GW4$Nhat, GW4FI$Nhat, GW5$Nhat, GW5FI$Nhat, GW6$Nhat, GW6FI$Nhat, GW7$Nhat, GW7FI$Nhat, GW8$Nhat, GW8FI$Nhat) AIC.values <- AIC # Nhat.se is calculated in mrds:::summary.io, not in ddf(), so # it takes a bit to pull out std.err <- c(summary(GW1)$Nhat.se, summary(GW2)$Nhat.se, summary(GW3)$Nhat.se,summary(GW4)$Nhat.se, summary(GW4FI)$Nhat.se, summary(GW5)$Nhat.se, summary(GW5FI)$Nhat.se, summary(GW6)$Nhat.se, summary(GW6FI)$Nhat.se, summary(GW7)$Nhat.se, summary(GW7FI)$Nhat.se,summary(GW8)$Nhat.se, summary(GW8FI)$Nhat.se) ## End(Not run) ## Not run: #Not Run #requires RMark library(RMark) #uses model.average structure to model average real abundance estimates for #covered area of the surveys mmi.list=list(estimate=estimate, AIC=AIC.values, se=std.err) model.average(mmi.list, revised=TRUE) #Not Run #Best Model FI #best.modelFI=AIC.table[1,] #best.model #Best Model PI #best.modelPI=AIC.table[2,] #best.modelPI #Not Run #summary(GW7FI, se=TRUE) #summary(GW7, se=TRUE) #Not Run #GOF for models #ddf.gof(GW7, breaks=c(0,50,100)) #Not Run #Density estimation across occupancy categories #out.GW=dht(GW7, region.data, final.samples, obs, se=TRUE, options=list(convert.units=.01)) #Plots--Not Run #Composite Detection Function examples #plot(GW7, which=3, showpoints=FALSE, angle=0, density=0, # col="black", lwd=3, main="Golden-cheeked Warbler", # xlab="Distance (m)", las=1, cex.axis=1.25, cex.lab=1.25) #Conditional Detection Function #dd=expand.grid(distance=0:100,Day=(4:82)/82) #dmat=model.matrix(~distance*Day,dd) #dd$p=plogis(model.matrix(~distance*Day,dd)%*%coef(GW7$mr)$estimate) #dd$Day=dd$Day*82 #with(dd[dd$Day==12,],plot(distance,p,ylim=c(0,1), las=1, # ylab="Detection probability", xlab="Distance (m)", # type="l",lty=1, lwd=3, bty="l", cex.axis=1.5, cex.lab=1.5)) #with(dd[dd$Day==65,],lines(distance,p,lty=2, lwd=3)) #ch=paste(bird.data$detected[bird.data$observer==1], # bird.data$detected[bird.data$observer==2], # sep="") #tab=table(ch,cut(82*bird.data$Day[bird.data$observer==1],c(0,45,83)), # cut(bird.data$distance[bird.data$observer==1],c(0,50,100))) #tabmat=cbind(colMeans(rbind(tab[3,,1]/colSums(tab[2:3,,1], # tab[3,,1]/colSums(tab[c(1,3),,1])))), # colMeans(rbind(tab[3,,2]/colSums(tab[2:3,,2], # tab[3,,2]/colSums(tab[c(1,3),,2]))))) #colnames(tabmat)=c("0-50","51-100") #points(c(25,75),tabmat[1,],pch=1, cex=1.5) #points(c(25,75),tabmat[2,],pch=2, cex=1.5) # Another alternative plot using barplot instead of points # (this is one in paper) #ch=paste(bird.data$detected[bird.data$observer==1], # bird.data$detected[bird.data$observer==2], #sep="") #tab=table(ch,cut(82*bird.data$Day[bird.data$observer==1],c(0,45,83)), # cut(bird.data$distance[bird.data$observer==1],c(0,50,100))) #tabmat=cbind(colMeans(rbind(tab[3,,1]/colSums(tab[2:3,,1], # tab[3,,1]/colSums(tab[c(1,3),,1])))), #colMeans(rbind(tab[3,,2]/colSums(tab[2:3,,2], # tab[3,,2]/colSums(tab[c(1,3),,2]))))) #colnames(tabmat)=c("0-50","51-100") #par(mfrow=c(2, 1), mai=c(1,1,1,1)) #with(dd[dd$Day==12,], # plot(distance,p,ylim=c(0,1), las=1, # ylab="Detection probability", xlab="", # type="l",lty=1, lwd=4, bty="l", cex.axis=1.5, cex.lab=1.5)) #segments(0, 0, .0, tabmat[1,1], lwd=3) #segments(0, tabmat[1,1], 50, tabmat[1,1], lwd=4) #segments(50, tabmat[1,1], 50, 0, lwd=4) #segments(50, tabmat[1,2], 100, tabmat[1,2], lwd=4) #segments(0, tabmat[1,1], 50, tabmat[1,1], lwd=4) #segments(100, tabmat[1,2], 100, 0, lwd=4) #mtext("a",line=-1, at=90) #with(dd[dd$Day==65,], # plot(distance,p,ylim=c(0,1), las=1, ylab="Detection probability", # xlab="Distance", type="l",lty=1, # lwd=4, bty="l", cex.axis=1.5, cex.lab=1.5)) #segments(0, 0, .0, tabmat[2,1], lwd=4) #segments(0, tabmat[2,1], 50, tabmat[2,1], lwd=4) #segments(50, tabmat[2,1], 50, 0, lwd=4) #segments(50, tabmat[2,2], 50, tabmat[2,1], lwd=4) #segments(50, tabmat[2,2], 100, tabmat[2,2], lwd=4) #segments(100, tabmat[2,2], 100, 0, lwd=4) #mtext("b",line=-1, at=90) ## End(Not run)
treats logistic as a function of covariates; for a given covariate combination it computes function at with those covariate values at a range of distances
logisticbyx(distance, x, models, beta, point)
logisticbyx(distance, x, models, beta, point)
distance |
vector of distance values |
x |
covariate data |
models |
model list |
beta |
logistic parameters |
point |
|
vector of probabilities
Jeff Laake
Treats logistic as a function of distance; for a given distance it computes function at all covariate values in data.
logisticbyz(x, distance, models, beta)
logisticbyz(x, distance, models, beta)
x |
covariate data |
distance |
single distance value |
models |
model list |
beta |
logistic parameters |
vector of probabilities
Jeff Laake
Logistic detection function
logisticdetfct(distance, theta, w, std = FALSE)
logisticdetfct(distance, theta, w, std = FALSE)
distance |
perpendicular distance vector |
theta |
scale parameters |
w |
scale covariate matrix |
std |
if TRUE uses scale=1 The routine returns a vector of probabilities that the observation were detected given they were at the specified distance and assuming that g(0)=1 (ie a standard line transect detection function). |
Treats logistic for duplicates as a function of covariate z; for a given z it computes the function at with those covariate values at a range of distances.
logisticdupbyx(distance, x1, x2, models, beta, point)
logisticdupbyx(distance, x1, x2, models, beta, point)
distance |
vector of distance values |
x1 |
covariate data for fct 1 |
x2 |
covariate data for fct 2 |
models |
model list |
beta |
logistic parameters |
point |
|
vector of probabilities
Jeff Laake
As logisticdupbyx
, but faster when distance is a covariate
(but no interactions with distance occur.
logisticdupbyx_fast(distance, x1, x2, models, beta, point, beta_distance)
logisticdupbyx_fast(distance, x1, x2, models, beta, point, beta_distance)
distance |
vector of distance values |
x1 |
linear predictor for 1, without distance |
x2 |
linear predictor for 2, without distance |
models |
model list |
beta |
logistic parameters |
point |
|
beta_distance |
parameter for distance |
David L Miller
Computes logit transformation.
logit(p)
logit(p)
p |
probability |
logit(p)
returns [log(p/(1-p)]
Jeff Laake
Extract the log-likelihood from a fitted detection function.
## S3 method for class 'ddf' logLik(object, ...)
## S3 method for class 'ddf' logLik(object, ...)
object |
a fitted detection function model object |
... |
included for S3 completeness, but ignored |
a numeric value giving the log-likelihood with two attributes:
"df"
the "degrees of freedom" for the model (number of parameters)
and "nobs"
the number of observations used to fit the model
David L Miller
Creates model formula list for multiple covariate distance sampling using
values supplied in call to ddf
mcds( formula = NULL, key = NULL, adj.series = NULL, adj.order = c(NULL), adj.scale = "width", adj.exp = FALSE, shape.formula = ~1 )
mcds( formula = NULL, key = NULL, adj.series = NULL, adj.order = c(NULL), adj.scale = "width", adj.exp = FALSE, shape.formula = ~1 )
formula |
formula for scale function |
key |
string identifying key function (currently either "hn" (half-normal),"hr" (hazard-rate), "unif" (uniform) or "gamma" (gamma distribution) |
adj.series |
string identifying adjustment functions cos (Cosine), herm (Hermite polynomials), poly (simple polynomials) or NULL |
adj.order |
vector of order of adjustment terms to include |
adj.scale |
whether to scale the adjustment terms by "width" or "scale" |
adj.exp |
if TRUE uses exp(adj) for adjustment to keep f(x)>0 |
shape.formula |
formula for shape function |
A formula list used to define the detection function model
fct |
string "mcds" |
key |
key function string |
adj.series |
adjustment function string |
adj.order |
adjustment function orders |
adj.scale |
adjustment function scale type |
formula |
formula for scale function |
shape.formula |
formula for shape function |
Jeff Laake; Dave Miller
Rather than use the R-based detection function fitting algorithms provided in 'mrds', one can also use the algorithm used by Distance for Windows, implemented in the binary file 'MCDS.exe'. Note that with changes in R-based optimizer introduced in 'mrds' version 3.0.0 this is unlikely to result in better estimates. The option remains available, although it may be deprecated in a future release. To make use of this facility, one must first download the 'MCDS.exe' binary, as laid out below under 'Obtaining MCDS.exe'. Once the binary is installed, calls to 'ddf' will, by default, result in using the model being fit using both 'MCDS.exe' and the R-based algorithm, and the one with lower negative log-likelihood being selected. In almost all cases, both algorithms produce the same results, but we have found edge where one or other fails to find the likelihood maximum and hence trying both is useful.
There may also be cases where the 'MCDS.exe' algorithm is faster than the R-based one.
Under this circumstance, you can choose to run only the 'MCDS.exe' algorithm via by setting
the 'ddf' argument control=list(optimizer='MCDS')
. For completeness, one can also
choose to use only the R-based algorithm by setting control=list(optimizer='R')
.
For more information and examples comparing the R-based and 'MCDS.exe' algorithms, see our examples pages at https://examples.distancesampling.org/
If you are running a non-Windows operating system, you can follow the instructions below to have 'MCDS.exe' run using 'wine'.
The following code can be used to download 'MCDS.exe' from the distance
sampling website:
download.file("http://distancesampling.org/R/MCDS.exe", paste0(system.file(package="mrds"),"/MCDS.exe"), mode = "wb")
The MCDS binary will be installed to the main directory of your your local R
mrds library. Alternatively, you can copy the 'MCDS.exe' from your local
Distance for Windows installation if you prefer. The location of your local
mrds library main directory can be found by running the following in R:
system.file("MCDS.exe", package="mrds")
.
This has been tentatively tested on a mac but should currently be considered largely experimental.
One can still use MCDS.exe even if you are running a mac computer. To do this one will need to install 'wine' a Windows emulator. It is important to use a version of 'wine' which can run 32-bit programs.
The package will attempt to work out which 'wine' binary to use (and detect if it is installed), but this doesn't always work. In this case, the location of the 'wine' binary can be specified in the 'control' 'list' provided to 'ddf' using the 'winebin' element or supply the 'winebin' argument to the 'ds' function. For example, if 'wine' is installed at '/usr/bin/local/wine' you can set 'control$winebin' to that location to use that binary.
On macOS, this can be achieved using the 'homebrew' package management
system and installing the 'wine-crossover' package. You may need to change
the control$winebin
to be 'wine', 'wine64' or 'wine32on64',
depending on your system's setup. This package tries to work out what to
do, but likely doesn't handle all corner cases. Currently this is untested
on Mac M1 systems.
Once this feature is enabled, using 'ddf' will by default run both
its built-in R optimizer and 'MCDS.exe'. To disable this behaviour, specify which
you wish to use with via the optimizer=
option described above. Alternatively,
if you wish to permanently stop using MCDS.exe, remove
the 'MCDS.exe' binary file. You can find which folder it is in by running the following in R:
system.file("MCDS.exe", package="mrds")
.
David L Miller and Jonah McArthur
mrds
modelsOccasionally when fitting an 'mrds' model one can run into optimisation issues. In general such problems can be quite complex so these "quick fixes" may not work. If you come up against problems that are not fixed by these tips, or you feel the results are dubious please go ahead and contact the package authors.
One can obtain debug output at each stage of the optimisation using the
showit
option. This is set via control
, so adding
control=list(showit=3)
gives the highest level of debug output
(setting showit
to 1 or 2 gives less output).
Sometimes convergence issues in covariate (MCDS) models are caused by values
of the covariate being very large, so a rescaling of that covariate is then
necessary. Simply scaling by the standard deviation of the covariate can
help (e.g. dat$size.scaled <- dat$scale/sd(dat$scale)
for a covariate
size
, then including size.scaled
in the model instead of
size
).
It is important to note that one needs to use the original covariate (size) when computing Horvitz-Thompson estimates of population size if the group size is used in that estimate. i.e. use the unscaled size in the numerator of the H-T estimator.
By default R will set the base factor level to be the label which comes
first alphabetically. Sometimes this can be an issue when that factor level
corresponds to a subset of the data with very few observations. This can
lead to very large uncertainty estimates (CVs) for model parameters. One way
around this is to use relevel
to set the base level to a level
with more observations.
Initial (or starting) values for the dsmodel can be set via the initial
element of the control
list. initial
is a list itself with
elements scale
, shape
and adjustment
, corresponding to
the associated parameters. If a model has covariates then the scale
or
shape
elements will be vectors with parameter initial values in the
same order as they are specific in the model formula (using showit
is
a good check they are in the correct order). Adjustment starting values are
in order of the order of that term (cosine order 2 is before cosine order 3
terms).
One way of obtaining starting values is to fit a simpler model first (say with fewer covariates or adjustments) and then use the starting values from this simpler model for the corresponding parameters.
Another alternative to obtain starting values is to fit the model (or some
submodel) using Distance for Windows. Note that Distance reports the scale
parameter (or intercept in a covariate model) on the exponential scale, so
one must log
this before supplying it to ddf
.
One can change the upper and lower bounds for the dsmodel parameters. These specify the largest and smallest values individual parameters can be. By placing these constraints on the parameters, it is possible to "temper" the optimisation problem, making fitting possible.
Again, one uses the control
list, the elements upperbounds
and
lowerbounds
. In this case, each of upperbounds
and
lowerbounds
are vectors, which one can think of as each of the
vectors shape
, scale
and adjustment
from the "Initial
values" section above, concatenated in that order. If one does not occur
(e.g. no shape parameter) then it is simple omitted from the vector.
The key function plus adjustment approach of Conventional Distance Sampling (CDS) can sometimes run into issues because it is sensible to constrain the fitted detection function to be monotonic non-increasing (i.e., flat or going down) with increasing distance - finding the maximum of the constrained likelihood is more difficult than the same task without constraints.
There are several options within the 'ddf' control
argument that may help
if difficulties are encountered. These are documented in the ddf
manual page, and a few are mentioned below.
One potential strategy (as mentioned above) is to use better starting values for the
optimization. If mono.startvals
is set to TRUE
then the detection function is first fit without adjustments and the resulting
scale (and shape) estimates used as starting values in the model with adjustments.
For even finer control, the initial
option can be used as documented above.
Another potential thing to change is the constraint solver used. From 'mrds' v 3.0.0
a new constraint solver, 'slsqp', has been included as the default. This was found
to work better than the solver previously used ('solnp') but if needed this solver
can be specified using the mono.method
option of the control
argument of
'ddf'.
It is also possible to use the optimizer implemented in Distance for Windows by downloading
a separate binary - see the manual page on mcds_dot_exe
. If specified, this
will also be used for Multiple Covariate Distance Sampling (MCDS) analyses.
David L. Miller <[email protected]>
Generic function that computes abundance within the covered region. It calls method (class) specific functions for the computation.
NCovered(par, model = NULL, group = TRUE)
NCovered(par, model = NULL, group = TRUE)
par |
parameter values (used when computing derivatives wrt parameter
uncertainty); if NULL parameter values in |
model |
ddf model object |
group |
if TRUE computes group abundance and if FALSE individual abundance |
abundance estimate
Jeff Laake
nlminb
This is a wrapper around nlminb to use scaling, as this is not available in
optimx
.
nlminb_wrapper( par, ll, ugr = NULL, lower = NULL, upper = NULL, mcontrol, hess = NULL, ddfobj, data, ... )
nlminb_wrapper( par, ll, ugr = NULL, lower = NULL, upper = NULL, mcontrol, hess = NULL, ddfobj, data, ... )
par |
starting parameters |
ll |
log likelihood function |
ugr |
gradient function |
lower |
lower bounds on parameters |
upper |
upper bounds on parameters |
mcontrol |
control options |
hess |
hessian function |
ddfobj |
detection function specification object |
data |
the data |
... |
anything else to pass to |
optimx
object
David L Miller, modified from optimx.run
by JC Nash, R
Varadhan, G Grothendieck.
Computes detection probability for detection function computed from mark-recapture data with possibly different link functions.
p.det(dpformula, dplink, dppars, dpdata)
p.det(dpformula, dplink, dppars, dpdata)
dpformula |
formula for detection function |
dplink |
link function ("logit","loglog","cloglog") |
dppars |
parameter vector |
dpdata |
double platform data |
vector of predicted detection probabilities
?????
Generate a table of frequencies of probability of detection from a detection function model. This is particularly useful when employing covariates, as it can indicate if there are detections with very small detection probabilities that can be unduly influential when calculating abundance estimates.
p.dist.table(object, bins = seq(0, 1, by = 0.1), proportion = FALSE) p_dist_table(object, bins = seq(0, 1, by = 0.1), proportion = FALSE)
p.dist.table(object, bins = seq(0, 1, by = 0.1), proportion = FALSE) p_dist_table(object, bins = seq(0, 1, by = 0.1), proportion = FALSE)
object |
fitted detection function |
bins |
how the results should be binned |
proportion |
should proportions be returned as well as counts? |
Because dht
uses a Horvitz-Thompson-like estimator, abundance
estimates can be sensitive to errors in the estimated probabilities. The
estimator is based on , which means that the
sensitivity is greater for smaller detection probabilities. As a rough
guide, we recommend that the method be not used if more than say 5% of the
are less than 0.2, or if any are less than 0.1. If
these conditions are violated, the truncation distance w can be reduced.
This causes some loss of precision relative to standard distance sampling
without covariates.
a data.frame
with probability bins, counts and (optionally)
proportions. The object has an attribute p_range
which contains the
range of estimated detection probabilities
David L Miller
Marques, F.F.C. and S.T. Buckland. 2004. Covariate models for the detection function. In: Advanced Distance Sampling, eds. S.T. Buckland, D.R. Anderson, K.P. Burnham, J.L. Laake, D.L. Borchers, and L. Thomas. Oxford University Press.
## Not run: # try out the tee data data(book.tee.data) egdata <- book.tee.data$book.tee.dataframe # fit model with covariates result <- ddf(dsmodel = ~mcds(key = "hn", formula = ~sex+size), data = egdata[egdata$observer==1, ], method = "ds", meta.data = list(width = 4)) # print table p.dist.table(result) # with proportions p.dist.table(result, proportion=TRUE) ## End(Not run)
## Not run: # try out the tee data data(book.tee.data) egdata <- book.tee.data$book.tee.dataframe # fit model with covariates result <- ddf(dsmodel = ~mcds(key = "hn", formula = ~sex+size), data = egdata[egdata$observer==1, ], method = "ds", meta.data = list(width = 4)) # print table p.dist.table(result) # with proportions p.dist.table(result, proportion=TRUE) ## End(Not run)
Take the resulting object from a call to optimx and make it into an object that mrds wants to talk to.
parse.optimx(lt, lnl.last, par.last)
parse.optimx(lt, lnl.last, par.last)
lt |
an optimx object |
lnl.last |
last value of the log likelihood |
par.last |
last value of the parameters |
lt
object that can be used later on
Computes probability that a object was detected by at least one observer
(pdot
or p_.) for a logistic detection function that contains
distance.
pdot.dsr.integrate.logistic( right, width, beta, x, integral.numeric, BT, models, GAM = FALSE, rem = FALSE, point = FALSE )
pdot.dsr.integrate.logistic( right, width, beta, x, integral.numeric, BT, models, GAM = FALSE, rem = FALSE, point = FALSE )
right |
either an integration range for binned data (vector of 2) or the rightmost value for integration (from 0 to right) |
width |
transect width |
beta |
parameters of logistic detection function |
x |
data matrix |
integral.numeric |
set to |
BT |
|
models |
list of models including |
GAM |
Not used at present. The idea was to be able to use a GAM for g(0) portion of detection function; should always be F |
rem |
only |
point |
|
Jeff Laake
Plot proportion of observations detected within distance intervals (for
conditional detection functions) to compare visually the fitted model and
data. Internal function called by plot
methods.
plot_cond( obs, xmat, gxvalues, model, nc, breaks, finebr, showpoints, showlines, maintitle, ylim, angle = -45, density = 20, col = "black", jitter = NULL, xlab = "Distance", ylab = "Detection probability", subtitle = TRUE, ... )
plot_cond( obs, xmat, gxvalues, model, nc, breaks, finebr, showpoints, showlines, maintitle, ylim, angle = -45, density = 20, col = "black", jitter = NULL, xlab = "Distance", ylab = "Detection probability", subtitle = TRUE, ... )
obs |
observer code |
xmat |
processed data |
gxvalues |
detection function values for each observation |
model |
fitted model from |
nc |
number of equal-width bins for histogram |
breaks |
user define breakpoints |
finebr |
fine break values over which line is averaged |
showpoints |
logical variable; if |
showlines |
logical variable; if |
maintitle |
main title line for each plot |
ylim |
range of y axis (default |
angle |
shading angle for hatching |
density |
shading density for hatching |
col |
plotting colour |
jitter |
scaling option for plotting points. Jitter is applied to points by multiplying the fitted value by a random draw from a normal distribution with mean 1 and sd jitter. |
xlab |
label for x-axis |
ylab |
label for y-axis |
subtitle |
if TRUE, shows plot type as sub-title |
... |
other graphical parameters, passed to the plotting functions
( |
Jeff Laake, Jon Bishop, David Borchers
This function does the paging, using devAskNewPage()
. This means we
can just call plots and R will make the prompt for us
Warning, this function has side effects! It modifies devAskNewPage
!
plot_layout(which, pages)
plot_layout(which, pages)
which |
which plots are to be created |
pages |
number of pages to span the plots across |
Code is stolen and modified from plot.R
in mgcv
by Simon Wood
David L. Miller, based on code by Simon N. Wood
Plots unconditional detection function for observer=obs observations
overlays histogram, average detection function and values for individual
observations data. Internal function called by plot
methods.
plot_uncond( model, obs, xmat, gxvalues, nc, finebr, breaks, showpoints, showlines, maintitle, ylim, return.lines = FALSE, angle = -45, density = 20, col = "black", jitter = NULL, xlab = "Distance", ylab = "Detection probability", subtitle = TRUE, ... )
plot_uncond( model, obs, xmat, gxvalues, nc, finebr, breaks, showpoints, showlines, maintitle, ylim, return.lines = FALSE, angle = -45, density = 20, col = "black", jitter = NULL, xlab = "Distance", ylab = "Detection probability", subtitle = TRUE, ... )
model |
fitted model from |
obs |
value of observer for plot |
xmat |
processed data |
gxvalues |
detection function values for each observation |
nc |
number of equal-width bins for histogram |
finebr |
fine break values over which line is averaged |
breaks |
user define breakpoints |
showpoints |
logical variable; if TRUE plots predicted value for each observation |
showlines |
logical variable; if TRUE plots average predicted value line |
maintitle |
main title line for each plot |
ylim |
range of y axis; defaults to (0,1) |
return.lines |
if TRUE, returns values for line |
angle |
shading angle for hatching |
density |
shading density for hatching |
col |
plotting colour |
jitter |
scaling option for plotting points. Jitter is applied to points by multiplying the fitted value by a random draw from a normal distribution with mean 1 and sd jitter. |
xlab |
label for x-axis |
ylab |
label for y-axis |
subtitle |
if TRUE, shows plot type as sub-title |
... |
other graphical parameters, passed to the plotting functions
( |
if return.lines==TRUE
returns dataframe average.line
otherwise just plots
Jeff Laake, Jon Bishop, David Borchers
Plot the tables created by det.tables
. Produces a series of
tables for dual observer data that shows the number missed and detected for
each observer within defined distance classes.
## S3 method for class 'det.tables' plot( x, which = 1:6, angle = NULL, density = NULL, col1 = "white", col2 = "lightgrey", new = TRUE, ... )
## S3 method for class 'det.tables' plot( x, which = 1:6, angle = NULL, density = NULL, col1 = "white", col2 = "lightgrey", new = TRUE, ... )
x |
object returned by |
which |
items in x to plot (vector with values in 1:6) |
angle |
shading angle for hatching |
density |
shading density for hatching |
col1 |
plotting colour for total histogram bars. |
col2 |
plotting colour for subset histogram bars. |
new |
if |
... |
other graphical parameters, passed to plotting functions |
Plots that are produced are as follows (controlled by the which
argument):
Detected by either observer/Detected by observer 1
Detected by either observer/Detected by observer 2
Seen by both observers
Seen by either observer
Detected by observer 2/Detected by observer 1 | 2
Detected by observer 1/Detected by observer 2 | 1
Just plots.
Jeff Laake, David L Miller
data(book.tee.data) region <- book.tee.data$book.tee.region egdata <- book.tee.data$book.tee.dataframe samples <- book.tee.data$book.tee.samples obs <- book.tee.data$book.tee.obs xx <- ddf(mrmodel=~glm(formula=~distance*observer), dsmodel = ~mcds(key = "hn", formula = ~sex), data = egdata, method = "io", meta.data = list(width = 4)) tabs <- det.tables(xx,breaks=c(0,.5,1,2,3,4)) par(mfrow=c(2,3)) plot(tabs,which=1:6,new=FALSE)
data(book.tee.data) region <- book.tee.data$book.tee.region egdata <- book.tee.data$book.tee.dataframe samples <- book.tee.data$book.tee.samples obs <- book.tee.data$book.tee.obs xx <- ddf(mrmodel=~glm(formula=~distance*observer), dsmodel = ~mcds(key = "hn", formula = ~sex), data = egdata, method = "io", meta.data = list(width = 4)) tabs <- det.tables(xx,breaks=c(0,.5,1,2,3,4)) par(mfrow=c(2,3)) plot(tabs,which=1:6,new=FALSE)
Plots the fitted detection function(s) with a histogram of the observed distances to compare visually the fitted model and data.
## S3 method for class 'ds' plot( x, which = 2, breaks = NULL, nc = NULL, jitter.v = rep(0, 3), showpoints = TRUE, subset = NULL, pl.col = "lightgrey", pl.den = NULL, pl.ang = NULL, main = NULL, pages = 0, pdf = FALSE, ylim = NULL, xlab = "Distance", ylab = NULL, ... )
## S3 method for class 'ds' plot( x, which = 2, breaks = NULL, nc = NULL, jitter.v = rep(0, 3), showpoints = TRUE, subset = NULL, pl.col = "lightgrey", pl.den = NULL, pl.ang = NULL, main = NULL, pages = 0, pdf = FALSE, ylim = NULL, xlab = "Distance", ylab = NULL, ... )
x |
fitted model from |
||||
which |
index to specify which plots should be produced:
|
||||
breaks |
user defined breakpoints |
||||
nc |
number of equal width bins for histogram |
||||
jitter.v |
apply jitter to points by multiplying the fitted value by a
random draw from a normal distribution with mean 1 and sd |
||||
showpoints |
logical variable; if |
||||
subset |
subset of data to plot. |
||||
pl.col |
colour for histogram bars. |
||||
pl.den |
shading density for histogram bars. |
||||
pl.ang |
shading angle for histogram bars. |
||||
main |
plot title. |
||||
pages |
the number of pages over which to spread the plots. For
example, if |
||||
pdf |
plot the histogram of distances with the PDF of the probability of detection overlaid. Ignored (with warning) for line transect models. |
||||
ylim |
vertical axis limits. |
||||
xlab |
horizontal axis label (defaults to "Distance"). |
||||
ylab |
vertical axis label (default automatically set depending on plot type). |
||||
... |
other graphical parameters, passed to the plotting functions
( |
The structure of the histogram can be controlled by the user-defined
arguments nc
or breaks
. The observation specific detection
probabilities along with the line representing the fitted average detection
probability.
It is not intended for the user to call plot.ds
but its arguments are
documented here. Instead the generic plot
command should be used and
it will call the appropriate function based on the class of the ddf
object.
Just plots.
Jeff Laake, Jon Bishop, David Borchers, David L Miller
add_df_covar_line
# fit a model to the tee data data(book.tee.data) egdata <- book.tee.data$book.tee.dataframe xx <- ddf(dsmodel=~mcds(key="hn", formula=~sex), data=egdata[egdata$observer==1, ], method="ds", meta.data=list(width=4)) # not showing predicted probabilities plot(xx, breaks=c(0, 0.5, 1, 2, 3, 4), showpoints=FALSE) # two subsets plot(xx, breaks=c(0, 0.5, 1, 2, 3, 4), subset=sex==0) plot(xx, breaks=c(0, 0.5, 1, 2, 3, 4), subset=sex==1) # put both plots on one page plot(xx, breaks=c(0, 0.5, 1, 2, 3, 4), pages=1, which=1:2)
# fit a model to the tee data data(book.tee.data) egdata <- book.tee.data$book.tee.dataframe xx <- ddf(dsmodel=~mcds(key="hn", formula=~sex), data=egdata[egdata$observer==1, ], method="ds", meta.data=list(width=4)) # not showing predicted probabilities plot(xx, breaks=c(0, 0.5, 1, 2, 3, 4), showpoints=FALSE) # two subsets plot(xx, breaks=c(0, 0.5, 1, 2, 3, 4), subset=sex==0) plot(xx, breaks=c(0, 0.5, 1, 2, 3, 4), subset=sex==1) # put both plots on one page plot(xx, breaks=c(0, 0.5, 1, 2, 3, 4), pages=1, which=1:2)
io
) modelPlots the fitted detection functions for a distance sampling model and histograms of the distances (for unconditional detection functions) or proportion of observations detected within distance intervals (for conditional detection functions) to compare visually the fitted model and data.
## S3 method for class 'io' plot( x, which = 1:6, breaks = NULL, nc = NULL, maintitle = "", showlines = TRUE, showpoints = TRUE, ylim = c(0, 1), angle = NULL, density = NULL, col = "lightgrey", jitter = NULL, divisions = 25, pages = 0, xlab = "Distance", ylab = "Detection probability", subtitle = TRUE, ... )
## S3 method for class 'io' plot( x, which = 1:6, breaks = NULL, nc = NULL, maintitle = "", showlines = TRUE, showpoints = TRUE, ylim = c(0, 1), angle = NULL, density = NULL, col = "lightgrey", jitter = NULL, divisions = 25, pages = 0, xlab = "Distance", ylab = "Detection probability", subtitle = TRUE, ... )
x |
fitted model from |
||||||||||||
which |
index to specify which plots should be produced.
Note that the order of which is ignored and plots are produced in the above order. |
||||||||||||
breaks |
user define breakpoints |
||||||||||||
nc |
number of equal-width bins for histogram |
||||||||||||
maintitle |
main title line for each plot |
||||||||||||
showlines |
logical variable; if TRUE a line representing the average detection probability is plotted |
||||||||||||
showpoints |
logical variable; if TRUE plots predicted value for each observation |
||||||||||||
ylim |
range of vertical axis; defaults to (0,1) |
||||||||||||
angle |
shading angle for histogram bars. |
||||||||||||
density |
shading density for histogram bars. |
||||||||||||
col |
colour for histogram bars. |
||||||||||||
jitter |
scaling option for plotting points. Jitter is applied to points by multiplying the fitted value by a random draw from a normal distribution with mean 1 and sd jitter. |
||||||||||||
divisions |
number of divisions for averaging line values; default = 25 |
||||||||||||
pages |
the number of pages over which to spread the plots. For
example, if |
||||||||||||
xlab |
label for x-axis |
||||||||||||
ylab |
label for y-axis |
||||||||||||
subtitle |
if TRUE, shows plot type as sub-title |
||||||||||||
... |
other graphical parameters, passed to the plotting functions
( |
The structure of the histogram can be controlled by the user-defined
arguments nc
or breaks
. The observation specific detection
probabilities along with the line representing the fitted average detection
probability.
It is not intended for the user to call plot.io.fi
but its arguments
are documented here. Instead the generic plot
command should be used
and it will call the appropriate function based on the class of the
ddf
object.
Just plots
Jeff Laake, Jon Bishop, David Borchers, David L Miller
library(mrds) data(book.tee.data) egdata <- book.tee.data$book.tee.dataframe result.io <- ddf(dsmodel=~cds(key = "hn"), mrmodel=~glm(~distance), data=egdata, method="io", meta.data=list(width=4)) # just plot everything plot(result.io) # Plot primary and secondary unconditional detection functions on one page # and primary and secondary conditional detection functions on another plot(result.io,which=c(1,2,5,6),pages=2)
library(mrds) data(book.tee.data) egdata <- book.tee.data$book.tee.dataframe result.io <- ddf(dsmodel=~cds(key = "hn"), mrmodel=~glm(~distance), data=egdata, method="io", meta.data=list(width=4)) # just plot everything plot(result.io) # Plot primary and secondary unconditional detection functions on one page # and primary and secondary conditional detection functions on another plot(result.io,which=c(1,2,5,6),pages=2)
io.fi
)Plots the fitted detection functions for a distance sampling model and histograms of the distances (for unconditional detection functions) or proportion of observations detected within distance intervals (for conditional detection functions) to compare visually the fitted model and data.
## S3 method for class 'io.fi' plot( x, which = 1:6, breaks = NULL, nc = NULL, maintitle = "", showlines = TRUE, showpoints = TRUE, ylim = c(0, 1), angle = NULL, density = NULL, col = "lightgrey", jitter = NULL, divisions = 25, pages = 0, xlab = "Distance", ylab = "Detection probability", subtitle = TRUE, ... )
## S3 method for class 'io.fi' plot( x, which = 1:6, breaks = NULL, nc = NULL, maintitle = "", showlines = TRUE, showpoints = TRUE, ylim = c(0, 1), angle = NULL, density = NULL, col = "lightgrey", jitter = NULL, divisions = 25, pages = 0, xlab = "Distance", ylab = "Detection probability", subtitle = TRUE, ... )
x |
fitted model from |
||||||||||||
which |
index to specify which plots should be produced.
Note that the order of which is ignored and plots are produced in the above order. |
||||||||||||
breaks |
user define breakpoints |
||||||||||||
nc |
number of equal-width bins for histogram |
||||||||||||
maintitle |
main title line for each plot |
||||||||||||
showlines |
logical variable; if TRUE a line representing the average detection probability is plotted |
||||||||||||
showpoints |
logical variable; if TRUE plots predicted value for each observation |
||||||||||||
ylim |
range of vertical axis; defaults to (0,1) |
||||||||||||
angle |
shading angle for histogram bars. |
||||||||||||
density |
shading density for histogram bars. |
||||||||||||
col |
colour for histogram bars. |
||||||||||||
jitter |
scaling option for plotting points. Jitter is applied to points by multiplying the fitted value by a random draw from a normal distribution with mean 1 and sd jitter. |
||||||||||||
divisions |
number of divisions for averaging line values; default = 25 |
||||||||||||
pages |
the number of pages over which to spread the plots. For
example, if |
||||||||||||
xlab |
label for x-axis |
||||||||||||
ylab |
label for y-axis |
||||||||||||
subtitle |
if TRUE, shows plot type as sub-title |
||||||||||||
... |
other graphical parameters, passed to the plotting functions
( |
The structure of the histogram can be controlled by the user-defined
arguments nc
or breaks
. The observation specific detection
probabilities along with the line representing the fitted average detection
probability.
It is not intended for the user to call plot.io.fi
but its arguments
are documented here. Instead the generic plot
command should be used
and it will call the appropriate function based on the class of the
ddf
object.
Just plots.
Jeff Laake, Jon Bishop, David Borchers, David L Miller
library(mrds) data(book.tee.data) egdata <- book.tee.data$book.tee.dataframe result.io.fi <- ddf(mrmodel=~glm(~distance), data = egdata, method = "io.fi", meta.data = list(width = 4)) # just plot everything plot(result.io.fi) # Plot primary and secondary unconditional detection functions on one page # and primary and secondary conditional detection functions on another plot(result.io.fi,which=c(1,2,5,6),pages=2)
library(mrds) data(book.tee.data) egdata <- book.tee.data$book.tee.dataframe result.io.fi <- ddf(mrmodel=~glm(~distance), data = egdata, method = "io.fi", meta.data = list(width = 4)) # just plot everything plot(result.io.fi) # Plot primary and secondary unconditional detection functions on one page # and primary and secondary conditional detection functions on another plot(result.io.fi,which=c(1,2,5,6),pages=2)
Plots the fitted detection functions for a distance sampling model and histograms of the distances (for unconditional detection functions) or proportion of observations detected within distance intervals (for conditional detection functions) to compare visually the fitted model and data.
## S3 method for class 'rem' plot( x, which = 1:3, breaks = NULL, nc = NULL, maintitle = "", showlines = TRUE, showpoints = TRUE, ylim = c(0, 1), angle = NULL, density = NULL, col = "lightgrey", jitter = NULL, divisions = 25, pages = 0, xlab = "Distance", ylab = "Detection probability", subtitle = TRUE, ... )
## S3 method for class 'rem' plot( x, which = 1:3, breaks = NULL, nc = NULL, maintitle = "", showlines = TRUE, showpoints = TRUE, ylim = c(0, 1), angle = NULL, density = NULL, col = "lightgrey", jitter = NULL, divisions = 25, pages = 0, xlab = "Distance", ylab = "Detection probability", subtitle = TRUE, ... )
x |
fitted model from |
||||||
which |
index to specify which plots should be produced.
|
||||||
breaks |
user define breakpoints |
||||||
nc |
number of equal-width bins for histogram |
||||||
maintitle |
main title line for each plot |
||||||
showlines |
logical variable; if |
||||||
showpoints |
logical variable; if |
||||||
ylim |
range of vertical axis; defaults to (0,1) |
||||||
angle |
shading angle for histogram bars. |
||||||
density |
shading density for histogram bars. |
||||||
col |
colour for histogram bars. |
||||||
jitter |
scaling option for plotting points. Jitter is applied to points by multiplying the fitted value by a random draw from a normal distribution with mean 1 and sd jitter. |
||||||
divisions |
number of divisions for averaging line values; default = 25 |
||||||
pages |
the number of pages over which to spread the plots. For
example, if |
||||||
xlab |
label for x-axis |
||||||
ylab |
label for y-axis |
||||||
subtitle |
if |
||||||
... |
other graphical parameters, passed to the plotting functions
( |
The structure of the histogram can be controlled by the user-defined
arguments nc
or breaks
. The observation specific detection
probabilities along with the line representing the fitted average detection
probability.
It is not intended for the user to call plot.rem
but its arguments
are documented here. Instead the generic plot
command should be used
and it will call the appropriate function based on the class of the
ddf
object.
Jeff Laake, Jon Bishop, David Borchers, David L Miller
Plots the fitted detection functions for a distance sampling model and histograms of the distances (for unconditional detection functions) or proportion of observations detected within distance intervals (for conditional detection functions) to compare visually the fitted model and data.
## S3 method for class 'rem.fi' plot( x, which = 1:3, breaks = NULL, nc = NULL, maintitle = "", showlines = TRUE, showpoints = TRUE, ylim = c(0, 1), angle = NULL, density = NULL, col = "lightgrey", jitter = NULL, divisions = 25, pages = 0, xlab = "Distance", ylab = "Detection probability", subtitle = TRUE, ... )
## S3 method for class 'rem.fi' plot( x, which = 1:3, breaks = NULL, nc = NULL, maintitle = "", showlines = TRUE, showpoints = TRUE, ylim = c(0, 1), angle = NULL, density = NULL, col = "lightgrey", jitter = NULL, divisions = 25, pages = 0, xlab = "Distance", ylab = "Detection probability", subtitle = TRUE, ... )
x |
fitted model from |
||||||
which |
index to specify which plots should be produced.
|
||||||
breaks |
user defined breakpoints |
||||||
nc |
number of equal-width bins for histogram |
||||||
maintitle |
main title line for each plot |
||||||
showlines |
logical variable; if |
||||||
showpoints |
logical variable; if |
||||||
ylim |
range of vertical axis; defaults to (0,1) |
||||||
angle |
shading angle for histogram bars. |
||||||
density |
shading density for histogram bars. |
||||||
col |
colour for histogram bars. |
||||||
jitter |
scaling option for plotting points. Jitter is applied to points by multiplying the fitted value by a random draw from a normal distribution with mean 1 and sd jitter |
||||||
divisions |
number of divisions for averaging line values; default = 25 |
||||||
pages |
the number of pages over which to spread the plots. For
example, if |
||||||
xlab |
label for x-axis |
||||||
ylab |
label for y-axis |
||||||
subtitle |
if |
||||||
... |
other graphical parameters, passed to the plotting functions
( |
The structure of the histogram can be controlled by the user-defined
arguments nc
or breaks
. The observation specific detection
probabilities along with the line representing the fitted average detection
probability.
It is not intended for the user to call plot.rem.fi
but its arguments
are documented here. Instead the generic plot
command should be used
and it will call the appropriate function based on the class of the
ddf
object.
Jeff Laake, Jon Bishop, David Borchers, David L Miller
Plots the fitted detection functions for a distance sampling model and histograms of the distances (for unconditional detection functions) or proportion of observations detected within distance intervals (for conditional detection functions) to compare visually the fitted model and data.
## S3 method for class 'trial' plot( x, which = 1:2, breaks = NULL, nc = NULL, maintitle = "", showlines = TRUE, showpoints = TRUE, ylim = c(0, 1), angle = NULL, density = NULL, col = "lightgrey", jitter = NULL, divisions = 25, pages = 0, xlab = "Distance", ylab = "Detection probability", subtitle = TRUE, ... )
## S3 method for class 'trial' plot( x, which = 1:2, breaks = NULL, nc = NULL, maintitle = "", showlines = TRUE, showpoints = TRUE, ylim = c(0, 1), angle = NULL, density = NULL, col = "lightgrey", jitter = NULL, divisions = 25, pages = 0, xlab = "Distance", ylab = "Detection probability", subtitle = TRUE, ... )
x |
fitted model from |
||||
which |
index to specify which plots should be produced.
|
||||
breaks |
user define breakpoints |
||||
nc |
number of equal-width bins for histogram |
||||
maintitle |
main title line for each plot |
||||
showlines |
logical variable; if |
||||
showpoints |
logical variable; if |
||||
ylim |
range of vertical axis; defaults to (0,1) |
||||
angle |
shading angle for histogram bars. |
||||
density |
shading density for histogram bars. |
||||
col |
colour for histogram bars. |
||||
jitter |
scaling option for plotting points. Jitter is applied to points by multiplying the fitted value by a random draw from a normal distribution with mean 1 and sd jitter. |
||||
divisions |
number of divisions for averaging line values; default = 25 |
||||
pages |
the number of pages over which to spread the plots. For
example, if |
||||
xlab |
label for x-axis |
||||
ylab |
label for y-axis |
||||
subtitle |
if |
||||
... |
other graphical parameters, passed to the plotting functions
( |
The structure of the histogram can be controlled by the user-defined
arguments nc
or breaks
. The observation specific detection
probabilities along with the line representing the fitted average detection
probability.
It is not intended for the user to call plot.io.fi
but its arguments
are documented here. Instead the generic plot
command should be used
and it will call the appropriate function based on the class of the
ddf
object.
Jeff Laake, Jon Bishop, David Borchers
Plots the fitted detection functions for a distance sampling model and histograms of the distances (for unconditional detection functions) or proportion of observations detected within distance intervals (for conditional detection functions) to compare visually the fitted model and data.
## S3 method for class 'trial.fi' plot( x, which = 1:2, breaks = NULL, nc = NULL, maintitle = "", showlines = TRUE, showpoints = TRUE, ylim = c(0, 1), angle = NULL, density = NULL, col = "lightgrey", jitter = NULL, divisions = 25, pages = 0, xlab = "Distance", ylab = "Detection probability", subtitle = TRUE, ... )
## S3 method for class 'trial.fi' plot( x, which = 1:2, breaks = NULL, nc = NULL, maintitle = "", showlines = TRUE, showpoints = TRUE, ylim = c(0, 1), angle = NULL, density = NULL, col = "lightgrey", jitter = NULL, divisions = 25, pages = 0, xlab = "Distance", ylab = "Detection probability", subtitle = TRUE, ... )
x |
fitted model from |
||||
which |
index to specify which plots should be produced.
|
||||
breaks |
user define breakpoints |
||||
nc |
number of equal-width bins for histogram |
||||
maintitle |
main title line for each plot |
||||
showlines |
logical variable; if |
||||
showpoints |
logical variable; if |
||||
ylim |
range of vertical axis; defaults to (0,1) |
||||
angle |
shading angle for histogram bars. |
||||
density |
shading density for histogram bars. |
||||
col |
colour for histogram bars. |
||||
jitter |
scaling option for plotting points. Jitter is applied to points by multiplying the fitted value by a random draw from a normal distribution with mean 1 and sd jitter. |
||||
divisions |
number of divisions for averaging line values; default = 25 |
||||
pages |
the number of pages over which to spread the plots. For
example, if |
||||
xlab |
label for x-axis |
||||
ylab |
label for y-axis |
||||
subtitle |
if TRUE, shows plot type as sub-title |
||||
... |
other graphical parameters, passed to the plotting functions
( |
The structure of the histogram can be controlled by the user-defined
arguments nc
or breaks
. The observation specific detection
probabilities along with the line representing the fitted average detection
probability.
It is not intended for the user to call plot.io.fi
but its arguments
are documented here. Instead the generic plot
command should be used
and it will call the appropriate function based on the class of the
ddf
object.
Jeff Laake, Jon Bishop, David Borchers
mrds
modelsPredict detection probabilities (or effective strip widths/effective areas of detection) from a fitted distance sampling model using either the original data (i.e. "fitted" values) or using new data.
## S3 method for class 'ds' predict(object, newdata=NULL, compute=FALSE, int.range=NULL, esw=FALSE, se.fit=FALSE, ...) ## S3 method for class 'io.fi' predict(object, newdata=NULL, compute=FALSE, int.range=NULL, integrate=FALSE, ...) ## S3 method for class 'io' predict(object, newdata=NULL, compute=FALSE, int.range=NULL, ...) ## S3 method for class 'trial' predict(object, newdata=NULL, compute=FALSE, int.range=NULL, ...) ## S3 method for class 'trial.fi' predict(object, newdata=NULL, compute=FALSE, int.range=NULL, integrate=FALSE, ...) ## S3 method for class 'rem' predict(object, newdata=NULL, compute=FALSE, int.range=NULL, ...) ## S3 method for class 'rem.fi' predict(object, newdata=NULL, compute=FALSE, int.range=NULL, integrate=FALSE, ...)
## S3 method for class 'ds' predict(object, newdata=NULL, compute=FALSE, int.range=NULL, esw=FALSE, se.fit=FALSE, ...) ## S3 method for class 'io.fi' predict(object, newdata=NULL, compute=FALSE, int.range=NULL, integrate=FALSE, ...) ## S3 method for class 'io' predict(object, newdata=NULL, compute=FALSE, int.range=NULL, ...) ## S3 method for class 'trial' predict(object, newdata=NULL, compute=FALSE, int.range=NULL, ...) ## S3 method for class 'trial.fi' predict(object, newdata=NULL, compute=FALSE, int.range=NULL, integrate=FALSE, ...) ## S3 method for class 'rem' predict(object, newdata=NULL, compute=FALSE, int.range=NULL, ...) ## S3 method for class 'rem.fi' predict(object, newdata=NULL, compute=FALSE, int.range=NULL, integrate=FALSE, ...)
object |
|
newdata |
new |
compute |
if |
int.range |
integration range for variable range analysis; either vector or 2 column matrix. |
esw |
if |
se.fit |
for |
... |
for S3 consistency |
integrate |
for |
The first 4 arguments are the same in each predict function. The latter 2
are specific to certain functions. For line transects, the effective strip
half-width (esw=TRUE
) is the integral of the fitted detection
function over either 0 to W or the specified int.range
. The
predicted detection probability is the average probability which is simply
the integral divided by the distance range. For point transect models,
esw=TRUE
calculates the effective area of detection (commonly
referred to as "nu", this is the integral of 2/width^2 * rg(r)
.
Fitted detection probabilities are stored in the model
object and
these are returned unless compute=TRUE
or newdata
is
specified. compute=TRUE
is used to estimate numerical derivatives for
use in delta method approximations to the variance.
For method="io.fi"
or method="trial.fi"
if
integrate=FALSE
, predict
returns the value of the conditional
detection probability and if integrate=TRUE
, it returns the average
conditional detection probability by integrating over x (distance) with
respect to a uniform distribution.
Note that the ordering of the returned results when no new data is supplied
(the "fitted" values) will not necessarily be the same as the data supplied
to ddf
, the data (and hence results from predict
) will
be sorted by object ID (object
) then observer ID (observer
).
For all but the exceptions below, the value is a list with a single
element: fitted
, a vector of average detection probabilities or esw
values for each observation in the original data ornewdata
For predict.ds
, if se.fit=TRUE
there is an additional element
$se.fit
, which contains the standard errors of the probabilities of
detection or ESW.
For predict.io.fi
,predict.trial.fi
,predict.rem.fi
with
integrate=TRUE
, the value is a list with one element: fitted
,
which is a vector of integrated (average) detection probabilities for each
observation in the original data or newdata
.
For predict.io.fi
, predict.trial.fi
, or predict.rem.fi
with integrate=FALSE
, the value is a list with the following
elements:
fitted
values
p1
, conditional detection probability for
observer 1
p2
, conditional detection probability for
observer 2
fitted
, conditional detection probability of being seen by either
observer
Each function is called by the generic function predict
for the
appropriate ddf
model object. They can be called directly by the
user, but it is typically safest to use predict
which calls the
appropriate function based on the type of model.
Jeff Laake, David L Miller
Simply prints out summary of the model which was fitted. For more
detailed information see summary
.
## S3 method for class 'ddf' print(x, ...)
## S3 method for class 'ddf' print(x, ...)
x |
a |
... |
not passed through, just for S3 compatibility. |
David L. Miller
Provides formatted output for results of goodness of fit tests: chi-square, Kolmogorv-Smirnov and Cramer-von Mises test as appropriate.
## S3 method for class 'ddf.gof' print(x, digits = 3, ...)
## S3 method for class 'ddf.gof' print(x, digits = 3, ...)
x |
result of call to |
digits |
number of digits to round chi-squared table values to |
... |
unused unspecified arguments for generic print |
None
Jeff Laake
Provides formatted output for detection tables
## S3 method for class 'det.tables' print(x, ...)
## S3 method for class 'det.tables' print(x, ...)
x |
result of call to ddf |
... |
unused unspecified arguments for generic print |
None
Jeff Laake
Outputs summary statistics, abundance and density by region (if any) and optionally a correlation matrix if more than one region.
## S3 method for class 'dht' print(x, cor = FALSE, bysample = FALSE, vcmatrices = FALSE, ...)
## S3 method for class 'dht' print(x, cor = FALSE, bysample = FALSE, vcmatrices = FALSE, ...)
x |
dht object that results from call to dht for a specific ddf object |
cor |
if TRUE outputs correlation matrix of estimates |
bysample |
if TRUE, prints results for each sample |
vcmatrices |
if TRUE, prints variance-covariance matrices |
... |
unspecified and unused arguments for S3 consistency |
None
Jeff Laake
Just a pretty printer for the table of probabilities of detection.
## S3 method for class 'p_dist_table' print(x, digits = 2, ...)
## S3 method for class 'p_dist_table' print(x, digits = 2, ...)
x |
output from |
digits |
number of significant digits to print |
... |
other arguments to be passed to |
just prints the table and the range of ps
David L Miller
Provides a brief summary of data and fitted detection probability model parameters, model selection criterion, and optionally abundance in the covered (sampled) region and its standard error. What is printed depends on the corresponding call to summary.
## S3 method for class 'summary.ds' print(x, ...)
## S3 method for class 'summary.ds' print(x, ...)
x |
a summary of |
... |
unspecified and unused arguments for S3 consistency |
Jeff Laake
Provides a brief summary of data and fitted detection probability model parameters, model selection criterion, and optionally abundance in the covered (sampled) region and its standard error. What is printed depends on the corresponding call to summary.
## S3 method for class 'summary.io' print(x, ...)
## S3 method for class 'summary.io' print(x, ...)
x |
a summary of |
... |
unspecified and unused arguments for S3 consistency |
Jeff Laake
Provides a brief summary of data and fitted detection probability model parameters, model selection criterion, and optionally abundance in the covered (sampled) region and its standard error. What is printed depends on the corresponding call to summary.
## S3 method for class 'summary.io.fi' print(x, ...)
## S3 method for class 'summary.io.fi' print(x, ...)
x |
a summary of |
... |
unspecified and unused arguments for S3 consistency |
Jeff Laake
Provides a brief summary of data and fitted detection probability model parameters, model selection criterion, and optionally abundance in the covered (sampled) region and its standard error. What is printed depends on the corresponding call to summary.
## S3 method for class 'summary.rem' print(x, ...)
## S3 method for class 'summary.rem' print(x, ...)
x |
a summary of |
... |
unspecified and unused arguments for S3 consistency |
Jeff Laake
Provides a brief summary of data and fitted detection probability model parameters, model selection criterion, and optionally abundance in the covered (sampled) region and its standard error. What is printed depends on the corresponding call to summary.
## S3 method for class 'summary.rem.fi' print(x, ...)
## S3 method for class 'summary.rem.fi' print(x, ...)
x |
a summary of |
... |
unspecified and unused arguments for S3 consistency |
Jeff Laake
Provides a brief summary of data and fitted detection probability model parameters, model selection criterion, and optionally abundance in the covered (sampled) region and its standard error. What is printed depends on the corresponding call to summary.
## S3 method for class 'summary.trial' print(x, ...)
## S3 method for class 'summary.trial' print(x, ...)
x |
a summary of |
... |
unspecified and unused arguments for S3 consistency |
Jeff Laake
Provides a brief summary of data and fitted detection probability model parameters, model selection criterion, and optionally abundance in the covered (sampled) region and its standard error. What is printed depends on the corresponding call to summary.
## S3 method for class 'summary.trial.fi' print(x, ...)
## S3 method for class 'summary.trial.fi' print(x, ...)
x |
a summary of |
... |
unspecified and unused arguments for S3 consistency |
Jeff Laake
Used in call to DeltaMethod from prob.se to get first derivatives
prob.deriv(par, model, parfct, observer = NULL, fittedmodel = NULL)
prob.deriv(par, model, parfct, observer = NULL, fittedmodel = NULL)
par |
detection function parameter values |
model |
ddf model object |
parfct |
function of detection probabilities; currently only average (over covariates) detection probability p integrated over distance or average (over covariates) detection probability at distance 0; p(0) |
observer |
1,2,3 for primary, secondary, or duplicates for average p(0); passed to fct |
fittedmodel |
full fitted ddf model when |
Need to add equations here as I do not think they exist in any of the texts. These should probably be checked with simulation.
Vector of values from fct at specified parameter values
Jeff Laake
prob.se
Computes components of variance for average p=n/N and average p(0) with weights based on empirical covariate distribution, if it contains covariates.
prob.se(model, fct, vcov, observer = NULL, fittedmodel = NULL)
prob.se(model, fct, vcov, observer = NULL, fittedmodel = NULL)
model |
ddf model object |
fct |
function of detection probabilities; currently only average (over covariates) detection probability p integrated over distance or average (over covariates) detection probability at distance 0; p(0) |
vcov |
variance-covariance matrix of parameter estimates |
observer |
1,2,3 for primary, secondary, or duplicates for average p(0); passed to fct |
fittedmodel |
full fitted ddf model when |
Need to add equations here as I do not think they exist in any of the texts. These should probably be checked with simulation.
var |
variance |
partial |
partial derivatives of parameters with respect to fct |
covar |
covariance of n and average p or p(0) |
Jeff Laake
prob.deriv
Sets up dataframe and does some basic error checking. Adds needed fields to
dataframe and to meta.data
.
process.data(data, meta.data = list(), check = TRUE)
process.data(data, meta.data = list(), check = TRUE)
data |
dataframe object |
meta.data |
meta.data options; see |
check |
if |
The function does a number of error checking tasks, creating fields and
adding to meta.data
including:
1) If check=TRUE
, check to make sure the record structure is okay for
mrds data. The number of primary records (observer=1) must equal the number
of secondary records (observer=2). Also, a field in the dataframe is created
timesseen
which counts the number of times an object was detected
0,1,2; if timesseen=0
then the record is tossed from the analysis.
Also if there are differences in the data (distance, size, covariates) for
observer 1 and 2 a warning is issued that the analysis may fail. The code
assumes these values are the same for both observers.
2) Based on the presence of fields distbegin
and distend
, a
determination is made of whether the data analysis should be based on binned
distances and a field binned
is created, which is TRUE
if the
distance for the observation is binned. By assigning for each observation
this allows an analysis of a mixture of binned and unbinned distances.
4) Data are restricted such that distances are not greater than width
and not less than left
if those values are specified in
meta.data
. If they are not specified then left
defaults to 0
and width
defaults to the largest distance measurement.
5) Determine if an integration range (int.begin
and int.end
has been specified for the observations. If it has, add the structure to
meta.data
. The integration range is typically used for aerial
surveys in which the altitude varies such that the strip width (left to
width) changes with a change in altitude.
6) Fields defined as factors are cleaned up such that any unused levels are eliminated.
7) If the restrictions placed on the data, eliminated all of the data, the function stops with an error message
xmat |
processed |
meta.data |
meta.data list |
Jeff Laake
Detections of pronghorn from fixed-wing aerial surveys in Southeastern Wyoming using four angular bins defined by strut marks. Illustrates data where altitude above ground level (AGL) varies during the survey.
A data frame with 660 observations on the following 5 variables.
a numeric vector
a factor with levels N
S
representing the survey direction
height above ground level
a factor with levels A
B
C
D
which represent angular bands between breaks at
35.42,44.56,51.52,61.02,70.97 degrees. These angles were set based on
selected distance bins based on the target AGL.
number of pronghorn in the observed cluster
Each record is an observed cluster of pronghorn. The data provide the stratum for the observation, the direction of travel, the AGL at the time of the observation, the angular bin which contained the center of the pronghorn cluster(group), and the number of pronghorn in the group. The angular bins were defined by a combination of two window and five wing strut marks to define bin cutpoints for perpendicular ground distances of 0-65, 65-90, 90-115, 115-165 and 165-265 meters when the plane is 300' (91.4 meters) above ground level. The inner band is considered a blind region due to obstruction of view beneath the plane; thus th the line is offset 65 meters from underneath the plane.
Data provided courtesy of Rich Guenzel of Wyoming Game and Fish.
Laake, J., R. J. Guenzel, J. L. Bengtson, P. Boveng, M. Cameron, and M. B. Hanson. 2008. Coping with variation in aerial survey protocol for line-transect sampling. Wildlife Research 35:289-298.
Single observer point count data example from Distance
The format is 144 obs of 6 variables: distance: numeric distance from center observer: Factor w/ 2 levels "1","2": 1 2 1 2 1 2 1 2 1 2 ... detected: numeric 0/1 object: sequential object number Sample.Label: point label Region.Label: single region label
data(ptdata.distance) xx <- ddf(dsmodel = ~cds(key="hn", formula = ~1), data = ptdata.distance, method = "ds", meta.data = list(point=TRUE)) summary(xx) plot(xx,main="Distance point count data") ddf.gof(xx) Regions <- data.frame(Region.Label=1,Area=1) Samples <- data.frame(Sample.Label=1:30, Region.Label=rep(1,30), Effort=rep(1,30)) print(dht(xx,sample.table=Samples,region.table=Regions))
data(ptdata.distance) xx <- ddf(dsmodel = ~cds(key="hn", formula = ~1), data = ptdata.distance, method = "ds", meta.data = list(point=TRUE)) summary(xx) plot(xx,main="Distance point count data") ddf.gof(xx) Regions <- data.frame(Region.Label=1,Area=1) Samples <- data.frame(Sample.Label=1:30, Region.Label=rep(1,30), Effort=rep(1,30)) print(dht(xx,sample.table=Samples,region.table=Regions))
Simulated dual observer point count data with detection p(0)=0.8; hn sigma=30; w=100 for both observers with dependency y>0, gamma=0.1
The format is 420 obs of 6 variables: distance: numeric distance from center observer: Factor w/ 2 levels "1","2": 1 2 1 2 1 2 1 2 1 2 ... detected: numeric 0/1 person: Factor with 2 levels A,B pair: Factor with 2 levels "AB" BA" $ object : sequential object number
data(ptdata.dual) xx <- ddf(mrmodel=~glm(formula=~distance), dsmodel = ~cds(key="hn", formula = ~1), data = ptdata.dual, method = "io", meta.data = list(point=TRUE)) summary(xx) plot(xx,main="Simulated point count data")
data(ptdata.dual) xx <- ddf(mrmodel=~glm(formula=~distance), dsmodel = ~cds(key="hn", formula = ~1), data = ptdata.dual, method = "io", meta.data = list(point=TRUE)) summary(xx) plot(xx,main="Simulated point count data")
Simulated removal observer point count data with detection p(0)=0.8; hn sigma=30; w=100 for both observers with dependency y>0, gamma=0.1
The format is 408 obs of 6 variables: distance: numeric distance from center observer: Factor w/ 2 levels "1","2": 1 2 1 2 1 2 1 2 1 2 ... detected: numeric 0/1 person: Factor with 2 levels A,B pair: Factor with 2 levels "AB" BA" object: sequential object number
data(ptdata.removal) xx <- ddf(mrmodel=~glm(formula=~distance), dsmodel = ~cds(key="hn", formula = ~1), data = ptdata.removal, method = "rem", meta.data = list(point=TRUE)) summary(xx) plot(xx,main="Simulated point count data")
data(ptdata.removal) xx <- ddf(mrmodel=~glm(formula=~distance), dsmodel = ~cds(key="hn", formula = ~1), data = ptdata.removal, method = "rem", meta.data = list(point=TRUE)) summary(xx) plot(xx,main="Simulated point count data")
Simulated single observer point count data with detection p(0)=1; hn sigma=30; w=100
The format is 341 obs of 4 variables: ..$ distance: numeric distance from center $ observer: Factor w/ 2 levels "1","2": 1 2 1 2 1 2 1 2 1 2 ... ..$ detected: numeric 0/1 $ object : sequential object number
data(ptdata.single) xx=ddf(dsmodel = ~cds(key="hn", formula = ~1), data = ptdata.single, method = "ds", meta.data = list(point=TRUE)) summary(xx) plot(xx,main="Simulated point count data")
data(ptdata.single) xx=ddf(dsmodel = ~cds(key="hn", formula = ~1), data = ptdata.single, method = "ds", meta.data = list(point=TRUE)) summary(xx) plot(xx,main="Simulated point count data")
Constructs a quantile-quantile (Q-Q) plot for fitted model as a graphical
check of goodness of fit. Formal goodness of fit testing for detection
function models using Kolmogorov-Smirnov and Cramer-von Mises tests. Both
tests are based on looking at the quantile-quantile plot produced by
qqplot.ddf
and deviations from the line x=y.
qqplot.ddf(model, plot = TRUE, nboot = 100, ks = FALSE, ...)
qqplot.ddf(model, plot = TRUE, nboot = 100, ks = FALSE, ...)
model |
fitted distance detection function model object |
plot |
the Q-Q plot be plotted or just report statistics? |
nboot |
number of replicates to use to calculate p-values for the goodness of fit test statistics |
ks |
perform the Kolmogorov-Smirnov test (this involves many bootstraps so can take a while) |
... |
additional arguments passed to |
The Kolmogorov-Smirnov test asks the question "what's the largest vertical distance between a point and the y=x line?" It uses this distance as a statistic to test the null hypothesis that the samples (EDF and CDF in our case) are from the same distribution (and hence our model fits well). If the deviation between the y=x line and the points is too large we reject the null hypothesis and say the model doesn't have a good fit.
Rather than looking at the single biggest difference between the y=x line and the points in the Q-Q plot, we might prefer to think about all the differences between line and points, since there may be many smaller differences that we want to take into account rather than looking for one large deviation. Its null hypothesis is the same, but the statistic it uses is the sum of the deviations from each of the point to the line.
A list of goodness of fit related values:
edf |
matrix of lower and upper empirical distribution function values |
cdf |
fitted cumulative distribution function values |
ks |
list with K-S statistic
( |
CvM |
list with CvM statistic
( |
Note that a bootstrap procedure is required to ensure that the p-values from the procedure are correct as the we are comparing the cumulative distribution function (CDF) and empirical distribution function (EDF) and we have estimated the parameters of the detection function.
Jeff Laake, David L Miller
Burnham, K.P., S.T. Buckland, J.L. Laake, D.L. Borchers, T.A. Marques, J.R.B. Bishop, and L. Thomas. 2004. Further topics in distance sampling. pp: 385-389. In: Advanced Distance Sampling, eds. S.T. Buckland, D.R.Anderson, K.P. Burnham, J.L. Laake, D.L. Borchers, and L. Thomas. Oxford University Press.
Detection function fitting from mark-recapture data with a removal
configuration in which a secondary observer knows what the primary observer
detects and detects objects missed by the primary observer. The iterative
offset glm/gam uses an offset to compensate for the conditioning on the set
of objects seen by either observer (eg 00 those missed by both observers are
not included in the analysis. This function is similar to
io.glm
.
rem.glm( datavec, fitformula, eps = 1e-05, iterlimit = 500, GAM = FALSE, gamplot = TRUE, datavec2 )
rem.glm( datavec, fitformula, eps = 1e-05, iterlimit = 500, GAM = FALSE, gamplot = TRUE, datavec2 )
datavec |
dataframe containing records seen by either observer 1 or 2 |
fitformula |
logit link formula |
eps |
convergence criterion |
iterlimit |
maximum number of iterations allowed |
GAM |
uses GAM instead of GLM for fitting |
gamplot |
set to TRUE to get a gam plot object if |
datavec2 |
dataframe containing all records for observer 1 and observer 2 as in io.glm form; this is used in case there is an observer(not platform effect) |
The only difference between this function and io.glm
is the
offset and the data construction because there is only one detection
function being estimated for the primary observer. The two functions could
be merged.
list of class("remglm","glm","lm") or class("remglm","gam")
glmobj |
GLM or GAM object |
offsetvalue |
offsetvalues from iterative fit |
plotobj |
gam plot object (if GAM & gamplot==TRUE, else NULL) |
currently the code in this function for GAMs has been commented out until the remainder of the mrds package will work with GAMs.
Jeff Laake
Buckland, S.T., J.M. breiwick, K.L. Cattanach, and J.L. Laake. 1993. Estimated population size of the California gray whale. Marine Mammal Science, 9:235-249.
Burnham, K.P., S.T. Buckland, J.L. Laake, D.L. Borchers, T.A. Marques, J.R.B. Bishop, and L. Thomas. 2004. Further topics in distance sampling. pp: 360-363. In: Advanced Distance Sampling, eds. S.T. Buckland, D.R.Anderson, K.P. Burnham, J.L. Laake, D.L. Borchers, and L. Thomas. Oxford University Press.
This will calculate the rescaling needed when covariates to be included in
the scale of the detection function are "too big". Based on code from
optimx
.
rescale_pars(initialvalues, ddfobj)
rescale_pars(initialvalues, ddfobj)
initialvalues |
starting values for the optimisation |
ddfobj |
detection function object |
Derivative-free methods like nlminb are sensitive to the parameters being poorly scaled. This can also cause problems for quasi-Newton methods too (at least, bad scaling won't _help_ the optimisation). So here we rescale the parameters if necessary (unless we already got scaling from control)
David L Miller
Generate data from a fitted detection function and refit the model
sample_ddf(ds.object)
sample_ddf(ds.object)
ds.object |
a fitted detection function object |
This function changes the random number generator seed. To avoid any
potential side-effects, use something like: seed <-
get(".Random.seed",envir=.GlobalEnv)
before running code and
assign(".Random.seed",seed,envir=.GlobalEnv)
after.
David L. Miller
Set values of lower and upper bounds and check lengths of any user-specified values
setbounds(lowerbounds, upperbounds, initialvalues, ddfobj, width, left)
setbounds(lowerbounds, upperbounds, initialvalues, ddfobj, width, left)
lowerbounds |
vector of lower bounds |
upperbounds |
vector of upper bounds |
initialvalues |
vector of initial parameter estimates |
ddfobj |
distance detection function object |
width |
truncation distance |
left |
left truncation distance |
lower |
vector of lower bounds |
upper |
vector of upper bounds |
setlower |
logical indicating whether user set lower bounds |
setupper |
logical indicating whether user set upper bounds |
Jeff Laake
This function creates a design matrix for the g(0) or scale covariates using
the input model formula. It returns a list which contains 2 elements: 1)
dim: the dimension (number of columns) of the design matrix, and 2) cov: the
constructed design matrix. This function is relatively simple because it
uses the built-in function model.matrix
which does the
majority of the work. This function handles 2 exceptions "~.", the null
model with 0 columns and "~1" the intercept only model - a column of 1s. If
a model other than the 2 exceptions is provided, it calls
model.matrix
to construct the columns. If any of the columns of
the design matrix are all 0's the column is removed. This occurs when there
is no data for a particular factor.
setcov(dmat, model)
setcov(dmat, model)
dmat |
data matrix |
model |
model formula |
a design matrix for the specified data and model
Jeff Laake
For a given detection function, it computes the initial values for the parameters including scale and shape parameters and adjustment function parameters if any. If there are user-defined initial values only the parameters not specified by the user are computed.
setinitial.ds(ddfobj, width, initial, point, left) sethazard(ddfobj, dmat, width, left, point)
setinitial.ds(ddfobj, width, initial, point, left) sethazard(ddfobj, dmat, width, left, point)
ddfobj |
distance detection function object |
width |
half-width of transect or radius of point count |
initial |
list of user-defined initial values with possible elements:
|
point |
if |
left |
left truncation |
dmat |
|
scale |
vector of initial scale parameter values |
shape |
vector of initial shape parameter values |
adjustment |
vector of initial adjustment function parameter values |
Jeff Laake, David L Miller
Simulation of distance sampling data via mixture models Allows one to simulate line transect distance sampling data using a mixture of half-normal detection functions.
sim.mix(n, sigma, mix.prop, width, means = 0)
sim.mix(n, sigma, mix.prop, width, means = 0)
n |
number of samples to generate |
sigma |
vector of scale parameters |
mix.prop |
vector of mixture proportions (same length as sigma) |
width |
truncation |
means |
vector of means (used to generate wacky, non-monotonic data) |
distances a vector of distances
At the moment this is TOTALLY UNSUPPORTED! Please don't use it for anything important!
David Lawrence Miller
Tries to invert a matrix by solve
. If this fails because of
singularity, an eigenvector decomposition is computed, and eigenvalues below
1/cmax
are replaced by 1/cmax
, i.e., cmax
will be the
corresponding eigenvalue of the inverted matrix.
solvecov(m, cmax = 1e+10)
solvecov(m, cmax = 1e+10)
m |
a numeric symmetric matrix. |
cmax |
a positive value, see above. |
A list with the following components: inv
the inverted
matrix, coll
TRUE
if solve
failed because of
singularity.
solvecov
code was taken from package fpc
: Christian Hennig
Christian Hennig
solve, eigen
Multiple surveys by different observers of a single 1km transect containing 150 wooden stakes placed randomly throughout a 40 m strip (20m on either side).
A data frame with 150 observations on the following 10 variables.
unique number for each stake 1-150
perpendicular distance at which the stake was placed from the line
0/1 whether missed/seen by observer 1
0/1 whether missed/seen by observer 2
0/1 whether missed/seen by observer 3
0/1 whether missed/seen by observer 4
0/1 whether missed/seen by observer 5
0/1 whether missed/seen by observer 6
0/1 whether missed/seen by observer 7
0/1 whether missed/seen by observer 8
Laake, J. 1978. Line transect estimators robust to animal movement. M.S. Thesis. Utah State University, Logan, Utah. 55p.
Burnham, K. P., D. R. Anderson, and J. L. Laake. 1980. Estimation of Density from Line Transect Sampling of Biological Populations. Wildlife Monographs:7-202.
data(stake77) # Extract functions for stake data and put in the mrds format extract.stake <- function(stake,obs){ extract.obs <- function(obs){ example <- subset(stake,eval(parse(text=paste("Obs",obs,"==1",sep=""))), select="PD") example$distance <- example$PD example$object <- 1:nrow(example) example$PD <- NULL return(example) } if(obs!="all"){ return(extract.obs(obs=obs)) }else{ example <- NULL for(i in 1:(ncol(stake)-2)){ df <- extract.obs(obs=i) df$person <- i example <- rbind(example,df) } example$person <- factor(example$person) example$object <- 1:nrow(example) return(example) } } extract.stake.pairs <- function(stake,obs1,obs2,removal=FALSE){ obs1 <- paste("Obs",obs1,sep="") obs2 <- paste("Obs",obs2,sep="") example <- subset(stake,eval(parse(text=paste(obs1,"==1 |",obs2,"==1 ", sep=""))),select=c("PD",obs1,obs2)) names(example) <- c("distance","obs1","obs2") detected <- c(example$obs1,example$obs2) example <- data.frame(object = rep(1:nrow(example),2), distance = rep(example$distance,2), detected = detected, observer = c(rep(1,nrow(example)), rep(2,nrow(example)))) if(removal) example$detected[example$observer==2] <- 1 return(example) } # extract data for observer 1 and fit a single observer model stakes <- extract.stake(stake77,1) ds.model <- ddf(dsmodel = ~mcds(key = "hn", formula = ~1), data = stakes, method = "ds", meta.data = list(width = 20)) plot(ds.model,breaks=seq(0,20,2),showpoints=TRUE) ddf.gof(ds.model) # extract data from observers 1 and 3 and fit an io model stkpairs <- extract.stake.pairs(stake77,1,3,removal=FALSE) io.model <- ddf(dsmodel = ~mcds(key = "hn", formula=~1), mrmodel=~glm(formula=~distance), data = stkpairs, method = "io") summary(io.model) par(mfrow=c(3,2)) plot(io.model,breaks=seq(0,20,2),showpoints=TRUE,new=FALSE) dev.new() ddf.gof(io.model)
data(stake77) # Extract functions for stake data and put in the mrds format extract.stake <- function(stake,obs){ extract.obs <- function(obs){ example <- subset(stake,eval(parse(text=paste("Obs",obs,"==1",sep=""))), select="PD") example$distance <- example$PD example$object <- 1:nrow(example) example$PD <- NULL return(example) } if(obs!="all"){ return(extract.obs(obs=obs)) }else{ example <- NULL for(i in 1:(ncol(stake)-2)){ df <- extract.obs(obs=i) df$person <- i example <- rbind(example,df) } example$person <- factor(example$person) example$object <- 1:nrow(example) return(example) } } extract.stake.pairs <- function(stake,obs1,obs2,removal=FALSE){ obs1 <- paste("Obs",obs1,sep="") obs2 <- paste("Obs",obs2,sep="") example <- subset(stake,eval(parse(text=paste(obs1,"==1 |",obs2,"==1 ", sep=""))),select=c("PD",obs1,obs2)) names(example) <- c("distance","obs1","obs2") detected <- c(example$obs1,example$obs2) example <- data.frame(object = rep(1:nrow(example),2), distance = rep(example$distance,2), detected = detected, observer = c(rep(1,nrow(example)), rep(2,nrow(example)))) if(removal) example$detected[example$observer==2] <- 1 return(example) } # extract data for observer 1 and fit a single observer model stakes <- extract.stake(stake77,1) ds.model <- ddf(dsmodel = ~mcds(key = "hn", formula = ~1), data = stakes, method = "ds", meta.data = list(width = 20)) plot(ds.model,breaks=seq(0,20,2),showpoints=TRUE) ddf.gof(ds.model) # extract data from observers 1 and 3 and fit an io model stkpairs <- extract.stake.pairs(stake77,1,3,removal=FALSE) io.model <- ddf(dsmodel = ~mcds(key = "hn", formula=~1), mrmodel=~glm(formula=~distance), data = stkpairs, method = "io") summary(io.model) par(mfrow=c(3,2)) plot(io.model,breaks=seq(0,20,2),showpoints=TRUE,new=FALSE) dev.new() ddf.gof(io.model)
Multiple surveys by different observers of a single 1km transect containing 150 wooden stakes placed based on expected uniform distribution throughout a 40 m strip (20m on either side).
A data frame with 150 observations on the following 13 variables.
unique number for each stake 1-150
perpendicular distance at which the stake was placed from the line
0/1 whether missed/seen by observer 1
0/1 whether missed/seen by observer 2
0/1 whether missed/seen by observer 3
0/1 whether missed/seen by observer 4
0/1 whether missed/seen by observer 5
0/1 whether missed/seen by observer 6
0/1 whether missed/seen by observer 7
0/1 whether missed/seen by observer 8
0/1 whether missed/seen by observer 9
0/1 whether missed/seen by observer 10
0/1 whether missed/seen by observer 11
The 1997 survey was based on a single realization of a uniform distribution. Because it was a single transect and there was no randomization of the distances for each survey, we repeated the experiment and used distances that provided a uniform distribution but randomly sorted the positions along the line so there was no pattern obvious to the observer.
Laake, J. 1978. Line transect estimators robust to animal movement. M.S. Thesis. Utah State University, Logan, Utah. 55p.
Burnham, K. P., D. R. Anderson, and J. L. Laake. 1980. Estimation of Density from Line Transect Sampling of Biological Populations. Wildlife Monographs:7-202.
data(stake78) data(stake77) # compare distribution of distances for all stakes hist(stake77$PD) hist(stake78$PD) # Extract stake data and put in the mrds format for model fitting. extract.stake <- function(stake,obs){ extract.obs <- function(obs){ example <- subset(stake,eval(parse(text=paste("Obs",obs,"==1",sep=""))), select="PD") example$distance <- example$PD example$object <- 1:nrow(example) example$PD <- NULL return(example) } if(obs!="all"){ return(extract.obs(obs=obs)) }else{ example <- NULL for(i in 1:(ncol(stake)-2)){ df <- extract.obs(obs=i) df$person <- i example <- rbind(example,df) } example$person <- factor(example$person) example$object <- 1:nrow(example) return(example) } } extract.stake.pairs <- function(stake,obs1,obs2,removal=FALSE){ obs1 <- paste("Obs",obs1,sep="") obs2 <- paste("Obs",obs2,sep="") example <- subset(stake,eval(parse(text=paste(obs1,"==1 |",obs2,"==1 ", sep=""))), select=c("PD",obs1,obs2)) names(example) <- c("distance","obs1","obs2") detected <- c(example$obs1,example$obs2) example <- data.frame(object=rep(1:nrow(example),2), distance=rep(example$distance,2), detected = detected, observer=c(rep(1,nrow(example)), rep(2,nrow(example)))) if(removal) example$detected[example$observer==2] <- 1 return(example) } # extract data for observer 10 and fit a single observer model stakes <- extract.stake(stake78,10) ds.model <- ddf(dsmodel = ~mcds(key = "hn", formula = ~1), data = stakes, method = "ds", meta.data = list(width = 20)) plot(ds.model,breaks=seq(0,20,2),showpoints=TRUE) ddf.gof(ds.model) # extract data from observers 5 and 7 and fit an io model stkpairs <- extract.stake.pairs(stake78,5,7,removal=FALSE) io.model <- ddf(dsmodel = ~mcds(key = "hn", formula=~1), mrmodel=~glm(formula=~distance), data = stkpairs, method = "io") summary(io.model) par(mfrow=c(3,2)) plot(io.model,breaks=seq(0,20,2),showpoints=TRUE,new=FALSE) ddf.gof(io.model)
data(stake78) data(stake77) # compare distribution of distances for all stakes hist(stake77$PD) hist(stake78$PD) # Extract stake data and put in the mrds format for model fitting. extract.stake <- function(stake,obs){ extract.obs <- function(obs){ example <- subset(stake,eval(parse(text=paste("Obs",obs,"==1",sep=""))), select="PD") example$distance <- example$PD example$object <- 1:nrow(example) example$PD <- NULL return(example) } if(obs!="all"){ return(extract.obs(obs=obs)) }else{ example <- NULL for(i in 1:(ncol(stake)-2)){ df <- extract.obs(obs=i) df$person <- i example <- rbind(example,df) } example$person <- factor(example$person) example$object <- 1:nrow(example) return(example) } } extract.stake.pairs <- function(stake,obs1,obs2,removal=FALSE){ obs1 <- paste("Obs",obs1,sep="") obs2 <- paste("Obs",obs2,sep="") example <- subset(stake,eval(parse(text=paste(obs1,"==1 |",obs2,"==1 ", sep=""))), select=c("PD",obs1,obs2)) names(example) <- c("distance","obs1","obs2") detected <- c(example$obs1,example$obs2) example <- data.frame(object=rep(1:nrow(example),2), distance=rep(example$distance,2), detected = detected, observer=c(rep(1,nrow(example)), rep(2,nrow(example)))) if(removal) example$detected[example$observer==2] <- 1 return(example) } # extract data for observer 10 and fit a single observer model stakes <- extract.stake(stake78,10) ds.model <- ddf(dsmodel = ~mcds(key = "hn", formula = ~1), data = stakes, method = "ds", meta.data = list(width = 20)) plot(ds.model,breaks=seq(0,20,2),showpoints=TRUE) ddf.gof(ds.model) # extract data from observers 5 and 7 and fit an io model stkpairs <- extract.stake.pairs(stake78,5,7,removal=FALSE) io.model <- ddf(dsmodel = ~mcds(key = "hn", formula=~1), mrmodel=~glm(formula=~distance), data = stkpairs, method = "io") summary(io.model) par(mfrow=c(3,2)) plot(io.model,breaks=seq(0,20,2),showpoints=TRUE,new=FALSE) ddf.gof(io.model)
Provides a brief summary of data and fitted detection probability model parameters, model selection criterion, and optionally abundance in the covered (sampled) region and its standard error.
## S3 method for class 'ds' summary(object, se = TRUE, N = TRUE, ...)
## S3 method for class 'ds' summary(object, se = TRUE, N = TRUE, ...)
object |
a |
se |
if TRUE, computes standard errors |
N |
if TRUE, computes abundance in covered (sampled) region |
... |
unspecified and unused arguments for S3 consistency |
The argument N
is used to suppress computation of
abundance and average detection probability in calls to summarize the
ds
and either io.fi
or trial.fi
for summaries of
io
and trial
objects respectively which are composed of a
ds
model object and a mark-recapture model object. The corresponding
print function is called to print the summary results.
list of extracted and summarized objects
This function is called by the generic function summary
for any
ddf
model object. Each function can be called directly by the
user, but it is typically safest to use the generic function
summary
which calls the appropriate function based on the type of
ddf
model.
Jeff Laake
Provides a brief summary of data and fitted detection probability model parameters, model selection criterion, and optionally abundance in the covered (sampled) region and its standard error.
## S3 method for class 'io' summary(object, se = TRUE, ...)
## S3 method for class 'io' summary(object, se = TRUE, ...)
object |
a |
se |
if TRUE, computes standard errors |
... |
unspecified and unused arguments for S3 consistency |
The argument N
is used to suppress computation of
abundance and average detection probability in calls to summarize the
ds
and either io.fi
or trial.fi
for summaries of
io
and trial
objects respectively which are composed of a
ds
model object and a mark-recapture model object. The corresponding
print function is called to print the summary results.
list of extracted and summarized objects
This function is called by the generic function summary
for any
ddf
model object. Each function can be called directly by the
user, but it is typically safest to use the generic function
summary
which calls the appropriate function based on the type of
ddf
model.
Jeff Laake
Provides a brief summary of data and fitted detection probability model parameters, model selection criterion, and optionally abundance in the covered (sampled) region and its standard error.
## S3 method for class 'io.fi' summary(object, se = TRUE, N = TRUE, fittedmodel = NULL, ddfobj = NULL, ...)
## S3 method for class 'io.fi' summary(object, se = TRUE, N = TRUE, fittedmodel = NULL, ddfobj = NULL, ...)
object |
a |
se |
if TRUE, computes standard errors |
N |
if TRUE, computes abundance in covered (sampled) region |
fittedmodel |
full fitted model when called from |
ddfobj |
distance sampling object description |
... |
unspecified and unused arguments for S3 consistency |
The argument N
is used to suppress computation of
abundance and average detection probability in calls to summarize the
ds
and either io.fi
or trial.fi
for summaries of
io
and trial
objects respectively which are composed of a
ds
model object and a mark-recapture model object. The corresponding
print function is called to print the summary results.
list of extracted and summarized objects
This function is called by the generic function summary
for any
ddf
model object. Each function can be called directly by the
user, but it is typically safest to use the generic function
summary
which calls the appropriate function based on the type of
ddf
model.
Jeff Laake
Provides a brief summary of data and fitted detection probability model parameters, model selection criterion, and optionally abundance in the covered (sampled) region and its standard error.
## S3 method for class 'rem' summary(object, se = TRUE, ...)
## S3 method for class 'rem' summary(object, se = TRUE, ...)
object |
a |
se |
if TRUE, computes standard errors |
... |
unspecified and unused arguments for S3 consistency |
The argument N
is used to suppress computation of
abundance and average detection probability in calls to summarize the
ds
and either io.fi
or trial.fi
for summaries of
io
and trial
objects respectively which are composed of a
ds
model object and a mark-recapture model object. The corresponding
print function is called to print the summary results.
list of extracted and summarized objects
This function is called by the generic function summary
for any
ddf
model object. Each function can be called directly by the
user, but it is typically safest to use the generic function
summary
which calls the appropriate function based on the type of
ddf
model.
Jeff Laake
Provides a brief summary of data and fitted detection probability model parameters, model selection criterion, and optionally abundance in the covered (sampled) region and its standard error.
## S3 method for class 'rem.fi' summary(object, se = TRUE, N = TRUE, fittedmodel = NULL, ...)
## S3 method for class 'rem.fi' summary(object, se = TRUE, N = TRUE, fittedmodel = NULL, ...)
object |
a |
se |
if TRUE, computes standard errors |
N |
if TRUE, computes abundance in covered (sampled) region |
fittedmodel |
full fitted model when called from |
... |
unspecified and unused arguments for S3 consistency |
The argument N
is used to suppress computation of
abundance and average detection probability in calls to summarize the
ds
and either io.fi
or trial.fi
for summaries of
io
and trial
objects respectively which are composed of a
ds
model object and a mark-recapture model object. The corresponding
print function is called to print the summary results.
list of extracted and summarized objects
This function is called by the generic function summary
for any
ddf
model object. Each function can be called directly by the
user, but it is typically safest to use the generic function
summary
which calls the appropriate function based on the type of
ddf
model.
Jeff Laake
Provides a brief summary of data and fitted detection probability model parameters, model selection criterion, and optionally abundance in the covered (sampled) region and its standard error.
## S3 method for class 'trial' summary(object, se = TRUE, ...)
## S3 method for class 'trial' summary(object, se = TRUE, ...)
object |
a |
se |
if TRUE, computes standard errors |
... |
unspecified and unused arguments for S3 consistency |
The argument N
is used to suppress computation of
abundance and average detection probability in calls to summarize the
ds
and either io.fi
or trial.fi
for summaries of
io
and trial
objects respectively which are composed of a
ds
model object and a mark-recapture model object. The corresponding
print function is called to print the summary results.
list of extracted and summarized objects
This function is called by the generic function summary
for any
ddf
model object. Each function can be called directly by the
user, but it is typically safest to use the generic function
summary
which calls the appropriate function based on the type of
ddf
model.
Jeff Laake
Provides a brief summary of data and fitted detection probability model parameters, model selection criterion, and optionally abundance in the covered (sampled) region and its standard error.
## S3 method for class 'trial.fi' summary(object, se = TRUE, N = TRUE, fittedmodel = NULL, ...)
## S3 method for class 'trial.fi' summary(object, se = TRUE, N = TRUE, fittedmodel = NULL, ...)
object |
a |
se |
if TRUE, computes standard errors |
N |
if TRUE, computes abundance in covered (sampled) region |
fittedmodel |
full fitted model when called from |
... |
unspecified and unused arguments for S3 consistency |
The argument N
is used to suppress computation of
abundance and average detection probability in calls to summarize the
ds
and either io.fi
or trial.fi
for summaries of
io
and trial
objects respectively which are composed of a
ds
model object and a mark-recapture model object. The corresponding
print function is called to print the summary results.
list of extracted and summarized objects
This function is called by the generic function summary
for any
ddf
model object. Each function can be called directly by the
user, but it is typically safest to use the generic function
summary
which calls the appropriate function based on the type of
ddf
model.
Jeff Laake
Extrapolate Horvitz-Thompson abundance estimates to entire surveyed region
survey.region.dht(Nhat.by.sample, samples, width, left, point, areas.supplied)
survey.region.dht(Nhat.by.sample, samples, width, left, point, areas.supplied)
Nhat.by.sample |
dataframe of abundance by sample |
samples |
samples table |
width |
truncation width |
left |
left truncation if any |
point |
if TRUE point count otherwise line transect |
areas.supplied |
if |
Revised Nhat.by.sample dataframe containing estimates extrapolated to survey region
Internal function called by dht
and related functions.
Jeff Laake and David L Miller
Determines whether user specified breaks for histograms are properly ordered and match the left and right truncation.
test.breaks(breaks, left, width)
test.breaks(breaks, left, width)
breaks |
vector of cutpoints (breaks) for distance histogram |
left |
left truncation value |
width |
right truncation value; either radius of point count or half-width of transect |
vector of breaks modified to be valid if necessary
Jeff Laake
Computes one of a series of possible variance estimates for the observed encounter rate for a set of sample measurements (e.g., line lengths) and number of observations per sample.
varn(lvec,nvec,type) covn(lvec, groups1, groups2, type)
varn(lvec,nvec,type) covn(lvec, groups1, groups2, type)
lvec |
vector of sample measurements (e.g., line lengths) |
nvec |
vector of number observed |
type |
choice of variance estimator to use for encounter rate |
groups1 |
vector of number of groups observed |
groups2 |
vector of number of individuals observed |
The choice of type follows the notation of Fewster et al. (2009) in that there are 8 choices of encounter rate variance that can be computed for lines and one for points:
R2
random line placement with unequal line lengths (design-assisted estimator)
R3
random line placement, model-assisted estimator, based on true contagion process
R4
random line placement, model-assisted estimator, based on apparent contagion process
S1
systematic line placement, post-stratification with no strata overlap
S2
systematic line placement, post-stratification with no strata overlap, variances weighted by line length per stratum
O1
systematic line placement, post-stratification with overlapping strata (akin to S1)
O2
systematic line placement, post-stratification with overlapping strata (weighted by line length per stratum, akin to S2)
O3
systematic line placement, post-stratification with overlapping strata, model-assisted estimator with trend in encounter rate with line length
P2
random point placement, potentially unequal number of visits per point, design-based estimator
P3
random point placement, potentially unequal number of visits per point, model-based estimator
Default value is "R2"
, shown in Fewster et al. (2009) to have good
performance for completely random designs for lines. For systematic parallel
line transect designs, Fewster et al. recommend "O2"
. For point
transects the default is "P2"
(but "P3"
is also available).
For the systematic estimators, pairs are assigned in the order they are
given in the lengths
and groups
vectors.
Variance of encounter rate as defined by arguments
This function is also used with different calling arguments to compute Innes et al variance of the estimated abundances/length rather than observation encounter rate. The function covn is probably only valid for R3 and R2. Currently, the R2 form is used for all types other than R3.
Jeff Laake, David L Miller
Fewster, R.M., S.T. Buckland, K.P. Burnham, D.L. Borchers, P.E. Jupp, J.L. Laake and L. Thomas. 2009. Estimating the encounter rate variance in distance sampling. Biometrics 65: 225-236.