Follow the steps in this guide to add a new model to the unmarked package. Note that the order can be adjusted based on your preferences. For instance, you can start with the likelihood function, as it forms the core of adding a model to unmarked, and then build the rest of the code around it. In this document, the steps are ordered as they would occur in an unmarked analysis workflow.

This guide uses the recently developed gdistremoval function for examples, mainly because most of the relevant code is in a single file instead of spread around. It also uses occu functions to show simpler examples that may be easier to understand.

Prerequisites and advices

Before you start coding, you should use git to version your code:
- Fork the unmarked repository on Github
- Make a new branch with your new function as the name
- Add the new code
unmarked uses S4 for objects and methods - if you aren’t familiar with S4 you may want to consult a book or tutorial such as this one.
If you are unfamiliar with building a package in R, here are two tutorials that may help you: Karl Broman’s guide to building packages and the official R-project guide. If you are using RStudio, their documentation on writing package could also be useful, especially to understand how to use the Build pane.
To avoid complex debugging in the end, I suggest you to regularly install and load the package as you add new code. You can easily do so in RStudio in the Build pane, by clicking on “Install > Clean and install”. This will also allow you to test your functions cleanly.
Write tests and documentation as you add new functions, classes, and methods. This eases the task, avoiding the need to write everything at the end.

Organise the input data: design the `unmarkedFrame` object

Most model types in unmarked have their own unmarkedFrame, a specialized kind of data frame. This is an S4 object which contains, at a minimum, the response (y). It may also include site covariates, observation covariates, primary period covariates, and other info related to study design (such as distance breaks).

In some cases you may be able to use an existing unmarkedFrame subclass. You can list all the existing unmarkedFrame subclasses by running the following code:

showClass("unmarkedFrame")

## Class "unmarkedFrame" [package "unmarked"]
## 
## Slots:
##                                                                               
## Name:                  y           obsCovs          siteCovs            obsToY
## Class:            matrix optionalDataFrame optionalDataFrame    optionalMatrix
## 
## Extends: "unmarkedFrameOrNULL"
## 
## Known Subclasses: 
## Class "unmarkedMultFrame", directly
## Class "unmarkedFrameDS", directly
## Class "unmarkedFrameOccu", directly
## Class "unmarkedFrameOccuFP", directly
## Class "unmarkedFrameOccuMulti", directly
## Class "unmarkedFramePCount", directly
## Class "unmarkedFrameMPois", directly
## Class "unmarkedFrameOccuCOP", directly
## Class "unmarkedFrameOccuComm", directly
## Class "unmarkedFrameOccuMS", by class "unmarkedMultFrame", distance 2
## Class "unmarkedFrameOccuTTD", by class "unmarkedMultFrame", distance 2
## Class "unmarkedFrameG3", by class "unmarkedMultFrame", distance 2
## Class "unmarkedFramePCO", by class "unmarkedMultFrame", distance 2
## Class "unmarkedFrameGDR", by class "unmarkedMultFrame", distance 2
## Class "unmarkedFrameGMM", by class "unmarkedMultFrame", distance 3
## Class "unmarkedFrameGDS", by class "unmarkedMultFrame", distance 3
## Class "unmarkedFrameGPC", by class "unmarkedMultFrame", distance 3
## Class "unmarkedFrameGOccu", by class "unmarkedMultFrame", distance 3
## Class "unmarkedFrameMMO", by class "unmarkedMultFrame", distance 4
## Class "unmarkedFrameDSO", by class "unmarkedMultFrame", distance 4

You can have more information about each unmarkedFrame subclass by looking at the documentation of the function that was written to create the unmarkedFrame object of this subclass, for example with ?unmarkedFrameGDR, or on the package’s website.

Define the `unmarkedFrame` subclass for this model

All unmarkedFrame subclasses are children of the umarkedFrame class, defined here.
Example with occu
Example with gdistremoval
All unmarkedFrame subclasses need to pass the validunmarkedFrame validity check. You may want to add complementary validity check, like, for example, the `unmarkedFrameDS subclass.

Write the function that creates the `unmarkedFrame` object

Write the S4 methods associated with the `unmarkedFrame` object

Note that you may not have to write all of the S4 methods below. Most of them will work without having to re-write them, but you should test it to verify it. All the methods associated with unmarkedFrame objects are listed in the unmarkedFrame class documentation accessible with help("unmarkedFrame-class").

Specific methods

Here are methods you probably will have to rewrite.

Subsetting the unmarkedFrame object: umf[i, ], umf[, j] and umf[i, j]
- Example with occu: code for unmarkedFrame mother class, as used to subset an unmarkedFrameOccu object.
- Example with gdistremoval: umf[i, ] when i is numeric, umf[i, ] when i is logical, umf[i, j]

Generic methods

Here are methods that you should test but probably will not have to rewrite. They are defined in the unmarkedFrame.R file, for the unmarkedFrame mother class.

getY
numSites
numY
obsCovs
obsCovs<-
obsNum
obsToY
obsToY<-
plot
projection
show
siteCovs
siteCovs<-
summary

Methods to access new attributes

You may also need to add specific methods to allow users to access an attribute you added to your unmarkedFrame subclass.

For example, getL for unmarkedFrameOccuCOP

Fitting the model

The fitting function can be declined into three main steps: reading the unmarkedFrame object, maximising the likelihood, and formatting the outputs.

Inputs of the fitting function

R formulas for each submodel (e.g. state, detection). We have found over time it is better to have separate arguments per formula (e.g. the way gdistremoval does it) instead of a combined formula (e.g. the way occu does it).
data for the unmarkedFrame
Parameters for optim: optimisation algorithm (method), initial parameters, and other parameters (...)
engine parameter to call one of the implemented likelihood functions
Other model-specific settings, such as key functions or parameterizations to use

Read the `unmarkedFrame` object: write the `getDesign` method

Most models have their own getDesign function, an S4 method. The purpose of this method is to convert the information in the unmarkedFrame into a format usable by the likelihood function.

It generates design matrices from formulas and components of the unmarkedFrame.
It often also has code to handle missing values, such as by dropping sites that don’t have measurements, or giving the user warnings if covariates are missing, etc.

Writing the getDesign method is frequently the most tedious and difficult part of the work adding a new function.

Example for occu, as used for occu
Example for gdistremoval

The likelihood function

Inputs: a vector of parameter values, the response variable, design matrices, and other settings/required data
Outputs: a numeric, the negative log-likelihood
Should be written so it can be used with the optim() function
Models can have three likelihood functions : coded in R, in C++ and with TMB (which is technically in C++ too). Users can specify which likelihood function to use in the engine argument of the fitting function.

The R likelihood function: easily understandable

If you are mainly used to coding in R, you should probably start here. If users want to dig deeper into the likelihood of a model, it may be useful for them to be able to read the R code to calculate likelihood, as they may not be familiar with other languages. This likelihood function can be used only for fixed-effects models.

Example for occu
gdistremoval doesn’t have an R version of the likelihood function

The C++ likelihood function: faster

The C++ likelihood function is essentially a C++ version of the R likelihood function, also designed exclusively for fixed-effects models. This function uses the RcppArmadillo R package, presented here. In the C++ code, you can use functions of the Armadillo C++ library, documented here.

Your C++ function should be in a .cpp file in the ./src/ folder of the package. You do not need to write a header file (.hpp), nor do you need to compile the code by yourself as it is all handled by the RcppArmadillo package. To test if your C++ function runs and gives you the expected result, you can compile and load the function with Rcpp::sourceCpp(./src/nll_yourmodel.cpp), and then use it like you would use a R function: nll_yourmodel(params=params, arg1=arg1).

The TMB likelihood function: for random effects

#TODO

Example for gdistremoval

Organise the output data

`unmarkedEstimate` objects per submodel

Outputs from optim should be organized unto unmarkedEstimate (S4) objects, with one unmarkedEstimate per submodel (e.g. state, detection). These objects include the parameter estimates and other information about link functions etc.

The unmarkedEstimate class is defined here in the unmarkedEstimate.R file, and the unmarkedEstimate function is defined here, and is used to create new unmarkedEstimate objects. You normally will not need to create unmarkedEstimate subclass.

Design the `unmarkedFit` object

You’ll need to create a new unmarkedFit subclass for your model. The main component of unmarkedFit objects is a list of the unmarkedEstimates described above.

After you defined your unmarkedFit subclass, you can create the object in your fitting function.

The fitting function return this unmarkedFit object.

Test the complete fitting function process

Simulate some data using your model
Construct the unmarkedFrame
Provide formulas, unmarkedFrame, other options to your draft fitting function
Process them with getDesign
Pass results from getDesign as inputs to your likelihood function
Optimize the likelihood function
Check the resulting parameter estimates for accuracy

Write the methods associated with the `unmarkedFit` object

Develop methods specific to your unmarkedFit type for operating on the output of your model. Like for the methods associated with an unmarkedFrame object above, you probably will not have to re-write all of them, but you should test them to see if they work. All the methods associated with unmarkedFit objects are listed in the unmarkedFit class documentation accessible with help("unmarkedFit-class").

Specific methods

Those are methods you will want to rewrite, adjusting them for your model.

`getP`

The getP method (defined here) “back-transforms” the detection parameter (p the detection probability or λ the detection rate, depending on the model). It returns a matrix of the estimated detection parameters. It is called by several other methods that are useful to extract information from the unmarkedFit object.

For occu, the generic method for unmarkedFit objects is called.
Example for gdistremoval

`simulate`

The generic simulate method (defined here) calls the simulate_fit method that depends on the class of the unmarkedFit object, which depends on the model.

The simulate method can be used in two ways:

to generate datasets from scratch (see the “Simulating datasets” vignette)
to generate datasets from a fitted model (with simulate(object = my_unmarkedFit_object)).

You should test both ways with your model.

`plot`

This method plots the results of your model. The generic plot method for unmarkedFit (defined here) plot the residuals of the model.

For occu, the generic method for unmarkedFit objects is called.
Example for gdistremoval

Generic methods

Here are methods that you should test but probably will not have to rewrite. They are defined in the unmarkedFit.R file, for the unmarkedFit mother class.

[
backTransform
coef
confint
fitted
getData
linearComb
names
parboot
nonparboot
predict
profile
residuals
SE
show
summary
update
vcov
logLik

Methods to access new attributes

You may also need to add specific methods to allow users to access an attribute you added to your unmarkedFit subclass.

For example, some methods are relevant for some type of models only:

getFP for occupancy models that account for false positives
getB for occupancy models that account for false positives
smoothed for colonization-extinction models
projected for colonization-extinction models

Update the `NAMESPACE` file

Add your fitting function to the functions export here
Add the new subclasses (unmarkedFrame, unmarkedFit) to the classes export here
Add the function you wrote to create your unmarkedFrame object to the functions export here
If you wrote new methods, for example to access new attributes for objects of a subclass, add them to the methods export here
If required, export other functions you created that may be called by users of the package

Write tests

Using testthat package, you need to write tests for your unmarkedFrame function, your fitting function, and methods described above. The tests should be fast, but cover all the key configurations.

Write your tests in the ./tests/testthat/ folder, creating a R file for your model. If you are using RStudio, you can run the tests of your file easily by clicking on the “Run tests” button. You can run all the tests by clicking on the “Test” button in the Build pane.

Write documentation

You need to write the documentation files for the new classes and functions you added. Documentation .Rd files are stored in the man folder. Here is a documentation on how to format your documentation.

The most important, your fitting function!
- Example for occu
- Example for gdistremoval
Your unmarkedFrame constructor function
- Example for occu
- Example for gdistremoval
Add your fitting function to the reference of all functions to _pkgdown.yml

Depending on how much you had to add, you may also need to update existing files:

If you added specific methods for your new unmarkedFrame class: add them to unmarkedFrame-class.Rd
If you added specific methods for your new unmarkedFit class: add them to unmarkedFit-class.Rd. The same goes for your new unmarkedFitList class in unmarkedFitList-class.Rd.
Add any specific function, method or class you created. For example, specific distance-sampling functions are documented in detFuns.Rd.

Add to `unmarked`

Send a pull request on Github
Probably fix a few things
Merged and done!

Contributing to unmarked: guide to adding a new model to unmarked

Prerequisites and advices

Organise the input data: design the unmarkedFrame object

Define the unmarkedFrame subclass for this model

Write the function that creates the unmarkedFrame object

Write the S4 methods associated with the unmarkedFrame object

Specific methods

Generic methods

Methods to access new attributes

Fitting the model

Inputs of the fitting function

Read the unmarkedFrame object: write the getDesign method

The likelihood function

The R likelihood function: easily understandable

The C++ likelihood function: faster

The TMB likelihood function: for random effects

Organise the output data

unmarkedEstimate objects per submodel

Design the unmarkedFit object

Test the complete fitting function process

Write the methods associated with the unmarkedFit object

Specific methods

getP

simulate

plot

Generic methods

Methods to access new attributes

Update the NAMESPACE file

Write tests

Write documentation

Add to unmarked

Contributing to unmarked: guide to adding a new model to `unmarked`

Organise the input data: design the `unmarkedFrame` object

Define the `unmarkedFrame` subclass for this model

Write the function that creates the `unmarkedFrame` object

Write the S4 methods associated with the `unmarkedFrame` object

Read the `unmarkedFrame` object: write the `getDesign` method

`unmarkedEstimate` objects per submodel

Design the `unmarkedFit` object

Write the methods associated with the `unmarkedFit` object

`getP`

`simulate`

`plot`

Update the `NAMESPACE` file

Add to `unmarked`