Package 'reslr' reference manual

Title:	Modelling Relative Sea Level Data
Description:	The Bayesian modelling of relative sea-level data using a comprehensive approach that incorporates various statistical models within a unifying framework. Details regarding each statistical models; linear regression (Ashe et al 2019) <doi:10.1016/j.quascirev.2018.10.032>, change point models (Cahill et al 2015) <doi:10.1088/1748-9326/10/8/084002>, integrated Gaussian process models (Cahill et al 2015) <doi:10.1214/15-AOAS824>, temporal splines (Upton et al 2023) <arXiv:2301.09556>, spatio-temporal splines (Upton et al 2023) <arXiv:2301.09556> and generalised additive models (Upton et al 2023) <arXiv:2301.09556>. This package facilitates data loading, model fitting and result summarisation. Notably, it accommodates the inherent measurement errors found in relative sea-level data across multiple dimensions, allowing for their inclusion in the statistical models.
Authors:	Maeve Upton [cph, aut, cre] , Andrew Parnell [aut], Niamh Cahill [aut]
Maintainer:	Maeve Upton <[email protected]>
License:	MIT + file LICENSE
Version:	0.1.1
Built:	2025-01-22 06:59:49 UTC
Source:	CRAN

Cross validation check for spline in time, spline in space time and GAM in order to select the most appropriate number of knots when creating basis functions.

Description

Cross validation check for spline in time, spline in space time and GAM in order to select the most appropriate number of knots when creating basis functions.

Usage

cross_val_check(
  data,
  prediction_grid_res = 50,
  spline_nseg = NULL,
  spline_nseg_t = 20,
  spline_nseg_st = 6,
  n_iterations = 1000,
  n_burnin = 100,
  n_thin = 5,
  n_chains = 2,
  model_type,
  n_fold = 5,
  seed = NULL,
  CI = 0.95
)
cross_val_check(
  data,
  prediction_grid_res = 50,
  spline_nseg = NULL,
  spline_nseg_t = 20,
  spline_nseg_st = 6,
  n_iterations = 1000,
  n_burnin = 100,
  n_thin = 5,
  n_chains = 2,
  model_type,
  n_fold = 5,
  seed = NULL,
  CI = 0.95
)

Arguments

`data`	Raw input data
`prediction_grid_res`	Resolution of grid. Predictions over every 50 years(default) can vary based on user preference, as larger values will reduce computational run time.
`spline_nseg`	This setting is focused on the Noisy Input Spline model. It provides the number of segments used to create basis functions.
`spline_nseg_t`	This setting is focused on the Noisy Input Generalised Additive Model. It provides the number of segments used to create basis functions.
`spline_nseg_st`	This setting is focused on the Noisy Input Generalised Additive Model. It provides the number of segments used to create basis functions.
`n_iterations`	Number of iterations. Increasing this value will increase the computational run time.
`n_burnin`	Size of burn-in. This number removes a certain number of samples at the beginning.
`n_thin`	Amount of thinning.
`n_chains`	Number of MCMC chains. The number of times the model will be run.
`model_type`	The user selects their statistical model type. The user can select a Noisy Input Spline in Time using "ni_spline_t". The user can select a Noisy Input Spline in Space Time using "ni_spline_st". The user can select a Noisy Input Generalised Additive Model using "ni_gam_decomp".
`n_fold`	Number of folds required in the cross validation. The default is 5 fold cross validation.
`seed`	If the user wants reproducible results, seed stores the output when random selection was used in the creation of the cross validation.
`CI`	Size of the credible interval required by the user. The default is 0.95 corresponding to 95%.

Value

A list containing the model comparison measures, e.g. Root Mean Square Error (RMSE), and plot of true vs predicted values

Examples


data <- NAACproxydata %>% dplyr::filter(Site == "Cedar Island")
cross_val_check(data = data, model_type = "ni_spline_t",n_fold = 2)

data <- NAACproxydata %>% dplyr::filter(Site == "Cedar Island")
cross_val_check(data = data, model_type = "ni_spline_t",n_fold = 2)

Relative Sea level example dataset

Description

An example dataset for 22 proxy sites along the Atlantic coast of North America

Usage

NAACproxydata
NAACproxydata

Format

A data frame with 1715 rows and 8 columns:

Region: All regions along Atlantic coast of North America
Site: All sites along Atlantic coast of North America
Latitude: Latitude of the data site
Longitude: Longitude of the data site
RSL: Relative Sea level in meters
RSL_err: Error in RSL
Age: Age in years CE
Age_err: Error in Age

Source

Check out the vignettes to see the sources

Plot raw data with measurement uncertainty.

Description

In this function, the raw data is plotted with grey uncertainty boxes representing the uncertainty associated with the input and the output. The function allows the user to plot the proxy record data and tide gauge data together or separately.

Usage

## S3 method for class 'reslr_input'
plot(
  x,
  title = "",
  xlab = "Year (CE)",
  ylab = "Relative Sea Level (m)",
  plot_tide_gauges = FALSE,
  plot_proxy_records = TRUE,
  plot_caption = TRUE,
  ...
)
## S3 method for class 'reslr_input'
plot(
  x,
  title = "",
  xlab = "Year (CE)",
  ylab = "Relative Sea Level (m)",
  plot_tide_gauges = FALSE,
  plot_proxy_records = TRUE,
  plot_caption = TRUE,
  ...
)

Arguments

`x`	An object created via the function `reslr_load`
`title`	Title of the plot
`xlab`	Labeling the x-axis
`ylab`	Labeling the y-axis
`plot_tide_gauges`	Plotting the tide gauge data with the proxy records
`plot_proxy_records`	Plotting the proxy records on their own and this is the default
`plot_caption`	Plotting an informed caption with the number of tide gauges and proxy sites.
`...`	Not used

Value

Plot of the raw data with the measurement uncertainty.

Examples


full_dataset <- reslr_load(NAACproxydata)
plot(full_dataset)
full_dataset <- reslr_load(NAACproxydata)
plot(full_dataset)

Plotting the results for each statistical model from the `reslr_mcmc` function.

Description

Depending on the model chosen in the reslr_mcmc function, the package produces a range of output plots. Total posterior model fit plot with the raw data and measurement uncertainty are created for each statistical model. The rate of change plots are created for the EIV IGP and the NI spline regression models. For the NI GAM decomposition, each individual component of the model is plotted. Also, the regional and the non-linear local component, an associated rate plot is produced. If tide gauges are used in the model, the user has the ability plot the output with or without this additional data source.

Usage

## S3 method for class 'reslr_output'
plot(
  x,
  plot_proxy_records = TRUE,
  plot_tide_gauges = FALSE,
  title = "",
  plot_type = c("model_fit_plot"),
  plot_caption = TRUE,
  xlab = "Year (CE)",
  ylab = "Relative Sea Level (m)",
  y_rate_lab = "Rate of change (mm per year)",
  ...
)
## S3 method for class 'reslr_output'
plot(
  x,
  plot_proxy_records = TRUE,
  plot_tide_gauges = FALSE,
  title = "",
  plot_type = c("model_fit_plot"),
  plot_caption = TRUE,
  xlab = "Year (CE)",
  ylab = "Relative Sea Level (m)",
  y_rate_lab = "Rate of change (mm per year)",
  ...
)

Arguments

`x`	An object of class `reslr_output` and `model_type` created via `reslr_mcmc`
`plot_proxy_records`	Plotting the proxy records on their own and this is the default
`plot_tide_gauges`	Plotting the tide gauge data as well as proxy data
`title`	Plotting a title on the output plots
`plot_type`	The user can select the type of output plot they require from the following: "rate_plot", "model_fit_plot", "regional_plot", "regional_rate_plot", "linear_local_plot", "non_linear_local_plot", "non_linear_local_rate_plot", "nigam_component_plot"
`plot_caption`	Plotting an informed caption with the number of tide gauges and proxy sites.
`xlab`	Labeling the x-axis
`ylab`	Labeling the y-axis
`y_rate_lab`	Labeling the y-axis for rate of change plots
`...`	Not used

Value

Plot of model fit and the rate of change depending on the statistical model in question.

Examples


data <- NAACproxydata %>% dplyr::filter(Site == "Cedar Island")
x <- reslr_load(data = data)
jags_output <- reslr_mcmc(x, model_type = "eiv_slr_t")
plot(x = jags_output)
data <- NAACproxydata %>% dplyr::filter(Site == "Cedar Island")
x <- reslr_load(data = data)
jags_output <- reslr_mcmc(x, model_type = "eiv_slr_t")
plot(x = jags_output)

Print a reslr output object which is created from the `reslr_load` function.

Description

In this function, the reslr input object is printed. This is a high-level summary which provides the number of observations and the number of sites utilised in the dataset.

Usage

## S3 method for class 'reslr_input'
print(x, ...)
## S3 method for class 'reslr_input'
print(x, ...)

Arguments

`x`	An object of class `reslr_input`
`...`	Other arguments (not supported)

Value

A neat presentation of your input reslr object

Examples

data <- NAACproxydata %>% dplyr::filter(Site == "Cedar Island")
reslr_input <- reslr_load(data = data)
print(x = reslr_input)
data <- NAACproxydata %>% dplyr::filter(Site == "Cedar Island")
reslr_input <- reslr_load(data = data)
print(x = reslr_input)

Print a reslr output object which is created by the `reslr_mcmc` function.

Description

This will be very high level printing that the user can use to obtain information about the MCMC run using JAGS. The number of iterations and chains used by the user is printed In addition, the type of statistical model is printed.

Usage

## S3 method for class 'reslr_output'
print(x, ...)
## S3 method for class 'reslr_output'
print(x, ...)

Arguments

`x`	An object of class `reslr_output`
`...`	Other arguments (not supported)

Value

Returns high level information about the reslr_output object, i.e. the number of iterations and chains used.

Examples


data <- NAACproxydata %>% dplyr::filter(Site == "Cedar Island")
input_data <- reslr_load(data = data)
jags_output <- reslr_mcmc(input_data = input_data, model_type = "eiv_slr_t")
print(x = jags_output)

data <- NAACproxydata %>% dplyr::filter(Site == "Cedar Island")
input_data <- reslr_load(data = data)
jags_output <- reslr_mcmc(input_data = input_data, model_type = "eiv_slr_t")
print(x = jags_output)

Loading in data for the `reslr` package

Description

In this function, the data provided by the user is loaded into the package. The prerequisites of the input data structure has been given in the vignettes. The user can choose to include data from tide gauges which is sourced from the Permanent Service for Mean Sea Level (PSMSL) online database.

Usage

reslr_load(
  data,
  prediction_grid_res = 50,
  include_tide_gauge = FALSE,
  include_linear_rate = FALSE,
  list_preferred_TGs = NULL,
  TG_minimum_dist_proxy = FALSE,
  all_TG_1deg = FALSE,
  input_age_type = "CE",
  sediment_average_TG = 10,
  detrend_data = FALSE,
  core_col_year = NULL,
  cross_val = FALSE,
  test_set
)
reslr_load(
  data,
  prediction_grid_res = 50,
  include_tide_gauge = FALSE,
  include_linear_rate = FALSE,
  list_preferred_TGs = NULL,
  TG_minimum_dist_proxy = FALSE,
  all_TG_1deg = FALSE,
  input_age_type = "CE",
  sediment_average_TG = 10,
  detrend_data = FALSE,
  core_col_year = NULL,
  cross_val = FALSE,
  test_set
)

Arguments

`data`	The input data as a dataframe.
`prediction_grid_res`	Resolution of grid. Predictions over every 50 years(default) can vary based on user preference, as larger values will reduce computational run time.
`include_tide_gauge`	Including tide gauge data from PSMSL website. The tide gauge data undergo a cleaning process in this function where flagged stations are removed as recommended by the online database. Next, the tide gauge data is averaged over period defined by sediment_average_TG which default is 10 years corresponding to accumulation rates of proxy records. Then, the user selects their preferred tide gauge based on three criteria: 1.nearest tide gauge to the proxy site; 2. User supplies a list of names of preferred tide gauges; 3. all tide gauges within 1 degree are chosen.
`include_linear_rate`	User decides to include linear_rate and linear_rate_err associated. This relates to linear rate which corresponds to an important physical process that impacts sea level observations which is glacial isostatic adjustment (GIA). For the linear_rate and its associated linear_rate_err, the user can provide these values as additional columns in the input dataframe. If they prefer, the package will calculate the linear_rate and the linear_rate_err using the data.
`list_preferred_TGs`	The user can supply the name or names of the preferred tide gauges from the PSMSL database.
`TG_minimum_dist_proxy`	The package finds the tide gauge closest to the proxy site
`all_TG_1deg`	The package finds all tide gauges within 1 degree of the proxy site
`input_age_type`	The inputted age in years "CE" or year "BP". The default is "CE" and is the preferred structure of the package. The package has the ability to use Before Present ("BP") observations.
`sediment_average_TG`	Average the tide gauge data to make it comparable to accumulation rates of proxy records. The default averaging period for tide gauges is 10 years and the user can alter this.
`detrend_data`	Detrend the data using the linear rate provided to remove this component.
`core_col_year`	The year the sediment core was collected in order to the data to be detrended.
`cross_val`	For the spline in time, spline in space time and the NIGAM the user can undertake cross validation to examine the
`test_set`	The test set dataframe for cross validation

Value

A list containing data frame of data and prediction grid. The output of this function is two data frames, one with the data and one with the data_grid which represent a grid with evenly spaced time points. If tide gauge data is used, an ID column is included in the two output dataframes displaying the data source, "ProxyRecord" or "TideGaugeData".

Examples


data <- NAACproxydata %>% dplyr::filter(Site == "Cedar Island")
reslr_load(data = data)
data <- NAACproxydata %>% dplyr::filter(Site == "Cedar Island")
reslr_load(data = data)

Run a `reslr_input` object through the main reslr Markov chain Monte Carlo (MCMC) function using a chosen statistical model

Description

In this function, a variety of statistical models can be run depending on the requirements of the user. All models are written within a Bayesian framework and use JAGS (Just Another Gibbs Sampler) to calculate Markov Chain Monte Carlo (MCMC) simulation to obtain estimates of unknown parameters. The user has the ability to alter the number of iterations, the number of burnin, the number of chains and the thinning. These options relate to the amount of MCMC simulations required and should be reviewed by the user to ensure model convergence is achieved without excessively long run times. The user chooses their 'model_type' and as a range of models to choose from.

Usage

reslr_mcmc(
  input_data,
  model_type,
  n_cp = 1,
  igp_smooth = 0.2,
  n_iterations = 5000,
  n_burnin = 1000,
  n_thin = 4,
  n_chains = 3,
  CI = 0.95,
  spline_nseg = NULL,
  spline_nseg_t = 20,
  spline_nseg_st = 6
)
reslr_mcmc(
  input_data,
  model_type,
  n_cp = 1,
  igp_smooth = 0.2,
  n_iterations = 5000,
  n_burnin = 1000,
  n_thin = 4,
  n_chains = 3,
  CI = 0.95,
  spline_nseg = NULL,
  spline_nseg_t = 20,
  spline_nseg_st = 6
)

Arguments

`input_data`	Input data from the `reslr_load` function
`model_type`	The user selects their statistical model type. The user can select a Errors in Variable Simple Linear Regression using "eiv_slr_t". The user can select a Errors in Variable Change Point Regression using "eiv_cp_t". The user can select a Errors in Variable Integrated Gaussian Process using "eiv_igp_t". The user can select a Noisy Input Spline in Time using "ni_spline_t". The user can select a Noisy Input Spline in Space Time using "ni_spline_st". The user can select a Noisy Input Generalised Additive Model using "ni_gam_decomp".
`n_cp`	This setting is focused on the Errors in Variables Change Point model. The user can select the number of change points 1,2 or 3.
`igp_smooth`	This setting is focused on the Errors in Variables Integrated Gaussian Process model. It informs the prior for the smoothness (correlation) parameter if model = "igp" is chosen. Choose a value between 0 and 1. Closer to 1 will increase smoothness.
`n_iterations`	Number of iterations. Increasing this value will increase the computational run time.
`n_burnin`	Size of burn-in. This number removes a certain number of samples at the beginning.
`n_thin`	Amount of thinning.
`n_chains`	Number of MCMC chains. The number of times the model will be run.
`CI`	Size of the credible interval required by the user. The default is 0.95 corresponding to 95%.
`spline_nseg`	This setting is focused on the Noisy Input Spline model. It provides the number of segments used to create basis functions.
`spline_nseg_t`	This setting is focused on the Noisy Input Generalised Additive Model. It provides the number of segments used to create basis functions.
`spline_nseg_st`	This setting is focused on the Noisy Input Generalised Additive Model. It provides the number of segments used to create basis functions.

Value

A list containing the input data, the JAGS output and output dataframes used for final plots.

Examples


data <- NAACproxydata %>% dplyr::filter(Site == "Cedar Island")
input_data <- reslr_load(data = data)
reslr_mcmc(input_data = input_data, model_type = "eiv_slr_t")
data <- NAACproxydata %>% dplyr::filter(Site == "Cedar Island")
input_data <- reslr_load(data = data)
reslr_mcmc(input_data = input_data, model_type = "eiv_slr_t")

Produces summaries and convergence diagnostics for an object created with `reslr_mcmc`.

Description

A warning message will appear if the model has not been converge. If this appears the user is recommended to re-run the model and alter the reslr_mcmc function default iteration and MCMC settings. Also, it provides high-level summaries of the estimated parameters.

Usage

## S3 method for class 'reslr_output'
summary(object, ...)
## S3 method for class 'reslr_output'
summary(object, ...)

Arguments

`object`	Output object from the `reslr_mcmc`
`...`	Not in use

Value

A list containing convergence diagnostics and parameter estimates for the output.

Examples


data <- NAACproxydata %>% dplyr::filter(Site == "Cedar Island")
input_data <- reslr_load(data = data)
jags_output <- reslr_mcmc(input_data = input_data, model_type = "eiv_slr_t")
summary(object = jags_output)
data <- NAACproxydata %>% dplyr::filter(Site == "Cedar Island")
input_data <- reslr_load(data = data)
jags_output <- reslr_mcmc(input_data = input_data, model_type = "eiv_slr_t")
summary(object = jags_output)

Package 'reslr'

Help Index

Cross validation check for spline in time, spline in space time and GAM in order to select the most appropriate number of knots when creating basis functions.

Description

Usage

Arguments

Value

Examples

Relative Sea level example dataset

Description

Usage

Format

Source

Plot raw data with measurement uncertainty.

Description

Usage

Arguments

Value

Examples

Plotting the results for each statistical model from the reslr_mcmc function.

Description

Usage

Arguments

Value

Examples

Print a reslr output object which is created from the reslr_load function.

Description

Usage

Arguments

Value

Examples

Print a reslr output object which is created by the reslr_mcmc function.

Description

Usage

Arguments

Value

Examples

Loading in data for the reslr package

Description

Usage

Arguments

Value

Examples

Run a reslr_input object through the main reslr Markov chain Monte Carlo (MCMC) function using a chosen statistical model

Description

Usage

Arguments

Value

Examples

Produces summaries and convergence diagnostics for an object created with reslr_mcmc.

Description

Usage

Arguments

Value

Examples

Plotting the results for each statistical model from the `reslr_mcmc` function.

Print a reslr output object which is created from the `reslr_load` function.

Print a reslr output object which is created by the `reslr_mcmc` function.

Loading in data for the `reslr` package

Run a `reslr_input` object through the main reslr Markov chain Monte Carlo (MCMC) function using a chosen statistical model

Produces summaries and convergence diagnostics for an object created with `reslr_mcmc`.