Title: | Calibration Assistant and Post-Processing Tool for Aquatic Ecosystem Model DYRESM-CAEDYM |
---|---|
Description: | Dynamic Reservoir Simulation Model (DYRESM) and Computational Aquatic Ecosystem Dynamics Model (CAEDYM) model development, including assisting with calibrating selected model parameters and visualising model output through time series plot, profile plot, contour plot, and scatter plot. For more details, see Yu et al. (2023) <https://journal.r-project.org/articles/RJ-2023-008/>. |
Authors: | Songyan Yu [aut, cre] , Christopher McBride [ctb], Marieke Frassl [ctb] |
Maintainer: | Songyan Yu <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.4.4 |
Built: | 2025-01-10 07:21:07 UTC |
Source: | CRAN |
This function carries out simulations with a large number of possible combinations of parameter values that users regard as potentially suitable for their model calibration, and calculates the values of nominated objective functions (i.e., statistical measures of goodness of fit) for each combination. Based on the calculated objective function values, users can determine the optimal set(s) of parameter values or narrow the ranges of possible parameter values.
calib_assist( cal.para, combination = "random", n, model.var, phyto.group = NA, obs.data, objective.function = c("NSE", "RMSE"), start.date, end.date, dycd.wd, dycd.output, file.name, verbose = TRUE, parallel = FALSE, n.cores = NULL, write.out = TRUE )
calib_assist( cal.para, combination = "random", n, model.var, phyto.group = NA, obs.data, objective.function = c("NSE", "RMSE"), start.date, end.date, dycd.wd, dycd.output, file.name, verbose = TRUE, parallel = FALSE, n.cores = NULL, write.out = TRUE )
cal.para |
a data frame or a character string naming an external .csv file where below column names are mandatory: "Parameter" describing parameter names (abbreviation is allowed), "Min", "Max", and "Increment" describing the minimum and maximum parameter values and expected increment in the value range, "Input_file" and "Line_NO" listing in which configuration file at which line the parameter can be found. |
combination |
a vector of string character of how to pick up combinations of parameter values. "random" - the function randomly picks up a given number of combinations; "all" - the function tries all possible combinations of parameter values. |
n |
the number of random selections. Must be provided if combination = "random". |
model.var |
a vector of string character of modelled variables for calibration. the character should be from the 'var.name' column of 'data(output_name)'. Note that if model calibration needs to regard chlorophyll of multiple phytoplankton groups as a whole, model.var should use "CHLA" and individual phytoplankton group should be specified through the "phyto.group" argument. If phytoplankton groups are separately calibrated, simply list their character in this argument (model.var). |
phyto.group |
a vector of simulated phytoplankton groups, including CHLOR, FDIAT, NODUL, CYANO and CRYPT. |
obs.data |
a data frame or a character string naming a csv file of observed lake data. The observed lake data need to include below columns: 1) 'Date' in format of "%Y-%m-%d" 2) 'Depth' (integer) 3) Water quality variables (use string characters of model var as column names). see example data 'data(obs_temp)'. |
objective.function |
a vector of string character describing which objective function(s) to be used for calibration. Selected from the following five functions: "NSE": Nash-Sutcliffe efficiency coefficient, "RMSE": Root Mean Square Error, "MAE": Mean Absolute Error, "RAE": Relative Absolute Error, "Pearson": Pearson's r. |
start.date , end.date
|
the beginning and end simulation dates for the intended DYRESM-CAEDYM calibration. The date format must be "%Y-%m-%d". The two dates should be consistent with model configurations. |
dycd.wd |
the directory where input files (including the bat file) to DYRESM-CAEDYM are stored. |
dycd.output |
a character string naming the output file of model simulation. |
file.name |
a character string naming a .csv file where the results of this function are written to. Needed if 'write.out' = TRUE. |
verbose |
if TRUE, model calibration information is printed. |
parallel |
if TRUE, the calibration process is run on multiple cores. |
n.cores |
When 'parallel' is TRUE, n.cores is the number of cores the calibration function will be run on. If not provided, the default value is the number of available cores on the computer -1. |
write.out |
if TRUE, model calibration results are saved in a file with a file name set by the "file.name" argument. |
a data frame of all tested values of parameters and corresponding values of the objective function(s).
No executable examples are provided to illustrate the use of this function, as this function relies on the DYRESM-CAEDYM executables to work.
Change parameter value of input files to DYRESM_CAEDYM model.
change_input_file(input_file, row_no, new_value)
change_input_file(input_file, row_no, new_value)
input_file |
vector of input format, such as "par","cfg". |
row_no |
the number of row where the variable of interest is in the input file. |
new_value |
the new value that will be assigned to the variable of interest. |
updated input_file with a new value to a parameter.
Delete all whitespace until a non-whitespace character.
delete_space(extract_val)
delete_space(extract_val)
extract_val |
a vector. |
Extract simulation outputs from a DYRESM-CAEDYM model run.
ext_output(dycd.output, var.extract, verbose = FALSE)
ext_output(dycd.output, var.extract, verbose = FALSE)
dycd.output |
a string of characters describing the file path to the output netcdf file of DYRESM-CAEDYM model. |
var.extract |
a vector of variables to be extracted from the output. Please refer to the var.name of data(output_name) for accepted variable name. Apart from the user nominated variables, simulation period and layer height data are also extracted. |
verbose |
if TRUE, the information about the extraction process is printed. |
a list of values of those variables of interest, as well as two compulsory variables (i.e. simulation period, layer height)
# extract simulated temperature values from DYRESM-CAEDYM simulation file var.values<-ext_output(dycd.output=system.file("extdata", "dysim.nc", package = "dycdtools"), var.extract=c("TEMP")) for(i in 1:length(var.values)){ expres<-paste0(names(var.values)[i],"<-data.frame(var.values[[",i,"]])") eval(parse(text=expres)) }
# extract simulated temperature values from DYRESM-CAEDYM simulation file var.values<-ext_output(dycd.output=system.file("extdata", "dysim.nc", package = "dycdtools"), var.extract=c("TEMP")) for(i in 1:length(var.values)){ expres<-paste0(names(var.values)[i],"<-data.frame(var.values[[",i,"]])") eval(parse(text=expres)) }
convert from height to depth
hgt_to_dpt(height)
hgt_to_dpt(height)
height |
a vector of height profile |
The default simulation results of a water quality variable from DYRESM-CAEDYM are usually at irregular layer heights. This function convert it to a data frame with regular layer heights through interpolation.
interpol(layerHeights, var, min.depth, max.depth, by.value)
interpol(layerHeights, var, min.depth, max.depth, by.value)
layerHeights |
layer heights, outputs from a DYRESM-CAEDYM model run, and can be generated with the 'ext_output' function. |
var |
simulation results of a water quality variable and can also be generated with the 'ext_output' function. |
min.depth , max.depth , by.value
|
minimum and maximum layer depths within which interpolation will be conducted. by.value sets up the depth increments between two immediate layers. |
a matrix of interpolated values of the water quality variable(s).
# extract simulated temperature values from DYRESM-CAEDYM simulation file var.values<-ext_output(dycd.output=system.file("extdata", "dysim.nc", package = "dycdtools"), var.extract=c("TEMP")) for(i in seq_along(var.values)){ expres<-paste0(names(var.values)[i],"<-data.frame(var.values[[",i,"]])") eval(parse(text=expres)) } # interpolate temperature for depths from 0 to 13 m at increment of 0.5 m temp.interpolated<-interpol(layerHeights = dyresmLAYER_HTS_Var, var = dyresmTEMPTURE_Var, min.dept = 0, max.dept = 13, by.value = 0.5)
# extract simulated temperature values from DYRESM-CAEDYM simulation file var.values<-ext_output(dycd.output=system.file("extdata", "dysim.nc", package = "dycdtools"), var.extract=c("TEMP")) for(i in seq_along(var.values)){ expres<-paste0(names(var.values)[i],"<-data.frame(var.values[[",i,"]])") eval(parse(text=expres)) } # interpolate temperature for depths from 0 to 13 m at increment of 0.5 m temp.interpolated<-interpol(layerHeights = dyresmLAYER_HTS_Var, var = dyresmTEMPTURE_Var, min.dept = 0, max.dept = 13, by.value = 0.5)
calculate the below five objective functions that are commonly used to measure goodness of fit: 1) Nash-Sutcliffe Efficiency coefficient (NSE), 2) Root Mean Square Error (RMSE), 3) Mean Absolute Error (MAE), 4) Relative Absolute Error (RAE), and 5) Pearson's r (Pearson).
objective_fun( sim, obs, fun = c("NSE", "RMSE"), start.date, end.date, min.depth, max.depth, by.value )
objective_fun( sim, obs, fun = c("NSE", "RMSE"), start.date, end.date, min.depth, max.depth, by.value )
sim |
a matrix of a simulated water quality variable values with column of time and row of depth. This matrix can be generated by running the "interpol" function. |
obs |
a data frame having three columns to describe observed values of a water quality variable. These three columns are 'Date' (as '%Y-%m-%d'), 'Depth', and the designated variable name which can be found from the var.name column of 'data(output_name)'. An example of such a data frame can be found with 'data(obs_temp)' |
fun |
objective function(s) to be calculated. Select any from 'NSE', 'RMSE', 'MAE', 'RAE', and 'Pearson'. Multiple selections are allowed. |
start.date , end.date
|
the start and end simulation dates for the DYRESM-CAEDYM model run. The date format must be "%Y-%m-%d". |
min.depth , max.depth
|
the minimum and maximum depths of the simulation matrix. |
by.value |
the value of increment at which the depth of layers increases from the mim.depth to max.depth in the simulation matrix. |
a list of objective function values.
A table has three columns. The first column name is Date in the form of dd-mm-YY. The second column is Depth where the temperature data was monitored. The third column is monitored temperature value.
data(obs_temp)
data(obs_temp)
A data frame with 77 rows and 3 variables:
date when the monitoring happened
depth of monitoring
temperature value
self-made.
A table has two columns. The first column name is var.name, meaning variable names that are used in the extract.output function. The second column is the default DYCD simulation variable names, such as "dyresmLAYER_HTS_Var".
data(output_name)
data(output_name)
A data frame with 65 rows and 2 variables:
variable name
default DYCD simulation variable name
self-made.
Contour plot a matrix of values of a water quality variable,
plot_cont( sim, sim.start, sim.end, legend.title, min.depth, max.depth, by.value, nlevels )
plot_cont( sim, sim.start, sim.end, legend.title, min.depth, max.depth, by.value, nlevels )
sim |
a matrix of simulated variables. This matrix can be generated by running the "interpol" function. |
sim.start , sim.end
|
the start and end dates of the simulation period for the DYRESM-CAEDYM model run of interest. The date format must be "%Y-%m-%d". |
legend.title |
the legend title of the contour figure. |
min.depth , max.depth , by.value
|
minimum and maximum depths used to be the start of y axis of the contour plot, at the increment of by.value. |
nlevels |
Number of levels which are used to partition the range of simulation variable. |
This function is NOT based on ggplot2. To save the produced figure, users can use functions like png, bmp, jpeg, etc.
This function returns a filled.contour object.
sim <- matrix(c(28,28,28,27,25,24,12,13,14,15,16,17), nrow = 6, ncol = 2) # contour plot of the sim data frame p <- plot_cont(sim = sim, sim.start = "2020-01-01", sim.end = "2020-01-02", legend.title = "T \u00B0C", min.depth = 0, max.depth = 5, by.value = 1, nlevels = 20) p
sim <- matrix(c(28,28,28,27,25,24,12,13,14,15,16,17), nrow = 6, ncol = 2) # contour plot of the sim data frame p <- plot_cont(sim = sim, sim.start = "2020-01-01", sim.end = "2020-01-02", legend.title = "T \u00B0C", min.depth = 0, max.depth = 5, by.value = 1, nlevels = 20) p
Contour plot a matrix of values of a water quality variable.
plot_cont_comp( sim, obs, sim.start, sim.end, plot.start, plot.end, legend.title, min.depth, max.depth, by.value, nlevels = 20 )
plot_cont_comp( sim, obs, sim.start, sim.end, plot.start, plot.end, legend.title, min.depth, max.depth, by.value, nlevels = 20 )
sim |
a matrix of simulated variables. This matrix can be generated by running the "interpol" function. |
obs |
a data frame having three columns to describe observed values of a water quality variable. These three columns are 'Date' (as '%Y-%m-%d'), 'Depth', and the designated variable name which can be found from the var.name column of 'data(output_name)'. An example of such a data frame can be found with 'data(obs_temp)' |
sim.start , sim.end
|
the start and end dates of the simulation period for the DYRESM-CAEDYM model run of interest. The date format must be "%Y-%m-%d". |
plot.start , plot.end
|
the start and end dates of the period to be plotted, in the format of "%Y-%m-%d". |
legend.title |
the legend title of the contour figure. |
min.depth , max.depth , by.value
|
minimum and maximum depths used to be the start of y axis of the contour plot, at the increment of by.value. |
nlevels |
Number of levels which are used to partition the range of simulation variable. |
This function is NOT based on ggplot2. To save the produced figure, users can use functions like png, bmp, jpeg, etc.
This function returns a filled.contour object.
obs <- data.frame(Date = c(rep('2020-01-01', 6), rep('2020-01-02', 6)), Depth = rep(0:5, 2), TEMP = rep(29:24,2)) sim <- matrix(c(28,28,28,27,25,24,12,13,14,15,16,17), nrow = 6, ncol = 2) # contour plot of temperature simulations # with observed data shown as colour-coded dots p <- plot_cont_comp(sim = sim, obs = obs, sim.start = "2020-01-01", sim.end = "2020-01-02", plot.start = "2020-01-01", plot.end = "2020-01-02", legend.title = "T \u00B0C", min.depth=0, max.depth=5, by.value=1, nlevels=20) p
obs <- data.frame(Date = c(rep('2020-01-01', 6), rep('2020-01-02', 6)), Depth = rep(0:5, 2), TEMP = rep(29:24,2)) sim <- matrix(c(28,28,28,27,25,24,12,13,14,15,16,17), nrow = 6, ncol = 2) # contour plot of temperature simulations # with observed data shown as colour-coded dots p <- plot_cont_comp(sim = sim, obs = obs, sim.start = "2020-01-01", sim.end = "2020-01-02", plot.start = "2020-01-01", plot.end = "2020-01-02", legend.title = "T \u00B0C", min.depth=0, max.depth=5, by.value=1, nlevels=20) p
Profile plot shows vertical profiles of simulation outputs and corresponding observations for all dates where observations are available.
plot_prof( sim, obs, sim.start, sim.end, plot.start, plot.end, xlabel, min.depth, max.depth, by.value )
plot_prof( sim, obs, sim.start, sim.end, plot.start, plot.end, xlabel, min.depth, max.depth, by.value )
sim |
a matrix of simulated variables. This matrix can be generated by running the "interpol" function. |
obs |
a data frame having three columns to describe observed values of a water quality variable. These three columns are 'Date' (as '%Y-%m-%d'), 'Depth', and the designated variable name which can be found from the var.name column of 'data(output_name)'. An example of such a data frame can be found with 'data(obs_temp)' This function is based on ggplot2, and users can treat the object of this function in the same way as a ggplot2 object. |
sim.start , sim.end
|
the start and end dates of the simulation period of the DYRESM-CAEDYM model run of interest. The date format must be "%Y-%m-%d". |
plot.start , plot.end
|
the start and end dates of the period to be plotted in the format of "%Y-%m-%d". |
xlabel |
the x axis label of the profile figure. |
min.depth , max.depth , by.value
|
minimum and maximum depths in the profile plot at an increment of by.value. |
This function returns a ggplot object that can be modified with ggplot package functions.
var.values<-ext_output(dycd.output=system.file("extdata", "dysim.nc", package = "dycdtools"), var.extract=c("TEMP")) for(i in 1:length(var.values)){ expres<-paste0(names(var.values)[i],"<-data.frame(var.values[[",i,"]])") eval(parse(text=expres)) } # interpolate temperature for depths from 0 to 13 m at increment of 0.5 m temp.interpolated<-interpol(layerHeights = dyresmLAYER_HTS_Var, var = dyresmTEMPTURE_Var, min.dept = 0, max.dept = 13, by.value = 0.5) data(obs_temp) # profile plot of temperature sim and obs p <- plot_prof(sim=temp.interpolated, obs = obs_temp, sim.start="2017-06-06", sim.end="2017-06-15", plot.start="2017-06-06", plot.end="2017-06-15", xlabel = "Temperature \u00B0C", min.depth = 0, max.depth = 13, by.value = 0.5) p
var.values<-ext_output(dycd.output=system.file("extdata", "dysim.nc", package = "dycdtools"), var.extract=c("TEMP")) for(i in 1:length(var.values)){ expres<-paste0(names(var.values)[i],"<-data.frame(var.values[[",i,"]])") eval(parse(text=expres)) } # interpolate temperature for depths from 0 to 13 m at increment of 0.5 m temp.interpolated<-interpol(layerHeights = dyresmLAYER_HTS_Var, var = dyresmTEMPTURE_Var, min.dept = 0, max.dept = 13, by.value = 0.5) data(obs_temp) # profile plot of temperature sim and obs p <- plot_prof(sim=temp.interpolated, obs = obs_temp, sim.start="2017-06-06", sim.end="2017-06-15", plot.start="2017-06-06", plot.end="2017-06-15", xlabel = "Temperature \u00B0C", min.depth = 0, max.depth = 13, by.value = 0.5) p
Scatter plot of the simulation and observation of a water quality variable. This function is based on ggplot2, and users can treat the object of this function in the same way as a ggplot2 object.
plot_scatter( sim, obs, sim.start, sim.end, plot.start, plot.end, min.depth, max.depth, by.value )
plot_scatter( sim, obs, sim.start, sim.end, plot.start, plot.end, min.depth, max.depth, by.value )
sim |
a matrix of simulated variables. This matrix can be generated by running the "interpol" function. |
obs |
a data frame having three columns to describe observed values of a water quality variable. These three columns are 'Date' (as '%Y-%m-%d'), 'Depth', and the designated variable name which can be found from the var.name column of 'data(output_name)'. An example of such a data frame can be found with 'data(obs_temp)' |
sim.start , sim.end
|
the start and end dates of the simulation period of the DYRESM-CAEDYM model run of interest. The date format must be "%Y-%m-%d". |
plot.start , plot.end
|
the start and end dates of the period to be plotted in the format of "%Y-%m-%d". |
min.depth , max.depth , by.value
|
minimum and maximum depths in the profile plot at an increment of by.value. |
This function returns a ggplot object that can be modified with ggplot package functions.
var.values<-ext_output(dycd.output=system.file("extdata", "dysim.nc", package = "dycdtools"), var.extract=c("TEMP")) for(i in 1:length(var.values)){ expres<-paste0(names(var.values)[i],"<-data.frame(var.values[[",i,"]])") eval(parse(text=expres)) } # interpolate temperature for depths from 0 to 13 m at increment of 0.5 m temp.interpolated<-interpol(layerHeights = dyresmLAYER_HTS_Var, var = dyresmTEMPTURE_Var, min.dept = 0, max.dept = 13, by.value = 0.5) data(obs_temp) # scatter plot of sim and obs temperature p <- plot_scatter(sim=temp.interpolated, obs=obs_temp, sim.start="2017-06-06", sim.end="2017-06-15", plot.start="2017-06-06", plot.end="2017-06-15", min.depth = 0, max.depth = 13, by.value = 0.5) p
var.values<-ext_output(dycd.output=system.file("extdata", "dysim.nc", package = "dycdtools"), var.extract=c("TEMP")) for(i in 1:length(var.values)){ expres<-paste0(names(var.values)[i],"<-data.frame(var.values[[",i,"]])") eval(parse(text=expres)) } # interpolate temperature for depths from 0 to 13 m at increment of 0.5 m temp.interpolated<-interpol(layerHeights = dyresmLAYER_HTS_Var, var = dyresmTEMPTURE_Var, min.dept = 0, max.dept = 13, by.value = 0.5) data(obs_temp) # scatter plot of sim and obs temperature p <- plot_scatter(sim=temp.interpolated, obs=obs_temp, sim.start="2017-06-06", sim.end="2017-06-15", plot.start="2017-06-06", plot.end="2017-06-15", min.depth = 0, max.depth = 13, by.value = 0.5) p
Time series plot of simulated and observed values at target depths. This function is based on ggplot2, and users can treat the object of this function in the same way as a ggplot2 object.
plot_ts( sim, obs, target.depth, sim.start, sim.end, plot.start, plot.end, min.depth, max.depth, by.value, ylabel )
plot_ts( sim, obs, target.depth, sim.start, sim.end, plot.start, plot.end, min.depth, max.depth, by.value, ylabel )
sim |
a matrix of simulated variables. This matrix can be generated by running the "interpol" function. |
obs |
a data frame having three columns to describe observed values of a water quality variable. These three columns are 'Date' (as '%Y-%m-%d'), 'Depth', and the designated variable name which can be found from the var.name column of 'data(output_name)'. An example of such a data frame can be found with 'data(obs_temp)' |
target.depth |
a vector of depth (unit:m) for which time series simulation results will be plotted. |
sim.start , sim.end
|
the start and end dates of the simulation period of the DYRESM-CAEDYM model run of interest. The date format must be "%Y-%m-%d". |
plot.start , plot.end
|
the start and end dates of the period to be plotted in the format of "%Y-%m-%d". |
min.depth , max.depth , by.value
|
minimum and maximum depths in the profile plot at an increment of by.value. |
ylabel |
the y axis title of the time series plot. |
This function returns a ggplot object that can be modified with ggplot package functions.
var.values<-ext_output(dycd.output=system.file("extdata", "dysim.nc", package = "dycdtools"), var.extract=c("TEMP")) for(i in 1:length(var.values)){ expres<-paste0(names(var.values)[i],"<-data.frame(var.values[[",i,"]])") eval(parse(text=expres)) } # interpolate temperature for depths from 0 to 13 m at increment of 0.5 m temp.interpolated<-interpol(layerHeights = dyresmLAYER_HTS_Var, var = dyresmTEMPTURE_Var, min.dept = 0, max.dept = 13, by.value = 0.5) data(obs_temp) # time series plot of temperature sim and obs p <- plot_ts(sim = temp.interpolated, obs = obs_temp, target.depth=c(1,6), sim.start="2017-06-06", sim.end="2017-06-15", plot.start="2017-06-06", plot.end="2017-06-15", ylabel="Temperature \u00B0C", min.depth=0, max.depth=13, by.value=0.5) p
var.values<-ext_output(dycd.output=system.file("extdata", "dysim.nc", package = "dycdtools"), var.extract=c("TEMP")) for(i in 1:length(var.values)){ expres<-paste0(names(var.values)[i],"<-data.frame(var.values[[",i,"]])") eval(parse(text=expres)) } # interpolate temperature for depths from 0 to 13 m at increment of 0.5 m temp.interpolated<-interpol(layerHeights = dyresmLAYER_HTS_Var, var = dyresmTEMPTURE_Var, min.dept = 0, max.dept = 13, by.value = 0.5) data(obs_temp) # time series plot of temperature sim and obs p <- plot_ts(sim = temp.interpolated, obs = obs_temp, target.depth=c(1,6), sim.start="2017-06-06", sim.end="2017-06-15", plot.start="2017-06-06", plot.end="2017-06-15", ylabel="Temperature \u00B0C", min.depth=0, max.depth=13, by.value=0.5) p
Internal function to provide parallel processing support to the calibration assistant function.
run_iteration(this.sim, dycd.wd)
run_iteration(this.sim, dycd.wd)
this.sim |
a numeric denoting which parameter combination to be tried. |
dycd.wd |
working directory where input files (including the bat file) to DYRESM-CAEDYM are stored. |