| Title: | Linear and Nonlinear Regression for Agricultural Data |
|---|---|
| Description: | Fit, compare, and visualise linear and nonlinear regression models tailored to field-trial and dose-response agricultural data. Provides S3 classes for mixed-effects models (via 'lme4'), nonlinear growth curves (logistic, 'Gompertz', asymptotic, linear-plateau, quadratic), and four/five-parameter log-logistic dose-response models (via 'drc'). Includes automated starting-value heuristics, goodness-of-fit statistics, residual diagnostics, and 'ggplot2'-based visualisation. Methods are based on Bates and Watts (1988, ISBN:9780471816430), Ritz and others (2015) <doi:10.1371/journal.pone.0146021>, and Bates and others (2015) <doi:10.18637/jss.v067.i01>. |
| Authors: | Sadikul Islam [aut, cre] (ORCID: <https://orcid.org/0000-0003-2924-7122>) |
| Maintainer: | Sadikul Islam <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.0 |
| Built: | 2026-06-01 07:16:13 UTC |
| Source: | https://github.com/cran/agriReg |
agriReg provides a unified interface for fitting, comparing, and
visualising regression models commonly used in agricultural research:
Linear & mixed-effects models via fit_linear() (wraps lm / lme4::lmer)
Nonlinear growth curves via fit_nonlinear() — logistic, Gompertz,
asymptotic, linear-plateau, quadratic
Dose-response models via fit_dose_response() — 4/5-parameter
log-logistic and Weibull (wraps drc)
Model comparison via compare_models() — AIC, BIC, RMSE side by side
Diagnostics via gof_stats(), residual_check(), and S3 plot() methods
library(agriReg)
# 1. Clean field trial data
trial <- clean_agri_data(my_data, yield_col = "yield_tha")
# 2. Linear mixed model (treatment + block structure)
lm_fit <- fit_linear(trial,
formula = "yield_tha ~ nitrogen + phosphorus",
random = "(1|block)")
summary(lm_fit)
plot(lm_fit, type = "residuals")
# 3. Nonlinear growth curve
nl_fit <- fit_nonlinear(trial, x_col = "days", y_col = "biomass",
model = "logistic")
plot(nl_fit)
# 4. Dose-response
dr_fit <- fit_dose_response(herb_data,
dose_col = "dose_g_ha",
resp_col = "weed_control_pct")
ed_estimates(dr_fit, respLev = c(50, 90))
# 5. Compare models
compare_models(linear = lm_fit, nonlinear = nl_fit)
Maintainer: Sadikul Islam [email protected] (ORCID)
Removes rows with missing yield values, coerces the yield column to
numeric, optionally flags IQR-based outliers, and returns an agriData
object with a concise summary.
clean_agri_data( df, yield_col = "yield", flag_outliers = TRUE, na_action = c("remove", "warn") )clean_agri_data( df, yield_col = "yield", flag_outliers = TRUE, na_action = c("remove", "warn") )
df |
A |
yield_col |
Character. Name of the column that holds the primary
response variable (e.g. |
flag_outliers |
Logical. When |
na_action |
One of |
A data.frame with additional class "agriData". Attributes
n_removed (rows dropped) and n_flagged (outliers flagged) are
attached.
df <- data.frame( yield = c(3.1, 4.2, NA, 8.9, 3.8, 100), block = c("A", "A", "B", "B", "C", "C") ) clean_agri_data(df, yield_col = "yield")df <- data.frame( yield = c(3.1, 4.2, NA, 8.9, 3.8, 100), block = c("A", "A", "B", "B", "C", "C") ) clean_agri_data(df, yield_col = "yield")
Computes AIC, BIC, RMSE, MAE, and R² for a collection of agriLM,
agriNLS, or agriDRC objects and returns them in a tidy data frame
ranked by AIC.
compare_models(...)compare_models(...)
... |
Named model objects. Names are used as row identifiers. If
unnamed, models are labelled |
A data.frame with columns model, engine, n_par, AIC,
BIC, RMSE, MAE, R2, and delta_AIC (difference from best).
dat <- data.frame( nitrogen = rep(c(0, 40, 80, 120, 160), each = 8), yield = c(2.1, 2.8, 3.6, 4.1, 4.3, 2.0, 2.9, 3.5, 4.2, 4.4, 2.2, 2.7, 3.7, 4.0, 4.5, 2.1, 3.0, 3.4, 4.3, 4.2, 2.3, 2.8, 3.6, 4.1, 4.3, 2.0, 2.9, 3.5, 4.2, 4.4, 2.2, 2.7, 3.7, 4.0, 4.5, 2.1, 3.0, 3.4, 4.3, 4.2) ) m1 <- fit_linear(dat, "yield ~ nitrogen") m2 <- fit_nonlinear(dat, "nitrogen", "yield", "quadratic") compare_models(linear = m1, quadratic = m2)dat <- data.frame( nitrogen = rep(c(0, 40, 80, 120, 160), each = 8), yield = c(2.1, 2.8, 3.6, 4.1, 4.3, 2.0, 2.9, 3.5, 4.2, 4.4, 2.2, 2.7, 3.7, 4.0, 4.5, 2.1, 3.0, 3.4, 4.3, 4.2, 2.3, 2.8, 3.6, 4.1, 4.3, 2.0, 2.9, 3.5, 4.2, 4.4, 2.2, 2.7, 3.7, 4.0, 4.5, 2.1, 3.0, 3.4, 4.3, 4.2) ) m1 <- fit_linear(dat, "yield ~ nitrogen") m2 <- fit_nonlinear(dat, "nitrogen", "yield", "quadratic") compare_models(linear = m1, quadratic = m2)
Returns the dose required to achieve a specified percentage of the maximum response, with confidence intervals via the delta method.
ed_estimates( drc_fit, respLev = c(10, 50, 90), interval = "delta", level = 0.95 )ed_estimates( drc_fit, respLev = c(10, 50, 90), interval = "delta", level = 0.95 )
drc_fit |
An |
respLev |
Numeric vector of response levels (default |
interval |
Type of confidence interval: |
level |
Confidence level (default |
A matrix of ED estimates and confidence bounds (printed and returned invisibly).
A wrapper around drc::drm() providing the four model families most
commonly used in herbicide/pesticide and nutrient-response studies.
Returns an agriDRC object with print, summary, and plot methods.
fit_dose_response( data, dose_col, resp_col, fct = c("LL.4", "LL.5", "W1.4", "W2.4"), curveid = NULL, ... )fit_dose_response( data, dose_col, resp_col, fct = c("LL.4", "LL.5", "W1.4", "W2.4"), curveid = NULL, ... )
data |
A data frame containing dose and response columns. |
dose_col |
Character. Name of the dose/concentration column. |
resp_col |
Character. Name of the response column (e.g. percentage inhibition, weed biomass reduction, germination rate). |
fct |
Character. Model function:
|
curveid |
Optional column name for grouping curves (e.g. |
... |
Additional arguments forwarded to |
An object of class "agriDRC" containing:
The underlying drc object.
The model function name.
Column names used.
The data used.
The matched call.
herbicide_trial <- load_example_data("herbicide_trial") dr <- fit_dose_response(herbicide_trial, dose_col = "dose_g_ha", resp_col = "weed_control_pct") summary(dr) ed_estimates(dr, respLev = c(10, 50, 90)) plot(dr)herbicide_trial <- load_example_data("herbicide_trial") dr <- fit_dose_response(herbicide_trial, dose_col = "dose_g_ha", resp_col = "weed_control_pct") summary(dr) ed_estimates(dr, respLev = c(10, 50, 90)) plot(dr)
A wrapper around stats::lm() and lme4::lmer() that returns an agriLM
object with consistent print, summary, plot, coef, fitted,
residuals, and predict S3 methods.
fit_linear(data, formula, random = NULL, weights = NULL, ...)fit_linear(data, formula, random = NULL, weights = NULL, ...)
data |
A data frame or |
formula |
A model formula as a character string or formula object.
Example: |
random |
Optional character string specifying the random-effects term
in |
weights |
Optional numeric vector of observation weights passed to
|
... |
Additional arguments forwarded to |
An object of class "agriLM" (a named list) containing:
The underlying lm or lmerMod object.
Character: "lm" or "lmer".
The fixed-effects formula.
The random-effects term (or NULL).
The data used for fitting.
The matched call.
wheat_trial <- load_example_data("wheat_trial") # Ordinary least squares m1 <- fit_linear(wheat_trial, "yield ~ nitrogen + phosphorus") # Mixed model with block as random effect m2 <- fit_linear(wheat_trial, "yield ~ nitrogen + phosphorus", random = "(1|block)") summary(m2) plot(m2, type = "residuals")wheat_trial <- load_example_data("wheat_trial") # Ordinary least squares m1 <- fit_linear(wheat_trial, "yield ~ nitrogen + phosphorus") # Mixed model with block as random effect m2 <- fit_linear(wheat_trial, "yield ~ nitrogen + phosphorus", random = "(1|block)") summary(m2) plot(m2, type = "residuals")
Provides a unified interface for five model families commonly used in
agronomic research. Automatic starting-value heuristics are applied when
start is not supplied, making the function usable without prior
knowledge of parameter ranges.
fit_nonlinear( data, x_col, y_col, model = c("logistic", "gompertz", "asymptotic", "linear_plateau", "quadratic"), start = NULL, control = nls.control(maxiter = 500, tol = 1e-06) )fit_nonlinear( data, x_col, y_col, model = c("logistic", "gompertz", "asymptotic", "linear_plateau", "quadratic"), start = NULL, control = nls.control(maxiter = 500, tol = 1e-06) )
data |
A data frame (or |
x_col |
Character. Name of the independent variable column (e.g.
|
y_col |
Character. Name of the response column (e.g. |
model |
One of:
|
start |
Optional named list of starting parameter values. When
|
control |
A list passed to |
An object of class "agriNLS", a named list containing:
The nls object.
Character: the model name.
Column names used.
The data used for fitting.
The matched call.
maize_growth <- load_example_data("maize_growth") nl <- fit_nonlinear(maize_growth, x_col = "days", y_col = "biomass_g", model = "logistic") summary(nl) plot(nl)maize_growth <- load_example_data("maize_growth") nl <- fit_nonlinear(maize_growth, x_col = "days", y_col = "biomass_g", model = "logistic") summary(nl) plot(nl)
Convenience function for testing linear, quadratic, and cubic fits of a single continuous predictor and selecting the best by AIC.
fit_polynomial(data, x_col, y_col, max_degree = 3)fit_polynomial(data, x_col, y_col, max_degree = 3)
data |
A data frame. |
x_col |
Predictor column name. |
y_col |
Response column name. |
max_degree |
Maximum polynomial degree to test (default |
An agriLM object for the best-fitting model. A comparison table
is printed as a side-effect.
wheat_trial <- load_example_data("wheat_trial") best <- fit_polynomial(wheat_trial, x_col = "nitrogen", y_col = "yield") plot(best)wheat_trial <- load_example_data("wheat_trial") best <- fit_polynomial(wheat_trial, x_col = "nitrogen", y_col = "yield") plot(best)
Extracts R², adjusted-R², RMSE, MAE, AIC, and BIC from an agriLM,
agriNLS, or agriDRC object (or any object with fitted(),
residuals(), and AIC() methods).
gof_stats(model)gof_stats(model)
model |
An |
A named list with elements R2, adj_R2, RMSE, MAE,
AIC, and BIC.
Weed control percentage measured across seven herbicide doses for two weed species.
A data frame with 84 rows and 4 variables:
Weed species (Amaranth, Ryegrass).
Herbicide dose (g active ingredient per hectare).
Weed control percentage (0–100).
Replicate number (1–6).
Simulated for package demonstration.
Reads one of the three example datasets shipped with the package from
inst/extdata. Under normal installation the datasets are available
directly via data(wheat_trial) etc; this function is a fallback that
works even when .rda files have not been compiled.
load_example_data(name = c("wheat_trial", "maize_growth", "herbicide_trial"))load_example_data(name = c("wheat_trial", "maize_growth", "herbicide_trial"))
name |
One of |
A data.frame.
wheat <- load_example_data("wheat_trial") head(wheat)wheat <- load_example_data("wheat_trial") head(wheat)
Daily above-ground biomass measurements from 10 maize plants tracked from emergence to physiological maturity under well-watered conditions.
A data frame with 200 rows and 4 variables:
Plant identifier (1–10).
Days after emergence.
Dry biomass (g/plant).
Water treatment (WW = well-watered, DS = drought-stressed).
Simulated for package demonstration.
Prints a formatted block with coefficient table, goodness-of-fit
statistics, and (for agriLM) an ANOVA table.
model_summary(model, anova = TRUE)model_summary(model, anova = TRUE)
model |
An |
anova |
Logical. Include ANOVA table for |
Invisibly returns the model object (called for its side effect of
printing a formatted summary block containing coefficients,
goodness-of-fit statistics, and, for agriLM models fitted with
engine = "lm", an ANOVA table).
For quadratic and linear-plateau models, computes the x value that maximises the predicted response.
optimum_dose(nls_fit)optimum_dose(nls_fit)
nls_fit |
An |
A named numeric vector with x_opt (optimum x) and y_max
(predicted maximum y).
A standalone outlier-detection function supporting IQR, Z-score, and modified Z-score (Iglewicz-Hoaglin) methods.
outlier_flag(df, col, method = c("iqr", "zscore", "modz"), threshold = NULL)outlier_flag(df, col, method = c("iqr", "zscore", "modz"), threshold = NULL)
df |
A data frame. |
col |
Column name to test. |
method |
One of |
threshold |
Numeric threshold: for |
df with a logical column <col>_outlier.
Dispatches to plot.agriLM, plot.agriNLS, or plot.agriDRC depending
on model class. Useful in pipeline contexts.
plot_fit(model, ...)plot_fit(model, ...)
model |
An |
... |
Arguments passed to the underlying plot method. |
A ggplot object, returned invisibly. Called primarily for the
side effect of printing the plot. The exact structure of the returned
object depends on the class of model: plot.agriLM, plot.agriNLS,
or plot.agriDRC is dispatched accordingly.
Plot a fitted dose-response curve
## S3 method for class 'agriDRC' plot(x, log_dose = TRUE, n_points = 200, ...)## S3 method for class 'agriDRC' plot(x, log_dose = TRUE, n_points = 200, ...)
x |
An |
log_dose |
Logical. Plot dose on a log10 scale (default |
n_points |
Integer. Points on the smooth curve (default 200). |
... |
Ignored. |
A ggplot object (printed invisibly).
Plot diagnostics for an agriLM model
## S3 method for class 'agriLM' plot(x, type = c("fit", "residuals", "qq", "scale"), ...)## S3 method for class 'agriLM' plot(x, type = c("fit", "residuals", "qq", "scale"), ...)
x |
An |
type |
One of |
... |
Ignored. |
A ggplot object (printed invisibly).
Plot a fitted nonlinear curve over observed data
## S3 method for class 'agriNLS' plot(x, n_points = 300, show_residuals = FALSE, ...)## S3 method for class 'agriNLS' plot(x, n_points = 300, show_residuals = FALSE, ...)
x |
An |
n_points |
Integer. Number of points on the fitted curve (default 300). |
show_residuals |
Logical. When |
... |
Ignored. |
A ggplot object (printed invisibly).
Generates a 2×2 panel of residual diagnostics: residuals vs fitted, normal Q-Q, scale-location, and a histogram of residuals.
residual_check(model, ncol = 2)residual_check(model, ncol = 2)
model |
An |
ncol |
Integer. Number of plot columns (default |
Invisibly returns the list of four ggplot objects.
A simulated dataset from a randomised complete block design (RCBD) testing four nitrogen rates across three blocks and two phosphorus levels.
A data frame with 72 rows and 6 variables:
Block identifier (A, B, C).
Nitrogen application rate (kg/ha).
Phosphorus level (low, high).
Crop variety (V1, V2).
Grain yield (t/ha).
Above-ground dry biomass (t/ha).
Simulated for package demonstration.
Z-score or min-max normalises a numeric column, optionally within groups (e.g. per trial site or year).
yield_normalize( df, yield_col = "yield", method = c("zscore", "minmax"), group_by = NULL )yield_normalize( df, yield_col = "yield", method = c("zscore", "minmax"), group_by = NULL )
df |
A data frame. |
yield_col |
Character. Column to normalise. |
method |
|
group_by |
Character vector of grouping column names, or |
df with a new column <yield_col>_norm.
df <- data.frame(yield = c(3, 4, 5, 6), site = c("A","A","B","B")) yield_normalize(df, group_by = "site")df <- data.frame(yield = c(3, 4, 5, 6), site = c("A","A","B","B")) yield_normalize(df, group_by = "site")