Welcome to anabel: ANAlysis of Binding Events + l

Summary

anabel aims to simplify the analysis of binding-curve fitting for scientists of different backgrounds, while minimizing user influence (Stefan D. 2019; Norval L. 2019). With the function run_anabel, which supports three different modes, estimating kinetics constants is a straightforward task. The user can select the mode that is most appropriate for their experimental setup. Please note that this vignette assumes a basic understanding of real-time label-free biomolecular interactions. For more information and an introduction to the theoretical background, please refer to the online version.

Getting started

Installing anabel within R is similar to any other R package either using install.packages or devtools::install. Either way you choose, make sure to set dependencies = TRUE. The core of anabel includes some packages commonly used for everyday data analysis, such as ggplot2, dplyr, purrr, reshape2.

Once the installation is successful, you could start using anabel as follows:

library(anabel)
packageVersion("anabel")
#> [1] '3.0.1'

Input

anabel accepts sensogram input in the form of an Excel or CSV file, or as a data frame. If providing a file, the full path must be specified, or anabel will attempt to read from the working directory.

The input data must be in numeric table format with a column dedicated to time. This column can have any name and use any R-approved symbols, as long as it contains the keyword ‘time’ (see exemplary datasets).

To specify the spots/sample names for the final results (tables + plots), you can provide an additional table with an ‘ID’ column containing the exact column names from the sensogram tables (except for the time-column), and a ‘Name’ column for mapping. Please note that ‘ID’ and ‘Name’ are reserved column names, and anabel will ignore the file if they are not present.

Exemplary datasets - I

To run this tutorial, we will use simulated data that mimics typical 1:1 kinetics interactions. This data is available through anabel:

data("SCA_dataset")
data("MCK_dataset")
data("SCK_dataset")

To view the help page for anabel and the dataset, use the following command:

help(package = "anabel")
?SCA_dataset
?MCK_dataset
?SCK_dataset

All datasets that are used in this tutorial were generated using the Biacore™ Simul8 – SPR sensorgram simulation tool (Simul8) (Simul8 2023)

Functions

anabel currently offers two main functions, each with a help page that includes code examples:

?convert_toMolar() # show help page
?run_anabel() # show help page

The main function of anabel is run_anabel, which analyzes sensograms of 1:1 biomolecular interactions using three different modes: Single-curve analysis (SCA), Multi-cycle kinetics (MCK), and Single-cycle kinetics (SCK). Additionally, the convert_toMolar function converts the analyte concentration unit into molar, supporting units such as nanomolar (nm), millimolar (mm), micromolar (µM), and picomolar (pm). This function is case-insensitive and accepts variations such as nM, NM, nanomolar, and Nanomolar. In the following section (Analyte concentration), we explain how to use this function.

Analyte concentration

The first step is to convert the value of analyte-concentration into molar:

# one value in case of SCA method
ac <- convert_toMolar(val = 50, unit = "nM")
# vector in case of SCK and MCK methods
ac_mck <- convert_toMolar(val = c(50, 16.7, 5.56, 1.85, 6.17e-1), unit = "nM")
ac_sck <- convert_toMolar(val = c(6.17e-1, 1.85, 5.56, 16.7, 50), unit = "nM")

Supported models

Single-curve analysis (SCA)

The parameters of SCA_dataset are as follows:

Curve Ka Kd Conc tass tdiss Expected_KD
Sample. A 1e+06 0.010 50nM 50 200 0e+00
Sample. B 1e+06 0.050 50nM 50 200 1e-07
Sample. C 1e+06 0.001 50nM 50 200 0e+00

For example, Sample.A looks as follow:

By default, anabel runs in SCA mode. Before using the function, make sure that the input data meet the following requirements:

  • The data must contain a column with time values. The name of the column can be anything, as long as it contains the word “time” (case insensitive).
  • The association and dissociation time points must have single values (tass and tdiss, respectively).
  • The time points should be logically valid, i.e., tstart < tass < tdiss < tend.
  • The analyte concentration should have a single value.

The starting and ending time of the experiment are always single value, unlike the value of analyte concentration or association/dissociation time, these parameters are specific to the model.

Missing start or/and end of experiment time (tstart & tend resp.) are allowed, the values will be taken from the provided data.

check ?run_anabel to get full description of each parameter

sca_rslt <- run_anabel(SCA_dataset, tass = 50, tdiss = 200, conc = ac)

By default, the command creates a list of two data frames:

  • kinetics: contains the estimated kinetics constants for each binding curve
  • fit_data: contains the response data (original) with the fitted value

the kinetics table for this method contains the following information:

ID Decrease_1 KD Rmax delta kass kdiss std_Decrease_1 std_KD std_Rmax std_delta std_kass std_kdiss std_tass_1 std_tdiss_1 std_y_offset tass_1 tdiss_1 y_offset ParamsQualitySummary FittingQ
1_Sample.A 8.318465 0e+00 10.03916 8.354079 1005579.5 0.0101355 0.0375280 0 0.0446262 2.860259 2218.560 -0.0000614 0.1122489 0.3652735 0.0158706 51.88397 204.0112 -0.0303094
2_Sample.B 5.026936 1e-07 10.17412 5.010514 990988.3 0.0510633 0.0139427 0 0.3053503 6.473449 8216.625 -0.0006439 0.1592203 0.2126367 0.0158841 52.18423 203.3305 0.0116652
3_Sample.C 8.207838 0e+00 10.03172 9.785564 992297.2 0.0012263 2.1091399 0 0.0773133 3.295603 2023.200 -0.0001257 0.1181290 2.5548958 0.0160021 52.02474 204.5333 0.0136138

One way to visualize the results:

ggplot(sca_rslt$fit_data, aes(x = Time)) +
  geom_point(aes(y = Response), col = "#A2C510") +
  geom_path(aes(y = fit)) +
  facet_wrap(~Name, ncol = 2, scales = "free") +
  theme_light()

Multi-cycle kinetics (MCK)

The MCK method is the most common method used for analyzing biomolecular interactions, and it involves injecting different analyte concentrations in independent cycles. We can use the simulated data provided in the MCK_dataset to demonstrate how to analyze similar data with anabel. The data was created using the following parameters:

tass tdiss Kass Kdiss KD Conc
45 145 1e+7nM 1e-2 0 50, 16.7, 5.56, 1.85, 6.17e-1

The MCK method assumes that each column in the input table represents one cycle with a different analyte concentration. Ideally, the values of the concentration should be different, but anabel will not throw an error if the same value is given to multiple cycles. However, it is the user’s responsibility to check the validity of the input at this point.

As with SCA, make sure that the following conditions hold:

  • The table contains data for one sample.
  • The input data must have a column containing the time value.
  • There is a single time value for each of association and dissociation (tass & tdiss, respectively).
  • The time points are logically valid, i.e. tstart < tass < tdiss < tend.
  • There are multiple values for the analyte concentration.
  • The number of given analyte concentrations should equal the number of columns - 1 in the given table (e.g. the MCK_dataset requires 5 of each).
  • The order of the analyte concentrations must match the data.
mck_rslt <- run_anabel(MCK_dataset, tass = 45, tdiss = 145, conc = ac_mck, method = "MCK")

the order of the given analyte concentration should match the columns in the sensogram table. In case of MCK_dataset, the value of analyte concentration is decreasing therefore the input starts from 50 down to 6.1e-7.

the estimated kinetics constants in the kinetics table are named accoriding to the parameter that was used in the fitting plus the cycle number (e.g. tass_1).

the fitting was successful as no boundaries were violated (columns ParamsQualitySummary & FittingQ )

Decrease_1 Decrease_2 Decrease_3 Decrease_4 Decrease_5 KD Rmax delta_1 delta_2 delta_3 delta_4 delta_5 kass kdiss std_Decrease_1 std_Decrease_2 std_Decrease_3 std_Decrease_4 std_Decrease_5 std_KD std_Rmax std_delta_1 std_delta_2 std_delta_3 std_delta_4 std_delta_5 std_kass std_kdiss std_tass_1 std_tass_2 std_tass_3 std_tass_4 std_tass_5 std_tdiss_1 std_tdiss_2 std_tdiss_3 std_tdiss_4 std_tdiss_5 std_y_offset tass_1 tass_2 tass_3 tass_4 tass_5 tdiss_1 tdiss_2 tdiss_3 tdiss_4 tdiss_5 y_offset ParamsQualitySummary FittingQ
9.780219 9.399691 8.430784 6.109544 3.088036 0 9.987342 9.791528 9.42313 8.453839 6.115083 3.069447 10016265 0.0100154 0.0235417 0.0235708 0.0238967 0.0216496 0.0187299 0 0.011947 0.7156734 0.6860806 0.615358 0.4432102 0.2237067 7027.858 -2.83e-05 0.0314538 0.0471055 0.0857205 0.1788075 0.3967028 0.2449252 0.2564818 0.2881887 0.3393697 0.515144 0.0071815 52.0154 52.06985 52.20825 52.17429 51.7997 153.3805 153.0312 153.1443 152.665 152.8661 0.0006918

You can visualize the fitting results using the fit_data table.

ggplot(mck_rslt$fit_data, aes(x = Time, group = Name)) +
  geom_point(aes(y = Response), col = "#A2C510") +
  geom_path(aes(y = fit)) +
  theme_light()

Compared to the SCA method, the MCK method generates a slightly different output: it does not generate a report.

Single-cycle kinetics (SCK)

SCK is a fitting mode used when in the experimental setup, the analyte concentration is titrated while increasing the concentration with only a short or even without a regeneration step in between. The simulated data SCK_dataset was generated with the following parameters:

Param Step1 Step2 Step3 Step4 Step5
Conc 0.617 1.85 5.56 16.7 50
tass 35.000 205.00 375.00 545.0 715
tdiss 145.000 315.00 485.00 655.0 825

Overall Kass = 1e+6nM and Kdiss = 1e-2nM, therefore, the expected is KD = 1e-08.

To analyze a dataset with the SCK method, the input should include the following:

  • A vector of the different analyte concentrations used in the titration
  • A vector of different time points for each injection step. Specifically, two time points should be included for each step: one for association and one for dissociation.

To analyse this dataset with anabel use the following:

sck_rslt <- run_anabel(SCK_dataset,
  tass = c(35, 205, 375, 545, 715),
  tdiss = c(145, 315, 485, 655, 825), conc = ac_sck, method = "SCK"
)

and the kinetics table:

ID Decrease_1 Decrease_2 Decrease_3 Decrease_4 Decrease_5 KD Rmax_1 Rmax_2 Rmax_3 Rmax_4 Rmax_5 delta_1 delta_2 delta_3 delta_4 delta_5 kass kdiss std_Decrease_1 std_Decrease_2 std_Decrease_3 std_Decrease_4 std_Decrease_5 std_KD std_Rmax_1 std_Rmax_2 std_Rmax_3 std_Rmax_4 std_Rmax_5 std_delta_1 std_delta_2 std_delta_3 std_delta_4 std_delta_5 std_kass std_kdiss std_tass_1 std_tass_2 std_tass_3 std_tass_4 std_tass_5 std_tdiss_1 std_tdiss_2 std_tdiss_3 std_tdiss_4 std_tdiss_5 std_y_offset tass_1 tass_2 tass_3 tass_4 tass_5 tdiss_1 tdiss_2 tdiss_3 tdiss_4 tdiss_5 y_offset ParamsQualitySummary FittingQ
1_Sample.A 0.4293042 1.149556 3.009932 5.985547 8.336179 0 0.5691969 1.416621 3.048719 4.867233 5.47726 0.0220201 0.1478742 0.8468547 2.806756 4.533705 982111.9 0.0100701 0.0783707 0.0734193 0.0784437 0.0879147 0.0383598 0 0.038628 0.0461079 0.0396253 0.0360659 0.0287571 0.0100291 0.0560624 0.3150915 1.031712 1.660261 3471.492 -5.93e-05 4.894192 1.928644 0.6919048 0.289586 0.1664179 5.678101 2.076823 0.8714535 0.5026553 0.3779652 0.0161939 49.93995 226.3697 395.9063 567.8928 739.7848 157.1341 323 496.4046 668.7441 840.8677 0.0055423

and to visualize the outcome:

ggplot(sck_rslt$fit_data, aes(x = Time)) +
  geom_point(aes(y = Response), col = "#A2C510") +
  geom_path(aes(y = fit)) +
  facet_wrap(~Name, ncol = 2) +
  theme_light()

Model correction

Baseline drift and surface decay are common experimental issues that can affect the estimation of kinetics from sensograms. anabel includes features to correct for these problems. In the following sections, we will demonstrate how to handle these cases using three datasets that suffer from either surface decay or drift. The datasets are named according to the type of problem and the method used for correction.

Exemplary datasets - II

data("MCK_dataset_drift") # multi cycle kinetics experiment with baseline drift
data("SCA_dataset_drift") # single curve analysis with baseline drift
data("SCK_dataset_decay") # single cycle kinetics with exponentional decay

Linear drift

SCA

First, lets look at the data:

to analyse this data, apply the drift correction when calling run_anabel and visualize the results yourself if you didn’t let anabel generate the output

sca_rslt_drift <- run_anabel(SCA_dataset_drift, tass = 50, tdiss = 200, conc = ac, drift = TRUE)

ggplot(sca_rslt_drift$fit_data, aes(x = Time)) +
  geom_point(aes(y = Response), col = "#A2C510") +
  geom_path(aes(y = fit)) +
  facet_wrap(~Name, ncol = 2) +
  theme_light()

MCK

to analyse the MCK data with linear drift, apply the drift correction when calling run_anabel:

mck_rslt_drift <- run_anabel(MCK_dataset_drift, tass = 45, tdiss = 145, conc = ac_mck, drift = TRUE, method = "MCK")

ggplot(mck_rslt_drift$fit_data, aes(x = Time, group = Name)) +
  geom_point(aes(y = Response), col = "#A2C510") +
  geom_path(aes(y = fit)) +
  theme_light() +
  ggtitle("MCK five sensogram with linear drift = -0.01")

Exponential decay

The simulated SCK_dataset including an exponential decay component looks as follows:

sck_rslt_decay <- run_anabel(SCK_dataset_decay,
  tass = c(35, 205, 375, 545, 715),
  tdiss = c(145, 315, 485, 655, 825),
  conc = ac_sck, method = "SCK", decay = TRUE
)

ggplot(sck_rslt_decay$fit_data, aes(x = Time)) +
  geom_point(aes(y = Response), col = "#A2C510") +
  geom_path(aes(y = fit)) +
  facet_wrap(~Name, ncol = 2) +
  theme_light()

Debug mode

This mode is useful for users with a background in model optimization who want to understand the fitting model used by anabel. To enable debug mode, set debug_mode = TRUE when running the run_anabel() function. When the debug_mode parameter is set to TRUE, anabel will generate additional data frame that provide more information on the fitting process:

  • init_df: contains the initial values of the fitting parameters for each binding curve.
# call anabel in debug mode with sca data set
my_data <- run_anabel(SCA_dataset, tass = 50, tdiss = 200, conc = ac, debug_mode = TRUE)
init_df <- my_data$init_df

# extract information of the first curve (Sample.A)
response <- init_df$Response[1] %>%
  strsplit(",") %>%
  unlist() %>%
  as.numeric()

# create a temp data frame containing both original value 'Value' and the estimated one 'Response'
sampleA_df <- data.frame(
  Time = SCA_dataset$Time, Value = SCA_dataset$Sample.A,
  Response = response
)

# Generate the plot associated with this curve
ggplot(sampleA_df, aes(x = Time)) +
  geom_point(aes(y = Value), col = "#A2C510", size = 0.5) +
  geom_line(aes(y = Response)) +
  theme_light()

Output options

You can save anabel’s fitting results by setting the option generate_output = "all" and specifying the output directory outdir. The following outcome will be saved in the specified directory:

  • tables of kinetics and fit results (default as xlsx tables, and could be saved in other formats, see ?run_anabel)
  • pdf file containing the fitting plot
  • SCA and SCK methods: a report file (html format) that summarizes the results

If you only want specific output, you can set any of the associated options generate_Plots, generate_Tables, generate_Report to TRUE. If any of these options are TRUE, you must set the generate_output option to customized.

generate_output overwrits all other flags, its default value is “none”, i.e. nothing is generated. Therefore, changing the other options without changing it will always be ignored.

Design principles & support

The main goal of anabel is to support the scientific community for free and establish unified standards for kinetics analysis. It is continuously updated to ensure its usefulness for a variety of instruments. You can stay updated on the latest news on the anabel website at https://www.biocopy.com/. If you have any questions, suggestions, or bug reports, you can contact the anabel team at .

To help the anabel team process your request more efficiently, please make sure to include specific keywords in the subject line of your email. If you encountered an error, use the keyword Error. If the run was successful but the results are incorrect, use the keyword Bug. If you need help with something specific about your data, use the keyword Help. If you are requesting a new feature or plan to use anabel in a commercial workflow, use the keyword Request. Additionally, please include a reproducible example of the problem in your email.

Acknowledgments and licensing

anabel the package and the online tool are supported by BioCopy GmBH. The package could be re-distributed and/or modified under the terms of the General Public License (GNU) as published by the Free Software Foundation (under any version). For commercial use please contact the anabel team.

References

Norval L., et al. 2019. “KOFFI and Anabel 2.0 - a New Binding Kinetics Database and Its Integration in an Open-Source Binding Analysis Software.” Database 2019. https://academic.oup.com/database/article/doi/10.1093/database/baz101/5585575.
Simul8, Biacore™. 2023. “Biacore™ Simul8.” https://apps.cytivalifesciences.com/spr/.
Stefan D., et al. 2019. “Anabel: An Online Tool for the Real-Time Kinetic Analysis of Binding Events.” Bioinformatics and Biology Insights 13.