Title: | QSAR Modeling with Multiple Algorithms: MLR, PLS, and Random Forest |
---|---|
Description: | Quantitative Structure-Activity Relationship (QSAR) modeling is a valuable tool in computational chemistry and drug design, where it aims to predict the activity or property of chemical compounds based on their molecular structure. In this vignette, we present the 'rQSAR' package, which provides functions for variable selection and QSAR modeling using Multiple Linear Regression (MLR), Partial Least Squares (PLS), and Random Forest algorithms. |
Authors: | Oche Ambrose George [aut, cre] |
Maintainer: | Oche Ambrose George <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.0.0 |
Built: | 2024-12-29 08:35:15 UTC |
Source: | CRAN |
This function builds QSAR (Quantitative Structure-Activity Relationship) models using multiple algorithms such as Multiple Linear Regression (MLR), Partial Least Squares (PLS), and Random Forest with k-fold cross-validation.
build_qsar_models(data_file, k = 5)
build_qsar_models(data_file, k = 5)
data_file |
The file path of the dataset. |
k |
The number of folds for cross-validation (default is 5). |
A list containing MLR, PLS, and Random Forest models with their predictions, actuals, and formulas.
This function creates correlation plots for QSAR models, showing the relationship between predicted and actual values with a correlation coefficient.
correlation_plots(model_results)
correlation_plots(model_results)
model_results |
A list containing QSAR model results. |
A list of correlation plots for each QSAR model.
This function reads an SDF (Structure Data File) containing molecular structures and calculates molecular descriptors for each molecule.
generate_descriptors_from_sdf(sdf_file)
generate_descriptors_from_sdf(sdf_file)
sdf_file |
Path to the SDF file. |
A matrix containing molecular descriptors for each molecule in the SDF file.
This function performs variable selection using regression subsets method.
perform_variable_selection(file_path, outcome_col, des_sel_meth = "exhaustive")
perform_variable_selection(file_path, outcome_col, des_sel_meth = "exhaustive")
file_path |
The file path of the dataset. |
outcome_col |
The name of the outcome column. |
des_sel_meth |
The method for variable selection (default is "exhaustive"). |
A data frame containing the selected variables and the outcome.
Function to create residual plots with model type labels
residual_plots(model_results)
residual_plots(model_results)
model_results |
A list containing model results |
A list of ggplot objects representing residual plots