Package 'rQSAR'

Title: QSAR Modeling with Multiple Algorithms: MLR, PLS, and Random Forest
Description: Quantitative Structure-Activity Relationship (QSAR) modeling is a valuable tool in computational chemistry and drug design, where it aims to predict the activity or property of chemical compounds based on their molecular structure. In this vignette, we present the 'rQSAR' package, which provides functions for variable selection and QSAR modeling using Multiple Linear Regression (MLR), Partial Least Squares (PLS), and Random Forest algorithms.
Authors: Oche Ambrose George [aut, cre]
Maintainer: Oche Ambrose George <[email protected]>
License: MIT + file LICENSE
Version: 1.0.0
Built: 2024-10-30 06:52:19 UTC
Source: CRAN

Help Index


Build QSAR models with k-fold cross-validation

Description

This function builds QSAR (Quantitative Structure-Activity Relationship) models using multiple algorithms such as Multiple Linear Regression (MLR), Partial Least Squares (PLS), and Random Forest with k-fold cross-validation.

Usage

build_qsar_models(data_file, k = 5)

Arguments

data_file

The file path of the dataset.

k

The number of folds for cross-validation (default is 5).

Value

A list containing MLR, PLS, and Random Forest models with their predictions, actuals, and formulas.


Create correlation plots for QSAR models

Description

This function creates correlation plots for QSAR models, showing the relationship between predicted and actual values with a correlation coefficient.

Usage

correlation_plots(model_results)

Arguments

model_results

A list containing QSAR model results.

Value

A list of correlation plots for each QSAR model.


Generate Molecular Descriptors from SDF File

Description

This function reads an SDF (Structure Data File) containing molecular structures and calculates molecular descriptors for each molecule.

Usage

generate_descriptors_from_sdf(sdf_file)

Arguments

sdf_file

Path to the SDF file.

Value

A matrix containing molecular descriptors for each molecule in the SDF file.


Perform variable selection using regression subsets

Description

This function performs variable selection using regression subsets method.

Usage

perform_variable_selection(file_path, outcome_col, des_sel_meth = "exhaustive")

Arguments

file_path

The file path of the dataset.

outcome_col

The name of the outcome column.

des_sel_meth

The method for variable selection (default is "exhaustive").

Value

A data frame containing the selected variables and the outcome.


Function to create residual plots with model type labels

Description

Function to create residual plots with model type labels

Usage

residual_plots(model_results)

Arguments

model_results

A list containing model results

Value

A list of ggplot objects representing residual plots