Oversampling balances the class distribution of a dataset by increasing the representation of the minority class in the dataset. It wraps the smotefamily library.
bal_oversampling(attribute)
bal_oversampling(attribute)
attribute |
The class attribute to target balancing using oversampling. |
A bal_oversampling
object.
data(iris) mod_iris <- iris[c(1:50,51:71,101:111),] bal <- bal_oversampling('Species') bal <- daltoolbox::fit(bal, mod_iris) adjust_iris <- daltoolbox::transform(bal, mod_iris) table(adjust_iris$Species)
data(iris) mod_iris <- iris[c(1:50,51:71,101:111),] bal <- bal_oversampling('Species') bal <- daltoolbox::fit(bal, mod_iris) adjust_iris <- daltoolbox::transform(bal, mod_iris) table(adjust_iris$Species)
Subsampling balances the class distribution of a dataset by reducing the representation of the majority class in the dataset.
bal_subsampling(attribute)
bal_subsampling(attribute)
attribute |
The class attribute to target balancing using subsampling |
A bal_subsampling
object.
data(iris) mod_iris <- iris[c(1:50,51:71,101:111),] bal <- bal_subsampling('Species') bal <- daltoolbox::fit(bal, mod_iris) adjust_iris <- daltoolbox::transform(bal, mod_iris) table(adjust_iris$Species)
data(iris) mod_iris <- iris[c(1:50,51:71,101:111),] bal <- bal_subsampling('Species') bal <- daltoolbox::fit(bal, mod_iris) adjust_iris <- daltoolbox::transform(bal, mod_iris) table(adjust_iris$Species)
Feature selection is a process of selecting a subset of relevant features from a larger set of features in a dataset for use in model training. The FeatureSelection class in R provides a framework for performing feature selection.
fs(attribute)
fs(attribute)
attribute |
The target variable. |
An instance of the FeatureSelection class.
#See ?fs_fss for an example of feature selection
#See ?fs_fss for an example of feature selection
Forward stepwise selection is a technique for feature selection in which attributes are added to a model one at a time based on their ability to improve the model's performance. It stops adding once the candidate addition does not significantly improve model adjustment. It wraps the leaps library.
fs_fss(attribute)
fs_fss(attribute)
attribute |
The target variable. |
A fs_fss
object.
data(iris) myfeature <- daltoolbox::fit(fs_fss("Species"), iris) data <- daltoolbox::transform(myfeature, iris) head(data)
data(iris) myfeature <- daltoolbox::fit(fs_fss("Species"), iris) data <- daltoolbox::transform(myfeature, iris) head(data)
Information Gain is a feature selection technique based on information theory. It measures the information obtained for the target variable by knowing the presence or absence of a feature. It wraps the FSelector library.
fs_ig(attribute)
fs_ig(attribute)
attribute |
The target variable. |
A fs_ig
object.
data(iris) myfeature <- daltoolbox::fit(fs_ig("Species"), iris) data <- daltoolbox::transform(myfeature, iris) head(data)
data(iris) myfeature <- daltoolbox::fit(fs_ig("Species"), iris) data <- daltoolbox::transform(myfeature, iris) head(data)
Feature selection using Lasso regression is a technique for selecting a subset of relevant features. It wraps the glmnet library.
fs_lasso(attribute)
fs_lasso(attribute)
attribute |
The target variable. |
A fs_lasso
object.
data(iris) myfeature <- daltoolbox::fit(fs_lasso("Species"), iris) data <- daltoolbox::transform(myfeature, iris) head(data)
data(iris) myfeature <- daltoolbox::fit(fs_lasso("Species"), iris) data <- daltoolbox::transform(myfeature, iris) head(data)
Feature selection using Relief is a technique for selecting a subset of relevant features. It calculates the relevance of a feature by considering the difference in feature values between nearest neighbors of the same and different classes. It wraps the FSelector library.
fs_relief(attribute)
fs_relief(attribute)
attribute |
The target variable. |
A fs_relief
object.
data(iris) myfeature <- daltoolbox::fit(fs_relief("Species"), iris) data <- daltoolbox::transform(myfeature, iris) head(data)
data(iris) myfeature <- daltoolbox::fit(fs_relief("Species"), iris) data <- daltoolbox::transform(myfeature, iris) head(data)