--- title: "Classification modeling" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{classification} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- For showing classification `SSLR` models, we will use *Wine* dataset with 20% labeled data: ```{r eval=FALSE} library(SSLR) library(tidymodels) library(caret) ``` ```{r include=FALSE} knitr::opts_chunk$set( digits = 3, collapse = TRUE, comment = "#>" ) options(digits = 3) library(SSLR) library(tidymodels) library(caret) ``` ```{r wine, results="hide"} data(wine) set.seed(1) #Train and test data train.index <- createDataPartition(wine$Wine, p = .7, list = FALSE) train <- wine[ train.index,] test <- wine[-train.index,] cls <- which(colnames(wine) == "Wine") # 20 % LABELED labeled.index <- createDataPartition(wine$Wine, p = .2, list = FALSE) train[-labeled.index,cls] <- NA ``` We have multiple models for solving semi-supervised learning problems of classification. You can read [`Model List`](../../docs/articles/models.html) section For example, we train with Decision Tree: ```{r fit, results="hide"} m <- SSLRDecisionTree(min_samples_split = round(length(labeled.index) * 0.25), w = 0.3) %>% fit(Wine ~ ., data = train) ``` Now we predict with class (tibble) and prob (tibble:) ```{r testing} test_results <- test %>% select(Wine) %>% as_tibble() %>% mutate( dt_class = predict(m, test) %>% pull(.pred_class) ) test_results ``` Now we can use metrics from `yardstick` package: ```{r metrics} test_results %>% accuracy(truth = Wine, dt_class) test_results %>% conf_mat(truth = Wine, dt_class) #Using multiple metrics multi_metric <- metric_set(accuracy, kap, sens, spec, f_meas ) test_results %>% multi_metric(truth = Wine, estimate = dt_class) ``` In classification models we can use *raw* type of predict for getting labels in factor: ```{r metrics_raw} predict(m,test,"raw") ``` We can even use probability predictions in the Decision Tree model: ```{r metrics_prob} predict(m,test,"prob") ```