Title: | Precision-Recall and ROC Curves for Weighted and Unweighted Data |
---|---|
Description: | Computes the areas under the precision-recall (PR) and ROC curve for weighted (e.g., soft-labeled) and unweighted data. In contrast to other implementations, the interpolation between points of the PR curve is done by a non-linear piecewise function. In addition to the areas under the curves, the curves themselves can also be computed and plotted by a specific S3-method. References: Davis and Goadrich (2006) <doi:10.1145/1143844.1143874>; Keilwagen et al. (2014) <doi:10.1371/journal.pone.0092209>; Grau et al. (2015) <doi:10.1093/bioinformatics/btv153>. |
Authors: | Jan Grau and Jens Keilwagen |
Maintainer: | Jan Grau <[email protected]> |
License: | GPL-3 |
Version: | 1.3.1 |
Built: | 2024-11-13 06:30:38 UTC |
Source: | CRAN |
This package computes the areas under the precision-recall (PR) and receiver operating characteristics (ROC) curve for weighted (e.g., soft-labeled) and unweighted data. In contrast to other implementations, the interpolation between points of the PR curve is done by a non-linear piecewise function. In addition to the areas under the curves, the curves themselves can also be computed and plotted by a specific S3-method.
Package: | PRROC |
Type: | Package |
Version: | 1.3 |
Date: | 2017-04-21 |
License: | GPL-3 |
Jan Grau and Jens Keilwagen
Maintainer: Jan Grau <[email protected]>
J. Davis and M. Goadrich. The relationship between precision-recall and ROC curves. In Proceedings of the 23rd International Conference on Machine Learning, pages 233–240, New York, NY, USA, 2006. ACM.
T. Fawcett, An introduction to ROC analysis, Pattern Recognition Letters (27) 8, 861-874, 2006.
J. Keilwagen, I. Grosse, and J. Grau. Area under precision-recall curves for weighted and unweighted data, PLOS ONE (9) 3, 2014.
J. Grau, I. Grosse, and J. Keilwagen. PRROC: computing and visualizing precision-recall and receiver operating characteristic curves in R. Bioinformatics, 31(15):2595-2597, 2015.
# create artificial scores as random numbers x <- rnorm( 1000 ); y <- rnorm( 1000, -1 ); # compute area under PR curve pr <- pr.curve( x, y ); print( pr ); # compute area under ROC curve roc <- roc.curve( x, y ); print( roc ); # compute PR curve and area under curve pr <- pr.curve( x, y, curve = TRUE ); # plot curve plot(pr); # compute ROC curve and area under curve roc <- roc.curve( x, y, curve = TRUE ); # plot curve plot(roc); # create artificial weights x.weights <- runif( 1000 ); y.weights <- runif( 1000 ); # compute PR curve and area under curve pr <- pr.curve( x, y, x.weights, y.weights, curve = TRUE ); # plot curve plot(pr); # compute ROC curve and area under curve roc <- roc.curve( x, y, x.weights, y.weights, curve = TRUE ); # plot curve plot(roc);
# create artificial scores as random numbers x <- rnorm( 1000 ); y <- rnorm( 1000, -1 ); # compute area under PR curve pr <- pr.curve( x, y ); print( pr ); # compute area under ROC curve roc <- roc.curve( x, y ); print( roc ); # compute PR curve and area under curve pr <- pr.curve( x, y, curve = TRUE ); # plot curve plot(pr); # compute ROC curve and area under curve roc <- roc.curve( x, y, curve = TRUE ); # plot curve plot(roc); # create artificial weights x.weights <- runif( 1000 ); y.weights <- runif( 1000 ); # compute PR curve and area under curve pr <- pr.curve( x, y, x.weights, y.weights, curve = TRUE ); # plot curve plot(pr); # compute ROC curve and area under curve roc <- roc.curve( x, y, x.weights, y.weights, curve = TRUE ); # plot curve plot(roc);
Plots the PR or ROC curves of a PRROC object. To obtain such curves, pr.curve
or roc.curve
must be called with
argument curve=TRUE
.
## S3 method for class 'PRROC' plot(x, xlim=c(0,1), ylim=c(0,1), auc.main=TRUE, auc.type=c("integral","davis.goadrich"), legend=ifelse(is.logical(color) & color==TRUE,4,NA), xlab=NULL, ylab=NULL, main=NULL, color=TRUE, lwd=3, add=FALSE, scale.color=hsv(h=seq(0,1,length=100)*0.8, s=1, v=1), max.plot = FALSE, min.plot = FALSE, rand.plot = FALSE, fill.area = (max.plot & min.plot), maxminrand.col = grey(0.5), fill.color = grey(0.95), ...)
## S3 method for class 'PRROC' plot(x, xlim=c(0,1), ylim=c(0,1), auc.main=TRUE, auc.type=c("integral","davis.goadrich"), legend=ifelse(is.logical(color) & color==TRUE,4,NA), xlab=NULL, ylab=NULL, main=NULL, color=TRUE, lwd=3, add=FALSE, scale.color=hsv(h=seq(0,1,length=100)*0.8, s=1, v=1), max.plot = FALSE, min.plot = FALSE, rand.plot = FALSE, fill.area = (max.plot & min.plot), maxminrand.col = grey(0.5), fill.color = grey(0.95), ...)
x |
|
xlim |
as in |
ylim |
as in |
auc.main |
|
auc.type |
the area under the curve shown in the title (see also |
legend |
if |
xlab |
the label of the x-axis. If |
ylab |
the label of the y-axis. If |
main |
the title of the plot. If |
color |
if |
lwd |
the line width of the curve |
add |
if |
scale.color |
vector of colors that are used to reflect score thresholds, compare |
max.plot |
if |
min.plot |
if |
rand.plot |
if |
fill.area |
fill the area between maximum and minimum curve (given both have been computed for |
maxminrand.col |
the plot color for the maximum, minimum, and random curves |
fill.color |
the fill color for the area between minimum and maximum curve |
... |
see |
The plot
method for PRROC objects can be used in different ways.
The first is to plot a visualization of a single ROC or PR curve
that also represents the classification thresholds of individual points on the curve by a color scale.
In this case, a PRROC
object must be provided as x
, add
must be FALSE
, and color
must be TRUE
.
If, in addition, legend
is set to TRUE
, a legend translating colors to numerical threshold values is included to the right of the curve plot
itself. The layout of curve plot and legend is accomplished using layout()
, which means that this type of ROC/PR plot cannot be combined
with other/complex layouts.
The second application of the plot
method is to compare the performance of different classifiers (typically on the same data set). To do so,
plot
must be called with add=FALSE
and color
set to one specific color (e.g., 2, "red",...) for the first PRROC
object
provided as x
. Subsequent calls of plot
with add=TRUE
can be used to add further curves to the first plot, where different colors
may be specified by the color
parameter.
In both cases, the first (or only) call to plot
also allows for including plots of the maximum and minimum curve,
highlighting the area between minimum and maximum, and the curve of a random classifier.
For this purpose, the PRROC
object needs to be created (using pr.curve
or roc.curve
) with the corresponding
parameters (e.g., max.compute
) set to TRUE
.
Additional examples for the different use cases and corresponding plot commands are given in the documentations of pr.curve
and roc.curve
.
Jan Grau and Jens Keilwagen
# create artificial scores as random numbers x <- rnorm( 1000 ); y <- rnorm( 1000, -1 ); # compute PR curve pr <- pr.curve( x, y, curve = TRUE ); # standard plot of PR curve plot( pr ); # compute ROC curve roc <- roc.curve( x, y, curve = TRUE ); # standard plot of ROC curve plot( roc ); # create another set of scores x.2 <- rnorm( 1000 ); y.2 <- rnorm( 1000, -2 ); # compute PR curve pr.2 <- pr.curve( x.2, y.2, curve=TRUE ); # and ROC curve roc.2 <- roc.curve( x.2, y.2, curve=TRUE ); # plot PR curve in red, without legend plot( pr, color = "red", auc.main=FALSE ); # add second PR curve in green plot( pr.2, color = 3, add = TRUE ); # plot ROC curve in red, without legend plot( roc, color = "red", auc.main=FALSE); # add second ROC curve in green plot( roc.2, color = 3, add = TRUE ); # plot PR curve with legend below the main plot plot( pr, legend=1 ); # compute PR curve with minimum and maximum curve, and random classifier pr <- pr.curve( x, y, curve = TRUE, max.compute = TRUE, min.compute = TRUE, rand.compute = TRUE); # plot PR curve with area between minimum and # maximum curve in green and random classifier in blue plot(pr, rand.plot = TRUE, fill.area = TRUE, fill.color = rgb(0.8,1,0.8), maxminrand.col = "blue" );
# create artificial scores as random numbers x <- rnorm( 1000 ); y <- rnorm( 1000, -1 ); # compute PR curve pr <- pr.curve( x, y, curve = TRUE ); # standard plot of PR curve plot( pr ); # compute ROC curve roc <- roc.curve( x, y, curve = TRUE ); # standard plot of ROC curve plot( roc ); # create another set of scores x.2 <- rnorm( 1000 ); y.2 <- rnorm( 1000, -2 ); # compute PR curve pr.2 <- pr.curve( x.2, y.2, curve=TRUE ); # and ROC curve roc.2 <- roc.curve( x.2, y.2, curve=TRUE ); # plot PR curve in red, without legend plot( pr, color = "red", auc.main=FALSE ); # add second PR curve in green plot( pr.2, color = 3, add = TRUE ); # plot ROC curve in red, without legend plot( roc, color = "red", auc.main=FALSE); # add second ROC curve in green plot( roc.2, color = 3, add = TRUE ); # plot PR curve with legend below the main plot plot( pr, legend=1 ); # compute PR curve with minimum and maximum curve, and random classifier pr <- pr.curve( x, y, curve = TRUE, max.compute = TRUE, min.compute = TRUE, rand.compute = TRUE); # plot PR curve with area between minimum and # maximum curve in green and random classifier in blue plot(pr, rand.plot = TRUE, fill.area = TRUE, fill.color = rgb(0.8,1,0.8), maxminrand.col = "blue" );
Computes the area under the precision-recall (PR) curve for weighted and unweighted data.
In contrast to other implementations, the interpolation between points of the PR curve is done by a non-linear piecewise function.
In addition to the area under the curve, the curve itself can be obtained by setting argument curve
to TRUE
.
pr.curve( scores.class0, scores.class1=scores.class0, weights.class0=NULL, weights.class1 = {if(is.null(weights.class0)){NULL}else{1-weights.class0}}, sorted = FALSE, curve = FALSE, minStepSize=min(1,ifelse(is.null(weights.class0),1,sum(weights.class0)/100)), max.compute=F, min.compute=F, rand.compute=F,dg.compute=T)
pr.curve( scores.class0, scores.class1=scores.class0, weights.class0=NULL, weights.class1 = {if(is.null(weights.class0)){NULL}else{1-weights.class0}}, sorted = FALSE, curve = FALSE, minStepSize=min(1,ifelse(is.null(weights.class0),1,sum(weights.class0)/100)), max.compute=F, min.compute=F, rand.compute=F,dg.compute=T)
scores.class0 |
the classification scores of i) all data points or ii) only the data points belonging to the positive class. In the first case, scores.class1 should not be assigned an explicit value, but left at the default (scores.class1=scores.class0). In addition, weights.class0 needs to contain the class labels of the data points (1 for positive class, 0 for negative class) or the soft-labels for the positive class, i.e., the probability for each data point to belong to the positive class. Accordingly, weights.class1 should be left at the default value (1-weights.class0). In the second case, the scores for the negative data points need to be provided in scores.class1. In this case, weights.class0 and weights.class1 need to be provided only for soft-labelling and should be of the same length as scores.class0 and scores.class1, respectively. |
scores.class1 |
the scores of the negative class if provided separately (see scores.class0) |
weights.class0 |
the weights for the data points of the positive class in same ordering as |
weights.class1 |
the weights for the data points of the negative class in same ordering as |
sorted |
|
curve |
|
minStepSize |
the minimum step size between intermediate points of the curve, does not affect the computation of AUC-PR |
max.compute |
|
min.compute |
|
rand.compute |
|
dg.compute |
|
This function computes the area under a precision-recall curve and, optionally, the curve itself and returns it as a PRROC
object (see below).
It can be used under different scenarios:
1. Standard, hard-labeled classification problems:
Each data point is uniquely assigned to one out of two possible classes. In this case, the classification scores may be either provided separately
for the data points of each of the classes, i.e., as scores.class0
for the data points from the positive/foreground class and as scores.class1
for the data points of the negative/background class; or the classification scores for all data points are provided as scores.class0
and the labels
are provided as numerical values (1
for the positive class, 0
for the negative class) as weights.class0
.
2. Weighted, hard-labeled classification problems:
Each data point is uniquely assigned to one out of two possible classes, where each data points additionally has a weight assigned, for instance
multiplicities in the original data set. In this case, the classification scores need to be provided separately
for the data points of each of the classes, i.e., as scores.class0
for the data points from the positive/foreground class and as scores.class1
for the data points of the negative/background class. In addition, the weights for the data points must be provided as weights.class0
and weights.class1
, respectively.
3. Soft-labeled classification problems:
Each data point belongs to both of the two classes with a certain probability, where for each data point, these two probabilities add up to 1.
In this case, the classification scores for all data points need to be provided only once as scores.class0
and only the positive/foreground weights for each data point need to be provided in weights.class0
, while the converse probability for the negative class is automatically set to
weights.class1=1.0-weights.class0
.
type |
always |
auc.integral |
area under the curve computed by integration of the piecewise function |
auc.davis.goadrich |
area under the curve computed using the interpolation of Davis & Goadrich (2006). Is |
curve |
the PR curve as a matrix, where the first column contains recall, the second contains precision, and the third contains the corresponding threshold on the scores. |
max |
the maximum PR curve (if |
min |
the minimum PR curve (if |
rand |
the PR curve of a random classifier (if |
Jan Grau and Jens Keilwagen
J. Davis and M. Goadrich. The relationship between precision-recall and ROC curves. In Proceedings of the 23rd International Conference on Machine Learning, pages 233–240, New York, NY, USA, 2006. ACM.
J. Keilwagen, I. Grosse, and J. Grau. Area under precision-recall curves for weighted and unweighted data, PLOS ONE (9) 3, 2014.
J. Grau, I. Grosse, and J. Keilwagen. PRROC: computing and visualizing precision-recall and receiver operating characteristic curves in R. Bioinformatics, 31(15):2595-2597, 2015.
# create artificial scores as random numbers x <- rnorm( 1000 ); y <- rnorm( 1000, -1 ); # compute area under PR curve for the hard-labeled case pr <- pr.curve( x, y ); print( pr ); # compute PR curve and area under curve pr <- pr.curve( x, y, curve = TRUE ); # plot curve plot(pr); # create artificial weights x.weights <- runif( 1000 ); y.weights <- runif( 1000 ); # compute PR curve and area under curve for weighted, hard-labeled data pr <- pr.curve( x, y, x.weights, y.weights, curve = TRUE ); # and plot the curve plot(pr); # compute PR curve and area under curve, # and maximum, minimum, and random PR curve for weighted, hard-labeled data pr <- pr.curve(x, y, x.weights, y.weights, curve = TRUE, max.compute = TRUE, min.compute = TRUE, rand.compute = TRUE); # plot all three curves plot(pr, max.plot = TRUE, min.plot = TRUE, rand.plot = TRUE, fill.area = TRUE) # concatenate the drawn scores scores<-c(x,y); # and create artificial soft-labels weights<-c(runif(1000, min = 0.5, max = 1), runif(1000, min = 0, max = 0.5)) # compute PR curve and area under curve, # and maximum, minimum, and random PR curve for soft-labeled data pr<-pr.curve(scores.class0 = scores, weights.class0 = weights, curve = TRUE, max.compute = TRUE, min.compute = TRUE, rand.compute = TRUE); # plot all three curves plot(pr, max.plot = TRUE, min.plot = TRUE, rand.plot = TRUE, fill.area = TRUE) # print the areas under the curves print(pr); # generate classification scores of a second classifier scores.2<-c(rnorm( 1000 ),rnorm( 1000, -2 )); # and compute the PR curve pr.2<-pr.curve(scores.class0 = scores.2, weights.class0 = weights, curve = TRUE) # plot all three curves for the first classifier in red plot(pr, max.plot = TRUE, min.plot = TRUE, rand.plot = TRUE, fill.area = TRUE, color="red", auc.main=FALSE) # and add the curve for the second classifier plot(pr.2, add=TRUE, color="green")
# create artificial scores as random numbers x <- rnorm( 1000 ); y <- rnorm( 1000, -1 ); # compute area under PR curve for the hard-labeled case pr <- pr.curve( x, y ); print( pr ); # compute PR curve and area under curve pr <- pr.curve( x, y, curve = TRUE ); # plot curve plot(pr); # create artificial weights x.weights <- runif( 1000 ); y.weights <- runif( 1000 ); # compute PR curve and area under curve for weighted, hard-labeled data pr <- pr.curve( x, y, x.weights, y.weights, curve = TRUE ); # and plot the curve plot(pr); # compute PR curve and area under curve, # and maximum, minimum, and random PR curve for weighted, hard-labeled data pr <- pr.curve(x, y, x.weights, y.weights, curve = TRUE, max.compute = TRUE, min.compute = TRUE, rand.compute = TRUE); # plot all three curves plot(pr, max.plot = TRUE, min.plot = TRUE, rand.plot = TRUE, fill.area = TRUE) # concatenate the drawn scores scores<-c(x,y); # and create artificial soft-labels weights<-c(runif(1000, min = 0.5, max = 1), runif(1000, min = 0, max = 0.5)) # compute PR curve and area under curve, # and maximum, minimum, and random PR curve for soft-labeled data pr<-pr.curve(scores.class0 = scores, weights.class0 = weights, curve = TRUE, max.compute = TRUE, min.compute = TRUE, rand.compute = TRUE); # plot all three curves plot(pr, max.plot = TRUE, min.plot = TRUE, rand.plot = TRUE, fill.area = TRUE) # print the areas under the curves print(pr); # generate classification scores of a second classifier scores.2<-c(rnorm( 1000 ),rnorm( 1000, -2 )); # and compute the PR curve pr.2<-pr.curve(scores.class0 = scores.2, weights.class0 = weights, curve = TRUE) # plot all three curves for the first classifier in red plot(pr, max.plot = TRUE, min.plot = TRUE, rand.plot = TRUE, fill.area = TRUE, color="red", auc.main=FALSE) # and add the curve for the second classifier plot(pr.2, add=TRUE, color="green")
Prints a PRROC object.
## S3 method for class 'PRROC' print(x, ...)
## S3 method for class 'PRROC' print(x, ...)
x |
|
... |
see |
The print method for PRROC objects prints the area under the (PR or ROC) curve, and (if curve=TRUE
in pr.curve
or roc.curve
) the range of classification scores. If also max.compute=TRUE
, min.compute=TRUE
, and/or rand.compute=TRUE
when the PRROC object has been computes using pr.curve
or roc.curve
, a relative area under curve is reported, i.e., the minimal AUC subtracted from the original AUC and the result divided by the difference of maximum and minimum AUC.
Jan Grau and Jens Keilwagen
# create artificial scores as random numbers x <- rnorm( 1000 ); y <- rnorm( 1000, -1 ); # compute area under PR curve pr <- pr.curve( x, y ); print( pr ); # compute area under ROC curve roc <- roc.curve( x, y ); print( roc );
# create artificial scores as random numbers x <- rnorm( 1000 ); y <- rnorm( 1000, -1 ); # compute area under PR curve pr <- pr.curve( x, y ); print( pr ); # compute area under ROC curve roc <- roc.curve( x, y ); print( roc );
Computes the area under the receiver operating characteristics (ROC) curve for weighted and unweighted data.
In addition to the area under the curve, the curve can be obtained by setting argument curve
to TRUE
.
roc.curve( scores.class0, scores.class1=scores.class0, weights.class0=NULL, weights.class1 = {if(is.null(weights.class0)){NULL}else{1-weights.class0}}, sorted = FALSE, curve = FALSE, max.compute=F, min.compute=F, rand.compute=F)
roc.curve( scores.class0, scores.class1=scores.class0, weights.class0=NULL, weights.class1 = {if(is.null(weights.class0)){NULL}else{1-weights.class0}}, sorted = FALSE, curve = FALSE, max.compute=F, min.compute=F, rand.compute=F)
scores.class0 |
the classification scores of i) all data points or ii) only the data points belonging to the positive class. In the first case, scores.class1 should not be assigned an explicit value, but left at the default (scores.class1=scores.class0). In addition, weights.class0 needs to contain the class labels of the data points (1 for positive class, 0 for negative class) or the soft-labels for the positive class, i.e., the probability for each data point to belong to the positive class. Accordingly, weights.class1 should be left at the default value (1-weights.class0). In the second case, the scores for the negative data points need to be provided in scores.class1. In this case, weights.class0 and weights.class1 need to be provided only for soft-labelling and should be of the same length as scores.class0 and scores.class1, respectively. |
scores.class1 |
the scores of the negative class if provided separately (see scores.class0) |
weights.class0 |
the weights for the data points of the positive class in same ordering as |
weights.class1 |
the weights for the data points of the negative class in same ordering as |
sorted |
|
curve |
|
max.compute |
|
min.compute |
|
rand.compute |
|
This function computes the area under a receiver-operating characteristic (ROC) curve and, optionally, the curve itself and returns it as a PRROC
object (see below).
It can be used under different scenarios:
1. Standard, hard-labeled classification problems:
Each data point is uniquely assigned to one out of two possible classes. In this case, the classification scores may be either provided separately
for the data points of each of the classes, i.e., as scores.class0
for the data points from the positive/foreground class and as scores.class1
for the data points of the negative/background class; or the classification scores for all data points are provided as scores.class0
and the labels
are provided as numerical values (1
for the positive class, 0
for the negative class) as weights.class0
.
2. Weighted, hard-labeled classification problems:
Each data point is uniquely assigned to one out of two possible classes, where each data points additionally has a weight assigned, for instance
multiplicities in the original data set. In this case, the classification scores need to be provided separately
for the data points of each of the classes, i.e., as scores.class0
for the data points from the positive/foreground class and as scores.class1
for the data points of the negative/background class. In addition, the weights for the data points must be provided as weights.class0
and weights.class1
, respectively.
3. Soft-labeled classification problems:
Each data point belongs to both of the two classes with a certain probability, where for each data point, these two probabilities add up to 1.
In this case, the classification scores for all data points need to be provided only once as scores.class0
and only the positive/foreground weights for each data point need to be provided in weights.class0
, while the converse probability for the negative class is automatically set to
weights.class1=1.0-weights.class0
.
type |
always |
auc |
area under the curve |
curve |
the ROC curve as a matrix, where the first column contains the false-positive rate, the second contains recall (sensitivity), and the third contains the corresponding threshold on the scores. |
max |
the maximum ROC curve (if |
min |
the minimum ROC curve (if |
rand |
the ROC curve of a random classifier (if |
Jan Grau and Jens Keilwagen
J. Keilwagen, I. Grosse, and J. Grau. Area under precision-recall curves for weighted and unweighted data, PLOS ONE (9) 3, 2014.
J. Grau, I. Grosse, and J. Keilwagen. PRROC: computing and visualizing precision-recall and receiver operating characteristic curves in R. Bioinformatics, 31(15):2595-2597, 2015.
# create artificial scores as random numbers x <- rnorm( 1000 ); y <- rnorm( 1000, -1 ); # compute area under ROC curve for the hard-labeled case roc <- roc.curve( x, y ); print( roc ); # compute ROC curve and area under curve roc <- roc.curve( x, y, curve = TRUE ); # plot curve plot(roc); # create artificial weights x.weights <- runif( 1000 ); y.weights <- runif( 1000 ); # compute ROC curve and area under curve for weighted, hard-labeled data roc <- roc.curve( x, y, x.weights, y.weights, curve = TRUE ); # and plot the curve plot(roc); # compute ROC curve and area under curve, # and maximum, minimum, and random ROC curve for weighted, hard-labeled data roc <- roc.curve(x, y, x.weights, y.weights, curve = TRUE, max.compute = TRUE, min.compute = TRUE, rand.compute = TRUE); # plot all three curves plot(roc, max.plot = TRUE, min.plot = TRUE, rand.plot = TRUE, fill.area = TRUE) # concatenate the drawn scores scores<-c(x,y); # and create artificial soft-labels weights<-c(runif(1000, min = 0.5, max = 1), runif(1000, min = 0, max = 0.5)) # compute ROC curve and area under curve, # and maximum, minimum, and random ROC curve for soft-labeled data roc<-roc.curve(scores.class0 = scores, weights.class0 = weights, curve = TRUE, max.compute = TRUE, min.compute = TRUE, rand.compute = TRUE); # plot all three curves plot(roc, max.plot = TRUE, min.plot = TRUE, rand.plot = TRUE, fill.area = TRUE) # print the areas under the curves print(roc); # generate classification scores of a second classifier scores.2<-c(rnorm( 1000 ),rnorm( 1000, -2 )); # and compute the ROC curve roc.2<-roc.curve(scores.class0 = scores.2, weights.class0 = weights, curve = TRUE) # plot all three curves for the first classifier in red plot(roc, max.plot = TRUE, min.plot = TRUE, rand.plot = TRUE, fill.area = TRUE, color="red", auc.main=FALSE) # and add the curve for the second classifier plot(roc.2, add=TRUE, color="green")
# create artificial scores as random numbers x <- rnorm( 1000 ); y <- rnorm( 1000, -1 ); # compute area under ROC curve for the hard-labeled case roc <- roc.curve( x, y ); print( roc ); # compute ROC curve and area under curve roc <- roc.curve( x, y, curve = TRUE ); # plot curve plot(roc); # create artificial weights x.weights <- runif( 1000 ); y.weights <- runif( 1000 ); # compute ROC curve and area under curve for weighted, hard-labeled data roc <- roc.curve( x, y, x.weights, y.weights, curve = TRUE ); # and plot the curve plot(roc); # compute ROC curve and area under curve, # and maximum, minimum, and random ROC curve for weighted, hard-labeled data roc <- roc.curve(x, y, x.weights, y.weights, curve = TRUE, max.compute = TRUE, min.compute = TRUE, rand.compute = TRUE); # plot all three curves plot(roc, max.plot = TRUE, min.plot = TRUE, rand.plot = TRUE, fill.area = TRUE) # concatenate the drawn scores scores<-c(x,y); # and create artificial soft-labels weights<-c(runif(1000, min = 0.5, max = 1), runif(1000, min = 0, max = 0.5)) # compute ROC curve and area under curve, # and maximum, minimum, and random ROC curve for soft-labeled data roc<-roc.curve(scores.class0 = scores, weights.class0 = weights, curve = TRUE, max.compute = TRUE, min.compute = TRUE, rand.compute = TRUE); # plot all three curves plot(roc, max.plot = TRUE, min.plot = TRUE, rand.plot = TRUE, fill.area = TRUE) # print the areas under the curves print(roc); # generate classification scores of a second classifier scores.2<-c(rnorm( 1000 ),rnorm( 1000, -2 )); # and compute the ROC curve roc.2<-roc.curve(scores.class0 = scores.2, weights.class0 = weights, curve = TRUE) # plot all three curves for the first classifier in red plot(roc, max.plot = TRUE, min.plot = TRUE, rand.plot = TRUE, fill.area = TRUE, color="red", auc.main=FALSE) # and add the curve for the second classifier plot(roc.2, add=TRUE, color="green")