Package 'givitiR'

Title: The GiViTI Calibration Test and Belt
Description: Functions to assess the calibration of logistic regression models with the GiViTI (Gruppo Italiano per la Valutazione degli interventi in Terapia Intensiva, Italian Group for the Evaluation of the Interventions in Intensive Care Units - see <http://www.giviti.marionegri.it/>) approach. The approach consists in a graphical tool, namely the GiViTI calibration belt, and in the associated statistical test. These tools can be used both to evaluate the internal calibration (i.e. the goodness of fit) and to assess the validity of an externally developed model.
Authors: Giovanni Nattino [cre, aut], Stefano Finazzi [aut], Guido Bertolini [aut], Carlotta Rossi [aut], Greta Carrara [aut]
Maintainer: Giovanni Nattino <[email protected]>
License: GPL-3
Version: 1.3
Built: 2024-10-27 06:28:05 UTC
Source: CRAN

Help Index


Calibration Belt Significant Deviations

Description

calibrationBeltIntersections returns the intervals where the calibration belt significantly deviates from the bisector.

Usage

calibrationBeltIntersections(cbBound, seqP, minMax)

Arguments

cbBound

A data.frame object with the numeric variables "U" and "L", representing the upper and lower boundary of the calibration belt.

seqP

The vector of the the probabilities where the points of the calibration belt have been evaluated.

minMax

A list with two elements, named min and max, representing the minimum and maximum probabilities in the model under evaluation.

Value

A list with two components, overBisector and underBisector. Each component is a list containing all the intervals where the calibration belt is significantly over/under the bisector.

See Also

givitiCalibrationBelt and plot.givitiCalibrationBelt to compute and plot the calibaration belt, and givitiCalibrationTest to perform the associated calibration test.

Examples

e <- runif(1000)
logite <- logit(e)
eMod <- logistic(logit(e) +  (logit(e))^2)
o <- rbinom(1000, size = 1, prob = eMod)
data <- data.frame(e = e, o = o, logite = logite)

seqP <- seq(from = .01, to =.99, by = .01)
seqG <- logit(seqP)

minMax <- list(min = min(e), max = max(e))

fwLR <- polynomialLogRegrFw(data, .95, 4, 1)
cbBound <- calibrationBeltPoints(data, seqG, fwLR$m, fwLR$fit, .95, .90, "external")
calibrationBeltIntersections(cbBound, seqP, minMax)

Calibration Belt Confidence Region

Description

calibrationBeltPoints computes the points defining the boundary of the confidence region.

Usage

calibrationBeltPoints(data, seqG, m, fit, thres, cLevel, devel)

Arguments

data

A data.frame object with the numeric variables "o", "e" and "logite", representing the binary outcomes, the probabilities of the model under evaluation and the logit of the probabilities, respectively. The variable "e" must contain values between 0 and 1. The variable "o" must assume only the values 0 and 1.

seqG

A vector containing the logit of the probabilities where the points of the calibration belt will be evaluated.

m

A scalar integer representing the degree of the polynomial at the end of the forward selection.

fit

An object of class glm containig the output of the fit of the logistic regression model at the end of the iterative forward selection.

thres

A numeric scalar between 0 and 1 representing 1 - the significance level adopted in the forward selection.

cLevel

A numeric scalar between 0 and 1 representing the confidence level that will be used for the confidence region.

devel

A character string specifying if the model has been fit on the same dataset under evaluation (internal) or if the model has been developed on an external sample (external).

Value

A data.frame object with two columns, "U" and "L", containing the points of the upper and lower boundary of the cLevel*100%-level calibration belt evaluated at values seqG.

See Also

givitiCalibrationBelt and plot.givitiCalibrationBelt to compute and plot the calibaration belt, and givitiCalibrationTest to perform the associated calibration test.

Examples

e <- runif(100)
logite <- logit(e)
o <- rbinom(100, size = 1, prob = e)
data <- data.frame(e = e, o = o, logite = logite)

seqG <- logit(seq(from = .01, to =.99, by = .01))

fwLR <- polynomialLogRegrFw(data, .95, 4, 1)

calibrationBeltPoints(data, seqG, fwLR$m, fwLR$fit, .95, .90, "external")

Calibration Belt

Description

givitiCalibrationBelt implements the computations necessary to plot the calibration belt.

Usage

givitiCalibrationBelt(o, e, devel, subset = NULL, confLevels = c(0.8, 0.95),
  thres = 0.95, maxDeg = 4, nPoints = 200)

Arguments

o

A numeric vector representing the binary outcomes. The elements must assume only the values 0 or 1. The predictions in e must represent the probability of the event coded as 1.

e

A numeric vector containing the predictions of the model under evaluation. The elements must be numeric and between 0 and 1. The lenght of the vector must be equal to the length of the vector o.

devel

A character string specifying if the model has been fit on the same dataset under evaluation (internal) or if the model has been developed on an external sample (external). See also the 'Details' section.

subset

An optional boolean vector specifying the subset of observations to be considered.

confLevels

A numeric vector containing the confidence levels of the calibration belt. The default values are set to .80 and .95.

thres

A numeric scalar between 0 and 1 representing 1 - the significance level adopted in the forward selection. By default is set to 0.95.

maxDeg

The maximum degree considered in the forward selection. By default is set to 4.

nPoints

A numeric scalar indicating the number of points to be considered to plot the calibration belt. The default value is 200.

Details

The calibration belt and the associated test can be used both to evaluate the calibration of the model in external samples or in the development dataset. However, the two cases have different requirements. When a model is evaluated on independent samples, the calibration belt and the related test can be applied whatever is the method used to fit the model. Conversely, they can be used on the development set only if the model is fitted with logistic regression.

Value

An object of class givitiCalibrationBelt. After computing the calibration belt with the present function, the plot method can be used to plot the calibration belt. The object returned is a list that contains the following components:

n

The size of the sample evaluated in the analysis, after discarding missing values from the vectors o and e.

resultCheck

Result of the check on the data. If the data are compatible with the construction of the calibration belt, the value is the boolean TRUE. Otherwise, the element contain a character string describing the problem found.

m

The degree of the polynomial at the end of the forward selection.

statistic

The value of the test's statistic.

p.value

The p-value of the test.

seqP

The vector of the probabilities where the points of the calibration belt has been evaluated.

minMax

A list with two elements named min and max representing the minimum and maximum probabilities in the model under evaluation

confLevels

The vector containing the confidence levels of the calibration belt.

intersByConfLevel

A list whose elements report the intervals where the calibration belt is significantly over/under the bisector for each confidence level in confLevels.

See Also

plot.givitiCalibrationBelt to plot the calibaration belt and givitiCalibrationTest to perform the associated calibration test.

Examples

#Random by-construction well calibrated model
e <- runif(100)
o <- rbinom(100, size = 1, prob = e)
cb <- givitiCalibrationBelt(o, e, "external")
plot(cb)

#Random by-construction poorly calibrated model
e <- runif(100)
o <- rbinom(100, size = 1, prob = logistic(logit(e)+2))
cb <- givitiCalibrationBelt(o, e, "external")
plot(cb)

Table of the Calibration Belt Significant Deviations

Description

givitiCalibrationBeltTable prints on the graphical area of the calibration belt plot the table that summarizes the significant deviations from the line of perfect calibration (i.e. the bisector of the I quadrant).

Usage

givitiCalibrationBeltTable(cb, tableStrings, grayLevels, xlim, ylim)

Arguments

cb

A givitiCalibrationBelt object, to be generated with the function givitiCalibrationBelt.

tableStrings

Optional. A list with four character elements named overBisString,underBisString,confLevelString, neverString. The four strings of the list are printed instead of the texts "Over the bisector"/"Under the bisector"/"Confidence level"/"NEVER" in the table reporting the intersections of the calibration belt with the bisector.

grayLevels

A vector containing the code of the gray levels used in the plot of the calibration belt.

xlim, ylim

Numeric vectors of length 2, giving the x and y coordinates ranges. Default values are c(0,1).

Value

The function prints the table on the graphical area.


Calibration Test

Description

givitiCalibrationTest performs the calibration test associated to the calibration belt.

Usage

givitiCalibrationTest(o, e, devel, subset = NULL, thres = 0.95,
  maxDeg = 4)

Arguments

o

A numeric vector representing the binary outcomes. The elements must assume only the values 0 or 1. The predictions in e must represent the probability of the event coded as 1.

e

A numeric vector containing the probabilities of the model under evaluation. The elements must be numeric and between 0 and 1. The lenght of the vector must be equal to the length of the vector o.

devel

A character string specifying if the model has been fit on the same dataset under evaluation (internal) or if the model has been developed on an external sample (external). See also the 'Details' section.

subset

An optional boolean vector specifying the subset of observations to be considered.

thres

A numeric scalar between 0 and 1 representing 1 - the significance level adopted in the forward selection. By default is set to 0.95.

maxDeg

The maximum degree considered in the forward selection. By default is set to 4.

Details

The calibration belt and the associated test can be used both to evaluate the calibration of the model in external samples or in the development dataset. However, the two cases have different requirements. When a model is evaluated on independent samples, the calibration belt and the related test can be applied whatever is the method used to fit the model. Conversely, they can be used on the development set only if the model is fitted with logistic regression.

Value

A list of class htest containing the following components:

statistic

The value of the test's statistic.

p.value

The p-value of the test.

null.value

The vector of coefficients hypothesized under the null hypothesis, that is, the parameters corresponding to the bisector.

alternative

A character string describing the alternative hypothesis.

method

A character string indicating what type of calibration test (internal or external) was performed.

estimate

The estimate of the coefficients of the polynomial logistic regression.

data.name

A character string giving the name(s) of the data.

See Also

givitiCalibrationBelt and plot.givitiCalibrationBelt to compute and plot the calibaration belt.

Examples

#Random by-construction well calibrated model
e <- runif(100)
o <- rbinom(100, size = 1, prob = e)
givitiCalibrationTest(o, e, "external")

#Random by-construction poorly calibrated model
e <- runif(100)
o <- rbinom(100, size = 1, prob = logistic(logit(e)+2))
givitiCalibrationTest(o, e, "external")

Computation of the Calibration Test

Description

givitiCalibrationTestComp implements the computations necessary to perform the calibration test associated to the calibration belt.

Usage

givitiCalibrationTestComp(o, e, devel, thres, maxDeg)

Arguments

o

A numeric vector representing the binary outcomes. The elements must assume only the values 0 or 1. The predictions in e must represent the probability of the event coded as 1.

e

A numeric vector containing the probabilities of the model under evaluation. The elements must be numeric and between 0 and 1. The lenght of the vector must be equal to the length of the vector o.

devel

A character string specifying if the model has been fit on the same dataset under evaluation (internal) or if the model has been developed on an external sample (external). See also the 'Details' sections.

thres

A numeric scalar between 0 and 1 representing 1 - the significance level adopted in the forward selection.

maxDeg

The maximum degree considered in the forward selection.

Details

The calibration belt and the associated test can be used both to evaluate the calibration of the model in external samples or in the development dataset. However, the two cases have different requirements. When a model is evaluated on independent samples, the calibration belt and the related test can be applied whatever is the method used to fit the model. Conversely, they can be used on the development set only if the model is fitted with logistic regression.

Value

A list containing the following components:

data

A data.frame object with the numeric variables "o", "e" provided in the input and the variable "logite", the logit of the probabilities.

nrowOrigData

The size of the original sample, i.e. the length of the vectors e and o.

calibrationStat

The value of the test's statistic.

calibrationP

The p-value of the test.

m

The degree of the polynomial at the end of the forward selection.

fit

An object of class glm containig the output of the fit of the logistic regression model at the end of the iterative forward selection.

See Also

givitiCalibrationBelt and plot.givitiCalibrationBelt to compute and plot the calibaration belt, and givitiCalibrationTest to perform the associated calibration test.

Examples

e <- runif(100)
o <- rbinom(100, size = 1, prob = e)
givitiCalibrationTestComp(o, e, "external", .95, 4)

Check of the argument's values

Description

Check of the coherence of the values passed to the functions givitiCalibrationTest and givitiCalibrationBelt.

Usage

givitiCheckArgs(o, e, devel, thres, maxDeg)

Arguments

o

A numeric vector representing the binary outcomes. The elements must assume only the values 0 or 1. The predictions in e must represent the probability of the event coded as 1.

e

A numeric vector containing the probabilities of the model under evaluation. The elements must be numeric and between 0 and 1. The lenght of the vector must be equal to the length of the vector o.

devel

A character string specifying if the model has been fit on the same dataset under evaluation (internal) or if the model has been developed on an external sample (external).

thres

A numeric scalar between 0 and 1 representing 1 - the significance level adopted in the forward selection.

maxDeg

The maximum degree considered in the forward selection.

Value

The function produce an error if the elements provided through the arguments do not meet the constraints reported.


Check of data

Description

The function verifies that the data are compatible with the construction of the calibration belt. In particular, the function checks that the predictions provided do not complete separate the outcomes and that at least two events and non-events are present in the data.

Usage

givitiCheckData(o, e)

Arguments

o

A numeric vector representing the binary outcomes. The elements must assume only the values 0 or 1. The predictions in e must represent the probability of the event coded as 1.

e

A numeric vector containing the probabilities of the model under evaluation. The elements must be numeric and between 0 and 1. The lenght of the vector must be equal to the length of the vector o.

Value

The output is TRUE if the data do not show any of the reported problems. Otherwise, the function returns a string describing the problem found.


givitiR: assessing the calibration of binary outcome models with the GiViTI calibration belt.

Description

The package 'givitiR' provides the functions to plot the GiViTI calibration belt and to compute the associated statistical test.

Details

The name of the approach derives from the GiViTI (Gruppo Italiano per la valutazione degli interventi in Terapia Intensiva, Italian Group for the Evaluation of the Interventions in Intensive Care Units), an international network of intensive care units (ICU) established in Italy in 1992. The group counts more than 400 ICUs from 7 countries, with about the half of the participating centers continuosly collecting data on the admitted patients through the PROSAFE project (PROmoting patient SAFEty and quality improvement in critical care). For further information, see the package vignette and the references therein.

The GiViTI calibration belt has been developed within the methodological research promoted by the GiViTI network, with the purposes of a) enhancing the quality of the logistic regression models built in the group's projects b) providing the participating ICUs with a detailed feedback about their quality of care. A description of the approach and examples of applications are reported in the package vignette.

The main functions of the package are listed below.

Fitting the calibration belt

givitiCalibrationBelt implements the computations necessary to plot the calibration belt.

Plotting the calibration belt

plot.givitiCalibrationBelt plots the calibration belt.

Computing the calibration test

givitiCalibrationTest performs the calibration test associated to the calibration belt.


CDF of the Calibration Statistic Under the Null Hypothesis

Description

givitiStatCdf returns the cumulative density function of the calibration statistic under the null hypothesis.

Usage

givitiStatCdf(t, m, devel, thres)

Arguments

t

The argument of the CDF. Must be a scalar value.

m

The scalar integer representing the degree of the polynomial at the end of the forward selection.

devel

A character string specifying if the model has been fit on the same dataset under evaluation (internal) or if the model has been developed on an external sample (external).

thres

A numeric scalar between 0 and 1 representing the significance level adopted in the forward selection.

Value

A number representing the value of the CDF evaluated in t.

See Also

givitiCalibrationBelt and plot.givitiCalibrationBelt to compute and plot the calibaration belt, and givitiCalibrationTest to perform the associated calibration test.

Examples

givitiStatCdf(3, 1, "external", .95)
givitiStatCdf(3, 2, "internal", .95)

Information of SAPS II score and outcome of 1,000 ICU patients.

Description

A dataset containing clinical information of 1,000 patients admitted to Italian Intesive Care Units joining the GiViTI network (Gruppo Italiano per la valutazione degli interventi in Terapia Intensiva, Italian Group for the Evaluation of the Interventions in Intensive Care Units). The data has been collected within the ProSAFE project, an Italian observational study based on a continuous data collection of clinical data in more than 200 Italian ICUs. The purpose of the project is a continuous surveillance of the quality of care provided in the participating centres. The actual values of the variables have been modified to protect subject confidentiality.

Usage

icuData

Format

A data frame with 1000 rows and 33 variables. The dataset contains, for each predictor of the SAPSII score, both the clinical information and the weight of that variable in the score (the variable with the suffix '_NUM').

outcome

hospital outcome, numeric binary variable with values 1 (deceased) and 0 (alive).

probSaps

probability estimated by the SAPSII prognostic model.

sapsScore

SAPSII score.

age,age_NUM

age, factor variable with levels (in years): '<40', '40-59', '60-69', '70-74', '75-80', '>=80'.

adm,adm_NUM

type of admission, factor variable with 3 levels: 'unschSurg' (unscheduled surgery), 'med' (medical), 'schSurg' (scheduled surgery).

chronic,chronic_NUM

chronic diseases, factor variable with 4 levels: 'noChronDis' (no chronic disease), 'metCarc' (metastatic carcinoma), 'hemMalig' (hematologic malignancy), 'aids' (AIDS).

gcs,gcs_NUM

Glasgow Coma Scale, factor variable with 5 levels: '3-5', '6-8', '9-10', '11-13', '14-15'.

BP,BP_NUM

systolic blood pressure, factor variable with 4 levels (in mmHg): '<70', '70-99', '100-199', '>=200'.

HR,HR_NUM

heart rate, factor variable with 5 levels: '<40', '40-69', '70-119', '120-159', '>=160'

temp,temp_NUM

temperature, factor variable with 2 levels (in Celsius degree): '<39', '>=39'.

urine,urine_NUM

urine output, factor variable with 3 levels (in L/24h): '<0.5', '0.5-0.99', '>=1'.

urea,urea_NUM

serum urea, factor variable with 3 levels (in g/L): '<0.60', '0.60-1.79', '>=1.80'.

WBC,WBC_NUM

wbc, factor variable with 3 levels (in 1/mm3): '<1', '1-19', '>=20'.

potassium,potassium_NUM

potassium, factor variable with 3 levels (in mEq/L): '<3', '3-4.9', '>=5'.

sodium,sodium_NUM

sodium, factor variable with 3 levels (in mEq/L): '<125', '125-144', '>=145'.

HCO3,HCO3_NUM

HCO3, factor variable with 3 levels (in mEq/L): '<15', '15-19', '>=20'.

bili,bili_NUM

bilirubin, factor variable with 3 levels (in mg/dL): '<4', '4-5.9', '>=6'.

paFiIfVent,paFiIfVent_NUM

mechanical ventilation and CPAP PaO2/FIO2, factor variable with 4 levels (PaO2/FIO2 in mmHg): 'noVent' (not ventilated), 'vent_<100' (ventialated and Pa02/FI02 <100), 'vent_100-199' (ventialated and Pa02/FI02 in 100-199), 'vent_>=200' (ventialated and Pa02/FI02 >= 200).

Details

The data contain the information to apply the SAPSII model, a prognostic model developed to predict hospital mortality (Le Gall et al., 1993). Both the computed SAPSII score and the associated probability of death are variables of the dataset. The score is an integer number ranging from 0 to 163 describing the severity of the patient (the higher the score, the more severe the patient). The probability is computed from the score through the formula reported in the original paper. The dataset contains also the hospital survival of the patients.

Source

http://www.giviti.marionegri.it/Default.asp (in Italian only)

References

Le Gall, Jean-Roger, Stanley Lemeshow, and Fabienne Saulnier. "A new simplified acute physiology score (SAPS II) based on a European/North American multicenter study." Jama 270, no. 24 (1993): 2957-2963.

The GiViTI Network, Prosafe Project - 2014 report. Sestante Edizioni: Bergamo, 2015. http://www.giviti.marionegri.it/Download/ReportPROSAFE_2014_EN_Polivalenti_ITALIA.pdf.


Logit and logistic functions

Description

logit and logistic implement the logit and logistic transformations, respectively.

Usage

logit(p)

logistic(x)

Arguments

p

A numeric vector whose components are numbers between 0 and 1.

x

A numeric vector.

Value

The functions apply the logit and logistic transformation to each element of the vector passed as argument. In particular, logit(p)=ln(p/(1-p)) and logistic(x)=exp(x)/(1+exp(x)).

Examples

logit(0.1)
logit(0.5)
logistic(0)
logistic(logit(0.25))
logit(logistic(2))

Calibration Belt Plot

Description

The plot method for calibration belt objects.

Usage

## S3 method for class 'givitiCalibrationBelt'
plot(x, xlim = c(0, 1), ylim = c(0, 1),
  colBis = "red", xlab = "e", ylab = "o",
  main = "GiViTI Calibration Belt", polynomialString = T,
  pvalueString = T, nString = T, table = T, tableStrings = NULL,
  unableToFitString = NULL, ...)

Arguments

x

A givitiCalibrationBelt object, to be generated with the function givitiCalibrationBelt.

xlim, ylim

Numeric vectors of length 2, giving the x and y coordinates ranges. Default values are c(0,1).

colBis

The color to be used for the bisector. The default value is red.

xlab, ylab

Titles for the x and y axis. Default values are "e" and "o", repectively.

main

The main title of the plot. The default value is "GiViTI Calibration Belt".

polynomialString

If the value is FALSE, the degree of the polynomial is not printed on the graphical area. If the value is TRUE, the degree m is reported. If a string is passed to this argument, the string is reported instead of the text "Polynomial degree". The default value is TRUE.

pvalueString

If the value is FALSE, the p-value of the test is not printed on the graphical area. If the value is TRUE, the p-value is reported. If a string is passed to this argument, the string is reported instead of the text "p-value". The default value is TRUE.

nString

If the value is FALSE, the sample size is not printed on the graphical area. If the value is TRUE, the sample size is reported. If a string is passed to this argument, the string is reported instead of the text "n". The default value is TRUE.

table

A boolean value indicating whether the table reporting the intersections of the calibration belt with the bisector should be printed on the plot.

tableStrings

Optional. A list with four character elements named overBisString,underBisString,confLevelString, neverString. The four strings of the list are printed instead of the texts "Over the bisector"/"Under the bisector"/"Confidence level"/"NEVER" in the table reporting the intersections of the calibration belt with the bisector.

unableToFitString

Optional. If a string is passed to this argument, this string is reported in the plot area when the dataset is not compatible with the fit of the calibration belt (e.g. data separation or no positive events). By default, in such cases the text "Unable to fit the Calibration Belt" is reported.

...

Other graphical parameters passed to the generic plot method.

Value

The function generates the calibration belt plot. In addition, a list containing the following components is returned:

p.value

The p-value of the test.

m

The degree of the polynomial at the end of the forward selection.

See Also

givitiCalibrationBelt to compute the calibaration belt and givitiCalibrationTest to perform the associated calibration test.

Examples

#Random by-construction well calibrated model
e <- runif(100)
o <- rbinom(100, size = 1, prob = e)
cb <- givitiCalibrationBelt(o, e, "external")
plot(cb)

#Random by-construction poorly calibrated model
e <- runif(100)
o <- rbinom(100, size = 1, prob = logistic(logit(e)+2))
cb <- givitiCalibrationBelt(o, e, "external")
plot(cb)

Forward Selection in Polynomial Logistic Regression

Description

polynomialLogRegrFw implements a forward selection in a polynomial logistic regression model.

Usage

polynomialLogRegrFw(data, thres, maxDeg, startDeg)

Arguments

data

A data.frame object with the numeric variables "o", "e" and "logite", representing the binary outcomes, the probabilities of the model under evaluation and the logit of the probabilities. The variable "e" must contain values between 0 and 1. The variable "o" must assume only the value 0 and 1.

thres

A numeric scalar between 0 and 1 representing the significance level adopted in the forward selection.

maxDeg

The maximum degree considered in the forward selection.

startDeg

The starting degree in the forward selection.

Value

A list containing the following components:

fit

An object of class glm containig the output of the fit of the logistic regression model at the end of the iterative forward selection.

m

The degree of the polynomial at the end of the forward selection.

Examples

e <- runif(100)
logite <- logit(e)
o <- rbinom(100, size = 1, prob = e)
data <- data.frame(e = e, o = o, logite = logite)
polynomialLogRegrFw(data, .95, 4, 1)