Skip to contents

Performs internal validation of a prediction model development procedure via bootstrapping or cross-validation. Many model types are supported via the insight and marginaleffects packages or users can supply user-defined functions that implement the model development procedure and retrieve predictions. Bias-corrected scores and estimates of optimism (where applicable) are provided.

Usage

validate(
  fit,
  method = c("boot_optimism", "boot_simple", ".632", "cv_optimism", "cv_average"),
  data,
  outcome,
  model_fun,
  pred_fun,
  score_fun,
  B,
  ...
)

Arguments

fit

a model object. If fit is given the insight package is used to extract data, outcome, and original model call. Therefore, it is important that fit be supported by insight and implements the entire model development process (see Harrell 2015). A fit given after selection of variables by some method will not give accurate bias-correction. Model predictions are obtained via marginaleffects::get_predict with type = "response" so fit should be compatible with this function. If fit is provided the arguments data, outcome, model_fun, and pred_fun are all ignored.

method

bias-correction method. Valid options are "boot_optimism", "boot_simple", ".632", "cv_optimism", or "cv_average". See details.

data

a data.frame containing data used to fit development model

outcome

character denoting the column name of the outcome in data

model_fun

for models that cannot be supplied via fit this should be a function that takes one named argument: 'data' (function should include ... among arguments). This function should implement the entire model development procedure (hyperparameter tuning, variable selection, imputation etc) and return an object that can be used by pred_fun. Additional arguments can be supplied by ...

pred_fun

for models that cannot be supplied via fit this should be a function that takes two named arguments: 'model' and 'data' (function should include ... among arguments). 'model' is an object returned by model_fun. The function should return a vector of predicted risk probabilities of the same length as the number of rows in data. Additional arguments can be supplied by ...

score_fun

function used to produce performance measures from predicted risks and observed binary outcome. Should take two named arguments: 'y' and 'p' (function should include ... among arguments). This function should return a named vector of scores. If unspecified score_binary is used and this should be good for most purposes.

B

number of bootstrap replicates or crossvalidation folds. If unspecified B is set to 200 for method = "boot_\*"/".632", or is set to 10 for method = "cv_\*".

...

additional arguments for user-defined functions. Arguments for producing calibration curves can be set via 'calib_args' which should be a named list (see cal_plot and score_binary). For method = "boot_optimism", "boot_simple", or ".632" users can specify a cores argument (e.g., cores = 4) to run bootstrap samples in parallel.

Value

an object of class internal_validate containing apparent and bias-corrected estimates of performance scores. If method = "boot_*" it also contains results pertaining to stability of predictions across bootstrapped models (see Riley and Collins, 2023).

Details

Internal validation can provide bias-corrected estimates of performance (e.g., C-statistic/AUC) for a model development procedure (i.e., expected performance if the same procedure were applied to another sample of the same size from the same population; see references). There are several approaches to producing bias-corrected estimates (see below). It is important that the fit or model_fun provided implement the entire model development procedure, including any hyperparameter tuning and/or variable selection.

Note that validate does very little to check for missing values. If fit is supplied insight::get_data will extract the data used to fit the model and usually this will result in complete cases being used. User-defined model and predict functions can be specified to handle missing values among predictor variables. Currently any user supplied data will have rows with missing outcome values removed.

method

boot_optimism

(default) estimates optimism for each score and subtracts from apparent score (score calculated with the original/development model evaluated on the original sample). A new model is fit using the same procedure using each bootstrap resample. Scores are calculated when applying the boot model to the boot sample (\(S_{boot}\)) and the original sample (\(S_{orig}\)) and the difference gives an estimate of optimism for a given resample (\(S_{boot} - S_{orig}\)). The average optimism across the B resamples is subtracted from the apparent score to produce the bias corrected score.

boot_simple

implements the simple bootstrap. B bootstrap models are fit and evaluated on the original data. The average score across the B replicates is the bias-corrected score.

.632

implements Harrell's adaption of Efron's .632 estimator for binary outcomes (see rms::predab.resample and rms::validate). In this case the estimate of optimism is \(0.632 \times (S_{app} - mean(S_{omit} \times w))\) where \(S_{app}\) is the apparent performance score and \(S_{omit}\) is the score estimated using the bootstrap model evaluated on the out-of-sample observations and \(w\) weights for the proportion of observations omitted (see Harrell 2015, p. 115).

cv_optimism

estimate optimism via B-fold crossvalidation. Optimism is the average of the difference in performance measure between predictions made on the training vs test (held out fold) data. This is the approach implemented in rms::validate with method="crossvalidation".

cv_average

bias corrected scores are the average of scores calculated by assessing the model developed on each fold evaluated on the test/held out data. This approach is described and compared to "boot_optimism" and ".632" in Steyerberg et al. (2001).

References

Steyerberg, E. W., Harrell Jr, F. E., Borsboom, G. J., Eijkemans, M. J. C., Vergouwe, Y., & Habbema, J. D. F. (2001). Internal validation of predictive models: efficiency of some procedures for logistic regression analysis. Journal of clinical epidemiology, 54(8), 774-781.

Harrell Jr F. E. (2015). Regression Modeling Strategies: with applications to linear models, logistic and ordinal regression, and survival analysis. New York: Springer Science, LLC.

Efron (1983). “Estimating the error rate of a prediction rule: improvement on cross-validation”. Journal of the American Statistical Association, 78(382):316-331

Van Calster, B., Steyerberg, E. W., Wynants, L., and van Smeden, M. (2023). There is no such thing as a validated prediction model. BMC medicine, 21(1), 70.

Riley RD, Collins GS. (2023). Stability of clinical prediction models developed using statistical or machine learning methods. Biom J. doi:10.1002/bimj.202200302. Epub ahead of print.

Examples

library(pminternal)
set.seed(456)
# simulate data with two predictors that interact
dat <- pmcalibration::sim_dat(N = 2000, a1 = -2, a3 = -.3)
mean(dat$y)
#> [1] 0.1985
dat$LP <- NULL # remove linear predictor

# fit a (misspecified) logistic regression model
m1 <- glm(y ~ ., data=dat, family="binomial")

# internal validation of m1 via bootstrap optimism with 10 resamples
# B = 10 for example but should be >= 200 in practice
m1_iv <- validate(m1, method="boot_optimism", B=10)
#> It is recommended that B >= 200 for bootstrap validation
m1_iv
#>                C   Brier Intercept   Slope   Eavg    E50    E90   Emax    ECI
#> Apparent  0.7779  0.1335     0.000 1.00000 0.0076 0.0064 0.0115  0.058  0.011
#> Optimism  0.0016 -0.0011    -0.019 0.00083 0.0052 0.0038 0.0088  0.078  0.037
#> Corrected 0.7764  0.1346     0.019 0.99917 0.0024 0.0026 0.0027 -0.020 -0.026