Simulate a binary outcome with either a quadratic relationship or interaction
Source:R/utils.R
sim_dat.Rd
Function for simulating data either with a single 'predictor' variable with a quadratic relationship with logit(p) or two predictors that interact (see references for examples).
Arguments
- N
number of observations to simulate
- a1
value of the intercept term (in logits). This must be provided along with either
a2
ora3
.- a2
value of the quadratic coefficient. If specified the linear predictor is simulated as follows:
LP <- a1 + x1 + a2*x1^2
wherex1
is sampled from a standard normal distribution.- a3
value of the interaction coefficient. If specified the linear predictor is simulated as follows:
LP <- a1 + x1 + x2 + x1*x2*a3
wherex1
andx2
are sampled from independent standard normal distributions.
References
Austin, P. C., & Steyerberg, E. W. (2019). The Integrated Calibration Index (ICI) and related metrics for quantifying the calibration of logistic regression models. Statistics in medicine, 38(21), 4051-4065.
Rhodes, S. (2022, November 4). Using restricted cubic splines to assess the calibration of clinical prediction models: Logit transform predicted probabilities first. https://doi.org/10.31219/osf.io/4n86q
Examples
library(pmcalibration)
# simulate some data with a binary outcome
n <- 500
dat <- sim_dat(N = n, a1 = .5, a3 = .2)
head(dat) # LP = linear predictor
#> x1 x2 y LP
#> 1 -0.5381156 1.4852597 1 1.2872958
#> 2 0.9606686 0.6594172 1 2.2467820
#> 3 0.9887032 -1.7521091 0 -0.6098691
#> 4 0.2356989 -0.1821829 1 0.5449279
#> 5 -1.4456681 0.3269426 0 -0.7132556
#> 6 -0.3154586 0.3461028 1 0.5088080