Estimators of the Expected Prognostic Observed Cross-Entropy (EPOCE) for
evaluating predictive accuracy of joint latent class models estimated using
Jointlcmm
Source: R/epoce.R
epoce.Rd
This function computes estimators of the Expected Prognostic Observed
Cross-Entropy (EPOCE) for evaluating the predictive accuracy of joint latent
class models estimated using Jointlcmm
. On the same data as used for
estimation of the Jointlcmm
object, this function computes both the
Mean Prognostic Observed Log-Likelihood (MPOL) and the Cross-Validated
Observed Log-Likelihood (CVPOL), two estimators of EPOCE. The latter
corrects the MPOL estimate for over-optimism by approximated
cross-validation. On external data, this function only computes the Mean
Prognostic Observed Log-Likelihood (MPOL).
Usage
epoce(
model,
pred.times,
var.time,
fun.time = identity,
newdata = NULL,
subset = NULL,
na.action = 1
)
Arguments
- model
an object inheriting from class
Jointlcmm
- pred.times
Vector of times of prediction, from which predictive accuracy is evaluated (only subjects still at risk at the time of prediction are included in the computation, and only information before the time of prediction is considered.
- var.time
Name of the variable indicating time in the dataset
- fun.time
an optional function. This is only required if the time scales in the longitudinal part of the model and the survival part are different. In that case,
fun.time
is the function that translates the times from the longitudinal part into the time scale of the survival part. The default is the identity function which means that the two time scales are the same.- newdata
optional. When missing, the data used for estimating the
Jointlcmm
object are used, and CVPOL and MPOL are computed (internal validation). When newdata is specified, only MPOL is computed on this newdataset (external validation).- subset
a specification of the rows to be used: defaults to all rows. This can be any valid indexing vector for the rows of data or if that is not supplied, a data frame made up of the variable used in formula.
- na.action
Integer indicating how NAs are managed. The default is 1 for 'na.omit'. The alternative is 2 for 'na.fail'. Other options such as 'na.pass' or 'na.exclude' are not implemented in the current version.
Value
- call.Jointlcmm
the
Jointlcmm
call- call.epoce
the matched call
- EPOCE
Dataframe containing, for each prediction time s, the number of subjects still at risk at s (and with at least one measure before s), the number of events after time s, the MPOL, and the CVPOL when computation is done on the dataset used for
Jointlcmm
estimation- IndivContrib
Individual contributions to the prognostic observed log-likelihood at each time of prediction. Used for computing tracking intervals of EPOCE differences between models.
- new.data
a boolean for internal use only, which is FALSE if computation is done on the same data as for
Jointlcmm
estimation, and TRUE otherwise.
Details
This function does not apply for the moment with multiple causes of event (competing risks).
EPOCE assesses the prognostic information of a joint latent class model. It relies on information theory.
MPOL computed at time s equals minus the mean individual contribution to the conditional log-likelihood of the time to event given the longitudinal data up to the time of prediction s and given the subject is still at risk of event in s.
CVPOL computed at time s equals MPOL at time s plus a penalty term that corrects for over-optimism when computing predictive accuracy measures on the same dataset as used for estimation. This penalty term is computed from the inverse of the Hessian of the joint log-likelihood and the product of the gradients of the contributions to respectively the joint log-likelihood and the conditional log-likelihood.
The theory of EPOCE and its estimators MPOL and CVPOL is given in Commenges et al. (2012), and further detailed and illustrated for joint models in Proust-Lima et al. (2013).
References
Commenges, Liquet and Proust-Lima (2012). Choice of prognostic estimators in joint models by estimating differences of expected conditional Kullback-Leibler risks. Biometrics 68(2), 380-7.
Proust-Lima, Sene, Taylor and Jacqmin-Gadda (2014). Joint latent class models of longitudinal and time-to-event data: a review. Statistical Methods in Medical Research 23, 74-90.
Examples
# \dontrun{
## estimation of a joint latent class model with 2 latent classes (ng=2)
# (see the example section of Jointlcmm for details about
# the model specification)
m <- Jointlcmm(fixed= Ydep1~Time*X1,random=~Time,mixture=~Time,subject='ID'
,survival = Surv(Tevent,Event)~ X1+X2 ,hazard="Weibull"
,hazardtype="PH",ng=2,data=data_lcmm,logscale=TRUE,
B=c(0.7608, -9.4974 , 1.0242, 1.4331 , 0.1063 , 0.6714, 10.4679, 11.3178,
-2.5671, -0.5386, 1.4616, -0.0605, 0.9489, 0.1020 , 0.2079, 1.5045))
summary(m)
#> Joint latent class model for quantitative outcome and competing risks
#> fitted by maximum likelihood method
#>
#> Jointlcmm(fixed = Ydep1 ~ Time * X1, mixture = ~Time, random = ~Time,
#> subject = "ID", ng = 2, survival = Surv(Tevent, Event) ~
#> X1 + X2, hazard = "Weibull", hazardtype = "PH", data = data_lcmm,
#> logscale = TRUE)
#>
#> Statistical Model:
#> Dataset: data_lcmm
#> Number of subjects: 300
#> Number of observations: 1678
#> Number of latent classes: 2
#> Number of parameters: 16
#> Event 1:
#> Number of events: 150
#> Proportional hazards over latent classes and
#> Weibull baseline risk function
#>
#> Iteration process:
#> Convergence criteria satisfied
#> Number of iterations: 1
#> Convergence criteria: parameters= 1.3e-08
#> : likelihood= 3.7e-06
#> : second derivatives= 2.2e-09
#>
#> Goodness-of-fit statistics:
#> maximum log-likelihood: -3929.65
#> AIC: 7891.31
#> BIC: 7950.57
#> Score test statistic for CI assumption: 14.357 (p-value=8e-04)
#>
#> Maximum Likelihood Estimates:
#>
#> Fixed effects in the class-membership model:
#> (the class of reference is the last class)
#>
#> coef Se Wald p-value
#> intercept class1 0.76081 0.26309 2.892 0.00383
#>
#> Parameters in the proportional hazard model:
#>
#> coef Se Wald p-value
#> event1 log(Weibull1) -9.49740 0.84000 -11.306 0.00000
#> event1 log(Weibull2) 1.02423 0.07062 14.504 0.00000
#> event1 SurvPH class1 1.43309 0.45106 3.177 0.00149
#> X1 0.10626 0.16968 0.626 0.53114
#> X2 0.67140 0.18118 3.706 0.00021
#>
#> Fixed effects in the longitudinal model:
#>
#> coef Se Wald p-value
#> intercept class1 10.46787 0.20070 52.157 0.00000
#> intercept class2 11.31778 0.23454 48.255 0.00000
#> Time class1 -2.56706 0.19549 -13.132 0.00000
#> Time class2 -0.53855 0.17704 -3.042 0.00235
#> X1 1.46157 0.22160 6.595 0.00000
#> Time:X1 -0.06049 0.20112 -0.301 0.76358
#>
#>
#> Variance-covariance matrix of the random-effects:
#> intercept Time
#> intercept 0.94894
#> Time 0.10204 0.20786
#>
#> coef Se
#> Residual standard error 1.50448 0.03003
#>
## Computation of the EPOCE on the same dataset as used for
# estimation of m with times at predictions from 1 to 15
VecTime <- c(1,3,5,7,9,11,13,15)
cvpl <- epoce(m,var.time="Time",pred.times=VecTime)
#> Be patient, epoce function is running ...
#> The program took 0.46 seconds
summary(cvpl)
#> Expected Prognostic Observed Cross-Entropy (EPOCE) of the joint latent class model:
#>
#> Jointlcmm(fixed = Ydep1 ~ Time * X1, mixture = ~Time, random = ~Time,
#> subject = "ID", ng = 2, survival = Surv(Tevent, Event) ~
#> X1 + X2, hazard = "Weibull", hazardtype = "PH", data = data_lcmm,
#> logscale = TRUE)
#>
#> EPOCE estimators on data used for estimation:
#> Mean Prognostic Observed Log-likelihood (MPOL)
#> and Cross-validated Prognostic Observed Log-likelihood (CVPOL)
#> (CVPOL is the bias-corrected MPOL obtained by approximated cross-validation)
#>
#> pred. times N at risk N events MPOL CVPOL
#> 1 300 150 1.869272 1.886070
#> 3 299 150 1.840027 1.860890
#> 5 291 149 1.853149 1.874388
#> 7 258 127 1.735359 1.756333
#> 9 205 107 1.773111 1.795867
#> 11 158 81 1.672144 1.695271
#> 13 129 68 1.628349 1.653144
#> 15 99 49 1.463446 1.487694
#>
plot(cvpl,bty="l",ylim=c(0,2))
# }