Skip to contents

This function computes estimators of the Expected Prognostic Observed Cross-Entropy (EPOCE) for evaluating the predictive accuracy of joint latent class models estimated using Jointlcmm. On the same data as used for estimation of the Jointlcmm object, this function computes both the Mean Prognostic Observed Log-Likelihood (MPOL) and the Cross-Validated Observed Log-Likelihood (CVPOL), two estimators of EPOCE. The latter corrects the MPOL estimate for over-optimism by approximated cross-validation. On external data, this function only computes the Mean Prognostic Observed Log-Likelihood (MPOL).

Usage

epoce(
  model,
  pred.times,
  var.time,
  fun.time = identity,
  newdata = NULL,
  subset = NULL,
  na.action = 1
)

Arguments

model

an object inheriting from class Jointlcmm

pred.times

Vector of times of prediction, from which predictive accuracy is evaluated (only subjects still at risk at the time of prediction are included in the computation, and only information before the time of prediction is considered.

var.time

Name of the variable indicating time in the dataset

fun.time

an optional function. This is only required if the time scales in the longitudinal part of the model and the survival part are different. In that case, fun.time is the function that translates the times from the longitudinal part into the time scale of the survival part. The default is the identity function which means that the two time scales are the same.

newdata

optional. When missing, the data used for estimating the Jointlcmm object are used, and CVPOL and MPOL are computed (internal validation). When newdata is specified, only MPOL is computed on this newdataset (external validation).

subset

a specification of the rows to be used: defaults to all rows. This can be any valid indexing vector for the rows of data or if that is not supplied, a data frame made up of the variable used in formula.

na.action

Integer indicating how NAs are managed. The default is 1 for 'na.omit'. The alternative is 2 for 'na.fail'. Other options such as 'na.pass' or 'na.exclude' are not implemented in the current version.

Value

call.Jointlcmm

the Jointlcmm call

call.epoce

the matched call

EPOCE

Dataframe containing, for each prediction time s, the number of subjects still at risk at s (and with at least one measure before s), the number of events after time s, the MPOL, and the CVPOL when computation is done on the dataset used for Jointlcmm estimation

IndivContrib

Individual contributions to the prognostic observed log-likelihood at each time of prediction. Used for computing tracking intervals of EPOCE differences between models.

new.data

a boolean for internal use only, which is FALSE if computation is done on the same data as for Jointlcmm estimation, and TRUE otherwise.

Details

This function does not apply for the moment with multiple causes of event (competing risks).

EPOCE assesses the prognostic information of a joint latent class model. It relies on information theory.

MPOL computed at time s equals minus the mean individual contribution to the conditional log-likelihood of the time to event given the longitudinal data up to the time of prediction s and given the subject is still at risk of event in s.

CVPOL computed at time s equals MPOL at time s plus a penalty term that corrects for over-optimism when computing predictive accuracy measures on the same dataset as used for estimation. This penalty term is computed from the inverse of the Hessian of the joint log-likelihood and the product of the gradients of the contributions to respectively the joint log-likelihood and the conditional log-likelihood.

The theory of EPOCE and its estimators MPOL and CVPOL is given in Commenges et al. (2012), and further detailed and illustrated for joint models in Proust-Lima et al. (2013).

References

Commenges, Liquet and Proust-Lima (2012). Choice of prognostic estimators in joint models by estimating differences of expected conditional Kullback-Leibler risks. Biometrics 68(2), 380-7.

Proust-Lima, Sene, Taylor and Jacqmin-Gadda (2014). Joint latent class models of longitudinal and time-to-event data: a review. Statistical Methods in Medical Research 23, 74-90.

Author

Cecile Proust-Lima and Amadou Diakite

Examples


# \dontrun{
## estimation of a joint latent class model with 2 latent classes (ng=2)
# (see the example section of Jointlcmm for details about
#  the model specification)

m <- Jointlcmm(fixed= Ydep1~Time*X1,random=~Time,mixture=~Time,subject='ID'
,survival = Surv(Tevent,Event)~ X1+X2 ,hazard="Weibull"
,hazardtype="PH",ng=2,data=data_lcmm,logscale=TRUE,
B=c(0.7608, -9.4974 , 1.0242,  1.4331 , 0.1063 , 0.6714, 10.4679, 11.3178,
 -2.5671, -0.5386,  1.4616, -0.0605,  0.9489,  0.1020 , 0.2079,  1.5045))
summary(m)
#> Joint latent class model for quantitative outcome and competing risks 
#>      fitted by maximum likelihood method 
#>  
#> Jointlcmm(fixed = Ydep1 ~ Time * X1, mixture = ~Time, random = ~Time, 
#>     subject = "ID", ng = 2, survival = Surv(Tevent, Event) ~ 
#>         X1 + X2, hazard = "Weibull", hazardtype = "PH", data = data_lcmm, 
#>     logscale = TRUE)
#>  
#> Statistical Model: 
#>      Dataset: data_lcmm 
#>      Number of subjects: 300 
#>      Number of observations: 1678 
#>      Number of latent classes: 2 
#>      Number of parameters: 16  
#>      Event 1: 
#>         Number of events: 150
#>         Proportional hazards over latent classes and 
#>         Weibull baseline risk function 
#>  
#> Iteration process: 
#>      Convergence criteria satisfied 
#>      Number of iterations:  1 
#>      Convergence criteria: parameters= 1.3e-08 
#>                          : likelihood= 3.7e-06 
#>                          : second derivatives= 2.2e-09 
#>  
#> Goodness-of-fit statistics: 
#>      maximum log-likelihood: -3929.65  
#>      AIC: 7891.31  
#>      BIC: 7950.57  
#>      Score test statistic for CI assumption: 14.357 (p-value=8e-04) 
#>  
#> Maximum Likelihood Estimates: 
#>  
#> Fixed effects in the class-membership model:
#> (the class of reference is the last class) 
#> 
#>                      coef      Se    Wald p-value
#> intercept class1  0.76081 0.26309   2.892 0.00383
#> 
#> Parameters in the proportional hazard model:
#> 
#>                          coef      Se    Wald p-value
#> event1 log(Weibull1) -9.49740 0.84000 -11.306 0.00000
#> event1 log(Weibull2)  1.02423 0.07062  14.504 0.00000
#> event1 SurvPH class1  1.43309 0.45106   3.177 0.00149
#> X1                    0.10626 0.16968   0.626 0.53114
#> X2                    0.67140 0.18118   3.706 0.00021
#> 
#> Fixed effects in the longitudinal model:
#> 
#>                      coef      Se    Wald p-value
#> intercept class1 10.46787 0.20070  52.157 0.00000
#> intercept class2 11.31778 0.23454  48.255 0.00000
#> Time class1      -2.56706 0.19549 -13.132 0.00000
#> Time class2      -0.53855 0.17704  -3.042 0.00235
#> X1                1.46157 0.22160   6.595 0.00000
#> Time:X1          -0.06049 0.20112  -0.301 0.76358
#> 
#> 
#> Variance-covariance matrix of the random-effects:
#>           intercept    Time
#> intercept   0.94894        
#> Time        0.10204 0.20786
#> 
#>                             coef      Se
#> Residual standard error  1.50448 0.03003
#> 

## Computation of the EPOCE on the same dataset as used for
# estimation of m with times at predictions from 1 to 15 
VecTime <- c(1,3,5,7,9,11,13,15)
cvpl <- epoce(m,var.time="Time",pred.times=VecTime)
#> Be patient, epoce function is running ... 
#> The program took 0.46 seconds 
summary(cvpl)
#> Expected Prognostic Observed Cross-Entropy (EPOCE) of the joint latent class model: 
#>  
#> Jointlcmm(fixed = Ydep1 ~ Time * X1, mixture = ~Time, random = ~Time, 
#>     subject = "ID", ng = 2, survival = Surv(Tevent, Event) ~ 
#>         X1 + X2, hazard = "Weibull", hazardtype = "PH", data = data_lcmm, 
#>     logscale = TRUE)
#>  
#> EPOCE estimators on data used for estimation: 
#>     Mean Prognostic Observed Log-likelihood (MPOL) 
#>     and Cross-validated Prognostic Observed Log-likelihood (CVPOL) 
#>     (CVPOL is the bias-corrected MPOL obtained by approximated cross-validation) 
#>  
#>   pred. times  N at risk  N events     MPOL    CVPOL
#>             1        300       150 1.869272 1.886070
#>             3        299       150 1.840027 1.860890
#>             5        291       149 1.853149 1.874388
#>             7        258       127 1.735359 1.756333
#>             9        205       107 1.773111 1.795867
#>            11        158        81 1.672144 1.695271
#>            13        129        68 1.628349 1.653144
#>            15         99        49 1.463446 1.487694
#>  
plot(cvpl,bty="l",ylim=c(0,2))

# }