Survival Analysis
Survival analysis, also called event history analysis in social science, or reliability analysis in engineering, deals with time until occurrence of an event of interest. However, this failure time may not be observed within the relevant time period, producing so-called censored observations.
This task view aims at presenting the useful R packages for the analysis of time to event data.
Please let the maintainers know if something is inaccurate or missing.
Estimation of the Survival Distribution
survfit
function from the survival package computes the Kaplan-Meier estimator for truncated and/or censored data. rms (replacement of the Design package) proposes a modified version of the survfit
function. The prodlim package implements a fast algorithm and some features not included in survival. Various confidence intervals and confidence bands for the Kaplan-Meier estimator are implemented in the km.ci package. plot.Surv
of package eha plots the Kaplan-Meier estimator. The NADA package includes a function to compute the Kaplan-Meier estimator for left-censored data. svykm
in survey provides a weighted Kaplan-Meier estimator. nested.km
in NestedCohort estimates the survival curve for each level of categorical variables with missing data. The kaplan-meier
function in spatstat computes the Kaplan-Meier estimator from histogram data. The MAMSE package permits to compute a weighted Kaplan-Meier estimate. The KM
function in package rhosp plots the survival function using a variant of the Kaplan-Meier estimator in a hospitalisation risk context. The survPresmooth package computes presmoothed estimates of the main quantities used for right-censored data, i.e., survival, hazard and density functions. The asbio package permits to compute the Kaplan-Meier estimator following Pollock et al. (1998). The bpcp package provides several functions for computing confidence intervals of the survival distribution (e.g., beta product confidence procedure). The lbiassurv package offers various length-bias corrections to survival curve estimation. Non-Parametric confidance bands for the Kaplan-Meier estimator can be computed using the kmconfband package. The kmc package implements the Kaplan-Meier estimator with constraints. The landest package allows landmark estimation and testing of survival probabilities. The jackknifeKME package computes the original and modified jackknife estimates of Kaplan-Meier estimators. The tranSurv package permits to estimate a survival distribution in the presence of dependent left-truncation and right-censoring. The condSURV package provides methods for estimating the conditional survival function for ordered multivariate failure time data. The gte package implements the generalised Turnbull estimator proposed by Dehghan and Duchesne for estimating the conditional survival function with interval-censored data. icfit
function in package interval computes the NPMLE for interval-censored data. The DTDA package implements several algorithms permitting to analyse possibly doubly truncated survival data. npsurv computes the NPMLE of a survival function for general interval-censored data. Hazard Estimation
epi.insthaz
function from epiR computes the instantaneous hazard from the Kaplan-Meier estimator. Testing
survdiff
function in survival compares survival curves using the Fleming-Harrington G-rho family of test. NADA implements this class of tests for left-censored data. SurvTest
in the coin package implements the logrank test reformulated as a linear rank test. Regression Modelling
coxph
function in the survival package fits the Cox model. cph
in the rms package and the eha package propose some extensions to the coxph
function. The package coxphf implements the Firth's penalised maximum likelihood bias reduction method for the Cox model. An implementation of weighted estimation in Cox regression can be found in coxphw. The coxrobust package proposes a robust implementation of the Cox model. timecox
in package timereg fits Cox models with possibly time-varying effects. The mfp package permits to fit Cox models with multiple fractional polynomial. The NestedCohort fits Cox models for covariates with missing data. A Cox model model can be fitted to data from complex survey design using the svycoxph
function in survey. The multipleNCC package fits Cox models using a weighted partial likelihood for nested case-control studies. The MIICD package implements Pan's (2000) multiple imputation approach to Cox models for interval censored data. The ICsurv package fits Cox models for interval-censored data through an EM algorithm. The dynsurv package fits time-varying coefficient models for interval censored and right censored survival data using a Bayesian Cox model, a spline based Cox model or a transformation model. The CPHshape package computes the Cox proportional hazards model with shape constrained hazard functions. The OrdFacReg package implements the Cox model using an active set algorithm for dummy variables of ordered factors. The survivalMPL package fits Cox models using maximum penalised likelihood and provide a non parametric smooth estimate of the baseline hazard function. A Cox model with piecewise constant hazards can be fitted using the pch package. The isoph allows nonparametric estimation of an isotonic covariate effect for proportional hazards model. The icenReg package implements several models for interval-censored data, e.g., Cox, proportional odds, and accelerated failure time models. A Cox type Self-Exciting Intensity model can be fitted to right-censored data using the coxsei package. The SurvLong contains methods for estimation of proportional hazards models with intermittently observed longitudinal covariates. The plac package provides routines to fit the Cox model with left-truncated data using augmented information from the marginal of the truncation times. cumres
function in gof computes goodness-of-fit methods for the Cox proportional hazards model. The proportionality assumption can be checked using the cox.zph
function in survival. The CPE package calculates concordance probability estimate for the Cox model, as does the coxphCPE
function in clinfun. The coxphQuantile
in the latter package draws a quantile curve of the survival distribution as a function of covariates. The multcomp package computes simultaneous tests and confidence intervals for the Cox model and other parametric survival models. The lsmeans package permits to obtain least-squares means (and contrasts thereof) from linear models. In particular, it provides support for the coxph
, survreg
and coxme
functions. The multtest package on Bioconductor proposes a resampling based multiple hypothesis testing that can be applied to the Cox model. Testing coefficients of Cox regression models using a Wald test with a sandwich estimator of variance can be done using the saws package. The rankhazard package permits to plot visualisation of the relative importance of covariates in a proportional hazards model. The smoothHR package provides hazard ratio curves that allows for nonlinear relationship between predictor and survival. The paf package permits to compute the unadjusted/adjusted attributable fraction function from a Cox proportional hazards model. The PHeval package proposes tools to check the proportional hazards assumption using a standardised score process. The ELYP package implements empirical likelihood analysis for the Cox Model and Yang-Prentice (2005) Model. survreg
(from survival) fits a parametric proportional hazards model. The eha and mixPHM packages implement a proportional hazards model with a parametric baseline hazard. The pphsm
in rms translates an AFT model to a proportional hazards form. The polspline package includes the hare
function that fits a hazard regression model, using splines to model the baseline hazard. Hazards can be, but not necessarily, proportional. The flexsurv package implements the model of Royston and Parmar (2002). The model uses natural cubic splines for the baseline survival function, and proportional hazards, proportional odds or probit functions for regression. The SurvRegCensCov package allows estimation of a Weibull Regression for a right-censored endpoint, one interval-censored covariate, and an arbitrary number of non-censored covariates. survreg
function in package survival can fit an accelerated failure time model. A modified version of survreg
is implemented in the rms package (psm
function). It permits to use some of the rms functionalities. The eha package also proposes an implementation of the AFT model (function aftreg
). An AFT model with an error distribution assumed to be a mixture of G-splines is implemented in the smoothSurv package. The NADA package proposes the front end of the survreg
function for left-censored data. A least-square principled implementation of the AFT model can be found in the lss package. The simexaft package implements the Simulation-Extrapolation algorithm for the AFT model, that can be used when covariates are subject to measurement error. A robust version of the accelerated failure time model can be found in RobustAFT. The coarseDataTools package fits AFT models for interval censored data. The aftgee package implements both rank-based estimates and least square estimates (via generalised estimating equations) to the AFT model. An alternative weighting scheme for parameter estimation in the AFT model is proposed in the imputeYn package. The AdapEnetClass package implements elastic net regularisation for the AFT model. aareg
and aalen
, respectively. timereg also proposes an implementation of the Cox-Aalen model (that can also be used to perform the Lin, Wei and Ying (1994) goodness-of-fit for Cox regression models) and the partly parametric additive risk model of McKeague and Sasieni. A version of the Cox-Aalen model for interval censored data is available in the coxinterval package. The uniah package fits shape-restricted additive hazards models. The addhazard package contains tools to fit additive hazards model to random sampling, two-phase sampling and two-phase sampling with auxiliary information. bj
function in rms and BJnoint
in emplik compute the Buckley-James model, though the latter does it without an intercept term. The bujar package fits the Buckley-James model with high-dimensional covariates (L2 boosting, regression trees and boosted MARS, elastic net). survreg
can fit other types of models depending on the chosen distribution, e.g., a tobit model. The AER package provides the tobit
function, which is a wrapper of survreg
to fit the tobit model. An implementation of the tobit model for cross-sectional data and panel data can be found in the censReg package. The timereg package provides implementation of the proportional odds model and of the proportional excess hazards model. The invGauss package fits the inverse Gaussian distribution to survival data. The model is based on describing time to event as the barrier hitting time of a Wiener process, where drift towards the barrier has been randomized with a Gaussian distribution. The pseudo package computes the pseudo-observation for modelling the survival function based on the Kaplan-Meier estimator and the restricted mean. The fastpseudo package dose the same for the restricted mean survival time. flexsurv fits parametric time-to-event models, in which any parametric distribution can be used to model the survival probability, and where one of the parameters is a linear function of covariates. The Icens
function in package Epi provides a multiplicative relative risk and an additive excess risk model for interval-censored data. The VGAM package can fit vector generalised linear and additive models for censored data. The gamlss.cens package implements the generalised additive model for location, scale and shape that can be fitted to censored data. The locfit.censor
function in locfit produces local regression estimates. The crq
function included in the quantreg package implements a conditional quantile regression model for censored data. The JM package fits shared parameter models for the joint modelling of a longitudinal response and event times. The temporal process regression model is implemented in the tpr package. Aster models, which combine aspects of generalized linear models and Cox models, are implemented in the aster and aster2 packages. The concreg package implements conditional logistic regression for survival data as an alternative to the Cox model when hazards are non-proportional. lava.tobit, an extension of the lava package, fits latent variable models for censored outcomes via a probit link formulation. The BGPhazard package implements Markov beta and gamma processes for modelling the hazard ratio for discrete failure time data. The surv2sampleComp packages proposes some model-free contrast comparison measures such as difference/ratio of cumulative hazards, quantiles and restricted mean. The rstpm2 package provides link-based survival models that extend the Royston-Parmar models, a family of flexible parametric models. The TransModel package implements a unified estimation procedure for the analysis of censored data using linear transformation models. The flexPM package fits a flexible parametric regression model to possibly right-censored, left-truncated data. The ICGOR fits the generalized odds rate hazards model to interval-censored data while GORCure generalized odds rate mixture cure model to interval-censored data. The thregI package permits to fit a threshold regression model for interval-censored data based on the first-hitting-time of a boundary by the sample path of a Wiener diffusion process. The miCoPTCM package fits semiparametric promotion time cure models with possibly mis-measured covariates. The intercure package implements semiparametric cure rate estimators for interval censored data. The smcure package permits to fit semiparametric proportional hazards and accelerated failure time mixture cure models. coxph
function from package survival can be fitted for any transition of a multistate model. It can also be used for comparing two transition hazards, using correspondence between multistate models and time-dependent covariates. Besides, all the regression methods presented above can be used for multistate models as long as they allow for left-truncation.survfit
) and prodlim can also be used to estimate the cumulative incidence function. The compeir package estimates event-specific incidence rates, rate ratios, event-specific incidence proportions and cumulative incidence functions. The NPMLEcmprsk package implements the semi-parametric mixture model for competing risks data. The MIICD package implements Pan's (2000) multiple imputation approach to the Fine and Gray model for interval censored data. The crskdiag package provides graphical and analytical approaches for checking the assumptions of the Fine and Gray model. The CFC package permits to perform Bayesian, and non-Bayesian, cause-specific competing risks analysis for parametric and non-parametric survival functions. The gcerisk package provides some methods for competing risks data. Estimation, testing and regression modeling of subdistribution functions in the competing risks setting using quantile regressions can be had in cmprskQR. coxph
from the survival package can be used to analyse recurrent event data. The cph
function of the rms package fits the Anderson-Gill model for recurrent events, model that can also be fitted with the frailtypack package. The latter also permits to fit joint frailty models for joint modelling of recurrent events and a terminal event. The survrec package proposes implementations of several models for recurrent events data, such as the Peña-Strawderman-Hollander, Wang-Chang estimators, and MLE estimation under a Gamma Frailty model. The condGEE package implements the conditional GEE for recurrent event gap times. The TestSurvRec package implements weighted logrank type tests for recurrent events. The reda package provides function to fit gamma frailty model with either a piecewise constant or a spline as the baseline rate function for recurrent event data, as well as some miscellaneous functions for recurrent event data. Several regression models for recurrent event data are implemented in the reReg package. rs.surv
computes a relative survival curve. rs.add
fits an additive model and rsmul
fits the Cox model of Andersen et al. for relative survival, while rstrans
fits a Cox model in transformed time. coxph
and survreg
functions in package survival. A mixed-effects Cox model is implemented in the coxme package. The two.stage
function in the timereg package fits the Clayton-Oakes-Glidden model. The parfm package fits fully parametric frailty models via maximisation of the marginal likelihood. The frailtypack package fits proportional hazards models with a shared Gamma frailty to right-censored and/or left-truncated data using a penalised likelihood on the hazard function. The package also fits additive and nested frailty models that can be used for, e.g., meta-analysis and for hierarchically clustered data (with 2 levels of clustering), respectively. A proportional hazards model with mixed effects can be fitted using the phmm package. The lmec package fits a linear mixed-effects model for left-censored data. The Cox model using h-likelihood estimation for the frailty terms can be fitted using the frailtyHL package. The tlmec package implements a linear mixed effects model for censored data with Student-t or normal distributions. The frailtySurv package simulates and fits semiparametric shared frailty models under a wide range of frailty distributions. The parfm package implements parametric frailty models by maximum marginal likelihood. The PenCoxFrail package provides a regularisation approach for Cox frailty models through penalisation. The mexhaz enables modelling of the excess hazard regression model with time-dependent and/or non-linear effect(s) and a random effect defined at the cluster level Multivariate survival refers to the analysis of unit, e.g., the survival of twins or a family. To analyse such data, we can estimate the joint distribution of the survival times
DPsurvint
function in DPpackage fits a Bayesian semi-parametric AFT model. LDDPsurvival
in the same package fits a Linear Dependent Dirichlet Process Mixture of survival models. NMixMCMC
in mixAK performs an MCMC estimation of normal mixtures for censored data. MCMCtobit
in MCMCpack. weibullregpost
function in LearnBayes computes the log posterior density for a Weibull proportional-odds regression model. This section tries to list some specialised plot functions that might be useful in the context of event history analysis.
plot.Hist
function in prodlim permits to draw the states and transitions that characterize a multistate model. ggsurvplot
for drawing survival curves with the 'number at risk' table. Other functions are also available for visual examinations of cox model assumptions. censboot
function that implements several types of bootstrap techniques for right-censored data. 3 months ago
Arthur Allignol and Aurelien Latouche