1\name{etm} 2\alias{etm} 3\alias{etm.data.frame} 4 5\title{Computation of the empirical transition matrix} 6\description{ 7 This function computes the empirical transition matrix, also called 8 Aalen-Johansen estimator, of the transition probability matrix of any 9 multistate model. The covariance matrix is also computed. 10} 11\usage{ 12\S3method{etm}{data.frame}(data, state.names, tra, cens.name, s, t = "last", 13 covariance = TRUE, delta.na = TRUE, modif = FALSE, 14 c = 1, alpha = NULL, strata, ...) 15} 16\arguments{ 17 \item{data}{data.frame of the form data.frame(id,from,to,time) 18 or (id,from,to,entry,exit) 19 \describe{ 20 \item{id:}{patient id} 21 \item{from:}{the state from where the transition occurs} 22 \item{to:}{the state to which a transition occurs} 23 \item{time:}{time when a transition occurs} 24 \item{entry:}{entry time in a state} 25 \item{exit:}{exit time from a state} 26 } 27 This data.frame is transition-oriented, \emph{i.e.} it contains one 28 row per transition, and possibly several rows per patient. Specifying 29 an entry and exit time permits to take into account left-truncation. } 30 \item{state.names}{A vector of characters giving the states names.} 31 \item{tra}{A quadratic matrix of logical values describing the possible 32 transitions within the multistate model. } 33 \item{cens.name}{ A character giving the code for censored 34 observations in the column 'to' of \code{data}. If there is no 35 censored observations in your data, put 'NULL'.} 36 \item{s}{Starting value for computing the transition probabilities.} 37 \item{t}{Ending value. Default is "last", meaning that the transition 38 probabilities are computed over \eqn{(s, t]}{(s, t]}, \eqn{t}{t} 39 being the last time in the data set.} 40 \item{covariance}{Logical. Decide whether or not computing the 41 covariance matrix. May be useful for, say, simulations, as the variance 42 computation is a bit long. Default is TRUE.} 43 \item{delta.na}{Logical. Whether to export the array containing the 44 increments of the Nelson-Aalen estimator. Default is \code{TRUE}.} 45 \item{modif}{Logical. Whether to apply the modification of Lai and 46 Ying for small risk sets} 47 \item{c}{Constant for the Lai and Ying modification. Either \code{c} 48 contains only one value that will be used for all the states, 49 otherwise \code{c} should be the same length as 50 \code{state.names}.} 51 \item{alpha}{Constant for the Lai and Ying modification. If NULL (the 52 default) then only \code{c} is used and the Lai and Ying 53 modification discards the event times for which \eqn{Y(t) \geq 54 t}{Y(t) >= t}. Otherwise \eqn{cn^\alpha}{cn^alpha} is used. It is 55 recommanded to let \code{alpha} equal NULL for multistate models.} 56 \item{strata}{Character vector giving variables on which to stratify 57 the analysis.} 58 \item{...}{Not used} 59} 60\details{ 61 Data are considered to arise from a time-inhomogeneous Markovian 62 multistate model with finite state space, and possibly subject to 63 independent right-censoring and left-truncation. 64 65 The matrix of the transition probabilities is estimated by the 66 Aalen-Johansen estimator / empirical transition matrix (Andersen et 67 al., 1993), which is the product integral over the time period 68 \eqn{(s, t]}{(s, t]} of I + the matrix of the increments of the 69 Nelson-Aalen estimates of the cumulative transition hazards. The 70 \eqn{(i, j)-th}{(i, j)-th} entry of the empirical transition matrix 71 estimates the transition probability of being in state \eqn{j}{j} at 72 time \eqn{t}{t} given that one has been in state j at time \eqn{s}{s}. 73 74 The covariance matrix is computed using the recursion formula (4.4.19) 75 in Anderson et al. (1993, p. 295). This estimator of the covariance 76 matrix is an estimator of the Greenwood type. 77 78 If the multistate model is not Markov, but censorship is entirely 79 random, the Aalen-Johansen estimator still consistently estimates the 80 state occupation probabilities of being in state \eqn{i}{i} at time 81 \eqn{t}{t} (Datta & Satten, 2001; Glidden, 2002) 82 83 Recent versions of R have changed the \code{data.frame} function, 84 where the default for the \code{stringsAsFactors} 85 argument from \code{TRUE} to \code{FALSE}. \code{etm} currently 86 depends on the states being factors, so that the user should use 87 \code{data.frame(..., stringsAsFactors=TRUE)}. 88 89 } 90\value{ 91 \item{est}{Transition probability estimates. This is a 3 dimension 92 array with the first dimension being the state from where transitions 93 occur, the second the state to which transitions occur, and the 94 last one being the event times.} 95 \item{cov}{Estimated covariance matrix. Each cell of the matrix gives 96 the covariance between the transition probabilities given by the 97 rownames and the colnames, respectively.} 98 \item{time}{Event times at which the transition probabilities are 99 computed. That is all the observed times between \eqn{(s, t]}{(s, t]}.} 100 \item{s}{Start of the time interval.} 101 \item{t}{End of the time interval.} 102 \item{trans}{A \code{data.frame} giving the possible transitions.} 103 \item{state.names}{A vector of character giving the state names.} 104 \item{cens.name}{How the censored observation are coded in the data 105 set.} 106 \item{n.risk}{Matrix indicating the number of individuals at risk just 107 before an event} 108 \item{n.event}{Array containing the number of transitions at each 109 times} 110 \item{delta.na}{A 3d array containing the increments of the 111 Nelson-Aalen estimator.} 112 \item{ind.n.risk}{When \code{modif} is true, risk set size for which 113 the indicator function is 1} 114 115 If the analysis is stratified, a list of \code{etm} objects is 116 returned. 117} 118\references{ 119 Beyersmann J, Allignol A, Schumacher M: Competing Risks and Multistate 120 Models with R (Use R!), Springer Verlag, 2012 (Use R!) 121 122 Allignol, A., Schumacher, M. and Beyersmann, J. (2011). 123 Empirical Transition Matrix of Multi-State Models: The etm Package. 124 \emph{Journal of Statistical Software}, 38. 125 126 Andersen, P.K., Borgan, O., Gill, R.D. and Keiding, 127 N. (1993). \emph{Statistical models based on counting 128 processes}. Springer Series in Statistics. New York, NY: Springer. 129 130 Aalen, O. and Johansen, S. (1978). An empirical transition matrix for 131 non-homogeneous Markov chains based on censored 132 observations. \emph{Scandinavian Journal of Statistics}, 5: 141-150. 133 134 Gill, R.D. and Johansen, S. (1990). A survey of product-integration 135 with a view towards application in survival analysis. \emph{Annals of 136 statistics}, 18(4): 1501-1555. 137 138 Datta, S. and Satten G.A. (2001). Validity of the Aalen-Johansen 139 estimators of stage occupation probabilities and Nelson-Aalen 140 estimators of integrated transition hazards for non-Markov 141 models. \emph{Statistics and Probability Letters}, 55(4): 403-411. 142 143 Glidden, D. (2002). Robust inference for event probabilities with 144 non-Markov data. \emph{Biometrics}, 58: 361-368. 145} 146\author{Arthur Allignol, \email{arthur.allignol@gmail.com}} 147 148\note{Transitions into a same state, mathematically superfluous, are not 149 allowed. If transitions into the same state are detected in the data, 150 the function will stop. Equally, \code{diag(tra)} must be set to 151 FALSE, see the example below.} 152 153\seealso{\code{\link{print.etm}}, \code{\link{summary.etm}}, \code{\link{sir.cont}}, 154 \code{\link{xyplot.etm}}} 155 156\examples{ 157data(sir.cont) 158 159# Modification for patients entering and leaving a state 160# at the same date 161# Change on ventilation status is considered 162# to happen before end of hospital stay 163sir.cont <- sir.cont[order(sir.cont$id, sir.cont$time), ] 164for (i in 2:nrow(sir.cont)) { 165 if (sir.cont$id[i]==sir.cont$id[i-1]) { 166 if (sir.cont$time[i]==sir.cont$time[i-1]) { 167 sir.cont$time[i-1] <- sir.cont$time[i-1] - 0.5 168 } 169 } 170} 171 172### Computation of the transition probabilities 173# Possible transitions. 174tra <- matrix(ncol=3,nrow=3,FALSE) 175tra[1, 2:3] <- TRUE 176tra[2, c(1, 3)] <- TRUE 177 178# etm 179tr.prob <- etm(sir.cont, c("0", "1", "2"), tra, "cens", 1) 180 181tr.prob 182summary(tr.prob) 183 184# plotting 185if (require("lattice")) { 186xyplot(tr.prob, tr.choice=c("0 0", "1 1", "0 1", "0 2", "1 0", "1 2"), 187 layout=c(2, 3), strip=strip.custom(bg="white", 188 factor.levels= 189 c("0 to 0", "1 to 1", "0 to 1", "0 to 2", "1 to 0", "1 to 2"))) 190} 191 192### example with left-truncation 193 194data(abortion) 195 196# Data set modification in order to be used by etm 197names(abortion) <- c("id", "entry", "exit", "from", "to") 198abortion$to <- abortion$to + 1 199 200## computation of the matrix giving the possible transitions 201tra <- matrix(FALSE, nrow = 5, ncol = 5) 202tra[1:2, 3:5] <- TRUE 203 204## etm 205fit <- etm(abortion, as.character(0:4), tra, NULL, s = 0) 206 207## plot 208xyplot(fit, tr.choice = c("0 0", "1 1", "0 4", "1 4"), 209 ci.fun = c("log-log", "log-log", "cloglog", "cloglog"), 210 strip = strip.custom(factor.levels = c("P(T > t) -- control", 211 "P(T > t) -- exposed", 212 "CIF spontaneous abortion -- control", 213 "CIF spontaneous abortion -- 214exposed"))) 215} 216 217\keyword{survival} 218