1\name{etm}
2\alias{etm}
3\alias{etm.data.frame}
4
5\title{Computation of the empirical transition matrix}
6\description{
7  This function computes the empirical transition matrix, also called
8  Aalen-Johansen estimator, of the transition probability matrix of any
9  multistate model. The covariance matrix is also computed.
10}
11\usage{
12\S3method{etm}{data.frame}(data, state.names, tra, cens.name, s, t = "last",
13    covariance = TRUE, delta.na = TRUE, modif = FALSE,
14    c = 1, alpha = NULL, strata, ...)
15}
16\arguments{
17  \item{data}{data.frame of the form data.frame(id,from,to,time)
18    or (id,from,to,entry,exit)
19    \describe{
20    \item{id:}{patient id}
21    \item{from:}{the state from where the transition occurs}
22    \item{to:}{the state to which a transition occurs}
23    \item{time:}{time when a transition occurs}
24    \item{entry:}{entry time in a state}
25    \item{exit:}{exit time from a state}
26    }
27  This data.frame is transition-oriented, \emph{i.e.} it contains one
28  row per transition, and possibly several rows per patient. Specifying
29  an entry and exit time permits to take into account left-truncation. }
30  \item{state.names}{A vector of characters giving the states names.}
31  \item{tra}{A quadratic matrix of logical values describing the possible
32    transitions within the multistate model. }
33  \item{cens.name}{ A character giving the code for censored
34    observations in the column 'to' of \code{data}. If there is no
35    censored observations in your data, put 'NULL'.}
36  \item{s}{Starting value for computing the transition probabilities.}
37  \item{t}{Ending value. Default is "last", meaning that the transition
38    probabilities are computed over \eqn{(s, t]}{(s, t]}, \eqn{t}{t}
39    being the last time in the data set.}
40  \item{covariance}{Logical. Decide whether or not computing the
41    covariance matrix. May be useful for, say, simulations, as the variance
42    computation is a bit long. Default is TRUE.}
43  \item{delta.na}{Logical. Whether to export the array containing the
44    increments of the Nelson-Aalen estimator. Default is \code{TRUE}.}
45  \item{modif}{Logical. Whether to apply the modification of Lai and
46    Ying for small risk sets}
47  \item{c}{Constant for the Lai and Ying modification. Either \code{c}
48    contains only one value that will be used for all the states,
49    otherwise \code{c} should be the same length as
50    \code{state.names}.}
51  \item{alpha}{Constant for the Lai and Ying modification. If NULL (the
52    default) then only \code{c} is used and the Lai and Ying
53    modification discards the event times for which \eqn{Y(t) \geq
54      t}{Y(t) >= t}. Otherwise \eqn{cn^\alpha}{cn^alpha} is used. It is
55    recommanded to let \code{alpha} equal NULL for multistate models.}
56  \item{strata}{Character vector giving variables on which to stratify
57    the analysis.}
58  \item{...}{Not used}
59}
60\details{
61  Data are considered to arise from a time-inhomogeneous Markovian
62  multistate model with finite state space, and possibly subject to
63  independent right-censoring and left-truncation.
64
65  The matrix of the transition probabilities is estimated by the
66  Aalen-Johansen estimator / empirical transition matrix (Andersen et
67  al., 1993), which is the product integral over the time period
68  \eqn{(s, t]}{(s, t]} of I + the matrix of the increments of the
69  Nelson-Aalen estimates of the cumulative transition hazards. The
70  \eqn{(i, j)-th}{(i, j)-th} entry of the empirical transition matrix
71  estimates the transition probability of being in state \eqn{j}{j} at
72  time \eqn{t}{t} given that one has been in state j at time \eqn{s}{s}.
73
74  The covariance matrix is computed using the recursion formula (4.4.19)
75  in Anderson et al. (1993, p. 295). This estimator of the covariance
76  matrix is an estimator of the Greenwood type.
77
78  If the multistate model is not Markov, but censorship is entirely
79  random, the Aalen-Johansen estimator still consistently estimates the
80  state occupation probabilities of being in state \eqn{i}{i} at time
81  \eqn{t}{t} (Datta & Satten, 2001; Glidden, 2002)
82
83  Recent versions of R have changed the \code{data.frame} function,
84  where the default for the \code{stringsAsFactors}
85  argument from \code{TRUE} to \code{FALSE}. \code{etm} currently
86  depends on the states being factors, so that the user should use
87  \code{data.frame(..., stringsAsFactors=TRUE)}.
88
89  }
90\value{
91  \item{est}{Transition probability estimates. This is a 3 dimension
92    array with the first dimension being the state from where transitions
93    occur, the second the state to which transitions occur, and the
94    last one being the event times.}
95  \item{cov}{Estimated covariance matrix. Each cell of the matrix gives
96    the covariance between the transition probabilities given by the
97    rownames and the colnames, respectively.}
98  \item{time}{Event times at which the transition probabilities are
99  computed. That is all the observed times between \eqn{(s, t]}{(s, t]}.}
100  \item{s}{Start of the time interval.}
101  \item{t}{End of the time interval.}
102  \item{trans}{A \code{data.frame} giving the possible transitions.}
103  \item{state.names}{A vector of character giving the state names.}
104  \item{cens.name}{How the censored observation are coded in the data
105    set.}
106  \item{n.risk}{Matrix indicating the number of individuals at risk just
107    before an event}
108  \item{n.event}{Array containing the number of transitions at each
109    times}
110  \item{delta.na}{A 3d array containing the increments of the
111    Nelson-Aalen estimator.}
112  \item{ind.n.risk}{When \code{modif} is true, risk set size for which
113    the indicator function is 1}
114
115  If the analysis is stratified, a list of \code{etm} objects is
116  returned.
117}
118\references{
119  Beyersmann J, Allignol A, Schumacher M: Competing Risks and Multistate
120  Models with R (Use R!), Springer Verlag, 2012 (Use R!)
121
122  Allignol, A., Schumacher, M. and Beyersmann, J. (2011).
123  Empirical Transition Matrix of Multi-State Models: The etm Package.
124  \emph{Journal of Statistical Software}, 38.
125
126  Andersen, P.K., Borgan, O., Gill, R.D. and Keiding,
127  N. (1993). \emph{Statistical models based on counting
128    processes}. Springer Series in Statistics. New York, NY: Springer.
129
130  Aalen, O. and Johansen, S. (1978). An empirical transition matrix for
131  non-homogeneous Markov chains based on censored
132  observations. \emph{Scandinavian Journal of Statistics}, 5: 141-150.
133
134  Gill, R.D. and Johansen, S. (1990). A survey of product-integration
135  with a view towards application in survival analysis. \emph{Annals of
136    statistics}, 18(4): 1501-1555.
137
138  Datta, S. and Satten G.A. (2001). Validity of the Aalen-Johansen
139  estimators of stage occupation probabilities and Nelson-Aalen
140  estimators of integrated transition hazards for non-Markov
141  models. \emph{Statistics and Probability Letters}, 55(4): 403-411.
142
143  Glidden, D. (2002). Robust inference for event probabilities with
144  non-Markov data. \emph{Biometrics}, 58: 361-368.
145}
146\author{Arthur Allignol, \email{arthur.allignol@gmail.com}}
147
148\note{Transitions into a same state, mathematically superfluous, are not
149  allowed. If transitions into the same state are detected in the data,
150  the function will stop. Equally, \code{diag(tra)} must be set to
151  FALSE, see the example below.}
152
153\seealso{\code{\link{print.etm}}, \code{\link{summary.etm}}, \code{\link{sir.cont}},
154  \code{\link{xyplot.etm}}}
155
156\examples{
157data(sir.cont)
158
159# Modification for patients entering and leaving a state
160# at the same date
161# Change on ventilation status is considered
162# to happen before end of hospital stay
163sir.cont <- sir.cont[order(sir.cont$id, sir.cont$time), ]
164for (i in 2:nrow(sir.cont)) {
165  if (sir.cont$id[i]==sir.cont$id[i-1]) {
166    if (sir.cont$time[i]==sir.cont$time[i-1]) {
167      sir.cont$time[i-1] <- sir.cont$time[i-1] - 0.5
168    }
169  }
170}
171
172### Computation of the transition probabilities
173# Possible transitions.
174tra <- matrix(ncol=3,nrow=3,FALSE)
175tra[1, 2:3] <- TRUE
176tra[2, c(1, 3)] <- TRUE
177
178# etm
179tr.prob <- etm(sir.cont, c("0", "1", "2"), tra, "cens", 1)
180
181tr.prob
182summary(tr.prob)
183
184# plotting
185if (require("lattice")) {
186xyplot(tr.prob, tr.choice=c("0 0", "1 1", "0 1", "0 2", "1 0", "1 2"),
187       layout=c(2, 3), strip=strip.custom(bg="white",
188         factor.levels=
189     c("0 to 0", "1 to 1", "0 to 1", "0 to 2", "1 to 0", "1 to 2")))
190}
191
192### example with left-truncation
193
194data(abortion)
195
196# Data set modification in order to be used by etm
197names(abortion) <- c("id", "entry", "exit", "from", "to")
198abortion$to <- abortion$to + 1
199
200## computation of the matrix giving the possible transitions
201tra <- matrix(FALSE, nrow = 5, ncol = 5)
202tra[1:2, 3:5] <- TRUE
203
204## etm
205fit <- etm(abortion, as.character(0:4), tra, NULL, s = 0)
206
207## plot
208xyplot(fit, tr.choice = c("0 0", "1 1", "0 4", "1 4"),
209       ci.fun = c("log-log", "log-log", "cloglog", "cloglog"),
210       strip = strip.custom(factor.levels = c("P(T > t) -- control",
211                                              "P(T > t) -- exposed",
212                                 "CIF spontaneous abortion -- control",
213                                 "CIF spontaneous abortion --
214exposed")))
215}
216
217\keyword{survival}
218