1\name{reShape}
2\alias{reShape}
3\title{Reshape Matrices and Serial Data}
4\description{
5  If the first argument is a matrix, \code{reShape} strings out its values
6  and creates row and column vectors specifying the row and column each
7  element came from.  This is useful for sending matrices to Trellis
8  functions, for analyzing or plotting results of \code{table} or
9  \code{crosstabs}, or for reformatting serial data stored in a matrix (with
10  rows representing multiple time points) into vectors.  The number of
11  observations in the new variables will be the product of the number of
12  rows and number of columns in the input matrix.  If the first
13  argument is a vector, the \code{id} and \code{colvar} variables are used to
14  restructure it into a matrix, with \code{NA}s for elements that corresponded
15  to combinations of \code{id} and \code{colvar} values that did not exist in the
16  data.  When more than one vector is given, multiple matrices are
17  created.  This is useful for restructuring irregular serial data into
18  regular matrices.  It is also useful for converting data produced by
19  \code{expand.grid} into a matrix (see the last example).  The number of
20  rows of the new matrices equals the number of unique values of \code{id},
21  and the number of columns equals the number of unique values of
22  \code{colvar}.
23
24  When the first argument is a vector and the \code{id} is a data frame
25  (even with only one variable),
26  \code{reShape} will produce a data frame, and the unique groups are
27  identified by combinations of the values of all variables in \code{id}.
28  If a data frame \code{constant} is specified, the variables in this data
29  frame are assumed to be constant within combinations of \code{id}
30  variables (if not, an arbitrary observation in \code{constant} will be
31  selected for each group).  A row of \code{constant} corresponding to the
32  target \code{id} combination is then carried along when creating the
33  data frame result.
34
35  A different behavior of \code{reShape} is achieved when \code{base} and \code{reps}
36  are specified.  In that case \code{x} must be a list or data frame, and
37  those data are assumed to contain one or more non-repeating
38  measurements (e.g., baseline measurements) and one or more repeated
39  measurements represented by variables named by pasting together the
40  character strings in the vector \code{base} with the integers 1, 2, \dots,
41  \code{reps}.  The input data are rearranged by repeating each value of the
42  baseline variables \code{reps} times and by transposing each observation's
43  values of one of the set of repeated measurements as \code{reps}
44  observations under the variable whose name does not have an integer
45  pasted to the end.  if \code{x} has a \code{row.names} attribute, those
46  observation identifiers are each repeated \code{reps} times in the output
47  object.  See the last example.
48}
49\usage{
50reShape(x, \dots, id, colvar, base, reps, times=1:reps,
51        timevar='seqno', constant=NULL)
52}
53\arguments{
54  \item{x}{
55    a matrix or vector, or, when \code{base} is specified, a list or data frame
56  }
57  \item{\dots}{
58    other optional vectors, if \code{x} is a vector
59  }
60  \item{id}{
61    A numeric, character, category, or factor variable containing subject
62    identifiers, or a data frame of such variables that in combination form
63    groups of interest.  Required if \code{x} is a vector, ignored otherwise.
64  }
65  \item{colvar}{
66    A numeric, character, category, or factor variable containing column
67    identifiers.  \code{colvar} is using a "time of data collection" variable.
68    Required if \code{x} is a vector, ignored otherwise.
69  }
70  \item{base}{
71    vector of character strings containing base names of repeated
72    measurements
73  }
74  \item{reps}{
75    number of times variables named in \code{base} are repeated.  This must be
76    a constant.
77  }
78  \item{times}{
79    when \code{base} is given, \code{times} is the vector of times to create
80    if you do not want to use consecutive integers beginning with 1.
81  }
82  \item{timevar}{
83    specifies the name of the time variable to create if \code{times} is
84    given, if you do not want to use \code{seqno}
85  }
86  \item{constant}{
87    a data frame with the same number of rows in \code{id} and \code{x},
88    containing auxiliary information to be merged into the resulting data
89    frame.  Logically, the rows of \code{constant} within each group
90    should have the same value of all of its variables.
91  }
92}
93\value{
94  If \code{x} is a matrix, returns a list containing the row variable, the
95  column variable, and the \code{as.vector(x)} vector, named the same as the
96  calling argument was called for \code{x}.  If \code{x} is a vector and no other
97  vectors were specified as \code{\dots}, the result is a matrix.  If at least
98  one vector was given to \code{\dots}, the result is a list containing \code{k}
99  matrices, where \code{k} one plus the number of vectors in \code{\dots}.  If \code{x}
100  is a list or data frame, the same type of object is returned.  If
101  \code{x} is a vector and \code{id} is a data frame, a data frame will be
102  the result.
103}
104\details{
105  In converting \code{dimnames} to vectors, the resulting variables are
106  numeric if all elements of the matrix dimnames can be converted to
107  numeric, otherwise the corresponding row or column variable remains
108  character.  When the \code{dimnames} if \code{x} have a \code{names} attribute, those
109  two names become the new variable names.  If \code{x} is a vector and
110  another vector is also given (in \code{\dots}), the matrices in the resulting
111  list are named the same as the input vector calling arguments.  You
112  can specify customized names for these on-the-fly by using
113  e.g. \code{reShape(X=x, Y=y, id= , colvar= )}.  The new names will then be
114  \code{X} and \code{Y} instead of \code{x} and \code{y}.   A new variable named \code{seqnno} is
115  also added to the resulting object.  \code{seqno} indicates the sequential
116  repeated measurement number.  When \code{base} and \code{times} are
117  specified, this new variable is named the character value of \code{timevar} and the values
118  are given by a table lookup into the vector \code{times}.
119}
120\author{
121Frank Harrell\cr
122Department of Biostatistics\cr
123Vanderbilt University School of Medicine\cr
124\email{f.harrell@vanderbilt.edu}
125}
126\seealso{
127  \code{\link[stats]{reshape}}, \code{\link[base:vector]{as.vector}},
128  \code{\link[base]{matrix}}, \code{\link[base]{dimnames}},
129  \code{\link[base]{outer}}, \code{\link[base]{table}}
130}
131\examples{
132set.seed(1)
133Solder  <- factor(sample(c('Thin','Thick'),200,TRUE),c('Thin','Thick'))
134Opening <- factor(sample(c('S','M','L'),  200,TRUE),c('S','M','L'))
135
136tab <- table(Opening, Solder)
137tab
138reShape(tab)
139# attach(tab)  # do further processing
140
141
142#if(!.R.) {
143# g <- crosstabs( ~ Solder + Opening, data = solder, subset = skips > 10)
144# rowpct <- 100*attr(g,'marginals')$"N/RowTotal"   # compute row pcts
145# rowpct
146#
147#
148# r <- reShape(rowpct)
149# # note names "Solder" and "Opening" came originally from formula
150# # given to crosstabs
151# r
152# dotplot(Solder ~ rowpct, groups=Opening, panel=panel.superpose, data=r)
153#}
154
155
156# An example where a matrix is created from irregular vectors
157follow <- data.frame(id=c('a','a','b','b','b','d'),
158                     month=c(1, 2,  1,  2,  3,  2),
159                     cholesterol=c(225,226, 320,319,318, 270))
160follow
161attach(follow)
162reShape(cholesterol, id=id, colvar=month)
163detach('follow')
164# Could have done :
165# reShape(cholesterol, triglyceride=trig, id=id, colvar=month)
166
167# Create a data frame, reshaping a long dataset in which groups are
168# formed not just by subject id but by combinations of subject id and
169# visit number.  Also carry forward a variable that is supposed to be
170# constant within subject-visit number combinations.  In this example,
171# it is not constant, so an arbitrary visit number will be selected.
172w <- data.frame(id=c('a','a','a','a','b','b','b','d','d','d'),
173             visit=c(  1,  1,  2,  2,  1,  1,  2,  2,  2,  2),
174                 k=c('A','A','B','B','C','C','D','E','F','G'),
175               var=c('x','y','x','y','x','y','y','x','y','z'),
176               val=1:10)
177with(w,
178     reShape(val, id=data.frame(id,visit),
179             constant=data.frame(k), colvar=var))
180
181# Get predictions from a regression model for 2 systematically
182# varying predictors.  Convert the predictions into a matrix, with
183# rows corresponding to the predictor having the most values, and
184# columns corresponding to the other predictor
185# d <- expand.grid(x2=0:1, x1=1:100)
186# pred <- predict(fit, d)
187# reShape(pred, id=d$x1, colvar=d$x2)  # makes 100 x 2 matrix
188
189
190# Reshape a wide data frame containing multiple variables representing
191# repeated measurements (3 repeats on 2 variables; 4 subjects)
192set.seed(33)
193n <- 4
194w <- data.frame(age=rnorm(n, 40, 10),
195                sex=sample(c('female','male'), n,TRUE),
196                sbp1=rnorm(n, 120, 15),
197                sbp2=rnorm(n, 120, 15),
198                sbp3=rnorm(n, 120, 15),
199                dbp1=rnorm(n,  80, 15),
200                dbp2=rnorm(n,  80, 15),
201                dbp3=rnorm(n,  80, 15), row.names=letters[1:n])
202options(digits=3)
203w
204
205
206u <- reShape(w, base=c('sbp','dbp'), reps=3)
207u
208reShape(w, base=c('sbp','dbp'), reps=3, timevar='week', times=c(0,3,12))
209}
210\keyword{manip}
211\keyword{array}
212\concept{trellis}
213\concept{lattice}
214\concept{repeated measures}
215\concept{longitudinal data}
216