1\name{reShape} 2\alias{reShape} 3\title{Reshape Matrices and Serial Data} 4\description{ 5 If the first argument is a matrix, \code{reShape} strings out its values 6 and creates row and column vectors specifying the row and column each 7 element came from. This is useful for sending matrices to Trellis 8 functions, for analyzing or plotting results of \code{table} or 9 \code{crosstabs}, or for reformatting serial data stored in a matrix (with 10 rows representing multiple time points) into vectors. The number of 11 observations in the new variables will be the product of the number of 12 rows and number of columns in the input matrix. If the first 13 argument is a vector, the \code{id} and \code{colvar} variables are used to 14 restructure it into a matrix, with \code{NA}s for elements that corresponded 15 to combinations of \code{id} and \code{colvar} values that did not exist in the 16 data. When more than one vector is given, multiple matrices are 17 created. This is useful for restructuring irregular serial data into 18 regular matrices. It is also useful for converting data produced by 19 \code{expand.grid} into a matrix (see the last example). The number of 20 rows of the new matrices equals the number of unique values of \code{id}, 21 and the number of columns equals the number of unique values of 22 \code{colvar}. 23 24 When the first argument is a vector and the \code{id} is a data frame 25 (even with only one variable), 26 \code{reShape} will produce a data frame, and the unique groups are 27 identified by combinations of the values of all variables in \code{id}. 28 If a data frame \code{constant} is specified, the variables in this data 29 frame are assumed to be constant within combinations of \code{id} 30 variables (if not, an arbitrary observation in \code{constant} will be 31 selected for each group). A row of \code{constant} corresponding to the 32 target \code{id} combination is then carried along when creating the 33 data frame result. 34 35 A different behavior of \code{reShape} is achieved when \code{base} and \code{reps} 36 are specified. In that case \code{x} must be a list or data frame, and 37 those data are assumed to contain one or more non-repeating 38 measurements (e.g., baseline measurements) and one or more repeated 39 measurements represented by variables named by pasting together the 40 character strings in the vector \code{base} with the integers 1, 2, \dots, 41 \code{reps}. The input data are rearranged by repeating each value of the 42 baseline variables \code{reps} times and by transposing each observation's 43 values of one of the set of repeated measurements as \code{reps} 44 observations under the variable whose name does not have an integer 45 pasted to the end. if \code{x} has a \code{row.names} attribute, those 46 observation identifiers are each repeated \code{reps} times in the output 47 object. See the last example. 48} 49\usage{ 50reShape(x, \dots, id, colvar, base, reps, times=1:reps, 51 timevar='seqno', constant=NULL) 52} 53\arguments{ 54 \item{x}{ 55 a matrix or vector, or, when \code{base} is specified, a list or data frame 56 } 57 \item{\dots}{ 58 other optional vectors, if \code{x} is a vector 59 } 60 \item{id}{ 61 A numeric, character, category, or factor variable containing subject 62 identifiers, or a data frame of such variables that in combination form 63 groups of interest. Required if \code{x} is a vector, ignored otherwise. 64 } 65 \item{colvar}{ 66 A numeric, character, category, or factor variable containing column 67 identifiers. \code{colvar} is using a "time of data collection" variable. 68 Required if \code{x} is a vector, ignored otherwise. 69 } 70 \item{base}{ 71 vector of character strings containing base names of repeated 72 measurements 73 } 74 \item{reps}{ 75 number of times variables named in \code{base} are repeated. This must be 76 a constant. 77 } 78 \item{times}{ 79 when \code{base} is given, \code{times} is the vector of times to create 80 if you do not want to use consecutive integers beginning with 1. 81 } 82 \item{timevar}{ 83 specifies the name of the time variable to create if \code{times} is 84 given, if you do not want to use \code{seqno} 85 } 86 \item{constant}{ 87 a data frame with the same number of rows in \code{id} and \code{x}, 88 containing auxiliary information to be merged into the resulting data 89 frame. Logically, the rows of \code{constant} within each group 90 should have the same value of all of its variables. 91 } 92} 93\value{ 94 If \code{x} is a matrix, returns a list containing the row variable, the 95 column variable, and the \code{as.vector(x)} vector, named the same as the 96 calling argument was called for \code{x}. If \code{x} is a vector and no other 97 vectors were specified as \code{\dots}, the result is a matrix. If at least 98 one vector was given to \code{\dots}, the result is a list containing \code{k} 99 matrices, where \code{k} one plus the number of vectors in \code{\dots}. If \code{x} 100 is a list or data frame, the same type of object is returned. If 101 \code{x} is a vector and \code{id} is a data frame, a data frame will be 102 the result. 103} 104\details{ 105 In converting \code{dimnames} to vectors, the resulting variables are 106 numeric if all elements of the matrix dimnames can be converted to 107 numeric, otherwise the corresponding row or column variable remains 108 character. When the \code{dimnames} if \code{x} have a \code{names} attribute, those 109 two names become the new variable names. If \code{x} is a vector and 110 another vector is also given (in \code{\dots}), the matrices in the resulting 111 list are named the same as the input vector calling arguments. You 112 can specify customized names for these on-the-fly by using 113 e.g. \code{reShape(X=x, Y=y, id= , colvar= )}. The new names will then be 114 \code{X} and \code{Y} instead of \code{x} and \code{y}. A new variable named \code{seqnno} is 115 also added to the resulting object. \code{seqno} indicates the sequential 116 repeated measurement number. When \code{base} and \code{times} are 117 specified, this new variable is named the character value of \code{timevar} and the values 118 are given by a table lookup into the vector \code{times}. 119} 120\author{ 121Frank Harrell\cr 122Department of Biostatistics\cr 123Vanderbilt University School of Medicine\cr 124\email{f.harrell@vanderbilt.edu} 125} 126\seealso{ 127 \code{\link[stats]{reshape}}, \code{\link[base:vector]{as.vector}}, 128 \code{\link[base]{matrix}}, \code{\link[base]{dimnames}}, 129 \code{\link[base]{outer}}, \code{\link[base]{table}} 130} 131\examples{ 132set.seed(1) 133Solder <- factor(sample(c('Thin','Thick'),200,TRUE),c('Thin','Thick')) 134Opening <- factor(sample(c('S','M','L'), 200,TRUE),c('S','M','L')) 135 136tab <- table(Opening, Solder) 137tab 138reShape(tab) 139# attach(tab) # do further processing 140 141 142#if(!.R.) { 143# g <- crosstabs( ~ Solder + Opening, data = solder, subset = skips > 10) 144# rowpct <- 100*attr(g,'marginals')$"N/RowTotal" # compute row pcts 145# rowpct 146# 147# 148# r <- reShape(rowpct) 149# # note names "Solder" and "Opening" came originally from formula 150# # given to crosstabs 151# r 152# dotplot(Solder ~ rowpct, groups=Opening, panel=panel.superpose, data=r) 153#} 154 155 156# An example where a matrix is created from irregular vectors 157follow <- data.frame(id=c('a','a','b','b','b','d'), 158 month=c(1, 2, 1, 2, 3, 2), 159 cholesterol=c(225,226, 320,319,318, 270)) 160follow 161attach(follow) 162reShape(cholesterol, id=id, colvar=month) 163detach('follow') 164# Could have done : 165# reShape(cholesterol, triglyceride=trig, id=id, colvar=month) 166 167# Create a data frame, reshaping a long dataset in which groups are 168# formed not just by subject id but by combinations of subject id and 169# visit number. Also carry forward a variable that is supposed to be 170# constant within subject-visit number combinations. In this example, 171# it is not constant, so an arbitrary visit number will be selected. 172w <- data.frame(id=c('a','a','a','a','b','b','b','d','d','d'), 173 visit=c( 1, 1, 2, 2, 1, 1, 2, 2, 2, 2), 174 k=c('A','A','B','B','C','C','D','E','F','G'), 175 var=c('x','y','x','y','x','y','y','x','y','z'), 176 val=1:10) 177with(w, 178 reShape(val, id=data.frame(id,visit), 179 constant=data.frame(k), colvar=var)) 180 181# Get predictions from a regression model for 2 systematically 182# varying predictors. Convert the predictions into a matrix, with 183# rows corresponding to the predictor having the most values, and 184# columns corresponding to the other predictor 185# d <- expand.grid(x2=0:1, x1=1:100) 186# pred <- predict(fit, d) 187# reShape(pred, id=d$x1, colvar=d$x2) # makes 100 x 2 matrix 188 189 190# Reshape a wide data frame containing multiple variables representing 191# repeated measurements (3 repeats on 2 variables; 4 subjects) 192set.seed(33) 193n <- 4 194w <- data.frame(age=rnorm(n, 40, 10), 195 sex=sample(c('female','male'), n,TRUE), 196 sbp1=rnorm(n, 120, 15), 197 sbp2=rnorm(n, 120, 15), 198 sbp3=rnorm(n, 120, 15), 199 dbp1=rnorm(n, 80, 15), 200 dbp2=rnorm(n, 80, 15), 201 dbp3=rnorm(n, 80, 15), row.names=letters[1:n]) 202options(digits=3) 203w 204 205 206u <- reShape(w, base=c('sbp','dbp'), reps=3) 207u 208reShape(w, base=c('sbp','dbp'), reps=3, timevar='week', times=c(0,3,12)) 209} 210\keyword{manip} 211\keyword{array} 212\concept{trellis} 213\concept{lattice} 214\concept{repeated measures} 215\concept{longitudinal data} 216