1\name{melt.data.table} 2\alias{melt.data.table} 3\alias{melt} 4\title{Fast melt for data.table} 5\description{ 6\code{melt} is \code{data.table}'s wide-to-long reshaping tool. 7We provide an S3 method for melting \code{data.table}s. It is written in C for speed and memory 8efficiency. Since \code{v1.9.6}, \code{melt.data.table} allows melting into 9multiple columns simultaneously. 10} 11\usage{ 12## fast melt a data.table 13\method{melt}{data.table}(data, id.vars, measure.vars, 14 variable.name = "variable", value.name = "value", 15 \dots, na.rm = FALSE, variable.factor = TRUE, 16 value.factor = FALSE, 17 verbose = getOption("datatable.verbose")) 18} 19\arguments{ 20\item{data}{ A \code{data.table} object to melt.} 21\item{id.vars}{vector of id variables. Can be integer (corresponding id 22column numbers) or character (id column names) vector. If missing, all 23non-measure columns will be assigned to it. If integer, must be positive; see Details. } 24\item{measure.vars}{Measure variables for \code{melt}ing. Can be missing, vector, list, or pattern-based. 25 26 \itemize{ 27 \item{ When missing, \code{measure.vars} will become all columns outside \code{id.vars}. } 28 \item{ Vector can be \code{integer} (implying column numbers) or \code{character} (column names). } 29 \item{ \code{list} is a generalization of the vector version -- each element of the list (which should be \code{integer} or \code{character} as above) will become a \code{melt}ed column. } 30 \item{ Pattern-based column matching can be achieved with the regular expression-based \code{\link{patterns}} syntax; multiple patterns will produce multiple columns. } 31 } 32 33 For convenience/clarity in the case of multiple \code{melt}ed columns, resulting column names can be supplied as names to the elements \code{measure.vars} (in the \code{list} and \code{patterns} usages). See also \code{Examples}. } 34\item{variable.name}{name for the measured variable names column. The default name is \code{'variable'}.} 35\item{value.name}{name for the molten data values column(s). The default name is \code{'value'}. Multiple names can be provided here for the case when \code{measure.vars} is a \code{list}, though note well that the names provided in \code{measure.vars} take precedence. } 36\item{na.rm}{If \code{TRUE}, \code{NA} values will be removed from the molten 37data.} 38\item{variable.factor}{If \code{TRUE}, the \code{variable} column will be 39converted to \code{factor}, else it will be a \code{character} column.} 40\item{value.factor}{If \code{TRUE}, the \code{value} column will be converted 41to \code{factor}, else the molten value type is left unchanged.} 42\item{verbose}{\code{TRUE} turns on status and information messages to the 43console. Turn this on by default using \code{options(datatable.verbose=TRUE)}. 44The quantity and types of verbosity may be expanded in future.} 45\item{\dots}{any other arguments to be passed to/from other methods.} 46} 47\details{ 48If \code{id.vars} and \code{measure.vars} are both missing, all 49non-\code{numeric/integer/logical} columns are assigned as id variables and 50the rest as measure variables. If only one of \code{id.vars} or 51\code{measure.vars} is supplied, the rest of the columns will be assigned to 52the other. Both \code{id.vars} and \code{measure.vars} can have the same column 53more than once and the same column can be both as id and measure variables. 54 55\code{melt.data.table} also accepts \code{list} columns for both id and measure 56variables. 57 58When all \code{measure.vars} are not of the same type, they'll be coerced 59according to the hierarchy \code{list} > \code{character} > \code{numeric > 60integer > logical}. For example, if any of the measure variables is a 61\code{list}, then entire value column will be coerced to a list. Note that, 62if the type of \code{value} column is a list, \code{na.rm = TRUE} will have no 63effect. 64 65From version \code{1.9.6}, \code{melt} gains a feature with \code{measure.vars} 66accepting a list of \code{character} or \code{integer} vectors as well to melt 67into multiple columns in a single function call efficiently. The function 68\code{\link{patterns}} can be used to provide regular expression patterns. When 69used along with \code{melt}, if \code{cols} argument is not provided, the 70patterns will be matched against \code{names(data)}, for convenience. 71 72Attributes are preserved if all \code{value} columns are of the same type. By 73default, if any of the columns to be melted are of type \code{factor}, it'll 74be coerced to \code{character} type. To get a \code{factor} column, set 75\code{value.factor = TRUE}. \code{melt.data.table} also preserves 76\code{ordered} factors. 77 78Historical note: \code{melt.data.table} was originally designed as an enhancement to \code{reshape2::melt} in terms of computing and memory efficiency. \code{reshape2} has since been deprecated, and \code{melt} has had a generic defined within \code{data.table} since \code{v1.9.6} in 2015, at which point the dependency between the packages became more etymological than programmatic. We thank the \code{reshape2} authors for the inspiration. 79 80} 81 82\value{ 83An unkeyed \code{data.table} containing the molten data. 84} 85 86\examples{ 87set.seed(45) 88require(data.table) 89DT <- data.table( 90 i_1 = c(1:5, NA), 91 i_2 = c(NA,6,7,8,9,10), 92 f_1 = factor(sample(c(letters[1:3], NA), 6, TRUE)), 93 f_2 = factor(c("z", "a", "x", "c", "x", "x"), ordered=TRUE), 94 c_1 = sample(c(letters[1:3], NA), 6, TRUE), 95 d_1 = as.Date(c(1:3,NA,4:5), origin="2013-09-01"), 96 d_2 = as.Date(6:1, origin="2012-01-01")) 97# add a couple of list cols 98DT[, l_1 := DT[, list(c=list(rep(i_1, sample(5,1)))), by = i_1]$c] 99DT[, l_2 := DT[, list(c=list(rep(c_1, sample(5,1)))), by = i_1]$c] 100 101# id, measure as character/integer/numeric vectors 102melt(DT, id=1:2, measure="f_1") 103melt(DT, id=c("i_1", "i_2"), measure=3) # same as above 104melt(DT, id=1:2, measure=3L, value.factor=TRUE) # same, but 'value' is factor 105melt(DT, id=1:2, measure=3:4, value.factor=TRUE) # 'value' is *ordered* factor 106 107# preserves attribute when types are identical, ex: Date 108melt(DT, id=3:4, measure=c("d_1", "d_2")) 109melt(DT, id=3:4, measure=c("i_1", "d_1")) # attribute not preserved 110 111# on list 112melt(DT, id=1, measure=c("l_1", "l_2")) # value is a list 113melt(DT, id=1, measure=c("c_1", "l_1")) # c1 coerced to list 114 115# on character 116melt(DT, id=1, measure=c("c_1", "f_1")) # value is char 117melt(DT, id=1, measure=c("c_1", "i_2")) # i2 coerced to char 118 119# on na.rm=TRUE. NAs are removed efficiently, from within C 120melt(DT, id=1, measure=c("c_1", "i_2"), na.rm=TRUE) # remove NA 121 122# measure.vars can be also a list 123# melt "f_1,f_2" and "d_1,d_2" simultaneously, retain 'factor' attribute 124# convenient way using internal function patterns() 125melt(DT, id=1:2, measure=patterns("^f_", "^d_"), value.factor=TRUE) 126# same as above, but provide list of columns directly by column names or indices 127melt(DT, id=1:2, measure=list(3:4, c("d_1", "d_2")), value.factor=TRUE) 128# same as above, but provide names directly: 129melt(DT, id=1:2, measure=patterns(f="^f_", d="^d_"), value.factor=TRUE) 130 131# na.rm=TRUE removes rows with NAs in any 'value' columns 132melt(DT, id=1:2, measure=patterns("f_", "d_"), value.factor=TRUE, na.rm=TRUE) 133 134# return 'NA' for missing columns, 'na.rm=TRUE' ignored due to list column 135melt(DT, id=1:2, measure=patterns("l_", "c_"), na.rm=TRUE) 136 137} 138\seealso{ 139 \code{\link{dcast}}, \url{https://cran.r-project.org/package=reshape} 140} 141\keyword{ data } 142 143