1\name{melt.data.table}
2\alias{melt.data.table}
3\alias{melt}
4\title{Fast melt for data.table}
5\description{
6\code{melt} is \code{data.table}'s wide-to-long reshaping tool.
7We provide an S3 method for melting \code{data.table}s. It is written in C for speed and memory
8efficiency. Since \code{v1.9.6}, \code{melt.data.table} allows melting into
9multiple columns simultaneously.
10}
11\usage{
12## fast melt a data.table
13\method{melt}{data.table}(data, id.vars, measure.vars,
14    variable.name = "variable", value.name = "value",
15    \dots, na.rm = FALSE, variable.factor = TRUE,
16    value.factor = FALSE,
17    verbose = getOption("datatable.verbose"))
18}
19\arguments{
20\item{data}{ A \code{data.table} object to melt.}
21\item{id.vars}{vector of id variables. Can be integer (corresponding id
22column numbers) or character (id column names) vector. If missing, all
23non-measure columns will be assigned to it. If integer, must be positive; see Details. }
24\item{measure.vars}{Measure variables for \code{melt}ing. Can be missing, vector, list, or pattern-based.
25
26  \itemize{
27    \item{ When missing, \code{measure.vars} will become all columns outside \code{id.vars}. }
28    \item{ Vector can be \code{integer} (implying column numbers) or \code{character} (column names). }
29    \item{ \code{list} is a generalization of the vector version -- each element of the list (which should be \code{integer} or \code{character} as above) will become a \code{melt}ed column. }
30    \item{ Pattern-based column matching can be achieved with the regular expression-based \code{\link{patterns}} syntax; multiple patterns will produce multiple columns. }
31  }
32
33    For convenience/clarity in the case of multiple \code{melt}ed columns, resulting column names can be supplied as names to the elements \code{measure.vars} (in the \code{list} and \code{patterns} usages). See also \code{Examples}. }
34\item{variable.name}{name for the measured variable names column. The default name is \code{'variable'}.}
35\item{value.name}{name for the molten data values column(s). The default name is \code{'value'}. Multiple names can be provided here for the case when \code{measure.vars} is a \code{list}, though note well that the names provided in \code{measure.vars} take precedence. }
36\item{na.rm}{If \code{TRUE}, \code{NA} values will be removed from the molten
37data.}
38\item{variable.factor}{If \code{TRUE}, the \code{variable} column will be
39converted to \code{factor}, else it will be a \code{character} column.}
40\item{value.factor}{If \code{TRUE}, the \code{value} column will be converted
41to \code{factor}, else the molten value type is left unchanged.}
42\item{verbose}{\code{TRUE} turns on status and information messages to the
43console. Turn this on by default using \code{options(datatable.verbose=TRUE)}.
44The quantity and types of verbosity may be expanded in future.}
45\item{\dots}{any other arguments to be passed to/from other methods.}
46}
47\details{
48If \code{id.vars} and \code{measure.vars} are both missing, all
49non-\code{numeric/integer/logical} columns are assigned as id variables and
50the rest as measure variables. If only one of \code{id.vars} or
51\code{measure.vars} is supplied, the rest of the columns will be assigned to
52the other. Both \code{id.vars} and \code{measure.vars} can have the same column
53more than once and the same column can be both as id and measure variables.
54
55\code{melt.data.table} also accepts \code{list} columns for both id and measure
56variables.
57
58When all \code{measure.vars} are not of the same type, they'll be coerced
59according to the hierarchy \code{list} > \code{character} > \code{numeric >
60integer > logical}. For example, if any of the measure variables is a
61\code{list}, then entire value column will be coerced to a list. Note that,
62if the type of \code{value} column is a list, \code{na.rm = TRUE} will have no
63effect.
64
65From version \code{1.9.6}, \code{melt} gains a feature with \code{measure.vars}
66accepting a list of \code{character} or \code{integer} vectors as well to melt
67into multiple columns in a single function call efficiently. The function
68\code{\link{patterns}} can be used to provide regular expression patterns. When
69used along with \code{melt}, if \code{cols} argument is not provided, the
70patterns will be matched against \code{names(data)}, for convenience.
71
72Attributes are preserved if all \code{value} columns are of the same type. By
73default, if any of the columns to be melted are of type \code{factor}, it'll
74be coerced to \code{character} type. To get a \code{factor} column, set
75\code{value.factor = TRUE}. \code{melt.data.table} also preserves
76\code{ordered} factors.
77
78Historical note: \code{melt.data.table} was originally designed as an enhancement to \code{reshape2::melt} in terms of computing and memory efficiency. \code{reshape2} has since been deprecated, and \code{melt} has had a generic defined within \code{data.table} since \code{v1.9.6} in 2015, at which point the dependency between the packages became more etymological than programmatic. We thank the \code{reshape2} authors for the inspiration.
79
80}
81
82\value{
83An unkeyed \code{data.table} containing the molten data.
84}
85
86\examples{
87set.seed(45)
88require(data.table)
89DT <- data.table(
90      i_1 = c(1:5, NA),
91      i_2 = c(NA,6,7,8,9,10),
92      f_1 = factor(sample(c(letters[1:3], NA), 6, TRUE)),
93      f_2 = factor(c("z", "a", "x", "c", "x", "x"), ordered=TRUE),
94      c_1 = sample(c(letters[1:3], NA), 6, TRUE),
95      d_1 = as.Date(c(1:3,NA,4:5), origin="2013-09-01"),
96      d_2 = as.Date(6:1, origin="2012-01-01"))
97# add a couple of list cols
98DT[, l_1 := DT[, list(c=list(rep(i_1, sample(5,1)))), by = i_1]$c]
99DT[, l_2 := DT[, list(c=list(rep(c_1, sample(5,1)))), by = i_1]$c]
100
101# id, measure as character/integer/numeric vectors
102melt(DT, id=1:2, measure="f_1")
103melt(DT, id=c("i_1", "i_2"), measure=3) # same as above
104melt(DT, id=1:2, measure=3L, value.factor=TRUE) # same, but 'value' is factor
105melt(DT, id=1:2, measure=3:4, value.factor=TRUE) # 'value' is *ordered* factor
106
107# preserves attribute when types are identical, ex: Date
108melt(DT, id=3:4, measure=c("d_1", "d_2"))
109melt(DT, id=3:4, measure=c("i_1", "d_1")) # attribute not preserved
110
111# on list
112melt(DT, id=1, measure=c("l_1", "l_2")) # value is a list
113melt(DT, id=1, measure=c("c_1", "l_1")) # c1 coerced to list
114
115# on character
116melt(DT, id=1, measure=c("c_1", "f_1")) # value is char
117melt(DT, id=1, measure=c("c_1", "i_2")) # i2 coerced to char
118
119# on na.rm=TRUE. NAs are removed efficiently, from within C
120melt(DT, id=1, measure=c("c_1", "i_2"), na.rm=TRUE) # remove NA
121
122# measure.vars can be also a list
123# melt "f_1,f_2" and "d_1,d_2" simultaneously, retain 'factor' attribute
124# convenient way using internal function patterns()
125melt(DT, id=1:2, measure=patterns("^f_", "^d_"), value.factor=TRUE)
126# same as above, but provide list of columns directly by column names or indices
127melt(DT, id=1:2, measure=list(3:4, c("d_1", "d_2")), value.factor=TRUE)
128# same as above, but provide names directly:
129melt(DT, id=1:2, measure=patterns(f="^f_", d="^d_"), value.factor=TRUE)
130
131# na.rm=TRUE removes rows with NAs in any 'value' columns
132melt(DT, id=1:2, measure=patterns("f_", "d_"), value.factor=TRUE, na.rm=TRUE)
133
134# return 'NA' for missing columns, 'na.rm=TRUE' ignored due to list column
135melt(DT, id=1:2, measure=patterns("l_", "c_"), na.rm=TRUE)
136
137}
138\seealso{
139  \code{\link{dcast}}, \url{https://cran.r-project.org/package=reshape}
140}
141\keyword{ data }
142
143