• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

R/H06-Feb-2021-7,6785,953

build/H03-May-2022-

inst/H23-Sep-2021-4,7802,997

man/H23-Sep-2021-5,1804,422

po/H03-Nov-2020-13,64610,793

src/H23-Sep-2021-21,48517,796

tests/H03-Nov-2020-333279

vignettes/H03-May-2022-256,767255,569

DESCRIPTIONH A D27-Sep-20215.2 KiB142141

LICENSEH A D22-May-202016.3 KiB374293

MD5H A D27-Sep-202116.4 KiB298297

NAMESPACEH A D03-Nov-20205.9 KiB189178

NEWS.mdH A D23-Sep-2021183.6 KiB1,472902

README.mdH A D23-Sep-20216.2 KiB9769

cleanupH A D23-Sep-202129 31

configureH A D23-Sep-20215.3 KiB12587

README.md

1
2# data.table <a href="https://r-datatable.com"><img src="https://raw.githubusercontent.com/Rdatatable/data.table/master/.graphics/logo.png" align="right" height="140" /></a>
3
4<!-- badges: start -->
5[![CRAN status](https://cranchecks.info/badges/flavor/release/data.table)](https://cran.r-project.org/web/checks/check_results_data.table.html)
6[![Travis build status](https://travis-ci.org/Rdatatable/data.table.svg?branch=master)](https://travis-ci.org/Rdatatable/data.table)
7[![AppVeyor build status](https://ci.appveyor.com/api/projects/status/kayjdh5qtgymhoxr/branch/master?svg=true)](https://ci.appveyor.com/project/Rdatatable/data-table)
8[![Codecov test coverage](https://codecov.io/github/Rdatatable/data.table/coverage.svg?branch=master)](https://codecov.io/github/Rdatatable/data.table?branch=master)
9[![GitLab CI build status](https://gitlab.com/Rdatatable/data.table/badges/master/pipeline.svg)](https://gitlab.com/Rdatatable/data.table/-/pipelines)
10[![downloads](https://cranlogs.r-pkg.org/badges/data.table)](https://www.rdocumentation.org/trends)
11[![CRAN usage](https://jangorecki.gitlab.io/rdeps/data.table/CRAN_usage.svg?sanitize=true)](https://gitlab.com/jangorecki/rdeps)
12[![BioC usage](https://jangorecki.gitlab.io/rdeps/data.table/BioC_usage.svg?sanitize=true)](https://gitlab.com/jangorecki/rdeps)
13[![indirect usage](https://jangorecki.gitlab.io/rdeps/data.table/indirect_usage.svg?sanitize=true)](https://gitlab.com/jangorecki/rdeps)
14<!-- badges: end -->
15
16`data.table` provides a high-performance version of [base R](https://www.r-project.org/about.html)'s `data.frame` with syntax and feature enhancements for ease of use, convenience and programming speed.
17
18## Why `data.table`?
19
20* concise syntax: fast to type, fast to read
21* fast speed
22* memory efficient
23* careful API lifecycle management
24* community
25* feature rich
26
27## Features
28
29* fast and friendly delimited **file reader**: **[`?fread`](https://rdatatable.gitlab.io/data.table/reference/fread.html)**, see also [convenience features for _small_ data](https://github.com/Rdatatable/data.table/wiki/Convenience-features-of-fread)
30* fast and feature rich delimited **file writer**: **[`?fwrite`](https://rdatatable.gitlab.io/data.table/reference/fwrite.html)**
31* low-level **parallelism**: many common operations are internally parallelized to use multiple CPU threads
32* fast and scalable aggregations; e.g. 100GB in RAM (see [benchmarks](https://h2oai.github.io/db-benchmark/) on up to **two billion rows**)
33* fast and feature rich joins: **ordered joins** (e.g. rolling forwards, backwards, nearest and limited staleness), **[overlapping range joins](https://github.com/Rdatatable/data.table/wiki/talks/EARL2014_OverlapRangeJoin_Arun.pdf)** (similar to `IRanges::findOverlaps`), **[non-equi joins](https://github.com/Rdatatable/data.table/wiki/talks/ArunSrinivasanUseR2016.pdf)** (i.e. joins using operators `>, >=, <, <=`), **aggregate on join** (`by=.EACHI`), **update on join**
34* fast add/update/delete columns **by reference** by group using no copies at all
35* fast and feature rich **reshaping** data: **[`?dcast`](https://rdatatable.gitlab.io/data.table/reference/dcast.data.table.html)** (_pivot/wider/spread_) and **[`?melt`](https://rdatatable.gitlab.io/data.table/reference/melt.data.table.html)** (_unpivot/longer/gather_)
36* **any R function from any R package** can be used in queries not just the subset of functions made available by a database backend, also columns of type `list` are supported
37* has **[no dependencies](https://en.wikipedia.org/wiki/Dependency_hell)** at all other than base R itself, for simpler production/maintenance
38* the R dependency is **as old as possible for as long as possible**, dated April 2014, and we continuously test against that version; e.g. v1.11.0 released on 5 May 2018 bumped the dependency up from 5 year old R 3.0.0 to 4 year old R 3.1.0
39
40## Installation
41
42```r
43install.packages("data.table")
44
45# latest development version:
46data.table::update.dev.pkg()
47```
48
49See [the Installation wiki](https://github.com/Rdatatable/data.table/wiki/Installation) for more details.
50
51## Usage
52
53Use `data.table` subset `[` operator the same way you would use `data.frame` one, but...
54
55* no need to prefix each column with `DT$` (like `subset()` and `with()` but built-in)
56* any R expression using any package is allowed in `j` argument, not just list of columns
57* extra argument `by` to compute `j` expression by group
58
59```r
60library(data.table)
61DT = as.data.table(iris)
62
63# FROM[WHERE, SELECT, GROUP BY]
64# DT  [i,     j,      by]
65
66DT[Petal.Width > 1.0, mean(Petal.Length), by = Species]
67#      Species       V1
68#1: versicolor 4.362791
69#2:  virginica 5.552000
70```
71
72### Getting started
73
74* [Introduction to data.table](https://cran.r-project.org/package=data.table/vignettes/datatable-intro.html) vignette
75* [Getting started](https://github.com/Rdatatable/data.table/wiki/Getting-started) wiki page
76* [Examples](https://rdatatable.gitlab.io/data.table/reference/data.table.html#examples) produced by `example(data.table)`
77
78### Cheatsheets
79
80<a href="https://raw.githubusercontent.com/rstudio/cheatsheets/master/datatable.pdf"><img src="https://raw.githubusercontent.com/rstudio/cheatsheets/master/pngs/datatable.png" width="615" height="242"/></a>
81
82## Community
83
84`data.table` is widely used by the R community. It is being directly used by hundreds of CRAN and Bioconductor packages, and indirectly by thousands. It is one of the [top most starred](https://www.r-pkg.org/starred) R packages on GitHub, and was highly rated by the [Depsy project](http://depsy.org/package/r/data.table). If you need help, the `data.table` community is active on [StackOverflow](https://stackoverflow.com/questions/tagged/data.table).
85
86### Stay up-to-date
87
88- click the **Watch** button at the top and right of GitHub project page
89- read [NEWS file](https://github.com/Rdatatable/data.table/blob/master/NEWS.md)
90- follow [#rdatatable](https://twitter.com/hashtag/rdatatable) on twitter
91- watch recent [Presentations](https://github.com/Rdatatable/data.table/wiki/Presentations)
92- read recent [Articles](https://github.com/Rdatatable/data.table/wiki/Articles)
93
94### Contributing
95
96Guidelines for filing issues / pull requests: [Contribution Guidelines](https://github.com/Rdatatable/data.table/wiki/Contributing).
97