• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

.circleci/H05-Jun-2021-317279

.github/H05-Jun-2021-1,1631,022

benchmark/H03-May-2022-6,3685,057

cmake/H05-Jun-2021-376330

dependencies/H03-May-2022-8,1885,706

doc/H03-May-2022-2,8992,266

examples/quickstart/H03-May-2022-160136

extra/H05-Jun-2021-2016

fuzz/H03-May-2022-1,4841,020

images/H03-May-2022-

include/H03-May-2022-18,97510,895

jsonchecker/H03-May-2022-73,75173,051

jsonexamples/H03-May-2022-269,278269,204

scripts/H05-Jun-2021-531389

singleheader/H03-May-2022-36,11720,957

src/H03-May-2022-5,6493,813

style/H05-Jun-2021-379290

tests/H03-May-2022-11,3169,887

tools/H03-May-2022-700600

windows/H03-May-2022-1,7471,004

.appveyor.ymlH A D05-Jun-20211.7 KiB5748

.cirrus.ymlH A D05-Jun-2021612 2725

.clang-formatH A D05-Jun-202119 21

.dockerignoreH A D05-Jun-2021143 1615

.drone.ymlH A D05-Jun-202114.3 KiB463462

.travis.ymlH A D05-Jun-20214.7 KiB189169

AUTHORSH A D05-Jun-2021106 54

CONTRIBUTING.mdH A D05-Jun-20217.5 KiB10374

CONTRIBUTORSH A D05-Jun-2021664 4140

DockerfileH A D05-Jun-20214.1 KiB8986

DoxyfileH A D05-Jun-2021109.9 KiB2,5792,007

HACKING.mdH A D05-Jun-202116.2 KiB294215

LICENSEH A D05-Jun-202111.1 KiB202169

README.mdH A D05-Jun-20219.6 KiB186136

RELEASES.mdH A D05-Jun-20214.8 KiB7057

README.md

1[![Fuzzing Status](https://oss-fuzz-build-logs.storage.googleapis.com/badges/simdjson.svg)](https://bugs.chromium.org/p/oss-fuzz/issues/list?sort=-opened&q=proj%3Asimdjson&can=2)
2![Ubuntu 18.04 CI](https://github.com/simdjson/simdjson/workflows/Ubuntu%2018.04%20CI%20(GCC%207)/badge.svg)
3[![Ubuntu 20.04 CI](https://github.com/simdjson/simdjson/workflows/Ubuntu%2020.04%20CI%20(GCC%209)/badge.svg)](https://simdjson.org/plots.html)
4![VS16-CI](https://github.com/simdjson/simdjson/workflows/VS16-CI/badge.svg)
5![MinGW64-CI](https://github.com/simdjson/simdjson/workflows/MinGW64-CI/badge.svg)
6[![][license img]][license]  [![Doxygen Documentation](https://img.shields.io/badge/docs-doxygen-green.svg)](https://simdjson.org/api/0.9.0/index.html)
7
8simdjson : Parsing gigabytes of JSON per second
9===============================================
10
11<img src="images/logo.png" width="10%" style="float: right">
12JSON is everywhere on the Internet. Servers spend a *lot* of time parsing it. We need a fresh
13approach. The simdjson library uses commonly available SIMD instructions and microparallel algorithms
14to parse JSON 4x  faster than RapidJSON and 25x faster than JSON for Modern C++.
15
16* **Fast:** Over 4x faster than commonly used production-grade JSON parsers.
17* **Record Breaking Features:** Minify JSON  at 6 GB/s, validate UTF-8  at 13 GB/s,  NDJSON at 3.5 GB/s.
18* **Easy:** First-class, easy to use and carefully documented APIs.
19* **Strict:** Full JSON and UTF-8 validation, lossless parsing. Performance with no compromises.
20* **Automatic:** Selects a CPU-tailored parser at runtime. No configuration needed.
21* **Reliable:** From memory allocation to error handling, simdjson's design avoids surprises.
22* **Peer Reviewed:** Our research appears in venues like VLDB Journal, Software: Practice and Experience.
23
24This library is part of the [Awesome Modern C++](https://awesomecpp.com) list.
25
26Table of Contents
27-----------------
28
29* [Quick Start](#quick-start)
30  * [On Demand](#on-demand)
31* [Documentation](#documentation)
32* [Performance results](#performance-results)
33* [Real-world usage](#real-world-usage)
34* [Bindings and Ports of simdjson](#bindings-and-ports-of-simdjson)
35* [About simdjson](#about-simdjson)
36* [Funding](#funding)
37* [Contributing to simdjson](#contributing-to-simdjson)
38* [License](#license)
39
40Quick Start
41-----------
42
43The simdjson library is easily consumable with a single .h and .cpp file.
44
450. Prerequisites: `g++` (version 7 or better) or `clang++` (version 6 or better), and a 64-bit
46   system with a command-line shell (e.g., Linux, macOS, freeBSD). We also support programming
47   environments like Visual Studio and Xcode, but different steps are needed.
481. Pull [simdjson.h](singleheader/simdjson.h) and [simdjson.cpp](singleheader/simdjson.cpp) into a
49   directory, along with the sample file [twitter.json](jsonexamples/twitter.json).
50
51   ```
52   wget https://raw.githubusercontent.com/simdjson/simdjson/master/singleheader/simdjson.h https://raw.githubusercontent.com/simdjson/simdjson/master/singleheader/simdjson.cpp https://raw.githubusercontent.com/simdjson/simdjson/master/jsonexamples/twitter.json
53   ```
542. Create `quickstart.cpp`:
55
56```c++
57#include "simdjson.h"
58using namespace simdjson;
59int main(void) {
60    ondemand::parser parser;
61    padded_string json = padded_string::load("twitter.json");
62    ondemand::document tweets = parser.iterate(json);
63    std::cout << uint64_t(tweets["search_metadata"]["count"]) << " results." << std::endl;
64}
65
66   ```
673. `c++ -o quickstart quickstart.cpp simdjson.cpp`
684. `./quickstart`
69   ```
70   100 results.
71   ```
72
73Documentation
74-------------
75
76Usage documentation is available:
77
78* [Basics](doc/basics.md) is an overview of how to use simdjson and its APIs.
79* [Performance](doc/performance.md) shows some more advanced scenarios and how to tune for them.
80* [Implementation Selection](doc/implementation-selection.md) describes runtime CPU detection and
81  how you can work with it.
82* [API](https://simdjson.org/api/0.9.0/annotated.html) contains the automatically generated API documentation.
83
84Performance results
85-------------------
86
87The simdjson library uses three-quarters less instructions than state-of-the-art parser [RapidJSON](https://rapidjson.org). To our knowledge, simdjson is the first fully-validating JSON parser
88to run at [gigabytes per second](https://en.wikipedia.org/wiki/Gigabyte) (GB/s) on commodity processors. It can parse millions of JSON documents per second on a single core.
89
90The following figure represents parsing speed in GB/s for parsing various files
91on an Intel Skylake processor (3.4 GHz) using the GNU GCC 10 compiler (with the -O3 flag).
92We compare against the best and fastest C++ libraries on benchmarks that load and process the data.
93The simdjson library offers full unicode ([UTF-8](https://en.wikipedia.org/wiki/UTF-8)) validation and exact
94number parsing.
95
96<img src="doc/rome.png" width="60%">
97
98The simdjson library offers high speed whether it processes tiny files (e.g., 300 bytes)
99or larger files (e.g., 3MB). The following plot presents parsing
100speed for [synthetic files over various sizes generated with a script](https://github.com/simdjson/simdjson_experiments_vldb2019/blob/master/experiments/growing/gen.py) on a 3.4 GHz Skylake processor (GNU GCC 9, -O3).
101
102<img src="doc/growing.png" width="60%">
103
104[All our experiments are reproducible](https://github.com/simdjson/simdjson_experiments_vldb2019).
105
106
107For NDJSON files, we can exceed 3 GB/s with [our  multithreaded parsing functions](https://github.com/simdjson/simdjson/blob/master/doc/parse_many.md).
108
109
110
111Real-world usage
112----------------
113
114- [Microsoft FishStore](https://github.com/microsoft/FishStore)
115- [Yandex ClickHouse](https://github.com/yandex/ClickHouse)
116- [Clang Build Analyzer](https://github.com/aras-p/ClangBuildAnalyzer)
117- [Shopify HeapProfiler](https://github.com/Shopify/heap-profiler)
118
119If you are planning to use simdjson in a product, please work from one of our releases.
120
121Bindings and Ports of simdjson
122------------------------------
123
124We distinguish between "bindings" (which just wrap the C++ code) and a port to another programming language (which reimplements everything).
125
126- [ZippyJSON](https://github.com/michaeleisel/zippyjson): Swift bindings for the simdjson project.
127- [libpy_simdjson](https://github.com/gerrymanoim/libpy_simdjson/): high-speed Python bindings for simdjson using [libpy](https://github.com/quantopian/libpy).
128- [pysimdjson](https://github.com/TkTech/pysimdjson): Python bindings for the simdjson project.
129- [cysimdjson](https://github.com/TeskaLabs/cysimdjson): high-speed Python bindings for the simdjson project.
130- [simdjson-rs](https://github.com/simd-lite): Rust port.
131- [simdjson-rust](https://github.com/SunDoge/simdjson-rust): Rust wrapper (bindings).
132- [SimdJsonSharp](https://github.com/EgorBo/SimdJsonSharp): C# version for .NET Core (bindings and full port).
133- [simdjson_nodejs](https://github.com/luizperes/simdjson_nodejs): Node.js bindings for the simdjson project.
134- [simdjson_php](https://github.com/crazyxman/simdjson_php): PHP bindings for the simdjson project.
135- [simdjson_ruby](https://github.com/saka1/simdjson_ruby): Ruby bindings for the simdjson project.
136- [fast_jsonparser](https://github.com/anilmaurya/fast_jsonparser): Ruby bindings for the simdjson project.
137- [simdjson-go](https://github.com/minio/simdjson-go): Go port using Golang assembly.
138- [rcppsimdjson](https://github.com/eddelbuettel/rcppsimdjson): R bindings.
139- [simdjson_erlang](https://github.com/ChomperT/simdjson_erlang): erlang bindings.
140
141
142About simdjson
143--------------
144
145The simdjson library takes advantage of modern microarchitectures, parallelizing with SIMD vector
146instructions, reducing branch misprediction, and reducing data dependency to take advantage of each
147CPU's multiple execution cores.
148
149Some people [enjoy reading our paper](https://arxiv.org/abs/1902.08318): A description of the design
150and implementation of simdjson is in our research article:
151- Geoff Langdale, Daniel Lemire, [Parsing Gigabytes of JSON per Second](https://arxiv.org/abs/1902.08318), VLDB Journal 28 (6), 2019.
152
153We have an in-depth paper focused on the UTF-8 validation:
154
155- John Keiser, Daniel Lemire, [Validating UTF-8 In Less Than One Instruction Per Byte](https://arxiv.org/abs/2010.03090), Software: Practice & Experience (to appear)
156
157We also have an informal [blog post providing some background and context](https://branchfree.org/2019/02/25/paper-parsing-gigabytes-of-json-per-second/).
158
159For the video inclined, <br />
160[![simdjson at QCon San Francisco 2019](http://img.youtube.com/vi/wlvKAT7SZIQ/0.jpg)](http://www.youtube.com/watch?v=wlvKAT7SZIQ)<br />
161(it was the best voted talk, we're kinda proud of it).
162
163Funding
164-------
165
166The work is supported by the Natural Sciences and Engineering Research Council of Canada under grant
167number RGPIN-2017-03910.
168
169[license]: LICENSE
170[license img]: https://img.shields.io/badge/License-Apache%202-blue.svg
171
172Contributing to simdjson
173------------------------
174
175Head over to [CONTRIBUTING.md](CONTRIBUTING.md) for information on contributing to simdjson, and
176[HACKING.md](HACKING.md) for information on source, building, and architecture/design.
177
178License
179-------
180
181This code is made available under the [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0.html).
182
183Under Windows, we build some tools using the windows/dirent_portable.h file (which is outside our library code): it under the liberal (business-friendly) MIT license.
184
185For compilers that do not support [C++17](https://en.wikipedia.org/wiki/C%2B%2B17), we bundle the string-view library which is published under the Boost license (http://www.boost.org/LICENSE_1_0.txt). Like the Apache license, the Boost license is a permissive license allowing commercial redistribution.
186