|
Name |
|
Date |
Size |
#Lines |
LOC |
| .. | | 19-Dec-2020 | - |
| dict-files/ | H | 03-May-2022 | - | | |
| fuzz/ | H | 19-Dec-2020 | - | 3,662 | 2,686 |
| golden-compression/ | H | 03-May-2022 | - | 1 | 1 |
| golden-decompression/ | H | 03-May-2022 | - | | |
| golden-dictionaries/ | H | 03-May-2022 | - | | |
| gzip/ | H | 03-May-2022 | - | 1,814 | 877 |
| regression/ | H | 19-Dec-2020 | - | 3,128 | 2,426 |
| .gitignore | H A D | 19-Dec-2020 | 678 | 69 | 62 |
| DEPRECATED-test-zstd-speed.py | H A D | 19-Dec-2020 | 17.8 KiB | 379 | 316 |
| Makefile | H A D | 19-Dec-2020 | 15.2 KiB | 449 | 321 |
| README.md | H A D | 19-Dec-2020 | 9.5 KiB | 186 | 159 |
| automated_benchmarking.py | H A D | 19-Dec-2020 | 12.9 KiB | 327 | 274 |
| bigdict.c | H A D | 19-Dec-2020 | 4 KiB | 129 | 106 |
| checkTag.c | H A D | 19-Dec-2020 | 2 KiB | 66 | 30 |
| datagencli.c | H A D | 19-Dec-2020 | 4.4 KiB | 131 | 91 |
| decodecorpus.c | H A D | 19-Dec-2020 | 68.3 KiB | 1,933 | 1,475 |
| fullbench.c | H A D | 19-Dec-2020 | 34 KiB | 937 | 750 |
| fuzzer.c | H A D | 19-Dec-2020 | 178.1 KiB | 3,797 | 3,106 |
| invalidDictionaries.c | H A D | 19-Dec-2020 | 2.1 KiB | 62 | 46 |
| legacy.c | H A D | 19-Dec-2020 | 10.6 KiB | 261 | 205 |
| libzstd_partial_builds.sh | H A D | 19-Dec-2020 | 2.8 KiB | 89 | 70 |
| longmatch.c | H A D | 19-Dec-2020 | 3 KiB | 102 | 85 |
| paramgrill.c | H A D | 19-Dec-2020 | 101.5 KiB | 2,967 | 2,277 |
| playTests.sh | H A D | 19-Dec-2020 | 54.1 KiB | 1,512 | 1,290 |
| poolTests.c | H A D | 19-Dec-2020 | 6.7 KiB | 272 | 208 |
| rateLimiter.py | H A D | 19-Dec-2020 | 1.1 KiB | 42 | 18 |
| roundTripCrash.c | H A D | 19-Dec-2020 | 7.7 KiB | 242 | 172 |
| seqgen.c | H A D | 19-Dec-2020 | 7.8 KiB | 261 | 198 |
| seqgen.h | H A D | 19-Dec-2020 | 1.6 KiB | 59 | 28 |
| test-license.py | H A D | 19-Dec-2020 | 3.8 KiB | 141 | 104 |
| test-zstd-versions.py | H A D | 19-Dec-2020 | 10.6 KiB | 278 | 217 |
| zbufftest.c | H A D | 19-Dec-2020 | 24.7 KiB | 626 | 480 |
| zstreamtest.c | H A D | 19-Dec-2020 | 121.9 KiB | 2,594 | 2,185 |
README.md
1Programs and scripts for automated testing of Zstandard
2=======================================================
3
4This directory contains the following programs and scripts:
5- `datagen` : Synthetic and parametrable data generator, for tests
6- `fullbench` : Precisely measure speed for each zstd inner functions
7- `fuzzer` : Test tool, to check zstd integrity on target platform
8- `paramgrill` : parameter tester for zstd
9- `test-zstd-speed.py` : script for testing zstd speed difference between commits
10- `test-zstd-versions.py` : compatibility test between zstd versions stored on Github (v0.1+)
11- `zbufftest` : Test tool to check ZBUFF (a buffered streaming API) integrity
12- `zstreamtest` : Fuzzer test tool for zstd streaming API
13- `legacy` : Test tool to test decoding of legacy zstd frames
14- `decodecorpus` : Tool to generate valid Zstandard frames, for verifying decoder implementations
15
16
17#### `test-zstd-versions.py` - script for testing zstd interoperability between versions
18
19This script creates `versionsTest` directory to which zstd repository is cloned.
20Then all tagged (released) versions of zstd are compiled.
21In the following step interoperability between zstd versions is checked.
22
23#### `automated-benchmarking.py` - script for benchmarking zstd prs to dev
24
25This script benchmarks facebook:dev and changes from pull requests made to zstd and compares
26them against facebook:dev to detect regressions. This script currently runs on a dedicated
27desktop machine for every pull request that is made to the zstd repo but can also
28be run on any machine via the command line interface.
29
30There are three modes of usage for this script: fastmode will just run a minimal single
31build comparison (between facebook:dev and facebook:release), onetime will pull all the current
32pull requests from the zstd repo and compare facebook:dev to all of them once, continuous
33will continuously get pull requests from the zstd repo and run benchmarks against facebook:dev.
34
35```
36Example usage: python automated_benchmarking.py
37```
38
39```
40usage: automated_benchmarking.py [-h] [--directory DIRECTORY]
41 [--levels LEVELS] [--iterations ITERATIONS]
42 [--emails EMAILS] [--frequency FREQUENCY]
43 [--mode MODE] [--dict DICT]
44
45optional arguments:
46 -h, --help show this help message and exit
47 --directory DIRECTORY
48 directory with files to benchmark
49 --levels LEVELS levels to test eg ('1,2,3')
50 --iterations ITERATIONS
51 number of benchmark iterations to run
52 --emails EMAILS email addresses of people who will be alerted upon
53 regression. Only for continuous mode
54 --frequency FREQUENCY
55 specifies the number of seconds to wait before each
56 successive check for new PRs in continuous mode
57 --mode MODE 'fastmode', 'onetime', 'current', or 'continuous' (see
58 README.md for details)
59 --dict DICT filename of dictionary to use (when set, this
60 dictioanry will be used to compress the files provided
61 inside --directory)
62```
63
64#### `test-zstd-speed.py` - script for testing zstd speed difference between commits
65
66DEPRECATED
67
68This script creates `speedTest` directory to which zstd repository is cloned.
69Then it compiles all branches of zstd and performs a speed benchmark for a given list of files (the `testFileNames` parameter).
70After `sleepTime` (an optional parameter, default 300 seconds) seconds the script checks repository for new commits.
71If a new commit is found it is compiled and a speed benchmark for this commit is performed.
72The results of the speed benchmark are compared to the previous results.
73If compression or decompression speed for one of zstd levels is lower than `lowerLimit` (an optional parameter, default 0.98) the speed benchmark is restarted.
74If second results are also lower than `lowerLimit` the warning e-mail is send to recipients from the list (the `emails` parameter).
75
76Additional remarks:
77- To be sure that speed results are accurate the script should be run on a "stable" target system with no other jobs running in parallel
78- Using the script with virtual machines can lead to large variations of speed results
79- The speed benchmark is not performed until computers' load average is lower than `maxLoadAvg` (an optional parameter, default 0.75)
80- The script sends e-mails using `mutt`; if `mutt` is not available it sends e-mails without attachments using `mail`; if both are not available it only prints a warning
81
82
83The example usage with two test files, one e-mail address, and with an additional message:
84```
85./test-zstd-speed.py "silesia.tar calgary.tar" "email@gmail.com" --message "tested on my laptop" --sleepTime 60
86```
87
88To run the script in background please use:
89```
90nohup ./test-zstd-speed.py testFileNames emails &
91```
92
93The full list of parameters:
94```
95positional arguments:
96 testFileNames file names list for speed benchmark
97 emails list of e-mail addresses to send warnings
98
99optional arguments:
100 -h, --help show this help message and exit
101 --message MESSAGE attach an additional message to e-mail
102 --lowerLimit LOWERLIMIT
103 send email if speed is lower than given limit
104 --maxLoadAvg MAXLOADAVG
105 maximum load average to start testing
106 --lastCLevel LASTCLEVEL
107 last compression level for testing
108 --sleepTime SLEEPTIME
109 frequency of repository checking in seconds
110```
111
112#### `decodecorpus` - tool to generate Zstandard frames for decoder testing
113Command line tool to generate test .zst files.
114
115This tool will generate .zst files with checksums,
116as well as optionally output the corresponding correct uncompressed data for
117extra verification.
118
119Example:
120```
121./decodecorpus -ptestfiles -otestfiles -n10000 -s5
122```
123will generate 10,000 sample .zst files using a seed of 5 in the `testfiles` directory,
124with the zstd checksum field set,
125as well as the 10,000 original files for more detailed comparison of decompression results.
126
127```
128./decodecorpus -t -T1mn
129```
130will choose a random seed, and for 1 minute,
131generate random test frames and ensure that the
132zstd library correctly decompresses them in both simple and streaming modes.
133
134#### `paramgrill` - tool for generating compression table parameters and optimizing parameters on file given constraints
135
136Full list of arguments
137```
138 -T# : set level 1 speed objective
139 -B# : cut input into blocks of size # (default : single block)
140 -S : benchmarks a single run (example command: -Sl3w10h12)
141 w# - windowLog
142 h# - hashLog
143 c# - chainLog
144 s# - searchLog
145 l# - minMatch
146 t# - targetLength
147 S# - strategy
148 L# - level
149 --zstd= : Single run, parameter selection syntax same as zstdcli with more parameters
150 (Added forceAttachDictionary / fadt)
151 When invoked with --optimize, this represents the sample to exceed.
152 --optimize= : find parameters to maximize compression ratio given parameters
153 Can use all --zstd= commands to constrain the type of solution found in addition to the following constraints
154 cSpeed= : Minimum compression speed
155 dSpeed= : Minimum decompression speed
156 cMem= : Maximum compression memory
157 lvl= : Searches for solutions which are strictly better than that compression lvl in ratio and cSpeed,
158 stc= : When invoked with lvl=, represents percentage slack in ratio/cSpeed allowed for a solution to be considered (Default 100%)
159 : In normal operation, represents percentage slack in choosing viable starting strategy selection in choosing the default parameters
160 (Lower value will begin with stronger strategies) (Default 90%)
161 speedRatio= (accepts decimals)
162 : determines value of gains in speed vs gains in ratio
163 when determining overall winner (default 5 (1% ratio = 5% speed)).
164 tries= : Maximum number of random restarts on a single strategy before switching (Default 5)
165 Higher values will make optimizer run longer, more chances to find better solution.
166 memLog : Limits the log of the size of each memotable (1 per strategy). Will use hash tables when state space is larger than max size.
167 Setting memLog = 0 turns off memoization
168 --display= : specify which parameters are included in the output
169 can use all --zstd parameter names and 'cParams' as a shorthand for all parameters used in ZSTD_compressionParameters
170 (Default: display all params available)
171 -P# : generated sample compressibility (when no file is provided)
172 -t# : Caps runtime of operation in seconds (default : 99999 seconds (about 27 hours ))
173 -v : Prints Benchmarking output
174 -D : Next argument dictionary file
175 -s : Benchmark all files separately
176 -q : Quiet, repeat for more quiet
177 -q Prints parameters + results whenever a new best is found
178 -qq Only prints parameters whenever a new best is found, prints final parameters + results
179 -qqq Only print final parameters + results
180 -qqqq Only prints final parameter set in the form --zstd=
181 -v : Verbose, cancels quiet, repeat for more volume
182 -v Prints all candidate parameters and results
183
184```
185 Any inputs afterwards are treated as files to benchmark.
186