1<center><H1>Dieharder: A Random Number Test Suite</H1></center>
2<center><H2>Version 3.31.1</H2></center>
3
4<center><H3>Robert G. Brown (rgb)</H3></center>
5<center><H3>Dirk Eddelbuettel</H3></center>
6<center><H3>David Bauer</H3></center>
7
8<p>Welcome to the dieharder distribution website.</p>
9
10<p>Version 3.29.4beta is the current snapshot.  Some of the documentation
11below may not quite be caught up to it, but it should be close.</p>
12
13<p>Dieharder is a <i>random number generator (rng) testing suite</i>.
14It is intended to test <i>generators</i>, not <i>files of possibly
15random numbers</i> as the latter is a fallacious view of what it means
16to be random.  Is the number 7 random?  If it is generated by a random
17process, it might be.  If it is made up to serve the purpose of some
18argument (like this one) it is not.  Perfect random number generators
19produce "unlikely" sequences of random numbers -- at exactly the right
20average rate.  Testing a rng is therefore quite subtle.</p>
21
22<p>dieharder is a tool designed to permit one to push a weak generator
23to unambiguous failure (at the e.g. 0.0001% level), not leave one in the
24"limbo" of 1% or 5% maybe-failure.  It also contains many tests and is
25extensible so that eventually it will contain many more tests than it
26already does.</p>
27
28<p>If you are using dieharder for testing rngs either in one of its
29prebuilt versions (rpm or apt) or built from source (which gives you the
30ability to e.g. add more tests or integrate your rng directly with
31dieharder for ease of use) you may want to join either or both of the
32<a
33href="https://lists.phy.duke.edu/mailman/listinfo/dieharder-announce">dieharder-announce</a>
34or the
35<a
36href="https://lists.phy.duke.edu/mailman/listinfo/dieharder-devel">dieharder-devel</a>
37mailing lists here.  The former should be very low traffic -- basically
38announcing when a snapshot makes it through development to where I'm
39proud of it.  The latter will be a bit more active, and is a good place
40to post bug reports, patches, suggestions, fixes, complaints and
41generally participate in the development process.</p>
42
43<h2>About Dieharder</h2>
44
45<p>At the suggestion of Linas Vepstas on the Gnu Scientific Library
46(GSL) list this GPL'd suite of random number tests will be named
47"Dieharder".  Using a movie sequel pun for the name is a double tribute
48to George Marsaglia, whose <a
49href="http://stat.fsu.edu/~geo/diehard.html">"Diehard battery of
50tests"</a> of random number generators has enjoyed years of enduring
51usefulness as a test suite.</p>
52
53<p>The dieharder suite is more than just the diehard tests cleaned up
54and given a pretty GPL'd source face in native C.  Tests from the <a
55href="http://csrc.nist.gov/rng/">Statistical Test Suite (STS)</a>
56developed by the National Institute for Standards and Technology (NIST)
57are being incorporated, as are new tests developed by rgb.  Where
58possible or appropriate, <i>all</i> tests that can be parameterized
59("cranked up") to where failure, at least, is unambiguous are so
60parameterized and controllable from the command line.</p>
61
62<p>A further design goal is to provide some indication of <i>why</i> a
63generator fails a test, where such information can be extracted during
64the test process and placed in usable form.  For example, the
65bit-distribution tests should (eventually) be able to display the actual
66histogram for the different bit ntuplets.</p>
67
68<p>Dieharder is by design extensible.  It is intended to be the "Swiss
69army knife of random number test suites", or if you prefer, "the last
70suite you'll ever ware" for testing random numbers.</p>
71
72<hr>
73
74<center><h2><a href="./dieharder">Dieharder Related Talks or Papers</a></h2></center>
75
76<ul>
77
78 <li> <a href="../dieharder_techexpo_2011.odp">TechExpo 2011 Talk
79(Duke).</a> A short talk given at a Duke's Tech Expo in 2011 as an
80overview of random number generator testing.  Good for beginners.
81
82 <li> <a
83href="http://www.cs.ucl.ac.uk/staff/d.jones/GoodPracticeRNG.pdf">Good
84Practice in (Pseudo) Random Number Generation for
85Bioinformatics Applications</a> by David Jones, UCL Bioinformatics Group
86(E-mail: d dot jones@cs dot ucl dot ac dot uk).  A really excellent
87"must read" guideline for anyone thinking of using random number
88generators in an actual application.  My own advice differs only in that
89I endorse using (well tested) Gnu Scientific Library random number
90generators as they are generally portable and open source, hence well
91tested.  Several of Jones' implementation of Marsaglia's KISS-family rngs
92have been added to dieharder and will shortly be added to the GSL under
93the GPL for general use.
94
95</ul>
96
97
98<hr>
99
100<center><h2><a href="./dieharder">Dieharder Download
101Area</a></h2></center>
102
103<p>Dieharder can be freely downloaded from <a
104href="http://www.phy.duke.edu/~rgb/General/dieharder.php">the Dieharder
105download site</a>.  On this page there should be a long list of previous
106versions of dieharder, and it should tell you what is the current
107snapshot.  The version numbers have the following <i>specific
108meaning</i> which is a bit different than usual:</p>
109
110<ul>
111
112<li> First number (major).  Bumped only when major goals in the design
113roadmap are reached (for example, finishing all the diehard tests).
114Version 1.x.x, for example, means that ALL of diehard (and more) is now
115incorporated in the program.  Version 2.x.x means that the tests
116themselves have been split off into the libdieharder library, so that
117they can be linked into scripting languages such as R, new UIs, or user
118code.  3.x.x would be expected to indicate that the entire STS suite is
119incorporated, and so on.
120
121<li> Second number (first minor).  This number indicates the number of
122tests currently supported.  When it bumps, it means new tests have been
123added from e.g. STS, Knuth, Marsaglia and Tsang, rgb, or elsewhere.
124
125<li> Third number (second minor).  This number is bumped when
126significant features are added or altered.  Bug fixes bump this number,
127usually after a few bumps of the release number for testing snapshots.
128This number and the release are reset to 0 when the major is bumped or a
129new test is added to maintain the strictly increasing numerical value on
130which e.g. yum upgrades rely.
131
132</ul>
133
134<p> The single-tree dieharder sources (.tgz and .src.rpm) files can be
135downloaded from this directory.  In addition, binary rpm's built on top
136of Fedora Core whatever (for either i386 or both of x86_64) may be
137present. Be warned: the GSL is a build <i>requirement</i>.  The current
138packaging builds both the library and the dieharder UI from a single
139source rpm, or from running "make" in the toplevel directory of the
140source tarball.  With a bit of effort (making a private rpm building
141tree), "make rpm" should work for you as well in this toplevel
142directory.</p>
143
144<p> This project is under very active development.  Considerable effort
145is being expended so that the suite will "run out of the box" to produce
146a reasonably understandable report for any given random number generator
147it supports via the "-a" flag, in addition to the ability to
148considerably vary most specific tests as applied to the generator.  A
149brief synopsis of command options to get you started is presented below.
150In general, though, documentation (including this page, the man page,
151and built-in documentation) may lag the bleeding edge snapshot by a few
152days or more.</p>
153
154<p>An rpm installation note from Court Shrock:</p>
155<pre>
156I was reading about your work on dieharder.  First, some info
157about getting dieharder working in Gentoo:
158
159cd ~
160emerge rpm gsl
161wget
162http://www.phy.duke.edu/~rgb/General/dieharder/dieharder-0.6.11-1.i386.rpm
163rpm -i --nodeps dieharder-0.6.11-1.i386.rpm
164</pre>
165
166<p>Rebuilding from tarball source should always work as well, and if you
167are planning to play a lot with the tool may be a desireable way to
168proceed as there are some documentation goodies in the ./doc
169subdirectory and the ./manual subdirectory of the source tarball (such
170as the original diehard test descriptions and the STS white paper).
171
172<p>George Marsaglia retired from FSU in 1996.  For a brief time diehard
173appeared to have finally disappeared from FSU webspace, but what had
174really happened is google's favorite path to it had disappeared when his
175personal home directory was removed.  Diehard is still there, at the URL
176<a
177href="http://www.stat.fsu.edu/pub/diehard">http://www.stat.fsu.edu/pub/diehard</a>
178as well as at a Hong Kong website.  The source code of diehard itself is
179(of course) Copyright George Marsaglia but Marsaglia did not incorporate
180an explicit <i>license</i> into his code which muddles the issue of how
181and when it can be distributed, freely or otherwise.  Existing diehard
182sources are <i>not directly incorporated into dieharder in source
183form</i> for that reason, to keep authorship and GPL licensing issues
184clear.</p>
185
186<p>Note that the same is not true about data.  Several of the diehard
187tests require that one use precomputed numbers as e.g. target mean,
188sigma for some test statistic.  Obviously in these cases we use the same
189numbers as diehard so we get the same, or comparable, results.  These
190numbers were all developed with support from Federal grants and have all
191been published in the literature, though, and should therefore be in the
192public domain as far as reuse in a program is concerned.</p>
193
194<p>Note also that most of the diehard tests are <i>modified</i> in
195dieharder, usually in a way that should improve them.  There are three
196improvements that were basically always made if possible.
197<ul>
198 <li> The number of test sample p-value that contribute to the final
199Kolmogorov-Smirnov test for the uniformity of the distribution of
200p-values of the test statistic is a variable with default 100, which is
201<i>much</i> larger than most diehard default values.  This change alone
202causes many generators that are asserted to "pass diehard" to in fact
203fail -- any given test run generates a p-value that is acceptable, but
204the <i>distribution</i> of p-values is not uniform.
205 <li> The number of actual samples <i>within</i> a test that contribute
206to the single-run test statistic was made a variable when possible.
207This was generally possible when the target was an easily computable
208function of the number of samples, but a number of the tests have
209pre-computed targets for specific numbers of samples and that number
210cannot be varied because no general function is known relating the
211target value to the number of samples.
212 <li> Many of diehard's tests investigated overlapping bit sequences.
213Overlapping sequences are not independent and one has to account for
214covariance between the samples (or a gradually vanishing degree of
215autocorrelation between sequential samples with gradually decreasing
216overlap). This was generally done at least in part because it used
217file-based input of random numbers and the size of files that could
218reasonably be generated and tested in the mid-90's contained on the
219order of a million random deviates.
220
221<p>Unfortunately, some of the diehard tests that rely on weak inverses
222of the covariance matrices associated with overlapping samples seem to
223have errors in their implementation, whether in the original diehard
224(covariance) data or in dieharder-specific code it is difficult to say.
225Fortunately, it is no longer necessary to limit the number of random
226numbers drawn from a generator when running an integrated test, and
227non-overlapping versions of these same tests do not require any
228treatment of covariance.  For that reason non-overlapping versions of
229the questionable tests have been provided where possible (in particular
230testing permutations and sums) and the overlapping versions of those
231tests are deprecated pending a resolution of the apparent errors.</p>
232
233</ul>
234
235<p>In a few cases other variations are possible for specific tests.
236This should be noted in the built-in test documentation for that test
237where appropriate.</p>
238
239<p>Aside from these major differences, note that the algorithms were
240independently written more or less from the test descriptions alone
241(sometimes illuminated by a look at the code implementations, but only
242to clear up just what was meant by the description).  They may well do
243things in a different (but equally valid) order or using different (but
244ultimately equivalent) algorithms altogether and hence produce slightly
245different (but equally valid) results even when run on the <i>same data
246with the same basic parameters</i>.  Then, there may be bugs in the
247code, which might have the same general effect.  Finally, it is always
248possible that <i>diehard</i> implementations have bugs and can be in
249error.  Your Mileage May Vary.  Be Warned.</p>
250
251<hr>
252
253<center><h2>About Dieharder</h2></center>
254
255<p>The primary point of dieharder (like diehard before it) is to make it
256easy to time and test (pseudo)random number generators, both software
257and hardware, for a variety of purposes in research and cryptography.
258The tool is built entirely on top of the GSL's random number generator
259interface and uses a variety of other GSL tools (e.g.  sort, erfc,
260incomplete gamma, distribution generators) in its operation.</p>
261
262<p>Dieharder differs significantly from diehard in many ways.  For
263example, diehard uses file based sources of random numbers exclusively
264and by default works with only roughly ten million random numbers in
265such a file.  However, modern random number generators in a typical
266simulation application can easily need to generate 10^18 or more random
267numbers, generated from hundreds, thousands, millions of different seeds
268in independent (parallelized) simulation threads, as the application
269runs over a period of months to years.  Those applications can easily be
270sensitive to rng weaknesses that might not be revealed by sequences as
271short as 10^7 uints in length even with excellent and sensitive
272tests.  One of dieharder's primary design goals was to permit tests to
273be run on very long sequences.</p>
274
275<p>To facilitate this, dieharder <i>prefers</i> to test generators that
276have been wrapped up in a GSL-compatible interface so that they can
277return an <i>unbounded</i> stream of random numbers -- as many as any
278single test or the entire suite of tests might require.  Numerous
279examples are provided of how one can wrap one's own random number
280generator so that it is can be called via the GSL interface.</p>
281
282<p>Dieharder also supports file-based input three distinct ways.  The
283simplest is to use the (raw binary) stdin interface to pipe a bit stream
284from <i>any</i> rng, hardware or software, through dieharder for
285testing.  In addition, one can use "direct" file input of either raw
286binary or ascii formatted (usually uint) random numbers.  The man page
287contains examples of how to do all three of these things, and dieharder
288itself can generate sample files to use as templates for the appropriate
289formatting.</p>
290
291<p><b>Note Well!</b> Dieharder can consume a <i>lot</i> of random
292numbers in the course of running all the tests!  To facilitate this,
293dieharder should (as of 2.27.11 and beyond) support large file (> 2GB)
294input, although this is still experimental.  Large files are clunky and
295relatively slow, and the LFS (large file system) in linux/gcc is still
296relatively new and may have portability issues if dieharder is built
297with a non-gcc compiler.  It is therefore <i>strongly recommended</i>
298that both hardware and software generators be tested by being wrapped
299within the GSL interface by emulating the source code examples or that
300the pipe/stdin interface be used so that they can return an essentially
301unbounded rng stream.</p>
302
303<p>Dieharder also goes beyond diehard in that it is deliberately
304extensible.  In addition to implementing all of the diehard tests it is
305expected that dieharder will eventually contain all of the NIST STS and
306a variety of tests contributed by users, invented by the dieharder
307authors, or implemented from descriptions in the literature.  As a true
308open source project, dieharder can eventually contain <i>all</i> rng
309tests that prove useful in one place with a consistent interface that
310permits one to apply those tests to many generators for purposes of
311comparison and validation of the <i>tests themselves</i> as much as the
312generators.  In other words, it is intended to be a vehicle for the
313computer science of random number generation testing as well as a
314practical test harness for random number generators.</p>
315
316<p>To expand on this, the development of dieharder was motivated by the
317following, in rough order of importance:<p>
318
319<ul>
320
321<li> To provide a readily available, rpm- or apt- installable
322<b>toolset</b> so that "consumers" of random numbers (who typically use
323<i>large</i> numbers of random numbers in e.g. simulation or other
324research) can test the generator(s) they are using to verify their
325quality or lack thereof.
326
327<li> To provide a very <b>simple user interface</b> for that toolset for
328random number consumers.  At the moment, this means a command line
329interface (CLI) that can easily be embedded in scripts or run repeatedly
330with different parameters.  A graphical user interface (GUI) is on the
331list of things to do, although it adds little to the practical utility
332of the tool.
333
334<li> To provide <b>lots of knobs and dials</b> and low level control for
335statistical researchers that want to study particular generators with
336particular tests in more detail.  This includes full access to test
337sources -- no parameter or aspect of the test algorithms is "hidden" and
338needs to be taken on faith.
339
340<li> To have the entire test code and documentation be fully <b>Gnu
341Public Licensed</b> and hence openly available for adaptation, testing,
342comment, and modification so that the testing suite itself becomes (over
343time) reliable.
344
345<li> To be <b>extensible</b>.  Dieharder provides a fairly <b>simple
346API</b> for adding new tests with a common set of low-level testing
347tools and a <b>common test structure</b> that leads (one hopes) to an
348<i>unambiguous</i> decision to accept or reject any given random number
349generator on the basis of any given test for a suitable choice of
350controllable test parameters.
351
352<li> To allow all researchers to be able to directly test, in
353particular, the <b>random number generators interfaced with the GSL</b>.
354This is a deliberate design decision justified by the extremely large
355and growing number of random number generators prebuilt into the GSL and
356the ease of adding new ones (either contributing them to the project or
357for the sole purpose of local testing).
358
359<li> To allow researchers that use e.g. <i>distributions</i> directly
360generated by GSL routines (which can in principle fail two ways, due to
361the failure of the underlying random number generator or due to a
362failure of the generating algorithm) to be able to directly validate
363their particular generator/distribution combination at the cost of
364implementing a suitable test in dieharder (using the code of existing
365tests as a template).
366
367<li> To allow dieharder to be directly interfaced with <b>other tools
368and interfaces</b>.  For example, dieharder can be directly called
369within the R interface, permitting its rngs to be tested and R-based
370graphics and tools to be used to analyze test results.  Note well,
371however, that because it uses the GSL (which is GPL viral) dieharder
372itself is GPL viral and cannot be embedded directly into a non-GPL tool
373such as matlab.  It can, of course, be used to generate <i>p-value
374data</i> that is passed on to matlab (or any other graphing or analysis
375tool)
376
377</ul>
378
379<p>Although this tool is being developed on Linux/GCC-based platforms,
380it should port with no particular difficulty to other Unix-like
381environments (at least ones that also support the GSL), with the further
382warning that certain features (in particular large file support) may
383require tweaking and that the dieharder authors may not be able to help
384you perform that tweaking.</p>
385
386<center><h2>Essential Usage Synopsis</h2></center>
387
388<p>If you compile the test or install the provided binary rpm's and run
389it as:</p>
390
391<tt>dieharder -a</tt>
392
393<p>it should run -a(ll) tests on the default GSL generator.</p>
394
395<p>Choose alternative tests with -g number where:</p>
396
397<tt>dieharder -g -1</tt>
398
399<p>will list all possible numbers known to the current snapshot of the
400dieharder.</p>
401
402<tt>dieharder -l</tt>
403
404<p>should list all the tests implemented in the current snapshop of
405DieHarder.  Finally, the venerable and time tested:</p>
406
407<tt>dieharder -h</tt>
408
409<p> provides a Usage synopsis (which can quite long) and</p>
410
411<tt>man dieharder</tt>
412
413<p>is the (installed) man page, which may or many not be completely up
414to date as the suite is under active development.  For developers,
415additional documentation is available in the toplevel directory or doc
416subdirectory of the source tree.  Eventually, a complete DieHard manual
417in printable PDF form will be available both on this website and in
418/usr/share/doc/dieharder-*/.</p>
419
420<center><h2>List of Random Number Generators and Tests
421Available</h2></center>
422
423<p>List of GSL and user-defined random number generators that can be
424tested by dieharder:</p>
425<pre>
426#=============================================================================#
427#          dieharder version 3.29.4beta Copyright 2003 Robert G. Brown        #
428#=============================================================================#
429#    Id Test Name           | Id Test Name           | Id Test Name           #
430#=============================================================================#
431|   000 borosh13            |001 cmrg                |002 coveyou             |
432|   003 fishman18           |004 fishman20           |005 fishman2x           |
433|   006 gfsr4               |007 knuthran            |008 knuthran2           |
434|   009 knuthran2002        |010 lecuyer21           |011 minstd              |
435|   012 mrg                 |013 mt19937             |014 mt19937_1999        |
436|   015 mt19937_1998        |016 r250                |017 ran0                |
437|   018 ran1                |019 ran2                |020 ran3                |
438|   021 rand                |022 rand48              |023 random128-bsd       |
439|   024 random128-glibc2    |025 random128-libc5     |026 random256-bsd       |
440|   027 random256-glibc2    |028 random256-libc5     |029 random32-bsd        |
441|   030 random32-glibc2     |031 random32-libc5      |032 random64-bsd        |
442|   033 random64-glibc2     |034 random64-libc5      |035 random8-bsd         |
443|   036 random8-glibc2      |037 random8-libc5       |038 random-bsd          |
444|   039 random-glibc2       |040 random-libc5        |041 randu               |
445|   042 ranf                |043 ranlux              |044 ranlux389           |
446|   045 ranlxd1             |046 ranlxd2             |047 ranlxs0             |
447|   048 ranlxs1             |049 ranlxs2             |050 ranmar              |
448|   051 slatec              |052 taus                |053 taus2               |
449|   054 taus113             |055 transputer          |056 tt800               |
450|   057 uni                 |058 uni32               |059 vax                 |
451|   060 waterman14          |061 zuf                 |                        |
452#=============================================================================#
453|   200 stdin_input_raw     |201 file_input_raw      |202 file_input          |
454|   203 ca                  |204 uvag                |205 AES_OFB             |
455|   206 Threefish_OFB       |                        |                        |
456#=============================================================================#
457|   400 R_wichmann_hill     |401 R_marsaglia_multic. |402 R_super_duper       |
458|   403 R_mersenne_twister  |404 R_knuth_taocp       |405 R_knuth_taocp2      |
459#=============================================================================#
460|   500 /dev/random         |501 /dev/urandom        |                        |
461#=============================================================================#
462|   600 empty               |                        |                        |
463#=============================================================================#
464</pre>
465
466<p>Two "gold standard" generators in particular are provided to "test
467the test" -- AES_OFB and Threefish_OFB are both cryptographic generators
468and should be quite random.  gfsr4, mt19937, and taus (and several
469others) are very good generators in the GSL, as well.  If you are
470developing a new rng, it should compare decently with these generators
471on dieharder test runs.</i>
472
473<p>Note that the stdin_input_raw interface (-g 200) is a "universal"
474interface.  Any generator that can produce a (continuous) stream of
475presumably random bits can be tested with dieharder.  The easiest way to
476demonstrate this is by running:</p>
477
478<pre>
479dieharder -S 1 -B -o -t 1000000000 | dieharder -g 75 -r 3 -n 2
480</pre>
481
482<p>where the first invocation of dieharder generates a stream of binary
483bits drawn from the default generator with seed 1 and the second reads
484those bits from stdin and tests them with the rgb bitdist test on two
485bit sequences.  Compare the output to:</p>
486
487<pre>
488dieharder -S 1 -r 3 -n 2
489</pre>
490
491<p>which runs the same test on the same generator with the same seed
492internally.  They should be the same.</p>
493
494<p>Similarly the file_input generator requires a file of "cooked" (ascii
495readable) random numbers, one per line, with a header that describes the
496format to dieharder.  Note Well!  File or stream input rands (with any
497of the three methods for input) are delivered to the tests on demand,
498but if the test needs more than are available dieharder either fails (in
499the case of a stdin stream) or rewinds the file and cycles through it
500again, and again, and again as needed.  Obviously this significantly
501reduces the sample space and can lead to completely incorrect results
502for the p-value histograms unless there are enough rands to run EACH
503test without repetition (it is harmless to reuse the sequence for
504different tests).  <b>Let the user beware!</b></p>
505
506<p>List of the CURRENT fully implemented tests (as of the 08/18/08
507snapshot):</p>
508<pre>
509#=============================================================================#
510#          dieharder version 3.29.4beta Copyright 2003 Robert G. Brown        #
511#=============================================================================#
512Installed dieharder tests:
513 Test Number                         Test Name                Test Reliability
514===============================================================================
515  -d 0                            Diehard Birthdays Test              Good
516  -d 1                               Diehard OPERM5 Test           Suspect
517  -d 2                    Diehard 32x32 Binary Rank Test              Good
518  -d 3                      Diehard 6x8 Binary Rank Test              Good
519  -d 4                            Diehard Bitstream Test              Good
520  -d 5                                      Diehard OPSO              Good
521  -d 6                                 Diehard OQSO Test              Good
522  -d 7                                  Diehard DNA Test              Good
523  -d 8                Diehard Count the 1s (stream) Test              Good
524  -d 9                  Diehard Count the 1s Test (byte)              Good
525  -d 10                         Diehard Parking Lot Test              Good
526  -d 11         Diehard Minimum Distance (2d Circle) Test             Good
527  -d 12         Diehard 3d Sphere (Minimum Distance) Test             Good
528  -d 13                             Diehard Squeeze Test              Good
529  -d 14                                Diehard Sums Test        Do Not Use
530  -d 15                                Diehard Runs Test              Good
531  -d 16                               Diehard Craps Test              Good
532  -d 17                     Marsaglia and Tsang GCD Test              Good
533  -d 100                                STS Monobit Test              Good
534  -d 101                                   STS Runs Test              Good
535  -d 102                   STS Serial Test (Generalized)              Good
536  -d 200                       RGB Bit Distribution Test              Good
537  -d 201           RGB Generalized Minimum Distance Test              Good
538  -d 202                           RGB Permutations Test              Good
539  -d 203                             RGB Lagged Sum Test              Good
540  -d 204                RGB Kolmogorov-Smirnov Test Test              Good
541</pre>
542
543<p>Full descriptions of the tests are available from within the tool.
544For example, enter:
545<pre>
546rgb@lilith|B:1003>./dieharder -d 203 -h
547OK, what is dtest_num = 203
548#==================================================================
549#                     RGB Lagged Sums Test
550# This package contains many very lovely tests.  Very few of them,
551# however, test for lagged correlations -- the possibility that
552# the random number generator has a bitlevel correlation after
553# some fixed number of intervening bits.
554#
555# The lagged sums test is therefore very simple.   One simply adds up
556# uniform deviates sampled from the rng, skipping lag samples in between
557# each rand used.  The mean of tsamples samples thus summed should be
558# 0.5*tsamples.  The standard deviation should be sqrt(tsamples/12).
559# The experimental values of the sum are thus converted into a
560# p-value (using the erf()) and a ks-test applied to psamples of them.
561#==================================================================
562</pre>
563</p>
564
565<p>Note that all tests have been independently rewritten from their
566description, and may be functionally modified or extended relative to
567the original source code published in the originating suite(s).  This
568has proven to be absolutely necessary; dieharder stresses random number
569generator tests as much as it stresses random number generators, and
570tests with imprecise target statistics can return "failure" when the
571fault is with the test, not the generator.</p>
572
573<p>The author (rgb) bears complete responsibility for these changes,
574subject to the standard GPL code disclaimer that the code <i>has no
575warranty</i>.  In essence, yes it may be my fault if they don't work but
576using the tool is <i>at your own risk</i> and you can <i>fix it</i> if
577it bothers you and/or I don't fix it first.</p>
578
579<center><h2>Development Notes</h2></center>
580
581<p>All tests are encapsulated to be as standard as possible in the way
582they compute p-values from single statistics or from vectors of
583statistics, and in the way they implement the underlying KS and chisq
584tests.  Diehard is now complete in dieharder (although two tests are
585badly broken and should not be used), and attention will turn towards
586implementing more selected tests from the STS and many other sources.  A
587road map of sorts (with full supporting documentation) is available on
588request if volunteers wish to work on adding more GPL tests.</p>
589
590<p>Note that a few tests appear to have stubborn bugs.  In particular,
591the diehard operm5 test seems to fail all generators in dieharder.
592Several users have attempted to help debug this problem, and it
593tentatively appears that the problem is in the original diehard code and
594not just dieharder.  There is extensive literature on overlapping tests,
595which are highly non-trivial to implement and involve things like
596forming the weak inverse of covariance matrices in order to correct for
597overlapping (non-independent) statistics.</p>
598
599<p>A revised version of overlapping permutations is underway (as an rgb
600test), but is still buggy.  A non-overlapping (rgb) permutations test is
601provided now that should test much the same thing at the expense of
602requiring more samples to do it.</p>
603
604<p>Similarly, the diehard sums test appears to produce a systematically
605non-flat distribution of p-values for all rngs tested, in particular for
606the "gold standard" cryptographic generators aes and threefish, as well
607as for the "good" generators in the GSL (mt19937, taus, gfsr4).  It
608seems very unlikely that all of these generators would be flawed in the
609same way, so this test also should not be used to test your rng.
610
611<center><h2>Thoughts for the Future/Wish List/To Do</h2></center>
612
613<ul>
614
615<li> Tests of GSL random distribution (as opposed to number) generators,
616as indirect tests of the generators that feed them.
617
618<li> New tests, compressions of existing ones that are "different" but
619really the same.  Hyperplane tests.  Spectral tests.  Especially the bit
620distribution test with user defineable lag or lag pattern (to look for
621subtle, long period correlations in the bit patterns produced).
622
623<li> Collaborators.  Co-developers welcome, as are contributions or
624suggestions from users.  Note well that users have already provided
625critical help debugging the early code!  Part of the point of a GPL
626project is that you are NOT at the mercy of a black box piece of code.
627If you are using dieharder and are moderately expert at statistics and
628random numbers and observe something odd, please help out!
629
630</ul>
631
632<center><h2>Conclusions</h2></center>
633
634<p>I hope that even during its development, you find dieharder useful.
635Remember, it is fully open source, so you can freely modify and
636redistribute the code according to the rules laid out in the Gnu Public
637License (version 2b), which might cost you as much as a beer one day.
638In particular, you can easily add random number generators using the
639provided examples as templates, or you can add tests of your own by
640copying the general layout of the existing tests (working toward a
641p-value per run, cumulating (say) 100 runs, and turning the resulting KS
642test into an overall p-value).  Best of all, you can look inside the
643code and see how the tests work, which may inspire you to create a new
644test -- or a new generator that can <i>pass</i> a test.</p>
645
646<p>To conclude, if you have any interest in participating in the
647development of dieharder, be sure to let me know, especially if you have
648decent C coding skills (including familiarity with Subversion and the
649GSL) and a basic knowledge of statistics.  I even have documents to help
650with the latter, if you have the programming skills and want to LEARN
651statistics.  Bug reports or suggestions are also welcome.</p>
652
653<p>Submit bug reports, etc. to</p>
654<address>
655  rgb at phy dot duke dot edu
656</address>
657
658