1<center><H1>Dieharder: A Random Number Test Suite</H1></center> 2<center><H2>Version @VERSION@</H2></center> 3 4<center><H3>Robert G. Brown (rgb)</H3></center> 5<center><H3>Dirk Eddelbuettel</H3></center> 6<center><H3>David Bauer</H3></center> 7 8<p>Welcome to the dieharder distribution website.</p> 9 10<p>Version @VERSION@ is the current snapshot. Some of the documentation 11below may not quite be caught up to it, but it should be close.</p> 12 13<p>Dieharder is a <i>random number generator (rng) testing suite</i>. 14It is intended to test <i>generators</i>, not <i>files of possibly 15random numbers</i> as the latter is a fallacious view of what it means 16to be random. Is the number 7 random? If it is generated by a random 17process, it might be. If it is made up to serve the purpose of some 18argument (like this one) it is not. Perfect random number generators 19produce "unlikely" sequences of random numbers -- at exactly the right 20average rate. Testing a rng is therefore quite subtle.</p> 21 22<p>dieharder is a tool designed to permit one to push a weak generator 23to unambiguous failure (at the e.g. 0.0001% level), not leave one in the 24"limbo" of 1% or 5% maybe-failure. It also contains many tests and is 25extensible so that eventually it will contain many more tests than it 26already does.</p> 27 28<p>If you are using dieharder for testing rngs either in one of its 29prebuilt versions (rpm or apt) or built from source (which gives you the 30ability to e.g. add more tests or integrate your rng directly with 31dieharder for ease of use) you may want to join either or both of the 32<a 33href="https://lists.phy.duke.edu/mailman/listinfo/dieharder-announce">dieharder-announce</a> 34or the 35<a 36href="https://lists.phy.duke.edu/mailman/listinfo/dieharder-devel">dieharder-devel</a> 37mailing lists here. The former should be very low traffic -- basically 38announcing when a snapshot makes it through development to where I'm 39proud of it. The latter will be a bit more active, and is a good place 40to post bug reports, patches, suggestions, fixes, complaints and 41generally participate in the development process.</p> 42 43<h2>About Dieharder</h2> 44 45<p>At the suggestion of Linas Vepstas on the Gnu Scientific Library 46(GSL) list this GPL'd suite of random number tests will be named 47"Dieharder". Using a movie sequel pun for the name is a double tribute 48to George Marsaglia, whose <a 49href="http://stat.fsu.edu/~geo/diehard.html">"Diehard battery of 50tests"</a> of random number generators has enjoyed years of enduring 51usefulness as a test suite.</p> 52 53<p>The dieharder suite is more than just the diehard tests cleaned up 54and given a pretty GPL'd source face in native C. Tests from the <a 55href="http://csrc.nist.gov/rng/">Statistical Test Suite (STS)</a> 56developed by the National Institute for Standards and Technology (NIST) 57are being incorporated, as are new tests developed by rgb. Where 58possible or appropriate, <i>all</i> tests that can be parameterized 59("cranked up") to where failure, at least, is unambiguous are so 60parameterized and controllable from the command line.</p> 61 62<p>A further design goal is to provide some indication of <i>why</i> a 63generator fails a test, where such information can be extracted during 64the test process and placed in usable form. For example, the 65bit-distribution tests should (eventually) be able to display the actual 66histogram for the different bit ntuplets.</p> 67 68<p>Dieharder is by design extensible. It is intended to be the "Swiss 69army knife of random number test suites", or if you prefer, "the last 70suite you'll ever ware" for testing random numbers.</p> 71 72<hr> 73 74<center><h2><a href="./dieharder">Dieharder Download 75Area</a></h2></center> 76 77<p>Dieharder can be freely downloaded from <a 78href="http:www.phy.duke.edu/~rgb/General/dieharder.php">the Dieharder 79download site</a>. On this page there should be a long list of previous 80versions of dieharder, and it should tell you what is the current 81snapshot. The version numbers have the following <i>specific 82meaning</i> which is a bit different than usual:</p> 83 84<ul> 85 86<li> First number (major). Bumped only when major goals in the design 87roadmap are reached (for example, finishing all the diehard tests). 88Version 1.x.x, for example, means that ALL of diehard (and more) is now 89incorporated in the program. Version 2.x.x means that the tests 90themselves have been split off into the libdieharder library, so that 91they can be linked into scripting languages such as R, new UIs, or user 92code. 3.x.x would be expected to indicate that the entire STS suite is 93incorporated, and so on. 94 95<li> Second number (first minor). This number indicates the number of 96tests currently supported. When it bumps, it means new tests have been 97added from e.g. STS, Knuth, Marsaglia and Tsang, rgb, or elsewhere. 98 99<li> Third number (second minor). This number is bumped when 100significant features are added or altered. Bug fixes bump this number, 101usually after a few bumps of the release number for testing snapshots. 102This number and the release are reset to 0 when the major is bumped or a 103new test is added to maintain the strictly increasing numerical value on 104which e.g. yum upgrades rely. 105 106</ul> 107 108<p> The single-tree dieharder sources (.tgz and .src.rpm) files can be 109downloaded from this directory. In addition, binary rpm's built on top 110of Fedora Core whatever (for either i386 or both of x86_64) may be 111present. Be warned: the GSL is a build <i>requirement</i>. The current 112packaging builds both the library and the dieharder UI from a single 113source rpm, or from running "make" in the toplevel directory of the 114source tarball. With a bit of effort (making a private rpm building 115tree), "make rpm" should work for you as well in this toplevel 116directory.</p> 117 118<p> This project is under very active development. Considerable effort 119is being expended so that the suite will "run out of the box" to produce 120a reasonably understandable report for any given random number generator 121it supports via the "-a" flag, in addition to the ability to 122considerably vary most specific tests as applied to the generator. A 123brief synopsis of command options to get you started is presented below. 124In general, though, documentation (including this page, the man page, 125and built-in documentation) may lag the bleeding edge snapshot by a few 126days or more.</p> 127 128<p>An rpm installation note from Court Shrock:</p> 129<pre> 130I was reading about your work on dieharder. First, some info 131about getting dieharder working in Gentoo: 132 133cd ~ 134emerge rpm gsl 135wget 136http://www.phy.duke.edu/~rgb/General/dieharder/dieharder-0.6.11-1.i386.rpm 137rpm -i --nodeps dieharder-0.6.11-1.i386.rpm 138</pre> 139 140<p>Rebuilding from tarball source should always work as well, and if you 141are planning to play a lot with the tool may be a desireable way to 142proceed as there are some documentation goodies in the ./doc 143subdirectory and the ./manual subdirectory of the source tarball (such 144as the original diehard test descriptions and the STS white paper). 145 146<p>George Marsaglia retired from FSU in 1996. For a brief time diehard 147appeared to have finally disappeared from FSU webspace, but what had 148really happened is google's favorite path to it had disappeared when his 149personal home directory was removed. Diehard is still there, at the URL 150<a 151href="http://www.stat.fsu.edu/pub/diehard">http://www.stat.fsu.edu/pub/diehard</a> 152as well as at a Hong Kong website. The source code of diehard itself is 153(of course) Copyright George Marsaglia but Marsaglia did not incorporate 154an explicit <i>license</i> into his code which muddles the issue of how 155and when it can be distributed, freely or otherwise. Existing diehard 156sources are <i>not directly incorporated into dieharder in source 157form</i> for that reason, to keep authorship and GPL licensing issues 158clear.</p> 159 160<p>Note that the same is not true about data. Several of the diehard 161tests require that one use precomputed numbers as e.g. target mean, 162sigma for some test statistic. Obviously in these cases we use the same 163numbers as diehard so we get the same, or comparable, results. These 164numbers were all developed with support from Federal grants and have all 165been published in the literature, though, and should therefore be in the 166public domain as far as reuse in a program is concerned.</p> 167 168<p>Note also that most of the diehard tests are <i>modified</i> in 169dieharder, usually in a way that should improve them. There are three 170improvements that were basically always made if possible. 171<ul> 172 <li> The number of test sample p-value that contribute to the final 173Kolmogorov-Smirnov test for the uniformity of the distribution of 174p-values of the test statistic is a variable with default 100, which is 175<i>much</i> larger than most diehard default values. This change alone 176causes many generators that are asserted to "pass diehard" to in fact 177fail -- any given test run generates a p-value that is acceptable, but 178the <i>distribution</i> of p-values is not uniform. 179 <li> The number of actual samples <i>within</i> a test that contribute 180to the single-run test statistic was made a variable when possible. 181This was generally possible when the target was an easily computable 182function of the number of samples, but a number of the tests have 183pre-computed targets for specific numbers of samples and that number 184cannot be varied because no general function is known relating the 185target value to the number of samples. 186 <li> Many of diehard's tests investigated overlapping bit sequences. 187Overlapping sequences are not independent and one has to account for 188covariance between the samples (or a gradually vanishing degree of 189autocorrelation between sequential samples with gradually decreasing 190overlap). This was generally done at least in part because it used 191file-based input of random numbers and the size of files that could 192reasonably be generated and tested in the mid-90's contained on the 193order of a million random deviates. 194 195<p>Unfortunately, some of the diehard tests that rely on weak inverses 196of the covariance matrices associated with overlapping samples seem to 197have errors in their implementation, whether in the original diehard 198(covariance) data or in dieharder-specific code it is difficult to say. 199Fortunately, it is no longer necessary to limit the number of random 200numbers drawn from a generator when running an integrated test, and 201non-overlapping versions of these same tests do not require any 202treatment of covariance. For that reason non-overlapping versions of 203the questionable tests have been provided where possible (in particular 204testing permutations and sums) and the overlapping versions of those 205tests are deprecated pending a resolution of the apparent errors.</p> 206 207</ul> 208 209<p>In a few cases other variations are possible for specific tests. 210This should be noted in the built-in test documentation for that test 211where appropriate.</p> 212 213<p>Aside from these major differences, note that the algorithms were 214independently written more or less from the test descriptions alone 215(sometimes illuminated by a look at the code implementations, but only 216to clear up just what was meant by the description). They may well do 217things in a different (but equally valid) order or using different (but 218ultimately equivalent) algorithms altogether and hence produce slightly 219different (but equally valid) results even when run on the <i>same data 220with the same basic parameters</i>. Then, there may be bugs in the 221code, which might have the same general effect. Finally, it is always 222possible that <i>diehard</i> implementations have bugs and can be in 223error. Your Mileage May Vary. Be Warned.</p> 224 225<hr> 226 227<center><h2>About Dieharder</h2></center> 228 229<p>The primary point of dieharder (like diehard before it) is to make it 230easy to time and test (pseudo)random number generators, both software 231and hardware, for a variety of purposes in research and cryptography. 232The tool is built entirely on top of the GSL's random number generator 233interface and uses a variety of other GSL tools (e.g. sort, erfc, 234incomplete gamma, distribution generators) in its operation.</p> 235 236<p>Dieharder differs significantly from diehard in many ways. For 237example, diehard uses file based sources of random numbers exclusively 238and by default works with only roughly ten million random numbers in 239such a file. However, modern random number generators in a typical 240simulation application can easily need to generate 10^18 or more random 241numbers, generated from hundreds, thousands, millions of different seeds 242in independent (parallelized) simulation threads, as the application 243runs over a period of months to years. Those applications can easily be 244sensitive to rng weaknesses that might not be revealed by sequences as 245short as 10^7 uints in length even with excellent and sensitive 246tests. One of dieharder's primary design goals was to permit tests to 247be run on very long sequences.</p> 248 249<p>To facilitate this, dieharder <i>prefers</i> to test generators that 250have been wrapped up in a GSL-compatible interface so that they can 251return an <i>unbounded</i> stream of random numbers -- as many as any 252single test or the entire suite of tests might require. Numerous 253examples are provided of how one can wrap one's own random number 254generator so that it is can be called via the GSL interface.</p> 255 256<p>Dieharder also supports file-based input three distinct ways. The 257simplest is to use the (raw binary) stdin interface to pipe a bit stream 258from <i>any</i> rng, hardware or software, through dieharder for 259testing. In addition, one can use "direct" file input of either raw 260binary or ascii formatted (usually uint) random numbers. The man page 261contains examples of how to do all three of these things, and dieharder 262itself can generate sample files to use as templates for the appropriate 263formatting.</p> 264 265<p><b>Note Well!</b> Dieharder can consume a <i>lot</i> of random 266numbers in the course of running all the tests! To facilitate this, 267dieharder should (as of 2.27.11 and beyond) support large file (> 2GB) 268input, although this is still experimental. Large files are clunky and 269relatively slow, and the LFS (large file system) in linux/gcc is still 270relatively new and may have portability issues if dieharder is built 271with a non-gcc compiler. It is therefore <i>strongly recommended</i> 272that both hardware and software generators be tested by being wrapped 273within the GSL interface by emulating the source code examples or that 274the pipe/stdin interface be used so that they can return an essentially 275unbounded rng stream.</p> 276 277<p>Dieharder also goes beyond diehard in that it is deliberately 278extensible. In addition to implementing all of the diehard tests it is 279expected that dieharder will eventually contain all of the NIST STS and 280a variety of tests contributed by users, invented by the dieharder 281authors, or implemented from descriptions in the literature. As a true 282open source project, dieharder can eventually contain <i>all</i> rng 283tests that prove useful in one place with a consistent interface that 284permits one to apply those tests to many generators for purposes of 285comparison and validation of the <i>tests themselves</i> as much as the 286generators. In other words, it is intended to be a vehicle for the 287computer science of random number generation testing as well as a 288practical test harness for random number generators.</p> 289 290<p>To expand on this, the development of dieharder was motivated by the 291following, in rough order of importance:<p> 292 293<ul> 294 295<li> To provide a readily available, rpm- or apt- installable 296<b>toolset</b> so that "consumers" of random numbers (who typically use 297<i>large</i> numbers of random numbers in e.g. simulation or other 298research) can test the generator(s) they are using to verify their 299quality or lack thereof. 300 301<li> To provide a very <b>simple user interface</b> for that toolset for 302random number consumers. At the moment, this means a command line 303interface (CLI) that can easily be embedded in scripts or run repeatedly 304with different parameters. A graphical user interface (GUI) is on the 305list of things to do, although it adds little to the practical utility 306of the tool. 307 308<li> To provide <b>lots of knobs and dials</b> and low level control for 309statistical researchers that want to study particular generators with 310particular tests in more detail. This includes full access to test 311sources -- no parameter or aspect of the test algorithms is "hidden" and 312needs to be taken on faith. 313 314<li> To have the entire test code and documentation be fully <b>Gnu 315Public Licensed</b> and hence openly available for adaptation, testing, 316comment, and modification so that the testing suite itself becomes (over 317time) reliable. 318 319<li> To be <b>extensible</b>. Dieharder provides a fairly <b>simple 320API</b> for adding new tests with a common set of low-level testing 321tools and a <b>common test structure</b> that leads (one hopes) to an 322<i>unambiguous</i> decision to accept or reject any given random number 323generator on the basis of any given test for a suitable choice of 324controllable test parameters. 325 326<li> To allow all researchers to be able to directly test, in 327particular, the <b>random number generators interfaced with the GSL</b>. 328This is a deliberate design decision justified by the extremely large 329and growing number of random number generators prebuilt into the GSL and 330the ease of adding new ones (either contributing them to the project or 331for the sole purpose of local testing). 332 333<li> To allow researchers that use e.g. <i>distributions</i> directly 334generated by GSL routines (which can in principle fail two ways, due to 335the failure of the underlying random number generator or due to a 336failure of the generating algorithm) to be able to directly validate 337their particular generator/distribution combination at the cost of 338implementing a suitable test in dieharder (using the code of existing 339tests as a template). 340 341<li> To allow dieharder to be directly interfaced with <b>other tools 342and interfaces</b>. For example, dieharder can be directly called 343within the R interface, permitting its rngs to be tested and R-based 344graphics and tools to be used to analyze test results. Note well, 345however, that because it uses the GSL (which is GPL viral) dieharder 346itself is GPL viral and cannot be embedded directly into a non-GPL tool 347such as matlab. It can, of course, be used to generate <i>p-value 348data</i> that is passed on to matlab (or any other graphing or analysis 349tool) 350 351</ul> 352 353<p>Although this tool is being developed on Linux/GCC-based platforms, 354it should port with no particular difficulty to other Unix-like 355environments (at least ones that also support the GSL), with the further 356warning that certain features (in particular large file support) may 357require tweaking and that the dieharder authors may not be able to help 358you perform that tweaking.</p> 359 360<center><h2>Essential Usage Synopsis</h2></center> 361 362<p>If you compile the test or install the provided binary rpm's and run 363it as:</p> 364 365<tt>dieharder -a</tt> 366 367<p>it should run -a(ll) tests on the default GSL generator.</p> 368 369<p>Choose alternative tests with -g number where:</p> 370 371<tt>dieharder -g -1</tt> 372 373<p>will list all possible numbers known to the current snapshot of the 374dieharder.</p> 375 376<tt>dieharder -l</tt> 377 378<p>should list all the tests implemented in the current snapshop of 379DieHarder. Finally, the venerable and time tested:</p> 380 381<tt>dieharder -h</tt> 382 383<p> provides a Usage synopsis (which can quite long) and</p> 384 385<tt>man dieharder</tt> 386 387<p>is the (installed) man page, which may or many not be completely up 388to date as the suite is under active development. For developers, 389additional documentation is available in the toplevel directory or doc 390subdirectory of the source tree. Eventually, a complete DieHard manual 391in printable PDF form will be available both on this website and in 392/usr/share/doc/dieharder-*/.</p> 393 394<center><h2>List of Random Number Generators and Tests 395Available</h2></center> 396 397<p>List of GSL and user-defined random number generators that can be 398tested by dieharder:</p> 399<pre> 400#=============================================================================# 401# dieharder version 3.29.4beta Copyright 2003 Robert G. Brown # 402#=============================================================================# 403# Id Test Name | Id Test Name | Id Test Name # 404#=============================================================================# 405| 000 borosh13 |001 cmrg |002 coveyou | 406| 003 fishman18 |004 fishman20 |005 fishman2x | 407| 006 gfsr4 |007 knuthran |008 knuthran2 | 408| 009 knuthran2002 |010 lecuyer21 |011 minstd | 409| 012 mrg |013 mt19937 |014 mt19937_1999 | 410| 015 mt19937_1998 |016 r250 |017 ran0 | 411| 018 ran1 |019 ran2 |020 ran3 | 412| 021 rand |022 rand48 |023 random128-bsd | 413| 024 random128-glibc2 |025 random128-libc5 |026 random256-bsd | 414| 027 random256-glibc2 |028 random256-libc5 |029 random32-bsd | 415| 030 random32-glibc2 |031 random32-libc5 |032 random64-bsd | 416| 033 random64-glibc2 |034 random64-libc5 |035 random8-bsd | 417| 036 random8-glibc2 |037 random8-libc5 |038 random-bsd | 418| 039 random-glibc2 |040 random-libc5 |041 randu | 419| 042 ranf |043 ranlux |044 ranlux389 | 420| 045 ranlxd1 |046 ranlxd2 |047 ranlxs0 | 421| 048 ranlxs1 |049 ranlxs2 |050 ranmar | 422| 051 slatec |052 taus |053 taus2 | 423| 054 taus113 |055 transputer |056 tt800 | 424| 057 uni |058 uni32 |059 vax | 425| 060 waterman14 |061 zuf | | 426#=============================================================================# 427| 200 stdin_input_raw |201 file_input_raw |202 file_input | 428| 203 ca |204 uvag |205 AES_OFB | 429| 206 Threefish_OFB | | | 430#=============================================================================# 431| 400 R_wichmann_hill |401 R_marsaglia_multic. |402 R_super_duper | 432| 403 R_mersenne_twister |404 R_knuth_taocp |405 R_knuth_taocp2 | 433#=============================================================================# 434| 500 /dev/random |501 /dev/urandom | | 435#=============================================================================# 436| 600 empty | | | 437#=============================================================================# 438</pre> 439 440<p>Two "gold standard" generators in particular are provided to "test 441the test" -- AES_OFB and Threefish_OFB are both cryptographic generators 442and should be quite random. gfsr4, mt19937, and taus (and several 443others) are very good generators in the GSL, as well. If you are 444developing a new rng, it should compare decently with these generators 445on dieharder test runs.</i> 446 447<p>Note that the stdin_input_raw interface (-g 200) is a "universal" 448interface. Any generator that can produce a (continuous) stream of 449presumably random bits can be tested with dieharder. The easiest way to 450demonstrate this is by running:</p> 451 452<pre> 453dieharder -S 1 -o -B -t 100000000 | dieharder -g 200 -d 0 454</pre> 455 456<p>where the first invocation of dieharder generates a stream of binary 457bits drawn from the default generator with seed 1 and the second reads 458those bits from stdin and tests them with the diehard birthdaytest on 459two bit sequences. Compare the output to:</p> 460 461<pre> 462dieharder -S 1 -d 0 463</pre> 464 465<p>which runs the same test on the same generator with the same seed 466internally. They should be the same.</p> 467 468<p>Similarly the file_input generator requires a file of "cooked" (ascii 469readable) random numbers, one per line, with a header that describes the 470format to dieharder. Note Well! File or stream input rands (with any 471of the three methods for input) are delivered to the tests on demand, 472but if the test needs more than are available dieharder either fails (in 473the case of a stdin stream) or rewinds the file and cycles through it 474again, and again, and again as needed. Obviously this significantly 475reduces the sample space and can lead to completely incorrect results 476for the p-value histograms unless there are enough rands to run EACH 477test without repetition (it is harmless to reuse the sequence for 478different tests). <b>Let the user beware!</b></p> 479 480<p>List of the CURRENT fully implemented tests (as of the 08/18/08 481snapshot):</p> 482<pre> 483#=============================================================================# 484# dieharder version 3.29.4beta Copyright 2003 Robert G. Brown # 485#=============================================================================# 486Installed dieharder tests: 487 Test Number Test Name Test Reliability 488=============================================================================== 489 -d 0 Diehard Birthdays Test Good 490 -d 1 Diehard OPERM5 Test Suspect 491 -d 2 Diehard 32x32 Binary Rank Test Good 492 -d 3 Diehard 6x8 Binary Rank Test Good 493 -d 4 Diehard Bitstream Test Good 494 -d 5 Diehard OPSO Good 495 -d 6 Diehard OQSO Test Good 496 -d 7 Diehard DNA Test Good 497 -d 8 Diehard Count the 1s (stream) Test Good 498 -d 9 Diehard Count the 1s Test (byte) Good 499 -d 10 Diehard Parking Lot Test Good 500 -d 11 Diehard Minimum Distance (2d Circle) Test Good 501 -d 12 Diehard 3d Sphere (Minimum Distance) Test Good 502 -d 13 Diehard Squeeze Test Good 503 -d 14 Diehard Sums Test Do Not Use 504 -d 15 Diehard Runs Test Good 505 -d 16 Diehard Craps Test Good 506 -d 17 Marsaglia and Tsang GCD Test Good 507 -d 100 STS Monobit Test Good 508 -d 101 STS Runs Test Good 509 -d 102 STS Serial Test (Generalized) Good 510 -d 200 RGB Bit Distribution Test Good 511 -d 201 RGB Generalized Minimum Distance Test Good 512 -d 202 RGB Permutations Test Good 513 -d 203 RGB Lagged Sum Test Good 514 -d 204 RGB Kolmogorov-Smirnov Test Test Good 515</pre> 516 517<p>Full descriptions of the tests are available from within the tool. 518For example, enter: 519<pre> 520rgb@lilith|B:1003>./dieharder -d 203 -h 521OK, what is dtest_num = 203 522#================================================================== 523# RGB Lagged Sums Test 524# This package contains many very lovely tests. Very few of them, 525# however, test for lagged correlations -- the possibility that 526# the random number generator has a bitlevel correlation after 527# some fixed number of intervening bits. 528# 529# The lagged sums test is therefore very simple. One simply adds up 530# uniform deviates sampled from the rng, skipping lag samples in between 531# each rand used. The mean of tsamples samples thus summed should be 532# 0.5*tsamples. The standard deviation should be sqrt(tsamples/12). 533# The experimental values of the sum are thus converted into a 534# p-value (using the erf()) and a ks-test applied to psamples of them. 535#================================================================== 536</pre> 537</p> 538 539<p>Note that all tests have been independently rewritten from their 540description, and may be functionally modified or extended relative to 541the original source code published in the originating suite(s). This 542has proven to be absolutely necessary; dieharder stresses random number 543generator tests as much as it stresses random number generators, and 544tests with imprecise target statistics can return "failure" when the 545fault is with the test, not the generator.</p> 546 547<p>The author (rgb) bears complete responsibility for these changes, 548subject to the standard GPL code disclaimer that the code <i>has no 549warranty</i>. In essence, yes it may be my fault if they don't work but 550using the tool is <i>at your own risk</i> and you can <i>fix it</i> if 551it bothers you and/or I don't fix it first.</p> 552 553<center><h2>Development Notes</h2></center> 554 555<p>All tests are encapsulated to be as standard as possible in the way 556they compute p-values from single statistics or from vectors of 557statistics, and in the way they implement the underlying KS and chisq 558tests. Diehard is now complete in dieharder (although two tests are 559badly broken and should not be used), and attention will turn towards 560implementing more selected tests from the STS and many other sources. A 561road map of sorts (with full supporting documentation) is available on 562request if volunteers wish to work on adding more GPL tests.</p> 563 564<p>Note that a few tests appear to have stubborn bugs. In particular, 565the diehard operm5 test seems to fail all generators in dieharder. 566Several users have attempted to help debug this problem, and it 567tentatively appears that the problem is in the original diehard code and 568not just dieharder. There is extensive literature on overlapping tests, 569which are highly non-trivial to implement and involve things like 570forming the weak inverse of covariance matrices in order to correct for 571overlapping (non-independent) statistics.</p> 572 573<p>A revised version of overlapping permutations is underway (as an rgb 574test), but is still buggy. A non-overlapping (rgb) permutations test is 575provided now that should test much the same thing at the expense of 576requiring more samples to do it.</p> 577 578<p>Similarly, the diehard sums test appears to produce a systematically 579non-flat distribution of p-values for all rngs tested, in particular for 580the "gold standard" cryptographic generators aes and threefish, as well 581as for the "good" generators in the GSL (mt19937, taus, gfsr4). It 582seems very unlikely that all of these generators would be flawed in the 583same way, so this test also should not be used to test your rng. 584 585<center><h2>Thoughts for the Future/Wish List/To Do</h2></center> 586 587<ul> 588 589<li> Tests of GSL random distribution (as opposed to number) generators, 590as indirect tests of the generators that feed them. 591 592<li> New tests, compressions of existing ones that are "different" but 593really the same. Hyperplane tests. Spectral tests. Especially the bit 594distribution test with user defineable lag or lag pattern (to look for 595subtle, long period correlations in the bit patterns produced). 596 597<li> Collaborators. Co-developers welcome, as are contributions or 598suggestions from users. Note well that users have already provided 599critical help debugging the early code! Part of the point of a GPL 600project is that you are NOT at the mercy of a black box piece of code. 601If you are using dieharder and are moderately expert at statistics and 602random numbers and observe something odd, please help out! 603 604</ul> 605 606<center><h2>Conclusions</h2></center> 607 608<p>I hope that even during its development, you find dieharder useful. 609Remember, it is fully open source, so you can freely modify and 610redistribute the code according to the rules laid out in the Gnu Public 611License (version 2b), which might cost you as much as a beer one day. 612In particular, you can easily add random number generators using the 613provided examples as templates, or you can add tests of your own by 614copying the general layout of the existing tests (working toward a 615p-value per run, cumulating (say) 100 runs, and turning the resulting KS 616test into an overall p-value). Best of all, you can look inside the 617code and see how the tests work, which may inspire you to create a new 618test -- or a new generator that can <i>pass</i> a test.</p> 619 620<p>To conclude, if you have any interest in participating in the 621development of dieharder, be sure to let me know, especially if you have 622decent C coding skills (including familiarity with Subversion and the 623GSL) and a basic knowledge of statistics. I even have documents to help 624with the latter, if you have the programming skills and want to LEARN 625statistics. Bug reports or suggestions are also welcome.</p> 626 627<p>Submit bug reports, etc. to</p> 628<address> 629 rgb at phy dot duke dot edu 630</address> 631 632