1Performance
2-----------
3
4.. currentmodule:: numpy.random
5
6Recommendation
7**************
8The recommended generator for general use is `PCG64`. It is
9statistically high quality, full-featured, and fast on most platforms, but
10somewhat slow when compiled for 32-bit processes.
11
12`Philox` is fairly slow, but its statistical properties have
13very high quality, and it is easy to get assuredly-independent stream by using
14unique keys. If that is the style you wish to use for parallel streams, or you
15are porting from another system that uses that style, then
16`Philox` is your choice.
17
18`SFC64` is statistically high quality and very fast. However, it
19lacks jumpability. If you are not using that capability and want lots of speed,
20even on 32-bit processes, this is your choice.
21
22`MT19937` `fails some statistical tests`_ and is not especially
23fast compared to modern PRNGs. For these reasons, we mostly do not recommend
24using it on its own, only through the legacy `~.RandomState` for
25reproducing old results. That said, it has a very long history as a default in
26many systems.
27
28.. _`fails some statistical tests`: https://www.iro.umontreal.ca/~lecuyer/myftp/papers/testu01.pdf
29
30Timings
31*******
32
33The timings below are the time in ns to produce 1 random value from a
34specific distribution.  The original `MT19937` generator is
35much slower since it requires 2 32-bit values to equal the output of the
36faster generators.
37
38Integer performance has a similar ordering.
39
40The pattern is similar for other, more complex generators. The normal
41performance of the legacy `RandomState` generator is much
42lower than the other since it uses the Box-Muller transformation rather
43than the Ziggurat generator. The performance gap for Exponentials is also
44large due to the cost of computing the log function to invert the CDF.
45The column labeled MT19973 is used the same 32-bit generator as
46`RandomState` but produces random values using
47`Generator`.
48
49.. csv-table::
50    :header: ,MT19937,PCG64,Philox,SFC64,RandomState
51    :widths: 14,14,14,14,14,14
52
53    32-bit Unsigned Ints,3.2,2.7,4.9,2.7,3.2
54    64-bit Unsigned Ints,5.6,3.7,6.3,2.9,5.7
55    Uniforms,7.3,4.1,8.1,3.1,7.3
56    Normals,13.1,10.2,13.5,7.8,34.6
57    Exponentials,7.9,5.4,8.5,4.1,40.3
58    Gammas,34.8,28.0,34.7,25.1,58.1
59    Binomials,25.0,21.4,26.1,19.5,25.2
60    Laplaces,45.1,40.7,45.5,38.1,45.6
61    Poissons,67.6,52.4,69.2,46.4,78.1
62
63The next table presents the performance in percentage relative to values
64generated by the legacy generator, ``RandomState(MT19937())``. The overall
65performance was computed using a geometric mean.
66
67.. csv-table::
68    :header: ,MT19937,PCG64,Philox,SFC64
69    :widths: 14,14,14,14,14
70
71    32-bit Unsigned Ints,101,121,67,121
72    64-bit Unsigned Ints,102,156,91,199
73    Uniforms,100,179,90,235
74    Normals,263,338,257,443
75    Exponentials,507,752,474,985
76    Gammas,167,207,167,231
77    Binomials,101,118,96,129
78    Laplaces,101,112,100,120
79    Poissons,116,149,113,168
80    Overall,144,192,132,225
81
82.. note::
83
84   All timings were taken using Linux on an i5-3570 processor.
85
86Performance on different Operating Systems
87******************************************
88Performance differs across platforms due to compiler and hardware availability
89(e.g., register width) differences. The default bit generator has been chosen
90to perform well on 64-bit platforms.  Performance on 32-bit operating systems
91is very different.
92
93The values reported are normalized relative to the speed of MT19937 in
94each table. A value of 100 indicates that the performance matches the MT19937.
95Higher values indicate improved performance. These values cannot be compared
96across tables.
97
9864-bit Linux
99~~~~~~~~~~~~
100
101===================   =========  =======  ========  =======
102Distribution            MT19937    PCG64    Philox    SFC64
103===================   =========  =======  ========  =======
10432-bit Unsigned Int         100    119.8      67.7    120.2
10564-bit Unsigned Int         100    152.9      90.8    213.3
106Uniforms                    100    179.0      87.0    232.0
107Normals                     100    128.5      99.2    167.8
108Exponentials                100    148.3      93.0    189.3
109**Overall**                 100    144.3      86.8    180.0
110===================   =========  =======  ========  =======
111
112
11364-bit Windows
114~~~~~~~~~~~~~~
115The relative performance on 64-bit Linux and 64-bit Windows is broadly similar.
116
117
118===================   =========  =======  ========  =======
119Distribution            MT19937    PCG64    Philox    SFC64
120===================   =========  =======  ========  =======
12132-bit Unsigned Int         100    129.1      35.0    135.0
12264-bit Unsigned Int         100    146.9      35.7    176.5
123Uniforms                    100    165.0      37.0    192.0
124Normals                     100    128.5      48.5    158.0
125Exponentials                100    151.6      39.0    172.8
126**Overall**                 100    143.6      38.7    165.7
127===================   =========  =======  ========  =======
128
129
13032-bit Windows
131~~~~~~~~~~~~~~
132
133The performance of 64-bit generators on 32-bit Windows is much lower than on 64-bit
134operating systems due to register width. MT19937, the generator that has been
135in NumPy since 2005, operates on 32-bit integers.
136
137===================   =========  =======  ========  =======
138Distribution            MT19937    PCG64    Philox    SFC64
139===================   =========  =======  ========  =======
14032-bit Unsigned Int         100     30.5      21.1     77.9
14164-bit Unsigned Int         100     26.3      19.2     97.0
142Uniforms                    100     28.0      23.0    106.0
143Normals                     100     40.1      31.3    112.6
144Exponentials                100     33.7      26.3    109.8
145**Overall**                 100     31.4      23.8     99.8
146===================   =========  =======  ========  =======
147
148
149.. note::
150
151   Linux timings used Ubuntu 18.04 and GCC 7.4.  Windows timings were made on
152   Windows 10 using Microsoft C/C++ Optimizing Compiler Version 19 (Visual
153   Studio 2015). All timings were produced on an i5-3570 processor.
154