1:mod:`random` --- Generate pseudo-random numbers 2================================================ 3 4.. module:: random 5 :synopsis: Generate pseudo-random numbers with various common distributions. 6 7**Source code:** :source:`Lib/random.py` 8 9-------------- 10 11This module implements pseudo-random number generators for various 12distributions. 13 14For integers, there is uniform selection from a range. For sequences, there is 15uniform selection of a random element, a function to generate a random 16permutation of a list in-place, and a function for random sampling without 17replacement. 18 19On the real line, there are functions to compute uniform, normal (Gaussian), 20lognormal, negative exponential, gamma, and beta distributions. For generating 21distributions of angles, the von Mises distribution is available. 22 23Almost all module functions depend on the basic function :func:`.random`, which 24generates a random float uniformly in the semi-open range [0.0, 1.0). Python 25uses the Mersenne Twister as the core generator. It produces 53-bit precision 26floats and has a period of 2\*\*19937-1. The underlying implementation in C is 27both fast and threadsafe. The Mersenne Twister is one of the most extensively 28tested random number generators in existence. However, being completely 29deterministic, it is not suitable for all purposes, and is completely unsuitable 30for cryptographic purposes. 31 32The functions supplied by this module are actually bound methods of a hidden 33instance of the :class:`random.Random` class. You can instantiate your own 34instances of :class:`Random` to get generators that don't share state. 35 36Class :class:`Random` can also be subclassed if you want to use a different 37basic generator of your own devising: in that case, override the :meth:`~Random.random`, 38:meth:`~Random.seed`, :meth:`~Random.getstate`, and :meth:`~Random.setstate` methods. 39Optionally, a new generator can supply a :meth:`~Random.getrandbits` method --- this 40allows :meth:`randrange` to produce selections over an arbitrarily large range. 41 42The :mod:`random` module also provides the :class:`SystemRandom` class which 43uses the system function :func:`os.urandom` to generate random numbers 44from sources provided by the operating system. 45 46.. warning:: 47 48 The pseudo-random generators of this module should not be used for 49 security purposes. For security or cryptographic uses, see the 50 :mod:`secrets` module. 51 52.. seealso:: 53 54 M. Matsumoto and T. Nishimura, "Mersenne Twister: A 623-dimensionally 55 equidistributed uniform pseudorandom number generator", ACM Transactions on 56 Modeling and Computer Simulation Vol. 8, No. 1, January pp.3--30 1998. 57 58 59 `Complementary-Multiply-with-Carry recipe 60 <https://code.activestate.com/recipes/576707/>`_ for a compatible alternative 61 random number generator with a long period and comparatively simple update 62 operations. 63 64 65Bookkeeping functions 66--------------------- 67 68.. function:: seed(a=None, version=2) 69 70 Initialize the random number generator. 71 72 If *a* is omitted or ``None``, the current system time is used. If 73 randomness sources are provided by the operating system, they are used 74 instead of the system time (see the :func:`os.urandom` function for details 75 on availability). 76 77 If *a* is an int, it is used directly. 78 79 With version 2 (the default), a :class:`str`, :class:`bytes`, or :class:`bytearray` 80 object gets converted to an :class:`int` and all of its bits are used. 81 82 With version 1 (provided for reproducing random sequences from older versions 83 of Python), the algorithm for :class:`str` and :class:`bytes` generates a 84 narrower range of seeds. 85 86 .. versionchanged:: 3.2 87 Moved to the version 2 scheme which uses all of the bits in a string seed. 88 89.. function:: getstate() 90 91 Return an object capturing the current internal state of the generator. This 92 object can be passed to :func:`setstate` to restore the state. 93 94 95.. function:: setstate(state) 96 97 *state* should have been obtained from a previous call to :func:`getstate`, and 98 :func:`setstate` restores the internal state of the generator to what it was at 99 the time :func:`getstate` was called. 100 101 102.. function:: getrandbits(k) 103 104 Returns a Python integer with *k* random bits. This method is supplied with 105 the Mersenne Twister generator and some other generators may also provide it 106 as an optional part of the API. When available, :meth:`getrandbits` enables 107 :meth:`randrange` to handle arbitrarily large ranges. 108 109 110Functions for integers 111---------------------- 112 113.. function:: randrange(stop) 114 randrange(start, stop[, step]) 115 116 Return a randomly selected element from ``range(start, stop, step)``. This is 117 equivalent to ``choice(range(start, stop, step))``, but doesn't actually build a 118 range object. 119 120 The positional argument pattern matches that of :func:`range`. Keyword arguments 121 should not be used because the function may use them in unexpected ways. 122 123 .. versionchanged:: 3.2 124 :meth:`randrange` is more sophisticated about producing equally distributed 125 values. Formerly it used a style like ``int(random()*n)`` which could produce 126 slightly uneven distributions. 127 128.. function:: randint(a, b) 129 130 Return a random integer *N* such that ``a <= N <= b``. Alias for 131 ``randrange(a, b+1)``. 132 133 134Functions for sequences 135----------------------- 136 137.. function:: choice(seq) 138 139 Return a random element from the non-empty sequence *seq*. If *seq* is empty, 140 raises :exc:`IndexError`. 141 142.. function:: choices(population, weights=None, *, cum_weights=None, k=1) 143 144 Return a *k* sized list of elements chosen from the *population* with replacement. 145 If the *population* is empty, raises :exc:`IndexError`. 146 147 If a *weights* sequence is specified, selections are made according to the 148 relative weights. Alternatively, if a *cum_weights* sequence is given, the 149 selections are made according to the cumulative weights (perhaps computed 150 using :func:`itertools.accumulate`). For example, the relative weights 151 ``[10, 5, 30, 5]`` are equivalent to the cumulative weights 152 ``[10, 15, 45, 50]``. Internally, the relative weights are converted to 153 cumulative weights before making selections, so supplying the cumulative 154 weights saves work. 155 156 If neither *weights* nor *cum_weights* are specified, selections are made 157 with equal probability. If a weights sequence is supplied, it must be 158 the same length as the *population* sequence. It is a :exc:`TypeError` 159 to specify both *weights* and *cum_weights*. 160 161 The *weights* or *cum_weights* can use any numeric type that interoperates 162 with the :class:`float` values returned by :func:`random` (that includes 163 integers, floats, and fractions but excludes decimals). Weights are 164 assumed to be non-negative. 165 166 For a given seed, the :func:`choices` function with equal weighting 167 typically produces a different sequence than repeated calls to 168 :func:`choice`. The algorithm used by :func:`choices` uses floating 169 point arithmetic for internal consistency and speed. The algorithm used 170 by :func:`choice` defaults to integer arithmetic with repeated selections 171 to avoid small biases from round-off error. 172 173 .. versionadded:: 3.6 174 175 176.. function:: shuffle(x[, random]) 177 178 Shuffle the sequence *x* in place. 179 180 The optional argument *random* is a 0-argument function returning a random 181 float in [0.0, 1.0); by default, this is the function :func:`.random`. 182 183 To shuffle an immutable sequence and return a new shuffled list, use 184 ``sample(x, k=len(x))`` instead. 185 186 Note that even for small ``len(x)``, the total number of permutations of *x* 187 can quickly grow larger than the period of most random number generators. 188 This implies that most permutations of a long sequence can never be 189 generated. For example, a sequence of length 2080 is the largest that 190 can fit within the period of the Mersenne Twister random number generator. 191 192 193.. function:: sample(population, k) 194 195 Return a *k* length list of unique elements chosen from the population sequence 196 or set. Used for random sampling without replacement. 197 198 Returns a new list containing elements from the population while leaving the 199 original population unchanged. The resulting list is in selection order so that 200 all sub-slices will also be valid random samples. This allows raffle winners 201 (the sample) to be partitioned into grand prize and second place winners (the 202 subslices). 203 204 Members of the population need not be :term:`hashable` or unique. If the population 205 contains repeats, then each occurrence is a possible selection in the sample. 206 207 To choose a sample from a range of integers, use a :func:`range` object as an 208 argument. This is especially fast and space efficient for sampling from a large 209 population: ``sample(range(10000000), k=60)``. 210 211 If the sample size is larger than the population size, a :exc:`ValueError` 212 is raised. 213 214Real-valued distributions 215------------------------- 216 217The following functions generate specific real-valued distributions. Function 218parameters are named after the corresponding variables in the distribution's 219equation, as used in common mathematical practice; most of these equations can 220be found in any statistics text. 221 222 223.. function:: random() 224 225 Return the next random floating point number in the range [0.0, 1.0). 226 227 228.. function:: uniform(a, b) 229 230 Return a random floating point number *N* such that ``a <= N <= b`` for 231 ``a <= b`` and ``b <= N <= a`` for ``b < a``. 232 233 The end-point value ``b`` may or may not be included in the range 234 depending on floating-point rounding in the equation ``a + (b-a) * random()``. 235 236 237.. function:: triangular(low, high, mode) 238 239 Return a random floating point number *N* such that ``low <= N <= high`` and 240 with the specified *mode* between those bounds. The *low* and *high* bounds 241 default to zero and one. The *mode* argument defaults to the midpoint 242 between the bounds, giving a symmetric distribution. 243 244 245.. function:: betavariate(alpha, beta) 246 247 Beta distribution. Conditions on the parameters are ``alpha > 0`` and 248 ``beta > 0``. Returned values range between 0 and 1. 249 250 251.. function:: expovariate(lambd) 252 253 Exponential distribution. *lambd* is 1.0 divided by the desired 254 mean. It should be nonzero. (The parameter would be called 255 "lambda", but that is a reserved word in Python.) Returned values 256 range from 0 to positive infinity if *lambd* is positive, and from 257 negative infinity to 0 if *lambd* is negative. 258 259 260.. function:: gammavariate(alpha, beta) 261 262 Gamma distribution. (*Not* the gamma function!) Conditions on the 263 parameters are ``alpha > 0`` and ``beta > 0``. 264 265 The probability distribution function is:: 266 267 x ** (alpha - 1) * math.exp(-x / beta) 268 pdf(x) = -------------------------------------- 269 math.gamma(alpha) * beta ** alpha 270 271 272.. function:: gauss(mu, sigma) 273 274 Gaussian distribution. *mu* is the mean, and *sigma* is the standard 275 deviation. This is slightly faster than the :func:`normalvariate` function 276 defined below. 277 278 279.. function:: lognormvariate(mu, sigma) 280 281 Log normal distribution. If you take the natural logarithm of this 282 distribution, you'll get a normal distribution with mean *mu* and standard 283 deviation *sigma*. *mu* can have any value, and *sigma* must be greater than 284 zero. 285 286 287.. function:: normalvariate(mu, sigma) 288 289 Normal distribution. *mu* is the mean, and *sigma* is the standard deviation. 290 291 292.. function:: vonmisesvariate(mu, kappa) 293 294 *mu* is the mean angle, expressed in radians between 0 and 2\*\ *pi*, and *kappa* 295 is the concentration parameter, which must be greater than or equal to zero. If 296 *kappa* is equal to zero, this distribution reduces to a uniform random angle 297 over the range 0 to 2\*\ *pi*. 298 299 300.. function:: paretovariate(alpha) 301 302 Pareto distribution. *alpha* is the shape parameter. 303 304 305.. function:: weibullvariate(alpha, beta) 306 307 Weibull distribution. *alpha* is the scale parameter and *beta* is the shape 308 parameter. 309 310 311Alternative Generator 312--------------------- 313 314.. class:: Random([seed]) 315 316 Class that implements the default pseudo-random number generator used by the 317 :mod:`random` module. 318 319.. class:: SystemRandom([seed]) 320 321 Class that uses the :func:`os.urandom` function for generating random numbers 322 from sources provided by the operating system. Not available on all systems. 323 Does not rely on software state, and sequences are not reproducible. Accordingly, 324 the :meth:`seed` method has no effect and is ignored. 325 The :meth:`getstate` and :meth:`setstate` methods raise 326 :exc:`NotImplementedError` if called. 327 328 329Notes on Reproducibility 330------------------------ 331 332Sometimes it is useful to be able to reproduce the sequences given by a pseudo 333random number generator. By re-using a seed value, the same sequence should be 334reproducible from run to run as long as multiple threads are not running. 335 336Most of the random module's algorithms and seeding functions are subject to 337change across Python versions, but two aspects are guaranteed not to change: 338 339* If a new seeding method is added, then a backward compatible seeder will be 340 offered. 341 342* The generator's :meth:`~Random.random` method will continue to produce the same 343 sequence when the compatible seeder is given the same seed. 344 345.. _random-examples: 346 347Examples and Recipes 348-------------------- 349 350Basic examples:: 351 352 >>> random() # Random float: 0.0 <= x < 1.0 353 0.37444887175646646 354 355 >>> uniform(2.5, 10.0) # Random float: 2.5 <= x < 10.0 356 3.1800146073117523 357 358 >>> expovariate(1 / 5) # Interval between arrivals averaging 5 seconds 359 5.148957571865031 360 361 >>> randrange(10) # Integer from 0 to 9 inclusive 362 7 363 364 >>> randrange(0, 101, 2) # Even integer from 0 to 100 inclusive 365 26 366 367 >>> choice(['win', 'lose', 'draw']) # Single random element from a sequence 368 'draw' 369 370 >>> deck = 'ace two three four'.split() 371 >>> shuffle(deck) # Shuffle a list 372 >>> deck 373 ['four', 'two', 'ace', 'three'] 374 375 >>> sample([10, 20, 30, 40, 50], k=4) # Four samples without replacement 376 [40, 10, 50, 30] 377 378Simulations:: 379 380 >>> # Six roulette wheel spins (weighted sampling with replacement) 381 >>> choices(['red', 'black', 'green'], [18, 18, 2], k=6) 382 ['red', 'green', 'black', 'black', 'red', 'black'] 383 384 >>> # Deal 20 cards without replacement from a deck of 52 playing cards 385 >>> # and determine the proportion of cards with a ten-value 386 >>> # (a ten, jack, queen, or king). 387 >>> deck = collections.Counter(tens=16, low_cards=36) 388 >>> seen = sample(list(deck.elements()), k=20) 389 >>> seen.count('tens') / 20 390 0.15 391 392 >>> # Estimate the probability of getting 5 or more heads from 7 spins 393 >>> # of a biased coin that settles on heads 60% of the time. 394 >>> def trial(): 395 ... return choices('HT', cum_weights=(0.60, 1.00), k=7).count('H') >= 5 396 ... 397 >>> sum(trial() for i in range(10_000)) / 10_000 398 0.4169 399 400 >>> # Probability of the median of 5 samples being in middle two quartiles 401 >>> def trial(): 402 ... return 2_500 <= sorted(choices(range(10_000), k=5))[2] < 7_500 403 ... 404 >>> sum(trial() for i in range(10_000)) / 10_000 405 0.7958 406 407Example of `statistical bootstrapping 408<https://en.wikipedia.org/wiki/Bootstrapping_(statistics)>`_ using resampling 409with replacement to estimate a confidence interval for the mean of a sample:: 410 411 # http://statistics.about.com/od/Applications/a/Example-Of-Bootstrapping.htm 412 from statistics import fmean as mean 413 from random import choices 414 415 data = [41, 50, 29, 37, 81, 30, 73, 63, 20, 35, 68, 22, 60, 31, 95] 416 means = sorted(mean(choices(data, k=len(data))) for i in range(100)) 417 print(f'The sample mean of {mean(data):.1f} has a 90% confidence ' 418 f'interval from {means[5]:.1f} to {means[94]:.1f}') 419 420Example of a `resampling permutation test 421<https://en.wikipedia.org/wiki/Resampling_(statistics)#Permutation_tests>`_ 422to determine the statistical significance or `p-value 423<https://en.wikipedia.org/wiki/P-value>`_ of an observed difference 424between the effects of a drug versus a placebo:: 425 426 # Example from "Statistics is Easy" by Dennis Shasha and Manda Wilson 427 from statistics import fmean as mean 428 from random import shuffle 429 430 drug = [54, 73, 53, 70, 73, 68, 52, 65, 65] 431 placebo = [54, 51, 58, 44, 55, 52, 42, 47, 58, 46] 432 observed_diff = mean(drug) - mean(placebo) 433 434 n = 10_000 435 count = 0 436 combined = drug + placebo 437 for i in range(n): 438 shuffle(combined) 439 new_diff = mean(combined[:len(drug)]) - mean(combined[len(drug):]) 440 count += (new_diff >= observed_diff) 441 442 print(f'{n} label reshufflings produced only {count} instances with a difference') 443 print(f'at least as extreme as the observed difference of {observed_diff:.1f}.') 444 print(f'The one-sided p-value of {count / n:.4f} leads us to reject the null') 445 print(f'hypothesis that there is no difference between the drug and the placebo.') 446 447Simulation of arrival times and service deliveries for a multiserver queue:: 448 449 from heapq import heappush, heappop 450 from random import expovariate, gauss 451 from statistics import mean, median, stdev 452 453 average_arrival_interval = 5.6 454 average_service_time = 15.0 455 stdev_service_time = 3.5 456 num_servers = 3 457 458 waits = [] 459 arrival_time = 0.0 460 servers = [0.0] * num_servers # time when each server becomes available 461 for i in range(100_000): 462 arrival_time += expovariate(1.0 / average_arrival_interval) 463 next_server_available = heappop(servers) 464 wait = max(0.0, next_server_available - arrival_time) 465 waits.append(wait) 466 service_duration = gauss(average_service_time, stdev_service_time) 467 service_completed = arrival_time + wait + service_duration 468 heappush(servers, service_completed) 469 470 print(f'Mean wait: {mean(waits):.1f}. Stdev wait: {stdev(waits):.1f}.') 471 print(f'Median wait: {median(waits):.1f}. Max wait: {max(waits):.1f}.') 472 473.. seealso:: 474 475 `Statistics for Hackers <https://www.youtube.com/watch?v=Iq9DzN6mvYA>`_ 476 a video tutorial by 477 `Jake Vanderplas <https://us.pycon.org/2016/speaker/profile/295/>`_ 478 on statistical analysis using just a few fundamental concepts 479 including simulation, sampling, shuffling, and cross-validation. 480 481 `Economics Simulation 482 <http://nbviewer.jupyter.org/url/norvig.com/ipython/Economics.ipynb>`_ 483 a simulation of a marketplace by 484 `Peter Norvig <http://norvig.com/bio.html>`_ that shows effective 485 use of many of the tools and distributions provided by this module 486 (gauss, uniform, sample, betavariate, choice, triangular, and randrange). 487 488 `A Concrete Introduction to Probability (using Python) 489 <http://nbviewer.jupyter.org/url/norvig.com/ipython/Probability.ipynb>`_ 490 a tutorial by `Peter Norvig <http://norvig.com/bio.html>`_ covering 491 the basics of probability theory, how to write simulations, and 492 how to perform data analysis using Python. 493