1// $Id: poissonTables.src,v 1.2 2003/08/13 20:00:12 garren Exp $ 2// -*- C++ -*- 3// 4// poissonTables.src 5// 6// This program creates the data tables that RandPoisson will use to 7// interpolate from the flat random R to the poisson deviate N. 8// The tables are produced in files poissonTables.cdat. The data is 9// in the form of a row for each integer mu value, counting by 5's to 10// 100. Each row of the table presents 51 values of r, corresponding to 11// the lowest r such that the distribution ought to return N, for 51 12// consecutive integers N. 13// This is exp(-mu) times sum(mu**k/k!, k=0 to N-1). 14// 15// For mu of 30 and lower, these start at N=1; 16// for mu of 35 and higher these start at mu-30. 17// 18// At the end of this file is a lengthy comment describing just what is going 19// on mathematically. 20// 21// This file need be compiled only once at one location; the poissonTables.cdat 22// generated is completely portable. Therefore, one time only, 23// poissonTables.src has been renamed poissonTables.cc, and compiled and run 24// to create poissonTables.cdat. The source poissonTables.src is provided 25// with CLHEP as a mode of concrete documentation of how those numers were 26// computed. 27// 28// This file is stylistically not suitable for distribution as a reusable 29// module. 30// 31// 11/17/99 mf Created. 32// 11/18/99 mf Having been used (from test area) to create poissonTables.cdat, 33// poissonTables.cc was renamed .src and moved to Random area. 34// -------------------------------------------------------------- 35 36#include "CLHEP/Random/defs.h" 37#include <iostream> 38#include <fstream> 39#include <iomanip> 40 41#define IOS_OK 42#ifdef IOS_OK 43#include <ios.h> 44#endif 45 46using std::cout; 47 48 49// 50// Constants dictating the nature of the table are picked up from RandPoisson.h 51// 52 53#include "CLHEP/Random/RandPoisson.h" 54 55int main() { 56 57 // Open the output file 58 59 std::ofstream output ( "poissonTables.cdat", std::ios::out ); 60 61 // Write the table, one mu-value at a time 62 63 double mu; 64 for ( mu = FIRST_MU; mu <= LAST_MU; mu += S ) { 65 66 output << "\n//\n// mu = " << mu << "\n//\n"; 67 68 double exp_mu = exp(-mu); 69 int Nmin = mu - BELOW; 70 if ( Nmin < 1 ) {Nmin = 1;} 71 int Nmax = Nmin + ENTRIES - 1; 72 73 double cdf = 0; 74 double term = exp_mu; 75 int N; 76 77 // Compute exp_mu sum (mu**m/m!, m = 0 to Nmin-1. 78 // Leave term as the next term, mu**Nmin/Nmin! 79 80 N = 1; 81 while ( N < Nmin ) { 82 cdf += term; 83 term *= mu/N; 84 N++; 85 } 86 87 // Write out the values of cdf, in groups of 3 88 N = Nmin; 89 while (N <= Nmax) { 90 int j; 91 for ( j = 0; j < 3; j++ ) { 92 double cdf0 = cdf; 93 cdf += term; 94 // The following lines handle roundoff issues: 95 if ( (cdf == cdf0) && (cdf > (1 - 1.0E-14)) ) cdf = 1; 96 if (cdf > 1) cdf = 1; 97 // 98 term *= mu/N; 99 output.setf(ios::fixed,ios::floatfield); 100 output.precision(17); 101 output << std::setw(20) << cdf << ", "; 102 N++; 103 } 104 output.setf(0,ios::floatfield); 105 output.precision(4); 106 output << " // " << N-3 << " - " << N-1 << "\n"; 107 } 108 109 } 110 111 cout << " Tables completed \n"; 112 113 return 0; 114} 115 116 117#ifdef STRATEGY 118 119The task is, given a value r and a mean mu, to find the smallest value of 120N such that cdf (mu, N) >= r. 121 122The crucial mathematical enabler is that the sum of two Poisson deviates 123with means u and v is distributed as a Possion with mean (u+v). The idea 124is to generate r, LOOK UP the Poisson N for a value of mu slightly less than 125the actual value, and then generate a small-mean Poisson to add to that N to 126reflect the differnce between the actual mu and the tabulated mu. 127 128The algorithm, then, is: 129 1300) Generate one random, r. 131 1321) If mu > boundary, use the Gaussian approximation. We can share the normal 133 transform that RandGauss uses, to double the speed over a Box-Mueller. 134 This boundary can be pushed to quite a high value, since the only cost of 135 increasing it is table space. THe original CENRLIB used 88; we can use 100. 136 1372) If mu is very small, simply do the direct series summation. 138 1393) Consider row [mu/S] of the table; that is, the table contains values of 140 r_min vs N for mu = 0, S, 2S, 3S, ... 141 1424) In that table, find the largest Rentry below r. Note the index N1. 143 144 r - Rentry[n1] 1455) Form a new random, r' = ----------------------- . This is uniform on (0,1). 146 Rentry[n1+1]-Rentry[n1] 147 148 1496) Use the direct method to form Poisson deviate N2 from r'. 150 151 The new random r' has fewer bits of randomness than did r, but 152 only by about 1-7 bits (about 7 bits for selecting among 100 possible 153 N values), Since the second Poisson is for a small mu, this 154 distortion will harm the overall accuracy of the distribution 155 by somewhat less than 7 bits of loss. 156 1577) The result is N1 + N2. 158 159This method is precise to at least 46-bit accuracy, and can be made accurate 160to the full precision of your random engine at the cost of one extra random 161fired per deviate. 162 163Some wrinkles: 164 165-- When the table runs out -- 166 167The table should extend to one number more than the maximal N anticipated 168for most r. In rare cases, r will put you above that last r. In such 169cases, (r-last R) / (1-last R) will again be uniformly distributed, but it 170would be incorrect to simply this as the second Random because we don't know 171the correct value of N1. But we do know the sum of the N terms, and (from 172subtracting the next-to-last entry) the N-th term in the series. So we can 173start from that last Rentry, and continue to extend the sum out as far as is 174needed to get the deviate in one step. 175 176Does the table need to start from 1? 177 178There is in each mu-row an implied Rentry of 0 at N=0. For a given (assumedly 179fairly large) mu, it is not essential that the first Rentry in the table be 1. 180If r is less than that first Rentry, and the Rentry corresponds to N > 1, then 181the direct method can be used to compute the proper value. 182 183-- How big should the table be -- 184 185The penalty for the table not going far enough out in N for a given mu 186is that with some frequency r will exceed the last Rentry. Even in those 187cases, as long as the correct N is not very far from the end of the table, 188extending by the direct series is not a very serious matter. 189 190If a row of the table is to start from M > 1, the penalty is proportial to 191the miss frequency, times roughly M iterations of the direct series. Thus 192a miss by a little in this case is as bad as a miss by a lot. 193 194Thus, the table (for any given mu) ought to extend at least 2 sigma on either 195side of mu. So for mu = 100, you would want to go at least from 80-120. For 196large mu, it is more imprtant to provide coverage on the lower side, since 197the cost of a miss will be much bigger. For instance, at 2 sigma with mu=100, 198we can expect about 80 iterations of the series about 2.5% of the time, which 199adds .6 r-units just for the *low-miss* cases. On the high-end, the equivalent 200cost is about 15 times less. So 70-120 would make sense for mu = 100. We 201make the table 51 numbers wide. 202 203The penalty for making the spacing between mu values, S, too large is that 204mu is about S/2 when computing N2. My feeling is that a mu gap of 5 would 205be sensible, and that we could push that to 10 without serious regret. 206 207-- The size and shape of the table -- 208 209The tradeoffs can be expressed in terms of expected series iteration count: 210 211* Increasing S by a unit costs .5 iterations 212* Decreasing the first Rentry in one of the large mu rows 213 2 sigma to 2.5 sigma --> 1.3 iterations 214 2.5 sigma to 3 sigma --> .4 iterations 215 3.5 sigma to 4 sigma --> .1 iterations 216* Increasing the last Rentry in one of the large mu rows 217 2 sigma to 2.5 sigma --> .1 iterations 218 219There is some advantage in keeping the table rectangular. Since sigma goes 220as sqrt(mu), splitting off the lower mu values for less R entries would not 221gain much (perhaps 15% for a single split into two rectangular tables). We 222thus keep the table ractangular. 223 224Varying the starting N for each row lets us avoid a factor of Nmax/(Nmax-Nmin) 225which will turn out to be a factor of 2 or so; this is worthwhile since it 226costs almost no extra time. 227 228So the table is thus 20 by 51 doubles. 229 230-- Precomputing for default mu -- 231 232For the default mu, fire() can use precomputed N values. The trick is to 233prepare a table of N vs 100 r, along with =the r at which you go to the next 234N= and the series term at that point. This saves, fundamentally, the time 235needed to search for N, plus the time needed to do the N2 calculation (which 236includes an exp()). 237 238 239#endif 240