1// $Id: poissonTables.src,v 1.2 2003/08/13 20:00:12 garren Exp $
2// -*- C++ -*-
3//
4// poissonTables.src
5//
6// This program creates the data tables that RandPoisson will use to
7// interpolate from the flat random R to the poisson deviate N.
8// The tables are produced in files poissonTables.cdat.  The data is
9// in the form of a row for each integer mu value, counting by 5's to
10// 100.  Each row of the table presents 51 values of r, corresponding to
11// the lowest r such that the distribution ought to return N, for 51
12// consecutive integers N.
13// This is exp(-mu) times sum(mu**k/k!, k=0 to N-1).
14//
15// For mu of 30 and lower, these start at N=1;
16// for mu of 35 and higher these start at mu-30.
17//
18// At the end of this file is a lengthy comment describing just what is going
19// on mathematically.
20//
21// This file need be compiled only once at one location; the poissonTables.cdat
22// generated is completely portable.  Therefore, one time only,
23// poissonTables.src has been renamed poissonTables.cc, and compiled and run
24// to create poissonTables.cdat.  The source poissonTables.src is provided
25// with CLHEP as a mode of concrete documentation of how those numers were
26// computed.
27//
28// This file is stylistically not suitable for distribution as a reusable
29// module.
30//
31// 11/17/99  mf	Created.
32// 11/18/99  mf	Having been used (from test area) to create poissonTables.cdat,
33//		poissonTables.cc was renamed .src and moved to Random area.
34// --------------------------------------------------------------
35
36#include "CLHEP/Random/defs.h"
37#include <iostream>
38#include <fstream>
39#include <iomanip>
40
41#define IOS_OK
42#ifdef  IOS_OK
43#include <ios.h>
44#endif
45
46using std::cout;
47
48
49//
50// Constants dictating the nature of the table are picked up from RandPoisson.h
51//
52
53#include "CLHEP/Random/RandPoisson.h"
54
55int main() {
56
57  // Open the output file
58
59  std::ofstream output ( "poissonTables.cdat", std::ios::out );
60
61  // Write the table, one mu-value at a time
62
63  double mu;
64  for (  mu = FIRST_MU; mu <= LAST_MU; mu += S ) {
65
66    output << "\n//\n// mu = " << mu << "\n//\n";
67
68    double exp_mu = exp(-mu);
69    int Nmin = mu - BELOW;
70    if ( Nmin < 1 ) {Nmin = 1;}
71    int Nmax = Nmin + ENTRIES - 1;
72
73    double cdf  = 0;
74    double term = exp_mu;
75    int N;
76
77    // Compute exp_mu sum (mu**m/m!, m = 0 to Nmin-1.
78    // Leave term as the next term, mu**Nmin/Nmin!
79
80    N = 1;
81    while ( N < Nmin )  {
82      cdf += term;
83      term *= mu/N;
84      N++;
85    }
86
87    // Write out the values of cdf, in groups of 3
88    N = Nmin;
89    while (N <= Nmax) {
90      int   j;
91      for ( j = 0; j < 3; j++ ) {
92	double cdf0 = cdf;
93	cdf += term;
94	// The following lines handle roundoff issues:
95	  if ( (cdf == cdf0) && (cdf > (1 - 1.0E-14)) ) cdf = 1;
96	  if (cdf > 1) cdf = 1;
97	//
98        term *= mu/N;
99        output.setf(ios::fixed,ios::floatfield);
100        output.precision(17);
101        output << std::setw(20) << cdf << ", ";
102	N++;
103      }
104      output.setf(0,ios::floatfield);
105      output.precision(4);
106      output << " // " << N-3 << " - " << N-1 << "\n";
107    }
108
109  }
110
111  cout << " Tables completed \n";
112
113  return 0;
114}
115
116
117#ifdef STRATEGY
118
119The task is, given a value r and a mean mu, to find the smallest value of
120N such that cdf (mu, N) >= r.
121
122The crucial mathematical enabler is that the sum of two Poisson deviates
123with means u and v is distributed as a Possion with mean (u+v).  The idea
124is to generate r, LOOK UP the Poisson N for a value of mu slightly less than
125the actual value, and then generate a small-mean Poisson to add to that N to
126reflect the differnce between the actual mu and the tabulated mu.
127
128The algorithm, then, is:
129
1300) Generate one random, r.
131
1321) If mu > boundary, use the Gaussian approximation.  We can share the normal
133   transform that RandGauss uses, to double the speed over a Box-Mueller.
134   This boundary can be pushed to quite a high value, since the only cost of
135   increasing it is table space.  THe original CENRLIB used 88; we can use 100.
136
1372) If mu is very small, simply do the direct series summation.
138
1393) Consider row [mu/S] of the table; that is, the table contains values of
140   r_min vs N for mu = 0, S, 2S, 3S, ...
141
1424) In that table, find the largest Rentry below r.  Note the index N1.
143
144                               r - Rentry[n1]
1455) Form a new random, r' = ----------------------- . This is uniform on (0,1).
146                           Rentry[n1+1]-Rentry[n1]
147
148
1496) Use the direct method to form Poisson deviate N2 from r'.
150
151	The new random r' has fewer bits of randomness than did r, but
152	only by about 1-7 bits (about 7 bits for selecting among 100 possible
153	N values),  Since the second Poisson is for a small mu, this
154	distortion will harm the overall accuracy of the distribution
155	by somewhat less than 7 bits of loss.
156
1577) The result is N1 + N2.
158
159This method is precise to at least 46-bit accuracy, and can be made accurate
160to the full precision of your random engine at the cost of one extra random
161fired per deviate.
162
163Some wrinkles:
164
165-- When the table runs out --
166
167The table should extend to one number more than the maximal N anticipated
168for most r.  In rare cases, r will put you above that last r.  In such
169cases, (r-last R) / (1-last R) will again be uniformly distributed, but it
170would be incorrect to simply this as the second Random because we don't know
171the correct value of N1.  But we do know the sum of the N terms, and (from
172subtracting the next-to-last entry) the N-th term in the series.  So we can
173start from that last Rentry, and continue to extend the sum out as far as is
174needed to get the deviate in one step.
175
176Does the table need to start from 1?
177
178There is in each mu-row an implied Rentry of 0 at N=0.  For a given (assumedly
179fairly large) mu, it is not essential that the first Rentry in the table be 1.
180If r is less than that first Rentry, and the Rentry corresponds to N > 1, then
181the direct method can be used to compute the proper value.
182
183-- How big should the table be --
184
185The penalty for the table not going far enough out in N for a given mu
186is that with some frequency r will exceed the last Rentry.  Even in those
187cases, as long as the correct N is not very far from the end of the table,
188extending by the direct series is not a very serious matter.
189
190If a row of the table is to start from M > 1, the penalty is proportial to
191the miss frequency, times roughly M iterations of the direct series.  Thus
192a miss by a little in this case is as bad as a miss by a lot.
193
194Thus, the table (for any given mu) ought to extend at least 2 sigma on either
195side of mu.  So for mu = 100, you would want to go at least from 80-120.  For
196large mu, it is more imprtant to provide coverage on the lower side, since
197the cost of a miss will be much bigger.  For instance, at 2 sigma with mu=100,
198we can expect about 80 iterations of the series about 2.5% of the time, which
199adds .6 r-units just for the *low-miss* cases.  On the high-end, the equivalent
200cost is about 15 times less.  So 70-120 would make sense for mu = 100.  We
201make the table 51 numbers wide.
202
203The penalty for making the spacing between mu values, S, too large is that
204mu is about S/2 when computing N2.  My feeling is that a mu gap of 5 would
205be sensible, and that we could push that to 10 without serious regret.
206
207-- The size and shape of the table --
208
209The tradeoffs can be expressed in terms of expected series iteration count:
210
211* Increasing S by a unit costs .5 iterations
212* Decreasing the first Rentry in one of the large mu rows
213	2 sigma to 2.5 sigma --> 1.3 iterations
214	2.5 sigma to 3 sigma -->  .4 iterations
215	3.5 sigma to 4 sigma -->  .1 iterations
216* Increasing the last Rentry in one of the large mu rows
217	2 sigma to 2.5 sigma -->  .1 iterations
218
219There is some advantage in keeping the table rectangular.  Since sigma goes
220as sqrt(mu), splitting off the lower mu values for less R entries would not
221gain much (perhaps 15% for a single split into two rectangular tables).  We
222thus keep the table ractangular.
223
224Varying the starting N for each row lets us avoid a factor of Nmax/(Nmax-Nmin)
225which will turn out to be a factor of 2 or so; this is worthwhile since it
226costs almost no extra time.
227
228So the table is thus 20 by 51 doubles.
229
230-- Precomputing for default mu --
231
232For the default mu, fire() can use precomputed N values.  The trick is to
233prepare a table of N vs 100 r, along with =the r at which you go to the next
234N= and the series term at that point.  This saves, fundamentally, the time
235needed to search for N, plus the time needed to do the N2 calculation (which
236includes an exp()).
237
238
239#endif
240