1.. _methods_tallies:
2
3=======
4Tallies
5=======
6
7Note that the methods discussed in this section are written specifically for
8continuous-energy mode but equivalent apply to the multi-group mode if the
9particle's energy is replaced with the particle's group
10
11------------------
12Filters and Scores
13------------------
14
15The tally capability in OpenMC takes a similar philosophy as that employed in
16the MC21_ Monte Carlo code to give maximum flexibility in specifying tallies
17while still maintaining scalability. Any tally in a Monte Carlo simulation can
18be written in the following form:
19
20.. math::
21    :label: tally-integral
22
23    X = \underbrace{\int d\mathbf{r} \int d\mathbf{\Omega} \int
24    dE}_{\text{filters}} \underbrace{f(\mathbf{r}, \mathbf{\Omega},
25    E)}_{\text{scores}} \psi (\mathbf{r}, \mathbf{\Omega}, E)
26
27
28A user can specify one or more filters which identify which regions of phase
29space should score to a given tally (the limits of integration as shown in
30equation :eq:`tally-integral`) as well as the scoring function (:math:`f` in
31equation :eq:`tally-integral`). For example, if the desired tally was the
32:math:`(n,\gamma)` reaction rate in a fuel pin, the filter would specify the
33cell which contains the fuel pin and the scoring function would be the radiative
34capture macroscopic cross section. The following quantities can be scored in
35OpenMC: flux, total reaction rate, scattering reaction rate, neutron production
36from scattering, higher scattering moments, :math:`(n,xn)` reaction rates,
37absorption reaction rate, fission reaction rate, neutron production rate from
38fission, and surface currents. The following variables can be used as filters:
39universe, material, cell, birth cell, surface, mesh, pre-collision energy,
40post-collision energy, polar angle, azimuthal angle, and the cosine of the
41change-in-angle due to a scattering event.
42
43With filters for pre- and post-collision energy and scoring functions for
44scattering and fission production, it is possible to use OpenMC to generate
45cross sections with user-defined group structures. These multigroup cross
46sections can subsequently be used in deterministic solvers such as coarse mesh
47finite difference (CMFD) diffusion.
48
49------------------------------
50Using Maps for Filter-Matching
51------------------------------
52
53Some Monte Carlo codes suffer severe performance penalties when tallying a large
54number of quantities. Care must be taken to ensure that a tally system scales
55well with the total number of tally bins. In OpenMC, a mapping technique is used
56that allows for a fast determination of what tally/bin combinations need to be
57scored to a given particle's phase space coordinates. For each discrete filter
58variable, a list is stored that contains the tally/bin combinations that could
59be scored to for each value of the filter variable. If a particle is in cell
60:math:`n`, the mapping would identify what tally/bin combinations specify cell
61:math:`n` for the cell filter variable. In this manner, it is not necessary to
62check the phase space variables against each tally. Note that this technique
63only applies to discrete filter variables and cannot be applied to energy,
64angle, or change-in-angle bins. For these filters, it is necessary to perform
65a binary search on the specified energy grid.
66
67-----------------------------------------
68Volume-Integrated Flux and Reaction Rates
69-----------------------------------------
70
71One quantity we may wish to compute during the course of a Monte Carlo
72simulation is the flux or a reaction rate integrated over a finite volume. The
73volume may be a particular cell, a collection of cells, or the entire
74geometry. There are various methods by which we can estimate reaction rates
75
76Analog Estimator
77----------------
78
79The analog estimator is the simplest type of estimator for reaction rates. The
80basic idea is that we simply count the number of actual reactions that take
81place and use that as our estimate for the reaction rate. This can be written
82mathematically as
83
84.. math::
85    :label: analog-estimator
86
87    R_x = \frac{1}{W} \sum_{i \in A} w_i
88
89where :math:`R_x` is the reaction rate for reaction :math:`x`, :math:`i` denotes
90an index for each event, :math:`A` is the set of all events resulting in
91reaction :math:`x`, and :math:`W` is the total starting weight of the particles,
92and :math:`w_i` is the pre-collision weight of the particle as it enters event
93:math:`i`. One should note that equation :eq:`analog-estimator` is
94volume-integrated so if we want a volume-averaged quantity, we need to divided
95by the volume of the region of integration. If survival biasing is employed, the
96analog estimator cannot be used for any reactions with zero neutrons in the exit
97channel.
98
99Collision Estimator
100-------------------
101
102While the analog estimator is conceptually very simple and easy to implement, it
103can suffer higher variance due to the fact low probability events will not occur
104often enough to get good statistics if they are being tallied. Thus, it is
105desirable to use a different estimator that allows us to score to the tally more
106often. One such estimator is the collision estimator. Instead of tallying a
107reaction only when it happens, the idea is to make a contribution to the tally
108at every collision.
109
110We can start by writing a formula for the collision estimate of the flux. Since
111:math:`R = \Sigma_t \phi` where :math:`R` is the total reaction rate,
112:math:`\Sigma_t` is the total macroscopic cross section, and :math:`\phi` is the
113scalar flux, it stands to reason that we can estimate the flux by taking an
114estimate of the total reaction rate and dividing it by the total macroscopic
115cross section. This gives us the following formula:
116
117.. math::
118    :label: collision-estimator-flux
119
120    \phi = \frac{1}{W} \sum_{i \in C} \frac{w_i}{\Sigma_t (E_i)}
121
122where :math:`W` is again the total starting weight of the particles, :math:`C`
123is the set of all events resulting in a collision with a nucleus, and
124:math:`\Sigma_t (E)` is the total macroscopic cross section of the target
125material at the incoming energy of the particle :math:`E_i`.
126
127If we multiply both sides of equation :eq:`collision-estimator-flux` by the
128macroscopic cross section for some reaction :math:`x`, then we get the collision
129estimate for the reaction rate for that reaction:
130
131.. math::
132    :label: collision-estimator
133
134    R_x = \frac{1}{W} \sum_{i \in C} \frac{w_i \Sigma_x (E_i)}{\Sigma_t (E_i)}
135
136where :math:`\Sigma_x (E_i)` is the macroscopic cross section for reaction
137:math:`x` at the incoming energy of the particle :math:`E_i`. In comparison to
138equation :eq:`analog-estimator`, we see that the collision estimate will result
139in a tally with a larger number of events that score to it with smaller
140contributions (since we have multiplied it by :math:`\Sigma_x / \Sigma_t`).
141
142Track-length Estimator
143----------------------
144
145One other method we can use to increase the number of events that scores to
146tallies is to use an estimator the scores contributions to a tally at every
147track for the particle rather than every collision. This is known as a
148track-length estimator, sometimes also called a path-length estimator. We first
149start with an expression for the volume integrated flux, which can be written as
150
151.. math::
152    :label: flux-integrated
153
154    V \phi = \int d\mathbf{r} \int dE \int d\mathbf{\Omega} \int dt \,
155    \psi(\mathbf{r}, \mathbf{\hat{\Omega}}, E, t)
156
157where :math:`V` is the volume, :math:`\psi` is the angular flux,
158:math:`\mathbf{r}` is the position of the particle, :math:`\mathbf{\hat{\Omega}}`
159is the direction of the particle, :math:`E` is the energy of the particle, and
160:math:`t` is the time. By noting that :math:`\psi(\mathbf{r},
161\mathbf{\hat{\Omega}}, E, t) = v n(\mathbf{r}, \mathbf{\hat{\Omega}}, E, t)`
162where :math:`n` is the angular neutron density, we can rewrite equation
163:eq:`flux-integrated` as
164
165.. math::
166    :label: flux-integrated-2
167
168    V \phi = \int d\mathbf{r} \int dE \int dt v \int d\mathbf{\Omega} \, n(\mathbf{r},
169    \mathbf{\hat{\Omega}}, E, t)).
170
171Using the relations :math:`N(\mathbf{r}, E, t) = \int d\mathbf{\Omega}
172n(\mathbf{r}, \mathbf{\hat{\Omega}}, E, t)` and :math:`d\ell = v \, dt` where
173:math:`d\ell` is the differential unit of track length, we then obtain
174
175.. math::
176    :label: track-length-integral
177
178    V \phi = \int d\mathbf{r} \int dE \int d\ell N(\mathbf{r}, E, t).
179
180Equation :eq:`track-length-integral` indicates that we can use the length of a
181particle's trajectory as an estimate for the flux, i.e. the track-length
182estimator of the flux would be
183
184.. math::
185    :label: track-length-flux
186
187    \phi = \frac{1}{W} \sum_{i \in T} w_i \ell_i
188
189where :math:`T` is the set of all the particle's trajectories within the desired
190volume and :math:`\ell_i` is the length of the :math:`i`-th trajectory. In the
191same vein as equation :eq:`collision-estimator`, the track-length estimate of a
192reaction rate is found by multiplying equation :eq:`track-length-flux` by a
193macroscopic reaction cross section:
194
195.. math::
196    :label: track-length-estimator
197
198    R_x = \frac{1}{W} \sum_{i \in T} w_i \ell_i \Sigma_x (E_i).
199
200One important fact to take into consideration is that the use of a track-length
201estimator precludes us from using any filter that requires knowledge of the
202particle's state following a collision because by definition, it will not have
203had a collision at every event. Thus, for tallies with outgoing-energy filters
204(which require the post-collision energy), scattering change-in-angle filters,
205or for tallies of scattering moments (which require the scattering cosine of
206the change-in-angle), we must use an analog estimator.
207
208.. TODO: Add description of surface current tallies
209
210----------
211Statistics
212----------
213
214As was discussed briefly in :ref:`methods_introduction`, any given result from a
215Monte Carlo calculation, colloquially known as a "tally", represents an estimate
216of the mean of some `random variable`_ of interest. This random variable
217typically corresponds to some physical quantity like a reaction rate, a net
218current across some surface, or the neutron flux in a region. Given that all
219tallies are produced by a `stochastic process`_, there is an associated
220uncertainty with each value reported. It is important to understand how the
221uncertainty is calculated and what it tells us about our results. To that end,
222we will introduce a number of theorems and results from statistics that should
223shed some light on the interpretation of uncertainties.
224
225Law of Large Numbers
226--------------------
227
228The `law of large numbers`_ is an important statistical result that tells us
229that the average value of the result a large number of repeated experiments
230should be close to the `expected value`_. Let :math:`X_1, X_2, \dots, X_n` be an
231infinite sequence of `independent, identically-distributed random variables`_
232with expected values :math:`E(X_1) = E(X_2) = \mu`. One form of the law of large
233numbers states that the sample mean :math:`\bar{X_n} = \frac{X_1 + \dots +
234X_n}{n}` `converges in probability`_ to the true mean, i.e. for all
235:math:`\epsilon > 0`
236
237.. math::
238
239    \lim\limits_{n\rightarrow\infty} P \left ( \left | \bar{X}_n - \mu \right |
240    \ge \epsilon \right ) = 0.
241
242.. _central-limit-theorem:
243
244Central Limit Theorem
245---------------------
246
247The `central limit theorem`_ (CLT) is perhaps the most well-known and ubiquitous
248statistical theorem that has far-reaching implications across many
249disciplines. The CLT is similar to the law of large numbers in that it tells us
250the limiting behavior of the sample mean. Whereas the law of large numbers tells
251us only that the value of the sample mean will converge to the expected value of
252the distribution, the CLT says that the distribution of the sample mean will
253converge to a `normal distribution`_. As we defined before, let :math:`X_1, X_2,
254\dots, X_n` be an infinite sequence of independent, identically-distributed
255random variables with expected values :math:`E(X_i) = \mu` and variances
256:math:`\text{Var} (X_i) = \sigma^2 < \infty`. Note that we don't require that
257these random variables take on any particular distribution -- they can be
258normal, log-normal, Weibull, etc. The central limit theorem states that as
259:math:`n \rightarrow \infty`, the random variable :math:`\sqrt{n} (\bar{X}_n -
260\mu)` `converges in distribution`_ to the standard normal distribution:
261
262.. math::
263    :label: central-limit-theorem
264
265    \sqrt{n} \left ( \frac{1}{n} \sum_{i=1}^n X_i - \mu \right ) \xrightarrow{d}
266    \mathcal{N} (0, \sigma^2)
267
268Estimating Statistics of a Random Variable
269------------------------------------------
270
271Mean
272++++
273
274Given independent samples drawn from a random variable, the sample mean is
275simply an estimate of the average value of the random variable. In a Monte Carlo
276simulation, the random variable represents physical quantities that we want
277tallied. If :math:`X` is the random variable with :math:`N` observations
278:math:`x_1, x_2, \dots, x_N`, then an unbiased estimator for the population mean
279is the sample mean, defined as
280
281.. math::
282    :label: sample-mean
283
284    \bar{x} = \frac{1}{N} \sum_{i=1}^N x_i.
285
286Variance
287++++++++
288
289The variance of a population indicates how spread out different members of the
290population are. For a Monte Carlo simulation, the variance of a tally is a
291measure of how precisely we know the tally value, with a lower variance
292indicating a higher precision. There are a few different estimators for the
293population variance. One of these is the second central moment of the
294distribution also known as the biased sample variance:
295
296.. math::
297    :label: biased-variance
298
299    s_N^2 = \frac{1}{N} \sum_{i=1}^N \left ( x_i - \bar{x} \right )^2 = \left (
300    \frac{1}{N} \sum_{i=1}^N x_i^2 \right ) - \bar{x}^2.
301
302This estimator is biased because its expected value is actually not equal to the
303population variance:
304
305.. math::
306    :label: biased-variance-expectation
307
308    E[s_N^2] = \frac{N - 1}{N} \sigma^2
309
310where :math:`\sigma^2` is the actual population variance. As a result, this
311estimator should not be used in practice. Instead, one can use `Bessel's
312correction`_ to come up with an unbiased sample variance estimator:
313
314.. math::
315    :label: unbiased-variance
316
317    s^2 = \frac{1}{N - 1} \sum_{i=1}^N \left ( x_i - \bar{x} \right )^2 =
318    \frac{1}{N - 1} \left ( \sum_{i=1}^N x_i^2 - N\bar{x}^2 \right ).
319
320This is the estimator normally used to calculate sample variance. The final form
321in equation :eq:`unbiased-variance` is especially suitable for computation since
322we do not need to store the values at every realization of the random variable
323as the simulation proceeds. Instead, we can simply keep a running sum and sum of
324squares of the values at each realization of the random variable and use that to
325calculate the variance.
326
327Variance of the Mean
328++++++++++++++++++++
329
330The previous sections discussed how to estimate the mean and variance of a
331random variable using statistics on a finite sample. However, we are generally
332not interested in the *variance of the random variable* itself; we are more
333interested in the *variance of the estimated mean*. The sample mean is the
334result of our simulation, and the variance of the sample mean will tell us how
335confident we should be in our answers.
336
337Fortunately, it is quite easy to estimate the variance of the mean if we are
338able to estimate the variance of the random variable. We start with the
339observation that if we have a series of uncorrelated random variables, we can
340write the variance of their sum as the sum of their variances:
341
342.. math::
343    :label: bienayme-formula
344
345    \text{Var} \left ( \sum_{i=1}^N X_i \right ) = \sum_{i=1}^N \text{Var} \left
346    ( X_i \right )
347
348This result is known as the Bienaymé formula. We can use this result to
349determine a formula for the variance of the sample mean. Assuming that the
350realizations of our random variable are again identical,
351independently-distributed samples, then we have that
352
353.. math::
354    :label: sample-variance-mean
355
356    \text{Var} \left ( \bar{X} \right ) = \text{Var} \left ( \frac{1}{N}
357    \sum_{i=1}^N X_i \right ) = \frac{1}{N^2} \sum_{i=1}^N \text{Var} \left (
358    X_i \right ) = \frac{1}{N^2} \left ( N\sigma^2 \right ) =
359    \frac{\sigma^2}{N}.
360
361We can combine this result with equation :eq:`unbiased-variance` to come up with
362an unbiased estimator for the variance of the sample mean:
363
364.. math::
365    :label: sample-variance-mean-formula
366
367    s_{\bar{X}}^2 = \frac{1}{N - 1} \left ( \frac{1}{N} \sum_{i=1}^N x_i^2 -
368    \bar{x}^2 \right ).
369
370At this point, an important distinction should be made between the estimator for
371the variance of the population and the estimator for the variance of the
372mean. As the number of realizations increases, the estimated variance of the
373population based on equation :eq:`unbiased-variance` will tend to the true
374population variance. On the other hand, the estimated variance of the mean will
375tend to zero as the number of realizations increases. A practical interpretation
376of this is that the longer you run a simulation, the better you know your
377results. Therefore, by running a simulation long enough, it is possible to
378reduce the stochastic uncertainty to arbitrarily low levels.
379
380Confidence Intervals
381++++++++++++++++++++
382
383While the sample variance and standard deviation gives us some idea about the
384variability of the estimate of the mean of whatever quantities we've tallied, it
385does not help us interpret how confidence we should be in the results. To
386quantify the reliability of our estimates, we can use `confidence intervals`_
387based on the calculated sample variance.
388
389A :math:`1-\alpha` confidence interval for a population parameter is defined as
390such: if we repeat the same experiment many times and calculate the confidence
391interval for each experiment, then :math:`1 - \alpha` percent of the calculated
392intervals would encompass the true population parameter. Let :math:`x_1, x_2,
393\dots, x_N` be samples from a set of independent, identically-distributed random
394variables each with population mean :math:`\mu` and variance
395:math:`\sigma^2`. The t-statistic is defined as
396
397.. math::
398    :label: t-statistic
399
400    t = \frac{\bar{x} - \mu}{s/\sqrt{N}}
401
402where :math:`\bar{x}` is the sample mean from equation :eq:`sample-mean` and
403:math:`s` is the standard deviation based on equation
404:eq:`unbiased-variance`. If the random variables :math:`X_i` are
405normally-distributed, then the t-statistic has a `Student's t-distribution`_
406with :math:`N-1` degrees of freedom. This implies that
407
408.. math::
409    :label: t-probability
410
411    Pr \left ( -t_{1 - \alpha/2, N - 1} \le \frac{\bar{x} - \mu}{s/\sqrt{N}} \le
412    t_{1 - \alpha/2, N - 1} \right ) = 1 - \alpha
413
414where :math:`t_{1-\alpha/2, N-1}` is the :math:`1 - \alpha/2` percentile of a
415t-distribution with :math:`N-1` degrees of freedom. Thus, the :math:`1 - \alpha`
416two sided confidence interval for the sample mean is
417
418.. math::
419    :label: two-sided-ci
420
421    \bar{x} \pm t_{1 - \alpha/2, N-1} \frac{s}{\sqrt{N}}.
422
423One should be cautioned that equation :eq:`two-sided-ci` only applies if the
424*underlying random variables* are normally-distributed. In general, this may not
425be true for a tally random variable --- the central limit theorem guarantees
426only that the sample mean is normally distributed, not the underlying random
427variable. If batching is used, then the underlying random variable, which would
428then be the averages from each batch, will be normally distributed as long as
429the conditions of the central limit theorem are met.
430
431Let us now outline the method used to calculate the percentile of the Student's
432t-distribution. For one or two degrees of freedom, the percentile can be written
433analytically. For one degree of freedom, the t-distribution becomes a standard
434`Cauchy distribution`_ whose cumulative distribution function is
435
436.. math::
437    :label: cauchy-cdf
438
439    c(x) = \frac{1}{\pi} \arctan x + \frac{1}{2}.
440
441Thus, inverting the cumulative distribution function, we find the :math:`x`
442percentile of the standard Cauchy distribution to be
443
444.. math::
445    :label: percentile-1
446
447    t_{x,1} = \tan \left ( \pi \left ( x - \frac{1}{2} \right ) \right ).
448
449For two degrees of freedom, the cumulative distribution function is the
450second-degree polynomial
451
452.. math::
453    :label: t-2-polynomial
454
455    c(x) = \frac{1}{2} + \frac{x}{2\sqrt{x^2 + 2}}
456
457Solving for :math:`x`, we find the :math:`x` percentile to be
458
459.. math::
460    :label: percentile-2
461
462    t_{x,2} = \frac{2\sqrt{2} (x - 1/2)}{\sqrt{1 - 4 (x - 1/2)^2}}
463
464For degrees of freedom greater than two, it is not possible to obtain an
465analytical formula for the inverse of the cumulative distribution function. We
466must resort to either numerically solving for the inverse or to an
467approximation. Approximations for percentiles of the t-distribution have been
468found with high levels of accuracy. OpenMC uses the `following approximation`_:
469
470.. math::
471    :label: percentile-n
472
473    t_{x,n} = \sqrt{\frac{n}{n-2}} \left ( z_x + \frac{1}{4} \frac{z_x^3 -
474    3z_x}{n-2} + \frac{1}{96} \frac{5z_x^5 - 56z_x^3 + 75z_x}{(n-2)^2} +
475    \frac{1}{384} \frac{3z_x^7 - 81z_x^5 + 417z_x^3 - 315z_x}{(n-2)^3} \right )
476
477where :math:`z_x` is the :math:`x` percentile of the standard normal
478distribution. In order to determine an arbitrary percentile of the standard
479normal distribution, we use an `unpublished rational approximation`_. After
480using the rational approximation, one iteration of Newton's method is applied to
481improve the estimate of the percentile.
482
483.. only:: html
484
485   .. rubric:: References
486
487.. _following approximation: https://doi.org/10.1080/03610918708812641
488
489.. _Bessel's correction: https://en.wikipedia.org/wiki/Bessel's_correction
490
491.. _random variable: https://en.wikipedia.org/wiki/Random_variable
492
493.. _stochastic process: https://en.wikipedia.org/wiki/Stochastic_process
494
495.. _independent, identically-distributed random variables: https://en.wikipedia.org/wiki/Independent_and_identically_distributed_random_variables
496
497.. _law of large numbers: https://en.wikipedia.org/wiki/Law_of_large_numbers
498
499.. _expected value: https://en.wikipedia.org/wiki/Expected_value
500
501.. _converges in probability: https://en.wikipedia.org/wiki/Convergence_of_random_variables#Convergence_in_probability
502
503.. _normal distribution: https://en.wikipedia.org/wiki/Normal_distribution
504
505.. _converges in distribution: https://en.wikipedia.org/wiki/Convergence_of_random_variables#Convergence_in_distribution
506
507.. _confidence intervals: https://en.wikipedia.org/wiki/Confidence_interval
508
509.. _Student's t-distribution: https://en.wikipedia.org/wiki/Student%27s_t-distribution
510
511.. _Cauchy distribution: https://en.wikipedia.org/wiki/Cauchy_distribution
512
513.. _unpublished rational approximation: https://web.archive.org/web/20150926021742/http://home.online.no/~pjacklam/notes/invnorm/
514
515.. _MC21: http://www.osti.gov/bridge/servlets/purl/903083-HT5p1o/903083.pdf
516