1.. ############################################################################
2.. # Copyright (c) Lawrence Livermore National Security, LLC and other Ascent
3.. # Project developers. See top-level LICENSE AND COPYRIGHT files for dates and
4.. # other details. No copyright assignment is required to contribute to Ascent.
5.. ############################################################################
6
7.. _Binning:
8
9Data Binning
10============
11Ascent's Data Binning was modeled after VisIt's Data Binning / Derived Data Field capability.
12The capability defies a good name, it is has also been called Equivalence Class Functions.
13The approach is very similar to a multi-dimensional histogram.
14You define a multi-dimensional binning, based on either spatial coordinates or field values,  and then Ascent will loop over your mesh elements and aggregate them into these bins.
15During the binning process, you can employ a menu of reduction functions
16(sum, average, min, max, variance, etc) depending on the type of analysis desired.
17
18You can bin spatially to calculate distributions, find extreme values, etc.
19With the right approach, you can implement mesh agnostic analysis that can be used across simulation codes.
20You can also map the binned result back onto the original mesh topology
21to enable further analysis, like deviations from an average.
22
23Benefits
24--------
25Simulation user often needs to analyze quantaties of interest within fields on
26a mesh, but the user might not know the exact data structures used by the underlying
27simulation.
28For example, the mesh data might be represented as uniform grids or as high-order finite
29element meshes.
30If the users does not know the underlying data structures, it can be very difficult to write
31the underlying analysis, and that analysis code will not work for another simulation.
32Using spatial binning essentially create a uniform representation that can be use across
33simulation codes, regardless of the underlying mesh representation.
34
35
36Sampling and Aggregation
37------------------------
38When specifying the number of bins on an axis, there will always be over smapling or undersampling.
39During spatial binning, each zone is placed into a bin based on it centriod, and as with all
40binning, this is subject to over sampling or under sampling.
41
42
43.. figure:: ../images/undersampling.png
44
45  An example of spatial under sampling.
46
47When multiple values fall into a single bin (i.e., undersampling), we aggregate values using the following options:
48
49*  min: minimum value in a bin
50*  max: maximum value in a bin
51*  sum: sum of values in a bin
52*  avg: average of values in a bin
53*  pdf: probability distribution function
54*  std: standard deviation of values in a bin
55*  var: variance of values in a bin
56*  rms: root mean square of values in a bin
57
58The aggegation function is the second argument to the binning function and is demonstrated in the line out
59example.
60
61.. figure:: ../images/oversampling.png
62
63  An example of spatial over sampling.
64
65When oversamping data, the default value of an empy bin is 0. That said, the default empty
66value can be overridden by an option named parameter, e.g., `empty_bin_val=100`.
67This is often useful when the default value is part of the data range, and setting
68the empty bin value to something known, allows the user to filter out empty bins
69from the results.
70
71
72Example Line Out
73----------------
74We will use data binning to provide capablility similar to a a line out.
75To accomplish this, we will define a spatial binning that is like a pencil
76down the center of the data set in the z direction,
77and we will use the noise mini-app to demonstrate.
78
79In the Lulesh proxy application, the mesh is defined with the spatial bounds
80(0,0,0)-(1.2,1.2,1.2).
81We will define a three dimentional binning on the ranges `x=(0,0.1)` with 1 bin,
82`y=(0,1.2)` with 1 bin, and `z=(0,1.2)` with 20 bins.
83This is technically a 3d binning, but it will result in a 1d array of values.
84
85Lusesh implements the Sedov blast problem which deposits a region of high energy in
86one corner of the data set, and as time progresses, a shockwave propagate out.
87The idea behind this example is to create a simple query to help us track the shock
88front as it moves through the problem.
89To that end, we will create a query that bins pressure (i.e., the variable `p`).
90
91.. figure:: ../images/lulesh_binning_fig.png
92
93  An example of Lulesh where the binning region is highlighted in red..
94
95Actions File
96^^^^^^^^^^^^
97An example ascent actions file that create this query:
98
99.. code-block:: yaml
100
101  -
102    action: "add_queries"
103    queries:
104      bin_density:
105        params:
106          expression: "binning('p','max', [axis('x',[0.0,0.1]), axis('y', [0.0,0.1]), axis('z', num_bins=20)])"
107          name: my_binning_name
108
109Note that with and `x` and `y` axes that we are explicity specifying the bounds of the bins.
110Ascent deduces the number of bins bases on the explicit coordinates inside the array `[0.0,0.1]`.
111With the `z` axis, the binning  automatically defines a uniform binning based on the spatial
112extents of the mesh.
113Additionally, we are using `max` as the aggregation function.
114
115Session File
116^^^^^^^^^^^^
117The binning is called every cycle ascent is executed, and the results are stored within
118the expressions cache.
119When the run is complete, the results of the binnning, as well as all other expressions,
120are output inside the `ascent_session.yaml` file, which is convenient for post processing.
121
122Here is a excerpt from the session file (note: the large array is truncated):
123
124.. code-block:: yaml
125
126  my_binning_name:
127    1:
128      type: "binning"
129      attrs:
130        value:
131          value: [0.0, ...]
132          type: "array"
133        reduction_var:
134          value: "p"
135          type: "string"
136        reduction_op:
137          value: "max"
138          type: "string"
139        bin_axes:
140          value:
141            x:
142              bins: [0.0, 0.1]
143              clamp: 0
144            y:
145              bins: [0.0, 0.1]
146              clamp: 0
147            z:
148              num_bins: 20
149              clamp: 0
150              min_val: 0.0
151              max_val: 1.12500001125
152        association:
153          value: "element"
154          type: "string"
155      time: 1.06812409221472e-05
156
157Inside the session file is all the information Ascent used to create the binning,
158including the automatically defined spatial ranges for the `z` axis,
159fields used, the aggregate operation, cycle, and simulation time.
160The session file will include an entry like the one above for each cycle,
161and the cycle is located directly below the name of the query
162(i.e., `my_binning_name`).
163Once the simulation is complete, we can create a python script to process
164and plot the data.
165
166Plotting
167^^^^^^^^
168Plotting the resulting data is straight forward in python.
169
170.. code-block:: python
171
172  import yaml #pip install --user pyyaml
173  import pandas as pd
174  import matplotlib.pyplot as plt
175
176  session = []
177  with open(r'ascent_session.yaml') as file:
178    session = yaml.load(file)
179
180  binning = session['binning']
181  cycles = list(binning.keys())
182  bins = []
183
184  # loop through each cycle and grab the bins
185  for cycle in binning.values():
186    bins.append((cycle['attrs']['value']['value']))
187
188  # create the coordinate axis using bin centers
189  z_axis = binning[cycles[0]]['attrs']['bin_axes']['value']['z']
190  z_min = z_axis['min_val']
191  z_max = z_axis['max_val']
192  z_bins = z_axis['num_bins']
193
194  z_delta = (z_max - z_min) / float(z_bins)
195  z_start = z_min + 0.5 * z_delta
196  z_vals = []
197  for b in range(0,z_bins):
198    z_vals.append(b * z_delta + z_start)
199
200  # plot the curve from the last cycle
201  plt.plot(z_vals, bins[-1]);
202  plt.xlabel('z position')
203  plt.ylabel('pressure')
204  plt.savefig("binning.png")
205
206
207.. figure:: ../images/lulesh_binning.png
208
209  The resulting plot of pressure from the last cycle.
210
211From the resulting plot, we can clearly see how far the shock front has traveled
212through the problem.
213Plotting the curve through time, we can see the shock from move along the z-axis.
214
215.. image:: ../images/lulesh_binning.gif
216