1VTK-m 1.6 Release Notes
2=======================
3
4# Table of Contents
5
61. [Core](#Core)
7  - Add Kokkos backend
8  - Deprecate `DataSetFieldAdd`
9  - Move VTK file readers and writers into vtkm_io
10  - Remove VTKDataSetWriter::WriteDataSet just_points parameter
11  - Added VecFlat class
12  - Add a vtkm::Tuple class
13  - DataSet now only allows unique field names
14  - Result DataSet of coordinate transform has its CoordinateSystem changed
15  - `vtkm::cont::internal::Buffer` now can have ownership transferred
16  - Configurable default types
172. [ArrayHandle](#ArrayHandle)
18  - Shorter fancy array handle classnames
19  - `ReadPortal().Get(idx)`
20  - Precompiled `ArrayCopy` for `UnknownArrayHandle`
21  - Create `ArrayHandleOffsetsToNumComponents`
22  - Recombine extracted component arrays from unknown arrays
23  - UnknownArrayHandle and UncertainArrayHandle for runtime-determined types
24  - Support `ArrayHandleSOA` as a "default" array
25  - Removed old `ArrayHandle` transfer mechanism
26  - Order asynchronous `ArrayHandle` access
27  - Improvements to moving data into ArrayHandle
28  - Deprecate ArrayHandleVirtualCoordinates
29  - ArrayHandleDecorator Allocate and Shrink Support
30  - Portals may advertise custom iterators
31  - Redesign of ArrayHandle to access data using typeless buffers
32  - `ArrayRangeCompute` works on any array type without compiling device code
33  - Implemented ArrayHandleRandomUniformBits and ArrayHandleRandomUniformReal
34  - Extract component arrays from unknown arrays
35  - `ArrayHandleGroupVecVariable` holds now one more offset.
363. [Control Environment](#Control-Environment)
37  - Algorithms for Control and Execution Environments
384. [Execution Environment](#Execution-Environment)
39  - Scope ExecObjects with Tokens
40  - Masks and Scatters Supported for 3D Scheduling
41  - Virtual methods in execution environment deprecated
42  - Deprecate Execute with policy
435. [Worklets and Filters](#Worklets-and-Filters)
44  - Enable setting invalid value in probe filter
45  - Avoid raising errors when operating on cells
46  - Add atomic free functions
47  - Flying Edges
48  - Filters specify their own field types
496. [Build](#Build)
50  - Disable asserts for CUDA architecture builds
51  - Disable asserts for HIP architecture builds
52  - Add VTKM_DEPRECATED macro
537. [Other](#Other)
54  - Porting layer for future std features
55  - Removed OpenGL Rendering Classes
56  - Reorganization of `io` directory
57  - Implemented PNG/PPM image Readers/Writers
58  - Updated Benchmark Framework
59  - Provide scripts to build Gitlab-ci workers locally
60  - Replaced `vtkm::ListTag` with `vtkm::List`
61  - Add `ListTagRemoveIf`
62  - Write uniform and rectilinear grids to legacy VTK files
638. [References](#References)
64
65# Core
66
67## Add Kokkos backend
68
69  Adds a new device backend `Kokkos` which uses the kokkos library for parallelism.
70  User must provide the kokkos build and Vtk-m will use the default configured execution
71  space.
72
73## Deprecate `DataSetFieldAdd`
74
75The class `vtkm::cont::DataSetFieldAdd` is now deprecated.
76Its methods, `AddPointField` and `AddCellField` have been moved to member functions
77of `vtkm::cont::DataSet`, which simplifies many calls.
78
79For example, the following code
80
81```cpp
82vtkm::cont::DataSetFieldAdd fieldAdder;
83fieldAdder.AddCellField(dataSet, "cellvar", values);
84```
85
86would now be
87
88```cpp
89dataSet.AddCellField("cellvar", values);
90```
91
92## Move VTK file readers and writers into vtkm_io
93
94The legacy VTK file reader and writer were created back when VTK-m was a
95header-only library. Things have changed and we now compile quite a bit of
96code into libraries. At this point, there is no reason why the VTK file
97reader/writer should be any different.
98
99Thus, `VTKDataSetReader`, `VTKDataSetWriter`, and several supporting
100classes are now compiled into the `vtkm_io` library. Also similarly updated
101`BOVDataSetReader` for good measure.
102
103As a side effect, code using VTK-m will need to link to `vtkm_io` if they
104are using any readers or writers.
105
106
107## Remove VTKDataSetWriter::WriteDataSet just_points parameter
108
109In the method `VTKDataSetWriter::WriteDataSet`, `just_points` parameter has been
110removed due to lack of usage.
111
112The purpose of `just_points` was to allow exporting only the points of a
113DataSet without its cell data.
114
115
116## Added VecFlat class
117
118`vtkm::VecFlat` is a wrapper around a `Vec`-like class that may be a nested
119series of vectors. For example, if you run a gradient operation on a vector
120field, you are probably going to get a `Vec` of `Vec`s that looks something
121like `vtkm::Vec<vtkm::Vec<vtkm::Float32, 3>, 3>`. That is fine, but what if
122you want to treat the result simply as a `Vec` of size 9?
123
124The `VecFlat` wrapper class allows you to do this. Simply place the nested
125`Vec` as an argument to `VecFlat` and it will behave as a flat `Vec` class.
126(In fact, `VecFlat` is a subclass of `Vec`.) The `VecFlat` class can be
127copied to and from the nested `Vec` it is wrapping.
128
129There is a `vtkm::make_VecFlat` convenience function that takes an object
130and returns a `vtkm::VecFlat` wrapped around it.
131
132`VecFlat` works with any `Vec`-like object as well as scalar values.
133However, any type used with `VecFlat` must have `VecTraits` defined and the
134number of components must be static (i.e. known at compile time).
135
136
137## Add a vtkm::[Tuple](Tuple) class
138
139This change added a `vtkm::Tuple` class that is very similar in nature to
140`std::tuple`. This should replace our use of tao tuple.
141
142The motivation for this change was some recent attempts at removing objects
143like `Invocation` and `FunctionInterface`. I expected these changes to
144speed up the build, but in fact they ended up slowing down the build. I
145believe the problem was that these required packing variable parameters
146into a tuple. I was using the tao `tuple` class, but it seemed to slow down
147the compile. (That is, compiling tao's `tuple` seemed much slower than
148compiling the equivalent `FunctionInterface` class.)
149
150The implementation of `vtkm::Tuple` is using `pyexpander` to build lots of
151simple template cases for the object (with a backup implementation for even
152longer argument lists). I believe the compiler is better and parsing
153through thousands of lines of simple templates than to employ clever MPL to
154build general templates.
155
156### Usage
157
158The `vtkm::Tuple` class is defined in the `vtkm::Tuple.h` header file. A
159`Tuple` is designed to behave much like a `std::tuple` with some minor
160syntax differences to fit VTK-m coding standards.
161
162A tuple is declared with a list of template argument types.
163
164``` cpp
165vtkm::Tuple<vtkm::Id, vtkm::Vec3f, vtkm::cont::ArrayHandle<vtkm::Float32>> myTuple;
166```
167
168If given no arguments, a `vtkm::Tuple` will default-construct its contained
169objects. A `vtkm::Tuple` can also be constructed with the initial values of
170all contained objects.
171
172``` cpp
173vtkm::Tuple<vtkm::Id, vtkm::Vec3f, vtkm::cont::ArrayHandle<vtkm::Float32>>
174  myTuple(0, vtkm::Vec3f(0, 1, 2), array);
175```
176
177For convenience there is a `vtkm::MakeTuple` function that takes arguments
178and packs them into a `Tuple` of the appropriate type. (There is also a
179`vtkm::make_tuple` alias to the function to match the `std` version.)
180
181``` cpp
182auto myTuple = vtkm::MakeTuple(0, vtkm::Vec3f(0, 1, 2), array);
183```
184
185Data is retrieved from a `Tuple` by using the `vtkm::Get` method. The `Get`
186method is templated on the index to get the value from. The index is of
187type `vtkm::IdComponent`. (There is also a `vtkm::get` that uses a
188`std::size_t` as the index type as an alias to the function to match the
189`std` version.)
190
191``` cpp
192vtkm::Id a = vtkm::Get<0>(myTuple);
193vtkm::Vec3f b = vtkm::Get<1>(myTuple);
194vtkm::cont::ArrayHandle<vtkm::Float32> c = vtkm::Get<2>(myTuple);
195```
196
197Likewise `vtkm::TupleSize` and `vtkm::TupleElement` (and their aliases
198`vtkm::Tuple_size`, `vtkm::tuple_element`, and `vtkm::tuple_element_t`) are
199provided.
200
201### Extended Functionality
202
203The `vtkm::Tuple` class contains some functionality beyond that of
204`std::tuple` to cover some common use cases in VTK-m that are tricky to
205implement. In particular, these methods allow you to use a `Tuple` as you
206would commonly use parameter packs. This allows you to stash parameter
207packs in a `Tuple` and then get them back out again.
208
209#### For Each
210
211`vtkm::Tuple::ForEach()` is a method that takes a function or functor and
212calls it for each of the items in the tuple. Nothing is returned from
213`ForEach` and any return value from the function is ignored.
214
215`ForEach` can be used to check the validity of each item.
216
217``` cpp
218void CheckPositive(vtkm::Float64 x)
219{
220  if (x < 0)
221  {
222    throw vtkm::cont::ErrorBadValue("Values need to be positive.");
223  }
224}
225
226// ...
227
228  vtkm::Tuple<vtkm::Float64, vtkm::Float64, vtkm::Float64> tuple(
229    CreateValue1(), CreateValue2(), CreateValue3());
230
231  // Will throw an error if any of the values are negative.
232  tuple.ForEach(CheckPositive);
233```
234
235`ForEach` can also be used to aggregate values.
236
237``` cpp
238struct SumFunctor
239{
240  vtkm::Float64 Sum = 0;
241
242  template <typename T>
243  void operator()(const T& x)
244  {
245    this->Sum = this->Sum + static_cast<vtkm::Float64>(x);
246  }
247};
248
249// ...
250
251  vtkm::Tuple<vtkm::Float32, vtkm::Float64, vtkm::Id> tuple(
252    CreateValue1(), CreateValue2(), CreateValue3());
253
254  SumFunctor sum;
255  tuple.ForEach(sum);
256  vtkm::Float64 average = sum.Sum / 3;
257```
258
259#### Transform
260
261`vtkm::Tuple::Transform` is a method that builds a new `Tuple` by calling a
262function or functor on each of the items. The return value is placed in the
263corresponding part of the resulting `Tuple`, and the type is automatically
264created from the return type of the function.
265
266``` cpp
267struct GetReadPortalFunctor
268{
269  template <typename Array>
270  typename Array::ReadPortal operator()(const Array& array) const
271  {
272    VTKM_IS_ARRAY_HANDLE(Array);
273	return array.ReadPortal();
274  }
275};
276
277// ...
278
279  auto arrayTuple = vtkm::MakeTuple(array1, array2, array3);
280
281  auto portalTuple = arrayTuple.Transform(GetReadPortalFunctor{});
282```
283
284#### Apply
285
286`vtkm::Tuple::Apply` is a method that calls a function or functor using the
287objects in the `Tuple` as the arguments. If the function returns a value,
288that value is returned from `Apply`.
289
290``` cpp
291struct AddArraysFunctor
292{
293  template <typename Array1, typename Array2, typename Array3>
294  vtkm::Id operator()(Array1 inArray1, Array2 inArray2, Array3 outArray) const
295  {
296    VTKM_IS_ARRAY_HANDLE(Array1);
297    VTKM_IS_ARRAY_HANDLE(Array2);
298    VTKM_IS_ARRAY_HANDLE(Array3);
299
300    vtkm::Id length = inArray1.GetNumberOfValues();
301	VTKM_ASSERT(inArray2.GetNumberOfValues() == length);
302	outArray.Allocate(length);
303
304	auto inPortal1 = inArray1.ReadPortal();
305	auto inPortal2 = inArray2.ReadPortal();
306	auto outPortal = outArray.WritePortal();
307	for (vtkm::Id index = 0; index < length; ++index)
308	{
309	  outPortal.Set(index, inPortal1.Get(index) + inPortal2.Get(index));
310	}
311
312	return length;
313  }
314};
315
316// ...
317
318  auto arrayTuple = vtkm::MakeTuple(array1, array2, array3);
319
320  vtkm::Id arrayLength = arrayTuple.Apply(AddArraysFunctor{});
321```
322
323If additional arguments are given to `Apply`, they are also passed to the
324function (before the objects in the `Tuple`). This is helpful for passing
325state to the function. (This feature is not available in either `ForEach`
326or `Transform` for technical implementation reasons.)
327
328``` cpp
329struct ScanArrayLengthFunctor
330{
331  template <std::size_t N, typename Array, typename... Remaining>
332  std::array<vtkm::Id, N + 1 + sizeof...(Remaining)>
333  operator()(const std::array<vtkm::Id, N>& partialResult,
334             const Array& nextArray,
335			 const Remaining&... remainingArrays) const
336  {
337    std::array<vtkm::Id, N + 1> nextResult;
338	std::copy(partialResult.begin(), partialResult.end(), nextResult.begin());
339    nextResult[N] = nextResult[N - 1] + nextArray.GetNumberOfValues();
340	return (*this)(nextResult, remainingArray);
341  }
342
343  template <std::size_t N>
344  std::array<vtkm::Id, N> operator()(const std::array<vtkm::Id, N>& result) const
345  {
346    return result;
347  }
348};
349
350// ...
351
352  auto arrayTuple = vtkm::MakeTuple(array1, array2, array3);
353
354  std::array<vtkm::Id, 4> =
355    arrayTuple.Apply(ScanArrayLengthFunctor{}, std::array<vtkm::Id, 1>{ 0 });
356```
357
358## DataSet now only allows unique field names
359
360When you add a `vtkm::cont::Field` to a `vtkm::cont::DataSet`, it now
361requires every `Field` to have a unique name. When you attempt to add a
362`Field` to a `DataSet` that already has a `Field` of the same name and
363association, the old `Field` is removed and replaced with the new `Field`.
364
365You are allowed, however, to have two `Field`s with the same name but
366different associations. For example, you could have a point `Field` named
367"normals" and also have a cell `Field` named "normals" in the same
368`DataSet`.
369
370This new behavior matches how VTK's data sets manage fields.
371
372The old behavior allowed you to add multiple `Field`s with the same name,
373but it would be unclear which one you would get if you asked for a `Field`
374by name.
375
376
377## Result DataSet of coordinate transform has its CoordinateSystem changed
378
379When you run one of the coordinate transform filters,
380`CylindricalCoordinateTransform` or `SphericalCoordinateTransform`, the
381transform coordiantes are placed as the first `CoordinateSystem` in the
382returned `DataSet`. This means that after running this filter, the data
383will be moved to this new coordinate space.
384
385Previously, the result of these filters was just placed in a named `Field`
386of the output. This caused some confusion because the filter did not seem
387to have any effect (unless you knew to modify the output data). Not using
388the result as the coordinate system seems like a dubious use case (and not
389hard to work around), so this is much better behavior.
390
391
392## `vtkm::cont::internal::Buffer` now can have ownership transferred
393
394Memory once transferred to `Buffer` always had to be managed by VTK-m. This is problematic
395for applications that needed VTK-m to allocate memory, but have the memory ownership
396be longer than VTK-m.
397
398`Buffer::TakeHostBufferOwnership` allows for easy transfer ownership of memory out of VTK-m.
399When taking ownership of an VTK-m buffer you are provided the following information:
400
401- Memory: A `void*` pointer to the array
402- Container: A `void*` pointer used to free the memory. This is necessary to support cases such as allocations transferred into VTK-m from a `std::vector`.
403- Delete: The function to call to actually delete the transferred memory
404- Reallocate: The function to call to re-allocate the transferred memory. This will throw an exception if users try
405to reallocate a buffer that was 'view' only
406- Size: The size in number of elements of the array
407
408
409To properly steal memory from VTK-m you do the following:
410```cpp
411  vtkm::cont::ArrayHandle<T> arrayHandle;
412
413  ...
414
415  auto stolen = arrayHandle.GetBuffers()->TakeHostBufferOwnership();
416
417  ...
418
419  stolen.Delete(stolen.Container);
420```
421
422
423## Configurable default types
424
425Because VTK-m compiles efficient code for accelerator architectures, it
426often has to compile for static types. This means that dynamic types often
427have to be determined at runtime and converted to static types. This is the
428reason for the `CastAndCall` architecture in VTK-m.
429
430For this `CastAndCall` to work, there has to be a finite set of static
431types to try at runtime. If you don't compile in the types you need, you
432will get runtime errors. However, the more types you compile in, the longer
433the compile time and executable size. Thus, getting the types right is
434important.
435
436The "right" types to use can change depending on the application using
437VTK-m. For example, when VTK links in VTK-m, it needs to support lots of
438types and can sacrifice the compile times to do so. However, if using VTK-m
439in situ with a fortran simulation, space and time are critical and you
440might only need to worry about double SoA arrays.
441
442Thus, it is important to customize what types VTK-m uses based on the
443application. This leads to the oxymoronic phrase of configuring the default
444types used by VTK-m.
445
446This is being implemented by providing VTK-m with a header file that
447defines the default types. The header file provided to VTK-m should define
448one or more of the following preprocessor macros:
449
450  * `VTKM_DEFAULT_TYPE_LIST` - a `vtkm::List` of value types for fields that
451     filters should directly operate on (where applicable).
452  * `VTKM_DEFAULT_STORAGE_LIST` - a `vtkm::List` of storage tags for fields
453     that filters should directly operate on.
454  * `VTKM_DEFAULT_CELL_SET_LIST_STRUCTURED` - a `vtkm::List` of
455     `vtkm::cont::CellSet` types that filters should operate on as a
456     strutured cell set.
457  * `VTKM_DEFAULT_CELL_SET_LIST_UNSTRUCTURED` - a `vtkm::List` of
458     `vtkm::cont::CellSet` types that filters should operate on as an
459     unstrutured cell set.
460  * `VTKM_DEFAULT_CELL_SET_LIST` - a `vtkm::List` of `vtkm::cont::CellSet`
461     types that filters should operate on (where applicable). The default of
462     `vtkm::ListAppend<VTKM_DEFAULT_CELL_SET_LIST_STRUCTURED, VTKM_DEFAULT_CELL_SET_LIST>`
463	 is usually correct.
464
465If any of these macros are not defined, a default version will be defined.
466(This is the same default used if no header file is provided.)
467
468This header file is provided to the build by setting the
469`VTKm_DEFAULT_TYPES_HEADER` CMake variable. `VTKm_DEFAULT_TYPES_HEADER`
470points to the file, which will be configured and copied to VTK-m's build
471directory.
472
473For convenience, header files can be added to the VTK_m source directory
474(conventionally under vtkm/cont/internal). If this is the case, an advanced
475CMake option should be added to select the provided header file.
476
477
478
479# ArrayHandle
480
481## Shorter fancy array handle classnames
482
483Many of the fancy `ArrayHandle`s use the generic builders like
484`ArrayHandleTransform` and `ArrayHandleImplicit` for their implementation.
485Such is fine, but because they use functors and other such generic items to
486template their `Storage`, you can end up with very verbose classnames. This
487is an issue for humans trying to discern classnames. It can also be an
488issue for compilers that end up with very long resolved classnames that
489might get truncated if they extend past what was expected.
490
491The fix was for these classes to declare their own `Storage` tag and then
492implement their `Storage` and `ArrayTransport` classes as trivial
493subclasses of the generic `ArrayHandleImplicit` or `ArrayHandleTransport`.
494
495As an added bonus, a lot of this shortening also means that storage that
496relies on other array handles now are just typed by the storage of the
497decorated type, not the array itself. This should make the types a little
498more robust.
499
500Here is a list of classes that were updated.
501
502### `ArrayHandleCast<TargetT, vtkm::cont::ArrayHandle<SourceT, SourceStorage>>`
503
504Old storage:
505``` cpp
506vtkm::cont::internal::StorageTagTransform<
507  vtkm::cont::ArrayHandle<SourceT, SourceStorage>,
508  vtkm::cont::internal::Cast<TargetT, SourceT>,
509  vtkm::cont::internal::Cast<SourceT, TargetT>>
510```
511
512New Storage:
513``` cpp
514vtkm::cont::StorageTagCast<SourceT, SourceStorage>
515```
516
517(Developer's note: Implementing this change to `ArrayHandleCast` was a much bigger PITA than expected.)
518
519### `ArrayHandleCartesianProduct<AH1, AH2, AH3>`
520
521Old storage:
522``` cpp
523vtkm::cont::internal::StorageTagCartesianProduct<
524  vtkm::cont::ArrayHandle<ValueType, StorageTag1,
525  vtkm::cont::ArrayHandle<ValueType, StorageTag2,
526  vtkm::cont::ArrayHandle<ValueType, StorageTag3>>
527```
528
529New storage:
530``` cpp
531vtkm::cont::StorageTagCartesianProduct<StorageTag1, StorageTag2, StorageTag3>
532```
533
534### `ArrayHandleCompositeVector<AH1, AH2, ...>`
535
536Old storage:
537``` cpp
538vtkm::cont::internal::StorageTagCompositeVector<
539  tao::tuple<
540    vtkm::cont::ArrayHandle<ValueType, StorageType1>,
541	vtkm::cont::ArrayHandle<ValueType, StorageType2>,
542	...
543  >
544>
545```
546
547New storage:
548``` cpp
549vtkm::cont::StorageTagCompositeVec<StorageType1, StorageType2>
550```
551
552### `ArrayHandleConcatinate`
553
554First an example with two simple types.
555
556Old storage:
557``` cpp
558vtkm::cont::StorageTagConcatenate<
559  vtkm::cont::ArrayHandle<ValueType, StorageTag1>,
560  vtkm::cont::ArrayHandle<ValueType, StorageTag2>>
561```
562
563New storage:
564``` cpp
565vtkm::cont::StorageTagConcatenate<StorageTag1, StorageTag2>
566```
567
568Now a more specific example taken from the unit test of a concatination of a concatination.
569
570Old storage:
571``` cpp
572vtkm::cont::StorageTagConcatenate<
573  vtkm::cont::ArrayHandleConcatenate<
574    vtkm::cont::ArrayHandle<ValueType, StorageTag1>,
575	vtkm::cont::ArrayHandle<ValueType, StorageTag2>>,
576  vtkm::cont::ArrayHandle<ValueType, StorageTag3>>
577```
578
579New storage:
580``` cpp
581vtkm::cont::StorageTagConcatenate<
582  vtkm::cont::StorageTagConcatenate<StorageTag1, StorageTag2>, StorageTag3>
583```
584
585### `ArrayHandleConstant`
586
587Old storage:
588``` cpp
589vtkm::cont::StorageTagImplicit<
590  vtkm::cont::detail::ArrayPortalImplicit<
591    vtkm::cont::detail::ConstantFunctor<ValueType>>>
592```
593
594New storage:
595``` cpp
596vtkm::cont::StorageTagConstant
597```
598
599### `ArrayHandleCounting`
600
601Old storage:
602``` cpp
603vtkm::cont::StorageTagImplicit<vtkm::cont::internal::ArrayPortalCounting<ValueType>>
604```
605
606New storage:
607``` cpp
608vtkm::cont::StorageTagCounting
609```
610
611### `ArrayHandleGroupVec`
612
613Old storage:
614``` cpp
615vtkm::cont::internal::StorageTagGroupVec<
616  vtkm::cont::ArrayHandle<ValueType, StorageTag>, N>
617```
618
619New storage:
620``` cpp
621vtkm::cont::StorageTagGroupVec<StorageTag, N>
622```
623
624### `ArrayHandleGroupVecVariable`
625
626Old storage:
627``` cpp
628vtkm::cont::internal::StorageTagGroupVecVariable<
629  vtkm::cont::ArrayHandle<ValueType, StorageTag1>,
630  vtkm::cont::ArrayHandle<vtkm::Id, StorageTag2>>
631```
632
633New storage:
634``` cpp
635vtkm::cont::StorageTagGroupVecVariable<StorageTag1, StorageTag2>
636```
637
638### `ArrayHandleIndex`
639
640Old storage:
641``` cpp
642vtkm::cont::StorageTagImplicit<
643  vtkm::cont::detail::ArrayPortalImplicit<vtkm::cont::detail::IndexFunctor>>
644```
645
646New storage:
647``` cpp
648vtkm::cont::StorageTagIndex
649```
650
651### `ArrayHandlePermutation`
652
653Old storage:
654``` cpp
655vtkm::cont::internal::StorageTagPermutation<
656  vtkm::cont::ArrayHandle<vtkm::Id, StorageTag1>,
657  vtkm::cont::ArrayHandle<ValueType, StorageTag2>>
658```
659
660New storage:
661``` cpp
662vtkm::cont::StorageTagPermutation<StorageTag1, StorageTag2>
663```
664
665### `ArrayHandleReverse`
666
667Old storage:
668``` cpp
669vtkm::cont::StorageTagReverse<vtkm::cont::ArrayHandle<ValueType, vtkm::cont::StorageTag>>
670```
671
672New storage:
673``` cpp
674vtkm::cont::StorageTagReverse<StorageTag>
675```
676
677### `ArrayHandleUniformPointCoordinates`
678
679Old storage:
680``` cpp
681vtkm::cont::StorageTagImplicit<vtkm::internal::ArrayPortalUniformPointCoordinates>
682```
683
684New Storage:
685``` cpp
686vtkm::cont::StorageTagUniformPoints
687```
688
689### `ArrayHandleView`
690
691Old storage:
692``` cpp
693vtkm::cont::StorageTagView<vtkm::cont::ArrayHandle<ValueType, StorageTag>>
694```
695
696New storage:
697``` cpp
698'vtkm::cont::StorageTagView<StorageTag>
699```
700
701
702### `ArrayPortalZip`
703
704Old storage:
705``` cpp
706vtkm::cont::internal::StorageTagZip<
707  vtkm::cont::ArrayHandle<ValueType1, StorageTag1>,
708  vtkm::cont::ArrayHandle<ValueType2, StorageTag2>>
709```
710
711New storage:
712``` cpp
713vtkm::cont::StorageTagZip<StorageTag1, StorageTag2>
714```
715
716
717## `ReadPortal().Get(idx)`
718
719Calling `ReadPortal()` in a tight loop is an antipattern.
720A call to `ReadPortal()` causes the array to be copied back to the control environment,
721and hence code like
722
723```cpp
724for (vtkm::Id i = 0; i < array.GetNumberOfValues(); ++i) {
725    vtkm::FloatDefault x = array.ReadPortal().Get(i);
726}
727```
728
729is a quadratic-scaling loop.
730
731We have remove *almost* all internal uses of the `ReadPortal().Get` antipattern,
732with the exception of 4 API calls into which the pattern is baked in:
733`CellSetExplicit::GetCellShape`, `CellSetPermutation::GetNumberOfPointsInCell`, `CellSetPermutation::GetCellShape`, and `CellSetPermutation::GetCellPointIds`.
734We expect these will need to be deprecated in the future.
735
736## Precompiled `ArrayCopy` for `UnknownArrayHandle`
737
738Previously, in order to copy an `UnknownArrayHandle`, you had to specify
739some subset of types and then specially compile a copy for each potential
740type. With the new ability to extract a component from an
741`UnknownArrayHandle`, it is now feasible to precompile copying an
742`UnknownArrayHandle` to another array. This greatly reduces the overhead of
743using `ArrayCopy` to copy `UnknownArrayHandle`s while simultaneously
744increasing the likelihood that the copy will be successful.
745
746## Create `ArrayHandleOffsetsToNumComponents`
747
748`ArrayHandleOffsetsToNumComponents` is a fancy array that takes an array of
749offsets and converts it to an array of the number of components for each
750packed entry.
751
752It is common in VTK-m to pack small vectors of variable sizes into a single
753contiguous array. For example, cells in an explicit cell set can each have
754a different amount of vertices (triangles = 3, quads = 4, tetra = 4, hexa =
7558, etc.). Generally, to access items in this list, you need an array of
756components in each entry and the offset for each entry. However, if you
757have just the array of offsets in sorted order, you can easily derive the
758number of components for each entry by subtracting adjacent entries. This
759works best if the offsets array has a size that is one more than the number
760of packed vectors with the first entry set to 0 and the last entry set to
761the total size of the packed array (the offset to the end).
762
763When packing data of this nature, it is common to start with an array that
764is the number of components. You can convert that to an offsets array using
765the `vtkm::cont::ConvertNumComponentsToOffsets` function. This will create
766an offsets array with one extra entry as previously described. You can then
767throw out the original number of components array and use the offsets with
768`ArrayHandleOffsetsToNumComponents` to represent both the offsets and num
769components while storing only one array.
770
771This replaces the use of `ArrayHandleDecorator` in `CellSetExplicit`.
772The two implementation should do the same thing, but the new
773`ArrayHandleOffsetsToNumComponents` should be less complex for
774compilers.
775
776## Recombine extracted component arrays from unknown arrays
777
778Building on the recent capability to [extract component arrays from unknown
779arrays](array-extract-component.md), there is now also the ability to
780recombine these extracted arrays to a single `ArrayHandle`. It might seem
781counterintuitive to break an `ArrayHandle` into component arrays and then
782combine the component arrays back into a single `ArrayHandle`, but this is
783a very handy way to run algorithms without knowing the exact `ArrayHandle`
784type.
785
786Recall that when extracting a component array from an `UnknownArrayHandle`
787you only need to know the base component of the value type of the contained
788`ArrayHandle`. That makes extracting a component array independent from
789either the size of any `Vec` value type and any storage type.
790
791The added `UnknownArrayHandle::ExtractArrayFromComponents` method allows
792you to use the functionality to transform the unknown array handle to a
793form of `ArrayHandle` that depends only on this base component type. This
794method internally uses a new `ArrayHandleRecombineVec` class, but this
795class is mostly intended for internal use by this class.
796
797As an added convenience, `UnknownArrayHandle` now also provides the
798`CastAndCallWithExtractedArray` method. This method works like other
799`CastAndCall`s except that it uses the `ExtractArrayFromComponents` feature
800to allow you to handle most `ArrayHandle` types with few template
801instances.
802
803
804## UnknownArrayHandle and UncertainArrayHandle for runtime-determined types
805
806Two new classes have been added to VTK-m: `UnknownArrayHandle` and
807`UncertainArrayHandle`. These classes serve the same purpose as the set of
808`VariantArrayHandle` classes and will replace them.
809
810Motivated mostly by the desire to move away from `ArrayHandleVirtual`, we
811have multiple reasons to completely refactor the `VariantArrayHandle`
812class. These include changing the implementation, some behavior, and even
813the name.
814
815### Motivation
816
817We have several reasons that have accumulated to revisit the implementation
818of `VariantArrayHandle`.
819
820#### Move away from `ArrayHandleVirtual`
821
822The current implementation of `VariantArrayHandle` internally stores the
823array wrapped in an `ArrayHandleVirtual`. That makes sense since you might
824as well consolidate the hierarchy of virtual objects into one.
825
826Except `ArrayHandleVirtual` is being deprecated, so it no longer makes
827sense to use that internally.
828
829So we will transition the class back to managing the data as typeless on
830its own. We will consider using function pointers rather than actual
831virtual functions because compilers can be slow in creating lots of virtual
832subclasses.
833
834#### Reintroduce storage tag lists
835
836The original implementation of `VariantArrayHandle` (which at the time was
837called `DynamicArrayHandle`) actually had two type lists: one for the array
838value type and one for the storage type. The storage type list was removed
839soon after `ArrayHandleVirtual` was introduced because whatever the type of
840array it could be access as `ArrayHandleVirtual`.
841
842However, with `ArrayHandleVirtual` being deprecated, this feature is no
843longer possible. We are in need again for the list of storage types to try.
844Thus, we need to reintroduce this template argument to
845`VariantArrayHandle`.
846
847#### More clear name
848
849The name of this class has always been unsatisfactory. The first name,
850`DynamicArrayHandle`, makes it sound like the data is always changing. The
851second name, `VariantArrayHandle`, makes it sound like an array that holds
852a value type that can vary (like an `std::variant`).
853
854We can use a more clear name that expresses better that it is holding an
855`ArrayHandle` of an _unknown_ type.
856
857#### Take advantage of default types for less templating
858
859Once upon a time everything in VTK-m was templated header library. Things
860have changed quite a bit since then. The most recent development is the
861ability to select the "default types" with CMake configuration that allows
862you to select a global set of types you care about during compilation. This
863is so units like filters can be compiled into a library with all types we
864care about, and we don't have to constantly recompile units.
865
866This means that we are becoming less concerned about maintaining type lists
867everywhere. Often we can drop the type list and pass data across libraries.
868
869With that in mind, it makes less sense for `VariantArrayHandle` to actually
870be a `using` alias for `VariantArrayHandleBase<VTKM_DEFAULT_TYPE_LIST>`.
871
872In response, we can revert the is-a relationship between the two. Have a
873completely typeless version as the base class and have a second version
874templated version to express when the type of the array has been partially
875narrowed down to given type lists.
876
877### New Name and Structure
878
879The ultimate purpose of this class is to store an `ArrayHandle` where the
880value and storage types are unknown. Thus, an appropriate name for the
881class is `UnknownArrayHandle`.
882
883`UnknownArrayHandle` is _not_ templated. It simply stores an `ArrayHandle`
884in a typeless (`void *`) buffer. It does, however, contain many templated
885methods that allow you to query whether the contained array matches given
886types, to cast to given types, and to cast and call to a given functor
887(from either given type lists or default lists).
888
889Rather than have a virtual class structure to manage the typeless array,
890the new management will use function pointers. This has shown to sometimes
891improve compile times and generate less code.
892
893Sometimes it is the case that the set of potential types can be narrowed. In
894this case, the array ceases to be unknown and becomes _uncertain_. Thus,
895the companion class to `UnknownArrayHandle` is `UncertainArrayHandle`.
896
897`UncertainArrayHandle` has two template parameters: a list of potential
898value types and a list of potential storage types. The behavior of
899`UncertainArrayHandle` matches that of `UnknownArrayHandle` (and might
900inherit from it). However, for `CastAndCall` operations, it will use the
901type lists defined in its template parameters.
902
903### Serializing UnknownArrayHandle
904
905Because `UnknownArrayHandle` is not templated, it contains some
906opportunities to compile things into the `vtkm_cont` library. Templated
907methods like `CastAndCall` cannot be, but the specializations of DIY's
908serialize can be.
909
910And since it only has to be compiled once into a library, we can spend some
911extra time compiling for more types. We don't have to restrict ourselves to
912`VTKM_DEFAULT_TYPE_LIST`. We can compile for vtkm::TypeListTagAll.
913
914
915Support `ArrayHandleSOA` as a "default" array
916
917Many programs, particularly simulations, store fields of vectors in
918separate arrays for each component. This maps to the storage of
919`ArrayHandleSOA`. The VTK-m code tends to prefer the AOS storage (which is
920what is implemented in `ArrayHandleBasic`, and the behavior of which is
921inherited from VTK). VTK-m should better support adding `ArrayHandleSOA` as
922one of the types.
923
924We now have a set of default types for Ascent that uses SOA as one of the
925basic types.
926
927Part of this change includes an intentional feature regression of
928`ArrayHandleSOA` to only support value types of `Vec`. Previously, scalar
929types were supported. However, the behavior of `ArrayHandleSOA` is exactly
930the same as `ArrayHandleBasic`, except a lot more template code has to be
931generated. That itself is not a huge deal, but because you have 2 types
932that essentially do the same thing, a lot of template code in VTK-m would
933unwind to create two separate code paths that do the same thing with the
934same data. To avoid creating those code paths, we simply make any use of
935`ArrayHandleSOA` without a `Vec` value invalid. This will prevent VTK-m
936from creating those code paths.
937
938
939## Removed old `ArrayHandle` transfer mechanism
940
941Deleted the default implementation of `ArrayTransfer`. `ArrayTransfer` is
942used with the old `ArrayHandle` style to move data between host and device.
943The new version of `ArrayHandle` does not use `ArrayTransfer` at all
944because this functionality is wrapped in `Buffer` (where it can exist in a
945precompiled library).
946
947Once all the old `ArrayHandle` classes are gone, this class will be removed
948completely. Although all the remaining `ArrayHandle` classes provide their
949own versions of `ArrayTransfer`, they still need the prototype to be
950defined to specialize. Thus, the guts of the default `ArrayTransfer` are
951removed and replaced with a compile error if you try to compile it.
952
953Also removed `ArrayManagerExecution`. This class was used indirectly by the
954old `ArrayHandle`, through `ArrayHandleTransfer`, to move data to and from
955a device. This functionality has been replaced in the new `ArrayHandle`s
956through the `Buffer` class (which can be compiled into libraries rather
957than make every translation unit compile their own template).
958
959
960## Order asynchronous `ArrayHandle` access
961
962The recent feature of [tokens that scope access to
963`ArrayHandle`s](scoping-tokens.md) allows multiple threads to use the same
964`ArrayHandle`s without read/write hazards. The intent is twofold. First, it
965allows two separate threads in the control environment to independently
966schedule tasks. Second, it allows us to move toward scheduling worklets and
967other algorithms asynchronously.
968
969However, there was a flaw with the original implementation. Once requests
970to an `ArrayHandle` get queued up, they are resolved in arbitrary order.
971This might mean that things run in surprising and incorrect order.
972
973### Problematic use case
974
975To demonstrate the flaw in the original implementation, let us consider a
976future scenario where when you invoke a worklet (on OpenMP or TBB), the
977call to invoke returns immediately and the actual work is scheduled
978asynchronously. Now let us say we have a sequence of 3 worklets we wish to
979run: `Worklet1`, `Worklet2`, and `Worklet3`. One of `Worklet1`'s parameters
980is a `FieldOut` that creates an intermediate `ArrayHandle` that we will
981simply call `array`. `Worklet2` is given `array` as a `FieldInOut` to
982modify its values. Finally, `Worklet3` is given `array` as a `FieldIn`. It
983is clear that for the computation to be correct, the three worklets _must_
984execute in the correct order of `Worklet1`, `Worklet2`, and `Worklet3`.
985
986The problem is that if `Worklet2` and `Worklet3` are both scheduled before
987`Worklet1` finishes, the order they are executed could be arbitrary. Let us
988say that `Worklet1` is invoked, and the invoke call returns before the
989execution of `Worklet1` finishes.
990
991The calling code immediately invokes `Worklet2`. Because `array` is already
992locked by `Worklet1`, `Worklet2` does not execute right away. Instead, it
993waits on a condition variable of `array` until it is free. But even though
994the scheduling of `Worklet2` is blocked, the invoke returns because we are
995scheduling asynchronously.
996
997Likewise, the calling code then immediately calls invoke for `Worklet3`.
998`Worklet3` similarly waits on the condition variable of `array` until it is
999free.
1000
1001Let us assume the likely event that both `Worklet2` and `Worklet3` get
1002scheduled before `Worklet1` finishes. When `Worklet1` then later does
1003finish, it's token relinquishes the lock on `array`, which wakes up the
1004threads waiting for access to `array`. However, there is no imposed order on
1005in what order the waiting threads will acquire the lock and run. (At least,
1006I'm not aware of anything imposing an order.) Thus, it is quite possible
1007that `Worklet3` will wake up first. It will see that `array` is no longer
1008locked (because `Worklet1` has released it and `Worklet2` has not had a
1009chance to claim it).
1010
1011Oops. Now `Worklet3` is operating on `array` before `Worklet2` has had a
1012chance to put the correct values in it. The results will be wrong.
1013
1014### Queuing requests
1015
1016What we want is to impose the restriction that locks to an `ArrayHandle`
1017get resolved in the order that they are requested. In the previous example,
1018we have 3 requests on an array that happen in a known order. We want
1019control given to them in the same order.
1020
1021To implement this, we need to impose another restriction on the
1022`condition_variable` when waiting to read or write. We want the lock to go
1023to the thread that first started waiting. To do this, we added an
1024internal queue of `Token`s to the `ArrayHandle`.
1025
1026In `ArrayHandle::WaitToRead` and `ArrayHandle::WaitToWrite`, it first adds
1027its `Token` to the back of the queue before waiting on the condition
1028variable. In the `CanRead` and `CanWrite` methods, it checks this queue to
1029see if the provided `Token` is at the front. If not, then the lock is
1030denied and the thread must continue to wait.
1031
1032### Early enqueuing
1033
1034Another issue that can happen in the previous example is that as threads
1035are spawned for the 3 different worklets, they may actually start running
1036in an unexpected order. So the thread running `Worklet3` might actually
1037start before the other 2 and place itself in the queue first.
1038
1039The solution is to add a method to `ArrayHandle` called `Enqueue`. This
1040method takes a `Token` object and adds that `Token` to the queue. However,
1041regardless of where the `Token` ends up on the queue, the method
1042immediately returns. It does not attempt to lock the `ArrayHandle`.
1043
1044So now we can ensure that `Worklet1` properly locks `array` with this
1045sequence of events. First, the main thread calls `array.Enqueue`. Then a
1046thread is spawned to call `PrepareForOutput`.
1047
1048Even if control returns to the calling code and it calls invoke for
1049`Worklet2` before this spawned thread starts, `Worklet2` cannot start
1050first. When `PrepareForInput` is called on `array`, it is queued after the
1051`Token` for `Worklet1`, even if `Worklet1` has not started waiting on the
1052`array`.
1053
1054
1055## Improvements to moving data into ArrayHandle
1056
1057We have made several improvements to adding data into an `ArrayHandle`.
1058
1059### Moving data from an `std::vector`
1060
1061For numerous reasons, it is convenient to define data in a `std::vector`
1062and then wrap that into an `ArrayHandle`. There are two obvious ways to do
1063this. First, you could deep copy the data into an `ArrayHandle`, which has
1064obvious drawbacks. Second, you could take the pointer for the data in the
1065`std::vector` and use that as user-allocated memory in the `ArrayHandle`
1066without deep copying it. The problem with this shallow copy is that it is
1067unsafe. If the `std::vector` goes out of scope (or gets resized), then the
1068data the `ArrayHandle` is pointing to becomes unallocated, which will lead
1069to unpredictable behavior.
1070
1071However, there is a third option. It is often the case that an
1072`std::vector` is filled and then becomes unused once it is converted to an
1073`ArrayHandle`. In this case, what we really want is to pass the data off to
1074the `ArrayHandle` so that the `ArrayHandle` is now managing the data and
1075not the `std::vector`.
1076
1077C++11 has a mechanism to do this: move semantics. You can now pass
1078variables to functions as an "rvalue" (right-hand value). When something is
1079passed as an rvalue, it can pull state out of that variable and move it
1080somewhere else. `std::vector` implements this movement so that an rvalue
1081can be moved to another `std::vector` without actually copying the data.
1082`make_ArrayHandle` now also takes advantage of this feature to move rvalue
1083`std::vector`s.
1084
1085There is a special form of `make_ArrayHandle` named `make_ArrayHandleMove`
1086that takes an rvalue. There is also a special overload of
1087`make_ArrayHandle` itself that handles an rvalue `vector`. (However, using
1088the explicit move version is better if you want to make sure the data is
1089actually moved.)
1090
1091So if you create the `std::vector` in the call to `make_ArrayHandle`, then
1092the data only gets created once.
1093
1094``` cpp
1095auto array = vtkm::cont::make_ArrayHandleMove(std::vector<vtkm::Id>{ 2, 6, 1, 7, 4, 3, 9 });
1096```
1097
1098Note that there is now a better way to express an initializer list to
1099`ArrayHandle` documented below. But this form of `ArrayHandleMove` can be
1100particularly useful for initializing an array to all of a particular value.
1101For example, an easy way to initialize an array of 1000 elements all to 1
1102is
1103
1104``` cpp
1105auto array = vtkm::cont::make_ArrayHandleMove(std::vector<vtkm::Id>(1000, 1));
1106```
1107
1108You can also move the data from an already created `std::vector` by using
1109the `std::move` function to convert it to an rvalue. When you do this, the
1110`std::vector` becomes invalid after the call and any use will be undefined.
1111
1112``` cpp
1113std::vector<vtkm::Id> vector;
1114// fill vector
1115
1116auto array = vtkm::cont::make_ArrayHandleMove(std::move(vector));
1117```
1118
1119### Make `ArrayHandle` from initalizer list
1120
1121A common use case for using `std::vector` (particularly in our unit tests)
1122is to quickly add an initalizer list into an `ArrayHandle`. Repeating the
1123example from above:
1124
1125``` cpp
1126auto array = vtkm::cont::make_ArrayHandleMove(std::vector<vtkm::Id>{ 2, 6, 1, 7, 4, 3, 9 });
1127```
1128
1129However, creating the `std::vector` should be unnecessary. Why not be able
1130to create the `ArrayHandle` directly from an initializer list? Now you can
1131by simply passing an initializer list to `make_ArrayHandle`.
1132
1133``` cpp
1134auto array = vtkm::cont::make_ArrayHandle({ 2, 6, 1, 7, 4, 3, 9 });
1135```
1136
1137There is an issue here. The type here can be a little ambiguous (for
1138humans). In this case, `array` will be of type
1139`vtkm::cont::ArrayHandleBasic<int>`, since that is what an integer literal
1140defaults to. This could be a problem if, for example, you want to use
1141`array` as an array of `vtkm::Id`, which could be of type `vtkm::Int64`.
1142This is easily remedied by specifying the desired value type as a template
1143argument to `make_ArrayHandle`.
1144
1145``` cpp
1146auto array = vtkm::cont::make_ArrayHandle<vtkm::Id>({ 2, 6, 1, 7, 4, 3, 9 });
1147```
1148
1149### Deprecated `make_ArrayHandle` with default shallow copy
1150
1151For historical reasons, passing an `std::vector` or a pointer to
1152`make_ArrayHandle` does a shallow copy (i.e. `CopyFlag` defaults to `Off`).
1153Although more efficient, this mode is inherintly unsafe, and making it the
1154default is asking for trouble.
1155
1156To combat this, calling `make_ArrayHandle` without a copy flag is
1157deprecated. In this way, if you wish to do the faster but more unsafe
1158creation of an `ArrayHandle` you should explicitly express that.
1159
1160This requried quite a few changes through the VTK-m source (particularly in
1161the tests).
1162
1163### Similar changes to `Field`
1164
1165`vtkm::cont::Field` has a `make_Field` helper function that is similar to
1166`make_ArrayHandle`. It also features the ability to create fields from
1167`std::vector`s and C arrays. It also likewise had the same unsafe behavior
1168by default of not copying from the source of the arrays.
1169
1170That behavior has similarly been depreciated. You now have to specify a
1171copy flag.
1172
1173The ability to construct a `Field` from an initializer list of values has
1174also been added.
1175
1176
1177## Deprecate ArrayHandleVirtualCoordinates
1178
1179As we port VTK-m to more types of accelerator architectures, supporting
1180virtual methods is becoming more problematic. Thus, we are working to back
1181out of using virtual methods in the execution environment.
1182
1183One of the most widespread users of virtual methods in the execution
1184environment is `ArrayHandleVirtual`. As a first step of deprecating this
1185class, we first deprecate the `ArrayHandleVirtualCoordinates` subclass.
1186
1187Not surprisingly, `ArrayHandleVirtualCoordinates` is used directly by
1188`CoordinateSystem`. The biggest change necessary was that the `GetData`
1189method returned an `ArrayHandleVirtualCoordinates`, which obviously would
1190not work if that class is deprecated.
1191
1192An oddness about this return type is that it is quite different from the
1193superclass's method of the same name. Rather, `Field` returns a
1194`VariantArrayHandle`. Since this had to be corrected anyway, it was decided
1195to change `CoordinateSystem`'s `GetData` to also return a
1196`VariantArrayHandle`, although its typelist is set to just `vtkm::Vec3f`.
1197
1198To try to still support old code that expects the deprecated behavior of
1199returning an `ArrayHandleVirtualCoordinates`, `CoordinateSystem::GetData`
1200actually returns a "hidden" subclass of `VariantArrayHandle` that
1201automatically converts itself to an `ArrayHandleVirtualCoordinates`. (A
1202deprecation warning is given if this is done.)
1203
1204This approach to support deprecated code is not perfect. The returned value
1205for `CoordinateSystem::GetData` can only be used as an `ArrayHandle` if a
1206method is directly called on it or if it is cast specifically to
1207`ArrayHandleVirtualCoordinates` or its superclass. For example, if passing
1208it to a method argument typed as `vtkm::cont::ArrayHandle<T, S>` where `T`
1209and `S` are template parameters, then the conversion will fail.
1210
1211To continue to support ease of use, `CoordinateSystem` now has a method
1212named `GetDataAsMultiplexer` that returns the data as an
1213`ArrayHandleMultiplexer`. This can be employed to quickly use the
1214`CoordinateSystem` as an array without the overhead of a `CastAndCall`.
1215
1216## ArrayHandleDecorator Allocate and Shrink Support
1217
1218`ArrayHandleDecorator` can now be resized when given an appropriate
1219decorator implementation.
1220
1221Since the mapping between the size of an `ArrayHandleDecorator` and its source
1222`ArrayHandle`s is not well defined, resize operations (such as `Shrink` and
1223`Allocate`) are not defined by default, and will throw an exception if called.
1224
1225However, by implementing the methods `AllocateSourceArrays` and/or
1226`ShrinkSourceArrays` on the implementation class, resizing the decorator is
1227allowed. These methods are passed in a new size along with each of the
1228`ArrayHandleDecorator`'s source arrays, allowing developers to control how
1229the resize operation should affect the source arrays.
1230
1231For example, the following decorator implementation can be used to create a
1232resizable `ArrayHandleDecorator` that is implemented using two arrays, which
1233are combined to produce values via the expression:
1234
1235```
1236[decorator value i] = [source1 value i] * 10 + [source2 value i]
1237```
1238
1239Implementation:
1240
1241```c++
1242  template <typename ValueType>
1243  struct DecompositionDecorImpl
1244  {
1245    template <typename Portal1T, typename Portal2T>
1246    struct Functor
1247    {
1248      Portal1T Portal1;
1249      Portal2T Portal2;
1250
1251      VTKM_EXEC_CONT
1252      ValueType operator()(vtkm::Id idx) const
1253      {
1254        return static_cast<ValueType>(this->Portal1.Get(idx) * 10 + this->Portal2.Get(idx));
1255      }
1256    };
1257
1258    template <typename Portal1T, typename Portal2T>
1259    struct InverseFunctor
1260    {
1261      Portal1T Portal1;
1262      Portal2T Portal2;
1263
1264      VTKM_EXEC_CONT
1265      void operator()(vtkm::Id idx, const ValueType& val) const
1266      {
1267        this->Portal1.Set(idx, static_cast<ValueType>(std::floor(val / 10)));
1268        this->Portal2.Set(idx, static_cast<ValueType>(std::fmod(val, 10)));
1269      }
1270    };
1271
1272    template <typename Portal1T, typename Portal2T>
1273    VTKM_CONT Functor<typename std::decay<Portal1T>::type, typename std::decay<Portal2T>::type>
1274    CreateFunctor(Portal1T&& p1, Portal2T&& p2) const
1275    {
1276      return { std::forward<Portal1T>(p1), std::forward<Portal2T>(p2) };
1277    }
1278
1279    template <typename Portal1T, typename Portal2T>
1280    VTKM_CONT InverseFunctor<typename std::decay<Portal1T>::type, typename std::decay<Portal2T>::type>
1281    CreateInverseFunctor(Portal1T&& p1, Portal2T&& p2) const
1282    {
1283      return { std::forward<Portal1T>(p1), std::forward<Portal2T>(p2) };
1284    }
1285
1286    // Resize methods:
1287    template <typename Array1T, typename Array2T>
1288    VTKM_CONT
1289    void AllocateSourceArrays(vtkm::Id numVals, Array1T&& array1, Array2T&& array2) const
1290    {
1291      array1.Allocate(numVals);
1292      array2.Allocate(numVals);
1293    }
1294
1295    template <typename Array1T, typename Array2T>
1296    VTKM_CONT
1297    void ShrinkSourceArrays(vtkm::Id numVals, Array1T&& array1, Array2T&& array2) const
1298    {
1299      array1.Shrink(numVals);
1300      array2.Shrink(numVals);
1301    }
1302  };
1303
1304  // Usage:
1305  vtkm::cont::ArrayHandle<ValueType> a1;
1306  vtkm::cont::ArrayHandle<ValueType> a2;
1307  auto decor = vtkm::cont::make_ArrayHandleDecorator(0, DecompositionDecorImpl<ValueType>{}, a1, a2);
1308
1309  decor.Allocate(5);
1310  {
1311    auto decorPortal = decor.GetPortalControl();
1312    decorPortal.Set(0, 13);
1313    decorPortal.Set(1, 8);
1314    decorPortal.Set(2, 43);
1315    decorPortal.Set(3, 92);
1316    decorPortal.Set(4, 117);
1317  }
1318
1319  // a1:    {   1,   0,   4,   9,   11 }
1320  // a2:    {   3,   8,   3,   2,    7 }
1321  // decor: {  13,   8,  43,  92,  117 }
1322
1323  decor.Shrink(3);
1324
1325  // a1:    {   1,   0,   4 }
1326  // a2:    {   3,   8,   3 }
1327  // decor: {  13,   8,  43 }
1328
1329```
1330
1331
1332## Portals may advertise custom iterators
1333
1334The `ArrayPortalToIterator` utilities are used to produce STL-style iterators
1335from vtk-m's `ArrayHandle` portals. By default, a facade class is constructed
1336around the portal API, adapting it to an iterator interface.
1337
1338However, some portals use iterators internally, or may be able to construct a
1339lightweight iterator easily. For these, it is preferable to directly use the
1340specialized iterators instead of going through the generic facade. A portal may
1341now declare the following optional API to advertise that it has custom
1342iterators:
1343
1344```
1345struct MyPortal
1346{
1347  using IteratorType = ...; // alias to the portal's specialized iterator type
1348  IteratorType GetIteratorBegin(); // Return the begin iterator
1349  IteratorType GetIteratorEnd(); // Return the end iterator
1350
1351  // ...rest of ArrayPortal API...
1352};
1353```
1354
1355If these members are present, `ArrayPortalToIterators` will forward the portal's
1356specialized iterators instead of constructing a facade. This works when using
1357the `ArrayPortalToIterators` class directly, and also with the
1358`ArrayPortalToIteratorBegin` and `ArrayPortalToIteratorEnd` convenience
1359functions.
1360
1361
1362## Redesign of ArrayHandle to access data using typeless buffers
1363
1364The original implementation of `ArrayHandle` is meant to be very generic.
1365To define an `ArrayHandle`, you actually create a `Storage` class that
1366maintains the data and provides portals to access it (on the host). Because
1367the `Storage` can provide any type of data structure it wants, you also
1368need to define an `ArrayTransfer` that describes how to move the
1369`ArrayHandle` to and from a device. It also has to be repeated for every
1370translation unit that uses them.
1371
1372This is a very powerful mechanism. However, one of the major problems with
1373this approach is that every `ArrayHandle` type needs to have a separate
1374compile path for every value type crossed with every device. Because of
1375this limitation, the `ArrayHandle` for the basic storage has a special
1376implementation that manages the actual data allocation and movement as
1377`void *` arrays. In this way all the data management can be compiled once
1378and put into the `vtkm_cont` library. This has dramatically improved the
1379VTK-m compile time.
1380
1381This new design replicates the basic `ArrayHandle`'s success to all other
1382storage types. The basic idea is to make the implementation of
1383`ArrayHandle` storage slightly less generic. Instead of requiring it to
1384manage the data it stores, it instead just builds `ArrayPortal`s from
1385`void` pointers that it is given. The management of `void` pointers can be
1386done in non-templated classes that are compiled into a library.
1387
1388This initial implementation does not convert all `ArrayHandle`s to avoid
1389making non-backward compatible changes before the next minor revision of
1390VTK-m. In particular, it would be particularly difficult to convert
1391`ArrayHandleVirtual`. It could be done, but it would be a lot of work for a
1392class that will likely be removed.
1393
1394### Buffer
1395
1396Key to these changes is the introduction of a
1397`vtkm::cont::internal::Buffer` object. As the name implies, the `Buffer`
1398object manages a single block of bytes. `Buffer` is agnostic to the type of
1399data being stored. It only knows the length of the buffer in bytes. It is
1400responsible for allocating space on the host and any devices as necessary
1401and for transferring data among them. (Since `Buffer` knows nothing about
1402the type of data, a precondition of VTK-m would be that the host and all
1403devices have to have the same endian.)
1404
1405The idea of the `Buffer` object is similar in nature to the existing
1406`vtkm::cont::internal::ExecutionArrayInterfaceBasicBase` except that it
1407will manage a buffer of data among the control and all devices rather than
1408in one device through a templated subclass.
1409
1410As explained below, `ArrayHandle` holds some fixed number of `Buffer`
1411objects. (The number can be zero for implicit `ArrayHandle`s.) Because all
1412the interaction with the devices happen through `Buffer`, it will no longer
1413be necessary to compile any reference to `ArrayHandle` for devices (e.g.
1414you won't have to use nvcc just because the code links `ArrayHandle.h`).
1415
1416### Storage
1417
1418The `vtkm::cont::internal::Storage` class changes dramatically. Although an
1419instance will be kept, the intention is for `Storage` itself to be a
1420stateless object. It will manage its data through `Buffer` objects provided
1421from the `ArrayHandle`.
1422
1423That said, it is possible for `Storage` to have some state. For example,
1424the `Storage` for `ArrayHandleImplicit` must hold on to the instance of the
1425portal used to manage the state.
1426
1427
1428### ArrayTransport
1429
1430The `vtkm::cont::internal::ArrayTransfer` class will be removed completely.
1431All data transfers will be handled internally with the `Buffer` object
1432
1433### Portals
1434
1435A big change for this design is that the type of a portal for an
1436`ArrayHandle` will be the same for all devices and the host. Thus, we no
1437longer need specialized versions of portals for each device. We only have
1438one portal type. And since they are constructed from `void *` pointers, one
1439method can create them all.
1440
1441
1442### Advantages
1443
1444The `ArrayHandle` interface should not change significantly for external
1445uses, but this redesign offers several advantages.
1446
1447#### Faster Compiles
1448
1449Because the memory management is contained in a non-templated `Buffer`
1450class, it can be compiled once in a library and used by all template
1451instances of `ArrayHandle`. It should have similar compile advantages to
1452our current specialization of the basic `ArrayHandle`, but applied to all
1453types of `ArrayHandle`s.
1454
1455#### Fewer Templates
1456
1457Hand-in-hand with faster compiles, the new design should require fewer
1458templates and template instances. We have immediately gotten rid of
1459`ArrayTransport`. `Storage` is also much shorter. Because all
1460`ArrayPortal`s are the same for every device and the host, we need many
1461fewer versions of those classes. In the device adapter, we can probably
1462collapse the three `ArrayManagerExecution` classes into a single, much
1463simpler class that does simple memory allocation and copy.
1464
1465#### Fewer files need to be compiled for CUDA
1466
1467Including `ArrayHandle.h` no longer adds code that compiles for a device.
1468Thus, we should no longer need to compile for a specific device adapter
1469just because we access an `ArrayHandle`. This should make it much easier to
1470achieve our goal of a "firewall". That is, code that just calls VTK-m
1471filters does not need to support all its compilers and flags.
1472
1473#### Simpler ArrayHandle specialization
1474
1475The newer code should simplify the implementation of special `ArrayHandle`s
1476a bit. You need only implement an `ArrayPortal` that operates on one or
1477more `void *` arrays and a simple `Storage` class.
1478
1479#### Out of band memory sharing
1480
1481With the current version of `ArrayHandle`, if you want to take data from
1482one `ArrayHandle` you pretty much have to create a special template to wrap
1483another `ArrayHandle` around that. With this new design, it is possible to
1484take data from one `ArrayHandle` and give it to another `ArrayHandle` of a
1485completely different type. You can't do this willy-nilly since different
1486`ArrayHandle` types will interpret buffers differently. But there can be
1487some special important use cases.
1488
1489One such case could be an `ArrayHandle` that provides strided access to a
1490buffer. (Let's call it `ArrayHandleStride`.) The idea is that it interprets
1491the buffer as an array for a particular type (like a basic `ArrayHandle`)
1492but also defines a stride, skip, and repeat so that given an index it looks
1493up the value `((index / skip) % repeat) * stride`. The point is that it can
1494take an AoS array of tuples and represent an array of one of the
1495components.
1496
1497The point would be that if you had a `VariantArrayHandle` or `Field`, you
1498could pull out an array of one of the components as an `ArrayHandleStride`.
1499An `ArrayHandleStride<vtkm::Float32>` could be used to represent that data
1500that comes from any basic `ArrayHandle` with `vtkm::Float32` or a
1501`vtkm::Vec` of that type. It could also represent data from an
1502`ArrayHandleCartesianProduct` and `ArrayHandleSoA`. We could even represent
1503an `ArrayHandleUniformPointCoordinates` by just making a small array. This
1504allows us to statically access a whole bunch of potential array storage
1505classes with a single type.
1506
1507#### Potentially faster device transfers
1508
1509There is currently a fast-path for basic `ArrayHandle`s that does a block
1510cuda memcpy between host and device. But for other `ArrayHandle`s that do
1511not defer their `ArrayTransfer` to a sub-array, the transfer first has to
1512copy the data into a known buffer.
1513
1514Because this new design stores all data in `Buffer` objects, any of these
1515can be easily and efficiently copied between devices.
1516
1517### Disadvantages
1518
1519This new design gives up some features of the original `ArrayHandle` design.
1520
1521#### Can only interface data that can be represented in a fixed number of buffers
1522
1523Because the original `ArrayHandle` design required the `Storage` to
1524completely manage the data, it could represent it in any way possible. In
1525this redesign, the data need to be stored in some fixed number of memory
1526buffers.
1527
1528This is a pretty open requirement. I suspect most data formats will be
1529storable in this. The user's guide has an example of data stored in a
1530`std::deque` that will not be representable. But that is probably not a
1531particularly practical example.
1532
1533#### VTK-m would only be able to support hosts and devices with the same endian
1534
1535Because data are transferred as `void *` blocks of memory, there is no way
1536to correct words if the endian on the two devices does not agree. As far as
1537I know, there should be no issues with the proposed ECP machines.
1538
1539If endian becomes an issue, it might be possible to specify a word length
1540in the `Buffer`. That would assume that all numbers stored in the `Buffer`
1541have the same word length.
1542
1543#### ArrayPortals must be completely recompiled in each translation unit
1544
1545We can declare that an `ArrayHandle` does not need to include the device
1546adapter header files in part because it no longer needs specialized
1547`ArrayPortal`s for each device. However, that means that a translation unit
1548compiled with the host compiler (say gcc) will produce different code for
1549the `ArrayPortal`s than those with the device compiler (say nvcc). This
1550could lead to numerous linking problems.
1551
1552To get around these issues, we will probably have to enforce no exporting
1553of any of the `ArrayPotal` symbols and force them all to be recompiled for
1554each translation unit. This will serve to increase the compile times a bit.
1555We will probably also still encounter linking errors as there would be no
1556way to enforce this requirement.
1557
1558#### Cannot have specialized portals for the control environment
1559
1560Because the new design unifies `ArrayPortal` types across control and
1561execution environments, it is no longer possible to have a special version
1562for the control environment to manage resources. This will require removing
1563some recent behavior of control portals such as with MR !1988.
1564
1565
1566## `ArrayRangeCompute` works on any array type without compiling device code
1567
1568Originally, `ArrayRangeCompute` required you to know specifically the
1569`ArrayHandle` type (value type and storage type) and to compile using any
1570device compiler. The method is changed to include only overloads that have
1571precompiled versions of `ArrayRangeCompute`.
1572
1573Additionally, an `ArrayRangeCompute` overload that takes an
1574`UnknownArrayHandle` has been added. In addition to allowing you to compute
1575the range of arrays of unknown types, this implementation of
1576`ArrayRangeCompute` serves as a fallback for `ArrayHandle` types that are
1577not otherwise explicitly supported.
1578
1579If you really want to make sure that you compute the range directly on an
1580`ArrayHandle` of a particular type, you can include
1581`ArrayRangeComputeTemplate.h`, which contains a templated overload of
1582`ArrayRangeCompute` that directly computes the range of an `ArrayHandle`.
1583Including this header requires compiling for device code.
1584
1585
1586## Implemented ArrayHandleRandomUniformBits and ArrayHandleRandomUniformReal
1587
1588ArrayHandleRandomUniformBits and ArrayHandleRandomUniformReal were added to provide
1589an efficient way to generate pseudo random numbers in parallel. They are based on the
1590Philox parallel pseudo random number generator. ArrayHandleRandomUniformBits provides
159164-bits random bits in the whole range of UInt64 as its content while
1592ArrayHandleRandomUniformReal provides random Float64 in the range of [0, 1). User can
1593either provide a seed in the form of Vec<vtkm::Uint32, 1> or use the default random
1594source provided by the C++ standard library. Both of the ArrayHandles  are lazy evaluated
1595as other Fancy ArrayHandles such that they only have O(1) memory overhead. They are
1596stateless and functional and does not change once constructed. To generate a new set of
1597random numbers, for example as part of a iterative algorithm, a  new ArrayHandle
1598needs to be constructed in each iteration. See the user's guide for more detail and
1599examples.
1600
1601
1602## Extract component arrays from unknown arrays
1603
1604One of the problems with the data structures of VTK-m is that non-templated
1605classes like `DataSet`, `Field`, and `UnknownArrayHandle` (formally
1606`VariantArrayHandle`) internally hold an `ArrayHandle` of a particular type
1607that has to be cast to the correct task before it can be reasonably used.
1608That in turn is problematic because the list of possible `ArrayHandle`
1609types is very long.
1610
1611At one time we were trying to compensate for this by using
1612`ArrayHandleVirtual`. However, for technical reasons this class is
1613infeasible for every use case of VTK-m and has been deprecated. Also, this
1614was only a partial solution since using it still required different code
1615paths for, say, handling values of `vtkm::Float32` and `vtkm::Vec3f_32`
1616even though both are essentially arrays of 32-bit floats.
1617
1618The extract component feature compensates for this problem by allowing you
1619to extract the components from an `ArrayHandle`. This feature allows you to
1620create a single code path to handle `ArrayHandle`s containing scalars or
1621vectors of any size. Furthermore, when you extract a component from an
1622array, the storage gets normalized so that one code path covers all storage
1623types.
1624
1625### `ArrayExtractComponent`
1626
1627The basic enabling feature is a new function named `ArrayExtractComponent`.
1628This function takes takes an `ArrayHandle` and an index to a component. It
1629then returns an `ArrayHandleStride` holding the selected component of each
1630entry in the original array.
1631
1632We will get to the structure of `ArrayHandleStride` later. But the
1633important part is that `ArrayHandleStride` does _not_ depend on the storage
1634type of the original `ArrayHandle`. That means whether you extract a
1635component from `ArrayHandleBasic`, `ArrayHandleSOA`,
1636`ArrayHandleCartesianProduct`, or any other type, you get back the same
1637`ArrayHandleStride`. Likewise, regardless of whether the input
1638`ArrayHandle` has a `ValueType` of `FloatDefault`, `Vec2f`, `Vec3f`, or any
1639other `Vec` of a default float, you get the same `ArrayHandleStride`. Thus,
1640you can see how this feature can dramatically reduce code paths if used
1641correctly.
1642
1643It should be noted that `ArrayExtractComponent` will (logically) flatten
1644the `ValueType` before extracting the component. Thus, nested `Vec`s such
1645as `Vec<Vec3f, 3>` will be treated as a `Vec<FloatDefault, 9>`. The
1646intention is so that the extracted component will always be a basic C type.
1647For the purposes of this document when we refer to the "component type", we
1648really mean the base component type.
1649
1650Different `ArrayHandle` implementations provide their own implementations
1651for `ArrayExtractComponent` so that the component can be extracted without
1652deep copying all the data. We will visit how `ArrayHandleStride` can
1653represent different data layouts later, but first let's go into the main
1654use case.
1655
1656### Extract components from `UnknownArrayHandle`
1657
1658The principle use case for `ArrayExtractComponent` is to get an
1659`ArrayHandle` from an unknown array handle without iterating over _every_
1660possible type. (Rather, we iterate over a smaller set of types.) To
1661facilitate this, an `ExtractComponent` method has been added to
1662`UnknownArrayHandle`.
1663
1664To use `UnknownArrayHandle::ExtractComponent`, you must give it the
1665component type. You can check for the correct component type by using the
1666`IsBaseComponentType` method. The method will then return an
1667`ArrayHandleStride` for the component type specified.
1668
1669#### Example
1670
1671As an example, let's say you have a worklet, `FooWorklet`, that does some
1672per component operation on an array. Furthermore, let's say that you want
1673to implement a function that, to the best of your ability, can apply
1674`FooWorklet` on an array of any type. This function should be pre-compiled
1675into a library so it doesn't have to be compiled over and over again.
1676(`MapFieldPermutation` and `MapFieldMergeAverage` are real and important
1677examples that have this behavior.)
1678
1679Without the extract component feature, the implementation might look
1680something like this (many practical details left out):
1681
1682``` cpp
1683struct ApplyFooFunctor
1684{
1685  template <typename ArrayType>
1686  void operator()(const ArrayType& input, vtkm::cont::UnknownArrayHandle& output) const
1687  {
1688    ArrayType outputArray;
1689	vtkm::cont::Invoke invoke;
1690	invoke(FooWorklet{}, input, outputArray);
1691	output = outputArray;
1692  }
1693};
1694
1695vtkm::cont::UnknownArrayHandle ApplyFoo(const vtkm::cont::UnknownArrayHandle& input)
1696{
1697  vtkm::cont::UnknownArrayHandle output;
1698  input.CastAndCallForTypes<vtkm::TypeListAll, VTKM_DEFAULT_STORAGE_LIST_TAG>(
1699    ApplyFooFunctor{}, output);
1700  return output;
1701}
1702```
1703
1704Take a look specifically at the `CastAndCallForTypes` call near the bottom
1705of this example. It calls for all types in `vtkm::TypeListAll`, which is
1706about 40 instances. Then, it needs to be called for any type in the desired
1707storage list. This could include basic arrays, SOA arrays, and lots of
1708other specialized types. It would be expected for this code to generate
1709over 100 paths for `ApplyFooFunctor`. This in turn contains a worklet
1710invoke, which is not a small amount of code.
1711
1712Now consider how we can use the `ExtractComponent` feature to reduce the
1713code paths:
1714
1715``` cpp
1716struct ApplyFooFunctor
1717{
1718  template <typename T>
1719  void operator()(T,
1720                  const vtkm::cont::UnknownArrayHandle& input,
1721				  cont vtkm::cont::UnknownArrayHandle& output) const
1722  {
1723    if (!input.IsBasicComponentType<T>()) { return; }
1724	VTKM_ASSERT(output.IsBasicComponentType<T>());
1725
1726	vtkm::cont::Invoke invoke;
1727	invoke(FooWorklet{}, input.ExtractComponent<T>(), output.ExtractComponent<T>());
1728  }
1729};
1730
1731vtkm::cont::UnknownArrayHandle ApplyFoo(const vtkm::cont::UnknownArrayHandle& input)
1732{
1733  vtkm::cont::UnknownArrayHandle output = input.NewInstanceBasic();
1734  output.Allocate(input.GetNumberOfValues());
1735  vtkm::cont::ListForEach(ApplyFooFunctor{}, vtkm::TypeListScalarAll{}, input, output);
1736  return output;
1737}
1738```
1739
1740The number of lines of code is about the same, but take a look at the
1741`ListForEach` (which replaces the `CastAndCallForTypes`). This calling code
1742takes `TypeListScalarAll` instead of `TypeListAll`, which reduces the
1743instances created from around 40 to 13 (every basic C type). It is also no
1744longer dependent on the storage, so these 13 instances are it. As an
1745example of potential compile savings, changing the implementation of the
1746`MapFieldMergePermutation` and `MapFieldMergeAverage` functions in this way
1747reduced the filters_common library (on Mac, Debug build) by 24 MB (over a
1748third of the total size).
1749
1750Another great advantage of this approach is that even though it takes less
1751time to compile and generates less code, it actually covers more cases.
1752Have an array containg values of `Vec<short, 13>`? No problem. The values
1753were actually stored in an `ArrayHandleReverse`? It will still work.
1754
1755### `ArrayHandleStride`
1756
1757This functionality is made possible with the new `ArrayHandleStride`. This
1758array behaves much like `ArrayHandleBasic`, except that it contains an
1759_offset_ parameter to specify where in the buffer array to start reading
1760and a _stride_ parameter to specify how many entries to skip for each
1761successive entry. `ArrayHandleStride` also has optional parameters
1762`divisor` and `modulo` that allow indices to be repeated at regular
1763intervals.
1764
1765Here are how `ArrayHandleStride` extracts components from several common
1766arrays. For each of these examples, we assume that the `ValueType` of the
1767array is `Vec<T, N>`. They are each extracting _component_.
1768
1769#### Extracting from `ArrayHandleBasic`
1770
1771When extracting from an `ArrayHandleBasic`, we just need to start at the
1772proper component and skip the length of the `Vec`.
1773
1774* _offset_: _component_
1775* _stride_: `N`
1776
1777#### Extracting from `ArrayHandleSOA`
1778
1779Since each component is held in a separate array, they are densly packed.
1780Each component could be represented by `ArrayHandleBasic`, but of course we
1781use `ArrayHandleStride` to keep the type consistent.
1782
1783* _offset_: 0
1784* _stride_: 1
1785
1786#### Extracting from `ArrayHandleCartesianProduct`
1787
1788This array is the basic reason for implementing the _divisor_ and _modulo_
1789parameters. Each of the 3 components have different parameters, which are
1790the following (given that _dims_[3] captures the size of the 3 arrays for
1791each dimension).
1792
1793* _offset_: 0
1794* _stride_: 1
1795* case _component_ == 0
1796  * _divisor_: _ignored_
1797  * _modulo_: _dims_[0]
1798* case _component_ == 1
1799  * _divisor_: _dims_[0]
1800  * _modulo_: _dims_[1]
1801* case _component_ == 2
1802  * _divisor_: _dims_[0]
1803  * _modulo_: _ignored_
1804
1805#### Extracting from `ArrayHandleUniformPointCoordinates`
1806
1807This array cannot be represented directly because it is fully implicit.
1808However, it can be trivially converted to `ArrayHandleCartesianProduct` in
1809typically very little memory. (In fact, EAVL always represented uniform
1810point coordinates by explicitly storing a Cartesian product.) Thus, for
1811very little overhead the `ArrayHandleStride` can be created.
1812
1813### Runtime overhead of extracting components
1814
1815These benefits come at a cost, but not a large one. The "biggest" cost is
1816the small cost of computing index arithmetic for each access into
1817`ArrayHandleStride`. To make this as efficient as possible, there are
1818conditions that skip over the modulo and divide steps if they are not
1819necessary. (Integer modulo and divide tend to take much longer than
1820addition and multiplication.) It is for this reason that we probably do not
1821want to use this method all the time.
1822
1823Another cost is the fact that not every `ArrayHandle` can be represented by
1824`ArrayHandleStride` directly without copying. If you ask to extract a
1825component that cannot be directly represented, it will be copied into a
1826basic array, which is not great. To make matters worse, for technical
1827reasons this copy happens on the host rather than the device.
1828
1829
1830## `ArrayHandleGroupVecVariable` holds now one more offset.
1831
1832This change affects the usage of both `ConvertNumComponentsToOffsets` and
1833 `make_ArrayHandleGroupVecVariable`.
1834
1835The reason of this change is to remove a branch in
1836`ArrayHandleGroupVecVariable::Get` which is used to avoid an array overflow,
1837this in theory would increases the performance since at the CPU level it will
1838remove penalties due to wrong branch predictions.
1839
1840The change affects `ConvertNumComponentsToOffsets` by both:
1841
1842 1. Increasing the numbers of elements in `offsetsArray` (its second parameter)
1843    by one.
1844
1845 2. Setting `sourceArraySize` as the sum of all the elements plus the new one
1846    in `offsetsArray`
1847
1848Note that not every specialization of `ConvertNumComponentsToOffsets` does
1849return `offsetsArray`. Thus, some of them would not be affected.
1850
1851Similarly, this change affects `make_ArrayHandleGroupVecVariable` since it
1852expects its second parameter (offsetsArray) to be one element bigger than
1853before.
1854
1855# Control Environment
1856
1857## Algorithms for Control and Execution Environments
1858
1859The `<vtkm/Algorithms.h>` header has been added to provide common STL-style
1860generic algorithms that are suitable for use in both the control and execution
1861environments. This is necessary as the STL algorithms in the `<algorithm>`
1862header are not marked up for use in execution environments such as CUDA.
1863
1864In addition to the markup, these algorithms have convenience overloads to
1865support ArrayPortals directly, simplifying their usage with VTK-m data
1866structures.
1867
1868Currently, three related algorithms are provided: `LowerBounds`, `UpperBounds`,
1869and `BinarySearch`. `BinarySearch` differs from the STL `std::binary_search`
1870algorithm in that it returns an iterator (or index) to a matching element,
1871rather than just a boolean indicating whether a or not key is present.
1872
1873The new algorithm signatures are:
1874
1875```c++
1876namespace vtkm
1877{
1878
1879template <typename IterT, typename T, typename Comp>
1880VTKM_EXEC_CONT
1881IterT BinarySearch(IterT first, IterT last, const T& val, Comp comp);
1882
1883template <typename IterT, typename T>
1884VTKM_EXEC_CONT
1885IterT BinarySearch(IterT first, IterT last, const T& val);
1886
1887template <typename PortalT, typename T, typename Comp>
1888VTKM_EXEC_CONT
1889vtkm::Id BinarySearch(const PortalT& portal, const T& val, Comp comp);
1890
1891template <typename PortalT, typename T>
1892VTKM_EXEC_CONT
1893vtkm::Id BinarySearch(const PortalT& portal, const T& val);
1894
1895template <typename IterT, typename T, typename Comp>
1896VTKM_EXEC_CONT
1897IterT LowerBound(IterT first, IterT last, const T& val, Comp comp);
1898
1899template <typename IterT, typename T>
1900VTKM_EXEC_CONT
1901IterT LowerBound(IterT first, IterT last, const T& val);
1902
1903template <typename PortalT, typename T, typename Comp>
1904VTKM_EXEC_CONT
1905vtkm::Id LowerBound(const PortalT& portal, const T& val, Comp comp);
1906
1907template <typename PortalT, typename T>
1908VTKM_EXEC_CONT
1909vtkm::Id LowerBound(const PortalT& portal, const T& val);
1910
1911template <typename IterT, typename T, typename Comp>
1912VTKM_EXEC_CONT
1913IterT UpperBound(IterT first, IterT last, const T& val, Comp comp);
1914
1915template <typename IterT, typename T>
1916VTKM_EXEC_CONT
1917IterT UpperBound(IterT first, IterT last, const T& val);
1918
1919template <typename PortalT, typename T, typename Comp>
1920VTKM_EXEC_CONT
1921vtkm::Id UpperBound(const PortalT& portal, const T& val, Comp comp);
1922
1923template <typename PortalT, typename T>
1924VTKM_EXEC_CONT
1925vtkm::Id UpperBound(const PortalT& portal, const T& val);
1926
1927}
1928```
1929
1930# Execution Environment
1931
1932## Scope ExecObjects with Tokens
1933
1934When VTK-m's `ArrayHandle` was originally designed, it was assumed that the
1935control environment would run on a single thread. However, multiple users
1936have expressed realistic use cases in which they would like to control
1937VTK-m from multiple threads (for example, to control multiple devices).
1938Consequently, it is important that VTK-m's control classes work correctly
1939when used simultaneously from multiple threads.
1940
1941The original `PrepareFor*` methods of `ArrayHandle` returned an object to
1942be used in the execution environment on a particular device that pointed to
1943data in the array. The pointer to the data was contingent on the state of
1944the `ArrayHandle` not changing. The assumption was that the calling code
1945would immediately use the returned execution environment object and would
1946not further change the `ArrayHandle` until done with the execution
1947environment object.
1948
1949This assumption is broken if multiple threads are running in the control
1950environment. For example, if one thread has called `PrepareForInput` to get
1951an execution array portal, the portal or its data could become invalid if
1952another thread calls `PrepareForOutput` on the same array. Initially one
1953would think that a well designed program should not share `ArrayHandle`s in
1954this way, but there are good reasons to need to do so. For example, when
1955using `vtkm::cont::PartitionedDataSet` where multiple partitions share a
1956coordinate system (very common), it becomes unsafe to work on multiple
1957blocks in parallel on different devices.
1958
1959What we really want is the code to be able to specify more explicitly when
1960the execution object is in use. Ideally, the execution object itself would
1961maintain the resources it is using. However, that will not work in this
1962case since the object has to pass from control to execution environment and
1963back. The resource allocation will break when the object is passed to an
1964offloaded device and back.
1965
1966Because we cannot use the object itself to manage its own resources, we use
1967a proxy object we are calling a `Token`. The `Token` object manages the
1968scope of the return execution object. As long as the `Token` is still in
1969scope, the execution object will remain valid. When the `Token` is
1970destroyed (or `DetachFromAll` is called on it), then the execution object
1971is no longer protected.
1972
1973When a `Token` is attached to an `ArrayHandle` to protect an execution
1974object, it's read or write mode is recorded. Multiple `Token`s can be
1975attached to read the `ArrayHandle` at the same time. However, only one
1976`Token` can be used to write to the `ArrayHandle`.
1977
1978### Basic `ArrayHandle` use
1979
1980The basic use of the `PrepareFor*` methods of `ArrayHandle` remain the
1981same. The only difference is the addition of a `Token` parameter.
1982
1983``` cpp
1984template <typename Device>
1985void LowLevelArray(vtkm::cont::ArrayHandle<vtkm::Float32> array, Device)
1986{
1987  vtkm::cont::Token token;
1988  auto portal = array.PrepareForOutput(ARRAY_SIZE, Device{}, token);
1989  // At this point, array is locked from anyone else from reading or modifying
1990  vtkm::cont::DeviceAdapterAlgorithm<Device>::Schedule(MyKernel(portal), ARRAY_SIZE);
1991
1992  // When the function finishes, token goes out of scope and array opens up
1993  // for other uses.
1994}
1995```
1996
1997### Execution objects
1998
1999To make sure that execution objects are scoped correctly, many changes
2000needed to be made to propagate a `Token` reference from the top of the
2001scope to where the execution object is actually made. The most noticeable
2002place for this was for implementations of
2003`vtkm::cont::ExecutionObjectBase`. Most implementations of
2004`ExecutionObjectBase` create an object that requires data from an
2005`ArrayHandle`.
2006
2007Previously, a subclass of `ExecutionObjectBase` was expected to have a
2008method named `PrepareForExecution` that had a single argument: the device
2009tag (or id) to make an object for. Now, subclasses of `ExecutionObjectBase`
2010should have a `PrepareForExecution` that takes two arguments: the device
2011and a `Token` to use for scoping the execution object.
2012
2013``` cpp
2014struct MyExecObject : vtkm::cont::ExecutionObjectBase
2015{
2016  vtkm::cont::ArrayHandle<vtkm::Float32> Array;
2017
2018  template <typename Device>
2019  VTKM_CONT
2020  MyExec<Device> PrepareForExecution(Device device, vtkm::cont::Token& token)
2021  {
2022    MyExec<Device> object;
2023	object.Portal = this->Array.PrepareForInput(device, token);
2024	return object;
2025  }
2026};
2027```
2028
2029It actually still works to use the old style of `PrepareForExecution`.
2030However, you will get a deprecation warning (on supported compilers) when
2031you try to use it.
2032
2033### Invoke and Dispatcher
2034
2035The `Dispatcher` classes now internally define a `Token` object during the
2036call to `Invoke`. (Likewise, `Invoker` will have a `Token` defined during
2037its invoke.) This internal `Token` is used when preparing `ArrayHandle`s
2038and `ExecutionObject`s for the execution environment. (Details in the next
2039section on how that works.)
2040
2041Because the invoke uses a `Token` to protect its arguments, it will block
2042the execution of other worklets attempting to access arrays in a way that
2043could cause read-write hazards. In the following example, the second
2044worklet will not be able to execute until the first worklet finishes.
2045
2046``` cpp
2047vtkm::cont::Invoker invoke;
2048invoke(Worklet1{}, input, intermediate);
2049invoke(Worklet2{}, intermediate, output); // Will not execute until Worklet1 finishes.
2050```
2051
2052That said, invocations _can_ share arrays if their use will not cause
2053read-write hazards. In particular, two invocations can both use the same
2054array if they are both strictly reading from it. In the following example,
2055both worklets can potentially execute at the same time.
2056
2057``` cpp
2058vtkm::cont::Invoker invoke;
2059invoke(Worklet1{}, input, output1);
2060invoke(Worklet2{}, input, output2); // Will not block
2061```
2062
2063The same `Token` is used for all arguments to the `Worklet`. This deatil is
2064important to prevent deadlocks if the same object is used in more than one
2065`Worklet` parameter. As a simple example, if a `Worklet` has a control
2066signature like
2067
2068``` cpp
2069  using ControlSignature = void(FieldIn, FieldOut);
2070```
2071
2072it should continue to work to use the same array as both fields.
2073
2074``` cpp
2075vtkm::cont::Invoker invoke;
2076invoke(Worklet1{}, array, array);
2077```
2078
2079### Transport
2080
2081The dispatch mechanism of worklets internally uses
2082`vtkm::cont::arg::Transport` objects to automatically move data from the
2083control environment to the execution environment. These `Transport` object
2084now take a `Token` when doing the transportation. This all happens under
2085the covers for most users.
2086
2087### Control Portals
2088
2089The `GetPortalConstControl` and `GetPortalControl` methods have been
2090deprecated. Instead, the methods `ReadPortal` and `WritePortal` should be
2091used. The calling signature is the same as their predecessors, but the
2092returned portal contains a reference back to the original `ArrayHandle`.
2093The reference keeps track of whether the memory allocation has changed.
2094
2095If the `ArrayHandle` is changed while the `ArrayPortal` still exists,
2096nothing will happen immediately. However, if the portal is subsequently
2097accessed (i.e. `Set` or `Get` is called on it), then a fatal error will be
2098reported to the log.
2099
2100### Deadlocks
2101
2102Now that portal objects from `ArrayHandle`s have finite scope (as opposed
2103to able to be immediately invalidated), the scopes have the ability to
2104cause operations to block. This can cause issues if the `ArrayHandle` is
2105attempted to be used by multiple `Token`s at once.
2106
2107The following is a contrived example of causing a deadlock.
2108
2109``` cpp
2110vtkm::cont::Token token1;
2111auto portal1 = array.PrepareForInPlace(Device{}, token1);
2112
2113vtkm::cont::Token token2;
2114auto portal2 = array.PrepareForInput(Device{}, token2);
2115```
2116
2117The last line will deadlock as `PrepareForInput` waits for `token1` to
2118detach, which will never happen. To prevent this from happening, if you use
2119the same `Token` on the array, it will always allow the action. Thus, the
2120following will work fine.
2121
2122``` cpp
2123vtkm::cont::Token token;
2124
2125auto portal1 = array.PrepareForInPlace(Device{}, token);
2126auto portal2 = array.PrepareForInput(Device{}, token);
2127```
2128
2129This prevents deadlock during the invocation of a worklet (so long as no
2130intermediate object tries to create its own `Token`, which would be bad
2131practice).
2132
2133Deadlocks are more likely when actually running multiple threads in the
2134control environment, but still pretty unlikely. One way it can occur is if
2135you have one (or more) worklet that has two output fields. You then try to
2136run the worklet(s) simultaneously on multiple threads. It could be that one
2137thread locks the first output array and the other thread locks the second
2138output array.
2139
2140However, having multiple threads trying to write to the same output arrays
2141at the same time without its own coordination is probably a bad idea in itself.
2142
2143
2144## Masks and Scatters Supported for 3D Scheduling
2145
2146Previous to this change worklets that wanted to use non-default
2147`vtkm::worklet::Mask` or `vtkm::worklet::Scatter` wouldn't work when scheduled
2148to run across `vtkm::cont::CellSetStructured` or other `InputDomains` that
2149supported 3D scheduling.
2150
2151This restriction was an inadvertent limitation of the VTK-m worklet scheduling
2152algorithm. Lifting the restriction and providing sufficient information has
2153been achieved in a manner that shouldn't degrade performance of any existing
2154worklets.
2155
2156
2157## Virtual methods in execution environment deprecated
2158
2159The use of classes with any virtual methods in the execution environment is
2160deprecated. Although we had code to correctly build virtual methods on some
2161devices such as CUDA, this feature was not universally supported on all
2162programming models we wish to support. Plus, the implementation of virtual
2163methods is not hugely convenient on CUDA because the virtual methods could
2164not be embedded in a library. To get around virtual methods declared in
2165different libraries, all builds had to be static, and a special linking
2166step to pull in possible virtual method implementations was required.
2167
2168For these reasons, VTK-m is no longer relying on virtual methods. (Other
2169approaches like multiplexers are used instead.) The code will be officially
2170removed in version 2.0. It is still supported in a deprecated sense (you
2171should get a warning). However, if you want to build without virtual
2172methods, you can set the `VTKm_NO_DEPRECATED_VIRTUAL` CMake flag, and they
2173will not be compiled.
2174
2175
2176## Deprecate Execute with policy
2177
2178The version of `Filter::Execute` that takes a policy as an argument is now
2179deprecated. Filters are now able to specify their own fields and types,
2180which is often why you want to customize the policy for an execution. The
2181other reason is that you are compiling VTK-m into some other source that
2182uses a particular types of storage. However, there is now a mechanism in
2183the CMake configuration to allow you to provide a header that customizes
2184the "default" types used in filters. This is a much more convenient way to
2185compile filters for specific types.
2186
2187One thing that filters were not able to do was to customize what cell sets
2188they allowed using. This allows filters to self-select what types of cell
2189sets they support (beyond simply just structured or unstructured). To
2190support this, the lists `SupportedCellSets`, `SupportedStructuredCellSets`,
2191and `SupportedUnstructuredCellSets` have been added to `Filter`. When you
2192apply a policy to a cell set, you now have to also provide the filter.
2193
2194# Worklets and Filters
2195
2196## Enable setting invalid value in probe filter
2197
2198Initially, the probe filter would simply not set a value if a sample was
2199outside the input `DataSet`. This is not great as the memory could be
2200left uninitalized and lead to unpredictable results. The testing
2201compared these invalid results to 0, which seemed to work but is
2202probably unstable.
2203
2204This was partially fixed by a previous change that consolidated to
2205mapping of cell data with a general routine that permuted data. However,
2206the fix did not extend to point data in the input, and it was not
2207possible to specify a particular invalid value.
2208
2209This change specifically updates the probe filter so that invalid values
2210are set to a user-specified value.
2211
2212
2213## Avoid raising errors when operating on cells
2214
2215Cell operations like interpolate and finding parametric coordinates can
2216fail under certain conditions. The previous behavior was to call
2217`RaiseError` on the worklet. By design, this would cause the worklet
2218execution to fail. However, that makes the worklet unstable for a conditin
2219that might be relatively common in data. For example, you wouldn't want a
2220large streamline worklet to fail just because one cell was not found
2221correctly.
2222
2223To work around this, many of the cell operations in the execution
2224environment have been changed to return an error code rather than raise an
2225error in the worklet.
2226
2227### Error Codes
2228
2229To support cell operations efficiently returning errors, a new enum named
2230`vtkm::ErrorCode` is available. This is the current implementation of
2231`ErrorCode`.
2232
2233``` cpp
2234enum class ErrorCode
2235{
2236  Success,
2237  InvalidShapeId,
2238  InvalidNumberOfPoints,
2239  WrongShapeIdForTagType,
2240  InvalidPointId,
2241  InvalidEdgeId,
2242  InvalidFaceId,
2243  SolutionDidNotConverge,
2244  MatrixFactorizationFailed,
2245  DegenerateCellDetected,
2246  MalformedCellDetected,
2247  OperationOnEmptyCell,
2248  CellNotFound,
2249
2250  UnknownError
2251};
2252```
2253
2254A convenience function named `ErrorString` is provided to make it easy to
2255convert the `ErrorCode` to a descriptive string that can be placed in an
2256error.
2257
2258### New Calling Specification
2259
2260Previously, most execution environment functions took as an argument the
2261worklet calling the function. This made it possible to call `RaiseError` on
2262the worklet. The result of the operation was typically returned. For
2263example, here is how the _old_ version of interpolate was called.
2264
2265``` cpp
2266FieldType interpolatedValue =
2267  vtkm::exec::CellInterpolate(fieldValues, pcoord, shape, worklet);
2268```
2269
2270The worklet is now no longer passed to the function. It is no longer needed
2271because an error is never directly raised. Instead, an `ErrorCode` is
2272returned from the function. Because the `ErrorCode` is returned, the
2273computed result of the function is returned by passing in a reference to a
2274variable. This is usually placed as the last argument (where the worklet
2275used to be). here is the _new_ version of how interpolate is called.
2276
2277``` cpp
2278FieldType interpolatedValue;
2279vtkm::ErrorCode result =
2280  vtkm::exec::CellInterpolate(fieldValues, pcoord, shape, interpolatedValue);
2281```
2282
2283The success of the operation can be determined by checking that the
2284returned `ErrorCode` is equal to `vtkm::ErrorCode::Success`.
2285
2286
2287## Add atomic free functions
2288
2289Previously, all atomic functions were stored in classes named
2290`AtomicInterfaceControl` and `AtomicInterfaceExecution`, which required
2291you to know at compile time which device was using the methods. That in
2292turn means that anything using an atomic needed to be templated on the
2293device it is running on.
2294
2295That can be a big hassle (and is problematic for some code structure).
2296Instead, these methods are moved to free functions in the `vtkm`
2297namespace. These functions operate like those in `Math.h`. Using
2298compiler directives, an appropriate version of the function is compiled
2299for the current device the compiler is using.
2300
2301## Flying Edges
2302
2303Added the flying edges contouring algorithm to VTK-m. This algorithm only
2304works on structured grids, but operates much faster than the traditional
2305Marching Cubes algorithm.
2306
2307The speed of VTK-m's flying edges is comprable to VTK's running on the same
2308CPUs. VTK-m's implementation also works well on CUDA hardware.
2309
2310The Flying Edges algorithm was introduced in this paper:
2311
2312Schroeder, W.; Maynard, R. & Geveci, B.
2313"Flying edges: A high-performance scalable isocontouring algorithm."
2314Large Data Analysis and Visualization (LDAV), 2015.
2315DOI 10.1109/LDAV.2015.7348069
2316
2317
2318## Filters specify their own field types
2319
2320Previously, the policy specified which field types the filter should
2321operate on. The filter could remove some types, but it was not able to
2322add any types.
2323
2324This is backward. Instead, the filter should specify what types its
2325supports and the policy may cull out some of those.
2326
2327# Build
2328
2329## Disable asserts for CUDA architecture builds
2330
2331`assert` is supported on recent CUDA cards, but compiling it appears to be
2332very slow. By default, the `VTKM_ASSERT` macro has been disabled whenever
2333compiling for a CUDA device (i.e. when `__CUDA_ARCH__` is defined).
2334
2335Asserts for CUDA devices can be turned back on by turning the
2336`VTKm_NO_ASSERT_CUDA` CMake variable off. Turning this CMake variable off
2337will enable assertions in CUDA kernels unless there is another reason
2338turning off all asserts (such as a release build).
2339
2340## Disable asserts for HIP architecture builds
2341
2342`assert` is supported on recent HIP cards, but compiling it is very slow,
2343as it triggers the usage of `printf` which. Currently (ROCm 3.7) `printf`
2344has a severe performance penalty and should be avoided when possible.
2345By default, the `VTKM_ASSERT` macro has been disabled whenever compiling
2346for a HIP device via kokkos.
2347
2348Asserts for HIP devices can be turned back on by turning the
2349`VTKm_NO_ASSERT_HIP` CMake variable off. Turning this CMake variable off
2350will enable assertions in HIP kernels unless there is another reason
2351turning off all asserts (such as a release build).
2352
2353## Add VTKM_DEPRECATED macro
2354
2355The `VTKM_DEPRECATED` macro allows us to remove (and usually replace)
2356features from VTK-m in minor releases while still following the conventions
2357of semantic versioning. The idea is that when we want to remove or replace
2358a feature, we first mark the old feature as deprecated. The old feature
2359will continue to work, but compilers that support it will start to issue a
2360warning that the use is deprecated and should stop being used. The
2361deprecated features should remain viable until at least the next major
2362version. At the next major version, deprecated features from the previous
2363version may be removed.
2364
2365### Declaring things deprecated
2366
2367Classes and methods are marked deprecated using the `VTKM_DEPRECATED`
2368macro. The first argument of `VTKM_DEPRECATED` should be set to the first
2369version in which the feature is deprecated. For example, if the last
2370released version of VTK-m was 1.5, and on the master branch a developer
2371wants to deprecate a class foo, then the `VTKM_DEPRECATED` release version
2372should be given as 1.6, which will be the next minor release of VTK-m. The
2373second argument of `VTKM_DEPRECATED`, which is optional but highly
2374encouraged, is a short message that should clue developers on how to update
2375their code to the new changes. For example, it could point to the
2376replacement class or method for the changed feature.
2377
2378`VTKM_DEPRECATED` can be used to deprecate a class by adding it between the
2379`struct` or `class` keyword and the class name.
2380
2381``` cpp
2382struct VTKM_DEPRECATED(1.6, "OldClass replaced with NewClass.") OldClass
2383{
2384};
2385```
2386
2387Aliases can similarly be depreciated, except the `VTKM_DEPRECATED` macro
2388goes after the name in this case.
2389
2390``` cpp
2391using OldAlias VTKM_DEPRECATED(1.6, "Use NewClass instead.") = NewClass;
2392```
2393
2394Functions and methods are marked as deprecated by adding `VTKM_DEPRECATED`
2395as a modifier before the return value and any markup (VTKM_CONT, VTKM_EXEC, or VTKM_EXEC_CONT).
2396
2397``` cpp
2398VTKM_DEPRECATED(1.6, "You must now specify a tolerance.") void ImportantMethod(double x)
2399VTKM_EXEC_CONT
2400{
2401  this->ImportantMethod(x, 1e-6);
2402}
2403```
2404
2405`enum`s can be deprecated like classes using similar syntax.
2406
2407``` cpp
2408enum struct VTKM_DEPRECATED(1.7, "Use NewEnum instead.") OldEnum
2409{
2410  OLD_VALUE
2411};
2412```
2413
2414Individual items in an `enum` can also be marked as deprecated and
2415intermixed with regular items.
2416
2417``` cpp
2418enum struct NewEnum
2419{
2420  OLD_VALUE1 VTKM_DEPRECATED(1.7, "Use NEW_VALUE instead."),
2421  NEW_VALUE,
2422  OLD_VALUE2 VTKM_DEPRECATED(1.7) = 42
2423};
2424```
2425
2426### Using deprecated items
2427
2428Using deprecated items should work, but the compiler will give a warning.
2429That is the point. However, sometimes you need to legitimately use a
2430deprecated item without a warning. This is usually because you are
2431implementing another deprecated item or because you have a test for a
2432deprecated item (that can be easily removed with the deprecated bit). To
2433support this a pair of macros, `VTKM_DEPRECATED_SUPPRESS_BEGIN` and
2434`VTKM_DEPRECATED_SUPPRESS_END` are provided. Code that legitimately uses
2435deprecated items should be wrapped in these macros.
2436
2437``` cpp
2438VTKM_DEPRECATED(1.6, "You must now specify both a value and tolerance.")
2439VTKM_EXEC_CONT
2440void ImportantMethod()
2441{
2442  // It can be the case that to implement a deprecated method you need to
2443  // use other deprecated features. To do that, just temporarily suppress
2444  // those warnings.
2445  VTKM_DEPRECATED_SUPPRESS_BEGIN
2446  this->ImportantMethod(0.0);
2447  VTKM_DEPRECATED_SUPPRESS_END
2448}
2449```
2450
2451# Other
2452
2453## Porting layer for future std features
2454
2455Currently, VTK-m is using C++11. However, it is often useful to use
2456features in the `std` namespace that are defined for C++14 or later. We can
2457provide our own versions (sometimes), but it is preferable to use the
2458version provided by the compiler if available.
2459
2460There were already some examples of defining portable versions of C++14 and
2461C++17 classes in a `vtkmstd` namespace, but these were sprinkled around the
2462source code.
2463
2464There is now a top level `vtkmstd` directory and in it are header files
2465that provide portable versions of these future C++ classes. In each case,
2466preprocessor macros are used to select which version of the class to use.
2467
2468
2469## Removed OpenGL Rendering Classes
2470
2471When the rendering library was first built, OpenGL was used to implement
2472the components (windows, mappers, annotation, etc.). However, as the native
2473ray casting became viable, the majority of the work has focused on using
2474that. Since then, the original OpenGL classes have been largely ignored.
2475
2476It has for many months been determined that it is not work attempting to
2477maintain two different versions of the rendering libraries as features are
2478added and changed. Thus, the OpenGL classes have fallen out of date and did
2479not actually work.
2480
2481These classes have finally been officially removed.
2482
2483
2484## Reorganization of `io` directory
2485
2486The `vtkm/io` directory has been flattened.
2487Namely, the files in `vtkm/io/reader` and `vtkm/io/writer` have been moved up into `vtkm/io`,
2488with the associated changes in namespaces.
2489
2490In addition, `vtkm/cont/EncodePNG.h` and `vtkm/cont/DecodePNG.h` have been moved to a more natural home in `vtkm/io`.
2491
2492
2493## Implemented PNG/PPM image Readers/Writers
2494
2495The original implementation of writing image data was only performed as a
2496proxy through the Canvas rendering class. In order to implement true support
2497for image-based regression testing, this interface needed to be expanded upon
2498to support reading/writing arbitrary image data and storing it in a `vtkm::DataSet`.
2499Using the new `vtkm::io::PNGReader` and `vtkm::io::PPMReader` it is possible
2500to read data from files and Cavases directly and store them as a point field
2501in a 2D uniform `vtkm::DataSet`
2502
2503```cpp
2504auto reader = vtkm::io::PNGReader();
2505auto imageDataSet = reader.ReadFromFile("read_image.png");
2506```
2507
2508Similarly, the new `vtkm::io::PNGWriter` and `vtkm::io::PPMWriter` make it possible
2509to write out a 2D uniform `vtkm::DataSet` directly to a file.
2510
2511```cpp
2512auto writer = vtkm::io::PNGWriter();
2513writer.WriteToFile("write_image.png", imageDataSet);
2514```
2515
2516If canvas data is to be written out, the reader provides a method for converting
2517a canvas's data to a `vtkm::DataSet`.
2518
2519```cpp
2520auto reader = vtkm::io::PNGReader();
2521auto dataSet = reader.CreateImageDataSet(canvas);
2522auto writer = vtkm::io::PNGWriter();
2523writer.WriteToFile("output.png", dataSet);
2524```
2525
2526
2527## Updated Benchmark Framework
2528
2529The benchmarking framework has been updated to use Google Benchmark.
2530
2531A benchmark is now a single function, which is passed to a macro:
2532
2533```
2534void MyBenchmark(::benchmark::State& state)
2535{
2536  MyClass someClass;
2537
2538  // Optional: Add a descriptive label with additional benchmark details:
2539  state.SetLabel("Blah blah blah.");
2540
2541  // Must use a vtkm timer to properly capture eg. CUDA execution times.
2542  vtkm::cont::Timer timer;
2543  for (auto _ : state)
2544  {
2545    someClass.Reset();
2546
2547    timer.Start();
2548    someClass.DoWork();
2549    timer.Stop();
2550
2551    state.SetIterationTime(timer.GetElapsedTime());
2552  }
2553
2554  // Optional: Report items and/or bytes processed per iteration in output:
2555  state.SetItemsProcessed(state.iterations() * someClass.GetNumberOfItems());
2556  state.SetBytesProcessed(state.iterations() * someClass.GetNumberOfBytes());
2557}
2558}
2559VTKM_BENCHMARK(MyBenchmark);
2560```
2561
2562Google benchmark also makes it easy to implement parameter sweep benchmarks:
2563
2564```
2565void MyParameterSweep(::benchmark::State& state)
2566{
2567  // The current value in the sweep:
2568  const vtkm::Id currentValue = state.range(0);
2569
2570  MyClass someClass;
2571  someClass.SetSomeParameter(currentValue);
2572
2573  vtkm::cont::Timer timer;
2574  for (auto _ : state)
2575  {
2576    someClass.Reset();
2577
2578    timer.Start();
2579    someClass.DoWork();
2580    timer.Stop();
2581
2582    state.SetIterationTime(timer.GetElapsedTime());
2583  }
2584}
2585VTKM_BENCHMARK_OPTS(MyBenchmark, ->ArgName("Param")->Range(32, 1024 * 1024));
2586```
2587
2588will generate and launch several benchmarks, exploring the parameter space of
2589`SetSomeParameter` between the values of 32 and (1024*1024). The chain of
2590functions calls in the second argument is applied to an instance of
2591::benchmark::internal::Benchmark. See Google Benchmark's documentation for
2592more details.
2593
2594For more complex benchmark configurations, the VTKM_BENCHMARK_APPLY macro
2595accepts a function with the signature
2596`void Func(::benchmark::internal::Benchmark*)` that may be used to generate
2597more complex configurations.
2598
2599To instantiate a templated benchmark across a list of types, the
2600VTKM_BENCHMARK_TEMPLATE* macros take a vtkm::List of types as an additional
2601parameter. The templated benchmark function will be instantiated and called
2602for each type in the list:
2603
2604```
2605template <typename T>
2606void MyBenchmark(::benchmark::State& state)
2607{
2608  MyClass<T> someClass;
2609
2610  // Must use a vtkm timer to properly capture eg. CUDA execution times.
2611  vtkm::cont::Timer timer;
2612  for (auto _ : state)
2613  {
2614    someClass.Reset();
2615
2616    timer.Start();
2617    someClass.DoWork();
2618    timer.Stop();
2619
2620    state.SetIterationTime(timer.GetElapsedTime());
2621  }
2622}
2623}
2624VTKM_BENCHMARK_TEMPLATE(MyBenchmark, vtkm::List<vtkm::Float32, vtkm::Vec3f_32>);
2625```
2626
2627The benchmarks are executed by calling the `VTKM_EXECUTE_BENCHMARKS(argc, argv)`
2628macro from `main`. There is also a `VTKM_EXECUTE_BENCHMARKS_PREAMBLE(argc, argv, some_string)`
2629macro that appends the contents of `some_string` to the Google Benchmark preamble.
2630
2631If a benchmark is not compatible with some configuration, it may call
2632`state.SkipWithError("Error message");` on the `::benchmark::State` object and return. This is
2633useful, for instance in the filter tests when the input is not compatible with the filter.
2634
2635When launching a benchmark executable, the following options are supported by Google Benchmark:
2636
2637- `--benchmark_list_tests`: List all available tests.
2638- `--benchmark_filter="[regex]"`: Only run benchmark with names that match `[regex]`.
2639- `--benchmark_filter="-[regex]"`: Only run benchmark with names that DON'T match `[regex]`.
2640- `--benchmark_min_time=[float]`: Make sure each benchmark repetition gathers `[float]` seconds
2641  of data.
2642- `--benchmark_repetitions=[int]`: Run each benchmark `[int]` times and report aggregate statistics
2643  (mean, stdev, etc). A "repetition" refers to a single execution of the benchmark function, not
2644  an "iteration", which is a loop of the `for(auto _:state){...}` section.
2645- `--benchmark_report_aggregates_only="true|false"`: If true, only the aggregate statistics are
2646  reported (affects both console and file output). Requires `--benchmark_repetitions` to be useful.
2647- `--benchmark_display_aggregates_only="true|false"`: If true, only the aggregate statistics are
2648  printed to the terminal. Any file output will still contain all repetition info.
2649- `--benchmark_format="console|json|csv"`: Specify terminal output format: human readable
2650  (`console`) or `csv`/`json` formats.
2651- `--benchmark_out_format="console|json|csv"`: Specify file output format: human readable
2652  (`console`) or `csv`/`json` formats.
2653- `--benchmark_out=[filename]`: Specify output file.
2654- `--benchmark_color="true|false"`: Toggle color output in terminal when using `console` output.
2655- `--benchmark_counters_tabular="true|false"`: Print counter information (e.g. bytes/sec, items/sec)
2656  in the table, rather than appending them as a label.
2657
2658For more information and examples of practical usage, take a look at the existing benchmarks in
2659vtk-m/benchmarking/.
2660
2661
2662## Provide scripts to build Gitlab-ci workers locally
2663
2664To simplify reproducing docker based CI workers locally, VTK-m has python program that handles all the
2665work automatically for you.
2666
2667The program is located in `[Utilities/CI/reproduce_ci_env.py ]` and requires python3 and pyyaml.
2668
2669To use the program is really easy! The following two commands will create the `build:rhel8` gitlab-ci
2670worker as a docker image and setup a container just as how gitlab-ci would be before the actual
2671compilation of VTK-m. Instead of doing the compilation, instead you will be given an interactive shell.
2672
2673```
2674./reproduce_ci_env.py create rhel8
2675./reproduce_ci_env.py run rhel8
2676```
2677
2678To compile VTK-m from the the interactive shell you would do the following:
2679```
2680> src]## cd build/
2681> build]## cmake --build .
2682```
2683
2684
2685## Replaced `vtkm::ListTag` with `vtkm::List`
2686
2687The original `vtkm::ListTag` was designed when we had to support compilers
2688that did not provide C++11's variadic templates. Thus, the design hides
2689type lists, which were complicated to support.
2690
2691Now that we support C++11, variadic templates are trivial and we can easily
2692create templated type aliases with `using`. Thus, it is now simpler to deal
2693with a template that lists types directly.
2694
2695Hence, `vtkm::ListTag` is deprecated and `vtkm::List` is now supported. The
2696main difference between the two is that whereas `vtkm::ListTag` allowed you
2697to create a list by subclassing another list, `vtkm::List` cannot be
2698subclassed. (Well, it can be subclassed, but the subclass ceases to be
2699considered a list.) Thus, where before you would declare a list like
2700
2701``` cpp
2702struct MyList : vtkm::ListTagBase<Type1, Type2, Type3>
2703{
2704};
2705```
2706
2707you now make an alias
2708
2709``` cpp
2710using MyList = vtkm::List<Type1, Type2, Type3>;
2711```
2712
2713If the compiler reports the `MyList` type in an error or warning, it
2714actually uses the fully qualified `vtkm::List<Type1, Type2, Type3>`.
2715Although this makes errors more verbose, it makes it easier to diagnose
2716problems because the types are explicitly listed.
2717
2718The new `vtkm::List` comes with a list of utility templates to manipulate
2719lists that mostly mirrors those in `vtkm::ListTag`: `VTKM_IS_LIST`,
2720`ListApply`, `ListSize`, `ListAt`, `ListIndexOf`, `ListHas`, `ListAppend`,
2721`ListIntersect`, `ListTransform`, `ListRemoveIf`, and `ListCross`. All of
2722these utilities become `vtkm::List<>` types (where applicable), which makes
2723them more consistent than the old `vtkm::ListTag` versions.
2724
2725Thus, if you have a declaration like
2726
2727``` cpp
2728vtkm::ListAppend(vtkm::List<Type1a, Type2a>, vtkm::List<Type1b, Type2b>>
2729```
2730
2731this gets changed automatically to
2732
2733``` cpp
2734vtkm::List<Type1a, Type2a, Type1b, Type2b>
2735```
2736
2737This is in contrast to the equivalent old version, which would create a new
2738type for `vtkm::ListTagAppend` in addition to the ultimate actual list it
2739constructs.
2740
2741
2742## Add `ListTagRemoveIf`
2743
2744It is sometimes useful to remove types from `ListTag`s. This is especially
2745the case when combining lists of types together where some of the type
2746combinations may be invalid and should be removed. To handle this
2747situation, a new `ListTag` type is added: `ListTagRemoveIf`.
2748
2749`ListTagRemoveIf` is a template structure that takes two arguments. The
2750first argument is another `ListTag` type to operate on. The second argument
2751is a template that acts as a predicate. The predicate takes a type and
2752declares a Boolean `value` that should be `true` if the type should be
2753removed and `false` if the type should remain.
2754
2755Here is an example of using `ListTagRemoveIf` to get basic types that hold
2756only integral values.
2757
2758``` cpp
2759template <typename T>
2760using IsRealValue =
2761  std::is_same<
2762    typename vtkm::TypeTraits<typename vtkm::VecTraits<T>::BaseComponentType>::NumericTag,
2763    vtkm::TypeTraitsRealTag>;
2764
2765using MixedTypes =
2766  vtkm::ListTagBase<vtkm::Id, vtkm::FloatDefault, vtkm::Id3, vtkm::Vec3f>;
2767
2768using IntegralTypes = vtkm::ListTagRemoveIf<MixedTypes, IsRealValue>;
2769// IntegralTypes now equivalent to vtkm::ListTagBase<vtkm::Id, vtkm::Id3>
2770```
2771
2772
2773## Write uniform and rectilinear grids to legacy VTK files
2774
2775As a programming convenience, all `vtkm::cont::DataSet` written by
2776`vtkm::io::VTKDataSetWriter` were written as a structured grid. Although
2777technically correct, it changed the structure of the data. This meant that
2778if you wanted to capture data to run elsewhere, it would run as a different
2779data type. This was particularly frustrating if the data of that structure
2780was causing problems and you wanted to debug it.
2781
2782Now, `VTKDataSetWriter` checks the type of the `CoordinateSystem` to
2783determine whether the data should be written out as `STRUCTURED_POINTS`
2784(i.e. a uniform grid), `RECTILINEAR_GRID`, or `STRUCTURED_GRID`
2785(curvilinear).
2786
2787# References
2788
2789| Feature                                                                   | Merge Request            |
2790| --------------------------------------------------------------------------| ------------------------ |
2791| Add Kokkos backend                                                        | Merge-request: !2164     |
2792| Extract component arrays from unknown arrays                              | Merge-request: !2354     |
2793| `ArrayHandleGroupVecVariable` holds now one more offset.                  | Merge-request: !1964     |
2794| Create `ArrayHandleOffsetsToNumComponents`                                | Merge-request: !2299     |
2795| Implemented ArrayHandleRandomUniformBits and ArrayHandleRandomUniformReal | Merge-request: !2116     |
2796| `ArrayRangeCompute` works on any array type without compiling device code | Merge-request: !2409     |
2797| Algorithms for Control and Execution Environments                         | Merge-request: !1920     |
2798| Redesign of ArrayHandle to access data using typeless buffers             | Merge-request: !2347     |
2799| `vtkm::cont::internal::Buffer` now can have ownership transferred         | Merge-request: !2200     |
2800| Provide scripts to build Gitlab-ci workers locally                        | Merge-request: !2030     |
2801| Configurable default types                                                | Merge-request: !1997     |
2802| Result DataSet of coordinate transform has its CoordinateSystem changed   | Merge-request: !2099     |
2803| Precompiled `ArrayCopy` for `UnknownArrayHandle`                          | Merge-request: !2396     |
2804| Disable asserts for CUDA architecture builds                              | Merge-request: !2157     |
2805| Portals may advertise custom iterators                                    | Merge-request: !1929     |
2806| DataSet now only allows unique field names                                | Merge-request: !2099     |
2807| ArrayHandleDecorator Allocate and Shrink Support                          | Merge-request: !1933     |
2808| Deprecate ArrayHandleVirtualCoordinates                                   | Merge-request: !2177     |
2809| Deprecate `DataSetFieldAdd`                                               | Merge-request: !2106     |
2810| Deprecate Execute with policy                                             | Merge-request: !2093     |
2811| Virtual methods in execution environment deprecated                       | Merge-request: !2256     |
2812| Add VTKM_DEPRECATED macro                                                 | Merge-request: !2266     |
2813| Filters specify their own field types                                     | Merge-request: !2099     |
2814| Flying Edges                                                              | Merge-request: !2099     |
2815| Updated Benchmark Framework                                               | Merge-request: !1936     |
2816| Disable asserts for HIP architecture builds                               | Merge-request: !2270     |
2817| Implemented PNG/PPM image Readers/Writers                                 | Merge-request: !1967     |
2818| Reorganization of `io` directory                                          | Merge-request: !2067     |
2819| Add `ListTagRemoveIf`                                                     | Merge-request: !1901     |
2820| Masks and Scatters Supported for 3D Scheduling                            | Merge-request: !1975     |
2821| Improvements to moving data into ArrayHandle                              | Merge-request: !2184     |
2822| Avoid raising errors when operating on cells                              | Merge-request: !2099     |
2823| Order asynchronous `ArrayHandle` access                                   | Merge-request: !2130     |
2824| Enable setting invalid value in probe filter                              | Merge-request: !2122     |
2825| `ReadPortal().Get(idx)`                                                   | Merge-request: !2078     |
2826| Recombine extracted component arrays from unknown arrays                  | Merge-request: !2381     |
2827| Removed old `ArrayHandle` transfer mechanism                              | Merge-request: !2347     |
2828| Removed OpenGL Rendering Classes                                          | Merge-request: !2099     |
2829| Scope ExecObjects with Tokens                                             | Merge-request: !1988     |
2830| Shorter fancy array handle classnames                                     | Merge-request: !1937     |
2831| Support `ArrayHandleSOA` as a "default" array                             | Merge-request: !2349     |
2832| Porting layer for future std features                                     | Merge-request: !1977     |
2833| Add a vtkm::Tuple class                                                   | Merge-request: !1977     |
2834| UnknownArrayHandle and UncertainArrayHandle for runtime-determined types  | Merge-request: !2202     |
2835| Added VecFlat class                                                       | Merge-request: !2354     |
2836| Remove VTKDataSetWriter::WriteDataSet just_points parameter               | Merge-request: !2185     |
2837| Move VTK file readers and writers into vtkm_io                            | Merge-request: !2100     |
2838| Write uniform and rectilinear grids to legacy VTK files                   | Merge-request: !2173     |
2839| Add atomic free functions                                                 | Merge-request: !2223     |
2840| Replaced `vtkm::ListTag` with `vtkm::List`                                | Merge-request: !1918     |
2841