• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

README.mdH A D14-Jan-202158.9 KiB1,7611,309

Tensor.hH A D14-Jan-202120.1 KiB528396

TensorArgMax.hH A D14-Jan-202110.8 KiB300227

TensorAssign.hH A D14-Jan-20217.5 KiB182124

TensorBase.hH A D14-Jan-202148.5 KiB1,013846

TensorBroadcasting.hH A D14-Jan-202114 KiB393322

TensorChipping.hH A D14-Jan-202114.4 KiB385310

TensorConcatenation.hH A D14-Jan-202114.3 KiB362296

TensorContraction.hH A D14-Jan-202126.1 KiB629464

TensorContractionBlocking.hH A D14-Jan-20211.6 KiB5734

TensorContractionCuda.hH A D14-Jan-202160.6 KiB1,3921,118

TensorContractionMapper.hH A D14-Jan-202118.3 KiB470367

TensorContractionThreadPool.hH A D14-Jan-202143 KiB1,044690

TensorConversion.hH A D14-Jan-202110.7 KiB280214

TensorConvolution.hH A D14-Jan-202146.5 KiB1,105915

TensorCostModel.hH A D14-Jan-20218.2 KiB213145

TensorCustomOp.hH A D14-Jan-202111.2 KiB314230

TensorDevice.hH A D14-Jan-20212.5 KiB6939

TensorDeviceCuda.hH A D14-Jan-202110.8 KiB338257

TensorDeviceDefault.hH A D14-Jan-20212.4 KiB8253

TensorDeviceSycl.hH A D14-Jan-20215.1 KiB12383

TensorDeviceThreadPool.hH A D14-Jan-20219.7 KiB283196

TensorDimensionList.hH A D14-Jan-20217.5 KiB237194

TensorDimensions.hH A D14-Jan-202115.2 KiB429333

TensorEvalTo.hH A D14-Jan-20216.4 KiB182124

TensorEvaluator.hH A D14-Jan-202124.7 KiB634474

TensorExecutor.hH A D14-Jan-202110 KiB289222

TensorExpr.hH A D14-Jan-202114.3 KiB372265

TensorFFT.hH A D14-Jan-202122.8 KiB652536

TensorFixedSize.hH A D14-Jan-202114.6 KiB390313

TensorForcedEval.hH A D14-Jan-20216.4 KiB170113

TensorForwardDeclarations.hH A D14-Jan-20215.3 KiB11076

TensorFunctors.hH A D14-Jan-202114.3 KiB490406

TensorGenerator.hH A D14-Jan-20216.2 KiB186139

TensorGlobalFunctions.hH A D14-Jan-20211.3 KiB3415

TensorIO.hH A D14-Jan-20212.5 KiB8051

TensorImagePatch.hH A D14-Jan-202122.6 KiB510373

TensorIndexList.hH A D14-Jan-202125.2 KiB726561

TensorInflation.hH A D14-Jan-20218.2 KiB230180

TensorInitializer.hH A D14-Jan-20212.7 KiB8352

TensorIntDiv.hH A D14-Jan-20218.3 KiB254184

TensorLayoutSwap.hH A D14-Jan-20217.2 KiB210142

TensorMacros.hH A D14-Jan-20211.3 KiB5521

TensorMap.hH A D14-Jan-202113.2 KiB324265

TensorMeta.hH A D14-Jan-20215.2 KiB219163

TensorMorphing.hH A D14-Jan-202133.5 KiB889721

TensorPadding.hH A D14-Jan-202115.4 KiB398312

TensorPatch.hH A D14-Jan-202110.4 KiB270213

TensorRandom.hH A D14-Jan-20219.1 KiB277194

TensorReduction.hH A D14-Jan-202133.1 KiB782644

TensorReductionCuda.hH A D14-Jan-202129.6 KiB751595

TensorReductionSycl.hH A D14-Jan-202113.7 KiB243147

TensorRef.hH A D14-Jan-202113.3 KiB430349

TensorReverse.hH A D14-Jan-202110.3 KiB289229

TensorScan.hH A D14-Jan-20219.7 KiB288214

TensorShuffling.hH A D14-Jan-20219.3 KiB265205

TensorStorage.hH A D14-Jan-20215 KiB14797

TensorStriding.hH A D14-Jan-202112.9 KiB339278

TensorSycl.hH A D14-Jan-20212.4 KiB8335

TensorSyclConvertToDeviceExpression.hH A D14-Jan-20214.9 KiB12266

TensorSyclExprConstructor.hH A D14-Jan-202111.3 KiB240170

TensorSyclExtractAccessor.hH A D14-Jan-202112.2 KiB205126

TensorSyclExtractFunctors.hH A D14-Jan-20219.5 KiB178103

TensorSyclLeafCount.hH A D14-Jan-20215.2 KiB11556

TensorSyclPlaceHolderExpr.hH A D14-Jan-20216.5 KiB182110

TensorSyclRun.hH A D14-Jan-20213 KiB7137

TensorSyclTuple.hH A D14-Jan-20219.5 KiB23887

TensorTraits.hH A D14-Jan-20219.2 KiB273187

TensorUInt128.hH A D14-Jan-20217.3 KiB249185

TensorVolumePatch.hH A D14-Jan-202127.5 KiB609473

README.md

1# Eigen Tensors {#eigen_tensors}
2
3Tensors are multidimensional arrays of elements. Elements are typically scalars,
4but more complex types such as strings are also supported.
5
6[TOC]
7
8## Tensor Classes
9
10You can manipulate a tensor with one of the following classes.  They all are in
11the namespace `::Eigen.`
12
13
14### Class Tensor<data_type, rank>
15
16This is the class to use to create a tensor and allocate memory for it.  The
17class is templatized with the tensor datatype, such as float or int, and the
18tensor rank.  The rank is the number of dimensions, for example rank 2 is a
19matrix.
20
21Tensors of this class are resizable.  For example, if you assign a tensor of a
22different size to a Tensor, that tensor is resized to match its new value.
23
24#### Constructor `Tensor<data_type, rank>(size0, size1, ...)`
25
26Constructor for a Tensor.  The constructor must be passed `rank` integers
27indicating the sizes of the instance along each of the the `rank`
28dimensions.
29
30    // Create a tensor of rank 3 of sizes 2, 3, 4.  This tensor owns
31    // memory to hold 24 floating point values (24 = 2 x 3 x 4).
32    Tensor<float, 3> t_3d(2, 3, 4);
33
34    // Resize t_3d by assigning a tensor of different sizes, but same rank.
35    t_3d = Tensor<float, 3>(3, 4, 3);
36
37#### Constructor `Tensor<data_type, rank>(size_array)`
38
39Constructor where the sizes for the constructor are specified as an array of
40values instead of an explicitly list of parameters.  The array type to use is
41`Eigen::array<Eigen::Index>`.  The array can be constructed automatically
42from an initializer list.
43
44    // Create a tensor of strings of rank 2 with sizes 5, 7.
45    Tensor<string, 2> t_2d({5, 7});
46
47
48### Class `TensorFixedSize<data_type, Sizes<size0, size1, ...>>`
49
50Class to use for tensors of fixed size, where the size is known at compile
51time.  Fixed sized tensors can provide very fast computations because all their
52dimensions are known by the compiler.  FixedSize tensors are not resizable.
53
54If the total number of elements in a fixed size tensor is small enough the
55tensor data is held onto the stack and does not cause heap allocation and free.
56
57    // Create a 4 x 3 tensor of floats.
58    TensorFixedSize<float, Sizes<4, 3>> t_4x3;
59
60### Class `TensorMap<Tensor<data_type, rank>>`
61
62This is the class to use to create a tensor on top of memory allocated and
63owned by another part of your code.  It allows to view any piece of allocated
64memory as a Tensor.  Instances of this class do not own the memory where the
65data are stored.
66
67A TensorMap is not resizable because it does not own the memory where its data
68are stored.
69
70#### Constructor `TensorMap<Tensor<data_type, rank>>(data, size0, size1, ...)`
71
72Constructor for a Tensor.  The constructor must be passed a pointer to the
73storage for the data, and "rank" size attributes.  The storage has to be
74large enough to hold all the data.
75
76    // Map a tensor of ints on top of stack-allocated storage.
77    int storage[128];  // 2 x 4 x 2 x 8 = 128
78    TensorMap<Tensor<int, 4>> t_4d(storage, 2, 4, 2, 8);
79
80    // The same storage can be viewed as a different tensor.
81    // You can also pass the sizes as an array.
82    TensorMap<Tensor<int, 2>> t_2d(storage, 16, 8);
83
84    // You can also map fixed-size tensors.  Here we get a 1d view of
85    // the 2d fixed-size tensor.
86    TensorFixedSize<float, Sizes<4, 5>> t_4x3;
87    TensorMap<Tensor<float, 1>> t_12(t_4x3.data(), 12);
88
89
90#### Class `TensorRef`
91
92See Assigning to a TensorRef below.
93
94## Accessing Tensor Elements
95
96#### `<data_type> tensor(index0, index1...)`
97
98Return the element at position `(index0, index1...)` in tensor
99`tensor`.  You must pass as many parameters as the rank of `tensor`.
100The expression can be used as an l-value to set the value of the element at the
101specified position.  The value returned is of the datatype of the tensor.
102
103    // Set the value of the element at position (0, 1, 0);
104    Tensor<float, 3> t_3d(2, 3, 4);
105    t_3d(0, 1, 0) = 12.0f;
106
107    // Initialize all elements to random values.
108    for (int i = 0; i < 2; ++i) {
109      for (int j = 0; j < 3; ++j) {
110        for (int k = 0; k < 4; ++k) {
111          t_3d(i, j, k) = ...some random value...;
112        }
113      }
114    }
115
116    // Print elements of a tensor.
117    for (int i = 0; i < 2; ++i) {
118      LOG(INFO) << t_3d(i, 0, 0);
119    }
120
121
122## TensorLayout
123
124The tensor library supports 2 layouts: `ColMajor` (the default) and
125`RowMajor`.  Only the default column major layout is currently fully
126supported, and it is therefore not recommended to attempt to use the row major
127layout at the moment.
128
129The layout of a tensor is optionally specified as part of its type. If not
130specified explicitly column major is assumed.
131
132    Tensor<float, 3, ColMajor> col_major;  // equivalent to Tensor<float, 3>
133    TensorMap<Tensor<float, 3, RowMajor> > row_major(data, ...);
134
135All the arguments to an expression must use the same layout. Attempting to mix
136different layouts will result in a compilation error.
137
138It is possible to change the layout of a tensor or an expression using the
139`swap_layout()` method.  Note that this will also reverse the order of the
140dimensions.
141
142    Tensor<float, 2, ColMajor> col_major(2, 4);
143    Tensor<float, 2, RowMajor> row_major(2, 4);
144
145    Tensor<float, 2> col_major_result = col_major;  // ok, layouts match
146    Tensor<float, 2> col_major_result = row_major;  // will not compile
147
148    // Simple layout swap
149    col_major_result = row_major.swap_layout();
150    eigen_assert(col_major_result.dimension(0) == 4);
151    eigen_assert(col_major_result.dimension(1) == 2);
152
153    // Swap the layout and preserve the order of the dimensions
154    array<int, 2> shuffle(1, 0);
155    col_major_result = row_major.swap_layout().shuffle(shuffle);
156    eigen_assert(col_major_result.dimension(0) == 2);
157    eigen_assert(col_major_result.dimension(1) == 4);
158
159
160## Tensor Operations
161
162The Eigen Tensor library provides a vast library of operations on Tensors:
163numerical operations such as addition and multiplication, geometry operations
164such as slicing and shuffling, etc.  These operations are available as methods
165of the Tensor classes, and in some cases as operator overloads.  For example
166the following code computes the elementwise addition of two tensors:
167
168    Tensor<float, 3> t1(2, 3, 4);
169    ...set some values in t1...
170    Tensor<float, 3> t2(2, 3, 4);
171    ...set some values in t2...
172    // Set t3 to the element wise sum of t1 and t2
173    Tensor<float, 3> t3 = t1 + t2;
174
175While the code above looks easy enough, it is important to understand that the
176expression `t1 + t2` is not actually adding the values of the tensors.  The
177expression instead constructs a "tensor operator" object of the class
178TensorCwiseBinaryOp<scalar_sum>, which has references to the tensors
179`t1` and `t2`.  This is a small C++ object that knows how to add
180`t1` and `t2`.  It is only when the value of the expression is assigned
181to the tensor `t3` that the addition is actually performed.  Technically,
182this happens through the overloading of `operator=()` in the Tensor class.
183
184This mechanism for computing tensor expressions allows for lazy evaluation and
185optimizations which are what make the tensor library very fast.
186
187Of course, the tensor operators do nest, and the expression `t1 + t2 * 0.3f`
188is actually represented with the (approximate) tree of operators:
189
190    TensorCwiseBinaryOp<scalar_sum>(t1, TensorCwiseUnaryOp<scalar_mul>(t2, 0.3f))
191
192
193### Tensor Operations and C++ "auto"
194
195Because Tensor operations create tensor operators, the C++ `auto` keyword
196does not have its intuitive meaning.  Consider these 2 lines of code:
197
198    Tensor<float, 3> t3 = t1 + t2;
199    auto t4 = t1 + t2;
200
201In the first line we allocate the tensor `t3` and it will contain the
202result of the addition of `t1` and `t2`.  In the second line, `t4`
203is actually the tree of tensor operators that will compute the addition of
204`t1` and `t2`.  In fact, `t4` is *not* a tensor and you cannot get
205the values of its elements:
206
207    Tensor<float, 3> t3 = t1 + t2;
208    cout << t3(0, 0, 0);  // OK prints the value of t1(0, 0, 0) + t2(0, 0, 0)
209
210    auto t4 = t1 + t2;
211    cout << t4(0, 0, 0);  // Compilation error!
212
213When you use `auto` you do not get a Tensor as a result but instead a
214non-evaluated expression.  So only use `auto` to delay evaluation.
215
216Unfortunately, there is no single underlying concrete type for holding
217non-evaluated expressions, hence you have to use auto in the case when you do
218want to hold non-evaluated expressions.
219
220When you need the results of set of tensor computations you have to assign the
221result to a Tensor that will be capable of holding onto them.  This can be
222either a normal Tensor, a fixed size Tensor, or a TensorMap on an existing
223piece of memory.  All the following will work:
224
225    auto t4 = t1 + t2;
226
227    Tensor<float, 3> result = t4;  // Could also be: result(t4);
228    cout << result(0, 0, 0);
229
230    TensorMap<float, 4> result(<a float* with enough space>, <size0>, ...) = t4;
231    cout << result(0, 0, 0);
232
233    TensorFixedSize<float, Sizes<size0, ...>> result = t4;
234    cout << result(0, 0, 0);
235
236Until you need the results, you can keep the operation around, and even reuse
237it for additional operations.  As long as you keep the expression as an
238operation, no computation is performed.
239
240    // One way to compute exp((t1 + t2) * 0.2f);
241    auto t3 = t1 + t2;
242    auto t4 = t3 * 0.2f;
243    auto t5 = t4.exp();
244    Tensor<float, 3> result = t5;
245
246    // Another way, exactly as efficient as the previous one:
247    Tensor<float, 3> result = ((t1 + t2) * 0.2f).exp();
248
249### Controlling When Expression are Evaluated
250
251There are several ways to control when expressions are evaluated:
252
253*   Assignment to a Tensor, TensorFixedSize, or TensorMap.
254*   Use of the eval() method.
255*   Assignment to a TensorRef.
256
257#### Assigning to a Tensor, TensorFixedSize, or TensorMap.
258
259The most common way to evaluate an expression is to assign it to a Tensor.  In
260the example below, the `auto` declarations make the intermediate values
261"Operations", not Tensors, and do not cause the expressions to be evaluated.
262The assignment to the Tensor `result` causes the evaluation of all the
263operations.
264
265    auto t3 = t1 + t2;             // t3 is an Operation.
266    auto t4 = t3 * 0.2f;           // t4 is an Operation.
267    auto t5 = t4.exp();            // t5 is an Operation.
268    Tensor<float, 3> result = t5;  // The operations are evaluated.
269
270If you know the ranks and sizes of the Operation value you can assign the
271Operation to a TensorFixedSize instead of a Tensor, which is a bit more
272efficient.
273
274    // We know that the result is a 4x4x2 tensor!
275    TensorFixedSize<float, Sizes<4, 4, 2>> result = t5;
276
277Simiarly, assigning an expression to a TensorMap causes its evaluation.  Like
278tensors of type TensorFixedSize, TensorMaps cannot be resized so they have to
279have the rank and sizes of the expression that are assigned to them.
280
281#### Calling `eval()`.
282
283When you compute large composite expressions, you sometimes want to tell Eigen
284that an intermediate value in the expression tree is worth evaluating ahead of
285time.  This is done by inserting a call to the `eval()` method of the
286expression Operation.
287
288    // The previous example could have been written:
289    Tensor<float, 3> result = ((t1 + t2) * 0.2f).exp();
290
291    // If you want to compute (t1 + t2) once ahead of time you can write:
292    Tensor<float, 3> result = ((t1 + t2).eval() * 0.2f).exp();
293
294Semantically, calling `eval()` is equivalent to materializing the value of
295the expression in a temporary Tensor of the right size.  The code above in
296effect does:
297
298    // .eval() knows the size!
299    TensorFixedSize<float, Sizes<4, 4, 2>> tmp = t1 + t2;
300    Tensor<float, 3> result = (tmp * 0.2f).exp();
301
302Note that the return value of `eval()` is itself an Operation, so the
303following code does not do what you may think:
304
305    // Here t3 is an evaluation Operation.  t3 has not been evaluated yet.
306    auto t3 = (t1 + t2).eval();
307
308    // You can use t3 in another expression.  Still no evaluation.
309    auto t4 = (t3 * 0.2f).exp();
310
311    // The value is evaluated when you assign the Operation to a Tensor, using
312    // an intermediate tensor to represent t3.x
313    Tensor<float, 3> result = t4;
314
315While in the examples above calling `eval()` does not make a difference in
316performance, in other cases it can make a huge difference.  In the expression
317below the `broadcast()` expression causes the `X.maximum()` expression
318to be evaluated many times:
319
320    Tensor<...> X ...;
321    Tensor<...> Y = ((X - X.maximum(depth_dim).reshape(dims2d).broadcast(bcast))
322                     * beta).exp();
323
324Inserting a call to `eval()` between the `maximum()` and
325`reshape()` calls guarantees that maximum() is only computed once and
326greatly speeds-up execution:
327
328    Tensor<...> Y =
329      ((X - X.maximum(depth_dim).eval().reshape(dims2d).broadcast(bcast))
330        * beta).exp();
331
332In the other example below, the tensor `Y` is both used in the expression
333and its assignment.  This is an aliasing problem and if the evaluation is not
334done in the right order Y will be updated incrementally during the evaluation
335resulting in bogus results:
336
337     Tensor<...> Y ...;
338     Y = Y / (Y.sum(depth_dim).reshape(dims2d).broadcast(bcast));
339
340Inserting a call to `eval()` between the `sum()` and `reshape()`
341expressions ensures that the sum is computed before any updates to `Y` are
342done.
343
344     Y = Y / (Y.sum(depth_dim).eval().reshape(dims2d).broadcast(bcast));
345
346Note that an eval around the full right hand side expression is not needed
347because the generated has to compute the i-th value of the right hand side
348before assigning it to the left hand side.
349
350However, if you were assigning the expression value to a shuffle of `Y`
351then you would need to force an eval for correctness by adding an `eval()`
352call for the right hand side:
353
354     Y.shuffle(...) =
355        (Y / (Y.sum(depth_dim).eval().reshape(dims2d).broadcast(bcast))).eval();
356
357
358#### Assigning to a `TensorRef`.
359
360If you need to access only a few elements from the value of an expression you
361can avoid materializing the value in a full tensor by using a TensorRef.
362
363A TensorRef is a small wrapper class for any Eigen Operation.  It provides
364overloads for the `()` operator that let you access individual values in
365the expression.  TensorRef is convenient, because the Operation themselves do
366not provide a way to access individual elements.
367
368    // Create a TensorRef for the expression.  The expression is not
369    // evaluated yet.
370    TensorRef<Tensor<float, 3> > ref = ((t1 + t2) * 0.2f).exp();
371
372    // Use "ref" to access individual elements.  The expression is evaluated
373    // on the fly.
374    float at_0 = ref(0, 0, 0);
375    cout << ref(0, 1, 0);
376
377Only use TensorRef when you need a subset of the values of the expression.
378TensorRef only computes the values you access.  However note that if you are
379going to access all the values it will be much faster to materialize the
380results in a Tensor first.
381
382In some cases, if the full Tensor result would be very large, you may save
383memory by accessing it as a TensorRef.  But not always.  So don't count on it.
384
385
386### Controlling How Expressions Are Evaluated
387
388The tensor library provides several implementations of the various operations
389such as contractions and convolutions.  The implementations are optimized for
390different environments: single threaded on CPU, multi threaded on CPU, or on a
391GPU using cuda.  Additional implementations may be added later.
392
393You can choose which implementation to use with the `device()` call.  If
394you do not choose an implementation explicitly the default implementation that
395uses a single thread on the CPU is used.
396
397The default implementation has been optimized for recent Intel CPUs, taking
398advantage of SSE, AVX, and FMA instructions.  Work is ongoing to tune the
399library on ARM CPUs.  Note that you need to pass compiler-dependent flags
400to enable the use of SSE, AVX, and other instructions.
401
402For example, the following code adds two tensors using the default
403single-threaded CPU implementation:
404
405    Tensor<float, 2> a(30, 40);
406    Tensor<float, 2> b(30, 40);
407    Tensor<float, 2> c = a + b;
408
409To choose a different implementation you have to insert a `device()` call
410before the assignment of the result.  For technical C++ reasons this requires
411that the Tensor for the result be declared on its own.  This means that you
412have to know the size of the result.
413
414    Eigen::Tensor<float, 2> c(30, 40);
415    c.device(...) = a + b;
416
417The call to `device()` must be the last call on the left of the operator=.
418
419You must pass to the `device()` call an Eigen device object.  There are
420presently three devices you can use: DefaultDevice, ThreadPoolDevice and
421GpuDevice.
422
423
424#### Evaluating With the DefaultDevice
425
426This is exactly the same as not inserting a `device()` call.
427
428    DefaultDevice my_device;
429    c.device(my_device) = a + b;
430
431#### Evaluating with a Thread Pool
432
433    // Create the Eigen ThreadPoolDevice.
434    Eigen::ThreadPoolDevice my_device(4 /* number of threads to use */);
435
436    // Now just use the device when evaluating expressions.
437    Eigen::Tensor<float, 2> c(30, 50);
438    c.device(my_device) = a.contract(b, dot_product_dims);
439
440
441#### Evaluating On GPU
442
443This is presently a bit more complicated than just using a thread pool device.
444You need to create a GPU device but you also need to explicitly allocate the
445memory for tensors with cuda.
446
447
448## API Reference
449
450### Datatypes
451
452In the documentation of the tensor methods and Operation we mention datatypes
453that are tensor-type specific:
454
455#### `<Tensor-Type>::``Dimensions`
456
457Acts like an array of ints.  Has an `int size` attribute, and can be
458indexed like an array to access individual values.  Used to represent the
459dimensions of a tensor.  See `dimensions()`.
460
461#### `<Tensor-Type>::``Index`
462
463Acts like an `int`.  Used for indexing tensors along their dimensions.  See
464`operator()`, `dimension()`, and `size()`.
465
466#### `<Tensor-Type>::``Scalar`
467
468Represents the datatype of individual tensor elements.  For example, for a
469`Tensor<float>`, `Scalar` is the type `float`.  See
470`setConstant()`.
471
472#### `<Operation>`
473
474We use this pseudo type to indicate that a tensor Operation is returned by a
475method.  We indicate in the text the type and dimensions of the tensor that the
476Operation returns after evaluation.
477
478The Operation will have to be evaluated, for example by assigning it to a
479tensor, before you can access the values of the resulting tensor.  You can also
480access the values through a TensorRef.
481
482
483## Built-in Tensor Methods
484
485These are usual C++ methods that act on tensors immediately.  They are not
486Operations which provide delayed evaluation of their results.  Unless specified
487otherwise, all the methods listed below are available on all tensor classes:
488Tensor, TensorFixedSize, and TensorMap.
489
490## Metadata
491
492### `int NumDimensions`
493
494Constant value indicating the number of dimensions of a Tensor.  This is also
495known as the tensor "rank".
496
497      Eigen::Tensor<float, 2> a(3, 4);
498      cout << "Dims " << a.NumDimensions;
499      => Dims 2
500
501### `Dimensions dimensions()`
502
503Returns an array-like object representing the dimensions of the tensor.
504The actual type of the `dimensions()` result is `<Tensor-Type>::``Dimensions`.
505
506    Eigen::Tensor<float, 2> a(3, 4);
507    const Eigen::Tensor<float, 2>::Dimensions& d = a.dimensions();
508    cout << "Dim size: " << d.size << ", dim 0: " << d[0]
509         << ", dim 1: " << d[1];
510    => Dim size: 2, dim 0: 3, dim 1: 4
511
512If you use a C++11 compiler, you can use `auto` to simplify the code:
513
514    const auto& d = a.dimensions();
515    cout << "Dim size: " << d.size << ", dim 0: " << d[0]
516         << ", dim 1: " << d[1];
517    => Dim size: 2, dim 0: 3, dim 1: 4
518
519### `Index dimension(Index n)`
520
521Returns the n-th dimension of the tensor.  The actual type of the
522`dimension()` result is `<Tensor-Type>::``Index`, but you can
523always use it like an int.
524
525      Eigen::Tensor<float, 2> a(3, 4);
526      int dim1 = a.dimension(1);
527      cout << "Dim 1: " << dim1;
528      => Dim 1: 4
529
530### `Index size()`
531
532Returns the total number of elements in the tensor.  This is the product of all
533the tensor dimensions.  The actual type of the `size()` result is
534`<Tensor-Type>::``Index`, but you can always use it like an int.
535
536    Eigen::Tensor<float, 2> a(3, 4);
537    cout << "Size: " << a.size();
538    => Size: 12
539
540
541### Getting Dimensions From An Operation
542
543A few operations provide `dimensions()` directly,
544e.g. `TensorReslicingOp`.  Most operations defer calculating dimensions
545until the operation is being evaluated.  If you need access to the dimensions
546of a deferred operation, you can wrap it in a TensorRef (see Assigning to a
547TensorRef above), which provides `dimensions()` and `dimension()` as
548above.
549
550TensorRef can also wrap the plain Tensor types, so this is a useful idiom in
551templated contexts where the underlying object could be either a raw Tensor
552or some deferred operation (e.g. a slice of a Tensor).  In this case, the
553template code can wrap the object in a TensorRef and reason about its
554dimensionality while remaining agnostic to the underlying type.
555
556
557## Constructors
558
559### Tensor
560
561Creates a tensor of the specified size. The number of arguments must be equal
562to the rank of the tensor. The content of the tensor is not initialized.
563
564    Eigen::Tensor<float, 2> a(3, 4);
565    cout << "NumRows: " << a.dimension(0) << " NumCols: " << a.dimension(1) << endl;
566    => NumRows: 3 NumCols: 4
567
568### TensorFixedSize
569
570Creates a tensor of the specified size. The number of arguments in the Sizes<>
571template parameter determines the rank of the tensor. The content of the tensor
572is not initialized.
573
574    Eigen::TensorFixedSize<float, Sizes<3, 4>> a;
575    cout << "Rank: " << a.rank() << endl;
576    => Rank: 2
577    cout << "NumRows: " << a.dimension(0) << " NumCols: " << a.dimension(1) << endl;
578    => NumRows: 3 NumCols: 4
579
580### TensorMap
581
582Creates a tensor mapping an existing array of data. The data must not be freed
583until the TensorMap is discarded, and the size of the data must be large enough
584to accommodate the coefficients of the tensor.
585
586    float data[] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11};
587    Eigen::TensorMap<Tensor<float, 2>> a(data, 3, 4);
588    cout << "NumRows: " << a.dimension(0) << " NumCols: " << a.dimension(1) << endl;
589    => NumRows: 3 NumCols: 4
590    cout << "a(1, 2): " << a(1, 2) << endl;
591    => a(1, 2): 7
592
593
594## Contents Initialization
595
596When a new Tensor or a new TensorFixedSize are created, memory is allocated to
597hold all the tensor elements, but the memory is not initialized.  Similarly,
598when a new TensorMap is created on top of non-initialized memory the memory its
599contents are not initialized.
600
601You can use one of the methods below to initialize the tensor memory.  These
602have an immediate effect on the tensor and return the tensor itself as a
603result.  These are not tensor Operations which delay evaluation.
604
605### `<Tensor-Type> setConstant(const Scalar& val)`
606
607Sets all elements of the tensor to the constant value `val`.  `Scalar`
608is the type of data stored in the tensor.  You can pass any value that is
609convertible to that type.
610
611Returns the tensor itself in case you want to chain another call.
612
613    a.setConstant(12.3f);
614    cout << "Constant: " << endl << a << endl << endl;
615    =>
616    Constant:
617    12.3 12.3 12.3 12.3
618    12.3 12.3 12.3 12.3
619    12.3 12.3 12.3 12.3
620
621Note that `setConstant()` can be used on any tensor where the element type
622has a copy constructor and an `operator=()`:
623
624    Eigen::Tensor<string, 2> a(2, 3);
625    a.setConstant("yolo");
626    cout << "String tensor: " << endl << a << endl << endl;
627    =>
628    String tensor:
629    yolo yolo yolo
630    yolo yolo yolo
631
632
633### `<Tensor-Type> setZero()`
634
635Fills the tensor with zeros.  Equivalent to `setConstant(Scalar(0))`.
636Returns the tensor itself in case you want to chain another call.
637
638    a.setZero();
639    cout << "Zeros: " << endl << a << endl << endl;
640    =>
641    Zeros:
642    0 0 0 0
643    0 0 0 0
644    0 0 0 0
645
646
647### `<Tensor-Type> setValues({..initializer_list})`
648
649Fills the tensor with explicit values specified in a std::initializer_list.
650The type of the initializer list depends on the type and rank of the tensor.
651
652If the tensor has rank N, the initializer list must be nested N times.  The
653most deeply nested lists must contains P scalars of the Tensor type where P is
654the size of the last dimension of the Tensor.
655
656For example, for a `TensorFixedSize<float, 2, 3>` the initializer list must
657contains 2 lists of 3 floats each.
658
659`setValues()` returns the tensor itself in case you want to chain another
660call.
661
662    Eigen::Tensor<float, 2> a(2, 3);
663    a.setValues({{0.0f, 1.0f, 2.0f}, {3.0f, 4.0f, 5.0f}});
664    cout << "a" << endl << a << endl << endl;
665    =>
666    a
667    0 1 2
668    3 4 5
669
670If a list is too short, the corresponding elements of the tensor will not be
671changed.  This is valid at each level of nesting.  For example the following
672code only sets the values of the first row of the tensor.
673
674    Eigen::Tensor<int, 2> a(2, 3);
675    a.setConstant(1000);
676    a.setValues({{10, 20, 30}});
677    cout << "a" << endl << a << endl << endl;
678    =>
679    a
680    10   20   30
681    1000 1000 1000
682
683### `<Tensor-Type> setRandom()`
684
685Fills the tensor with random values.  Returns the tensor itself in case you
686want to chain another call.
687
688    a.setRandom();
689    cout << "Random: " << endl << a << endl << endl;
690    =>
691    Random:
692      0.680375    0.59688  -0.329554    0.10794
693     -0.211234   0.823295   0.536459 -0.0452059
694      0.566198  -0.604897  -0.444451   0.257742
695
696You can customize `setRandom()` by providing your own random number
697generator as a template argument:
698
699    a.setRandom<MyRandomGenerator>();
700
701Here, `MyRandomGenerator` must be a struct with the following member
702functions, where Scalar and Index are the same as `<Tensor-Type>::``Scalar`
703and `<Tensor-Type>::``Index`.
704
705See `struct UniformRandomGenerator` in TensorFunctors.h for an example.
706
707    // Custom number generator for use with setRandom().
708    struct MyRandomGenerator {
709      // Default and copy constructors. Both are needed
710      MyRandomGenerator() { }
711      MyRandomGenerator(const MyRandomGenerator& ) { }
712
713      // Return a random value to be used.  "element_location" is the
714      // location of the entry to set in the tensor, it can typically
715      // be ignored.
716      Scalar operator()(Eigen::DenseIndex element_location,
717                        Eigen::DenseIndex /*unused*/ = 0) const {
718        return <randomly generated value of type T>;
719      }
720
721      // Same as above but generates several numbers at a time.
722      typename internal::packet_traits<Scalar>::type packetOp(
723          Eigen::DenseIndex packet_location, Eigen::DenseIndex /*unused*/ = 0) const {
724        return <a packet of randomly generated values>;
725      }
726    };
727
728You can also use one of the 2 random number generators that are part of the
729tensor library:
730*   UniformRandomGenerator
731*   NormalRandomGenerator
732
733
734## Data Access
735
736The Tensor, TensorFixedSize, and TensorRef classes provide the following
737accessors to access the tensor coefficients:
738
739    const Scalar& operator()(const array<Index, NumIndices>& indices)
740    const Scalar& operator()(Index firstIndex, IndexTypes... otherIndices)
741    Scalar& operator()(const array<Index, NumIndices>& indices)
742    Scalar& operator()(Index firstIndex, IndexTypes... otherIndices)
743
744The number of indices must be equal to the rank of the tensor. Moreover, these
745accessors are not available on tensor expressions. In order to access the
746values of a tensor expression, the expression must either be evaluated or
747wrapped in a TensorRef.
748
749
750### `Scalar* data()` and `const Scalar* data() const`
751
752Returns a pointer to the storage for the tensor.  The pointer is const if the
753tensor was const.  This allows direct access to the data.  The layout of the
754data depends on the tensor layout: RowMajor or ColMajor.
755
756This access is usually only needed for special cases, for example when mixing
757Eigen Tensor code with other libraries.
758
759Scalar is the type of data stored in the tensor.
760
761    Eigen::Tensor<float, 2> a(3, 4);
762    float* a_data = a.data();
763    a_data[0] = 123.45f;
764    cout << "a(0, 0): " << a(0, 0);
765    => a(0, 0): 123.45
766
767
768## Tensor Operations
769
770All the methods documented below return non evaluated tensor `Operations`.
771These can be chained: you can apply another Tensor Operation to the value
772returned by the method.
773
774The chain of Operation is evaluated lazily, typically when it is assigned to a
775tensor.  See "Controlling when Expression are Evaluated" for more details about
776their evaluation.
777
778### `<Operation> constant(const Scalar& val)`
779
780Returns a tensor of the same type and dimensions as the original tensor but
781where all elements have the value `val`.
782
783This is useful, for example, when you want to add or subtract a constant from a
784tensor, or multiply every element of a tensor by a scalar.
785
786    Eigen::Tensor<float, 2> a(2, 3);
787    a.setConstant(1.0f);
788    Eigen::Tensor<float, 2> b = a + a.constant(2.0f);
789    Eigen::Tensor<float, 2> c = b * b.constant(0.2f);
790    cout << "a" << endl << a << endl << endl;
791    cout << "b" << endl << b << endl << endl;
792    cout << "c" << endl << c << endl << endl;
793    =>
794    a
795    1 1 1
796    1 1 1
797
798    b
799    3 3 3
800    3 3 3
801
802    c
803    0.6 0.6 0.6
804    0.6 0.6 0.6
805
806### `<Operation> random()`
807
808Returns a tensor of the same type and dimensions as the current tensor
809but where all elements have random values.
810
811This is for example useful to add random values to an existing tensor.
812The generation of random values can be customized in the same manner
813as for `setRandom()`.
814
815    Eigen::Tensor<float, 2> a(2, 3);
816    a.setConstant(1.0f);
817    Eigen::Tensor<float, 2> b = a + a.random();
818    cout << "a" << endl << a << endl << endl;
819    cout << "b" << endl << b << endl << endl;
820    =>
821    a
822    1 1 1
823    1 1 1
824
825    b
826    1.68038   1.5662  1.82329
827    0.788766  1.59688 0.395103
828
829
830## Unary Element Wise Operations
831
832All these operations take a single input tensor as argument and return a tensor
833of the same type and dimensions as the tensor to which they are applied.  The
834requested operations are applied to each element independently.
835
836### `<Operation> operator-()`
837
838Returns a tensor of the same type and dimensions as the original tensor
839containing the opposite values of the original tensor.
840
841    Eigen::Tensor<float, 2> a(2, 3);
842    a.setConstant(1.0f);
843    Eigen::Tensor<float, 2> b = -a;
844    cout << "a" << endl << a << endl << endl;
845    cout << "b" << endl << b << endl << endl;
846    =>
847    a
848    1 1 1
849    1 1 1
850
851    b
852    -1 -1 -1
853    -1 -1 -1
854
855### `<Operation> sqrt()`
856
857Returns a tensor of the same type and dimensions as the original tensor
858containing the square roots of the original tensor.
859
860### `<Operation> rsqrt()`
861
862Returns a tensor of the same type and dimensions as the original tensor
863containing the inverse square roots of the original tensor.
864
865### `<Operation> square()`
866
867Returns a tensor of the same type and dimensions as the original tensor
868containing the squares of the original tensor values.
869
870### `<Operation> inverse()`
871
872Returns a tensor of the same type and dimensions as the original tensor
873containing the inverse of the original tensor values.
874
875### `<Operation> exp()`
876
877Returns a tensor of the same type and dimensions as the original tensor
878containing the exponential of the original tensor.
879
880### `<Operation> log()`
881
882Returns a tensor of the same type and dimensions as the original tensor
883containing the natural logarithms of the original tensor.
884
885### `<Operation> abs()`
886
887Returns a tensor of the same type and dimensions as the original tensor
888containing the absolute values of the original tensor.
889
890### `<Operation> pow(Scalar exponent)`
891
892Returns a tensor of the same type and dimensions as the original tensor
893containing the coefficients of the original tensor to the power of the
894exponent.
895
896The type of the exponent, Scalar, is always the same as the type of the
897tensor coefficients.  For example, only integer exponents can be used in
898conjuntion with tensors of integer values.
899
900You can use cast() to lift this restriction.  For example this computes
901cubic roots of an int Tensor:
902
903    Eigen::Tensor<int, 2> a(2, 3);
904    a.setValues({{0, 1, 8}, {27, 64, 125}});
905    Eigen::Tensor<double, 2> b = a.cast<double>().pow(1.0 / 3.0);
906    cout << "a" << endl << a << endl << endl;
907    cout << "b" << endl << b << endl << endl;
908    =>
909    a
910    0   1   8
911    27  64 125
912
913    b
914    0 1 2
915    3 4 5
916
917### `<Operation>  operator * (Scalar scale)`
918
919Multiplies all the coefficients of the input tensor by the provided scale.
920
921### `<Operation>  cwiseMax(Scalar threshold)`
922TODO
923
924### `<Operation>  cwiseMin(Scalar threshold)`
925TODO
926
927### `<Operation>  unaryExpr(const CustomUnaryOp& func)`
928TODO
929
930
931## Binary Element Wise Operations
932
933These operations take two input tensors as arguments. The 2 input tensors should
934be of the same type and dimensions. The result is a tensor of the same
935dimensions as the tensors to which they are applied, and unless otherwise
936specified it is also of the same type. The requested operations are applied to
937each pair of elements independently.
938
939### `<Operation> operator+(const OtherDerived& other)`
940
941Returns a tensor of the same type and dimensions as the input tensors
942containing the coefficient wise sums of the inputs.
943
944### `<Operation> operator-(const OtherDerived& other)`
945
946Returns a tensor of the same type and dimensions as the input tensors
947containing the coefficient wise differences of the inputs.
948
949### `<Operation> operator*(const OtherDerived& other)`
950
951Returns a tensor of the same type and dimensions as the input tensors
952containing the coefficient wise products of the inputs.
953
954### `<Operation> operator/(const OtherDerived& other)`
955
956Returns a tensor of the same type and dimensions as the input tensors
957containing the coefficient wise quotients of the inputs.
958
959This operator is not supported for integer types.
960
961### `<Operation> cwiseMax(const OtherDerived& other)`
962
963Returns a tensor of the same type and dimensions as the input tensors
964containing the coefficient wise maximums of the inputs.
965
966### `<Operation> cwiseMin(const OtherDerived& other)`
967
968Returns a tensor of the same type and dimensions as the input tensors
969containing the coefficient wise mimimums of the inputs.
970
971### `<Operation> Logical operators`
972
973The following logical operators are supported as well:
974
975*   operator&&(const OtherDerived& other)
976*   operator||(const OtherDerived& other)
977*   operator<(const OtherDerived& other)
978*   operator<=(const OtherDerived& other)
979*   operator>(const OtherDerived& other)
980*   operator>=(const OtherDerived& other)
981*   operator==(const OtherDerived& other)
982*   operator!=(const OtherDerived& other)
983
984They all return a tensor of boolean values.
985
986
987## Selection (select(const ThenDerived& thenTensor, const ElseDerived& elseTensor)
988
989Selection is a coefficient-wise ternary operator that is the tensor equivalent
990to the if-then-else operation.
991
992    Tensor<bool, 3> if = ...;
993    Tensor<float, 3> then = ...;
994    Tensor<float, 3> else = ...;
995    Tensor<float, 3> result = if.select(then, else);
996
997The 3 arguments must be of the same dimensions, which will also be the dimension
998of the result.  The 'if' tensor must be of type boolean, the 'then' and the
999'else' tensor must be of the same type, which will also be the type of the
1000result.
1001
1002Each coefficient in the result is equal to the corresponding coefficient in the
1003'then' tensor if the corresponding value in the 'if' tensor is true. If not, the
1004resulting coefficient will come from the 'else' tensor.
1005
1006
1007## Contraction
1008
1009Tensor *contractions* are a generalization of the matrix product to the
1010multidimensional case.
1011
1012    // Create 2 matrices using tensors of rank 2
1013    Eigen::Tensor<int, 2> a(2, 3);
1014    a.setValues({{1, 2, 3}, {6, 5, 4}});
1015    Eigen::Tensor<int, 2> b(3, 2);
1016    b.setValues({{1, 2}, {4, 5}, {5, 6}});
1017
1018    // Compute the traditional matrix product
1019    Eigen::array<Eigen::IndexPair<int>, 1> product_dims = { Eigen::IndexPair<int>(1, 0) };
1020    Eigen::Tensor<int, 2> AB = a.contract(b, product_dims);
1021
1022    // Compute the product of the transpose of the matrices
1023    Eigen::array<Eigen::IndexPair<int>, 1> transposed_product_dims = { Eigen::IndexPair<int>(0, 1) };
1024    Eigen::Tensor<int, 2> AtBt = a.contract(b, transposed_product_dims);
1025
1026    // Contraction to scalar value using a double contraction.
1027    // First coordinate of both tensors are contracted as well as both second coordinates, i.e., this computes the sum of the squares of the elements.
1028    Eigen::array<Eigen::IndexPair<int>, 2> double_contraction_product_dims = { Eigen::IndexPair<int>(0, 0), Eigen::IndexPair<int>(1, 1) };
1029    Eigen::Tensor<int, 0> AdoubleContractedA = a.contract(a, double_contraction_product_dims);
1030
1031    // Extracting the scalar value of the tensor contraction for further usage
1032    int value = AdoubleContractedA(0);
1033
1034## Reduction Operations
1035
1036A *Reduction* operation returns a tensor with fewer dimensions than the
1037original tensor.  The values in the returned tensor are computed by applying a
1038*reduction operator* to slices of values from the original tensor.  You specify
1039the dimensions along which the slices are made.
1040
1041The Eigen Tensor library provides a set of predefined reduction operators such
1042as `maximum()` and `sum()` and lets you define additional operators by
1043implementing a few methods from a reductor template.
1044
1045### Reduction Dimensions
1046
1047All reduction operations take a single parameter of type
1048`<TensorType>::``Dimensions` which can always be specified as an array of
1049ints.  These are called the "reduction dimensions."  The values are the indices
1050of the dimensions of the input tensor over which the reduction is done.  The
1051parameter can have at most as many element as the rank of the input tensor;
1052each element must be less than the tensor rank, as it indicates one of the
1053dimensions to reduce.
1054
1055Each dimension of the input tensor should occur at most once in the reduction
1056dimensions as the implementation does not remove duplicates.
1057
1058The order of the values in the reduction dimensions does not affect the
1059results, but the code may execute faster if you list the dimensions in
1060increasing order.
1061
1062Example: Reduction along one dimension.
1063
1064    // Create a tensor of 2 dimensions
1065    Eigen::Tensor<int, 2> a(2, 3);
1066    a.setValues({{1, 2, 3}, {6, 5, 4}});
1067    // Reduce it along the second dimension (1)...
1068    Eigen::array<int, 1> dims({1 /* dimension to reduce */});
1069    // ...using the "maximum" operator.
1070    // The result is a tensor with one dimension.  The size of
1071    // that dimension is the same as the first (non-reduced) dimension of a.
1072    Eigen::Tensor<int, 1> b = a.maximum(dims);
1073    cout << "a" << endl << a << endl << endl;
1074    cout << "b" << endl << b << endl << endl;
1075    =>
1076    a
1077    1 2 3
1078    6 5 4
1079
1080    b
1081    3
1082    6
1083
1084Example: Reduction along two dimensions.
1085
1086    Eigen::Tensor<float, 3, Eigen::ColMajor> a(2, 3, 4);
1087    a.setValues({{{0.0f, 1.0f, 2.0f, 3.0f},
1088                  {7.0f, 6.0f, 5.0f, 4.0f},
1089                  {8.0f, 9.0f, 10.0f, 11.0f}},
1090                 {{12.0f, 13.0f, 14.0f, 15.0f},
1091                  {19.0f, 18.0f, 17.0f, 16.0f},
1092                  {20.0f, 21.0f, 22.0f, 23.0f}}});
1093    // The tensor a has 3 dimensions.  We reduce along the
1094    // first 2, resulting in a tensor with a single dimension
1095    // of size 4 (the last dimension of a.)
1096    // Note that we pass the array of reduction dimensions
1097    // directly to the maximum() call.
1098    Eigen::Tensor<float, 1, Eigen::ColMajor> b =
1099        a.maximum(Eigen::array<int, 2>({0, 1}));
1100    cout << "b" << endl << b << endl << endl;
1101    =>
1102    b
1103    20
1104    21
1105    22
1106    23
1107
1108#### Reduction along all dimensions
1109
1110As a special case, if you pass no parameter to a reduction operation the
1111original tensor is reduced along *all* its dimensions.  The result is a
1112scalar, represented as a zero-dimension tensor.
1113
1114    Eigen::Tensor<float, 3> a(2, 3, 4);
1115    a.setValues({{{0.0f, 1.0f, 2.0f, 3.0f},
1116                  {7.0f, 6.0f, 5.0f, 4.0f},
1117                  {8.0f, 9.0f, 10.0f, 11.0f}},
1118                 {{12.0f, 13.0f, 14.0f, 15.0f},
1119                  {19.0f, 18.0f, 17.0f, 16.0f},
1120                  {20.0f, 21.0f, 22.0f, 23.0f}}});
1121    // Reduce along all dimensions using the sum() operator.
1122    Eigen::Tensor<float, 0> b = a.sum();
1123    cout << "b" << endl << b << endl << endl;
1124    =>
1125    b
1126    276
1127
1128
1129### `<Operation> sum(const Dimensions& new_dims)`
1130### `<Operation> sum()`
1131
1132Reduce a tensor using the sum() operator.  The resulting values
1133are the sum of the reduced values.
1134
1135### `<Operation> mean(const Dimensions& new_dims)`
1136### `<Operation> mean()`
1137
1138Reduce a tensor using the mean() operator.  The resulting values
1139are the mean of the reduced values.
1140
1141### `<Operation> maximum(const Dimensions& new_dims)`
1142### `<Operation> maximum()`
1143
1144Reduce a tensor using the maximum() operator.  The resulting values are the
1145largest of the reduced values.
1146
1147### `<Operation> minimum(const Dimensions& new_dims)`
1148### `<Operation> minimum()`
1149
1150Reduce a tensor using the minimum() operator.  The resulting values
1151are the smallest of the reduced values.
1152
1153### `<Operation> prod(const Dimensions& new_dims)`
1154### `<Operation> prod()`
1155
1156Reduce a tensor using the prod() operator.  The resulting values
1157are the product of the reduced values.
1158
1159### `<Operation> all(const Dimensions& new_dims)`
1160### `<Operation> all()`
1161Reduce a tensor using the all() operator.  Casts tensor to bool and then checks
1162whether all elements are true.  Runs through all elements rather than
1163short-circuiting, so may be significantly inefficient.
1164
1165### `<Operation> any(const Dimensions& new_dims)`
1166### `<Operation> any()`
1167Reduce a tensor using the any() operator.  Casts tensor to bool and then checks
1168whether any element is true.  Runs through all elements rather than
1169short-circuiting, so may be significantly inefficient.
1170
1171
1172### `<Operation> reduce(const Dimensions& new_dims, const Reducer& reducer)`
1173
1174Reduce a tensor using a user-defined reduction operator.  See `SumReducer`
1175in TensorFunctors.h for information on how to implement a reduction operator.
1176
1177
1178## Scan Operations
1179
1180A *Scan* operation returns a tensor with the same dimensions as the original
1181tensor. The operation performs an inclusive scan along the specified
1182axis, which means it computes a running total along the axis for a given
1183reduction operation.
1184If the reduction operation corresponds to summation, then this computes the
1185prefix sum of the tensor along the given axis.
1186
1187Example:
1188dd a comment to this line
1189
1190    // Create a tensor of 2 dimensions
1191    Eigen::Tensor<int, 2> a(2, 3);
1192    a.setValues({{1, 2, 3}, {4, 5, 6}});
1193    // Scan it along the second dimension (1) using summation
1194    Eigen::Tensor<int, 2> b = a.cumsum(1);
1195    // The result is a tensor with the same size as the input
1196    cout << "a" << endl << a << endl << endl;
1197    cout << "b" << endl << b << endl << endl;
1198    =>
1199    a
1200    1 2 3
1201    4 5 6
1202
1203    b
1204    1  3  6
1205    4  9 15
1206
1207### `<Operation> cumsum(const Index& axis)`
1208
1209Perform a scan by summing consecutive entries.
1210
1211### `<Operation> cumprod(const Index& axis)`
1212
1213Perform a scan by multiplying consecutive entries.
1214
1215
1216## Convolutions
1217
1218### `<Operation> convolve(const Kernel& kernel, const Dimensions& dims)`
1219
1220Returns a tensor that is the output of the convolution of the input tensor with the kernel,
1221along the specified dimensions of the input tensor. The dimension size for dimensions of the output tensor
1222which were part of the convolution will be reduced by the formula:
1223output_dim_size = input_dim_size - kernel_dim_size + 1 (requires: input_dim_size >= kernel_dim_size).
1224The dimension sizes for dimensions that were not part of the convolution will remain the same.
1225Performance of the convolution can depend on the length of the stride(s) of the input tensor dimension(s) along which the
1226convolution is computed (the first dimension has the shortest stride for ColMajor, whereas RowMajor's shortest stride is
1227for the last dimension).
1228
1229    // Compute convolution along the second and third dimension.
1230    Tensor<float, 4, DataLayout> input(3, 3, 7, 11);
1231    Tensor<float, 2, DataLayout> kernel(2, 2);
1232    Tensor<float, 4, DataLayout> output(3, 2, 6, 11);
1233    input.setRandom();
1234    kernel.setRandom();
1235
1236    Eigen::array<ptrdiff_t, 2> dims({1, 2});  // Specify second and third dimension for convolution.
1237    output = input.convolve(kernel, dims);
1238
1239    for (int i = 0; i < 3; ++i) {
1240      for (int j = 0; j < 2; ++j) {
1241        for (int k = 0; k < 6; ++k) {
1242          for (int l = 0; l < 11; ++l) {
1243            const float result = output(i,j,k,l);
1244            const float expected = input(i,j+0,k+0,l) * kernel(0,0) +
1245                                   input(i,j+1,k+0,l) * kernel(1,0) +
1246                                   input(i,j+0,k+1,l) * kernel(0,1) +
1247                                   input(i,j+1,k+1,l) * kernel(1,1);
1248            VERIFY_IS_APPROX(result, expected);
1249          }
1250        }
1251      }
1252    }
1253
1254
1255## Geometrical Operations
1256
1257These operations return a Tensor with different dimensions than the original
1258Tensor.  They can be used to access slices of tensors, see them with different
1259dimensions, or pad tensors with additional data.
1260
1261### `<Operation> reshape(const Dimensions& new_dims)`
1262
1263Returns a view of the input tensor that has been reshaped to the specified
1264new dimensions.  The argument new_dims is an array of Index values.  The
1265rank of the resulting tensor is equal to the number of elements in new_dims.
1266
1267The product of all the sizes in the new dimension array must be equal to
1268the number of elements in the input tensor.
1269
1270    // Increase the rank of the input tensor by introducing a new dimension
1271    // of size 1.
1272    Tensor<float, 2> input(7, 11);
1273    array<int, 3> three_dims{{7, 11, 1}};
1274    Tensor<float, 3> result = input.reshape(three_dims);
1275
1276    // Decrease the rank of the input tensor by merging 2 dimensions;
1277    array<int, 1> one_dim{{7 * 11}};
1278    Tensor<float, 1> result = input.reshape(one_dim);
1279
1280This operation does not move any data in the input tensor, so the resulting
1281contents of a reshaped Tensor depend on the data layout of the original Tensor.
1282
1283For example this is what happens when you `reshape()` a 2D ColMajor tensor
1284to one dimension:
1285
1286    Eigen::Tensor<float, 2, Eigen::ColMajor> a(2, 3);
1287    a.setValues({{0.0f, 100.0f, 200.0f}, {300.0f, 400.0f, 500.0f}});
1288    Eigen::array<Eigen::DenseIndex, 1> one_dim({3 * 2});
1289    Eigen::Tensor<float, 1, Eigen::ColMajor> b = a.reshape(one_dim);
1290    cout << "b" << endl << b << endl;
1291    =>
1292    b
1293      0
1294    300
1295    100
1296    400
1297    200
1298    500
1299
1300This is what happens when the 2D Tensor is RowMajor:
1301
1302    Eigen::Tensor<float, 2, Eigen::RowMajor> a(2, 3);
1303    a.setValues({{0.0f, 100.0f, 200.0f}, {300.0f, 400.0f, 500.0f}});
1304    Eigen::array<Eigen::DenseIndex, 1> one_dim({3 * 2});
1305    Eigen::Tensor<float, 1, Eigen::RowMajor> b = a.reshape(one_dim);
1306    cout << "b" << endl << b << endl;
1307    =>
1308    b
1309      0
1310    100
1311    200
1312    300
1313    400
1314    500
1315
1316The reshape operation is a lvalue. In other words, it can be used on the left
1317side of the assignment operator.
1318
1319The previous example can be rewritten as follow:
1320
1321    Eigen::Tensor<float, 2, Eigen::ColMajor> a(2, 3);
1322    a.setValues({{0.0f, 100.0f, 200.0f}, {300.0f, 400.0f, 500.0f}});
1323    Eigen::array<Eigen::DenseIndex, 2> two_dim({2, 3});
1324    Eigen::Tensor<float, 1, Eigen::ColMajor> b(6);
1325    b.reshape(two_dim) = a;
1326    cout << "b" << endl << b << endl;
1327    =>
1328    b
1329      0
1330    300
1331    100
1332    400
1333    200
1334    500
1335
1336Note that "b" itself was not reshaped but that instead the assignment is done to
1337the reshape view of b.
1338
1339
1340### `<Operation> shuffle(const Shuffle& shuffle)`
1341
1342Returns a copy of the input tensor whose dimensions have been
1343reordered according to the specified permutation. The argument shuffle
1344is an array of Index values. Its size is the rank of the input
1345tensor. It must contain a permutation of 0, 1, ..., rank - 1. The i-th
1346dimension of the output tensor equals to the size of the shuffle[i]-th
1347dimension of the input tensor. For example:
1348
1349    // Shuffle all dimensions to the left by 1.
1350    Tensor<float, 3> input(20, 30, 50);
1351    // ... set some values in input.
1352    Tensor<float, 3> output = input.shuffle({1, 2, 0})
1353
1354    eigen_assert(output.dimension(0) == 30);
1355    eigen_assert(output.dimension(1) == 50);
1356    eigen_assert(output.dimension(2) == 20);
1357
1358Indices into the output tensor are shuffled accordingly to formulate
1359indices into the input tensor. For example, one can assert in the above
1360code snippet that:
1361
1362    eigen_assert(output(3, 7, 11) == input(11, 3, 7));
1363
1364In general, one can assert that
1365
1366    eigen_assert(output(..., indices[shuffle[i]], ...) ==
1367                 input(..., indices[i], ...))
1368
1369The shuffle operation results in a lvalue, which means that it can be assigned
1370to. In other words, it can be used on the left side of the assignment operator.
1371
1372Let's rewrite the previous example to take advantage of this feature:
1373
1374    // Shuffle all dimensions to the left by 1.
1375    Tensor<float, 3> input(20, 30, 50);
1376    // ... set some values in input.
1377    Tensor<float, 3> output(30, 50, 20);
1378    output.shuffle({2, 0, 1}) = input;
1379
1380
1381### `<Operation> stride(const Strides& strides)`
1382
1383Returns a view of the input tensor that strides (skips stride-1
1384elements) along each of the dimensions.  The argument strides is an
1385array of Index values.  The dimensions of the resulting tensor are
1386ceil(input_dimensions[i] / strides[i]).
1387
1388For example this is what happens when you `stride()` a 2D tensor:
1389
1390    Eigen::Tensor<int, 2> a(4, 3);
1391    a.setValues({{0, 100, 200}, {300, 400, 500}, {600, 700, 800}, {900, 1000, 1100}});
1392    Eigen::array<Eigen::DenseIndex, 2> strides({3, 2});
1393    Eigen::Tensor<int, 2> b = a.stride(strides);
1394    cout << "b" << endl << b << endl;
1395    =>
1396    b
1397       0   200
1398     900  1100
1399
1400It is possible to assign a tensor to a stride:
1401    Tensor<float, 3> input(20, 30, 50);
1402    // ... set some values in input.
1403    Tensor<float, 3> output(40, 90, 200);
1404    output.stride({2, 3, 4}) = input;
1405
1406
1407### `<Operation> slice(const StartIndices& offsets, const Sizes& extents)`
1408
1409Returns a sub-tensor of the given tensor. For each dimension i, the slice is
1410made of the coefficients stored between offset[i] and offset[i] + extents[i] in
1411the input tensor.
1412
1413    Eigen::Tensor<int, 2> a(4, 3);
1414    a.setValues({{0, 100, 200}, {300, 400, 500},
1415                 {600, 700, 800}, {900, 1000, 1100}});
1416    Eigen::array<int, 2> offsets = {1, 0};
1417    Eigen::array<int, 2> extents = {2, 2};
1418    Eigen::Tensor<int, 1> slice = a.slice(offsets, extents);
1419    cout << "a" << endl << a << endl;
1420    =>
1421    a
1422       0   100   200
1423     300   400   500
1424     600   700   800
1425     900  1000  1100
1426    cout << "slice" << endl << slice << endl;
1427    =>
1428    slice
1429     300   400
1430     600   700
1431
1432
1433### `<Operation> chip(const Index offset, const Index dim)`
1434
1435A chip is a special kind of slice. It is the subtensor at the given offset in
1436the dimension dim. The returned tensor has one fewer dimension than the input
1437tensor: the dimension dim is removed.
1438
1439For example, a matrix chip would be either a row or a column of the input
1440matrix.
1441
1442    Eigen::Tensor<int, 2> a(4, 3);
1443    a.setValues({{0, 100, 200}, {300, 400, 500},
1444                 {600, 700, 800}, {900, 1000, 1100}});
1445    Eigen::Tensor<int, 1> row_3 = a.chip(2, 0);
1446    Eigen::Tensor<int, 1> col_2 = a.chip(1, 1);
1447    cout << "a" << endl << a << endl;
1448    =>
1449    a
1450       0   100   200
1451     300   400   500
1452     600   700   800
1453     900  1000  1100
1454    cout << "row_3" << endl << row_3 << endl;
1455    =>
1456    row_3
1457       600   700   800
1458    cout << "col_2" << endl << col_2 << endl;
1459    =>
1460    col_2
1461       100   400   700    1000
1462
1463It is possible to assign values to a tensor chip since the chip operation is a
1464lvalue. For example:
1465
1466    Eigen::Tensor<int, 1> a(3);
1467    a.setValues({{100, 200, 300}});
1468    Eigen::Tensor<int, 2> b(2, 3);
1469    b.setZero();
1470    b.chip(0, 0) = a;
1471    cout << "a" << endl << a << endl;
1472    =>
1473    a
1474     100
1475     200
1476     300
1477    cout << "b" << endl << b << endl;
1478    =>
1479    b
1480       100   200   300
1481         0     0     0
1482
1483
1484### `<Operation> reverse(const ReverseDimensions& reverse)`
1485
1486Returns a view of the input tensor that reverses the order of the coefficients
1487along a subset of the dimensions.  The argument reverse is an array of boolean
1488values that indicates whether or not the order of the coefficients should be
1489reversed along each of the dimensions.  This operation preserves the dimensions
1490of the input tensor.
1491
1492For example this is what happens when you `reverse()` the first dimension
1493of a 2D tensor:
1494
1495    Eigen::Tensor<int, 2> a(4, 3);
1496    a.setValues({{0, 100, 200}, {300, 400, 500},
1497                {600, 700, 800}, {900, 1000, 1100}});
1498    Eigen::array<bool, 2> reverse({true, false});
1499    Eigen::Tensor<int, 2> b = a.reverse(reverse);
1500    cout << "a" << endl << a << endl << "b" << endl << b << endl;
1501    =>
1502    a
1503       0   100   200
1504     300   400   500
1505     600   700   800
1506     900  1000  1100
1507    b
1508     900  1000  1100
1509     600   700   800
1510     300   400   500
1511       0   100   200
1512
1513
1514### `<Operation> broadcast(const Broadcast& broadcast)`
1515
1516Returns a view of the input tensor in which the input is replicated one to many
1517times.
1518The broadcast argument specifies how many copies of the input tensor need to be
1519made in each of the dimensions.
1520
1521    Eigen::Tensor<int, 2> a(2, 3);
1522    a.setValues({{0, 100, 200}, {300, 400, 500}});
1523    Eigen::array<int, 2> bcast({3, 2});
1524    Eigen::Tensor<int, 2> b = a.broadcast(bcast);
1525    cout << "a" << endl << a << endl << "b" << endl << b << endl;
1526    =>
1527    a
1528       0   100   200
1529     300   400   500
1530    b
1531       0   100   200    0   100   200
1532     300   400   500  300   400   500
1533       0   100   200    0   100   200
1534     300   400   500  300   400   500
1535       0   100   200    0   100   200
1536     300   400   500  300   400   500
1537
1538### `<Operation> concatenate(const OtherDerived& other, Axis axis)`
1539
1540TODO
1541
1542### `<Operation>  pad(const PaddingDimensions& padding)`
1543
1544Returns a view of the input tensor in which the input is padded with zeros.
1545
1546    Eigen::Tensor<int, 2> a(2, 3);
1547    a.setValues({{0, 100, 200}, {300, 400, 500}});
1548    Eigen::array<pair<int, int>, 2> paddings;
1549    paddings[0] = make_pair(0, 1);
1550    paddings[1] = make_pair(2, 3);
1551    Eigen::Tensor<int, 2> b = a.pad(paddings);
1552    cout << "a" << endl << a << endl << "b" << endl << b << endl;
1553    =>
1554    a
1555       0   100   200
1556     300   400   500
1557    b
1558       0     0     0    0
1559       0     0     0    0
1560       0   100   200    0
1561     300   400   500    0
1562       0     0     0    0
1563       0     0     0    0
1564       0     0     0    0
1565
1566
1567### `<Operation>  extract_patches(const PatchDims& patch_dims)`
1568
1569Returns a tensor of coefficient patches extracted from the input tensor, where
1570each patch is of dimension specified by 'patch_dims'. The returned tensor has
1571one greater dimension than the input tensor, which is used to index each patch.
1572The patch index in the output tensor depends on the data layout of the input
1573tensor: the patch index is the last dimension ColMajor layout, and the first
1574dimension in RowMajor layout.
1575
1576For example, given the following input tensor:
1577
1578  Eigen::Tensor<float, 2, DataLayout> tensor(3,4);
1579  tensor.setValues({{0.0f, 1.0f, 2.0f, 3.0f},
1580                    {4.0f, 5.0f, 6.0f, 7.0f},
1581                    {8.0f, 9.0f, 10.0f, 11.0f}});
1582
1583  cout << "tensor: " << endl << tensor << endl;
1584=>
1585tensor:
1586 0   1   2   3
1587 4   5   6   7
1588 8   9  10  11
1589
1590Six 2x2 patches can be extracted and indexed using the following code:
1591
1592  Eigen::Tensor<float, 3, DataLayout> patch;
1593  Eigen::array<ptrdiff_t, 2> patch_dims;
1594  patch_dims[0] = 2;
1595  patch_dims[1] = 2;
1596  patch = tensor.extract_patches(patch_dims);
1597  for (int k = 0; k < 6; ++k) {
1598    cout << "patch index: " << k << endl;
1599    for (int i = 0; i < 2; ++i) {
1600      for (int j = 0; j < 2; ++j) {
1601        if (DataLayout == ColMajor) {
1602          cout << patch(i, j, k) << " ";
1603        } else {
1604          cout << patch(k, i, j) << " ";
1605        }
1606      }
1607      cout << endl;
1608    }
1609  }
1610
1611This code results in the following output when the data layout is ColMajor:
1612
1613patch index: 0
16140 1
16154 5
1616patch index: 1
16174 5
16188 9
1619patch index: 2
16201 2
16215 6
1622patch index: 3
16235 6
16249 10
1625patch index: 4
16262 3
16276 7
1628patch index: 5
16296 7
163010 11
1631
1632This code results in the following output when the data layout is RowMajor:
1633(NOTE: the set of patches is the same as in ColMajor, but are indexed differently).
1634
1635patch index: 0
16360 1
16374 5
1638patch index: 1
16391 2
16405 6
1641patch index: 2
16422 3
16436 7
1644patch index: 3
16454 5
16468 9
1647patch index: 4
16485 6
16499 10
1650patch index: 5
16516 7
165210 11
1653
1654### `<Operation>  extract_image_patches(const Index patch_rows, const Index patch_cols, const Index row_stride, const Index col_stride, const PaddingType padding_type)`
1655
1656Returns a tensor of coefficient image patches extracted from the input tensor,
1657which is expected to have dimensions ordered as follows (depending on the data
1658layout of the input tensor, and the number of additional dimensions 'N'):
1659
1660*) ColMajor
16611st dimension: channels (of size d)
16622nd dimension: rows (of size r)
16633rd dimension: columns (of size c)
16644th-Nth dimension: time (for video) or batch (for bulk processing).
1665
1666*) RowMajor (reverse order of ColMajor)
16671st-Nth dimension: time (for video) or batch (for bulk processing).
1668N+1'th dimension: columns (of size c)
1669N+2'th dimension: rows (of size r)
1670N+3'th dimension: channels (of size d)
1671
1672The returned tensor has one greater dimension than the input tensor, which is
1673used to index each patch. The patch index in the output tensor depends on the
1674data layout of the input tensor: the patch index is the 4'th dimension in
1675ColMajor layout, and the 4'th from the last dimension in RowMajor layout.
1676
1677For example, given the following input tensor with the following dimension
1678sizes:
1679 *) depth:   2
1680 *) rows:    3
1681 *) columns: 5
1682 *) batch:   7
1683
1684  Tensor<float, 4> tensor(2,3,5,7);
1685  Tensor<float, 4, RowMajor> tensor_row_major = tensor.swap_layout();
1686
16872x2 image patches can be extracted and indexed using the following code:
1688
1689*) 2D patch: ColMajor (patch indexed by second-to-last dimension)
1690  Tensor<float, 5> twod_patch;
1691  twod_patch = tensor.extract_image_patches<2, 2>();
1692  // twod_patch.dimension(0) == 2
1693  // twod_patch.dimension(1) == 2
1694  // twod_patch.dimension(2) == 2
1695  // twod_patch.dimension(3) == 3*5
1696  // twod_patch.dimension(4) == 7
1697
1698*) 2D patch: RowMajor (patch indexed by the second dimension)
1699  Tensor<float, 5, RowMajor> twod_patch_row_major;
1700  twod_patch_row_major = tensor_row_major.extract_image_patches<2, 2>();
1701  // twod_patch_row_major.dimension(0) == 7
1702  // twod_patch_row_major.dimension(1) == 3*5
1703  // twod_patch_row_major.dimension(2) == 2
1704  // twod_patch_row_major.dimension(3) == 2
1705  // twod_patch_row_major.dimension(4) == 2
1706
1707## Special Operations
1708
1709### `<Operation> cast<T>()`
1710
1711Returns a tensor of type T with the same dimensions as the original tensor.
1712The returned tensor contains the values of the original tensor converted to
1713type T.
1714
1715    Eigen::Tensor<float, 2> a(2, 3);
1716    Eigen::Tensor<int, 2> b = a.cast<int>();
1717
1718This can be useful for example if you need to do element-wise division of
1719Tensors of integers.  This is not currently supported by the Tensor library
1720but you can easily cast the tensors to floats to do the division:
1721
1722    Eigen::Tensor<int, 2> a(2, 3);
1723    a.setValues({{0, 1, 2}, {3, 4, 5}});
1724    Eigen::Tensor<int, 2> b =
1725        (a.cast<float>() / a.constant(2).cast<float>()).cast<int>();
1726    cout << "a" << endl << a << endl << endl;
1727    cout << "b" << endl << b << endl << endl;
1728    =>
1729    a
1730    0 1 2
1731    3 4 5
1732
1733    b
1734    0 0 1
1735    1 2 2
1736
1737
1738### `<Operation>     eval()`
1739
1740TODO
1741
1742
1743## Representation of scalar values
1744
1745Scalar values are often represented by tensors of size 1 and rank 0.For example
1746Tensor<T, N>::maximum() currently returns a Tensor<T, 0>. Similarly, the inner
1747product of 2 1d tensors (through contractions) returns a 0d tensor.
1748
1749## Limitations
1750
1751*   The number of tensor dimensions is currently limited to 250 when using a
1752    compiler that supports cxx11. It is limited to only 5 for older compilers.
1753*   The IndexList class requires a cxx11 compliant compiler. You can use an
1754    array of indices instead if you don't have access to a modern compiler.
1755*   On GPUs only floating point values are properly tested and optimized for.
1756*   Complex and integer values are known to be broken on GPUs. If you try to use
1757    them you'll most likely end up triggering a static assertion failure such as
1758    EIGEN_STATIC_ASSERT(packetSize > 1, YOU_MADE_A_PROGRAMMING_MISTAKE)
1759
1760
1761