1Shuffle {#dev_guide_shuffle} 2============================ 3 4> 5> [API Reference](@ref dnnl_api_shuffle) 6> 7 8## General 9 10The shuffle primitive shuffles data along the shuffle axis (here is designated 11as \f$C\f$) with the group parameter \f$G\f$. Namely, the shuffle axis is 12thought to be a 2D tensor of size \f$(\frac{C}{G} \times G)\f$ and it is being 13transposed to \f$(G \times \frac{C}{G})\f$. Variable names follow the standard 14@ref dev_guide_conventions. 15 16The formal definition is shown below: 17 18### Forward 19 20\f[ 21 \dst(\overline{ou}, c, \overline{in}) = 22 \src(\overline{ou}, c', \overline{in}) 23\f] 24 25where 26 27- \f$c\f$ dimension is called a shuffle axis, 28- \f$G\f$ is a `group_size`, 29- \f$\overline{ou}\f$ is the outermost indices (to the left from shuffle axis), 30- \f$\overline{in}\f$ is the innermost indices (to the right from shuffle axis), and 31- \f$c'\f$ and \f$c\f$ relate to each other as define by the system: 32 33\f[ 34 \begin{cases} 35 c &= u + v\frac{C}{G}, \\ 36 c' &= uG + v, \\ 37 \end{cases} 38\f] 39 40Here, \f$0 \leq u < \frac{C}{G}\f$ and \f$0 \leq v < G\f$. 41 42#### Difference Between Forward Training and Forward Inference 43 44There is no difference between the #dnnl_forward_training 45and #dnnl_forward_inference propagation kinds. 46 47### Backward 48 49The backward propagation computes 50\f$\diffsrc(ou, c, in)\f$, 51based on 52\f$\diffdst(ou, c, in)\f$. 53 54Essentially, backward propagation is the same as forward propagation with 55\f$g\f$ replaced by \f$C / g\f$. 56 57## Execution Arguments 58 59When executed, the inputs and outputs should be mapped to an execution 60argument index as specified by the following table. 61 62| Primitive input/output | Execution argument index | 63| --- | --- | 64| \src | DNNL_ARG_SRC | 65| \dst | DNNL_ARG_DST | 66| \diffsrc | DNNL_ARG_DIFF_SRC | 67| \diffdst | DNNL_ARG_DIFF_DST | 68 69## Implementation Details 70 71### General Notes 72 731. The memory format and data type for `src` and `dst` are assumed to be the 74 same, and in the API are typically referred as `data` (e.g., see `data_desc` 75 in dnnl::shuffle_forward::desc::desc()). The same holds for 76 `diff_src` and `diff_dst`. The corresponding memory descriptors are referred 77 to as `diff_data_desc`. 78 79## Data Types 80 81The shuffle primitive supports the following combinations of data types: 82 83| Propagation | Source / Destination 84| :-- | :-- 85| forward / backward | f32, bf16 86| forward | s32, s8, u8 87 88@warning 89 There might be hardware and/or implementation specific restrictions. 90 Check the [Implementation Limitations](@ref dg_shuffle_impl_limits) section 91 below. 92 93## Data Layouts 94 95The shuffle primitive works with arbitrary data tensors. There is no special 96meaning associated with any logical dimensions. However, the shuffle axis is 97typically referred to as channels (hence in formulas we use \f$c\f$). 98 99Shuffle operation typically appear in CNN topologies. Hence, in the library the 100shuffle primitive is optimized for the corresponding memory formats: 101 102| Spatial | Logical tensor | Shuffle Axis | Implementations optimized for memory formats | 103| :-- | :-- | :-- | :-- | 104| 2D | NCHW | 1 (C) | #dnnl_nchw (#dnnl_abcd), #dnnl_nhwc (#dnnl_acdb), *optimized^* | 105| 3D | NCDHW | 1 (C) | #dnnl_ncdhw (#dnnl_abcde), #dnnl_ndhwc (#dnnl_acdeb), *optimized^* | 106 107Here *optimized^* means the format that 108[comes out](@ref memory_format_propagation_cpp) 109of any preceding compute-intensive primitive. 110 111### Post-Ops and Attributes 112 113The shuffle primitive does not support any post-ops or attributes. 114 115@anchor dg_shuffle_impl_limits 116## Implementation Limitations 117 1181. Refer to @ref dev_guide_data_types for limitations related to data types 119 support. 120 121## Performance Tips 122 123N/A 124 125## Example 126 127[Shuffle Primitive Example](@ref shuffle_example_cpp) 128 129@copydetails shuffle_example_cpp_short 130