1Local Response Normalization (LRN) {#dev_guide_lrn} 2==================================================== 3 4> 5> [API Reference](@ref dnnl_api_lrn) 6> 7 8## General 9 10The LRN primitive performs a forward or backward local response normalization. 11 12### Forward 13 14The LRN operation is defined by the following formulas (the variable names 15follow the standard @ref dev_guide_conventions): 16 17LRN [across channels](#dnnl_lrn_across_channels): 18 19\f[ 20 \dst(n, c, h, w) = 21 \left\{k + \frac{\alpha}{n_{l}} 22 \sum\limits_{i=-(n_{l}-1)/2}^{(n_{l}+1)/2-1} 23 (\src(n, c+i, h, w))^2 24 \right\}^{-\beta} 25 \cdot 26 \src(n, c, h, w), 27\f] 28 29LRN [within channel](#dnnl_lrn_within_channel): 30 31\f[ 32 \dst(n, c, h, w) = 33 \left\{k + \frac{\alpha}{n_{l}} 34 \sum\limits_{i=-(n_{l}-1)/2}^{(n_{l}+1)/2-1} 35 \sum\limits_{j=-(n_{l}-1)/2}^{(n_{l}+1)/2-1} 36 (\src(n, c, h+i, w+j))^2 37 \right\}^{-\beta} 38 \cdot 39 \src(n, c, h, w), 40\f] 41 42where \f$n_{l}\f$ is the @p local_size. Formulas are provided for 2D spatial 43data case. 44 45### Backward 46 47The backward propagation computes \f$\diffsrc(n, c, h, w)\f$, based on 48\f$\diffdst(n, c, h, w)\f$ and \f$\src(n, c, h, w)\f$. 49 50## Execution Arguments 51 52When executed, the inputs and outputs should be mapped to an execution 53argument index as specified by the following table. 54 55| Primitive input/output | Execution argument index | 56| --- | --- | 57| \src | DNNL_ARG_SRC | 58| \dst | DNNL_ARG_DST | 59| workspace | DNNL_ARG_WORKSPACE | 60| \diffsrc | DNNL_ARG_DIFF_SRC | 61| \diffdst | DNNL_ARG_DIFF_DST | 62 63 64## Implementation Details 65 66### General Notes 67 681. During training, LRN might or might not require a workspace on forward and 69 backward passes. The behavior is implementation specific. Optimized 70 implementations typically require a workspace and use it to save some 71 intermediate results from the forward pass that accelerate computations on 72 the backward pass. To check whether a workspace is required, query the LRN 73 primitive descriptor for the workspace. Success indicates that the workspace 74 is required and its description will be returned. 75 762. The memory format and data type for `src` and `dst` are assumed to be the 77 same, and in the API are typically referred to as `data` (e.g., see 78 `data_desc` in dnnl::lrn_forward::desc::desc()). The same holds for 79 `diff_src` and `diff_dst`. The corresponding memory descriptors are referred 80 to as `diff_data_desc`. 81 82### Data Type Support 83 84The LRN primitive supports the following combinations of data types: 85 86| Propagation | Source / Destination | 87| :-- | :-- | 88| forward / backward | f32, bf16 | 89| forward | f16 | 90 91@warning 92 There might be hardware and/or implementation specific restrictions. Check 93 the [Implementation Limitations](@ref dg_lrn_impl_limits) section below. 94 95### Data Representation 96 97#### Source, Destination, and Their Gradients 98 99Like most other primitives, the LRN primitive expects the following 100tensors: 101 102| Spatial | Source / Destination 103| :-- | :-- 104| 0D | \f$N \times C\f$ 105| 1D | \f$N \times C \times W\f$ 106| 2D | \f$N \times C \times H \times W\f$ 107| 3D | \f$N \times C \times D \times H \times W\f$ 108 109The LRN primitive is optimized for the following memory formats: 110 111| Spatial | Logical tensor | Implementations optimized for memory formats 112| :-- | :-- | :-- 113| 2D | NCHW | #dnnl_nchw (#dnnl_abcd), #dnnl_nhwc (#dnnl_acdb), *optimized^* 114 115Here *optimized^* means the format that 116[comes out](@ref memory_format_propagation_cpp) 117of any preceding compute-intensive primitive. 118 119### Post-ops and Attributes 120 121The LRN primitive does not support any post-ops or attributes. 122 123 124@anchor dg_lrn_impl_limits 125## Implementation Limitations 126 1271. Refer to @ref dev_guide_data_types for limitations related to data types 128 support. 129 1302. **GPU** 131 - Supports only 2D spatial case. 132 133 134## Performance Tips 135 1361. For backward propagation, use the same memory format for `src`, `diff_dst`, 137 and `diff_src` (the format of the `diff_dst` and `diff_src` are always the 138 same because of the API). Different formats are functionally supported but 139 lead to highly suboptimal performance. 140 141## Examples 142 143| Engine | Name | Comments 144| :-- | :-- | :-- 145| CPU/GPU | @ref lrn_example_cpp | @copydetails lrn_example_cpp_short 146