1Pooling {#dev_guide_pooling}
2============================
3
4>
5> [API Reference](@ref dnnl_api_pooling)
6>
7
8## General
9
10The pooling primitive performs forward or backward max or average pooling
11operation on 1D, 2D, or 3D spatial data.
12
13### Forward
14
15The pooling operation is defined by the following formulas.
16We show formulas only for 2D spatial data which are straightforward to
17generalize to cases of higher and lower dimensions. Variable names follow the
18standard @ref dev_guide_conventions.
19
20Max pooling:
21
22\f[
23    \dst(n, c, oh, ow) =
24        \max\limits_{kh, kw}
25        \left(
26            \src(n, c, oh \cdot SH + kh \cdot (DH + 1) - PH_L, ow \cdot SW + kw \cdot (DW + 1) - PW_L)
27        \right)
28\f]
29
30Average pooling:
31
32\f[
33    \dst(n, c, oh, ow) =
34        \frac{1}{DENOM}
35        \sum\limits_{kh, kw}
36            \src(n, c, oh \cdot SH + kh \cdot (DH + 1) - PH_L, ow \cdot SW + kw \cdot (DW + 1) - PW_L)
37\f]
38
39Here output spatial dimensions are calculated similarly to how they are done in
40@ref dev_guide_convolution.
41
42Average pooling supports two algorithms:
43- #dnnl_pooling_avg_include_padding, in which case \f$DENOM = KH \cdot KW\f$,
44- #dnnl_pooling_avg_exclude_padding, in which case \f$DENOM\f$ equals to the
45  size of overlap between an averaging window and images.
46
47> TODO: a picture would be nice here.
48
49#### Difference Between Forward Training and Forward Inference
50
51- Max pooling requires a `workspace` for the #dnnl_forward_training propagation
52  kind, and does not require it for #dnnl_forward_inference (see details below).
53
54### Backward
55
56The backward propagation computes \f$\diffsrc(n, c, h,
57w)\f$, based on \f$\diffdst(n, c, h, w)\f$ and (in
58case of max pooling) `workspace`.
59
60## Execution Arguments
61When executed, the inputs and outputs should be mapped to an execution
62argument index as specified by the following table.
63
64| Primitive input/output      | Execution argument index                                                  |
65| ---                         | ---                                                                       |
66| \src                        | DNNL_ARG_SRC                                                              |
67| \dst                        | DNNL_ARG_DST                                                              |
68| workspace                   | DNNL_ARG_WORKSPACE                                                        |
69| \diffsrc                    | DNNL_ARG_DIFF_SRC                                                         |
70| \diffdst                    | DNNL_ARG_DIFF_DST                                                         |
71| \f$\text{binary post-op}\f$ | DNNL_ARG_ATTR_MULTIPLE_POST_OP(binary_post_op_position) \| DNNL_ARG_SRC_1 |
72
73## Implementation Details
74
75### General Notes
76
771. During training, max pooling requires a workspace on forward
78   (#dnnl_forward_training) and backward passes to save indices where a maximum
79   was found. The workspace format is opaque, and the indices cannot be restored
80   from it. However, one can use backward pooling to perform up-sampling (used
81   in some detection topologies). The workspace can be created via
82   `workspace_desc()` from the pooling primitive descriptor.
83
842. A user can use memory format tag #dnnl_format_tag_any for `dst` memory
85   descriptor when creating pooling forward propagation. The library would
86   derive the appropriate format from the `src` memory descriptor. However,
87   the `src` itself must be defined. Similarly, a user can use memory format tag
88   #dnnl_format_tag_any for the `diff_src` memory descriptor when creating
89   pooling backward propagation.
90
91### Data Type Support
92
93The pooling primitive supports the following combinations of data types:
94
95| Propagation        | Source | Destination | Accumulation data type (used for average pooling only)
96| :--                | :--    | :--         | :--
97| forward / backward | f32    | f32         | f32
98| forward / backward | bf16   | bf16        | bf16
99| forward            | f16    | f16         | f16
100| forward            | s8     | s8          | s32
101| forward            | u8     | u8          | s32
102| forward            | s32    | s32         | s32
103| forward inference  | s8     | u8          | s32
104| forward inference  | u8     | s8          | s32
105| forward inference  | s8     | f16         | f16
106| forward inference  | u8     | f16         | f16
107| forward inference  | f16    | s8          | f16
108| forward inference  | f16    | u8          | f16
109| forward inference  | s8     | f32         | f32
110| forward inference  | u8     | f32         | f32
111| forward inference  | f32    | s8          | f32
112| forward inference  | f32    | u8          | f32
113
114@warning
115    There might be hardware and/or implementation specific restrictions.
116    Check [Implementation Limitations](@ref dg_pool_impl_limits) section below.
117
118### Data Representation
119
120#### Source, Destination, and Their Gradients
121
122Like other CNN primitives, the pooling primitive expects data to be
123an \f$N \times C \times W\f$ tensor for the 1D spatial case,
124an \f$N \times C \times H \times W\f$ tensor for the 2D spatial case, and
125an \f$N \times C \times D \times H \times W\f$ tensor for the 3D spatial case.
126
127The pooling primitive is optimized for the following memory formats:
128
129| Spatial | Logical tensor | Data type   | Implementations optimized for memory formats                       |
130| :--     | :--            | :--         | :--                                                                |
131| 1D      | NCW            | f32         | #dnnl_ncw (#dnnl_abc), #dnnl_nwc (#dnnl_acb), *optimized^*         |
132| 1D      | NCW            | s32, s8, u8 | #dnnl_nwc (#dnnl_acb), *optimized^*                                |
133| 2D      | NCHW           | f32         | #dnnl_nchw (#dnnl_abcd), #dnnl_nhwc (#dnnl_acdb), *optimized^*     |
134| 2D      | NCHW           | s32, s8, u8 | #dnnl_nhwc (#dnnl_acdb), *optimized^*                              |
135| 3D      | NCDHW          | f32         | #dnnl_ncdhw (#dnnl_abcde), #dnnl_ndhwc (#dnnl_acdeb), *optimized^* |
136| 3D      | NCDHW          | s32, s8, u8 | #dnnl_ndhwc (#dnnl_acdeb), *optimized^*                            |
137
138Here *optimized^* means the format that
139[comes out](@ref memory_format_propagation_cpp)
140of any preceding compute-intensive primitive.
141
142### Post-Ops and Attributes
143
144| Propagation | Type    | Operation                                    | Description                                            | Restrictions                        |
145| :--         | :--     | :--                                          | :--                                                    | :--                                 |
146| Forward     | Post-op | [Binary](@ref dnnl::post_ops::append_binary) | Applies a @ref dnnl_api_binary operation to the result | General binary post-op restrictions |
147
148@anchor dg_pool_impl_limits
149## Implementation Limitations
150
1511. Refer to @ref dev_guide_data_types for limitations related to data types
152   support.
153
1542. **CPU**
155    - Different data types of source and destination in forward inference
156      are not supported.
157
158## Performance Tips
159
160N/A
161
162## Example
163
164[Pooling Primitive Example](@ref pooling_example_cpp)
165
166@copydetails pooling_example_cpp_short
167