1 2* [AbsVal](#absval) 3* [ArgMax](#argmax) 4* [BatchNorm](#batchnorm) 5* [Bias](#bias) 6* [BinaryOp](#binaryop) 7* [BNLL](#bnll) 8* [Cast](#cast) 9* [Clip](#clip) 10* [Concat](#concat) 11* [Convolution](#convolution) 12* [Convolution1D](#convolution1d) 13* [Convolution3D](#convolution3d) 14* [ConvolutionDepthWise](#convolutiondepthwise) 15* [ConvolutionDepthWise1D](#convolutiondepthwise1d) 16* [ConvolutionDepthWise3D](#convolutiondepthwise3d) 17* [Crop](#crop) 18* [Deconvolution](#deconvolution) 19* [DeconvolutionDepthWise](#deconvolutiondepthwise) 20* [Dequantize](#dequantize) 21* [Dropout](#dropout) 22* [Eltwise](#eltwise) 23* [ELU](#elu) 24* [Exp](#exp) 25* [Flatten](#flatten) 26* [GELU](#gelu) 27* [Gemm](#gemm) 28* [GroupNorm](#groupnorm) 29* [GRU](#gru) 30* [HardSigmoid](#hardsigmoid) 31* [HardSwish](#hardswish) 32* [InnerProduct](#innerproduct) 33* [Input](#input) 34* [InstanceNorm](#instancenorm) 35* [Interp](#interp) 36* [LayerNorm](#layernorm) 37* [Log](#log) 38* [LRN](#lrn) 39* [LSTM](#lstm) 40* [MemoryData](#memorydata) 41* [Mish](#mish) 42* [MultiHeadAttention](#multiheadattention) 43* [MVN](#mvn) 44* [Noop](#noop) 45* [Normalize](#normalize) 46* [Packing](#packing) 47* [Padding](#padding) 48* [Permute](#permute) 49* [PixelShuffle](#pixelshuffle) 50* [Pooling](#pooling) 51* [Pooling1D](#pooling1d) 52* [Pooling3D](#pooling3d) 53* [Power](#power) 54* [PReLU](#prelu) 55* [Quantize](#quantize) 56* [Reduction](#reduction) 57* [ReLU](#relu) 58* [Reorg](#reorg) 59* [Requantize](#requantize) 60* [Reshape](#reshape) 61* [RNN](#rnn) 62* [Scale](#scale) 63* [SELU](#selu) 64* [ShuffleChannel](#shufflechannel) 65* [Sigmoid](#sigmoid) 66* [Slice](#slice) 67* [Softmax](#softmax) 68* [Softplus](#softplus) 69* [Split](#split) 70* [Swish](#swish) 71* [TanH](#tanh) 72* [Threshold](#threshold) 73* [UnaryOp](#unaryop) 74 75# AbsVal 76``` 77y = abs(x) 78``` 79 80* one_blob_only 81* support_inplace 82 83# ArgMax 84``` 85y = argmax(x, out_max_val, topk) 86``` 87 88* one_blob_only 89 90| param id | name | type | default | description | 91| --------- | ------------- | ----- | --------- | ----------------- | 92| 0 | out_max_val | int | 0 | | 93| 1 | topk | int | 1 | | 94 95# BatchNorm 96``` 97y = (x - mean) / sqrt(var + eps) * slope + bias 98``` 99 100* one_blob_only 101* support_inplace 102 103| param id | name | type | default | description | 104| --------- | ------------- | ----- | --------- | ----------------- | 105| 0 | channels | int | 0 | | 106| 1 | eps | float | 0.f | | 107 108| weight | type | shape | 109| ------------- | ----- | --------------------- | 110| slope_data | float | [channels] | 111| mean_data | float | [channels] | 112| var_data | float | [channels] | 113| bias_data | float | [channels] | 114 115# Bias 116``` 117y = x + bias 118``` 119 120* one_blob_only 121* support_inplace 122 123| param id | name | type | default | description | 124| --------- | ------------- | ----- | --------- | ----------------- | 125| 0 | bias_data_size| int | 0 | | 126 127| weight | type | shape | 128| ------------- | ----- | --------------------- | 129| bias_data | float | [channels] | 130 131# BinaryOp 132 This operation is used for binary computation, and the calculation rule depends on the [broadcasting rule](https://github.com/Tencent/ncnn/wiki/binaryop-broadcasting). 133``` 134C = binaryop(A, B) 135``` 136if with_scalar = 1: 137- one_blob_only 138- support_inplace 139 140| param id | name | type | default | description | 141| --------- | ------------- | ----- | --------- | ----------------- | 142| 0 | op_type | int | 0 | Operation type as follows | 143| 1 | with_scalar | int | 0 | with_scalar=0 B is a matrix, with_scalar=1 B is a scalar | 144| 2 | b | float | 0.f | When B is a scalar, B = b | 145 146Operation type: 147- 0 = ADD 148- 1 = SUB 149- 2 = MUL 150- 3 = DIV 151- 4 = MAX 152- 5 = MIN 153- 6 = POW 154- 7 = RSUB 155- 8 = RDIV 156 157# BNLL 158``` 159y = log(1 + e^(-x)) , x > 0 160y = log(1 + e^x), x < 0 161``` 162 163* one_blob_only 164* support_inplace 165 166# Cast 167``` 168y = cast(x) 169``` 170 171* one_blob_only 172* support_packing 173 174| param id | name | type | default | description | 175| --------- | ------------- | ----- | --------- | ----------------- | 176| 0 | type_from | int | 0 | | 177| 1 | type_to | int | 0 | | 178 179Element type: 180- 0 = auto 181- 1 = float32 182- 2 = float16 183- 3 = int8 184- 4 = bfloat16 185 186# Clip 187``` 188y = clamp(x, min, max) 189``` 190 191* one_blob_only 192* support_inplace 193 194| param id | name | type | default | description | 195| --------- | ------------- | ----- | --------- | ----------------- | 196| 0 | min | float | -FLT_MAX | | 197| 1 | max | float | FLT_MAX | | 198 199# Concat 200``` 201y = concat(x0, x1, x2, ...) by axis 202``` 203 204| param id | name | type | default | description | 205| --------- | ------------- | ----- | --------- | ----------------- | 206| 0 | axis | int | 0 | | 207 208# Convolution 209``` 210x2 = pad(x, pads, pad_value) 211x3 = conv(x2, weight, kernel, stride, dilation) + bias 212y = activation(x3, act_type, act_params) 213``` 214 215* one_blob_only 216 217| param id | name | type | default | description | 218| --------- | ------------- | ----- | --------- | ----------------- | 219| 0 | num_output | int | 0 | | 220| 1 | kernel_w | int | 0 | | 221| 2 | dilation_w | int | 1 | | 222| 3 | stride_w | int | 1 | | 223| 4 | pad_left | int | 0 | | 224| 5 | bias_term | int | 0 | | 225| 6 | weight_data_size| int | 0 | | 226| 8 | int8_scale_term| int | 0 | | 227| 9 | activation_type| int | 0 | | 228| 10 | activation_params| array | [ ] | | 229| 11 | kernel_h | int | kernel_w | | 230| 12 | dilation_h | int | dilation_w | | 231| 13 | stride_h | int | stride_w | | 232| 15 | pad_right | int | pad_left | | 233| 14 | pad_top | int | pad_left | | 234| 16 | pad_bottom | int | pad_top | | 235| 18 | pad_value | float | 0.f | | 236 237| weight | type | shape | 238| ------------- | ----- | --------------------- | 239| weight_data | float/fp16/int8 | [kernel_w, kernel_h, num_input, num_output] | 240| bias_data | float | [num_output] | 241| weight_data_int8_scales| float | [num_output] | 242| bottom_blob_int8_scales| float | [1] | 243| top_blob_int8_scales| float | [1] | 244 245# Convolution1D 246``` 247x2 = pad(x, pads, pad_value) 248x3 = conv1d(x2, weight, kernel, stride, dilation) + bias 249y = activation(x3, act_type, act_params) 250``` 251 252* one_blob_only 253 254| param id | name | type | default | description | 255| --------- | ------------- | ----- | --------- | ----------------- | 256| 0 | num_output | int | 0 | | 257| 1 | kernel_w | int | 0 | | 258| 2 | dilation_w | int | 1 | | 259| 3 | stride_w | int | 1 | | 260| 4 | pad_left | int | 0 | | 261| 5 | bias_term | int | 0 | | 262| 6 | weight_data_size| int | 0 | | 263| 9 | activation_type| int | 0 | | 264| 10 | activation_params| array | [ ] | | 265| 15 | pad_right | int | pad_left | | 266| 18 | pad_value | float | 0.f | | 267 268| weight | type | shape | 269| ------------- | ----- | --------------------- | 270| weight_data | float/fp16/int8 | [kernel_w, num_input, num_output] | 271| bias_data | float | [num_output] | 272 273# Convolution3D 274``` 275x2 = pad(x, pads, pad_value) 276x3 = conv3d(x2, weight, kernel, stride, dilation) + bias 277y = activation(x3, act_type, act_params) 278``` 279 280* one_blob_only 281 282| param id | name | type | default | description | 283| --------- | ------------- | ----- | --------- | ----------------- | 284| 0 | num_output | int | 0 | | 285| 1 | kernel_w | int | 0 | | 286| 2 | dilation_w | int | 1 | | 287| 3 | stride_w | int | 1 | | 288| 4 | pad_left | int | 0 | | 289| 5 | bias_term | int | 0 | | 290| 6 | weight_data_size| int | 0 | | 291| 9 | activation_type| int | 0 | | 292| 10 | activation_params| array | [ ] | | 293| 11 | kernel_h | int | kernel_w | | 294| 12 | dilation_h | int | dilation_w | | 295| 13 | stride_h | int | stride_w | | 296| 15 | pad_right | int | pad_left | | 297| 14 | pad_top | int | pad_left | | 298| 16 | pad_bottom | int | pad_top | | 299| 17 | pad_behind | int | pad_front | | 300| 18 | pad_value | float | 0.f | | 301| 21 | kernel_d | int | kernel_w | | 302| 22 | dilation_d | int | dilation_w | | 303| 23 | stride_d | int | stride_w | | 304| 24 | pad_front | int | pad_left | | 305 306| weight | type | shape | 307| ------------- | ----- | --------------------- | 308| weight_data | float/fp16/int8 | [kernel_w, kernel_h, kernel_d, num_input, num_output] | 309| bias_data | float | [num_output] | 310 311# ConvolutionDepthWise 312``` 313x2 = pad(x, pads, pad_value) 314x3 = conv(x2, weight, kernel, stride, dilation, group) + bias 315y = activation(x3, act_type, act_params) 316``` 317 318* one_blob_only 319 320| param id | name | type | default | description | 321| --------- | ------------- | ----- | --------- | ----------------- | 322| 0 | num_output | int | 0 | | 323| 1 | kernel_w | int | 0 | | 324| 2 | dilation_w | int | 1 | | 325| 3 | stride_w | int | 1 | | 326| 4 | pad_left | int | 0 | | 327| 5 | bias_term | int | 0 | | 328| 6 | weight_data_size| int | 0 | | 329| 7 | group | int | 1 | | 330| 8 | int8_scale_term| int | 0 | | 331| 9 | activation_type| int | 0 | | 332| 10 | activation_params| array | [ ] | | 333| 11 | kernel_h | int | kernel_w | | 334| 12 | dilation_h | int | dilation_w | | 335| 13 | stride_h | int | stride_w | | 336| 15 | pad_right | int | pad_left | | 337| 14 | pad_top | int | pad_left | | 338| 16 | pad_bottom | int | pad_top | | 339| 18 | pad_value | float | 0.f | | 340 341| weight | type | shape | 342| ------------- | ----- | --------------------- | 343| weight_data | float/fp16/int8 | [kernel_w, kernel_h, num_input / group, num_output / group, group] | 344| bias_data | float | [num_output] | 345| weight_data_int8_scales| float | [group] | 346| bottom_blob_int8_scales| float | [1] | 347| top_blob_int8_scales| float | [1] | 348 349# ConvolutionDepthWise1D 350``` 351x2 = pad(x, pads, pad_value) 352x3 = conv1d(x2, weight, kernel, stride, dilation, group) + bias 353y = activation(x3, act_type, act_params) 354``` 355 356* one_blob_only 357 358| param id | name | type | default | description | 359| --------- | ------------- | ----- | --------- | ----------------- | 360| 0 | num_output | int | 0 | | 361| 1 | kernel_w | int | 0 | | 362| 2 | dilation_w | int | 1 | | 363| 3 | stride_w | int | 1 | | 364| 4 | pad_left | int | 0 | | 365| 5 | bias_term | int | 0 | | 366| 6 | weight_data_size| int | 0 | | 367| 7 | group | int | 1 | | 368| 9 | activation_type| int | 0 | | 369| 10 | activation_params| array | [ ] | | 370| 15 | pad_right | int | pad_left | | 371| 18 | pad_value | float | 0.f | | 372 373| weight | type | shape | 374| ------------- | ----- | --------------------- | 375| weight_data | float/fp16/int8 | [kernel_w, num_input / group, num_output / group, group] | 376| bias_data | float | [num_output] | 377 378# ConvolutionDepthWise3D 379``` 380x2 = pad(x, pads, pad_value) 381x3 = conv3d(x2, weight, kernel, stride, dilation, group) + bias 382y = activation(x3, act_type, act_params) 383``` 384 385* one_blob_only 386 387| param id | name | type | default | description | 388| --------- | ------------- | ----- | --------- | ----------------- | 389| 0 | num_output | int | 0 | | 390| 1 | kernel_w | int | 0 | | 391| 2 | dilation_w | int | 1 | | 392| 3 | stride_w | int | 1 | | 393| 4 | pad_left | int | 0 | | 394| 5 | bias_term | int | 0 | | 395| 6 | weight_data_size| int | 0 | | 396| 7 | group | int | 1 | | 397| 9 | activation_type| int | 0 | | 398| 10 | activation_params| array | [ ] | | 399| 11 | kernel_h | int | kernel_w | | 400| 12 | dilation_h | int | dilation_w | | 401| 13 | stride_h | int | stride_w | | 402| 15 | pad_right | int | pad_left | | 403| 14 | pad_top | int | pad_left | | 404| 16 | pad_bottom | int | pad_top | | 405| 17 | pad_behind | int | pad_front | | 406| 18 | pad_value | float | 0.f | | 407| 21 | kernel_d | int | kernel_w | | 408| 22 | dilation_d | int | dilation_w | | 409| 23 | stride_d | int | stride_w | | 410| 24 | pad_front | int | pad_left | | 411 412| weight | type | shape | 413| ------------- | ----- | --------------------- | 414| weight_data | float/fp16/int8 | [kernel_w, kernel_h, kernel_d, num_input / group, num_output / group, group] | 415| bias_data | float | [num_output] | 416 417# Crop 418``` 419y = crop(x) 420``` 421 422* one_blob_only 423 424| param id | name | type | default | description | 425| --------- | ------------- | ----- | --------- | ----------------- | 426| 0 | woffset | int | 0 | | 427| 1 | hoffset | int | 0 | | 428| 2 | coffset | int | 1 | | 429| 3 | outw | int | 1 | | 430| 4 | outh | int | 0 | | 431| 5 | outc | int | 0 | | 432| 6 | woffset2 | int | 0 | | 433| 7 | hoffset2 | int | 1 | | 434| 8 | coffset2 | int | 0 | | 435| 9 | starts | array | [ ] | | 436| 10 | ends | array | [ ] | | 437| 11 | axes | array | [ ] | | 438 439# Deconvolution 440``` 441x2 = deconv(x, weight, kernel, stride, dilation) + bias 442x3 = depad(x2, pads, pad_value) 443y = activation(x3, act_type, act_params) 444``` 445 446* one_blob_only 447 448| param id | name | type | default | description | 449| --------- | ------------- | ----- | --------- | ----------------- | 450| 0 | num_output | int | 0 | | 451| 1 | kernel_w | int | 0 | | 452| 2 | dilation_w | int | 1 | | 453| 3 | stride_w | int | 1 | | 454| 4 | pad_left | int | 0 | | 455| 5 | bias_term | int | 0 | | 456| 6 | weight_data_size| int | 0 | | 457| 8 | int8_scale_term| int | 0 | | 458| 9 | activation_type| int | 0 | | 459| 10 | activation_params| array | [ ] | | 460| 11 | kernel_h | int | kernel_w | | 461| 12 | dilation_h | int | dilation_w | | 462| 13 | stride_h | int | stride_w | | 463| 15 | pad_right | int | pad_left | | 464| 14 | pad_top | int | pad_left | | 465| 16 | pad_bottom | int | pad_top | | 466| 18 | output_pad_right| int | 0 | | 467| 19 | output_pad_bottom| int | output_pad_right | | 468| 20 | output_w | int | 0 | | 469| 21 | output_h | int | output_w | | 470 471| weight | type | shape | 472| ------------- | ----- | --------------------- | 473| weight_data | float/fp16/int8 | [kernel_w, kernel_h, num_input, num_output] | 474| bias_data | float | [num_output] | 475 476# DeconvolutionDepthWise 477``` 478x2 = deconv(x, weight, kernel, stride, dilation, group) + bias 479x3 = depad(x2, pads, pad_value) 480y = activation(x3, act_type, act_params) 481``` 482 483* one_blob_only 484 485| param id | name | type | default | description | 486| --------- | ------------- | ----- | --------- | ----------------- | 487| 0 | num_output | int | 0 | | 488| 1 | kernel_w | int | 0 | | 489| 2 | dilation_w | int | 1 | | 490| 3 | stride_w | int | 1 | | 491| 4 | pad_left | int | 0 | | 492| 5 | bias_term | int | 0 | | 493| 6 | weight_data_size| int | 0 | | 494| 7 | group | int | 1 | | 495| 8 | int8_scale_term| int | 0 | | 496| 9 | activation_type| int | 0 | | 497| 10 | activation_params| array | [ ] | | 498| 11 | kernel_h | int | kernel_w | | 499| 12 | dilation_h | int | dilation_w | | 500| 13 | stride_h | int | stride_w | | 501| 15 | pad_right | int | pad_left | | 502| 14 | pad_top | int | pad_left | | 503| 16 | pad_bottom | int | pad_top | | 504| 18 | output_pad_right| int | 0 | | 505| 19 | output_pad_bottom| int | output_pad_right | | 506| 20 | output_w | int | 0 | | 507| 21 | output_h | int | output_w | | 508 509| weight | type | shape | 510| ------------- | ----- | --------------------- | 511| weight_data | float/fp16/int8 | [kernel_w, kernel_h, num_input / group, num_output / group, group] | 512| bias_data | float | [num_output] | 513 514# Dequantize 515``` 516y = x * scale + bias 517``` 518 519* one_blob_only 520* support_inplace 521 522| param id | name | type | default | description | 523| --------- | ------------- | ----- | --------- | ----------------- | 524| 0 | scale | float | 1.f | | 525| 1 | bias_term | int | 0 | | 526| 2 | bias_data_size| int | 0 | | 527 528# Dropout 529``` 530y = x * scale 531``` 532 533* one_blob_only 534 535| param id | name | type | default | description | 536| --------- | ------------- | ----- | --------- | ----------------- | 537| 0 | scale | float | 1.f | | 538 539# Eltwise 540``` 541y = elementwise_op(x0, x1, ...) 542``` 543 544| param id | name | type | default | description | 545| --------- | ------------- | ----- | --------- | ----------------- | 546| 0 | op_type | int | 0 | | 547| 1 | coeffs | array | [ ] | | 548 549Operation type: 550- 0 = PROD 551- 1 = SUM 552- 2 = MAX 553 554# ELU 555``` 556if x < 0 y = (exp(x) - 1) * alpha 557else y = x 558``` 559 560* one_blob_only 561* support_inplace 562 563| param id | name | type | default | description | 564| --------- | ------------- | ----- | --------- | ----------------- | 565| 0 | alpha | float | 0.1f | | 566 567# Exp 568``` 569if base == -1 y = exp(shift + x * scale) 570else y = pow(base, (shift + x * scale)) 571``` 572 573* one_blob_only 574* support_inplace 575 576| param id | name | type | default | description | 577| --------- | ------------- | ----- | --------- | ----------------- | 578| 0 | base | float | -1.f | | 579| 1 | scale | float | 1.f | | 580| 2 | shift | float | 0.f | | 581 582# Flatten 583Reshape blob to 1 dimension 584 585* one_blob_only 586 587# GELU 588``` 589if fast_gelu == 1 y = 0.5 * x * (1 + tanh(0.79788452 * (x + 0.044715 * x * x * x))); 590else y = 0.5 * x * erfc(-0.70710678 * x) 591``` 592 593* one_blob_only 594* support_inplace 595 596| param id | name | type | default | description | 597| --------- | ------------- | ----- | --------- | ----------------- | 598| 0 | fast_gelu | int | 0 | use approximation | 599 600# Gemm 601``` 602a = transA ? transpose(x0) : x0 603b = transb ? transpose(x1) : x1 604c = x2 605y = gemm(a, b) * alpha + c * beta 606``` 607 608| param id | name | type | default | description | 609| --------- | ------------- | ----- | --------- | ----------------- | 610| 0 | alpha | float | 1.f | | 611| 1 | beta | float | 1.f | | 612| 2 | transA | int | 0 | | 613| 3 | transb | int | 0 | | 614 615# GroupNorm 616``` 617split x along channel axis into group x0, x1 ... 618l2 normalize for each group x0, x1 ... 619y = x * gamma + beta 620``` 621 622* one_blob_only 623* support_inplace 624 625| param id | name | type | default | description | 626| --------- | ------------- | ----- | --------- | ----------------- | 627| 0 | group | int | 1 | | 628| 1 | channels | int | 0 | | 629| 2 | eps | float | 0.001f | x = x / sqrt(var + eps) | 630| 3 | affine | int | 1 | | 631 632| weight | type | shape | 633| ------------- | ----- | --------------------- | 634| gamma_data | float | [channels] | 635| beta_data | float | [channels] | 636 637# GRU 638Apply a single-layer GRU to a feature sequence of `T` timesteps. The input blob shape is `[w=input_size, h=T]` and the output blob shape is `[w=num_output, h=T]`. 639 640``` 641y = gru(x) 642y0, hidden y1 = gru(x0, hidden x1) 643``` 644 645* one_blob_only if bidirectional 646 647| param id | name | type | default | description | 648| --------- | ------------- | ----- | --------- | ----------------- | 649| 0 | num_output | int | 0 | hidden size of output | 650| 1 | weight_data_size| int | 0 | total size of weight matrix | 651| 2 | direction | int | 0 | 0=forward, 1=reverse, 2=bidirectional | 652 653| weight | type | shape | 654| ------------- | ----- | --------------------- | 655| weight_xc_data| float/fp16/int8 | [input_size, num_output * 3, num_directions] | 656| bias_c_data | float/fp16/int8 | [num_output, 4, num_directions] | 657| weight_hc_data| float/fp16/int8 | [num_output, num_output * 3, num_directions] | 658 659Direction flag: 660- 0 = forward only 661- 1 = reverse only 662- 2 = bidirectional 663 664# HardSigmoid 665``` 666y = clamp(x * alpha + beta, 0, 1) 667``` 668 669* one_blob_only 670* support_inplace 671 672| param id | name | type | default | description | 673| --------- | ------------- | ----- | --------- | ----------------- | 674| 0 | alpha | float | 0.2f | | 675| 1 | beta | float | 0.5f | | 676 677# HardSwish 678``` 679y = x * clamp(x * alpha + beta, 0, 1) 680``` 681 682* one_blob_only 683* support_inplace 684 685| param id | name | type | default | description | 686| --------- | ------------- | ----- | --------- | ----------------- | 687| 0 | alpha | float | 0.2f | | 688| 1 | beta | float | 0.5f | | 689 690# InnerProduct 691``` 692x2 = innerproduct(x, weight) + bias 693y = activation(x2, act_type, act_params) 694``` 695 696* one_blob_only 697 698| param id | name | type | default | description | 699| --------- | ------------- | ----- | --------- | ----------------- | 700| 0 | num_output | int | 0 | | 701| 1 | bias_term | int | 0 | | 702| 2 | weight_data_size| int | 0 | | 703| 8 | int8_scale_term| int | 0 | | 704| 9 | activation_type| int | 0 | | 705| 10 | activation_params| array | [ ] | | 706 707| weight | type | shape | 708| ------------- | ----- | --------------------- | 709| weight_data | float/fp16/int8 | [num_input, num_output] | 710| bias_data | float | [num_output] | 711| weight_data_int8_scales| float | [num_output] | 712| bottom_blob_int8_scales| float | [1] | 713 714# Input 715``` 716y = input 717``` 718 719* support_inplace 720 721| param id | name | type | default | description | 722| --------- | ------------- | ----- | --------- | ----------------- | 723| 0 | w | int | 0 | | 724| 1 | h | int | 0 | | 725| 2 | c | int | 0 | | 726 727# InstanceNorm 728``` 729split x along channel axis into instance x0, x1 ... 730l2 normalize for each channel instance x0, x1 ... 731y = x * gamma + beta 732``` 733 734* one_blob_only 735* support_inplace 736 737| param id | name | type | default | description | 738| --------- | ------------- | ----- | --------- | ----------------- | 739| 0 | channels | int | 0 | | 740| 1 | eps | float | 0.001f | x = x / sqrt(var + eps) | 741| 2 | affine | int | 1 | | 742 743| weight | type | shape | 744| ------------- | ----- | --------------------- | 745| gamma_data | float | [channels] | 746| beta_data | float | [channels] | 747 748# Interp 749``` 750if dynamic_target_size == 0 y = resize(x) by fixed size or scale 751else y = resize(x0, size(x1)) 752``` 753 754* one_blob_only if dynamic_target_size == 0 755 756| param id | name | type | default | description | 757| --------- | ------------- | ----- | --------- | ----------------- | 758| 0 | resize_type | int | 0 | | 759| 1 | height_scale | float | 1.f | | 760| 2 | width_scale | float | 1.f | | 761| 3 | output_height | int | 0 | | 762| 4 | output_width | int | 0 | | 763| 5 | dynamic_target_size| int | 0 | | 764| 6 | align_corner | int | 0 | | 765 766Resize type: 767- 1 = Nearest 768- 2 = Bilinear 769- 3 = Bicubic 770 771# LayerNorm 772``` 773split x along outmost axis into part x0, x1 ... 774l2 normalize for each part x0, x1 ... 775y = x * gamma + beta by elementwise 776``` 777 778* one_blob_only 779* support_inplace 780 781| param id | name | type | default | description | 782| --------- | ------------- | ----- | --------- | ----------------- | 783| 0 | affine_size | int | 0 | | 784| 1 | eps | float | 0.001f | x = x / sqrt(var + eps) | 785| 2 | affine | int | 1 | | 786 787| weight | type | shape | 788| ------------- | ----- | --------------------- | 789| gamma_data | float | [affine_size] | 790| beta_data | float | [affine_size] | 791 792# Log 793``` 794if base == -1 y = log(shift + x * scale) 795else y = log(shift + x * scale) / log(base) 796``` 797 798* one_blob_only 799* support_inplace 800 801| param id | name | type | default | description | 802| --------- | ------------- | ----- | --------- | ----------------- | 803| 0 | base | float | -1.f | | 804| 1 | scale | float | 1.f | | 805| 2 | shift | float | 0.f | | 806 807# LRN 808``` 809if region_type == ACROSS_CHANNELS square_sum = sum of channel window of local_size 810if region_type == WITHIN_CHANNEL square_sum = sum of spatial window of local_size 811y = x * pow(bias + alpha * square_sum / (local_size * local_size), -beta) 812``` 813 814* one_blob_only 815* support_inplace 816 817| param id | name | type | default | description | 818| --------- | ------------- | ----- | --------- | ----------------- | 819| 0 | region_type | int | 0 | | 820| 1 | local_size | int | 5 | | 821| 2 | alpha | float | 1.f | | 822| 3 | beta | float | 0.75f | | 823| 4 | bias | float | 1.f | | 824 825Region type: 826- 0 = ACROSS_CHANNELS 827- 1 = WITHIN_CHANNEL 828 829# LSTM 830Apply a single-layer LSTM to a feature sequence of `T` timesteps. The input blob shape is `[w=input_size, h=T]` and the output blob shape is `[w=num_output, h=T]`. 831 832``` 833y = lstm(x) 834y0, hidden y1, cell y2 = lstm(x0, hidden x1, cell x2) 835``` 836 837* one_blob_only if bidirectional 838 839| param id | name | type | default | description | 840| --------- | ------------- | ----- | --------- | ----------------- | 841| 0 | num_output | int | 0 | hidden size of output | 842| 1 | weight_data_size| int | 0 | total size of IFOG weight matrix | 843| 2 | direction | int | 0 | 0=forward, 1=reverse, 2=bidirectional | 844 845| weight | type | shape | 846| ------------- | ----- | --------------------- | 847| weight_xc_data| float/fp16/int8 | [input_size, num_output * 4, num_directions] | 848| bias_c_data | float/fp16/int8 | [num_output, 4, num_directions] | 849| weight_hc_data| float/fp16/int8 | [num_output, num_output * 4, num_directions] | 850 851Direction flag: 852- 0 = forward only 853- 1 = reverse only 854- 2 = bidirectional 855 856# MemoryData 857``` 858y = data 859``` 860 861| param id | name | type | default | description | 862| --------- | ------------- | ----- | --------- | ----------------- | 863| 0 | w | int | 0 | | 864| 1 | h | int | 0 | | 865| 2 | c | int | 0 | | 866 867| weight | type | shape | 868| ------------- | ----- | --------------------- | 869| data | float | [w, h, c] | 870 871# Mish 872``` 873y = x * tanh(log(exp(x) + 1)) 874``` 875 876* one_blob_only 877* support_inplace 878 879# MultiHeadAttention 880``` 881split q k v into num_head part q0, k0, v0, q1, k1, v1 ... 882for each num_head part 883 xq = affine(q) / (embed_dim / num_head) 884 xk = affine(k) 885 xv = affine(v) 886 xqk = xq * xk 887 softmax_inplace(xqk) 888 xqkv = xqk * xv 889 merge xqkv to out 890y = affine(out) 891``` 892 893| param id | name | type | default | description | 894| --------- | ------------- | ----- | --------- | ----------------- | 895| 0 | embed_dim | int | 0 | | 896| 1 | num_head | int | 1 | | 897| 2 | weight_data_size| int | 0 | | 898 899| weight | type | shape | 900| ------------- | ----- | --------------------- | 901| q_weight_data | float/fp16/int8 | [weight_data_size] | 902| q_bias_data | float | [embed_dim] | 903| k_weight_data | float/fp16/int8 | [weight_data_size] | 904| k_bias_data | float | [embed_dim] | 905| v_weight_data | float/fp16/int8 | [weight_data_size] | 906| v_bias_data | float | [embed_dim] | 907| out_weight_data| float/fp16/int8 | [weight_data_size] | 908| out_bias_data | float | [embed_dim] | 909 910# MVN 911``` 912if normalize_variance == 1 && across_channels == 1 y = (x - mean) / (sqrt(var) + eps) of whole blob 913if normalize_variance == 1 && across_channels == 0 y = (x - mean) / (sqrt(var) + eps) of each channel 914if normalize_variance == 0 && across_channels == 1 y = x - mean of whole blob 915if normalize_variance == 0 && across_channels == 0 y = x - mean of each channel 916``` 917 918* one_blob_only 919 920| param id | name | type | default | description | 921| --------- | ------------- | ----- | --------- | ----------------- | 922| 0 | normalize_variance| int | 0 | | 923| 1 | across_channels| int | 0 | | 924| 2 | eps | float | 0.0001f | x = x / (sqrt(var) + eps) | 925 926# Noop 927``` 928y = x 929``` 930 931# Normalize 932``` 933if across_spatial == 1 && across_channel == 1 x2 = normalize(x) of whole blob 934if across_spatial == 1 && across_channel == 0 x2 = normalize(x) of each channel 935if across_spatial == 0 && across_channel == 1 x2 = normalize(x) of each position 936y = x2 * scale 937``` 938 939* one_blob_only 940* support_inplace 941 942| param id | name | type | default | description | 943| --------- | ------------- | ----- | --------- | ----------------- | 944| 0 | across_spatial| int | 0 | | 945| 1 | channel_shared| int | 0 | | 946| 2 | eps | float | 0.0001f | see eps mode | 947| 3 | scale_data_size| int | 0 | | 948| 4 | across_channel| int | 0 | | 949| 9 | eps_mode | int | 0 | | 950 951| weight | type | shape | 952| ------------- | ----- | --------------------- | 953| scale_data | float | [scale_data_size] | 954 955Eps Mode: 956- 0 = caffe/mxnet x = x / sqrt(var + eps) 957- 1 = pytorch x = x / max(sqrt(var), eps) 958- 2 = tensorflow x = x / sqrt(max(var, eps)) 959 960# Packing 961``` 962y = wrap_packing(x) 963``` 964 965* one_blob_only 966 967| param id | name | type | default | description | 968| --------- | ------------- | ----- | --------- | ----------------- | 969| 0 | out_elempack | int | 1 | | 970| 1 | use_padding | int | 0 | | 971| 2 | cast_type_from| int | 0 | | 972| 3 | cast_type_to | int | 0 | | 973| 4 | storage_type_from| int | 0 | | 974| 5 | storage_type_to| int | 0 | | 975 976# Padding 977``` 978y = pad(x, pads) 979``` 980 981| param id | name | type | default | description | 982| --------- | ------------- | ---- | --------- | ----------------- | 983| 0 | top | int | 0 | | 984| 1 | bottom | int | 0 | | 985| 2 | left | int | 0 | | 986| 3 | right | int | 0 | | 987| 4 | type | int | 0 | | 988| 5 | value | float | 0 | | 989| 6 | per_channel_pad_data_size| int | 0 | | 990| 7 | front | int | stride_w | | 991| 8 | behind | int | pad_left | | 992 993| weight | type | shape | 994| ------------- | ----- | --------------------- | 995| per_channel_pad_data| float | [per_channel_pad_data_size] | 996 997Padding type: 998- 0 = CONSTANT 999- 1 = REPLICATE 1000- 2 = REFLECT 1001 1002# Permute 1003``` 1004y = reorder(x) 1005``` 1006 1007| param id | name | type | default | description | 1008| --------- | ------------- | ---- | --------- | ----------------- | 1009| 0 | order_type | int | 0 | | 1010 1011Order Type: 1012- 0 = WH WHC 1013- 1 = HW HWC 1014- 2 = WCH 1015- 3 = CWH 1016- 4 = HCW 1017- 5 = CHW 1018 1019# PixelShuffle 1020``` 1021if mode == 0 y = depth_to_space(x) where x channel order is sw-sh-outc 1022if mode == 1 y = depth_to_space(x) where x channel order is outc-sw-sh 1023``` 1024 1025* one_blob_only 1026 1027| param id | name | type | default | description | 1028| --------- | ------------- | ---- | --------- | ----------------- | 1029| 0 | upscale_factor| int | 1 | | 1030| 1 | mode | int | 0 | | 1031 1032# Pooling 1033``` 1034x2 = pad(x, pads) 1035x3 = pooling(x2, kernel, stride) 1036``` 1037 1038| param id | name | type | default | description | 1039| --------- | --------------| ---- | --------- | ----------------- | 1040| 0 | pooling_type | int | 0 | | 1041| 1 | kernel_w | int | 0 | | 1042| 2 | stride_w | int | 1 | | 1043| 3 | pad_left | int | 0 | | 1044| 4 | global_pooling| int | 0 | | 1045| 5 | pad_mode | int | 0 | | 1046| 6 | avgpool_count_include_pad| int | 0 | | 1047| 7 | adaptive_pooling| int | 0 | | 1048| 8 | out_w | int | 0 | | 1049| 11 | kernel_h | int | kernel_w | | 1050| 12 | stride_h | int | stride_w | | 1051| 13 | pad_top | int | pad_left | | 1052| 14 | pad_right | int | pad_left | | 1053| 15 | pad_bottom | int | pad_top | | 1054| 18 | out_h | int | out_w | | 1055 1056Pooling type: 1057- 0 = MAX 1058- 1 = AVG 1059 1060Pad mode: 1061- 0 = full padding 1062- 1 = valid padding 1063- 2 = tensorflow padding=SAME or onnx padding=SAME_UPPER 1064- 3 = onnx padding=SAME_LOWER 1065 1066# Pooling1D 1067``` 1068x2 = pad(x, pads) 1069x3 = pooling1d(x2, kernel, stride) 1070``` 1071 1072| param id | name | type | default | description | 1073| --------- | --------------| ---- | --------- | ----------------- | 1074| 0 | pooling_type | int | 0 | | 1075| 1 | kernel_w | int | 0 | | 1076| 2 | stride_w | int | 1 | | 1077| 3 | pad_left | int | 0 | | 1078| 4 | global_pooling| int | 0 | | 1079| 5 | pad_mode | int | 0 | | 1080| 6 | avgpool_count_include_pad| int | 0 | | 1081| 7 | adaptive_pooling| int | 0 | | 1082| 8 | out_w | int | 0 | | 1083| 14 | pad_right | int | pad_left | | 1084 1085Pooling type: 1086- 0 = MAX 1087- 1 = AVG 1088 1089Pad mode: 1090- 0 = full padding 1091- 1 = valid padding 1092- 2 = tensorflow padding=SAME or onnx padding=SAME_UPPER 1093- 3 = onnx padding=SAME_LOWER 1094 1095# Pooling3D 1096``` 1097x2 = pad(x, pads) 1098x3 = pooling3d(x2, kernel, stride) 1099``` 1100 1101| param id | name | type | default | description | 1102| --------- | --------------| ---- | --------- | ----------------- | 1103| 0 | pooling_type | int | 0 | | 1104| 1 | kernel_w | int | 0 | | 1105| 2 | stride_w | int | 1 | | 1106| 3 | pad_left | int | 0 | | 1107| 4 | global_pooling| int | 0 | | 1108| 5 | pad_mode | int | 0 | | 1109| 6 | avgpool_count_include_pad| int | 0 | | 1110| 7 | adaptive_pooling| int | 0 | | 1111| 8 | out_w | int | 0 | | 1112| 11 | kernel_h | int | kernel_w | | 1113| 12 | stride_h | int | stride_w | | 1114| 13 | pad_top | int | pad_left | | 1115| 14 | pad_right | int | pad_left | | 1116| 15 | pad_bottom | int | pad_top | | 1117| 16 | pad_behind | int | pad_front | | 1118| 18 | out_h | int | out_w | | 1119| 21 | kernel_d | int | kernel_w | | 1120| 22 | stride_d | int | stride_w | | 1121| 23 | pad_front | int | pad_top | | 1122| 28 | out_d | int | out_w | | 1123 1124Pooling type: 1125- 0 = MAX 1126- 1 = AVG 1127 1128Pad mode: 1129- 0 = full padding 1130- 1 = valid padding 1131- 2 = tensorflow padding=SAME or onnx padding=SAME_UPPER 1132- 3 = onnx padding=SAME_LOWER 1133 1134# Power 1135``` 1136y = pow((shift + x * scale), power) 1137``` 1138 1139* one_blob_only 1140* support_inplace 1141 1142| param id | name | type | default | description | 1143| --------- | ------------- | ----- | --------- | ----------------- | 1144| 0 | power | float | 1.f | | 1145| 1 | scale | float | 1.f | | 1146| 2 | shift | float | 0.f | | 1147 1148# PReLU 1149``` 1150if x < 0 y = x * slope 1151else y = x 1152``` 1153 1154* one_blob_only 1155* support_inplace 1156 1157| param id | name | type | default | description | 1158| --------- | ------------- | ----- | --------- | ----------------- | 1159| 0 | num_slope | int | 0 | | 1160 1161| weight | type | shape | 1162| ------------- | ----- | --------------------- | 1163| slope_data | float | [num_slope] | 1164 1165# Quantize 1166``` 1167y = float2int8(x * scale) 1168``` 1169 1170* one_blob_only 1171 1172| param id | name | type | default | description | 1173| --------- | ------------- | ----- | --------- | ----------------- | 1174| 0 | scale_data_size| int | 0 | | 1175 1176| weight | type | shape | 1177| ------------- | ----- | --------------------- | 1178| scale_data | float | [scale_data_size] | 1179 1180# Reduction 1181``` 1182y = reduce_op(x * coeff) 1183``` 1184 1185* one_blob_only 1186 1187| param id | name | type | default | description | 1188| --------- | ------------- | ----- | --------- | ----------------- | 1189| 0 | operation | int | 0 | | 1190| 1 | reduce_all | int | 1 | | 1191| 2 | coeff | float | 1.f | | 1192| 3 | axes | array | [ ] | | 1193| 4 | keepdims | int | 0 | | 1194 1195Operation type: 1196- 0 = SUM 1197- 1 = ASUM 1198- 2 = SUMSQ 1199- 3 = MEAN 1200- 4 = MAX 1201- 5 = MIN 1202- 6 = PROD 1203- 7 = L1 1204- 8 = L2 1205- 9 = LogSum 1206- 10 = LogSumExp 1207 1208# ReLU 1209``` 1210if x < 0 y = x * slope 1211else y = x 1212``` 1213 1214* one_blob_only 1215* support_inplace 1216 1217| param id | name | type | default | description | 1218| --------- | ------------- | ----- | --------- | ----------------- | 1219| 0 | slope | float | 0.f | | 1220 1221# Reorg 1222``` 1223if mode == 0 y = space_to_depth(x) where x channel order is sw-sh-outc 1224if mode == 1 y = space_to_depth(x) where x channel order is outc-sw-sh 1225``` 1226 1227* one_blob_only 1228 1229| param id | name | type | default | description | 1230| --------- | ------------- | ---- | --------- | ----------------- | 1231| 0 | stride | int | 1 | | 1232| 1 | mode | int | 0 | | 1233 1234# Requantize 1235``` 1236x2 = x * scale_in + bias 1237x3 = activation(x2) 1238y = float2int8(x3 * scale_out) 1239``` 1240 1241* one_blob_only 1242 1243| param id | name | type | default | description | 1244| --------- | ------------- | ----- | --------- | ----------------- | 1245| 0 | scale_in_data_size| int | 1 | | 1246| 1 | scale_out_data_size| int | 1 | | 1247| 2 | bias_data_size| int | 0 | | 1248| 3 | activation_type| int | 0 | | 1249| 4 | activation_params| int | [ ] | | 1250 1251| weight | type | shape | 1252| ------------- | ----- | --------------------- | 1253| scale_in_data | float | [scale_in_data_size] | 1254| scale_out_data| float | [scale_out_data_size] | 1255| bias_data | float | [bias_data_size] | 1256 1257# Reshape 1258``` 1259if permute == 1 y = hwc2chw(reshape(chw2hwc(x))) 1260else y = reshape(x) 1261``` 1262 1263* one_blob_only 1264 1265| param id | name | type | default | description | 1266| --------- | ------------- | ----- | --------- | ----------------- | 1267| 0 | w | int | -233 | | 1268| 1 | h | int | -233 | | 1269| 2 | c | int | -233 | | 1270| 3 | permute | int | 0 | | 1271 1272Reshape flag: 1273- 0 = copy from bottom 1274- -1 = remaining 1275- -233 = drop this dim(default) 1276 1277# RNN 1278Apply a single-layer RNN to a feature sequence of `T` timesteps. The input blob shape is `[w=input_size, h=T]` and the output blob shape is `[w=num_output, h=T]`. 1279 1280``` 1281y = rnn(x) 1282y0, hidden y1 = rnn(x0, hidden x1) 1283``` 1284 1285* one_blob_only if bidirectional 1286 1287| param id | name | type | default | description | 1288| --------- | ------------- | ----- | --------- | ----------------- | 1289| 0 | num_output | int | 0 | hidden size of output | 1290| 1 | weight_data_size| int | 0 | total size of weight matrix | 1291| 2 | direction | int | 0 | 0=forward, 1=reverse, 2=bidirectional | 1292 1293| weight | type | shape | 1294| ------------- | ----- | --------------------- | 1295| weight_xc_data| float/fp16/int8 | [input_size, num_output, num_directions] | 1296| bias_c_data | float/fp16/int8 | [num_output, 1, num_directions] | 1297| weight_hc_data| float/fp16/int8 | [num_output, num_output, num_directions] | 1298 1299Direction flag: 1300- 0 = forward only 1301- 1 = reverse only 1302- 2 = bidirectional 1303 1304# Scale 1305``` 1306if scale_data_size == -233 y = x0 * x1 1307else y = x * scale + bias 1308``` 1309 1310* one_blob_only if scale_data_size != -233 1311* support_inplace 1312 1313| param id | name | type | default | description | 1314| --------- | ------------- | ----- | --------- | ----------------- | 1315| 0 | scale_data_size| int | 0 | | 1316| 1 | bias_term | int | 0 | | 1317 1318| weight | type | shape | 1319| ------------- | ----- | --------------------- | 1320| scale_data | float | [scale_data_size] | 1321| bias_data | float | [scale_data_size] | 1322 1323# SELU 1324``` 1325if x < 0 y = (exp(x) - 1.f) * alpha * lambda 1326else y = x * lambda 1327``` 1328 1329* one_blob_only 1330* support_inplace 1331 1332| param id | name | type | default | description | 1333| --------- | ------------- | ----- | --------- | ----------------- | 1334| 0 | alpha | float | 1.67326324f| | 1335| 1 | lambda | float | 1.050700987f| | 1336 1337# ShuffleChannel 1338``` 1339if reverse == 0 y = shufflechannel(x) by group 1340if reverse == 1 y = shufflechannel(x) by channel / group 1341``` 1342 1343* one_blob_only 1344 1345| param id | name | type | default | description | 1346| --------- | ------------- | ---- | --------- | ----------------- | 1347| 0 | group | int | 1 | | 1348| 1 | reverse | int | 0 | | 1349 1350# Sigmoid 1351``` 1352y = 1 / (1 + exp(-x)) 1353``` 1354 1355* one_blob_only 1356* support_inplace 1357 1358# Slice 1359``` 1360split x along axis into slices, each part slice size is based on slices array 1361``` 1362 1363| param id | name | type | default | description | 1364| --------- | ------------- | ----- | --------- | ----------------- | 1365| 0 | slices | array | [ ] | | 1366| 1 | axis | int | 0 | | 1367 1368# Softmax 1369``` 1370softmax(x, axis) 1371``` 1372 1373* one_blob_only 1374* support_inplace 1375 1376| param id | name | type | default | description | 1377| --------- | ------------- | ----- | --------- | ----------------- | 1378| 0 | axis | int | 0 | | 1379| 1 | fixbug0 | int | 0 | hack for bug fix, should be 1 | 1380 1381# Softplus 1382``` 1383y = log(exp(x) + 1) 1384``` 1385 1386* one_blob_only 1387* support_inplace 1388 1389# Split 1390``` 1391y0, y1 ... = x 1392``` 1393 1394# Swish 1395``` 1396y = x / (1 + exp(-x)) 1397``` 1398 1399* one_blob_only 1400* support_inplace 1401 1402# TanH 1403``` 1404y = tanh(x) 1405``` 1406 1407* one_blob_only 1408* support_inplace 1409 1410# Threshold 1411``` 1412if x > threshold y = 1 1413else y = 0 1414``` 1415 1416* one_blob_only 1417* support_inplace 1418 1419| param id | name | type | default | description | 1420| --------- | ------------- | ----- | --------- | ----------------- | 1421| 0 | threshold | float | 0.f | | 1422 1423# UnaryOp 1424``` 1425y = unaryop(x) 1426``` 1427 1428- one_blob_only 1429- support_inplace 1430 1431| param id | name | type | default | description | 1432| --------- | ------------- | ----- | --------- | ----------------- | 1433| 0 | op_type | int | 0 | Operation type as follows | 1434 1435Operation type: 1436- 0 = ABS 1437- 1 = NEG 1438- 2 = FLOOR 1439- 3 = CEIL 1440- 4 = SQUARE 1441- 5 = SQRT 1442- 6 = RSQ 1443- 7 = EXP 1444- 8 = LOG 1445- 9 = SIN 1446- 10 = COS 1447- 11 = TAN 1448- 12 = ASIN 1449- 13 = ACOS 1450- 14 = ATAN 1451- 15 = RECIPROCAL 1452- 16 = TANH 1453