1====================================== 2Syntax of AMDGPU Instruction Modifiers 3====================================== 4 5.. contents:: 6 :local: 7 8Conventions 9=========== 10 11The following notation is used throughout this document: 12 13 =================== ============================================================= 14 Notation Description 15 =================== ============================================================= 16 {0..N} Any integer value in the range from 0 to N (inclusive). 17 <x> Syntax and meaning of *x* is explained elsewhere. 18 =================== ============================================================= 19 20.. _amdgpu_syn_modifiers: 21 22Modifiers 23========= 24 25DS Modifiers 26------------ 27 28.. _amdgpu_synid_ds_offset8: 29 30offset8 31~~~~~~~ 32 33Specifies an immediate unsigned 8-bit offset, in bytes. The default value is 0. 34 35Used with DS instructions which have 2 addresses. 36 37 =================== ==================================================================== 38 Syntax Description 39 =================== ==================================================================== 40 offset:{0..0xFF} Specifies an unsigned 8-bit offset as a positive 41 :ref:`integer number <amdgpu_synid_integer_number>` 42 or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. 43 =================== ==================================================================== 44 45Examples: 46 47.. parsed-literal:: 48 49 offset:0xff 50 offset:2-x 51 offset:-x-y 52 53.. _amdgpu_synid_ds_offset16: 54 55offset16 56~~~~~~~~ 57 58Specifies an immediate unsigned 16-bit offset, in bytes. The default value is 0. 59 60Used with DS instructions which have 1 address. 61 62 ==================== ==================================================================== 63 Syntax Description 64 ==================== ==================================================================== 65 offset:{0..0xFFFF} Specifies an unsigned 16-bit offset as a positive 66 :ref:`integer number <amdgpu_synid_integer_number>` 67 or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. 68 ==================== ==================================================================== 69 70Examples: 71 72.. parsed-literal:: 73 74 offset:65535 75 offset:0xffff 76 offset:-x-y 77 78.. _amdgpu_synid_sw_offset16: 79 80swizzle pattern 81~~~~~~~~~~~~~~~ 82 83This is a special modifier which may be used with *ds_swizzle_b32* instruction only. 84It specifies a swizzle pattern in numeric or symbolic form. The default value is 0. 85 86See AMD documentation for more information. 87 88 ======================================================= =========================================================== 89 Syntax Description 90 ======================================================= =========================================================== 91 offset:{0..0xFFFF} Specifies a 16-bit swizzle pattern. 92 offset:swizzle(QUAD_PERM,{0..3},{0..3},{0..3},{0..3}) Specifies a quad permute mode pattern 93 94 Each number is a lane *id*. 95 offset:swizzle(BITMASK_PERM, "<mask>") Specifies a bitmask permute mode pattern. 96 97 The pattern converts a 5-bit lane *id* to another 98 lane *id* with which the lane interacts. 99 100 *mask* is a 5 character sequence which 101 specifies how to transform the bits of the 102 lane *id*. 103 104 The following characters are allowed: 105 106 * "0" - set bit to 0. 107 108 * "1" - set bit to 1. 109 110 * "p" - preserve bit. 111 112 * "i" - inverse bit. 113 114 offset:swizzle(BROADCAST,{2..32},{0..N}) Specifies a broadcast mode. 115 116 Broadcasts the value of any particular lane to 117 all lanes in its group. 118 119 The first numeric parameter is a group 120 size and must be equal to 2, 4, 8, 16 or 32. 121 122 The second numeric parameter is an index of the 123 lane being broadcasted. 124 125 The index must not exceed group size. 126 offset:swizzle(SWAP,{1..16}) Specifies a swap mode. 127 128 Swaps the neighboring groups of 129 1, 2, 4, 8 or 16 lanes. 130 offset:swizzle(REVERSE,{2..32}) Specifies a reverse mode. 131 132 Reverses the lanes for groups of 2, 4, 8, 16 or 32 lanes. 133 ======================================================= =========================================================== 134 135Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or 136:ref:`absolute expressions<amdgpu_synid_absolute_expression>`. 137 138Examples: 139 140.. parsed-literal:: 141 142 offset:255 143 offset:0xffff 144 offset:swizzle(QUAD_PERM, 0, 1, 2, 3) 145 offset:swizzle(BITMASK_PERM, "01pi0") 146 offset:swizzle(BROADCAST, 2, 0) 147 offset:swizzle(SWAP, 8) 148 offset:swizzle(REVERSE, 30 + 2) 149 150.. _amdgpu_synid_gds: 151 152gds 153~~~ 154 155Specifies whether to use GDS or LDS memory (LDS is the default). 156 157 ======================================== ================================================ 158 Syntax Description 159 ======================================== ================================================ 160 gds Use GDS memory. 161 ======================================== ================================================ 162 163 164EXP Modifiers 165------------- 166 167.. _amdgpu_synid_done: 168 169done 170~~~~ 171 172Specifies if this is the last export from the shader to the target. By default, 173*exp* instruction does not finish an export sequence. 174 175 ======================================== ================================================ 176 Syntax Description 177 ======================================== ================================================ 178 done Indicates the last export operation. 179 ======================================== ================================================ 180 181.. _amdgpu_synid_compr: 182 183compr 184~~~~~ 185 186Indicates if the data are compressed (data are not compressed by default). 187 188 ======================================== ================================================ 189 Syntax Description 190 ======================================== ================================================ 191 compr Data are compressed. 192 ======================================== ================================================ 193 194.. _amdgpu_synid_vm: 195 196vm 197~~ 198 199Specifies valid mask flag state (off by default). 200 201 ======================================== ================================================ 202 Syntax Description 203 ======================================== ================================================ 204 vm Set valid mask flag. 205 ======================================== ================================================ 206 207FLAT Modifiers 208-------------- 209 210.. _amdgpu_synid_flat_offset12: 211 212offset12 213~~~~~~~~ 214 215Specifies an immediate unsigned 12-bit offset, in bytes. The default value is 0. 216 217Cannot be used with *global/scratch* opcodes. GFX9 only. 218 219 ================= ==================================================================== 220 Syntax Description 221 ================= ==================================================================== 222 offset:{0..4095} Specifies a 12-bit unsigned offset as a positive 223 :ref:`integer number <amdgpu_synid_integer_number>` 224 or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. 225 ================= ==================================================================== 226 227Examples: 228 229.. parsed-literal:: 230 231 offset:4095 232 offset:x-0xff 233 234.. _amdgpu_synid_flat_offset13s: 235 236offset13s 237~~~~~~~~~ 238 239Specifies an immediate signed 13-bit offset, in bytes. The default value is 0. 240 241Can be used with *global/scratch* opcodes only. GFX9 only. 242 243 ===================== ==================================================================== 244 Syntax Description 245 ===================== ==================================================================== 246 offset:{-4096..4095} Specifies a 13-bit signed offset as an 247 :ref:`integer number <amdgpu_synid_integer_number>` 248 or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. 249 ===================== ==================================================================== 250 251Examples: 252 253.. parsed-literal:: 254 255 offset:-4000 256 offset:0x10 257 offset:-x 258 259.. _amdgpu_synid_flat_offset12s: 260 261offset12s 262~~~~~~~~~ 263 264Specifies an immediate signed 12-bit offset, in bytes. The default value is 0. 265 266Can be used with *global/scratch* opcodes only. 267 268GFX10 only. 269 270 ===================== ==================================================================== 271 Syntax Description 272 ===================== ==================================================================== 273 offset:{-2048..2047} Specifies a 12-bit signed offset as an 274 :ref:`integer number <amdgpu_synid_integer_number>` 275 or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. 276 ===================== ==================================================================== 277 278Examples: 279 280.. parsed-literal:: 281 282 offset:-2000 283 offset:0x10 284 offset:-x+y 285 286.. _amdgpu_synid_flat_offset11: 287 288offset11 289~~~~~~~~ 290 291Specifies an immediate unsigned 11-bit offset, in bytes. The default value is 0. 292 293Cannot be used with *global/scratch* opcodes. 294 295GFX10 only. 296 297 ================= ==================================================================== 298 Syntax Description 299 ================= ==================================================================== 300 offset:{0..2047} Specifies an 11-bit unsigned offset as a positive 301 :ref:`integer number <amdgpu_synid_integer_number>` 302 or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. 303 ================= ==================================================================== 304 305Examples: 306 307.. parsed-literal:: 308 309 offset:2047 310 offset:x+0xff 311 312dlc 313~~~ 314 315See a description :ref:`here<amdgpu_synid_dlc>`. GFX10 only. 316 317glc 318~~~ 319 320See a description :ref:`here<amdgpu_synid_glc>`. 321 322lds 323~~~ 324 325See a description :ref:`here<amdgpu_synid_lds>`. GFX10 only. 326 327slc 328~~~ 329 330See a description :ref:`here<amdgpu_synid_slc>`. 331 332tfe 333~~~ 334 335See a description :ref:`here<amdgpu_synid_tfe>`. 336 337nv 338~~ 339 340See a description :ref:`here<amdgpu_synid_nv>`. 341 342MIMG Modifiers 343-------------- 344 345.. _amdgpu_synid_dmask: 346 347dmask 348~~~~~ 349 350Specifies which channels (image components) are used by the operation. By default, no channels 351are used. 352 353 =============== ==================================================================== 354 Syntax Description 355 =============== ==================================================================== 356 dmask:{0..15} Specifies image channels as a positive 357 :ref:`integer number <amdgpu_synid_integer_number>` 358 or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. 359 360 Each bit corresponds to one of 4 image components (RGBA). 361 362 If the specified bit value is 0, the component is not used, 363 value 1 means that the component is used. 364 =============== ==================================================================== 365 366This modifier has some limitations depending on instruction kind: 367 368 =================================================== ======================== 369 Instruction Kind Valid dmask Values 370 =================================================== ======================== 371 32-bit atomic *cmpswap* 0x3 372 32-bit atomic instructions except for *cmpswap* 0x1 373 64-bit atomic *cmpswap* 0xF 374 64-bit atomic instructions except for *cmpswap* 0x3 375 *gather4* 0x1, 0x2, 0x4, 0x8 376 Other instructions any value 377 =================================================== ======================== 378 379Examples: 380 381.. parsed-literal:: 382 383 dmask:0xf 384 dmask:0b1111 385 dmask:x|y|z 386 387.. _amdgpu_synid_unorm: 388 389unorm 390~~~~~ 391 392Specifies whether the address is normalized or not (the address is normalized by default). 393 394 ======================== ======================================== 395 Syntax Description 396 ======================== ======================================== 397 unorm Force the address to be unnormalized. 398 ======================== ======================================== 399 400glc 401~~~ 402 403See a description :ref:`here<amdgpu_synid_glc>`. 404 405slc 406~~~ 407 408See a description :ref:`here<amdgpu_synid_slc>`. 409 410.. _amdgpu_synid_r128: 411 412r128 413~~~~ 414 415Specifies texture resource size. The default size is 256 bits. 416 417GFX7, GFX8 and GFX10 only. 418 419 =================== ================================================ 420 Syntax Description 421 =================== ================================================ 422 r128 Specifies 128 bits texture resource size. 423 =================== ================================================ 424 425.. WARNING:: Using this modifier should descrease *rsrc* operand size from 8 to 4 dwords, but assembler does not currently support this feature. 426 427tfe 428~~~ 429 430See a description :ref:`here<amdgpu_synid_tfe>`. 431 432.. _amdgpu_synid_lwe: 433 434lwe 435~~~ 436 437Specifies LOD warning status (LOD warning is disabled by default). 438 439 ======================================== ================================================ 440 Syntax Description 441 ======================================== ================================================ 442 lwe Enables LOD warning. 443 ======================================== ================================================ 444 445.. _amdgpu_synid_da: 446 447da 448~~ 449 450Specifies if an array index must be sent to TA. By default, array index is not sent. 451 452 ======================================== ================================================ 453 Syntax Description 454 ======================================== ================================================ 455 da Send an array-index to TA. 456 ======================================== ================================================ 457 458.. _amdgpu_synid_d16: 459 460d16 461~~~ 462 463Specifies data size: 16 or 32 bits (32 bits by default). Not supported by GFX7. 464 465 ======================================== ================================================ 466 Syntax Description 467 ======================================== ================================================ 468 d16 Enables 16-bits data mode. 469 470 On loads, convert data in memory to 16-bit 471 format before storing it in VGPRs. 472 473 For stores, convert 16-bit data in VGPRs to 474 32 bits before going to memory. 475 476 Note that GFX8.0 does not support data packing. 477 Each 16-bit data element occupies 1 VGPR. 478 479 GFX8.1, GFX9 and GFX10 support data packing. 480 Each pair of 16-bit data elements 481 occupies 1 VGPR. 482 ======================================== ================================================ 483 484.. _amdgpu_synid_a16: 485 486a16 487~~~ 488 489Specifies size of image address components: 16 or 32 bits (32 bits by default). 490GFX9 and GFX10 only. 491 492 ======================================== ================================================ 493 Syntax Description 494 ======================================== ================================================ 495 a16 Enables 16-bits image address components. 496 ======================================== ================================================ 497 498.. _amdgpu_synid_dim: 499 500dim 501~~~ 502 503Specifies surface dimension. This is a mandatory modifier. There is no default value. 504 505GFX10 only. 506 507 =============================== ========================================================= 508 Syntax Description 509 =============================== ========================================================= 510 dim:1D One-dimensional image. 511 dim:2D Two-dimensional image. 512 dim:3D Three-dimensional image. 513 dim:CUBE Cubemap array. 514 dim:1D_ARRAY One-dimensional image array. 515 dim:2D_ARRAY Two-dimensional image array. 516 dim:2D_MSAA Two-dimensional multi-sample auto-aliasing image. 517 dim:2D_MSAA_ARRAY Two-dimensional multi-sample auto-aliasing image array. 518 =============================== ========================================================= 519 520The following table defines an alternative syntax which is supported 521for compatibility with SP3 assembler: 522 523 =============================== ========================================================= 524 Syntax Description 525 =============================== ========================================================= 526 dim:SQ_RSRC_IMG_1D One-dimensional image. 527 dim:SQ_RSRC_IMG_2D Two-dimensional image. 528 dim:SQ_RSRC_IMG_3D Three-dimensional image. 529 dim:SQ_RSRC_IMG_CUBE Cubemap array. 530 dim:SQ_RSRC_IMG_1D_ARRAY One-dimensional image array. 531 dim:SQ_RSRC_IMG_2D_ARRAY Two-dimensional image array. 532 dim:SQ_RSRC_IMG_2D_MSAA Two-dimensional multi-sample auto-aliasing image. 533 dim:SQ_RSRC_IMG_2D_MSAA_ARRAY Two-dimensional multi-sample auto-aliasing image array. 534 =============================== ========================================================= 535 536dlc 537~~~ 538 539See a description :ref:`here<amdgpu_synid_dlc>`. GFX10 only. 540 541Miscellaneous Modifiers 542----------------------- 543 544.. _amdgpu_synid_dlc: 545 546dlc 547~~~ 548 549Controls device level cache policy for memory operations. Used for synchronization. 550When specified, forces operation to bypass device level cache making the operation device 551level coherent. By default, instructions use device level cache. 552 553GFX10 only. 554 555 ======================================== ================================================ 556 Syntax Description 557 ======================================== ================================================ 558 dlc Bypass device level cache. 559 ======================================== ================================================ 560 561.. _amdgpu_synid_glc: 562 563glc 564~~~ 565 566This modifier has different meaning for loads, stores, and atomic operations. 567The default value is off (0). 568 569See AMD documentation for details. 570 571 ======================================== ================================================ 572 Syntax Description 573 ======================================== ================================================ 574 glc Set glc bit to 1. 575 ======================================== ================================================ 576 577.. _amdgpu_synid_lds: 578 579lds 580~~~ 581 582Specifies where to store the result: VGPRs or LDS (VGPRs by default). 583 584 ======================================== =========================== 585 Syntax Description 586 ======================================== =========================== 587 lds Store result in LDS. 588 ======================================== =========================== 589 590.. _amdgpu_synid_nv: 591 592nv 593~~ 594 595Specifies if instruction is operating on non-volatile memory. By default, memory is volatile. 596 597GFX9 only. 598 599 ======================================== ================================================ 600 Syntax Description 601 ======================================== ================================================ 602 nv Indicates that instruction operates on 603 non-volatile memory. 604 ======================================== ================================================ 605 606.. _amdgpu_synid_slc: 607 608slc 609~~~ 610 611Specifies cache policy. The default value is off (0). 612 613See AMD documentation for details. 614 615 ======================================== ================================================ 616 Syntax Description 617 ======================================== ================================================ 618 slc Set slc bit to 1. 619 ======================================== ================================================ 620 621.. _amdgpu_synid_tfe: 622 623tfe 624~~~ 625 626Controls access to partially resident textures. The default value is off (0). 627 628See AMD documentation for details. 629 630 ======================================== ================================================ 631 Syntax Description 632 ======================================== ================================================ 633 tfe Set tfe bit to 1. 634 ======================================== ================================================ 635 636MUBUF/MTBUF Modifiers 637--------------------- 638 639.. _amdgpu_synid_idxen: 640 641idxen 642~~~~~ 643 644Specifies whether address components include an index. By default, no components are used. 645 646Can be used together with :ref:`offen<amdgpu_synid_offen>`. 647 648Cannot be used with :ref:`addr64<amdgpu_synid_addr64>`. 649 650 ======================================== ================================================ 651 Syntax Description 652 ======================================== ================================================ 653 idxen Address components include an index. 654 ======================================== ================================================ 655 656.. _amdgpu_synid_offen: 657 658offen 659~~~~~ 660 661Specifies whether address components include an offset. By default, no components are used. 662 663Can be used together with :ref:`idxen<amdgpu_synid_idxen>`. 664 665Cannot be used with :ref:`addr64<amdgpu_synid_addr64>`. 666 667 ======================================== ================================================ 668 Syntax Description 669 ======================================== ================================================ 670 offen Address components include an offset. 671 ======================================== ================================================ 672 673.. _amdgpu_synid_addr64: 674 675addr64 676~~~~~~ 677 678Specifies whether a 64-bit address is used. By default, no address is used. 679 680GFX7 only. Cannot be used with :ref:`offen<amdgpu_synid_offen>` and 681:ref:`idxen<amdgpu_synid_idxen>` modifiers. 682 683 ======================================== ================================================ 684 Syntax Description 685 ======================================== ================================================ 686 addr64 A 64-bit address is used. 687 ======================================== ================================================ 688 689.. _amdgpu_synid_buf_offset12: 690 691offset12 692~~~~~~~~ 693 694Specifies an immediate unsigned 12-bit offset, in bytes. The default value is 0. 695 696 ================== ==================================================================== 697 Syntax Description 698 ================== ==================================================================== 699 offset:{0..0xFFF} Specifies a 12-bit unsigned offset as a positive 700 :ref:`integer number <amdgpu_synid_integer_number>` 701 or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. 702 ================== ==================================================================== 703 704Examples: 705 706.. parsed-literal:: 707 708 offset:x+y 709 offset:0x10 710 711glc 712~~~ 713 714See a description :ref:`here<amdgpu_synid_glc>`. 715 716slc 717~~~ 718 719See a description :ref:`here<amdgpu_synid_slc>`. 720 721lds 722~~~ 723 724See a description :ref:`here<amdgpu_synid_lds>`. 725 726dlc 727~~~ 728 729See a description :ref:`here<amdgpu_synid_dlc>`. GFX10 only. 730 731tfe 732~~~ 733 734See a description :ref:`here<amdgpu_synid_tfe>`. 735 736.. _amdgpu_synid_fmt: 737 738fmt 739~~~ 740 741Specifies data and numeric formats used by the operation. 742The default numeric format is BUF_NUM_FORMAT_UNORM. 743The default data format is BUF_DATA_FORMAT_8. 744 745 ========================================= =============================================================== 746 Syntax Description 747 ========================================= =============================================================== 748 format:{0..127} Use format specified as either an 749 :ref:`integer number<amdgpu_synid_integer_number>` or an 750 :ref:`absolute expression<amdgpu_synid_absolute_expression>`. 751 format:[<data format>] Use the specified data format and 752 default numeric format. 753 format:[<numeric format>] Use the specified numeric format and 754 default data format. 755 format:[<data format>, <numeric format>] Use the specified data and numeric formats. 756 format:[<numeric format>, <data format>] Use the specified data and numeric formats. 757 ========================================= =============================================================== 758 759.. _amdgpu_synid_format_data: 760 761Supported data formats are defined in the following table: 762 763 ========================================= =============================== 764 Syntax Note 765 ========================================= =============================== 766 BUF_DATA_FORMAT_INVALID 767 BUF_DATA_FORMAT_8 Default value. 768 BUF_DATA_FORMAT_16 769 BUF_DATA_FORMAT_8_8 770 BUF_DATA_FORMAT_32 771 BUF_DATA_FORMAT_16_16 772 BUF_DATA_FORMAT_10_11_11 773 BUF_DATA_FORMAT_11_11_10 774 BUF_DATA_FORMAT_10_10_10_2 775 BUF_DATA_FORMAT_2_10_10_10 776 BUF_DATA_FORMAT_8_8_8_8 777 BUF_DATA_FORMAT_32_32 778 BUF_DATA_FORMAT_16_16_16_16 779 BUF_DATA_FORMAT_32_32_32 780 BUF_DATA_FORMAT_32_32_32_32 781 BUF_DATA_FORMAT_RESERVED_15 782 ========================================= =============================== 783 784.. _amdgpu_synid_format_num: 785 786Supported numeric formats are defined below: 787 788 ========================================= =============================== 789 Syntax Note 790 ========================================= =============================== 791 BUF_NUM_FORMAT_UNORM Default value. 792 BUF_NUM_FORMAT_SNORM 793 BUF_NUM_FORMAT_USCALED 794 BUF_NUM_FORMAT_SSCALED 795 BUF_NUM_FORMAT_UINT 796 BUF_NUM_FORMAT_SINT 797 BUF_NUM_FORMAT_SNORM_OGL GFX7 only. 798 BUF_NUM_FORMAT_RESERVED_6 GFX8 and GFX9 only. 799 BUF_NUM_FORMAT_FLOAT 800 ========================================= =============================== 801 802Examples: 803 804.. parsed-literal:: 805 806 format:0 807 format:127 808 format:[BUF_DATA_FORMAT_16] 809 format:[BUF_DATA_FORMAT_16,BUF_NUM_FORMAT_SSCALED] 810 format:[BUF_NUM_FORMAT_FLOAT] 811 812.. _amdgpu_synid_ufmt: 813 814ufmt 815~~~~ 816 817Specifies a unified format used by the operation. 818The default format is BUF_FMT_8_UNORM. 819GFX10 only. 820 821 ========================================= =============================================================== 822 Syntax Description 823 ========================================= =============================================================== 824 format:{0..127} Use unified format specified as either an 825 :ref:`integer number<amdgpu_synid_integer_number>` or an 826 :ref:`absolute expression<amdgpu_synid_absolute_expression>`. 827 Note that unified format numbers are not compatible with 828 format numbers used for pre-GFX10 ISA. 829 format:[<unified format>] Use the specified unified format. 830 ========================================= =============================================================== 831 832Unified format is a replacement for :ref:`data<amdgpu_synid_format_data>` 833and :ref:`numeric<amdgpu_synid_format_num>` formats. For compatibility with older ISA, 834:ref:`syntax with data and numeric formats<amdgpu_synid_fmt>` is still accepthed 835provided that the combination of formats can be mapped to a unified format. 836 837Supported unified formats and equivalent combinations of data and numeric formats 838are defined below: 839 840 ============================== ============================== ============================= 841 Syntax Equivalent Data Format Equivalent Numeric Format 842 ============================== ============================== ============================= 843 BUF_FMT_INVALID BUF_DATA_FORMAT_INVALID BUF_NUM_FORMAT_UNORM 844 845 BUF_FMT_8_UNORM BUF_DATA_FORMAT_8 BUF_NUM_FORMAT_UNORM 846 BUF_FMT_8_SNORM BUF_DATA_FORMAT_8 BUF_NUM_FORMAT_SNORM 847 BUF_FMT_8_USCALED BUF_DATA_FORMAT_8 BUF_NUM_FORMAT_USCALED 848 BUF_FMT_8_SSCALED BUF_DATA_FORMAT_8 BUF_NUM_FORMAT_SSCALED 849 BUF_FMT_8_UINT BUF_DATA_FORMAT_8 BUF_NUM_FORMAT_UINT 850 BUF_FMT_8_SINT BUF_DATA_FORMAT_8 BUF_NUM_FORMAT_SINT 851 852 BUF_FMT_16_UNORM BUF_DATA_FORMAT_16 BUF_NUM_FORMAT_UNORM 853 BUF_FMT_16_SNORM BUF_DATA_FORMAT_16 BUF_NUM_FORMAT_SNORM 854 BUF_FMT_16_USCALED BUF_DATA_FORMAT_16 BUF_NUM_FORMAT_USCALED 855 BUF_FMT_16_SSCALED BUF_DATA_FORMAT_16 BUF_NUM_FORMAT_SSCALED 856 BUF_FMT_16_UINT BUF_DATA_FORMAT_16 BUF_NUM_FORMAT_UINT 857 BUF_FMT_16_SINT BUF_DATA_FORMAT_16 BUF_NUM_FORMAT_SINT 858 BUF_FMT_16_FLOAT BUF_DATA_FORMAT_16 BUF_NUM_FORMAT_FLOAT 859 860 BUF_FMT_8_8_UNORM BUF_DATA_FORMAT_8_8 BUF_NUM_FORMAT_UNORM 861 BUF_FMT_8_8_SNORM BUF_DATA_FORMAT_8_8 BUF_NUM_FORMAT_SNORM 862 BUF_FMT_8_8_USCALED BUF_DATA_FORMAT_8_8 BUF_NUM_FORMAT_USCALED 863 BUF_FMT_8_8_SSCALED BUF_DATA_FORMAT_8_8 BUF_NUM_FORMAT_SSCALED 864 BUF_FMT_8_8_UINT BUF_DATA_FORMAT_8_8 BUF_NUM_FORMAT_UINT 865 BUF_FMT_8_8_SINT BUF_DATA_FORMAT_8_8 BUF_NUM_FORMAT_SINT 866 867 BUF_FMT_32_UINT BUF_DATA_FORMAT_32 BUF_NUM_FORMAT_UINT 868 BUF_FMT_32_SINT BUF_DATA_FORMAT_32 BUF_NUM_FORMAT_SINT 869 BUF_FMT_32_FLOAT BUF_DATA_FORMAT_32 BUF_NUM_FORMAT_FLOAT 870 871 BUF_FMT_16_16_UNORM BUF_DATA_FORMAT_16_16 BUF_NUM_FORMAT_UNORM 872 BUF_FMT_16_16_SNORM BUF_DATA_FORMAT_16_16 BUF_NUM_FORMAT_SNORM 873 BUF_FMT_16_16_USCALED BUF_DATA_FORMAT_16_16 BUF_NUM_FORMAT_USCALED 874 BUF_FMT_16_16_SSCALED BUF_DATA_FORMAT_16_16 BUF_NUM_FORMAT_SSCALED 875 BUF_FMT_16_16_UINT BUF_DATA_FORMAT_16_16 BUF_NUM_FORMAT_UINT 876 BUF_FMT_16_16_SINT BUF_DATA_FORMAT_16_16 BUF_NUM_FORMAT_SINT 877 BUF_FMT_16_16_FLOAT BUF_DATA_FORMAT_16_16 BUF_NUM_FORMAT_FLOAT 878 879 BUF_FMT_10_11_11_UNORM BUF_DATA_FORMAT_10_11_11 BUF_NUM_FORMAT_UNORM 880 BUF_FMT_10_11_11_SNORM BUF_DATA_FORMAT_10_11_11 BUF_NUM_FORMAT_SNORM 881 BUF_FMT_10_11_11_USCALED BUF_DATA_FORMAT_10_11_11 BUF_NUM_FORMAT_USCALED 882 BUF_FMT_10_11_11_SSCALED BUF_DATA_FORMAT_10_11_11 BUF_NUM_FORMAT_SSCALED 883 BUF_FMT_10_11_11_UINT BUF_DATA_FORMAT_10_11_11 BUF_NUM_FORMAT_UINT 884 BUF_FMT_10_11_11_SINT BUF_DATA_FORMAT_10_11_11 BUF_NUM_FORMAT_SINT 885 BUF_FMT_10_11_11_FLOAT BUF_DATA_FORMAT_10_11_11 BUF_NUM_FORMAT_FLOAT 886 887 BUF_FMT_11_11_10_UNORM BUF_DATA_FORMAT_11_11_10 BUF_NUM_FORMAT_UNORM 888 BUF_FMT_11_11_10_SNORM BUF_DATA_FORMAT_11_11_10 BUF_NUM_FORMAT_SNORM 889 BUF_FMT_11_11_10_USCALED BUF_DATA_FORMAT_11_11_10 BUF_NUM_FORMAT_USCALED 890 BUF_FMT_11_11_10_SSCALED BUF_DATA_FORMAT_11_11_10 BUF_NUM_FORMAT_SSCALED 891 BUF_FMT_11_11_10_UINT BUF_DATA_FORMAT_11_11_10 BUF_NUM_FORMAT_UINT 892 BUF_FMT_11_11_10_SINT BUF_DATA_FORMAT_11_11_10 BUF_NUM_FORMAT_SINT 893 BUF_FMT_11_11_10_FLOAT BUF_DATA_FORMAT_11_11_10 BUF_NUM_FORMAT_FLOAT 894 895 BUF_FMT_10_10_10_2_UNORM BUF_DATA_FORMAT_10_10_10_2 BUF_NUM_FORMAT_UNORM 896 BUF_FMT_10_10_10_2_SNORM BUF_DATA_FORMAT_10_10_10_2 BUF_NUM_FORMAT_SNORM 897 BUF_FMT_10_10_10_2_USCALED BUF_DATA_FORMAT_10_10_10_2 BUF_NUM_FORMAT_USCALED 898 BUF_FMT_10_10_10_2_SSCALED BUF_DATA_FORMAT_10_10_10_2 BUF_NUM_FORMAT_SSCALED 899 BUF_FMT_10_10_10_2_UINT BUF_DATA_FORMAT_10_10_10_2 BUF_NUM_FORMAT_UINT 900 BUF_FMT_10_10_10_2_SINT BUF_DATA_FORMAT_10_10_10_2 BUF_NUM_FORMAT_SINT 901 902 BUF_FMT_2_10_10_10_UNORM BUF_DATA_FORMAT_2_10_10_10 BUF_NUM_FORMAT_UNORM 903 BUF_FMT_2_10_10_10_SNORM BUF_DATA_FORMAT_2_10_10_10 BUF_NUM_FORMAT_SNORM 904 BUF_FMT_2_10_10_10_USCALED BUF_DATA_FORMAT_2_10_10_10 BUF_NUM_FORMAT_USCALED 905 BUF_FMT_2_10_10_10_SSCALED BUF_DATA_FORMAT_2_10_10_10 BUF_NUM_FORMAT_SSCALED 906 BUF_FMT_2_10_10_10_UINT BUF_DATA_FORMAT_2_10_10_10 BUF_NUM_FORMAT_UINT 907 BUF_FMT_2_10_10_10_SINT BUF_DATA_FORMAT_2_10_10_10 BUF_NUM_FORMAT_SINT 908 909 BUF_FMT_8_8_8_8_UNORM BUF_DATA_FORMAT_8_8_8_8 BUF_NUM_FORMAT_UNORM 910 BUF_FMT_8_8_8_8_SNORM BUF_DATA_FORMAT_8_8_8_8 BUF_NUM_FORMAT_SNORM 911 BUF_FMT_8_8_8_8_USCALED BUF_DATA_FORMAT_8_8_8_8 BUF_NUM_FORMAT_USCALED 912 BUF_FMT_8_8_8_8_SSCALED BUF_DATA_FORMAT_8_8_8_8 BUF_NUM_FORMAT_SSCALED 913 BUF_FMT_8_8_8_8_UINT BUF_DATA_FORMAT_8_8_8_8 BUF_NUM_FORMAT_UINT 914 BUF_FMT_8_8_8_8_SINT BUF_DATA_FORMAT_8_8_8_8 BUF_NUM_FORMAT_SINT 915 916 BUF_FMT_32_32_UINT BUF_DATA_FORMAT_32_32 BUF_NUM_FORMAT_UINT 917 BUF_FMT_32_32_SINT BUF_DATA_FORMAT_32_32 BUF_NUM_FORMAT_SINT 918 BUF_FMT_32_32_FLOAT BUF_DATA_FORMAT_32_32 BUF_NUM_FORMAT_FLOAT 919 920 BUF_FMT_16_16_16_16_UNORM BUF_DATA_FORMAT_16_16_16_16 BUF_NUM_FORMAT_UNORM 921 BUF_FMT_16_16_16_16_SNORM BUF_DATA_FORMAT_16_16_16_16 BUF_NUM_FORMAT_SNORM 922 BUF_FMT_16_16_16_16_USCALED BUF_DATA_FORMAT_16_16_16_16 BUF_NUM_FORMAT_USCALED 923 BUF_FMT_16_16_16_16_SSCALED BUF_DATA_FORMAT_16_16_16_16 BUF_NUM_FORMAT_SSCALED 924 BUF_FMT_16_16_16_16_UINT BUF_DATA_FORMAT_16_16_16_16 BUF_NUM_FORMAT_UINT 925 BUF_FMT_16_16_16_16_SINT BUF_DATA_FORMAT_16_16_16_16 BUF_NUM_FORMAT_SINT 926 BUF_FMT_16_16_16_16_FLOAT BUF_DATA_FORMAT_16_16_16_16 BUF_NUM_FORMAT_FLOAT 927 928 BUF_FMT_32_32_32_UINT BUF_DATA_FORMAT_32_32_32 BUF_NUM_FORMAT_UINT 929 BUF_FMT_32_32_32_SINT BUF_DATA_FORMAT_32_32_32 BUF_NUM_FORMAT_SINT 930 BUF_FMT_32_32_32_FLOAT BUF_DATA_FORMAT_32_32_32 BUF_NUM_FORMAT_FLOAT 931 BUF_FMT_32_32_32_32_UINT BUF_DATA_FORMAT_32_32_32_32 BUF_NUM_FORMAT_UINT 932 BUF_FMT_32_32_32_32_SINT BUF_DATA_FORMAT_32_32_32_32 BUF_NUM_FORMAT_SINT 933 BUF_FMT_32_32_32_32_FLOAT BUF_DATA_FORMAT_32_32_32_32 BUF_NUM_FORMAT_FLOAT 934 ============================== ============================== ============================= 935 936Examples: 937 938.. parsed-literal:: 939 940 format:0 941 format:[BUF_FMT_32_UINT] 942 943SMRD/SMEM Modifiers 944------------------- 945 946glc 947~~~ 948 949See a description :ref:`here<amdgpu_synid_glc>`. 950 951nv 952~~ 953 954See a description :ref:`here<amdgpu_synid_nv>`. GFX9 only. 955 956dlc 957~~~ 958 959See a description :ref:`here<amdgpu_synid_dlc>`. GFX10 only. 960 961VINTRP Modifiers 962---------------- 963 964.. _amdgpu_synid_high: 965 966high 967~~~~ 968 969Specifies which half of the LDS word to use. Low half of LDS word is used by default. 970GFX9 and GFX10 only. 971 972 ======================================== ================================ 973 Syntax Description 974 ======================================== ================================ 975 high Use high half of LDS word. 976 ======================================== ================================ 977 978DPP8 Modifiers 979-------------- 980 981GFX10 only. 982 983.. _amdgpu_synid_dpp8_sel: 984 985dpp8_sel 986~~~~~~~~ 987 988Selects which lanes to pull data from, within a group of 8 lanes. This is a mandatory modifier. 989There is no default value. 990 991GFX10 only. 992 993The *dpp8_sel* modifier must specify exactly 8 values. 994First value selects which lane to read from to supply data into lane 0. 995Second value controls lane 1 and so on. 996 997Each value may be specified as either 998an :ref:`integer number<amdgpu_synid_integer_number>` or 999an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. 1000 1001 =============================================================== =========================== 1002 Syntax Description 1003 =============================================================== =========================== 1004 dpp8:[{0..7},{0..7},{0..7},{0..7},{0..7},{0..7},{0..7},{0..7}] Select lanes to read from. 1005 =============================================================== =========================== 1006 1007Examples: 1008 1009.. parsed-literal:: 1010 1011 dpp8:[7,6,5,4,3,2,1,0] 1012 dpp8:[0,1,0,1,0,1,0,1] 1013 1014.. _amdgpu_synid_fi8: 1015 1016fi 1017~~ 1018 1019Controls interaction with inactive lanes for *dpp8* instructions. The default value is zero. 1020 1021Note: *inactive* lanes are those whose :ref:`exec<amdgpu_synid_exec>` mask bit is zero. 1022 1023GFX10 only. 1024 1025 ==================================== ===================================================== 1026 Syntax Description 1027 ==================================== ===================================================== 1028 fi:0 Fetch zero when accessing data from inactive lanes. 1029 fi:1 Fetch pre-exist values from inactive lanes. 1030 ==================================== ===================================================== 1031 1032Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or 1033:ref:`absolute expressions<amdgpu_synid_absolute_expression>`. 1034 1035DPP/DPP16 Modifiers 1036------------------- 1037 1038GFX8, GFX9 and GFX10 only. 1039 1040.. _amdgpu_synid_dpp_ctrl: 1041 1042dpp_ctrl 1043~~~~~~~~ 1044 1045Specifies how data are shared between threads. This is a mandatory modifier. 1046There is no default value. 1047 1048GFX8 and GFX9 only. Use :ref:`dpp16_ctrl<amdgpu_synid_dpp16_ctrl>` for GFX10. 1049 1050Note: the lanes of a wavefront are organized in four *rows* and four *banks*. 1051 1052 ======================================== ================================================ 1053 Syntax Description 1054 ======================================== ================================================ 1055 quad_perm:[{0..3},{0..3},{0..3},{0..3}] Full permute of 4 threads. 1056 row_mirror Mirror threads within row. 1057 row_half_mirror Mirror threads within 1/2 row (8 threads). 1058 row_bcast:15 Broadcast 15th thread of each row to next row. 1059 row_bcast:31 Broadcast thread 31 to rows 2 and 3. 1060 wave_shl:1 Wavefront left shift by 1 thread. 1061 wave_rol:1 Wavefront left rotate by 1 thread. 1062 wave_shr:1 Wavefront right shift by 1 thread. 1063 wave_ror:1 Wavefront right rotate by 1 thread. 1064 row_shl:{1..15} Row shift left by 1-15 threads. 1065 row_shr:{1..15} Row shift right by 1-15 threads. 1066 row_ror:{1..15} Row rotate right by 1-15 threads. 1067 ======================================== ================================================ 1068 1069Note: numeric values may be specified as either 1070:ref:`integer numbers<amdgpu_synid_integer_number>` or 1071:ref:`absolute expressions<amdgpu_synid_absolute_expression>`. 1072 1073Examples: 1074 1075.. parsed-literal:: 1076 1077 quad_perm:[0, 1, 2, 3] 1078 row_shl:3 1079 1080.. _amdgpu_synid_dpp16_ctrl: 1081 1082dpp16_ctrl 1083~~~~~~~~~~ 1084 1085Specifies how data are shared between threads. This is a mandatory modifier. 1086There is no default value. 1087 1088GFX10 only. Use :ref:`dpp_ctrl<amdgpu_synid_dpp_ctrl>` for GFX8 and GFX9. 1089 1090Note: the lanes of a wavefront are organized in four *rows* and four *banks*. 1091(There are only two rows in *wave32* mode.) 1092 1093 ======================================== ==================================================== 1094 Syntax Description 1095 ======================================== ==================================================== 1096 quad_perm:[{0..3},{0..3},{0..3},{0..3}] Full permute of 4 threads. 1097 row_mirror Mirror threads within row. 1098 row_half_mirror Mirror threads within 1/2 row (8 threads). 1099 row_share:{0..15} Share the value from the specified lane with other 1100 lanes in the row. 1101 row_xmask:{0..15} Fetch from XOR(current lane id, specified lane id). 1102 row_shl:{1..15} Row shift left by 1-15 threads. 1103 row_shr:{1..15} Row shift right by 1-15 threads. 1104 row_ror:{1..15} Row rotate right by 1-15 threads. 1105 ======================================== ==================================================== 1106 1107Note: numeric values may be specified as either 1108:ref:`integer numbers<amdgpu_synid_integer_number>` or 1109:ref:`absolute expressions<amdgpu_synid_absolute_expression>`. 1110 1111Examples: 1112 1113.. parsed-literal:: 1114 1115 quad_perm:[0, 1, 2, 3] 1116 row_shl:3 1117 1118.. _amdgpu_synid_row_mask: 1119 1120row_mask 1121~~~~~~~~ 1122 1123Controls which rows are enabled for data sharing. By default, all rows are enabled. 1124 1125Note: the lanes of a wavefront are organized in four *rows* and four *banks*. 1126(There are only two rows in *wave32* mode.) 1127 1128 ================= ==================================================================== 1129 Syntax Description 1130 ================= ==================================================================== 1131 row_mask:{0..15} Specifies a *row mask* as a positive 1132 :ref:`integer number <amdgpu_synid_integer_number>` 1133 or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. 1134 1135 Each of 4 bits in the mask controls one row 1136 (0 - disabled, 1 - enabled). 1137 1138 In *wave32* mode the values should be limited to 0..7. 1139 ================= ==================================================================== 1140 1141Examples: 1142 1143.. parsed-literal:: 1144 1145 row_mask:0xf 1146 row_mask:0b1010 1147 row_mask:x|y 1148 1149.. _amdgpu_synid_bank_mask: 1150 1151bank_mask 1152~~~~~~~~~ 1153 1154Controls which banks are enabled for data sharing. By default, all banks are enabled. 1155 1156Note: the lanes of a wavefront are organized in four *rows* and four *banks*. 1157(There are only two rows in *wave32* mode.) 1158 1159 ================== ==================================================================== 1160 Syntax Description 1161 ================== ==================================================================== 1162 bank_mask:{0..15} Specifies a *bank mask* as a positive 1163 :ref:`integer number <amdgpu_synid_integer_number>` 1164 or an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. 1165 1166 Each of 4 bits in the mask controls one bank 1167 (0 - disabled, 1 - enabled). 1168 ================== ==================================================================== 1169 1170Examples: 1171 1172.. parsed-literal:: 1173 1174 bank_mask:0x3 1175 bank_mask:0b0011 1176 bank_mask:x&y 1177 1178.. _amdgpu_synid_bound_ctrl: 1179 1180bound_ctrl 1181~~~~~~~~~~ 1182 1183Controls data sharing when accessing an invalid lane. By default, data sharing with 1184invalid lanes is disabled. 1185 1186 ======================================== ================================================ 1187 Syntax Description 1188 ======================================== ================================================ 1189 bound_ctrl:0 Enables data sharing with invalid lanes. 1190 1191 Accessing data from an invalid lane will 1192 return zero. 1193 ======================================== ================================================ 1194 1195.. _amdgpu_synid_fi16: 1196 1197fi 1198~~ 1199 1200Controls interaction with *inactive* lanes for *dpp16* instructions. The default value is zero. 1201 1202Note: *inactive* lanes are those whose :ref:`exec<amdgpu_synid_exec>` mask bit is zero. 1203 1204GFX10 only. 1205 1206 ======================================== ================================================== 1207 Syntax Description 1208 ======================================== ================================================== 1209 fi:0 Interaction with inactive lanes is controlled by 1210 :ref:`bound_ctrl<amdgpu_synid_bound_ctrl>`. 1211 1212 fi:1 Fetch pre-exist values from inactive lanes. 1213 ======================================== ================================================== 1214 1215Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or 1216:ref:`absolute expressions<amdgpu_synid_absolute_expression>`. 1217 1218SDWA Modifiers 1219-------------- 1220 1221GFX8, GFX9 and GFX10 only. 1222 1223clamp 1224~~~~~ 1225 1226See a description :ref:`here<amdgpu_synid_clamp>`. 1227 1228omod 1229~~~~ 1230 1231See a description :ref:`here<amdgpu_synid_omod>`. 1232 1233GFX9 and GFX10 only. 1234 1235.. _amdgpu_synid_dst_sel: 1236 1237dst_sel 1238~~~~~~~ 1239 1240Selects which bits in the destination are affected. By default, all bits are affected. 1241 1242 ======================================== ================================================ 1243 Syntax Description 1244 ======================================== ================================================ 1245 dst_sel:DWORD Use bits 31:0. 1246 dst_sel:BYTE_0 Use bits 7:0. 1247 dst_sel:BYTE_1 Use bits 15:8. 1248 dst_sel:BYTE_2 Use bits 23:16. 1249 dst_sel:BYTE_3 Use bits 31:24. 1250 dst_sel:WORD_0 Use bits 15:0. 1251 dst_sel:WORD_1 Use bits 31:16. 1252 ======================================== ================================================ 1253 1254.. _amdgpu_synid_dst_unused: 1255 1256dst_unused 1257~~~~~~~~~~ 1258 1259Controls what to do with the bits in the destination which are not selected 1260by :ref:`dst_sel<amdgpu_synid_dst_sel>`. 1261By default, unused bits are preserved. 1262 1263 ======================================== ================================================ 1264 Syntax Description 1265 ======================================== ================================================ 1266 dst_unused:UNUSED_PAD Pad with zeros. 1267 dst_unused:UNUSED_SEXT Sign-extend upper bits, zero lower bits. 1268 dst_unused:UNUSED_PRESERVE Preserve bits. 1269 ======================================== ================================================ 1270 1271.. _amdgpu_synid_src0_sel: 1272 1273src0_sel 1274~~~~~~~~ 1275 1276Controls which bits in the src0 are used. By default, all bits are used. 1277 1278 ======================================== ================================================ 1279 Syntax Description 1280 ======================================== ================================================ 1281 src0_sel:DWORD Use bits 31:0. 1282 src0_sel:BYTE_0 Use bits 7:0. 1283 src0_sel:BYTE_1 Use bits 15:8. 1284 src0_sel:BYTE_2 Use bits 23:16. 1285 src0_sel:BYTE_3 Use bits 31:24. 1286 src0_sel:WORD_0 Use bits 15:0. 1287 src0_sel:WORD_1 Use bits 31:16. 1288 ======================================== ================================================ 1289 1290.. _amdgpu_synid_src1_sel: 1291 1292src1_sel 1293~~~~~~~~ 1294 1295Controls which bits in the src1 are used. By default, all bits are used. 1296 1297 ======================================== ================================================ 1298 Syntax Description 1299 ======================================== ================================================ 1300 src1_sel:DWORD Use bits 31:0. 1301 src1_sel:BYTE_0 Use bits 7:0. 1302 src1_sel:BYTE_1 Use bits 15:8. 1303 src1_sel:BYTE_2 Use bits 23:16. 1304 src1_sel:BYTE_3 Use bits 31:24. 1305 src1_sel:WORD_0 Use bits 15:0. 1306 src1_sel:WORD_1 Use bits 31:16. 1307 ======================================== ================================================ 1308 1309.. _amdgpu_synid_sdwa_operand_modifiers: 1310 1311SDWA Operand Modifiers 1312---------------------- 1313 1314Operand modifiers are not used separately. They are applied to source operands. 1315 1316GFX8, GFX9 and GFX10 only. 1317 1318abs 1319~~~ 1320 1321See a description :ref:`here<amdgpu_synid_abs>`. 1322 1323neg 1324~~~ 1325 1326See a description :ref:`here<amdgpu_synid_neg>`. 1327 1328.. _amdgpu_synid_sext: 1329 1330sext 1331~~~~ 1332 1333Sign-extends value of a (sub-dword) operand to fill all 32 bits. 1334Has no effect for 32-bit operands. 1335 1336Valid for integer operands only. 1337 1338 ======================================== ================================================ 1339 Syntax Description 1340 ======================================== ================================================ 1341 sext(<operand>) Sign-extend operand value. 1342 ======================================== ================================================ 1343 1344Examples: 1345 1346.. parsed-literal:: 1347 1348 sext(v4) 1349 sext(v255) 1350 1351VOP3 Modifiers 1352-------------- 1353 1354.. _amdgpu_synid_vop3_op_sel: 1355 1356op_sel 1357~~~~~~ 1358 1359Selects the low [15:0] or high [31:16] operand bits for source and destination operands. 1360By default, low bits are used for all operands. 1361 1362The number of values specified with the op_sel modifier must match the number of instruction 1363operands (both source and destination). First value controls src0, second value controls src1 1364and so on, except that the last value controls destination. 1365The value 0 selects the low bits, while 1 selects the high bits. 1366 1367Note: op_sel modifier affects 16-bit operands only. For 32-bit operands the value specified 1368by op_sel must be 0. 1369 1370GFX9 and GFX10 only. 1371 1372 ======================================== ============================================================ 1373 Syntax Description 1374 ======================================== ============================================================ 1375 op_sel:[{0..1},{0..1}] Select operand bits for instructions with 1 source operand. 1376 op_sel:[{0..1},{0..1},{0..1}] Select operand bits for instructions with 2 source operands. 1377 op_sel:[{0..1},{0..1},{0..1},{0..1}] Select operand bits for instructions with 3 source operands. 1378 ======================================== ============================================================ 1379 1380Note: numeric values may be specified as either 1381:ref:`integer numbers<amdgpu_synid_integer_number>` or 1382:ref:`absolute expressions<amdgpu_synid_absolute_expression>`. 1383 1384Examples: 1385 1386.. parsed-literal:: 1387 1388 op_sel:[0,0] 1389 op_sel:[0,1] 1390 1391.. _amdgpu_synid_clamp: 1392 1393clamp 1394~~~~~ 1395 1396Clamp meaning depends on instruction. 1397 1398For *v_cmp* instructions, clamp modifier indicates that the compare signals 1399if a floating point exception occurs. By default, signaling is disabled. 1400Not supported by GFX7. 1401 1402For integer operations, clamp modifier indicates that the result must be clamped 1403to the largest and smallest representable value. By default, there is no clamping. 1404Integer clamping is not supported by GFX7. 1405 1406For floating point operations, clamp modifier indicates that the result must be clamped 1407to the range [0.0, 1.0]. By default, there is no clamping. 1408 1409Note: clamp modifier is applied after :ref:`output modifiers<amdgpu_synid_omod>` (if any). 1410 1411 ======================================== ================================================ 1412 Syntax Description 1413 ======================================== ================================================ 1414 clamp Enables clamping (or signaling). 1415 ======================================== ================================================ 1416 1417.. _amdgpu_synid_omod: 1418 1419omod 1420~~~~ 1421 1422Specifies if an output modifier must be applied to the result. 1423By default, no output modifiers are applied. 1424 1425Note: output modifiers are applied before :ref:`clamping<amdgpu_synid_clamp>` (if any). 1426 1427Output modifiers are valid for f32 and f64 floating point results only. 1428They must not be used with f16. 1429 1430Note: *v_cvt_f16_f32* is an exception. This instruction produces f16 result 1431but accepts output modifiers. 1432 1433 ======================================== ================================================ 1434 Syntax Description 1435 ======================================== ================================================ 1436 mul:2 Multiply the result by 2. 1437 mul:4 Multiply the result by 4. 1438 div:2 Multiply the result by 0.5. 1439 ======================================== ================================================ 1440 1441Note: numeric values may be specified as either :ref:`integer numbers<amdgpu_synid_integer_number>` or 1442:ref:`absolute expressions<amdgpu_synid_absolute_expression>`. 1443 1444Examples: 1445 1446.. parsed-literal:: 1447 1448 mul:2 1449 mul:x // x must be equal to 2 or 4 1450 1451.. _amdgpu_synid_vop3_operand_modifiers: 1452 1453VOP3 Operand Modifiers 1454---------------------- 1455 1456Operand modifiers are not used separately. They are applied to source operands. 1457 1458.. _amdgpu_synid_abs: 1459 1460abs 1461~~~ 1462 1463Computes the absolute value of its operand. Must be applied before :ref:`neg<amdgpu_synid_neg>` 1464(if any). Valid for floating point operands only. 1465 1466 ======================================== ==================================================== 1467 Syntax Description 1468 ======================================== ==================================================== 1469 abs(<operand>) Get the absolute value of a floating-point operand. 1470 \|<operand>| The same as above (an SP3 syntax). 1471 ======================================== ==================================================== 1472 1473Note: avoid using SP3 syntax with operands specified as expressions because the trailing '|' 1474may be misinterpreted. Such operands should be enclosed into additional parentheses as shown 1475in examples below. 1476 1477Examples: 1478 1479.. parsed-literal:: 1480 1481 abs(v36) 1482 \|v36| 1483 abs(x|y) // ok 1484 \|(x|y)| // additional parentheses are required 1485 1486.. _amdgpu_synid_neg: 1487 1488neg 1489~~~ 1490 1491Computes the negative value of its operand. Must be applied after :ref:`abs<amdgpu_synid_abs>` 1492(if any). Valid for floating point operands only. 1493 1494 ================== ==================================================== 1495 Syntax Description 1496 ================== ==================================================== 1497 neg(<operand>) Get the negative value of a floating-point operand. 1498 The operand may include an optional 1499 :ref:`abs<amdgpu_synid_abs>` modifier. 1500 -<operand> The same as above (an SP3 syntax). 1501 ================== ==================================================== 1502 1503Note: SP3 syntax is supported with limitations because of a potential ambiguity. 1504Currently it is allowed in the following cases: 1505 1506* Before a register. 1507* Before an :ref:`abs<amdgpu_synid_abs>` modifier. 1508* Before an SP3 :ref:`abs<amdgpu_synid_abs>` modifier. 1509 1510In all other cases "-" is handled as a part of an expression that follows the sign. 1511 1512Examples: 1513 1514.. parsed-literal:: 1515 1516 // Operands with negate modifiers 1517 neg(v[0]) 1518 neg(1.0) 1519 neg(abs(v0)) 1520 -v5 1521 -abs(v5) 1522 -\|v5| 1523 1524 // Operands without negate modifiers 1525 -1 1526 -x+y 1527 1528VOP3P Modifiers 1529--------------- 1530 1531This section describes modifiers of *regular* VOP3P instructions. 1532 1533*v_mad_mix\** and *v_fma_mix\** 1534instructions use these modifiers :ref:`in a special manner<amdgpu_synid_mad_mix>`. 1535 1536GFX9 and GFX10 only. 1537 1538.. _amdgpu_synid_op_sel: 1539 1540op_sel 1541~~~~~~ 1542 1543Selects the low [15:0] or high [31:16] operand bits as input to the operation 1544which results in the lower-half of the destination. 1545By default, low bits are used for all operands. 1546 1547The number of values specified by the *op_sel* modifier must match the number of source 1548operands. First value controls src0, second value controls src1 and so on. 1549 1550The value 0 selects the low bits, while 1 selects the high bits. 1551 1552 ================================= ============================================================= 1553 Syntax Description 1554 ================================= ============================================================= 1555 op_sel:[{0..1}] Select operand bits for instructions with 1 source operand. 1556 op_sel:[{0..1},{0..1}] Select operand bits for instructions with 2 source operands. 1557 op_sel:[{0..1},{0..1},{0..1}] Select operand bits for instructions with 3 source operands. 1558 ================================= ============================================================= 1559 1560Note: numeric values may be specified as either 1561:ref:`integer numbers<amdgpu_synid_integer_number>` or 1562:ref:`absolute expressions<amdgpu_synid_absolute_expression>`. 1563 1564Examples: 1565 1566.. parsed-literal:: 1567 1568 op_sel:[0,0] 1569 op_sel:[0,1,0] 1570 1571.. _amdgpu_synid_op_sel_hi: 1572 1573op_sel_hi 1574~~~~~~~~~ 1575 1576Selects the low [15:0] or high [31:16] operand bits as input to the operation 1577which results in the upper-half of the destination. 1578By default, high bits are used for all operands. 1579 1580The number of values specified by the *op_sel_hi* modifier must match the number of source 1581operands. First value controls src0, second value controls src1 and so on. 1582 1583The value 0 selects the low bits, while 1 selects the high bits. 1584 1585 =================================== ============================================================= 1586 Syntax Description 1587 =================================== ============================================================= 1588 op_sel_hi:[{0..1}] Select operand bits for instructions with 1 source operand. 1589 op_sel_hi:[{0..1},{0..1}] Select operand bits for instructions with 2 source operands. 1590 op_sel_hi:[{0..1},{0..1},{0..1}] Select operand bits for instructions with 3 source operands. 1591 =================================== ============================================================= 1592 1593Note: numeric values may be specified as either 1594:ref:`integer numbers<amdgpu_synid_integer_number>` or 1595:ref:`absolute expressions<amdgpu_synid_absolute_expression>`. 1596 1597Examples: 1598 1599.. parsed-literal:: 1600 1601 op_sel_hi:[0,0] 1602 op_sel_hi:[0,0,1] 1603 1604.. _amdgpu_synid_neg_lo: 1605 1606neg_lo 1607~~~~~~ 1608 1609Specifies whether to change sign of operand values selected by 1610:ref:`op_sel<amdgpu_synid_op_sel>`. These values are then used 1611as input to the operation which results in the upper-half of the destination. 1612 1613The number of values specified by this modifier must match the number of source 1614operands. First value controls src0, second value controls src1 and so on. 1615 1616The value 0 indicates that the corresponding operand value is used unmodified, 1617the value 1 indicates that negative value of the operand must be used. 1618 1619By default, operand values are used unmodified. 1620 1621This modifier is valid for floating point operands only. 1622 1623 ================================ ================================================================== 1624 Syntax Description 1625 ================================ ================================================================== 1626 neg_lo:[{0..1}] Select affected operands for instructions with 1 source operand. 1627 neg_lo:[{0..1},{0..1}] Select affected operands for instructions with 2 source operands. 1628 neg_lo:[{0..1},{0..1},{0..1}] Select affected operands for instructions with 3 source operands. 1629 ================================ ================================================================== 1630 1631Note: numeric values may be specified as either 1632:ref:`integer numbers<amdgpu_synid_integer_number>` or 1633:ref:`absolute expressions<amdgpu_synid_absolute_expression>`. 1634 1635Examples: 1636 1637.. parsed-literal:: 1638 1639 neg_lo:[0] 1640 neg_lo:[0,1] 1641 1642.. _amdgpu_synid_neg_hi: 1643 1644neg_hi 1645~~~~~~ 1646 1647Specifies whether to change sign of operand values selected by 1648:ref:`op_sel_hi<amdgpu_synid_op_sel_hi>`. These values are then used 1649as input to the operation which results in the upper-half of the destination. 1650 1651The number of values specified by this modifier must match the number of source 1652operands. First value controls src0, second value controls src1 and so on. 1653 1654The value 0 indicates that the corresponding operand value is used unmodified, 1655the value 1 indicates that negative value of the operand must be used. 1656 1657By default, operand values are used unmodified. 1658 1659This modifier is valid for floating point operands only. 1660 1661 =============================== ================================================================== 1662 Syntax Description 1663 =============================== ================================================================== 1664 neg_hi:[{0..1}] Select affected operands for instructions with 1 source operand. 1665 neg_hi:[{0..1},{0..1}] Select affected operands for instructions with 2 source operands. 1666 neg_hi:[{0..1},{0..1},{0..1}] Select affected operands for instructions with 3 source operands. 1667 =============================== ================================================================== 1668 1669Note: numeric values may be specified as either 1670:ref:`integer numbers<amdgpu_synid_integer_number>` or 1671:ref:`absolute expressions<amdgpu_synid_absolute_expression>`. 1672 1673Examples: 1674 1675.. parsed-literal:: 1676 1677 neg_hi:[1,0] 1678 neg_hi:[0,1,1] 1679 1680clamp 1681~~~~~ 1682 1683See a description :ref:`here<amdgpu_synid_clamp>`. 1684 1685.. _amdgpu_synid_mad_mix: 1686 1687VOP3P MAD_MIX/FMA_MIX Modifiers 1688------------------------------- 1689 1690*v_mad_mix\** and *v_fma_mix\** 1691instructions use *op_sel* and *op_sel_hi* modifiers 1692in a manner different from *regular* VOP3P instructions. 1693 1694See a description below. 1695 1696GFX9 and GFX10 only. 1697 1698.. _amdgpu_synid_mad_mix_op_sel: 1699 1700m_op_sel 1701~~~~~~~~ 1702 1703This operand has meaning only for 16-bit source operands as indicated by 1704:ref:`m_op_sel_hi<amdgpu_synid_mad_mix_op_sel_hi>`. 1705It specifies to select either the low [15:0] or high [31:16] operand bits 1706as input to the operation. 1707 1708The number of values specified by the *op_sel* modifier must match the number of source 1709operands. First value controls src0, second value controls src1 and so on. 1710 1711The value 0 indicates the low bits, the value 1 indicates the high 16 bits. 1712 1713By default, low bits are used for all operands. 1714 1715 =============================== ================================================ 1716 Syntax Description 1717 =============================== ================================================ 1718 op_sel:[{0..1},{0..1},{0..1}] Select location of each 16-bit source operand. 1719 =============================== ================================================ 1720 1721Note: numeric values may be specified as either 1722:ref:`integer numbers<amdgpu_synid_integer_number>` or 1723:ref:`absolute expressions<amdgpu_synid_absolute_expression>`. 1724 1725Examples: 1726 1727.. parsed-literal:: 1728 1729 op_sel:[0,1] 1730 1731.. _amdgpu_synid_mad_mix_op_sel_hi: 1732 1733m_op_sel_hi 1734~~~~~~~~~~~ 1735 1736Selects the size of source operands: either 32 bits or 16 bits. 1737By default, 32 bits are used for all source operands. 1738 1739The number of values specified by the *op_sel_hi* modifier must match the number of source 1740operands. First value controls src0, second value controls src1 and so on. 1741 1742The value 0 indicates 32 bits, the value 1 indicates 16 bits. 1743 1744The location of 16 bits in the operand may be specified by 1745:ref:`m_op_sel<amdgpu_synid_mad_mix_op_sel>`. 1746 1747 ======================================== ==================================== 1748 Syntax Description 1749 ======================================== ==================================== 1750 op_sel_hi:[{0..1},{0..1},{0..1}] Select size of each source operand. 1751 ======================================== ==================================== 1752 1753Note: numeric values may be specified as either 1754:ref:`integer numbers<amdgpu_synid_integer_number>` or 1755:ref:`absolute expressions<amdgpu_synid_absolute_expression>`. 1756 1757Examples: 1758 1759.. parsed-literal:: 1760 1761 op_sel_hi:[1,1,1] 1762 1763abs 1764~~~ 1765 1766See a description :ref:`here<amdgpu_synid_abs>`. 1767 1768neg 1769~~~ 1770 1771See a description :ref:`here<amdgpu_synid_neg>`. 1772 1773clamp 1774~~~~~ 1775 1776See a description :ref:`here<amdgpu_synid_clamp>`. 1777 1778VOP3P MFMA Modifiers 1779-------------------- 1780 1781.. _amdgpu_synid_cbsz: 1782 1783cbsz 1784~~~~ 1785 1786 =============================== ================================================================== 1787 Syntax Description 1788 =============================== ================================================================== 1789 cbsz:[{0..7}] TBD 1790 =============================== ================================================================== 1791 1792Note: numeric value may be specified as either 1793an :ref:`integer number<amdgpu_synid_integer_number>` or 1794an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. 1795 1796.. _amdgpu_synid_abid: 1797 1798abid 1799~~~~ 1800 1801 =============================== ================================================================== 1802 Syntax Description 1803 =============================== ================================================================== 1804 abid:[{0..15}] TBD 1805 =============================== ================================================================== 1806 1807Note: numeric value may be specified as either 1808an :ref:`integer number<amdgpu_synid_integer_number>` or 1809an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. 1810 1811.. _amdgpu_synid_blgp: 1812 1813blgp 1814~~~~ 1815 1816 =============================== ================================================================== 1817 Syntax Description 1818 =============================== ================================================================== 1819 blgp:[{0..7}] TBD 1820 =============================== ================================================================== 1821 1822Note: numeric value may be specified as either 1823an :ref:`integer number<amdgpu_synid_integer_number>` or 1824an :ref:`absolute expression<amdgpu_synid_absolute_expression>`. 1825 1826