1 2# Xbyak 5.78 ; JIT assembler for x86(IA32), x64(AMD64, x86-64) by C++ 3 4## Abstract 5 6This is a header file which enables dynamically to assemble x86(IA32), x64(AMD64, x86-64) mnemonic. 7 8## Feature 9* header file only 10* Intel/MASM like syntax 11* fully support AVX-512 12 13**Note**: Xbyak uses and(), or(), xor(), not() functions, so `-fno-operator-names` option is necessary for gcc/clang. 14 15Or define `XBYAK_NO_OP_NAMES` before including `xbyak.h` and use and_(), or_(), xor_(), not_() instead of them. 16 17and_(), or_(), xor_(), not_() are always available. 18 19`XBYAK_NO_OP_NAMES` will be defined in the feature version. 20 21### Supported OS 22 23* Windows Xp, Vista, Windows 7, Windows 10(32bit, 64bit) 24* Linux(32bit, 64bit) 25* Intel macOS 26 27### Supported Compilers 28 29Almost C++03 or later compilers for x86/x64 such as Visual Studio, g++, clang++, Intel C++ compiler and g++ on mingw/cygwin. 30 31## Install 32 33The following files are necessary. Please add the path to your compile directory. 34 35* xbyak.h 36* xbyak_mnemonic.h 37* xbyak_util.h 38 39Linux: 40``` 41make install 42``` 43 44These files are copied into `/usr/local/include/xbyak`. 45 46## How to use it 47 48Inherit `Xbyak::CodeGenerator` class and make the class method. 49``` 50#define XBYAK_NO_OP_NAMES 51#include <xbyak/xbyak.h> 52 53struct Code : Xbyak::CodeGenerator { 54 Code(int x) 55 { 56 mov(eax, x); 57 ret(); 58 } 59}; 60``` 61Make an instance of the class and get the function 62pointer by calling `getCode()` and call it. 63``` 64Code c(5); 65int (*f)() = c.getCode<int (*)()>(); 66printf("ret=%d\n", f()); // ret = 5 67``` 68 69## Syntax 70Similar to MASM/NASM syntax with parentheses. 71 72``` 73NASM Xbyak 74mov eax, ebx --> mov(eax, ebx); 75inc ecx inc(ecx); 76ret --> ret(); 77``` 78 79## Addressing 80Use `qword`, `dword`, `word` and `byte` if it is necessary to specify the size of memory, 81otherwise use `ptr`. 82 83``` 84(ptr|qword|dword|word|byte) [base + index * (1|2|4|8) + displacement] 85 [rip + 32bit disp] ; x64 only 86 87NASM Xbyak 88mov eax, [ebx+ecx] --> mov(eax, ptr [ebx+ecx]); 89mov al, [ebx+ecx] --> mov(al, ptr [ebx + ecx]); 90test byte [esp], 4 --> test(byte [esp], 4); 91inc qword [rax] --> inc(qword [rax]); 92``` 93**Note**: `qword`, ... are member variables, then don't use `dword` as unsigned int type. 94 95### How to use Selector (Segment Register) 96``` 97mov eax, [fs:eax] --> putSeg(fs); 98 mov(eax, ptr [eax]); 99mov ax, cs --> mov(ax, cs); 100``` 101**Note**: Segment class is not derived from `Operand`. 102 103## AVX 104 105``` 106vaddps(xmm1, xmm2, xmm3); // xmm1 <- xmm2 + xmm3 107vaddps(xmm2, xmm3, ptr [rax]); // use ptr to access memory 108vgatherdpd(xmm1, ptr [ebp + 256 + xmm2*4], xmm3); 109``` 110 111**Note**: 112If `XBYAK_ENABLE_OMITTED_OPERAND` is defined, then you can use two operand version for backward compatibility. 113But the newer version will not support it. 114``` 115vaddps(xmm2, xmm3); // xmm2 <- xmm2 + xmm3 116``` 117 118## AVX-512 119 120``` 121vaddpd zmm2, zmm5, zmm30 --> vaddpd(zmm2, zmm5, zmm30); 122vaddpd xmm30, xmm20, [rax] --> vaddpd(xmm30, xmm20, ptr [rax]); 123vaddps xmm30, xmm20, [rax] --> vaddps(xmm30, xmm20, ptr [rax]); 124vaddpd zmm2{k5}, zmm4, zmm2 --> vaddpd(zmm2 | k5, zmm4, zmm2); 125vaddpd zmm2{k5}{z}, zmm4, zmm2 --> vaddpd(zmm2 | k5 | T_z, zmm4, zmm2); 126vaddpd zmm2{k5}{z}, zmm4, zmm2,{rd-sae} --> vaddpd(zmm2 | k5 | T_z, zmm4, zmm2 | T_rd_sae); 127 vaddpd(zmm2 | k5 | T_z | T_rd_sae, zmm4, zmm2); // the position of `|` is arbitrary. 128vcmppd k4{k3}, zmm1, zmm2, {sae}, 5 --> vcmppd(k4 | k3, zmm1, zmm2 | T_sae, 5); 129 130vaddpd xmm1, xmm2, [rax+256] --> vaddpd(xmm1, xmm2, ptr [rax+256]); 131vaddpd xmm1, xmm2, [rax+256]{1to2} --> vaddpd(xmm1, xmm2, ptr_b [rax+256]); 132vaddpd ymm1, ymm2, [rax+256]{1to4} --> vaddpd(ymm1, ymm2, ptr_b [rax+256]); 133vaddpd zmm1, zmm2, [rax+256]{1to8} --> vaddpd(zmm1, zmm2, ptr_b [rax+256]); 134vaddps zmm1, zmm2, [rax+rcx*8+8]{1to16} --> vaddps(zmm1, zmm2, ptr_b [rax+rcx*8+8]); 135vmovsd [rax]{k1}, xmm4 --> vmovsd(ptr [rax] | k1, xmm4); 136 137vcvtpd2dq xmm16, oword [eax+33] --> vcvtpd2dq(xmm16, xword [eax+33]); // use xword for m128 instead of oword 138 vcvtpd2dq(xmm16, ptr [eax+33]); // default xword 139vcvtpd2dq xmm21, [eax+32]{1to2} --> vcvtpd2dq(xmm21, ptr_b [eax+32]); 140vcvtpd2dq xmm0, yword [eax+33] --> vcvtpd2dq(xmm0, yword [eax+33]); // use yword for m256 141vcvtpd2dq xmm19, [eax+32]{1to4} --> vcvtpd2dq(xmm19, yword_b [eax+32]); // use yword_b to broadcast 142 143vfpclassps k5{k3}, zword [rax+64], 5 --> vfpclassps(k5|k3, zword [rax+64], 5); // specify m512 144vfpclasspd k5{k3}, [rax+64]{1to2}, 5 --> vfpclasspd(k5|k3, xword_b [rax+64], 5); // broadcast 64-bit to 128-bit 145vfpclassps k5{k3}, [rax+64]{1to4}, 5 --> vfpclassps(k5|k3, yword_b [rax+64], 5); // broadcast 64-bit to 256-bit 146``` 147### Remark 148* `k1`, ..., `k7` are opmask registers. 149* use `| T_z`, `| T_sae`, `| T_rn_sae`, `| T_rd_sae`, `| T_ru_sae`, `| T_rz_sae` instead of `,{z}`, `,{sae}`, `,{rn-sae}`, `,{rd-sae}`, `,{ru-sae}`, `,{rz-sae}` respectively. 150* `k4 | k3` is different from `k3 | k4`. 151* use `ptr_b` for broadcast `{1toX}`. X is automatically determined. 152* specify `xword`/`yword`/`zword(_b)` for m128/m256/m512 if necessary. 153 154## Label 155Two kinds of Label are supported. (String literal and Label class). 156 157### String literal 158``` 159L("L1"); 160 jmp("L1"); 161 162 jmp("L2"); 163 ... 164 a few mnemonics (8-bit displacement jmp) 165 ... 166L("L2"); 167 168 jmp("L3", T_NEAR); 169 ... 170 a lot of mnemonics (32-bit displacement jmp) 171 ... 172L("L3"); 173``` 174 175* Call `hasUndefinedLabel()` to verify your code has no undefined label. 176* you can use a label for immediate value of mov like as `mov(eax, "L2")`. 177 178### Support `@@`, `@f`, `@b` like MASM 179 180``` 181L("@@"); // <A> 182 jmp("@b"); // jmp to <A> 183 jmp("@f"); // jmp to <B> 184L("@@"); // <B> 185 jmp("@b"); // jmp to <B> 186 mov(eax, "@b"); 187 jmp(eax); // jmp to <B> 188``` 189 190### Local label 191 192Label symbols beginning with a period between `inLocalLabel()` and `outLocalLabel()` 193are treated as a local label. 194`inLocalLabel()` and `outLocalLabel()` can be nested. 195 196``` 197void func1() 198{ 199 inLocalLabel(); 200 L(".lp"); // <A> ; local label 201 ... 202 jmp(".lp"); // jmp to <A> 203 L("aaa"); // global label <C> 204 outLocalLabel(); 205 206 inLocalLabel(); 207 L(".lp"); // <B> ; local label 208 func1(); 209 jmp(".lp"); // jmp to <B> 210 inLocalLabel(); 211 jmp("aaa"); // jmp to <C> 212} 213``` 214 215### Label class 216 217`L()` and `jxx()` support Label class. 218 219``` 220 Xbyak::Label label1, label2; 221L(label1); 222 ... 223 jmp(label1); 224 ... 225 jmp(label2); 226 ... 227L(label2); 228``` 229 230Use `putL` for jmp table 231``` 232 Label labelTbl, L0, L1, L2; 233 mov(rax, labelTbl); 234 // rdx is an index of jump table 235 jmp(ptr [rax + rdx * sizeof(void*)]); 236L(labelTbl); 237 putL(L0); 238 putL(L1); 239 putL(L2); 240L(L0); 241 .... 242L(L1); 243 .... 244``` 245 246`assignL(dstLabel, srcLabel)` binds dstLabel with srcLabel. 247 248``` 249 Label label2; 250 Label label1 = L(); // make label1 ; same to Label label1; L(label1); 251 ... 252 jmp(label2); // label2 is not determined here 253 ... 254 assignL(label2, label1); // label2 <- label1 255``` 256The `jmp` in the above code jumps to label1 assigned by `assignL`. 257 258**Note**: 259* srcLabel must be used in `L()`. 260* dstLabel must not be used in `L()`. 261 262`Label::getAddress()` returns the address specified by the label instance and 0 if not specified. 263``` 264// not AutoGrow mode 265Label label; 266assert(label.getAddress() == 0); 267L(label); 268assert(label.getAddress() == getCurr()); 269``` 270 271### Rip ; relative addressing 272``` 273Label label; 274mov(eax, ptr [rip + label]); // eax = 4 275... 276 277L(label); 278dd(4); 279``` 280``` 281int x; 282... 283 mov(eax, ptr[rip + &x]); // throw exception if the difference between &x and current position is larger than 2GiB 284``` 285 286## Code size 287The default max code size is 4096 bytes. 288Specify the size in constructor of `CodeGenerator()` if necessary. 289 290``` 291class Quantize : public Xbyak::CodeGenerator { 292public: 293 Quantize() 294 : CodeGenerator(8192) 295 { 296 } 297 ... 298}; 299``` 300 301## User allocated memory 302 303You can make jit code on prepaired memory. 304 305Call `setProtectModeRE` yourself to change memory mode if using the prepaired memory. 306 307``` 308uint8_t alignas(4096) buf[8192]; // C++11 or later 309 310struct Code : Xbyak::CodeGenerator { 311 Code() : Xbyak::CodeGenerator(sizeof(buf), buf) 312 { 313 mov(rax, 123); 314 ret(); 315 } 316}; 317 318int main() 319{ 320 Code c; 321 c.setProtectModeRE(); // set memory to Read/Exec 322 printf("%d\n", c.getCode<int(*)()>()()); 323} 324``` 325 326**Note**: See [sample/test0.cpp](sample/test0.cpp). 327 328### AutoGrow 329 330The memory region for jit is automatically extended if necessary when `AutoGrow` is specified in a constructor of `CodeGenerator`. 331 332Call `ready()` or `readyRE()` before calling `getCode()` to fix jump address. 333``` 334struct Code : Xbyak::CodeGenerator { 335 Code() 336 : Xbyak::CodeGenerator(<default memory size>, Xbyak::AutoGrow) 337 { 338 ... 339 } 340}; 341Code c; 342// generate code for jit 343c.ready(); // mode = Read/Write/Exec 344``` 345 346**Note**: 347* Don't use the address returned by `getCurr()` before calling `ready()` because it may be invalid address. 348 349### Read/Exec mode 350Xbyak set Read/Write/Exec mode to memory to run jit code. 351If you want to use Read/Exec mode for security, then specify `DontSetProtectRWE` for `CodeGenerator` and 352call `setProtectModeRE()` after generating jit code. 353 354``` 355struct Code : Xbyak::CodeGenerator { 356 Code() 357 : Xbyak::CodeGenerator(4096, Xbyak::DontSetProtectRWE) 358 { 359 mov(eax, 123); 360 ret(); 361 } 362}; 363 364Code c; 365c.setProtectModeRE(); 366... 367 368``` 369Call `readyRE()` instead of `ready()` when using `AutoGrow` mode. 370See [protect-re.cpp](sample/protect-re.cpp). 371 372## Macro 373 374* **XBYAK32** is defined on 32bit. 375* **XBYAK64** is defined on 64bit. 376* **XBYAK64_WIN** is defined on 64bit Windows(VC) 377* **XBYAK64_GCC** is defined on 64bit gcc, cygwin 378* define **XBYAK_NO_OP_NAMES** on gcc without `-fno-operator-names` 379* define **XBYAK_ENABLE_OMITTED_OPERAND** if you use omitted destination such as `vaddps(xmm2, xmm3);`(deprecated in the future) 380* define **XBYAK_UNDEF_JNL** if Bessel function jnl is defined as macro 381 382## Sample 383 384* [test0.cpp](sample/test0.cpp) ; tiny sample (x86, x64) 385* [quantize.cpp](sample/quantize.cpp) ; JIT optimized quantization by fast division (x86 only) 386* [calc.cpp](sample/calc.cpp) ; assemble and estimate a given polynomial (x86, x64) 387* [bf.cpp](sample/bf.cpp) ; JIT brainfuck (x86, x64) 388 389## License 390 391modified new BSD License 392http://opensource.org/licenses/BSD-3-Clause 393 394## History 395* 2019/Apr/15 ver 5.78 rewrite Reg::changeBit() (thanks to MerryMage) 396* 2019/Mar/06 ver 5.77 fix number of cores that share LLC cache by densamoilov 397* 2019/Jan/17 ver 5.76 add Cpu::getNumCores() by shelleygoel 398* 2018/Oct/31 ver 5.751 recover Xbyak::CastTo for compatibility 399* 2018/Oct/29 ver 5.75 unlink LabelManager from Label when msg is destroyed 400* 2018/Oct/21 ver 5.74 support RegRip +/- int. Xbyak::CastTo is removed 401* 2018/Oct/15 util::AddressFrame uses push/pop instead of mov 402* 2018/Sep/19 ver 5.73 fix evex encoding of vpslld, vpslldq, vpsllw, etc for (reg, mem, imm8) 403* 2018/Sep/19 ver 5.72 fix the encoding of vinsertps for disp8N(Thanks to petercaday) 404* 2018/Sep/04 ver 5.71 L() returns a new label instance 405* 2018/Aug/27 ver 5.70 support setProtectMode() and DontUseProtect for read/exec setting 406* 2018/Aug/24 ver 5.68 fix wrong VSIB encoding with vector index >= 16(thanks to petercaday) 407* 2018/Aug/14 ver 5.67 remove mutable in Address ; fix setCacheHierarchy for cloud vm 408* 2018/Jul/26 ver 5.661 support mingw64 409* 2018/Jul/24 ver 5.66 add CodeArray::PROTECT_RE to mode of protect() 410* 2018/Jun/26 ver 5.65 fix push(qword [mem]) 411* 2018/Mar/07 ver 5.64 fix zero division in Cpu() on some cpu 412* 2018/Feb/14 ver 5.63 fix Cpu::setCacheHierarchy() and fix EvexModifierZero for clang<3.9(thanks to mgouicem) 413* 2018/Feb/13 ver 5.62 Cpu::setCacheHierarchy() by mgouicem and rsdubtso 414* 2018/Feb/07 ver 5.61 vmov* supports mem{k}{z}(I forgot it) 415* 2018/Jan/24 ver 5.601 add xword, yword, etc. into Xbyak::util namespace 416* 2018/Jan/05 ver 5.60 support AVX-512 for Ice lake(319433-030.pdf) 417* 2017/Aug/22 ver 5.53 fix mpx encoding, add bnd() prefix 418* 2017/Aug/18 ver 5.52 fix align (thanks to MerryMage) 419* 2017/Aug/17 ver 5.51 add multi-byte nop and align() uses it(thanks to inolen) 420* 2017/Aug/08 ver 5.50 add mpx(thanks to magurosan) 421* 2017/Aug/08 ver 5.45 add sha(thanks to magurosan) 422* 2017/Aug/08 ver 5.44 add prefetchw(thanks to rsdubtso) 423* 2017/Jul/12 ver 5.432 reduce warnings of PVS studio 424* 2017/Jul/09 ver 5.431 fix hasRex() (no affect) (thanks to drillsar) 425* 2017/May/14 ver 5.43 fix CodeGenerator::resetSize() (thanks to gibbed) 426* 2017/May/13 ver 5.42 add movs{b,w,d,q} 427* 2017/Jan/26 ver 5.41 add prefetchwt1 and support for scale == 0(thanks to rsdubtso) 428* 2016/Dec/14 ver 5.40 add Label::getAddress() method to get the pointer specified by the label 429* 2016/Dec/09 ver 5.34 fix handling of negative offsets when encoding disp8N(thanks to rsdubtso) 430* 2016/Dec/08 ver 5.33 fix encoding of vpbroadcast{b,w,d,q}, vpinsr{b,w}, vpextr{b,w} for disp8N 431* 2016/Dec/01 ver 5.32 rename __xgetbv() to _xgetbv() to support clang for Visual Studio(thanks to freiro) 432* 2016/Nov/27 ver 5.31 rename AVX512_4VNNI to AVX512_4VNNIW 433* 2016/Nov/27 ver 5.30 add AVX512_4VNNI, AVX512_4FMAPS instructions(thanks to rsdubtso) 434* 2016/Nov/26 ver 5.20 add detection of AVX512_4VNNI and AVX512_4FMAPS(thanks to rsdubtso) 435* 2016/Nov/20 ver 5.11 lost vptest for ymm(thanks to gregory38) 436* 2016/Nov/20 ver 5.10 add addressing [rip+&var] 437* 2016/Sep/29 ver 5.03 fix detection ERR_INVALID_OPMASK_WITH_MEMORY(thanks to PVS-Studio) 438* 2016/Aug/15 ver 5.02 xbyak does not include xbyak_bin2hex.h 439* 2016/Aug/15 ver 5.011 fix detection of version of gcc 5.4 440* 2016/Aug/03 ver 5.01 disable omitted operand 441* 2016/Jun/24 ver 5.00 support avx-512 instruction set 442* 2016/Jun/13 avx-512 add mask instructions 443* 2016/May/05 ver 4.91 add detection of AVX-512 to Xbyak::util::Cpu 444* 2016/Mar/14 ver 4.901 comment to ready() function(thanks to skmp) 445* 2016/Feb/04 ver 4.90 add jcc(const void *addr); 446* 2016/Jan/30 ver 4.89 vpblendvb supports ymm reg(thanks to John Funnell) 447* 2016/Jan/24 ver 4.88 lea, cmov supports 16-bit register(thanks to whyisthisfieldhere) 448* 2015/Oct/05 ver 4.87 support segment selectors 449* 2015/Aug/18 ver 4.86 fix [rip + label] addressing with immediate value(thanks to whyisthisfieldhere) 450* 2015/Aug/10 ver 4.85 Address::operator==() is not correct(thanks to inolen) 451* 2015/Jun/22 ver 4.84 call() support variadic template if available(thanks to randomstuff) 452* 2015/Jun/16 ver 4.83 support movbe(thanks to benvanik) 453* 2015/May/24 ver 4.82 support detection of F16C 454* 2015/Apr/25 ver 4.81 fix the condition to throw exception for setSize(thanks to whyisthisfieldhere) 455* 2015/Apr/22 ver 4.80 rip supports label(thanks to whyisthisfieldhere) 456* 2015/Jar/28 ver 4.71 support adcx, adox, cmpxchg, rdseed, stac 457* 2014/Oct/14 ver 4.70 support MmapAllocator 458* 2014/Jun/13 ver 4.62 disable warning of VC2014 459* 2014/May/30 ver 4.61 support bt, bts, btr, btc 460* 2014/May/28 ver 4.60 support vcvtph2ps, vcvtps2ph 461* 2014/Apr/11 ver 4.52 add detection of rdrand 462* 2014/Mar/25 ver 4.51 remove state information of unreferenced labels 463* 2014/Mar/16 ver 4.50 support new Label 464* 2014/Mar/05 ver 4.40 fix wrong detection of BMI/enhanced rep on VirtualBox 465* 2013/Dec/03 ver 4.30 support Reg::cvt8(), cvt16(), cvt32(), cvt64() 466* 2013/Oct/16 ver 4.21 label support std::string 467* 2013/Jul/30 ver 4.20 [break backward compatibility] split Reg32e class into RegExp(base+index*scale+disp) and Reg32e(means Reg32 or Reg64) 468* 2013/Jul/04 ver 4.10 [break backward compatibility] change the type of Xbyak::Error from enum to a class 469* 2013/Jun/21 ver 4.02 add putL(LABEL) function to put the address of the label 470* 2013/Jun/21 ver 4.01 vpsllw, vpslld, vpsllq, vpsraw, vpsrad, vpsrlw, vpsrld, vpsrlq support (ymm, ymm, xmm). support vpbroadcastb, vpbroadcastw, vpbroadcastd, vpbroadcastq(thanks to Gabest). 471* 2013/May/30 ver 4.00 support AVX2, VEX-encoded GPR-instructions 472* 2013/Mar/27 ver 3.80 support mov(reg, "label"); 473* 2013/Mar/13 ver 3.76 add cqo(), jcxz(), jecxz(), jrcxz() 474* 2013/Jan/15 ver 3.75 add setSize() to modify generated code 475* 2013/Jan/12 ver 3.74 add CodeGenerator::reset() ; add Allocator::useProtect() 476* 2013/Jan/06 ver 3.73 use unordered_map if possible 477* 2012/Dec/04 ver 3.72 eax, ebx, ... are member variables of CodeGenerator(revert), Xbyak::util::eax, ... are static const. 478* 2012/Nov/17 ver 3.71 and_(), or_(), xor_(), not_() are available if XBYAK_NO_OP_NAMES is not defined. 479* 2012/Nov/17 change eax, ebx, ptr and so on in CodeGenerator as static member and alias of them are defined in Xbyak::util. 480* 2012/Nov/09 ver 3.70 XBYAK_NO_OP_NAMES macro is added to use and_() instead of and() (thanks to Mattias) 481* 2012/Nov/01 ver 3.62 add fwait/fnwait/finit/fninit 482* 2012/Nov/01 ver 3.61 add fldcw/fstcw 483* 2012/May/03 ver 3.60 change interface of Allocator 484* 2012/Mar/23 ver 3.51 fix userPtr mode 485* 2012/Mar/19 ver 3.50 support AutoGrow mode 486* 2011/Nov/09 ver 3.05 fix bit property of rip addresing / support movsxd 487* 2011/Aug/15 ver 3.04 fix dealing with imm8 such as add(dword [ebp-8], 0xda); (thanks to lolcat) 488* 2011/Jun/16 ver 3.03 fix __GNUC_PREREQ macro for Mac gcc(thanks to t_teruya) 489* 2011/Apr/28 ver 3.02 do not use xgetbv on Mac gcc 490* 2011/May/24 ver 3.01 fix typo of OSXSAVE 491* 2011/May/23 ver 3.00 add vcmpeqps and so on 492* 2011/Feb/16 ver 2.994 beta add vmovq for 32-bit mode(I forgot it) 493* 2011/Feb/16 ver 2.993 beta remove cvtReg to avoid thread unsafe 494* 2011/Feb/10 ver 2.992 beta support one argument syntax for fadd like nasm 495* 2011/Feb/07 ver 2.991 beta fix pextrw reg, xmm, imm(Thanks to Gabest) 496* 2011/Feb/04 ver 2.99 beta support AVX 497* 2010/Dec/08 ver 2.31 fix ptr [rip + 32bit offset], support rdtscp 498* 2010/Oct/19 ver 2.30 support pclmulqdq, aesdec, aesdeclast, aesenc, aesenclast, aesimc, aeskeygenassist 499* 2010/Jun/07 ver 2.29 fix call(<label>) 500* 2010/Jun/17 ver 2.28 move some member functions to public 501* 2010/Jun/01 ver 2.27 support encoding of mov(reg64, imm) like yasm(not nasm) 502* 2010/May/24 ver 2.26 fix sub(rsp, 1000) 503* 2010/Apr/26 ver 2.25 add jc/jnc(I forgot to implement them...) 504* 2010/Apr/16 ver 2.24 change the prototype of rewrite() method 505* 2010/Apr/15 ver 2.23 fix align() and xbyak_util.h for Mac 506* 2010/Feb/16 ver 2.22 fix inLocalLabel()/outLocalLabel() 507* 2009/Dec/09 ver 2.21 support cygwin(gcc 4.3.2) 508* 2009/Nov/28 support a part of FPU 509* 2009/Jun/25 fix mov(qword[rax], imm); (thanks to Martin) 510* 2009/Mar/10 fix redundant REX.W prefix on jmp/call reg64 511* 2009/Feb/24 add movq reg64, mmx/xmm; movq mmx/xmm, reg64 512* 2009/Feb/13 movd(xmm7, dword[eax]) drops 0x66 prefix (thanks to Gabest) 513* 2008/Dec/30 fix call in short relative address(thanks to kato san) 514* 2008/Sep/18 support @@, @f, @b and localization of label(thanks to nobu-q san) 515* 2008/Sep/18 support (ptr[rip + 32bit offset]) (thanks to Dango-Chu san) 516* 2008/Jun/03 fix align(). mov(ptr[eax],1) throws ERR_MEM_SIZE_IS_NOT_SPECIFIED. 517* 2008/Jun/02 support memory interface allocated by user 518* 2008/May/26 fix protect() to avoid invalid setting(thanks to shinichiro_h san) 519* 2008/Apr/30 add cmpxchg16b, cdqe 520* 2008/Apr/29 support x64 521* 2008/Apr/14 code refactoring 522* 2008/Mar/12 add bsr/bsf 523* 2008/Feb/14 fix output of sub eax, 1234 (thanks to Robert) 524* 2007/Nov/5 support lock, xadd, xchg 525* 2007/Nov/2 support SSSE3/SSE4 (thanks to Dango-Chu san) 526* 2007/Feb/4 fix the bug that exception doesn't occur under the condition which the offset of jmp mnemonic without T_NEAR is over 127. 527* 2007/Jan/21 fix the bug to create address like [disp] select smaller representation for mov (eax|ax|al, [disp]) 528* 2007/Jan/4 first version 529 530## Author 531MITSUNARI Shigeo(herumi@nifty.com) 532 533