• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

gen/H28-Aug-2020-

sample/H28-Aug-2020-

test/H07-May-2022-

xbyak/H28-Aug-2020-

.travis.ymlH A D28-Aug-2020153

COPYRIGHTH A D28-Aug-20203.4 KiB

MakefileH A D28-Aug-2020316

readme.mdH A D28-Aug-202022.9 KiB

xbyak.slnH A D28-Aug-20205.7 KiB

readme.md

1[![Build Status](https://travis-ci.org/herumi/xbyak.png)](https://travis-ci.org/herumi/xbyak)
2
3# Xbyak 5.96 ; JIT assembler for x86(IA32), x64(AMD64, x86-64) by C++
4
5## Abstract
6
7Xbyak is a C++ header library that enables dynamically to assemble x86(IA32), x64(AMD64, x86-64) mnemonic.
8
9## Feature
10* header file only
11* Intel/MASM like syntax
12* fully support AVX-512
13
14**Note**:
15Use `and_()`, `or_()`, ... instead of `and()`, `or()`.
16If you want to use them, then specify `-fno-operator-names` option to gcc/clang.
17
18### News
19- (break backward compatibility) `push(byte, imm)` (resp. `push(word, imm)`) forces to cast `imm` to 8(resp. 16) bit.
20- (Windows) `#include <winsock2.h>` has been removed from xbyak.h, so add it explicitly if you need it.
21- support exception-less mode see. [Exception-less mode](#exception-less-mode)
22- `XBYAK_USE_MMAP_ALLOCATOR` will be defined on Linux/macOS unless `XBYAK_DONT_USE_MMAP_ALLOCATOR` is defined.
23
24### Supported OS
25
26* Windows Xp, Vista, Windows 7, Windows 10(32bit, 64bit)
27* Linux(32bit, 64bit)
28* Intel macOS
29
30### Supported Compilers
31
32Almost C++03 or later compilers for x86/x64 such as Visual Studio, g++, clang++, Intel C++ compiler and g++ on mingw/cygwin.
33
34## Install
35
36The following files are necessary. Please add the path to your compile directory.
37
38* xbyak.h
39* xbyak_mnemonic.h
40* xbyak_util.h
41
42Linux:
43```
44make install
45```
46
47These files are copied into `/usr/local/include/xbyak`.
48
49## How to use it
50
51Inherit `Xbyak::CodeGenerator` class and make the class method.
52```
53#include <xbyak/xbyak.h>
54
55struct Code : Xbyak::CodeGenerator {
56    Code(int x)
57    {
58        mov(eax, x);
59        ret();
60    }
61};
62```
63Or you can pass the instance of CodeGenerator without inheriting.
64```
65void genCode(Xbyak::CodeGenerator& code, int x) {
66    using namespace Xbyak::util;
67    code.mov(eax, x);
68    code.ret();
69}
70```
71
72Make an instance of the class and get the function
73pointer by calling `getCode()` and call it.
74```
75Code c(5);
76int (*f)() = c.getCode<int (*)()>();
77printf("ret=%d\n", f()); // ret = 5
78```
79
80## Syntax
81Similar to MASM/NASM syntax with parentheses.
82
83```
84NASM              Xbyak
85mov eax, ebx  --> mov(eax, ebx);
86inc ecx           inc(ecx);
87ret           --> ret();
88```
89
90## Addressing
91Use `qword`, `dword`, `word` and `byte` if it is necessary to specify the size of memory,
92otherwise use `ptr`.
93
94```
95(ptr|qword|dword|word|byte) [base + index * (1|2|4|8) + displacement]
96                            [rip + 32bit disp] ; x64 only
97
98NASM                   Xbyak
99mov eax, [ebx+ecx] --> mov(eax, ptr [ebx+ecx]);
100mov al, [ebx+ecx]  --> mov(al, ptr [ebx + ecx]);
101test byte [esp], 4 --> test(byte [esp], 4);
102inc qword [rax]    --> inc(qword [rax]);
103```
104**Note**: `qword`, ... are member variables, then don't use `dword` as unsigned int type.
105
106### How to use Selector (Segment Register)
107```
108mov eax, [fs:eax] --> putSeg(fs);
109                      mov(eax, ptr [eax]);
110mov ax, cs        --> mov(ax, cs);
111```
112**Note**: Segment class is not derived from `Operand`.
113
114## AVX
115
116```
117vaddps(xmm1, xmm2, xmm3); // xmm1 <- xmm2 + xmm3
118vaddps(xmm2, xmm3, ptr [rax]); // use ptr to access memory
119vgatherdpd(xmm1, ptr [ebp + 256 + xmm2*4], xmm3);
120```
121
122**Note**:
123If `XBYAK_ENABLE_OMITTED_OPERAND` is defined, then you can use two operand version for backward compatibility.
124But the newer version will not support it.
125```
126vaddps(xmm2, xmm3); // xmm2 <- xmm2 + xmm3
127```
128
129## AVX-512
130
131```
132vaddpd zmm2, zmm5, zmm30                --> vaddpd(zmm2, zmm5, zmm30);
133vaddpd xmm30, xmm20, [rax]              --> vaddpd(xmm30, xmm20, ptr [rax]);
134vaddps xmm30, xmm20, [rax]              --> vaddps(xmm30, xmm20, ptr [rax]);
135vaddpd zmm2{k5}, zmm4, zmm2             --> vaddpd(zmm2 | k5, zmm4, zmm2);
136vaddpd zmm2{k5}{z}, zmm4, zmm2          --> vaddpd(zmm2 | k5 | T_z, zmm4, zmm2);
137vaddpd zmm2{k5}{z}, zmm4, zmm2,{rd-sae} --> vaddpd(zmm2 | k5 | T_z, zmm4, zmm2 | T_rd_sae);
138                                            vaddpd(zmm2 | k5 | T_z | T_rd_sae, zmm4, zmm2); // the position of `|` is arbitrary.
139vcmppd k4{k3}, zmm1, zmm2, {sae}, 5     --> vcmppd(k4 | k3, zmm1, zmm2 | T_sae, 5);
140
141vaddpd xmm1, xmm2, [rax+256]            --> vaddpd(xmm1, xmm2, ptr [rax+256]);
142vaddpd xmm1, xmm2, [rax+256]{1to2}      --> vaddpd(xmm1, xmm2, ptr_b [rax+256]);
143vaddpd ymm1, ymm2, [rax+256]{1to4}      --> vaddpd(ymm1, ymm2, ptr_b [rax+256]);
144vaddpd zmm1, zmm2, [rax+256]{1to8}      --> vaddpd(zmm1, zmm2, ptr_b [rax+256]);
145vaddps zmm1, zmm2, [rax+rcx*8+8]{1to16} --> vaddps(zmm1, zmm2, ptr_b [rax+rcx*8+8]);
146vmovsd [rax]{k1}, xmm4                  --> vmovsd(ptr [rax] | k1, xmm4);
147
148vcvtpd2dq xmm16, oword [eax+33]         --> vcvtpd2dq(xmm16, xword [eax+33]); // use xword for m128 instead of oword
149                                            vcvtpd2dq(xmm16, ptr [eax+33]); // default xword
150vcvtpd2dq xmm21, [eax+32]{1to2}         --> vcvtpd2dq(xmm21, ptr_b [eax+32]);
151vcvtpd2dq xmm0, yword [eax+33]          --> vcvtpd2dq(xmm0, yword [eax+33]); // use yword for m256
152vcvtpd2dq xmm19, [eax+32]{1to4}         --> vcvtpd2dq(xmm19, yword_b [eax+32]); // use yword_b to broadcast
153
154vfpclassps k5{k3}, zword [rax+64], 5    --> vfpclassps(k5|k3, zword [rax+64], 5); // specify m512
155vfpclasspd k5{k3}, [rax+64]{1to2}, 5    --> vfpclasspd(k5|k3, xword_b [rax+64], 5); // broadcast 64-bit to 128-bit
156vfpclassps k5{k3}, [rax+64]{1to4}, 5    --> vfpclassps(k5|k3, yword_b [rax+64], 5); // broadcast 64-bit to 256-bit
157```
158### Remark
159* `k1`, ..., `k7` are opmask registers.
160  - `k0` is dealt as no mask.
161  - e.g. `vmovaps(zmm0|k0, ptr[rax]);` and `vmovaps(zmm0|T_z, ptr[rax]);` are same to `vmovaps(zmm0, ptr[rax]);`.
162* use `| T_z`, `| T_sae`, `| T_rn_sae`, `| T_rd_sae`, `| T_ru_sae`, `| T_rz_sae` instead of `,{z}`, `,{sae}`, `,{rn-sae}`, `,{rd-sae}`, `,{ru-sae}`, `,{rz-sae}` respectively.
163* `k4 | k3` is different from `k3 | k4`.
164* use `ptr_b` for broadcast `{1toX}`. X is automatically determined.
165* specify `xword`/`yword`/`zword(_b)` for m128/m256/m512 if necessary.
166
167## Label
168Two kinds of Label are supported. (String literal and Label class).
169
170### String literal
171```
172L("L1");
173  jmp("L1");
174
175  jmp("L2");
176  ...
177  a few mnemonics (8-bit displacement jmp)
178  ...
179L("L2");
180
181  jmp("L3", T_NEAR);
182  ...
183  a lot of mnemonics (32-bit displacement jmp)
184  ...
185L("L3");
186```
187
188* Call `hasUndefinedLabel()` to verify your code has no undefined label.
189* you can use a label for immediate value of mov like as `mov(eax, "L2")`.
190
191### Support `@@`, `@f`, `@b` like MASM
192
193```
194L("@@"); // <A>
195  jmp("@b"); // jmp to <A>
196  jmp("@f"); // jmp to <B>
197L("@@"); // <B>
198  jmp("@b"); // jmp to <B>
199  mov(eax, "@b");
200  jmp(eax); // jmp to <B>
201```
202
203### Local label
204
205Label symbols beginning with a period between `inLocalLabel()` and `outLocalLabel()`
206are treated as a local label.
207`inLocalLabel()` and `outLocalLabel()` can be nested.
208
209```
210void func1()
211{
212    inLocalLabel();
213  L(".lp"); // <A> ; local label
214    ...
215    jmp(".lp"); // jmp to <A>
216  L("aaa"); // global label <C>
217    outLocalLabel();
218
219    inLocalLabel();
220  L(".lp"); // <B> ; local label
221    func1();
222    jmp(".lp"); // jmp to <B>
223    inLocalLabel();
224    jmp("aaa"); // jmp to <C>
225}
226```
227
228### short and long jump
229Xbyak deals with jump mnemonics of an undefined label as short jump if no type is specified.
230So if the size between jmp and label is larger than 127 byte, then xbyak will cause an error.
231
232```
233jmp("short-jmp"); // short jmp
234// small code
235L("short-jmp");
236
237jmp("long-jmp");
238// long code
239L("long-jmp"); // throw exception
240```
241Then specify T_NEAR for jmp.
242```
243jmp("long-jmp", T_NEAR); // long jmp
244// long code
245L("long-jmp");
246```
247Or call `setDefaultJmpNEAR(true);` once, then the default type is set to T_NEAR.
248```
249jmp("long-jmp"); // long jmp
250// long code
251L("long-jmp");
252```
253
254### Label class
255
256`L()` and `jxx()` support Label class.
257
258```
259  Xbyak::Label label1, label2;
260L(label1);
261  ...
262  jmp(label1);
263  ...
264  jmp(label2);
265  ...
266L(label2);
267```
268
269Use `putL` for jmp table
270```
271    Label labelTbl, L0, L1, L2;
272    mov(rax, labelTbl);
273    // rdx is an index of jump table
274    jmp(ptr [rax + rdx * sizeof(void*)]);
275L(labelTbl);
276    putL(L0);
277    putL(L1);
278    putL(L2);
279L(L0);
280    ....
281L(L1);
282    ....
283```
284
285`assignL(dstLabel, srcLabel)` binds dstLabel with srcLabel.
286
287```
288  Label label2;
289  Label label1 = L(); // make label1 ; same to Label label1; L(label1);
290  ...
291  jmp(label2); // label2 is not determined here
292  ...
293  assignL(label2, label1); // label2 <- label1
294```
295The `jmp` in the above code jumps to label1 assigned by `assignL`.
296
297**Note**:
298* srcLabel must be used in `L()`.
299* dstLabel must not be used in `L()`.
300
301`Label::getAddress()` returns the address specified by the label instance and 0 if not specified.
302```
303// not AutoGrow mode
304Label  label;
305assert(label.getAddress() == 0);
306L(label);
307assert(label.getAddress() == getCurr());
308```
309
310### Rip ; relative addressing
311```
312Label label;
313mov(eax, ptr [rip + label]); // eax = 4
314...
315
316L(label);
317dd(4);
318```
319```
320int x;
321...
322  mov(eax, ptr[rip + &x]); // throw exception if the difference between &x and current position is larger than 2GiB
323```
324
325## Code size
326The default max code size is 4096 bytes.
327Specify the size in constructor of `CodeGenerator()` if necessary.
328
329```
330class Quantize : public Xbyak::CodeGenerator {
331public:
332  Quantize()
333    : CodeGenerator(8192)
334  {
335  }
336  ...
337};
338```
339
340## User allocated memory
341
342You can make jit code on prepaired memory.
343
344Call `setProtectModeRE` yourself to change memory mode if using the prepaired memory.
345
346```
347uint8_t alignas(4096) buf[8192]; // C++11 or later
348
349struct Code : Xbyak::CodeGenerator {
350    Code() : Xbyak::CodeGenerator(sizeof(buf), buf)
351    {
352        mov(rax, 123);
353        ret();
354    }
355};
356
357int main()
358{
359    Code c;
360    c.setProtectModeRE(); // set memory to Read/Exec
361    printf("%d\n", c.getCode<int(*)()>()());
362}
363```
364
365**Note**: See [sample/test0.cpp](sample/test0.cpp).
366
367### AutoGrow
368
369The memory region for jit is automatically extended if necessary when `AutoGrow` is specified in a constructor of `CodeGenerator`.
370
371Call `ready()` or `readyRE()` before calling `getCode()` to fix jump address.
372```
373struct Code : Xbyak::CodeGenerator {
374  Code()
375    : Xbyak::CodeGenerator(<default memory size>, Xbyak::AutoGrow)
376  {
377     ...
378  }
379};
380Code c;
381// generate code for jit
382c.ready(); // mode = Read/Write/Exec
383```
384
385**Note**:
386* Don't use the address returned by `getCurr()` before calling `ready()` because it may be invalid address.
387
388### Read/Exec mode
389Xbyak set Read/Write/Exec mode to memory to run jit code.
390If you want to use Read/Exec mode for security, then specify `DontSetProtectRWE` for `CodeGenerator` and
391call `setProtectModeRE()` after generating jit code.
392
393```
394struct Code : Xbyak::CodeGenerator {
395    Code()
396        : Xbyak::CodeGenerator(4096, Xbyak::DontSetProtectRWE)
397    {
398        mov(eax, 123);
399        ret();
400    }
401};
402
403Code c;
404c.setProtectModeRE();
405...
406
407```
408Call `readyRE()` instead of `ready()` when using `AutoGrow` mode.
409See [protect-re.cpp](sample/protect-re.cpp).
410
411## Exception-less mode
412If `XBYAK_NO_EXCEPTION` is defined, then gcc/clang can compile xbyak with `-fno-exceptions`.
413In stead of throwing an exception, `Xbyak::GetError()` returns non-zero value (e.g. `ERR_BAD_ADDRESSING`) if there is something wrong.
414The status will not be changed automatically, then you should reset it by `Xbyak::ClearError()`.
415`CodeGenerator::reset()` calls `ClearError()`.
416
417## Macro
418
419* **XBYAK32** is defined on 32bit.
420* **XBYAK64** is defined on 64bit.
421* **XBYAK64_WIN** is defined on 64bit Windows(VC).
422* **XBYAK64_GCC** is defined on 64bit gcc, cygwin.
423* define **XBYAK_USE_OP_NAMES** on gcc with `-fno-operator-names` if you want to use `and()`, ....
424* define **XBYAK_ENABLE_OMITTED_OPERAND** if you use omitted destination such as `vaddps(xmm2, xmm3);`(deprecated in the future).
425* define **XBYAK_UNDEF_JNL** if Bessel function jnl is defined as macro.
426* define **XBYAK_NO_EXCEPTION** for a compiler option `-fno-exceptions`.
427
428## Sample
429
430* [test0.cpp](sample/test0.cpp) ; tiny sample (x86, x64)
431* [quantize.cpp](sample/quantize.cpp) ; JIT optimized quantization by fast division (x86 only)
432* [calc.cpp](sample/calc.cpp) ; assemble and estimate a given polynomial (x86, x64)
433* [bf.cpp](sample/bf.cpp) ; JIT brainfuck (x86, x64)
434
435## License
436
437modified new BSD License
438http://opensource.org/licenses/BSD-3-Clause
439
440## History
441* 2020/Aug/28 ver 5.95 some constructors of register classes support constexpr if C++14 or later
442* 2020/Aug/04 ver 5.941 `CodeGenerator::reset()` calls `ClearError()`.
443* 2020/Jul/28 ver 5.94 remove #include <winsock2.h> (only windows)
444* 2020/Jul/21 ver 5.93 support exception-less mode
445* 2020/Jun/30 ver 5.92 support Intel AMX instruction set (Thanks to nshustrov)
446* 2020/Jun/22 ver 5.913 fix mov(r64, imm64) on 32-bit env with XBYAK64
447* 2020/Jun/19 ver 5.912 define MAP_JIT on macOS regardless of Xcode version (Thanks to rsdubtso)
448* 2020/May/10 ver 5.911 XBYAK_USE_MMAP_ALLOCATOR is defined unless XBYAK_DONT_USE_MMAP_ALLOCATOR is defined.
449* 2020/Apr/20 ver 5.91 accept mask register k0 (it means no mask)
450* 2020/Apr/09 ver 5.90 kmov{b,d,w,q} throws exception for an unsupported register
451* 2020/Feb/26 ver 5.891 fix typo of type
452* 2020/Jan/03 ver 5.89 fix error of vfpclasspd
453* 2019/Dec/20 ver 5.88 fix compile error on Windows
454* 2019/Dec/19 ver 5.87 add setDefaultJmpNEAR(), which deals with `jmp` of an undefined label as T_NEAR if no type is specified.
455* 2019/Dec/13 ver 5.86 [changed] revert to the behavior before v5.84 if -fno-operator-names is defined (and() is available)
456* 2019/Dec/07 ver 5.85 append MAP_JIT flag to mmap for macOS mojave or later
457* 2019/Nov/29 ver 5.84 [changed] XBYAK_NO_OP_NAMES is defined unless XBYAK_USE_OP_NAMES is defined
458* 2019/Oct/12 ver 5.83 exit(1) was removed
459* 2019/Sep/23 ver 5.82 support monitorx, mwaitx, clzero (thanks to @MagurosanTeam)
460* 2019/Sep/14 ver 5.81 support some generic mnemonics.
461* 2019/Aug/01 ver 5.802 fix detection of AVX512_BF16 (thanks to vpirogov)
462* 2019/May/27 support vp2intersectd, vp2intersectq (not tested)
463* 2019/May/26 ver 5.80 support vcvtne2ps2bf16, vcvtneps2bf16, vdpbf16ps
464* 2019/Apr/27 ver 5.79 vcmppd/vcmpps supports ptr_b(thanks to jkopinsky)
465* 2019/Apr/15 ver 5.78 rewrite Reg::changeBit() (thanks to MerryMage)
466* 2019/Mar/06 ver 5.77 fix number of cores that share LLC cache by densamoilov
467* 2019/Jan/17 ver 5.76 add Cpu::getNumCores() by shelleygoel
468* 2018/Oct/31 ver 5.751 recover Xbyak::CastTo for compatibility
469* 2018/Oct/29 ver 5.75 unlink LabelManager from Label when msg is destroyed
470* 2018/Oct/21 ver 5.74 support RegRip +/- int. Xbyak::CastTo is removed
471* 2018/Oct/15 util::AddressFrame uses push/pop instead of mov
472* 2018/Sep/19 ver 5.73 fix evex encoding of vpslld, vpslldq, vpsllw, etc for (reg, mem, imm8)
473* 2018/Sep/19 ver 5.72 fix the encoding of vinsertps for disp8N(Thanks to petercaday)
474* 2018/Sep/04 ver 5.71 L() returns a new label instance
475* 2018/Aug/27 ver 5.70 support setProtectMode() and DontUseProtect for read/exec setting
476* 2018/Aug/24 ver 5.68 fix wrong VSIB encoding with vector index >= 16(thanks to petercaday)
477* 2018/Aug/14 ver 5.67 remove mutable in Address ; fix setCacheHierarchy for cloud vm
478* 2018/Jul/26 ver 5.661 support mingw64
479* 2018/Jul/24 ver 5.66 add CodeArray::PROTECT_RE to mode of protect()
480* 2018/Jun/26 ver 5.65 fix push(qword [mem])
481* 2018/Mar/07 ver 5.64 fix zero division in Cpu() on some cpu
482* 2018/Feb/14 ver 5.63 fix Cpu::setCacheHierarchy() and fix EvexModifierZero for clang<3.9(thanks to mgouicem)
483* 2018/Feb/13 ver 5.62 Cpu::setCacheHierarchy() by mgouicem and rsdubtso
484* 2018/Feb/07 ver 5.61 vmov* supports mem{k}{z}(I forgot it)
485* 2018/Jan/24 ver 5.601 add xword, yword, etc. into Xbyak::util namespace
486* 2018/Jan/05 ver 5.60 support AVX-512 for Ice lake(319433-030.pdf)
487* 2017/Aug/22 ver 5.53 fix mpx encoding, add bnd() prefix
488* 2017/Aug/18 ver 5.52 fix align (thanks to MerryMage)
489* 2017/Aug/17 ver 5.51 add multi-byte nop and align() uses it(thanks to inolen)
490* 2017/Aug/08 ver 5.50 add mpx(thanks to magurosan)
491* 2017/Aug/08 ver 5.45 add sha(thanks to magurosan)
492* 2017/Aug/08 ver 5.44 add prefetchw(thanks to rsdubtso)
493* 2017/Jul/12 ver 5.432 reduce warnings of PVS studio
494* 2017/Jul/09 ver 5.431 fix hasRex() (no affect) (thanks to drillsar)
495* 2017/May/14 ver 5.43 fix CodeGenerator::resetSize() (thanks to gibbed)
496* 2017/May/13 ver 5.42 add movs{b,w,d,q}
497* 2017/Jan/26 ver 5.41 add prefetchwt1 and support for scale == 0(thanks to rsdubtso)
498* 2016/Dec/14 ver 5.40 add Label::getAddress() method to get the pointer specified by the label
499* 2016/Dec/09 ver 5.34 fix handling of negative offsets when encoding disp8N(thanks to rsdubtso)
500* 2016/Dec/08 ver 5.33 fix encoding of vpbroadcast{b,w,d,q}, vpinsr{b,w}, vpextr{b,w} for disp8N
501* 2016/Dec/01 ver 5.32 rename __xgetbv() to _xgetbv() to support clang for Visual Studio(thanks to freiro)
502* 2016/Nov/27 ver 5.31 rename AVX512_4VNNI to AVX512_4VNNIW
503* 2016/Nov/27 ver 5.30 add AVX512_4VNNI, AVX512_4FMAPS instructions(thanks to rsdubtso)
504* 2016/Nov/26 ver 5.20 add detection of AVX512_4VNNI and AVX512_4FMAPS(thanks to rsdubtso)
505* 2016/Nov/20 ver 5.11 lost vptest for ymm(thanks to gregory38)
506* 2016/Nov/20 ver 5.10 add addressing [rip+&var]
507* 2016/Sep/29 ver 5.03 fix detection ERR_INVALID_OPMASK_WITH_MEMORY(thanks to PVS-Studio)
508* 2016/Aug/15 ver 5.02 xbyak does not include xbyak_bin2hex.h
509* 2016/Aug/15 ver 5.011 fix detection of version of gcc 5.4
510* 2016/Aug/03 ver 5.01 disable omitted operand
511* 2016/Jun/24 ver 5.00 support avx-512 instruction set
512* 2016/Jun/13 avx-512 add mask instructions
513* 2016/May/05 ver 4.91 add detection of AVX-512 to Xbyak::util::Cpu
514* 2016/Mar/14 ver 4.901 comment to ready() function(thanks to skmp)
515* 2016/Feb/04 ver 4.90 add jcc(const void *addr);
516* 2016/Jan/30 ver 4.89 vpblendvb supports ymm reg(thanks to John Funnell)
517* 2016/Jan/24 ver 4.88 lea, cmov supports 16-bit register(thanks to whyisthisfieldhere)
518* 2015/Oct/05 ver 4.87 support segment selectors
519* 2015/Aug/18 ver 4.86 fix [rip + label] addressing with immediate value(thanks to whyisthisfieldhere)
520* 2015/Aug/10 ver 4.85 Address::operator==() is not correct(thanks to inolen)
521* 2015/Jun/22 ver 4.84 call() support variadic template if available(thanks to randomstuff)
522* 2015/Jun/16 ver 4.83 support movbe(thanks to benvanik)
523* 2015/May/24 ver 4.82 support detection of F16C
524* 2015/Apr/25 ver 4.81 fix the condition to throw exception for setSize(thanks to whyisthisfieldhere)
525* 2015/Apr/22 ver 4.80 rip supports label(thanks to whyisthisfieldhere)
526* 2015/Jar/28 ver 4.71 support adcx, adox, cmpxchg, rdseed, stac
527* 2014/Oct/14 ver 4.70 support MmapAllocator
528* 2014/Jun/13 ver 4.62 disable warning of VC2014
529* 2014/May/30 ver 4.61 support bt, bts, btr, btc
530* 2014/May/28 ver 4.60 support vcvtph2ps, vcvtps2ph
531* 2014/Apr/11 ver 4.52 add detection of rdrand
532* 2014/Mar/25 ver 4.51 remove state information of unreferenced labels
533* 2014/Mar/16 ver 4.50 support new Label
534* 2014/Mar/05 ver 4.40 fix wrong detection of BMI/enhanced rep on VirtualBox
535* 2013/Dec/03 ver 4.30 support Reg::cvt8(), cvt16(), cvt32(), cvt64()
536* 2013/Oct/16 ver 4.21 label support std::string
537* 2013/Jul/30 ver 4.20 [break backward compatibility] split Reg32e class into RegExp(base+index*scale+disp) and Reg32e(means Reg32 or Reg64)
538* 2013/Jul/04 ver 4.10 [break backward compatibility] change the type of Xbyak::Error from enum to a class
539* 2013/Jun/21 ver 4.02 add putL(LABEL) function to put the address of the label
540* 2013/Jun/21 ver 4.01 vpsllw, vpslld, vpsllq, vpsraw, vpsrad, vpsrlw, vpsrld, vpsrlq support (ymm, ymm, xmm). support vpbroadcastb, vpbroadcastw, vpbroadcastd, vpbroadcastq(thanks to Gabest).
541* 2013/May/30 ver 4.00 support AVX2, VEX-encoded GPR-instructions
542* 2013/Mar/27 ver 3.80 support mov(reg, "label");
543* 2013/Mar/13 ver 3.76 add cqo(), jcxz(), jecxz(), jrcxz()
544* 2013/Jan/15 ver 3.75 add setSize() to modify generated code
545* 2013/Jan/12 ver 3.74 add CodeGenerator::reset() ; add Allocator::useProtect()
546* 2013/Jan/06 ver 3.73 use unordered_map if possible
547* 2012/Dec/04 ver 3.72 eax, ebx, ... are member variables of CodeGenerator(revert), Xbyak::util::eax, ... are static const.
548* 2012/Nov/17 ver 3.71 and_(), or_(), xor_(), not_() are available if XBYAK_NO_OP_NAMES is not defined.
549* 2012/Nov/17 change eax, ebx, ptr and so on in CodeGenerator as static member and alias of them are defined in Xbyak::util.
550* 2012/Nov/09 ver 3.70 XBYAK_NO_OP_NAMES macro is added to use and_() instead of and() (thanks to Mattias)
551* 2012/Nov/01 ver 3.62 add fwait/fnwait/finit/fninit
552* 2012/Nov/01 ver 3.61 add fldcw/fstcw
553* 2012/May/03 ver 3.60 change interface of Allocator
554* 2012/Mar/23 ver 3.51 fix userPtr mode
555* 2012/Mar/19 ver 3.50 support AutoGrow mode
556* 2011/Nov/09 ver 3.05 fix bit property of rip addresing / support movsxd
557* 2011/Aug/15 ver 3.04 fix dealing with imm8 such as add(dword [ebp-8], 0xda); (thanks to lolcat)
558* 2011/Jun/16 ver 3.03 fix __GNUC_PREREQ macro for Mac gcc(thanks to t_teruya)
559* 2011/Apr/28 ver 3.02 do not use xgetbv on Mac gcc
560* 2011/May/24 ver 3.01 fix typo of OSXSAVE
561* 2011/May/23 ver 3.00 add vcmpeqps and so on
562* 2011/Feb/16 ver 2.994 beta add vmovq for 32-bit mode(I forgot it)
563* 2011/Feb/16 ver 2.993 beta remove cvtReg to avoid thread unsafe
564* 2011/Feb/10 ver 2.992 beta support one argument syntax for fadd like nasm
565* 2011/Feb/07 ver 2.991 beta fix pextrw reg, xmm, imm(Thanks to Gabest)
566* 2011/Feb/04 ver 2.99 beta support AVX
567* 2010/Dec/08 ver 2.31 fix ptr [rip + 32bit offset], support rdtscp
568* 2010/Oct/19 ver 2.30 support pclmulqdq, aesdec, aesdeclast, aesenc, aesenclast, aesimc, aeskeygenassist
569* 2010/Jun/07 ver 2.29 fix call(<label>)
570* 2010/Jun/17 ver 2.28 move some member functions to public
571* 2010/Jun/01 ver 2.27 support encoding of mov(reg64, imm) like yasm(not nasm)
572* 2010/May/24 ver 2.26 fix sub(rsp, 1000)
573* 2010/Apr/26 ver 2.25 add jc/jnc(I forgot to implement them...)
574* 2010/Apr/16 ver 2.24 change the prototype of rewrite() method
575* 2010/Apr/15 ver 2.23 fix align() and xbyak_util.h for Mac
576* 2010/Feb/16 ver 2.22 fix inLocalLabel()/outLocalLabel()
577* 2009/Dec/09 ver 2.21 support cygwin(gcc 4.3.2)
578* 2009/Nov/28 support a part of FPU
579* 2009/Jun/25 fix mov(qword[rax], imm); (thanks to Martin)
580* 2009/Mar/10 fix redundant REX.W prefix on jmp/call reg64
581* 2009/Feb/24 add movq reg64, mmx/xmm; movq mmx/xmm, reg64
582* 2009/Feb/13 movd(xmm7, dword[eax]) drops 0x66 prefix (thanks to Gabest)
583* 2008/Dec/30 fix call in short relative address(thanks to kato san)
584* 2008/Sep/18 support @@, @f, @b and localization of label(thanks to nobu-q san)
585* 2008/Sep/18 support (ptr[rip + 32bit offset]) (thanks to Dango-Chu san)
586* 2008/Jun/03 fix align(). mov(ptr[eax],1) throws ERR_MEM_SIZE_IS_NOT_SPECIFIED.
587* 2008/Jun/02 support memory interface allocated by user
588* 2008/May/26 fix protect() to avoid invalid setting(thanks to shinichiro_h san)
589* 2008/Apr/30 add cmpxchg16b, cdqe
590* 2008/Apr/29 support x64
591* 2008/Apr/14 code refactoring
592* 2008/Mar/12 add bsr/bsf
593* 2008/Feb/14 fix output of sub eax, 1234 (thanks to Robert)
594* 2007/Nov/5  support lock, xadd, xchg
595* 2007/Nov/2  support SSSE3/SSE4 (thanks to Dango-Chu san)
596* 2007/Feb/4  fix the bug that exception doesn't occur under the condition which the offset of jmp mnemonic without T_NEAR is over 127.
597* 2007/Jan/21 fix the bug to create address like [disp] select smaller representation for mov (eax|ax|al, [disp])
598* 2007/Jan/4  first version
599
600## Author
601MITSUNARI Shigeo(herumi@nifty.com)
602
603## Sponsors welcome
604[GitHub Sponsor](https://github.com/sponsors/herumi)
605