int_helper.c - OpenGrok history log for /qemu/target/ppc/int

Revision (<<< Hide revision tags) (Show revision tags >>>)	Date	Author	Comments
Revision tags: v8.2.4, v8.2.3, v7.2.11, v9.0.0
# 948e257c	23-Apr-2024	Chinmay Rath <rathc@linux.ibm.com>	target/ppc: Move logical fixed-point instructions to decodetree. Moving the below instructions to decodetree specification : andi[s]., {ori, xori}[s] : D-form {and, andc, nand, or, orc, nor, x target/ppc: Move logical fixed-point instructions to decodetree. Moving the below instructions to decodetree specification : andi[s]., {ori, xori}[s] : D-form {and, andc, nand, or, orc, nor, xor, eqv}[.], exts{b, h, w}[.], cnt{l, t}z{w, d}[.], popcnt{b, w, d}, prty{w, d}, cmp, bpermd : X-form With this patch, all the fixed-point logical instructions have been moved to decodetree. The changes were verified by validating that the tcg ops generated by those instructions remain the same, which were captured with the '-d in_asm,op' flag. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Chinmay Rath <rathc@linux.ibm.com> [np: 32-bit compile fix] Signed-off-by: Nicholas Piggin <npiggin@gmail.com> show more ...
# ae556c6a	23-Apr-2024	Chinmay Rath <rathc@linux.ibm.com>	target/ppc: Move cmp{rb, eqb}, tw[i], td[i], isel instructions to decodetree. Moving the following instructions to decodetree specification : cmp{rb, eqb}, t{w, d} : X-form t{w, d}i : D-form is target/ppc: Move cmp{rb, eqb}, tw[i], td[i], isel instructions to decodetree. Moving the following instructions to decodetree specification : cmp{rb, eqb}, t{w, d} : X-form t{w, d}i : D-form isel : A-form The changes were verified by validating that the tcg ops generated by those instructions remain the same, which were captured using the '-d in_asm,op' flag. Also for CMPRB, following review comments : Replaced repetition of arithmetic right shifting (tcg_gen_shri_i32) followed by extraction of last 8 bits (tcg_gen_ext8u_i32) with extraction of the required bits using offsets (tcg_gen_extract_i32). Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Chinmay Rath <rathc@linux.ibm.com> [np: 32-bit compile fix] Signed-off-by: Nicholas Piggin <npiggin@gmail.com> show more ...
# f424bc10	23-Apr-2024	Chinmay Rath <rathc@linux.ibm.com>	target/ppc: Move div/mod fixed-point insns (64 bits operands) to decodetree. Moving the below instructions to decodetree specification : divd[u, e, eu][o][.] : XO-form mod{sd, ud} : X-form With target/ppc: Move div/mod fixed-point insns (64 bits operands) to decodetree. Moving the below instructions to decodetree specification : divd[u, e, eu][o][.] : XO-form mod{sd, ud} : X-form With this patch, all the fixed-point arithmetic instructions have been moved to decodetree. The changes were verified by validating that the tcg ops generated by those instructions remain the same, which were captured using the '-d in_asm,op' flag. Also, remaned do_divwe method in fixedpoint-impl.c.inc to do_dive because it is now used to divide doubleword operands as well, and not just words. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Chinmay Rath <rathc@linux.ibm.com> [np: 32-bit compile fix] Signed-off-by: Nicholas Piggin <npiggin@gmail.com> show more ...
# a81b5c18	23-Apr-2024	Chinmay Rath <rathc@linux.ibm.com>	target/ppc: Move neg, darn, mod{sw, uw} to decodetree. Moving the below instructions to decodetree specification : neg[o][.] : XO-form mod{sw, uw}, darn : X-form The changes were verified target/ppc: Move neg, darn, mod{sw, uw} to decodetree. Moving the below instructions to decodetree specification : neg[o][.] : XO-form mod{sw, uw}, darn : X-form The changes were verified by validating that the tcg ops generated by those instructions remain the same, which were captured with the '-d in_asm,op' flag. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Chinmay Rath <rathc@linux.ibm.com> [np: 32-bit compile fix] Signed-off-by: Nicholas Piggin <npiggin@gmail.com> show more ...
# 2871921d	23-Apr-2024	Chinmay Rath <rathc@linux.ibm.com>	target/ppc: Move divw[u, e, eu] instructions to decodetree. Moving the following instructions to decodetree specification : divw[u, e, eu][o][.] : XO-form The changes were verified by validating target/ppc: Move divw[u, e, eu] instructions to decodetree. Moving the following instructions to decodetree specification : divw[u, e, eu][o][.] : XO-form The changes were verified by validating that the tcg ops generated by those instructions remain the same, which were captured with the '-d in_asm,op' flag. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Chinmay Rath <rathc@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> show more ...
Revision tags: v9.0.0-rc4, v9.0.0-rc3, v9.0.0-rc2, v9.0.0-rc1, v9.0.0-rc0, v8.2.2, v7.2.10, v8.2.1, v8.1.5, v7.2.9, v8.1.4, v7.2.8, v8.2.0, v8.2.0-rc4, v8.2.0-rc3, v8.2.0-rc2, v8.2.0-rc1, v7.2.7, v8.1.3, v8.2.0-rc0, v8.1.2
# 668a6314	29-Sep-2023	Cédric Le Goater <clg@kaod.org>	target/ppc: Rename variables to avoid local variable shadowing in VUPKPX and fix such warnings : ../target/ppc/int_helper.c: In function ‘helper_vupklpx’: ../target/ppc/int_helper.c:2025:21: wa target/ppc: Rename variables to avoid local variable shadowing in VUPKPX and fix such warnings : ../target/ppc/int_helper.c: In function ‘helper_vupklpx’: ../target/ppc/int_helper.c:2025:21: warning: declaration of ‘r’ shadows a parameter [-Wshadow=local] 2025 \| uint8_t r = (e >> 10) & 0x1f; \ \| ^ ../target/ppc/int_helper.c:2033:1: note: in expansion of macro ‘VUPKPX’ 2033 \| VUPKPX(lpx, UPKLO) \| ^~~~~~ ../target/ppc/int_helper.c:2017:41: note: shadowed declaration is here 2017 \| void helper_vupk##suffix(ppc_avr_t r, ppc_avr_t b) \ \| ~~~~~~~~~~~^ ../target/ppc/int_helper.c:2033:1: note: in expansion of macro ‘VUPKPX’ 2033 \| VUPKPX(lpx, UPKLO) \| ^~~~~~ Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-ID: <20230929083143.234553-1-clg@kaod.org> Reviewed-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Markus Armbruster <armbru@redhat.com> show more ...
Revision tags: v9.0.0-rc4, v9.0.0-rc3, v9.0.0-rc2, v9.0.0-rc1, v9.0.0-rc0, v8.2.2, v7.2.10, v8.2.1, v8.1.5, v7.2.9, v8.1.4, v7.2.8, v8.2.0, v8.2.0-rc4, v8.2.0-rc3, v8.2.0-rc2, v8.2.0-rc1, v7.2.7, v8.1.3, v8.2.0-rc0, v8.1.2
# 668a6314	29-Sep-2023	Cédric Le Goater <clg@kaod.org>	target/ppc: Rename variables to avoid local variable shadowing in VUPKPX and fix such warnings : ../target/ppc/int_helper.c: In function ‘helper_vupklpx’: ../target/ppc/int_helper.c:2025:21: wa target/ppc: Rename variables to avoid local variable shadowing in VUPKPX and fix such warnings : ../target/ppc/int_helper.c: In function ‘helper_vupklpx’: ../target/ppc/int_helper.c:2025:21: warning: declaration of ‘r’ shadows a parameter [-Wshadow=local] 2025 \| uint8_t r = (e >> 10) & 0x1f; \ \| ^ ../target/ppc/int_helper.c:2033:1: note: in expansion of macro ‘VUPKPX’ 2033 \| VUPKPX(lpx, UPKLO) \| ^~~~~~ ../target/ppc/int_helper.c:2017:41: note: shadowed declaration is here 2017 \| void helper_vupk##suffix(ppc_avr_t r, ppc_avr_t b) \ \| ~~~~~~~~~~~^ ../target/ppc/int_helper.c:2033:1: note: in expansion of macro ‘VUPKPX’ 2033 \| VUPKPX(lpx, UPKLO) \| ^~~~~~ Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-ID: <20230929083143.234553-1-clg@kaod.org> Reviewed-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Markus Armbruster <armbru@redhat.com> show more ...
Revision tags: v9.0.0-rc4, v9.0.0-rc3, v9.0.0-rc2, v9.0.0-rc1, v9.0.0-rc0, v8.2.2, v7.2.10, v8.2.1, v8.1.5, v7.2.9, v8.1.4, v7.2.8, v8.2.0, v8.2.0-rc4, v8.2.0-rc3, v8.2.0-rc2, v8.2.0-rc1, v7.2.7, v8.1.3, v8.2.0-rc0, v8.1.2
# 668a6314	29-Sep-2023	Cédric Le Goater <clg@kaod.org>	target/ppc: Rename variables to avoid local variable shadowing in VUPKPX and fix such warnings : ../target/ppc/int_helper.c: In function ‘helper_vupklpx’: ../target/ppc/int_helper.c:2025:21: wa target/ppc: Rename variables to avoid local variable shadowing in VUPKPX and fix such warnings : ../target/ppc/int_helper.c: In function ‘helper_vupklpx’: ../target/ppc/int_helper.c:2025:21: warning: declaration of ‘r’ shadows a parameter [-Wshadow=local] 2025 \| uint8_t r = (e >> 10) & 0x1f; \ \| ^ ../target/ppc/int_helper.c:2033:1: note: in expansion of macro ‘VUPKPX’ 2033 \| VUPKPX(lpx, UPKLO) \| ^~~~~~ ../target/ppc/int_helper.c:2017:41: note: shadowed declaration is here 2017 \| void helper_vupk##suffix(ppc_avr_t r, ppc_avr_t b) \ \| ~~~~~~~~~~~^ ../target/ppc/int_helper.c:2033:1: note: in expansion of macro ‘VUPKPX’ 2033 \| VUPKPX(lpx, UPKLO) \| ^~~~~~ Signed-off-by: Cédric Le Goater <clg@kaod.org> Message-ID: <20230929083143.234553-1-clg@kaod.org> Reviewed-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Markus Armbruster <armbru@redhat.com> show more ...
Revision tags: v8.1.1, v7.2.6, v8.0.5, v8.1.0, v8.1.0-rc4, v8.1.0-rc3, v7.2.5, v8.0.4, v8.1.0-rc2, v8.1.0-rc1, v8.1.0-rc0
# 7bdbf233	11-Jul-2023	Richard Henderson <richard.henderson@linaro.org>	target/ppc: Use clmul_64 Use generic routine for 64-bit carry-less multiply. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
# f56d3c1a	11-Jul-2023	Richard Henderson <richard.henderson@linaro.org>	target/ppc: Use clmul_32* routines Use generic routines for 32-bit carry-less multiply. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
# a2c67342	11-Jul-2023	Richard Henderson <richard.henderson@linaro.org>	target/ppc: Use clmul_16* routines Use generic routines for 16-bit carry-less multiply. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
# cec4090d	11-Jul-2023	Richard Henderson <richard.henderson@linaro.org>	target/ppc: Use clmul_8* routines Use generic routines for 8-bit carry-less multiply. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
# 73c19706	28-Aug-2023	Philippe Mathieu-Daudé <philmd@linaro.org>	target/helpers: Remove unnecessary 'qemu/main-loop.h' header "qemu/main-loop.h" declares functions related to QEMU's main loop mutex, which these files don't access. Remove the unused "qemu/main-loo target/helpers: Remove unnecessary 'qemu/main-loop.h' header "qemu/main-loop.h" declares functions related to QEMU's main loop mutex, which these files don't access. Remove the unused "qemu/main-loop.h" header. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230828221314.18435-8-philmd@linaro.org> show more ...
# 73c19706	28-Aug-2023	Philippe Mathieu-Daudé <philmd@linaro.org>	target/helpers: Remove unnecessary 'qemu/main-loop.h' header "qemu/main-loop.h" declares functions related to QEMU's main loop mutex, which these files don't access. Remove the unused "qemu/main-loo target/helpers: Remove unnecessary 'qemu/main-loop.h' header "qemu/main-loop.h" declares functions related to QEMU's main loop mutex, which these files don't access. Remove the unused "qemu/main-loop.h" header. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230828221314.18435-8-philmd@linaro.org> show more ...
# 73c19706	28-Aug-2023	Philippe Mathieu-Daudé <philmd@linaro.org>	target/helpers: Remove unnecessary 'qemu/main-loop.h' header "qemu/main-loop.h" declares functions related to QEMU's main loop mutex, which these files don't access. Remove the unused "qemu/main-loo target/helpers: Remove unnecessary 'qemu/main-loop.h' header "qemu/main-loop.h" declares functions related to QEMU's main loop mutex, which these files don't access. Remove the unused "qemu/main-loop.h" header. Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230828221314.18435-8-philmd@linaro.org> show more ...
Revision tags: v8.0.3, v7.2.4
# af4cb945	02-Jun-2023	Richard Henderson <richard.henderson@linaro.org>	target/ppc: Use aesdec_ISB_ISR_AK_IMC This implements the VNCIPHER instruction. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> target/ppc: Use aesdec_ISB_ISR_AK_IMC This implements the VNCIPHER instruction. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> show more ...
# ce9f5b37	02-Jun-2023	Richard Henderson <richard.henderson@linaro.org>	target/ppc: Use aesenc_SB_SR_MC_AK This implements the VCIPHER instruction. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Sign target/ppc: Use aesenc_SB_SR_MC_AK This implements the VCIPHER instruction. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> show more ...
# 2cf44f3b	02-Jun-2023	Richard Henderson <richard.henderson@linaro.org>	target/ppc: Use aesdec_ISB_ISR_AK This implements the VNCIPHERLAST instruction. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> target/ppc: Use aesdec_ISB_ISR_AK This implements the VNCIPHERLAST instruction. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> show more ...
# 7df34e48	02-Jun-2023	Richard Henderson <richard.henderson@linaro.org>	target/ppc: Use aesenc_SB_SR_AK This implements the VCIPHERLAST instruction. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Sig target/ppc: Use aesenc_SB_SR_AK This implements the VCIPHERLAST instruction. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> show more ...
Revision tags: v8.0.2, v8.0.1, v7.2.3, v7.2.2, v8.0.0, v8.0.0-rc4, v8.0.0-rc3, v7.2.1, v8.0.0-rc2, v8.0.0-rc1, v8.0.0-rc0, v7.2.0, v7.2.0-rc4, v7.2.0-rc3, v7.2.0-rc2, v7.2.0-rc1, v7.2.0-rc0
# 26c964f8	19-Oct-2022	Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>	target/ppc: Move VABSDU[BHW] to decodetree and use gvec Moved VABSDUB, VABSDUH and VABSDUW to decodetree and use gvec to translate them. vabsdub: rept loop master patch 8 12 target/ppc: Move VABSDU[BHW] to decodetree and use gvec Moved VABSDUB, VABSDUH and VABSDUW to decodetree and use gvec to translate them. vabsdub: rept loop master patch 8 12500 0,03601600 0,00688500 (-80.9%) 25 4000 0,03651000 0,00532100 (-85.4%) 100 1000 0,03666900 0,00595300 (-83.8%) 500 200 0,04305800 0,01244600 (-71.1%) 2500 40 0,06893300 0,04273700 (-38.0%) 8000 12 0,14633200 0,12660300 (-13.5%) vabsduh: rept loop master patch 8 12500 0,02172400 0,00687500 (-68.4%) 25 4000 0,02154100 0,00531500 (-75.3%) 100 1000 0,02235400 0,00596300 (-73.3%) 500 200 0,02827500 0,01245100 (-56.0%) 2500 40 0,05638400 0,04285500 (-24.0%) 8000 12 0,13166000 0,12641400 (-4.0%) vabsduw: rept loop master patch 8 12500 0,01646400 0,00688300 (-58.2%) 25 4000 0,01454500 0,00475500 (-67.3%) 100 1000 0,01545800 0,00511800 (-66.9%) 500 200 0,02168200 0,01114300 (-48.6%) 2500 40 0,04571300 0,04138800 (-9.5%) 8000 12 0,12209500 0,12178500 (-0.3%) Same as VADDCUW and VSUBCUW, overall performance gain but it uses more TCGop (4 before the patch, 6 after). Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20221019125040.48028-8-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> show more ...
# c85929b2	19-Oct-2022	Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>	target/ppc: Move VAVG[SU][BHW] to decodetree and use gvec Moved the instructions VAVGUB, VAVGUH, VAVGUW, VAVGSB, VAVGSH, VAVGSW, to decodetree and use gvec with them. For these one the right shift h target/ppc: Move VAVG[SU][BHW] to decodetree and use gvec Moved the instructions VAVGUB, VAVGUH, VAVGUW, VAVGSB, VAVGSH, VAVGSW, to decodetree and use gvec with them. For these one the right shift had to be made before the sum as to avoid an overflow, so add 1 at the end if any of the entries had 1 in its LSB as to replicate the "+ 1" before the shift described by the ISA. vavgub: rept loop master patch 8 12500 0,02616600 0,00754200 (-71.2%) 25 4000 0,02530000 0,00637700 (-74.8%) 100 1000 0,02604600 0,00790100 (-69.7%) 500 200 0,03189300 0,01838400 (-42.4%) 2500 40 0,06006900 0,06851000 (+14.1%) 8000 12 0,13941000 0,20548500 (+47.4%) vavguh: rept loop master patch 8 12500 0,01818200 0,00780600 (-57.1%) 25 4000 0,01789300 0,00641600 (-64.1%) 100 1000 0,01899100 0,00787200 (-58.5%) 500 200 0,02527200 0,01828400 (-27.7%) 2500 40 0,05361800 0,06773000 (+26.3%) 8000 12 0,12886600 0,20291400 (+57.5%) vavguw: rept loop master patch 8 12500 0,01423100 0,00776600 (-45.4%) 25 4000 0,01780800 0,00638600 (-64.1%) 100 1000 0,02085500 0,00787000 (-62.3%) 500 200 0,02737100 0,01828800 (-33.2%) 2500 40 0,05572600 0,06774200 (+21.6%) 8000 12 0,13101700 0,20311600 (+55.0%) vavgsb: rept loop master patch 8 12500 0,03006000 0,00788600 (-73.8%) 25 4000 0,02882200 0,00637800 (-77.9%) 100 1000 0,02958000 0,00791400 (-73.2%) 500 200 0,03548800 0,01860400 (-47.6%) 2500 40 0,06360000 0,06850800 (+7.7%) 8000 12 0,13816500 0,20550300 (+48.7%) vavgsh: rept loop master patch 8 12500 0,01965900 0,00776600 (-60.5%) 25 4000 0,01875400 0,00638700 (-65.9%) 100 1000 0,01952200 0,00786900 (-59.7%) 500 200 0,02562000 0,01760300 (-31.3%) 2500 40 0,05384300 0,06742800 (+25.2%) 8000 12 0,13240800 0,20330000 (+53.5%) vavgsw: rept loop master patch 8 12500 0,01407700 0,00775600 (-44.9%) 25 4000 0,01762300 0,00640000 (-63.7%) 100 1000 0,02046500 0,00788500 (-61.5%) 500 200 0,02745600 0,01843000 (-32.9%) 2500 40 0,05375500 0,06820500 (+26.9%) 8000 12 0,13068300 0,20304900 (+55.4%) These results to me seems to indicate that with gvec the results have a slower translation but faster execution. Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20221019125040.48028-7-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> show more ...
# d57fbd8f	19-Oct-2022	Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>	target/ppc: Move VPRTYB[WDQ] to decodetree and use gvec Moved VPRTYBW and VPRTYBD to use gvec and both of them and VPRTYBQ to decodetree. VPRTYBW and VPRTYBD now also use .fni4 and .fni8, respective target/ppc: Move VPRTYB[WDQ] to decodetree and use gvec Moved VPRTYBW and VPRTYBD to use gvec and both of them and VPRTYBQ to decodetree. VPRTYBW and VPRTYBD now also use .fni4 and .fni8, respectively. vprtybw: rept loop master patch 8 12500 0,01198900 0,00703100 (-41.4%) 25 4000 0,01070100 0,00571400 (-46.6%) 100 1000 0,01123300 0,00678200 (-39.6%) 500 200 0,01601500 0,01535600 (-4.1%) 2500 40 0,03872900 0,05562100 (43.6%) 8000 12 0,10047000 0,16643000 (65.7%) vprtybd: rept loop master patch 8 12500 0,00757700 0,00788100 (4.0%) 25 4000 0,00652500 0,00669600 (2.6%) 100 1000 0,00714400 0,00825400 (15.5%) 500 200 0,01211000 0,01903700 (57.2%) 2500 40 0,03483800 0,07021200 (101.5%) 8000 12 0,09591800 0,21036200 (119.3%) vprtybq: rept loop master patch 8 12500 0,00675600 0,00667200 (-1.2%) 25 4000 0,00619400 0,00643200 (3.8%) 100 1000 0,00707100 0,00751100 (6.2%) 500 200 0,01199300 0,01342000 (11.9%) 2500 40 0,03490900 0,04092900 (17.2%) 8000 12 0,09588200 0,11465100 (19.6%) I wasn't expecting such a performance lost in both VPRTYBD and VPRTYBQ, I'm not sure if it's worth to move those instructions. Comparing the assembly of the helper with the TCGop they are pretty similar, so I'm not sure why vprtybd took so much more time. Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20221019125040.48028-6-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> show more ...
# 90b5aadb	19-Oct-2022	Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>	target/ppc: Move VNEG[WD] to decodtree and use gvec Moved the instructions VNEGW and VNEGD to decodetree and used gvec to decode it. vnegw: rept loop master patch 8 12500 target/ppc: Move VNEG[WD] to decodtree and use gvec Moved the instructions VNEGW and VNEGD to decodetree and used gvec to decode it. vnegw: rept loop master patch 8 12500 0,01053200 0,00548400 (-47.9%) 25 4000 0,01030500 0,00390000 (-62.2%) 100 1000 0,01096300 0,00395400 (-63.9%) 500 200 0,01472000 0,00712300 (-51.6%) 2500 40 0,03809000 0,02147700 (-43.6%) 8000 12 0,09957100 0,06202100 (-37.7%) vnegd: rept loop master patch 8 12500 0,00594600 0,00543800 (-8.5%) 25 4000 0,00575200 0,00396400 (-31.1%) 100 1000 0,00676100 0,00394800 (-41.6%) 500 200 0,01149300 0,00709400 (-38.3%) 2500 40 0,03441500 0,02169600 (-37.0%) 8000 12 0,09516900 0,06337000 (-33.4%) Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20221019125040.48028-5-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> show more ...
# 611bc69b	19-Oct-2022	Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>	target/ppc: Move V(ADD\|SUB)CUW to decodetree and use gvec This patch moves VADDCUW and VSUBCUW to decodtree with gvec using an implementation based on the helper, with the main difference being chan target/ppc: Move V(ADD\|SUB)CUW to decodetree and use gvec This patch moves VADDCUW and VSUBCUW to decodtree with gvec using an implementation based on the helper, with the main difference being changing the -1 (aka all bits set to 1) result returned by cmp when true to +1. It also implemented a .fni4 version of those instructions and dropped the helper. vaddcuw: rept loop master patch 8 12500 0,01008200 0,00612400 (-39.3%) 25 4000 0,01091500 0,00471600 (-56.8%) 100 1000 0,01332500 0,00593700 (-55.4%) 500 200 0,01998500 0,01275700 (-36.2%) 2500 40 0,04704300 0,04364300 (-7.2%) 8000 12 0,10748200 0,11241000 (+4.6%) vsubcuw: rept loop master patch 8 12500 0,01226200 0,00571600 (-53.4%) 25 4000 0,01493500 0,00462100 (-69.1%) 100 1000 0,01522700 0,00455100 (-70.1%) 500 200 0,02384600 0,01133500 (-52.5%) 2500 40 0,04935200 0,03178100 (-35.6%) 8000 12 0,09039900 0,09440600 (+4.4%) Overall there was a gain in performance, but the TCGop code was still slightly bigger in the new version (it went from 4 to 5). Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20221019125040.48028-4-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> show more ...
# 306e4753	19-Oct-2022	Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>	target/ppc: Move VMH[R]ADDSHS instruction to decodetree This patch moves VMHADDSHS and VMHRADDSHS to decodetree I couldn't find a satisfactory implementation with TCG inline. vmhaddshs: rept loo target/ppc: Move VMH[R]ADDSHS instruction to decodetree This patch moves VMHADDSHS and VMHRADDSHS to decodetree I couldn't find a satisfactory implementation with TCG inline. vmhaddshs: rept loop master patch 8 12500 0,02983400 0,02648500 (-11.2%) 25 4000 0,02946000 0,02518000 (-14.5%) 100 1000 0,03104300 0,02638000 (-15.0%) 500 200 0,04002000 0,03502500 (-12.5%) 2500 40 0,08090100 0,07562200 (-6.5%) 8000 12 0,19242600 0,18626800 (-3.2%) vmhraddshs: rept loop master patch 8 12500 0,03078600 0,02851000 (-7.4%) 25 4000 0,02793200 0,02746900 (-1.7%) 100 1000 0,02886000 0,02839900 (-1.6%) 500 200 0,03714700 0,03799200 (+2.3%) 2500 40 0,07948000 0,07852200 (-1.2%) 8000 12 0,19049800 0,18813900 (-1.2%) Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20221019125040.48028-3-lucas.araujo@eldorado.org.br> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com> show more ...
12 3 4 5 6 7