History log of /qemu/target/ppc/int_helper.c (Results 1 – 25 of 172)
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
Revision tags: v8.2.4, v8.2.3, v7.2.11, v9.0.0
# 948e257c 23-Apr-2024 Chinmay Rath <rathc@linux.ibm.com>

target/ppc: Move logical fixed-point instructions to decodetree.

Moving the below instructions to decodetree specification :

andi[s]., {ori, xori}[s] : D-form

{and, andc, nand, or, orc, nor, x

target/ppc: Move logical fixed-point instructions to decodetree.

Moving the below instructions to decodetree specification :

andi[s]., {ori, xori}[s] : D-form

{and, andc, nand, or, orc, nor, xor, eqv}[.],
exts{b, h, w}[.], cnt{l, t}z{w, d}[.],
popcnt{b, w, d}, prty{w, d}, cmp, bpermd : X-form

With this patch, all the fixed-point logical instructions have been
moved to decodetree.
The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured with the '-d in_asm,op' flag.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
[np: 32-bit compile fix]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

show more ...


# ae556c6a 23-Apr-2024 Chinmay Rath <rathc@linux.ibm.com>

target/ppc: Move cmp{rb, eqb}, tw[i], td[i], isel instructions to decodetree.

Moving the following instructions to decodetree specification :

cmp{rb, eqb}, t{w, d} : X-form
t{w, d}i : D-form
is

target/ppc: Move cmp{rb, eqb}, tw[i], td[i], isel instructions to decodetree.

Moving the following instructions to decodetree specification :

cmp{rb, eqb}, t{w, d} : X-form
t{w, d}i : D-form
isel : A-form

The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured using the '-d in_asm,op' flag.
Also for CMPRB, following review comments :
Replaced repetition of arithmetic right shifting (tcg_gen_shri_i32) followed
by extraction of last 8 bits (tcg_gen_ext8u_i32) with extraction of the required
bits using offsets (tcg_gen_extract_i32).

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
[np: 32-bit compile fix]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

show more ...


# f424bc10 23-Apr-2024 Chinmay Rath <rathc@linux.ibm.com>

target/ppc: Move div/mod fixed-point insns (64 bits operands) to decodetree.

Moving the below instructions to decodetree specification :

divd[u, e, eu][o][.] : XO-form
mod{sd, ud} : X-form

With

target/ppc: Move div/mod fixed-point insns (64 bits operands) to decodetree.

Moving the below instructions to decodetree specification :

divd[u, e, eu][o][.] : XO-form
mod{sd, ud} : X-form

With this patch, all the fixed-point arithmetic instructions have been
moved to decodetree.
The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured using the '-d in_asm,op' flag.
Also, remaned do_divwe method in fixedpoint-impl.c.inc to do_dive because it is
now used to divide doubleword operands as well, and not just words.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
[np: 32-bit compile fix]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

show more ...


# a81b5c18 23-Apr-2024 Chinmay Rath <rathc@linux.ibm.com>

target/ppc: Move neg, darn, mod{sw, uw} to decodetree.

Moving the below instructions to decodetree specification :

neg[o][.] : XO-form
mod{sw, uw}, darn : X-form

The changes were verified

target/ppc: Move neg, darn, mod{sw, uw} to decodetree.

Moving the below instructions to decodetree specification :

neg[o][.] : XO-form
mod{sw, uw}, darn : X-form

The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured with the '-d in_asm,op' flag.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
[np: 32-bit compile fix]
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

show more ...


# 2871921d 23-Apr-2024 Chinmay Rath <rathc@linux.ibm.com>

target/ppc: Move divw[u, e, eu] instructions to decodetree.

Moving the following instructions to decodetree specification :
divw[u, e, eu][o][.] : XO-form

The changes were verified by validating

target/ppc: Move divw[u, e, eu] instructions to decodetree.

Moving the following instructions to decodetree specification :
divw[u, e, eu][o][.] : XO-form

The changes were verified by validating that the tcg ops generated by those
instructions remain the same, which were captured with the '-d in_asm,op' flag.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Chinmay Rath <rathc@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

show more ...


Revision tags: v9.0.0-rc4, v9.0.0-rc3, v9.0.0-rc2, v9.0.0-rc1, v9.0.0-rc0, v8.2.2, v7.2.10, v8.2.1, v8.1.5, v7.2.9, v8.1.4, v7.2.8, v8.2.0, v8.2.0-rc4, v8.2.0-rc3, v8.2.0-rc2, v8.2.0-rc1, v7.2.7, v8.1.3, v8.2.0-rc0, v8.1.2
# 668a6314 29-Sep-2023 Cédric Le Goater <clg@kaod.org>

target/ppc: Rename variables to avoid local variable shadowing in VUPKPX

and fix such warnings :

../target/ppc/int_helper.c: In function ‘helper_vupklpx’:
../target/ppc/int_helper.c:2025:21: wa

target/ppc: Rename variables to avoid local variable shadowing in VUPKPX

and fix such warnings :

../target/ppc/int_helper.c: In function ‘helper_vupklpx’:
../target/ppc/int_helper.c:2025:21: warning: declaration of ‘r’ shadows a parameter [-Wshadow=local]
2025 | uint8_t r = (e >> 10) & 0x1f; \
| ^
../target/ppc/int_helper.c:2033:1: note: in expansion of macro ‘VUPKPX’
2033 | VUPKPX(lpx, UPKLO)
| ^~~~~~
../target/ppc/int_helper.c:2017:41: note: shadowed declaration is here
2017 | void helper_vupk##suffix(ppc_avr_t *r, ppc_avr_t *b) \
| ~~~~~~~~~~~^
../target/ppc/int_helper.c:2033:1: note: in expansion of macro ‘VUPKPX’
2033 | VUPKPX(lpx, UPKLO)
| ^~~~~~

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230929083143.234553-1-clg@kaod.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Markus Armbruster <armbru@redhat.com>

show more ...


Revision tags: v9.0.0-rc4, v9.0.0-rc3, v9.0.0-rc2, v9.0.0-rc1, v9.0.0-rc0, v8.2.2, v7.2.10, v8.2.1, v8.1.5, v7.2.9, v8.1.4, v7.2.8, v8.2.0, v8.2.0-rc4, v8.2.0-rc3, v8.2.0-rc2, v8.2.0-rc1, v7.2.7, v8.1.3, v8.2.0-rc0, v8.1.2
# 668a6314 29-Sep-2023 Cédric Le Goater <clg@kaod.org>

target/ppc: Rename variables to avoid local variable shadowing in VUPKPX

and fix such warnings :

../target/ppc/int_helper.c: In function ‘helper_vupklpx’:
../target/ppc/int_helper.c:2025:21: wa

target/ppc: Rename variables to avoid local variable shadowing in VUPKPX

and fix such warnings :

../target/ppc/int_helper.c: In function ‘helper_vupklpx’:
../target/ppc/int_helper.c:2025:21: warning: declaration of ‘r’ shadows a parameter [-Wshadow=local]
2025 | uint8_t r = (e >> 10) & 0x1f; \
| ^
../target/ppc/int_helper.c:2033:1: note: in expansion of macro ‘VUPKPX’
2033 | VUPKPX(lpx, UPKLO)
| ^~~~~~
../target/ppc/int_helper.c:2017:41: note: shadowed declaration is here
2017 | void helper_vupk##suffix(ppc_avr_t *r, ppc_avr_t *b) \
| ~~~~~~~~~~~^
../target/ppc/int_helper.c:2033:1: note: in expansion of macro ‘VUPKPX’
2033 | VUPKPX(lpx, UPKLO)
| ^~~~~~

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230929083143.234553-1-clg@kaod.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Markus Armbruster <armbru@redhat.com>

show more ...


Revision tags: v9.0.0-rc4, v9.0.0-rc3, v9.0.0-rc2, v9.0.0-rc1, v9.0.0-rc0, v8.2.2, v7.2.10, v8.2.1, v8.1.5, v7.2.9, v8.1.4, v7.2.8, v8.2.0, v8.2.0-rc4, v8.2.0-rc3, v8.2.0-rc2, v8.2.0-rc1, v7.2.7, v8.1.3, v8.2.0-rc0, v8.1.2
# 668a6314 29-Sep-2023 Cédric Le Goater <clg@kaod.org>

target/ppc: Rename variables to avoid local variable shadowing in VUPKPX

and fix such warnings :

../target/ppc/int_helper.c: In function ‘helper_vupklpx’:
../target/ppc/int_helper.c:2025:21: wa

target/ppc: Rename variables to avoid local variable shadowing in VUPKPX

and fix such warnings :

../target/ppc/int_helper.c: In function ‘helper_vupklpx’:
../target/ppc/int_helper.c:2025:21: warning: declaration of ‘r’ shadows a parameter [-Wshadow=local]
2025 | uint8_t r = (e >> 10) & 0x1f; \
| ^
../target/ppc/int_helper.c:2033:1: note: in expansion of macro ‘VUPKPX’
2033 | VUPKPX(lpx, UPKLO)
| ^~~~~~
../target/ppc/int_helper.c:2017:41: note: shadowed declaration is here
2017 | void helper_vupk##suffix(ppc_avr_t *r, ppc_avr_t *b) \
| ~~~~~~~~~~~^
../target/ppc/int_helper.c:2033:1: note: in expansion of macro ‘VUPKPX’
2033 | VUPKPX(lpx, UPKLO)
| ^~~~~~

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230929083143.234553-1-clg@kaod.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Markus Armbruster <armbru@redhat.com>

show more ...


Revision tags: v8.1.1, v7.2.6, v8.0.5, v8.1.0, v8.1.0-rc4, v8.1.0-rc3, v7.2.5, v8.0.4, v8.1.0-rc2, v8.1.0-rc1, v8.1.0-rc0
# 7bdbf233 11-Jul-2023 Richard Henderson <richard.henderson@linaro.org>

target/ppc: Use clmul_64

Use generic routine for 64-bit carry-less multiply.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


# f56d3c1a 11-Jul-2023 Richard Henderson <richard.henderson@linaro.org>

target/ppc: Use clmul_32* routines

Use generic routines for 32-bit carry-less multiply.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


# a2c67342 11-Jul-2023 Richard Henderson <richard.henderson@linaro.org>

target/ppc: Use clmul_16* routines

Use generic routines for 16-bit carry-less multiply.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


# cec4090d 11-Jul-2023 Richard Henderson <richard.henderson@linaro.org>

target/ppc: Use clmul_8* routines

Use generic routines for 8-bit carry-less multiply.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>


# 73c19706 28-Aug-2023 Philippe Mathieu-Daudé <philmd@linaro.org>

target/helpers: Remove unnecessary 'qemu/main-loop.h' header

"qemu/main-loop.h" declares functions related to QEMU's
main loop mutex, which these files don't access. Remove
the unused "qemu/main-loo

target/helpers: Remove unnecessary 'qemu/main-loop.h' header

"qemu/main-loop.h" declares functions related to QEMU's
main loop mutex, which these files don't access. Remove
the unused "qemu/main-loop.h" header.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230828221314.18435-8-philmd@linaro.org>

show more ...


# 73c19706 28-Aug-2023 Philippe Mathieu-Daudé <philmd@linaro.org>

target/helpers: Remove unnecessary 'qemu/main-loop.h' header

"qemu/main-loop.h" declares functions related to QEMU's
main loop mutex, which these files don't access. Remove
the unused "qemu/main-loo

target/helpers: Remove unnecessary 'qemu/main-loop.h' header

"qemu/main-loop.h" declares functions related to QEMU's
main loop mutex, which these files don't access. Remove
the unused "qemu/main-loop.h" header.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230828221314.18435-8-philmd@linaro.org>

show more ...


# 73c19706 28-Aug-2023 Philippe Mathieu-Daudé <philmd@linaro.org>

target/helpers: Remove unnecessary 'qemu/main-loop.h' header

"qemu/main-loop.h" declares functions related to QEMU's
main loop mutex, which these files don't access. Remove
the unused "qemu/main-loo

target/helpers: Remove unnecessary 'qemu/main-loop.h' header

"qemu/main-loop.h" declares functions related to QEMU's
main loop mutex, which these files don't access. Remove
the unused "qemu/main-loop.h" header.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230828221314.18435-8-philmd@linaro.org>

show more ...


Revision tags: v8.0.3, v7.2.4
# af4cb945 02-Jun-2023 Richard Henderson <richard.henderson@linaro.org>

target/ppc: Use aesdec_ISB_ISR_AK_IMC

This implements the VNCIPHER instruction.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: Use aesdec_ISB_ISR_AK_IMC

This implements the VNCIPHER instruction.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

show more ...


# ce9f5b37 02-Jun-2023 Richard Henderson <richard.henderson@linaro.org>

target/ppc: Use aesenc_SB_SR_MC_AK

This implements the VCIPHER instruction.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Sign

target/ppc: Use aesenc_SB_SR_MC_AK

This implements the VCIPHER instruction.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

show more ...


# 2cf44f3b 02-Jun-2023 Richard Henderson <richard.henderson@linaro.org>

target/ppc: Use aesdec_ISB_ISR_AK

This implements the VNCIPHERLAST instruction.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>

target/ppc: Use aesdec_ISB_ISR_AK

This implements the VNCIPHERLAST instruction.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

show more ...


# 7df34e48 02-Jun-2023 Richard Henderson <richard.henderson@linaro.org>

target/ppc: Use aesenc_SB_SR_AK

This implements the VCIPHERLAST instruction.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Sig

target/ppc: Use aesenc_SB_SR_AK

This implements the VCIPHERLAST instruction.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

show more ...


Revision tags: v8.0.2, v8.0.1, v7.2.3, v7.2.2, v8.0.0, v8.0.0-rc4, v8.0.0-rc3, v7.2.1, v8.0.0-rc2, v8.0.0-rc1, v8.0.0-rc0, v7.2.0, v7.2.0-rc4, v7.2.0-rc3, v7.2.0-rc2, v7.2.0-rc1, v7.2.0-rc0
# 26c964f8 19-Oct-2022 Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>

target/ppc: Move VABSDU[BHW] to decodetree and use gvec

Moved VABSDUB, VABSDUH and VABSDUW to decodetree and use gvec to
translate them.

vabsdub:
rept loop master patch
8 12

target/ppc: Move VABSDU[BHW] to decodetree and use gvec

Moved VABSDUB, VABSDUH and VABSDUW to decodetree and use gvec to
translate them.

vabsdub:
rept loop master patch
8 12500 0,03601600 0,00688500 (-80.9%)
25 4000 0,03651000 0,00532100 (-85.4%)
100 1000 0,03666900 0,00595300 (-83.8%)
500 200 0,04305800 0,01244600 (-71.1%)
2500 40 0,06893300 0,04273700 (-38.0%)
8000 12 0,14633200 0,12660300 (-13.5%)

vabsduh:
rept loop master patch
8 12500 0,02172400 0,00687500 (-68.4%)
25 4000 0,02154100 0,00531500 (-75.3%)
100 1000 0,02235400 0,00596300 (-73.3%)
500 200 0,02827500 0,01245100 (-56.0%)
2500 40 0,05638400 0,04285500 (-24.0%)
8000 12 0,13166000 0,12641400 (-4.0%)

vabsduw:
rept loop master patch
8 12500 0,01646400 0,00688300 (-58.2%)
25 4000 0,01454500 0,00475500 (-67.3%)
100 1000 0,01545800 0,00511800 (-66.9%)
500 200 0,02168200 0,01114300 (-48.6%)
2500 40 0,04571300 0,04138800 (-9.5%)
8000 12 0,12209500 0,12178500 (-0.3%)

Same as VADDCUW and VSUBCUW, overall performance gain but it uses more
TCGop (4 before the patch, 6 after).

Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221019125040.48028-8-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

show more ...


# c85929b2 19-Oct-2022 Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>

target/ppc: Move VAVG[SU][BHW] to decodetree and use gvec

Moved the instructions VAVGUB, VAVGUH, VAVGUW, VAVGSB, VAVGSH, VAVGSW,
to decodetree and use gvec with them. For these one the right shift
h

target/ppc: Move VAVG[SU][BHW] to decodetree and use gvec

Moved the instructions VAVGUB, VAVGUH, VAVGUW, VAVGSB, VAVGSH, VAVGSW,
to decodetree and use gvec with them. For these one the right shift
had to be made before the sum as to avoid an overflow, so add 1 at the
end if any of the entries had 1 in its LSB as to replicate the "+ 1"
before the shift described by the ISA.

vavgub:
rept loop master patch
8 12500 0,02616600 0,00754200 (-71.2%)
25 4000 0,02530000 0,00637700 (-74.8%)
100 1000 0,02604600 0,00790100 (-69.7%)
500 200 0,03189300 0,01838400 (-42.4%)
2500 40 0,06006900 0,06851000 (+14.1%)
8000 12 0,13941000 0,20548500 (+47.4%)

vavguh:
rept loop master patch
8 12500 0,01818200 0,00780600 (-57.1%)
25 4000 0,01789300 0,00641600 (-64.1%)
100 1000 0,01899100 0,00787200 (-58.5%)
500 200 0,02527200 0,01828400 (-27.7%)
2500 40 0,05361800 0,06773000 (+26.3%)
8000 12 0,12886600 0,20291400 (+57.5%)

vavguw:
rept loop master patch
8 12500 0,01423100 0,00776600 (-45.4%)
25 4000 0,01780800 0,00638600 (-64.1%)
100 1000 0,02085500 0,00787000 (-62.3%)
500 200 0,02737100 0,01828800 (-33.2%)
2500 40 0,05572600 0,06774200 (+21.6%)
8000 12 0,13101700 0,20311600 (+55.0%)

vavgsb:
rept loop master patch
8 12500 0,03006000 0,00788600 (-73.8%)
25 4000 0,02882200 0,00637800 (-77.9%)
100 1000 0,02958000 0,00791400 (-73.2%)
500 200 0,03548800 0,01860400 (-47.6%)
2500 40 0,06360000 0,06850800 (+7.7%)
8000 12 0,13816500 0,20550300 (+48.7%)

vavgsh:
rept loop master patch
8 12500 0,01965900 0,00776600 (-60.5%)
25 4000 0,01875400 0,00638700 (-65.9%)
100 1000 0,01952200 0,00786900 (-59.7%)
500 200 0,02562000 0,01760300 (-31.3%)
2500 40 0,05384300 0,06742800 (+25.2%)
8000 12 0,13240800 0,20330000 (+53.5%)

vavgsw:
rept loop master patch
8 12500 0,01407700 0,00775600 (-44.9%)
25 4000 0,01762300 0,00640000 (-63.7%)
100 1000 0,02046500 0,00788500 (-61.5%)
500 200 0,02745600 0,01843000 (-32.9%)
2500 40 0,05375500 0,06820500 (+26.9%)
8000 12 0,13068300 0,20304900 (+55.4%)

These results to me seems to indicate that with gvec the results have a
slower translation but faster execution.

Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221019125040.48028-7-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

show more ...


# d57fbd8f 19-Oct-2022 Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>

target/ppc: Move VPRTYB[WDQ] to decodetree and use gvec

Moved VPRTYBW and VPRTYBD to use gvec and both of them and VPRTYBQ to
decodetree. VPRTYBW and VPRTYBD now also use .fni4 and .fni8,
respective

target/ppc: Move VPRTYB[WDQ] to decodetree and use gvec

Moved VPRTYBW and VPRTYBD to use gvec and both of them and VPRTYBQ to
decodetree. VPRTYBW and VPRTYBD now also use .fni4 and .fni8,
respectively.

vprtybw:
rept loop master patch
8 12500 0,01198900 0,00703100 (-41.4%)
25 4000 0,01070100 0,00571400 (-46.6%)
100 1000 0,01123300 0,00678200 (-39.6%)
500 200 0,01601500 0,01535600 (-4.1%)
2500 40 0,03872900 0,05562100 (43.6%)
8000 12 0,10047000 0,16643000 (65.7%)

vprtybd:
rept loop master patch
8 12500 0,00757700 0,00788100 (4.0%)
25 4000 0,00652500 0,00669600 (2.6%)
100 1000 0,00714400 0,00825400 (15.5%)
500 200 0,01211000 0,01903700 (57.2%)
2500 40 0,03483800 0,07021200 (101.5%)
8000 12 0,09591800 0,21036200 (119.3%)

vprtybq:
rept loop master patch
8 12500 0,00675600 0,00667200 (-1.2%)
25 4000 0,00619400 0,00643200 (3.8%)
100 1000 0,00707100 0,00751100 (6.2%)
500 200 0,01199300 0,01342000 (11.9%)
2500 40 0,03490900 0,04092900 (17.2%)
8000 12 0,09588200 0,11465100 (19.6%)

I wasn't expecting such a performance lost in both VPRTYBD and VPRTYBQ,
I'm not sure if it's worth to move those instructions. Comparing the
assembly of the helper with the TCGop they are pretty similar, so
I'm not sure why vprtybd took so much more time.

Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221019125040.48028-6-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

show more ...


# 90b5aadb 19-Oct-2022 Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>

target/ppc: Move VNEG[WD] to decodtree and use gvec

Moved the instructions VNEGW and VNEGD to decodetree and used gvec to
decode it.

vnegw:
rept loop master patch
8 12500

target/ppc: Move VNEG[WD] to decodtree and use gvec

Moved the instructions VNEGW and VNEGD to decodetree and used gvec to
decode it.

vnegw:
rept loop master patch
8 12500 0,01053200 0,00548400 (-47.9%)
25 4000 0,01030500 0,00390000 (-62.2%)
100 1000 0,01096300 0,00395400 (-63.9%)
500 200 0,01472000 0,00712300 (-51.6%)
2500 40 0,03809000 0,02147700 (-43.6%)
8000 12 0,09957100 0,06202100 (-37.7%)

vnegd:
rept loop master patch
8 12500 0,00594600 0,00543800 (-8.5%)
25 4000 0,00575200 0,00396400 (-31.1%)
100 1000 0,00676100 0,00394800 (-41.6%)
500 200 0,01149300 0,00709400 (-38.3%)
2500 40 0,03441500 0,02169600 (-37.0%)
8000 12 0,09516900 0,06337000 (-33.4%)

Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221019125040.48028-5-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

show more ...


# 611bc69b 19-Oct-2022 Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>

target/ppc: Move V(ADD|SUB)CUW to decodetree and use gvec

This patch moves VADDCUW and VSUBCUW to decodtree with gvec using an
implementation based on the helper, with the main difference being
chan

target/ppc: Move V(ADD|SUB)CUW to decodetree and use gvec

This patch moves VADDCUW and VSUBCUW to decodtree with gvec using an
implementation based on the helper, with the main difference being
changing the -1 (aka all bits set to 1) result returned by cmp when
true to +1. It also implemented a .fni4 version of those instructions
and dropped the helper.

vaddcuw:
rept loop master patch
8 12500 0,01008200 0,00612400 (-39.3%)
25 4000 0,01091500 0,00471600 (-56.8%)
100 1000 0,01332500 0,00593700 (-55.4%)
500 200 0,01998500 0,01275700 (-36.2%)
2500 40 0,04704300 0,04364300 (-7.2%)
8000 12 0,10748200 0,11241000 (+4.6%)

vsubcuw:
rept loop master patch
8 12500 0,01226200 0,00571600 (-53.4%)
25 4000 0,01493500 0,00462100 (-69.1%)
100 1000 0,01522700 0,00455100 (-70.1%)
500 200 0,02384600 0,01133500 (-52.5%)
2500 40 0,04935200 0,03178100 (-35.6%)
8000 12 0,09039900 0,09440600 (+4.4%)

Overall there was a gain in performance, but the TCGop code was still
slightly bigger in the new version (it went from 4 to 5).

Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221019125040.48028-4-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

show more ...


# 306e4753 19-Oct-2022 Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>

target/ppc: Move VMH[R]ADDSHS instruction to decodetree

This patch moves VMHADDSHS and VMHRADDSHS to decodetree I couldn't find
a satisfactory implementation with TCG inline.

vmhaddshs:
rept loo

target/ppc: Move VMH[R]ADDSHS instruction to decodetree

This patch moves VMHADDSHS and VMHRADDSHS to decodetree I couldn't find
a satisfactory implementation with TCG inline.

vmhaddshs:
rept loop master patch
8 12500 0,02983400 0,02648500 (-11.2%)
25 4000 0,02946000 0,02518000 (-14.5%)
100 1000 0,03104300 0,02638000 (-15.0%)
500 200 0,04002000 0,03502500 (-12.5%)
2500 40 0,08090100 0,07562200 (-6.5%)
8000 12 0,19242600 0,18626800 (-3.2%)

vmhraddshs:
rept loop master patch
8 12500 0,03078600 0,02851000 (-7.4%)
25 4000 0,02793200 0,02746900 (-1.7%)
100 1000 0,02886000 0,02839900 (-1.6%)
500 200 0,03714700 0,03799200 (+2.3%)
2500 40 0,07948000 0,07852200 (-1.2%)
8000 12 0,19049800 0,18813900 (-1.2%)

Signed-off-by: Lucas Mateus Castro (alqotel) <lucas.araujo@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20221019125040.48028-3-lucas.araujo@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>

show more ...


1234567