History log of /linux/arch/x86/crypto/aesni-intel_glue.c (Results 1 – 25 of 121)
Revision Date Author Comments
# 6d85a058 20-May-2024 Tony Luck <tony.luck@intel.com>

crypto: x86/aes-xts - switch to new Intel CPU model defines

New CPU #defines encode vendor and family as well as model.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov

crypto: x86/aes-xts - switch to new Intel CPU model defines

New CPU #defines encode vendor and family as well as model.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Link: https://lore.kernel.org/r/20240520224620.9480-2-tony.luck@intel.com

show more ...


# ed265f7f 20-Apr-2024 Eric Biggers <ebiggers@google.com>

crypto: x86/aes-gcm - simplify GCM hash subkey derivation

Remove a redundant expansion of the AES key, and use rodata for zeroes.
Also rename rfc4106_set_hash_subkey() to aes_gcm_derive_hash_subkey(

crypto: x86/aes-gcm - simplify GCM hash subkey derivation

Remove a redundant expansion of the AES key, and use rodata for zeroes.
Also rename rfc4106_set_hash_subkey() to aes_gcm_derive_hash_subkey()
because it's used for both versions of AES-GCM, not just RFC4106.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# 6a805864 20-Apr-2024 Eric Biggers <ebiggers@google.com>

crypto: x86/aes-xts - simplify loop in xts_crypt_slowpath()

Since the total length processed by the loop in xts_crypt_slowpath() is
a multiple of AES_BLOCK_SIZE, just round the length down to
AES_BL

crypto: x86/aes-xts - simplify loop in xts_crypt_slowpath()

Since the total length processed by the loop in xts_crypt_slowpath() is
a multiple of AES_BLOCK_SIZE, just round the length down to
AES_BLOCK_SIZE even on the last step. This doesn't change behavior, as
the last step will process a multiple of AES_BLOCK_SIZE regardless.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# 543ea178 13-Apr-2024 Eric Biggers <ebiggers@google.com>

crypto: x86/aes-xts - optimize size of instructions operating on lengths

x86_64 has the "interesting" property that the instruction size is
generally a bit shorter for instructions that operate on t

crypto: x86/aes-xts - optimize size of instructions operating on lengths

x86_64 has the "interesting" property that the instruction size is
generally a bit shorter for instructions that operate on the 32-bit (or
less) part of registers, or registers that are in the original set of 8.

This patch adjusts the AES-XTS code to take advantage of that property
by changing the LEN parameter from size_t to unsigned int (which is all
that's needed and is what the non-AVX implementation uses) and using the
%eax register for KEYLEN.

This decreases the size of aes-xts-avx-x86_64.o by 1.2%.

Note that changing the kmovq to kmovd was going to be needed anyway to
make the AVX10/256 code really work on CPUs that don't support 512-bit
vectors (since the AVX10 spec says that 64-bit opmask instructions will
only be supported on processors that support 512-bit vectors).

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# 751fb252 07-Apr-2024 Eric Biggers <ebiggers@google.com>

crypto: x86/aes-xts - make non-AVX implementation use new glue code

Make the non-AVX implementation of AES-XTS (xts-aes-aesni) use the new
glue code that was introduced for the AVX implementations o

crypto: x86/aes-xts - make non-AVX implementation use new glue code

Make the non-AVX implementation of AES-XTS (xts-aes-aesni) use the new
glue code that was introduced for the AVX implementations of AES-XTS.
This reduces code size, and it improves the performance of xts-aes-aesni
due to the optimization for messages that don't span page boundaries.

This required moving the new glue functions higher up in the file and
allowing the IV encryption function to be specified by the caller.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# aa2197f5 29-Mar-2024 Eric Biggers <ebiggers@google.com>

crypto: x86/aes-xts - wire up VAES + AVX10/512 implementation

Add an AES-XTS implementation "xts-aes-vaes-avx10_512" for x86_64 CPUs
with the VAES, VPCLMULQDQ, and either AVX10/512 or AVX512BW + AVX

crypto: x86/aes-xts - wire up VAES + AVX10/512 implementation

Add an AES-XTS implementation "xts-aes-vaes-avx10_512" for x86_64 CPUs
with the VAES, VPCLMULQDQ, and either AVX10/512 or AVX512BW + AVX512VL
extensions. This implementation uses zmm registers to operate on four
AES blocks at a time. The assembly code is instantiated using a macro
so that most of the source code is shared with other implementations.

To avoid downclocking on older Intel CPU models, an exclusion list is
used to prevent this 512-bit implementation from being used by default
on some CPU models. They will use xts-aes-vaes-avx10_256 instead. For
now, this exclusion list is simply coded into aesni-intel_glue.c. It
may make sense to eventually move it into a more central location.

xts-aes-vaes-avx10_512 is slightly faster than xts-aes-vaes-avx10_256 on
some current CPUs. E.g., on AMD Zen 4, AES-256-XTS decryption
throughput increases by 13% with 4096-byte inputs, or 14% with 512-byte
inputs. On Intel Sapphire Rapids, AES-256-XTS decryption throughput
increases by 2% with 4096-byte inputs, or 3% with 512-byte inputs.

Future CPUs may provide stronger 512-bit support, in which case a larger
benefit should be seen.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# ee63fea0 29-Mar-2024 Eric Biggers <ebiggers@google.com>

crypto: x86/aes-xts - wire up VAES + AVX10/256 implementation

Add an AES-XTS implementation "xts-aes-vaes-avx10_256" for x86_64 CPUs
with the VAES, VPCLMULQDQ, and either AVX10/256 or AVX512BW + AVX

crypto: x86/aes-xts - wire up VAES + AVX10/256 implementation

Add an AES-XTS implementation "xts-aes-vaes-avx10_256" for x86_64 CPUs
with the VAES, VPCLMULQDQ, and either AVX10/256 or AVX512BW + AVX512VL
extensions. This implementation avoids using zmm registers, instead
using ymm registers to operate on two AES blocks at a time. The
assembly code is instantiated using a macro so that most of the source
code is shared with other implementations.

This is the optimal implementation on CPUs that support VAES and AVX512
but where the zmm registers should not be used due to downclocking
effects, for example Intel's Ice Lake. It should also be the optimal
implementation on future CPUs that support AVX10/256 but not AVX10/512.

The performance is slightly better than that of xts-aes-vaes-avx2, which
uses the same 256-bit vector length, due to factors such as being able
to use ymm16-ymm31 to cache the AES round keys, and being able to use
the vpternlogd instruction to do XORs more efficiently. For example, on
Ice Lake, the throughput of decrypting 4096-byte messages with
AES-256-XTS is 6.6% higher with xts-aes-vaes-avx10_256 than with
xts-aes-vaes-avx2. While this is a small improvement, it is
straightforward to provide this implementation (xts-aes-vaes-avx10_256)
as long as we are providing xts-aes-vaes-avx2 and xts-aes-vaes-avx10_512
anyway, due to the way the _aes_xts_crypt macro is structured.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# e787060b 29-Mar-2024 Eric Biggers <ebiggers@google.com>

crypto: x86/aes-xts - wire up VAES + AVX2 implementation

Add an AES-XTS implementation "xts-aes-vaes-avx2" for x86_64 CPUs with
the VAES, VPCLMULQDQ, and AVX2 extensions, but not AVX512 or AVX10.
Th

crypto: x86/aes-xts - wire up VAES + AVX2 implementation

Add an AES-XTS implementation "xts-aes-vaes-avx2" for x86_64 CPUs with
the VAES, VPCLMULQDQ, and AVX2 extensions, but not AVX512 or AVX10.
This implementation uses ymm registers to operate on two AES blocks at a
time. The assembly code is instantiated using a macro so that most of
the source code is shared with other implementations.

This is the optimal implementation on AMD Zen 3. It should also be the
optimal implementation on Intel Alder Lake, which similarly supports
VAES but not AVX512. Comparing to xts-aes-aesni-avx on Zen 3,
xts-aes-vaes-avx2 provides 70% higher AES-256-XTS decryption throughput
with 4096-byte messages, or 23% higher with 512-byte messages.

A large improvement is also seen with CPUs that do support AVX512 (e.g.,
98% higher AES-256-XTS decryption throughput on Ice Lake with 4096-byte
messages), though the following patches add AVX512 optimized
implementations to get a bit more performance on those CPUs.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# 996f4dcb 29-Mar-2024 Eric Biggers <ebiggers@google.com>

crypto: x86/aes-xts - wire up AESNI + AVX implementation

Add an AES-XTS implementation "xts-aes-aesni-avx" for x86_64 CPUs that
have the AES-NI and AVX extensions but not VAES. It's similar to the

crypto: x86/aes-xts - wire up AESNI + AVX implementation

Add an AES-XTS implementation "xts-aes-aesni-avx" for x86_64 CPUs that
have the AES-NI and AVX extensions but not VAES. It's similar to the
existing xts-aes-aesni in that uses xmm registers to operate on one AES
block at a time. It differs from xts-aes-aesni in the following ways:

- It uses the VEX-coded (non-destructive) instructions from AVX.
This improves performance slightly.
- It incorporates some additional optimizations such as interleaving the
tweak computation with AES en/decryption, handling single-page
messages more efficiently, and caching the first round key.
- It supports only 64-bit (x86_64).
- It's generated by an assembly macro that will also be used to generate
VAES-based implementations.

The performance improvement over xts-aes-aesni varies from small to
large, depending on the CPU and other factors such as the size of the
messages en/decrypted. For example, the following increases in
AES-256-XTS decryption throughput are seen on the following CPUs:

| 4096-byte messages | 512-byte messages |
----------------------+--------------------+-------------------+
Intel Skylake | 6% | 31% |
Intel Cascade Lake | 4% | 26% |
AMD Zen 1 | 61% | 73% |
AMD Zen 2 | 36% | 59% |

(The above CPUs don't support VAES, so they can't use VAES instead.)

While this isn't as large an improvement as what VAES provides, this
still seems worthwhile. This implementation is fairly easy to provide
based on the assembly macro that's needed for VAES anyway, and it will
be the best implementation on a large number of CPUs (very roughly, the
CPUs launched by Intel and AMD from 2011 to 2018).

This makes the existing xts-aes-aesni *mostly* obsolete. For now, leave
it in place to support 32-bit kernels and also CPUs like Intel Westmere
that support AES-NI but not AVX. (We could potentially remove it anyway
and just rely on the indirect acceleration via ecb-aes-aesni in those
cases, but that change will need to be considered separately.)

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# e3299a4c 22-Mar-2024 Chang S. Bae <chang.seok.bae@intel.com>

crypto: x86/aesni - Update aesni_set_key() to return void

The aesni_set_key() implementation has no error case, yet its prototype
specifies to return an error code.

Modify the function prototype to

crypto: x86/aesni - Update aesni_set_key() to return void

The aesni_set_key() implementation has no error case, yet its prototype
specifies to return an error code.

Modify the function prototype to return void and adjust the related code.

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: linux-crypto@vger.kernel.org
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# d50b35f0 22-Mar-2024 Chang S. Bae <chang.seok.bae@intel.com>

crypto: x86/aesni - Rearrange AES key size check

aes_expandkey() already includes an AES key size check. If AES-NI is
unusable, invoke the function without the size check.

Also, use aes_check_keyle

crypto: x86/aesni - Rearrange AES key size check

aes_expandkey() already includes an AES key size check. If AES-NI is
unusable, invoke the function without the size check.

Also, use aes_check_keylen() instead of open code.

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: linux-crypto@vger.kernel.org
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# e12a68b3 28-Sep-2023 Chang S. Bae <chang.seok.bae@intel.com>

crypto: x86/aesni - Perform address alignment early for XTS mode

Currently, the alignment of each field in struct aesni_xts_ctx occurs
right before every access. However, it's possible to perform th

crypto: x86/aesni - Perform address alignment early for XTS mode

Currently, the alignment of each field in struct aesni_xts_ctx occurs
right before every access. However, it's possible to perform this
alignment ahead of time.

Introduce a helper function that converts struct crypto_skcipher *tfm
to struct aesni_xts_ctx *ctx and returns an aligned address. Utilize
this helper function at the beginning of each XTS function and then
eliminate redundant alignment code.

Suggested-by: Eric Biggers <ebiggers@kernel.org>
Link: https://lore.kernel.org/all/ZFWQ4sZEVu%2FLHq+Q@gmail.com/
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Cc: linux-crypto@vger.kernel.org
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# d148736f 28-Sep-2023 Chang S. Bae <chang.seok.bae@intel.com>

crypto: x86/aesni - Correct the data type in struct aesni_xts_ctx

Currently, every field in struct aesni_xts_ctx is defined as a byte
array of the same size as struct crypto_aes_ctx. This data type

crypto: x86/aesni - Correct the data type in struct aesni_xts_ctx

Currently, every field in struct aesni_xts_ctx is defined as a byte
array of the same size as struct crypto_aes_ctx. This data type
is obscure and the choice lacks justification.

To rectify this, update the field type in struct aesni_xts_ctx to
match its actual structure.

Suggested-by: Eric Biggers <ebiggers@kernel.org>
Link: https://lore.kernel.org/all/ZFWQ4sZEVu%2FLHq+Q@gmail.com/
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Cc: linux-crypto@vger.kernel.org
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# 62496a2d 28-Sep-2023 Chang S. Bae <chang.seok.bae@intel.com>

crypto: x86/aesni - Refactor the common address alignment code

The address alignment code has been duplicated for each mode. Instead
of duplicating the same code, refactor the alignment code and sim

crypto: x86/aesni - Refactor the common address alignment code

The address alignment code has been duplicated for each mode. Instead
of duplicating the same code, refactor the alignment code and simplify
the alignment helpers.

Suggested-by: Eric Biggers <ebiggers@kernel.org>
Link: https://lore.kernel.org/all/20230526065414.GB875@sol.localdomain/
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Cc: linux-crypto@vger.kernel.org
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# 28b77609 15-Jul-2023 Eric Biggers <ebiggers@google.com>

crypto: x86/aesni - remove unused parameter to aes_set_key_common()

The 'tfm' parameter to aes_set_key_common() is never used, so remove it.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed

crypto: x86/aesni - remove unused parameter to aes_set_key_common()

The 'tfm' parameter to aes_set_key_common() is never used, so remove it.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# 74c6df41 21-Jun-2023 Chang S. Bae <chang.seok.bae@intel.com>

crypto: x86/aesni - Align the address before aes_set_key_common()

aes_set_key_common() performs runtime alignment to the void *raw_ctx
pointer. This facilitates consistent access to the 16byte-align

crypto: x86/aesni - Align the address before aes_set_key_common()

aes_set_key_common() performs runtime alignment to the void *raw_ctx
pointer. This facilitates consistent access to the 16byte-aligned
address during key extension.

However, the alignment is already handlded in the GCM-related setkey
functions before invoking the common function. Consequently, the
alignment in the common function is unnecessary for those functions.

To establish a consistent approach throughout the glue code, remove
the aes_ctx() call from its current location. Instead, place it at
each call site where the runtime alignment is currently absent.

Link: https://lore.kernel.org/lkml/20230605024623.GA4653@quark.localdomain/
Suggested-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Cc: linux-crypto@vger.kernel.org
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# fd94fcf0 20-May-2022 Nathan Huckleberry <nhuck@google.com>

crypto: x86/aesni-xctr - Add accelerated implementation of XCTR

Add hardware accelerated version of XCTR for x86-64 CPUs with AESNI
support.

More information on XCTR can be found in the HCTR2 paper

crypto: x86/aesni-xctr - Add accelerated implementation of XCTR

Add hardware accelerated version of XCTR for x86-64 CPUs with AESNI
support.

More information on XCTR can be found in the HCTR2 paper:
"Length-preserving encryption with HCTR2":
https://eprint.iacr.org/2021/1441.pdf

Signed-off-by: Nathan Huckleberry <nhuck@google.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# d480a26b 21-Dec-2021 Jakub Kicinski <kuba@kernel.org>

crypto: x86/aesni - don't require alignment of data

x86 AES-NI routines can deal with unaligned data. Crypto context
(key, iv etc.) have to be aligned but we take care of that separately
by copying

crypto: x86/aesni - don't require alignment of data

x86 AES-NI routines can deal with unaligned data. Crypto context
(key, iv etc.) have to be aligned but we take care of that separately
by copying it onto the stack. We were feeding unaligned data into
crypto routines up until commit 83c83e658863 ("crypto: aesni -
refactor scatterlist processing") switched to use the full
skcipher API which uses cra_alignmask to decide data alignment.

This fixes 21% performance regression in kTLS.

Tested by booting with CONFIG_CRYPTO_MANAGER_EXTRA_TESTS=y
(and running thru various kTLS packets).

CC: stable@vger.kernel.org # 5.15+
Fixes: 83c83e658863 ("crypto: aesni - refactor scatterlist processing")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# a2d3cbc8 11-Sep-2021 Shreyansh Chouhan <chouhan.shreyansh630@gmail.com>

crypto: aesni - check walk.nbytes instead of err

In the code for xts_crypt(), we check for the err value returned by
skcipher_walk_virt() and return from the function if it is non zero.
However, skc

crypto: aesni - check walk.nbytes instead of err

In the code for xts_crypt(), we check for the err value returned by
skcipher_walk_virt() and return from the function if it is non zero.
However, skcipher_walk_virt() can set walk.nbytes to 0, which would cause
us to call kernel_fpu_begin(), and then skip the kernel_fpu_end() call.

This patch checks for the walk.nbytes value instead, and returns if
walk.nbytes is 0. This prevents us from calling kernel_fpu_begin() in
the first place and also covers the case of having a non zero err value
returned from skcipher_walk_virt().

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Shreyansh Chouhan <chouhan.shreyansh630@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# 72ff2bf0 22-Aug-2021 Shreyansh Chouhan <chouhan.shreyansh630@gmail.com>

crypto: aesni - xts_crypt() return if walk.nbytes is 0

xts_crypt() code doesn't call kernel_fpu_end() after calling
kernel_fpu_begin() if walk.nbytes is 0. The correct behavior should be
not calling

crypto: aesni - xts_crypt() return if walk.nbytes is 0

xts_crypt() code doesn't call kernel_fpu_end() after calling
kernel_fpu_begin() if walk.nbytes is 0. The correct behavior should be
not calling kernel_fpu_begin() if walk.nbytes is 0.

Reported-by: syzbot+20191dc583eff8602d2d@syzkaller.appspotmail.com
Signed-off-by: Shreyansh Chouhan <chouhan.shreyansh630@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# 821720b9 16-Jul-2021 Ard Biesheuvel <ardb@kernel.org>

crypto: x86/aes-ni - add missing error checks in XTS code

The updated XTS code fails to check the return code of skcipher_walk_virt,
which may lead to skcipher_walk_abort() or skcipher_walk_done() b

crypto: x86/aes-ni - add missing error checks in XTS code

The updated XTS code fails to check the return code of skcipher_walk_virt,
which may lead to skcipher_walk_abort() or skcipher_walk_done() being called
while the walk argument is in an inconsistent state.

So check the return value after each such call, and bail on errors.

Fixes: 2481104fe98d ("crypto: x86/aes-ni-xts - rewrite and drop indirections via glue helper")
Reported-by: Dave Hansen <dave.hansen@intel.com>
Reported-by: syzbot <syzbot+5d1bad8042a8f0e8117a@syzkaller.appspotmail.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# 65d1e3c4 16-Jan-2021 Ard Biesheuvel <ardb@kernel.org>

crypto: aesni - release FPU during skcipher walk API calls

Taking ownership of the FPU in kernel mode disables preemption, and
this may result in excessive scheduling blackouts if the size of the
da

crypto: aesni - release FPU during skcipher walk API calls

Taking ownership of the FPU in kernel mode disables preemption, and
this may result in excessive scheduling blackouts if the size of the
data being processed on the FPU is unbounded.

Given that taking and releasing the FPU is cheap these days on x86, we
can limit the impact of this issue easily for skcipher implementations,
by moving the FPU begin/end calls inside the skcipher walk processing
loop. Considering that skcipher walks operate on at most one page at a
time, doing so fully mitigates this issue.

This also permits the skcipher walk logic to use non-atomic kmalloc()
calls etc so we can change the 'atomic' bool argument in the calls to
skcipher_walk_virt() to false as well.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# 64a49b85 16-Jan-2021 Ard Biesheuvel <ardb@kernel.org>

crypto: aesni - replace CTR function pointer with static call

Indirect calls are very expensive on x86, so use a static call to set
the system-wide AES-NI/CTR asm helper.

Signed-off-by: Ard Biesheu

crypto: aesni - replace CTR function pointer with static call

Indirect calls are very expensive on x86, so use a static call to set
the system-wide AES-NI/CTR asm helper.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# d6cbf4ea 04-Jan-2021 Ard Biesheuvel <ardb@kernel.org>

crypto: aesni - replace function pointers with static branches

Replace the function pointers in the GCM implementation with static branches,
which are based on code patching, which occurs only at mo

crypto: aesni - replace function pointers with static branches

Replace the function pointers in the GCM implementation with static branches,
which are based on code patching, which occurs only at module load time.
This avoids the severe performance penalty caused by the use of retpolines.

In order to retain the ability to switch between different versions of the
implementation based on the input size on cores that support AVX and AVX2,
use static branches instead of static calls.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# 83c83e65 04-Jan-2021 Ard Biesheuvel <ardb@kernel.org>

crypto: aesni - refactor scatterlist processing

Currently, the gcm(aes-ni) driver open codes the scatterlist handling
that is encapsulated by the skcipher walk API. So let's switch to that
instead.

crypto: aesni - refactor scatterlist processing

Currently, the gcm(aes-ni) driver open codes the scatterlist handling
that is encapsulated by the skcipher walk API. So let's switch to that
instead.

Also, move the handling at the end of gcmaes_crypt_by_sg() that is
dependent on whether we are encrypting or decrypting into the callers,
which always do one or the other.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


12345