History log of /linux/arch/arm64/include/asm/fpsimdmacros.h (Results 1 – 23 of 23)
Revision Date Author Comments
# 86d1921c 05-Dec-2023 Zenghui Yu <yuzenghui@huawei.com>

arm64: Delete the zero_za macro

zero_za was introduced in commit ca8a4ebcff44 ("arm64/sme: Manually encode
SME instructions") but doesn't appear to have any in kernel user. Drop it.

Signed-off-by:

arm64: Delete the zero_za macro

zero_za was introduced in commit ca8a4ebcff44 ("arm64/sme: Manually encode
SME instructions") but doesn't appear to have any in kernel user. Drop it.

Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Acked-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20231205160140.1438-1-yuzenghui@huawei.com
Signed-off-by: Will Deacon <will@kernel.org>

show more ...


# 893b2418 28-Jun-2023 Will Deacon <will@kernel.org>

arm64: sme: Use STR P to clear FFR context field in streaming SVE mode

The FFR is a predicate register which can vary between 16 and 256 bits
in size depending upon the configured vector length. Whe

arm64: sme: Use STR P to clear FFR context field in streaming SVE mode

The FFR is a predicate register which can vary between 16 and 256 bits
in size depending upon the configured vector length. When saving the
SVE state in streaming SVE mode, the FFR register is inaccessible and
so commit 9f5848665788 ("arm64/sve: Make access to FFR optional") simply
clears the FFR field of the in-memory context structure. Unfortunately,
it achieves this using an unconditional 8-byte store and so if the SME
vector length is anything other than 64 bytes in size we will either
fail to clear the entire field or, worse, we will corrupt memory
immediately following the structure. This has led to intermittent kfence
splats in CI [1] and can trigger kmalloc Redzone corruption messages
when running the 'fp-stress' kselftest:

| =============================================================================
| BUG kmalloc-1k (Not tainted): kmalloc Redzone overwritten
| -----------------------------------------------------------------------------
|
| 0xffff000809bf1e22-0xffff000809bf1e27 @offset=7714. First byte 0x0 instead of 0xcc
| Allocated in do_sme_acc+0x9c/0x220 age=2613 cpu=1 pid=531
| __kmalloc+0x8c/0xcc
| do_sme_acc+0x9c/0x220
| ...

Replace the 8-byte store with a store of a predicate register which has
been zero-initialised with PFALSE, ensuring that the entire field is
cleared in memory.

[1] https://lore.kernel.org/r/CA+G9fYtU7HsV0R0dp4XEH5xXHSJFw8KyDf5VQrLLfMxWfxQkag@mail.gmail.com

Cc: Mark Brown <broonie@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
Fixes: 9f5848665788 ("arm64/sve: Make access to FFR optional")
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Signed-off-by: Will Deacon <will@kernel.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Tested-by: Anders Roxell <anders.roxell@linaro.org>
Link: https://lore.kernel.org/r/20230628155605.22296-1-will@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

show more ...


# 2cdeecdb 16-Jan-2023 Mark Brown <broonie@kernel.org>

arm64/sme: Manually encode ZT0 load and store instructions

In order to avoid unrealistic toolchain requirements we manually encode the
instructions for loading and storing ZT0.

Signed-off-by: Mark

arm64/sme: Manually encode ZT0 load and store instructions

In order to avoid unrealistic toolchain requirements we manually encode the
instructions for loading and storing ZT0.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-6-f2fa0aef982f@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

show more ...


# 0033cd93 19-Apr-2022 Mark Brown <broonie@kernel.org>

arm64/sme: Implement ZA context switching

Allocate space for storing ZA on first access to SME and use that to save
and restore ZA state when context switching. We do this by using the vector
form o

arm64/sme: Implement ZA context switching

Allocate space for storing ZA on first access to SME and use that to save
and restore ZA state when context switching. We do this by using the vector
form of the LDR and STR ZA instructions, these do not require streaming
mode and have implementation recommendations that they avoid contention
issues in shared SMCU implementations.

Since ZA is architecturally guaranteed to be zeroed when enabled we do not
need to explicitly zero ZA, either we will be restoring from a saved copy
or trapping on first use of SME so we know that ZA must be disabled.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20220419112247.711548-16-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

show more ...


# af7167d6 19-Apr-2022 Mark Brown <broonie@kernel.org>

arm64/sme: Implement streaming SVE context switching

When in streaming mode we need to save and restore the streaming mode
SVE register state rather than the regular SVE register state. This uses
th

arm64/sme: Implement streaming SVE context switching

When in streaming mode we need to save and restore the streaming mode
SVE register state rather than the regular SVE register state. This uses
the streaming mode vector length and omits FFR but is otherwise identical,
if TIF_SVE is enabled when we are in streaming mode then streaming mode
takes precedence.

This does not handle use of streaming SVE state with KVM, ptrace or
signals. This will be updated in further patches.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20220419112247.711548-15-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

show more ...


# ca8a4ebc 19-Apr-2022 Mark Brown <broonie@kernel.org>

arm64/sme: Manually encode SME instructions

As with SVE rather than impose ambitious toolchain requirements for SME
we manually encode the few instructions which we require in order to
perform the w

arm64/sme: Manually encode SME instructions

As with SVE rather than impose ambitious toolchain requirements for SME
we manually encode the few instructions which we require in order to
perform the work the kernel needs to do. The instructions used to save
and restore context are provided as assembler macros while those for
entering and leaving streaming mode are done in asm volatile blocks
since they are expected to be used from C.

We could do the SMSTART and SMSTOP operations with read/modify/write
cycles on SVCR but using the aliases provided for individual field
accesses should be slightly faster. These instructions are aliases for
MSR but since our minimum toolchain requirements are old enough to mean
that we can't use the sX_X_cX_cX_X form and they always use xzr rather
than taking a value like write_sysreg_s() wants we just use .inst.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20220419112247.711548-7-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

show more ...


# ddc806b5 19-Oct-2021 Mark Brown <broonie@kernel.org>

arm64/sve: Explicitly load vector length when restoring SVE state

Currently when restoring the SVE state we supply the SVE vector length
as an argument to sve_load_state() and the underlying macros.

arm64/sve: Explicitly load vector length when restoring SVE state

Currently when restoring the SVE state we supply the SVE vector length
as an argument to sve_load_state() and the underlying macros. This becomes
inconvenient with the addition of SME since we may need to restore any
combination of SVE and SME vector lengths, and we already separately
restore the vector length in the KVM code. We don't need to know the vector
length during the actual register load since the SME load instructions can
index into the data array for us.

Refactor the interface so we explicitly set the vector length separately
to restoring the SVE registers in preparation for adding SME support, no
functional change should be involved.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211019172247.3045838-9-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>

show more ...


# 9f584866 19-Oct-2021 Mark Brown <broonie@kernel.org>

arm64/sve: Make access to FFR optional

SME introduces streaming SVE mode in which FFR is not present and the
instructions for accessing it UNDEF. In preparation for handling this
update the low leve

arm64/sve: Make access to FFR optional

SME introduces streaming SVE mode in which FFR is not present and the
instructions for accessing it UNDEF. In preparation for handling this
update the low level SVE state access functions to take a flag specifying
if FFR should be handled. When saving the register state we store a zero
for FFR to guard against uninitialized data being read. No behaviour change
should be introduced by this patch.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20211019172247.3045838-5-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>

show more ...


# 04fa17d1 16-Aug-2021 Mark Brown <broonie@kernel.org>

arm64/sve: Add a comment documenting the binutils needed for SVE asm

At some point it would be nice to avoid the need to manually encode SVE
instructions, add a note of the binutils version required

arm64/sve: Add a comment documenting the binutils needed for SVE asm

At some point it would be nice to avoid the need to manually encode SVE
instructions, add a note of the binutils version required to save looking
it up.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210816125024.8112-1-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

show more ...


# 483dbf6a 12-May-2021 Mark Brown <broonie@kernel.org>

arm64/sve: Split _sve_flush macro into separate Z and predicate flushes

Trivial refactoring to support further work, no change to generated code.

Signed-off-by: Mark Brown <broonie@kernel.org>
Revi

arm64/sve: Split _sve_flush macro into separate Z and predicate flushes

Trivial refactoring to support further work, no change to generated code.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20210512151131.27877-2-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>

show more ...


# 52029198 16-Mar-2021 Marc Zyngier <maz@kernel.org>

KVM: arm64: Rework SVE host-save/guest-restore

In order to keep the code readable, move the host-save/guest-restore
sequences in their own functions, with the following changes:
- the hypervisor ZCR

KVM: arm64: Rework SVE host-save/guest-restore

In order to keep the code readable, move the host-save/guest-restore
sequences in their own functions, with the following changes:
- the hypervisor ZCR is now set from C code
- ZCR_EL2 is always used as the EL2 accessor

This results in some minor assembler macro rework.
No functional change intended.

Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>

show more ...


# 297b8603 11-Mar-2021 Marc Zyngier <maz@kernel.org>

KVM: arm64: Provide KVM's own save/restore SVE primitives

as we are about to change the way KVM deals with SVE, provide
KVM with its own save/restore SVE primitives.

No functional change intended.

KVM: arm64: Provide KVM's own save/restore SVE primitives

as we are about to change the way KVM deals with SVE, provide
KVM with its own save/restore SVE primitives.

No functional change intended.

Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>

show more ...


# 1e530f13 28-Aug-2020 Julien Grall <julien.grall@arm.com>

arm64/sve: Implement a helper to flush SVE registers

Introduce a new helper that will zero all SVE registers but the first
128-bits of each vector. This will be used by subsequent patches to
avoid c

arm64/sve: Implement a helper to flush SVE registers

Introduce a new helper that will zero all SVE registers but the first
128-bits of each vector. This will be used by subsequent patches to
avoid costly store/maipulate/reload sequences in places like do_sve_acc().

Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Link: https://lore.kernel.org/r/20200828181155.17745-6-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>

show more ...


# 6d40f05f 28-Aug-2020 Julien Grall <julien.grall@arm.com>

arm64/fpsimdmacros: Allow the macro "for" to be used in more cases

The current version of the macro "for" is not able to work when the
counter is used to generate registers using mnemonics. This is

arm64/fpsimdmacros: Allow the macro "for" to be used in more cases

The current version of the macro "for" is not able to work when the
counter is used to generate registers using mnemonics. This is because
gas is not able to evaluate the expression generated if used in
register's name (i.e x\n).

Gas offers a way to evaluate macro arguments by using % in front of
them under the alternate macro mode.

The implementation of "for" is updated to use the alternate macro mode
and %, so we can use the macro in more cases. As the alternate macro
mode may have side-effects, this is disabled when expanding the body.

While it is enough to prefix the argument of the macro "__for_body"
with %, the arguments of "__for" are also prefixed to get a more
bearable value in case of compilation error.

Suggested-by: Dave Martin <dave.martin@arm.com>
Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Link: https://lore.kernel.org/r/20200828181155.17745-4-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>

show more ...


# 315cf047 28-Aug-2020 Julien Grall <julien.grall@arm.com>

arm64/fpsimdmacros: Introduce a macro to update ZCR_EL1.LEN

A follow-up patch will need to update ZCR_EL1.LEN.

Add a macro that could be re-used in the current and new places to
avoid code duplicat

arm64/fpsimdmacros: Introduce a macro to update ZCR_EL1.LEN

A follow-up patch will need to update ZCR_EL1.LEN.

Add a macro that could be re-used in the current and new places to
avoid code duplication.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Link: https://lore.kernel.org/r/20200828181155.17745-5-broonie@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>

show more ...


# caab277b 03-Jun-2019 Thomas Gleixner <tglx@linutronix.de>

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234

Based on 1 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of th

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 234

Based on 1 normalized pattern(s):

this program is free software you can redistribute it and or modify
it under the terms of the gnu general public license version 2 as
published by the free software foundation this program is
distributed in the hope that it will be useful but without any
warranty without even the implied warranty of merchantability or
fitness for a particular purpose see the gnu general public license
for more details you should have received a copy of the gnu general
public license along with this program if not see http www gnu org
licenses

extracted by the scancode license scanner the SPDX license identifier

GPL-2.0-only

has been chosen to replace the boilerplate/reference in 503 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Enrico Weigelt <info@metux.net>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190602204653.811534538@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

show more ...


# 159fd7b8 14-May-2018 Dave Martin <Dave.Martin@arm.com>

arm64/sve: Write ZCR_EL1 on context switch only if changed

Writes to ZCR_EL1 are self-synchronising, and so may be expensive
in typical implementations.

This patch adopts the approach used for cost

arm64/sve: Write ZCR_EL1 on context switch only if changed

Writes to ZCR_EL1 are self-synchronising, and so may be expensive
in typical implementations.

This patch adopts the approach used for costly system register
writes elsewhere in the kernel: the system register write is
suppressed if it would not change the stored value.

Since the common case will be that of switching between tasks that
use the same vector length as one another, prediction hit rates on
the conditional branch should be reasonably good, with lower
expected amortised cost than the unconditional execution of a
heavyweight self-synchronising instruction.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

show more ...


# 1fc5dce7 31-Oct-2017 Dave Martin <Dave.Martin@arm.com>

arm64/sve: Low-level SVE architectural state manipulation functions

Manipulating the SVE architectural state, including the vector and
predicate registers, first-fault register and the vector length

arm64/sve: Low-level SVE architectural state manipulation functions

Manipulating the SVE architectural state, including the vector and
predicate registers, first-fault register and the vector length,
requires the use of dedicated instructions added by SVE.

This patch adds suitable assembly functions for saving and
restoring the SVE registers and querying the vector length.
Setting of the vector length is done as part of register restore.

Since people building kernels may not all get an SVE-enabled
toolchain for a while, this patch uses macros that generate
explicit opcodes in place of assembler mnemonics.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>

show more ...


# cb84d11e 03-Aug-2017 Dave Martin <Dave.Martin@arm.com>

arm64: neon: Remove support for nested or hardirq kernel-mode NEON

Support for kernel-mode NEON to be nested and/or used in hardirq
context adds significant complexity, and the benefits may be
margi

arm64: neon: Remove support for nested or hardirq kernel-mode NEON

Support for kernel-mode NEON to be nested and/or used in hardirq
context adds significant complexity, and the benefits may be
marginal. In practice, kernel-mode NEON is not used in hardirq
context, and is rarely used in softirq context (by certain mac80211
drivers).

This patch implements an arm64 may_use_simd() function to allow
clients to check whether kernel-mode NEON is usable in the current
context, and simplifies kernel_neon_{begin,end}() to handle only
saving of the task FPSIMD state (if any). Without nesting, there
is no other state to save.

The partial fpsimd save/restore functions become redundant as a
result of these changes, so they are removed too.

The save/restore model is changed to operate directly on
task_struct without additional percpu storage. This simplifies the
code and saves a bit of memory, but means that softirqs must now be
disabled when manipulating the task fpsimd state from task context:
correspondingly, preempt_{en,dis}sable() calls are upgraded to
local_bh_{en,dis}able() as appropriate. fpsimd_thread_switch()
already runs with hardirqs disabled and so is already protected
from softirqs.

These changes should make it easier to support kernel-mode NEON in
the presence of the Scalable Vector extension in the future.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

show more ...


# 6917c857 29-Jan-2015 Dave P Martin <Dave.Martin@arm.com>

arm64: Avoid breakage caused by .altmacro in fpsimd save/restore macros

Alternate macro mode is not a property of a macro definition, but a
gas runtime state that alters the way macros are expanded

arm64: Avoid breakage caused by .altmacro in fpsimd save/restore macros

Alternate macro mode is not a property of a macro definition, but a
gas runtime state that alters the way macros are expanded for ever
after (until .noaltmacro is seen).

This means that subsequent assembly code that calls other macros can
break if fpsimdmacros.h is included.

Since these instruction sequences are simple (if dull -- but in a
good way), this patch solves the problem by simply expanding the
.irp loops. The pre-existing fpsimd_{save,restore} macros weren't
rolled with .irp anyway and the sequences affected are short, so
this change restores consistency at little cost.

Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

show more ...


# 5959e257 10-Jul-2014 Will Deacon <will.deacon@arm.com>

arm64: fpsimd: avoid restoring fpcr if the contents haven't changed

Writing to the FPCR is commonly implemented as a self-synchronising
operation in the CPU, so avoid writing to the register when th

arm64: fpsimd: avoid restoring fpcr if the contents haven't changed

Writing to the FPCR is commonly implemented as a self-synchronising
operation in the CPU, so avoid writing to the register when the saved
value matches that in the hardware already.

Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

show more ...


# 190f1ca8 24-Feb-2014 Ard Biesheuvel <ard.biesheuvel@linaro.org>

arm64: add support for kernel mode NEON in interrupt context

This patch modifies kernel_neon_begin() and kernel_neon_end(), so
they may be called from any context. To address the case where only
a c

arm64: add support for kernel mode NEON in interrupt context

This patch modifies kernel_neon_begin() and kernel_neon_end(), so
they may be called from any context. To address the case where only
a couple of registers are needed, kernel_neon_begin_partial(u32) is
introduced which takes as a parameter the number of bottom 'n' NEON
q-registers required. To mark the end of such a partial section, the
regular kernel_neon_end() should be used.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>

show more ...


# cfc5180e 12-Nov-2012 Marc Zyngier <marc.zyngier@arm.com>

arm64: move FP-SIMD save/restore code to a macro

In order to be able to reuse the save-restore code in KVM, move
it to a pair of macros, similar to what the 32bit code does.

Signed-off-by: Marc Zyn

arm64: move FP-SIMD save/restore code to a macro

In order to be able to reuse the save-restore code in KVM, move
it to a pair of macros, similar to what the 32bit code does.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

show more ...