History log of /linux/kernel/padata.c (Results 1 – 25 of 103)
Revision Date Author Comments
# eb522866 06-Mar-2024 Gang Li Subject: padata: dispatch works on <gang.li@linux.dev>

Author: Gang Li padata: dispatch works on

different nodes Date: Thu, 22 Feb 2024 22:04:17 +0800

When a group of tasks that access different nodes are scheduled on the
same node, they may encounter

Author: Gang Li padata: dispatch works on

different nodes Date: Thu, 22 Feb 2024 22:04:17 +0800

When a group of tasks that access different nodes are scheduled on the
same node, they may encounter bandwidth bottlenecks and access latency.

Thus, numa_aware flag is introduced here, allowing tasks to be distributed
across different nodes to fully utilize the advantage of multi-node
systems.

Link: https://lkml.kernel.org/r/20240222140422.393911-5-gang.li@linux.dev
Signed-off-by: Gang Li <ligang.bdlg@bytedance.com>
Tested-by: David Rientjes <rientjes@google.com>
Reviewed-by: Muchun Song <muchun.song@linux.dev>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jane Chu <jane.chu@oracle.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

show more ...


# 7ddc21e3 16-Oct-2023 WangJinchao <wangjinchao@xfusion.com>

padata: Fix refcnt handling in padata_free_shell()

In a high-load arm64 environment, the pcrypt_aead01 test in LTP can lead
to system UAF (Use-After-Free) issues. Due to the lengthy analysis of
the

padata: Fix refcnt handling in padata_free_shell()

In a high-load arm64 environment, the pcrypt_aead01 test in LTP can lead
to system UAF (Use-After-Free) issues. Due to the lengthy analysis of
the pcrypt_aead01 function call, I'll describe the problem scenario
using a simplified model:

Suppose there's a user of padata named `user_function` that adheres to
the padata requirement of calling `padata_free_shell` after `serial()`
has been invoked, as demonstrated in the following code:

```c
struct request {
struct padata_priv padata;
struct completion *done;
};

void parallel(struct padata_priv *padata) {
do_something();
}

void serial(struct padata_priv *padata) {
struct request *request = container_of(padata,
struct request,
padata);
complete(request->done);
}

void user_function() {
DECLARE_COMPLETION(done)
padata->parallel = parallel;
padata->serial = serial;
padata_do_parallel();
wait_for_completion(&done);
padata_free_shell();
}
```

In the corresponding padata.c file, there's the following code:

```c
static void padata_serial_worker(struct work_struct *serial_work) {
...
cnt = 0;

while (!list_empty(&local_list)) {
...
padata->serial(padata);
cnt++;
}

local_bh_enable();

if (refcount_sub_and_test(cnt, &pd->refcnt))
padata_free_pd(pd);
}
```

Because of the high system load and the accumulation of unexecuted
softirq at this moment, `local_bh_enable()` in padata takes longer
to execute than usual. Subsequently, when accessing `pd->refcnt`,
`pd` has already been released by `padata_free_shell()`, resulting
in a UAF issue with `pd->refcnt`.

The fix is straightforward: add `refcount_dec_and_test` before calling
`padata_free_pd` in `padata_free_shell`.

Fixes: 07928d9bfc81 ("padata: Remove broken queue flushing")

Signed-off-by: WangJinchao <wangjinchao@xfusion.com>
Acked-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Acked-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# 8f4f68e7 04-Sep-2023 Lu Jialin <lujialin4@huawei.com>

crypto: pcrypt - Fix hungtask for PADATA_RESET

We found a hungtask bug in test_aead_vec_cfg as follows:

INFO: task cryptomgr_test:391009 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kerne

crypto: pcrypt - Fix hungtask for PADATA_RESET

We found a hungtask bug in test_aead_vec_cfg as follows:

INFO: task cryptomgr_test:391009 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Call trace:
__switch_to+0x98/0xe0
__schedule+0x6c4/0xf40
schedule+0xd8/0x1b4
schedule_timeout+0x474/0x560
wait_for_common+0x368/0x4e0
wait_for_completion+0x20/0x30
wait_for_completion+0x20/0x30
test_aead_vec_cfg+0xab4/0xd50
test_aead+0x144/0x1f0
alg_test_aead+0xd8/0x1e0
alg_test+0x634/0x890
cryptomgr_test+0x40/0x70
kthread+0x1e0/0x220
ret_from_fork+0x10/0x18
Kernel panic - not syncing: hung_task: blocked tasks

For padata_do_parallel, when the return err is 0 or -EBUSY, it will call
wait_for_completion(&wait->completion) in test_aead_vec_cfg. In normal
case, aead_request_complete() will be called in pcrypt_aead_serial and the
return err is 0 for padata_do_parallel. But, when pinst->flags is
PADATA_RESET, the return err is -EBUSY for padata_do_parallel, and it
won't call aead_request_complete(). Therefore, test_aead_vec_cfg will
hung at wait_for_completion(&wait->completion), which will cause
hungtask.

The problem comes as following:
(padata_do_parallel) |
rcu_read_lock_bh(); |
err = -EINVAL; | (padata_replace)
| pinst->flags |= PADATA_RESET;
err = -EBUSY |
if (pinst->flags & PADATA_RESET) |
rcu_read_unlock_bh() |
return err

In order to resolve the problem, we replace the return err -EBUSY with
-EAGAIN, which means parallel_data is changing, and the caller should call
it again.

v3:
remove retry and just change the return err.
v2:
introduce padata_try_do_parallel() in pcrypt_aead_encrypt and
pcrypt_aead_decrypt to solve the hungtask.

Signed-off-by: Lu Jialin <lujialin4@huawei.com>
Signed-off-by: Guo Zihua <guozihua@huawei.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# f84155ca 23-Feb-2023 Anthony Yznaga <anthony.yznaga@oracle.com>

padata: use alignment when calculating the number of worker threads

For multithreaded jobs the computed chunk size is rounded up by the
caller-specified alignment. However, the number of worker thre

padata: use alignment when calculating the number of worker threads

For multithreaded jobs the computed chunk size is rounded up by the
caller-specified alignment. However, the number of worker threads to
use is computed using the minimum chunk size without taking alignment
into account. A sufficiently large alignment value can result in too
many worker threads being allocated for the job.

Signed-off-by: Anthony Yznaga <anthony.yznaga@oracle.com>
Acked-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# 0bedc992 17-Feb-2023 Thomas Weißschuh <linux@weissschuh.net>

padata: Make kobj_type structure constant

Since commit ee6d3dd4ed48 ("driver core: make kobj_type constant.")
the driver core allows the usage of const struct kobj_type.

Take advantage of this to c

padata: Make kobj_type structure constant

Since commit ee6d3dd4ed48 ("driver core: make kobj_type constant.")
the driver core allows the usage of const struct kobj_type.

Take advantage of this to constify the structure definition to prevent
modification at runtime.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Acked-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# 0d24f1b7 13-Dec-2022 Nathan Chancellor <nathan@kernel.org>

padata: Mark padata_work_init() as __ref

When building arm64 allmodconfig + ThinLTO with clang and a proposed
modpost update to account for -ffuncton-sections, the following warning
appears:

WARN

padata: Mark padata_work_init() as __ref

When building arm64 allmodconfig + ThinLTO with clang and a proposed
modpost update to account for -ffuncton-sections, the following warning
appears:

WARNING: modpost: vmlinux.o: section mismatch in reference: padata_work_init (section: .text.padata_work_init) -> padata_mt_helper (section: .init.text)
WARNING: modpost: vmlinux.o: section mismatch in reference: padata_work_init (section: .text.padata_work_init) -> padata_mt_helper (section: .init.text)

LLVM has optimized padata_work_init() to include the address of
padata_mt_helper() directly because it inlined the other call to
padata_work_init() with padata_parallel_worker(), meaning the remaining
uses of padata_work_init() use padata_mt_helper() as the work_fn
argument. This optimization causes modpost to complain since
padata_work_init() is not __init, whereas padata_mt_helper() is.

Since padata_work_init() is only called from __init code when
padata_mt_helper() is passed as the work_fn argument, mark
padata_work_init() as __ref, which makes it clear to modpost that this
scenario is okay.

Suggested-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>

show more ...


# 57ddfecc 17-Nov-2022 Daniel Jordan <daniel.m.jordan@oracle.com>

padata: Fix list iterator in padata_do_serial()

list_for_each_entry_reverse() assumes that the iterated list is nonempty
and that every list_head is embedded in the same type, but its use in
padata_

padata: Fix list iterator in padata_do_serial()

list_for_each_entry_reverse() assumes that the iterated list is nonempty
and that every list_head is embedded in the same type, but its use in
padata_do_serial() breaks both rules.

This doesn't cause any issues now because padata_priv and padata_list
happen to have their list fields at the same offset, but we really
shouldn't be relying on that.

Fixes: bfde23ce200e ("padata: unbind parallel jobs from specific CPUs")
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# 34c3a47d 17-Nov-2022 Daniel Jordan <daniel.m.jordan@oracle.com>

padata: Always leave BHs disabled when running ->parallel()

A deadlock can happen when an overloaded system runs ->parallel() in the
context of the current task:

padata_do_parallel
->para

padata: Always leave BHs disabled when running ->parallel()

A deadlock can happen when an overloaded system runs ->parallel() in the
context of the current task:

padata_do_parallel
->parallel()
pcrypt_aead_enc/dec
padata_do_serial
spin_lock(&reorder->lock) // BHs still enabled
<interrupt>
...
__do_softirq
...
padata_do_serial
spin_lock(&reorder->lock)

It's a bug for BHs to be on in _do_serial as Steffen points out, so
ensure they're off in the "current task" case like they are in
padata_parallel_worker to avoid this situation.

Reported-by: syzbot+bc05445bc14148d51915@syzkaller.appspotmail.com
Fixes: 4611ce224688 ("padata: allocate work structures for parallel jobs from a pool")
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# 1c4cafd1 23-Jan-2022 Yury Norov <yury.norov@gmail.com>

padata: replace cpumask_weight with cpumask_empty in padata.c

padata_do_parallel() calls cpumask_weight() to check if any bit of a
given cpumask is set. We can do it more efficiently with cpumask_em

padata: replace cpumask_weight with cpumask_empty in padata.c

padata_do_parallel() calls cpumask_weight() to check if any bit of a
given cpumask is set. We can do it more efficiently with cpumask_empty()
because cpumask_empty() stops traversing the cpumask as soon as it finds
first set bit, while cpumask_weight() counts all bits unconditionally.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# cedcf527 22-Aug-2021 Cai Huoqing <caihuoqing@baidu.com>

padata: Remove repeated verbose license text

remove it because SPDX-License-Identifier is already used

Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Acked-by: Daniel Jordan <daniel.m.jordan@ora

padata: Remove repeated verbose license text

remove it because SPDX-License-Identifier is already used

Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Acked-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# 80771c82 03-Aug-2021 Sebastian Andrzej Siewior <bigeasy@linutronix.de>

padata: Replace deprecated CPU-hotplug functions.

The functions get_online_cpus() and put_online_cpus() have been
deprecated during the CPU hotplug rework. They map directly to
cpus_read_lock() and

padata: Replace deprecated CPU-hotplug functions.

The functions get_online_cpus() and put_online_cpus() have been
deprecated during the CPU hotplug rework. They map directly to
cpus_read_lock() and cpus_read_unlock().

Replace deprecated CPU-hotplug functions with the official version.
The behavior remains unchanged.

Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: linux-crypto@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# d5ee8e75 20-Jul-2021 Xiyu Yang <xiyuyang19@fudan.edu.cn>

padata: Convert from atomic_t to refcount_t on parallel_data->refcnt

refcount_t type and corresponding API can protect refcounters from
accidental underflow and overflow and further use-after-free s

padata: Convert from atomic_t to refcount_t on parallel_data->refcnt

refcount_t type and corresponding API can protect refcounters from
accidental underflow and overflow and further use-after-free situations.

Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Acked-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# 1b0df11f 02-Sep-2020 Daniel Jordan <daniel.m.jordan@oracle.com>

padata: fix possible padata_works_lock deadlock

syzbot reports,

WARNING: inconsistent lock state
5.9.0-rc2-syzkaller #0 Not tainted
--------------------------------
inconsistent {IN-SOFTIRQ

padata: fix possible padata_works_lock deadlock

syzbot reports,

WARNING: inconsistent lock state
5.9.0-rc2-syzkaller #0 Not tainted
--------------------------------
inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
syz-executor.0/26715 takes:
(padata_works_lock){+.?.}-{2:2}, at: padata_do_parallel kernel/padata.c:220
{IN-SOFTIRQ-W} state was registered at:
spin_lock include/linux/spinlock.h:354 [inline]
padata_do_parallel kernel/padata.c:220
...
__do_softirq kernel/softirq.c:298
...
sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1091
asm_sysvec_apic_timer_interrupt arch/x86/include/asm/idtentry.h:581

Possible unsafe locking scenario:

CPU0
----
lock(padata_works_lock);
<Interrupt>
lock(padata_works_lock);

padata_do_parallel() takes padata_works_lock with softirqs enabled, so a
deadlock is possible if, on the same CPU, the lock is acquired in
process context and then softirq handling done in an interrupt leads to
the same path.

Fix by leaving softirqs disabled while do_parallel holds
padata_works_lock.

Reported-by: syzbot+f4b9f49e38e25eb4ef52@syzkaller.appspotmail.com
Fixes: 4611ce2246889 ("padata: allocate work structures for parallel jobs from a pool")
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# f601c725 14-Jul-2020 Daniel Jordan <daniel.m.jordan@oracle.com>

padata: remove padata_parallel_queue

Only its reorder field is actually used now, so remove the struct and
embed @reorder directly in parallel_data.

No functional change, just a cleanup.

Signed-of

padata: remove padata_parallel_queue

Only its reorder field is actually used now, so remove the struct and
embed @reorder directly in parallel_data.

No functional change, just a cleanup.

Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# 3f257191 14-Jul-2020 Daniel Jordan <daniel.m.jordan@oracle.com>

padata: fold padata_alloc_possible() into padata_alloc()

There's no reason to have two interfaces when there's only one caller.
Removing _possible saves text and simplifies future changes.

Signed-o

padata: fold padata_alloc_possible() into padata_alloc()

There's no reason to have two interfaces when there's only one caller.
Removing _possible saves text and simplifies future changes.

Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# d69e037b 14-Jul-2020 Daniel Jordan <daniel.m.jordan@oracle.com>

padata: remove effective cpumasks from the instance

A padata instance has effective cpumasks that store the user-supplied
masks ANDed with the online mask, but this middleman is unnecessary.
paralle

padata: remove effective cpumasks from the instance

A padata instance has effective cpumasks that store the user-supplied
masks ANDed with the online mask, but this middleman is unnecessary.
parallel_data keeps the same information around. Removing this saves
text and code churn in future changes.

Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# cec00e6e 14-Jul-2020 Daniel Jordan <daniel.m.jordan@oracle.com>

padata: inline single call of pd_setup_cpumasks()

pd_setup_cpumasks() has only one caller. Move its contents inline to
prepare for the next cleanup.

Signed-off-by: Daniel Jordan <daniel.m.jordan@o

padata: inline single call of pd_setup_cpumasks()

pd_setup_cpumasks() has only one caller. Move its contents inline to
prepare for the next cleanup.

Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# 350ef051 14-Jul-2020 Daniel Jordan <daniel.m.jordan@oracle.com>

padata: remove stop function

padata_stop() has two callers and is unnecessary in both cases. When
pcrypt calls it before padata_free(), it's being unloaded so there are
no outstanding padata jobs[0

padata: remove stop function

padata_stop() has two callers and is unnecessary in both cases. When
pcrypt calls it before padata_free(), it's being unloaded so there are
no outstanding padata jobs[0]. When __padata_free() calls it, it's
either along the same path or else pcrypt initialization failed, which
of course means there are also no outstanding jobs.

Removing it simplifies padata and saves text.

[0] https://lore.kernel.org/linux-crypto/20191119225017.mjrak2fwa5vccazl@gondor.apana.org.au/

Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# bd25b488 14-Jul-2020 Daniel Jordan <daniel.m.jordan@oracle.com>

padata: remove start function

padata_start() is only used right after pcrypt allocates an instance
with all possible CPUs, when PADATA_INVALID can't happen, so there's no
need for a separate "start"

padata: remove start function

padata_start() is only used right after pcrypt allocates an instance
with all possible CPUs, when PADATA_INVALID can't happen, so there's no
need for a separate "start" step. It can be done during allocation to
save text, make using padata easier, and avoid unneeded calls in the
future.

Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# e04ec0de 08-Jun-2020 Daniel Jordan <daniel.m.jordan@oracle.com>

padata: upgrade smp_mb__after_atomic to smp_mb in padata_do_serial

A 5.7 kernel hangs during a tcrypt test of padata that waits for an AEAD
request to finish. This is only seen on large machines ru

padata: upgrade smp_mb__after_atomic to smp_mb in padata_do_serial

A 5.7 kernel hangs during a tcrypt test of padata that waits for an AEAD
request to finish. This is only seen on large machines running many
concurrent requests.

The issue is that padata never serializes the request. The removal of
the reorder_objects atomic missed that the memory barrier in
padata_do_serial() depends on it.

Upgrade the barrier from smp_mb__after_atomic to smp_mb to get correct
ordering again.

Fixes: 3facced7aeed1 ("padata: remove reorder_objects")
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: linux-kernel@vger.kernel.org
Cc: <stable@vger.kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


# 004ed426 03-Jun-2020 Daniel Jordan <daniel.m.jordan@oracle.com>

padata: add basic support for multithreaded jobs

Sometimes the kernel doesn't take full advantage of system memory
bandwidth, leading to a single CPU spending excessive time in
initialization paths

padata: add basic support for multithreaded jobs

Sometimes the kernel doesn't take full advantage of system memory
bandwidth, leading to a single CPU spending excessive time in
initialization paths where the data scales with memory size.

Multithreading naturally addresses this problem.

Extend padata, a framework that handles many parallel yet singlethreaded
jobs, to also handle multithreaded jobs by adding support for splitting up
the work evenly, specifying a minimum amount of work that's appropriate
for one helper thread to do, load balancing between helpers, and
coordinating them.

This is inspired by work from Pavel Tatashin and Steve Sistare.

Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Josh Triplett <josh@joshtriplett.org>
Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Robert Elliott <elliott@hpe.com>
Cc: Shile Zhang <shile.zhang@linux.alibaba.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Steven Sistare <steven.sistare@oracle.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Link: http://lkml.kernel.org/r/20200527173608.2885243-5-daniel.m.jordan@oracle.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

show more ...


# 4611ce22 03-Jun-2020 Daniel Jordan <daniel.m.jordan@oracle.com>

padata: allocate work structures for parallel jobs from a pool

padata allocates per-CPU, per-instance work structs for parallel jobs. A
do_parallel call assigns a job to a sequence number and hashe

padata: allocate work structures for parallel jobs from a pool

padata allocates per-CPU, per-instance work structs for parallel jobs. A
do_parallel call assigns a job to a sequence number and hashes the number
to a CPU, where the job will eventually run using the corresponding work.

This approach fit with how padata used to bind a job to each CPU
round-robin, makes less sense after commit bfde23ce200e6 ("padata: unbind
parallel jobs from specific CPUs") because a work isn't bound to a
particular CPU anymore, and isn't needed at all for multithreaded jobs
because they don't have sequence numbers.

Replace the per-CPU works with a preallocated pool, which allows sharing
them between existing padata users and the upcoming multithreaded user.
The pool will also facilitate setting NUMA-aware concurrency limits with
later users.

The pool is sized according to the number of possible CPUs. With this
limit, MAX_OBJ_NUM no longer makes sense, so remove it.

If the global pool is exhausted, a parallel job is run in the current task
instead to throttle a system trying to do too much in parallel.

Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Josh Triplett <josh@joshtriplett.org>
Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Robert Elliott <elliott@hpe.com>
Cc: Shile Zhang <shile.zhang@linux.alibaba.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Steven Sistare <steven.sistare@oracle.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Link: http://lkml.kernel.org/r/20200527173608.2885243-4-daniel.m.jordan@oracle.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

show more ...


# f1b192b1 03-Jun-2020 Daniel Jordan <daniel.m.jordan@oracle.com>

padata: initialize earlier

padata will soon initialize the system's struct pages in parallel, so it
needs to be ready by page_alloc_init_late().

The error return from padata_driver_init() triggers

padata: initialize earlier

padata will soon initialize the system's struct pages in parallel, so it
needs to be ready by page_alloc_init_late().

The error return from padata_driver_init() triggers an initcall warning,
so add a warning to padata_init() to avoid silent failure.

Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Josh Triplett <josh@joshtriplett.org>
Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Robert Elliott <elliott@hpe.com>
Cc: Shile Zhang <shile.zhang@linux.alibaba.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Steven Sistare <steven.sistare@oracle.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Link: http://lkml.kernel.org/r/20200527173608.2885243-3-daniel.m.jordan@oracle.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

show more ...


# 305dacf7 03-Jun-2020 Daniel Jordan <daniel.m.jordan@oracle.com>

padata: remove exit routine

Patch series "padata: parallelize deferred page init", v3.

Deferred struct page init is a bottleneck in kernel boot--the biggest for
us and probably others. Optimizing

padata: remove exit routine

Patch series "padata: parallelize deferred page init", v3.

Deferred struct page init is a bottleneck in kernel boot--the biggest for
us and probably others. Optimizing it maximizes availability for
large-memory systems and allows spinning up short-lived VMs as needed
without having to leave them running. It also benefits bare metal
machines hosting VMs that are sensitive to downtime. In projects such as
VMM Fast Restart[1], where guest state is preserved across kexec reboot,
it helps prevent application and network timeouts in the guests.

So, multithread deferred init to take full advantage of system memory
bandwidth.

Extend padata, a framework that handles many parallel singlethreaded jobs,
to handle multithreaded jobs as well by adding support for splitting up
the work evenly, specifying a minimum amount of work that's appropriate
for one helper thread to do, load balancing between helpers, and
coordinating them. More documentation in patches 4 and 8.

This series is the first step in a project to address other memory
proportional bottlenecks in the kernel such as pmem struct page init, vfio
page pinning, hugetlb fallocate, and munmap. Deferred page init doesn't
require concurrency limits, resource control, or priority adjustments like
these other users will because it happens during boot when the system is
otherwise idle and waiting for page init to finish.

This has been run on a variety of x86 systems and speeds up kernel boot by
4% to 49%, saving up to 1.6 out of 4 seconds. Patch 6 has more numbers.

This patch (of 8):

padata_driver_exit() is unnecessary because padata isn't built as a module
and doesn't exit.

padata's init routine will soon allocate memory, so getting rid of the
exit function now avoids pointless code to free it.

Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Josh Triplett <josh@joshtriplett.org>
Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Robert Elliott <elliott@hpe.com>
Cc: Shile Zhang <shile.zhang@linux.alibaba.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Steven Sistare <steven.sistare@oracle.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Zi Yan <ziy@nvidia.com>
Link: http://lkml.kernel.org/r/20200527173608.2885243-1-daniel.m.jordan@oracle.com
Link: http://lkml.kernel.org/r/20200527173608.2885243-2-daniel.m.jordan@oracle.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

show more ...


# 3c2214b6 21-Apr-2020 Daniel Jordan <daniel.m.jordan@oracle.com>

padata: add separate cpuhp node for CPUHP_PADATA_DEAD

Removing the pcrypt module triggers this:

general protection fault, probably for non-canonical
address 0xdead000000000122
CPU: 5 PID: 2

padata: add separate cpuhp node for CPUHP_PADATA_DEAD

Removing the pcrypt module triggers this:

general protection fault, probably for non-canonical
address 0xdead000000000122
CPU: 5 PID: 264 Comm: modprobe Not tainted 5.6.0+ #2
Hardware name: QEMU Standard PC
RIP: 0010:__cpuhp_state_remove_instance+0xcc/0x120
Call Trace:
padata_sysfs_release+0x74/0xce
kobject_put+0x81/0xd0
padata_free+0x12/0x20
pcrypt_exit+0x43/0x8ee [pcrypt]

padata instances wrongly use the same hlist node for the online and dead
states, so __padata_free()'s second cpuhp remove call chokes on the node
that the first poisoned.

cpuhp multi-instance callbacks only walk forward in cpuhp_step->list and
the same node is linked in both the online and dead lists, so the list
corruption that results from padata_alloc() adding the node to a second
list without removing it from the first doesn't cause problems as long
as no instances are freed.

Avoid the issue by giving each state its own node.

Fixes: 894c9ef9780c ("padata: validate cpumask without removed CPU during offline")
Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org # v5.4+
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

show more ...


12345