History log of /linux/kernel/signal.c (Results 1 – 25 of 695)
Revision Date Author Comments
# a436184e 26-Feb-2024 Oleg Nesterov <oleg@redhat.com>

get_signal: don't initialize ksig->info if SIGNAL_GROUP_EXIT/group_exec_task

This initialization is incomplete and unnecessary, neither do_group_exit()
nor PF_USER_WORKER need ksig->info.

Link: htt

get_signal: don't initialize ksig->info if SIGNAL_GROUP_EXIT/group_exec_task

This initialization is incomplete and unnecessary, neither do_group_exit()
nor PF_USER_WORKER need ksig->info.

Link: https://lkml.kernel.org/r/20240226165653.GA20834@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Peter Collingbourne <pcc@google.com>
Cc: Wen Yang <wenyang.linux@foxmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

show more ...


# dd69edd6 26-Feb-2024 Oleg Nesterov <oleg@redhat.com>

get_signal: hide_si_addr_tag_bits: fix the usage of uninitialized ksig

ksig->ka and ksig->info are not initialized if get_signal() returns 0 or
if the caller is PF_USER_WORKER.

Check signr != 0 bef

get_signal: hide_si_addr_tag_bits: fix the usage of uninitialized ksig

ksig->ka and ksig->info are not initialized if get_signal() returns 0 or
if the caller is PF_USER_WORKER.

Check signr != 0 before SA_EXPOSE_TAGBITS and move the "out" label down.

The latter means that ksig->sig won't be initialized if a PF_USER_WORKER
thread gets a fatal signal but this is fine, PF_USER_WORKER's don't use
ksig. And there is nothing new, in this case ksig->ka and ksig-info are
not initialized anyway. Add a comment.

Link: https://lkml.kernel.org/r/20240226165650.GA20829@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Peter Collingbourne <pcc@google.com>
Cc: Wen Yang <wenyang.linux@foxmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

show more ...


# 49fd5f5a 26-Feb-2024 Oleg Nesterov <oleg@redhat.com>

get_signal: don't abuse ksig->info.si_signo and ksig->sig

Patch series "get_signal: minor cleanups and fix".

Lets remove this clear_siginfo() right now. It is incomplete (and thus
looks confusing)

get_signal: don't abuse ksig->info.si_signo and ksig->sig

Patch series "get_signal: minor cleanups and fix".

Lets remove this clear_siginfo() right now. It is incomplete (and thus
looks confusing) and unnecessary. Also, PF_USER_WORKER's already don't
get a fully initialized ksig anyway.


This patch (of 3):

Cleanup and preparation for the next changes.

get_signal() uses signr or ksig->info.si_signo or ksig->sig in a chaotic
way, this looks confusing. Change it to always use signr.

Link: https://lkml.kernel.org/r/20240226165612.GA20787@redhat.com
Link: https://lkml.kernel.org/r/20240226165647.GA20826@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Peter Collingbourne <pcc@google.com>
Cc: Wen Yang <wenyang.linux@foxmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

show more ...


# e1fb1dc0 09-Feb-2024 Christian Brauner <brauner@kernel.org>

pidfd: allow to override signal scope in pidfd_send_signal()

Right now we determine the scope of the signal based on the type of
pidfd. There are use-cases where it's useful to override the scope of

pidfd: allow to override signal scope in pidfd_send_signal()

Right now we determine the scope of the signal based on the type of
pidfd. There are use-cases where it's useful to override the scope of
the signal. For example in [1]. Add flags to determine the scope of the
signal:

(1) PIDFD_SIGNAL_THREAD: send signal to specific thread reference by @pidfd
(2) PIDFD_SIGNAL_THREAD_GROUP: send signal to thread-group of @pidfd
(2) PIDFD_SIGNAL_PROCESS_GROUP: send signal to process-group of @pidfd

Since we now allow specifying PIDFD_SEND_PROCESS_GROUP for
pidfd_send_signal() to send signals to process groups we need to adjust
the check restricting si_code emulation by userspace to account for
PIDTYPE_PGID.

Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Link: https://github.com/systemd/systemd/issues/31093 [1]
Link: https://lore.kernel.org/r/20240210-chihuahua-hinzog-3945b6abd44a@brauner
Link: https://lore.kernel.org/r/20240214123655.GB16265@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>

show more ...


# 81b9d8ac 09-Feb-2024 Oleg Nesterov <oleg@redhat.com>

pidfd: change pidfd_send_signal() to respect PIDFD_THREAD

Turn kill_pid_info() into kill_pid_info_type(), this allows to pass any
pid_type to group_send_sig_info(), despite its name it should work f

pidfd: change pidfd_send_signal() to respect PIDFD_THREAD

Turn kill_pid_info() into kill_pid_info_type(), this allows to pass any
pid_type to group_send_sig_info(), despite its name it should work fine
even if type = PIDTYPE_PID.

Change pidfd_send_signal() to use PIDTYPE_PID or PIDTYPE_TGID depending
on PIDFD_THREAD.

While at it kill another TODO comment in pidfd_show_fdinfo(). As Christian
expains fdinfo reports f_flags, userspace can already detect PIDFD_THREAD.

Reviewed-by: Tycho Andersen <tandersen@netflix.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20240209130650.GA8048@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>

show more ...


# c044a950 09-Feb-2024 Oleg Nesterov <oleg@redhat.com>

signal: fill in si_code in prepare_kill_siginfo()

So that do_tkill() can use this helper too. This also simplifies
the next patch.

TODO: perhaps we can kill prepare_kill_siginfo() and change the
ca

signal: fill in si_code in prepare_kill_siginfo()

So that do_tkill() can use this helper too. This also simplifies
the next patch.

TODO: perhaps we can kill prepare_kill_siginfo() and change the
callers to use SEND_SIG_NOINFO, but this needs some changes in
__send_signal_locked() and TP_STORE_SIGINFO().

Reviewed-by: Tycho Andersen <tandersen@netflix.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20240209130620.GA8039@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>

show more ...


# 9ed52108 05-Feb-2024 Oleg Nesterov <oleg@redhat.com>

pidfd: change do_notify_pidfd() to use __wake_up(poll_to_key(EPOLLIN))

rather than wake_up_all(). This way do_notify_pidfd() won't wakeup the
POLLHUP-only waiters which wait for pid_task() == NULL.

pidfd: change do_notify_pidfd() to use __wake_up(poll_to_key(EPOLLIN))

rather than wake_up_all(). This way do_notify_pidfd() won't wakeup the
POLLHUP-only waiters which wait for pid_task() == NULL.

TODO:
- as Christian pointed out, this asks for the new wake_up_all_poll()
helper, it can already have other users.

- we can probably discriminate the PIDFD_THREAD and non-PIDFD_THREAD
waiters, but this needs more work. See
https://lore.kernel.org/all/20240205140848.GA15853@redhat.com/

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20240205141348.GA16539@redhat.com
Reviewed-by: Tycho Andersen <tandersen@netflix.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>

show more ...


# 64bef697 31-Jan-2024 Oleg Nesterov <oleg@redhat.com>

pidfd: implement PIDFD_THREAD flag for pidfd_open()

With this flag:

- pidfd_open() doesn't require that the target task must be
a thread-group leader

- pidfd_poll() succeeds when the task exi

pidfd: implement PIDFD_THREAD flag for pidfd_open()

With this flag:

- pidfd_open() doesn't require that the target task must be
a thread-group leader

- pidfd_poll() succeeds when the task exits and becomes a
zombie (iow, passes exit_notify()), even if it is a leader
and thread-group is not empty.

This means that the behaviour of pidfd_poll(PIDFD_THREAD,
pid-of-group-leader) is not well defined if it races with
exec() from its sub-thread; pidfd_poll() can succeed or not
depending on whether pidfd_task_exited() is called before
or after exchange_tids().

Perhaps we can improve this behaviour later, pidfd_poll()
can probably take sig->group_exec_task into account. But
this doesn't really differ from the case when the leader
exits before other threads (so pidfd_poll() succeeds) and
then another thread execs and pidfd_poll() will block again.

thread_group_exited() is no longer used, perhaps it can die.

Co-developed-by: Tycho Andersen <tycho@tycho.pizza>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20240131132602.GA23641@redhat.com
Tested-by: Tycho Andersen <tandersen@netflix.com>
Reviewed-by: Tycho Andersen <tandersen@netflix.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>

show more ...


# 21e25205 27-Jan-2024 Oleg Nesterov <oleg@redhat.com>

pidfd: don't do_notify_pidfd() if !thread_group_empty()

do_notify_pidfd() makes no sense until the whole thread group exits, change
do_notify_parent() to check thread_group_empty().

This avoids the

pidfd: don't do_notify_pidfd() if !thread_group_empty()

do_notify_pidfd() makes no sense until the whole thread group exits, change
do_notify_parent() to check thread_group_empty().

This avoids the unnecessary do_notify_pidfd() when tsk is not a leader, or
it exits before other threads, or it has a ptraced EXIT_ZOMBIE sub-thread.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20240127132407.GA29136@redhat.com
Reviewed-by: Tycho Andersen <tandersen@netflix.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>

show more ...


# b454ec29 20-Nov-2023 Oleg Nesterov <oleg@redhat.com>

kernel/signal.c: simplify force_sig_info_to_task(), kill recalc_sigpending_and_wake()

The purpose of recalc_sigpending_and_wake() is not clear, it looks
"obviously unneeded" because we are going to

kernel/signal.c: simplify force_sig_info_to_task(), kill recalc_sigpending_and_wake()

The purpose of recalc_sigpending_and_wake() is not clear, it looks
"obviously unneeded" because we are going to send the signal which can't
be blocked or ignored.

Add the comment to explain why we can't rely on send_signal_locked() and
make this logic more simple/explicit. recalc_sigpending_and_wake() has no
other users, it can die.

In fact I think we don't even need signal_wake_up(), the target task must
be either current or a TASK_TRACED child, otherwise the usage of siglock
is not safe. But this needs another change.

Link: https://lkml.kernel.org/r/20231120151649.GA15995@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

show more ...


# 61a7a5e2 30-Oct-2023 Oleg Nesterov <oleg@redhat.com>

introduce for_other_threads(p, t)

Cosmetic, but imho it makes the usage look more clear and simple, the new
helper doesn't require to initialize "t".

After this change while_each_thread() has only

introduce for_other_threads(p, t)

Cosmetic, but imho it makes the usage look more clear and simple, the new
helper doesn't require to initialize "t".

After this change while_each_thread() has only 3 users, and it is only
used in the do/while loops.

Link: https://lkml.kernel.org/r/20231030155710.GA9095@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

show more ...


# a287116a 26-Sep-2023 Li kunyu <kunyu@nfschina.com>

kernel/signal: remove unnecessary NULL values from ucounts

ucounts is assigned first, so it does not need to initialize the
assignment.

Link: https://lkml.kernel.org/r/20230926022410.4280-1-kunyu@n

kernel/signal: remove unnecessary NULL values from ucounts

ucounts is assigned first, so it does not need to initialize the
assignment.

Link: https://lkml.kernel.org/r/20230926022410.4280-1-kunyu@nfschina.com
Signed-off-by: Li kunyu <kunyu@nfschina.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

show more ...


# e5ecf29c 09-Sep-2023 Oleg Nesterov <oleg@redhat.com>

signal: complete_signal: use __for_each_thread()

do/while_each_thread should be avoided when possible.

Link: https://lkml.kernel.org/r/20230909164537.GA11633@redhat.com
Signed-off-by: Oleg Nesterov

signal: complete_signal: use __for_each_thread()

do/while_each_thread should be avoided when possible.

Link: https://lkml.kernel.org/r/20230909164537.GA11633@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

show more ...


# 39835204 23-Aug-2023 Oleg Nesterov <oleg@redhat.com>

__kill_pgrp_info: simplify the calculation of return value

No need to calculate/check the "success" variable, we can kill it and update
retval in the main loop unless it is zero.

Link: https://lkml

__kill_pgrp_info: simplify the calculation of return value

No need to calculate/check the "success" variable, we can kill it and update
retval in the main loop unless it is zero.

Link: https://lkml.kernel.org/r/20230823171455.GA12188@redhat.com
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Suggested-by: David Laight <David.Laight@ACULAB.COM>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

show more ...


# 1aabbc53 03-Aug-2023 Sebastian Andrzej Siewior <bigeasy@linutronix.de>

signal: Don't disable preemption in ptrace_stop() on PREEMPT_RT

On PREEMPT_RT keeping preemption disabled during the invocation of
cgroup_enter_frozen() is a problem because the function acquires
cs

signal: Don't disable preemption in ptrace_stop() on PREEMPT_RT

On PREEMPT_RT keeping preemption disabled during the invocation of
cgroup_enter_frozen() is a problem because the function acquires
css_set_lock which is a sleeping lock on PREEMPT_RT and must not be
acquired with disabled preemption.

The preempt-disabled section is only for performance optimisation reasons
and can be avoided.

Extend the comment and don't disable preemption before scheduling on
PREEMPT_RT.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20230803100932.325870-3-bigeasy@linutronix.de

show more ...


# a20d6f63 03-Aug-2023 Sebastian Andrzej Siewior <bigeasy@linutronix.de>

signal: Add a proper comment about preempt_disable() in ptrace_stop()

Commit 53da1d9456fe7 ("fix ptrace slowness") added a preempt-disable section
between read_unlock() and the following schedule()

signal: Add a proper comment about preempt_disable() in ptrace_stop()

Commit 53da1d9456fe7 ("fix ptrace slowness") added a preempt-disable section
between read_unlock() and the following schedule() invocation without
explaining why it is needed.

Replace the existing contentless comment with a proper explanation to
clarify that it is not needed for correctness but for performance reasons.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20230803100932.325870-2-bigeasy@linutronix.de

show more ...


# f5e83688 13-Jan-2023 Ard Biesheuvel <ardb@kernel.org>

kernel: Drop IA64 support from sig_fault handlers

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>


# b0b88e02 07-Jul-2023 Vincent Whitchurch <vincent.whitchurch@axis.com>

signal: print comm and exe name on fatal signals

Make the print-fatal-signals message more useful by printing the comm
and the exe name for the process which received the fatal signal:

Before:

po

signal: print comm and exe name on fatal signals

Make the print-fatal-signals message more useful by printing the comm
and the exe name for the process which received the fatal signal:

Before:

potentially unexpected fatal signal 4
potentially unexpected fatal signal 11

After:

buggy-program: pool: potentially unexpected fatal signal 4
some-daemon: gdbus: potentially unexpected fatal signal 11

comm used to be present but was removed in commit 681a90ffe829b8ee25d
("arc, print-fatal-signals: reduce duplicated information") because it's
also included as part of the later stack trace. Having the comm as part
of the main "unexpected fatal..." print is rather useful though when
analysing logs, and the exe name is also valuable as shown in the
examples above where the comm ends up having some generic name like
"pool".

[akpm@linux-foundation.org: don't include linux/file.h twice]
Link: https://lkml.kernel.org/r/20230707-fatal-comm-v1-1-400363905d5e@axis.com
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vineet Gupta <vgupta@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

show more ...


# 5f0bc0b0 25-Jul-2023 Linus Torvalds <torvalds@linux-foundation.org>

mm: suppress mm fault logging if fatal signal already pending

Commit eda0047296a1 ("mm: make the page fault mmap locking killable")
intentionally made it much easier to trigger the "page fault fails

mm: suppress mm fault logging if fatal signal already pending

Commit eda0047296a1 ("mm: make the page fault mmap locking killable")
intentionally made it much easier to trigger the "page fault fails
because a fatal signal is pending" situation, by having the mmap locking
fail early in that case.

We have long aborted page faults in other fatal cases when the actual IO
for a page is interrupted by SIGKILL - which is particularly useful for
the traditional case of NFS hanging due to network issues, but local
filesystems could cause it too if you happened to get the SIGKILL while
waiting for a page to be faulted in (eg lock_folio_maybe_drop_mmap()).

So aborting the page fault wasn't a new condition - but it now triggers
earlier, before we even get to 'handle_mm_fault()'. And as a result the
error doesn't go through our 'fault_signal_pending()' logic, and doesn't
get filtered away there.

Normally you'd never even notice, because if a fatal signal is pending,
the new SIGSEGV we send ends up being ignored anyway.

But it turns out that there is one very noticeable exception: if you
enable 'show_unhandled_signals', the aborted page fault will be logged
in the kernel messages, and you'll get a scary line looking something
like this in your logs:

pverados[2183248]: segfault at 55e5a00f9ae0 ip 000055e5a00f9ae0 sp 00007ffc0720bea8 error 14 in perl[55e5a00d4000+195000] likely on CPU 10 (core 4, socket 0)

which is rather misleading. It's not really a segfault at all, it's
just "the thread was killed before the page fault completed, so we
aborted the page fault".

Fix this by just making it clear that a pending fatal signal means that
any new signal coming in after that is implicitly handled. This will
avoid the misleading logging, since now the signal isn't 'unhandled' any
more.

Reported-and-tested-by: Fiona Ebner <f.ebner@proxmox.com>
Tested-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Link: https://lore.kernel.org/lkml/8d063a26-43f5-0bb7-3203-c6a04dc159f8@proxmox.com/
Acked-by: Oleg Nesterov <oleg@redhat.com>
Fixes: eda0047296a1 ("mm: make the page fault mmap locking killable")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

show more ...


# f9010dbd 01-Jun-2023 Mike Christie <michael.christie@oracle.com>

fork, vhost: Use CLONE_THREAD to fix freezer/ps regression

When switching from kthreads to vhost_tasks two bugs were added:
1. The vhost worker tasks's now show up as processes so scripts doing
ps o

fork, vhost: Use CLONE_THREAD to fix freezer/ps regression

When switching from kthreads to vhost_tasks two bugs were added:
1. The vhost worker tasks's now show up as processes so scripts doing
ps or ps a would not incorrectly detect the vhost task as another
process. 2. kthreads disabled freeze by setting PF_NOFREEZE, but
vhost tasks's didn't disable or add support for them.

To fix both bugs, this switches the vhost task to be thread in the
process that does the VHOST_SET_OWNER ioctl, and has vhost_worker call
get_signal to support SIGKILL/SIGSTOP and freeze signals. Note that
SIGKILL/STOP support is required because CLONE_THREAD requires
CLONE_SIGHAND which requires those 2 signals to be supported.

This is a modified version of the patch written by Mike Christie
<michael.christie@oracle.com> which was a modified version of patch
originally written by Linus.

Much of what depended upon PF_IO_WORKER now depends on PF_USER_WORKER.
Including ignoring signals, setting up the register state, and having
get_signal return instead of calling do_group_exit.

Tidied up the vhost_task abstraction so that the definition of
vhost_task only needs to be visible inside of vhost_task.c. Making
it easier to review the code and tell what needs to be done where.
As part of this the main loop has been moved from vhost_worker into
vhost_task_fn. vhost_worker now returns true if work was done.

The main loop has been updated to call get_signal which handles
SIGSTOP, freezing, and collects the message that tells the thread to
exit as part of process exit. This collection clears
__fatal_signal_pending. This collection is not guaranteed to
clear signal_pending() so clear that explicitly so the schedule()
sleeps.

For now the vhost thread continues to exist and run work until the
last file descriptor is closed and the release function is called as
part of freeing struct file. To avoid hangs in the coredump
rendezvous and when killing threads in a multi-threaded exec. The
coredump code and de_thread have been modified to ignore vhost threads.

Remvoing the special case for exec appears to require teaching
vhost_dev_flush how to directly complete transactions in case
the vhost thread is no longer running.

Removing the special case for coredump rendezvous requires either the
above fix needed for exec or moving the coredump rendezvous into
get_signal.

Fixes: 6e890c5d5021 ("vhost: use vhost_tasks for worker threads")
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Co-developed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

show more ...


# 01e6aac7 18-May-2023 Luis Chamberlain <mcgrof@kernel.org>

signal: move show_unhandled_signals sysctl to its own file

The show_unhandled_signals sysctl is the only sysctl for debug
left on kernel/sysctl.c. We've been moving the syctls out from
kernel/sysctl

signal: move show_unhandled_signals sysctl to its own file

The show_unhandled_signals sysctl is the only sysctl for debug
left on kernel/sysctl.c. We've been moving the syctls out from
kernel/sysctl.c so to help avoid merge conflicts as the shared
array gets out of hand.

This change incurs simplifies sysctl registration by localizing
it where it should go for a penalty in size of increasing the
kernel by 23 bytes, we accept this given recent cleanups have
actually already saved us 1465 bytes in the prior commits.

./scripts/bloat-o-meter vmlinux.3-remove-dev-table vmlinux.4-remove-debug-table
add/remove: 3/1 grow/shrink: 0/1 up/down: 177/-154 (23)
Function old new delta
signal_debug_table - 128 +128
init_signal_sysctls - 33 +33
__pfx_init_signal_sysctls - 16 +16
sysctl_init_bases 85 59 -26
debug_table 128 - -128
Total: Before=21256967, After=21256990, chg +0.00%

Reviewed-by: Joel Granados <j.granados@samsung.com>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>

show more ...


# bcb7ee79 16-Mar-2023 Dmitry Vyukov <dvyukov@google.com>

posix-timers: Prefer delivery of signals to the current thread

POSIX timers using the CLOCK_PROCESS_CPUTIME_ID clock prefer the main
thread of a thread group for signal delivery. However, this has a

posix-timers: Prefer delivery of signals to the current thread

POSIX timers using the CLOCK_PROCESS_CPUTIME_ID clock prefer the main
thread of a thread group for signal delivery. However, this has a
significant downside: it requires waking up a potentially idle thread.

Instead, prefer to deliver signals to the current thread (in the same
thread group) if SIGEV_THREAD_ID is not set by the user. This does not
change guaranteed semantics, since POSIX process CPU time timers have
never guaranteed that signal delivery is to a specific thread (without
SIGEV_THREAD_ID set).

The effect is that queueing the signal no longer wakes up potentially idle
threads, and the kernel is no longer biased towards delivering the timer
signal to any particular thread (which better distributes the timer signals
esp. when multiple timers fire concurrently).

Suggested-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230316123028.2890338-1-elver@google.com

show more ...


# af7f588d 22-Nov-2022 Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

sched: Introduce per-memory-map concurrency ID

This feature allows the scheduler to expose a per-memory map concurrency
ID to user-space. This concurrency ID is within the possible cpus range,
and i

sched: Introduce per-memory-map concurrency ID

This feature allows the scheduler to expose a per-memory map concurrency
ID to user-space. This concurrency ID is within the possible cpus range,
and is temporarily (and uniquely) assigned while threads are actively
running within a memory map. If a memory map has fewer threads than
cores, or is limited to run on few cores concurrently through sched
affinity or cgroup cpusets, the concurrency IDs will be values close
to 0, thus allowing efficient use of user-space memory for per-cpu
data structures.

This feature is meant to be exposed by a new rseq thread area field.

The primary purpose of this feature is to do the heavy-lifting needed
by memory allocators to allow them to use per-cpu data structures
efficiently in the following situations:

- Single-threaded applications,
- Multi-threaded applications on large systems (many cores) with limited
cpu affinity mask,
- Multi-threaded applications on large systems (many cores) with
restricted cgroup cpuset per container.

One of the key concern from scheduler maintainers is the overhead
associated with additional spin locks or atomic operations in the
scheduler fast-path. This is why the following optimization is
implemented.

On context switch between threads belonging to the same memory map,
transfer the mm_cid from prev to next without any atomic ops. This
takes care of use-cases involving frequent context switch between
threads belonging to the same memory map.

Additional optimizations can be done if the spin locks added when
context switching between threads belonging to different memory maps end
up being a performance bottleneck. Those are left out of this patch
though. A performance impact would have to be clearly demonstrated to
justify the added complexity.

The credit goes to Paul Turner (Google) for the original virtual cpu id
idea. This feature is implemented based on the discussions with Paul
Turner and Peter Oskolkov (Google), but I took the liberty to implement
scheduler fast-path optimizations and my own NUMA-awareness scheme. The
rumor has it that Google have been running a rseq vcpu_id extension
internally in production for a year. The tcmalloc source code indeed has
comments hinting at a vcpu_id prototype extension to the rseq system
call [1].

The following benchmarks do not show any significant overhead added to
the scheduler context switch by this feature:

* perf bench sched messaging (process)

Baseline: 86.5±0.3 ms
With mm_cid: 86.7±2.6 ms

* perf bench sched messaging (threaded)

Baseline: 84.3±3.0 ms
With mm_cid: 84.7±2.6 ms

* hackbench (process)

Baseline: 82.9±2.7 ms
With mm_cid: 82.9±2.9 ms

* hackbench (threaded)

Baseline: 85.2±2.6 ms
With mm_cid: 84.4±2.9 ms

[1] https://github.com/google/tcmalloc/blob/master/tcmalloc/internal/linux_syscall_support.h#L26

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221122203932.231377-8-mathieu.desnoyers@efficios.com

show more ...


# 3a017d63 28-Nov-2022 haifeng.xu <haifeng.xu@shopee.com>

signal: Initialize the info in ksignal

When handing the SIGNAL_GROUP_EXIT flag, the info in ksignal isn't cleared.
However, the info acquired by dequeue_synchronous_signal/dequeue_signal is
initiali

signal: Initialize the info in ksignal

When handing the SIGNAL_GROUP_EXIT flag, the info in ksignal isn't cleared.
However, the info acquired by dequeue_synchronous_signal/dequeue_signal is
initialized and can be safely used. Fortunately, the fatal signal process
just uses the si_signo and doesn't use any other member. Even so, the
initialization before use is more safer.

Signed-off-by: haifeng.xu <haifeng.xu@shopee.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221128065606.19570-1-haifeng.xu@shopee.com

show more ...


# 6a542d1d 08-Jun-2020 Al Viro <viro@zeniv.linux.org.uk>

kill signal_pt_regs()

Once upon at it was used on hot paths, but that had not been
true since 2013. IOW, there's no point for arch-optimized
equivalent of task_pt_regs(current) - remaining two user

kill signal_pt_regs()

Once upon at it was used on hot paths, but that had not been
true since 2013. IOW, there's no point for arch-optimized
equivalent of task_pt_regs(current) - remaining two users are
not worth bothering with.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

show more ...


12345678910>>...28