History log of /linux/kernel/profile.c (Results 1 – 25 of 82)
Revision Date Author Comments
# 2e5449f4 27-Apr-2024 Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>

profiling: Remove create_prof_cpu_mask().

create_prof_cpu_mask() is no longer used after commit 1f44a225777e ("s390:
convert interrupt handling to use generic hardirq").

Signed-off-by: Tetsuo Handa

profiling: Remove create_prof_cpu_mask().

create_prof_cpu_mask() is no longer used after commit 1f44a225777e ("s390:
convert interrupt handling to use generic hardirq").

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

show more ...


# 35783ccb 01-Sep-2022 wuchi <wuchi.zero@gmail.com>

kernel/profile.c: simplify duplicated code in profile_setup()

The code to parse option string "schedule/sleep/kvm" of cmdline in
function profile_setup is redundant, so simplify that.

Link: https:/

kernel/profile.c: simplify duplicated code in profile_setup()

The code to parse option string "schedule/sleep/kvm" of cmdline in
function profile_setup is redundant, so simplify that.

Link: https://lkml.kernel.org/r/20220901003121.53597-1-wuchi.zero@gmail.com
Signed-off-by: wuchi <wuchi.zero@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foudation.org>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

show more ...


# 787dbea1 21-Jul-2022 Ben Dooks <ben-linux@fluff.org>

profile: setup_profiling_timer() is moslty not implemented

The setup_profiling_timer() is mostly un-implemented by many
architectures. In many places it isn't guarded by CONFIG_PROFILE which is
nee

profile: setup_profiling_timer() is moslty not implemented

The setup_profiling_timer() is mostly un-implemented by many
architectures. In many places it isn't guarded by CONFIG_PROFILE which is
needed for it to be used. Make it a weak symbol in kernel/profile.c and
remove the 'return -EINVAL' implementations from the kenrel.

There are a couple of architectures which do return 0 from the
setup_profiling_timer() function but they don't seem to do anything else
with it. To keep the /proc compatibility for now, leave these for a
future update or removal.

On ARM, this fixes the following sparse warning:
arch/arm/kernel/smp.c:793:5: warning: symbol 'setup_profiling_timer' was not declared. Should it be static?

Link: https://lkml.kernel.org/r/20220721195509.418205-1-ben-linux@fluff.org
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

show more ...


# 0fe6ee8f 31-May-2022 Chen Zhongjin <chenzhongjin@huawei.com>

profiling: fix shift too large makes kernel panic

2d186afd04d6 ("profiling: fix shift-out-of-bounds bugs") limits shift
value by [0, BITS_PER_LONG -1], which means [0, 63].

However, syzbot found th

profiling: fix shift too large makes kernel panic

2d186afd04d6 ("profiling: fix shift-out-of-bounds bugs") limits shift
value by [0, BITS_PER_LONG -1], which means [0, 63].

However, syzbot found that the max shift value should be the bit number of
(_etext - _stext). If shift is outside of this, the "buffer_bytes" will
be zero and will cause kzalloc(0). Then the kernel panics due to
dereferencing the returned pointer 16.

This can be easily reproduced by passing a large number like 60 to enable
profiling and then run readprofile.

LOGS:
BUG: kernel NULL pointer dereference, address: 0000000000000010
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
PGD 6148067 P4D 6148067 PUD 6142067 PMD 0
PREEMPT SMP
CPU: 4 PID: 184 Comm: readprofile Not tainted 5.18.0+ #162
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
RIP: 0010:read_profile+0x104/0x220
RSP: 0018:ffffc900006fbe80 EFLAGS: 00000202
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff888006150000 RSI: 0000000000000001 RDI: ffffffff82aba4a0
RBP: 000000000188bb60 R08: 0000000000000010 R09: ffff888006151000
R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff82aba4a0
R13: 0000000000000000 R14: ffffc900006fbf08 R15: 0000000000020c30
FS: 000000000188a8c0(0000) GS:ffff88803ed00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000010 CR3: 0000000006144000 CR4: 00000000000006e0
Call Trace:
<TASK>
proc_reg_read+0x56/0x70
vfs_read+0x9a/0x1b0
ksys_read+0xa1/0xe0
? fpregs_assert_state_consistent+0x1e/0x40
do_syscall_64+0x3a/0x80
entry_SYSCALL_64_after_hwframe+0x46/0xb0
RIP: 0033:0x4d4b4e
RSP: 002b:00007ffebb668d58 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 000000000188a8a0 RCX: 00000000004d4b4e
RDX: 0000000000000400 RSI: 000000000188bb60 RDI: 0000000000000003
RBP: 0000000000000003 R08: 000000000000006e R09: 0000000000000000
R10: 0000000000000041 R11: 0000000000000246 R12: 000000000188bb60
R13: 0000000000000400 R14: 0000000000000000 R15: 000000000188bb60
</TASK>
Modules linked in:
CR2: 0000000000000010
Killed
---[ end trace 0000000000000000 ]---

Check prof_len in profile_init() to prevent it be zero.

Link: https://lkml.kernel.org/r/20220531012854.229439-1-chenzhongjin@huawei.com
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

show more ...


# 2873cd31 08-Jan-2022 Eric W. Biederman <ebiederm@xmission.com>

exit: Remove profile_handoff_task

All profile_handoff_task does is notify the task_free_notifier chain.
The helpers task_handoff_register and task_handoff_unregister are used
to add and delete entri

exit: Remove profile_handoff_task

All profile_handoff_task does is notify the task_free_notifier chain.
The helpers task_handoff_register and task_handoff_unregister are used
to add and delete entries from that chain and are never called.

So remove the dead code and make it much easier to read and reason
about __put_task_struct.

Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Link: https://lkml.kernel.org/r/87fspyw6m0.fsf@email.froward.int.ebiederm.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

show more ...


# 2d4bcf88 08-Jan-2022 Eric W. Biederman <ebiederm@xmission.com>

exit: Remove profile_task_exit & profile_munmap

When I say remove I mean remove. All profile_task_exit and
profile_munmap do is call a blocking notifier chain. The helpers
profile_task_register an

exit: Remove profile_task_exit & profile_munmap

When I say remove I mean remove. All profile_task_exit and
profile_munmap do is call a blocking notifier chain. The helpers
profile_task_register and profile_task_unregister are not called
anywhere in the tree. Which means this is all dead code.

So remove the dead code and make it easier to read do_exit.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lkml.kernel.org/r/20220103213312.9144-1-ebiederm@xmission.com
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

show more ...


# 2d186afd 08-Sep-2021 Pavel Skripkin <paskripkin@gmail.com>

profiling: fix shift-out-of-bounds bugs

Syzbot reported shift-out-of-bounds bug in profile_init().
The problem was in incorrect prof_shift. Since prof_shift value comes from
userspace we need to cla

profiling: fix shift-out-of-bounds bugs

Syzbot reported shift-out-of-bounds bug in profile_init().
The problem was in incorrect prof_shift. Since prof_shift value comes from
userspace we need to clamp this value into [0, BITS_PER_LONG -1]
boundaries.

Second possible shiht-out-of-bounds was found by Tetsuo:
sample_step local variable in read_profile() had "unsigned int" type,
but prof_shift allows to make a BITS_PER_LONG shift. So, to prevent
possible shiht-out-of-bounds sample_step type was changed to
"unsigned long".

Also, "unsigned short int" will be sufficient for storing
[0, BITS_PER_LONG] value, that's why there is no need for
"unsigned long" prof_shift.

Link: https://lkml.kernel.org/r/20210813140022.5011-1-paskripkin@gmail.com
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-and-tested-by: syzbot+e68c89a9510c159d9684@syzkaller.appspotmail.com
Suggested-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

show more ...


# c5e3a411 01-Apr-2021 Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>

kernel: Initialize cpumask before parsing

KMSAN complains that new_value at cpumask_parse_user() from
write_irq_affinity() from irq_affinity_proc_write() is uninitialized.

[ 148.133411][ T5509]

kernel: Initialize cpumask before parsing

KMSAN complains that new_value at cpumask_parse_user() from
write_irq_affinity() from irq_affinity_proc_write() is uninitialized.

[ 148.133411][ T5509] =====================================================
[ 148.135383][ T5509] BUG: KMSAN: uninit-value in find_next_bit+0x325/0x340
[ 148.137819][ T5509]
[ 148.138448][ T5509] Local variable ----new_value.i@irq_affinity_proc_write created at:
[ 148.140768][ T5509] irq_affinity_proc_write+0xc3/0x3d0
[ 148.142298][ T5509] irq_affinity_proc_write+0xc3/0x3d0
[ 148.143823][ T5509] =====================================================

Since bitmap_parse() from cpumask_parse_user() calls find_next_bit(),
any alloc_cpumask_var() + cpumask_parse_user() sequence has possibility
that find_next_bit() accesses uninitialized cpu mask variable. Fix this
problem by replacing alloc_cpumask_var() with zalloc_cpumask_var().

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20210401055823.3929-1-penguin-kernel@I-love.SAKURA.ne.jp

show more ...


# 97a32539 04-Feb-2020 Alexey Dobriyan <adobriyan@gmail.com>

proc: convert everything to "struct proc_ops"

The most notable change is DEFINE_SHOW_ATTRIBUTE macro split in
seq_file.h.

Conversion rule is:

llseek => proc_lseek
unlocked_ioctl => proc_ioctl

proc: convert everything to "struct proc_ops"

The most notable change is DEFINE_SHOW_ATTRIBUTE macro split in
seq_file.h.

Conversion rule is:

llseek => proc_lseek
unlocked_ioctl => proc_ioctl

xxx => proc_xxx

delete ".owner = THIS_MODULE" line

[akpm@linux-foundation.org: fix drivers/isdn/capi/kcapi_proc.c]
[sfr@canb.auug.org.au: fix kernel/sched/psi.c]
Link: http://lkml.kernel.org/r/20200122180545.36222f50@canb.auug.org.au
Link: http://lkml.kernel.org/r/20191225172546.GB13378@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

show more ...


# ef70eff9 05-Dec-2019 Nathan Chancellor <natechancellor@gmail.com>

kernel/profile.c: use cpumask_available to check for NULL cpumask

When building with clang + -Wtautological-pointer-compare, these
instances pop up:

kernel/profile.c:339:6: warning: comparison of

kernel/profile.c: use cpumask_available to check for NULL cpumask

When building with clang + -Wtautological-pointer-compare, these
instances pop up:

kernel/profile.c:339:6: warning: comparison of array 'prof_cpu_mask' not equal to a null pointer is always true [-Wtautological-pointer-compare]
if (prof_cpu_mask != NULL)
^~~~~~~~~~~~~ ~~~~
kernel/profile.c:376:6: warning: comparison of array 'prof_cpu_mask' not equal to a null pointer is always true [-Wtautological-pointer-compare]
if (prof_cpu_mask != NULL)
^~~~~~~~~~~~~ ~~~~
kernel/profile.c:406:26: warning: comparison of array 'prof_cpu_mask' not equal to a null pointer is always true [-Wtautological-pointer-compare]
if (!user_mode(regs) && prof_cpu_mask != NULL &&
^~~~~~~~~~~~~ ~~~~
3 warnings generated.

This can be addressed with the cpumask_available helper, introduced in
commit f7e30f01a9e2 ("cpumask: Add helper cpumask_available()") to fix
warnings like this while keeping the code the same.

Link: https://github.com/ClangBuiltLinux/linux/issues/747
Link: http://lkml.kernel.org/r/20191022191957.9554-1-natechancellor@gmail.com
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

show more ...


# 457c8996 19-May-2019 Thomas Gleixner <tglx@linutronix.de>

treewide: Add SPDX license identifier for missed files

Add SPDX license identifiers to all files which:

- Have no license information of any form

- Have EXPORT_.*_SYMBOL_GPL inside which was use

treewide: Add SPDX license identifier for missed files

Add SPDX license identifiers to all files which:

- Have no license information of any form

- Have EXPORT_.*_SYMBOL_GPL inside which was used in the
initial scan/conversion to ignore the file

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

GPL-2.0-only

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

show more ...


# 57c8a661 30-Oct-2018 Mike Rapoport <rppt@linux.vnet.ibm.com>

mm: remove include/linux/bootmem.h

Move remaining definitions and declarations from include/linux/bootmem.h
into include/linux/memblock.h and remove the redundant header.

The includes were replaced

mm: remove include/linux/bootmem.h

Move remaining definitions and declarations from include/linux/bootmem.h
into include/linux/memblock.h and remove the redundant header.

The includes were replaced with the semantic patch below and then
semi-automated removal of duplicated '#include <linux/memblock.h>

@@
@@
- #include <linux/bootmem.h>
+ #include <linux/memblock.h>

[sfr@canb.auug.org.au: dma-direct: fix up for the removal of linux/bootmem.h]
Link: http://lkml.kernel.org/r/20181002185342.133d1680@canb.auug.org.au
[sfr@canb.auug.org.au: powerpc: fix up for removal of linux/bootmem.h]
Link: http://lkml.kernel.org/r/20181005161406.73ef8727@canb.auug.org.au
[sfr@canb.auug.org.au: x86/kaslr, ACPI/NUMA: fix for linux/bootmem.h removal]
Link: http://lkml.kernel.org/r/20181008190341.5e396491@canb.auug.org.au
Link: http://lkml.kernel.org/r/1536927045-23536-30-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Serge Semin <fancer.lancer@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

show more ...


# 3905f9ad 05-Feb-2017 Ingo Molnar <mingo@kernel.org>

sched/headers: Prepare to move sched_info_on() and force_schedstat_enabled() from <linux/sched.h> to <linux/sched/stat.h>

But first update usage sites with the new header dependency.

Acked-by: Linu

sched/headers: Prepare to move sched_info_on() and force_schedstat_enabled() from <linux/sched.h> to <linux/sched/stat.h>

But first update usage sites with the new header dependency.

Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>

show more ...


# 7c0f6ba6 24-Dec-2016 Linus Torvalds <torvalds@linux-foundation.org>

Replace <asm/uaccess.h> with <linux/uaccess.h> globally

This was entirely automated, using the script by Al:

PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
sed -i -e "s!$PA

Replace <asm/uaccess.h> with <linux/uaccess.h> globally

This was entirely automated, using the script by Al:

PATT='^[[:blank:]]*#[[:blank:]]*include[[:blank:]]*<asm/uaccess.h>'
sed -i -e "s!$PATT!#include <linux/uaccess.h>!" \
$(git grep -l "$PATT"|grep -v ^include/linux/uaccess.h)

to do the replacement at the end of the merge window.

Requested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

show more ...


# e722d8da 13-Jul-2016 Sebastian Andrzej Siewior <bigeasy@linutronix.de>

profile: Convert to hotplug state machine

Install the callbacks via the state machine and let the core invoke
the callbacks on the already online CPUs. A lot of code is removed because
the for-loop

profile: Convert to hotplug state machine

Install the callbacks via the state machine and let the core invoke
the callbacks on the already online CPUs. A lot of code is removed because
the for-loop is used and create_hash_tables() is removed since its purpose
is covered by the startup / teardown hooks.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160713153337.649867675@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>

show more ...


# ade356b9 22-Mar-2016 Arnd Bergmann <arnd@arndb.de>

profile: hide unused functions when !CONFIG_PROC_FS

A couple of functions and variables in the profile implementation are
used only on SMP systems by the procfs code, but are unused if either
procfs

profile: hide unused functions when !CONFIG_PROC_FS

A couple of functions and variables in the profile implementation are
used only on SMP systems by the procfs code, but are unused if either
procfs is disabled or in uniprocessor kernels. gcc prints a harmless
warning about the unused symbols:

kernel/profile.c:243:13: error: 'profile_flip_buffers' defined but not used [-Werror=unused-function]
static void profile_flip_buffers(void)
^
kernel/profile.c:266:13: error: 'profile_discard_flip_buffers' defined but not used [-Werror=unused-function]
static void profile_discard_flip_buffers(void)
^
kernel/profile.c:330:12: error: 'profile_cpu_callback' defined but not used [-Werror=unused-function]
static int profile_cpu_callback(struct notifier_block *info,
^

This adds further #ifdef to the file, to annotate exactly in which cases
they are used. I have done several thousand ARM randconfig kernels with
this patch applied and no longer get any warnings in this file.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Robin Holt <robinmholt@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

show more ...


# cb251765 05-Feb-2016 Mel Gorman <mgorman@techsingularity.net>

sched/debug: Make schedstats a runtime tunable that is disabled by default

schedstats is very useful during debugging and performance tuning but it
incurs overhead to calculate the stats. As such, e

sched/debug: Make schedstats a runtime tunable that is disabled by default

schedstats is very useful during debugging and performance tuning but it
incurs overhead to calculate the stats. As such, even though it can be
disabled at build time, it is often enabled as the information is useful.

This patch adds a kernel command-line and sysctl tunable to enable or
disable schedstats on demand (when it's built in). It is disabled
by default as someone who knows they need it can also learn to enable
it when necessary.

The benefits are dependent on how scheduler-intensive the workload is.
If it is then the patch reduces the number of cycles spent calculating
the stats with a small benefit from reducing the cache footprint of the
scheduler.

These measurements were taken from a 48-core 2-socket
machine with Xeon(R) E5-2670 v3 cpus although they were also tested on a
single socket machine 8-core machine with Intel i7-3770 processors.

netperf-tcp
4.5.0-rc1 4.5.0-rc1
vanilla nostats-v3r1
Hmean 64 560.45 ( 0.00%) 575.98 ( 2.77%)
Hmean 128 766.66 ( 0.00%) 795.79 ( 3.80%)
Hmean 256 950.51 ( 0.00%) 981.50 ( 3.26%)
Hmean 1024 1433.25 ( 0.00%) 1466.51 ( 2.32%)
Hmean 2048 2810.54 ( 0.00%) 2879.75 ( 2.46%)
Hmean 3312 4618.18 ( 0.00%) 4682.09 ( 1.38%)
Hmean 4096 5306.42 ( 0.00%) 5346.39 ( 0.75%)
Hmean 8192 10581.44 ( 0.00%) 10698.15 ( 1.10%)
Hmean 16384 18857.70 ( 0.00%) 18937.61 ( 0.42%)

Small gains here, UDP_STREAM showed nothing intresting and neither did
the TCP_RR tests. The gains on the 8-core machine were very similar.

tbench4
4.5.0-rc1 4.5.0-rc1
vanilla nostats-v3r1
Hmean mb/sec-1 500.85 ( 0.00%) 522.43 ( 4.31%)
Hmean mb/sec-2 984.66 ( 0.00%) 1018.19 ( 3.41%)
Hmean mb/sec-4 1827.91 ( 0.00%) 1847.78 ( 1.09%)
Hmean mb/sec-8 3561.36 ( 0.00%) 3611.28 ( 1.40%)
Hmean mb/sec-16 5824.52 ( 0.00%) 5929.03 ( 1.79%)
Hmean mb/sec-32 10943.10 ( 0.00%) 10802.83 ( -1.28%)
Hmean mb/sec-64 15950.81 ( 0.00%) 16211.31 ( 1.63%)
Hmean mb/sec-128 15302.17 ( 0.00%) 15445.11 ( 0.93%)
Hmean mb/sec-256 14866.18 ( 0.00%) 15088.73 ( 1.50%)
Hmean mb/sec-512 15223.31 ( 0.00%) 15373.69 ( 0.99%)
Hmean mb/sec-1024 14574.25 ( 0.00%) 14598.02 ( 0.16%)
Hmean mb/sec-2048 13569.02 ( 0.00%) 13733.86 ( 1.21%)
Hmean mb/sec-3072 12865.98 ( 0.00%) 13209.23 ( 2.67%)

Small gains of 2-4% at low thread counts and otherwise flat. The
gains on the 8-core machine were slightly different

tbench4 on 8-core i7-3770 single socket machine
Hmean mb/sec-1 442.59 ( 0.00%) 448.73 ( 1.39%)
Hmean mb/sec-2 796.68 ( 0.00%) 794.39 ( -0.29%)
Hmean mb/sec-4 1322.52 ( 0.00%) 1343.66 ( 1.60%)
Hmean mb/sec-8 2611.65 ( 0.00%) 2694.86 ( 3.19%)
Hmean mb/sec-16 2537.07 ( 0.00%) 2609.34 ( 2.85%)
Hmean mb/sec-32 2506.02 ( 0.00%) 2578.18 ( 2.88%)
Hmean mb/sec-64 2511.06 ( 0.00%) 2569.16 ( 2.31%)
Hmean mb/sec-128 2313.38 ( 0.00%) 2395.50 ( 3.55%)
Hmean mb/sec-256 2110.04 ( 0.00%) 2177.45 ( 3.19%)
Hmean mb/sec-512 2072.51 ( 0.00%) 2053.97 ( -0.89%)

In constract, this shows a relatively steady 2-3% gain at higher thread
counts. Due to the nature of the patch and the type of workload, it's
not a surprise that the result will depend on the CPU used.

hackbench-pipes
4.5.0-rc1 4.5.0-rc1
vanilla nostats-v3r1
Amean 1 0.0637 ( 0.00%) 0.0660 ( -3.59%)
Amean 4 0.1229 ( 0.00%) 0.1181 ( 3.84%)
Amean 7 0.1921 ( 0.00%) 0.1911 ( 0.52%)
Amean 12 0.3117 ( 0.00%) 0.2923 ( 6.23%)
Amean 21 0.4050 ( 0.00%) 0.3899 ( 3.74%)
Amean 30 0.4586 ( 0.00%) 0.4433 ( 3.33%)
Amean 48 0.5910 ( 0.00%) 0.5694 ( 3.65%)
Amean 79 0.8663 ( 0.00%) 0.8626 ( 0.43%)
Amean 110 1.1543 ( 0.00%) 1.1517 ( 0.22%)
Amean 141 1.4457 ( 0.00%) 1.4290 ( 1.16%)
Amean 172 1.7090 ( 0.00%) 1.6924 ( 0.97%)
Amean 192 1.9126 ( 0.00%) 1.9089 ( 0.19%)

Some small gains and losses and while the variance data is not included,
it's close to the noise. The UMA machine did not show anything particularly
different

pipetest
4.5.0-rc1 4.5.0-rc1
vanilla nostats-v2r2
Min Time 4.13 ( 0.00%) 3.99 ( 3.39%)
1st-qrtle Time 4.38 ( 0.00%) 4.27 ( 2.51%)
2nd-qrtle Time 4.46 ( 0.00%) 4.39 ( 1.57%)
3rd-qrtle Time 4.56 ( 0.00%) 4.51 ( 1.10%)
Max-90% Time 4.67 ( 0.00%) 4.60 ( 1.50%)
Max-93% Time 4.71 ( 0.00%) 4.65 ( 1.27%)
Max-95% Time 4.74 ( 0.00%) 4.71 ( 0.63%)
Max-99% Time 4.88 ( 0.00%) 4.79 ( 1.84%)
Max Time 4.93 ( 0.00%) 4.83 ( 2.03%)
Mean Time 4.48 ( 0.00%) 4.39 ( 1.91%)
Best99%Mean Time 4.47 ( 0.00%) 4.39 ( 1.91%)
Best95%Mean Time 4.46 ( 0.00%) 4.38 ( 1.93%)
Best90%Mean Time 4.45 ( 0.00%) 4.36 ( 1.98%)
Best50%Mean Time 4.36 ( 0.00%) 4.25 ( 2.49%)
Best10%Mean Time 4.23 ( 0.00%) 4.10 ( 3.13%)
Best5%Mean Time 4.19 ( 0.00%) 4.06 ( 3.20%)
Best1%Mean Time 4.13 ( 0.00%) 4.00 ( 3.39%)

Small improvement and similar gains were seen on the UMA machine.

The gain is small but it stands to reason that doing less work in the
scheduler is a good thing. The downside is that the lack of schedstats and
tracepoints may be surprising to experts doing performance analysis until
they find the existence of the schedstats= parameter or schedstats sysctl.
It will be automatically activated for latencytop and sleep profiling to
alleviate the problem. For tracepoints, there is a simple warning as it's
not safe to activate schedstats in the context when it's known the tracepoint
may be wanted but is unavailable.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <mgalbraith@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1454663316-22048-1-git-send-email-mgorman@techsingularity.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>

show more ...


# 96db800f 08-Sep-2015 Vlastimil Babka <vbabka@suse.cz>

mm: rename alloc_pages_exact_node() to __alloc_pages_node()

alloc_pages_exact_node() was introduced in commit 6484eb3e2a81 ("page
allocator: do not check NUMA node ID when the caller knows the node

mm: rename alloc_pages_exact_node() to __alloc_pages_node()

alloc_pages_exact_node() was introduced in commit 6484eb3e2a81 ("page
allocator: do not check NUMA node ID when the caller knows the node is
valid") as an optimized variant of alloc_pages_node(), that doesn't
fallback to current node for nid == NUMA_NO_NODE. Unfortunately the
name of the function can easily suggest that the allocation is
restricted to the given node and fails otherwise. In truth, the node is
only preferred, unless __GFP_THISNODE is passed among the gfp flags.

The misleading name has lead to mistakes in the past, see for example
commits 5265047ac301 ("mm, thp: really limit transparent hugepage
allocation to local node") and b360edb43f8e ("mm, mempolicy:
migrate_to_node should only migrate to node").

Another issue with the name is that there's a family of
alloc_pages_exact*() functions where 'exact' means exact size (instead
of page order), which leads to more confusion.

To prevent further mistakes, this patch effectively renames
alloc_pages_exact_node() to __alloc_pages_node() to better convey that
it's an optimized variant of alloc_pages_node() not intended for general
usage. Both functions get described in comments.

It has been also considered to really provide a convenience function for
allocations restricted to a node, but the major opinion seems to be that
__GFP_THISNODE already provides that functionality and we shouldn't
duplicate the API needlessly. The number of users would be small
anyway.

Existing callers of alloc_pages_exact_node() are simply converted to
call __alloc_pages_node(), with the exception of sba_alloc_coherent()
which open-codes the check for NUMA_NO_NODE, so it is converted to use
alloc_pages_node() instead. This means it no longer performs some
VM_BUG_ON checks, and since the current check for nid in
alloc_pages_node() uses a 'nid < 0' comparison (which includes
NUMA_NO_NODE), it may hide wrong values which would be previously
exposed.

Both differences will be rectified by the next patch.

To sum up, this patch makes no functional changes, except temporarily
hiding potentially buggy callers. Restricting the checks in
alloc_pages_node() is left for the next patch which can in turn expose
more existing buggy callers.

Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Robin Holt <robinmholt@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Mel Gorman <mgorman@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Cliff Whickman <cpw@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

show more ...


# ccbd59c1 13-Feb-2015 Tejun Heo <tj@kernel.org>

profile: use %*pb[l] to print bitmaps including cpumasks and nodemasks

printk and friends can now format bitmaps using '%*pb[l]'. cpumask
and nodemask also provide cpumask_pr_args() and nodemask_pr

profile: use %*pb[l] to print bitmaps including cpumasks and nodemasks

printk and friends can now format bitmaps using '%*pb[l]'. cpumask
and nodemask also provide cpumask_pr_args() and nodemask_pr_args()
respectively which can be used to generate the two printf arguments
necessary to format the specified cpu/nodemask.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

show more ...


# f3da64d1 06-Jun-2014 Fabian Frederick <fabf@skynet.be>

kernel/profile.c: use static const char instead of static char

schedstr, sleepstr and kvmstr are only used in strcmp & strlen

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Paul Gortmaker <pa

kernel/profile.c: use static const char instead of static char

schedstr, sleepstr and kvmstr are only used in strcmp & strlen

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

show more ...


# aba871f1 06-Jun-2014 Fabian Frederick <fabf@skynet.be>

kernel/profile.c: convert printk to pr_foo()

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.or

kernel/profile.c: convert printk to pr_foo()

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

show more ...


# c96d6660 03-Apr-2014 Paul Gortmaker <paul.gortmaker@windriver.com>

kernel: audit/fix non-modular users of module_init in core code

Code that is obj-y (always built-in) or dependent on a bool Kconfig
(built-in or absent) can never be modular. So using module_init a

kernel: audit/fix non-modular users of module_init in core code

Code that is obj-y (always built-in) or dependent on a bool Kconfig
(built-in or absent) can never be modular. So using module_init as an
alias for __initcall can be somewhat misleading.

Fix these up now, so that we can relocate module_init from init.h into
module.h in the future. If we don't do this, we'd have to add module.h
to obviously non-modular code, and that would be a worse thing.

The audit targets the following module_init users for change:
kernel/user.c obj-y
kernel/kexec.c bool KEXEC (one instance per arch)
kernel/profile.c bool PROFILING
kernel/hung_task.c bool DETECT_HUNG_TASK
kernel/sched/stats.c bool SCHEDSTATS
kernel/user_namespace.c bool USER_NS

Note that direct use of __initcall is discouraged, vs. one of the
priority categorized subgroups. As __initcall gets mapped onto
device_initcall, our use of subsys_initcall (which makes sense for these
files) will thus change this registration from level 6-device to level
4-subsys (i.e. slightly earlier). However no observable impact of that
difference has been observed during testing.

Also, two instances of missing ";" at EOL are fixed in kexec.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Eric Biederman <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

show more ...


# c270a817 10-Mar-2014 Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>

profile: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
be

profile: Fix CPU hotplug callback registration

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_notifier_register_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_notifier_register_done();

Fix the profile code by using this latter form of callback registration.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Mauro Carvalho Chehab <mchehab@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

show more ...


# e97ca8e5 10-Mar-2014 Johannes Weiner <hannes@cmpxchg.org>

mm: fix GFP_THISNODE callers and clarify

GFP_THISNODE is for callers that implement their own clever fallback to
remote nodes. It restricts the allocation to the specified node and
does not invoke

mm: fix GFP_THISNODE callers and clarify

GFP_THISNODE is for callers that implement their own clever fallback to
remote nodes. It restricts the allocation to the specified node and
does not invoke reclaim, assuming that the caller will take care of it
when the fallback fails, e.g. through a subsequent allocation request
without GFP_THISNODE set.

However, many current GFP_THISNODE users only want the node exclusive
aspect of the flag, without actually implementing their own fallback or
triggering reclaim if necessary. This results in things like page
migration failing prematurely even when there is easily reclaimable
memory available, unless kswapd happens to be running already or a
concurrent allocation attempt triggers the necessary reclaim.

Convert all callsites that don't implement their own fallback strategy
to __GFP_THISNODE. This restricts the allocation a single node too, but
at the same time allows the allocator to enter the slowpath, wake
kswapd, and invoke direct reclaim if necessary, to make the allocation
happen when memory is full.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Jan Stancek <jstancek@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

show more ...


# 0db0628d 19-Jun-2013 Paul Gortmaker <paul.gortmaker@windriver.com>

kernel: delete __cpuinit usage from all core kernel files

The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offse

kernel: delete __cpuinit usage from all core kernel files

The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offset the cost and complications. For example, the fix in
commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
is a good example of the nasty type of bugs that can be created
with improper use of the various __init prefixes.

After a discussion on LKML[1] it was decided that cpuinit should go
the way of devinit and be phased out. Once all the users are gone,
we can then finally remove the macros themselves from linux/init.h.

This removes all the uses of the __cpuinit macros from C files in
the core kernel directories (kernel, init, lib, mm, and include)
that don't really have a specific maintainer.

[1] https://lkml.org/lkml/2013/5/20/589

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

show more ...


1234