#
455f9075 |
| 19-Apr-2024 |
Daniel J Blueman <daniel@quora.org> |
x86/tsc: Trust initial offset in architectural TSC-adjust MSRs
When the BIOS configures the architectural TSC-adjust MSRs on secondary sockets to correct a constant inter-chassis offset, after Linux
x86/tsc: Trust initial offset in architectural TSC-adjust MSRs
When the BIOS configures the architectural TSC-adjust MSRs on secondary sockets to correct a constant inter-chassis offset, after Linux brings the cores online, the TSC sync check later resets the core-local MSR to 0, triggering HPET fallback and leading to performance loss.
Fix this by unconditionally using the initial adjust values read from the MSRs. Trusting the initial offsets in this architectural mechanism is a better approach than special-casing workarounds for specific platforms.
Signed-off-by: Daniel J Blueman <daniel@quora.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Steffen Persvold <sp@numascale.com> Reviewed-by: James Cleverdon <james.cleverdon.external@eviden.com> Reviewed-by: Dimitri Sivanich <sivanich@hpe.com> Reviewed-by: Prarit Bhargava <prarit@redhat.com> Link: https://lore.kernel.org/r/20240419085146.175665-1-daniel@quora.org
show more ...
|
#
bd94d86f |
| 25-Oct-2023 |
Thomas Gleixner <tglx@linutronix.de> |
x86/tsc: Defer marking TSC unstable to a worker
Tetsuo reported the following lockdep splat when the TSC synchronization fails during CPU hotplug:
tsc: Marking TSC unstable due to check_tsc_sync
x86/tsc: Defer marking TSC unstable to a worker
Tetsuo reported the following lockdep splat when the TSC synchronization fails during CPU hotplug:
tsc: Marking TSC unstable due to check_tsc_sync_source failed WARNING: inconsistent lock state inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. ffffffff8cfa1c78 (watchdog_lock){?.-.}-{2:2}, at: clocksource_watchdog+0x23/0x5a0 {IN-HARDIRQ-W} state was registered at: _raw_spin_lock_irqsave+0x3f/0x60 clocksource_mark_unstable+0x1b/0x90 mark_tsc_unstable+0x41/0x50 check_tsc_sync_source+0x14f/0x180 sysvec_call_function_single+0x69/0x90
Possible unsafe locking scenario: lock(watchdog_lock); <Interrupt> lock(watchdog_lock);
stack backtrace: _raw_spin_lock+0x30/0x40 clocksource_watchdog+0x23/0x5a0 run_timer_softirq+0x2a/0x50 sysvec_apic_timer_interrupt+0x6e/0x90
The reason is the recent conversion of the TSC synchronization function during CPU hotplug on the control CPU to a SMP function call. In case that the synchronization with the upcoming CPU fails, the TSC has to be marked unstable via clocksource_mark_unstable().
clocksource_mark_unstable() acquires 'watchdog_lock', but that lock is taken with interrupts enabled in the watchdog timer callback to minimize interrupt disabled time. That's obviously a possible deadlock scenario,
Before that change the synchronization function was invoked in thread context so this could not happen.
As it is not crucical whether the unstable marking happens slightly delayed, defer the call to a worker thread which avoids the lock context problem.
Fixes: 9d349d47f0e3 ("x86/smpboot: Make TSC synchronization function call based") Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/87zg064ceg.ffs@tglx
show more ...
|
#
9d349d47 |
| 12-May-2023 |
Thomas Gleixner <tglx@linutronix.de> |
x86/smpboot: Make TSC synchronization function call based
Spin-waiting on the control CPU until the AP reaches the TSC synchronization is just a waste especially in the case that there is no synchro
x86/smpboot: Make TSC synchronization function call based
Spin-waiting on the control CPU until the AP reaches the TSC synchronization is just a waste especially in the case that there is no synchronization required.
As the synchronization has to run with interrupts disabled the control CPU part can just be done from a SMP function call. The upcoming AP issues that call async only in the case that synchronization is required.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Michael Kelley <mikelley@microsoft.com> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> Tested-by: Helge Deller <deller@gmx.de> # parisc Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck Link: https://lore.kernel.org/r/20230512205256.148255496@linutronix.de
show more ...
|
#
c7719e79 |
| 17-Nov-2021 |
Feng Tang <feng.tang@intel.com> |
x86/tsc: Add a timer to make sure TSC_adjust is always checked
The TSC_ADJUST register is checked every time a CPU enters idle state, but Thomas Gleixner mentioned there is still a caveat that a sys
x86/tsc: Add a timer to make sure TSC_adjust is always checked
The TSC_ADJUST register is checked every time a CPU enters idle state, but Thomas Gleixner mentioned there is still a caveat that a system won't enter idle [1], either because it's too busy or configured purposely to not enter idle.
Setup a periodic timer (every 10 minutes) to make sure the check is happening on a regular base.
[1] https://lore.kernel.org/lkml/875z286xtk.fsf@nanos.tec.linutronix.de/
Fixes: 6e3cd95234dc ("x86/hpet: Use another crystalball to evaluate HPET usability") Requested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Feng Tang <feng.tang@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20211117023751.24190-1-feng.tang@intel.com
show more ...
|
#
163b0991 |
| 21-Mar-2021 |
Ingo Molnar <mingo@kernel.org> |
x86: Fix various typos in comments, take #2
Fix another ~42 single-word typos in arch/x86/ code comments, missed a few in the first pass, in particular in .S files.
Signed-off-by: Ingo Molnar <ming
x86: Fix various typos in comments, take #2
Fix another ~42 single-word typos in arch/x86/ code comments, missed a few in the first pass, in particular in .S files.
Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: linux-kernel@vger.kernel.org
show more ...
|
#
4d1d0977 |
| 16-Feb-2020 |
Martin Molnar <martin.molnar.programming@gmail.com> |
x86: Fix a handful of typos
Fix a couple of typos in code comments.
[ bp: While at it: s/IRQ's/IRQs/. ]
Signed-off-by: Martin Molnar <martin.molnar.programming@gmail.com> Signed-off-by: Borislav
x86: Fix a handful of typos
Fix a couple of typos in code comments.
[ bp: While at it: s/IRQ's/IRQs/. ]
Signed-off-by: Martin Molnar <martin.molnar.programming@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lkml.kernel.org/r/0819a044-c360-44a4-f0b6-3f5bafe2d35c@gmail.com
show more ...
|
#
4144fddb |
| 18-Jan-2020 |
Mateusz Nosek <mateusznosek0@gmail.com> |
x86/tsc: Remove redundant assignment
Previously, the assignment to the local variable 'now' took place before the for loop. The loop is unconditional so it will be entered at least once. The variabl
x86/tsc: Remove redundant assignment
Previously, the assignment to the local variable 'now' took place before the for loop. The loop is unconditional so it will be entered at least once. The variable 'now' is reassigned in the loop and is not used before reassigning. Therefore, the assignment before the loop is unnecessary and can be removed.
No code changed:
# arch/x86/kernel/tsc_sync.o:
text data bss dec hex filename 3569 198 44 3811 ee3 tsc_sync.o.before 3569 198 44 3811 ee3 tsc_sync.o.after
md5: 36216de29b208edbcd34fed9fe7f7b69 tsc_sync.o.before.asm 36216de29b208edbcd34fed9fe7f7b69 tsc_sync.o.after.asm
[ bp: Massage commit message. ]
Signed-off-by: Mateusz Nosek <mateusznosek0@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/20200118171143.25178-1-mateusznosek0@gmail.com
show more ...
|
#
8d3bcc44 |
| 18-Oct-2019 |
Kefeng Wang <wangkefeng.wang@huawei.com> |
x86: Use pr_warn instead of pr_warning
As said in commit f2c2cbcc35d4 ("powerpc: Use pr_warn instead of pr_warning"), removing pr_warning so all logging messages use a consistent <prefix>_warn style
x86: Use pr_warn instead of pr_warning
As said in commit f2c2cbcc35d4 ("powerpc: Use pr_warn instead of pr_warning"), removing pr_warning so all logging messages use a consistent <prefix>_warn style. Let's do it.
Link: http://lkml.kernel.org/r/20191018031850.48498-7-wangkefeng.wang@huawei.com To: linux-kernel@vger.kernel.org Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Robert Richter <rric@kernel.org> Cc: Darren Hart <dvhart@infradead.org> Cc: Andy Shevchenko <andy@infradead.org> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Petr Mladek <pmladek@suse.com>
show more ...
|
#
b2441318 |
| 01-Nov-2017 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine
License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license identifiers to apply.
- when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary:
SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became the concluded license(s).
- when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time.
In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related.
Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
show more ...
|
#
41e7864a |
| 12-Oct-2017 |
mike.travis@hpe.com <mike.travis@hpe.com> |
x86/tsc: Drastically reduce the number of firmware bug warnings
Prior to the TSC ADJUST MSR being available, the method to set TSC's in sync with each other naturally caused a small skew between cpu
x86/tsc: Drastically reduce the number of firmware bug warnings
Prior to the TSC ADJUST MSR being available, the method to set TSC's in sync with each other naturally caused a small skew between cpu threads. This was NOT a firmware bug at the time so introducing a whole avalanche of alarming warning messages might cause unnecessary concern and customer complaints. (Example: >3000 msgs in a 32 socket Skylake system.)
Simply report the warning condition, if possible do the necessary fixes, and move on.
Signed-off-by: Mike Travis <mike.travis@hpe.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Dimitri Sivanich <dimitri.sivanich@hpe.com> Reviewed-by: Russ Anderson <russ.anderson@hpe.com> Reviewed-by: Peter Zijlstra <peterz@infradead.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Andrew Banman <andrew.banman@hpe.com> Cc: Bin Gao <bin.gao@linux.intel.com> Link: https://lkml.kernel.org/r/20171012163202.175062400@stormcage.americas.sgi.com
show more ...
|
#
9514ecec |
| 12-Oct-2017 |
mike.travis@hpe.com <mike.travis@hpe.com> |
x86/tsc: Skip TSC test and error messages if already unstable
If the TSC has already been determined to be unstable, then checking TSC ADJUST values is a waste of time and generates unnecessary erro
x86/tsc: Skip TSC test and error messages if already unstable
If the TSC has already been determined to be unstable, then checking TSC ADJUST values is a waste of time and generates unnecessary error messages.
Signed-off-by: Mike Travis <mike.travis@hpe.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Dimitri Sivanich <dimitri.sivanich@hpe.com> Reviewed-by: Russ Anderson <russ.anderson@hpe.com> Reviewed-by: Peter Zijlstra <peterz@infradead.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Andrew Banman <andrew.banman@hpe.com> Cc: Bin Gao <bin.gao@linux.intel.com> Link: https://lkml.kernel.org/r/20171012163202.060777495@stormcage.americas.sgi.com
show more ...
|
#
341102c3 |
| 12-Oct-2017 |
mike.travis@hpe.com <mike.travis@hpe.com> |
x86/tsc: Add option that TSC on Socket 0 being non-zero is valid
Add a flag to indicate and process that TSC counters are on chassis that reset at different times during system startup. Therefore w
x86/tsc: Add option that TSC on Socket 0 being non-zero is valid
Add a flag to indicate and process that TSC counters are on chassis that reset at different times during system startup. Therefore which TSC ADJUST values should be zero is not predictable.
Signed-off-by: Mike Travis <mike.travis@hpe.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Dimitri Sivanich <dimitri.sivanich@hpe.com> Reviewed-by: Russ Anderson <russ.anderson@hpe.com> Reviewed-by: Andrew Banman <andrew.abanman@hpe.com> Reviewed-by: Peter Zijlstra <peterz@infradead.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Andrew Banman <andrew.banman@hpe.com> Cc: Bin Gao <bin.gao@linux.intel.com> Link: https://lkml.kernel.org/r/20171012163201.944370012@stormcage.americas.sgi.com
show more ...
|
#
855615ee |
| 31-May-2017 |
Peter Zijlstra <peterz@infradead.org> |
x86/tsc: Remove the TSC_ADJUST clamp
Now that all affected platforms have a microcode update; and we check this and disable TSC_DEADLINE and print a microcode revision update error if its too old, w
x86/tsc: Remove the TSC_ADJUST clamp
Now that all affected platforms have a microcode update; and we check this and disable TSC_DEADLINE and print a microcode revision update error if its too old, we can remove the TSC_ADJUST clamp.
This should help with systems where the second socket runs ahead of the first socket and needs a negative adjustment. In this case we'd hit the 0 clamp and give up for not achieving synchronization.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: kevin.b.stanton@intel.com Link: http://lkml.kernel.org/r/20170531155306.100950003@infradead.org Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
show more ...
|
#
5f2e71e7 |
| 09-Feb-2017 |
Thomas Gleixner <tglx@linutronix.de> |
x86/tsc: Make the TSC ADJUST sanitizing work for tsc_reliable
When the TSC is marked reliable then the synchronization check is skipped, but that also skips the TSC ADJUST sanitizing code. So on a m
x86/tsc: Make the TSC ADJUST sanitizing work for tsc_reliable
When the TSC is marked reliable then the synchronization check is skipped, but that also skips the TSC ADJUST sanitizing code. So on a machine with a wreckaged BIOS the TSC deviation between CPUs might go unnoticed.
Let the TSC adjust sanitizing code run unconditionally and just skip the expensive synchronization checks when TSC is marked reliable.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Olof Johansson <olof@lixom.net> Link: http://lkml.kernel.org/r/20170209151231.491189912@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
show more ...
|
#
8c9b9d87 |
| 18-Dec-2016 |
Thomas Gleixner <tglx@linutronix.de> |
x86/tsc: Limit the adjust value further
Adjust value 0x80000000 and other values larger than that render the TSC deadline timer disfunctional.
We have not yet any information about this from Intel,
x86/tsc: Limit the adjust value further
Adjust value 0x80000000 and other values larger than that render the TSC deadline timer disfunctional.
We have not yet any information about this from Intel, but experimentation clearly proves that this is a 32/64 bit and sign extension issue.
If adjust values larger than that are actually required, which might be the case for physical CPU hotplug, then we need to disable the deadline timer on the affected package/CPUs and use the local APIC timer instead.
That requires some surgery in the APIC setup code, so we just limit the ADJUST register value into the known to work range for now and revisit this when Intel comes forth with proper information.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Roland Scheidegger <rscheidegger_lists@hispeed.ch> Cc: Bruce Schlobohm <bruce.schlobohm@intel.com> Cc: Kevin Stanton <kevin.b.stanton@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de>
show more ...
|
#
16588f65 |
| 18-Dec-2016 |
Thomas Gleixner <tglx@linutronix.de> |
x86/tsc: Annotate printouts as firmware bug
Make it more obvious that the BIOS is screwed up.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Roland Scheidegger <rscheidegger_lists@hispeed.
x86/tsc: Annotate printouts as firmware bug
Make it more obvious that the BIOS is screwed up.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Roland Scheidegger <rscheidegger_lists@hispeed.ch> Cc: Bruce Schlobohm <bruce.schlobohm@intel.com> Cc: Kevin Stanton <kevin.b.stanton@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de>
show more ...
|
#
5bae1562 |
| 13-Dec-2016 |
Thomas Gleixner <tglx@linutronix.de> |
x86/tsc: Force TSC_ADJUST register to value >= zero
Roland reported that his DELL T5810 sports a value add BIOS which completely wreckages the TSC. The squirmware [(TM) Ingo Molnar] boots with rando
x86/tsc: Force TSC_ADJUST register to value >= zero
Roland reported that his DELL T5810 sports a value add BIOS which completely wreckages the TSC. The squirmware [(TM) Ingo Molnar] boots with random negative TSC_ADJUST values, different on all CPUs. That renders the TSC useless because the sycnchronization check fails.
Roland tested the new TSC_ADJUST mechanism. While it manages to readjust the TSCs he needs to disable the TSC deadline timer, otherwise the machine just stops booting.
Deeper investigation unearthed that the TSC deadline timer is sensitive to the TSC_ADJUST value. Writing TSC_ADJUST to a negative value results in an interrupt storm caused by the TSC deadline timer.
This does not make any sense and it's hard to imagine what kind of hardware wreckage is behind that misfeature, but it's reliably reproducible on other systems which have TSC_ADJUST and TSC deadline timer.
While it would be understandable that a big enough negative value which moves the resulting TSC readout into the negative space could have the described effect, this happens even with a adjust value of -1, which keeps the TSC readout definitely in the positive space. The compare register for the TSC deadline timer is set to a positive value larger than the TSC, but despite not having reached the deadline the interrupt is raised immediately. If this happens on the boot CPU, then the machine dies silently because this setup happens before the NMI watchdog is armed.
Further experiments showed that any other adjustment of TSC_ADJUST works as expected as long as it stays in the positive range. The direction of the adjustment has no influence either. See the lkml link for further analysis.
Yet another proof for the theory that timers are designed by janitors and the underlying (obviously undocumented) mechanisms which allow BIOSes to wreckage them are considered a feature. Well done Intel - NOT!
To address this wreckage add the following sanity measures:
- If the TSC_ADJUST value on the boot cpu is not 0, set it to 0
- If the TSC_ADJUST value on any cpu is negative, set it to 0
- Prevent the cross package synchronization mechanism from setting negative TSC_ADJUST values.
Reported-and-tested-by: Roland Scheidegger <rscheidegger_lists@hispeed.ch> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Bruce Schlobohm <bruce.schlobohm@intel.com> Cc: Kevin Stanton <kevin.b.stanton@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Allen Hung <allen_hung@dell.com> Cc: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/20161213131211.397588033@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
show more ...
|
#
6a369583 |
| 13-Dec-2016 |
Thomas Gleixner <tglx@linutronix.de> |
x86/tsc: Validate TSC_ADJUST after resume
Some 'feature' BIOSes fiddle with the TSC_ADJUST register during suspend/resume which renders the TSC unusable.
Add sanity checks into the resume path and
x86/tsc: Validate TSC_ADJUST after resume
Some 'feature' BIOSes fiddle with the TSC_ADJUST register during suspend/resume which renders the TSC unusable.
Add sanity checks into the resume path and restore the original value if it was adjusted.
Reported-and-tested-by: Roland Scheidegger <rscheidegger_lists@hispeed.ch> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Bruce Schlobohm <bruce.schlobohm@intel.com> Cc: Kevin Stanton <kevin.b.stanton@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Allen Hung <allen_hung@dell.com> Cc: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/20161213131211.317654500@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
show more ...
|
#
31f8a651 |
| 01-Dec-2016 |
Thomas Gleixner <tglx@linutronix.de> |
x86/tsc: Validate cpumask pointer before accessing it
0-day testing encountered a NULL pointer dereference in a cpumask access from tsc_store_and_check_tsc_adjust().
This happens when the function
x86/tsc: Validate cpumask pointer before accessing it
0-day testing encountered a NULL pointer dereference in a cpumask access from tsc_store_and_check_tsc_adjust().
This happens when the function is called on the boot CPU and the topology masks are not yet available due to CPUMASK_OFFSTACK=y.
Add a NULL pointer check for the mask pointer. If NULL it's safe to assume that the CPU is the boot CPU and the first one in the package.
Fixes: 8b223bc7abe0 ("x86/tsc: Store and check TSC ADJUST MSR") Reported-by: kernel test robot <xiaolong.ye@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
show more ...
|
#
b8365543 |
| 29-Nov-2016 |
Thomas Gleixner <tglx@linutronix.de> |
x86/tsc: Fix broken CONFIG_X86_TSC=n build
Add the missing return statement to the inline stub tsc_store_and_check_tsc_adjust() and add the other stubs to make a SMP=y,TSC=n build happy.
While at i
x86/tsc: Fix broken CONFIG_X86_TSC=n build
Add the missing return statement to the inline stub tsc_store_and_check_tsc_adjust() and add the other stubs to make a SMP=y,TSC=n build happy.
While at it, remove the unused variable from the UP variant of tsc_store_and_check_tsc_adjust().
Fixes: commit ba75fb646931 ("x86/tsc: Sync test only for the first cpu in a package") Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
show more ...
|
#
cc4db268 |
| 19-Nov-2016 |
Thomas Gleixner <tglx@linutronix.de> |
x86/tsc: Try to adjust TSC if sync test fails
If the first CPU of a package comes online, it is necessary to test whether the TSC is in sync with a CPU on some other package. When a deviation is obs
x86/tsc: Try to adjust TSC if sync test fails
If the first CPU of a package comes online, it is necessary to test whether the TSC is in sync with a CPU on some other package. When a deviation is observed (time going backwards between the two CPUs) the TSC is marked unstable, which is a problem on large machines as they have to fall back to the HPET clocksource, which is insanely slow.
It has been attempted to compensate the TSC by adding the offset to the TSC and writing it back some time ago, but this never was merged because it did not turn out to be stable, especially not on older systems.
Modern systems have become more stable in that regard and the TSC_ADJUST MSR allows us to compensate for the time deviation in a sane way. If it's available allow up to three synchronization runs and if a time warp is detected the starting CPU can compensate the time warp via the TSC_ADJUST MSR and retry. If the third run still shows a deviation or when random time warps are detected the test terminally fails.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/20161119134018.048237517@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
show more ...
|
#
76d3b851 |
| 19-Nov-2016 |
Thomas Gleixner <tglx@linutronix.de> |
x86/tsc: Prepare warp test for TSC adjustment
To allow TSC compensation cross nodes its necessary to know in which direction the TSC warp was observed. Return the maximum observed value on the calli
x86/tsc: Prepare warp test for TSC adjustment
To allow TSC compensation cross nodes its necessary to know in which direction the TSC warp was observed. Return the maximum observed value on the calling CPU so the caller can determine the direction later.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/20161119134017.970859287@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
show more ...
|
#
4c5e3c63 |
| 19-Nov-2016 |
Thomas Gleixner <tglx@linutronix.de> |
x86/tsc: Move sync cleanup to a safe place
Cleaning up the stop marker on the control CPU is wrong when we want to add retry support. Move the cleanup to the starting CPU.
Signed-off-by: Thomas Gle
x86/tsc: Move sync cleanup to a safe place
Cleaning up the stop marker on the control CPU is wrong when we want to add retry support. Move the cleanup to the starting CPU.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/20161119134017.892095627@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
show more ...
|
#
a36f5136 |
| 19-Nov-2016 |
Thomas Gleixner <tglx@linutronix.de> |
x86/tsc: Sync test only for the first cpu in a package
If the TSC_ADJUST MSR is available all CPUs in a package are forced to the same value. So TSCs cannot be out of sync when the first CPU in the
x86/tsc: Sync test only for the first cpu in a package
If the TSC_ADJUST MSR is available all CPUs in a package are forced to the same value. So TSCs cannot be out of sync when the first CPU in the package was in sync.
That allows to skip the sync test for all CPUs except the first starting CPU in a package.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/20161119134017.809901363@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
show more ...
|
#
1d0095fe |
| 19-Nov-2016 |
Thomas Gleixner <tglx@linutronix.de> |
x86/tsc: Verify TSC_ADJUST from idle
When entering idle, it's a good oportunity to verify that the TSC_ADJUST MSR has not been tampered with (BIOS hiding SMM cycles). If tampering is detected, emit
x86/tsc: Verify TSC_ADJUST from idle
When entering idle, it's a good oportunity to verify that the TSC_ADJUST MSR has not been tampered with (BIOS hiding SMM cycles). If tampering is detected, emit a warning and restore it to the previous value.
This is especially important for machines, which mark the TSC reliable because there is no watchdog clocksource available (SoCs).
This is not sufficient for HPC (NOHZ_FULL) situations where a CPU never goes idle, but adding a timer to do the check periodically is not an option either. On a machine, which has this issue, the check triggeres right during boot, so there is a decent chance that the sysadmin will notice.
Rate limit the check to once per second and warn only once per cpu.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/20161119134017.732180441@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
show more ...
|