Skip to content

Commit 15be042

Browse files
hbathinigregkh
authored andcommitted
powerpc/fadump: Fix inaccurate CPU state info in vmcore generated with panic
[ Upstream commit 06e629c ] In panic path, fadump is triggered via a panic notifier function. Before calling panic notifier functions, smp_send_stop() gets called, which stops all CPUs except the panic'ing CPU. Commit 8389b37 ("powerpc: stop_this_cpu: remove the cpu from the online map.") and again commit bab2623 ("powerpc: Offline CPU in stop_this_cpu()") started marking CPUs as offline while stopping them. So, if a kernel has either of the above commits, vmcore captured with fadump via panic path would not process register data for all CPUs except the panic'ing CPU. Sample output of crash-utility with such vmcore: # crash vmlinux vmcore ... KERNEL: vmlinux DUMPFILE: vmcore [PARTIAL DUMP] CPUS: 1 DATE: Wed Nov 10 09:56:34 EST 2021 UPTIME: 00:00:42 LOAD AVERAGE: 2.27, 0.69, 0.24 TASKS: 183 NODENAME: XXXXXXXXX RELEASE: 5.15.0+ VERSION: #974 SMP Wed Nov 10 04:18:19 CST 2021 MACHINE: ppc64le (2500 Mhz) MEMORY: 8 GB PANIC: "Kernel panic - not syncing: sysrq triggered crash" PID: 3394 COMMAND: "bash" TASK: c0000000150a5f80 [THREAD_INFO: c0000000150a5f80] CPU: 1 STATE: TASK_RUNNING (PANIC) crash> p -x __cpu_online_mask __cpu_online_mask = $1 = { bits = {0x2, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0} } crash> crash> crash> p -x __cpu_active_mask __cpu_active_mask = $2 = { bits = {0xff, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0} } crash> While this has been the case since fadump was introduced, the issue was not identified for two probable reasons: - In general, the bulk of the vmcores analyzed were from crash due to exception. - The above did change since commit 8341f2f ("sysrq: Use panic() to force a crash") started using panic() instead of deferencing NULL pointer to force a kernel crash. But then commit de6e5d3 ("powerpc: smp_send_stop do not offline stopped CPUs") stopped marking CPUs as offline till kernel commit bab2623 ("powerpc: Offline CPU in stop_this_cpu()") reverted that change. To ensure post processing register data of all other CPUs happens as intended, let panic() function take the crash friendly path (read crash_smp_send_stop()) with the help of crash_kexec_post_notifiers option. Also, as register data for all CPUs is captured by f/w, skip IPI callbacks here for fadump, to avoid any complications in finding the right backtraces. Signed-off-by: Hari Bathini <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]>
1 parent f2e658d commit 15be042

File tree

2 files changed

+18
-0
lines changed

2 files changed

+18
-0
lines changed

arch/powerpc/kernel/fadump.c

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1641,6 +1641,14 @@ int __init setup_fadump(void)
16411641
else if (fw_dump.reserve_dump_area_size)
16421642
fw_dump.ops->fadump_init_mem_struct(&fw_dump);
16431643

1644+
/*
1645+
* In case of panic, fadump is triggered via ppc_panic_event()
1646+
* panic notifier. Setting crash_kexec_post_notifiers to 'true'
1647+
* lets panic() function take crash friendly path before panic
1648+
* notifiers are invoked.
1649+
*/
1650+
crash_kexec_post_notifiers = true;
1651+
16441652
return 1;
16451653
}
16461654
subsys_initcall(setup_fadump);

arch/powerpc/kernel/smp.c

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,7 @@
6060
#include <asm/cpu_has_feature.h>
6161
#include <asm/ftrace.h>
6262
#include <asm/kup.h>
63+
#include <asm/fadump.h>
6364

6465
#ifdef DEBUG
6566
#include <asm/udbg.h>
@@ -612,6 +613,15 @@ void crash_smp_send_stop(void)
612613
{
613614
static bool stopped = false;
614615

616+
/*
617+
* In case of fadump, register data for all CPUs is captured by f/w
618+
* on ibm,os-term rtas call. Skip IPI callbacks to other CPUs before
619+
* this rtas call to avoid tricky post processing of those CPUs'
620+
* backtraces.
621+
*/
622+
if (should_fadump_crash())
623+
return;
624+
615625
if (stopped)
616626
return;
617627

0 commit comments

Comments
 (0)