RPI-kernel branch 3.6.y not rebooting #142

msperl · 2012-10-24T20:04:06Z

Hi!

After switching from git 3.2.27 to git 3.6.y (commit 3bc2651)
the RPI stays stuck when doing a "typical" /sbin/reboot.

Only thing visible on the serial Console is:
[ 7.068853] ftdi_sio: v1.6.0:USB FTDI Serial Converters Driver
[ 9.179254] EXT4-fs (mmcblk0p2): re-mounted. Opts: (null)
Debian GNU/Linux wheezy/sid raspberrypi ttyAMA0

raspberrypi login: [ 50.542724] Restarting system.

This makes the system essentially unsafe to use - especially when not sitting in front of the RPI, but working remotely...

No jump to kernel debugger or anything that would typically indicate a crash on the kernel level.
Seems to happen in the videocore part, where something is not initialized as expected.
After power off/on it the machine boots correctly - no filesystem check or similar, so the filesystem is shut down correctly.

Ciao,
Martin

p.s: the .config was copied from the running (and rebooting) git-version of 3.2.7 and the defaults applied for all new config-values

The text was updated successfully, but these errors were encountered:

popcornmix · 2012-10-24T20:06:03Z

Do the network LEDs stop flashing?

msperl · 2012-10-24T20:25:43Z

Link flashes all others are "on" - except ACT.

popcornmix · 2012-10-24T20:28:19Z

That does mean linux didn't get to the shutdown function (bcm2708_power_off) which resets the chip (and so the LAN chip). So it's not a firmware problem.

I imagine it means a linux task is hung when trying to shut down.

msperl · 2012-10-24T20:29:14Z

Maybe a hint:
booting with /boot/arm240_start.elf (md5sum b77e7028caa3d733985b0baf6510e144)
taken from git version: 17ff0bb5bb46210bad235d32589d35f951f269cf

popcornmix · 2012-10-24T20:29:14Z

Does "sudo reboot -f" reboot? (save your work and sync first)

msperl · 2012-10-24T20:32:56Z

no - same effect...

Running a RPI with HW revision 4 (256MB)

msperl · 2012-10-24T20:34:31Z

power down works correctly - leds all off after shutdown -h now

Ferroin · 2012-10-24T20:36:20Z

Do you have your system setup to use kexec to reboot? I've seen intermittent problems with kexec() operation not only on the rpi, but on other ARM systems as well.

msperl · 2012-10-24T20:41:11Z

no - just the default config that came with the original kernel from raspbian.
used that successfully with 3.2.27 for some time and now when moving/testing to 3.6.y I am experience the issue

I also tried to remove everything attached (USB, GPIO,...) - no changes with only power+network connected.

popcornmix · 2012-10-24T20:50:02Z

I see the problem. 3.2.27 does: (arch/arm/kernel/process.c)
void (*arm_pm_restart)(char str, const char *cmd) = arm_machine_restart;

and 3.6.1 does:
void (*arm_pm_restart)(char str, const char *cmd) = null_restart;

Seems there's a new restart method of the machine type. I'll look into it.

msperl · 2012-10-24T20:59:44Z

I tried to build with the default config from arch/arm/configs/bcmrpi_defconfig
and wanted to test the reboot tomorrow when RPI has finished compiling...

Steps:
make clean
cp arch/arm/configs/bcmrpi_defconfig .config
make silentoldconfig (just pressing enter until finished - taking the defaults for unknown configs)
make

but make stops almost immediately:
root@raspberrypi:/usr/src/linux.3.6.y# make
/bin/sh: 1: gcc: not found
CHK include/linux/version.h
CHK include/generated/utsrelease.h
UPD include/generated/utsrelease.h
make[1]: `include/generated/mach-types.h' is up to date.
CC kernel/bounds.s
/bin/sh: 1: gcc: not found
make[1]: *** [kernel/bounds.s] Error 127
make: *** [prepare0] Error 2

moving back to the config used before
cp .config.backup .config; make silentoldconfig;make
works again...

So maybe the "bcmrpi_defconfig" are no longer valid when compiling the kernel (at least when compiling on the RPI) -
this may also need a review/replacement...

popcornmix · 2012-10-24T21:01:38Z

Fix pushed for reboot.

Ferroin · 2012-10-24T21:10:23Z

The compilation error probably isn't something in the config, but either something in the Makefile or how your system is configured.

popcornmix · 2012-10-24T21:42:04Z

/bin/sh: 1: gcc: not found
sound like you don't have gcc installed.

licaon-kter · 2012-10-24T21:53:48Z

@msperl: it easier to export the running config of a working kernel: zcat /proc/config.gz > ~/config.default and work your way from that one

msperl · 2012-10-25T05:56:22Z

well, as said: if I do the following:

zcat /proc/config.gzip .config; make

it compiles, but if I:

cp arch/arm/configs/bcmrpi_defconfig .config;make

it does NOT work and complains about gcc.

and that in the SAME shell without adding/removing anything else (especially not removing gcc from the system...)

msperl · 2012-10-25T06:08:10Z

I have just seen the new patch - compiling the kernel right now to see if it works now.
I will let you know if it works...

msperl · 2012-10-25T06:42:31Z

@licaon-kter: I know it is better (especially since /proc/config.gz), but I was assuming that the reboot issue may be related to a kernel configuration issue. So I wanted to test with a "stock" config, that should be working - hence the use of arch/arm/configs/bcmrpi_defconfig (which I was hoping to be working). But then I stumbled across the issue of "missing compiler" (even though the other config compiles). But I believe we should branch this to a different "issue"...

msperl · 2012-10-25T16:35:47Z

Reboots now correctly - thanks!

Cc: David Rientjes <[email protected]> WARNING: line over 80 characters #99: FILE: mm/page_alloc.c:2965: + * zone list (with a backoff mechanism which is a function of no_progress_loops). WARNING: line over 80 characters #129: FILE: mm/page_alloc.c:2995: + * Keep reclaiming pages while there is a chance this will lead somewhere. WARNING: line over 80 characters #134: FILE: mm/page_alloc.c:3000: + for_each_zone_zonelist_nodemask(zone, z, ac->zonelist, ac->high_zoneidx, ac->nodemask) { WARNING: line over 80 characters #138: FILE: mm/page_alloc.c:3004: + available -= DIV_ROUND_UP(no_progress_loops * available, MAX_RECLAIM_RETRIES); WARNING: line over 80 characters #142: FILE: mm/page_alloc.c:3008: + * Would the allocation succeed if we reclaimed the whole available? WARNING: line over 80 characters #146: FILE: mm/page_alloc.c:3012: + /* Wait for some write requests to complete then retry */ total: 0 errors, 6 warnings, 202 lines checked ./patches/mm-oom-rework-oom-detection.patch has style problems, please review. NOTE: If any of the errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. Please run checkpatch prior to sending patches Cc: David Rientjes <[email protected]> Cc: Hillf Danton <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: KAMEZAWA Hiroyuki <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Tetsuo Handa <[email protected]> Signed-off-by: Andrew Morton <[email protected]>

Tetsuo Handa has noticed that MMF_UNSTABLE SIGBUS path in handle_mm_fault causes a lockdep splat Out of memory: Kill process 1056 (a.out) score 603 or sacrifice child Killed process 1056 (a.out) total-vm:4268108kB, anon-rss:2246048kB, file-rss:0kB, shmem-rss:0kB a.out (1169) used greatest stack depth: 11664 bytes left DEBUG_LOCKS_WARN_ON(depth <= 0) ------------[ cut here ]------------ WARNING: CPU: 6 PID: 1339 at kernel/locking/lockdep.c:3617 lock_release+0x172/0x1e0 CPU: 6 PID: 1339 Comm: a.out Not tainted 4.13.0-rc3-next-20170803+ #142 Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015 RIP: 0010:lock_release+0x172/0x1e0 Call Trace: up_read+0x1a/0x40 __do_page_fault+0x28e/0x4c0 do_page_fault+0x30/0x80 page_fault+0x28/0x30 The reason is that the page fault path might have dropped the mmap_sem and returned with VM_FAULT_RETRY. MMF_UNSTABLE check however rewrites the error path to VM_FAULT_SIGBUS and we always expect mmap_sem taken in that path. Fix this by taking mmap_sem when VM_FAULT_RETRY is held in the MMF_UNSTABLE path. We cannot simply add VM_FAULT_SIGBUS to the existing error code because all arch specific page fault handlers and g-u-p would have to learn a new error code combination. Link: http://lkml.kernel.org/r/[email protected] Fixes: 3f70dc3 ("mm: make sure that kthreads will not refault oom reaped memory") Reported-by: Tetsuo Handa <[email protected]> Signed-off-by: Michal Hocko <[email protected]> Acked-by: David Rientjes <[email protected]> Cc: Andrea Argangeli <[email protected]> Cc: "Kirill A. Shutemov" <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Wenwei Tao <[email protected]> Cc: <[email protected]> [4.9+] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>

commit 5b53a6e upstream. Tetsuo Handa has noticed that MMF_UNSTABLE SIGBUS path in handle_mm_fault causes a lockdep splat Out of memory: Kill process 1056 (a.out) score 603 or sacrifice child Killed process 1056 (a.out) total-vm:4268108kB, anon-rss:2246048kB, file-rss:0kB, shmem-rss:0kB a.out (1169) used greatest stack depth: 11664 bytes left DEBUG_LOCKS_WARN_ON(depth <= 0) ------------[ cut here ]------------ WARNING: CPU: 6 PID: 1339 at kernel/locking/lockdep.c:3617 lock_release+0x172/0x1e0 CPU: 6 PID: 1339 Comm: a.out Not tainted 4.13.0-rc3-next-20170803+ #142 Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015 RIP: 0010:lock_release+0x172/0x1e0 Call Trace: up_read+0x1a/0x40 __do_page_fault+0x28e/0x4c0 do_page_fault+0x30/0x80 page_fault+0x28/0x30 The reason is that the page fault path might have dropped the mmap_sem and returned with VM_FAULT_RETRY. MMF_UNSTABLE check however rewrites the error path to VM_FAULT_SIGBUS and we always expect mmap_sem taken in that path. Fix this by taking mmap_sem when VM_FAULT_RETRY is held in the MMF_UNSTABLE path. We cannot simply add VM_FAULT_SIGBUS to the existing error code because all arch specific page fault handlers and g-u-p would have to learn a new error code combination. Link: http://lkml.kernel.org/r/[email protected] Fixes: 3f70dc3 ("mm: make sure that kthreads will not refault oom reaped memory") Reported-by: Tetsuo Handa <[email protected]> Signed-off-by: Michal Hocko <[email protected]> Acked-by: David Rientjes <[email protected]> Cc: Andrea Argangeli <[email protected]> Cc: "Kirill A. Shutemov" <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Wenwei Tao <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit 70b639d ] Because the error handling is sequential, the application of resources should be carried out in the order of error handling, so the operation of registering the interrupt handler should be put in front, so as not to free the unregistered interrupt handler during error handling. This log reveals it: [ 3.438724] Trying to free already-free IRQ 23 [ 3.439060] WARNING: CPU: 5 PID: 1 at kernel/irq/manage.c:1825 free_irq+0xfb/0x480 [ 3.440039] Modules linked in: [ 3.440257] CPU: 5 PID: 1 Comm: swapper/0 Not tainted 5.12.4-g70e7f0549188-dirty #142 [ 3.440793] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 [ 3.441561] RIP: 0010:free_irq+0xfb/0x480 [ 3.441845] Code: 6e 08 74 6f 4d 89 f4 e8 c3 78 09 00 4d 8b 74 24 18 4d 85 f6 75 e3 e8 b4 78 09 00 8b 75 c8 48 c7 c7 a0 ac d5 85 e8 95 d7 f5 ff <0f> 0b 48 8b 75 c0 4c 89 ff e8 87 c5 90 03 48 8b 43 40 4c 8b a0 80 [ 3.443121] RSP: 0000:ffffc90000017b50 EFLAGS: 00010086 [ 3.443483] RAX: 0000000000000000 RBX: ffff888107c6f000 RCX: 0000000000000000 [ 3.443972] RDX: 0000000000000000 RSI: ffffffff8123f301 RDI: 00000000ffffffff [ 3.444462] RBP: ffffc90000017b90 R08: 0000000000000001 R09: 0000000000000003 [ 3.444950] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 [ 3.444994] R13: ffff888107dc0000 R14: ffff888104f6bf00 R15: ffff888107c6f0a8 [ 3.444994] FS: 0000000000000000(0000) GS:ffff88817bd40000(0000) knlGS:0000000000000000 [ 3.444994] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3.444994] CR2: 0000000000000000 CR3: 000000000642e000 CR4: 00000000000006e0 [ 3.444994] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 3.444994] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 3.444994] Call Trace: [ 3.444994] ns_init_card_error+0x18e/0x250 [ 3.444994] nicstar_init_one+0x10d2/0x1130 [ 3.444994] local_pci_probe+0x4a/0xb0 [ 3.444994] pci_device_probe+0x126/0x1d0 [ 3.444994] ? pci_device_remove+0x100/0x100 [ 3.444994] really_probe+0x27e/0x650 [ 3.444994] driver_probe_device+0x84/0x1d0 [ 3.444994] ? mutex_lock_nested+0x16/0x20 [ 3.444994] device_driver_attach+0x63/0x70 [ 3.444994] __driver_attach+0x117/0x1a0 [ 3.444994] ? device_driver_attach+0x70/0x70 [ 3.444994] bus_for_each_dev+0xb6/0x110 [ 3.444994] ? rdinit_setup+0x40/0x40 [ 3.444994] driver_attach+0x22/0x30 [ 3.444994] bus_add_driver+0x1e6/0x2a0 [ 3.444994] driver_register+0xa4/0x180 [ 3.444994] __pci_register_driver+0x77/0x80 [ 3.444994] ? uPD98402_module_init+0xd/0xd [ 3.444994] nicstar_init+0x1f/0x75 [ 3.444994] do_one_initcall+0x7a/0x3d0 [ 3.444994] ? rdinit_setup+0x40/0x40 [ 3.444994] ? rcu_read_lock_sched_held+0x4a/0x70 [ 3.444994] kernel_init_freeable+0x2a7/0x2f9 [ 3.444994] ? rest_init+0x2c0/0x2c0 [ 3.444994] kernel_init+0x13/0x180 [ 3.444994] ? rest_init+0x2c0/0x2c0 [ 3.444994] ? rest_init+0x2c0/0x2c0 [ 3.444994] ret_from_fork+0x1f/0x30 [ 3.444994] Kernel panic - not syncing: panic_on_warn set ... [ 3.444994] CPU: 5 PID: 1 Comm: swapper/0 Not tainted 5.12.4-g70e7f0549188-dirty #142 [ 3.444994] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 [ 3.444994] Call Trace: [ 3.444994] dump_stack+0xba/0xf5 [ 3.444994] ? free_irq+0xfb/0x480 [ 3.444994] panic+0x155/0x3ed [ 3.444994] ? __warn+0xed/0x150 [ 3.444994] ? free_irq+0xfb/0x480 [ 3.444994] __warn+0x103/0x150 [ 3.444994] ? free_irq+0xfb/0x480 [ 3.444994] report_bug+0x119/0x1c0 [ 3.444994] handle_bug+0x3b/0x80 [ 3.444994] exc_invalid_op+0x18/0x70 [ 3.444994] asm_exc_invalid_op+0x12/0x20 [ 3.444994] RIP: 0010:free_irq+0xfb/0x480 [ 3.444994] Code: 6e 08 74 6f 4d 89 f4 e8 c3 78 09 00 4d 8b 74 24 18 4d 85 f6 75 e3 e8 b4 78 09 00 8b 75 c8 48 c7 c7 a0 ac d5 85 e8 95 d7 f5 ff <0f> 0b 48 8b 75 c0 4c 89 ff e8 87 c5 90 03 48 8b 43 40 4c 8b a0 80 [ 3.444994] RSP: 0000:ffffc90000017b50 EFLAGS: 00010086 [ 3.444994] RAX: 0000000000000000 RBX: ffff888107c6f000 RCX: 0000000000000000 [ 3.444994] RDX: 0000000000000000 RSI: ffffffff8123f301 RDI: 00000000ffffffff [ 3.444994] RBP: ffffc90000017b90 R08: 0000000000000001 R09: 0000000000000003 [ 3.444994] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 [ 3.444994] R13: ffff888107dc0000 R14: ffff888104f6bf00 R15: ffff888107c6f0a8 [ 3.444994] ? vprintk_func+0x71/0x110 [ 3.444994] ns_init_card_error+0x18e/0x250 [ 3.444994] nicstar_init_one+0x10d2/0x1130 [ 3.444994] local_pci_probe+0x4a/0xb0 [ 3.444994] pci_device_probe+0x126/0x1d0 [ 3.444994] ? pci_device_remove+0x100/0x100 [ 3.444994] really_probe+0x27e/0x650 [ 3.444994] driver_probe_device+0x84/0x1d0 [ 3.444994] ? mutex_lock_nested+0x16/0x20 [ 3.444994] device_driver_attach+0x63/0x70 [ 3.444994] __driver_attach+0x117/0x1a0 [ 3.444994] ? device_driver_attach+0x70/0x70 [ 3.444994] bus_for_each_dev+0xb6/0x110 [ 3.444994] ? rdinit_setup+0x40/0x40 [ 3.444994] driver_attach+0x22/0x30 [ 3.444994] bus_add_driver+0x1e6/0x2a0 [ 3.444994] driver_register+0xa4/0x180 [ 3.444994] __pci_register_driver+0x77/0x80 [ 3.444994] ? uPD98402_module_init+0xd/0xd [ 3.444994] nicstar_init+0x1f/0x75 [ 3.444994] do_one_initcall+0x7a/0x3d0 [ 3.444994] ? rdinit_setup+0x40/0x40 [ 3.444994] ? rcu_read_lock_sched_held+0x4a/0x70 [ 3.444994] kernel_init_freeable+0x2a7/0x2f9 [ 3.444994] ? rest_init+0x2c0/0x2c0 [ 3.444994] kernel_init+0x13/0x180 [ 3.444994] ? rest_init+0x2c0/0x2c0 [ 3.444994] ? rest_init+0x2c0/0x2c0 [ 3.444994] ret_from_fork+0x1f/0x30 [ 3.444994] Dumping ftrace buffer: [ 3.444994] (ftrace buffer empty) [ 3.444994] Kernel Offset: disabled [ 3.444994] Rebooting in 1 seconds.. Signed-off-by: Zheyu Ma <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit ebf7f6f ] In the current code, the actual max tail call count is 33 which is greater than MAX_TAIL_CALL_CNT (defined as 32). The actual limit is not consistent with the meaning of MAX_TAIL_CALL_CNT and thus confusing at first glance. We can see the historical evolution from commit 04fd61a ("bpf: allow bpf programs to tail-call other bpf programs") and commit f9dabe0 ("bpf: Undo off-by-one in interpreter tail call count limit"). In order to avoid changing existing behavior, the actual limit is 33 now, this is reasonable. After commit 874be05 ("bpf, tests: Add tail call test suite"), we can see there exists failed testcase. On all archs when CONFIG_BPF_JIT_ALWAYS_ON is not set: # echo 0 > /proc/sys/net/core/bpf_jit_enable # modprobe test_bpf # dmesg | grep -w FAIL Tail call error path, max count reached jited:0 ret 34 != 33 FAIL On some archs: # echo 1 > /proc/sys/net/core/bpf_jit_enable # modprobe test_bpf # dmesg | grep -w FAIL Tail call error path, max count reached jited:1 ret 34 != 33 FAIL Although the above failed testcase has been fixed in commit 18935a7 ("bpf/tests: Fix error in tail call limit tests"), it would still be good to change the value of MAX_TAIL_CALL_CNT from 32 to 33 to make the code more readable. The 32-bit x86 JIT was using a limit of 32, just fix the wrong comments and limit to 33 tail calls as the constant MAX_TAIL_CALL_CNT updated. For the mips64 JIT, use "ori" instead of "addiu" as suggested by Johan Almbladh. For the riscv JIT, use RV_REG_TCC directly to save one register move as suggested by Björn Töpel. For the other implementations, no function changes, it does not change the current limit 33, the new value of MAX_TAIL_CALL_CNT can reflect the actual max tail call count, the related tail call testcases in test_bpf module and selftests can work well for the interpreter and the JIT. Here are the test results on x86_64: # uname -m x86_64 # echo 0 > /proc/sys/net/core/bpf_jit_enable # modprobe test_bpf test_suite=test_tail_calls # dmesg | tail -1 test_bpf: test_tail_calls: Summary: 8 PASSED, 0 FAILED, [0/8 JIT'ed] # rmmod test_bpf # echo 1 > /proc/sys/net/core/bpf_jit_enable # modprobe test_bpf test_suite=test_tail_calls # dmesg | tail -1 test_bpf: test_tail_calls: Summary: 8 PASSED, 0 FAILED, [8/8 JIT'ed] # rmmod test_bpf # ./test_progs -t tailcalls #142 tailcalls:OK Summary: 1/11 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Tiezhu Yang <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Tested-by: Johan Almbladh <[email protected]> Tested-by: Ilya Leoshkevich <[email protected]> Acked-by: Björn Töpel <[email protected]> Acked-by: Johan Almbladh <[email protected]> Acked-by: Ilya Leoshkevich <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Sasha Levin <[email protected]>

commit cb3543c upstream. When updating the operating mode as part of regulator enable, the caller has already locked the regulator tree and drms_uA_update() must not try to do the same in order not to trigger a deadlock. The lock inversion is reported by lockdep as: ====================================================== WARNING: possible circular locking dependency detected 6.1.0-next-20221215 raspberrypi#142 Not tainted ------------------------------------------------------ udevd/154 is trying to acquire lock: ffffc11f123d7e50 (regulator_list_mutex){+.+.}-{3:3}, at: regulator_lock_dependent+0x54/0x280 but task is already holding lock: ffff80000e4c36e8 (regulator_ww_class_acquire){+.+.}-{0:0}, at: regulator_enable+0x34/0x80 which lock already depends on the new lock. ... Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(regulator_ww_class_acquire); lock(regulator_list_mutex); lock(regulator_ww_class_acquire); lock(regulator_list_mutex); *** DEADLOCK *** just before probe of a Qualcomm UFS controller (occasionally) deadlocks when enabling one of its regulators. Fixes: 9243a19 ("regulator: core: Change voltage setting path") Fixes: f8702f9 ("regulator: core: Use ww_mutex for regulators locking") Cc: [email protected] # 5.0 Signed-off-by: Johan Hovold <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit cb3543c upstream. When updating the operating mode as part of regulator enable, the caller has already locked the regulator tree and drms_uA_update() must not try to do the same in order not to trigger a deadlock. The lock inversion is reported by lockdep as: ====================================================== WARNING: possible circular locking dependency detected 6.1.0-next-20221215 #142 Not tainted ------------------------------------------------------ udevd/154 is trying to acquire lock: ffffc11f123d7e50 (regulator_list_mutex){+.+.}-{3:3}, at: regulator_lock_dependent+0x54/0x280 but task is already holding lock: ffff80000e4c36e8 (regulator_ww_class_acquire){+.+.}-{0:0}, at: regulator_enable+0x34/0x80 which lock already depends on the new lock. ... Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(regulator_ww_class_acquire); lock(regulator_list_mutex); lock(regulator_ww_class_acquire); lock(regulator_list_mutex); *** DEADLOCK *** just before probe of a Qualcomm UFS controller (occasionally) deadlocks when enabling one of its regulators. Fixes: 9243a19 ("regulator: core: Change voltage setting path") Fixes: f8702f9 ("regulator: core: Use ww_mutex for regulators locking") Cc: [email protected] # 5.0 Signed-off-by: Johan Hovold <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit 9e14606 ] Since limited tracking device per condition, this feature is to support tracking multiple devices concurrently. When a pattern monitor detects the device, this feature issues an address monitor for tracking that device. Let pattern monitor can keep monitor new devices. This feature adds an address filter when receiving a LE monitor device event which monitor handle is for a pattern, and the controller started monitoring the device. And this feature also has cancelled the monitor advertisement from address filters when receiving a LE monitor device event when the controller stopped monitoring the device specified by an address and monitor handle. Below is an example to know the feature adds the address filter. //Add MSFT pattern monitor < HCI Command: Vendor (0x3f|0x00f0) plen 14 #142 [hci0] 55.552420 03 b8 a4 03 ff 01 01 06 09 05 5f 52 45 46 .........._REF > HCI Event: Command Complete (0x0e) plen 6 #143 [hci0] 55.653960 Vendor (0x3f|0x00f0) ncmd 2 Status: Success (0x00) 03 00 //Got event from the pattern monitor > HCI Event: Vendor (0xff) plen 18 #148 [hci0] 58.384953 23 79 54 33 77 88 97 68 02 00 fb c1 29 eb 27 b8 #yT3w..h....).'. 00 01 .. //Add MSFT address monitor (Sample address: B8:27:EB:29:C1:FB) < HCI Command: Vendor (0x3f|0x00f0) plen 13 #149 [hci0] 58.385067 03 b8 a4 03 ff 04 00 fb c1 29 eb 27 b8 .........).'. //Report to userspace about found device (ADV Monitor Device Found) @ MGMT Event: Unknown (0x002f) plen 38 {0x0003} [hci0] 58.680042 01 00 fb c1 29 eb 27 b8 01 ce 00 00 00 00 16 00 ....).'......... 0a 09 4b 45 59 42 44 5f 52 45 46 02 01 06 03 19 ..KEYBD_REF..... c1 03 03 03 12 18 ...... //Got event from address monitor > HCI Event: Vendor (0xff) plen 18 #152 [hci0] 58.672956 23 79 54 33 77 88 97 68 02 00 fb c1 29 eb 27 b8 #yT3w..h....).'. 01 01 Signed-off-by: Alex Lu <[email protected]> Signed-off-by: Hilda Wu <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Luiz Augusto von Dentz <[email protected]> Stable-dep-of: 253f339 ("Bluetooth: HCI: Introduce HCI_QUIRK_BROKEN_LE_CODED") Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit 7d42e09 ] During the PCI AER system's error recovery process, the kernel driver may encounter a race condition with freeing the reset_data structure's memory. If the device restart will take more than 10 seconds the function scheduling that restart will exit due to a timeout, and the reset_data structure will be freed. However, this data structure is used for completion notification after the restart is completed, which leads to a UAF bug. This results in a KFENCE bug notice. BUG: KFENCE: use-after-free read in adf_device_reset_worker+0x38/0xa0 [intel_qat] Use-after-free read at 0x00000000bc56fddf (in kfence-#142): adf_device_reset_worker+0x38/0xa0 [intel_qat] process_one_work+0x173/0x340 To resolve this race condition, the memory associated to the container of the work_struct is freed on the worker if the timeout expired, otherwise on the function that schedules the worker. The timeout detection can be done by checking if the caller is still waiting for completion or not by using completion_done() function. Fixes: d8cba25 ("crypto: qat - Intel(R) QAT driver framework") Cc: <[email protected]> Signed-off-by: Damian Muszynski <[email protected]> Reviewed-by: Giovanni Cabiddu <[email protected]> Signed-off-by: Herbert Xu <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

[ Upstream commit e7f3187 ] Recently, when running './test_progs -j', I occasionally hit the following errors: test_lwt_redirect:PASS:pthread_create 0 nsec test_lwt_redirect_run:FAIL:netns_create unexpected error: 256 (errno 0) #142/2 lwt_redirect/lwt_redirect_normal_nomac:FAIL #142 lwt_redirect:FAIL test_lwt_reroute:PASS:pthread_create 0 nsec test_lwt_reroute_run:FAIL:netns_create unexpected error: 256 (errno 0) test_lwt_reroute:PASS:pthread_join 0 nsec #143/2 lwt_reroute/lwt_reroute_qdisc_dropped:FAIL #143 lwt_reroute:FAIL The netns_create() definition looks like below: #define NETNS "ns_lwt" static inline int netns_create(void) { return system("ip netns add " NETNS); } One possibility is that both lwt_redirect and lwt_reroute create netns with the same name "ns_lwt" which may cause conflict. I tried the following example: $ sudo ip netns add abc $ echo $? 0 $ sudo ip netns add abc Cannot create namespace file "/var/run/netns/abc": File exists $ echo $? 1 $ The return code for above netns_create() is 256. The internet search suggests that the return value for 'ip netns add ns_lwt' is 1, which matches the above 'sudo ip netns add abc' example. This patch tried to use different netns names for two tests to avoid 'ip netns add <name>' failure. I ran './test_progs -j' 10 times and all succeeded with lwt_redirect/lwt_reroute tests. Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Tested-by: Eduard Zingerman <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Stable-dep-of: 16b795c ("selftests/bpf: Fix redefinition errors compiling lwt_reroute.c") Signed-off-by: Sasha Levin <[email protected]>

msperl closed this as completed Oct 25, 2012

imgaolong mentioned this issue Sep 19, 2019

rt: dwcotg: USB Host corrupted the kernel in 32 bit mode #3243

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RPI-kernel branch 3.6.y not rebooting #142

RPI-kernel branch 3.6.y not rebooting #142

msperl commented Oct 24, 2012

popcornmix commented Oct 24, 2012

msperl commented Oct 24, 2012

popcornmix commented Oct 24, 2012

msperl commented Oct 24, 2012

popcornmix commented Oct 24, 2012

msperl commented Oct 24, 2012

msperl commented Oct 24, 2012

Ferroin commented Oct 24, 2012

msperl commented Oct 24, 2012

popcornmix commented Oct 24, 2012

msperl commented Oct 24, 2012

popcornmix commented Oct 24, 2012

Ferroin commented Oct 24, 2012

popcornmix commented Oct 24, 2012

licaon-kter commented Oct 24, 2012

msperl commented Oct 25, 2012

msperl commented Oct 25, 2012

msperl commented Oct 25, 2012

msperl commented Oct 25, 2012

RPI-kernel branch 3.6.y not rebooting #142

RPI-kernel branch 3.6.y not rebooting #142

Comments

msperl commented Oct 24, 2012

popcornmix commented Oct 24, 2012

msperl commented Oct 24, 2012

popcornmix commented Oct 24, 2012

msperl commented Oct 24, 2012

popcornmix commented Oct 24, 2012

msperl commented Oct 24, 2012

msperl commented Oct 24, 2012

Ferroin commented Oct 24, 2012

msperl commented Oct 24, 2012

popcornmix commented Oct 24, 2012

msperl commented Oct 24, 2012

popcornmix commented Oct 24, 2012

Ferroin commented Oct 24, 2012

popcornmix commented Oct 24, 2012

licaon-kter commented Oct 24, 2012

msperl commented Oct 25, 2012

msperl commented Oct 25, 2012

msperl commented Oct 25, 2012

msperl commented Oct 25, 2012