-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Oops with new kernel when using NFS #88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Please be more specific. Is the the default prebuilt kernel (or one you've built yourself)? |
I am using today's pre-build kernel 3.2.27+ #6 Did another update for good measure. |
Was that the first kernel that failed? |
I was using the previous kernels from the 3.1.9+ range without this NFS issue. I'll try the version you refer to if the update to 3.2.27+ #12 exhibits the NFS problem. |
So far 3.2.27+ #12 is holding on very well. Have not seen the nfs related oops again. |
Closing as issue appears fixed. |
The TIF_MSA_CTX_LIVE flag (indicating that a task has MSA context which needs to be preserved) was being cleared in start_thread, but the TIF_USEDMSA flag (indicating that a task has used MSA in this timeslice) was not. In copy_thread neither flag was cleared, but both need to be. Without clearing these flags the kernel will proceed to attempt to save MSA context when the task is context switched out, and if the task had not used MSA in the meantime then it will fail because MSA or the FPU are disabled. The end result is typically: do_cpu invoked from kernel context![#1]: CPU: 0 PID: 99 Comm: sh Not tainted 3.16.0-rc4-00025-g6dc9476-dirty #88 task: 8f23dc60 ti: 8f1d8000 task.ti: 8f1d8000 ... Call Trace: [<8010edbc>] resume+0x5c/0x280 [<80481e0c>] __schedule+0x370/0x800 [<80104838>] work_resched+0x8/0x2c Fix by consistently clearing both flags in both functions. Signed-off-by: Paul Burton <[email protected]> Cc: [email protected] Patchwork: https://patchwork.linux-mips.org/patch/7309/ Signed-off-by: Ralf Baechle <[email protected]>
fdo#70354 - comment #88. Signed-off-by: Ben Skeggs <[email protected]>
commit 1c7de2b upstream. There is at least one Chelsio 10Gb card which uses VPD area to store some non-standard blocks (example below). However pci_vpd_size() returns the length of the first block only assuming that there can be only one VPD "End Tag". Since 4e1a635 ("vfio/pci: Use kernel VPD access functions"), VFIO blocks access beyond that offset, which prevents the guest "cxgb3" driver from probing the device. The host system does not have this problem as its driver accesses the config space directly without pci_read_vpd(). Add a quirk to override the VPD size to a bigger value. The maximum size is taken from EEPROMSIZE in drivers/net/ethernet/chelsio/cxgb3/common.h. We do not read the tag as the cxgb3 driver does as the driver supports writing to EEPROM/VPD and when it writes, it only checks for 8192 bytes boundary. The quirk is registered for all devices supported by the cxgb3 driver. This adds a quirk to the PCI layer (not to the cxgb3 driver) as the cxgb3 driver itself accesses VPD directly and the problem only exists with the vfio-pci driver (when cxgb3 is not running on the host and may not be even loaded) which blocks accesses beyond the first block of VPD data. However vfio-pci itself does not have quirks mechanism so we add it to PCI. This is the controller: Ethernet controller [0200]: Chelsio Communications Inc T310 10GbE Single Port Adapter [1425:0030] This is what I parsed from its VPD: === b'\x82*\x0010 Gigabit Ethernet-SR PCI Express Adapter\x90J\x00EC\x07D76809 FN\x0746K' 0000 Large item 42 bytes; name 0x2 Identifier String b'10 Gigabit Ethernet-SR PCI Express Adapter' 002d Large item 74 bytes; name 0x10 #00 [EC] len=7: b'D76809 ' #0a [FN] len=7: b'46K7897' #14 [PN] len=7: b'46K7897' #1e [MN] len=4: b'1037' #25 [FC] len=4: b'5769' #2c [SN] len=12: b'YL102035603V' #3b [NA] len=12: b'00145E992ED1' 007a Small item 1 bytes; name 0xf End Tag 0c00 Large item 16 bytes; name 0x2 Identifier String b'S310E-SR-X ' 0c13 Large item 234 bytes; name 0x10 #00 [PN] len=16: b'TBD ' #13 [EC] len=16: b'110107730D2 ' #26 [SN] len=16: b'97YL102035603V ' #39 [NA] len=12: b'00145E992ED1' #48 [V0] len=6: b'175000' #51 [V1] len=6: b'266666' #5a [V2] len=6: b'266666' #63 [V3] len=6: b'2000 ' #6c [V4] len=2: b'1 ' #71 [V5] len=6: b'c2 ' #7a [V6] len=6: b'0 ' #83 [V7] len=2: b'1 ' #88 [V8] len=2: b'0 ' #8d [V9] len=2: b'0 ' #92 [VA] len=2: b'0 ' #97 [RV] len=80: b's\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'... 0d00 Large item 252 bytes; name 0x11 #00 [VC] len=16: b'122310_1222 dp ' #13 [VD] len=16: b'610-0001-00 H1\x00\x00' #26 [VE] len=16: b'122310_1353 fp ' #39 [VF] len=16: b'610-0001-00 H1\x00\x00' #4c [RW] len=173: b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'... 0dff Small item 0 bytes; name 0xf End Tag 10f3 Large item 13315 bytes; name 0x62 !!! unknown item name 98: b'\xd0\x03\x00@`\x0c\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00' === Signed-off-by: Alexey Kardashevskiy <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
[ Upstream commit 09c8b97 ] sysbot/KMSAN reported an uninit-value in recvmsg() that I tracked down to tipc_sk_set_orig_addr(), missing srcaddr->member.scope initialization. This patches moves srcaddr->sock.scope init to follow fields order and ease future verifications. BUG: KMSAN: uninit-value in copy_to_user include/linux/uaccess.h:184 [inline] BUG: KMSAN: uninit-value in move_addr_to_user+0x32e/0x530 net/socket.c:226 CPU: 0 PID: 4549 Comm: syz-executor287 Not tainted 4.17.0-rc3+ #88 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x185/0x1d0 lib/dump_stack.c:113 kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067 kmsan_internal_check_memory+0x135/0x1e0 mm/kmsan/kmsan.c:1157 kmsan_copy_to_user+0x69/0x160 mm/kmsan/kmsan.c:1199 copy_to_user include/linux/uaccess.h:184 [inline] move_addr_to_user+0x32e/0x530 net/socket.c:226 ___sys_recvmsg+0x4e2/0x810 net/socket.c:2285 __sys_recvmsg net/socket.c:2328 [inline] __do_sys_recvmsg net/socket.c:2338 [inline] __se_sys_recvmsg net/socket.c:2335 [inline] __x64_sys_recvmsg+0x325/0x460 net/socket.c:2335 do_syscall_64+0x154/0x220 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x4455e9 RSP: 002b:00007fe3bd36ddb8 EFLAGS: 00000246 ORIG_RAX: 000000000000002f RAX: ffffffffffffffda RBX: 00000000006dac24 RCX: 00000000004455e9 RDX: 0000000000002002 RSI: 0000000020000400 RDI: 0000000000000003 RBP: 00000000006dac20 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00007fff98ce4b6f R14: 00007fe3bd36e9c0 R15: 0000000000000003 Local variable description: ----addr@___sys_recvmsg Variable was created at: ___sys_recvmsg+0xd5/0x810 net/socket.c:2246 __sys_recvmsg net/socket.c:2328 [inline] __do_sys_recvmsg net/socket.c:2338 [inline] __se_sys_recvmsg net/socket.c:2335 [inline] __x64_sys_recvmsg+0x325/0x460 net/socket.c:2335 Byte 19 of 32 is uninitialized Fixes: 31c82a2 ("tipc: add second source address to recvmsg()/recvfrom()") Signed-off-by: Eric Dumazet <[email protected]> Reported-by: syzbot <[email protected]> Cc: Jon Maloy <[email protected]> Cc: Ying Xue <[email protected]> Acked-by: Jon Maloy <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
[ Upstream commit 153fcd5 ] brd_free() may be called in failure path on one brd instance which disk isn't added yet, so release handler of gendisk may free the associated request_queue early and causes the following use-after-free[1]. This patch fixes this issue by associating gendisk with request_queue just before adding disk. [1] KASAN: use-after-free Read in del_timer_syncNon-volatile memory driver v1.3 Linux agpgart interface v0.103 [drm] Initialized vgem 1.0.0 20120112 for virtual device on minor 0 usbcore: registered new interface driver udl ================================================================== BUG: KASAN: use-after-free in __lock_acquire+0x36d9/0x4c20 kernel/locking/lockdep.c:3218 Read of size 8 at addr ffff8801d1b6b540 by task swapper/0/1 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.19.0+ #88 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x244/0x39d lib/dump_stack.c:113 print_address_description.cold.7+0x9/0x1ff mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report.cold.8+0x242/0x309 mm/kasan/report.c:412 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433 __lock_acquire+0x36d9/0x4c20 kernel/locking/lockdep.c:3218 lock_acquire+0x1ed/0x520 kernel/locking/lockdep.c:3844 del_timer_sync+0xb7/0x270 kernel/time/timer.c:1283 blk_cleanup_queue+0x413/0x710 block/blk-core.c:809 brd_free+0x5d/0x71 drivers/block/brd.c:422 brd_init+0x2eb/0x393 drivers/block/brd.c:518 do_one_initcall+0x145/0x957 init/main.c:890 do_initcall_level init/main.c:958 [inline] do_initcalls init/main.c:966 [inline] do_basic_setup init/main.c:984 [inline] kernel_init_freeable+0x5c6/0x6b9 init/main.c:1148 kernel_init+0x11/0x1ae init/main.c:1068 ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:350 Reported-by: [email protected] Signed-off-by: Ming Lei <[email protected]> Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
When vb2_buffer_done is called the buffer is unbound from the request and put. The media_request_object_put also 'put's the request reference. If the application has already closed the request fd, then that means that the request reference at that point goes to 0 and the whole request is released. This means that the control handler associated with the request is also freed and that causes this kernel oops: [174705.995401] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:908 [174705.995411] in_atomic(): 1, irqs_disabled(): 1, pid: 28071, name: vivid-000-vid-o [174705.995416] 2 locks held by vivid-000-vid-o/28071: [174705.995420] #0: 000000001ea3a232 (&dev->mutex#3){....}, at: vivid_thread_vid_out+0x3f5/0x550 [vivid] [174705.995447] #1: 00000000e30a0d1e (&(&q->done_lock)->rlock){....}, at: vb2_buffer_done+0x92/0x1d0 [videobuf2_common] [174705.995460] Preemption disabled at: [174705.995461] [<0000000000000000>] (null) [174705.995472] CPU: 11 PID: 28071 Comm: vivid-000-vid-o Tainted: G W 4.20.0-rc1-test-no #88 [174705.995476] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017 [174705.995481] Call Trace: [174705.995500] dump_stack+0x46/0x60 [174705.995512] ___might_sleep.cold.79+0xe1/0xf1 [174705.995523] __mutex_lock+0x50/0x8f0 [174705.995531] ? find_held_lock+0x2d/0x90 [174705.995536] ? find_held_lock+0x2d/0x90 [174705.995542] ? find_held_lock+0x2d/0x90 [174705.995564] ? v4l2_ctrl_handler_free.part.13+0x44/0x1d0 [videodev] [174705.995576] v4l2_ctrl_handler_free.part.13+0x44/0x1d0 [videodev] [174705.995590] v4l2_ctrl_request_release+0x1c/0x30 [videodev] [174705.995600] media_request_clean+0x64/0xe0 [media] [174705.995609] media_request_release+0x19/0x40 [media] [174705.995617] vb2_buffer_done+0xef/0x1d0 [videobuf2_common] [174705.995630] vivid_thread_vid_out+0x2c1/0x550 [vivid] [174705.995645] ? vivid_stop_generating_vid_cap+0x1c0/0x1c0 [vivid] [174705.995653] kthread+0x113/0x130 [174705.995659] ? kthread_park+0x80/0x80 [174705.995667] ret_from_fork+0x35/0x40 The vb2_buffer_done function can be called from interrupt context, so anything that sleeps is not allowed. The solution is to increment the request refcount when the buffer is queued and decrement it when the buffer is dequeued. Releasing the request is fine if that happens from VIDIOC_DQBUF. Signed-off-by: Hans Verkuil <[email protected]> Acked-by: Sakari Ailus <[email protected]> Signed-off-by: Hans Verkuil <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]>
syzbot reported: BUG: KMSAN: uninit-value in tipc_conn_rcv_sub+0x184/0x950 net/tipc/topsrv.c:373 CPU: 0 PID: 66 Comm: kworker/u4:4 Not tainted 4.17.0-rc3+ #88 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: tipc_rcv tipc_conn_recv_work Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x185/0x1d0 lib/dump_stack.c:113 kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067 __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:683 tipc_conn_rcv_sub+0x184/0x950 net/tipc/topsrv.c:373 tipc_conn_rcv_from_sock net/tipc/topsrv.c:409 [inline] tipc_conn_recv_work+0x3cd/0x560 net/tipc/topsrv.c:424 process_one_work+0x12c6/0x1f60 kernel/workqueue.c:2145 worker_thread+0x113c/0x24f0 kernel/workqueue.c:2279 kthread+0x539/0x720 kernel/kthread.c:239 ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:412 Local variable description: ----s.i@tipc_conn_recv_work Variable was created at: tipc_conn_recv_work+0x65/0x560 net/tipc/topsrv.c:419 process_one_work+0x12c6/0x1f60 kernel/workqueue.c:2145 In tipc_conn_rcv_from_sock(), it always supposes the length of message received from sock_recvmsg() is not smaller than the size of struct tipc_subscr. However, this assumption is false. Especially when the length of received message is shorter than struct tipc_subscr size, we will end up touching uninitialized fields in tipc_conn_rcv_sub(). Reported-by: [email protected] Reported-by: [email protected] Signed-off-by: Ying Xue <[email protected]> Signed-off-by: David S. Miller <[email protected]>
commit a88289f upstream. syzbot reported: BUG: KMSAN: uninit-value in tipc_conn_rcv_sub+0x184/0x950 net/tipc/topsrv.c:373 CPU: 0 PID: 66 Comm: kworker/u4:4 Not tainted 4.17.0-rc3+ #88 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: tipc_rcv tipc_conn_recv_work Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x185/0x1d0 lib/dump_stack.c:113 kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067 __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:683 tipc_conn_rcv_sub+0x184/0x950 net/tipc/topsrv.c:373 tipc_conn_rcv_from_sock net/tipc/topsrv.c:409 [inline] tipc_conn_recv_work+0x3cd/0x560 net/tipc/topsrv.c:424 process_one_work+0x12c6/0x1f60 kernel/workqueue.c:2145 worker_thread+0x113c/0x24f0 kernel/workqueue.c:2279 kthread+0x539/0x720 kernel/kthread.c:239 ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:412 Local variable description: ----s.i@tipc_conn_recv_work Variable was created at: tipc_conn_recv_work+0x65/0x560 net/tipc/topsrv.c:419 process_one_work+0x12c6/0x1f60 kernel/workqueue.c:2145 In tipc_conn_rcv_from_sock(), it always supposes the length of message received from sock_recvmsg() is not smaller than the size of struct tipc_subscr. However, this assumption is false. Especially when the length of received message is shorter than struct tipc_subscr size, we will end up touching uninitialized fields in tipc_conn_rcv_sub(). Reported-by: [email protected] Reported-by: [email protected] Signed-off-by: Ying Xue <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit a88289f upstream. syzbot reported: BUG: KMSAN: uninit-value in tipc_conn_rcv_sub+0x184/0x950 net/tipc/topsrv.c:373 CPU: 0 PID: 66 Comm: kworker/u4:4 Not tainted 4.17.0-rc3+ #88 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: tipc_rcv tipc_conn_recv_work Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x185/0x1d0 lib/dump_stack.c:113 kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067 __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:683 tipc_conn_rcv_sub+0x184/0x950 net/tipc/topsrv.c:373 tipc_conn_rcv_from_sock net/tipc/topsrv.c:409 [inline] tipc_conn_recv_work+0x3cd/0x560 net/tipc/topsrv.c:424 process_one_work+0x12c6/0x1f60 kernel/workqueue.c:2145 worker_thread+0x113c/0x24f0 kernel/workqueue.c:2279 kthread+0x539/0x720 kernel/kthread.c:239 ret_from_fork+0x35/0x40 arch/x86/entry/entry_64.S:412 Local variable description: ----s.i@tipc_conn_recv_work Variable was created at: tipc_conn_recv_work+0x65/0x560 net/tipc/topsrv.c:419 process_one_work+0x12c6/0x1f60 kernel/workqueue.c:2145 In tipc_conn_rcv_from_sock(), it always supposes the length of message received from sock_recvmsg() is not smaller than the size of struct tipc_subscr. However, this assumption is false. Especially when the length of received message is shorter than struct tipc_subscr size, we will end up touching uninitialized fields in tipc_conn_rcv_sub(). Reported-by: [email protected] Reported-by: [email protected] Signed-off-by: Ying Xue <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
syzbot was able to trigger another soft lockup [1] I first thought it was the O(N^2) issue I mentioned in my prior fix (f657d22ee1f "net/x25: do not hold the cpu too long in x25_new_lci()"), but I eventually found that x25_bind() was not checking SOCK_ZAPPED state under socket lock protection. This means that multiple threads can end up calling x25_insert_socket() for the same socket, and corrupt x25_list [1] watchdog: BUG: soft lockup - CPU#0 stuck for 123s! [syz-executor.2:10492] Modules linked in: irq event stamp: 27515 hardirqs last enabled at (27514): [<ffffffff81006673>] trace_hardirqs_on_thunk+0x1a/0x1c hardirqs last disabled at (27515): [<ffffffff8100668f>] trace_hardirqs_off_thunk+0x1a/0x1c softirqs last enabled at (32): [<ffffffff8632ee73>] x25_get_neigh+0xa3/0xd0 net/x25/x25_link.c:336 softirqs last disabled at (34): [<ffffffff86324bc3>] x25_find_socket+0x23/0x140 net/x25/af_x25.c:341 CPU: 0 PID: 10492 Comm: syz-executor.2 Not tainted 5.0.0-rc7+ #88 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__sanitizer_cov_trace_pc+0x4/0x50 kernel/kcov.c:97 Code: f4 ff ff ff e8 11 9f ea ff 48 c7 05 12 fb e5 08 00 00 00 00 e9 c8 e9 ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 55 48 89 e5 <48> 8b 75 08 65 48 8b 04 25 40 ee 01 00 65 8b 15 38 0c 92 7e 81 e2 RSP: 0018:ffff88806e94fc48 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff13 RAX: 1ffff1100d84dac5 RBX: 0000000000000001 RCX: ffffc90006197000 RDX: 0000000000040000 RSI: ffffffff86324bf3 RDI: ffff88806c26d628 RBP: ffff88806e94fc48 R08: ffff88806c1c6500 R09: fffffbfff1282561 R10: fffffbfff1282560 R11: ffffffff89412b03 R12: ffff88806c26d628 R13: ffff888090455200 R14: dffffc0000000000 R15: 0000000000000000 FS: 00007f3a107e4700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f3a107e3db8 CR3: 00000000a5544000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __x25_find_socket net/x25/af_x25.c:327 [inline] x25_find_socket+0x7d/0x140 net/x25/af_x25.c:342 x25_new_lci net/x25/af_x25.c:355 [inline] x25_connect+0x380/0xde0 net/x25/af_x25.c:784 __sys_connect+0x266/0x330 net/socket.c:1662 __do_sys_connect net/socket.c:1673 [inline] __se_sys_connect net/socket.c:1670 [inline] __x64_sys_connect+0x73/0xb0 net/socket.c:1670 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x457e29 Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f3a107e3c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002a RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457e29 RDX: 0000000000000012 RSI: 0000000020000200 RDI: 0000000000000005 RBP: 000000000073c040 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f3a107e46d4 R13: 00000000004be362 R14: 00000000004ceb98 R15: 00000000ffffffff Sending NMI from CPU 0 to CPUs 1: NMI backtrace for cpu 1 CPU: 1 PID: 10493 Comm: syz-executor.3 Not tainted 5.0.0-rc7+ #88 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__read_once_size include/linux/compiler.h:193 [inline] RIP: 0010:queued_write_lock_slowpath+0x143/0x290 kernel/locking/qrwlock.c:86 Code: 4c 8d 2c 01 41 83 c7 03 41 0f b6 45 00 41 38 c7 7c 08 84 c0 0f 85 0c 01 00 00 8b 03 3d 00 01 00 00 74 1a f3 90 41 0f b6 55 00 <41> 38 d7 7c eb 84 d2 74 e7 48 89 df e8 cc aa 4e 00 eb dd be 04 00 RSP: 0018:ffff888085c47bd8 EFLAGS: 00000206 RAX: 0000000000000300 RBX: ffffffff89412b00 RCX: 1ffffffff1282560 RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff89412b00 RBP: ffff888085c47c70 R08: 1ffffffff1282560 R09: fffffbfff1282561 R10: fffffbfff1282560 R11: ffffffff89412b03 R12: 00000000000000ff R13: fffffbfff1282560 R14: 1ffff11010b88f7d R15: 0000000000000003 FS: 00007fdd04086700(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fdd04064db8 CR3: 0000000090be0000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: queued_write_lock include/asm-generic/qrwlock.h:104 [inline] do_raw_write_lock+0x1d6/0x290 kernel/locking/spinlock_debug.c:203 __raw_write_lock_bh include/linux/rwlock_api_smp.h:204 [inline] _raw_write_lock_bh+0x3b/0x50 kernel/locking/spinlock.c:312 x25_insert_socket+0x21/0xe0 net/x25/af_x25.c:267 x25_bind+0x273/0x340 net/x25/af_x25.c:703 __sys_bind+0x23f/0x290 net/socket.c:1481 __do_sys_bind net/socket.c:1492 [inline] __se_sys_bind net/socket.c:1490 [inline] __x64_sys_bind+0x73/0xb0 net/socket.c:1490 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x457e29 Fixes: 90c2729 ("X.25 remove bkl in bind") Signed-off-by: Eric Dumazet <[email protected]> Cc: andrew hendry <[email protected]> Signed-off-by: David S. Miller <[email protected]>
[ Upstream commit 797a22b ] syzbot was able to trigger another soft lockup [1] I first thought it was the O(N^2) issue I mentioned in my prior fix (f657d22ee1f "net/x25: do not hold the cpu too long in x25_new_lci()"), but I eventually found that x25_bind() was not checking SOCK_ZAPPED state under socket lock protection. This means that multiple threads can end up calling x25_insert_socket() for the same socket, and corrupt x25_list [1] watchdog: BUG: soft lockup - CPU#0 stuck for 123s! [syz-executor.2:10492] Modules linked in: irq event stamp: 27515 hardirqs last enabled at (27514): [<ffffffff81006673>] trace_hardirqs_on_thunk+0x1a/0x1c hardirqs last disabled at (27515): [<ffffffff8100668f>] trace_hardirqs_off_thunk+0x1a/0x1c softirqs last enabled at (32): [<ffffffff8632ee73>] x25_get_neigh+0xa3/0xd0 net/x25/x25_link.c:336 softirqs last disabled at (34): [<ffffffff86324bc3>] x25_find_socket+0x23/0x140 net/x25/af_x25.c:341 CPU: 0 PID: 10492 Comm: syz-executor.2 Not tainted 5.0.0-rc7+ #88 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__sanitizer_cov_trace_pc+0x4/0x50 kernel/kcov.c:97 Code: f4 ff ff ff e8 11 9f ea ff 48 c7 05 12 fb e5 08 00 00 00 00 e9 c8 e9 ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 55 48 89 e5 <48> 8b 75 08 65 48 8b 04 25 40 ee 01 00 65 8b 15 38 0c 92 7e 81 e2 RSP: 0018:ffff88806e94fc48 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff13 RAX: 1ffff1100d84dac5 RBX: 0000000000000001 RCX: ffffc90006197000 RDX: 0000000000040000 RSI: ffffffff86324bf3 RDI: ffff88806c26d628 RBP: ffff88806e94fc48 R08: ffff88806c1c6500 R09: fffffbfff1282561 R10: fffffbfff1282560 R11: ffffffff89412b03 R12: ffff88806c26d628 R13: ffff888090455200 R14: dffffc0000000000 R15: 0000000000000000 FS: 00007f3a107e4700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f3a107e3db8 CR3: 00000000a5544000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __x25_find_socket net/x25/af_x25.c:327 [inline] x25_find_socket+0x7d/0x140 net/x25/af_x25.c:342 x25_new_lci net/x25/af_x25.c:355 [inline] x25_connect+0x380/0xde0 net/x25/af_x25.c:784 __sys_connect+0x266/0x330 net/socket.c:1662 __do_sys_connect net/socket.c:1673 [inline] __se_sys_connect net/socket.c:1670 [inline] __x64_sys_connect+0x73/0xb0 net/socket.c:1670 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x457e29 Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f3a107e3c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002a RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457e29 RDX: 0000000000000012 RSI: 0000000020000200 RDI: 0000000000000005 RBP: 000000000073c040 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f3a107e46d4 R13: 00000000004be362 R14: 00000000004ceb98 R15: 00000000ffffffff Sending NMI from CPU 0 to CPUs 1: NMI backtrace for cpu 1 CPU: 1 PID: 10493 Comm: syz-executor.3 Not tainted 5.0.0-rc7+ #88 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__read_once_size include/linux/compiler.h:193 [inline] RIP: 0010:queued_write_lock_slowpath+0x143/0x290 kernel/locking/qrwlock.c:86 Code: 4c 8d 2c 01 41 83 c7 03 41 0f b6 45 00 41 38 c7 7c 08 84 c0 0f 85 0c 01 00 00 8b 03 3d 00 01 00 00 74 1a f3 90 41 0f b6 55 00 <41> 38 d7 7c eb 84 d2 74 e7 48 89 df e8 cc aa 4e 00 eb dd be 04 00 RSP: 0018:ffff888085c47bd8 EFLAGS: 00000206 RAX: 0000000000000300 RBX: ffffffff89412b00 RCX: 1ffffffff1282560 RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff89412b00 RBP: ffff888085c47c70 R08: 1ffffffff1282560 R09: fffffbfff1282561 R10: fffffbfff1282560 R11: ffffffff89412b03 R12: 00000000000000ff R13: fffffbfff1282560 R14: 1ffff11010b88f7d R15: 0000000000000003 FS: 00007fdd04086700(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fdd04064db8 CR3: 0000000090be0000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: queued_write_lock include/asm-generic/qrwlock.h:104 [inline] do_raw_write_lock+0x1d6/0x290 kernel/locking/spinlock_debug.c:203 __raw_write_lock_bh include/linux/rwlock_api_smp.h:204 [inline] _raw_write_lock_bh+0x3b/0x50 kernel/locking/spinlock.c:312 x25_insert_socket+0x21/0xe0 net/x25/af_x25.c:267 x25_bind+0x273/0x340 net/x25/af_x25.c:703 __sys_bind+0x23f/0x290 net/socket.c:1481 __do_sys_bind net/socket.c:1492 [inline] __se_sys_bind net/socket.c:1490 [inline] __x64_sys_bind+0x73/0xb0 net/socket.c:1490 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x457e29 Fixes: 90c2729 ("X.25 remove bkl in bind") Signed-off-by: Eric Dumazet <[email protected]> Cc: andrew hendry <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
[ Upstream commit 797a22b ] syzbot was able to trigger another soft lockup [1] I first thought it was the O(N^2) issue I mentioned in my prior fix (f657d22ee1f "net/x25: do not hold the cpu too long in x25_new_lci()"), but I eventually found that x25_bind() was not checking SOCK_ZAPPED state under socket lock protection. This means that multiple threads can end up calling x25_insert_socket() for the same socket, and corrupt x25_list [1] watchdog: BUG: soft lockup - CPU#0 stuck for 123s! [syz-executor.2:10492] Modules linked in: irq event stamp: 27515 hardirqs last enabled at (27514): [<ffffffff81006673>] trace_hardirqs_on_thunk+0x1a/0x1c hardirqs last disabled at (27515): [<ffffffff8100668f>] trace_hardirqs_off_thunk+0x1a/0x1c softirqs last enabled at (32): [<ffffffff8632ee73>] x25_get_neigh+0xa3/0xd0 net/x25/x25_link.c:336 softirqs last disabled at (34): [<ffffffff86324bc3>] x25_find_socket+0x23/0x140 net/x25/af_x25.c:341 CPU: 0 PID: 10492 Comm: syz-executor.2 Not tainted 5.0.0-rc7+ #88 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__sanitizer_cov_trace_pc+0x4/0x50 kernel/kcov.c:97 Code: f4 ff ff ff e8 11 9f ea ff 48 c7 05 12 fb e5 08 00 00 00 00 e9 c8 e9 ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 55 48 89 e5 <48> 8b 75 08 65 48 8b 04 25 40 ee 01 00 65 8b 15 38 0c 92 7e 81 e2 RSP: 0018:ffff88806e94fc48 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff13 RAX: 1ffff1100d84dac5 RBX: 0000000000000001 RCX: ffffc90006197000 RDX: 0000000000040000 RSI: ffffffff86324bf3 RDI: ffff88806c26d628 RBP: ffff88806e94fc48 R08: ffff88806c1c6500 R09: fffffbfff1282561 R10: fffffbfff1282560 R11: ffffffff89412b03 R12: ffff88806c26d628 R13: ffff888090455200 R14: dffffc0000000000 R15: 0000000000000000 FS: 00007f3a107e4700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f3a107e3db8 CR3: 00000000a5544000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __x25_find_socket net/x25/af_x25.c:327 [inline] x25_find_socket+0x7d/0x140 net/x25/af_x25.c:342 x25_new_lci net/x25/af_x25.c:355 [inline] x25_connect+0x380/0xde0 net/x25/af_x25.c:784 __sys_connect+0x266/0x330 net/socket.c:1662 __do_sys_connect net/socket.c:1673 [inline] __se_sys_connect net/socket.c:1670 [inline] __x64_sys_connect+0x73/0xb0 net/socket.c:1670 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x457e29 Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f3a107e3c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002a RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457e29 RDX: 0000000000000012 RSI: 0000000020000200 RDI: 0000000000000005 RBP: 000000000073c040 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f3a107e46d4 R13: 00000000004be362 R14: 00000000004ceb98 R15: 00000000ffffffff Sending NMI from CPU 0 to CPUs 1: NMI backtrace for cpu 1 CPU: 1 PID: 10493 Comm: syz-executor.3 Not tainted 5.0.0-rc7+ #88 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__read_once_size include/linux/compiler.h:193 [inline] RIP: 0010:queued_write_lock_slowpath+0x143/0x290 kernel/locking/qrwlock.c:86 Code: 4c 8d 2c 01 41 83 c7 03 41 0f b6 45 00 41 38 c7 7c 08 84 c0 0f 85 0c 01 00 00 8b 03 3d 00 01 00 00 74 1a f3 90 41 0f b6 55 00 <41> 38 d7 7c eb 84 d2 74 e7 48 89 df e8 cc aa 4e 00 eb dd be 04 00 RSP: 0018:ffff888085c47bd8 EFLAGS: 00000206 RAX: 0000000000000300 RBX: ffffffff89412b00 RCX: 1ffffffff1282560 RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff89412b00 RBP: ffff888085c47c70 R08: 1ffffffff1282560 R09: fffffbfff1282561 R10: fffffbfff1282560 R11: ffffffff89412b03 R12: 00000000000000ff R13: fffffbfff1282560 R14: 1ffff11010b88f7d R15: 0000000000000003 FS: 00007fdd04086700(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fdd04064db8 CR3: 0000000090be0000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: queued_write_lock include/asm-generic/qrwlock.h:104 [inline] do_raw_write_lock+0x1d6/0x290 kernel/locking/spinlock_debug.c:203 __raw_write_lock_bh include/linux/rwlock_api_smp.h:204 [inline] _raw_write_lock_bh+0x3b/0x50 kernel/locking/spinlock.c:312 x25_insert_socket+0x21/0xe0 net/x25/af_x25.c:267 x25_bind+0x273/0x340 net/x25/af_x25.c:703 __sys_bind+0x23f/0x290 net/socket.c:1481 __do_sys_bind net/socket.c:1492 [inline] __se_sys_bind net/socket.c:1490 [inline] __x64_sys_bind+0x73/0xb0 net/socket.c:1490 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x457e29 Fixes: 90c2729 ("X.25 remove bkl in bind") Signed-off-by: Eric Dumazet <[email protected]> Cc: andrew hendry <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
[ Upstream commit 73e6ff7 ] Super-IO accesses may fail on a system with no or unmapped LPC bus. Unable to handle kernel paging request at virtual address ffffffbffee0002e pgd = ffffffc1d68d4000 [ffffffbffee0002e] *pgd=0000000000000000, *pud=0000000000000000 Internal error: Oops: 94000046 [#1] PREEMPT SMP Modules linked in: f71805f(+) hwmon CPU: 3 PID: 1659 Comm: insmod Not tainted 4.5.0+ #88 Hardware name: linux,dummy-virt (DT) task: ffffffc1f6665400 ti: ffffffc1d6418000 task.ti: ffffffc1d6418000 PC is at f71805f_find+0x6c/0x358 [f71805f] Also, other drivers may attempt to access the LPC bus at the same time, resulting in undefined behavior. Use request_muxed_region() to ensure that IO access on the requested address space is supported, and to ensure that access by multiple drivers is synchronized. Fixes: e53004e ("hwmon: New f71805f driver") Reported-by: Kefeng Wang <[email protected]> Reported-by: John Garry <[email protected]> Cc: John Garry <[email protected]> Acked-by: John Garry <[email protected]> Signed-off-by: Guenter Roeck <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 73e6ff7 ] Super-IO accesses may fail on a system with no or unmapped LPC bus. Unable to handle kernel paging request at virtual address ffffffbffee0002e pgd = ffffffc1d68d4000 [ffffffbffee0002e] *pgd=0000000000000000, *pud=0000000000000000 Internal error: Oops: 94000046 [#1] PREEMPT SMP Modules linked in: f71805f(+) hwmon CPU: 3 PID: 1659 Comm: insmod Not tainted 4.5.0+ #88 Hardware name: linux,dummy-virt (DT) task: ffffffc1f6665400 ti: ffffffc1d6418000 task.ti: ffffffc1d6418000 PC is at f71805f_find+0x6c/0x358 [f71805f] Also, other drivers may attempt to access the LPC bus at the same time, resulting in undefined behavior. Use request_muxed_region() to ensure that IO access on the requested address space is supported, and to ensure that access by multiple drivers is synchronized. Fixes: e53004e ("hwmon: New f71805f driver") Reported-by: Kefeng Wang <[email protected]> Reported-by: John Garry <[email protected]> Cc: John Garry <[email protected]> Acked-by: John Garry <[email protected]> Signed-off-by: Guenter Roeck <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 73e6ff7 ] Super-IO accesses may fail on a system with no or unmapped LPC bus. Unable to handle kernel paging request at virtual address ffffffbffee0002e pgd = ffffffc1d68d4000 [ffffffbffee0002e] *pgd=0000000000000000, *pud=0000000000000000 Internal error: Oops: 94000046 [#1] PREEMPT SMP Modules linked in: f71805f(+) hwmon CPU: 3 PID: 1659 Comm: insmod Not tainted 4.5.0+ #88 Hardware name: linux,dummy-virt (DT) task: ffffffc1f6665400 ti: ffffffc1d6418000 task.ti: ffffffc1d6418000 PC is at f71805f_find+0x6c/0x358 [f71805f] Also, other drivers may attempt to access the LPC bus at the same time, resulting in undefined behavior. Use request_muxed_region() to ensure that IO access on the requested address space is supported, and to ensure that access by multiple drivers is synchronized. Fixes: e53004e ("hwmon: New f71805f driver") Reported-by: Kefeng Wang <[email protected]> Reported-by: John Garry <[email protected]> Cc: John Garry <[email protected]> Acked-by: John Garry <[email protected]> Signed-off-by: Guenter Roeck <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 73e6ff7 ] Super-IO accesses may fail on a system with no or unmapped LPC bus. Unable to handle kernel paging request at virtual address ffffffbffee0002e pgd = ffffffc1d68d4000 [ffffffbffee0002e] *pgd=0000000000000000, *pud=0000000000000000 Internal error: Oops: 94000046 [raspberrypi#1] PREEMPT SMP Modules linked in: f71805f(+) hwmon CPU: 3 PID: 1659 Comm: insmod Not tainted 4.5.0+ raspberrypi#88 Hardware name: linux,dummy-virt (DT) task: ffffffc1f6665400 ti: ffffffc1d6418000 task.ti: ffffffc1d6418000 PC is at f71805f_find+0x6c/0x358 [f71805f] Also, other drivers may attempt to access the LPC bus at the same time, resulting in undefined behavior. Use request_muxed_region() to ensure that IO access on the requested address space is supported, and to ensure that access by multiple drivers is synchronized. Fixes: e53004e ("hwmon: New f71805f driver") Reported-by: Kefeng Wang <[email protected]> Reported-by: John Garry <[email protected]> Cc: John Garry <[email protected]> Acked-by: John Garry <[email protected]> Signed-off-by: Guenter Roeck <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
I got issue as follows: [ 567.094140] __io_remove_buffers: [1]start ctx=0xffff8881067bf000 bgid=65533 buf=0xffff8881fefe1680 [ 594.360799] watchdog: BUG: soft lockup - CPU#2 stuck for 26s! [kworker/u32:5:108] [ 594.364987] Modules linked in: [ 594.365405] irq event stamp: 604180238 [ 594.365906] hardirqs last enabled at (604180237): [<ffffffff93fec9bd>] _raw_spin_unlock_irqrestore+0x2d/0x50 [ 594.367181] hardirqs last disabled at (604180238): [<ffffffff93fbbadb>] sysvec_apic_timer_interrupt+0xb/0xc0 [ 594.368420] softirqs last enabled at (569080666): [<ffffffff94200654>] __do_softirq+0x654/0xa9e [ 594.369551] softirqs last disabled at (569080575): [<ffffffff913e1d6a>] irq_exit_rcu+0x1ca/0x250 [ 594.370692] CPU: 2 PID: 108 Comm: kworker/u32:5 Tainted: G L 5.15.0-next-20211112+ #88 [ 594.371891] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014 [ 594.373604] Workqueue: events_unbound io_ring_exit_work [ 594.374303] RIP: 0010:_raw_spin_unlock_irqrestore+0x33/0x50 [ 594.375037] Code: 48 83 c7 18 53 48 89 f3 48 8b 74 24 10 e8 55 f5 55 fd 48 89 ef e8 ed a7 56 fd 80 e7 02 74 06 e8 43 13 7b fd fb bf 01 00 00 00 <e8> f8 78 474 [ 594.377433] RSP: 0018:ffff888101587a70 EFLAGS: 00000202 [ 594.378120] RAX: 0000000024030f0d RBX: 0000000000000246 RCX: 1ffffffff2f09106 [ 594.379053] RDX: 0000000000000000 RSI: ffffffff9449f0e0 RDI: 0000000000000001 [ 594.379991] RBP: ffffffff9586cdc0 R08: 0000000000000001 R09: fffffbfff2effcab [ 594.380923] R10: ffffffff977fe557 R11: fffffbfff2effcaa R12: ffff8881b8f3def0 [ 594.381858] R13: 0000000000000246 R14: ffff888153a8b070 R15: 0000000000000000 [ 594.382787] FS: 0000000000000000(0000) GS:ffff888399c00000(0000) knlGS:0000000000000000 [ 594.383851] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 594.384602] CR2: 00007fcbe71d2000 CR3: 00000000b4216000 CR4: 00000000000006e0 [ 594.385540] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 594.386474] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 594.387403] Call Trace: [ 594.387738] <TASK> [ 594.388042] find_and_remove_object+0x118/0x160 [ 594.389321] delete_object_full+0xc/0x20 [ 594.389852] kfree+0x193/0x470 [ 594.390275] __io_remove_buffers.part.0+0xed/0x147 [ 594.390931] io_ring_ctx_free+0x342/0x6a2 [ 594.392159] io_ring_exit_work+0x41e/0x486 [ 594.396419] process_one_work+0x906/0x15a0 [ 594.399185] worker_thread+0x8b/0xd80 [ 594.400259] kthread+0x3bf/0x4a0 [ 594.401847] ret_from_fork+0x22/0x30 [ 594.402343] </TASK> Message from syslogd@localhost at Nov 13 09:09:54 ... kernel:watchdog: BUG: soft lockup - CPU#2 stuck for 26s! [kworker/u32:5:108] [ 596.793660] __io_remove_buffers: [2099199]start ctx=0xffff8881067bf000 bgid=65533 buf=0xffff8881fefe1680 We can reproduce this issue by follow syzkaller log: r0 = syz_io_uring_setup(0x401, &(0x7f0000000300), &(0x7f0000003000/0x2000)=nil, &(0x7f0000ff8000/0x4000)=nil, &(0x7f0000000280)=<r1=>0x0, &(0x7f0000000380)=<r2=>0x0) sendmsg$ETHTOOL_MSG_FEATURES_SET(0xffffffffffffffff, &(0x7f0000003080)={0x0, 0x0, &(0x7f0000003040)={&(0x7f0000000040)=ANY=[], 0x18}}, 0x0) syz_io_uring_submit(r1, r2, &(0x7f0000000240)=@IORING_OP_PROVIDE_BUFFERS={0x1f, 0x5, 0x0, 0x401, 0x1, 0x0, 0x100, 0x0, 0x1, {0xfffd}}, 0x0) io_uring_enter(r0, 0x3a2d, 0x0, 0x0, 0x0, 0x0) The reason above issue is 'buf->list' has 2,100,000 nodes, occupied cpu lead to soft lockup. To solve this issue, we need add schedule point when do while loop in '__io_remove_buffers'. After add schedule point we do regression, get follow data. [ 240.141864] __io_remove_buffers: [1]start ctx=0xffff888170603000 bgid=65533 buf=0xffff8881116fcb00 [ 268.408260] __io_remove_buffers: [1]start ctx=0xffff8881b92d2000 bgid=65533 buf=0xffff888130c83180 [ 275.899234] __io_remove_buffers: [2099199]start ctx=0xffff888170603000 bgid=65533 buf=0xffff8881116fcb00 [ 296.741404] __io_remove_buffers: [1]start ctx=0xffff8881b659c000 bgid=65533 buf=0xffff8881010fe380 [ 305.090059] __io_remove_buffers: [2099199]start ctx=0xffff8881b92d2000 bgid=65533 buf=0xffff888130c83180 [ 325.415746] __io_remove_buffers: [1]start ctx=0xffff8881b92d1000 bgid=65533 buf=0xffff8881a17d8f00 [ 333.160318] __io_remove_buffers: [2099199]start ctx=0xffff8881b659c000 bgid=65533 buf=0xffff8881010fe380 ... Fixes:8bab4c09f24e("io_uring: allow conditional reschedule for intensive iterators") Signed-off-by: Ye Bin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]>
commit 1d0254e upstream. I got issue as follows: [ 567.094140] __io_remove_buffers: [1]start ctx=0xffff8881067bf000 bgid=65533 buf=0xffff8881fefe1680 [ 594.360799] watchdog: BUG: soft lockup - CPU#2 stuck for 26s! [kworker/u32:5:108] [ 594.364987] Modules linked in: [ 594.365405] irq event stamp: 604180238 [ 594.365906] hardirqs last enabled at (604180237): [<ffffffff93fec9bd>] _raw_spin_unlock_irqrestore+0x2d/0x50 [ 594.367181] hardirqs last disabled at (604180238): [<ffffffff93fbbadb>] sysvec_apic_timer_interrupt+0xb/0xc0 [ 594.368420] softirqs last enabled at (569080666): [<ffffffff94200654>] __do_softirq+0x654/0xa9e [ 594.369551] softirqs last disabled at (569080575): [<ffffffff913e1d6a>] irq_exit_rcu+0x1ca/0x250 [ 594.370692] CPU: 2 PID: 108 Comm: kworker/u32:5 Tainted: G L 5.15.0-next-20211112+ #88 [ 594.371891] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014 [ 594.373604] Workqueue: events_unbound io_ring_exit_work [ 594.374303] RIP: 0010:_raw_spin_unlock_irqrestore+0x33/0x50 [ 594.375037] Code: 48 83 c7 18 53 48 89 f3 48 8b 74 24 10 e8 55 f5 55 fd 48 89 ef e8 ed a7 56 fd 80 e7 02 74 06 e8 43 13 7b fd fb bf 01 00 00 00 <e8> f8 78 474 [ 594.377433] RSP: 0018:ffff888101587a70 EFLAGS: 00000202 [ 594.378120] RAX: 0000000024030f0d RBX: 0000000000000246 RCX: 1ffffffff2f09106 [ 594.379053] RDX: 0000000000000000 RSI: ffffffff9449f0e0 RDI: 0000000000000001 [ 594.379991] RBP: ffffffff9586cdc0 R08: 0000000000000001 R09: fffffbfff2effcab [ 594.380923] R10: ffffffff977fe557 R11: fffffbfff2effcaa R12: ffff8881b8f3def0 [ 594.381858] R13: 0000000000000246 R14: ffff888153a8b070 R15: 0000000000000000 [ 594.382787] FS: 0000000000000000(0000) GS:ffff888399c00000(0000) knlGS:0000000000000000 [ 594.383851] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 594.384602] CR2: 00007fcbe71d2000 CR3: 00000000b4216000 CR4: 00000000000006e0 [ 594.385540] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 594.386474] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 594.387403] Call Trace: [ 594.387738] <TASK> [ 594.388042] find_and_remove_object+0x118/0x160 [ 594.389321] delete_object_full+0xc/0x20 [ 594.389852] kfree+0x193/0x470 [ 594.390275] __io_remove_buffers.part.0+0xed/0x147 [ 594.390931] io_ring_ctx_free+0x342/0x6a2 [ 594.392159] io_ring_exit_work+0x41e/0x486 [ 594.396419] process_one_work+0x906/0x15a0 [ 594.399185] worker_thread+0x8b/0xd80 [ 594.400259] kthread+0x3bf/0x4a0 [ 594.401847] ret_from_fork+0x22/0x30 [ 594.402343] </TASK> Message from syslogd@localhost at Nov 13 09:09:54 ... kernel:watchdog: BUG: soft lockup - CPU#2 stuck for 26s! [kworker/u32:5:108] [ 596.793660] __io_remove_buffers: [2099199]start ctx=0xffff8881067bf000 bgid=65533 buf=0xffff8881fefe1680 We can reproduce this issue by follow syzkaller log: r0 = syz_io_uring_setup(0x401, &(0x7f0000000300), &(0x7f0000003000/0x2000)=nil, &(0x7f0000ff8000/0x4000)=nil, &(0x7f0000000280)=<r1=>0x0, &(0x7f0000000380)=<r2=>0x0) sendmsg$ETHTOOL_MSG_FEATURES_SET(0xffffffffffffffff, &(0x7f0000003080)={0x0, 0x0, &(0x7f0000003040)={&(0x7f0000000040)=ANY=[], 0x18}}, 0x0) syz_io_uring_submit(r1, r2, &(0x7f0000000240)=@IORING_OP_PROVIDE_BUFFERS={0x1f, 0x5, 0x0, 0x401, 0x1, 0x0, 0x100, 0x0, 0x1, {0xfffd}}, 0x0) io_uring_enter(r0, 0x3a2d, 0x0, 0x0, 0x0, 0x0) The reason above issue is 'buf->list' has 2,100,000 nodes, occupied cpu lead to soft lockup. To solve this issue, we need add schedule point when do while loop in '__io_remove_buffers'. After add schedule point we do regression, get follow data. [ 240.141864] __io_remove_buffers: [1]start ctx=0xffff888170603000 bgid=65533 buf=0xffff8881116fcb00 [ 268.408260] __io_remove_buffers: [1]start ctx=0xffff8881b92d2000 bgid=65533 buf=0xffff888130c83180 [ 275.899234] __io_remove_buffers: [2099199]start ctx=0xffff888170603000 bgid=65533 buf=0xffff8881116fcb00 [ 296.741404] __io_remove_buffers: [1]start ctx=0xffff8881b659c000 bgid=65533 buf=0xffff8881010fe380 [ 305.090059] __io_remove_buffers: [2099199]start ctx=0xffff8881b92d2000 bgid=65533 buf=0xffff888130c83180 [ 325.415746] __io_remove_buffers: [1]start ctx=0xffff8881b92d1000 bgid=65533 buf=0xffff8881a17d8f00 [ 333.160318] __io_remove_buffers: [2099199]start ctx=0xffff8881b659c000 bgid=65533 buf=0xffff8881010fe380 ... Fixes:8bab4c09f24e("io_uring: allow conditional reschedule for intensive iterators") Signed-off-by: Ye Bin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
[ Upstream commit db52f57 ] Branch data available to BPF programs can be very useful to get stack traces out of userspace application. Commit fff7b64 ("bpf: Add bpf_read_branch_records() helper") added BPF support to capture branch records in x86. Enable this feature also for other architectures as well by removing checks specific to x86. If an architecture doesn't support branch records, bpf_read_branch_records() still has appropriate checks and it will return an -EINVAL in that scenario. Based on UAPI helper doc in include/uapi/linux/bpf.h, unsupported architectures should return -ENOENT in such case. Hence, update the appropriate check to return -ENOENT instead. Selftest 'perf_branches' result on power9 machine which has the branch stacks support: - Before this patch: [command]# ./test_progs -t perf_branches #88/1 perf_branches/perf_branches_hw:FAIL #88/2 perf_branches/perf_branches_no_hw:OK #88 perf_branches:FAIL Summary: 0/1 PASSED, 0 SKIPPED, 1 FAILED - After this patch: [command]# ./test_progs -t perf_branches #88/1 perf_branches/perf_branches_hw:OK #88/2 perf_branches/perf_branches_no_hw:OK #88 perf_branches:OK Summary: 1/2 PASSED, 0 SKIPPED, 0 FAILED Selftest 'perf_branches' result on power9 machine which doesn't have branch stack report: - After this patch: [command]# ./test_progs -t perf_branches #88/1 perf_branches/perf_branches_hw:SKIP #88/2 perf_branches/perf_branches_no_hw:OK #88 perf_branches:OK Summary: 1/1 PASSED, 1 SKIPPED, 0 FAILED Fixes: fff7b64 ("bpf: Add bpf_read_branch_records() helper") Suggested-by: Peter Zijlstra <[email protected]> Signed-off-by: Kajol Jain <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit db52f57 ] Branch data available to BPF programs can be very useful to get stack traces out of userspace application. Commit fff7b64 ("bpf: Add bpf_read_branch_records() helper") added BPF support to capture branch records in x86. Enable this feature also for other architectures as well by removing checks specific to x86. If an architecture doesn't support branch records, bpf_read_branch_records() still has appropriate checks and it will return an -EINVAL in that scenario. Based on UAPI helper doc in include/uapi/linux/bpf.h, unsupported architectures should return -ENOENT in such case. Hence, update the appropriate check to return -ENOENT instead. Selftest 'perf_branches' result on power9 machine which has the branch stacks support: - Before this patch: [command]# ./test_progs -t perf_branches #88/1 perf_branches/perf_branches_hw:FAIL #88/2 perf_branches/perf_branches_no_hw:OK #88 perf_branches:FAIL Summary: 0/1 PASSED, 0 SKIPPED, 1 FAILED - After this patch: [command]# ./test_progs -t perf_branches #88/1 perf_branches/perf_branches_hw:OK #88/2 perf_branches/perf_branches_no_hw:OK #88 perf_branches:OK Summary: 1/2 PASSED, 0 SKIPPED, 0 FAILED Selftest 'perf_branches' result on power9 machine which doesn't have branch stack report: - After this patch: [command]# ./test_progs -t perf_branches #88/1 perf_branches/perf_branches_hw:SKIP #88/2 perf_branches/perf_branches_no_hw:OK #88 perf_branches:OK Summary: 1/1 PASSED, 1 SKIPPED, 0 FAILED Fixes: fff7b64 ("bpf: Add bpf_read_branch_records() helper") Suggested-by: Peter Zijlstra <[email protected]> Signed-off-by: Kajol Jain <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit db52f57 ] Branch data available to BPF programs can be very useful to get stack traces out of userspace application. Commit fff7b64 ("bpf: Add bpf_read_branch_records() helper") added BPF support to capture branch records in x86. Enable this feature also for other architectures as well by removing checks specific to x86. If an architecture doesn't support branch records, bpf_read_branch_records() still has appropriate checks and it will return an -EINVAL in that scenario. Based on UAPI helper doc in include/uapi/linux/bpf.h, unsupported architectures should return -ENOENT in such case. Hence, update the appropriate check to return -ENOENT instead. Selftest 'perf_branches' result on power9 machine which has the branch stacks support: - Before this patch: [command]# ./test_progs -t perf_branches #88/1 perf_branches/perf_branches_hw:FAIL #88/2 perf_branches/perf_branches_no_hw:OK #88 perf_branches:FAIL Summary: 0/1 PASSED, 0 SKIPPED, 1 FAILED - After this patch: [command]# ./test_progs -t perf_branches #88/1 perf_branches/perf_branches_hw:OK #88/2 perf_branches/perf_branches_no_hw:OK #88 perf_branches:OK Summary: 1/2 PASSED, 0 SKIPPED, 0 FAILED Selftest 'perf_branches' result on power9 machine which doesn't have branch stack report: - After this patch: [command]# ./test_progs -t perf_branches #88/1 perf_branches/perf_branches_hw:SKIP #88/2 perf_branches/perf_branches_no_hw:OK #88 perf_branches:OK Summary: 1/1 PASSED, 1 SKIPPED, 0 FAILED Fixes: fff7b64 ("bpf: Add bpf_read_branch_records() helper") Suggested-by: Peter Zijlstra <[email protected]> Signed-off-by: Kajol Jain <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit db52f57 ] Branch data available to BPF programs can be very useful to get stack traces out of userspace application. Commit fff7b64 ("bpf: Add bpf_read_branch_records() helper") added BPF support to capture branch records in x86. Enable this feature also for other architectures as well by removing checks specific to x86. If an architecture doesn't support branch records, bpf_read_branch_records() still has appropriate checks and it will return an -EINVAL in that scenario. Based on UAPI helper doc in include/uapi/linux/bpf.h, unsupported architectures should return -ENOENT in such case. Hence, update the appropriate check to return -ENOENT instead. Selftest 'perf_branches' result on power9 machine which has the branch stacks support: - Before this patch: [command]# ./test_progs -t perf_branches raspberrypi#88/1 perf_branches/perf_branches_hw:FAIL raspberrypi#88/2 perf_branches/perf_branches_no_hw:OK raspberrypi#88 perf_branches:FAIL Summary: 0/1 PASSED, 0 SKIPPED, 1 FAILED - After this patch: [command]# ./test_progs -t perf_branches raspberrypi#88/1 perf_branches/perf_branches_hw:OK raspberrypi#88/2 perf_branches/perf_branches_no_hw:OK raspberrypi#88 perf_branches:OK Summary: 1/2 PASSED, 0 SKIPPED, 0 FAILED Selftest 'perf_branches' result on power9 machine which doesn't have branch stack report: - After this patch: [command]# ./test_progs -t perf_branches raspberrypi#88/1 perf_branches/perf_branches_hw:SKIP raspberrypi#88/2 perf_branches/perf_branches_no_hw:OK raspberrypi#88 perf_branches:OK Summary: 1/1 PASSED, 1 SKIPPED, 0 FAILED Fixes: fff7b64 ("bpf: Add bpf_read_branch_records() helper") Suggested-by: Peter Zijlstra <[email protected]> Signed-off-by: Kajol Jain <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Sasha Levin <[email protected]>
The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // raspberrypi#8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, raspberrypi#8] 40: str x0, [x27, raspberrypi#16] 44: str x0, [x27, raspberrypi#24] 48: str x0, [x27, raspberrypi#32] 4c: str x0, [x27, raspberrypi#40] 50: str x0, [x27, raspberrypi#48] 54: str x0, [x27, raspberrypi#56] 58: str x0, [x27, raspberrypi#64] 5c: str x0, [x27, raspberrypi#72] 60: str x0, [x27, raspberrypi#80] 64: str x0, [x27, raspberrypi#88] 68: str x0, [x27, raspberrypi#96] 6c: str x0, [x27, raspberrypi#104] 70: str x0, [x27, raspberrypi#112] 74: str x0, [x27, raspberrypi#120] 78: str x0, [x27, raspberrypi#128] 7c: ldr x2, [x19, raspberrypi#8] [...] Signed-off-by: Xu Kuohai <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
commit fd72761 upstream. powerpc sets up PF_KTHREAD and PF_IO_WORKER with a NULL pt_regs, which from my (arguably very short) checking is not commonly done for other archs. This is fine, except when PF_IO_WORKER's have been created and the task does something that causes a coredump to be generated. Then we get this crash: Kernel attempted to read user page (160) - exploit attempt? (uid: 1000) BUG: Kernel NULL pointer dereference on read at 0x00000160 Faulting instruction address: 0xc0000000000c3a60 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=32 NUMA pSeries Modules linked in: bochs drm_vram_helper drm_kms_helper xts binfmt_misc ecb ctr syscopyarea sysfillrect cbc sysimgblt drm_ttm_helper aes_generic ttm sg libaes evdev joydev virtio_balloon vmx_crypto gf128mul drm dm_mod fuse loop configfs drm_panel_orientation_quirks ip_tables x_tables autofs4 hid_generic usbhid hid xhci_pci xhci_hcd usbcore usb_common sd_mod CPU: 1 PID: 1982 Comm: ppc-crash Not tainted 6.3.0-rc2+ #88 Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1202 0xf000005 of:SLOF,HEAD hv:linux,kvm pSeries NIP: c0000000000c3a60 LR: c000000000039944 CTR: c0000000000398e0 REGS: c0000000041833b0 TRAP: 0300 Not tainted (6.3.0-rc2+) MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 88082828 XER: 200400f8 ... NIP memcpy_power7+0x200/0x7d0 LR ppr_get+0x64/0xb0 Call Trace: ppr_get+0x40/0xb0 (unreliable) __regset_get+0x180/0x1f0 regset_get_alloc+0x64/0x90 elf_core_dump+0xb98/0x1b60 do_coredump+0x1c34/0x24a0 get_signal+0x71c/0x1410 do_notify_resume+0x140/0x6f0 interrupt_exit_user_prepare_main+0x29c/0x320 interrupt_exit_user_prepare+0x6c/0xa0 interrupt_return_srr_user+0x8/0x138 Because ppr_get() is trying to copy from a PF_IO_WORKER with a NULL pt_regs. Check for a valid pt_regs in both ppc_get/ppr_set, and return an error if not set. The actual error value doesn't seem to be important here, so just pick -EINVAL. Fixes: fa43981 ("powerpc/ptrace: Enable support for NT_PPPC_TAR, NT_PPC_PPR, NT_PPC_DSCR") Cc: [email protected] # v4.8+ Signed-off-by: Jens Axboe <[email protected]> [mpe: Trim oops in change log, add Fixes & Cc stable] Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit fd72761 upstream. powerpc sets up PF_KTHREAD and PF_IO_WORKER with a NULL pt_regs, which from my (arguably very short) checking is not commonly done for other archs. This is fine, except when PF_IO_WORKER's have been created and the task does something that causes a coredump to be generated. Then we get this crash: Kernel attempted to read user page (160) - exploit attempt? (uid: 1000) BUG: Kernel NULL pointer dereference on read at 0x00000160 Faulting instruction address: 0xc0000000000c3a60 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=32 NUMA pSeries Modules linked in: bochs drm_vram_helper drm_kms_helper xts binfmt_misc ecb ctr syscopyarea sysfillrect cbc sysimgblt drm_ttm_helper aes_generic ttm sg libaes evdev joydev virtio_balloon vmx_crypto gf128mul drm dm_mod fuse loop configfs drm_panel_orientation_quirks ip_tables x_tables autofs4 hid_generic usbhid hid xhci_pci xhci_hcd usbcore usb_common sd_mod CPU: 1 PID: 1982 Comm: ppc-crash Not tainted 6.3.0-rc2+ #88 Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1202 0xf000005 of:SLOF,HEAD hv:linux,kvm pSeries NIP: c0000000000c3a60 LR: c000000000039944 CTR: c0000000000398e0 REGS: c0000000041833b0 TRAP: 0300 Not tainted (6.3.0-rc2+) MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 88082828 XER: 200400f8 ... NIP memcpy_power7+0x200/0x7d0 LR ppr_get+0x64/0xb0 Call Trace: ppr_get+0x40/0xb0 (unreliable) __regset_get+0x180/0x1f0 regset_get_alloc+0x64/0x90 elf_core_dump+0xb98/0x1b60 do_coredump+0x1c34/0x24a0 get_signal+0x71c/0x1410 do_notify_resume+0x140/0x6f0 interrupt_exit_user_prepare_main+0x29c/0x320 interrupt_exit_user_prepare+0x6c/0xa0 interrupt_return_srr_user+0x8/0x138 Because ppr_get() is trying to copy from a PF_IO_WORKER with a NULL pt_regs. Check for a valid pt_regs in both ppc_get/ppr_set, and return an error if not set. The actual error value doesn't seem to be important here, so just pick -EINVAL. Fixes: fa43981 ("powerpc/ptrace: Enable support for NT_PPPC_TAR, NT_PPC_PPR, NT_PPC_DSCR") Cc: [email protected] # v4.8+ Signed-off-by: Jens Axboe <[email protected]> [mpe: Trim oops in change log, add Fixes & Cc stable] Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
Misc updates for example schedulers to make life easier for user sched repo
When using the new kernel I get a reproducable oops when using NFS (client). Sometimes it happens fairly quick sometimes it takes a while.
I've taken a screenshot of the oops.
http://slimnet.home.xs4all.nl/raspi-oops-nfs.JPG
The text was updated successfully, but these errors were encountered: