-
Notifications
You must be signed in to change notification settings - Fork 5
Avoid size mismatches in skeletons #16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Master branch: 8cbf062 Pull request is NOT updated. Failed to apply https://patchwork.kernel.org/project/netdevbpf/list/?series=614738
conflict:
|
At least one diff in series https://patchwork.kernel.org/project/netdevbpf/list/?series=614738 expired. Closing PR. |
kernel-patches-bot
pushed a commit
that referenced
this pull request
Feb 18, 2022
When bringing down the netdevice or system shutdown, a panic can be triggered while accessing the sysfs path because the device is already removed. [ 755.549084] mlx5_core 0000:12:00.1: Shutdown was called [ 756.404455] mlx5_core 0000:12:00.0: Shutdown was called ... [ 757.937260] BUG: unable to handle kernel NULL pointer dereference at (null) [ 758.031397] IP: [<ffffffff8ee11acb>] dma_pool_alloc+0x1ab/0x280 crash> bt ... PID: 12649 TASK: ffff8924108f2100 CPU: 1 COMMAND: "amsd" ... #9 [ffff89240e1a38b0] page_fault at ffffffff8f38c778 [exception RIP: dma_pool_alloc+0x1ab] RIP: ffffffff8ee11acb RSP: ffff89240e1a3968 RFLAGS: 00010046 RAX: 0000000000000246 RBX: ffff89243d874100 RCX: 0000000000001000 RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffff89243d874090 RBP: ffff89240e1a39c0 R8: 000000000001f080 R9: ffff8905ffc03c00 R10: ffffffffc04680d4 R11: ffffffff8edde9fd R12: 00000000000080d0 R13: ffff89243d874090 R14: ffff89243d874080 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #10 [ffff89240e1a39c8] mlx5_alloc_cmd_msg at ffffffffc04680f3 [mlx5_core] #11 [ffff89240e1a3a18] cmd_exec at ffffffffc046ad62 [mlx5_core] #12 [ffff89240e1a3ab8] mlx5_cmd_exec at ffffffffc046b4fb [mlx5_core] #13 [ffff89240e1a3ae8] mlx5_core_access_reg at ffffffffc0475434 [mlx5_core] #14 [ffff89240e1a3b40] mlx5e_get_fec_caps at ffffffffc04a7348 [mlx5_core] #15 [ffff89240e1a3bb0] get_fec_supported_advertised at ffffffffc04992bf [mlx5_core] #16 [ffff89240e1a3c08] mlx5e_get_link_ksettings at ffffffffc049ab36 [mlx5_core] #17 [ffff89240e1a3ce8] __ethtool_get_link_ksettings at ffffffff8f25db46 #18 [ffff89240e1a3d48] speed_show at ffffffff8f277208 #19 [ffff89240e1a3dd8] dev_attr_show at ffffffff8f0b70e3 #20 [ffff89240e1a3df8] sysfs_kf_seq_show at ffffffff8eedbedf #21 [ffff89240e1a3e18] kernfs_seq_show at ffffffff8eeda596 #22 [ffff89240e1a3e28] seq_read at ffffffff8ee76d10 #23 [ffff89240e1a3e98] kernfs_fop_read at ffffffff8eedaef5 #24 [ffff89240e1a3ed8] vfs_read at ffffffff8ee4e3ff #25 [ffff89240e1a3f08] sys_read at ffffffff8ee4f27f #26 [ffff89240e1a3f50] system_call_fastpath at ffffffff8f395f92 crash> net_device.state ffff89443b0c0000 state = 0x5 (__LINK_STATE_START| __LINK_STATE_NOCARRIER) To prevent this scenario, we also make sure that the netdevice is present. Signed-off-by: suresh kumar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
kernel-patches-bot
pushed a commit
that referenced
this pull request
Mar 5, 2022
In current async pagefault logic, when a page is ready, KVM relies on kvm_arch_can_dequeue_async_page_present() to determine whether to deliver a READY event to the Guest. This function test token value of struct kvm_vcpu_pv_apf_data, which must be reset to zero by Guest kernel when a READY event is finished by Guest. If value is zero meaning that a READY event is done, so the KVM can deliver another. But the kvm_arch_setup_async_pf() may produce a valid token with zero value, which is confused with previous mention and may lead the loss of this READY event. This bug may cause task blocked forever in Guest: INFO: task stress:7532 blocked for more than 1254 seconds. Not tainted 5.10.0 #16 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:stress state:D stack: 0 pid: 7532 ppid: 1409 flags:0x00000080 Call Trace: __schedule+0x1e7/0x650 schedule+0x46/0xb0 kvm_async_pf_task_wait_schedule+0xad/0xe0 ? exit_to_user_mode_prepare+0x60/0x70 __kvm_handle_async_pf+0x4f/0xb0 ? asm_exc_page_fault+0x8/0x30 exc_page_fault+0x6f/0x110 ? asm_exc_page_fault+0x8/0x30 asm_exc_page_fault+0x1e/0x30 RIP: 0033:0x402d00 RSP: 002b:00007ffd31912500 EFLAGS: 00010206 RAX: 0000000000071000 RBX: ffffffffffffffff RCX: 00000000021a32b0 RDX: 000000000007d011 RSI: 000000000007d000 RDI: 00000000021262b0 RBP: 00000000021262b0 R08: 0000000000000003 R09: 0000000000000086 R10: 00000000000000eb R11: 00007fefbdf2baa0 R12: 0000000000000000 R13: 0000000000000002 R14: 000000000007d000 R15: 0000000000001000 Signed-off-by: Liang Zhang <[email protected]> Message-Id: <[email protected]> Cc: [email protected] Signed-off-by: Paolo Bonzini <[email protected]>
kernel-patches-bot
pushed a commit
that referenced
this pull request
Mar 16, 2022
The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <[email protected]>
kernel-patches-bot
pushed a commit
that referenced
this pull request
Mar 16, 2022
The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <[email protected]>
kernel-patches-bot
pushed a commit
that referenced
this pull request
Mar 16, 2022
The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <[email protected]>
kernel-patches-bot
pushed a commit
that referenced
this pull request
Mar 17, 2022
The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <[email protected]>
kernel-patches-bot
pushed a commit
that referenced
this pull request
Mar 17, 2022
The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <[email protected]>
kernel-patches-bot
pushed a commit
that referenced
this pull request
Mar 17, 2022
The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <[email protected]>
kernel-patches-bot
pushed a commit
that referenced
this pull request
Mar 17, 2022
The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <[email protected]>
kernel-patches-bot
pushed a commit
that referenced
this pull request
Mar 18, 2022
The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <[email protected]>
kernel-patches-bot
pushed a commit
that referenced
this pull request
Mar 18, 2022
The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <[email protected]>
kernel-patches-bot
pushed a commit
that referenced
this pull request
Mar 18, 2022
The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <[email protected]>
kernel-patches-bot
pushed a commit
that referenced
this pull request
Mar 18, 2022
The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <[email protected]>
kernel-patches-bot
pushed a commit
that referenced
this pull request
Mar 18, 2022
The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <[email protected]>
kernel-patches-bot
pushed a commit
that referenced
this pull request
Mar 18, 2022
The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <[email protected]>
kernel-patches-bot
pushed a commit
that referenced
this pull request
Mar 18, 2022
The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <[email protected]>
kernel-patches-bot
pushed a commit
that referenced
this pull request
Mar 19, 2022
The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <[email protected]>
kernel-patches-bot
pushed a commit
that referenced
this pull request
Mar 21, 2022
The BPF STX/LDX instruction uses offset relative to the FP to address stack space. Since the BPF_FP locates at the top of the frame, the offset is usually a negative number. However, arm64 str/ldr immediate instruction requires that offset be a positive number. Therefore, this patch tries to convert the offsets. The method is to find the negative offset furthest from the FP firstly. Then add it to the FP, calculate a bottom position, called FPB, and then adjust the offsets in other STR/LDX instructions relative to FPB. FPB is saved using the callee-saved register x27 of arm64 which is not used yet. Before adjusting the offset, the patch checks every instruction to ensure that the FP does not change in run-time. If the FP may change, no offset is adjusted. For example, for the following bpftrace command: bpftrace -e 'kprobe:do_sys_open { printf("opening: %s\n", str(arg1)); }' Without this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: mov x25, sp 1c: mov x26, #0x0 // #0 20: bti j 24: sub sp, sp, #0x90 28: add x19, x0, #0x0 2c: mov x0, #0x0 // #0 30: mov x10, #0xffffffffffffff78 // #-136 34: str x0, [x25, x10] 38: mov x10, #0xffffffffffffff80 // #-128 3c: str x0, [x25, x10] 40: mov x10, #0xffffffffffffff88 // #-120 44: str x0, [x25, x10] 48: mov x10, #0xffffffffffffff90 // #-112 4c: str x0, [x25, x10] 50: mov x10, #0xffffffffffffff98 // #-104 54: str x0, [x25, x10] 58: mov x10, #0xffffffffffffffa0 // #-96 5c: str x0, [x25, x10] 60: mov x10, #0xffffffffffffffa8 // #-88 64: str x0, [x25, x10] 68: mov x10, #0xffffffffffffffb0 // #-80 6c: str x0, [x25, x10] 70: mov x10, #0xffffffffffffffb8 // #-72 74: str x0, [x25, x10] 78: mov x10, #0xffffffffffffffc0 // #-64 7c: str x0, [x25, x10] 80: mov x10, #0xffffffffffffffc8 // #-56 84: str x0, [x25, x10] 88: mov x10, #0xffffffffffffffd0 // #-48 8c: str x0, [x25, x10] 90: mov x10, #0xffffffffffffffd8 // #-40 94: str x0, [x25, x10] 98: mov x10, #0xffffffffffffffe0 // #-32 9c: str x0, [x25, x10] a0: mov x10, #0xffffffffffffffe8 // #-24 a4: str x0, [x25, x10] a8: mov x10, #0xfffffffffffffff0 // #-16 ac: str x0, [x25, x10] b0: mov x10, #0xfffffffffffffff8 // #-8 b4: str x0, [x25, x10] b8: mov x10, #0x8 // #8 bc: ldr x2, [x19, x10] [...] With this patch, jited code(fragment): 0: bti c 4: stp x29, x30, [sp, #-16]! 8: mov x29, sp c: stp x19, x20, [sp, #-16]! 10: stp x21, x22, [sp, #-16]! 14: stp x25, x26, [sp, #-16]! 18: stp x27, x28, [sp, #-16]! 1c: mov x25, sp 20: sub x27, x25, #0x88 24: mov x26, #0x0 // #0 28: bti j 2c: sub sp, sp, #0x90 30: add x19, x0, #0x0 34: mov x0, #0x0 // #0 38: str x0, [x27] 3c: str x0, [x27, #8] 40: str x0, [x27, #16] 44: str x0, [x27, #24] 48: str x0, [x27, #32] 4c: str x0, [x27, #40] 50: str x0, [x27, #48] 54: str x0, [x27, #56] 58: str x0, [x27, #64] 5c: str x0, [x27, #72] 60: str x0, [x27, #80] 64: str x0, [x27, #88] 68: str x0, [x27, #96] 6c: str x0, [x27, #104] 70: str x0, [x27, #112] 74: str x0, [x27, #120] 78: str x0, [x27, #128] 7c: ldr x2, [x19, #8] [...] Signed-off-by: Xu Kuohai <[email protected]>
kernel-patches-daemon-bpf-rc bot
pushed a commit
that referenced
this pull request
Jun 15, 2024
When queues are started, netif_napi_add() and napi_enable() are called. If there are 4 queues and only 3 queues are used for the current configuration, only 3 queues' napi should be registered and enabled. The ionic_qcq_enable() checks whether the .poll pointer is not NULL for enabling only the using queue' napi. Unused queues' napi will not be registered by netif_napi_add(), so the .poll pointer indicates NULL. But it couldn't distinguish whether the napi was unregistered or not because netif_napi_del() doesn't reset the .poll pointer to NULL. So, ionic_qcq_enable() calls napi_enable() for the queue, which was unregistered by netif_napi_del(). Reproducer: ethtool -L <interface name> rx 1 tx 1 combined 0 ethtool -L <interface name> rx 0 tx 0 combined 1 ethtool -L <interface name> rx 0 tx 0 combined 4 Splat looks like: kernel BUG at net/core/dev.c:6666! Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI CPU: 3 PID: 1057 Comm: kworker/3:3 Not tainted 6.10.0-rc2+ #16 Workqueue: events ionic_lif_deferred_work [ionic] RIP: 0010:napi_enable+0x3b/0x40 Code: 48 89 c2 48 83 e2 f6 80 b9 61 09 00 00 00 74 0d 48 83 bf 60 01 00 00 00 74 03 80 ce 01 f0 4f RSP: 0018:ffffb6ed83227d48 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff97560cda0828 RCX: 0000000000000029 RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff97560cda0a28 RBP: ffffb6ed83227d50 R08: 0000000000000400 R09: 0000000000000001 R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000 R13: ffff97560ce3c1a0 R14: 0000000000000000 R15: ffff975613ba0a20 FS: 0000000000000000(0000) GS:ffff975d5f780000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f8f734ee200 CR3: 0000000103e50000 CR4: 00000000007506f0 PKRU: 55555554 Call Trace: <TASK> ? die+0x33/0x90 ? do_trap+0xd9/0x100 ? napi_enable+0x3b/0x40 ? do_error_trap+0x83/0xb0 ? napi_enable+0x3b/0x40 ? napi_enable+0x3b/0x40 ? exc_invalid_op+0x4e/0x70 ? napi_enable+0x3b/0x40 ? asm_exc_invalid_op+0x16/0x20 ? napi_enable+0x3b/0x40 ionic_qcq_enable+0xb7/0x180 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] ionic_start_queues+0xc4/0x290 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] ionic_link_status_check+0x11c/0x170 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] ionic_lif_deferred_work+0x129/0x280 [ionic 59bdfc8a035436e1c4224ff7d10789e3f14643f8] process_one_work+0x145/0x360 worker_thread+0x2bb/0x3d0 ? __pfx_worker_thread+0x10/0x10 kthread+0xcc/0x100 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2d/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 Fixes: 0f3154e ("ionic: Add Tx and Rx handling") Signed-off-by: Taehee Yoo <[email protected]> Reviewed-by: Brett Creeley <[email protected]> Reviewed-by: Shannon Nelson <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
kernel-patches-daemon-bpf-rc bot
pushed a commit
that referenced
this pull request
Jun 19, 2024
On ARM64, the pointer to task_struct is always available in the sp_el0 register and therefore the calls to bpf_get_current_task() and bpf_get_current_task_btf() can be inlined into a single MRS instruction. Here is the difference before and after this change: Before: ; struct task_struct *task = bpf_get_current_task_btf(); 54: mov x10, #0xffffffffffff7978 // #-34440 58: movk x10, #0x802b, lsl #16 5c: movk x10, #0x8000, lsl #32 60: blr x10 --------------> 0xffff8000802b7978 <+0>: mrs x0, sp_el0 64: add x7, x0, #0x0 <-------------- 0xffff8000802b797c <+4>: ret After: ; struct task_struct *task = bpf_get_current_task_btf(); 54: mrs x7, sp_el0 This shows around 1% performance improvement in artificial microbenchmark. Signed-off-by: Puranjay Mohan <[email protected]>
kernel-patches-daemon-bpf-rc bot
pushed a commit
that referenced
this pull request
Jun 21, 2024
On ARM64, the pointer to task_struct is always available in the sp_el0 register and therefore the calls to bpf_get_current_task() and bpf_get_current_task_btf() can be inlined into a single MRS instruction. Here is the difference before and after this change: Before: ; struct task_struct *task = bpf_get_current_task_btf(); 54: mov x10, #0xffffffffffff7978 // #-34440 58: movk x10, #0x802b, lsl #16 5c: movk x10, #0x8000, lsl #32 60: blr x10 --------------> 0xffff8000802b7978 <+0>: mrs x0, sp_el0 64: add x7, x0, #0x0 <-------------- 0xffff8000802b797c <+4>: ret After: ; struct task_struct *task = bpf_get_current_task_btf(); 54: mrs x7, sp_el0 This shows around 1% performance improvement in artificial microbenchmark. Signed-off-by: Puranjay Mohan <[email protected]> Acked-by: Xu Kuohai <[email protected]> Acked-by: Andrii Nakryiko <[email protected]>
kernel-patches-daemon-bpf-rc bot
pushed a commit
that referenced
this pull request
Jun 21, 2024
On ARM64, the pointer to task_struct is always available in the sp_el0 register and therefore the calls to bpf_get_current_task() and bpf_get_current_task_btf() can be inlined into a single MRS instruction. Here is the difference before and after this change: Before: ; struct task_struct *task = bpf_get_current_task_btf(); 54: mov x10, #0xffffffffffff7978 // #-34440 58: movk x10, #0x802b, lsl #16 5c: movk x10, #0x8000, lsl #32 60: blr x10 --------------> 0xffff8000802b7978 <+0>: mrs x0, sp_el0 64: add x7, x0, #0x0 <-------------- 0xffff8000802b797c <+4>: ret After: ; struct task_struct *task = bpf_get_current_task_btf(); 54: mrs x7, sp_el0 This shows around 1% performance improvement in artificial microbenchmark. Signed-off-by: Puranjay Mohan <[email protected]> Acked-by: Xu Kuohai <[email protected]> Acked-by: Andrii Nakryiko <[email protected]>
kernel-patches-daemon-bpf-rc bot
pushed a commit
that referenced
this pull request
Jun 21, 2024
On ARM64, the pointer to task_struct is always available in the sp_el0 register and therefore the calls to bpf_get_current_task() and bpf_get_current_task_btf() can be inlined into a single MRS instruction. Here is the difference before and after this change: Before: ; struct task_struct *task = bpf_get_current_task_btf(); 54: mov x10, #0xffffffffffff7978 // #-34440 58: movk x10, #0x802b, lsl #16 5c: movk x10, #0x8000, lsl #32 60: blr x10 --------------> 0xffff8000802b7978 <+0>: mrs x0, sp_el0 64: add x7, x0, #0x0 <-------------- 0xffff8000802b797c <+4>: ret After: ; struct task_struct *task = bpf_get_current_task_btf(); 54: mrs x7, sp_el0 This shows around 1% performance improvement in artificial microbenchmark. Signed-off-by: Puranjay Mohan <[email protected]> Acked-by: Xu Kuohai <[email protected]> Acked-by: Andrii Nakryiko <[email protected]>
kernel-patches-daemon-bpf-rc bot
pushed a commit
that referenced
this pull request
Jun 21, 2024
On ARM64, the pointer to task_struct is always available in the sp_el0 register and therefore the calls to bpf_get_current_task() and bpf_get_current_task_btf() can be inlined into a single MRS instruction. Here is the difference before and after this change: Before: ; struct task_struct *task = bpf_get_current_task_btf(); 54: mov x10, #0xffffffffffff7978 // #-34440 58: movk x10, #0x802b, lsl #16 5c: movk x10, #0x8000, lsl #32 60: blr x10 --------------> 0xffff8000802b7978 <+0>: mrs x0, sp_el0 64: add x7, x0, #0x0 <-------------- 0xffff8000802b797c <+4>: ret After: ; struct task_struct *task = bpf_get_current_task_btf(); 54: mrs x7, sp_el0 This shows around 1% performance improvement in artificial microbenchmark. Signed-off-by: Puranjay Mohan <[email protected]> Acked-by: Xu Kuohai <[email protected]> Acked-by: Andrii Nakryiko <[email protected]>
kernel-patches-daemon-bpf-rc bot
pushed a commit
that referenced
this pull request
Jun 21, 2024
On ARM64, the pointer to task_struct is always available in the sp_el0 register and therefore the calls to bpf_get_current_task() and bpf_get_current_task_btf() can be inlined into a single MRS instruction. Here is the difference before and after this change: Before: ; struct task_struct *task = bpf_get_current_task_btf(); 54: mov x10, #0xffffffffffff7978 // #-34440 58: movk x10, #0x802b, lsl #16 5c: movk x10, #0x8000, lsl #32 60: blr x10 --------------> 0xffff8000802b7978 <+0>: mrs x0, sp_el0 64: add x7, x0, #0x0 <-------------- 0xffff8000802b797c <+4>: ret After: ; struct task_struct *task = bpf_get_current_task_btf(); 54: mrs x7, sp_el0 This shows around 1% performance improvement in artificial microbenchmark. Signed-off-by: Puranjay Mohan <[email protected]> Acked-by: Xu Kuohai <[email protected]> Acked-by: Andrii Nakryiko <[email protected]>
kernel-patches-daemon-bpf-rc bot
pushed a commit
that referenced
this pull request
Jun 21, 2024
On ARM64, the pointer to task_struct is always available in the sp_el0 register and therefore the calls to bpf_get_current_task() and bpf_get_current_task_btf() can be inlined into a single MRS instruction. Here is the difference before and after this change: Before: ; struct task_struct *task = bpf_get_current_task_btf(); 54: mov x10, #0xffffffffffff7978 // #-34440 58: movk x10, #0x802b, lsl #16 5c: movk x10, #0x8000, lsl #32 60: blr x10 --------------> 0xffff8000802b7978 <+0>: mrs x0, sp_el0 64: add x7, x0, #0x0 <-------------- 0xffff8000802b797c <+4>: ret After: ; struct task_struct *task = bpf_get_current_task_btf(); 54: mrs x7, sp_el0 This shows around 1% performance improvement in artificial microbenchmark. Signed-off-by: Puranjay Mohan <[email protected]> Acked-by: Xu Kuohai <[email protected]> Acked-by: Andrii Nakryiko <[email protected]>
kernel-patches-daemon-bpf-rc bot
pushed a commit
that referenced
this pull request
Jun 21, 2024
On ARM64, the pointer to task_struct is always available in the sp_el0 register and therefore the calls to bpf_get_current_task() and bpf_get_current_task_btf() can be inlined into a single MRS instruction. Here is the difference before and after this change: Before: ; struct task_struct *task = bpf_get_current_task_btf(); 54: mov x10, #0xffffffffffff7978 // #-34440 58: movk x10, #0x802b, lsl #16 5c: movk x10, #0x8000, lsl #32 60: blr x10 --------------> 0xffff8000802b7978 <+0>: mrs x0, sp_el0 64: add x7, x0, #0x0 <-------------- 0xffff8000802b797c <+4>: ret After: ; struct task_struct *task = bpf_get_current_task_btf(); 54: mrs x7, sp_el0 This shows around 1% performance improvement in artificial microbenchmark. Signed-off-by: Puranjay Mohan <[email protected]> Acked-by: Xu Kuohai <[email protected]> Acked-by: Andrii Nakryiko <[email protected]>
kernel-patches-daemon-bpf-rc bot
pushed a commit
that referenced
this pull request
Jun 21, 2024
On ARM64, the pointer to task_struct is always available in the sp_el0 register and therefore the calls to bpf_get_current_task() and bpf_get_current_task_btf() can be inlined into a single MRS instruction. Here is the difference before and after this change: Before: ; struct task_struct *task = bpf_get_current_task_btf(); 54: mov x10, #0xffffffffffff7978 // #-34440 58: movk x10, #0x802b, lsl #16 5c: movk x10, #0x8000, lsl #32 60: blr x10 --------------> 0xffff8000802b7978 <+0>: mrs x0, sp_el0 64: add x7, x0, #0x0 <-------------- 0xffff8000802b797c <+4>: ret After: ; struct task_struct *task = bpf_get_current_task_btf(); 54: mrs x7, sp_el0 This shows around 1% performance improvement in artificial microbenchmark. Signed-off-by: Puranjay Mohan <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: Xu Kuohai <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
kernel-patches-daemon-bpf-rc bot
pushed a commit
that referenced
this pull request
Jun 28, 2024
The code in ocfs2_dio_end_io_write() estimates number of necessary transaction credits using ocfs2_calc_extend_credits(). This however does not take into account that the IO could be arbitrarily large and can contain arbitrary number of extents. Extent tree manipulations do often extend the current transaction but not in all of the cases. For example if we have only single block extents in the tree, ocfs2_mark_extent_written() will end up calling ocfs2_replace_extent_rec() all the time and we will never extend the current transaction and eventually exhaust all the transaction credits if the IO contains many single block extents. Once that happens a WARN_ON(jbd2_handle_buffer_credits(handle) <= 0) is triggered in jbd2_journal_dirty_metadata() and subsequently OCFS2 aborts in response to this error. This was actually triggered by one of our customers on a heavily fragmented OCFS2 filesystem. To fix the issue make sure the transaction always has enough credits for one extent insert before each call of ocfs2_mark_extent_written(). Heming Zhao said: ------ PANIC: "Kernel panic - not syncing: OCFS2: (device dm-1): panic forced after error" PID: xxx TASK: xxxx CPU: 5 COMMAND: "SubmitThread-CA" #0 machine_kexec at ffffffff8c069932 #1 __crash_kexec at ffffffff8c1338fa #2 panic at ffffffff8c1d69b9 #3 ocfs2_handle_error at ffffffffc0c86c0c [ocfs2] #4 __ocfs2_abort at ffffffffc0c88387 [ocfs2] #5 ocfs2_journal_dirty at ffffffffc0c51e98 [ocfs2] #6 ocfs2_split_extent at ffffffffc0c27ea3 [ocfs2] #7 ocfs2_change_extent_flag at ffffffffc0c28053 [ocfs2] #8 ocfs2_mark_extent_written at ffffffffc0c28347 [ocfs2] #9 ocfs2_dio_end_io_write at ffffffffc0c2bef9 [ocfs2] #10 ocfs2_dio_end_io at ffffffffc0c2c0f5 [ocfs2] #11 dio_complete at ffffffff8c2b9fa7 #12 do_blockdev_direct_IO at ffffffff8c2bc09f #13 ocfs2_direct_IO at ffffffffc0c2b653 [ocfs2] #14 generic_file_direct_write at ffffffff8c1dcf14 #15 __generic_file_write_iter at ffffffff8c1dd07b #16 ocfs2_file_write_iter at ffffffffc0c49f1f [ocfs2] #17 aio_write at ffffffff8c2cc72e #18 kmem_cache_alloc at ffffffff8c248dde #19 do_io_submit at ffffffff8c2ccada #20 do_syscall_64 at ffffffff8c004984 #21 entry_SYSCALL_64_after_hwframe at ffffffff8c8000ba Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: c15471f ("ocfs2: fix sparse file & data ordering issue in direct io") Signed-off-by: Jan Kara <[email protected]> Reviewed-by: Joseph Qi <[email protected]> Reviewed-by: Heming Zhao <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
kernel-patches-daemon-bpf-rc bot
pushed a commit
that referenced
this pull request
Jul 9, 2024
…rnel/git/netfilter/nf-next into main Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next The following patchset contains Netfilter/IPVS updates for net-next: Patch #1 to #11 to shrink memory consumption for transaction objects: struct nft_trans_chain { /* size: 120 (-32), cachelines: 2, members: 10 */ struct nft_trans_elem { /* size: 72 (-40), cachelines: 2, members: 4 */ struct nft_trans_flowtable { /* size: 80 (-48), cachelines: 2, members: 5 */ struct nft_trans_obj { /* size: 72 (-40), cachelines: 2, members: 4 */ struct nft_trans_rule { /* size: 80 (-32), cachelines: 2, members: 6 */ struct nft_trans_set { /* size: 96 (-24), cachelines: 2, members: 8 */ struct nft_trans_table { /* size: 56 (-40), cachelines: 1, members: 2 */ struct nft_trans_elem can now be allocated from kmalloc-96 instead of kmalloc-128 slab. Series from Florian Westphal. For the record, I have mangled patch #1 to add nft_trans_container_*() and use if for every transaction object. I have also added BUILD_BUG_ON to ensure struct nft_trans always comes at the beginning of the container transaction object. And few minor cleanups, any new bugs are of my own. Patch #12 simplify check for SCTP GSO in IPVS, from Ismael Luceno. Patch #13 nf_conncount key length remains in the u32 bound, from Yunjian Wang. Patch #14 removes unnecessary check for CTA_TIMEOUT_L3PROTO when setting default conntrack timeouts via nfnetlink_cttimeout API, from Lin Ma. Patch #15 updates NFT_SECMARK_CTX_MAXLEN to 4096, SELinux could use larger secctx names than the existing 256 bytes length. Patch #16 adds a selftest to exercise nfnetlink_queue listeners leaving nfnetlink_queue, from Florian Westphal. Patch #17 increases hitcount from 255 to 65535 in xt_recent, from Phil Sutter. ==================== Signed-off-by: David S. Miller <[email protected]>
kernel-patches-daemon-bpf-rc bot
pushed a commit
that referenced
this pull request
Jul 12, 2024
Run bloom_filter_map selftests (./test_progs -t bloom_filter_map) on a Loongarch platform, an error message "JIT doesn't support bpf-to-bpf calls" is got in user space, together with an unexpected errno EINVAL (22), not ENOTSUPP (524): libbpf: prog 'inner_map': BPF program load failed: Invalid argument libbpf: prog 'inner_map': -- BEGIN PROG LOAD LOG -- JIT doesn't support bpf-to-bpf calls callbacks are not allowed in non-JITed programs processed 37 insns (limit 1000000) max_states_per_insn 1 total_states -- END PROG LOAD LOG -- libbpf: prog 'inner_map': failed to load: -22 libbpf: failed to load object 'bloom_filter_map' libbpf: failed to load BPF skeleton 'bloom_filter_map': -22 setup_progs:FAIL:bloom_filter_map__open_and_load unexpected error: -22 #16 bloom_filter_map:FAIL Although in jit_subprogs(), the error number does be set as "ENOTSUPP": verbose(env, "JIT doesn't support bpf-to-bpf calls\n"); err = -ENOTSUPP; goto out_free; But afterwards in fixup_call_args(), such error number is ignored, and overwritten as "-EINVAL": verbose(env, "callbacks are not allowed in non-JITed programs\n"); return -EINVAL; This patch fixes this by changing return values of fixup_call_args() from "-EINVAL" to "err ?: -EINVAL". With this change, errno 524 is got in user space now: libbpf: prog 'inner_map': BPF program load failed: unknown error (-524) libbpf: prog 'inner_map': -- BEGIN PROG LOAD LOG -- JIT doesn't support bpf-to-bpf calls processed 37 insns (limit 1000000) max_states_per_insn 1 total_states -- END PROG LOAD LOG -- libbpf: prog 'inner_map': failed to load: -524 libbpf: failed to load object 'bloom_filter_map' libbpf: failed to load BPF skeleton 'bloom_filter_map': -524 setup_progs:FAIL:bloom_filter_map__open_and_load unexpected error: -524 Signed-off-by: Geliang Tang <[email protected]>
kernel-patches-daemon-bpf-rc bot
pushed a commit
that referenced
this pull request
Jul 12, 2024
Run bloom_filter_map selftests (./test_progs -t bloom_filter_map) on a Loongarch platform, an error message "JIT doesn't support bpf-to-bpf calls" is got in user space, together with an unexpected errno EINVAL (22), not ENOTSUPP (524): libbpf: prog 'inner_map': BPF program load failed: Invalid argument libbpf: prog 'inner_map': -- BEGIN PROG LOAD LOG -- JIT doesn't support bpf-to-bpf calls callbacks are not allowed in non-JITed programs processed 37 insns (limit 1000000) max_states_per_insn 1 total_states -- END PROG LOAD LOG -- libbpf: prog 'inner_map': failed to load: -22 libbpf: failed to load object 'bloom_filter_map' libbpf: failed to load BPF skeleton 'bloom_filter_map': -22 setup_progs:FAIL:bloom_filter_map__open_and_load unexpected error: -22 #16 bloom_filter_map:FAIL Although the return value of jit_subprogs() does be set as "ENOTSUPP": verbose(env, "JIT doesn't support bpf-to-bpf calls\n"); err = -ENOTSUPP; goto out_free; But afterwards in fixup_call_args(), the return value of jit_subprogs() is ignored, and overwritten as "-EINVAL": verbose(env, "callbacks are not allowed in non-JITed programs\n"); return -EINVAL; This patch fixes this by changing return values of fixup_call_args() from "-EINVAL" to "err ?: -EINVAL". With this change, errno 524 is got in user space now: libbpf: prog 'inner_map': BPF program load failed: unknown error (-524) libbpf: prog 'inner_map': -- BEGIN PROG LOAD LOG -- JIT doesn't support bpf-to-bpf calls processed 37 insns (limit 1000000) max_states_per_insn 1 total_states -- END PROG LOAD LOG -- libbpf: prog 'inner_map': failed to load: -524 libbpf: failed to load object 'bloom_filter_map' libbpf: failed to load BPF skeleton 'bloom_filter_map': -524 setup_progs:FAIL:bloom_filter_map__open_and_load unexpected error: -524 Signed-off-by: Geliang Tang <[email protected]>
kernel-patches-daemon-bpf-rc bot
pushed a commit
that referenced
this pull request
Jul 12, 2024
Run bloom_filter_map selftests (./test_progs -t bloom_filter_map) on a Loongarch platform, an error message "JIT doesn't support bpf-to-bpf calls" is got in user space, together with an unexpected errno EINVAL (22), not ENOTSUPP (524): libbpf: prog 'inner_map': BPF program load failed: Invalid argument libbpf: prog 'inner_map': -- BEGIN PROG LOAD LOG -- JIT doesn't support bpf-to-bpf calls callbacks are not allowed in non-JITed programs processed 37 insns (limit 1000000) max_states_per_insn 1 total_states -- END PROG LOAD LOG -- libbpf: prog 'inner_map': failed to load: -22 libbpf: failed to load object 'bloom_filter_map' libbpf: failed to load BPF skeleton 'bloom_filter_map': -22 setup_progs:FAIL:bloom_filter_map__open_and_load unexpected error: -22 #16 bloom_filter_map:FAIL Although the return value of jit_subprogs() does be set as "ENOTSUPP": verbose(env, "JIT doesn't support bpf-to-bpf calls\n"); err = -ENOTSUPP; goto out_free; But afterwards in fixup_call_args(), the return value of jit_subprogs() is ignored, and overwritten as "-EINVAL": verbose(env, "callbacks are not allowed in non-JITed programs\n"); return -EINVAL; This patch fixes this by changing return values of fixup_call_args() from "-EINVAL" to "err ?: -EINVAL". With this change, errno 524 is got in user space now: libbpf: prog 'inner_map': BPF program load failed: unknown error (-524) libbpf: prog 'inner_map': -- BEGIN PROG LOAD LOG -- JIT doesn't support bpf-to-bpf calls processed 37 insns (limit 1000000) max_states_per_insn 1 total_states -- END PROG LOAD LOG -- libbpf: prog 'inner_map': failed to load: -524 libbpf: failed to load object 'bloom_filter_map' libbpf: failed to load BPF skeleton 'bloom_filter_map': -524 setup_progs:FAIL:bloom_filter_map__open_and_load unexpected error: -524 Signed-off-by: Geliang Tang <[email protected]>
kernel-patches-daemon-bpf-rc bot
pushed a commit
that referenced
this pull request
Aug 26, 2024
The arm64 jit blindly saves/restores all callee-saved registers, making the jited result looks a bit too compliated. For example, for an empty prog, the jited result is: 0: bti jc 4: mov x9, lr 8: nop c: paciasp 10: stp fp, lr, [sp, #-16]! 14: mov fp, sp 18: stp x19, x20, [sp, #-16]! 1c: stp x21, x22, [sp, #-16]! 20: stp x26, x25, [sp, #-16]! 24: mov x26, #0 28: stp x26, x25, [sp, #-16]! 2c: mov x26, sp 30: stp x27, x28, [sp, #-16]! 34: mov x25, sp 38: bti j // tailcall target 3c: sub sp, sp, #0 40: mov x7, #0 44: add sp, sp, #0 48: ldp x27, x28, [sp], #16 4c: ldp x26, x25, [sp], #16 50: ldp x26, x25, [sp], #16 54: ldp x21, x22, [sp], #16 58: ldp x19, x20, [sp], #16 5c: ldp fp, lr, [sp], #16 60: mov x0, x7 64: autiasp 68: ret Clearly, there is no need to save/restore unused callee-saved registers. This patch does this change, making the jited image to only save/restore the callee-saved registers it uses. Now the jited result of empty prog is: 0: bti jc 4: mov x9, lr 8: nop c: paciasp 10: stp fp, lr, [sp, #-16]! 14: mov fp, sp 18: stp xzr, x26, [sp, #-16]! 1c: mov x26, sp 20: bti j // tailcall target 24: mov x7, #0 28: ldp xzr, x26, [sp], #16 2c: ldp fp, lr, [sp], #16 30: mov x0, x7 34: autiasp 38: ret Since bpf prog saves/restores its own callee-saved registers as needed, to make tailcall work correctly, the caller needs to restore its saved registers before tailcall, and the callee needs to save its callee-saved registers after tailcall. This extra restoring/saving instructions increases preformance overhead. [1] provides 2 benchmarks for tailcall scenarios. Below is the perf number measured in an arm64 KVM guest. The result indicates that the performance difference before and after the patch in typical tailcall scenarios is negligible. - Before: Performance counter stats for './test_progs -t tailcalls' (5 runs): 4313.43 msec task-clock # 0.874 CPUs utilized ( +- 0.16% ) 574 context-switches # 133.073 /sec ( +- 1.14% ) 0 cpu-migrations # 0.000 /sec 538 page-faults # 124.727 /sec ( +- 0.57% ) 10697772784 cycles # 2.480 GHz ( +- 0.22% ) (61.19%) 25511241955 instructions # 2.38 insn per cycle ( +- 0.08% ) (66.70%) 5108910557 branches # 1.184 G/sec ( +- 0.08% ) (72.38%) 2800459 branch-misses # 0.05% of all branches ( +- 0.51% ) (72.36%) TopDownL1 # 0.60 retiring ( +- 0.09% ) (66.84%) # 0.21 frontend_bound ( +- 0.15% ) (61.31%) # 0.12 bad_speculation ( +- 0.08% ) (50.11%) # 0.07 backend_bound ( +- 0.16% ) (33.30%) 8274201819 L1-dcache-loads # 1.918 G/sec ( +- 0.18% ) (33.15%) 468268 L1-dcache-load-misses # 0.01% of all L1-dcache accesses ( +- 4.69% ) (33.16%) 385383 LLC-loads # 89.345 K/sec ( +- 5.22% ) (33.16%) 38296 LLC-load-misses # 9.94% of all LL-cache accesses ( +- 42.52% ) (38.69%) 6886576501 L1-icache-loads # 1.597 G/sec ( +- 0.35% ) (38.69%) 1848585 L1-icache-load-misses # 0.03% of all L1-icache accesses ( +- 4.52% ) (44.23%) 9043645883 dTLB-loads # 2.097 G/sec ( +- 0.10% ) (44.33%) 416672 dTLB-load-misses # 0.00% of all dTLB cache accesses ( +- 5.15% ) (49.89%) 6925626111 iTLB-loads # 1.606 G/sec ( +- 0.35% ) (55.46%) 66220 iTLB-load-misses # 0.00% of all iTLB cache accesses ( +- 1.88% ) (55.50%) <not supported> L1-dcache-prefetches <not supported> L1-dcache-prefetch-misses 4.9372 +- 0.0526 seconds time elapsed ( +- 1.07% ) Performance counter stats for './test_progs -t flow_dissector' (5 runs): 10924.50 msec task-clock # 0.945 CPUs utilized ( +- 0.08% ) 603 context-switches # 55.197 /sec ( +- 1.13% ) 0 cpu-migrations # 0.000 /sec 566 page-faults # 51.810 /sec ( +- 0.42% ) 27381270695 cycles # 2.506 GHz ( +- 0.18% ) (60.46%) 56996583922 instructions # 2.08 insn per cycle ( +- 0.21% ) (66.11%) 10321647567 branches # 944.816 M/sec ( +- 0.17% ) (71.79%) 3347735 branch-misses # 0.03% of all branches ( +- 3.72% ) (72.15%) TopDownL1 # 0.52 retiring ( +- 0.13% ) (66.74%) # 0.27 frontend_bound ( +- 0.14% ) (61.27%) # 0.14 bad_speculation ( +- 0.19% ) (50.36%) # 0.07 backend_bound ( +- 0.42% ) (33.89%) 18740797617 L1-dcache-loads # 1.715 G/sec ( +- 0.43% ) (33.71%) 13715669 L1-dcache-load-misses # 0.07% of all L1-dcache accesses ( +- 32.85% ) (33.34%) 4087551 LLC-loads # 374.164 K/sec ( +- 29.53% ) (33.26%) 267906 LLC-load-misses # 6.55% of all LL-cache accesses ( +- 23.90% ) (38.76%) 15811864229 L1-icache-loads # 1.447 G/sec ( +- 0.12% ) (38.73%) 2976833 L1-icache-load-misses # 0.02% of all L1-icache accesses ( +- 9.73% ) (44.22%) 20138907471 dTLB-loads # 1.843 G/sec ( +- 0.18% ) (44.15%) 732850 dTLB-load-misses # 0.00% of all dTLB cache accesses ( +- 11.18% ) (49.64%) 15895726702 iTLB-loads # 1.455 G/sec ( +- 0.15% ) (55.13%) 152075 iTLB-load-misses # 0.00% of all iTLB cache accesses ( +- 4.71% ) (54.98%) <not supported> L1-dcache-prefetches <not supported> L1-dcache-prefetch-misses 11.5613 +- 0.0317 seconds time elapsed ( +- 0.27% ) - After: Performance counter stats for './test_progs -t tailcalls' (5 runs): 4278.78 msec task-clock # 0.871 CPUs utilized ( +- 0.15% ) 569 context-switches # 132.982 /sec ( +- 0.58% ) 0 cpu-migrations # 0.000 /sec 539 page-faults # 125.970 /sec ( +- 0.43% ) 10588986432 cycles # 2.475 GHz ( +- 0.20% ) (60.91%) 25303825043 instructions # 2.39 insn per cycle ( +- 0.08% ) (66.48%) 5110756256 branches # 1.194 G/sec ( +- 0.07% ) (72.03%) 2719569 branch-misses # 0.05% of all branches ( +- 2.42% ) (72.03%) TopDownL1 # 0.60 retiring ( +- 0.22% ) (66.31%) # 0.22 frontend_bound ( +- 0.21% ) (60.83%) # 0.12 bad_speculation ( +- 0.26% ) (50.25%) # 0.06 backend_bound ( +- 0.17% ) (33.52%) 8163648527 L1-dcache-loads # 1.908 G/sec ( +- 0.33% ) (33.52%) 694979 L1-dcache-load-misses # 0.01% of all L1-dcache accesses ( +- 30.53% ) (33.52%) 1902347 LLC-loads # 444.600 K/sec ( +- 48.84% ) (33.69%) 96677 LLC-load-misses # 5.08% of all LL-cache accesses ( +- 43.48% ) (39.30%) 6863517589 L1-icache-loads # 1.604 G/sec ( +- 0.37% ) (39.17%) 1871519 L1-icache-load-misses # 0.03% of all L1-icache accesses ( +- 6.78% ) (44.56%) 8927782813 dTLB-loads # 2.087 G/sec ( +- 0.14% ) (44.37%) 438237 dTLB-load-misses # 0.00% of all dTLB cache accesses ( +- 6.00% ) (49.75%) 6886906831 iTLB-loads # 1.610 G/sec ( +- 0.36% ) (55.08%) 67568 iTLB-load-misses # 0.00% of all iTLB cache accesses ( +- 3.27% ) (54.86%) <not supported> L1-dcache-prefetches <not supported> L1-dcache-prefetch-misses 4.9114 +- 0.0309 seconds time elapsed ( +- 0.63% ) Performance counter stats for './test_progs -t flow_dissector' (5 runs): 10948.40 msec task-clock # 0.942 CPUs utilized ( +- 0.05% ) 615 context-switches # 56.173 /sec ( +- 1.65% ) 1 cpu-migrations # 0.091 /sec ( +- 31.62% ) 567 page-faults # 51.788 /sec ( +- 0.44% ) 27334194328 cycles # 2.497 GHz ( +- 0.08% ) (61.05%) 56656528828 instructions # 2.07 insn per cycle ( +- 0.08% ) (66.67%) 10270389422 branches # 938.072 M/sec ( +- 0.10% ) (72.21%) 3453837 branch-misses # 0.03% of all branches ( +- 3.75% ) (72.27%) TopDownL1 # 0.52 retiring ( +- 0.16% ) (66.55%) # 0.27 frontend_bound ( +- 0.09% ) (60.91%) # 0.14 bad_speculation ( +- 0.08% ) (49.85%) # 0.07 backend_bound ( +- 0.16% ) (33.33%) 18982866028 L1-dcache-loads # 1.734 G/sec ( +- 0.24% ) (33.34%) 8802454 L1-dcache-load-misses # 0.05% of all L1-dcache accesses ( +- 52.30% ) (33.31%) 2612962 LLC-loads # 238.661 K/sec ( +- 29.78% ) (33.45%) 264107 LLC-load-misses # 10.11% of all LL-cache accesses ( +- 18.34% ) (39.07%) 15793205997 L1-icache-loads # 1.443 G/sec ( +- 0.15% ) (39.09%) 3930802 L1-icache-load-misses # 0.02% of all L1-icache accesses ( +- 3.72% ) (44.66%) 20097828496 dTLB-loads # 1.836 G/sec ( +- 0.09% ) (44.68%) 961757 dTLB-load-misses # 0.00% of all dTLB cache accesses ( +- 3.32% ) (50.15%) 15838728506 iTLB-loads # 1.447 G/sec ( +- 0.09% ) (55.62%) 167652 iTLB-load-misses # 0.00% of all iTLB cache accesses ( +- 1.28% ) (55.52%) <not supported> L1-dcache-prefetches <not supported> L1-dcache-prefetch-misses 11.6173 +- 0.0268 seconds time elapsed ( +- 0.23% ) [1] https://lore.kernel.org/bpf/[email protected]/ Signed-off-by: Xu Kuohai <[email protected]>
kernel-patches-daemon-bpf-rc bot
pushed a commit
that referenced
this pull request
Aug 27, 2024
The arm64 jit blindly saves/restores all callee-saved registers, making the jited result looks a bit too compliated. For example, for an empty prog, the jited result is: 0: bti jc 4: mov x9, lr 8: nop c: paciasp 10: stp fp, lr, [sp, #-16]! 14: mov fp, sp 18: stp x19, x20, [sp, #-16]! 1c: stp x21, x22, [sp, #-16]! 20: stp x26, x25, [sp, #-16]! 24: mov x26, #0 28: stp x26, x25, [sp, #-16]! 2c: mov x26, sp 30: stp x27, x28, [sp, #-16]! 34: mov x25, sp 38: bti j // tailcall target 3c: sub sp, sp, #0 40: mov x7, #0 44: add sp, sp, #0 48: ldp x27, x28, [sp], #16 4c: ldp x26, x25, [sp], #16 50: ldp x26, x25, [sp], #16 54: ldp x21, x22, [sp], #16 58: ldp x19, x20, [sp], #16 5c: ldp fp, lr, [sp], #16 60: mov x0, x7 64: autiasp 68: ret Clearly, there is no need to save/restore unused callee-saved registers. This patch does this change, making the jited image to only save/restore the callee-saved registers it uses. Now the jited result of empty prog is: 0: bti jc 4: mov x9, lr 8: nop c: paciasp 10: stp fp, lr, [sp, #-16]! 14: mov fp, sp 18: stp xzr, x26, [sp, #-16]! 1c: mov x26, sp 20: bti j // tailcall target 24: mov x7, #0 28: ldp xzr, x26, [sp], #16 2c: ldp fp, lr, [sp], #16 30: mov x0, x7 34: autiasp 38: ret Since bpf prog saves/restores its own callee-saved registers as needed, to make tailcall work correctly, the caller needs to restore its saved registers before tailcall, and the callee needs to save its callee-saved registers after tailcall. This extra restoring/saving instructions increases preformance overhead. [1] provides 2 benchmarks for tailcall scenarios. Below is the perf number measured in an arm64 KVM guest. The result indicates that the performance difference before and after the patch in typical tailcall scenarios is negligible. - Before: Performance counter stats for './test_progs -t tailcalls' (5 runs): 4313.43 msec task-clock # 0.874 CPUs utilized ( +- 0.16% ) 574 context-switches # 133.073 /sec ( +- 1.14% ) 0 cpu-migrations # 0.000 /sec 538 page-faults # 124.727 /sec ( +- 0.57% ) 10697772784 cycles # 2.480 GHz ( +- 0.22% ) (61.19%) 25511241955 instructions # 2.38 insn per cycle ( +- 0.08% ) (66.70%) 5108910557 branches # 1.184 G/sec ( +- 0.08% ) (72.38%) 2800459 branch-misses # 0.05% of all branches ( +- 0.51% ) (72.36%) TopDownL1 # 0.60 retiring ( +- 0.09% ) (66.84%) # 0.21 frontend_bound ( +- 0.15% ) (61.31%) # 0.12 bad_speculation ( +- 0.08% ) (50.11%) # 0.07 backend_bound ( +- 0.16% ) (33.30%) 8274201819 L1-dcache-loads # 1.918 G/sec ( +- 0.18% ) (33.15%) 468268 L1-dcache-load-misses # 0.01% of all L1-dcache accesses ( +- 4.69% ) (33.16%) 385383 LLC-loads # 89.345 K/sec ( +- 5.22% ) (33.16%) 38296 LLC-load-misses # 9.94% of all LL-cache accesses ( +- 42.52% ) (38.69%) 6886576501 L1-icache-loads # 1.597 G/sec ( +- 0.35% ) (38.69%) 1848585 L1-icache-load-misses # 0.03% of all L1-icache accesses ( +- 4.52% ) (44.23%) 9043645883 dTLB-loads # 2.097 G/sec ( +- 0.10% ) (44.33%) 416672 dTLB-load-misses # 0.00% of all dTLB cache accesses ( +- 5.15% ) (49.89%) 6925626111 iTLB-loads # 1.606 G/sec ( +- 0.35% ) (55.46%) 66220 iTLB-load-misses # 0.00% of all iTLB cache accesses ( +- 1.88% ) (55.50%) <not supported> L1-dcache-prefetches <not supported> L1-dcache-prefetch-misses 4.9372 +- 0.0526 seconds time elapsed ( +- 1.07% ) Performance counter stats for './test_progs -t flow_dissector' (5 runs): 10924.50 msec task-clock # 0.945 CPUs utilized ( +- 0.08% ) 603 context-switches # 55.197 /sec ( +- 1.13% ) 0 cpu-migrations # 0.000 /sec 566 page-faults # 51.810 /sec ( +- 0.42% ) 27381270695 cycles # 2.506 GHz ( +- 0.18% ) (60.46%) 56996583922 instructions # 2.08 insn per cycle ( +- 0.21% ) (66.11%) 10321647567 branches # 944.816 M/sec ( +- 0.17% ) (71.79%) 3347735 branch-misses # 0.03% of all branches ( +- 3.72% ) (72.15%) TopDownL1 # 0.52 retiring ( +- 0.13% ) (66.74%) # 0.27 frontend_bound ( +- 0.14% ) (61.27%) # 0.14 bad_speculation ( +- 0.19% ) (50.36%) # 0.07 backend_bound ( +- 0.42% ) (33.89%) 18740797617 L1-dcache-loads # 1.715 G/sec ( +- 0.43% ) (33.71%) 13715669 L1-dcache-load-misses # 0.07% of all L1-dcache accesses ( +- 32.85% ) (33.34%) 4087551 LLC-loads # 374.164 K/sec ( +- 29.53% ) (33.26%) 267906 LLC-load-misses # 6.55% of all LL-cache accesses ( +- 23.90% ) (38.76%) 15811864229 L1-icache-loads # 1.447 G/sec ( +- 0.12% ) (38.73%) 2976833 L1-icache-load-misses # 0.02% of all L1-icache accesses ( +- 9.73% ) (44.22%) 20138907471 dTLB-loads # 1.843 G/sec ( +- 0.18% ) (44.15%) 732850 dTLB-load-misses # 0.00% of all dTLB cache accesses ( +- 11.18% ) (49.64%) 15895726702 iTLB-loads # 1.455 G/sec ( +- 0.15% ) (55.13%) 152075 iTLB-load-misses # 0.00% of all iTLB cache accesses ( +- 4.71% ) (54.98%) <not supported> L1-dcache-prefetches <not supported> L1-dcache-prefetch-misses 11.5613 +- 0.0317 seconds time elapsed ( +- 0.27% ) - After: Performance counter stats for './test_progs -t tailcalls' (5 runs): 4278.78 msec task-clock # 0.871 CPUs utilized ( +- 0.15% ) 569 context-switches # 132.982 /sec ( +- 0.58% ) 0 cpu-migrations # 0.000 /sec 539 page-faults # 125.970 /sec ( +- 0.43% ) 10588986432 cycles # 2.475 GHz ( +- 0.20% ) (60.91%) 25303825043 instructions # 2.39 insn per cycle ( +- 0.08% ) (66.48%) 5110756256 branches # 1.194 G/sec ( +- 0.07% ) (72.03%) 2719569 branch-misses # 0.05% of all branches ( +- 2.42% ) (72.03%) TopDownL1 # 0.60 retiring ( +- 0.22% ) (66.31%) # 0.22 frontend_bound ( +- 0.21% ) (60.83%) # 0.12 bad_speculation ( +- 0.26% ) (50.25%) # 0.06 backend_bound ( +- 0.17% ) (33.52%) 8163648527 L1-dcache-loads # 1.908 G/sec ( +- 0.33% ) (33.52%) 694979 L1-dcache-load-misses # 0.01% of all L1-dcache accesses ( +- 30.53% ) (33.52%) 1902347 LLC-loads # 444.600 K/sec ( +- 48.84% ) (33.69%) 96677 LLC-load-misses # 5.08% of all LL-cache accesses ( +- 43.48% ) (39.30%) 6863517589 L1-icache-loads # 1.604 G/sec ( +- 0.37% ) (39.17%) 1871519 L1-icache-load-misses # 0.03% of all L1-icache accesses ( +- 6.78% ) (44.56%) 8927782813 dTLB-loads # 2.087 G/sec ( +- 0.14% ) (44.37%) 438237 dTLB-load-misses # 0.00% of all dTLB cache accesses ( +- 6.00% ) (49.75%) 6886906831 iTLB-loads # 1.610 G/sec ( +- 0.36% ) (55.08%) 67568 iTLB-load-misses # 0.00% of all iTLB cache accesses ( +- 3.27% ) (54.86%) <not supported> L1-dcache-prefetches <not supported> L1-dcache-prefetch-misses 4.9114 +- 0.0309 seconds time elapsed ( +- 0.63% ) Performance counter stats for './test_progs -t flow_dissector' (5 runs): 10948.40 msec task-clock # 0.942 CPUs utilized ( +- 0.05% ) 615 context-switches # 56.173 /sec ( +- 1.65% ) 1 cpu-migrations # 0.091 /sec ( +- 31.62% ) 567 page-faults # 51.788 /sec ( +- 0.44% ) 27334194328 cycles # 2.497 GHz ( +- 0.08% ) (61.05%) 56656528828 instructions # 2.07 insn per cycle ( +- 0.08% ) (66.67%) 10270389422 branches # 938.072 M/sec ( +- 0.10% ) (72.21%) 3453837 branch-misses # 0.03% of all branches ( +- 3.75% ) (72.27%) TopDownL1 # 0.52 retiring ( +- 0.16% ) (66.55%) # 0.27 frontend_bound ( +- 0.09% ) (60.91%) # 0.14 bad_speculation ( +- 0.08% ) (49.85%) # 0.07 backend_bound ( +- 0.16% ) (33.33%) 18982866028 L1-dcache-loads # 1.734 G/sec ( +- 0.24% ) (33.34%) 8802454 L1-dcache-load-misses # 0.05% of all L1-dcache accesses ( +- 52.30% ) (33.31%) 2612962 LLC-loads # 238.661 K/sec ( +- 29.78% ) (33.45%) 264107 LLC-load-misses # 10.11% of all LL-cache accesses ( +- 18.34% ) (39.07%) 15793205997 L1-icache-loads # 1.443 G/sec ( +- 0.15% ) (39.09%) 3930802 L1-icache-load-misses # 0.02% of all L1-icache accesses ( +- 3.72% ) (44.66%) 20097828496 dTLB-loads # 1.836 G/sec ( +- 0.09% ) (44.68%) 961757 dTLB-load-misses # 0.00% of all dTLB cache accesses ( +- 3.32% ) (50.15%) 15838728506 iTLB-loads # 1.447 G/sec ( +- 0.09% ) (55.62%) 167652 iTLB-load-misses # 0.00% of all iTLB cache accesses ( +- 1.28% ) (55.52%) <not supported> L1-dcache-prefetches <not supported> L1-dcache-prefetch-misses 11.6173 +- 0.0268 seconds time elapsed ( +- 0.23% ) [1] https://lore.kernel.org/bpf/[email protected]/ Signed-off-by: Xu Kuohai <[email protected]>
kernel-patches-daemon-bpf-rc bot
pushed a commit
that referenced
this pull request
Aug 28, 2024
The arm64 jit blindly saves/restores all callee-saved registers, making the jited result looks a bit too compliated. For example, for an empty prog, the jited result is: 0: bti jc 4: mov x9, lr 8: nop c: paciasp 10: stp fp, lr, [sp, #-16]! 14: mov fp, sp 18: stp x19, x20, [sp, #-16]! 1c: stp x21, x22, [sp, #-16]! 20: stp x26, x25, [sp, #-16]! 24: mov x26, #0 28: stp x26, x25, [sp, #-16]! 2c: mov x26, sp 30: stp x27, x28, [sp, #-16]! 34: mov x25, sp 38: bti j // tailcall target 3c: sub sp, sp, #0 40: mov x7, #0 44: add sp, sp, #0 48: ldp x27, x28, [sp], #16 4c: ldp x26, x25, [sp], #16 50: ldp x26, x25, [sp], #16 54: ldp x21, x22, [sp], #16 58: ldp x19, x20, [sp], #16 5c: ldp fp, lr, [sp], #16 60: mov x0, x7 64: autiasp 68: ret Clearly, there is no need to save/restore unused callee-saved registers. This patch does this change, making the jited image to only save/restore the callee-saved registers it uses. Now the jited result of empty prog is: 0: bti jc 4: mov x9, lr 8: nop c: paciasp 10: stp fp, lr, [sp, #-16]! 14: mov fp, sp 18: stp xzr, x26, [sp, #-16]! 1c: mov x26, sp 20: bti j // tailcall target 24: mov x7, #0 28: ldp xzr, x26, [sp], #16 2c: ldp fp, lr, [sp], #16 30: mov x0, x7 34: autiasp 38: ret Since bpf prog saves/restores its own callee-saved registers as needed, to make tailcall work correctly, the caller needs to restore its saved registers before tailcall, and the callee needs to save its callee-saved registers after tailcall. This extra restoring/saving instructions increases preformance overhead. [1] provides 2 benchmarks for tailcall scenarios. Below is the perf number measured in an arm64 KVM guest. The result indicates that the performance difference before and after the patch in typical tailcall scenarios is negligible. - Before: Performance counter stats for './test_progs -t tailcalls' (5 runs): 4313.43 msec task-clock # 0.874 CPUs utilized ( +- 0.16% ) 574 context-switches # 133.073 /sec ( +- 1.14% ) 0 cpu-migrations # 0.000 /sec 538 page-faults # 124.727 /sec ( +- 0.57% ) 10697772784 cycles # 2.480 GHz ( +- 0.22% ) (61.19%) 25511241955 instructions # 2.38 insn per cycle ( +- 0.08% ) (66.70%) 5108910557 branches # 1.184 G/sec ( +- 0.08% ) (72.38%) 2800459 branch-misses # 0.05% of all branches ( +- 0.51% ) (72.36%) TopDownL1 # 0.60 retiring ( +- 0.09% ) (66.84%) # 0.21 frontend_bound ( +- 0.15% ) (61.31%) # 0.12 bad_speculation ( +- 0.08% ) (50.11%) # 0.07 backend_bound ( +- 0.16% ) (33.30%) 8274201819 L1-dcache-loads # 1.918 G/sec ( +- 0.18% ) (33.15%) 468268 L1-dcache-load-misses # 0.01% of all L1-dcache accesses ( +- 4.69% ) (33.16%) 385383 LLC-loads # 89.345 K/sec ( +- 5.22% ) (33.16%) 38296 LLC-load-misses # 9.94% of all LL-cache accesses ( +- 42.52% ) (38.69%) 6886576501 L1-icache-loads # 1.597 G/sec ( +- 0.35% ) (38.69%) 1848585 L1-icache-load-misses # 0.03% of all L1-icache accesses ( +- 4.52% ) (44.23%) 9043645883 dTLB-loads # 2.097 G/sec ( +- 0.10% ) (44.33%) 416672 dTLB-load-misses # 0.00% of all dTLB cache accesses ( +- 5.15% ) (49.89%) 6925626111 iTLB-loads # 1.606 G/sec ( +- 0.35% ) (55.46%) 66220 iTLB-load-misses # 0.00% of all iTLB cache accesses ( +- 1.88% ) (55.50%) <not supported> L1-dcache-prefetches <not supported> L1-dcache-prefetch-misses 4.9372 +- 0.0526 seconds time elapsed ( +- 1.07% ) Performance counter stats for './test_progs -t flow_dissector' (5 runs): 10924.50 msec task-clock # 0.945 CPUs utilized ( +- 0.08% ) 603 context-switches # 55.197 /sec ( +- 1.13% ) 0 cpu-migrations # 0.000 /sec 566 page-faults # 51.810 /sec ( +- 0.42% ) 27381270695 cycles # 2.506 GHz ( +- 0.18% ) (60.46%) 56996583922 instructions # 2.08 insn per cycle ( +- 0.21% ) (66.11%) 10321647567 branches # 944.816 M/sec ( +- 0.17% ) (71.79%) 3347735 branch-misses # 0.03% of all branches ( +- 3.72% ) (72.15%) TopDownL1 # 0.52 retiring ( +- 0.13% ) (66.74%) # 0.27 frontend_bound ( +- 0.14% ) (61.27%) # 0.14 bad_speculation ( +- 0.19% ) (50.36%) # 0.07 backend_bound ( +- 0.42% ) (33.89%) 18740797617 L1-dcache-loads # 1.715 G/sec ( +- 0.43% ) (33.71%) 13715669 L1-dcache-load-misses # 0.07% of all L1-dcache accesses ( +- 32.85% ) (33.34%) 4087551 LLC-loads # 374.164 K/sec ( +- 29.53% ) (33.26%) 267906 LLC-load-misses # 6.55% of all LL-cache accesses ( +- 23.90% ) (38.76%) 15811864229 L1-icache-loads # 1.447 G/sec ( +- 0.12% ) (38.73%) 2976833 L1-icache-load-misses # 0.02% of all L1-icache accesses ( +- 9.73% ) (44.22%) 20138907471 dTLB-loads # 1.843 G/sec ( +- 0.18% ) (44.15%) 732850 dTLB-load-misses # 0.00% of all dTLB cache accesses ( +- 11.18% ) (49.64%) 15895726702 iTLB-loads # 1.455 G/sec ( +- 0.15% ) (55.13%) 152075 iTLB-load-misses # 0.00% of all iTLB cache accesses ( +- 4.71% ) (54.98%) <not supported> L1-dcache-prefetches <not supported> L1-dcache-prefetch-misses 11.5613 +- 0.0317 seconds time elapsed ( +- 0.27% ) - After: Performance counter stats for './test_progs -t tailcalls' (5 runs): 4278.78 msec task-clock # 0.871 CPUs utilized ( +- 0.15% ) 569 context-switches # 132.982 /sec ( +- 0.58% ) 0 cpu-migrations # 0.000 /sec 539 page-faults # 125.970 /sec ( +- 0.43% ) 10588986432 cycles # 2.475 GHz ( +- 0.20% ) (60.91%) 25303825043 instructions # 2.39 insn per cycle ( +- 0.08% ) (66.48%) 5110756256 branches # 1.194 G/sec ( +- 0.07% ) (72.03%) 2719569 branch-misses # 0.05% of all branches ( +- 2.42% ) (72.03%) TopDownL1 # 0.60 retiring ( +- 0.22% ) (66.31%) # 0.22 frontend_bound ( +- 0.21% ) (60.83%) # 0.12 bad_speculation ( +- 0.26% ) (50.25%) # 0.06 backend_bound ( +- 0.17% ) (33.52%) 8163648527 L1-dcache-loads # 1.908 G/sec ( +- 0.33% ) (33.52%) 694979 L1-dcache-load-misses # 0.01% of all L1-dcache accesses ( +- 30.53% ) (33.52%) 1902347 LLC-loads # 444.600 K/sec ( +- 48.84% ) (33.69%) 96677 LLC-load-misses # 5.08% of all LL-cache accesses ( +- 43.48% ) (39.30%) 6863517589 L1-icache-loads # 1.604 G/sec ( +- 0.37% ) (39.17%) 1871519 L1-icache-load-misses # 0.03% of all L1-icache accesses ( +- 6.78% ) (44.56%) 8927782813 dTLB-loads # 2.087 G/sec ( +- 0.14% ) (44.37%) 438237 dTLB-load-misses # 0.00% of all dTLB cache accesses ( +- 6.00% ) (49.75%) 6886906831 iTLB-loads # 1.610 G/sec ( +- 0.36% ) (55.08%) 67568 iTLB-load-misses # 0.00% of all iTLB cache accesses ( +- 3.27% ) (54.86%) <not supported> L1-dcache-prefetches <not supported> L1-dcache-prefetch-misses 4.9114 +- 0.0309 seconds time elapsed ( +- 0.63% ) Performance counter stats for './test_progs -t flow_dissector' (5 runs): 10948.40 msec task-clock # 0.942 CPUs utilized ( +- 0.05% ) 615 context-switches # 56.173 /sec ( +- 1.65% ) 1 cpu-migrations # 0.091 /sec ( +- 31.62% ) 567 page-faults # 51.788 /sec ( +- 0.44% ) 27334194328 cycles # 2.497 GHz ( +- 0.08% ) (61.05%) 56656528828 instructions # 2.07 insn per cycle ( +- 0.08% ) (66.67%) 10270389422 branches # 938.072 M/sec ( +- 0.10% ) (72.21%) 3453837 branch-misses # 0.03% of all branches ( +- 3.75% ) (72.27%) TopDownL1 # 0.52 retiring ( +- 0.16% ) (66.55%) # 0.27 frontend_bound ( +- 0.09% ) (60.91%) # 0.14 bad_speculation ( +- 0.08% ) (49.85%) # 0.07 backend_bound ( +- 0.16% ) (33.33%) 18982866028 L1-dcache-loads # 1.734 G/sec ( +- 0.24% ) (33.34%) 8802454 L1-dcache-load-misses # 0.05% of all L1-dcache accesses ( +- 52.30% ) (33.31%) 2612962 LLC-loads # 238.661 K/sec ( +- 29.78% ) (33.45%) 264107 LLC-load-misses # 10.11% of all LL-cache accesses ( +- 18.34% ) (39.07%) 15793205997 L1-icache-loads # 1.443 G/sec ( +- 0.15% ) (39.09%) 3930802 L1-icache-load-misses # 0.02% of all L1-icache accesses ( +- 3.72% ) (44.66%) 20097828496 dTLB-loads # 1.836 G/sec ( +- 0.09% ) (44.68%) 961757 dTLB-load-misses # 0.00% of all dTLB cache accesses ( +- 3.32% ) (50.15%) 15838728506 iTLB-loads # 1.447 G/sec ( +- 0.09% ) (55.62%) 167652 iTLB-load-misses # 0.00% of all iTLB cache accesses ( +- 1.28% ) (55.52%) <not supported> L1-dcache-prefetches <not supported> L1-dcache-prefetch-misses 11.6173 +- 0.0268 seconds time elapsed ( +- 0.23% ) [1] https://lore.kernel.org/bpf/[email protected]/ Signed-off-by: Xu Kuohai <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
kernel-patches-daemon-bpf-rc bot
pushed a commit
that referenced
this pull request
Aug 28, 2024
Xu Kuohai says: ==================== bpf, arm64: Simplify jited prologue/epilogue From: Xu Kuohai <[email protected]> The arm64 jit blindly saves/restores all callee-saved registers, making the jited result looks a bit too compliated. For example, for an empty prog, the jited result is: 0: bti jc 4: mov x9, lr 8: nop c: paciasp 10: stp fp, lr, [sp, #-16]! 14: mov fp, sp 18: stp x19, x20, [sp, #-16]! 1c: stp x21, x22, [sp, #-16]! 20: stp x26, x25, [sp, #-16]! 24: mov x26, #0 28: stp x26, x25, [sp, #-16]! 2c: mov x26, sp 30: stp x27, x28, [sp, #-16]! 34: mov x25, sp 38: bti j // tailcall target 3c: sub sp, sp, #0 40: mov x7, #0 44: add sp, sp, #0 48: ldp x27, x28, [sp], #16 4c: ldp x26, x25, [sp], #16 50: ldp x26, x25, [sp], #16 54: ldp x21, x22, [sp], #16 58: ldp x19, x20, [sp], #16 5c: ldp fp, lr, [sp], #16 60: mov x0, x7 64: autiasp 68: ret Clearly, there is no need to save/restore unused callee-saved registers. This patch does this change, making the jited image to only save/restore the callee-saved registers it uses. Now the jited result of empty prog is: 0: bti jc 4: mov x9, lr 8: nop c: paciasp 10: stp fp, lr, [sp, #-16]! 14: mov fp, sp 18: stp xzr, x26, [sp, #-16]! 1c: mov x26, sp 20: bti j // tailcall target 24: mov x7, #0 28: ldp xzr, x26, [sp], #16 2c: ldp fp, lr, [sp], #16 30: mov x0, x7 34: autiasp 38: ret ==================== Acked-by: Puranjay Mohan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
kernel-patches-daemon-bpf-rc bot
pushed a commit
that referenced
this pull request
Sep 3, 2024
Currently, BPF_CALL is always jited to indirect call. When target is within the range of direct call, BPF_CALL can be jited to direct call. For example, the following BPF_CALL call __htab_map_lookup_elem is always jited to indirect call: mov x10, #0xffffffffffff18f4 movk x10, #0x821, lsl #16 movk x10, #0x8000, lsl #32 blr x10 When the address of target __htab_map_lookup_elem is within the range of direct call, the BPF_CALL can be jited to: bl 0xfffffffffd33bc98 This patch does such jit optimization by emitting arm64 direct calls for BPF_CALL when possible, indirect calls otherwise. Without this patch, the jit works as follows. 1. First pass A. Determine jited position and size for each bpf instruction. B. Computed the jited image size. 2. Allocate jited image with size computed in step 1. 3. Second pass A. Adjust jump offset for jump instructions B. Write the final image. This works because, for a given bpf prog, regardless of where the jited image is allocated, the jited result for each instruction is fixed. The second pass differs from the first only in adjusting the jump offsets, like changing "jmp imm1" to "jmp imm2", while the position and size of the "jmp" instruction remain unchanged. Now considering whether to jit BPF_CALL to arm64 direct or indirect call instruction. The choice depends solely on the jump offset: direct call if the jump offset is within 128MB, indirect call otherwise. For a given BPF_CALL, the target address is known, so the jump offset is decided by the jited address of the BPF_CALL instruction. In other words, for a given bpf prog, the jited result for each BPF_CALL is determined by its jited address. The jited address for a BPF_CALL is the jited image address plus the total jited size of all preceding instructions. For a given bpf prog, there are clearly no BPF_CALL instructions before the first BPF_CALL instruction. Since the jited result for all other instructions other than BPF_CALL are fixed, the total jited size preceding the first BPF_CALL is also fixed. Therefore, once the jited image is allocated, the jited address for the first BPF_CALL is fixed. Now that the jited result for the first BPF_CALL is fixed, the jited results for all instructions preceding the second BPF_CALL are fixed. So the jited address and result for the second BPF_CALL are also fixed. Similarly, we can conclude that the jited addresses and results for all subsequent BPF_CALL instructions are fixed. This means that, for a given bpf prog, once the jited image is allocated, the jited address and result for all instructions, including all BPF_CALL instructions, are fixed. Based on the observation, with this patch, the jit works as follows. 1. First pass Estimate the maximum jited image size. In this pass, all BPF_CALLs are jited to arm64 indirect calls since the jump offsets are unknown because the jited image is not allocated. 2. Allocate jited image with size estimated in step 1. 3. Second pass A. Determine the jited result for each BPF_CALL. B. Determine jited address and size for each bpf instruction. 4. Third pass A. Adjust jump offset for jump instructions. B. Write the final image. Signed-off-by: Xu Kuohai <[email protected]>
kernel-patches-daemon-bpf-rc bot
pushed a commit
that referenced
this pull request
Sep 4, 2024
Currently, BPF_CALL is always jited to indirect call. When target is within the range of direct call, BPF_CALL can be jited to direct call. For example, the following BPF_CALL call __htab_map_lookup_elem is always jited to indirect call: mov x10, #0xffffffffffff18f4 movk x10, #0x821, lsl #16 movk x10, #0x8000, lsl #32 blr x10 When the address of target __htab_map_lookup_elem is within the range of direct call, the BPF_CALL can be jited to: bl 0xfffffffffd33bc98 This patch does such jit optimization by emitting arm64 direct calls for BPF_CALL when possible, indirect calls otherwise. Without this patch, the jit works as follows. 1. First pass A. Determine jited position and size for each bpf instruction. B. Computed the jited image size. 2. Allocate jited image with size computed in step 1. 3. Second pass A. Adjust jump offset for jump instructions B. Write the final image. This works because, for a given bpf prog, regardless of where the jited image is allocated, the jited result for each instruction is fixed. The second pass differs from the first only in adjusting the jump offsets, like changing "jmp imm1" to "jmp imm2", while the position and size of the "jmp" instruction remain unchanged. Now considering whether to jit BPF_CALL to arm64 direct or indirect call instruction. The choice depends solely on the jump offset: direct call if the jump offset is within 128MB, indirect call otherwise. For a given BPF_CALL, the target address is known, so the jump offset is decided by the jited address of the BPF_CALL instruction. In other words, for a given bpf prog, the jited result for each BPF_CALL is determined by its jited address. The jited address for a BPF_CALL is the jited image address plus the total jited size of all preceding instructions. For a given bpf prog, there are clearly no BPF_CALL instructions before the first BPF_CALL instruction. Since the jited result for all other instructions other than BPF_CALL are fixed, the total jited size preceding the first BPF_CALL is also fixed. Therefore, once the jited image is allocated, the jited address for the first BPF_CALL is fixed. Now that the jited result for the first BPF_CALL is fixed, the jited results for all instructions preceding the second BPF_CALL are fixed. So the jited address and result for the second BPF_CALL are also fixed. Similarly, we can conclude that the jited addresses and results for all subsequent BPF_CALL instructions are fixed. This means that, for a given bpf prog, once the jited image is allocated, the jited address and result for all instructions, including all BPF_CALL instructions, are fixed. Based on the observation, with this patch, the jit works as follows. 1. First pass Estimate the maximum jited image size. In this pass, all BPF_CALLs are jited to arm64 indirect calls since the jump offsets are unknown because the jited image is not allocated. 2. Allocate jited image with size estimated in step 1. 3. Second pass A. Determine the jited result for each BPF_CALL. B. Determine jited address and size for each bpf instruction. 4. Third pass A. Adjust jump offset for jump instructions. B. Write the final image. Signed-off-by: Xu Kuohai <[email protected]> Reviewed-by: Puranjay Mohan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
kernel-patches-daemon-bpf-rc bot
pushed a commit
that referenced
this pull request
Sep 6, 2024
A sysfs reader can race with a device reset or removal, attempting to read device state when the device is not actually present. eg: [exception RIP: qed_get_current_link+17] #8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede] #9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3 #10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4 #11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300 #12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c #13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b #14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3 #15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1 #16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f #17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb crash> struct net_device.state ffff9a9d21336000 state = 5, state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100). The device is not present, note lack of __LINK_STATE_PRESENT (0b10). This is the same sort of panic as observed in commit 4224cfd ("net-sysfs: add check for netdevice being present to speed_show"). There are many other callers of __ethtool_get_link_ksettings() which don't have a device presence check. Move this check into ethtool to protect all callers. Fixes: d519e17 ("net: export device speed and duplex via sysfs") Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show") Signed-off-by: Jamie Bainbridge <[email protected]> Link: https://patch.msgid.link/8bae218864beaa44ed01628140475b9bf641c5b0.1724393671.git.jamie.bainbridge@gmail.com Signed-off-by: Jakub Kicinski <[email protected]>
kernel-patches-daemon-bpf-rc bot
pushed a commit
that referenced
this pull request
Sep 13, 2024
…rnel/git/netfilter/nf-next Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for net-next: Patch #1 adds ctnetlink support for kernel side filtering for deletions, from Changliang Wu. Patch #2 updates nft_counter support to Use u64_stats_t, from Sebastian Andrzej Siewior. Patch #3 uses kmemdup_array() in all xtables frontends, from Yan Zhen. Patch #4 is a oneliner to use ERR_CAST() in nf_conntrack instead opencoded casting, from Shen Lichuan. Patch #5 removes unused argument in nftables .validate interface, from Florian Westphal. Patch #6 is a oneliner to correct a typo in nftables kdoc, from Simon Horman. Patch #7 fixes missing kdoc in nftables, also from Simon. Patch #8 updates nftables to handle timeout less than CONFIG_HZ. Patch #9 rejects element expiration if timeout is zero, otherwise it is silently ignored. Patch #10 disallows element expiration larger than timeout. Patch #11 removes unnecessary READ_ONCE annotation while mutex is held. Patch #12 adds missing READ_ONCE/WRITE_ONCE annotation in dynset. Patch #13 annotates data-races around element expiration. Patch #14 allocates timeout and expiration in one single set element extension, they are tighly couple, no reason to keep them separated anymore. Patch #15 updates nftables to interpret zero timeout element as never times out. Note that it is already possible to declare sets with elements that never time out but this generalizes to all kind of set with timeouts. Patch #16 supports for element timeout and expiration updates. * tag 'nf-next-24-09-06' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: nf_tables: set element timeout update support netfilter: nf_tables: zero timeout means element never times out netfilter: nf_tables: consolidate timeout extension for elements netfilter: nf_tables: annotate data-races around element expiration netfilter: nft_dynset: annotate data-races around set timeout netfilter: nf_tables: remove annotation to access set timeout while holding lock netfilter: nf_tables: reject expiration higher than timeout netfilter: nf_tables: reject element expiration with no timeout netfilter: nf_tables: elements with timeout below CONFIG_HZ never expire netfilter: nf_tables: Add missing Kernel doc netfilter: nf_tables: Correct spelling in nf_tables.h netfilter: nf_tables: drop unused 3rd argument from validate callback ops netfilter: conntrack: Convert to use ERR_CAST() netfilter: Use kmemdup_array instead of kmemdup for multiple allocation netfilter: nft_counter: Use u64_stats_t for statistic. netfilter: ctnetlink: support CTA_FILTER for flush ==================== Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
kernel-patches-daemon-bpf-rc bot
pushed a commit
that referenced
this pull request
Sep 23, 2024
iter_finish_branch_entry() doesn't put the branch_info from/to map elements creating memory leaks. This can be seen with: ``` $ perf record -e cycles -b perf test -w noploop $ perf report -D ... Direct leak of 984344 byte(s) in 123043 object(s) allocated from: #0 0x7fb2654f3bd7 in malloc libsanitizer/asan/asan_malloc_linux.cpp:69 #1 0x564d3400d10b in map__get util/map.h:186 #2 0x564d3400d10b in ip__resolve_ams util/machine.c:1981 #3 0x564d34014d81 in sample__resolve_bstack util/machine.c:2151 #4 0x564d34094790 in iter_prepare_branch_entry util/hist.c:898 #5 0x564d34098fa4 in hist_entry_iter__add util/hist.c:1238 #6 0x564d33d1f0c7 in process_sample_event tools/perf/builtin-report.c:334 #7 0x564d34031eb7 in perf_session__deliver_event util/session.c:1655 #8 0x564d3403ba52 in do_flush util/ordered-events.c:245 #9 0x564d3403ba52 in __ordered_events__flush util/ordered-events.c:324 #10 0x564d3402d32e in perf_session__process_user_event util/session.c:1708 #11 0x564d34032480 in perf_session__process_event util/session.c:1877 #12 0x564d340336ad in reader__read_event util/session.c:2399 #13 0x564d34033fdc in reader__process_events util/session.c:2448 #14 0x564d34033fdc in __perf_session__process_events util/session.c:2495 #15 0x564d34033fdc in perf_session__process_events util/session.c:2661 #16 0x564d33d27113 in __cmd_report tools/perf/builtin-report.c:1065 #17 0x564d33d27113 in cmd_report tools/perf/builtin-report.c:1805 #18 0x564d33e0ccb7 in run_builtin tools/perf/perf.c:350 #19 0x564d33e0d45e in handle_internal_command tools/perf/perf.c:403 #20 0x564d33cdd827 in run_argv tools/perf/perf.c:447 #21 0x564d33cdd827 in main tools/perf/perf.c:561 ... ``` Clearing up the map_symbols properly creates maps reference count issues so resolve those. Resolving this issue doesn't improve peak heap consumption for the test above. Committer testing: $ sudo dnf install libasan $ make -k CORESIGHT=1 EXTRA_CFLAGS="-fsanitize=address" CC=clang O=/tmp/build/$(basename $PWD)/ -C tools/perf install-bin Reviewed-by: Kan Liang <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sun Haiyong <[email protected]> Cc: Yanteng Si <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
kernel-patches-daemon-bpf-rc bot
pushed a commit
that referenced
this pull request
Sep 23, 2024
The fields in the hist_entry are filled on-demand which means they only have meaningful values when relevant sort keys are used. So if neither of 'dso' nor 'sym' sort keys are used, the map/symbols in the hist entry can be garbage. So it shouldn't access it unconditionally. I got a segfault, when I wanted to see cgroup profiles. $ sudo perf record -a --all-cgroups --synth=cgroup true $ sudo perf report -s cgroup Program received signal SIGSEGV, Segmentation fault. 0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48 48 return RC_CHK_ACCESS(map)->dso; (gdb) bt #0 0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48 #1 0x00005555557aa39b in map__load (map=0x0) at util/map.c:344 #2 0x00005555557aa592 in map__find_symbol (map=0x0, addr=140736115941088) at util/map.c:385 #3 0x00005555557ef000 in hists__findnew_entry (hists=0x555556039d60, entry=0x7fffffffa4c0, al=0x7fffffffa8c0, sample_self=true) at util/hist.c:644 #4 0x00005555557ef61c in __hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0, block_info=0x0, sample=0x7fffffffaa90, sample_self=true, ops=0x0) at util/hist.c:761 #5 0x00005555557ef71f in hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0, sample=0x7fffffffaa90, sample_self=true) at util/hist.c:779 #6 0x00005555557f00fb in iter_add_single_normal_entry (iter=0x7fffffffa900, al=0x7fffffffa8c0) at util/hist.c:1015 #7 0x00005555557f09a7 in hist_entry_iter__add (iter=0x7fffffffa900, al=0x7fffffffa8c0, max_stack_depth=127, arg=0x7fffffffbce0) at util/hist.c:1260 #8 0x00005555555ba7ce in process_sample_event (tool=0x7fffffffbce0, event=0x7ffff7c14128, sample=0x7fffffffaa90, evsel=0x555556039ad0, machine=0x5555560388e8) at builtin-report.c:334 #9 0x00005555557b30c8 in evlist__deliver_sample (evlist=0x555556039010, tool=0x7fffffffbce0, event=0x7ffff7c14128, sample=0x7fffffffaa90, evsel=0x555556039ad0, machine=0x5555560388e8) at util/session.c:1232 #10 0x00005555557b32bc in machines__deliver_event (machines=0x5555560388e8, evlist=0x555556039010, event=0x7ffff7c14128, sample=0x7fffffffaa90, tool=0x7fffffffbce0, file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1271 #11 0x00005555557b3848 in perf_session__deliver_event (session=0x5555560386d0, event=0x7ffff7c14128, tool=0x7fffffffbce0, file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1354 #12 0x00005555557affaf in ordered_events__deliver_event (oe=0x555556038e60, event=0x555556135aa0) at util/session.c:132 #13 0x00005555557bb605 in do_flush (oe=0x555556038e60, show_progress=false) at util/ordered-events.c:245 #14 0x00005555557bb95c in __ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND, timestamp=0) at util/ordered-events.c:324 #15 0x00005555557bba46 in ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND) at util/ordered-events.c:342 #16 0x00005555557b1b3b in perf_event__process_finished_round (tool=0x7fffffffbce0, event=0x7ffff7c15bb8, oe=0x555556038e60) at util/session.c:780 #17 0x00005555557b3b27 in perf_session__process_user_event (session=0x5555560386d0, event=0x7ffff7c15bb8, file_offset=117688, file_path=0x555556038ff0 "perf.data") at util/session.c:1406 As you can see the entry->ms.map was NULL even if he->ms.map has a value. This is because 'sym' sort key is not given, so it cannot assume whether he->ms.sym and entry->ms.sym is the same. I only checked the 'sym' sort key here as it implies 'dso' behavior (so maps are the same). Fixes: ac01c8c ("perf hist: Update hist symbol when updating maps") Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Matt Fleming <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
kernel-patches-daemon-bpf-rc bot
pushed a commit
that referenced
this pull request
Nov 1, 2024
During the migration of Soundwire runtime stream allocation from the Qualcomm Soundwire controller to SoC's soundcard drivers the sdm845 soundcard was forgotten. At this point any playback attempt or audio daemon startup, for instance on sdm845-db845c (Qualcomm RB3 board), will result in stream pointer NULL dereference: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020 Mem abort info: ESR = 0x0000000096000004 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x04: level 0 translation fault Data abort info: ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 CM = 0, WnR = 0, TnD = 0, TagAccess = 0 GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 user pgtable: 4k pages, 48-bit VAs, pgdp=0000000101ecf000 [0000000000000020] pgd=0000000000000000, p4d=0000000000000000 Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP Modules linked in: ... CPU: 5 UID: 0 PID: 1198 Comm: aplay Not tainted 6.12.0-rc2-qcomlt-arm64-00059-g9d78f315a362-dirty #18 Hardware name: Thundercomm Dragonboard 845c (DT) pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : sdw_stream_add_slave+0x44/0x380 [soundwire_bus] lr : sdw_stream_add_slave+0x44/0x380 [soundwire_bus] sp : ffff80008a2035c0 x29: ffff80008a2035c0 x28: ffff80008a203978 x27: 0000000000000000 x26: 00000000000000c0 x25: 0000000000000000 x24: ffff1676025f4800 x23: ffff167600ff1cb8 x22: ffff167600ff1c98 x21: 0000000000000003 x20: ffff167607316000 x19: ffff167604e64e80 x18: 0000000000000000 x17: 0000000000000000 x16: ffffcec265074160 x15: 0000000000000000 x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000000 x8 : 0000000000000000 x7 : 0000000000000000 x6 : ffff167600ff1cec x5 : ffffcec22cfa2010 x4 : 0000000000000000 x3 : 0000000000000003 x2 : ffff167613f836c0 x1 : 0000000000000000 x0 : ffff16761feb60b8 Call trace: sdw_stream_add_slave+0x44/0x380 [soundwire_bus] wsa881x_hw_params+0x68/0x80 [snd_soc_wsa881x] snd_soc_dai_hw_params+0x3c/0xa4 __soc_pcm_hw_params+0x230/0x660 dpcm_be_dai_hw_params+0x1d0/0x3f8 dpcm_fe_dai_hw_params+0x98/0x268 snd_pcm_hw_params+0x124/0x460 snd_pcm_common_ioctl+0x998/0x16e8 snd_pcm_ioctl+0x34/0x58 __arm64_sys_ioctl+0xac/0xf8 invoke_syscall+0x48/0x104 el0_svc_common.constprop.0+0x40/0xe0 do_el0_svc+0x1c/0x28 el0_svc+0x34/0xe0 el0t_64_sync_handler+0x120/0x12c el0t_64_sync+0x190/0x194 Code: aa0403fb f9418400 9100e000 9400102f (f8420f22) ---[ end trace 0000000000000000 ]--- 0000000000006108 <sdw_stream_add_slave>: 6108: d503233f paciasp 610c: a9b97bfd stp x29, x30, [sp, #-112]! 6110: 910003fd mov x29, sp 6114: a90153f3 stp x19, x20, [sp, #16] 6118: a9025bf5 stp x21, x22, [sp, #32] 611c: aa0103f6 mov x22, x1 6120: 2a0303f5 mov w21, w3 6124: a90363f7 stp x23, x24, [sp, #48] 6128: aa0003f8 mov x24, x0 612c: aa0203f7 mov x23, x2 6130: a9046bf9 stp x25, x26, [sp, #64] 6134: aa0403f9 mov x25, x4 <-- x4 copied to x25 6138: a90573fb stp x27, x28, [sp, #80] 613c: aa0403fb mov x27, x4 6140: f9418400 ldr x0, [x0, #776] 6144: 9100e000 add x0, x0, #0x38 6148: 94000000 bl 0 <mutex_lock> 614c: f8420f22 ldr x2, [x25, #32]! <-- offset 0x44 ^^^ This is 0x6108 + offset 0x44 from the beginning of sdw_stream_add_slave() where data abort happens. wsa881x_hw_params() is called with stream = NULL and passes it further in register x4 (5th argument) to sdw_stream_add_slave() without any checks. Value from x4 is copied to x25 and finally it aborts on trying to load a value from address in x25 plus offset 32 (in dec) which corresponds to master_list member in struct sdw_stream_runtime: struct sdw_stream_runtime { const char * name; /* 0 8 */ struct sdw_stream_params params; /* 8 12 */ enum sdw_stream_state state; /* 20 4 */ enum sdw_stream_type type; /* 24 4 */ /* XXX 4 bytes hole, try to pack */ here-> struct list_head master_list; /* 32 16 */ int m_rt_count; /* 48 4 */ /* size: 56, cachelines: 1, members: 6 */ /* sum members: 48, holes: 1, sum holes: 4 */ /* padding: 4 */ /* last cacheline: 56 bytes */ Fix this by adding required calls to qcom_snd_sdw_startup() and sdw_release_stream() to startup and shutdown routines which restores the previous correct behaviour when ->set_stream() method is called to set a valid stream runtime pointer on playback startup. Reproduced and then fix was tested on db845c RB3 board. Reported-by: Dmitry Baryshkov <[email protected]> Cc: [email protected] Fixes: 15c7fab ("ASoC: qcom: Move Soundwire runtime stream alloc to soundcards") Cc: Srinivas Kandagatla <[email protected]> Cc: Dmitry Baryshkov <[email protected]> Cc: Krzysztof Kozlowski <[email protected]> Cc: Pierre-Louis Bossart <[email protected]> Signed-off-by: Alexey Klimov <[email protected]> Tested-by: Steev Klimaszewski <[email protected]> # Lenovo Yoga C630 Reviewed-by: Krzysztof Kozlowski <[email protected]> Reviewed-by: Srinivas Kandagatla <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Mark Brown <[email protected]>
kernel-patches-daemon-bpf-rc bot
pushed a commit
that referenced
this pull request
Jan 9, 2025
Hou Tao says: ==================== The use of migrate_{disable|enable} pair in BPF is mainly due to the introduction of bpf memory allocator and the use of per-CPU data struct in its internal implementation. The caller needs to disable migration before invoking the alloc or free APIs of bpf memory allocator, and enable migration after the invocation. The main users of bpf memory allocator are various kind of bpf maps in which the map values or the special fields in the map values are allocated by using bpf memory allocator. At present, the running context for bpf program has already disabled migration explictly or implictly, therefore, when these maps are manipulated in bpf program, it is OK to not invoke migrate_disable() and migrate_enable() pair. Howevers, it is not always the case when these maps are manipulated through bpf syscall, therefore many migrate_{disable|enable} pairs are added when the map can either be manipulated by BPF program or BPF syscall. The initial idea of reducing the use of migrate_{disable|enable} comes from Alexei [1]. I turned it into a patch set that archives the goals through the following three methods: 1. remove unnecessary migrate_{disable|enable} pair when the BPF syscall path also disables migration, it is OK to remove the pair. Patch #1~#3 fall into this category, while patch #4~#5 are partially included. 2. move the migrate_{disable|enable} pair from inner callee to outer caller Instead of invoking migrate_disable() in the inner callee, invoking migrate_disable() in the outer caller to simplify reasoning about when migrate_disable() is needed. Patch #4~#5 and patch #6~#19 belongs to this category. 3. add cant_migrate() check in the inner callee Add cant_migrate() check in the inner callee to ensure the guarantee that migration is disabled is not broken. Patch #1~#5, #13, #16~#19 also belong to this category. Please check the individual patches for more details. Comments are always welcome. Change Log: v2: * sqaush the ->map_free related patches (#10~#12, #15) into one patch * remove unnecessary cant_migrate() checks. v1: https://lore.kernel.org/bpf/[email protected] ==================== Link: https://patch.msgid.link/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
kernel-patches-daemon-bpf-rc bot
pushed a commit
that referenced
this pull request
Jan 30, 2025
Function xen_pin_page calls xen_pte_lock, which in turn grab page table lock (ptlock). When locking, xen_pte_lock expect mm->page_table_lock to be held before grabbing ptlock, but this does not happen when pinning is caused by xen_mm_pin_all. This commit addresses lockdep warning below, which shows up when suspending a Xen VM. [ 3680.658422] Freezing user space processes [ 3680.660156] Freezing user space processes completed (elapsed 0.001 seconds) [ 3680.660182] OOM killer disabled. [ 3680.660192] Freezing remaining freezable tasks [ 3680.661485] Freezing remaining freezable tasks completed (elapsed 0.001 seconds) [ 3680.685254] [ 3680.685265] ================================== [ 3680.685269] WARNING: Nested lock was not taken [ 3680.685274] 6.12.0+ #16 Tainted: G W [ 3680.685279] ---------------------------------- [ 3680.685283] migration/0/19 is trying to lock: [ 3680.685288] ffff88800bac33c0 (ptlock_ptr(ptdesc)#2){+.+.}-{3:3}, at: xen_pin_page+0x175/0x1d0 [ 3680.685303] [ 3680.685303] but this task is not holding: [ 3680.685308] init_mm.page_table_lock [ 3680.685311] [ 3680.685311] stack backtrace: [ 3680.685316] CPU: 0 UID: 0 PID: 19 Comm: migration/0 Tainted: G W 6.12.0+ #16 [ 3680.685324] Tainted: [W]=WARN [ 3680.685328] Stopper: multi_cpu_stop+0x0/0x120 <- __stop_cpus.constprop.0+0x8c/0xd0 [ 3680.685339] Call Trace: [ 3680.685344] <TASK> [ 3680.685347] dump_stack_lvl+0x77/0xb0 [ 3680.685356] __lock_acquire+0x917/0x2310 [ 3680.685364] lock_acquire+0xce/0x2c0 [ 3680.685369] ? xen_pin_page+0x175/0x1d0 [ 3680.685373] _raw_spin_lock_nest_lock+0x2f/0x70 [ 3680.685381] ? xen_pin_page+0x175/0x1d0 [ 3680.685386] xen_pin_page+0x175/0x1d0 [ 3680.685390] ? __pfx_xen_pin_page+0x10/0x10 [ 3680.685394] __xen_pgd_walk+0x233/0x2c0 [ 3680.685401] ? stop_one_cpu+0x91/0x100 [ 3680.685405] __xen_pgd_pin+0x5d/0x250 [ 3680.685410] xen_mm_pin_all+0x70/0xa0 [ 3680.685415] xen_pv_pre_suspend+0xf/0x280 [ 3680.685420] xen_suspend+0x57/0x1a0 [ 3680.685428] multi_cpu_stop+0x6b/0x120 [ 3680.685432] ? update_cpumasks_hier+0x7c/0xa60 [ 3680.685439] ? __pfx_multi_cpu_stop+0x10/0x10 [ 3680.685443] cpu_stopper_thread+0x8c/0x140 [ 3680.685448] ? smpboot_thread_fn+0x20/0x1f0 [ 3680.685454] ? __pfx_smpboot_thread_fn+0x10/0x10 [ 3680.685458] smpboot_thread_fn+0xed/0x1f0 [ 3680.685462] kthread+0xde/0x110 [ 3680.685467] ? __pfx_kthread+0x10/0x10 [ 3680.685471] ret_from_fork+0x2f/0x50 [ 3680.685478] ? __pfx_kthread+0x10/0x10 [ 3680.685482] ret_from_fork_asm+0x1a/0x30 [ 3680.685489] </TASK> [ 3680.685491] [ 3680.685491] other info that might help us debug this: [ 3680.685497] 1 lock held by migration/0/19: [ 3680.685500] #0: ffffffff8284df38 (pgd_lock){+.+.}-{3:3}, at: xen_mm_pin_all+0x14/0xa0 [ 3680.685512] [ 3680.685512] stack backtrace: [ 3680.685518] CPU: 0 UID: 0 PID: 19 Comm: migration/0 Tainted: G W 6.12.0+ #16 [ 3680.685528] Tainted: [W]=WARN [ 3680.685531] Stopper: multi_cpu_stop+0x0/0x120 <- __stop_cpus.constprop.0+0x8c/0xd0 [ 3680.685538] Call Trace: [ 3680.685541] <TASK> [ 3680.685544] dump_stack_lvl+0x77/0xb0 [ 3680.685549] __lock_acquire+0x93c/0x2310 [ 3680.685554] lock_acquire+0xce/0x2c0 [ 3680.685558] ? xen_pin_page+0x175/0x1d0 [ 3680.685562] _raw_spin_lock_nest_lock+0x2f/0x70 [ 3680.685568] ? xen_pin_page+0x175/0x1d0 [ 3680.685572] xen_pin_page+0x175/0x1d0 [ 3680.685578] ? __pfx_xen_pin_page+0x10/0x10 [ 3680.685582] __xen_pgd_walk+0x233/0x2c0 [ 3680.685588] ? stop_one_cpu+0x91/0x100 [ 3680.685592] __xen_pgd_pin+0x5d/0x250 [ 3680.685596] xen_mm_pin_all+0x70/0xa0 [ 3680.685600] xen_pv_pre_suspend+0xf/0x280 [ 3680.685607] xen_suspend+0x57/0x1a0 [ 3680.685611] multi_cpu_stop+0x6b/0x120 [ 3680.685615] ? update_cpumasks_hier+0x7c/0xa60 [ 3680.685620] ? __pfx_multi_cpu_stop+0x10/0x10 [ 3680.685625] cpu_stopper_thread+0x8c/0x140 [ 3680.685629] ? smpboot_thread_fn+0x20/0x1f0 [ 3680.685634] ? __pfx_smpboot_thread_fn+0x10/0x10 [ 3680.685638] smpboot_thread_fn+0xed/0x1f0 [ 3680.685642] kthread+0xde/0x110 [ 3680.685645] ? __pfx_kthread+0x10/0x10 [ 3680.685649] ret_from_fork+0x2f/0x50 [ 3680.685654] ? __pfx_kthread+0x10/0x10 [ 3680.685657] ret_from_fork_asm+0x1a/0x30 [ 3680.685662] </TASK> [ 3680.685267] xen:grant_table: Grant tables using version 1 layout [ 3680.685921] OOM killer enabled. [ 3680.685934] Restarting tasks ... done. Signed-off-by: Maksym Planeta <[email protected]> Reviewed-by: Juergen Gross <[email protected]> Message-ID: <[email protected]> Signed-off-by: Juergen Gross <[email protected]>
kernel-patches-daemon-bpf-rc bot
pushed a commit
that referenced
this pull request
Apr 3, 2025
When a bio with REQ_PREFLUSH is submitted to dm, __send_empty_flush() generates a flush_bio with REQ_OP_WRITE | REQ_PREFLUSH | REQ_SYNC, which causes the flush_bio to be throttled by wbt_wait(). An example from v5.4, similar problem also exists in upstream: crash> bt 2091206 PID: 2091206 TASK: ffff2050df92a300 CPU: 109 COMMAND: "kworker/u260:0" #0 [ffff800084a2f7f0] __switch_to at ffff80004008aeb8 #1 [ffff800084a2f820] __schedule at ffff800040bfa0c4 #2 [ffff800084a2f880] schedule at ffff800040bfa4b4 #3 [ffff800084a2f8a0] io_schedule at ffff800040bfa9c4 #4 [ffff800084a2f8c0] rq_qos_wait at ffff8000405925bc #5 [ffff800084a2f940] wbt_wait at ffff8000405bb3a0 #6 [ffff800084a2f9a0] __rq_qos_throttle at ffff800040592254 #7 [ffff800084a2f9c0] blk_mq_make_request at ffff80004057cf38 #8 [ffff800084a2fa60] generic_make_request at ffff800040570138 #9 [ffff800084a2fae0] submit_bio at ffff8000405703b4 #10 [ffff800084a2fb50] xlog_write_iclog at ffff800001280834 [xfs] #11 [ffff800084a2fbb0] xlog_sync at ffff800001280c3c [xfs] #12 [ffff800084a2fbf0] xlog_state_release_iclog at ffff800001280df4 [xfs] #13 [ffff800084a2fc10] xlog_write at ffff80000128203c [xfs] #14 [ffff800084a2fcd0] xlog_cil_push at ffff8000012846dc [xfs] #15 [ffff800084a2fda0] xlog_cil_push_work at ffff800001284a2c [xfs] #16 [ffff800084a2fdb0] process_one_work at ffff800040111d08 #17 [ffff800084a2fe00] worker_thread at ffff8000401121cc #18 [ffff800084a2fe70] kthread at ffff800040118de4 After commit 2def284 ("xfs: don't allow log IO to be throttled"), the metadata submitted by xlog_write_iclog() should not be throttled. But due to the existence of the dm layer, throttling flush_bio indirectly causes the metadata bio to be throttled. Fix this by conditionally adding REQ_IDLE to flush_bio.bi_opf, which makes wbt_should_throttle() return false to avoid wbt_wait(). Signed-off-by: Jinliang Zheng <[email protected]> Reviewed-by: Tianxiang Peng <[email protected]> Reviewed-by: Hao Peng <[email protected]> Signed-off-by: Mikulas Patocka <[email protected]>
kernel-patches-daemon-bpf-rc bot
pushed a commit
that referenced
this pull request
Apr 23, 2025
There is a potential deadlock if we do report zones in an IO context, detailed in below lockdep report. When one process do a report zones and another process freezes the block device, the report zones side cannot allocate a tag because the freeze is already started. This can thus result in new block group creation to hang forever, blocking the write path. Thankfully, a new block group should be created on empty zones. So, reporting the zones is not necessary and we can set the write pointer = 0 and load the zone capacity from the block layer using bdev_zone_capacity() helper. ====================================================== WARNING: possible circular locking dependency detected 6.14.0-rc1 #252 Not tainted ------------------------------------------------------ modprobe/1110 is trying to acquire lock: ffff888100ac83e0 ((work_completion)(&(&wb->dwork)->work)){+.+.}-{0:0}, at: __flush_work+0x38f/0xb60 but task is already holding lock: ffff8881205b6f20 (&q->q_usage_counter(queue)#16){++++}-{0:0}, at: sd_remove+0x85/0x130 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #3 (&q->q_usage_counter(queue)#16){++++}-{0:0}: blk_queue_enter+0x3d9/0x500 blk_mq_alloc_request+0x47d/0x8e0 scsi_execute_cmd+0x14f/0xb80 sd_zbc_do_report_zones+0x1c1/0x470 sd_zbc_report_zones+0x362/0xd60 blkdev_report_zones+0x1b1/0x2e0 btrfs_get_dev_zones+0x215/0x7e0 [btrfs] btrfs_load_block_group_zone_info+0x6d2/0x2c10 [btrfs] btrfs_make_block_group+0x36b/0x870 [btrfs] btrfs_create_chunk+0x147d/0x2320 [btrfs] btrfs_chunk_alloc+0x2ce/0xcf0 [btrfs] start_transaction+0xce6/0x1620 [btrfs] btrfs_uuid_scan_kthread+0x4ee/0x5b0 [btrfs] kthread+0x39d/0x750 ret_from_fork+0x30/0x70 ret_from_fork_asm+0x1a/0x30 -> #2 (&fs_info->dev_replace.rwsem){++++}-{4:4}: down_read+0x9b/0x470 btrfs_map_block+0x2ce/0x2ce0 [btrfs] btrfs_submit_chunk+0x2d4/0x16c0 [btrfs] btrfs_submit_bbio+0x16/0x30 [btrfs] btree_write_cache_pages+0xb5a/0xf90 [btrfs] do_writepages+0x17f/0x7b0 __writeback_single_inode+0x114/0xb00 writeback_sb_inodes+0x52b/0xe00 wb_writeback+0x1a7/0x800 wb_workfn+0x12a/0xbd0 process_one_work+0x85a/0x1460 worker_thread+0x5e2/0xfc0 kthread+0x39d/0x750 ret_from_fork+0x30/0x70 ret_from_fork_asm+0x1a/0x30 -> #1 (&fs_info->zoned_meta_io_lock){+.+.}-{4:4}: __mutex_lock+0x1aa/0x1360 btree_write_cache_pages+0x252/0xf90 [btrfs] do_writepages+0x17f/0x7b0 __writeback_single_inode+0x114/0xb00 writeback_sb_inodes+0x52b/0xe00 wb_writeback+0x1a7/0x800 wb_workfn+0x12a/0xbd0 process_one_work+0x85a/0x1460 worker_thread+0x5e2/0xfc0 kthread+0x39d/0x750 ret_from_fork+0x30/0x70 ret_from_fork_asm+0x1a/0x30 -> #0 ((work_completion)(&(&wb->dwork)->work)){+.+.}-{0:0}: __lock_acquire+0x2f52/0x5ea0 lock_acquire+0x1b1/0x540 __flush_work+0x3ac/0xb60 wb_shutdown+0x15b/0x1f0 bdi_unregister+0x172/0x5b0 del_gendisk+0x841/0xa20 sd_remove+0x85/0x130 device_release_driver_internal+0x368/0x520 bus_remove_device+0x1f1/0x3f0 device_del+0x3bd/0x9c0 __scsi_remove_device+0x272/0x340 scsi_forget_host+0xf7/0x170 scsi_remove_host+0xd2/0x2a0 sdebug_driver_remove+0x52/0x2f0 [scsi_debug] device_release_driver_internal+0x368/0x520 bus_remove_device+0x1f1/0x3f0 device_del+0x3bd/0x9c0 device_unregister+0x13/0xa0 sdebug_do_remove_host+0x1fb/0x290 [scsi_debug] scsi_debug_exit+0x17/0x70 [scsi_debug] __do_sys_delete_module.isra.0+0x321/0x520 do_syscall_64+0x93/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e other info that might help us debug this: Chain exists of: (work_completion)(&(&wb->dwork)->work) --> &fs_info->dev_replace.rwsem --> &q->q_usage_counter(queue)#16 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&q->q_usage_counter(queue)#16); lock(&fs_info->dev_replace.rwsem); lock(&q->q_usage_counter(queue)#16); lock((work_completion)(&(&wb->dwork)->work)); *** DEADLOCK *** 5 locks held by modprobe/1110: #0: ffff88811f7bc108 (&dev->mutex){....}-{4:4}, at: device_release_driver_internal+0x8f/0x520 #1: ffff8881022ee0e0 (&shost->scan_mutex){+.+.}-{4:4}, at: scsi_remove_host+0x20/0x2a0 #2: ffff88811b4c4378 (&dev->mutex){....}-{4:4}, at: device_release_driver_internal+0x8f/0x520 #3: ffff8881205b6f20 (&q->q_usage_counter(queue)#16){++++}-{0:0}, at: sd_remove+0x85/0x130 #4: ffffffffa3284360 (rcu_read_lock){....}-{1:3}, at: __flush_work+0xda/0xb60 stack backtrace: CPU: 0 UID: 0 PID: 1110 Comm: modprobe Not tainted 6.14.0-rc1 #252 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-3.fc41 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x6a/0x90 print_circular_bug.cold+0x1e0/0x274 check_noncircular+0x306/0x3f0 ? __pfx_check_noncircular+0x10/0x10 ? mark_lock+0xf5/0x1650 ? __pfx_check_irq_usage+0x10/0x10 ? lockdep_lock+0xca/0x1c0 ? __pfx_lockdep_lock+0x10/0x10 __lock_acquire+0x2f52/0x5ea0 ? __pfx___lock_acquire+0x10/0x10 ? __pfx_mark_lock+0x10/0x10 lock_acquire+0x1b1/0x540 ? __flush_work+0x38f/0xb60 ? __pfx_lock_acquire+0x10/0x10 ? __pfx_lock_release+0x10/0x10 ? mark_held_locks+0x94/0xe0 ? __flush_work+0x38f/0xb60 __flush_work+0x3ac/0xb60 ? __flush_work+0x38f/0xb60 ? __pfx_mark_lock+0x10/0x10 ? __pfx___flush_work+0x10/0x10 ? __pfx_wq_barrier_func+0x10/0x10 ? __pfx___might_resched+0x10/0x10 ? mark_held_locks+0x94/0xe0 wb_shutdown+0x15b/0x1f0 bdi_unregister+0x172/0x5b0 ? __pfx_bdi_unregister+0x10/0x10 ? up_write+0x1ba/0x510 del_gendisk+0x841/0xa20 ? __pfx_del_gendisk+0x10/0x10 ? _raw_spin_unlock_irqrestore+0x35/0x60 ? __pm_runtime_resume+0x79/0x110 sd_remove+0x85/0x130 device_release_driver_internal+0x368/0x520 ? kobject_put+0x5d/0x4a0 bus_remove_device+0x1f1/0x3f0 device_del+0x3bd/0x9c0 ? __pfx_device_del+0x10/0x10 __scsi_remove_device+0x272/0x340 scsi_forget_host+0xf7/0x170 scsi_remove_host+0xd2/0x2a0 sdebug_driver_remove+0x52/0x2f0 [scsi_debug] ? kernfs_remove_by_name_ns+0xc0/0xf0 device_release_driver_internal+0x368/0x520 ? kobject_put+0x5d/0x4a0 bus_remove_device+0x1f1/0x3f0 device_del+0x3bd/0x9c0 ? __pfx_device_del+0x10/0x10 ? __pfx___mutex_unlock_slowpath+0x10/0x10 device_unregister+0x13/0xa0 sdebug_do_remove_host+0x1fb/0x290 [scsi_debug] scsi_debug_exit+0x17/0x70 [scsi_debug] __do_sys_delete_module.isra.0+0x321/0x520 ? __pfx___do_sys_delete_module.isra.0+0x10/0x10 ? __pfx_slab_free_after_rcu_debug+0x10/0x10 ? kasan_save_stack+0x2c/0x50 ? kasan_record_aux_stack+0xa3/0xb0 ? __call_rcu_common.constprop.0+0xc4/0xfb0 ? kmem_cache_free+0x3a0/0x590 ? __x64_sys_close+0x78/0xd0 do_syscall_64+0x93/0x180 ? lock_is_held_type+0xd5/0x130 ? __call_rcu_common.constprop.0+0x3c0/0xfb0 ? lockdep_hardirqs_on+0x78/0x100 ? __call_rcu_common.constprop.0+0x3c0/0xfb0 ? __pfx___call_rcu_common.constprop.0+0x10/0x10 ? kmem_cache_free+0x3a0/0x590 ? lockdep_hardirqs_on_prepare+0x16d/0x400 ? do_syscall_64+0x9f/0x180 ? lockdep_hardirqs_on+0x78/0x100 ? do_syscall_64+0x9f/0x180 ? __pfx___x64_sys_openat+0x10/0x10 ? lockdep_hardirqs_on_prepare+0x16d/0x400 ? do_syscall_64+0x9f/0x180 ? lockdep_hardirqs_on+0x78/0x100 ? do_syscall_64+0x9f/0x180 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7f436712b68b RSP: 002b:00007ffe9f1a8658 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0 RAX: ffffffffffffffda RBX: 00005559b367fd80 RCX: 00007f436712b68b RDX: 0000000000000000 RSI: 0000000000000800 RDI: 00005559b367fde8 RBP: 00007ffe9f1a8680 R08: 1999999999999999 R09: 0000000000000000 R10: 00007f43671a5fe0 R11: 0000000000000206 R12: 0000000000000000 R13: 00007ffe9f1a86b0 R14: 0000000000000000 R15: 0000000000000000 </TASK> Reported-by: Shin'ichiro Kawasaki <[email protected]> CC: <[email protected]> # 6.13+ Tested-by: Shin'ichiro Kawasaki <[email protected]> Reviewed-by: Damien Le Moal <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: Naohiro Aota <[email protected]> Signed-off-by: David Sterba <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Pull request for series with
subject: Avoid size mismatches in skeletons
version: 3
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=614738