-
Notifications
You must be signed in to change notification settings - Fork 130
bpf_fib_lookup: optionally skip neighbour lookup #224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Master branch: 1e9259e |
Master branch: c3f01fd |
cc31f11
to
65c77c2
Compare
Master branch: eca43ee |
65c77c2
to
32db893
Compare
The bpf_fib_lookup() helper performs a neighbour lookup for the destination IP and returns BPF_FIB_LKUP_NO_NEIGH if this fails, with the expectation that the BPF program will pass the packet up the stack in this case. However, with the addition of bpf_redirect_neigh() that can be used instead to perform the neighbour lookup, at the cost of a bit of duplicated work. For that we still need the target ifindex, and since bpf_fib_lookup() already has that at the time it performs the neighbour lookup, there is really no reason why it can't just return it in any case. So let's just always return the ifindex, and also add a flag that lets the caller turn off the neighbour lookup entirely in bpf_fib_lookup(). v2: - Add flag (Daniel) - Remove misleading code example from commit message (David) Cc: David Ahern <[email protected]> Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
Master branch: ac53a0d |
32db893
to
67f3ea8
Compare
At least one diff in series https://patchwork.kernel.org/project/netdevbpf/list/?series=362021 irrelevant now. Closing PR. |
A static key warning splat appears during early boot on arm64 systems that credit randomness from devicetrees that contain an "rng-seed" property. This is because setup_machine_fdt() is called before jump_label_init() during setup_arch(). Let's swap the order of these two calls so that jump labels are initialized before the devicetree is unflattened and the rng seed is credited. static_key_enable_cpuslocked(): static key '0xffffffe51c6fcfc0' used before call to jump_label_init() WARNING: CPU: 0 PID: 0 at kernel/jump_label.c:166 static_key_enable_cpuslocked+0xb0/0xb8 Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 5.18.0+ #224 44b43e377bfc84bc99bb5ab885ff694984ee09ff pstate: 600001c9 (nZCv dAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : static_key_enable_cpuslocked+0xb0/0xb8 lr : static_key_enable_cpuslocked+0xb0/0xb8 sp : ffffffe51c393cf0 x29: ffffffe51c393cf0 x28: 000000008185054c x27: 00000000f1042f10 x26: 0000000000000000 x25: 00000000f10302b2 x24: 0000002513200000 x23: 0000002513200000 x22: ffffffe51c1c9000 x21: fffffffdfdc00000 x20: ffffffe51c2f0831 x19: ffffffe51c6fcfc0 x18: 00000000ffff1020 x17: 00000000e1e2ac90 x16: 00000000000000e0 x15: ffffffe51b710708 x14: 0000000000000066 x13: 0000000000000018 x12: 0000000000000000 x11: 0000000000000000 x10: 00000000ffffffff x9 : 0000000000000000 x8 : 0000000000000000 x7 : 61632065726f6665 x6 : 6220646573752027 x5 : ffffffe51c641d25 x4 : ffffffe51c13142c x3 : ffff0a00ffffff05 x2 : 40000000ffffe003 x1 : 00000000000001c0 x0 : 0000000000000065 Call trace: static_key_enable_cpuslocked+0xb0/0xb8 static_key_enable+0x2c/0x40 crng_set_ready+0x24/0x30 execute_in_process_context+0x80/0x90 _credit_init_bits+0x100/0x154 add_bootloader_randomness+0x64/0x78 early_init_dt_scan_chosen+0x140/0x184 early_init_dt_scan_nodes+0x28/0x4c early_init_dt_scan+0x40/0x44 setup_machine_fdt+0x7c/0x120 setup_arch+0x74/0x1d8 start_kernel+0x84/0x44c __primary_switched+0xc0/0xc8 ---[ end trace 0000000000000000 ]--- random: crng init done Machine model: Google Lazor (rev1 - 2) with LTE Cc: Hsin-Yi Wang <[email protected]> Cc: Douglas Anderson <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Jason A. Donenfeld <[email protected]> Cc: Dominik Brodowski <[email protected]> Fixes: f5bda35 ("random: use static branch for crng_ready()") Signed-off-by: Stephen Boyd <[email protected]> Reviewed-by: Jason A. Donenfeld <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
Stephen reported that a static key warning splat appears during early boot on systems that credit randomness from device trees that contain an "rng-seed" property, because because setup_machine_fdt() is called before jump_label_init() during setup_arch(): static_key_enable_cpuslocked(): static key '0xffffffe51c6fcfc0' used before call to jump_label_init() WARNING: CPU: 0 PID: 0 at kernel/jump_label.c:166 static_key_enable_cpuslocked+0xb0/0xb8 Modules linked in: CPU: 0 PID: 0 Comm: swapper Not tainted 5.18.0+ #224 44b43e377bfc84bc99bb5ab885ff694984ee09ff pstate: 600001c9 (nZCv dAIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : static_key_enable_cpuslocked+0xb0/0xb8 lr : static_key_enable_cpuslocked+0xb0/0xb8 sp : ffffffe51c393cf0 x29: ffffffe51c393cf0 x28: 000000008185054c x27: 00000000f1042f10 x26: 0000000000000000 x25: 00000000f10302b2 x24: 0000002513200000 x23: 0000002513200000 x22: ffffffe51c1c9000 x21: fffffffdfdc00000 x20: ffffffe51c2f0831 x19: ffffffe51c6fcfc0 x18: 00000000ffff1020 x17: 00000000e1e2ac90 x16: 00000000000000e0 x15: ffffffe51b710708 x14: 0000000000000066 x13: 0000000000000018 x12: 0000000000000000 x11: 0000000000000000 x10: 00000000ffffffff x9 : 0000000000000000 x8 : 0000000000000000 x7 : 61632065726f6665 x6 : 6220646573752027 x5 : ffffffe51c641d25 x4 : ffffffe51c13142c x3 : ffff0a00ffffff05 x2 : 40000000ffffe003 x1 : 00000000000001c0 x0 : 0000000000000065 Call trace: static_key_enable_cpuslocked+0xb0/0xb8 static_key_enable+0x2c/0x40 crng_set_ready+0x24/0x30 execute_in_process_context+0x80/0x90 _credit_init_bits+0x100/0x154 add_bootloader_randomness+0x64/0x78 early_init_dt_scan_chosen+0x140/0x184 early_init_dt_scan_nodes+0x28/0x4c early_init_dt_scan+0x40/0x44 setup_machine_fdt+0x7c/0x120 setup_arch+0x74/0x1d8 start_kernel+0x84/0x44c __primary_switched+0xc0/0xc8 ---[ end trace 0000000000000000 ]--- random: crng init done Machine model: Google Lazor (rev1 - 2) with LTE A trivial fix went in to address this on arm64, 73e2d82 ("arm64: Initialize jump labels before setup_machine_fdt()"). I wrote patches as well for arm32 and risc-v. But still patches are needed on xtensa, powerpc, arc, and mips. So that's 7 platforms where things aren't quite right. This sort of points to larger issues that might need a larger solution. Instead, this commit just defers setting the static branch until later in the boot process. random_init() is called after jump_label_init() has been called, and so is always a safe place from which to adjust the static branch. Fixes: f5bda35 ("random: use static branch for crng_ready()") Reported-by: Stephen Boyd <[email protected]> Reported-by: Phil Elwell <[email protected]> Tested-by: Phil Elwell <[email protected]> Reviewed-by: Ard Biesheuvel <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Russell King <[email protected]> Cc: Arnd Bergmann <[email protected]> Signed-off-by: Jason A. Donenfeld <[email protected]>
Use after free is detected by kfence when disabling sriov. What was read after being freed was vf->pci_dev: it was freed from pci_disable_sriov and later read in efx_ef10_sriov_free_vf_vports, called from efx_ef10_sriov_free_vf_vswitching. Set the pointer to NULL at release time to not trying to read it later. Reproducer and dmesg log (note that kfence doesn't detect it every time): $ echo 1 > /sys/class/net/enp65s0f0np0/device/sriov_numvfs $ echo 0 > /sys/class/net/enp65s0f0np0/device/sriov_numvfs BUG: KFENCE: use-after-free read in efx_ef10_sriov_free_vf_vswitching+0x82/0x170 [sfc] Use-after-free read at 0x00000000ff3c1ba5 (in kfence-#224): efx_ef10_sriov_free_vf_vswitching+0x82/0x170 [sfc] efx_ef10_pci_sriov_disable+0x38/0x70 [sfc] efx_pci_sriov_configure+0x24/0x40 [sfc] sriov_numvfs_store+0xfe/0x140 kernfs_fop_write_iter+0x11c/0x1b0 new_sync_write+0x11f/0x1b0 vfs_write+0x1eb/0x280 ksys_write+0x5f/0xe0 do_syscall_64+0x5c/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae kfence-#224: 0x00000000edb8ef95-0x00000000671f5ce1, size=2792, cache=kmalloc-4k allocated by task 6771 on cpu 10 at 3137.860196s: pci_alloc_dev+0x21/0x60 pci_iov_add_virtfn+0x2a2/0x320 sriov_enable+0x212/0x3e0 efx_ef10_sriov_configure+0x67/0x80 [sfc] efx_pci_sriov_configure+0x24/0x40 [sfc] sriov_numvfs_store+0xba/0x140 kernfs_fop_write_iter+0x11c/0x1b0 new_sync_write+0x11f/0x1b0 vfs_write+0x1eb/0x280 ksys_write+0x5f/0xe0 do_syscall_64+0x5c/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae freed by task 6771 on cpu 12 at 3170.991309s: device_release+0x34/0x90 kobject_cleanup+0x3a/0x130 pci_iov_remove_virtfn+0xd9/0x120 sriov_disable+0x30/0xe0 efx_ef10_pci_sriov_disable+0x57/0x70 [sfc] efx_pci_sriov_configure+0x24/0x40 [sfc] sriov_numvfs_store+0xfe/0x140 kernfs_fop_write_iter+0x11c/0x1b0 new_sync_write+0x11f/0x1b0 vfs_write+0x1eb/0x280 ksys_write+0x5f/0xe0 do_syscall_64+0x5c/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes: 3c5eb87 ("sfc: create vports for VFs and assign random MAC addresses") Reported-by: Yanghang Liu <[email protected]> Signed-off-by: Íñigo Huguet <[email protected]> Acked-by: Martin Habets <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
Add a big batch of test coverage to assert all aspects of the tcx link API: # ./vmtest.sh -- ./test_progs -t tc_links [...] kernel-patches#224 tc_links_after:OK kernel-patches#225 tc_links_append:OK kernel-patches#226 tc_links_basic:OK kernel-patches#227 tc_links_before:OK kernel-patches#228 tc_links_both:OK kernel-patches#229 tc_links_chain_classic:OK kernel-patches#230 tc_links_dev_cleanup:OK kernel-patches#231 tc_links_first:OK kernel-patches#232 tc_links_invalid:OK kernel-patches#233 tc_links_last:OK kernel-patches#234 tc_links_prepend:OK kernel-patches#235 tc_links_replace:OK kernel-patches#236 tc_links_revision:OK Summary: 13/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <[email protected]>
Add a big batch of test coverage to assert all aspects of the tcx link API: # ./vmtest.sh -- ./test_progs -t tc_links [...] #224 tc_links_after:OK #225 tc_links_append:OK #226 tc_links_basic:OK #227 tc_links_before:OK #228 tc_links_both:OK #229 tc_links_chain_classic:OK #230 tc_links_dev_cleanup:OK #231 tc_links_first:OK #232 tc_links_invalid:OK #233 tc_links_last:OK #234 tc_links_prepend:OK #235 tc_links_replace:OK #236 tc_links_revision:OK Summary: 13/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <[email protected]>
Add a big batch of test coverage to assert all aspects of the tcx link API: # ./vmtest.sh -- ./test_progs -t tc_links [...] kernel-patches#224 tc_links_after:OK kernel-patches#225 tc_links_append:OK kernel-patches#226 tc_links_basic:OK kernel-patches#227 tc_links_before:OK kernel-patches#228 tc_links_both:OK kernel-patches#229 tc_links_chain_classic:OK kernel-patches#230 tc_links_dev_cleanup:OK kernel-patches#231 tc_links_first:OK kernel-patches#232 tc_links_invalid:OK kernel-patches#233 tc_links_last:OK kernel-patches#234 tc_links_prepend:OK kernel-patches#235 tc_links_replace:OK kernel-patches#236 tc_links_revision:OK Summary: 13/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <[email protected]>
Add a big batch of test coverage to assert all aspects of the tcx link API: # ./vmtest.sh -- ./test_progs -t tc_links [...] #224 tc_links_after:OK #225 tc_links_append:OK #226 tc_links_basic:OK #227 tc_links_before:OK #228 tc_links_both:OK #229 tc_links_chain_classic:OK #230 tc_links_dev_cleanup:OK #231 tc_links_first:OK #232 tc_links_invalid:OK #233 tc_links_last:OK #234 tc_links_prepend:OK #235 tc_links_replace:OK #236 tc_links_revision:OK Summary: 13/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <[email protected]>
Add a big batch of test coverage to assert all aspects of the tcx link API: # ./vmtest.sh -- ./test_progs -t tc_links [...] kernel-patches#224 tc_links_after:OK kernel-patches#225 tc_links_append:OK kernel-patches#226 tc_links_basic:OK kernel-patches#227 tc_links_before:OK kernel-patches#228 tc_links_both:OK kernel-patches#229 tc_links_chain_classic:OK kernel-patches#230 tc_links_dev_cleanup:OK kernel-patches#231 tc_links_first:OK kernel-patches#232 tc_links_invalid:OK kernel-patches#233 tc_links_last:OK kernel-patches#234 tc_links_prepend:OK kernel-patches#235 tc_links_replace:OK kernel-patches#236 tc_links_revision:OK Summary: 13/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <[email protected]>
Pull request for series with
subject: bpf_fib_lookup: optionally skip neighbour lookup
version: 2
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=362021