x86/bpf: handle bpf-program-triggered exceptions properly #705

kernel-patches-bot · 2021-01-26T05:49:09Z

Pull request for series with
subject: x86/bpf: handle bpf-program-triggered exceptions properly
version: 1
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=421607

kernel-patches-bot · 2021-01-26T05:49:11Z

Master branch: b9557ca
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=421607
version: 1

When reviewing patch ([1]), which adds a script to run bpf selftest through qemu at /sbin/init stage, I found the following kernel bug warning: [ 112.118892] BUG: sleeping function called from invalid context at arch/x86/mm/fault.c:1351 [ 112.119805] in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 354, name: new_name [ 112.120512] 3 locks held by new_name/354: [ 112.120868] #0: ffff88800476e0a0 (&p->lock){+.+.}-{3:3}, at: bpf_seq_read+0x3a/0x3d0 [ 112.121573] #1: ffffffff82d69800 (rcu_read_lock){....}-{1:2}, at: bpf_iter_run_prog+0x5/0x160 [ 112.122348] #2: ffff8880061c2088 (&mm->mmap_lock#2){++++}-{3:3}, at: exc_page_fault+0x1a1/0x640 [ 112.123128] Preemption disabled at: [ 112.123130] [<ffffffff8108f913>] migrate_disable+0x33/0x80 [ 112.123942] CPU: 0 PID: 354 Comm: new_name Tainted: G O 5.11.0-rc4-00524-g6e66fbb 10597-dirty #1249 [ 112.124822] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-1.el7.centos 04/01 /2014 [ 112.125614] Call Trace: [ 112.125835] dump_stack+0x77/0x97 [ 112.126137] ___might_sleep.cold.119+0xf2/0x106 [ 112.126537] exc_page_fault+0x1c1/0x640 [ 112.126888] asm_exc_page_fault+0x1e/0x30 [ 112.127241] RIP: 0010:bpf_prog_0a182df2d34af188_dump_bpf_prog+0xf5/0xb3c [ 112.127825] Code: 00 00 8b 7d f4 41 8b 76 44 48 39 f7 73 06 48 01 fb 49 89 df 4c 89 7d d8 49 8b bd 20 01 00 00 48 89 7d e0 49 8b bd e0 00 00 00 <48> 8b 7f 20 48 01 d7 48 89 7d e8 48 89 e9 48 83 c 1 d0 48 8b 7d c8 [ 112.129433] RSP: 0018:ffffc9000035fdc8 EFLAGS: 00010282 [ 112.129895] RAX: 0000000000000000 RBX: ffff888005a49458 RCX: 0000000000000024 [ 112.130509] RDX: 00000000000002f0 RSI: 0000000000000509 RDI: 0000000000000000 [ 112.131126] RBP: ffffc9000035fe20 R08: 0000000000000001 R09: 0000000000000000 [ 112.131737] R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000000400 [ 112.132355] R13: ffff888006085800 R14: ffff888004718540 R15: ffff888005a49458 [ 112.132990] ? bpf_prog_0a182df2d34af188_dump_bpf_prog+0xc8/0xb3c [ 112.133526] bpf_iter_run_prog+0x75/0x160 [ 112.133880] __bpf_prog_seq_show+0x39/0x40 [ 112.134258] bpf_seq_read+0xf6/0x3d0 [ 112.134582] vfs_read+0xa3/0x1b0 [ 112.134873] ksys_read+0x4f/0xc0 [ 112.135166] do_syscall_64+0x2d/0x40 [ 112.135482] entry_SYSCALL_64_after_hwframe+0x44/0xa9 To reproduce the issue, with patch [1] and use the following script: tools/testing/selftests/bpf/run_in_vm.sh -- cat /sys/fs/bpf/progs.debug The reason of the above kernel warning is due to bpf program tries to dereference an address of 0 and which is not caught by bpf exception handling logic. ... SEC("iter/bpf_prog") int dump_bpf_prog(struct bpf_iter__bpf_prog *ctx) { struct bpf_prog *prog = ctx->prog; struct bpf_prog_aux *aux; ... if (!prog) return 0; aux = prog->aux; ... ... aux->dst_prog->aux->name ... return 0; } If the aux->dst_prog is NULL pointer, a fault will happen when trying to access aux->dst_prog->aux. In arch/x86/mm/fault.c function do_usr_addr_fault(), we have following code if (unlikely(cpu_feature_enabled(X86_FEATURE_SMAP) && !(hw_error_code & X86_PF_USER) && !(regs->flags & X86_EFLAGS_AC))) { bad_area_nosemaphore(regs, hw_error_code, address); return; } When the test is run normally after login prompt, cpu_feature_enabled(X86_FEATURE_SMAP) is true and bad_area_nosemaphore() is called and then fixup_exception() is called, where bpf specific handler is able to fixup the exception. But when the test is run at /sbin/init time, cpu_feature_enabled(X86_FEATURE_SMAP) is false, the control reaches if (unlikely(!mmap_read_trylock(mm))) { if (!user_mode(regs) && !search_exception_tables(regs->ip)) { /* * Fault from code in kernel from * which we do not expect faults. */ bad_area_nosemaphore(regs, hw_error_code, address); return; } retry: mmap_read_lock(mm); } else { /* * The above down_read_trylock() might have succeeded in * which case we'll have missed the might_sleep() from * down_read(): */ might_sleep(); } and might_sleep() is triggered and the above kernel warning is print. To fix the issue, before the above mmap_read_trylock(), we will check whether fault ip can be served by bpf exception handler or not, if yes, the exception will be fixed up and return. [1] https://lore.kernel.org/bpf/[email protected]/ Cc: Alexei Starovoitov <[email protected]> Cc: KP Singh <[email protected]> Signed-off-by: Yonghong Song <[email protected]>

kernel-patches-bot · 2021-01-26T16:23:54Z

Master branch: 7803138
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=421607
version: 1

kernel-patches-bot · 2021-01-26T19:23:39Z

At least one diff in series https://patchwork.kernel.org/project/netdevbpf/list/?series=421607 expired. Closing PR.

Wiphy should be locked before calling rdev_get_station() (see lockdep assert in ieee80211_get_station()). This fixes the following kernel NULL dereference: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000050 Mem abort info: ESR = 0x0000000096000006 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x06: level 2 translation fault Data abort info: ISV = 0, ISS = 0x00000006 CM = 0, WnR = 0 user pgtable: 4k pages, 48-bit VAs, pgdp=0000000003001000 [0000000000000050] pgd=0800000002dca003, p4d=0800000002dca003, pud=08000000028e9003, pmd=0000000000000000 Internal error: Oops: 0000000096000006 [kernel-patches#1] SMP Modules linked in: netconsole dwc3_meson_g12a dwc3_of_simple dwc3 ip_gre gre ath10k_pci ath10k_core ath9k ath9k_common ath9k_hw ath CPU: 0 PID: 1091 Comm: kworker/u8:0 Not tainted 6.4.0-02144-g565f9a3a7911-dirty kernel-patches#705 Hardware name: RPT (r1) (DT) Workqueue: bat_events batadv_v_elp_throughput_metric_update pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : ath10k_sta_statistics+0x10/0x2dc [ath10k_core] lr : sta_set_sinfo+0xcc/0xbd4 sp : ffff000007b43ad0 x29: ffff000007b43ad0 x28: ffff0000071fa900 x27: ffff00000294ca98 x26: ffff000006830880 x25: ffff000006830880 x24: ffff00000294c000 x23: 0000000000000001 x22: ffff000007b43c90 x21: ffff800008898acc x20: ffff00000294c6e8 x19: ffff000007b43c90 x18: 0000000000000000 x17: 445946354d552d78 x16: 62661f7200000000 x15: 57464f445946354d x14: 0000000000000000 x13: 00000000000000e3 x12: d5f0acbcebea978e x11: 00000000000000e3 x10: 000000010048fe41 x9 : 0000000000000000 x8 : ffff000007b43d90 x7 : 000000007a1e2125 x6 : 0000000000000000 x5 : ffff0000024e0900 x4 : ffff800000a0250c x3 : ffff000007b43c90 x2 : ffff00000294ca98 x1 : ffff000006831920 x0 : 0000000000000000 Call trace: ath10k_sta_statistics+0x10/0x2dc [ath10k_core] sta_set_sinfo+0xcc/0xbd4 ieee80211_get_station+0x2c/0x44 cfg80211_get_station+0x80/0x154 batadv_v_elp_get_throughput+0x138/0x1fc batadv_v_elp_throughput_metric_update+0x1c/0xa4 process_one_work+0x1ec/0x414 worker_thread+0x70/0x46c kthread+0xdc/0xe0 ret_from_fork+0x10/0x20 Code: a9bb7bfd 910003fd a90153f3 f9411c40 (f9402814) This happens because STA has time to disconnect and reconnect before batadv_v_elp_throughput_metric_update() delayed work gets scheduled. In this situation, ath10k_sta_state() can be in the middle of resetting arsta data when the work queue get chance to be scheduled and ends up accessing it. Locking wiphy prevents that. Fixes: 7406353 ("cfg80211: implement cfg80211_get_station cfg80211 API") Signed-off-by: Remi Pommarel <[email protected]> Reviewed-by: Nicolas Escande <[email protected]> Acked-by: Antonio Quartulli <[email protected]> Link: https://msgid.link/983b24a6a176e0800c01aedcd74480d9b551cb13.1716046653.git.repk@triplefau.lt Signed-off-by: Johannes Berg <[email protected]>

kernel-patches-bot added bpf new V1 labels Jan 26, 2021

kernel-patches-bot and others added 2 commits January 26, 2021 08:23

adding ci files

4acbaaa

kernel-patches-bot force-pushed the series/421607=>bpf branch from c70551b to eb3b86a Compare January 26, 2021 16:23

kernel-patches-bot added changes-requested and removed new labels Jan 26, 2021

kernel-patches-bot closed this Jan 26, 2021

kernel-patches-bot deleted the series/421607=>bpf branch January 29, 2021 03:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

x86/bpf: handle bpf-program-triggered exceptions properly #705

x86/bpf: handle bpf-program-triggered exceptions properly #705

Uh oh!

kernel-patches-bot commented Jan 26, 2021

Uh oh!

kernel-patches-bot commented Jan 26, 2021

Uh oh!

kernel-patches-bot commented Jan 26, 2021

Uh oh!

kernel-patches-bot commented Jan 26, 2021

Uh oh!

Uh oh!

x86/bpf: handle bpf-program-triggered exceptions properly #705

x86/bpf: handle bpf-program-triggered exceptions properly #705

Uh oh!

Conversation

kernel-patches-bot commented Jan 26, 2021

Uh oh!

kernel-patches-bot commented Jan 26, 2021

Uh oh!

kernel-patches-bot commented Jan 26, 2021

Uh oh!

kernel-patches-bot commented Jan 26, 2021

Uh oh!

Uh oh!