Skip to content

x86/bpf: handle bpf-program-triggered exceptions properly #705

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

kernel-patches-bot
Copy link

Pull request for series with
subject: x86/bpf: handle bpf-program-triggered exceptions properly
version: 1
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=421607

@kernel-patches-bot
Copy link
Author

Master branch: b9557ca
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=421607
version: 1

kernel-patches-bot and others added 2 commits January 26, 2021 08:23
When reviewing patch ([1]), which adds a script to run bpf selftest
through qemu at /sbin/init stage, I found the following kernel bug
warning:

[  112.118892] BUG: sleeping function called from invalid context at arch/x86/mm/fault.c:1351
[  112.119805] in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 354, name: new_name
[  112.120512] 3 locks held by new_name/354:
[  112.120868]  #0: ffff88800476e0a0 (&p->lock){+.+.}-{3:3}, at: bpf_seq_read+0x3a/0x3d0
[  112.121573]  #1: ffffffff82d69800 (rcu_read_lock){....}-{1:2}, at: bpf_iter_run_prog+0x5/0x160
[  112.122348]  #2: ffff8880061c2088 (&mm->mmap_lock#2){++++}-{3:3}, at: exc_page_fault+0x1a1/0x640
[  112.123128] Preemption disabled at:
[  112.123130] [<ffffffff8108f913>] migrate_disable+0x33/0x80
[  112.123942] CPU: 0 PID: 354 Comm: new_name Tainted: G           O      5.11.0-rc4-00524-g6e66fbb
10597-dirty #1249
[  112.124822] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-1.el7.centos 04/01
/2014
[  112.125614] Call Trace:
[  112.125835]  dump_stack+0x77/0x97
[  112.126137]  ___might_sleep.cold.119+0xf2/0x106
[  112.126537]  exc_page_fault+0x1c1/0x640
[  112.126888]  asm_exc_page_fault+0x1e/0x30
[  112.127241] RIP: 0010:bpf_prog_0a182df2d34af188_dump_bpf_prog+0xf5/0xb3c
[  112.127825] Code: 00 00 8b 7d f4 41 8b 76 44 48 39 f7 73 06 48 01 fb 49 89 df 4c 89 7d d8 49 8b
bd 20 01 00 00 48 89 7d e0 49 8b bd e0 00 00 00 <48> 8b 7f 20 48 01 d7 48 89 7d e8 48 89 e9 48 83 c
1 d0 48 8b 7d c8
[  112.129433] RSP: 0018:ffffc9000035fdc8 EFLAGS: 00010282
[  112.129895] RAX: 0000000000000000 RBX: ffff888005a49458 RCX: 0000000000000024
[  112.130509] RDX: 00000000000002f0 RSI: 0000000000000509 RDI: 0000000000000000
[  112.131126] RBP: ffffc9000035fe20 R08: 0000000000000001 R09: 0000000000000000
[  112.131737] R10: 0000000000000002 R11: 0000000000000000 R12: 0000000000000400
[  112.132355] R13: ffff888006085800 R14: ffff888004718540 R15: ffff888005a49458
[  112.132990]  ? bpf_prog_0a182df2d34af188_dump_bpf_prog+0xc8/0xb3c
[  112.133526]  bpf_iter_run_prog+0x75/0x160
[  112.133880]  __bpf_prog_seq_show+0x39/0x40
[  112.134258]  bpf_seq_read+0xf6/0x3d0
[  112.134582]  vfs_read+0xa3/0x1b0
[  112.134873]  ksys_read+0x4f/0xc0
[  112.135166]  do_syscall_64+0x2d/0x40
[  112.135482]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

To reproduce the issue, with patch [1] and use the following script:
  tools/testing/selftests/bpf/run_in_vm.sh -- cat /sys/fs/bpf/progs.debug

The reason of the above kernel warning is due to bpf program
tries to dereference an address of 0 and which is not caught
by bpf exception handling logic.

...
SEC("iter/bpf_prog")
int dump_bpf_prog(struct bpf_iter__bpf_prog *ctx)
{
	struct bpf_prog *prog = ctx->prog;
	struct bpf_prog_aux *aux;
	...
	if (!prog)
		return 0;
	aux = prog->aux;
	...
	... aux->dst_prog->aux->name ...
	return 0;
}

If the aux->dst_prog is NULL pointer, a fault will happen when trying
to access aux->dst_prog->aux.

In arch/x86/mm/fault.c function do_usr_addr_fault(), we have following code
         if (unlikely(cpu_feature_enabled(X86_FEATURE_SMAP) &&
                      !(hw_error_code & X86_PF_USER) &&
                      !(regs->flags & X86_EFLAGS_AC)))
         {
                 bad_area_nosemaphore(regs, hw_error_code, address);
                 return;
         }

When the test is run normally after login prompt, cpu_feature_enabled(X86_FEATURE_SMAP)
is true and bad_area_nosemaphore() is called and then fixup_exception() is called,
where bpf specific handler is able to fixup the exception.

But when the test is run at /sbin/init time, cpu_feature_enabled(X86_FEATURE_SMAP) is
false, the control reaches
         if (unlikely(!mmap_read_trylock(mm))) {
                 if (!user_mode(regs) && !search_exception_tables(regs->ip)) {
                         /*
                          * Fault from code in kernel from
                          * which we do not expect faults.
                          */
                         bad_area_nosemaphore(regs, hw_error_code, address);
                         return;
                 }
retry:
                 mmap_read_lock(mm);
         } else {
                 /*
                  * The above down_read_trylock() might have succeeded in
                  * which case we'll have missed the might_sleep() from
                  * down_read():
                  */
                 might_sleep();
         }
and might_sleep() is triggered and the above kernel warning is print.

To fix the issue, before the above mmap_read_trylock(), we will check
whether fault ip can be served by bpf exception handler or not, if
yes, the exception will be fixed up and return.

[1] https://lore.kernel.org/bpf/[email protected]/

Cc: Alexei Starovoitov <[email protected]>
Cc: KP Singh <[email protected]>
Signed-off-by: Yonghong Song <[email protected]>
@kernel-patches-bot
Copy link
Author

Master branch: 7803138
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=421607
version: 1

@kernel-patches-bot
Copy link
Author

At least one diff in series https://patchwork.kernel.org/project/netdevbpf/list/?series=421607 expired. Closing PR.

@kernel-patches-bot kernel-patches-bot deleted the series/421607=>bpf branch January 29, 2021 03:30
kuba-moo pushed a commit to linux-netdev/testing-bpf-ci that referenced this pull request Jun 3, 2024
Wiphy should be locked before calling rdev_get_station() (see lockdep
assert in ieee80211_get_station()).

This fixes the following kernel NULL dereference:

 Unable to handle kernel NULL pointer dereference at virtual address 0000000000000050
 Mem abort info:
   ESR = 0x0000000096000006
   EC = 0x25: DABT (current EL), IL = 32 bits
   SET = 0, FnV = 0
   EA = 0, S1PTW = 0
   FSC = 0x06: level 2 translation fault
 Data abort info:
   ISV = 0, ISS = 0x00000006
   CM = 0, WnR = 0
 user pgtable: 4k pages, 48-bit VAs, pgdp=0000000003001000
 [0000000000000050] pgd=0800000002dca003, p4d=0800000002dca003, pud=08000000028e9003, pmd=0000000000000000
 Internal error: Oops: 0000000096000006 [kernel-patches#1] SMP
 Modules linked in: netconsole dwc3_meson_g12a dwc3_of_simple dwc3 ip_gre gre ath10k_pci ath10k_core ath9k ath9k_common ath9k_hw ath
 CPU: 0 PID: 1091 Comm: kworker/u8:0 Not tainted 6.4.0-02144-g565f9a3a7911-dirty kernel-patches#705
 Hardware name: RPT (r1) (DT)
 Workqueue: bat_events batadv_v_elp_throughput_metric_update
 pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
 pc : ath10k_sta_statistics+0x10/0x2dc [ath10k_core]
 lr : sta_set_sinfo+0xcc/0xbd4
 sp : ffff000007b43ad0
 x29: ffff000007b43ad0 x28: ffff0000071fa900 x27: ffff00000294ca98
 x26: ffff000006830880 x25: ffff000006830880 x24: ffff00000294c000
 x23: 0000000000000001 x22: ffff000007b43c90 x21: ffff800008898acc
 x20: ffff00000294c6e8 x19: ffff000007b43c90 x18: 0000000000000000
 x17: 445946354d552d78 x16: 62661f7200000000 x15: 57464f445946354d
 x14: 0000000000000000 x13: 00000000000000e3 x12: d5f0acbcebea978e
 x11: 00000000000000e3 x10: 000000010048fe41 x9 : 0000000000000000
 x8 : ffff000007b43d90 x7 : 000000007a1e2125 x6 : 0000000000000000
 x5 : ffff0000024e0900 x4 : ffff800000a0250c x3 : ffff000007b43c90
 x2 : ffff00000294ca98 x1 : ffff000006831920 x0 : 0000000000000000
 Call trace:
  ath10k_sta_statistics+0x10/0x2dc [ath10k_core]
  sta_set_sinfo+0xcc/0xbd4
  ieee80211_get_station+0x2c/0x44
  cfg80211_get_station+0x80/0x154
  batadv_v_elp_get_throughput+0x138/0x1fc
  batadv_v_elp_throughput_metric_update+0x1c/0xa4
  process_one_work+0x1ec/0x414
  worker_thread+0x70/0x46c
  kthread+0xdc/0xe0
  ret_from_fork+0x10/0x20
 Code: a9bb7bfd 910003fd a90153f3 f9411c40 (f9402814)

This happens because STA has time to disconnect and reconnect before
batadv_v_elp_throughput_metric_update() delayed work gets scheduled. In
this situation, ath10k_sta_state() can be in the middle of resetting
arsta data when the work queue get chance to be scheduled and ends up
accessing it. Locking wiphy prevents that.

Fixes: 7406353 ("cfg80211: implement cfg80211_get_station cfg80211 API")
Signed-off-by: Remi Pommarel <[email protected]>
Reviewed-by: Nicolas Escande <[email protected]>
Acked-by: Antonio Quartulli <[email protected]>
Link: https://msgid.link/983b24a6a176e0800c01aedcd74480d9b551cb13.1716046653.git.repk@triplefau.lt
Signed-off-by: Johannes Berg <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants