Skip to content

Commit b2fc4b1

Browse files
kkdwivedigregkh
authored andcommitted
bpf: Augment raw_tp arguments with PTR_MAYBE_NULL
commit 838a10b upstream. Arguments to a raw tracepoint are tagged as trusted, which carries the semantics that the pointer will be non-NULL. However, in certain cases, a raw tracepoint argument may end up being NULL. More context about this issue is available in [0]. Thus, there is a discrepancy between the reality, that raw_tp arguments can actually be NULL, and the verifier's knowledge, that they are never NULL, causing explicit NULL check branch to be dead code eliminated. A previous attempt [1], i.e. the second fixed commit, was made to simulate symbolic execution as if in most accesses, the argument is a non-NULL raw_tp, except for conditional jumps. This tried to suppress branch prediction while preserving compatibility, but surfaced issues with production programs that were difficult to solve without increasing verifier complexity. A more complete discussion of issues and fixes is available at [2]. Fix this by maintaining an explicit list of tracepoints where the arguments are known to be NULL, and mark the positional arguments as PTR_MAYBE_NULL. Additionally, capture the tracepoints where arguments are known to be ERR_PTR, and mark these arguments as scalar values to prevent potential dereference. Each hex digit is used to encode NULL-ness (0x1) or ERR_PTR-ness (0x2), shifted by the zero-indexed argument number x 4. This can be represented as follows: 1st arg: 0x1 2nd arg: 0x10 3rd arg: 0x100 ... and so on (likewise for ERR_PTR case). In the future, an automated pass will be used to produce such a list, or insert __nullable annotations automatically for tracepoints. Each compilation unit will be analyzed and results will be collated to find whether a tracepoint pointer is definitely not null, maybe null, or an unknown state where verifier conservatively marks it PTR_MAYBE_NULL. A proof of concept of this tool from Eduard is available at [3]. Note that in case we don't find a specification in the raw_tp_null_args array and the tracepoint belongs to a kernel module, we will conservatively mark the arguments as PTR_MAYBE_NULL. This is because unlike for in-tree modules, out-of-tree module tracepoints may pass NULL freely to the tracepoint. We don't protect against such tracepoints passing ERR_PTR (which is uncommon anyway), lest we mark all such arguments as SCALAR_VALUE. While we are it, let's adjust the test raw_tp_null to not perform dereference of the skb->mark, as that won't be allowed anymore, and make it more robust by using inline assembly to test the dead code elimination behavior, which should still stay the same. [0]: https://lore.kernel.org/bpf/[email protected] [1]: https://lore.kernel.org/all/[email protected] [2]: https://lore.kernel.org/bpf/[email protected] [3]: https://github.com/eddyz87/llvm-project/tree/nullness-for-tracepoint-params Reported-by: Juri Lelli <[email protected]> # original bug Reported-by: Manu Bretelle <[email protected]> # bugs in masking fix Fixes: 3f00c52 ("bpf: Allow trusted pointers to be passed to KF_TRUSTED_ARGS kfuncs") Fixes: cb4158c ("bpf: Mark raw_tp arguments with PTR_MAYBE_NULL") Reviewed-by: Eduard Zingerman <[email protected]> Co-developed-by: Jiri Olsa <[email protected]> Signed-off-by: Jiri Olsa <[email protected]> Signed-off-by: Kumar Kartikeya Dwivedi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
1 parent 2bd517b commit b2fc4b1

File tree

1 file changed

+138
-0
lines changed

1 file changed

+138
-0
lines changed

kernel/bpf/btf.c

Lines changed: 138 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6415,6 +6415,101 @@ int btf_ctx_arg_offset(const struct btf *btf, const struct btf_type *func_proto,
64156415
return off;
64166416
}
64176417

6418+
struct bpf_raw_tp_null_args {
6419+
const char *func;
6420+
u64 mask;
6421+
};
6422+
6423+
static const struct bpf_raw_tp_null_args raw_tp_null_args[] = {
6424+
/* sched */
6425+
{ "sched_pi_setprio", 0x10 },
6426+
/* ... from sched_numa_pair_template event class */
6427+
{ "sched_stick_numa", 0x100 },
6428+
{ "sched_swap_numa", 0x100 },
6429+
/* afs */
6430+
{ "afs_make_fs_call", 0x10 },
6431+
{ "afs_make_fs_calli", 0x10 },
6432+
{ "afs_make_fs_call1", 0x10 },
6433+
{ "afs_make_fs_call2", 0x10 },
6434+
{ "afs_protocol_error", 0x1 },
6435+
{ "afs_flock_ev", 0x10 },
6436+
/* cachefiles */
6437+
{ "cachefiles_lookup", 0x1 | 0x200 },
6438+
{ "cachefiles_unlink", 0x1 },
6439+
{ "cachefiles_rename", 0x1 },
6440+
{ "cachefiles_prep_read", 0x1 },
6441+
{ "cachefiles_mark_active", 0x1 },
6442+
{ "cachefiles_mark_failed", 0x1 },
6443+
{ "cachefiles_mark_inactive", 0x1 },
6444+
{ "cachefiles_vfs_error", 0x1 },
6445+
{ "cachefiles_io_error", 0x1 },
6446+
{ "cachefiles_ondemand_open", 0x1 },
6447+
{ "cachefiles_ondemand_copen", 0x1 },
6448+
{ "cachefiles_ondemand_close", 0x1 },
6449+
{ "cachefiles_ondemand_read", 0x1 },
6450+
{ "cachefiles_ondemand_cread", 0x1 },
6451+
{ "cachefiles_ondemand_fd_write", 0x1 },
6452+
{ "cachefiles_ondemand_fd_release", 0x1 },
6453+
/* ext4, from ext4__mballoc event class */
6454+
{ "ext4_mballoc_discard", 0x10 },
6455+
{ "ext4_mballoc_free", 0x10 },
6456+
/* fib */
6457+
{ "fib_table_lookup", 0x100 },
6458+
/* filelock */
6459+
/* ... from filelock_lock event class */
6460+
{ "posix_lock_inode", 0x10 },
6461+
{ "fcntl_setlk", 0x10 },
6462+
{ "locks_remove_posix", 0x10 },
6463+
{ "flock_lock_inode", 0x10 },
6464+
/* ... from filelock_lease event class */
6465+
{ "break_lease_noblock", 0x10 },
6466+
{ "break_lease_block", 0x10 },
6467+
{ "break_lease_unblock", 0x10 },
6468+
{ "generic_delete_lease", 0x10 },
6469+
{ "time_out_leases", 0x10 },
6470+
/* host1x */
6471+
{ "host1x_cdma_push_gather", 0x10000 },
6472+
/* huge_memory */
6473+
{ "mm_khugepaged_scan_pmd", 0x10 },
6474+
{ "mm_collapse_huge_page_isolate", 0x1 },
6475+
{ "mm_khugepaged_scan_file", 0x10 },
6476+
{ "mm_khugepaged_collapse_file", 0x10 },
6477+
/* kmem */
6478+
{ "mm_page_alloc", 0x1 },
6479+
{ "mm_page_pcpu_drain", 0x1 },
6480+
/* .. from mm_page event class */
6481+
{ "mm_page_alloc_zone_locked", 0x1 },
6482+
/* netfs */
6483+
{ "netfs_failure", 0x10 },
6484+
/* power */
6485+
{ "device_pm_callback_start", 0x10 },
6486+
/* qdisc */
6487+
{ "qdisc_dequeue", 0x1000 },
6488+
/* rxrpc */
6489+
{ "rxrpc_recvdata", 0x1 },
6490+
{ "rxrpc_resend", 0x10 },
6491+
/* sunrpc */
6492+
{ "xs_stream_read_data", 0x1 },
6493+
/* ... from xprt_cong_event event class */
6494+
{ "xprt_reserve_cong", 0x10 },
6495+
{ "xprt_release_cong", 0x10 },
6496+
{ "xprt_get_cong", 0x10 },
6497+
{ "xprt_put_cong", 0x10 },
6498+
/* tcp */
6499+
{ "tcp_send_reset", 0x11 },
6500+
/* tegra_apb_dma */
6501+
{ "tegra_dma_tx_status", 0x100 },
6502+
/* timer_migration */
6503+
{ "tmigr_update_events", 0x1 },
6504+
/* writeback, from writeback_folio_template event class */
6505+
{ "writeback_dirty_folio", 0x10 },
6506+
{ "folio_wait_writeback", 0x10 },
6507+
/* rdma */
6508+
{ "mr_integ_alloc", 0x2000 },
6509+
/* bpf_testmod */
6510+
{ "bpf_testmod_test_read", 0x0 },
6511+
};
6512+
64186513
bool btf_ctx_access(int off, int size, enum bpf_access_type type,
64196514
const struct bpf_prog *prog,
64206515
struct bpf_insn_access_aux *info)
@@ -6425,6 +6520,7 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type,
64256520
const char *tname = prog->aux->attach_func_name;
64266521
struct bpf_verifier_log *log = info->log;
64276522
const struct btf_param *args;
6523+
bool ptr_err_raw_tp = false;
64286524
const char *tag_value;
64296525
u32 nr_args, arg;
64306526
int i, ret;
@@ -6573,6 +6669,39 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type,
65736669
if (btf_param_match_suffix(btf, &args[arg], "__nullable"))
65746670
info->reg_type |= PTR_MAYBE_NULL;
65756671

6672+
if (prog->expected_attach_type == BPF_TRACE_RAW_TP) {
6673+
struct btf *btf = prog->aux->attach_btf;
6674+
const struct btf_type *t;
6675+
const char *tname;
6676+
6677+
/* BTF lookups cannot fail, return false on error */
6678+
t = btf_type_by_id(btf, prog->aux->attach_btf_id);
6679+
if (!t)
6680+
return false;
6681+
tname = btf_name_by_offset(btf, t->name_off);
6682+
if (!tname)
6683+
return false;
6684+
/* Checked by bpf_check_attach_target */
6685+
tname += sizeof("btf_trace_") - 1;
6686+
for (i = 0; i < ARRAY_SIZE(raw_tp_null_args); i++) {
6687+
/* Is this a func with potential NULL args? */
6688+
if (strcmp(tname, raw_tp_null_args[i].func))
6689+
continue;
6690+
if (raw_tp_null_args[i].mask & (0x1 << (arg * 4)))
6691+
info->reg_type |= PTR_MAYBE_NULL;
6692+
/* Is the current arg IS_ERR? */
6693+
if (raw_tp_null_args[i].mask & (0x2 << (arg * 4)))
6694+
ptr_err_raw_tp = true;
6695+
break;
6696+
}
6697+
/* If we don't know NULL-ness specification and the tracepoint
6698+
* is coming from a loadable module, be conservative and mark
6699+
* argument as PTR_MAYBE_NULL.
6700+
*/
6701+
if (i == ARRAY_SIZE(raw_tp_null_args) && btf_is_module(btf))
6702+
info->reg_type |= PTR_MAYBE_NULL;
6703+
}
6704+
65766705
if (tgt_prog) {
65776706
enum bpf_prog_type tgt_type;
65786707

@@ -6617,6 +6746,15 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type,
66176746
bpf_log(log, "func '%s' arg%d has btf_id %d type %s '%s'\n",
66186747
tname, arg, info->btf_id, btf_type_str(t),
66196748
__btf_name_by_offset(btf, t->name_off));
6749+
6750+
/* Perform all checks on the validity of type for this argument, but if
6751+
* we know it can be IS_ERR at runtime, scrub pointer type and mark as
6752+
* scalar.
6753+
*/
6754+
if (ptr_err_raw_tp) {
6755+
bpf_log(log, "marking pointer arg%d as scalar as it may encode error", arg);
6756+
info->reg_type = SCALAR_VALUE;
6757+
}
66206758
return true;
66216759
}
66226760
EXPORT_SYMBOL_GPL(btf_ctx_access);

0 commit comments

Comments
 (0)