Skip to content

Commit 1cb6f0b

Browse files
borkmannMartin KaFai Lau
authored and
Martin KaFai Lau
committed
bpf: Fix too early release of tcx_entry
Pedro Pinto and later independently also Hyunwoo Kim and Wongi Lee reported an issue that the tcx_entry can be released too early leading to a use after free (UAF) when an active old-style ingress or clsact qdisc with a shared tc block is later replaced by another ingress or clsact instance. Essentially, the sequence to trigger the UAF (one example) can be as follows: 1. A network namespace is created 2. An ingress qdisc is created. This allocates a tcx_entry, and &tcx_entry->miniq is stored in the qdisc's miniqp->p_miniq. At the same time, a tcf block with index 1 is created. 3. chain0 is attached to the tcf block. chain0 must be connected to the block linked to the ingress qdisc to later reach the function tcf_chain0_head_change_cb_del() which triggers the UAF. 4. Create and graft a clsact qdisc. This causes the ingress qdisc created in step 1 to be removed, thus freeing the previously linked tcx_entry: rtnetlink_rcv_msg() => tc_modify_qdisc() => qdisc_create() => clsact_init() [a] => qdisc_graft() => qdisc_destroy() => __qdisc_destroy() => ingress_destroy() [b] => tcx_entry_free() => kfree_rcu() // tcx_entry freed 5. Finally, the network namespace is closed. This registers the cleanup_net worker, and during the process of releasing the remaining clsact qdisc, it accesses the tcx_entry that was already freed in step 4, causing the UAF to occur: cleanup_net() => ops_exit_list() => default_device_exit_batch() => unregister_netdevice_many() => unregister_netdevice_many_notify() => dev_shutdown() => qdisc_put() => clsact_destroy() [c] => tcf_block_put_ext() => tcf_chain0_head_change_cb_del() => tcf_chain_head_change_item() => clsact_chain_head_change() => mini_qdisc_pair_swap() // UAF There are also other variants, the gist is to add an ingress (or clsact) qdisc with a specific shared block, then to replace that qdisc, waiting for the tcx_entry kfree_rcu() to be executed and subsequently accessing the current active qdisc's miniq one way or another. The correct fix is to turn the miniq_active boolean into a counter. What can be observed, at step 2 above, the counter transitions from 0->1, at step [a] from 1->2 (in order for the miniq object to remain active during the replacement), then in [b] from 2->1 and finally [c] 1->0 with the eventual release. The reference counter in general ranges from [0,2] and it does not need to be atomic since all access to the counter is protected by the rtnl mutex. With this in place, there is no longer a UAF happening and the tcx_entry is freed at the correct time. Fixes: e420bed ("bpf: Add fd-based tcx multi-prog infra with link support") Reported-by: Pedro Pinto <[email protected]> Co-developed-by: Pedro Pinto <[email protected]> Signed-off-by: Pedro Pinto <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Cc: Hyunwoo Kim <[email protected]> Cc: Wongi Lee <[email protected]> Cc: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin KaFai Lau <[email protected]>
1 parent 83c36e7 commit 1cb6f0b

File tree

2 files changed

+15
-10
lines changed

2 files changed

+15
-10
lines changed

include/net/tcx.h

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ struct mini_Qdisc;
1313
struct tcx_entry {
1414
struct mini_Qdisc __rcu *miniq;
1515
struct bpf_mprog_bundle bundle;
16-
bool miniq_active;
16+
u32 miniq_active;
1717
struct rcu_head rcu;
1818
};
1919

@@ -125,11 +125,16 @@ static inline void tcx_skeys_dec(bool ingress)
125125
tcx_dec();
126126
}
127127

128-
static inline void tcx_miniq_set_active(struct bpf_mprog_entry *entry,
129-
const bool active)
128+
static inline void tcx_miniq_inc(struct bpf_mprog_entry *entry)
130129
{
131130
ASSERT_RTNL();
132-
tcx_entry(entry)->miniq_active = active;
131+
tcx_entry(entry)->miniq_active++;
132+
}
133+
134+
static inline void tcx_miniq_dec(struct bpf_mprog_entry *entry)
135+
{
136+
ASSERT_RTNL();
137+
tcx_entry(entry)->miniq_active--;
133138
}
134139

135140
static inline bool tcx_entry_is_active(struct bpf_mprog_entry *entry)

net/sched/sch_ingress.c

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -91,7 +91,7 @@ static int ingress_init(struct Qdisc *sch, struct nlattr *opt,
9191
entry = tcx_entry_fetch_or_create(dev, true, &created);
9292
if (!entry)
9393
return -ENOMEM;
94-
tcx_miniq_set_active(entry, true);
94+
tcx_miniq_inc(entry);
9595
mini_qdisc_pair_init(&q->miniqp, sch, &tcx_entry(entry)->miniq);
9696
if (created)
9797
tcx_entry_update(dev, entry, true);
@@ -121,7 +121,7 @@ static void ingress_destroy(struct Qdisc *sch)
121121
tcf_block_put_ext(q->block, sch, &q->block_info);
122122

123123
if (entry) {
124-
tcx_miniq_set_active(entry, false);
124+
tcx_miniq_dec(entry);
125125
if (!tcx_entry_is_active(entry)) {
126126
tcx_entry_update(dev, NULL, true);
127127
tcx_entry_free(entry);
@@ -257,7 +257,7 @@ static int clsact_init(struct Qdisc *sch, struct nlattr *opt,
257257
entry = tcx_entry_fetch_or_create(dev, true, &created);
258258
if (!entry)
259259
return -ENOMEM;
260-
tcx_miniq_set_active(entry, true);
260+
tcx_miniq_inc(entry);
261261
mini_qdisc_pair_init(&q->miniqp_ingress, sch, &tcx_entry(entry)->miniq);
262262
if (created)
263263
tcx_entry_update(dev, entry, true);
@@ -276,7 +276,7 @@ static int clsact_init(struct Qdisc *sch, struct nlattr *opt,
276276
entry = tcx_entry_fetch_or_create(dev, false, &created);
277277
if (!entry)
278278
return -ENOMEM;
279-
tcx_miniq_set_active(entry, true);
279+
tcx_miniq_inc(entry);
280280
mini_qdisc_pair_init(&q->miniqp_egress, sch, &tcx_entry(entry)->miniq);
281281
if (created)
282282
tcx_entry_update(dev, entry, false);
@@ -302,15 +302,15 @@ static void clsact_destroy(struct Qdisc *sch)
302302
tcf_block_put_ext(q->egress_block, sch, &q->egress_block_info);
303303

304304
if (ingress_entry) {
305-
tcx_miniq_set_active(ingress_entry, false);
305+
tcx_miniq_dec(ingress_entry);
306306
if (!tcx_entry_is_active(ingress_entry)) {
307307
tcx_entry_update(dev, NULL, true);
308308
tcx_entry_free(ingress_entry);
309309
}
310310
}
311311

312312
if (egress_entry) {
313-
tcx_miniq_set_active(egress_entry, false);
313+
tcx_miniq_dec(egress_entry);
314314
if (!tcx_entry_is_active(egress_entry)) {
315315
tcx_entry_update(dev, NULL, false);
316316
tcx_entry_free(egress_entry);

0 commit comments

Comments
 (0)