Skip to content

Commit 13e6065

Browse files
Amit Cohengregkh
Amit Cohen
authored andcommitted
mlxsw: spectrum_acl_erp: Fix error flow of pool allocation failure
[ Upstream commit 6d6eeab ] Lately, a bug was found when many TC filters are added - at some point, several bugs are printed to dmesg [1] and the switch is crashed with segmentation fault. The issue starts when gen_pool_free() fails because of unexpected behavior - a try to free memory which is already freed, this leads to BUG() call which crashes the switch and makes many other bugs. Trying to track down the unexpected behavior led to a bug in eRP code. The function mlxsw_sp_acl_erp_table_alloc() gets a pointer to the allocated index, sets the value and returns an error code. When gen_pool_alloc() fails it returns address 0, we track it and return -ENOBUFS outside, BUT the call for gen_pool_alloc() already override the index in erp_table structure. This is a problem when such allocation is done as part of table expansion. This is not a new table, which will not be used in case of allocation failure. We try to expand eRP table and override the current index (non-zero) with zero. Then, it leads to an unexpected behavior when address 0 is freed twice. Note that address 0 is valid in erp_table->base_index and indeed other tables use it. gen_pool_alloc() fails in case that there is no space left in the pre-allocated pool, in our case, the pool is limited to ACL_MAX_ERPT_BANK_SIZE, which is read from hardware. When more than max erp entries are required, we exceed the limit and return an error, this error leads to "Failed to migrate vregion" print. Fix this by changing erp_table->base_index only in case of a successful allocation. Add a test case for such a scenario. Without this fix it causes segmentation fault: $ TESTS="max_erp_entries_test" ./tc_flower.sh ./tc_flower.sh: line 988: 1560 Segmentation fault tc filter del dev $h2 ingress chain $i protocol ip pref $i handle $j flower &>/dev/null [1]: kernel BUG at lib/genalloc.c:508! invalid opcode: 0000 [#1] PREEMPT SMP CPU: 6 PID: 3531 Comm: tc Not tainted 6.7.0-rc5-custom-ga6893f479f5e #1 Hardware name: Mellanox Technologies Ltd. MSN4700/VMOD0010, BIOS 5.11 07/12/2021 RIP: 0010:gen_pool_free_owner+0xc9/0xe0 ... Call Trace: <TASK> __mlxsw_sp_acl_erp_table_other_dec+0x70/0xa0 [mlxsw_spectrum] mlxsw_sp_acl_erp_mask_destroy+0xf5/0x110 [mlxsw_spectrum] objagg_obj_root_destroy+0x18/0x80 [objagg] objagg_obj_destroy+0x12c/0x130 [objagg] mlxsw_sp_acl_erp_mask_put+0x37/0x50 [mlxsw_spectrum] mlxsw_sp_acl_ctcam_region_entry_remove+0x74/0xa0 [mlxsw_spectrum] mlxsw_sp_acl_ctcam_entry_del+0x1e/0x40 [mlxsw_spectrum] mlxsw_sp_acl_tcam_ventry_del+0x78/0xd0 [mlxsw_spectrum] mlxsw_sp_flower_destroy+0x4d/0x70 [mlxsw_spectrum] mlxsw_sp_flow_block_cb+0x73/0xb0 [mlxsw_spectrum] tc_setup_cb_destroy+0xc1/0x180 fl_hw_destroy_filter+0x94/0xc0 [cls_flower] __fl_delete+0x1ac/0x1c0 [cls_flower] fl_destroy+0xc2/0x150 [cls_flower] tcf_proto_destroy+0x1a/0xa0 ... mlxsw_spectrum3 0000:07:00.0: Failed to migrate vregion mlxsw_spectrum3 0000:07:00.0: Failed to migrate vregion Fixes: f465261 ("mlxsw: spectrum_acl: Implement common eRP core") Signed-off-by: Amit Cohen <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: Petr Machata <[email protected]> Acked-by: Paolo Abeni <[email protected]> Link: https://lore.kernel.org/r/4cfca254dfc0e5d283974801a24371c7b6db5989.1705502064.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
1 parent 31944f4 commit 13e6065

File tree

2 files changed

+56
-4
lines changed

2 files changed

+56
-4
lines changed

drivers/net/ethernet/mellanox/mlxsw/spectrum_acl_erp.c

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -301,6 +301,7 @@ mlxsw_sp_acl_erp_table_alloc(struct mlxsw_sp_acl_erp_core *erp_core,
301301
unsigned long *p_index)
302302
{
303303
unsigned int num_rows, entry_size;
304+
unsigned long index;
304305

305306
/* We only allow allocations of entire rows */
306307
if (num_erps % erp_core->num_erp_banks != 0)
@@ -309,10 +310,11 @@ mlxsw_sp_acl_erp_table_alloc(struct mlxsw_sp_acl_erp_core *erp_core,
309310
entry_size = erp_core->erpt_entries_size[region_type];
310311
num_rows = num_erps / erp_core->num_erp_banks;
311312

312-
*p_index = gen_pool_alloc(erp_core->erp_tables, num_rows * entry_size);
313-
if (*p_index == 0)
313+
index = gen_pool_alloc(erp_core->erp_tables, num_rows * entry_size);
314+
if (!index)
314315
return -ENOBUFS;
315-
*p_index -= MLXSW_SP_ACL_ERP_GENALLOC_OFFSET;
316+
317+
*p_index = index - MLXSW_SP_ACL_ERP_GENALLOC_OFFSET;
316318

317319
return 0;
318320
}

tools/testing/selftests/drivers/net/mlxsw/spectrum-2/tc_flower.sh

Lines changed: 51 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,8 @@ lib_dir=$(dirname $0)/../../../../net/forwarding
1010
ALL_TESTS="single_mask_test identical_filters_test two_masks_test \
1111
multiple_masks_test ctcam_edge_cases_test delta_simple_test \
1212
delta_two_masks_one_key_test delta_simple_rehash_test \
13-
bloom_simple_test bloom_complex_test bloom_delta_test"
13+
bloom_simple_test bloom_complex_test bloom_delta_test \
14+
max_erp_entries_test"
1415
NUM_NETIFS=2
1516
source $lib_dir/lib.sh
1617
source $lib_dir/tc_common.sh
@@ -983,6 +984,55 @@ bloom_delta_test()
983984
log_test "bloom delta test ($tcflags)"
984985
}
985986

987+
max_erp_entries_test()
988+
{
989+
# The number of eRP entries is limited. Once the maximum number of eRPs
990+
# has been reached, filters cannot be added. This test verifies that
991+
# when this limit is reached, inserstion fails without crashing.
992+
993+
RET=0
994+
995+
local num_masks=32
996+
local num_regions=15
997+
local chain_failed
998+
local mask_failed
999+
local ret
1000+
1001+
if [[ "$tcflags" != "skip_sw" ]]; then
1002+
return 0;
1003+
fi
1004+
1005+
for ((i=1; i < $num_regions; i++)); do
1006+
for ((j=$num_masks; j >= 0; j--)); do
1007+
tc filter add dev $h2 ingress chain $i protocol ip \
1008+
pref $i handle $j flower $tcflags \
1009+
dst_ip 192.1.0.0/$j &> /dev/null
1010+
ret=$?
1011+
1012+
if [ $ret -ne 0 ]; then
1013+
chain_failed=$i
1014+
mask_failed=$j
1015+
break 2
1016+
fi
1017+
done
1018+
done
1019+
1020+
# We expect to exceed the maximum number of eRP entries, so that
1021+
# insertion eventually fails. Otherwise, the test should be adjusted to
1022+
# add more filters.
1023+
check_fail $ret "expected to exceed number of eRP entries"
1024+
1025+
for ((; i >= 1; i--)); do
1026+
for ((j=0; j <= $num_masks; j++)); do
1027+
tc filter del dev $h2 ingress chain $i protocol ip \
1028+
pref $i handle $j flower &> /dev/null
1029+
done
1030+
done
1031+
1032+
log_test "max eRP entries test ($tcflags). " \
1033+
"max chain $chain_failed, mask $mask_failed"
1034+
}
1035+
9861036
setup_prepare()
9871037
{
9881038
h1=${NETIFS[p1]}

0 commit comments

Comments
 (0)