Skip to content

DLPX-86263 Remove merge commit from linux-kernel-aws #41

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

palash-gandhi
Copy link
Contributor

@palash-gandhi palash-gandhi commented May 30, 2023

Problem

Similar to delphix/linux-kernel-generic#20, a merge commit
needs to be removed from the git history.

Solution

I removed it and added an empty commit to create this PR.

Testing Done

ab-pre-push as a PR check.

@palash-gandhi palash-gandhi marked this pull request as ready for review May 30, 2023 21:54
@palash-gandhi palash-gandhi force-pushed the dlpx/pr/pgandhi-delphix/80025084-36e6-4ac3-abdc-0c86d6dc1648 branch from 1b09927 to 24afab6 Compare May 30, 2023 22:10
@palash-gandhi palash-gandhi changed the title Dlpx/pr/pgandhi delphix/80025084 36e6 4ac3 abdc 0c86d6dc1648 DLPX-86263 Remove merge commit from linux-kernel-aws May 30, 2023
@palash-gandhi palash-gandhi merged commit 24afab6 into develop May 31, 2023
@palash-gandhi palash-gandhi deleted the dlpx/pr/pgandhi-delphix/80025084-36e6-4ac3-abdc-0c86d6dc1648 branch May 31, 2023 16:22
delphix-devops-bot pushed a commit that referenced this pull request Mar 21, 2024
…>cur_tx

BugLink: https://bugs.launchpad.net/bugs/2049417

[ Upstream commit c1c0ce3 ]

KCSAN reported the following data-race:

==================================================================
BUG: KCSAN: data-race in rtl8169_poll [r8169] / rtl8169_start_xmit [r8169]

write (marked) to 0xffff888102474b74 of 4 bytes by task 5358 on cpu 29:
rtl8169_start_xmit (drivers/net/ethernet/realtek/r8169_main.c:4254) r8169
dev_hard_start_xmit (./include/linux/netdevice.h:4889 ./include/linux/netdevice.h:4903 net/core/dev.c:3544 net/core/dev.c:3560)
sch_direct_xmit (net/sched/sch_generic.c:342)
__dev_queue_xmit (net/core/dev.c:3817 net/core/dev.c:4306)
ip_finish_output2 (./include/linux/netdevice.h:3082 ./include/net/neighbour.h:526 ./include/net/neighbour.h:540 net/ipv4/ip_output.c:233)
__ip_finish_output (net/ipv4/ip_output.c:311 net/ipv4/ip_output.c:293)
ip_finish_output (net/ipv4/ip_output.c:328)
ip_output (net/ipv4/ip_output.c:435)
ip_send_skb (./include/net/dst.h:458 net/ipv4/ip_output.c:127 net/ipv4/ip_output.c:1486)
udp_send_skb (net/ipv4/udp.c:963)
udp_sendmsg (net/ipv4/udp.c:1246)
inet_sendmsg (net/ipv4/af_inet.c:840 (discriminator 4))
sock_sendmsg (net/socket.c:730 net/socket.c:753)
__sys_sendto (net/socket.c:2177)
__x64_sys_sendto (net/socket.c:2185)
do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)

read to 0xffff888102474b74 of 4 bytes by interrupt on cpu 21:
rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4397 drivers/net/ethernet/realtek/r8169_main.c:4581) r8169
__napi_poll (net/core/dev.c:6527)
net_rx_action (net/core/dev.c:6596 net/core/dev.c:6727)
__do_softirq (kernel/softirq.c:553)
__irq_exit_rcu (kernel/softirq.c:427 kernel/softirq.c:632)
irq_exit_rcu (kernel/softirq.c:647)
common_interrupt (arch/x86/kernel/irq.c:247 (discriminator 14))
asm_common_interrupt (./arch/x86/include/asm/idtentry.h:636)
cpuidle_enter_state (drivers/cpuidle/cpuidle.c:291)
cpuidle_enter (drivers/cpuidle/cpuidle.c:390)
call_cpuidle (kernel/sched/idle.c:135)
do_idle (kernel/sched/idle.c:219 kernel/sched/idle.c:282)
cpu_startup_entry (kernel/sched/idle.c:378 (discriminator 1))
start_secondary (arch/x86/kernel/smpboot.c:210 arch/x86/kernel/smpboot.c:294)
secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:433)

value changed: 0x002f4815 -> 0x002f4816

Reported by Kernel Concurrency Sanitizer on:
CPU: 21 PID: 0 Comm: swapper/21 Tainted: G             L     6.6.0-rc2-kcsan-00143-gb5cbe7c00aa0 #41
Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
==================================================================

The write side of drivers/net/ethernet/realtek/r8169_main.c is:
==================
   4251         /* rtl_tx needs to see descriptor changes before updated tp->cur_tx */
   4252         smp_wmb();
   4253
 → 4254         WRITE_ONCE(tp->cur_tx, tp->cur_tx + frags + 1);
   4255
   4256         stop_queue = !netif_subqueue_maybe_stop(dev, 0, rtl_tx_slots_avail(tp),
   4257                                                 R8169_TX_STOP_THRS,
   4258                                                 R8169_TX_START_THRS);

The read side is the function rtl_tx():

   4355 static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp,
   4356                    int budget)
   4357 {
   4358         unsigned int dirty_tx, bytes_compl = 0, pkts_compl = 0;
   4359         struct sk_buff *skb;
   4360
   4361         dirty_tx = tp->dirty_tx;
   4362
   4363         while (READ_ONCE(tp->cur_tx) != dirty_tx) {
   4364                 unsigned int entry = dirty_tx % NUM_TX_DESC;
   4365                 u32 status;
   4366
   4367                 status = le32_to_cpu(tp->TxDescArray[entry].opts1);
   4368                 if (status & DescOwn)
   4369                         break;
   4370
   4371                 skb = tp->tx_skb[entry].skb;
   4372                 rtl8169_unmap_tx_skb(tp, entry);
   4373
   4374                 if (skb) {
   4375                         pkts_compl++;
   4376                         bytes_compl += skb->len;
   4377                         napi_consume_skb(skb, budget);
   4378                 }
   4379                 dirty_tx++;
   4380         }
   4381
   4382         if (tp->dirty_tx != dirty_tx) {
   4383                 dev_sw_netstats_tx_add(dev, pkts_compl, bytes_compl);
   4384                 WRITE_ONCE(tp->dirty_tx, dirty_tx);
   4385
   4386                 netif_subqueue_completed_wake(dev, 0, pkts_compl, bytes_compl,
   4387                                               rtl_tx_slots_avail(tp),
   4388                                               R8169_TX_START_THRS);
   4389                 /*
   4390                  * 8168 hack: TxPoll requests are lost when the Tx packets are
   4391                  * too close. Let's kick an extra TxPoll request when a burst
   4392                  * of start_xmit activity is detected (if it is not detected,
   4393                  * it is slow enough). -- FR
   4394                  * If skb is NULL then we come here again once a tx irq is
   4395                  * triggered after the last fragment is marked transmitted.
   4396                  */
 → 4397                 if (tp->cur_tx != dirty_tx && skb)
   4398                         rtl8169_doorbell(tp);
   4399         }
   4400 }

Obviously from the code, an earlier detected data-race for tp->cur_tx was fixed in the
line 4363:

   4363         while (READ_ONCE(tp->cur_tx) != dirty_tx) {

but the same solution is required for protecting the other access to tp->cur_tx:

 → 4397                 if (READ_ONCE(tp->cur_tx) != dirty_tx && skb)
   4398                         rtl8169_doorbell(tp);

The write in the line 4254 is protected with WRITE_ONCE(), but the read in the line 4397
might have suffered read tearing under some compiler optimisations.

The fix eliminated the KCSAN data-race report for this bug.

It is yet to be evaluated what happens if tp->cur_tx changes between the test in line 4363
and line 4397. This test should certainly not be cached by the compiler in some register
for such a long time, while asynchronous writes to tp->cur_tx might have occurred in line
4254 in the meantime.

Fixes: 94d8a98 ("r8169: reduce number of workaround doorbell rings")
Cc: Heiner Kallweit <[email protected]>
Cc: [email protected]
Cc: "David S. Miller" <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: Marco Elver <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/lkml/[email protected]/
Signed-off-by: Mirsad Goran Todorovac <[email protected]>
Acked-by: Marco Elver <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Roxana Nicolescu <[email protected]>
Signed-off-by: Stefan Bader <[email protected]>
delphix-devops-bot pushed a commit that referenced this pull request Mar 21, 2024
…cArray[entry].opts1

BugLink: https://bugs.launchpad.net/bugs/2049417

[ Upstream commit dcf75a0 ]

KCSAN reported the following data-race:

==================================================================
BUG: KCSAN: data-race in rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4368 drivers/net/ethernet/realtek/r8169_main.c:4581) r8169

race at unknown origin, with read to 0xffff888140d37570 of 4 bytes by interrupt on cpu 21:
rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4368 drivers/net/ethernet/realtek/r8169_main.c:4581) r8169
__napi_poll (net/core/dev.c:6527)
net_rx_action (net/core/dev.c:6596 net/core/dev.c:6727)
__do_softirq (kernel/softirq.c:553)
__irq_exit_rcu (kernel/softirq.c:427 kernel/softirq.c:632)
irq_exit_rcu (kernel/softirq.c:647)
sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1074 (discriminator 14))
asm_sysvec_apic_timer_interrupt (./arch/x86/include/asm/idtentry.h:645)
cpuidle_enter_state (drivers/cpuidle/cpuidle.c:291)
cpuidle_enter (drivers/cpuidle/cpuidle.c:390)
call_cpuidle (kernel/sched/idle.c:135)
do_idle (kernel/sched/idle.c:219 kernel/sched/idle.c:282)
cpu_startup_entry (kernel/sched/idle.c:378 (discriminator 1))
start_secondary (arch/x86/kernel/smpboot.c:210 arch/x86/kernel/smpboot.c:294)
secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:433)

value changed: 0xb0000042 -> 0x00000000

Reported by Kernel Concurrency Sanitizer on:
CPU: 21 PID: 0 Comm: swapper/21 Tainted: G             L     6.6.0-rc2-kcsan-00143-gb5cbe7c00aa0 #41
Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
==================================================================

The read side is in

drivers/net/ethernet/realtek/r8169_main.c
=========================================
   4355 static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp,
   4356                    int budget)
   4357 {
   4358         unsigned int dirty_tx, bytes_compl = 0, pkts_compl = 0;
   4359         struct sk_buff *skb;
   4360
   4361         dirty_tx = tp->dirty_tx;
   4362
   4363         while (READ_ONCE(tp->cur_tx) != dirty_tx) {
   4364                 unsigned int entry = dirty_tx % NUM_TX_DESC;
   4365                 u32 status;
   4366
 → 4367                 status = le32_to_cpu(tp->TxDescArray[entry].opts1);
   4368                 if (status & DescOwn)
   4369                         break;
   4370
   4371                 skb = tp->tx_skb[entry].skb;
   4372                 rtl8169_unmap_tx_skb(tp, entry);
   4373
   4374                 if (skb) {
   4375                         pkts_compl++;
   4376                         bytes_compl += skb->len;
   4377                         napi_consume_skb(skb, budget);
   4378                 }
   4379                 dirty_tx++;
   4380         }
   4381
   4382         if (tp->dirty_tx != dirty_tx) {
   4383                 dev_sw_netstats_tx_add(dev, pkts_compl, bytes_compl);
   4384                 WRITE_ONCE(tp->dirty_tx, dirty_tx);
   4385
   4386                 netif_subqueue_completed_wake(dev, 0, pkts_compl, bytes_compl,
   4387                                               rtl_tx_slots_avail(tp),
   4388                                               R8169_TX_START_THRS);
   4389                 /*
   4390                  * 8168 hack: TxPoll requests are lost when the Tx packets are
   4391                  * too close. Let's kick an extra TxPoll request when a burst
   4392                  * of start_xmit activity is detected (if it is not detected,
   4393                  * it is slow enough). -- FR
   4394                  * If skb is NULL then we come here again once a tx irq is
   4395                  * triggered after the last fragment is marked transmitted.
   4396                  */
   4397                 if (READ_ONCE(tp->cur_tx) != dirty_tx && skb)
   4398                         rtl8169_doorbell(tp);
   4399         }
   4400 }

tp->TxDescArray[entry].opts1 is reported to have a data-race and READ_ONCE() fixes
this KCSAN warning.

   4366
 → 4367                 status = le32_to_cpu(READ_ONCE(tp->TxDescArray[entry].opts1));
   4368                 if (status & DescOwn)
   4369                         break;
   4370

Cc: Heiner Kallweit <[email protected]>
Cc: [email protected]
Cc: "David S. Miller" <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: Marco Elver <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/lkml/[email protected]/
Signed-off-by: Mirsad Goran Todorovac <[email protected]>
Acked-by: Marco Elver <[email protected]>
Fixes: 1da177e ("Linux-2.6.12-rc2")
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Roxana Nicolescu <[email protected]>
Signed-off-by: Stefan Bader <[email protected]>
delphix-devops-bot pushed a commit that referenced this pull request Mar 21, 2024
…>opts1

BugLink: https://bugs.launchpad.net/bugs/2049417

[ Upstream commit f97eee4 ]

KCSAN reported the following data-race bug:

==================================================================
BUG: KCSAN: data-race in rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4430 drivers/net/ethernet/realtek/r8169_main.c:4583) r8169

race at unknown origin, with read to 0xffff888117e43510 of 4 bytes by interrupt on cpu 21:
rtl8169_poll (drivers/net/ethernet/realtek/r8169_main.c:4430 drivers/net/ethernet/realtek/r8169_main.c:4583) r8169
__napi_poll (net/core/dev.c:6527)
net_rx_action (net/core/dev.c:6596 net/core/dev.c:6727)
__do_softirq (kernel/softirq.c:553)
__irq_exit_rcu (kernel/softirq.c:427 kernel/softirq.c:632)
irq_exit_rcu (kernel/softirq.c:647)
sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1074 (discriminator 14))
asm_sysvec_apic_timer_interrupt (./arch/x86/include/asm/idtentry.h:645)
cpuidle_enter_state (drivers/cpuidle/cpuidle.c:291)
cpuidle_enter (drivers/cpuidle/cpuidle.c:390)
call_cpuidle (kernel/sched/idle.c:135)
do_idle (kernel/sched/idle.c:219 kernel/sched/idle.c:282)
cpu_startup_entry (kernel/sched/idle.c:378 (discriminator 1))
start_secondary (arch/x86/kernel/smpboot.c:210 arch/x86/kernel/smpboot.c:294)
secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:433)

value changed: 0x80003fff -> 0x3402805f

Reported by Kernel Concurrency Sanitizer on:
CPU: 21 PID: 0 Comm: swapper/21 Tainted: G             L     6.6.0-rc2-kcsan-00143-gb5cbe7c00aa0 #41
Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
==================================================================

drivers/net/ethernet/realtek/r8169_main.c:
==========================================
   4429
 → 4430                 status = le32_to_cpu(desc->opts1);
   4431                 if (status & DescOwn)
   4432                         break;
   4433
   4434                 /* This barrier is needed to keep us from reading
   4435                  * any other fields out of the Rx descriptor until
   4436                  * we know the status of DescOwn
   4437                  */
   4438                 dma_rmb();
   4439
   4440                 if (unlikely(status & RxRES)) {
   4441                         if (net_ratelimit())
   4442                                 netdev_warn(dev, "Rx ERROR. status = %08x\n",

Marco Elver explained that dma_rmb() doesn't prevent the compiler to tear up the access to
desc->opts1 which can be written to concurrently. READ_ONCE() should prevent that from
happening:

   4429
 → 4430                 status = le32_to_cpu(READ_ONCE(desc->opts1));
   4431                 if (status & DescOwn)
   4432                         break;
   4433

As the consequence of this fix, this KCSAN warning was eliminated.

Fixes: 6202806 ("r8169: drop member opts1_mask from struct rtl8169_private")
Suggested-by: Marco Elver <[email protected]>
Cc: Heiner Kallweit <[email protected]>
Cc: [email protected]
Cc: "David S. Miller" <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/lkml/[email protected]/
Signed-off-by: Mirsad Goran Todorovac <[email protected]>
Acked-by: Marco Elver <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Roxana Nicolescu <[email protected]>
Signed-off-by: Stefan Bader <[email protected]>
delphix-devops-bot pushed a commit that referenced this pull request Dec 18, 2024
BugLink: https://bugs.launchpad.net/bugs/2064176

User can trigger (see steps in [1] and LP bug) the following RCU warning (which makes the whole
system unresponsive and effectively forces system administrator to reboot).

Aug 30 21:51:57 v1 kernel: ------------[ cut here ]------------
Aug 30 21:51:57 v1 kernel: Voluntary context switch within RCU read-side critical section!
Aug 30 21:51:57 v1 kernel: WARNING: CPU: 1 PID: 2669 at kernel/rcu/tree_plugin.h:320 rcu_note_context_switch+0x2ce/0x2f0
Aug 30 21:51:57 v1 kernel: Modules linked in: veth vxlan ip6_udp_tunnel udp_tunnel dummy nft_masq nft_chain_nat bridge stp llc zfs(PO) spl(O) nvme_fabrics nvme_core nvme_auth ebtable_filter ebtables ip6table_raw ip6table_mangle ip6table_nat ip6table_filter ip6_tables iptable_raw iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_filter nf_tables libcrc32c vhost_vsock vhost vhost_iotlb binfmt_misc kvm_amd ccp kvm irqbypass crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha256_ssse3 sha1_ssse3 nls_iso8859_1 joydev aesni_intel crypto_simd cryptd virtio_gpu 9pnet_virtio virtio_dma_buf xhci_pci psmouse ahci 9pnet virtiofs libahci vmw_vsock_virtio_transport xhci_pci_renesas vmw_vsock_virtio_transport_common vsock virtio_input input_leds serio_raw efi_pstore nfnetlink dmi_sysfs virtio_rng ip_tables x_tables autofs4
Aug 30 21:51:57 v1 kernel: CPU: 1 PID: 2669 Comm: systemd-resolve Tainted: P           O       6.8.0-41-generic #41-Ubuntu
Aug 30 21:51:57 v1 kernel: Hardware name: QEMU Standard PC (Q35 + ICH9, 2009)/LXD, BIOS unknown 2/2/2022
Aug 30 21:51:57 v1 kernel: RIP: 0010:rcu_note_context_switch+0x2ce/0x2f0
Aug 30 21:51:57 v1 kernel: Code: fe ff ff ba 02 00 00 00 be 01 00 00 00 e8 fa d0 fe ff e9 6b fe ff ff 48 c7 c7 60 7d a6 a8 c6 05 ab 99 61 02 01 e8 d2 0d f2 ff <0f> 0b e9 96 fd ff ff 0f 0b e9 36 ff ff ff 0f 0b e9 18 ff ff ff 66
Aug 30 21:51:57 v1 kernel: RSP: 0018:ffffb611812bbd80 EFLAGS: 00010046
Aug 30 21:51:57 v1 kernel: RAX: 0000000000000000 RBX: ffff9613faeb5a00 RCX: 0000000000000000
Aug 30 21:51:57 v1 kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
Aug 30 21:51:57 v1 kernel: RBP: ffffb611812bbda0 R08: 0000000000000000 R09: 0000000000000000
Aug 30 21:51:57 v1 kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
Aug 30 21:51:57 v1 kernel: R13: ffff9613b89dd200 R14: 0000000000000000 R15: 0000000000000000
Aug 30 21:51:57 v1 kernel: FS:  00007ec3a402c5c0(0000) GS:ffff9613fae80000(0000) knlGS:0000000000000000
Aug 30 21:51:57 v1 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Aug 30 21:51:57 v1 kernel: CR2: 000062592dc892b8 CR3: 000000013890a000 CR4: 00000000007506f0
Aug 30 21:51:57 v1 kernel: PKRU: 55555554
Aug 30 21:51:57 v1 kernel: Call Trace:
Aug 30 21:51:57 v1 kernel:  <TASK>
Aug 30 21:51:57 v1 kernel:  ? show_regs+0x6d/0x80
Aug 30 21:51:57 v1 kernel:  ? __warn+0x89/0x160
Aug 30 21:51:57 v1 kernel:  ? rcu_note_context_switch+0x2ce/0x2f0
Aug 30 21:51:57 v1 kernel:  ? report_bug+0x17e/0x1b0
Aug 30 21:51:57 v1 kernel:  ? handle_bug+0x51/0xa0
Aug 30 21:51:57 v1 kernel:  ? exc_invalid_op+0x18/0x80
Aug 30 21:51:57 v1 kernel:  ? asm_exc_invalid_op+0x1b/0x20
Aug 30 21:51:57 v1 kernel:  ? rcu_note_context_switch+0x2ce/0x2f0
Aug 30 21:51:57 v1 kernel:  __schedule+0x81/0x6b0
Aug 30 21:51:57 v1 kernel:  schedule+0x33/0x110
Aug 30 21:51:57 v1 kernel:  syscall_exit_to_user_mode+0x22d/0x260
Aug 30 21:51:57 v1 kernel:  do_syscall_64+0x8c/0x180
Aug 30 21:51:57 v1 kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Aug 30 21:51:57 v1 kernel:  ? syscall_exit_to_user_mode+0x89/0x260
Aug 30 21:51:57 v1 kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Aug 30 21:51:57 v1 kernel:  ? do_syscall_64+0x8c/0x180
Aug 30 21:51:57 v1 kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Aug 30 21:51:57 v1 kernel:  ? irqentry_exit_to_user_mode+0x7e/0x260
Aug 30 21:51:57 v1 kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Aug 30 21:51:57 v1 kernel:  ? irqentry_exit+0x43/0x50
Aug 30 21:51:57 v1 kernel:  ? srso_alias_return_thunk+0x5/0xfbef5
Aug 30 21:51:57 v1 kernel:  ? exc_page_fault+0x94/0x1b0
Aug 30 21:51:57 v1 kernel:  entry_SYSCALL_64_after_hwframe+0x78/0x80
Aug 30 21:51:57 v1 kernel: RIP: 0033:0x7ec3a3f14887
Aug 30 21:51:57 v1 kernel: Code: 10 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
Aug 30 21:51:57 v1 kernel: RSP: 002b:00007ffcbb32de08 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
Aug 30 21:51:57 v1 kernel: RAX: 000000000000002d RBX: 000062592dc882b0 RCX: 00007ec3a3f14887
Aug 30 21:51:57 v1 kernel: RDX: 000000000000002d RSI: 000062592dc88360 RDI: 0000000000000011
Aug 30 21:51:57 v1 kernel: RBP: 000062592dc7e690 R08: 00007ffcbb32dde4 R09: 0000000000000000
Aug 30 21:51:57 v1 kernel: R10: 00000000000005aa R11: 0000000000000246 R12: 0000000000000011
Aug 30 21:51:57 v1 kernel: R13: 0000000000000002 R14: 000000000000002d R15: 000062592dc88360
Aug 30 21:51:57 v1 kernel:  </TASK>
Aug 30 21:51:57 v1 kernel: ---[ end trace 0000000000000000 ]---

This warning is a result of an RCU misuse (an RCU read lock is taken and not released).

Let's fix it by releasing the RCU read lock before "goto tx_free" on the skb discard codepath.

Link: canonical/lxd#14025 [1]
Reported-by: Max Asnaashari <[email protected]>
Signed-off-by: Alexander Mikhalitsyn <[email protected]>
Acked-by: Guoqing Jiang <[email protected]>
Acked-by: Mehmet Basaran <[email protected]>
Signed-off-by: Roxana Nicolescu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants