Skip to content

NETDEV_WATCHDOG_eth0_smsc95xx_transmit_queue_0_timed_out #2025

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
andersthomson opened this issue May 19, 2017 · 5 comments
Closed

NETDEV_WATCHDOG_eth0_smsc95xx_transmit_queue_0_timed_out #2025

andersthomson opened this issue May 19, 2017 · 5 comments
Labels
Waiting for external input Waiting for a comment from the originator of the issue, or a collaborator.

Comments

@andersthomson
Copy link

While compiling latest 4.9.y, I got this one on the console.

Didn't get the first half, but it seems a mmc error (bad card?) got something stuck and ev the networking stuff choked on it. Uninformed guess suggests that their sharing on USB is part of this faith sharing.

NETDEV_WATCHDOG_eth0_smsc95xx_transmit_queue_0_timed_out.txt

@JamesH65
Copy link
Contributor

Again, without the setup of the device and what you are doing, its going to be difficult to fathom this out.

@JamesH65 JamesH65 added the Waiting for external input Waiting for a comment from the originator of the issue, or a collaborator. label May 22, 2017
@andersthomson
Copy link
Author

What do you need in terms of setup? The box has two powered usb hubs, two tv cards, two HDs, two serial readers. It's running 4 containers (with tvheadend, nginx, openhab, deamons, and a spare work one). No idea what triggers it. It can happen in the middle of a compile (off the HD), with no TV traffic, so net stuff should be low.

@JamesH65
Copy link
Contributor

I suspect this is related to #2027. Something has stopped the ethernet driver from emptying its queue in a timely manner.

Note that the SD interface is not USB, so it going wrong as well sort of implies a somewhat wider failure than just the USB.

If this was a one off I suspect it it going to be very difficult to reproduce and hence figure out.

@andersthomson
Copy link
Author

It probably is. As it happens, I keep /usr/src/linux on a USB attached HD. I've never seen this one happening again.

@JamesH65
Copy link
Contributor

I'm inclined to close this since its was a one off. If it does turn out to happen again, please reopen and report.

popcornmix pushed a commit that referenced this issue Sep 25, 2018
GRO expects skbs not to be owned by sockets, but when XDP is enabled veth
passed skbs owned by sockets. It caused corrupted sk_wmem_alloc.

Paolo Abeni reported the following splat:

[  362.098904] refcount_t overflow at skb_set_owner_w+0x5e/0xa0 in iperf3[1644], uid/euid: 0/0
[  362.108239] WARNING: CPU: 0 PID: 1644 at kernel/panic.c:648 refcount_error_report+0xa0/0xa4
[  362.117547] Modules linked in: tcp_diag inet_diag veth intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate intel_uncore intel_rapl_perf ipmi_ssif iTCO_wdt sg ipmi_si iTCO_vendor_support ipmi_devintf mxm_wmi ipmi_msghandler pcspkr dcdbas mei_me wmi mei lpc_ich acpi_power_meter pcc_cpufreq xfs libcrc32c sd_mod mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ixgbe igb ttm ahci mdio libahci ptp crc32c_intel drm pps_core libata i2c_algo_bit dca dm_mirror dm_region_hash dm_log dm_mod
[  362.176622] CPU: 0 PID: 1644 Comm: iperf3 Not tainted 4.19.0-rc2.vanilla+ #2025
[  362.184777] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.1.7 06/16/2016
[  362.193124] RIP: 0010:refcount_error_report+0xa0/0xa4
[  362.198758] Code: 08 00 00 48 8b 95 80 00 00 00 49 8d 8c 24 80 0a 00 00 41 89 c1 44 89 2c 24 48 89 de 48 c7 c7 18 4d e7 9d 31 c0 e8 30 fa ff ff <0f> 0b eb 88 0f 1f 44 00 00 55 48 89 e5 41 56 41 55 41 54 49 89 fc
[  362.219711] RSP: 0018:ffff9ee6ff603c20 EFLAGS: 00010282
[  362.225538] RAX: 0000000000000000 RBX: ffffffff9de83e10 RCX: 0000000000000000
[  362.233497] RDX: 0000000000000001 RSI: ffff9ee6ff6167d8 RDI: ffff9ee6ff6167d8
[  362.241457] RBP: ffff9ee6ff603d78 R08: 0000000000000490 R09: 0000000000000004
[  362.249416] R10: 0000000000000000 R11: ffff9ee6ff603990 R12: ffff9ee664b94500
[  362.257377] R13: 0000000000000000 R14: 0000000000000004 R15: ffffffff9de615f9
[  362.265337] FS:  00007f1d22d28740(0000) GS:ffff9ee6ff600000(0000) knlGS:0000000000000000
[  362.274363] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  362.280773] CR2: 00007f1d222f35d0 CR3: 0000001fddfec003 CR4: 00000000001606f0
[  362.288733] Call Trace:
[  362.291459]  <IRQ>
[  362.293702]  ex_handler_refcount+0x4e/0x80
[  362.298269]  fixup_exception+0x35/0x40
[  362.302451]  do_trap+0x109/0x150
[  362.306048]  do_error_trap+0xd5/0x130
[  362.315766]  invalid_op+0x14/0x20
[  362.319460] RIP: 0010:skb_set_owner_w+0x5e/0xa0
[  362.324512] Code: ef ff ff 74 49 48 c7 43 60 20 7b 4a 9d 8b 85 f4 01 00 00 85 c0 75 16 8b 83 e0 00 00 00 f0 01 85 44 01 00 00 0f 88 d8 23 16 00 <5b> 5d c3 80 8b 91 00 00 00 01 8b 85 f4 01 00 00 89 83 a4 00 00 00
[  362.345465] RSP: 0018:ffff9ee6ff603e20 EFLAGS: 00010a8
[  362.351291] RAX: 0000000000001100 RBX: ffff9ee65deec700 RCX: ffff9ee65e829244
[  362.359250] RDX: 0000000000000100 RSI: ffff9ee65e829100 RDI: ffff9ee65deec700
[  362.367210] RBP: ffff9ee65e829100 R08: 000000000002a380 R09: 0000000000000000
[  362.375169] R10: 0000000000000002 R11: fffff1a4bf77bb00 R12: ffffc0754661d000
[  362.383130] R13: ffff9ee65deec200 R14: ffff9ee65f597000 R15: 00000000000000aa
[  362.391092]  veth_xdp_rcv+0x4e4/0x890 [veth]
[  362.399357]  veth_poll+0x4d/0x17a [veth]
[  362.403731]  net_rx_action+0x2af/0x3f0
[  362.407912]  __do_softirq+0xdd/0x29e
[  362.411897]  do_softirq_own_stack+0x2a/0x40
[  362.416561]  </IRQ>
[  362.418899]  do_softirq+0x4b/0x70
[  362.422594]  __local_bh_enable_ip+0x50/0x60
[  362.427258]  ip_finish_output2+0x16a/0x390
[  362.431824]  ip_output+0x71/0xe0
[  362.440670]  __tcp_transmit_skb+0x583/0xab0
[  362.445333]  tcp_write_xmit+0x247/0xfb0
[  362.449609]  __tcp_push_pending_frames+0x2d/0xd0
[  362.454760]  tcp_sendmsg_locked+0x857/0xd30
[  362.459424]  tcp_sendmsg+0x27/0x40
[  362.463216]  sock_sendmsg+0x36/0x50
[  362.467104]  sock_write_iter+0x87/0x100
[  362.471382]  __vfs_write+0x112/0x1a0
[  362.475369]  vfs_write+0xad/0x1a0
[  362.479062]  ksys_write+0x52/0xc0
[  362.482759]  do_syscall_64+0x5b/0x180
[  362.486841]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  362.492473] RIP: 0033:0x7f1d22293238
[  362.496458] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 c5 54 2d 00 8b 00 85 c0 75 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89 d4 55
[  362.517409] RSP: 002b:00007ffebaef8008 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[  362.525855] RAX: ffffffffffffffda RBX: 0000000000002800 RCX: 00007f1d22293238
[  362.533816] RDX: 0000000000002800 RSI: 00007f1d22d36000 RDI: 0000000000000005
[  362.541775] RBP: 00007f1d22d36000 R08: 00000002db777a30 R09: 0000562b70712b20
[  362.549734] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000005
[  362.557693] R13: 0000000000002800 R14: 00007ffebaef8060 R15: 0000562b70712260

In order to avoid this, orphan the skb before entering GRO.

Fixes: 948d4f2 ("veth: Add driver XDP")
Reported-by: Paolo Abeni <[email protected]>
Signed-off-by: Toshiaki Makita <[email protected]>
Tested-by: Paolo Abeni <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Waiting for external input Waiting for a comment from the originator of the issue, or a collaborator.
Projects
None yet
Development

No branches or pull requests

2 participants