Skip to content

Rpi goes into kernel panic once LTE USB Dongle is disconnected #5116

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
IrynaSemenovych opened this issue Aug 5, 2022 · 18 comments
Open

Comments

@IrynaSemenovych
Copy link

IrynaSemenovych commented Aug 5, 2022

Describe the bug

Once LTE USB Dongle is disconnected "nonzero urb status received: -71, wdm_int_callback - 0 bytes" errors are thrown indefinitely.
Raspberry Pi is frozen and it's not possible to reach it until the system is power cycled.

Steps to reproduce the behaviour

  1. Connect a USB Dongle to Raspberry Pi 3 Model B+ to any USB
  2. Wait until it's connected to the Internet (connection is established by using ModemManager)
  3. Disconnect the USB Dongle

Device (s)

Raspberry Pi 3 Mod. B+

System

OS and version:
Generated using pi-gen, https://github.com/RPi-Distro/pi-gen, fa45ccf5a4b183ee566b36d74fb4b65bf9358bed, stage2

Firmware version:
Dec 1 2021 15:07:06
Copyright (c) 2012 Broadcom
version 71bd3109023a0c8575585ba87cbb374d2eeb038f (clean) (release) (start)

Kernel version:
Linux aqueduct 5.10.103-v7+ #1529 SMP Tue Mar 8 12:21:37 GMT 2022 armv7l GNU/Linux

LTE USB Dongle:
Alcatel IK41VE1

Logs

kernel: [ 70.955000] cdc_mbim 1-1.2:1.3: nonzero urb status received: -71
kernel: [ 70.957710] cdc_mbim 1-1.2:1.3: wdm_int_callback - 0 bytes
kernel: [ 71.058078] cdc_mbim 1-1.2:1.3: nonzero urb status received: -71
kernel: [ 71.060809] cdc_mbim 1-1.2:1.3: wdm_int_callback - 0 bytes
kernel: [ 71.602750] ERROR::dwc_otg_hcd_urb_enqueue:501: Not connected
kernel: [ 71.609973] cdc_mbim 1-1.2:1.3: Tx URB error: -19
kernel: [ 71.689118] ERROR::dwc_otg_hcd_urb_enqueue:501: Not connected
kernel: [ 71.696328] option1 ttyUSB0: usb_wwan_write: submit urb 0 failed: -19
kernel: [ 71.707848] ERROR::dwc_otg_hcd_urb_enqueue:501: Not connected
kernel: [ 71.715463] smsc95xx 1-1.1:1.0 eth0: Failed to read reg index 0x00000114: -19
kernel: [ 71.718163] smsc95xx 1-1.1:1.0 eth0: Error reading MII_ACCESS
kernel: [ 71.720804] smsc95xx 1-1.1:1.0 eth0: __smsc95xx_mdio_read: MII is busy
kernel: [ 72.462173] option1 ttyUSB0: usb_wwan_write: submit urb 0 failed: -19
kernel: [ 72.579529]
kernel: [ 72.579553] ERROR::dwc_otg_hcd_urb_enqueue:501: Not connected
kernel: [ 72.579553]
kernel: [ 72.591639] smsc95xx 1-1.1:1.0 eth0: Failed to read
reg index 0x00000114: -19
kernel: [ 72.595558] smsc95xx 1-1.1:1.0 eth0: Error reading MII_ACCESS
kernel: [ 72.599452] smsc95xx 1-1.1:1.0 eth0: __smsc95xx_mdio_read: MII is busy
Aug 2 09:23:32 aqueduct kernel: [ 71.802033] ------------[ cut here ]------------
Aug 2 09:23:32 aqueduct kernel: [ 71.804794] WARNING: CPU: 0 PID: 48 at drivers/net/phy/phy.c:958 phy_error+0x30/0x70
Aug 2 09:23:32 aqueduct kernel: [ 71.807587] Modules linked in: hci_uart btbcm bluetooth ecdh_generic ecc libaes vc4 cec 8021q garp stp llc drm_kms_helper drm drm_panel_orientation_quirks brcmfmac brcmutil snd_soc_core snd_compress snd_pcm_dmaengine syscopyarea sysfillrect sysimgblt fb_sys_fops sha256_generic libsha256 raspberrypi_hwmon backlight cfg80211 rfkill bcm2835_codec(C) bcm2835_v4l2(C) bcm2835_isp(C) v4l2_mem2mem snd_bcm2835(C) bcm2835_mmal_vchiq(C) videobuf2_vmalloc videobuf2_dma_contig videobuf2_memops videobuf2_v4l2 videobuf2_common snd_pcm snd_timer videodev snd vc_sm_cma(C) mc cdc_mbim cdc_wdm option cdc_ncm usb_wwan usbserial cdc_ether fixed uio_pdrv_genirq uio ip_tables x_tables ipv6
kernel: [ 71.826806] CPU: 0 PID: 48 Comm: kworker/0:3 Tainted: G C 5.10.103+ #1529
kernel: [ 71.830315] Hardware name: BCM2835
kernel: [ 71.833764] Workqueue: events_power_efficient phy_state_machine
kernel: [ 71.837233] Backtrace:
kernel: [ 71.840735] [] (dump_backtrace) from [] (show_stack+0x20/0x24)
kernel: [ 71.844326] r7:000003be r6:00000009 r5:c06096d8 r4:c0a73108
kernel: [ 71.847894] [] (show_stack) from [] (dump_stack+0x28/0x30)
kernel: [ 71.851562] [] (dump_stack) from [] (__warn+0xe8/0x110)
kernel: [ 71.855232] r5:c06096d8 r4:c0ac236c
kernel: [ 71.858863] [] (__warn) from [] (warn_slowpath_fmt+0x68/0xd8)
kernel: [ 71.862610] r7:00000009 r6:c06096d8 r5:000003be r4:c0ac236c
kernel: [ 71.866384] [] (warn_slowpath_fmt) from [] (phy_error+0x30/0x70)
kernel: [ 71.870243] r9:ffffffed r8:c190f000 r7:00000005 r6:c0bb7028 r5:c190f318 r4:c190f000
kernel: [ 71.874099] [] (phy_error) from [] (phy_state_machine+0xec/0x224)
kernel: [ 71.878004] r5:c190f318 r4:c190f2ec
kernel: [ 71.881956] [] (phy_state_machine) from [] (process_one_work+0x208/0x4dc)
kernel: [ 71.885968] r9:00000000 r8:00000000 r7:dbf57700 r6:00000000 r5:c18ed6c0 r4:c190f2ec
kernel: [ 71.890088] [] (process_one_work) from [] (worker_thread+0x34/0x594)
kernel: [ 71.894200] r10:c0bc0304 r9:00000008 r8:c0c2bea0 r7:c0bc0318 r6:c18ed6d4 r5:c0bc0304
kernel: [ 71.898307] r4:c18ed6c0
kernel: [ 71.902499] [] (worker_thread) from [] (kthread+0x148/0x15c)
kernel: [ 71.906749] r10:c1117e88 r9:c18ed6c0 r8:c003e058 r7:c190a000 r6:00000000 r5:c18e1cc0
kernel: [ 71.911052] r4:c18e2e80 r3:00000000
kernel: [ 71.915310] [] (kthread) from [] (ret_from_fork+0x14/0x28)
kernel: [ 71.919674] Exception stack(0xc190bfb0 to 0xc190bff8)
kernel: [ 71.923990] bfa0: 00000000 00000000 00000000 00000000
kernel: [ 71.928365] bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
kernel: [ 71.932764] bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
kernel: [ 71.937089] r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c0043880
kernel: [ 71.941438] r4:c18e1cc0
kernel: [ 71.945724] ---[ end trace b63e2e5b1bb43148 ]---
Complete logs are in attachments.
bug_report.txt

Additional context

Expected behaviour:
Raspberry Pi continues to function normally.

Note: The issue is also reproducible sometimes during system reboot. (after calling: sudo reboot now)

@P33M
Copy link
Contributor

P33M commented Aug 5, 2022

LTE dongles frequently consume large amounts of power when data connections are active. What happens if you use a self-powered USB hub between the dongle and the Pi?

What is the output of vcgencmd get_throttled when the dongle is active?

@IrynaSemenovych
Copy link
Author

@P33M thank you for the quick response!

If I use a self-powered USB hub the behavior is the same.
The result of vcgencmd get_throttled is following:
throttled=0x0

@P33M
Copy link
Contributor

P33M commented Aug 11, 2022

Please post the full output of sudo lsusb -v with the device plugged in (does not have to be active).

@IrynaSemenovych
Copy link
Author

Please see the output in the file.
usbs_info.txt

@IrynaSemenovych
Copy link
Author

Interesting finding that the issue is not reproducible with 64 bit Raspberry Pi OS Lite (Release date: April 4th 2022) version, but still reproducible with 32 bit Raspberry Pi OS Lite (Release date: April 4th 2022).

@JamesH65
Copy link
Contributor

JamesH65 commented Sep 1, 2022

Hi @IrynaSemenovych,

Can you try the DWC driver for armv7 and see if it fixes the issue. IIRC, this is enabled using the following in config.txt

dtoverlay=dwc2

This may however, cause other issues elsewhere, but worth trying.

@IrynaSemenovych
Copy link
Author

Hi @JamesH65

Unfortunately enabling the DWC driver for armv7 hasn't fixed anything.

@JamesH65
Copy link
Contributor

JamesH65 commented Sep 2, 2022

That is interesting, AIUI, its a completely different driver which doesn't use the FIQ, so that implies that its not the low level USB driver causing the issue, or the custom FIQ code we use to improve the standard USB.

I presume the error you see is exactly the same?

@P33M
Copy link
Contributor

P33M commented Sep 2, 2022

Please pose a full dmesg log in the dwc2 case.

@IrynaSemenovych
Copy link
Author

IrynaSemenovych commented Sep 5, 2022

@JamesH65 @P33M
Sorry that I have not described the full behavior. I have tested it with a different setup now and here the results and logs:

1.dtoverlay=dwc2 + LTE Modem
LTE USB Modem doesn't connect to the network, no kernel panic when USB Modem is disconnected.
dmesg_without_eth.txt

2.dtoverlay=dwc2 + LTE Modem + LAN cable
LTE USB Modem connects to the network and the connection seems stable, when the USB Modem is disconnected - no kernel panic, rpi functions well.
dmesg_with_eth.txt

  1. dtoverlay=dwc2 + raspberry pi zero + LAN9514 chip
    USBs don't work at all, seems like they are not powered (LED on the modem is off, no reaction to keyboard).
    Get it working on RaspberryPi Zero is quite essential for our case.

So the results are different depending on the setup, not sure what influences it.

@P33M
Copy link
Contributor

P33M commented Sep 6, 2022

1 & 2 -

[    8.314512] Under-voltage detected! (0x00050005)
...
[   14.554409] Voltage normalised (0x00000000)

You have an undervoltage event during boot in both cases, which means your power supply is marginal. USB symptoms include spontaneous disconnects as well as unexplained unreliability when power is flaky. Change the power supply (or micro-USB cable, if using a PSU without a captive cable) for one that doesn't produce an undervoltage message on boot.

Using dwc2 vs dwc_otg and not getting a crash with dwc2 could indicate a bug during disconnect processing that somehow causes a root port disconnect, but in your use-case you have a single high-speed device and there won't be much benefit to using dwc_otg which has specific optimisations for full- and low-speed devices. I recommend using dwc2.

  1. On Pi Zero boards dwc2 defaults to otg mode. If you don't use a cable that shorts OTGID to ground then the Zero's USB port will be in device mode. Use the line dtoverlay=dwc2,dr_mode=host * to force host.
  • edit - the line is in /boot/config.txt.

@IrynaSemenovych
Copy link
Author

IrynaSemenovych commented Sep 8, 2022

@P33M Thank you for the recommendations, please check out the results of my testing.

  1. Proper power supply, dtoverlay=dwc2 + LTE Modem

I changed the power supply for the rpi 3 Model b and LTE modem still doesn't connect to the network it's continuously blinks blue (that indicates that it's trying to connect to the 4G network).
dmesg_rpi_3modelB_dwc2.txt

  1. dtoverlay=dwc2,dr_mode=host + raspberry pi zero + LAN9514 chip

LTE USB Dongle doesn't connect the the network, continuously blinks.
dmesg_rpi_zero_dwc2_host.txt

In both cases I noticed such error:

[  502.472278] dwc2 20980000.usb: dwc2_hc_chhltd_intr_dma: Channel 4 - ChHltd set, but reason is unknown
[  502.472303] dwc2 20980000.usb: hcint 0x00000002, intsts 0x04600001

Maybe it will help in the issue resolving.

Is there any other tips & tricks that I should test?

@P33M
Copy link
Contributor

P33M commented Sep 9, 2022

The lines either side of the error are telling.

[  501.781538] usb 1-1.2.3: new high-speed USB device number 10 using dwc2
[  502.020277] dwc2 20980000.usb: dwc2_hc_chhltd_intr_dma: Channel 3 - ChHltd set, but reason is unknown
[  502.020306] dwc2 20980000.usb: hcint 0x00000002, intsts 0x04600001
[  502.020319] dwc2 20980000.usb: dwc2_update_urb_state_abn(): trimming xfer length
[  502.021267] dwc2 20980000.usb: dwc2_hc_chhltd_intr_dma: Channel 0 - ChHltd set, but reason is unknown
[  502.021293] dwc2 20980000.usb: hcint 0x00000002, intsts 0x04600001
[  502.021306] dwc2 20980000.usb: dwc2_update_urb_state_abn(): trimming xfer length
[  502.022279] dwc2 20980000.usb: dwc2_hc_chhltd_intr_dma: Channel 5 - ChHltd set, but reason is unknown
[  502.022304] dwc2 20980000.usb: hcint 0x00000002, intsts 0x04600001
[  502.022316] dwc2 20980000.usb: dwc2_update_urb_state_abn(): trimming xfer length
[  502.031706] usb 1-1.2.3: unable to read config index 0 descriptor/all
[  502.031752] usb 1-1.2.3: can't read configurations, error -71

In both cases, you have repeated device disconnects. In the Pi Zero case, you are getting a disconnect before the kernel even attempts to load the driver for the device. I suggest verifying with an oscilloscope that Vbus remains within tolerance (5V +-5%) at the device's USB connector.

@IrynaSemenovych
Copy link
Author

The reason why it gets disconnected is in our custom script that resets the LTE Dongle periodically in case it doesn't have an Internet connection.

@IrynaSemenovych
Copy link
Author

@P33M I tested it one more time without our custom script and device doesn't disconnect. Should we still try to verify Vbus at the device's USB? or is there any other suggestions?
Thank you in advance!

@vliaskov
Copy link
Contributor

I am hitting the same issue (albeit on a different kernel/distro).
Was there a definitive fix for your issue?

I am assuming the dwc2 errors:

[ 502.022279] dwc2 20980000.usb: dwc2_hc_chhltd_intr_dma: Channel 5 - ChHltd set, but reason is unknown
[ 502.022304] dwc2 20980000.usb: hcint 0x00000002, intsts 0x04600001

did not stop appearing, even after the custom script resetting the dongle periodically was removed.

@timokast
Copy link

I am facing similar Problems as Iryna.
I am using Raspberry PI CM3 with blank PiOS. A Quectel Modem is connected via Modem Manager and libqmi.

When resetting the modem the OS continuously prints error messages and becomes unresponsive. The only way to leave is a powercycle. The error messages are:
image

My observation is that this only occurs when using 32bit OS. On PiOS 64bit the error messages do also appear but after a few prints the stop and the system works fine again. So no system halt.
I also observed that having an active ethernet connection on 32bit leads to the same error messages but there is a timeout in place similar to the 64bit OS.

Are there any other ideas on how to fix this behavior other than using DWC2? Is anyone having experience of potential side effects of using DWC2?

@sfczsolti
Copy link

Hi everyone,

Has anyone managed to find a solution to this problem? I'm experiencing the same issue.

The USB modem connects, and the internet works perfectly, but when there is no cellular signal and I issue a reset command, the Raspberry Pi completely freezes.

I'm using ModemManager and NetworkManager along with the latest kernel available for the Raspberry Pi.

I'm almost certain that this isn't a power supply issue.

Interestingly, when a LAN cable is connected, this issue does not occur.

In my opinion, it appears that the kernel/driver is having conflicts when the Pi attempts to communicate with the qmi_wwan modem; the modem disconnects and fails to respond.

Any insights or suggestions would be greatly appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants