-
Notifications
You must be signed in to change notification settings - Fork 1.7k
RPI 3B+ state of /sys/class/net/eth0/carrier stuck at 1 #1100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@JamesH65 Very possibly related: |
This is also affecting dhcpcd, which fails to switch over to wifi if the cable is unplugged. In my case, I can reproduce this by booting with the ethernet cable plugged in and running If I boot with the cable unplugged, it works as expected. The issue is present in 4.14 and 4.19. I tried a patch that adds |
One remarkable thing is that ETHTOOL_GLINK (SIOCETHTOOL ioctl) reports the correct link status, so it must be that the carrier status flag is not extracted from what the hardware reports in real time. e.g. |
When you run ethtool, it reads the appropriate register, but only reports the value without tracking any changes. The driver has a different code path to handle carrier state changes, which is where the issue is. I haven't looked into how it works, but it's either an interrupt or polling mechanism that doesn't get triggered. |
Looking at https://github.com/raspberrypi/linux/blob/rpi-4.14.y/drivers/net/usb/lan78xx.c#L1179, this is the section of code that handles the link being dropped, I wonder if there needs to be some sort of flushing of the various SKB's etc that might be in flight. For example, there was a recent fix to the end of this function https://github.com/raspberrypi/linux/blob/rpi-4.14.y/drivers/net/usb/lan78xx.c#L1237 that makes sure requried processing is done when the link comes up. Complete guess though. What i don;t understand is why the polling seems to kill it, that's just a read as far as I cna see, so would not expect that to cause this. Unless there's a mutex issue somewhere perhaps? |
I don't seem to be able to replicate this on 4.14.90 using any of the mechanisms above. Don't think there is anything unusual in my kernel setup, non-tainted, so a pure 4.14.90. |
Bug is still there. Just tested with 4.14.90-v7+ (upgraded from 4.14.79-v7+). When cable is unplugged:
|
Odd. Not sure why I am not seeing it then. Tried the |
Strange. Since the rpi-update (and downgrade to original kernel), carrier is always stuck at 1, unless I boot without cable. Will check a second RPI in a moment. |
Second RPI (rpi-update to 4.14.94-v7+), cable is unplugged.
|
More info (back to fresh SD card). I have the impression that thing became worse after the rpi-update. Boot with cable: carrier stuck (only unbind/bind the driver solves it, and it stays solved)
|
I doubt it makes a difference, but as a data point - I have wifi connected at the same time. I could also give you my sd card tomorrow if it's not reproducible on your setup. |
Just tested: I have the impression that 'something' is happening too fast after boot in the lite version, which can explain why an unbind/bind cycle solves the issue. |
I was using the desktop version. |
I suppose it isn't possible to recompile/insmod the driver without recompiling the kernel? |
It is, but it's a bit of a hassle. Install raspberrypi-kernel headers and compile the module like it's out of tree. https://www.kernel.org/doc/Documentation/kbuild/modules.txt |
Confirmed: happens also on 3B+, "Raspbian Stretch desktop 2018-11-13", 4.14.98-v7+. |
I have a theory that this is caused by that:
By the time it gets to 2 the phy has already detected carrier, and we missed the interrupt about it. Seems to work better with the following hack:
But will leave a proper fix to someone else. |
Is this being worked on at the moment? Built a kernel with the 'hack' mentioned above for the lan78xx module but that does not change anything either unfortunatly. |
has anyone found a solution? 5.4.51-v7+ |
A possible fix for this is in the current rpi-update kernel, which has just switched to rpi-5.10.y. |
Hi, I found exactly the same issue still existing and rpi-update does not work. Any suggestions? Procedure:
Environment: rpi-update shows the following version: |
TLDR; The eth0 carrier state flag gets stuck at '1' if /sys/class/net/eth0/carrier is polled within <1 second after unplugging the network cable. Systems that depend on it then fail (e.g. route).
Hardware (3 devices were tested, on 2 different networks):
Setup
Result
The /sys/class/net/eth0/carrier flag gets stuck at '1'.
Also, the 'route' command hangs as the os probably doesn't know the interface is down (route -n works).
Expected result
The /sys/class/net/eth0/carrier flag becomes '0'
Situation where this issue is relevant
A script may be checking the carrier flag to failover to WiFi, for example every 10 seconds. However, every once in a while a race condition can occur where this bug is triggered on disconnecting the cable.
The text was updated successfully, but these errors were encountered: