-
Notifications
You must be signed in to change notification settings - Fork 5.2k
smsc95xx: dynamically fix up TX buffer alignment with padding bytes #2950
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
From dwc_otg stats, previously every tx packet was misaligned. Adding the pad bytes drops the misalign count to 0. @popcornmix I'd prefer two separate rpi-update revisions - one with the coherent pool increase, and one with this commit added so we can spot regressions. |
drivers/net/usb/smsc95xx.c
Outdated
@@ -2082,7 +2082,9 @@ static struct sk_buff *smsc95xx_tx_fixup(struct usbnet *dev, | |||
struct sk_buff *skb, gfp_t flags) | |||
{ | |||
bool csum = skb->ip_summed == CHECKSUM_PARTIAL; | |||
int overhead = csum ? SMSC95XX_TX_OVERHEAD_CSUM : SMSC95XX_TX_OVERHEAD; | |||
unsigned int align_bytes = (4 - ((uintptr_t)skb->data & 0x3)) & 0x3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first & 0x3
is unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty sure the 4 is unnecessary too. (-x)&3
should be enough...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GCC agrees, disassembly is the same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That comment needs an emoji to disambiguate the snark level.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or a push
dwc_otg requires a 32-bit aligned buffer start address, otherwise expensive bounce buffers are used. The LAN951x hardware can skip up to 3 bytes between the TX header and the start of frame data, which can be used to force alignment of the URB passed to dwc_otg. As found in raspberrypi#2924
… bytes See: raspberrypi/linux#2950 kernel: lan78xx: use default alignment for rx buffers See: raspberrypi/linux#2953 kernel: configs: Enable netdev LED trigger See: raspberrypi/linux#2951
… bytes See: raspberrypi/linux#2950 kernel: lan78xx: use default alignment for rx buffers See: raspberrypi/linux#2953 kernel: configs: Enable netdev LED trigger See: raspberrypi/linux#2951
We use Cloudflare CDN for our site, and since this commit, all of the rpi boards with the smsc95xx ethernet are experiencing stalled downloads when receiving larger files (using wget). The stalls typically occur after receiving less than 1MB of data. But every attempt does stall. I've only experience stalls coming from CloudFlare servers, but RPI3B+ boards ethernet and RPI Wifi We also use a custom kernel/OS, but I validated the testing with a fresh Raspbian Lite image, with only the kernel updated as below. RPI 4.19.40 (6cb1da772a8c56f44591bec904046b15f1126949) < Doesn't work sudo rpi-update 18e0a0f9a31e7a3a47d9c4301c7705b980ab0516 < Doesn't work sudo rpi-update 6cf2788e74b497c109865d052cd6fa68d094cf38 < downloads work fine here. |
Please open a separate issue detailing your findings. In particular, we need to be able to replicate a problem in order to debug it - can you provide an example script that can be run or Wireshark logs around the stall? |
dwc_otg requires a 32-bit aligned buffer start address, otherwise
expensive bounce buffers are used. The LAN951x hardware can skip up to
3 bytes between the TX header and the start of frame data, which can
be used to force alignment of the URB passed to dwc_otg.
As found in #2924