Skip to content

Kernels 5.15.26/5.16.12 64bit usb ssd issues #4930

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Dark-Sky opened this issue Mar 6, 2022 · 50 comments
Closed

Kernels 5.15.26/5.16.12 64bit usb ssd issues #4930

Dark-Sky opened this issue Mar 6, 2022 · 50 comments

Comments

@Dark-Sky
Copy link

Dark-Sky commented Mar 6, 2022

Describe the bug

After booting with these new kernels it appears what I can tell while installing another kernel/headers packages or doing a large package upgrades from the repo; writes to my usb ssd ceases. After installing a different kernel there is a large amount of time goes by before returning to the prompt. After installing new kernel packages it will not boot. I can extract the kernel package files and copy them over to my ssd on my desktop and it boots.

There seems to be no issue if booting a sdcard.

Steps to reproduce the behaviour

Install 5.15.26 or 5.16.12 64bit kernel packages on a usb ssd and boot into it then try to install another kernel package then try to boot into the new kernel installed.

Device (s)

Raspberry Pi 4 Mod. B

System

[ray@pi4 ~]$ vcgencmd version
Mar 1 2022 14:21:38
Copyright (c) 2012 Broadcom
version 969fb9b1521fc7ac2b88b15a3a9e942da7678c4d (clean) (release) (start)

[ray@pi4 ~]$ sudo rpi-eeprom-update
BOOTLOADER: up to date
CURRENT: Tue Feb 8 05:24:46 PM UTC 2022 (1644341086)
LATEST: Tue Feb 8 05:24:46 PM UTC 2022 (1644341086)
RELEASE: stable (/lib/firmware/raspberrypi/bootloader/stable)
Use raspi-config to change the release.

VL805_FW: Using bootloader EEPROM
VL805: up to date
CURRENT: 000138a1
LATEST: 000138a1

Logs

journalctl (thousands of these lines):

Mar 04 12:19:13 pi4 kernel: sd 0:0:0:0: [sda] tag#8 CDB: opcode=0x2a 2a 00 00 02 55 ac 00 03 55 00
Mar 04 12:19:13 pi4 kernel: sd 0:0:0:0: [sda] tag#8 data out submit err -11 uas-tag 1 inflight: s-out a-cmd s-cmd work

Mar 04 12:28:42 pi4 kernel: swiotlb_tbl_map_single: 8631 callbacks suppressed
Mar 04 12:28:42 pi4 kernel: xhci_hcd 0000:01:00.0: swiotlb buffer is full (sz: 414208 bytes), total 32768 (slots), used 2 (slots)
Mar 04 12:28:42 pi4 kernel: sd 0:0:0:0: [sda] tag#25 data out submit err -11 uas-tag 1 inflight: s-out a-cmd s-cmd work
Mar 04 12:28:42 pi4 kernel: sd 0:0:0:0: [sda] tag#25 CDB: opcode=0x2a 2a 00 00 03 53 64 00 03 29 00
Mar 04 12:28:42 pi4 kernel: xhci_hcd 0000:01:00.0: swiotlb buffer is full (sz: 414208 bytes), total 32768 (slots), used 2 (slots)
Mar 04 12:28:42 pi4 kernel: sd 0:0:0:0: [sda] tag#25 data out submit err -11 uas-tag 1 inflight: s-out a-cmd s-cmd work
Mar 04 12:28:42 pi4 kernel: sd 0:0:0:0: [sda] tag#25 CDB: opcode=0x2a 2a 00 00 03 53 64 00 03 29 00
Mar 04 12:28:42 pi4 kernel: xhci_hcd 0000:01:00.0: swiotlb buffer is full (sz: 414208 bytes), total 32768 (slots), used 2 (slots)
Mar 04 12:28:42 pi4 kernel: sd 0:0:0:0: [sda] tag#25 data out submit err -11 uas-tag 1 inflight: s-out a-cmd s-cmd work
Mar 04 12:28:42 pi4 kernel: sd 0:0:0:0: [sda] tag#25 CDB: opcode=0x2a 2a 00 00 03 53 64 00 03 29 00
Mar 04 12:28:42 pi4 kernel: xhci_hcd 0000:01:00.0: swiotlb buffer is full (sz: 414208 bytes), total 32768 (slots), used 2 (slots)
Mar 04 12:28:42 pi4 kernel: sd 0:0:0:0: [sda] tag#25 data out submit err -11 uas-tag 1 inflight: s-out a-cmd s-cmd work
Mar 04 12:28:42 pi4 kernel: sd 0:0:0:0: [sda] tag#25 CDB: opcode=0x2a 2a 00 00 03 53 64 00 03 29 00
Mar 04 12:28:42 pi4 kernel: xhci_hcd 0000:01:00.0: swiotlb buffer is full (sz: 414208 bytes), total 32768 (slots), used 2 (slots)

Additional context

No response

@Dark-Sky
Copy link
Author

Dark-Sky commented Mar 8, 2022

I was able to get around this issue today with these 2 kernels by reverting a couple of files with this patch. If I remember right the 5.17 tree has these new commits also since I compiled last.

https://gitlab.manjaro.org/manjaro-arm/packages/core/linux-rpi4/-/blob/master/xhci-revert.diff

@P33M
Copy link
Contributor

P33M commented Mar 8, 2022

Hmm. Am I leaking temporary buffers? I borrowed some code from upstream that fixed a theoretical bug with a dwc3 controller, so maybe that's not bullet-proof.

@Dark-Sky
Copy link
Author

Dark-Sky commented Mar 8, 2022

I tested the upstream 5.17-rc7 yesterday and had no issues. I only test each upstream -rc as it comes out on the pi4 but never go back and check the other upstream trees after that. I do know with this issue it ceases to write when installing another kernel package to my ssd drive and have to recover manually. I have also noticed the same happened when I switched our repo branches and did a complete upgrade with it's packages and the same thing happened. As it is now it can theoretically turn some ones complete install on a usb ssd into fruit salad.

@P33M
Copy link
Contributor

P33M commented Mar 9, 2022

Ah it's a SG op that's getting linearised into a buffer bigger than a single SWIOTLB entry. usb-storage limits the max consecutive number of written sectors to 256K (exactly the size of an entry) but UAS has no such limit. Why is the swiotlb getting used at all?

Please post the contents of /proc/cpuinfo and the output of mount.

@Dark-Sky
Copy link
Author

Dark-Sky commented Mar 9, 2022

[ray@pi4 ~]$ cat /proc/cpuinfo 
processor	: 0
BogoMIPS	: 108.00
Features	: fp asimd evtstrm crc32 cpuid
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x0
CPU part	: 0xd08
CPU revision	: 3

processor	: 1
BogoMIPS	: 108.00
Features	: fp asimd evtstrm crc32 cpuid
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x0
CPU part	: 0xd08
CPU revision	: 3

processor	: 2
BogoMIPS	: 108.00
Features	: fp asimd evtstrm crc32 cpuid
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x0
CPU part	: 0xd08
CPU revision	: 3

processor	: 3
BogoMIPS	: 108.00
Features	: fp asimd evtstrm crc32 cpuid
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x0
CPU part	: 0xd08
CPU revision	: 3

Hardware	: BCM2835
Revision	: d03114
Serial		: 0000000000000000
Model		: Raspberry Pi 4 Model B Rev 1.4
[ray@pi4 ~]$ mount
/dev/sda2 on / type ext4 (rw,relatime)
devtmpfs on /dev type devtmpfs (rw,relatime,size=3734348k,nr_inodes=933587,mode=755)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,nodev,size=1599392k,nr_inodes=819200,mode=755)
cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate,memory_recursiveprot)
bpf on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700)
systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=30,pgrp=1,timeout=0,minproto=5,maxproto=5,direct)
mqueue on /dev/mqueue type mqueue (rw,nosuid,nodev,noexec,relatime)
debugfs on /sys/kernel/debug type debugfs (rw,nosuid,nodev,noexec,relatime)
tracefs on /sys/kernel/tracing type tracefs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /tmp type tmpfs (rw,nosuid,nodev,size=3998480k,nr_inodes=1048576)
fusectl on /sys/fs/fuse/connections type fusectl (rw,nosuid,nodev,noexec,relatime)
configfs on /sys/kernel/config type configfs (rw,nosuid,nodev,noexec,relatime)
none on /run/credentials/systemd-sysusers.service type ramfs (ro,nosuid,nodev,noexec,relatime,mode=700)
/dev/sda1 on /boot type vfat (rw,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=ascii,shortname=mixed,errors=remount-ro)
tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=799692k,nr_inodes=199923,mode=700,uid=1000,gid=1000)
gvfsd-fuse on /run/user/1000/gvfs type fuse.gvfsd-fuse (rw,nosuid,nodev,relatime,user_id=1000,group_id=1000)

@Dark-Sky
Copy link
Author

Dark-Sky commented Mar 9, 2022

@P33M "Why is the swiotlb getting used at all?"

I have no clue but I do know when you do a "make bcm2711_defconfig" it gets built into the kernel:

swiotbl

@pelwell
Copy link
Contributor

pelwell commented Mar 9, 2022

Can you report the output of dmesg | grep -E "(brcm-pcie|CMA)" ?

@Dark-Sky
Copy link
Author

Dark-Sky commented Mar 9, 2022

This is using a good kernel. If you want the output with out the reverted patch it will take me @20 minutes to compile it.

[ray@pi4 ~]$ dmesg | grep -E "(brcm-pcie|CMA)"
[    0.000000] Reserved memory: created CMA memory pool at 0x000000000ec00000, size 512 MiB
[    1.353365] brcm-pcie fd500000.pcie: host bridge /scb/pcie@7d500000 ranges:
[    1.355429] brcm-pcie fd500000.pcie:   No bus range found for /scb/pcie@7d500000, using [bus 00-ff]
[    1.357551] brcm-pcie fd500000.pcie:      MEM 0x0600000000..0x063fffffff -> 0x00c0000000
[    1.359610] brcm-pcie fd500000.pcie:   IB MEM 0x0000000000..0x00bfffffff -> 0x0400000000
[    1.427578] brcm-pcie fd500000.pcie: link up, 5.0 GT/s PCIe x1 (SSC)
[    1.429883] brcm-pcie fd500000.pcie: PCI host bridge to bus 0000:00

@pelwell
Copy link
Contributor

pelwell commented Mar 9, 2022

This is using a good kernel.

That's fine. And one more (sorry) - dmesg | grep -i dma.

@Dark-Sky
Copy link
Author

Dark-Sky commented Mar 9, 2022

[ray@pi4 ~]$ dmesg | grep -i dma
[    0.000000] OF: reserved mem: initialized node linux,cma, compatible id shared-dma-pool
[    0.000000]   DMA      [mem 0x0000000000000000-0x000000003fffffff]
[    0.000000]   DMA32    [mem 0x0000000040000000-0x00000000ffffffff]
[    0.000000] On node 0, zone DMA32: 19712 pages in unavailable ranges
[    0.078487] DMA: preallocated 1024 KiB GFP_KERNEL pool for atomic allocations
[    0.078814] DMA: preallocated 1024 KiB GFP_KERNEL|GFP_DMA pool for atomic allocations
[    0.079724] DMA: preallocated 1024 KiB GFP_KERNEL|GFP_DMA32 pool for atomic allocations
[    0.161859] bcm2835-dma fe007000.dma: DMA legacy API manager, dmachans=0x1
[    1.738839] mmc-bcm2835 fe300000.mmcnr: DMA channel allocated
[    1.805256] mmc0: SDHCI controller on fe340000.mmc [fe340000.mmc] using ADMA
[    5.189726] uart-pl011 fe201000.serial: no DMA platform data

@pelwell
Copy link
Contributor

pelwell commented Mar 9, 2022

From your output I think you have an 8GB Pi 4 rev 1.4. It appears to have a BCM2711B0 (you can confirm that by checking that the second line of the etched writing on the processor package ends in B0T). The B0 parts are limited to a 3GB inbound PCIE window - this means in your case that 5GBs of the 8GBs are not addressable by the VLI USB controller, forcing it to use the SWIOTLB (bounce buffer) mechanism.

If you were to put total_mem=3072 in your config.txt it will artificially limit the used RAM to 3GB, avoiding all bounce buffer usage. I'm curious as to whether that has any other effects.

@Dark-Sky
Copy link
Author

Dark-Sky commented Mar 9, 2022

I can not readily confirm the BO as I have an ice tower mounted on my processor. I will recompile the kernel and and add total_mem=3072 in config.txt and see what happens.

I am a little concerned though of having to add this to the config.txt if total_mem=3072 works and limiting the ram if the code stands as is. It is not going to be pretty with people complaining in our forums.

@pelwell
Copy link
Contributor

pelwell commented Mar 9, 2022

The 3GB RAM limit is only for diagnosis - it's not an acceptable workaround.

@Dark-Sky
Copy link
Author

Dark-Sky commented Mar 9, 2022

It is compiling on a 16 core processor; it won't take long....

@timg236
Copy link
Contributor

timg236 commented Mar 9, 2022

The chip-revision should be visible "sudo busybox devmem 0xfc404000"

@Dark-Sky
Copy link
Author

Dark-Sky commented Mar 9, 2022

Revised: re-compiled busybox with devmem added:

[ray@pi4 ~]$ sudo busybox devmem 0xfc404000
0x27110010

@Dark-Sky
Copy link
Author

Dark-Sky commented Mar 9, 2022

@pelwell It works with the reduced mem. After booting into the test kernel I was able to install another kernel and reboot into it. I used the 5.15.27 kernel released today.

[ray@pi4 ~]$ uname -a
Linux pi4 5.15.27-1-MANJARO-ARM-RPI #1 SMP PREEMPT Wed Mar 9 17:32:00 UTC 2022 aarch64 GNU/Linux GNU/Linux
[ray@pi4 ~]$ 
[ray@pi4 ~]$ cat /boot/config.txt | grep total_mem
total_mem=3072
[ray@pi4 ~]$ 
[ray@pi4 ~]$ free
               total        used        free      shared  buff/cache   available
Mem:         2917488      540060     1429960       74448      947468     2262944

@justinkb
Copy link

I am getting this on a Pi with 2GB ram, what's up with that then?

@Dark-Sky
Copy link
Author

I have no clue but the issue is growing as people do upgrades as time goes by.

https://archlinuxarm.org/forum/viewtopic.php?f=65&t=15953&p=69197#p69197

@justinkb
Copy link

I'm confused why only fat fs writes can trigger this

@Dark-Sky
Copy link
Author

I'm not for sure that it is only fat writes. I did a huge upgrade by switching branches and it was triggered also.

@justinkb
Copy link

that's weird, my usb ssd as well as an usb hdd see a lot of write activity, but it never happened except when writing to the boot partition with bootloader or kernel updates

@graysky2
Copy link

graysky2 commented Mar 18, 2022

@Dark-Sky - This patch you are shipping for manjaro arm, is it reverting a single commit in rpi-5.15.y or multiples? Can you call the relevant ones out if applicable?

@Dark-Sky
Copy link
Author

Dark-Sky commented Mar 18, 2022

@graysky2 I use it on RPi's latest kernel's 5.15 and above. I spent a couple of days narrowing down the issue when the issue first hit and I believe there was more than one commit. I can say I never released the kernels as it was with out reverting and no one has complained of an issue yet.

Can you you them if applicable?

Not sure what you are asking.

@justinkb
Copy link

justinkb commented Mar 18, 2022

I think it's the last commit and the one two before that in the pi kernel that touch drivers/usb/host/xhci.c

@Dark-Sky
Copy link
Author

And drivers/usb/host/xhci-ring.c

@justinkb
Copy link

It's kinda annoying 5.15.y is being rebased still, rewriting commit history. Now it seems like these changes are from 3 days ago, when this bug report is already almost two weeks old

@graysky2
Copy link

@Dark-Sky - Sorry, massive typo there. "Can you call the relevant ones commit/commits if applicable?"

@graysky2
Copy link

@pelwell - What is your take on the manjaro arm fix? Safe to use while this is sorted out upstream? https://gitlab.manjaro.org/manjaro-arm/packages/core/linux-rpi4/-/blob/master/xhci-revert.diff

@pelwell
Copy link
Contributor

pelwell commented Mar 19, 2022

I'm currently doing weekendy things, but will take a look.

limeng-linux pushed a commit to limeng-linux/linux-yocto-5.15 that referenced this issue Apr 12, 2022
commit  a086698adad709fef9a0b73a5153a6f1f95d70b7 from
https://github.com/raspberrypi/linux.git rpi-5.15.y

This reverts commit 40686d87f87a46b3abf48a8dcaee5e0a031deafb.

See: raspberrypi/linux#4930

Signed-off-by: Phil Elwell <[email protected]>
Signed-off-by: Meng Li <[email protected]>
herrnst pushed a commit to herrnst/linux-raspberrypi that referenced this issue Apr 14, 2022
popcornmix pushed a commit that referenced this issue Apr 19, 2022
popcornmix pushed a commit that referenced this issue Apr 19, 2022
herrnst pushed a commit to herrnst/linux-raspberrypi that referenced this issue Apr 20, 2022
popcornmix pushed a commit that referenced this issue Apr 25, 2022
herrnst pushed a commit to herrnst/linux-raspberrypi that referenced this issue Apr 28, 2022
popcornmix pushed a commit that referenced this issue May 4, 2022
herrnst pushed a commit to herrnst/linux-raspberrypi that referenced this issue May 9, 2022
popcornmix pushed a commit that referenced this issue May 9, 2022
herrnst pushed a commit to herrnst/linux-raspberrypi that referenced this issue May 14, 2022
herrnst pushed a commit to herrnst/linux-raspberrypi that referenced this issue May 16, 2022
popcornmix pushed a commit that referenced this issue May 16, 2022
Noltari pushed a commit to Noltari/rpi-linux that referenced this issue May 17, 2022
herrnst pushed a commit to herrnst/linux-raspberrypi that referenced this issue May 21, 2022
popcornmix pushed a commit that referenced this issue May 23, 2022
herrnst pushed a commit to herrnst/linux-raspberrypi that referenced this issue May 25, 2022
herrnst pushed a commit to herrnst/linux-raspberrypi that referenced this issue May 25, 2022
popcornmix pushed a commit that referenced this issue May 26, 2022
popcornmix pushed a commit that referenced this issue Jun 1, 2022
popcornmix pushed a commit that referenced this issue Jun 6, 2022
popcornmix pushed a commit that referenced this issue Jun 14, 2022
herrnst pushed a commit to herrnst/linux-raspberrypi that referenced this issue Jun 21, 2022
popcornmix pushed a commit that referenced this issue Jun 23, 2022
papamoose pushed a commit to papamoose/ubuntu-kernel-raspi-jammy that referenced this issue Sep 3, 2022
BugLink: https://bugs.launchpad.net/bugs/1967733

This reverts commit 40686d87f87a46b3abf48a8dcaee5e0a031deafb.

See: raspberrypi/linux#4930

Signed-off-by: Phil Elwell <[email protected]>

(cherry picked from commit a086698adad709fef9a0b73a5153a6f1f95d70b7 rpi-5.15.y)
Signed-off-by: Juerg Haefliger <[email protected]>
@Dark-Sky
Copy link
Author

Dark-Sky commented Oct 11, 2022 via email

jai-raptee pushed a commit to jai-raptee/iliteck1 that referenced this issue Apr 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants