Skip to content

USB audio capture shows spurious samples in raspios 64-bit distributions. #5544

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
gregory-whaley opened this issue Jul 18, 2023 · 11 comments

Comments

@gregory-whaley
Copy link

Describe the bug

Using a variety of low-cost USB audio microphone capture adapters, I have found cases where capture of audio signals show data corruption in the stream sporadically every few minutes. This appears to happen only with stereo capture devices (in mono or stereo) and not with mono (single channel) capture hardware. It only happens with raspios 64-bit distributions, and not with 32-bit distributions. It does not seem to depend on which specific kernel, only if that distribution was compiled for 32 bit or 64 bit.

There are no error messages reported in any log or by the application software. There are no overrun or underrun (XRUN) reports. The data corruption appears as a small grouping of samples from one to about 25 with random sample values. Acoustically, this sounds like a "click" or "pop" upon replay. There is no unusual USB behavior correlated with the spurious samples as monitored by wireshark.

An example WAV file is attached and a screenshot showing a typical example spurious transient. This capture had no input signal so is mostly expected noise floor along with the transient errors.

Buster OS 64bit - Digital Life card at 48kHz stereo.wav.zip

Buster OS 64bit - Digital Life card at 48kHz stereo

It doesn't correlate with the user application, it happens with audacity, arecord, and gnuradio. It doesn't seem to be related to pulseaudio. It happens regardless if pulseaudio is running or not. I always capture from the ALSA device hw:2,0 which is supposed to access the stream directly from ALSA rather than getting the stream through pulseaudio. All the devices use snd-usb-audio driver.

Steps to reproduce the behaviour

From a fresh install of raspios taken from the offical distribution archives: https://downloads.raspberrypi.org/, I installed audacity:

sudo apt update
sudo apt install audacity

No system updates installed. Plug in a USB audio capture device. Use no microphone or input signal, i.e. record just the noise floor. Using audacity, select the ALSA input hw:2,0 which access the data stream unaffected by pulseaudio. Record up to 10 minutes of stream. Transient errors show clearly in the waveform display.

Note there is a known problem with some USB audio devices where they do not perform well in USB-high-speed mode (480Mbps) and the workaround is to force the device into USB-full-speed mode (12Mbps). I have verified that all these devices use full-speed mode (12Mbps) natively.

Device (s)

Raspberry Pi 3 Mod. B

System

Four different OS images were tested:
2023-05-03-raspios-bullseye-armhf (32 bit)
2023-05-03-raspios-bullseye-arm64 (64 bit)
2021-03-04-raspios-buster-arm64 (64 bit)
2021-01-11-raspios-buster-armhf (32 bit)

pi@raspberrypi:~ $ vcgencmd version
Mar 17 2023 10:52:42
Copyright (c) 2012 Broadcom
version 82f3750a65fadae9a38077e3c2e217ad158c8d54 (clean) (release) (start)
pi@raspberrypi:~ $

Here is the version for the 2023-05-03 bullseye 64 bit OS:

pi@raspberrypi:~ $ uname -a
Linux raspberrypi 6.1.21-v8+ #1642 SMP PREEMPT Mon Apr 3 17:24:16 BST 2023 aarch64 GNU/Linux
pi@raspberrypi:~ $

Here is raspinfo:
raspinfo.txt

Logs

No response

Additional context

The problem seems to happen more frequently when the CPU is busy with other tasks such as window management, or as in my case digital signal processing in gnuradio. So clicking on various other windows in the desktop environment seems to trigger the faults.

My guess is that either the snd-usb-audio driver or something in the ALSA code base is not properly handling pointers in the 64 bit environment so that when the OS is performing some background memory management, the pointers to the data stream are sometimes corrupted. Again, this does not happen in any 32-bit version of the OS.

@P33M
Copy link
Contributor

P33M commented Jul 27, 2023

The transients are small, a handful of samples each. They appear to be audio data, but with a byte shift for the duration of a glitch - easily visible in a hexdump.

00216e40: b1ff 5500 b0ff 6100 b3ff 5500 abff 5d00  ..U...a...U...].
00216e50: b4ff 5400 acff 5e00 afff 5600 aeff 5a00  ..T...^...V...Z.
00216e60: adff 5700 b5ff 5300 b3ff 5d00 b5ff 5c00  ..W...S...]...\.
00216e70: b4ff 6000 b1ff 5a00 b0ff 5a00 01b1 015f  ..`...Z...Z...._
00216e80: ffa6 fe51 01b1 0055 00ab fe58 ffa6 fe58  ...Q...U...X...X
00216e90: afff 5a00 adff 5700 abff 5600 a8ff 4e00  ..Z...W...V...N.
00216ea0: aaff 5700 a8ff 4e00 a9ff 5600 b1ff 5700  ..W...N...V...W.
00216eb0: a4ff 4f00 aeff 5000 abff 5500 a9ff 5300  ..O...P...U...S.

Odd that the corruption is a) smaller than a cacheline and b) "fixes" itself. In the dwc_otg FIQ, FS Isochronous traffic uses coherent DMA bounce buffers to do transaction reassembly. Is the coherent behaviour different in AARCH64 vs 32 environments?

@P33M
Copy link
Contributor

P33M commented Jul 28, 2023

After you have recorded a wav file with glitches in, what's the output of dmesg? Clear the buffer beforehand with sudo dmesg -C

@P33M
Copy link
Contributor

P33M commented Jul 28, 2023

I think I have a reproduction here - if I check that the number of bytes transferred in an isochronous packet is an integer multiple of sample size, I occasionally get glitches coincident with an odd number of bytes in a transfer completion. Isochronous endpoints used for audio data shouldn't do this.

@gregory-whaley
Copy link
Author

gregory-whaley commented Jul 28, 2023 via email

@gregory-whaley
Copy link
Author

gregory-whaley commented Jul 28, 2023 via email

@P33M
Copy link
Contributor

P33M commented Jul 31, 2023

Ah, 64-bit kernels are incompatible with ARM's FIQ handlers, so the FIQ code is demoted to a regular IRQ handler. The only workaround is to specify arm_64bit=0 in /boot/config.txt and use a 32-bit distribution.

@pelwell
Copy link
Contributor

pelwell commented Jul 31, 2023

I'll just leave this (3889ba7) here...

@gregory-whaley
Copy link
Author

gregory-whaley commented Aug 5, 2023 via email

@P33M
Copy link
Contributor

P33M commented Aug 5, 2023

The linked commit adds arm64 architectural support for FIQ handlers but there's quite a bit of plumbing required to get dwc_otg to use it. It's low priority, so won't get done quickly, but no need to close the issue.

@micdini
Copy link

micdini commented May 22, 2024

Hello, in the forum I opened this topic. I know it probably isn't the same issue but there are several element here that I think relate to my issue (auto-healing of the noise, not linked to a specifi application, randomess but linked to cpu load/irq handling in some ways).

On a CM4 / 32 bit arch. I have similar glitches on playback, but I'm using usb on xhci controller.
After many test, I found that the playback of two h264 video (1920x1080@25 Hz, level 5, about 30 Mb/s) trigger the issue ofter (several time in 3-4 minutes).

My guess at the moment is some misconfiguration of DMA between video pipeline (to/from h264 block) and xhci isochronous transfers.

Any hint or configuration to try in order to isolate the issue?

@gregory-whaley
Copy link
Author

gregory-whaley commented May 28, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants