-
Notifications
You must be signed in to change notification settings - Fork 5.2k
USB DAC dropouts/glitches #2215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
After further reading I think I may be affected by the problem described in this issue: I will attempt to build a custom version of Volumio with the very latest kernel (rpi-update cannot be used) to see if it fixes the problem as even the latest dev build of volumio is only 4.9.41 - shown to be bad in issue 2134. |
I've now tested a 4.9.51-v7+ kernel hoping that the timer fix for isochronous transfer in 43 described above would be the solution but unfortunately it doesn't fix this issue. The severity of the problem seems to be reduced by about half - with a 36 or 41 kernel I get on about 6 dropouts per minute which audibly last probably 1/10th of a second, with the 51 kernel I am seeing maybe 3 dropouts per minute and they don't "sound" as bad, but they are definitely still there and the audio is not bit perfect or glitch free, and can easily be heard on a sinewave test signal played using aplay. Is there any further testing or debugging I can do to get to the bottom of this issue or at least prove whether it is a hardware limitation or still potentially fixable in software ? |
More testing and some interesting conclusions. First of all this problem is not fixed by newer kernels and doesn't seem to be related to the timer fix in issue #2134, unfortunately. I can reproduce the same problem with 4.9.36, 4.9.41 and 4.9.51 with very little difference contrary to what I said above. I found that when I tested on a fresh install of Raspbian I was not experiencing audio glitches using aplay. at either 44.1Khz 16 bit or 96Khz 16 bit. I then tried to narrow down what was different between Raspbian and Volumio. It turns out that the main Volumio service - which is a node js server for the web interface, when running leads to glitches/dropouts in the audio. When I stop that service the glitches almost entirely stop although I can still reproduce a very occasional glitch with it stopped, maybe once every 2-3 minutes, instead of every few seconds when it is running. Nodejs is using very little CPU time (under 10%) when the web interface is idle yet still causes pretty bad sound glitches. If I renice nodejs to a low priority or aplay to a high priority it makes no difference at all, which suggests its not aplay being starved for CPU time. (aplay never reports any underruns anyway) I experimented with the snd-usb-audio nrpacks module option - no change at all. I also experimented with every buffer related alsa configuration setting - no improvement. Finally I tried dwc_otg.speed=1 to disable USB2.0 support and the symptoms are gone - absolutely no glitches or dropouts in about an hour of testing on test signals and music, despite nodejs running as before. So something the nodejs service is doing is causing issues with USB2.0 transmission - perhaps certain syscalls that it is calling are disabling interrupts for a long time, or the syscalls are just taking too long to return leading to the kernel missing a time critical USB transmission. Perhaps a realtime kernel would help here. I would add that this sound interface is USB2.0 (but also supports fallback to USB1.1) so I don't think this relates to the fiq fixes from the last few years ? Is there anything I can do to improve the USB 2.0 performance to avoid dropouts in the audio or am I stuck using USB1.1 as a workaround ? |
Hello. From an examination of your lsusb listing, the audio device is a high-speed device. The descriptors for the isochronous OUT endpoints (i.e. audio streams) also specify a bInterval of 1 which means a packet transfer every microframe - this is a particularly strenuous requirement for the Pi as it means the maximum hardware interrupt latency tolerable is <125uS. The FIQ code attempts to improve matters by performing batches of transfers, but it can only perform as many transfers as the submitted URB requests - at the boundary where URBs are returned/the next one queued, control is passed to the IRQ-driven driver so there's a window of vulnerability where a latency spike could cause frame slippage. dwc_otg.speed=1 will increase the hardIRQ latency tolerance to 1ms as that's the frame interval at full-speed. It's no surprise that the glitches disappear when clobbering the bus speed. The issue described above should cause at most momentary audio dropouts (typically less than 1ms) but this is a function of how the USB device behaves when presented with an underrun condition (i.e. missing data). Can you capture a typical output stream via line-out -> recording device (e.g. line-in on a PC or similar) and upload the wav somewhere? What's puzzling is the apparent difference between running a "heavy" userspace application (which should have negligible effect on hardIRQ latency) and plain aplay. |
@P33M one thing @DBMandrake mentioned to me was strace showed node js doing a number of cacheflush operations. It does make use of JIT and so is likely triggering instruction/data cache flushes in the kernel. |
That makes sense. This interface is designed for home recording studio use and as such minimising round trip latency from record to playback is a key factor in it's design, possibly at the expense of not being as "easy" to drive by the host OS. The problem felt like a latency to me issue as well - the glitches don't seem particularly worse if I play 96Khz 24bit (which I believe is sent as 32bit to the card) than if I send 44.1Khz 16 bit, suggesting it wasn't a throughput issue.
Yes I can can make some recordings for you, I should be able to do this tomorrow. The severity varies from a small "click" that you can only hear on a simple and uncluttered test signal like a sine wave (but not easily in music) all the way to a loud "pop" that is easily heard during music. Even if the click was subtle though, any glitch or dropout does defeat the purpose of trying to set up bit perfect playback from a high quality DAC.
Yes it is very puzzling - when I identified nodejs as the application that was placing the problematic "load" on the system (stopping the service eliminated 99% of the symptoms) I spent a few hours trying to figure out why. I found it didn't seem to be just raw CPU use that was causing the issue, as adjusting relative priority of the nodejs service (which is a parent thread and 3-4 worker threads) and aplay made absolutely no difference, and I was still getting glitches even when nodejs was mostly "idle". (less than 5% CPU use) Also other high cpu loads that I tried putting on the system were not causing trouble. Memory use also wasn't an issue as the service is using about 200MB of ram and the system is not using swap and has plenty of free ram. I concluded that if it was causing this problem with such low cpu use even at low scheduling priority it must be making system call(s) that were somehow harming interrupt response time so I decided to strace the parent nodejs process and as popcornmix says I noticed there were a lot of cacheflush calls being made, especially soon after the service was started. (When audio glitches were even worse still) The cacheflush calls might be a red herring, but it was the only thing that jumped out at me and seemed like something worth following up. It's possible the problem isn't even in the USB driver at all but somewhere else in the kernel where some user space accessible system call(s) are doing something that is hurting interrupt response times or maybe even causing driver interrupts to be missed altogether ? |
As promised here are some recordings of the problem made using audacity via line input, they are in 192Kbit 32 bit float wav format, uncompressed. This is a 20Hz sinewave - low enough in frequency that it won't be heard on most speakers/headphones but the resulting clicks will be. The first one was made after volumio had been booted for a while and was largely "idle". cpu use of nodejs at this point is under 10%: https://drive.google.com/open?id=0B5qpN5dK9MKjdWc4Y2F2SmlDNTA Clicking is more subtle and I count roughly 6 glitches over 30 seconds. I measured a couple in Audacity and they were about 0.4ms long, but because the amplitude jumps back to near zero it makes it quite audible. This second recording was started while the nodejs service was stopped, and then the service was started at about 5 seconds into the recording, with symptoms much more severe as a lot more work is being done by the service including setting up child threads: https://drive.google.com/open?id=0B5qpN5dK9MKjOURST3VfWUdEVHM It takes a few seconds but it starts crackling badly between 16 and 22 seconds. I measured one dropout at about 17.7 seconds that lasts a full 8ms, which suggests something is very wrong. Here is the original test file I'm playing with aplay: https://drive.google.com/open?id=0B5qpN5dK9MKjbkhhdTFfdHNzbDg Most of the dropouts appear to be the sample simply dropping to near zero for the duration then returning to where it should be, but I did see at least one dropout in the second file where you can see from the sinewave that time sync has been lost altogether and the sample has been "delayed", at about 17.79 seconds. Just to try to rule out cpu hogging I repeated the test again with the nodejs service niced to 19 and it was no better, if anything it was worse: (although there is a lot of random variation from test to test) https://drive.google.com/open?id=0B5qpN5dK9MKjTXh4Nk5YWGxCMkE In this last one I see a 3ms dropout where the time sync seems to have slipped at 16.38s although most of the glitches are about 0.5ms in length. Here is a screen grab from that area: Hopefully these files are useful to help at least classify the nature of the problem - if you need any more testing done please let me know. |
Hi there, we are happy to find this thread. In short: We have the same problem with the ARM architecture. We tried:
All the hints we found in lots of forums didn't help, but
So it's clear that there must be a timing / time slice issue with the isochronous data streams on ARM devices. |
@musicwonder Can you confirm which USB controller the DAC is connected to on the tinker board? A |
It's not actually all arm devices. While this problem is being looked into I've been running Volumio on a dual core Cubox-i (imx6 arm architecture) and I have absolutely zero problems with glitches or dropouts with this USB interface even when the box is under heavy load. Unfortunately the Cubox-i build of Volumio is older than the Pi build and not kept as up to date by the developers (new versions are always released for the Pi first) so I would prefer to run the Pi version if the problem can be solved satisfactorily. I've also tried this Behringer interface on a Vero4k (Amlogic S905X arm chipset) and there are no glitches there either, although to be fair this is much faster hardware than a Pi so is not a fair comparison, and there are no Volumio builds available for it. (I tested with both Kodi and aplay) |
@P33M: It happens at each of the 4 USB Ports. |
I've been toying with the idea of adding low-key profiling to the dwc_otg driver - it's possible to get reliable indication that you're missing interrupts (FIQ->IRQ latency being too big, for example). Unfortunately time is tight at the moment so it will be a while before I can look at this. |
Is there a chance that the issue is gone with a rt-kernel? If so, I would it give a try. |
No problem - meanwhile I am using it on my Cubox, when you get a chance to look at it I'll be around to do any testing required. |
No I have made a test with an orange pi pc2 with armbian 4.13. Same issue but less often. One dropout per 1-4 minutes. Why only arm devices? |
rt-kernel doesn't help. Tested. |
Update: With orange pi pc2 and a new install of armbian 4.13 the sound is about one minute clear. After that a first drop out comes. Then I have many glitches and dropouts and buffer underrun while using brutefir (2 In/4 Out with an USB c-media soundcard). Reboot doesn't help. You cannot get the same clear sound how it is after a new installation. Could there be an issue with SD Card read/write? |
Has there been any further progress on this issue? |
There's nothing further I can do at the moment but I'm able to test any proposed fixes. |
I've been banging my head quite a lot on this lately. In my case disabling nodejs made things a bit better, but dropouts were still there.
|
Did you try dwc_otg.speed=1 ? For me this almost eliminates the issue. Of course this is only a workaround, and restricts the USB controller to USB 1.1 mode and therefore limits all USB devices (including the onboard network adaptor) to 12Mbps. However if your only USB devices are the sound card and the onboard Ethernet controller and you are not trying to use it for anything other than Volumio then this seems to be fast enough - after all audio doesn't require very high bitrates. As P33M said earlier this reduces the required interrupt response time from 125uS to 1mS, making the interrupt load much easier to deal with. Might be a usable workaround for you in the meantime. I've sidestepped the issue by running Volumio on a Cubox-i for now but that introduces other issues like the fact that the Volumio builds for that device are very out of date so I'd like to come back to the Pi when I can.
I see the same glitches testing with aplay from the command line and also airplay (shairport) when mpd isn't even the active player process, so the issue is a lot more deep seated than that and seems to be down at the interrupt handler/scheduler level in the kernel. |
I would like to avoid this, as it's not really a solution, since it will mean max 24/96 and it will make db indexing very slow. |
The ARM SVC handler explicitly enables interrupts before calling whatever ARM-specific handler is needed: http://elixir.free-electrons.com/linux/latest/source/arch/arm/kernel/entry-common.S#L172 So the cache maintenance operation should be done without hardirqs disabled. The coherent_user_range() function invalidates Icache and cleans Dcache out to the inner shareable boundary which I imagine takes many cycles, but there shouldn't be anything like holding off interrupts for dozens of microseconds in that operation... |
are there in your opinion some kernel confs or userspace tweaks that can be applied in this particular scenario (Hi RES USB Music playback) that you think can mitigate\solve this issue? |
Hi all I would like to inform that I have exactly the same problem with differents setups connected to asynchronous USB DAC XMOS WAVEIO, whatever the music format
I have posted some logs here http://logs.volumio.org/volumio/mD6JT8I.html Good luck for debugging it really is annoying. BR |
Do you have any links for documentation of the OTG connection on the USB-C? I didn't spot it in anything I found earlier today. Having both the OTG and PCI-e connected USB-2 may be a helpful diagnostic. |
Indeed, but given that the new USB controller is much more suited to the job of interfacing with multiple devices simultaneously, I would not expect there to be any major issues. Certainly they should be easier to fix. Just need to avoid bugs in the hardware 😮 |
In any case, I am still very keen to find a solution for previous generation's PI. My best conf is the following (in cmdline.txt): However, I still have glitches (with some DACs) in DSD256 DoP. Any hint? |
Even with the mask=0xf I have occasional glitches. However, I can get 'glitch free' completely by forcing the CPU on a constant speed. It seems that switching between 2 freq's (which happens normally when I play audio) has influence on this. |
Might be interesting to compare the behavior of Downstream and Mainline cpufreq driver according to this issue. |
Despite valiant attempts at fixes in the kernel my feeling is that this is a hardware architectural limtitation in the original Pi models (1 to 3) that cannot be fixed in software, and two years after reporting this issue it remains unresolved. Has anyone done any thorough testing of this issue with an affeced USB sound interface on the Raspberry Pi 4 ? If this problem does indeed not exist on the Pi 4 with its different USB controller architecture then I'm willing to give in and simply buy a Pi 4 for use with Volumio. |
I tried to reproduce the problem on the Pi 4 using both the type A sockets that use the new controller, and the type C socket that uses the old one. I couldn't reproduce it in either case using the same configuration I used to test on the Pi 3 - squeezelite -> BruteFIR -> Focusrite Forte streaming high res files over wired ethernet. I had intended to repeat the tests on a Pi 3 to check that it wasn't a software change that had sorted it since there's also been a major raspbian version change, but I haven't done so yet. i should also try it using the Pi Zero W I suppose, in case the problem is related to the usb hub rather than the controller. |
That sounds promising... 👍 |
Pi 4 has a full XHCI-compliant USB3 controller with far more throughput than the DWC-OTG controller on previous Pis, and it puts very little load on the ARMs. I'm confident that it would solve your audio dropouts. |
Agreed @pelwell , but that still does not explain why I've got drop outs pops and glitches with a 4.19 kernel on a Pi 3, and if I revert to 4.14 they are gone. |
We have designed an audio streamer based on CM 3+ and we see the same issue . We have lots of reports of glitches with 4.19 and not with 4.14 Is anything done on this issue ? |
Something clearly changed in scheduling or interrupt handling between 4.14 and 4.19. I'm sure the change would be common to all platforms, but Pis 0 to 3 are probably more sensitive to it because of the lack of latency tolerance in the OTG USB controller. |
@AlloKatana and others, Let's try not to muddy the waters too much - the problem I reported in this git issue is not a 4.19 regression and occurs on all earlier Raspberry Pi kernel versions I've tested including 4.14. Not all USB audio glitches and dropouts are the same issue I reported, and while there may well be a regression in 4.19 which has made things much worse it should probably be discussed in its own separate Git issue specific to 4.19 regressions. |
We have contacted PI foundation to find a proper solution to this. The proposed solution (still to be implemented) is to add another bit in the fiq_fsm_mask parameter that pipelines isochronous transfers. Unfortunately, we've not heard back from foundation yet about this. When there will be any progress I'll post it here. |
@volumio Thanks for the pointer to that thread, I'll keep an eye on it. Just yesterday I've ordered a Pi 4 to be my new Volumio box, so for me at least the problem should be solved soon, (and I will no longer need to rely on a Cubox-i running very old Volumio builds) however it would still be great of course if a workable solution can be found for earlier models of Pi. My only concern is that the symptoms can be subtle and very hard to reproduce consistently - I've tried many different cmdline.txt and config.txt options (on a Pi 2) people have said work for them including all the ones suggested here and sometimes at first it seems to work but after using it for longer and trying different music and/or sample rates it becomes apparent that the problem is still there but perhaps just a bit less frequent and more difficult to trigger. So I suspect the people reporting success with various settings just aren't testing long enough or thoroughly enough on a wide enough variety of media files. Hopefully any fiq_fsm_mask fix doesn't end up just making it occur very infrequently instead of actually fixing it fully. |
I received my Pi 4 a few days ago and I can confirm that the issues I reported here are 100% resolved with the Pi 4. 😄 Perfect glitch free playback in Volumio up to 192Khz 24bit using the same Behringer UMC204HD which was having untold problems with dropouts on the Pi 2. An unexpected side bonus is that the 5 volt output of the Pi 4 seems to be infinitely cleaner than that of the Pi 2. My Pi 2 was using the official (at the time) Pi 2 power adaptor and yet the USB 5 volt output was so noisy that "CPU chirping" and other digital noise was readily apparent on the analogue output of the DAC with music paused even at normal amplifier volume settings. Connecting a TV to the HMDI output (normally disconnected with Volumio) made the noise even worse, almost intolerable. On the Pi 4 (also with the official USB-C power adaptor) there is no CPU chirping or digital noise audible at all. I can turn the amplifier way up to 2 o'clock (well beyond even loud playback) while paused and only hear a small amount of smooth analogue hiss from the amplifier itself that is present with the DAC unplugged. Basically no power noise of any kind is passing from the Pi through the DAC at all. I'm stunned. I don't know whether the improvement is coming from the power adaptor or from filtering of the 5v supply to the USB ports on the Pi 4 PCB - any insights into the analogue/power design of the Pi 4 which might explain this dramatic reduction in USB 5v noise @pelwell or @P33M ? |
Thanks for the feedback - if we ever need some poster quotes, we'll know where to come. I'm not going to embarrass myself with theories about where the improvements come from, but @jimbojr goes to a lot of effort over details like this. |
Thing is, on Raspberry Pi 3B+ and Raspberry Pi 4 4GB, Pi 400 and 8GB I have same problems. Did anyone had similar problem? |
Any updates on this issue? I am having the same problems with a Raspberry Pi Zero W and an USB XMOS DAC. |
Any news? I really hoped RPi4 solved the problem... but it went just better, not fine. |
I have different problems with the RPi4 USB-A ports: my USB DAC can only output 2 channel audio because if i try to use it in 5.1 mode i only get silence and an error message about not enough bandwidth on the bus for the device. Afaik it's a problem with the integrated USB 3.0 HUB. However i found that if i set the Pi4's USB-C port to host mode and connect the DAC to it using a USB-C otg cable it works perfectly fine. |
Three years and no fix is forthcoming - I don't think this is fixable in software, it is a hardware limitation in the USB controller, so waiting for a software fix at this point I think is wishful thinking. I gave up long ago and switched to a Pi 4 which has a completely different USB controller which doesn't have the same limitations.
I've been using a 1GB Pi 4 with Volumio for over a year now with the same USB DAC from my initial post and it is 100% glitch free for me at all sample/bitrates right up to 192Khz / 24bit. Doesn't matter whether I am playing back local content or over the network. Absolutely zero problems. I'm not sure what might be causing your issue - are you using Ethernet or WiFi ? (I use Ethernet) Have you tried plugging it into both USB 2 and USB 3 ports? (I use the USB 2 ports although I haven't noticed any problems on the USB 3 ports - worth trying both though) Does it only happen at higher sample / bitrates or even down at 44Khz / 16 bit? Is it any worse or better at different sample/bitrates ? Perhaps it is something specific to your sound card... A good test is to generate test files at each sample / bitrate with a 20Hz sinewave - during playback if there are any dropouts/glitches they will be audible as a slight click/pop because the 20Hz sinewave itself will not be audible on most speakers...(or if it is, turn the bass down during the test...) If there are no pops or clicks and you just hear silence after playback starts there is no issue. |
Wontfix - the dwc_otg codebase is effectively in maintenance mode. High-speed isochronous devices with a bInterval of 1 are supported on a best-effort basis and to increase compatibility with these means meddling with FIQ code in a non-trivial manner. dwc2 in host mode (as used on Pi Zero / Pi Zero W) will suffer from the same root problem as dwc_otg - the recommendation is to use full-speed mode in this case. |
Thanks for the final report on this problem - I had guessed this issue would not be solvable due to hardware limitations and I've long since moved on to using a Pi 4 in this application, which works perfectly. I suggest anyone else with the same sort of USB DAC glitches/dropout described here does the same. |
Unfortunately, it appears moving to Raspberry Pi4 doesn't solve all the USB-related audio problems. For this reason, I have given up on using Raspberry Pi4 for this purpose. For people still experiencing USB-related audio problems, I can recommend Odroid boards for ARM-based development or simply going with x86-based Linux. |
FWIW, I would like to report very similar symptoms: audio dropouts every few seconds on a UMC204HD playback on a much more powerful machine: Ryzen 9 with 32GB of RAM and 16 threads. So this is not a problem of Raspberry Pi not handling the load, it's probably an issue of the Audio interface. |
If anyone is interested i found a workaround: if i supply power to an Rpi4 through the GPIO header and use the USB-C port through an USB-A -> USB-C converter as host to connect the DAC it works fine. No dropouts and 5.1 audio works fine too. |
Hi,
I'm testing a Behringer U-Phoria UMC204HD USB sound interface with my Pi 2B running the latest version of Volumio and I'm experiencing random glitches/dropouts in the sound which after some investigation I suspect may be related to USB packet loss or timing - issues that have been looked at in the past for the Pi.
The interface supports up to 192Khz 24bit with 4 output channels, however I'm seeing the same symptoms down at 44Khz 16 bit - which is a pop or momentary dropout in the audio at random roughly every 3 to 10 seconds, whether playing audio via Volumio or directly using aplay from the command line. The sample rate makes very little difference to the frequency of occurance of the pops and clicks.
I've already experimented with the various dwc options discussed elsewhere and nothing seems to make much difference - the only one that made some difference was dwc_otg.speed=1 to disable hi-speed mode, which did significantly reduce the frequency of the glitches but did not eliminate them, and of course that is not a viable solution since it affects Ethernet speed etc...
These are the current cmdline options and kernel version used by Volumio that I am testing with:
splash quiet plymouth.ignore-serial-consoles dwc_otg.lpm_enable=0 dwc_otg.fiq_enable=1 dwc_otg.fiq_fsm_enable=1 dwc_otg.fiq_fsm_mask=0x3 console=serial0,115200 kgdboc=serial0,115200 console=tty1 imgpart=/dev/mmcblk0p2 imgfile=/volumio_current.sqsh elevator=noop rootwait smsc95xx.turbo_mode=N bootdelay=5 logo.nologo vt.global_cursor_default=0 loglevel=0
Linux version 4.9.36-v7+ (dc4@dc4-XPS13-9333) (gcc version 4.9.3 (crosstool-NG crosstool-ng-1.22.0-88-g8460611) ) #1015 SMP Thu Jul 6 16:14:20 BST 2017
For a comparison I installed the Cubox version of Volumio on a dual core Cubox-i that I have which is a comparable speed to a Pi 3, (so a bit faster than my Pi 2) and the interface works perfectly with both Volumio and aplay with no dropouts whatsoever, so I think this rules out a problem with the interface being supported properly in Linux on arm devices.
Here is the output from lsusb:
lsusb.txt
If there is anything else I can do to test or help debug this please let me know.
The text was updated successfully, but these errors were encountered: