-
Notifications
You must be signed in to change notification settings - Fork 5.2k
upstraming drivers/dma/bcm2835-dma.c #1231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Are they changes that affect other modules, or strictly internal implementation details? In general, I wouldn't worry about it. We frequently have to merge upstream changes with downstream patches, and the fact that you have contributed both (and thank you for that, Martin) doesn't really change that. Having said that, if you think the merge may not be trivial then you could send us a patch that gets us from the top of our tree to what you think the final code should look like, and then we can check post-merge that we end up in the right place. |
I would take care to upstream patches the same way you did - with the exception that I may move the dma pool in earlier (as I need to merge some of the cyclic and slave code into a single shared piece of code) - anyways: I would do that slightly differently... My biggest problem is: I can not test the cyclic dma code - no experience how that works and what HW I would need |
An external audio HAT is probably the best way to test cyclic DMA. Several purveyors of fine audio cards frequent these parts - @hifiberry and @iqaudio come to mind. |
@msperl Contact me at [email protected] if you need a board for testing purposes. |
@HiassofT may also be able to help with testing. |
The biggest thing is the learning-curve of i2s (which obviously requires some of my time)... |
I will send you the DAC+ Light, this one needs only 3 data lines + 5V + GND, no additional I2C connection is needed for this one. |
thanks |
@msperl You can also ping me with patches if you want any testing done in this specific area. I have a test group comprising a Zero, CM, A+, B+ and 2B, which each have I2S DAC attached, outputting audio 24/7..... Currently running downstream 4.3 plus backport of the dma pool patch from downstream 4.4rc, but can be changed to whatever you want tested. |
Who needs a board? Happy to ship worldwide :-) Seasons greetings,
|
@msperl just drop me a line if you need some help with testing or if you have questions about the I2S code Note that there's an unfixed bug in the cyclic DMA code, it would be good to resolve this before upstreaming the code.See #1193 As for testing with upstream: The hifiberry DAC+ light seems to be a good candidate as it has very few external code dependencies. I guess you should get it working if you add the hw_params function plus the .ops definition from the downstream driver to the test code I wrote for testing the dmapool patch |
FYI: here's the bcm2835-dma code I'm currently testing with kernel 4.3: It includes the dmapool patch and I've removed the incorrect period-splitting/rearrangement code from bcm2835_dma_prep_dma_cyclic. If a period larger than the maximum supported length is requested it errors out. Likewise if the buffer length is not a multiple of period length. This matches the implementation of most other DMA drivers, eg dma-axi-dmac, imx-sdma, mxs-dma - and simplifies the code quite a bit. So far I haven't run into any issues with these changes, the only requirement is that period_bytes_max in bcm2835-i2s is set up correctly (to 64k-4 bytes). |
@HiassofT Now running your latest bcm2835-dma code on 3 of my I2S test group. All looks good so far. I'll switch the others to it tomorrow. |
@HiassofT: Reviewing existing patches so that I can argue them (playing a bit of devils advocate):
|
That's a good idea, let's take the code apart and see if we can find more skeletons in the closet :)
Since kernel 4.3 dma_free_coherent complains loudly when it's beeing called from interrupt context - which happens when an audio device is closed. Switching to dmapool is one solution to that problem. See discussion here: #1178 (comment)
That was a temporary quick-fix to get rid of clicks during audio playback starting with kernel 4.2. See here: #1153 (comment) This quick-fix just masked the underlying bugs in the code, but it took a while until I found out what exactly was going wrong. The cyclic DMA code really shouldn't mess around with the parameters. The userspace application chooses buffer and period sizes within the limits reported by the PCM driver (bcm2835-i2s) so that it can meet it's latency etc constraints - and it relies on the PCM/DMA driver to operate as requested. If it fails to do so you may get buffer over/undderruns, manifesting in clicks or dropouts during recording and other nasty stuff - IOW it may break audio. I've just PRed my proposed fixes for the 4.4 branch, the bcm2835-dma code is identical to the 4.3 one I uploaded to gist (modulo some whitespace/comment diffs). See here: #1233 |
Other comments on commits in foundation 4.4.y:
So my approach would be:
|
I think f02f5b8 can be dropped. We were using it for debugging sdcard errors as previously increased dma wait states had worked around an sdcard driver bug. However it is not known to help with current driver. |
In the meantime I found that i2s does no longer work upstream because of the new clockmanager by @anholt - I wonder when this does filter down into the foundation kernels... |
Does rpi-4.4.y have the i2s problem? It builds clk-bcm2835 for downstream kernel. Yes, we are likely to get any upstream patches eventually, so identifying any upstream commits that break things we care about and reporting them would be good. |
i2s cards currently work fine in rpi-4.4.y. I ran into the same problem when testing the dmapool patch with |
I know, but as this effort is to upstream the dma code we need also to fix that portion - otherwise we can not test it... as @popcornmix said: we need to fix those anyway to avoid the roadblocks... |
b59b3cb is reverted in my proposed fix #1233 (I didn't add an explicit revert commit). Instead of looking at the single diffs it's easier to compare the resulting file. I did this for the cyclic setup code and compared upstream+dmapool patch to downstream plus my proposed fix. The end result is basically just added sanity checks, most of the other downstream changes are reverted. I think we may drop b12e830, that was only needed because the dma channels hadn't been defined in DT. I've created #1235 to clarify this. 2bc2902 changes the slave_sg code to use dmapool as well. Regarding the calltrace: IRQs are disabled in the alsa code, snd_pcm_period_elapsed (in sound/core/pcm_lib.c) does a snd_pcm_stream_lock_irqsave |
While patching the i2s driver to use the clock framework I found that the clock-framework is also "unhappy" about disabled interrupts and complains with WARN_ON. At least I found that there is a patch for the new clock-framework that also supports PWM - I need to check that when I get back and find a solution for the clock_framework clk_disable - the easiest solution is to keep the clock running when the driver is loaded or we need to defer the disable... Alternatively we need to clk_disable in a tasklet and at that time we can also abort the dma transfers. Actually I would guess this also would show up with other SOC implementations as well... |
I'm wondering if moving clock management to DAPM might be a solution. dapm_clock_event could be what we are looking for, and it'll be called from a different context. I'm totally unsure though if this is correct at all (the only user of SND_SOC_DAPM_CLOCK_SUPPLY in the kernel is a codec, not a CPU DAI) or if it would open a can of worms - DAPM can be tricky. |
Let me see how far I get next week when I am back - right now I get an exception when I use clk_set_rate - it may also be the clock driver patch I created or something else... |
Ok - after some reading and some coding one insight into why there are audiable "gaps" during transfers (b59b3cb)... The reason is that we produce 2 control-blocks with:
|
note that I got an initial version that (hopefully) is more acceptable to upstream by sharing code between cyclic and slave_sg... the essential portion (leaving out some definitions) of slave_sg now looks like this:
and the corresponding cyclic code:
In addition the dma_memcpy operations come for almost free:
Similarly memset/memset_sg would also be easy to implement and probably interleaved as well. In the hope that this may be acceptable for upstream... I still got I2S issues with the I2S clock so I can not verify if the cyclic code is working as expected (yet)... |
bcm2835_dma_prep_dma_cyclic doesn't look correct, it doesn't seem
to honor the period_len parameter.
See include/linux/dmaengine.h:
```
@device_prep_dma_cyclic: prepare a cyclic dma operation suitable for audio.
The function takes a buffer of size buf_len. The callback function will
be called after period_len bytes have been transferred.
```
Having a common function for CB allocation and basic setup makes sense,
but we'll need an address/len setup function for cyclic DMA as well.
|
well, that is something I have missed... Thanks for pointing it out - I found it strange that I could do away with period_len... |
still that "split I was talking about - especially about interrupts when you have one transfer that is only 4 bytes in size - seems relevant... |
Ok - I will create a pull fur the basic stuff that is upstream. |
Created: |
Note that I got the following feedback by Lars-Peter Clausen on the pcm registration patch:
So maybe it really is not needed - can someone confirm? |
@msperl without the "pcm registration" patch we loose the S16_LE format. Another reason we have this patch here is so that we could limit the maximum period size to 32k or 64k-4. The latter should no longer be a problem with your DMA patches / segment splitting code. It looks like the autodiscover code in dmaengine_pcm_set_runtime_hwparams determines that the DMA controller can't do 16bit transfers and so doesn't enable S16_LE. Not sure if there's a way to workaround this other than passing a pcm config struct to snd_dmaengine_pcm_register and setting all values. But I think we should compare the values with the ones setup by the auto-discover code, not that we miss some important flag/setting. I also think we can drop SND_DMAENGINE_PCM_FLAG_COMPAT, IIRC that's only needed in setups without devicetree. |
@HiassofT : well - as you are on the response from Lars-Peter - can you please send that answer to him/the lists? |
2 of those patches have been applied by mark (Master and 24 bit) |
I have pushed them into the pull request... |
For the last one I can modify the patch and post it again with better descriptions. |
As a note with regards to PCM registration: But I assume this has not been observed in the wild... |
That's not an issue. S24_LE is 24bits stored in the lower 3 bytes snd_pcm_format_width(SNDRV_PCM_FORMAT_S24_LE) = 24 The packed variant S24_3LE (width=physical_width=24) would result Have a look at the pcm_formats table for all the gory details: |
It was just my interpretation of this being a possible issue. As for the 16 bit response by Lars (fixing the core) - maybe you want to take up that portion and fix that core piece? |
@msperl could you give the V2 patch that I just mailed out to you and Lars It should apply cleanly on downstream but in addition to that you have ret = devm_snd_dmaengine_pcm_register(&pdev->dev, NULL, 0); You can verify that S16_LE is enabled and working with aplay: If everything's fine you should see FORMAT: S16_LE S24_LE S32_LE Playing 32-bit WAVs should still work fine, too, but to play 24-bit Playing 192kHz 32-bit WAVs would be a good test for your DMA code Need to clean up my HD to make room for some more trees now.... |
@msperl @clivem here are upstream+downstream trees with my current dmaengine patches: https://github.com/HiassofT/rpi-linux/tree/rpi-4.4-dma-packed-s16le I've tested the latter with 44.1/16, 192/24 and 192/32 files and so far everything went fine. |
OK, so I am merging your patchies on top of my CLK_ENABLE_HAND_OFF. As for testing: I tried But the clock is different : divider 3.125 without and 6.25 with your patch. So I can guess that it works... |
tested tags sent |
Thanks for testing!
The different clock is to be expected, you are using the default
ALSA plughw device and that'll automatically convert S16_LE to
S24_LE (or S32_LE) if native S16_LE support isn't available.
If you add the "-D hw:0" option to speaker-test you should get
an error message if S16_LE isn't available.
|
well - i also did the math and came to the conclusion it works based on the divider settings... |
@msperl has your issue been resolved? If so, please close this issue. Thanks. |
I guess it has been fixed |
There is an issue regarding using the Hifiberry DAC with Eric Anholt's VC4 driver. on current stable systems Linux Kernel 4.4 (ie. raspbian, archlinuxarm) it's not possible to enable "dtoverlay=vc4-kms-v3d" and "dtoverlay=hifiberry-dacplus" at the same time. |
The things "limiting" i2s from working with KMS has been fixed in kernel version 4.7 - see #1629. |
Great to hear it's resolved in 4.7 |
vc4-kms-v3d/vc4-fkms-v3d are working on 4.4 and 4.9, but not on kernels between. |
Address following errors: ``` [ 28.042135] mt7601u 1-1.2:1.0: Error: MCU resp urb failed:-71 [ 28.042183] mt7601u 1-1.2:1.0: Error: MCU resp evt:0 seq:1-f! ... [ 28.044756] mt7601u 1-1.2:1.0: Error: mt7601u_mcu_wait_resp timed out ``` TODO: Fix dma WARNING first ("WARNING: CPU: 0 PID: 10 at drivers/net/wireless/mediatek/mt7601u/dma.c:200 mt7601u_complete_rx+0x134/0x148 [mt7601u]") then test this mcu patch if mcu errors still present. ## System info Arch info: ``` pi@raspberrypi:~ $ uname -a Linux raspberrypi 4.14.34+ torvalds#1110 Mon Apr 16 14:51:42 BST 2018 armv6l GNU/Linux pi@raspberrypi:~ $ dpkg-query -W raspberrypi-kernel firmware-misc-nonfree firmware-misc-nonfree 20170823-1 raspberrypi-kernel 1.20180417-1 ``` Hardware info: ``` pi@raspberrypi:~ $ lsusb | grep 7601 Bus 001 Device 096: ID 148f:7601 Ralink Technology, Corp. MT7601U Wireless Adapter ``` I blacklisted a module in order not to taint the kernel: ``` pi@raspberrypi:~ $ cat /etc/modprobe.d/blacklist-snd_bcm2835.conf blacklist snd_bcm2835 ``` ## Diagnosis The user-visible symptom is that WiFi does not work. From `dmesg` I notice that association with access point fails. Excerpts of initial portion of `dmesg` that I consider relevant: ``` ... [ 3.423933] usb 1-1.2: New USB device found, idVendor=148f, idProduct=7601 [ 3.432813] usb 1-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3 [ 3.442117] usb 1-1.2: Product: 802.11 n WLAN [ 3.448341] usb 1-1.2: SerialNumber: 1.0 ... [ 16.363820] usb 1-1.2: reset high-speed USB device number 4 using dwc_otg [ 16.543871] mt7601u 1-1.2:1.0: ASIC revision: 76010001 MAC revision: 76010500 ... [ 16.618099] mt7601u 1-1.2:1.0: Firmware Version: 0.1.00 Build: 7640 Build time: 201302052146____ ... [ 17.393938] mt7601u 1-1.2:1.0: EEPROM ver:0c fae:00 [ 18.133052] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht' [ 18.135614] usbcore: registered new interface driver mt7601u ... [ 19.167626] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready [ 22.902745] wlan0: authenticate with ac:ce:ss:po:in:tX [ 22.969817] wlan0: send auth to ac:ce:ss:po:in:tX (try 1/3) [ 22.971735] wlan0: authenticated [ 22.982994] wlan0: associate with ac:ce:ss:po:in:tX (try 1/3) [ 22.986711] wlan0: RX AssocResp from ac:ce:ss:po:in:tX (capab=0x1411 status=0 aid=3) [ 23.003069] NYET/NAK/ACK/other in non-error case, 0x00000002 [ 23.003122] NYET/NAK/ACK/other in non-error case, 0x00000002 [ 23.003162] NYET/NAK/ACK/other in non-error case, 0x00000002 [ 23.033058] NYET/NAK/ACK/other in non-error case, 0x00000002 [ 23.033117] NYET/NAK/ACK/other in non-error case, 0x00000002 [ 23.033158] NYET/NAK/ACK/other in non-error case, 0x00000002 [ 23.053141] usb usb1-port1: disabled by hub (EMI?), re-enabling... [ 23.053181] usb 1-1: USB disconnect, device number 2 ... [ 23.084610] usb 1-1.2: USB disconnect, device number 4 [ 23.085396] ------------[ cut here ]------------ [ 23.085594] WARNING: CPU: 0 PID: 10 at drivers/net/wireless/mediatek/mt7601u/dma.c:200 mt7601u_complete_rx+0x134/0x148 [mt7601u] [ 23.085600] RX urb mismatch [ 23.085607] Modules linked in: arc4 mt7601u tun mac80211 cfg80211 rfkill uio_pdrv_genirq uio fixed ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT nf_reject_ipv4 nf_log_ipv4 nf_log_common xt_LOG xt_recent xt_limit xt_tcpudp xt_addrtype ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack iptable_filter ip_tables x_tables ipv6 [ 23.085789] CPU: 0 PID: 10 Comm: kworker/0:1 Not tainted 4.14.34+ torvalds#1110 [ 23.085794] Hardware name: BCM2835 [ 23.085828] Workqueue: usb_hub_wq hub_event [ 23.085891] [<c0016514>] (unwind_backtrace) from [<c0013e4c>] (show_stack+0x20/0x24) [ 23.085921] [<c0013e4c>] (show_stack) from [<c062ffc4>] (dump_stack+0x20/0x28) [ 23.085950] [<c062ffc4>] (dump_stack) from [<c0021f94>] (__warn+0xe4/0x10c) [ 23.085969] [<c0021f94>] (__warn) from [<c0022004>] (warn_slowpath_fmt+0x48/0x50) [ 23.086087] [<c0022004>] (warn_slowpath_fmt) from [<bf3264f8>] (mt7601u_complete_rx+0x134/0x148 [mt7601u]) [ 23.086236] [<bf3264f8>] (mt7601u_complete_rx [mt7601u]) from [<c04610e0>] (__usb_hcd_giveback_urb+0x80/0x160) [ 23.086262] [<c04610e0>] (__usb_hcd_giveback_urb) from [<c0461208>] (usb_hcd_giveback_urb+0x48/0x10c) [ 23.086297] [<c0461208>] (usb_hcd_giveback_urb) from [<c0490318>] (dwc_otg_urb_dequeue+0x98/0xbc) [ 23.086323] [<c0490318>] (dwc_otg_urb_dequeue) from [<c0461cac>] (unlink1+0x40/0x178) [ 23.086345] [<c0461cac>] (unlink1) from [<c0463118>] (usb_hcd_flush_endpoint+0xcc/0xec) [ 23.086366] [<c0463118>] (usb_hcd_flush_endpoint) from [<c046605c>] (usb_disable_endpoint+0x58/0xa0) [ 23.086383] [<c046605c>] (usb_disable_endpoint) from [<c04660f0>] (usb_disable_interface+0x4c/0x64) [ 23.086402] [<c04660f0>] (usb_disable_interface) from [<c0468b90>] (usb_unbind_interface+0x1d0/0x288) [ 23.086440] [<c0468b90>] (usb_unbind_interface) from [<c0400618>] (device_release_driver_internal+0x14c/0x1ec) [ 23.086465] [<c0400618>] (device_release_driver_internal) from [<c04006d8>] (device_release_driver+0x20/0x24) [ 23.086487] [<c04006d8>] (device_release_driver) from [<c03ff4a0>] (bus_remove_device+0xd8/0x108) [ 23.086507] [<c03ff4a0>] (bus_remove_device) from [<c03fc124>] (device_del+0x1ec/0x30c) [ 23.086526] [<c03fc124>] (device_del) from [<c04661b8>] (usb_disable_device+0xb0/0x1f4) [ 23.086546] [<c04661b8>] (usb_disable_device) from [<c045cd34>] (usb_disconnect+0x7c/0x1fc) [ 23.086567] [<c045cd34>] (usb_disconnect) from [<c045ce6c>] (usb_disconnect+0x1b4/0x1fc) [ 23.086586] [<c045ce6c>] (usb_disconnect) from [<c045e718>] (hub_event+0x594/0x11c0) [ 23.086610] [<c045e718>] (hub_event) from [<c0039208>] (process_one_work+0x11c/0x398) [ 23.086629] [<c0039208>] (process_one_work) from [<c00394c0>] (worker_thread+0x3c/0x544) [ 23.086656] [<c00394c0>] (worker_thread) from [<c003f3bc>] (kthread+0x120/0x15c) [ 23.086681] [<c003f3bc>] (kthread) from [<c000fe6c>] (ret_from_fork+0x14/0x28) [ 23.086690] ---[ end trace ee6907230b405e54 ]--- [ 23.096896] mt7601u 1-1.2:1.0: Error: submit URB dir:128 ep:1 failed:-19 [ 23.108805] mt7601u 1-1.2:1.0: mt7601u_rxdc_cal timed out [ 23.108914] wlan0: associated [ 23.109635] wlan0: deauthenticating from ac:ce:ss:po:in:tX by local choice (Reason: 3=DEAUTH_LEAVING) [ 23.121758] mt7601u 1-1.2:1.0: mt7601u_rxdc_cal timed out ... [ 28.040752] mt7601u 1-1.2:1.0: Error: RX urb failed:-71 ... [ 28.042135] mt7601u 1-1.2:1.0: Error: MCU resp urb failed:-71 [ 28.042183] mt7601u 1-1.2:1.0: Error: MCU resp evt:0 seq:1-f! ... [ 28.044756] mt7601u 1-1.2:1.0: Error: mt7601u_mcu_wait_resp timed out ... ``` ## Analysis ### Identification of similar issues This may be the same issue as reported in [this comment with the same driver - `mt7601u_rxdc_cal timed out`](raspberrypi/linux#1231 (comment)). This seems the same symptom as reported in [this comment with a similar driver - `rx urb mismatch` and `mt76_usb_complete_rx [mt76]`](openwrt/mt76#139 (comment)) and [fixed](openwrt/mt76@ad0a3e9). For this case resolution was usage of `GFP_ATOMIC` in call to `mt76_usb_submit_buf` that passes it to `usb_submit_urb`. ### Identification of affected source code ``` raspberrypi-kernel 1.20180417-1 ``` ``` [ 23.085594] WARNING: CPU: 0 PID: 10 at drivers/net/wireless/mediatek/mt7601u/dma.c:200 mt7601u_complete_rx+0x134/0x148 [mt7601u] ``` Versioned link to affected source code: https://github.com/raspberrypi/linux/blob/raspberrypi-kernel_1.20180417-1/drivers/net/wireless/mediatek/mt7601u/dma.c#L200
Addresses following warning: ``` [ 23.085594] WARNING: CPU: 0 PID: 10 at drivers/net/wireless/mediatek/mt7601u/dma.c:200 mt7601u_complete_rx+0x134/0x148 [mt7601u] [ 23.085600] RX urb mismatch ``` This WARN seemed the same symptom reported in [this comment for the similar `mt76x2u` driver - `rx urb mismatch` and `mt76_usb_complete_rx [mt76]`](openwrt/mt76#139 (comment)) and [fixed by using `GFP_ATOMIC` in call to `mt76_usb_submit_buf` that passes it to `usb_submit_urb`](openwrt/mt76@ad0a3e9), so the fix is similar. ---- ## System info Arch info: ``` pi@raspberrypi:~ $ uname -a Linux raspberrypi 4.14.34+ torvalds#1110 Mon Apr 16 14:51:42 BST 2018 armv6l GNU/Linux pi@raspberrypi:~ $ dpkg-query -W raspberrypi-kernel firmware-misc-nonfree firmware-misc-nonfree 20170823-1 raspberrypi-kernel 1.20180417-1 ``` Hardware info: ``` pi@raspberrypi:~ $ lsusb | grep 7601 Bus 001 Device 096: ID 148f:7601 Ralink Technology, Corp. MT7601U Wireless Adapter ``` I blacklisted a module in order not to taint the kernel: ``` pi@raspberrypi:~ $ cat /etc/modprobe.d/blacklist-snd_bcm2835.conf blacklist snd_bcm2835 ``` ## Diagnosis The user-visible symptom is that WiFi does not work. From `dmesg` I notice that association with access point fails. Excerpts of initial portion of `dmesg` that I consider relevant: ``` ... [ 3.423933] usb 1-1.2: New USB device found, idVendor=148f, idProduct=7601 [ 3.432813] usb 1-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3 [ 3.442117] usb 1-1.2: Product: 802.11 n WLAN [ 3.448341] usb 1-1.2: SerialNumber: 1.0 ... [ 16.363820] usb 1-1.2: reset high-speed USB device number 4 using dwc_otg [ 16.543871] mt7601u 1-1.2:1.0: ASIC revision: 76010001 MAC revision: 76010500 ... [ 16.618099] mt7601u 1-1.2:1.0: Firmware Version: 0.1.00 Build: 7640 Build time: 201302052146____ ... [ 17.393938] mt7601u 1-1.2:1.0: EEPROM ver:0c fae:00 [ 18.133052] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht' [ 18.135614] usbcore: registered new interface driver mt7601u ... [ 19.167626] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready [ 22.902745] wlan0: authenticate with ac:ce:ss:po:in:tX [ 22.969817] wlan0: send auth to ac:ce:ss:po:in:tX (try 1/3) [ 22.971735] wlan0: authenticated [ 22.982994] wlan0: associate with ac:ce:ss:po:in:tX (try 1/3) [ 22.986711] wlan0: RX AssocResp from ac:ce:ss:po:in:tX (capab=0x1411 status=0 aid=3) [ 23.003069] NYET/NAK/ACK/other in non-error case, 0x00000002 [ 23.003122] NYET/NAK/ACK/other in non-error case, 0x00000002 [ 23.003162] NYET/NAK/ACK/other in non-error case, 0x00000002 [ 23.033058] NYET/NAK/ACK/other in non-error case, 0x00000002 [ 23.033117] NYET/NAK/ACK/other in non-error case, 0x00000002 [ 23.033158] NYET/NAK/ACK/other in non-error case, 0x00000002 [ 23.053141] usb usb1-port1: disabled by hub (EMI?), re-enabling... [ 23.053181] usb 1-1: USB disconnect, device number 2 ... [ 23.084610] usb 1-1.2: USB disconnect, device number 4 [ 23.085396] ------------[ cut here ]------------ [ 23.085594] WARNING: CPU: 0 PID: 10 at drivers/net/wireless/mediatek/mt7601u/dma.c:200 mt7601u_complete_rx+0x134/0x148 [mt7601u] [ 23.085600] RX urb mismatch [ 23.085607] Modules linked in: arc4 mt7601u tun mac80211 cfg80211 rfkill uio_pdrv_genirq uio fixed ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT nf_reject_ipv4 nf_log_ipv4 nf_log_common xt_LOG xt_recent xt_limit xt_tcpudp xt_addrtype ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack iptable_filter ip_tables x_tables ipv6 [ 23.085789] CPU: 0 PID: 10 Comm: kworker/0:1 Not tainted 4.14.34+ torvalds#1110 [ 23.085794] Hardware name: BCM2835 [ 23.085828] Workqueue: usb_hub_wq hub_event [ 23.085891] [<c0016514>] (unwind_backtrace) from [<c0013e4c>] (show_stack+0x20/0x24) [ 23.085921] [<c0013e4c>] (show_stack) from [<c062ffc4>] (dump_stack+0x20/0x28) [ 23.085950] [<c062ffc4>] (dump_stack) from [<c0021f94>] (__warn+0xe4/0x10c) [ 23.085969] [<c0021f94>] (__warn) from [<c0022004>] (warn_slowpath_fmt+0x48/0x50) [ 23.086087] [<c0022004>] (warn_slowpath_fmt) from [<bf3264f8>] (mt7601u_complete_rx+0x134/0x148 [mt7601u]) [ 23.086236] [<bf3264f8>] (mt7601u_complete_rx [mt7601u]) from [<c04610e0>] (__usb_hcd_giveback_urb+0x80/0x160) [ 23.086262] [<c04610e0>] (__usb_hcd_giveback_urb) from [<c0461208>] (usb_hcd_giveback_urb+0x48/0x10c) [ 23.086297] [<c0461208>] (usb_hcd_giveback_urb) from [<c0490318>] (dwc_otg_urb_dequeue+0x98/0xbc) [ 23.086323] [<c0490318>] (dwc_otg_urb_dequeue) from [<c0461cac>] (unlink1+0x40/0x178) [ 23.086345] [<c0461cac>] (unlink1) from [<c0463118>] (usb_hcd_flush_endpoint+0xcc/0xec) [ 23.086366] [<c0463118>] (usb_hcd_flush_endpoint) from [<c046605c>] (usb_disable_endpoint+0x58/0xa0) [ 23.086383] [<c046605c>] (usb_disable_endpoint) from [<c04660f0>] (usb_disable_interface+0x4c/0x64) [ 23.086402] [<c04660f0>] (usb_disable_interface) from [<c0468b90>] (usb_unbind_interface+0x1d0/0x288) [ 23.086440] [<c0468b90>] (usb_unbind_interface) from [<c0400618>] (device_release_driver_internal+0x14c/0x1ec) [ 23.086465] [<c0400618>] (device_release_driver_internal) from [<c04006d8>] (device_release_driver+0x20/0x24) [ 23.086487] [<c04006d8>] (device_release_driver) from [<c03ff4a0>] (bus_remove_device+0xd8/0x108) [ 23.086507] [<c03ff4a0>] (bus_remove_device) from [<c03fc124>] (device_del+0x1ec/0x30c) [ 23.086526] [<c03fc124>] (device_del) from [<c04661b8>] (usb_disable_device+0xb0/0x1f4) [ 23.086546] [<c04661b8>] (usb_disable_device) from [<c045cd34>] (usb_disconnect+0x7c/0x1fc) [ 23.086567] [<c045cd34>] (usb_disconnect) from [<c045ce6c>] (usb_disconnect+0x1b4/0x1fc) [ 23.086586] [<c045ce6c>] (usb_disconnect) from [<c045e718>] (hub_event+0x594/0x11c0) [ 23.086610] [<c045e718>] (hub_event) from [<c0039208>] (process_one_work+0x11c/0x398) [ 23.086629] [<c0039208>] (process_one_work) from [<c00394c0>] (worker_thread+0x3c/0x544) [ 23.086656] [<c00394c0>] (worker_thread) from [<c003f3bc>] (kthread+0x120/0x15c) [ 23.086681] [<c003f3bc>] (kthread) from [<c000fe6c>] (ret_from_fork+0x14/0x28) [ 23.086690] ---[ end trace ee6907230b405e54 ]--- [ 23.096896] mt7601u 1-1.2:1.0: Error: submit URB dir:128 ep:1 failed:-19 [ 23.108805] mt7601u 1-1.2:1.0: mt7601u_rxdc_cal timed out [ 23.108914] wlan0: associated [ 23.109635] wlan0: deauthenticating from ac:ce:ss:po:in:tX by local choice (Reason: 3=DEAUTH_LEAVING) [ 23.121758] mt7601u 1-1.2:1.0: mt7601u_rxdc_cal timed out ... [ 28.040752] mt7601u 1-1.2:1.0: Error: RX urb failed:-71 ... [ 28.042135] mt7601u 1-1.2:1.0: Error: MCU resp urb failed:-71 [ 28.042183] mt7601u 1-1.2:1.0: Error: MCU resp evt:0 seq:1-f! ... [ 28.044756] mt7601u 1-1.2:1.0: Error: mt7601u_mcu_wait_resp timed out ... ``` ## Analysis ### Identification of similar issues This may be the same issue as reported in [this comment with the same driver - `mt7601u_rxdc_cal timed out`](raspberrypi/linux#1231 (comment)). This seems the same symptom as reported in [this comment with a similar driver - `rx urb mismatch` and `mt76_usb_complete_rx [mt76]`](openwrt/mt76#139 (comment)) and [fixed](openwrt/mt76@ad0a3e9). For this case resolution was usage of `GFP_ATOMIC` in call to `mt76_usb_submit_buf` that passes it to `usb_submit_urb`. ### Identification of affected source code ``` raspberrypi-kernel 1.20180417-1 ``` ``` [ 23.085594] WARNING: CPU: 0 PID: 10 at drivers/net/wireless/mediatek/mt7601u/dma.c:200 mt7601u_complete_rx+0x134/0x148 [mt7601u] ``` Versioned link to affected source code: https://github.com/raspberrypi/linux/blob/raspberrypi-kernel_1.20180417-1/drivers/net/wireless/mediatek/mt7601u/dma.c#L200
Address following errors: ``` [ 28.042135] mt7601u 1-1.2:1.0: Error: MCU resp urb failed:-71 [ 28.042183] mt7601u 1-1.2:1.0: Error: MCU resp evt:0 seq:1-f! ... [ 28.044756] mt7601u 1-1.2:1.0: Error: mt7601u_mcu_wait_resp timed out ``` TODO: Fix dma WARNING first ("WARNING: CPU: 0 PID: 10 at drivers/net/wireless/mediatek/mt7601u/dma.c:200 mt7601u_complete_rx+0x134/0x148 [mt7601u]") then test this mcu patch if mcu errors still present. ## System info Arch info: ``` pi@raspberrypi:~ $ uname -a Linux raspberrypi 4.14.34+ torvalds#1110 Mon Apr 16 14:51:42 BST 2018 armv6l GNU/Linux pi@raspberrypi:~ $ dpkg-query -W raspberrypi-kernel firmware-misc-nonfree firmware-misc-nonfree 20170823-1 raspberrypi-kernel 1.20180417-1 ``` Hardware info: ``` pi@raspberrypi:~ $ lsusb | grep 7601 Bus 001 Device 096: ID 148f:7601 Ralink Technology, Corp. MT7601U Wireless Adapter ``` I blacklisted a module in order not to taint the kernel: ``` pi@raspberrypi:~ $ cat /etc/modprobe.d/blacklist-snd_bcm2835.conf blacklist snd_bcm2835 ``` ## Diagnosis The user-visible symptom is that WiFi does not work. From `dmesg` I notice that association with access point fails. Excerpts of initial portion of `dmesg` that I consider relevant: ``` ... [ 3.423933] usb 1-1.2: New USB device found, idVendor=148f, idProduct=7601 [ 3.432813] usb 1-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3 [ 3.442117] usb 1-1.2: Product: 802.11 n WLAN [ 3.448341] usb 1-1.2: SerialNumber: 1.0 ... [ 16.363820] usb 1-1.2: reset high-speed USB device number 4 using dwc_otg [ 16.543871] mt7601u 1-1.2:1.0: ASIC revision: 76010001 MAC revision: 76010500 ... [ 16.618099] mt7601u 1-1.2:1.0: Firmware Version: 0.1.00 Build: 7640 Build time: 201302052146____ ... [ 17.393938] mt7601u 1-1.2:1.0: EEPROM ver:0c fae:00 [ 18.133052] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht' [ 18.135614] usbcore: registered new interface driver mt7601u ... [ 19.167626] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready [ 22.902745] wlan0: authenticate with ac:ce:ss:po:in:tX [ 22.969817] wlan0: send auth to ac:ce:ss:po:in:tX (try 1/3) [ 22.971735] wlan0: authenticated [ 22.982994] wlan0: associate with ac:ce:ss:po:in:tX (try 1/3) [ 22.986711] wlan0: RX AssocResp from ac:ce:ss:po:in:tX (capab=0x1411 status=0 aid=3) [ 23.003069] NYET/NAK/ACK/other in non-error case, 0x00000002 [ 23.003122] NYET/NAK/ACK/other in non-error case, 0x00000002 [ 23.003162] NYET/NAK/ACK/other in non-error case, 0x00000002 [ 23.033058] NYET/NAK/ACK/other in non-error case, 0x00000002 [ 23.033117] NYET/NAK/ACK/other in non-error case, 0x00000002 [ 23.033158] NYET/NAK/ACK/other in non-error case, 0x00000002 [ 23.053141] usb usb1-port1: disabled by hub (EMI?), re-enabling... [ 23.053181] usb 1-1: USB disconnect, device number 2 ... [ 23.084610] usb 1-1.2: USB disconnect, device number 4 [ 23.085396] ------------[ cut here ]------------ [ 23.085594] WARNING: CPU: 0 PID: 10 at drivers/net/wireless/mediatek/mt7601u/dma.c:200 mt7601u_complete_rx+0x134/0x148 [mt7601u] [ 23.085600] RX urb mismatch [ 23.085607] Modules linked in: arc4 mt7601u tun mac80211 cfg80211 rfkill uio_pdrv_genirq uio fixed ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT nf_reject_ipv4 nf_log_ipv4 nf_log_common xt_LOG xt_recent xt_limit xt_tcpudp xt_addrtype ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack iptable_filter ip_tables x_tables ipv6 [ 23.085789] CPU: 0 PID: 10 Comm: kworker/0:1 Not tainted 4.14.34+ torvalds#1110 [ 23.085794] Hardware name: BCM2835 [ 23.085828] Workqueue: usb_hub_wq hub_event [ 23.085891] [<c0016514>] (unwind_backtrace) from [<c0013e4c>] (show_stack+0x20/0x24) [ 23.085921] [<c0013e4c>] (show_stack) from [<c062ffc4>] (dump_stack+0x20/0x28) [ 23.085950] [<c062ffc4>] (dump_stack) from [<c0021f94>] (__warn+0xe4/0x10c) [ 23.085969] [<c0021f94>] (__warn) from [<c0022004>] (warn_slowpath_fmt+0x48/0x50) [ 23.086087] [<c0022004>] (warn_slowpath_fmt) from [<bf3264f8>] (mt7601u_complete_rx+0x134/0x148 [mt7601u]) [ 23.086236] [<bf3264f8>] (mt7601u_complete_rx [mt7601u]) from [<c04610e0>] (__usb_hcd_giveback_urb+0x80/0x160) [ 23.086262] [<c04610e0>] (__usb_hcd_giveback_urb) from [<c0461208>] (usb_hcd_giveback_urb+0x48/0x10c) [ 23.086297] [<c0461208>] (usb_hcd_giveback_urb) from [<c0490318>] (dwc_otg_urb_dequeue+0x98/0xbc) [ 23.086323] [<c0490318>] (dwc_otg_urb_dequeue) from [<c0461cac>] (unlink1+0x40/0x178) [ 23.086345] [<c0461cac>] (unlink1) from [<c0463118>] (usb_hcd_flush_endpoint+0xcc/0xec) [ 23.086366] [<c0463118>] (usb_hcd_flush_endpoint) from [<c046605c>] (usb_disable_endpoint+0x58/0xa0) [ 23.086383] [<c046605c>] (usb_disable_endpoint) from [<c04660f0>] (usb_disable_interface+0x4c/0x64) [ 23.086402] [<c04660f0>] (usb_disable_interface) from [<c0468b90>] (usb_unbind_interface+0x1d0/0x288) [ 23.086440] [<c0468b90>] (usb_unbind_interface) from [<c0400618>] (device_release_driver_internal+0x14c/0x1ec) [ 23.086465] [<c0400618>] (device_release_driver_internal) from [<c04006d8>] (device_release_driver+0x20/0x24) [ 23.086487] [<c04006d8>] (device_release_driver) from [<c03ff4a0>] (bus_remove_device+0xd8/0x108) [ 23.086507] [<c03ff4a0>] (bus_remove_device) from [<c03fc124>] (device_del+0x1ec/0x30c) [ 23.086526] [<c03fc124>] (device_del) from [<c04661b8>] (usb_disable_device+0xb0/0x1f4) [ 23.086546] [<c04661b8>] (usb_disable_device) from [<c045cd34>] (usb_disconnect+0x7c/0x1fc) [ 23.086567] [<c045cd34>] (usb_disconnect) from [<c045ce6c>] (usb_disconnect+0x1b4/0x1fc) [ 23.086586] [<c045ce6c>] (usb_disconnect) from [<c045e718>] (hub_event+0x594/0x11c0) [ 23.086610] [<c045e718>] (hub_event) from [<c0039208>] (process_one_work+0x11c/0x398) [ 23.086629] [<c0039208>] (process_one_work) from [<c00394c0>] (worker_thread+0x3c/0x544) [ 23.086656] [<c00394c0>] (worker_thread) from [<c003f3bc>] (kthread+0x120/0x15c) [ 23.086681] [<c003f3bc>] (kthread) from [<c000fe6c>] (ret_from_fork+0x14/0x28) [ 23.086690] ---[ end trace ee6907230b405e54 ]--- [ 23.096896] mt7601u 1-1.2:1.0: Error: submit URB dir:128 ep:1 failed:-19 [ 23.108805] mt7601u 1-1.2:1.0: mt7601u_rxdc_cal timed out [ 23.108914] wlan0: associated [ 23.109635] wlan0: deauthenticating from ac:ce:ss:po:in:tX by local choice (Reason: 3=DEAUTH_LEAVING) [ 23.121758] mt7601u 1-1.2:1.0: mt7601u_rxdc_cal timed out ... [ 28.040752] mt7601u 1-1.2:1.0: Error: RX urb failed:-71 ... [ 28.042135] mt7601u 1-1.2:1.0: Error: MCU resp urb failed:-71 [ 28.042183] mt7601u 1-1.2:1.0: Error: MCU resp evt:0 seq:1-f! ... [ 28.044756] mt7601u 1-1.2:1.0: Error: mt7601u_mcu_wait_resp timed out ... ``` ## Analysis ### Identification of similar issues This may be the same issue as reported in [this comment with the same driver - `mt7601u_rxdc_cal timed out`](raspberrypi/linux#1231 (comment)). This seems the same symptom as reported in [this comment with a similar driver - `rx urb mismatch` and `mt76_usb_complete_rx [mt76]`](openwrt/mt76#139 (comment)) and [fixed](openwrt/mt76@ad0a3e9). For this case resolution was usage of `GFP_ATOMIC` in call to `mt76_usb_submit_buf` that passes it to `usb_submit_urb`. ### Identification of affected source code ``` raspberrypi-kernel 1.20180417-1 ``` ``` [ 23.085594] WARNING: CPU: 0 PID: 10 at drivers/net/wireless/mediatek/mt7601u/dma.c:200 mt7601u_complete_rx+0x134/0x148 [mt7601u] ``` Versioned link to affected source code: https://github.com/raspberrypi/linux/blob/raspberrypi-kernel_1.20180417-1/drivers/net/wireless/mediatek/mt7601u/dma.c#L200
I am trying to upstream drivers/dma/bcm2835-dma.c - especially the slave-portion.
One of the thing that turns up is that upstream wants changes to the code, which I can do.
The question is: how would it filter back to this branched code how can I help making that work smoothly?
The text was updated successfully, but these errors were encountered: