Skip to content

upstraming drivers/dma/bcm2835-dma.c #1231

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
msperl opened this issue Dec 18, 2015 · 341 comments
Closed

upstraming drivers/dma/bcm2835-dma.c #1231

msperl opened this issue Dec 18, 2015 · 341 comments

Comments

@msperl
Copy link
Contributor

msperl commented Dec 18, 2015

I am trying to upstream drivers/dma/bcm2835-dma.c - especially the slave-portion.

One of the thing that turns up is that upstream wants changes to the code, which I can do.

The question is: how would it filter back to this branched code how can I help making that work smoothly?

@pelwell
Copy link
Contributor

pelwell commented Dec 18, 2015

Are they changes that affect other modules, or strictly internal implementation details?

In general, I wouldn't worry about it. We frequently have to merge upstream changes with downstream patches, and the fact that you have contributed both (and thank you for that, Martin) doesn't really change that.

Having said that, if you think the merge may not be trivial then you could send us a patch that gets us from the top of our tree to what you think the final code should look like, and then we can check post-merge that we end up in the right place.

@msperl
Copy link
Contributor Author

msperl commented Dec 18, 2015

I would take care to upstream patches the same way you did - with the exception that I may move the dma pool in earlier (as I need to merge some of the cyclic and slave code into a single shared piece of code) - anyways: I would do that slightly differently...

My biggest problem is: I can not test the cyclic dma code - no experience how that works and what HW I would need

@pelwell
Copy link
Contributor

pelwell commented Dec 18, 2015

An external audio HAT is probably the best way to test cyclic DMA. Several purveyors of fine audio cards frequent these parts - @hifiberry and @iqaudio come to mind.

@hifiberry
Copy link

@msperl Contact me at [email protected] if you need a board for testing purposes.

@pelwell
Copy link
Contributor

pelwell commented Dec 18, 2015

@HiassofT may also be able to help with testing.

@msperl
Copy link
Contributor Author

msperl commented Dec 18, 2015

The biggest thing is the learning-curve of i2s (which obviously requires some of my time)...
@hifiberry: contacting you - I hope it does not require that many jumper-cables, as my dev board is a CM...
@HiassofT may do some independent testing to see that it works...

@hifiberry
Copy link

I will send you the DAC+ Light, this one needs only 3 data lines + 5V + GND, no additional I2C connection is needed for this one.

@msperl
Copy link
Contributor Author

msperl commented Dec 18, 2015

thanks

@clivem
Copy link

clivem commented Dec 18, 2015

@msperl You can also ping me with patches if you want any testing done in this specific area. I have a test group comprising a Zero, CM, A+, B+ and 2B, which each have I2S DAC attached, outputting audio 24/7..... Currently running downstream 4.3 plus backport of the dma pool patch from downstream 4.4rc, but can be changed to whatever you want tested.

@iqaudio
Copy link

iqaudio commented Dec 18, 2015

Who needs a board? Happy to ship worldwide :-)

Seasons greetings,

[email protected]

On 18 Dec 2015, at 08:44, Phil Elwell [email protected] wrote:

An external audio HAT is probably the best way to test cyclic DMA. Several purveyors of fine audio cards frequent these parts - @hifiberry and @iqaudio come to mind.


Reply to this email directly or view it on GitHub.

@HiassofT
Copy link
Contributor

@msperl just drop me a line if you need some help with testing or if you have questions about the I2S code

Note that there's an unfixed bug in the cyclic DMA code, it would be good to resolve this before upstreaming the code.See #1193

As for testing with upstream: The hifiberry DAC+ light seems to be a good candidate as it has very few external code dependencies. I guess you should get it working if you add the hw_params function plus the .ops definition from the downstream driver to the test code I wrote for testing the dmapool patch
http://lists.infradead.org/pipermail/linux-arm-kernel/2015-November/387442.html

@HiassofT
Copy link
Contributor

FYI: here's the bcm2835-dma code I'm currently testing with kernel 4.3:
https://gist.github.com/HiassofT/95f073242f19299b4da0

It includes the dmapool patch and I've removed the incorrect period-splitting/rearrangement code from bcm2835_dma_prep_dma_cyclic.

If a period larger than the maximum supported length is requested it errors out. Likewise if the buffer length is not a multiple of period length. This matches the implementation of most other DMA drivers, eg dma-axi-dmac, imx-sdma, mxs-dma - and simplifies the code quite a bit.

So far I haven't run into any issues with these changes, the only requirement is that period_bytes_max in bcm2835-i2s is set up correctly (to 64k-4 bytes).

@clivem
Copy link

clivem commented Dec 19, 2015

@HiassofT Now running your latest bcm2835-dma code on 3 of my I2S test group. All looks good so far. I'll switch the others to it tomorrow.

@msperl
Copy link
Contributor Author

msperl commented Dec 20, 2015

@HiassofT: Reviewing existing patches so that I can argue them (playing a bit of devils advocate):

  • why do we switch to dmapool implementation? What is the advantage/arguments/measurements supporting those arguments?
  • why is there a different cyclic limit for cyclic DMA channels - I see nothing obvious? Would not: "#define MAX_LITE_TRANSFER (SZ_64K - SZ_K)" work more efficiently? (I can guess about the - 4 byte case - 2 transfers - one of 65532 bytes and one 4 bytes long with the corresponding CB overhead of 64 bytes to load for the control block.)

@HiassofT
Copy link
Contributor

@HiassofT: Reviewing existing patches so that I can argue them (playing a bit of devils advocate):

That's a good idea, let's take the code apart and see if we can find more skeletons in the closet :)

why do we switch to dmapool implementation? What is the advantage/arguments/measurements supporting those arguments?

Since kernel 4.3 dma_free_coherent complains loudly when it's beeing called from interrupt context - which happens when an audio device is closed. Switching to dmapool is one solution to that problem. See discussion here: #1178 (comment)

why is there a different cyclic limit for cyclic DMA channels - I see nothing obvious? Would not: "#define MAX_LITE_TRANSFER (SZ_64K - SZ_K)" work more efficiently? (I can guess about the - 4 byte case - 2 transfers - one of 65532 bytes and one 4 bytes long with the corresponding CB overhead of 64 bytes to load for the control block.)

That was a temporary quick-fix to get rid of clicks during audio playback starting with kernel 4.2. See here: #1153 (comment)

This quick-fix just masked the underlying bugs in the code, but it took a while until I found out what exactly was going wrong.

The cyclic DMA code really shouldn't mess around with the parameters. The userspace application chooses buffer and period sizes within the limits reported by the PCM driver (bcm2835-i2s) so that it can meet it's latency etc constraints - and it relies on the PCM/DMA driver to operate as requested. If it fails to do so you may get buffer over/undderruns, manifesting in clicks or dropouts during recording and other nasty stuff - IOW it may break audio.

I've just PRed my proposed fixes for the 4.4 branch, the bcm2835-dma code is identical to the 4.3 one I uploaded to gist (modulo some whitespace/comment diffs). See here: #1233

@msperl
Copy link
Contributor Author

msperl commented Dec 20, 2015

Other comments on commits in foundation 4.4.y:

  • 7835fbd Can not get up-streamed as it relates only to "legacy"/foundation kernels
  • 0e9c1fc Can get up-streamed (and is easily arguable)
  • f02f5b8 What is the reason for having this? How can this help with debugging? @notro may answer that.
  • b59b3cb This seems like a workaround so should get fixed in the correct place - if we still make any change here I would recommend limiting it to a multiple of a page (SZ_64K-SZ_4K) Maybe we should revert that patch?
  • c776a0f (and b12e830) needs some updating as per upstream (merging of common code,...)
  • 2bc2902 seems related to dma_free_coherent with interrupts disabled. Stacktrace in ARCH_BCM270X: Drop ATAGS support #1178 shows that it runs in a softirq thread, so with interrupts enabled. This seems to mask an issue with code (possibly in snd_pcm_*) which disables interrupts (possibly holding a spinlock).

So my approach would be:

  • upstream the capabilities 0e9c1fc
  • Split out the "common" CB generation into a separate patch
  • Add the async code using the above

@popcornmix
Copy link
Collaborator

I think f02f5b8 can be dropped. We were using it for debugging sdcard errors as previously increased dma wait states had worked around an sdcard driver bug. However it is not known to help with current driver.

@msperl
Copy link
Contributor Author

msperl commented Dec 21, 2015

In the meantime I found that i2s does no longer work upstream because of the new clockmanager by @anholt - I wonder when this does filter down into the foundation kernels...

@popcornmix
Copy link
Collaborator

Does rpi-4.4.y have the i2s problem? It builds clk-bcm2835 for downstream kernel.
I believe i2s audio is working in Milhouse OpenELEC builds which use rpi-4.4.y.

Yes, we are likely to get any upstream patches eventually, so identifying any upstream commits that break things we care about and reporting them would be good.

@HiassofT
Copy link
Contributor

Does rpi-4.4.y have the i2s problem? It builds clk-bcm2835 for downstream kernel.

i2s cards currently work fine in rpi-4.4.y.

I ran into the same problem when testing the dmapool patch with
upstream 4.4-rc2 and had to disable the clk driver in DT. See
http://lists.infradead.org/pipermail/linux-arm-kernel/2015-November/387442.html

@msperl
Copy link
Contributor Author

msperl commented Dec 21, 2015

I know, but as this effort is to upstream the dma code we need also to fix that portion - otherwise we can not test it...

as @popcornmix said: we need to fix those anyway to avoid the roadblocks...

@HiassofT
Copy link
Contributor

b59b3cb is reverted in my proposed fix #1233 (I didn't add an explicit revert commit).

Instead of looking at the single diffs it's easier to compare the resulting file. I did this for the cyclic setup code and compared upstream+dmapool patch to downstream plus my proposed fix. The end result is basically just added sanity checks, most of the other downstream changes are reverted.
https://gist.github.com/HiassofT/639a2069d56b2874fce9

I think we may drop b12e830, that was only needed because the dma channels hadn't been defined in DT. I've created #1235 to clarify this.

2bc2902 changes the slave_sg code to use dmapool as well.

Regarding the calltrace: IRQs are disabled in the alsa code, snd_pcm_period_elapsed (in sound/core/pcm_lib.c) does a snd_pcm_stream_lock_irqsave

@msperl
Copy link
Contributor Author

msperl commented Dec 22, 2015

While patching the i2s driver to use the clock framework I found that the clock-framework is also "unhappy" about disabled interrupts and complains with WARN_ON.

At least I found that there is a patch for the new clock-framework that also supports PWM - I need to check that when I get back and find a solution for the clock_framework clk_disable - the easiest solution is to keep the clock running when the driver is loaded or we need to defer the disable...

Alternatively we need to clk_disable in a tasklet and at that time we can also abort the dma transfers.
That would remove the need for the dmapool.

Actually I would guess this also would show up with other SOC implementations as well...

@HiassofT
Copy link
Contributor

I'm wondering if moving clock management to DAPM might be a solution. dapm_clock_event could be what we are looking for, and it'll be called from a different context.
http://lxr.free-electrons.com/source/sound/soc/soc-dapm.c#L1211

I'm totally unsure though if this is correct at all (the only user of SND_SOC_DAPM_CLOCK_SUPPLY in the kernel is a codec, not a CPU DAI) or if it would open a can of worms - DAPM can be tricky.

@msperl
Copy link
Contributor Author

msperl commented Dec 22, 2015

Let me see how far I get next week when I am back - right now I get an exception when I use clk_set_rate - it may also be the clock driver patch I created or something else...

@msperl
Copy link
Contributor Author

msperl commented Dec 29, 2015

Ok - after some reading and some coding one insight into why there are audiable "gaps" during transfers (b59b3cb)...

The reason is that we produce 2 control-blocks with:

  • the first being of size 65532 bytes
  • the second being of size 4 bytes
    Both have end interrupt set, which means it will trigger an interrupt, but - with all the latencies (i.e other interrupts running already) it is possible that when the first interrupt is checking for the dma which triggered it, the second controlblock is already finished and the first is again being processed. This results (in my experience) in the fact that the interrupt is NOT really detected (as the interrupt flag seems to be reset when the next transfer finishes).
    This is my best interpretation of the situation...

@msperl
Copy link
Contributor Author

msperl commented Dec 29, 2015

note that I got an initial version that (hopefully) is more acceptable to upstream by sharing code between cyclic and slave_sg...

the essential portion (leaving out some definitions) of slave_sg now looks like this:

static struct dma_async_tx_descriptor *bcm2835_dma_prep_slave_sg(
        struct dma_chan *chan,
        struct scatterlist *sgl, unsigned int sg_len,
        enum dma_transfer_direction direction,
        unsigned long flags, void *context)
{
        struct bcm2835_chan *c = to_bcm2835_dma_chan(chan);
        struct bcm2835_desc *d;
        dma_addr_t src = 0, dst = 0;
        u32 info = BCM2835_DMA_WAIT_RESP;
        u32 extra = BCM2835_DMA_INT_EN;
        size_t frames;

        if (!is_slave_direction(direction)) {
                dev_err(chan->device->dev,
                        "%s: bad direction?\n", __func__);
                return NULL;
        }

        if (c->dreq != 0)
                info |= BCM2835_DMA_PER_MAP(c->dreq);

        if (direction == DMA_DEV_TO_MEM) {
                if (c->cfg.src_addr_width != DMA_SLAVE_BUSWIDTH_4_BYTES)
                        return NULL;
                src = c->cfg.src_addr;
                info |= BCM2835_DMA_S_DREQ | BCM2835_DMA_D_INC;
        } else {
                if (c->cfg.dst_addr_width != DMA_SLAVE_BUSWIDTH_4_BYTES)
                        return NULL;
                dst = c->cfg.dst_addr;
                info |= BCM2835_DMA_D_DREQ | BCM2835_DMA_S_INC;
        }

        /* count frames in sg list */
        for_each_sg(sgl, sgent, sg_len, i)
                frames += bcm2835_dma_frames_for_length(c,
                                                        sg_dma_len(sgent));

        /* allocate the CB chain */
        d = bcm2835_dma_create_cb_chain(chan, direction, false,
                                        info, extra,
                                        frames, src, dst, 0, GFP_KERNEL);
        if (!d)
                return NULL;

        /* fill in frames with scatterlist pointers */
        bcm2835_dma_fill_cb_chain(chan, direction, d->cb_list,
                                  sgl, sg_len);

        return vchan_tx_prep(&c->vc, &d->vd, flags);
}

and the corresponding cyclic code:

static struct dma_async_tx_descriptor *bcm2835_dma_prep_dma_cyclic(
        struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
        size_t period_len, enum dma_transfer_direction direction,
        unsigned long flags)
{
        struct bcm2835_chan *c = to_bcm2835_dma_chan(chan);
        struct bcm2835_desc *d;
        dma_addr_t src, dst;
        u32 info = BCM2835_DMA_WAIT_RESP;
        u32 extra = BCM2835_DMA_INT_EN;

        if (!is_slave_direction(direction)) {
                dev_err(chan->device->dev,
                        "%s: bad direction?\n", __func__);
                return NULL;
        }

        if (!buf_len) {
                dev_err(chan->device->dev,
                        "%s: bad buffer length (= 0)\n", __func__);
                return NULL;
        }

        /* Setup DREQ channel */
        if (c->dreq != 0)
                info |= BCM2835_DMA_PER_MAP(c->dreq);

        if (direction == DMA_DEV_TO_MEM) {
                if (c->cfg.src_addr_width != DMA_SLAVE_BUSWIDTH_4_BYTES)
                        return NULL;
                src = c->cfg.src_addr;
                dst = buf_addr;
                info |= BCM2835_DMA_S_DREQ | BCM2835_DMA_D_INC;
        } else {
                if (c->cfg.dst_addr_width != DMA_SLAVE_BUSWIDTH_4_BYTES)
                        return NULL;
                dst = c->cfg.dst_addr;
                src = buf_addr;
                info |= BCM2835_DMA_D_DREQ | BCM2835_DMA_S_INC;
        }

        /* allocate the CB chain */
        d = bcm2835_dma_create_cb_chain(chan, direction,
                                        info, extra, true,
                                        0, src, dst, buf_len, GFP_KERNEL);
        if (!d)
                return NULL;

        /* wrap around into a loop */
        d->cb_list[d->frames - 1].cb->next = d->cb_list[0].paddr;

        return vchan_tx_prep(&c->vc, &d->vd, flags);
}

In addition the dma_memcpy operations come for almost free:

struct dma_async_tx_descriptor *bcm2835_dma_prep_dma_memcpy(
        struct dma_chan *chan, dma_addr_t dst, dma_addr_t src,
        size_t len, unsigned long flags)
{
        struct bcm2835_chan *c = to_bcm2835_dma_chan(chan);
        struct bcm2835_desc *d;
        u32 info = BCM2835_DMA_D_INC | BCM2835_DMA_S_INC;
        u32 extra = BCM2835_DMA_INT_EN | BCM2835_DMA_WAIT_RESP;

        /* if src, dst or len is not given return with an error */
        if (!src | !dst | !len)
                return NULL;

        /* allocate the CB chain */
        d = bcm2835_dma_create_cb_chain(chan, DMA_MEM_TO_MEM, false,
                                        info, extra, 0,
                                        src, dst, len, GFP_KERNEL);
        if (!d)
                return NULL;

        return vchan_tx_prep(&c->vc, &d->vd, flags);
}

Similarly memset/memset_sg would also be easy to implement and probably interleaved as well.
The question is if there are any "consumer" for those...

In the hope that this may be acceptable for upstream...

I still got I2S issues with the I2S clock so I can not verify if the cyclic code is working as expected (yet)...

@HiassofT
Copy link
Contributor

HiassofT commented Dec 29, 2015 via email

@msperl
Copy link
Contributor Author

msperl commented Dec 29, 2015

well, that is something I have missed... Thanks for pointing it out - I found it strange that I could do away with period_len...

@msperl
Copy link
Contributor Author

msperl commented Dec 29, 2015

still that "split I was talking about - especially about interrupts when you have one transfer that is only 4 bytes in size - seems relevant...

@msperl
Copy link
Contributor Author

msperl commented Apr 25, 2016

Ok - I will create a pull fur the basic stuff that is upstream.
those other patches are in a separate branch until they get merged...

@msperl
Copy link
Contributor Author

msperl commented Apr 25, 2016

Created:
#1439
But note that I have not converted all clocks (yet) to use the clock manager!
also note that smi seems to run its own clock manager, which we need to fix as well

@msperl
Copy link
Contributor Author

msperl commented Apr 25, 2016

Note that I got the following feedback by Lars-Peter Clausen on the pcm registration patch:

+static const struct snd_dmaengine_pcm_config bcm2835_dmaengine_pcm_config = {
+   .prepare_slave_config = snd_dmaengine_pcm_prepare_slave_config,
+   .pcm_hardware = &bcm2835_pcm_hardware,
+   .prealloc_buffer_size = 256 * PAGE_SIZE,
+};

The generic dmaengine PCM driver auto-discovers these things, no need to
provide them. The code is OK as it is.

So maybe it really is not needed - can someone confirm?

@HiassofT
Copy link
Contributor

@msperl without the "pcm registration" patch we loose the S16_LE format. Another reason we have this patch here is so that we could limit the maximum period size to 32k or 64k-4. The latter should no longer be a problem with your DMA patches / segment splitting code.

It looks like the autodiscover code in dmaengine_pcm_set_runtime_hwparams determines that the DMA controller can't do 16bit transfers and so doesn't enable S16_LE.

Not sure if there's a way to workaround this other than passing a pcm config struct to snd_dmaengine_pcm_register and setting all values. But I think we should compare the values with the ones setup by the auto-discover code, not that we miss some important flag/setting. I also think we can drop SND_DMAENGINE_PCM_FLAG_COMPAT, IIRC that's only needed in setups without devicetree.

@msperl
Copy link
Contributor Author

msperl commented Apr 25, 2016

@HiassofT : well - as you are on the response from Lars-Peter - can you please send that answer to him/the lists?

@msperl
Copy link
Contributor Author

msperl commented Apr 25, 2016

2 of those patches have been applied by mark (Master and 24 bit)

@msperl
Copy link
Contributor Author

msperl commented Apr 25, 2016

I have pushed them into the pull request...
I am still trying to find a way arround the clock issue, but I guess that will happen tomorrow - maybe eric has an idea in the meantime.

@msperl
Copy link
Contributor Author

msperl commented Apr 26, 2016

For the last one I can modify the patch and post it again with better descriptions.

@msperl
Copy link
Contributor Author

msperl commented Apr 26, 2016

As a note with regards to PCM registration:
Thinking about this a bit more (now that I understand the context) in principle we could create some issues with 24 bit transfers, as 2_24 bit = 48 bit 1.5_32bit and hence with bad "buffer sizes" we could end up with channel swapping and clicks...

But I assume this has not been observed in the wild...

@HiassofT
Copy link
Contributor

Thinking about this a bit more (now that I understand the context) in principle we could create some issues with 24 bit transfers, as 2_24 bit = 48 bit 1.5_32bit and hence with bad "buffer sizes" we could end up with channel swapping and clicks...

That's not an issue. S24_LE is 24bits stored in the lower 3 bytes
of a 32bit word.

snd_pcm_format_width(SNDRV_PCM_FORMAT_S24_LE) = 24
snd_pcm_format_physical_width(SNDRV_PCM_FORMAT_S24_LE) = 32

The packed variant S24_3LE (width=physical_width=24) would result
in such a layout, but we are not using that, the I2S chip doesn't
support this format.

Have a look at the pcm_formats table for all the gory details:
http://lxr.free-electrons.com/source/sound/core/pcm_misc.c?v=4.4#L42

@msperl
Copy link
Contributor Author

msperl commented Apr 26, 2016

It was just my interpretation of this being a possible issue.

As for the 16 bit response by Lars (fixing the core) - maybe you want to take up that portion and fix that core piece?
I am looking into the clock stuff instead.

@HiassofT
Copy link
Contributor

HiassofT commented Apr 26, 2016

@msperl could you give the V2 patch that I just mailed out to you and Lars
a try and see if that works with downstream+upstream trees?

It should apply cleanly on downstream but in addition to that you have
to remove the pcm_hardware config from the register call so that the
"auto-discover" code is used. I've used this for testing bcm2835-i2s
(same as in upstream code):

ret = devm_snd_dmaengine_pcm_register(&pdev->dev, NULL, 0);

You can verify that S16_LE is enabled and working with aplay:
aplay -v -D hw:0 --dump-hw-params some-16bit-wav-file.wav

If everything's fine you should see

FORMAT: S16_LE S24_LE S32_LE

Playing 32-bit WAVs should still work fine, too, but to play 24-bit
WAVs you have to use -D plughw:0 (or remove that, that should be the
default anyway) so that ALSA will convert the packed 24bit file
to the 32-bit padded S24_LE format.

Playing 192kHz 32-bit WAVs would be a good test for your DMA code
as well, that'll request 128k periods/buffers. I haven't tested
that yet, just assumed it would work.

Need to clean up my HD to make room for some more trees now....

@HiassofT
Copy link
Contributor

HiassofT commented Apr 27, 2016

@msperl @clivem here are upstream+downstream trees with my current dmaengine patches:
https://github.com/HiassofT/rpi-linux/tree/upstream-packed-s16le
This is based on 4.6-rc5. @msperl do you have a working upstream environment and could test these?

https://github.com/HiassofT/rpi-linux/tree/rpi-4.4-dma-packed-s16le
This one is based on https://github.com/msperl/linux-rpi/tree/dma-engine-backport-4.4 and also contains commit to remove the downstream pcm_hardware config and actually use the upstream code.

I've tested the latter with 44.1/16, 192/24 and 192/32 files and so far everything went fine.

@msperl
Copy link
Contributor Author

msperl commented Apr 27, 2016

OK, so I am merging your patchies on top of my CLK_ENABLE_HAND_OFF.

As for testing: I tried
speaker-test -c 2 -r 96000 -F S16_LE -f 440 -t sine -l 1
and that work with and without your patch...

But the clock is different : divider 3.125 without and 6.25 with your patch.

So I can guess that it works...

@msperl
Copy link
Contributor Author

msperl commented Apr 27, 2016

tested tags sent

@HiassofT
Copy link
Contributor

HiassofT commented Apr 27, 2016 via email

@msperl
Copy link
Contributor Author

msperl commented Apr 27, 2016

well - i also did the math and came to the conclusion it works based on the divider settings...
Maybe this is now sufficient to get merged into for-next.

@Ruffio
Copy link

Ruffio commented Aug 17, 2016

@msperl has your issue been resolved? If so, please close this issue. Thanks.

@msperl
Copy link
Contributor Author

msperl commented Aug 30, 2016

I guess it has been fixed

@msperl msperl closed this as completed Aug 30, 2016
@joaodriessen
Copy link

There is an issue regarding using the Hifiberry DAC with Eric Anholt's VC4 driver. on current stable systems Linux Kernel 4.4 (ie. raspbian, archlinuxarm) it's not possible to enable "dtoverlay=vc4-kms-v3d" and "dtoverlay=hifiberry-dacplus" at the same time.
Is this something that could be resolved in the near future? @msperl @hifiberry

@msperl
Copy link
Contributor Author

msperl commented Nov 3, 2016

The things "limiting" i2s from working with KMS has been fixed in kernel version 4.7 - see #1629.
What is missing in that (or later) kernel versions is an updated version of KMS - see the question asked by popcornmix at the end of that thread with regards to KMS working on 4.7 (or later).

@joaodriessen
Copy link

Great to hear it's resolved in 4.7
Any idea if this will be backported for users running stock distros/systems such as Rasbian or even ArchlinuxARM, which run kernel 4.4?

@popcornmix
Copy link
Collaborator

vc4-kms-v3d/vc4-fkms-v3d are working on 4.4 and 4.9, but not on kernels between.

lucafavatella added a commit to lucafavatella/linux that referenced this issue May 20, 2018
Address following errors:
```
[   28.042135] mt7601u 1-1.2:1.0: Error: MCU resp urb failed:-71
[   28.042183] mt7601u 1-1.2:1.0: Error: MCU resp evt:0 seq:1-f!
...
[   28.044756] mt7601u 1-1.2:1.0: Error: mt7601u_mcu_wait_resp timed out
```

TODO: Fix dma WARNING first ("WARNING: CPU: 0 PID: 10 at drivers/net/wireless/mediatek/mt7601u/dma.c:200 mt7601u_complete_rx+0x134/0x148 [mt7601u]") then test this mcu patch if mcu errors still present.

## System info

Arch info:
```
pi@raspberrypi:~ $ uname -a
Linux raspberrypi 4.14.34+ torvalds#1110 Mon Apr 16 14:51:42 BST 2018 armv6l GNU/Linux

pi@raspberrypi:~ $ dpkg-query -W raspberrypi-kernel firmware-misc-nonfree
firmware-misc-nonfree   20170823-1
raspberrypi-kernel      1.20180417-1
```

Hardware info:
```
pi@raspberrypi:~ $ lsusb | grep 7601
Bus 001 Device 096: ID 148f:7601 Ralink Technology, Corp. MT7601U Wireless Adapter
```

I blacklisted a module in order not to taint the kernel:
```
pi@raspberrypi:~ $ cat /etc/modprobe.d/blacklist-snd_bcm2835.conf
blacklist snd_bcm2835
```

## Diagnosis

The user-visible symptom is that WiFi does not work.

From `dmesg` I notice that association with access point fails.
Excerpts of initial portion of `dmesg` that I consider relevant:
```
...
[    3.423933] usb 1-1.2: New USB device found, idVendor=148f, idProduct=7601
[    3.432813] usb 1-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[    3.442117] usb 1-1.2: Product: 802.11 n WLAN
[    3.448341] usb 1-1.2: SerialNumber: 1.0
...
[   16.363820] usb 1-1.2: reset high-speed USB device number 4 using dwc_otg
[   16.543871] mt7601u 1-1.2:1.0: ASIC revision: 76010001 MAC revision: 76010500
...
[   16.618099] mt7601u 1-1.2:1.0: Firmware Version: 0.1.00 Build: 7640 Build time: 201302052146____
...
[   17.393938] mt7601u 1-1.2:1.0: EEPROM ver:0c fae:00
[   18.133052] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'
[   18.135614] usbcore: registered new interface driver mt7601u
...
[   19.167626] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[   22.902745] wlan0: authenticate with ac:ce:ss:po:in:tX
[   22.969817] wlan0: send auth to ac:ce:ss:po:in:tX (try 1/3)
[   22.971735] wlan0: authenticated
[   22.982994] wlan0: associate with ac:ce:ss:po:in:tX (try 1/3)
[   22.986711] wlan0: RX AssocResp from ac:ce:ss:po:in:tX (capab=0x1411 status=0 aid=3)
[   23.003069] NYET/NAK/ACK/other in non-error case, 0x00000002
[   23.003122] NYET/NAK/ACK/other in non-error case, 0x00000002
[   23.003162] NYET/NAK/ACK/other in non-error case, 0x00000002
[   23.033058] NYET/NAK/ACK/other in non-error case, 0x00000002
[   23.033117] NYET/NAK/ACK/other in non-error case, 0x00000002
[   23.033158] NYET/NAK/ACK/other in non-error case, 0x00000002
[   23.053141] usb usb1-port1: disabled by hub (EMI?), re-enabling...
[   23.053181] usb 1-1: USB disconnect, device number 2
...
[   23.084610] usb 1-1.2: USB disconnect, device number 4
[   23.085396] ------------[ cut here ]------------
[   23.085594] WARNING: CPU: 0 PID: 10 at drivers/net/wireless/mediatek/mt7601u/dma.c:200 mt7601u_complete_rx+0x134/0x148 [mt7601u]
[   23.085600] RX urb mismatch
[   23.085607] Modules linked in: arc4 mt7601u tun mac80211 cfg80211 rfkill uio_pdrv_genirq uio fixed ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT nf_reject_ipv4 nf_log_ipv4 nf_log_common xt_LOG xt_recent xt_limit xt_tcpudp xt_addrtype ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack iptable_filter ip_tables x_tables ipv6
[   23.085789] CPU: 0 PID: 10 Comm: kworker/0:1 Not tainted 4.14.34+ torvalds#1110
[   23.085794] Hardware name: BCM2835
[   23.085828] Workqueue: usb_hub_wq hub_event
[   23.085891] [<c0016514>] (unwind_backtrace) from [<c0013e4c>] (show_stack+0x20/0x24)
[   23.085921] [<c0013e4c>] (show_stack) from [<c062ffc4>] (dump_stack+0x20/0x28)
[   23.085950] [<c062ffc4>] (dump_stack) from [<c0021f94>] (__warn+0xe4/0x10c)
[   23.085969] [<c0021f94>] (__warn) from [<c0022004>] (warn_slowpath_fmt+0x48/0x50)
[   23.086087] [<c0022004>] (warn_slowpath_fmt) from [<bf3264f8>] (mt7601u_complete_rx+0x134/0x148 [mt7601u])
[   23.086236] [<bf3264f8>] (mt7601u_complete_rx [mt7601u]) from [<c04610e0>] (__usb_hcd_giveback_urb+0x80/0x160)
[   23.086262] [<c04610e0>] (__usb_hcd_giveback_urb) from [<c0461208>] (usb_hcd_giveback_urb+0x48/0x10c)
[   23.086297] [<c0461208>] (usb_hcd_giveback_urb) from [<c0490318>] (dwc_otg_urb_dequeue+0x98/0xbc)
[   23.086323] [<c0490318>] (dwc_otg_urb_dequeue) from [<c0461cac>] (unlink1+0x40/0x178)
[   23.086345] [<c0461cac>] (unlink1) from [<c0463118>] (usb_hcd_flush_endpoint+0xcc/0xec)
[   23.086366] [<c0463118>] (usb_hcd_flush_endpoint) from [<c046605c>] (usb_disable_endpoint+0x58/0xa0)
[   23.086383] [<c046605c>] (usb_disable_endpoint) from [<c04660f0>] (usb_disable_interface+0x4c/0x64)
[   23.086402] [<c04660f0>] (usb_disable_interface) from [<c0468b90>] (usb_unbind_interface+0x1d0/0x288)
[   23.086440] [<c0468b90>] (usb_unbind_interface) from [<c0400618>] (device_release_driver_internal+0x14c/0x1ec)
[   23.086465] [<c0400618>] (device_release_driver_internal) from [<c04006d8>] (device_release_driver+0x20/0x24)
[   23.086487] [<c04006d8>] (device_release_driver) from [<c03ff4a0>] (bus_remove_device+0xd8/0x108)
[   23.086507] [<c03ff4a0>] (bus_remove_device) from [<c03fc124>] (device_del+0x1ec/0x30c)
[   23.086526] [<c03fc124>] (device_del) from [<c04661b8>] (usb_disable_device+0xb0/0x1f4)
[   23.086546] [<c04661b8>] (usb_disable_device) from [<c045cd34>] (usb_disconnect+0x7c/0x1fc)
[   23.086567] [<c045cd34>] (usb_disconnect) from [<c045ce6c>] (usb_disconnect+0x1b4/0x1fc)
[   23.086586] [<c045ce6c>] (usb_disconnect) from [<c045e718>] (hub_event+0x594/0x11c0)
[   23.086610] [<c045e718>] (hub_event) from [<c0039208>] (process_one_work+0x11c/0x398)
[   23.086629] [<c0039208>] (process_one_work) from [<c00394c0>] (worker_thread+0x3c/0x544)
[   23.086656] [<c00394c0>] (worker_thread) from [<c003f3bc>] (kthread+0x120/0x15c)
[   23.086681] [<c003f3bc>] (kthread) from [<c000fe6c>] (ret_from_fork+0x14/0x28)
[   23.086690] ---[ end trace ee6907230b405e54 ]---
[   23.096896] mt7601u 1-1.2:1.0: Error: submit URB dir:128 ep:1 failed:-19
[   23.108805] mt7601u 1-1.2:1.0: mt7601u_rxdc_cal timed out
[   23.108914] wlan0: associated
[   23.109635] wlan0: deauthenticating from ac:ce:ss:po:in:tX by local choice (Reason: 3=DEAUTH_LEAVING)
[   23.121758] mt7601u 1-1.2:1.0: mt7601u_rxdc_cal timed out
...
[   28.040752] mt7601u 1-1.2:1.0: Error: RX urb failed:-71
...
[   28.042135] mt7601u 1-1.2:1.0: Error: MCU resp urb failed:-71
[   28.042183] mt7601u 1-1.2:1.0: Error: MCU resp evt:0 seq:1-f!
...
[   28.044756] mt7601u 1-1.2:1.0: Error: mt7601u_mcu_wait_resp timed out
...
```

## Analysis

### Identification of similar issues

This may be the same issue as reported in [this comment with the same driver - `mt7601u_rxdc_cal timed out`](raspberrypi/linux#1231 (comment)).

This seems the same symptom as reported in [this comment with a similar driver - `rx urb mismatch` and `mt76_usb_complete_rx [mt76]`](openwrt/mt76#139 (comment)) and [fixed](openwrt/mt76@ad0a3e9).
For this case resolution was usage of `GFP_ATOMIC` in call to `mt76_usb_submit_buf` that passes it to `usb_submit_urb`.

### Identification of affected source code

```
raspberrypi-kernel      1.20180417-1
```
```
[   23.085594] WARNING: CPU: 0 PID: 10 at drivers/net/wireless/mediatek/mt7601u/dma.c:200 mt7601u_complete_rx+0x134/0x148 [mt7601u]
```

Versioned link to affected source code: https://github.com/raspberrypi/linux/blob/raspberrypi-kernel_1.20180417-1/drivers/net/wireless/mediatek/mt7601u/dma.c#L200
lucafavatella added a commit to lucafavatella/linux that referenced this issue May 20, 2018
Addresses following warning:
```
[   23.085594] WARNING: CPU: 0 PID: 10 at drivers/net/wireless/mediatek/mt7601u/dma.c:200 mt7601u_complete_rx+0x134/0x148 [mt7601u]
[   23.085600] RX urb mismatch
```

This WARN seemed the same symptom reported in [this comment for the
similar `mt76x2u` driver - `rx urb mismatch` and `mt76_usb_complete_rx
[mt76]`](openwrt/mt76#139 (comment))
and [fixed by using `GFP_ATOMIC` in call to `mt76_usb_submit_buf` that
passes it to
`usb_submit_urb`](openwrt/mt76@ad0a3e9),
so the fix is similar.

----

## System info

Arch info:
```
pi@raspberrypi:~ $ uname -a
Linux raspberrypi 4.14.34+ torvalds#1110 Mon Apr 16 14:51:42 BST 2018 armv6l GNU/Linux

pi@raspberrypi:~ $ dpkg-query -W raspberrypi-kernel firmware-misc-nonfree
firmware-misc-nonfree   20170823-1
raspberrypi-kernel      1.20180417-1
```

Hardware info:
```
pi@raspberrypi:~ $ lsusb | grep 7601
Bus 001 Device 096: ID 148f:7601 Ralink Technology, Corp. MT7601U Wireless Adapter
```

I blacklisted a module in order not to taint the kernel:
```
pi@raspberrypi:~ $ cat /etc/modprobe.d/blacklist-snd_bcm2835.conf
blacklist snd_bcm2835
```

## Diagnosis

The user-visible symptom is that WiFi does not work.

From `dmesg` I notice that association with access point fails.
Excerpts of initial portion of `dmesg` that I consider relevant:
```
...
[    3.423933] usb 1-1.2: New USB device found, idVendor=148f, idProduct=7601
[    3.432813] usb 1-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[    3.442117] usb 1-1.2: Product: 802.11 n WLAN
[    3.448341] usb 1-1.2: SerialNumber: 1.0
...
[   16.363820] usb 1-1.2: reset high-speed USB device number 4 using dwc_otg
[   16.543871] mt7601u 1-1.2:1.0: ASIC revision: 76010001 MAC revision: 76010500
...
[   16.618099] mt7601u 1-1.2:1.0: Firmware Version: 0.1.00 Build: 7640 Build time: 201302052146____
...
[   17.393938] mt7601u 1-1.2:1.0: EEPROM ver:0c fae:00
[   18.133052] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'
[   18.135614] usbcore: registered new interface driver mt7601u
...
[   19.167626] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[   22.902745] wlan0: authenticate with ac:ce:ss:po:in:tX
[   22.969817] wlan0: send auth to ac:ce:ss:po:in:tX (try 1/3)
[   22.971735] wlan0: authenticated
[   22.982994] wlan0: associate with ac:ce:ss:po:in:tX (try 1/3)
[   22.986711] wlan0: RX AssocResp from ac:ce:ss:po:in:tX (capab=0x1411 status=0 aid=3)
[   23.003069] NYET/NAK/ACK/other in non-error case, 0x00000002
[   23.003122] NYET/NAK/ACK/other in non-error case, 0x00000002
[   23.003162] NYET/NAK/ACK/other in non-error case, 0x00000002
[   23.033058] NYET/NAK/ACK/other in non-error case, 0x00000002
[   23.033117] NYET/NAK/ACK/other in non-error case, 0x00000002
[   23.033158] NYET/NAK/ACK/other in non-error case, 0x00000002
[   23.053141] usb usb1-port1: disabled by hub (EMI?), re-enabling...
[   23.053181] usb 1-1: USB disconnect, device number 2
...
[   23.084610] usb 1-1.2: USB disconnect, device number 4
[   23.085396] ------------[ cut here ]------------
[   23.085594] WARNING: CPU: 0 PID: 10 at drivers/net/wireless/mediatek/mt7601u/dma.c:200 mt7601u_complete_rx+0x134/0x148 [mt7601u]
[   23.085600] RX urb mismatch
[   23.085607] Modules linked in: arc4 mt7601u tun mac80211 cfg80211 rfkill uio_pdrv_genirq uio fixed ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT nf_reject_ipv4 nf_log_ipv4 nf_log_common xt_LOG xt_recent xt_limit xt_tcpudp xt_addrtype ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack iptable_filter ip_tables x_tables ipv6
[   23.085789] CPU: 0 PID: 10 Comm: kworker/0:1 Not tainted 4.14.34+ torvalds#1110
[   23.085794] Hardware name: BCM2835
[   23.085828] Workqueue: usb_hub_wq hub_event
[   23.085891] [<c0016514>] (unwind_backtrace) from [<c0013e4c>] (show_stack+0x20/0x24)
[   23.085921] [<c0013e4c>] (show_stack) from [<c062ffc4>] (dump_stack+0x20/0x28)
[   23.085950] [<c062ffc4>] (dump_stack) from [<c0021f94>] (__warn+0xe4/0x10c)
[   23.085969] [<c0021f94>] (__warn) from [<c0022004>] (warn_slowpath_fmt+0x48/0x50)
[   23.086087] [<c0022004>] (warn_slowpath_fmt) from [<bf3264f8>] (mt7601u_complete_rx+0x134/0x148 [mt7601u])
[   23.086236] [<bf3264f8>] (mt7601u_complete_rx [mt7601u]) from [<c04610e0>] (__usb_hcd_giveback_urb+0x80/0x160)
[   23.086262] [<c04610e0>] (__usb_hcd_giveback_urb) from [<c0461208>] (usb_hcd_giveback_urb+0x48/0x10c)
[   23.086297] [<c0461208>] (usb_hcd_giveback_urb) from [<c0490318>] (dwc_otg_urb_dequeue+0x98/0xbc)
[   23.086323] [<c0490318>] (dwc_otg_urb_dequeue) from [<c0461cac>] (unlink1+0x40/0x178)
[   23.086345] [<c0461cac>] (unlink1) from [<c0463118>] (usb_hcd_flush_endpoint+0xcc/0xec)
[   23.086366] [<c0463118>] (usb_hcd_flush_endpoint) from [<c046605c>] (usb_disable_endpoint+0x58/0xa0)
[   23.086383] [<c046605c>] (usb_disable_endpoint) from [<c04660f0>] (usb_disable_interface+0x4c/0x64)
[   23.086402] [<c04660f0>] (usb_disable_interface) from [<c0468b90>] (usb_unbind_interface+0x1d0/0x288)
[   23.086440] [<c0468b90>] (usb_unbind_interface) from [<c0400618>] (device_release_driver_internal+0x14c/0x1ec)
[   23.086465] [<c0400618>] (device_release_driver_internal) from [<c04006d8>] (device_release_driver+0x20/0x24)
[   23.086487] [<c04006d8>] (device_release_driver) from [<c03ff4a0>] (bus_remove_device+0xd8/0x108)
[   23.086507] [<c03ff4a0>] (bus_remove_device) from [<c03fc124>] (device_del+0x1ec/0x30c)
[   23.086526] [<c03fc124>] (device_del) from [<c04661b8>] (usb_disable_device+0xb0/0x1f4)
[   23.086546] [<c04661b8>] (usb_disable_device) from [<c045cd34>] (usb_disconnect+0x7c/0x1fc)
[   23.086567] [<c045cd34>] (usb_disconnect) from [<c045ce6c>] (usb_disconnect+0x1b4/0x1fc)
[   23.086586] [<c045ce6c>] (usb_disconnect) from [<c045e718>] (hub_event+0x594/0x11c0)
[   23.086610] [<c045e718>] (hub_event) from [<c0039208>] (process_one_work+0x11c/0x398)
[   23.086629] [<c0039208>] (process_one_work) from [<c00394c0>] (worker_thread+0x3c/0x544)
[   23.086656] [<c00394c0>] (worker_thread) from [<c003f3bc>] (kthread+0x120/0x15c)
[   23.086681] [<c003f3bc>] (kthread) from [<c000fe6c>] (ret_from_fork+0x14/0x28)
[   23.086690] ---[ end trace ee6907230b405e54 ]---
[   23.096896] mt7601u 1-1.2:1.0: Error: submit URB dir:128 ep:1 failed:-19
[   23.108805] mt7601u 1-1.2:1.0: mt7601u_rxdc_cal timed out
[   23.108914] wlan0: associated
[   23.109635] wlan0: deauthenticating from ac:ce:ss:po:in:tX by local choice (Reason: 3=DEAUTH_LEAVING)
[   23.121758] mt7601u 1-1.2:1.0: mt7601u_rxdc_cal timed out
...
[   28.040752] mt7601u 1-1.2:1.0: Error: RX urb failed:-71
...
[   28.042135] mt7601u 1-1.2:1.0: Error: MCU resp urb failed:-71
[   28.042183] mt7601u 1-1.2:1.0: Error: MCU resp evt:0 seq:1-f!
...
[   28.044756] mt7601u 1-1.2:1.0: Error: mt7601u_mcu_wait_resp timed out
...
```

## Analysis

### Identification of similar issues

This may be the same issue as reported in [this comment with the same driver - `mt7601u_rxdc_cal timed out`](raspberrypi/linux#1231 (comment)).

This seems the same symptom as reported in [this comment with a similar driver - `rx urb mismatch` and `mt76_usb_complete_rx [mt76]`](openwrt/mt76#139 (comment)) and [fixed](openwrt/mt76@ad0a3e9).
For this case resolution was usage of `GFP_ATOMIC` in call to `mt76_usb_submit_buf` that passes it to `usb_submit_urb`.

### Identification of affected source code

```
raspberrypi-kernel      1.20180417-1
```
```
[   23.085594] WARNING: CPU: 0 PID: 10 at drivers/net/wireless/mediatek/mt7601u/dma.c:200 mt7601u_complete_rx+0x134/0x148 [mt7601u]
```

Versioned link to affected source code: https://github.com/raspberrypi/linux/blob/raspberrypi-kernel_1.20180417-1/drivers/net/wireless/mediatek/mt7601u/dma.c#L200
lucafavatella added a commit to lucafavatella/linux that referenced this issue May 26, 2018
Address following errors:
```
[   28.042135] mt7601u 1-1.2:1.0: Error: MCU resp urb failed:-71
[   28.042183] mt7601u 1-1.2:1.0: Error: MCU resp evt:0 seq:1-f!
...
[   28.044756] mt7601u 1-1.2:1.0: Error: mt7601u_mcu_wait_resp timed out
```

TODO: Fix dma WARNING first ("WARNING: CPU: 0 PID: 10 at drivers/net/wireless/mediatek/mt7601u/dma.c:200 mt7601u_complete_rx+0x134/0x148 [mt7601u]") then test this mcu patch if mcu errors still present.

## System info

Arch info:
```
pi@raspberrypi:~ $ uname -a
Linux raspberrypi 4.14.34+ torvalds#1110 Mon Apr 16 14:51:42 BST 2018 armv6l GNU/Linux

pi@raspberrypi:~ $ dpkg-query -W raspberrypi-kernel firmware-misc-nonfree
firmware-misc-nonfree   20170823-1
raspberrypi-kernel      1.20180417-1
```

Hardware info:
```
pi@raspberrypi:~ $ lsusb | grep 7601
Bus 001 Device 096: ID 148f:7601 Ralink Technology, Corp. MT7601U Wireless Adapter
```

I blacklisted a module in order not to taint the kernel:
```
pi@raspberrypi:~ $ cat /etc/modprobe.d/blacklist-snd_bcm2835.conf
blacklist snd_bcm2835
```

## Diagnosis

The user-visible symptom is that WiFi does not work.

From `dmesg` I notice that association with access point fails.
Excerpts of initial portion of `dmesg` that I consider relevant:
```
...
[    3.423933] usb 1-1.2: New USB device found, idVendor=148f, idProduct=7601
[    3.432813] usb 1-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[    3.442117] usb 1-1.2: Product: 802.11 n WLAN
[    3.448341] usb 1-1.2: SerialNumber: 1.0
...
[   16.363820] usb 1-1.2: reset high-speed USB device number 4 using dwc_otg
[   16.543871] mt7601u 1-1.2:1.0: ASIC revision: 76010001 MAC revision: 76010500
...
[   16.618099] mt7601u 1-1.2:1.0: Firmware Version: 0.1.00 Build: 7640 Build time: 201302052146____
...
[   17.393938] mt7601u 1-1.2:1.0: EEPROM ver:0c fae:00
[   18.133052] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'
[   18.135614] usbcore: registered new interface driver mt7601u
...
[   19.167626] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[   22.902745] wlan0: authenticate with ac:ce:ss:po:in:tX
[   22.969817] wlan0: send auth to ac:ce:ss:po:in:tX (try 1/3)
[   22.971735] wlan0: authenticated
[   22.982994] wlan0: associate with ac:ce:ss:po:in:tX (try 1/3)
[   22.986711] wlan0: RX AssocResp from ac:ce:ss:po:in:tX (capab=0x1411 status=0 aid=3)
[   23.003069] NYET/NAK/ACK/other in non-error case, 0x00000002
[   23.003122] NYET/NAK/ACK/other in non-error case, 0x00000002
[   23.003162] NYET/NAK/ACK/other in non-error case, 0x00000002
[   23.033058] NYET/NAK/ACK/other in non-error case, 0x00000002
[   23.033117] NYET/NAK/ACK/other in non-error case, 0x00000002
[   23.033158] NYET/NAK/ACK/other in non-error case, 0x00000002
[   23.053141] usb usb1-port1: disabled by hub (EMI?), re-enabling...
[   23.053181] usb 1-1: USB disconnect, device number 2
...
[   23.084610] usb 1-1.2: USB disconnect, device number 4
[   23.085396] ------------[ cut here ]------------
[   23.085594] WARNING: CPU: 0 PID: 10 at drivers/net/wireless/mediatek/mt7601u/dma.c:200 mt7601u_complete_rx+0x134/0x148 [mt7601u]
[   23.085600] RX urb mismatch
[   23.085607] Modules linked in: arc4 mt7601u tun mac80211 cfg80211 rfkill uio_pdrv_genirq uio fixed ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ipt_REJECT nf_reject_ipv4 nf_log_ipv4 nf_log_common xt_LOG xt_recent xt_limit xt_tcpudp xt_addrtype ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp nf_conntrack iptable_filter ip_tables x_tables ipv6
[   23.085789] CPU: 0 PID: 10 Comm: kworker/0:1 Not tainted 4.14.34+ torvalds#1110
[   23.085794] Hardware name: BCM2835
[   23.085828] Workqueue: usb_hub_wq hub_event
[   23.085891] [<c0016514>] (unwind_backtrace) from [<c0013e4c>] (show_stack+0x20/0x24)
[   23.085921] [<c0013e4c>] (show_stack) from [<c062ffc4>] (dump_stack+0x20/0x28)
[   23.085950] [<c062ffc4>] (dump_stack) from [<c0021f94>] (__warn+0xe4/0x10c)
[   23.085969] [<c0021f94>] (__warn) from [<c0022004>] (warn_slowpath_fmt+0x48/0x50)
[   23.086087] [<c0022004>] (warn_slowpath_fmt) from [<bf3264f8>] (mt7601u_complete_rx+0x134/0x148 [mt7601u])
[   23.086236] [<bf3264f8>] (mt7601u_complete_rx [mt7601u]) from [<c04610e0>] (__usb_hcd_giveback_urb+0x80/0x160)
[   23.086262] [<c04610e0>] (__usb_hcd_giveback_urb) from [<c0461208>] (usb_hcd_giveback_urb+0x48/0x10c)
[   23.086297] [<c0461208>] (usb_hcd_giveback_urb) from [<c0490318>] (dwc_otg_urb_dequeue+0x98/0xbc)
[   23.086323] [<c0490318>] (dwc_otg_urb_dequeue) from [<c0461cac>] (unlink1+0x40/0x178)
[   23.086345] [<c0461cac>] (unlink1) from [<c0463118>] (usb_hcd_flush_endpoint+0xcc/0xec)
[   23.086366] [<c0463118>] (usb_hcd_flush_endpoint) from [<c046605c>] (usb_disable_endpoint+0x58/0xa0)
[   23.086383] [<c046605c>] (usb_disable_endpoint) from [<c04660f0>] (usb_disable_interface+0x4c/0x64)
[   23.086402] [<c04660f0>] (usb_disable_interface) from [<c0468b90>] (usb_unbind_interface+0x1d0/0x288)
[   23.086440] [<c0468b90>] (usb_unbind_interface) from [<c0400618>] (device_release_driver_internal+0x14c/0x1ec)
[   23.086465] [<c0400618>] (device_release_driver_internal) from [<c04006d8>] (device_release_driver+0x20/0x24)
[   23.086487] [<c04006d8>] (device_release_driver) from [<c03ff4a0>] (bus_remove_device+0xd8/0x108)
[   23.086507] [<c03ff4a0>] (bus_remove_device) from [<c03fc124>] (device_del+0x1ec/0x30c)
[   23.086526] [<c03fc124>] (device_del) from [<c04661b8>] (usb_disable_device+0xb0/0x1f4)
[   23.086546] [<c04661b8>] (usb_disable_device) from [<c045cd34>] (usb_disconnect+0x7c/0x1fc)
[   23.086567] [<c045cd34>] (usb_disconnect) from [<c045ce6c>] (usb_disconnect+0x1b4/0x1fc)
[   23.086586] [<c045ce6c>] (usb_disconnect) from [<c045e718>] (hub_event+0x594/0x11c0)
[   23.086610] [<c045e718>] (hub_event) from [<c0039208>] (process_one_work+0x11c/0x398)
[   23.086629] [<c0039208>] (process_one_work) from [<c00394c0>] (worker_thread+0x3c/0x544)
[   23.086656] [<c00394c0>] (worker_thread) from [<c003f3bc>] (kthread+0x120/0x15c)
[   23.086681] [<c003f3bc>] (kthread) from [<c000fe6c>] (ret_from_fork+0x14/0x28)
[   23.086690] ---[ end trace ee6907230b405e54 ]---
[   23.096896] mt7601u 1-1.2:1.0: Error: submit URB dir:128 ep:1 failed:-19
[   23.108805] mt7601u 1-1.2:1.0: mt7601u_rxdc_cal timed out
[   23.108914] wlan0: associated
[   23.109635] wlan0: deauthenticating from ac:ce:ss:po:in:tX by local choice (Reason: 3=DEAUTH_LEAVING)
[   23.121758] mt7601u 1-1.2:1.0: mt7601u_rxdc_cal timed out
...
[   28.040752] mt7601u 1-1.2:1.0: Error: RX urb failed:-71
...
[   28.042135] mt7601u 1-1.2:1.0: Error: MCU resp urb failed:-71
[   28.042183] mt7601u 1-1.2:1.0: Error: MCU resp evt:0 seq:1-f!
...
[   28.044756] mt7601u 1-1.2:1.0: Error: mt7601u_mcu_wait_resp timed out
...
```

## Analysis

### Identification of similar issues

This may be the same issue as reported in [this comment with the same driver - `mt7601u_rxdc_cal timed out`](raspberrypi/linux#1231 (comment)).

This seems the same symptom as reported in [this comment with a similar driver - `rx urb mismatch` and `mt76_usb_complete_rx [mt76]`](openwrt/mt76#139 (comment)) and [fixed](openwrt/mt76@ad0a3e9).
For this case resolution was usage of `GFP_ATOMIC` in call to `mt76_usb_submit_buf` that passes it to `usb_submit_urb`.

### Identification of affected source code

```
raspberrypi-kernel      1.20180417-1
```
```
[   23.085594] WARNING: CPU: 0 PID: 10 at drivers/net/wireless/mediatek/mt7601u/dma.c:200 mt7601u_complete_rx+0x134/0x148 [mt7601u]
```

Versioned link to affected source code: https://github.com/raspberrypi/linux/blob/raspberrypi-kernel_1.20180417-1/drivers/net/wireless/mediatek/mt7601u/dma.c#L200
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests