Skip to content

dmaengine: dw-axi-dmac: Allow client-chosen width #6377

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Nov 8, 2024

Conversation

pelwell
Copy link
Contributor

@pelwell pelwell commented Sep 19, 2024

For devices where transfer lengths are not known upfront, there is a
danger when the destination is wider than the source that partial words
can be lost at the end of a transfer. Ideally the controller would be
able to flush the residue, but it can't - it's not even possible to tell
that there is any.

Instead, allow the client driver to avoid the problem by setting a
smaller width.

For devices where transfer lengths are not known upfront, there is a
danger when the destination is wider than the source that partial words
can be lost at the end of a transfer. Ideally the controller would be
able to flush the residue, but it can't - it's not even possible to tell
that there is any.

Instead, allow the client driver to avoid the problem by setting a
smaller width.

Signed-off-by: Phil Elwell <[email protected]>
SPI transfers are of defined length, unlike some UART traffic, so it is
safe to let the DMA controller choose a suitable memory width.

Signed-off-by: Phil Elwell <[email protected]>
In order to avoid losing residue bytes when a receive is terminated
early, set the destination width to single bytes.

Link: raspberrypi#6365

Signed-off-by: Phil Elwell <[email protected]>
@pelwell
Copy link
Contributor Author

pelwell commented Sep 19, 2024

@P33M The cause of the recent UART woes was the loss of this code fragment in the DMAC driver:

               /* Prevent partial access units getting lost */
               if (mem_width > reg_width)
                       mem_width = reg_width;

The partial access units in question are the bytes that have been received from the UART but not yet grouped into a word. If the UART driver decides that sufficient time has passed that any residue should be pushed to the tty framework, this partial word is inaccessible. And unlike the transmit case, there is no way of knowing what those bytes were.

Work around the problem by honouring a non-zero memory destination width (unless the DMAC driver knows it must be more restrictive).

@kraln
Copy link

kraln commented Oct 10, 2024

FYI, we've noticed a regression with recent kernels (after 6.6.47+rpt-rpi-2712) whereby a simple poll() on a uart returns immediately, resulting in read() returning EAGAIN, such that a trivial reader consumes 100% of one core. Scratching our heads quite a bit--the default reason is not to blame the kernel...

This pull request appears to prevent that behavior, but without a test case we're concerned it will re-appear in the future. It would be great if this could be considered.

@pelwell
Copy link
Contributor Author

pelwell commented Oct 10, 2024

We've disabled UART DMA in the most recent kernels as a sure, safe workaround while we consider this some more. Are you still seeing poll() problems with 6.6.51 etc.?

@kraln
Copy link

kraln commented Oct 10, 2024

Are you still seeing poll() problems with 6.6.51 etc.?

OK:
Linux xxx 6.6.47+rpt-rpi-2712 #1 SMP PREEMPT Debian 1:6.6.47-1+rpt1 (2024-09-02) aarch64 GNU/Linux

Broken:
Linux xxx 6.6.51+rpt-rpi-2712 #1 SMP PREEMPT Debian 1:6.6.51-1+rpt2 (2024-10-01) aarch64 GNU/Linux
Linux xxx 6.6.51+rpt-rpi-2712 #1 SMP PREEMPT Debian 1:6.6.51-1+rpt3 (2024-10-08) aarch64 GNU/Linux

OK:
Linux xxx 6.6.51-v8-16k+ #1 SMP PREEMPT Thu Sep 19 18:05:24 UTC 2024 aarch64 GNU/Linux

I'll try and get a small testcase together

@pelwell
Copy link
Contributor Author

pelwell commented Oct 10, 2024

Interesting - on 6.6.51+rpt-rpi-2712, what does cat /sys/kernel/debug/dmaengine/summary report?

@honx
Copy link

honx commented Oct 10, 2024

cat /sys/kernel/debug/dmaengine/summary

I'm a colleague of @kraln

On the non-working 6.6.51+rpt-rpi-2712 we get:

dma0 (1000010000.dma): number of channels: 5

dma1 (1000010600.dma): number of channels: 5
dma1chan0 | 107d004000.spi:tx
dma1chan1 | 107d004000.spi:rx
dma1chan2 | 107c701400.hdmi:audio-rx
dma1chan3 | 107c706400.hdmi:audio-rx

dma2 (1f00188000.dma): number of channels: 8
dma2chan0 | 1f00050000.spi:rx
dma2chan1 | 1f00050000.spi:tx

On the working kernel 6.6.51-v8-16k+, we get:

dma0 (1000010000.dma): number of channels: 5

dma1 (1000010600.dma): number of channels: 5
dma1chan0 | 107d004000.spi:tx
dma1chan1 | 107d004000.spi:rx
dma1chan2 | 107c701400.hdmi:audio-rx
dma1chan3 | 107c706400.hdmi:audio-rx

dma2 (1f00188000.dma): number of channels: 8
dma2chan0 | 1f00050000.spi:rx
dma2chan1 | 1f00050000.spi:tx
dma2chan2 | 1f00030000.serial:tx
dma2chan3 | 1f00030000.serial:rx

@pelwell
Copy link
Contributor Author

pelwell commented Oct 10, 2024

Are you absolutely certain that you got those two sets of results the right way round? I expected the working system not to have any 1f00030000.serial:rx entries because UART DMA has been disabled.

There is a second question over why the two are different, which looks like an out-of-date dtb file. Note that you could try copying /boot/firmware/bcm2712-rpi-5-b.dtb from working to non-working, and rebooting.

@honx
Copy link

honx commented Oct 10, 2024

quick update: yes, the above dmaengine/summary output is the correct way around.

however, it seems our issue was only triggered by disabling DMA but looking at the traces, it seems poll/read work fine in both cases and its userland code thats buggy. subtle bug that did not show with DMA, as large data chungs were returned by read. but now, with disabled DMA, it just returns 16byte chunks, and userlands faulty assumptions break..

we'll debug some more but i think our problems are unrelated to the original issue...

@honx
Copy link

honx commented Oct 10, 2024

quick update: yes, the above dmaengine/summary output is the correct way around.

however, it seems our issue was only triggered by disabling DMA but looking at the traces, it seems poll/read work fine in both cases and its userland code thats buggy. subtle bug that did not show with DMA, as large data chungs were returned by read. but now, with disabled DMA, it just returns 16byte chunks, and userlands faulty assumptions break..

we'll debug some more but i think our problems are unrelated to the original issue...

ok, after some testing, we're sure that while kernel behaviour has changed, it is still correct. and the userland code must be fixed.

@pelwell
Copy link
Contributor Author

pelwell commented Oct 10, 2024

OK, thanks for the confirmation.

@P33M
Copy link
Contributor

P33M commented Nov 8, 2024

I2S (via convoluted setup in sound/soc/soc-generic-dmaengine-pcm.c) sets only the peripheral's buswidth. I assume that because tx/rx is effectively continuous, the trapped residue for rx doesn't matter?

@pelwell
Copy link
Contributor Author

pelwell commented Nov 8, 2024

That was my thinking. And the fact that I2S data rate is potentially so much higher than the UARTs that we don't want to pay the penalty of the narrow transfers.

@P33M P33M marked this pull request as ready for review November 8, 2024 10:53
@pelwell pelwell merged commit d47754e into raspberrypi:rpi-6.6.y Nov 8, 2024
11 of 12 checks passed
popcornmix added a commit to raspberrypi/firmware that referenced this pull request Nov 8, 2024
See: raspberrypi/linux#6458

kernel: dmaengine: dw-axi-dmac: Allow client-chosen width
See: raspberrypi/linux#6377

kernel: drm/vc4: Allow option to transpose the output on the writeback connector
See: raspberrypi/linux#6312

kernel: raspberrypi/linux#6454
See: raspberrypi/linux#6454

kernel: More dwc3 quirks
See: raspberrypi/linux#6457

kernel: mmc: quirks: add more broken Kingston Canvas Go! SD card date ranges
See: raspberrypi/linux#6447
popcornmix added a commit to raspberrypi/rpi-firmware that referenced this pull request Nov 8, 2024
See: raspberrypi/linux#6458

kernel: dmaengine: dw-axi-dmac: Allow client-chosen width
See: raspberrypi/linux#6377

kernel: drm/vc4: Allow option to transpose the output on the writeback connector
See: raspberrypi/linux#6312

kernel: raspberrypi/linux#6454
See: raspberrypi/linux#6454

kernel: More dwc3 quirks
See: raspberrypi/linux#6457

kernel: mmc: quirks: add more broken Kingston Canvas Go! SD card date ranges
See: raspberrypi/linux#6447
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants