Skip to content

SPI: merge upstream patches for spi-bcm2835 that went into mainline (for 4.1) #930

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
msperl opened this issue Apr 15, 2015 · 13 comments
Closed

Comments

@msperl
Copy link
Contributor

msperl commented Apr 15, 2015

Hi!

Can you merge the following patches from upstream into your kernel to improve the spi-bcm2835 driver ahead of 4.1?

torvalds/linux@342f948 - spi: bcm2835: fix all checkpath --strict messages
torvalds/linux@4adf312 - spi: bcm2835: fill/drain SPI-fifo as much as possible during interrupt
torvalds/linux@210b492 - spi: bcm2835: clock divider can be a multiple of 2
torvalds/linux@6935224 - spi: bcm2835: enable support of 3-wire mode
torvalds/linux@e34ff01 - spi: bcm2835: move to the transfer_one driver model
torvalds/linux@1e4df62 - spi: bcm2835: fix code formatting issue
torvalds/linux@e3a2be3 - spi: bcm2835: fill FIFO before enabling interrupts to reduce interrupt latencies
torvalds/linux@a30a555 - spi: bcm2835: transform native-cs to gpio-cs on first spi_setup
torvalds/linux@704f32d - spi: bcm2835: enabling polling mode for transfers shorter than 30us

The driver also makes use of a new interface in 4.1, but we can disable that by commenting this line out:
master->handle_err = bcm2835_spi_handle_err;

Note that there is still one issue that can result in -ETIMEDOUT during HEAVY load on the system with short transfers (>70%system, 0%idle).
A solution for that is in the pipeline - I will amend that to the ticket as soon as it has gone in.

Also note that in 4.1 the spi-framework contains some patches that greatly improve latencies for kernel modules that use spi_sync et.al, so maybe it is better to target 4.1 as a whole as soon as it comes out. But that is at least 2 month away...

@popcornmix
Copy link
Collaborator

Are you more interested in 3.18 or 4.0 tree?

@msperl
Copy link
Contributor Author

msperl commented Apr 15, 2015

Well - I do not know your timelines for your next kernel releases.

If the move to 4.0 is further off, then 3.18 if 4.0 is coming in shortly, then that.

4.0 already comes with the improvement for spi_sync (in case of an idle queue), so this would be another thing that is beneficial.

@popcornmix
Copy link
Collaborator

4.0 seems to work fine (it's in latest OpenELEC builds).
If no issues are discovered I suspect that will be the next stable bump. Might be a month or two off.

We can pull it into 3.18 as well. Do you believe existing SPI users will have any issues, or need to change any setup after the merge?

@msperl
Copy link
Contributor Author

msperl commented Apr 15, 2015

No change is needed anywhere.

It will even alter the GPIO 7 and 8 (chip-selects) from ALT0 to OUT on the fly.

There are immediate advantages (unlimited - that means < 56) number of CS, less interrupts and context switches for big transfers, latency improvements), but not as much as with 4.0.

Also on to spi-bcm2835 is still not used as the default, but spi-bcm2708 is - you still have to enable it via the device-tree-overlay...

What is still missing is DMA mode, but that shall arrive as well...

Please note again this one issue that is left open that can have side-effects until fixed.
My fixed version works without any issues on an upstream kernel with a ModelB as well as with the foundation kernel on a Model2.

This is the submitted fix (no feedback yet if it gets merged) for the issue:

diff --git a/drivers/spi/spi-bcm2835.c b/drivers/spi/spi-bcm2835.c
index f63864a..af29467 100644
--- a/drivers/spi/spi-bcm2835.c
+++ b/drivers/spi/spi-bcm2835.c
@@ -164,28 +164,16 @@ static int bcm2835_spi_transfer_one_poll(struct spi_master *master,
                     unsigned long xfer_time_us)
 {
    struct bcm2835_spi *bs = spi_master_get_devdata(master);
-   unsigned long timeout = jiffies +
-       max(4 * xfer_time_us * HZ / 1000000, 2uL);

    /* enable HW block without interrupts */
    bcm2835_wr(bs, BCM2835_SPI_CS, cs | BCM2835_SPI_CS_TA);

-   /* set timeout to 4x the expected time, or 2 jiffies */
    /* loop until finished the transfer */
    while (bs->rx_len) {
        /* read from fifo as much as possible */
        bcm2835_rd_fifo(bs);
        /* fill in tx fifo as much as possible */
        bcm2835_wr_fifo(bs);
-       /* if we still expect some data after the read,
-        * check for a possible timeout
-        */
-       if (bs->rx_len && time_after(jiffies, timeout)) {
-           /* Transfer complete - reset SPI HW */
-           bcm2835_spi_reset_hw(master);
-           /* and return timeout */
-           return -ETIMEDOUT;
-       }
    }

    /* Transfer complete - reset SPI HW */

@msperl
Copy link
Contributor Author

msperl commented Apr 16, 2015

There is the following patch scheduled to go upstream by the spi-maintainer:

145367b - spi: bcm2835: change timeout of polling driver to 1s

That completes the first set of improvements for the spi driver.
The next requires dma, so might only come with 4.2 or later - depending on when the dma changes get incorporated into the bcm2835 branch... Notro started pushing some changes, but I fear there is still some way to go...

@popcornmix
Copy link
Collaborator

I've added your patches to rpi-4.0.y tree. It builds okay.
If you confirm the patches look fine I'll add to 3.18 tree.

@msperl
Copy link
Contributor Author

msperl commented Apr 17, 2015

I will build 4.0.y on a rpi2 today and then run my torture test and will let you know...

@msperl
Copy link
Contributor Author

msperl commented Apr 17, 2015

I tried to build 4.0.y, but it is stuck during boot - see: #934
3.18.y works fine, but does not include that patch...

@msperl
Copy link
Contributor Author

msperl commented Apr 17, 2015

But the patch is OK in principle, so it should also apply to 3.18.y, which I am just rebuilding now...

@msperl
Copy link
Contributor Author

msperl commented Apr 18, 2015

OK - now that the Devicetree was also changed 4.0.y works and the driver does what it is expected to do - even with the torture test!
But there is something different when running on a rpi2 with the rpi-4.0.y kernel. It does not show the same results on the reduction of the number of interrupts and context switches that the upstream kernel shows on a RPI-B+:

Lots of those interrupts actually come from:

IPI2:    4868269    2392205    6575101     891221  Rescheduling interrupts

when the torture-load is running.

I need to investigate the missmatch with a logic analyzer to maybe come to an understanding why it is also not using polling-mode...

@msperl
Copy link
Contributor Author

msperl commented Apr 18, 2015

Ah - just after send the explanation comes to my mind: the spi_sync optimization only came with 4.1.
That is why we have still more context switches and spi0 is running with CPU load.
Still I find those high number of "rescheduling" interrupts a bit strange...

@msperl
Copy link
Contributor Author

msperl commented Apr 18, 2015

but I can now confirm it works as expected for rpi-3.18.y (copying) and rpi-4.0.y (in tree)...

@msperl
Copy link
Contributor Author

msperl commented Apr 18, 2015

To get the latest improvements early it might also be beneficial to apply those to 3.18.y, so cherry-pick:
a875441673bf5fcc94b63137a7d2dc239e1d337a
929d458af89e7a99a876d654fd2b212244f44fcd
64799c6eea7c32f053d8d34d4230451ab691a67f
in that sequence!

But note that there is something strange on the rpi-kernels - at least when using 3.18.y on a rpi2:

  • when running with all 4 CPUs, then - as mentioned - there are LOTS of those rescheduling interrupts.
  • when running with maxcpus=1 then I get "almost" normal behaviour, that I see with upstream 4.1, but there is still this unexplained issue that "sometimes" the messages are still handled via the worker-queue, even though there is only one spi device running (mcp251x) and that exclusively uses spi_sync. To me it seems as if there is a second interrupt handler running that is as well scheduling spi transfers in parallel while the first one is not finished yet, so maybe some masking problem?

I will try the same with rpi-4.0.y (which contains the above patches already) to see if we see the same issue there as well...

popcornmix added a commit to raspberrypi/firmware that referenced this issue Apr 20, 2015
kernel: smsc95xx: Disable turbo mode by default
See: raspberrypi/linux#837

kernel: SPI: merge upstream patches for spi-bcm2835
See: raspberrypi/linux#930

firmware: jpegdec: Fix memory corruption issue caused by repeated exif tags
See: OpenELEC/OpenELEC.tv#4089

firmware: lcd: Updated pi lcd support

firmware: arm_display: Avoid overscan for lcd displays
popcornmix added a commit to Hexxeh/rpi-firmware that referenced this issue Apr 20, 2015
kernel: smsc95xx: Disable turbo mode by default
See: raspberrypi/linux#837

kernel: SPI: merge upstream patches for spi-bcm2835
See: raspberrypi/linux#930

firmware: jpegdec: Fix memory corruption issue caused by repeated exif tags
See: OpenELEC/OpenELEC.tv#4089

firmware: lcd: Updated pi lcd support

firmware: arm_display: Avoid overscan for lcd displays
@msperl msperl closed this as completed Jun 25, 2015
neuschaefer pushed a commit to neuschaefer/raspi-binary-firmware that referenced this issue Feb 27, 2017
kernel: smsc95xx: Disable turbo mode by default
See: raspberrypi/linux#837

kernel: SPI: merge upstream patches for spi-bcm2835
See: raspberrypi/linux#930

firmware: jpegdec: Fix memory corruption issue caused by repeated exif tags
See: OpenELEC/OpenELEC.tv#4089

firmware: lcd: Updated pi lcd support

firmware: arm_display: Avoid overscan for lcd displays
pfpacket pushed a commit to pfpacket/linux-rpi-rust that referenced this issue Apr 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants