Skip to content

RPI4 SPI controller CS deasserts too early at 1MHz #5655

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
AndreasFuchsTPM opened this issue Oct 16, 2023 · 12 comments
Open

RPI4 SPI controller CS deasserts too early at 1MHz #5655

AndreasFuchsTPM opened this issue Oct 16, 2023 · 12 comments

Comments

@AndreasFuchsTPM
Copy link

AndreasFuchsTPM commented Oct 16, 2023

Describe the bug

When using the SPI controller via the spidev driver from userspace, the timing between the last falling edge of the CLK and the deassertion of CS is somehow off.
It even happens that CS is deasserted before the last CLK fall.

This is worst at around 1MHz. It get's much better below 500kHz and above 2MHz.

Blue is CLK, Green is CS
SDS00080

Steps to reproduce the behaviour

I need a payload that uses 2 ioctls and leaves CS asserted between those two via the cs_change parameter = 1.

Attached you find a simple python script that reproduces this issue.
I had to insert the "waiting with asserted CS" between byte 4 and 5:
spidev-test.py.txt

Device (s)

Raspberry Pi 4 Mod. B

System

cat /etc/rpi-issue
Raspberry Pi reference 2023-05-03
Generated using pi-gen, https://github.com/RPi-Distro/pi-gen, 47eee1f0ddcf8811559d51eea1c1bb48335e3e88, stage2

vcgencmd version
Mar 17 2023 10:50:39
Copyright (c) 2012 Broadcom
version 82f3750a65fadae9a38077e3c2e217ad158c8d54 (clean) (release) (start)

uname -a
Linux raspberrypi-arael 6.1.21-v8+ #1642 SMP PREEMPT Mon Apr 3 17:24:16 BST 2023 aarch64 GNU/Linux

Logs

Blue is CLK, Green is CS
SDS00081
SDS00082
SDS00083

Additional context

No response

@pelwell
Copy link
Contributor

pelwell commented Oct 16, 2023

That's strange, because we use GPIO (software-controlled) CS lines on Pi 4 (and other Pis). Can you confirm which custom config.txt entries you have applied?

@pelwell
Copy link
Contributor

pelwell commented Oct 16, 2023

And which SPI interface you are using.

@AndreasFuchsTPM
Copy link
Author

config.txt is default.

$ dtoverlay spi0-2cs Using this standard dtbo

We use /dev/spidev0.1

@pelwell
Copy link
Contributor

pelwell commented Oct 18, 2023

Is this actually causing problems for you? The smallest gap I've seen is 48ns, which isn't large, but it also isn't negative.

What kind of gap does your application require?

@AndreasFuchsTPM
Copy link
Author

AndreasFuchsTPM commented Oct 19, 2023

The application requires 5ns. The problem is that we actually go below this or even into the negative, which is of course not ok.
Funny enough, we do not see a problem >2MHz or <500kHz. I have not tested all frequencies in the middle.

Also note: I could not reproduce this with calls to spidev_test or other setups. Also on 2 different RPI4, the error occurs on different parts of the program.
Thus, there seems to be a very specific (small) timing window, that leads to negatives. Maybe you need to vary the SPI-frequency or add a usleep between the ioctls. But once you hit it, it seems kind of stable...

@pelwell
Copy link
Contributor

pelwell commented Oct 19, 2023

I'm wondering if the problem is that TX FIFO empty is not the same as idle. @l1k You've done a lot of work on the SPI driver over the years, and I wouldn't be surprised if you've intentionally used TX FIFO empty to allow for a bit of overlap to reduce the gap between transfers - is that correct? Is there a neat way of forcing it to wait until the transfer is complete before de-asserting CS?

@l1k
Copy link
Contributor

l1k commented Oct 19, 2023

Are there any error messages or warnings in dmesg?

How large are the messages being transferred? If they're just a few bytes long, the driver uses polling, i.e. it alternatingly writes, then reads bytes from the FIFO until the message is complete. If the message is larger, DMA is used and then it gets more complicated.

I don't think "TX FIFO empty" semantics play a role here. Since SPI is full duplex, the message is not complete until the same amount of bytes has been received which has been sent.

I could imagine that the clock calculations in bcm2835_spi_transfer_one() might be slightly off for certain frequencies. That would seem to be the first thing that needs to be looked at more closely. I think the core clock (from which the SPI clock is derived) can be controlled through config.txt, so it might be worth checking if changing it results in a different behavior. Another idea would be to try on a 3B or 3B+, which has a different core clock than the 4B, hence might behave differently with a 1 MHz SPI clock.

@l1k
Copy link
Contributor

l1k commented Oct 19, 2023

Oh also, if the CPU is not under heavy load, it is clocked down. E.g. on a CM3 the core clock is reduced from 1200 MHz to 600 MHz. That will halve the bandwidth of the SPI bus. The workaround is to switch to "performance" cpufreq governor in sysfs. That will pin the clock at 1200 MHz and thus lead to predictable SPI clocks. Just something to keep in mind.

@AndreasFuchsTPM
Copy link
Author

I think the two different approaches may be a lead...

For 4-byte chunks, we see it running nicely:
image

For a 1-byte transfer, we see it not-so-nicely:
image

For a bit of background:
We first send 4 bytes but instruct the ioctl to keep CS asserted.
Then in a second ioctl (that releases CS) we send either 1 or 4 bytes usually.

@AndreasFuchsTPM
Copy link
Author

Another side comment: On the tpm_tis (in-kernel) driver, the same problem seems to have occurd.
Thus the following line was implemented there: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/char/tpm/tpm_tis_spi_main.c?h=v6.5.8#n113

@AndreasFuchsTPM
Copy link
Author

Oh also, if the CPU is not under heavy load, it is clocked down. E.g. on a CM3 the core clock is reduced from 1200 MHz to 600 MHz. That will halve the bandwidth of the SPI bus. The workaround is to switch to "performance" cpufreq governor in sysfs. That will pin the clock at 1200 MHz and thus lead to predictable SPI clocks. Just something to keep in mind.

I did set the governor to performance, but I still have the same results.

I have a new LA with 1GHz sampling rate and it is very visible here:

image

Only occurs on the 1-byte-writes, never on the 4-byte-writes.

(Zoomed out version:)
image

@AndreasFuchsTPM
Copy link
Author

AndreasFuchsTPM commented Nov 27, 2023

Are there any error messages or warnings in dmesg?

dmesg is empty besides the warnings from the sudo dtoverlay spi0-2cs call:

[   40.011000] OF: overlay: WARNING: memory leak will occur if overlay removed, property: /soc/gpio@7e200000/spi0_cs_pins/brcm,pins
[   40.011047] OF: overlay: WARNING: memory leak will occur if overlay removed, property: /soc/spi@7e204000/cs-gpios
[   40.011067] OF: overlay: WARNING: memory leak will occur if overlay removed, property: /soc/spi@7e204000/status

How large are the messages being transferred?

I call the ioctl from python.
First 4 bytes with cs_change=1 then 1 byte with cs_change=0

In the other messages that are 4 bytes + 4 bytes, I do not see any issues, there is actually a looot of time between CLK and CS deassert.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants