-
Notifications
You must be signed in to change notification settings - Fork 5.2k
bcm2835-sdhost: wait at least 150us between read retries #1486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
With the Compute Module I get timeout's while initializing the eMMC: mmc0: command never completed. And the system won't boot properly. With this patch it boots every single time successfully and won't ever show the above error. Also I don't see any performance regressions whatsoever. 150us seems to be a magic number, as wait times below doesn't always make the eMMC initialize successfully on first retry. So use 150us to 200us as usleep range for between read retries. Signed-off-by: Ahmet Inan <[email protected]>
Please explain why increasing the polling interval improves booting on eMMC. I'm reluctant to apply a patch that seems to work without an understanding of how. |
If we don't wait at least that 150 microseconds of time before, every single of the tons of bcm2835_sdhost_read retries we issue in that 100 milliseconds timeout window will fail. |
I think you've misunderstood that code slightly. That isn't a retry path, it's a slower poll path. The read you see isn't a read of the card it is a read of the register that indicates the status of the command. I'm concerned that the reason this patch works is that it can delay the time between the command being reported as complete and the next operation by up to 150-200us. The reason I say "up to" is that if the command completes at the end of that 150-200us window then the added delay could be zero, which might lead to a failure. In addition, this added delay may only be required for one command. As an experiment, can you put the polling interval back to what it was, but add an explicit delay after the loop (or at the point of the break) and see if that also fixes the problem and what the minimum delay is. |
Thanks, will investigate and come back to you :) |
150us seems to be the absolute minimum, otherwise you will see an occasional "mmc0: command never completed." |
Tomorrow would be better for me. |
Good morning, Before my patch: After my patch: Also discovered (maybe side effect of pr_err) that 150us is not enough, as I see one "mmc0: command never completed": Increasing delay to 200-300us: I still think that the extra delay before the read is the correct place, as we already have exhausted the cmd_quick_poll_retries in the quick poll loop and now we are doing things at a slower pace in the slow poll loop path and so no need to rush into things :) Don't have the programming manual (just the datasheet) for this KLM4G1FEAC but my guess is that we are "disturbing" the command from finishing if we poll to fast - as putting the timeout even higher (2 Seconds!) doesn't help before my patch. |
The problem with this theory is that the read is only checking the status of the SDHOST interface to see if the command has completed. This doesn't trigger any activity on the SD bus - once the command is sent, all the interface can do is wait for the card. |
Thanks, that was the info I was missing and now understand your concern. Anything below 150 gives "mmc0: command never completed" |
FYI: That extra delay really only helps inside or anywhere above the loop, but at no point below it. |
I made another patch and a new pull request: This way it won't slow down the other command polls. |
With the Compute Module I get timeout's while initializing the eMMC:
mmc0: command never completed.
And the system won't boot properly.
With this patch it boots every single time successfully and won't ever show the above error.
Also I don't see any performance regressions whatsoever.
150us seems to be a magic number, as wait times below doesn't always make the eMMC initialize successfully on first retry.
So use 150us to 200us as usleep range for between read retries.
Signed-off-by: Ahmet Inan [email protected]