dwc_otg has poor performance with multiple high-speed bulk IN endpoints

This is a catch-all thread consolidating analysis of the issue that arises when:
- Multiple devices with multiple high-speed bulk IN endpoints are attached to the Pi
- Many of the endpoints are actively being polled (have URBs outstanding)
- Many of the endpoints return no data for extended periods of time (10s of milliseconds or more)

In this set of conditions, the host channels in the OTG core will indefinitely poll endpoints without ever interrupting the host until one of three conditions are met:
- The maximum data transfer length has been received
- A short packet is received
- A bus error occurred
For endpoints that have no data available, this implies that the host channel is in a stable, busy state for a long time.

As interrupts on transfer completion or error states are required in order to release host channels to service other transfers, the maximum throughput on USB drops and the endpoint servicing interval for all non-periodic endpoints increases dramatically. In extreme cases, this ends up overflowing internal device data FIFOs.

c.f:
https://github.com/raspberrypi/firmware/issues/582
https://github.com/raspberrypi/linux/issues/1692
https://github.com/raspberrypi/linux/issues/2023

If we can somehow break this behaviour intentionally, we can then implement (yet another) software workaround that at least results in somewhat timely servicing of multiple endpoints.

We need to get two things to happen -
a) The host channel needs to raise an interrupt the first time it goes round the polling loop
b) We need to not throw data on the floor if the first response the host channel gets is a data packet.

The only way we can stop a channel in DMA mode is to set the channel disable bit and hope for the best. If we've received our interrupt that says condition a) is fulfilled (i.e. NAK), then disable the channel, the core could be in the process of sending another IN token to the same endpoint. If the endpoint then returns data, we could potentially truncate the packet and/or erroneously signal ACK to the endpoint.

I think we can satisfy the conditions (in the FIQ) if we do some deliberately bad things:
- Unmask the NAK interrupt in addition to channel halted interrupt
- Set the data toggle PID to be incorrect on purpose, so an ACK will not be issued
- For the first interrupt we get, if it was a NAK then disable the channel
- If we got a valid packet on the first interrupt, we will have thrown it away so restart the channel with correct toggle

Then we collapse into two state flows:
* Waiting for channel disable:
- Next interrupt could occur after the channel has performed another transfer
- By setting data toggle incorrectly we will still not issue an ACK in the case of a valid data packet
- If it was a valid packet, restart the channel with correct toggle
- If not, poke the IRQ to release the channel for other transfers

* Waiting for normal completion
- Usual flow resumes - pass the resulting interrupt state to the IRQ

This is a pretty sledge-hammer (and untested) approach to fix a rather obscure edge case, but there's no guarantee that there won't in future be some wonder device that uses the maximum number of bulk endpoints.

The alternative would be to get slave mode working, but the resulting storm of interrupts would likely make this a very suboptimal option.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

dwc_otg has poor performance with multiple high-speed bulk IN endpoints #2050

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

dwc_otg has poor performance with multiple high-speed bulk IN endpoints #2050

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions