Skip to content

Hedging retry on non-fatal-status code does not shortcut hedging delay (as stated in A6-client-retries) #10145

@doffid

Description

@doffid

What version of gRPC-Java are you using?

1.54.1

What is your environment?

openjdk 1.8

What did you expect to see?

Grpc heding implementation should match spec A6-client-retries.

If a non-fatal status code is received from a hedged request, then the next hedged request in line is sent immediately, shortcutting its hedging delay.

What did you see instead?

On non-fatal status codes the request is "retried" but still waits for the heding delay to expire. It is not sent immediately.
I can observe this in my application and also in quite isolated junit tests.

Having a look in the source code, I could not spot where this part of the spec should be implemented. HedgingPolicy.nonFatalStatusCodes is only used in RetriableStream.Sublistener#makeHedgingDecision but the information that it was a non fatal status code is not transported to the HedgingPlan. But I did not have a deeper look to understand more.

Steps to reproduce the bug

Setting up an embedded grpc client (with active hedging service config) and server that throws "non-fatal" status codes is possible. I have no sample ready that is not tied to my application though.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions