Description
Issue
There is a subtle bug in the settings empty_pool_response_code_503
and empty_pool_timeout
.
If both are enabled, gorouter will return a 503 service temporarily unavailable
for empty_pool_timeout
duration and will reap the empty pool with the next pruning cycle thereafter.
The Bug
The logic in registry/container/trie.go.MatchUri
doesn't take empty pools into account when trying to find a match for a given URI. This leads to situations where a route with a specific path such as a.b.com/foo
has been unmapped by the user with the expectation that gorouter would fall back to the less specific route a.b.com
but instead receives a 503.
This is because the MatchUri
function will treat empty pools as valid during the path traversal and returns them during registry.Lookup
The problem didn't exist before the two flags, because empty pools could not exist, they were reaped immediately, so the algorithm could not run into empty pools during traversal.
Affected Versions
https://github.com/cloudfoundry/routing-release/releases/tag/0.232.0 and up
https://github.com/cloudfoundry/cf-deployment/releases/tag/v20.2.0 and up
Context
A customer uses a B/G scenario where they have route service for rate limiting. They bind the route service to blue.cf-app.com
under a specific path e.g. blue.cf-app.com/foo
. When they do a switch from blue to green, they first unmap the route service from blue and remap it to green again. They have noticed that before routing-release 0.232.0 any requests to blue.cf-app.com/foo
would still be handled, even though the route to /foo
was no longer mapped. The request would fall back to blue.cf-app.com
.
However, after we rolled out 0.232.0 they complained about a time window where customers would receive a 503 instead of the fallback kicking in.
Steps to Reproduce
cf map-route blue.cf-app.com --hostname blue --path foo
cf unmap-route blue.cf-app.com --hostname blue --path foo
curl https://blue.cf-app.com/foo
Expected result
200 OK (from blue.cf-app.com fallback)
Current result
503 Service Unavailable: Requested route ('blue.cf-app.com') has no available endpoints.
Possible Fix
MatchUri
during path traversal should prefer pools with endpoints over empty pools if such pools exist. (avoiding 503)
If only empty pools exist, an empty pool may be returned (producing 503).
If no pools exist, nil may be returned (producing 404).
I've provided a fix PR