-
Notifications
You must be signed in to change notification settings - Fork 18k
proposal: x/sync: once functions that can be retried on error #22098
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
If retry has to happen, why can't it happen inside the function that once.Do calls? |
It could, if infinite retry was desirable. However, if you want to use a But if the caller's context has been cancelled, there is no reason to really keep trying the initialization. Not to mention, since this initialization is implied to be expensive, you don't want to be doing it again until you need it next. Essentially, I want this behavior to be easy to build with
re: singleflight - I suppose if there was a singleflightcache or something, so that subsequent (non-concurrent) calls didn't redo the initialization. That would work. It would feel odd though to use a map with a single key. I think at that point I'd just write this |
Oh, there is no |
The semantics of sync.Once are very easy to understand, to use, and to trust. Adding retries is a great complication for all three. While not dismissing the problem you have, I believe the solution should be solved elsewhere, most likely in an external package, and not in the core library. |
Fair enough, thank you, that is the sort of decision I was looking for. @bradfitz is this possibly something x/sync would be interested in? Feel to me pretty generally useful, and I'd hate to release it under my name to get 1 star and 0 downloads. |
If initialization is expensive and/or puts load on external systems and currently fails, you probably don't want to retry it with max possible frequency. This means that the primitive also needs lastInitTime and lastError and be configurable by a backoff policy. The search index initialization probably needs to be done in a separate goroutine and start ahead of time because if it does not fit into request slot it will never be finished. |
I agree, I was trying to provide a more specific example to take something as abstract as "once.Do need to be able to respect context deadlines" and bring it back into reality a bit. I want to make clear that I am not asking about a particular problem that I have, rather simply noticed this limitation of http context deadlines is the most familiar cancelation to gophers, but you also have queue lease expiration, user initiated cancelation, best-effort time sensitive calculations (do this before Y time, otherwise I don't care), etc etc. I was just trying to motivate the more general problem. |
I mean that if we consider this problem in sufficient generality, it does not boil down to |
Yes, I believe it does. You would simply compose the That I don't imagine I might be missing something, do you have an example it might not cover? |
@skabbes fair enough |
That's not obvious to me anyway. If the thing in question really is one-time initialization, then restarting it for each new call could be very expensive compared to just doing it once and getting it over with. For example, consider what happens if you are connecting to a backend server, establishing the connection (for whatever reason) takes 300ms, actually using the connection takes 10ms, and each incoming call that wants to use the connection has an aggressive 30ms deadline. With the behavior you're proposing, every operation fails in perpetuity. With naive use of
Don't confuse latency with throughput. Often, the action of a |
What does it mean to “exhaust” retries, or when would such exhaustion be desirable? With appropriate backoff parameters, you generally do not need to impose any artificial limit on the number of attempts: you can just keep retrying until the system comes back up. |
To use your earlier examples, I would expect to see something like: var (
once sync.Once
ready = make(chan struct{})
)
var tmpl []byte
handler := func (w http.ResponseWriter, r* http.Request) {
once.Do(func () {
go func() {
defer close(ready)
for attempt := int64(0); ; attempt++ {
tmpl, err = readTemplateFromFS()
if err == nil {
return
}
backoff(attempt)
}
})
})
select {
case <-r.Context().Done():
return
case <-ready:
}
// do something with tmpl
} // finishIndexing begins indexing the Book where the last finishIndexing call stopped.
func (b *Book) finishIndexing(ctx context.Context) error {
if b.indexed() {
return nil
}
select {
case <-ctx.Done():
return ctx.Err()
case b.indexSem <- token{}:
}
defer func() { <-b.indexSem }()
cursor := b.loadProgress()
for !b.indexed() {
…
select {
case <-ctx.Done():
b.storeProgress(cursor)
return ctx.Err()
default:
}
}
return nil
}
func (b *Book) Search(ctx context.Context, query string) ([]SearchResult, error) {
if err := b.finishIndexing(ctx); err != nil {
return nil, err
}
…
} |
Yes, it could be. As you have already pointed out, once.Sync already covers those cases very well. Also in your 300ms init example, yes that would certainly be a case to not use this. I think we can both agree on this if infinite retry is desirable then sync.Once's current behavior is sufficient. So... when is it not desirable? Infinite retry? // MakeKeyPair generates a new pub/priv keypair and stores it in the database
func MakeKeyPair(userID string) (*KeyPair, error) {
pair := expensiveCPUOp(2048)
err := storeKeyPairTransactionally(userID, pair)
return pair, err
} You would like to 'once' this per user (assume all reqs for same users sharded to same box), however your keypair database is down due to a configuration error. So do you:
1 will continue to use rob CPU resources from the other features. 2 will rob memory resources from other features. Since it is using additional resources, this infinite retry could lead to cascading failures in other feature areas.
Yes, you certainly can - but at the very least, that once.Do will be consuming a goroutine, some memory and possibly spinning CPU cycles - which depending on the context might be OK or might not be. The current design does not allow for graceful failure or backpressure to upstream systems. Given that gophers want to 'once-initialize' things that can error, is our answer "infinitely retry"? I think no because we don't get them the tools to do exponential backoff and retry with jitter (pretty much required for a correct implementation). I would rather the answer be something simple like, "use TryDo" or "don't do that at all, it is a bad pattern". |
In the situation you're describing, you're already consuming O(N) memory to store the N |
I see a lot of exponential backoff and retry libraries available. In contrast, the “retry on the first call after failure” policy that you propose for “Use exponential backoff” seems like the simple answer. |
Yes, and many of them seem to return
No, I disagree - the Let me rephrase my statement: Given that gophers want to 'once-initialize' things that can error, correct usage of I suppose we disagree on this, but if I wrote some library code that in some scenarios caused a reference cycle, and caused CPU or memory "ghosting" (due to infinite retry) even after the original request was served, my team would not accept it into our code base. Many of your arguments seem to assume there is no such things as a best-effort request, and that every single software system has an infinite supply of load, or that in software development it is always wise to solve those problems first. Regardless, I will close this and continue to use
|
sync.Once
provides a great tool for initializing singletons, or performing initialization lazily. However, it is completely unusable for initializing resource who's initialization can fail (and that failure is temporary).I am proposing a type and/or method to be added to the sync package that can handle initialization functions that can fail, to simply retry them later - the next time they are needed.
Simple Use cases
Workaround
sync.Once
. If retries are exhausted, this is not a suitable workaround.sync.Do
but on failure, swap thesync.Once
with a newsync.Once
so it can retry.TryDo
, requiring a lock, atomic variable (to make it low overhead) and a clever defer function to deal with panics. A hand rolled version will likely not handle panics, or atomic var optimization.Advanced use case
You are writing a library API which requires lazy initialization of some expensive reusable resource. The API looks like this.
However, to perform a search the first time, we would like to lazily build a search index of that book, an expensive operation, which can fail. You would like this index build operation to backoff and retry up until the limits of the context passed.
As a final comment,
TryDo
is more generic thanDo
, you can trivially buildDo
withTryDo
but not vice versa. I would like to solicit feedback on this API addition, and if agreed I can build a design doc or submit the proposed code to implement this.The text was updated successfully, but these errors were encountered: