Skip to content

context: context.Cause can return nil while Err return non-nil for custom context implmenentations #73258

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
aktau opened this issue Apr 8, 2025 · 4 comments
Labels
BugReport Issues describing a possible bug in the Go implementation. NeedsFix The path to resolution is known, but the work has not been done.
Milestone

Comments

@aktau
Copy link
Contributor

aktau commented Apr 8, 2025

Go version

go1.25-dev (HEAD)

Output of go env in your module/workspace:

N/A

What did you do?

When looking at what constitutes cancellation:

	// Done returns a channel that's closed when work done on behalf of this
	// context should be canceled. Done may return nil if this context can
	// never be canceled. ...
	Done() <-chan struct{}

	// If Done is not yet closed, Err returns nil.
	// If Done is closed, Err returns a non-nil error explaining why:
	// DeadlineExceeded if the context's deadline passed,
	// or Canceled if the context was canceled for some other reason.
	// After Err returns a non-nil error, successive calls to Err return the same error.
	Err() error

One could read that Done() being closed or Err() returning non-nil constitutes a cancelled context.

This interpretation doesn't jive with the current implementation of context.Cause:

go/src/context/context.go

Lines 282 to 300 in af278bf

// Cause returns a non-nil error explaining why c was canceled.
// The first cancellation of c or one of its parents sets the cause.
// If that cancellation happened via a call to CancelCauseFunc(err),
// then [Cause] returns err.
// Otherwise Cause(c) returns the same value as c.Err().
// Cause returns nil if c has not been canceled yet.
func Cause(c Context) error {
if cc, ok := c.Value(&cancelCtxKey).(*cancelCtx); ok {
cc.mu.Lock()
defer cc.mu.Unlock()
return cc.cause
}
// There is no cancelCtxKey value, so we know that c is
// not a descendant of some Context created by WithCancelCause.
// Therefore, there is no specific cause to return.
// If this is not one of the standard Context types,
// it might still have an error even though it won't have a cause.
return c.Err()
}

Using a custom context.Context implementation, cancellation can be implemented as returning a non-nil Err

https://go.dev/play/p/bmnq9qOZuac?v=gotip

This causes context.Cause to return nil, because it looks up the cause in the parent (through Value()).

A (partial) solution may be to change context.Cause to not return cause if it is nil, which I think complies with the contract:

 func Cause(c Context) error { 
 	if cc, ok := c.Value(&cancelCtxKey).(*cancelCtx); ok { 
 		cc.mu.Lock() 
                cause := cc.cause
 		cc.mu.Unlock() 
 		if cause != nil {
                  return cause
                }
 	} 
 	return c.Err() 
 } 

Though this has other subtler issues, like: which of the errors should be preferred, if the child was "cancelled" earlier than the parent. In the proposal, the parents one wins. I'm not sure it is resolveable.

Side note #1:

The context package internally resolves this for withoutCancelCtx (used for context.WithoutCancel) with a special-case in value:

go/src/context/context.go

Lines 780 to 785 in af278bf

case withoutCancelCtx:
if key == &cancelCtxKey {
// This implements Cause(ctx) == nil
// when ctx is created using WithoutCancel.
return nil
}

It's possible this special case may not be necessary anymore if context.Cause is fixed.

Side note #2:

I see this in the docs:

// The [WithCancelCause], [WithDeadlineCause], and [WithTimeoutCause] functions
// return a [CancelCauseFunc], which takes an error and records it as
// the cancellation cause. Calling [Cause] on the canceled context
// or any of its children retrieves the cause. If no cause is specified,
// Cause(ctx) returns the same value as ctx.Err().

But WithCancelCause is:

go/src/context/context.go

Lines 699 to 704 in af278bf

// WithTimeoutCause behaves like [WithTimeout] but also sets the cause of the
// returned Context when the timeout expires. The returned [CancelFunc] does
// not set the cause.
func WithTimeoutCause(parent Context, timeout time.Duration, cause error) (Context, CancelFunc) {
return WithDeadlineCause(parent, time.Now().Add(timeout), cause)
}

Which is an inconsistency in the docs.

What did you see happen?

https://go.dev/play/p/bmnq9qOZuac?v=gotip

Prints:

Custom Context (self cancelled).Err(): self cancellation
context.Cause(Custom Context (self cancelled)): <nil>

What did you expect to see?

If ctx.Err() != nil, I would have expected context.Cause(ctx) != nil.

@gabyhelp gabyhelp added the BugReport Issues describing a possible bug in the Go implementation. label Apr 8, 2025
@dmitshur
Copy link
Member

dmitshur commented Apr 8, 2025

CC @neild, @Sajmani.

@dmitshur dmitshur added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Apr 8, 2025
@neild
Copy link
Contributor

neild commented Apr 8, 2025

Nice analysis.

The core problem is that Cause assumes that if a context is canceled, the first ancestor of that context created by the context package contains the cause. This is wrong: The first ancestor might not be canceled at all, and if it is canceled it might not contain the cause.

For example, if we create:

ctx1, cancel1 := context.WithCancelCause(Background())
ctx2, cancel2 := myCustomContextWithParent(ctx1)

If we cancel1(someCause), then the existing algorithm works.

If we cancel2(), then Cause(ctx2) returns nil, which is obviously wrong.

If we cancel2(); cancel1(someCause), then Cause(ctx2) returns someCause. This is actually wrong: ctx2 was canceled before ctx1 and should not inherit ctx1's cause.

I don't see any way to fix this last case. Cause has no way to ask a third-party context for its cause. We could define a new optional method for returning causes, but that doesn't do anything for existing third-party contexts.

I think the best we might be able to do is say that if Cause is called on a canceled context where no cause can be identified, it defaults to returning Canceled.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/663936 mentions this issue: context: don't return a nil Cause for a canceled custom context

@dmitshur dmitshur added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Apr 8, 2025
@dmitshur dmitshur added this to the Go1.25 milestone Apr 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BugReport Issues describing a possible bug in the Go implementation. NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

5 participants