Skip to content

[oidc] encode and validate state params #16317

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 15, 2023
Merged

[oidc] encode and validate state params #16317

merged 1 commit into from
Feb 15, 2023

Conversation

AlexTugarev
Copy link
Member

@AlexTugarev AlexTugarev commented Feb 9, 2023

After this PR is merged, the state param of the OIDC/OAuth2 gets signed and verified. Validating of integrity is crucial, as this piece of information contains the ID of the OIDC client to continue with when Gitpod receives the callback from a 3rd party. Tests should show that expiration time is checked and signature validation is effective.

Related Issue(s)

Part of #15956

How to test

See internal post for test credentials.

  1. Create new Org
  2. Create new SSO client under settings
  3. Select Login from actions menu (three dot button)
  4. Copy the full URL (starting with https://accounts.google.com/o/oauth2/v2/auth/oauthchooseaccount?) from browser's location bar.
  5. Now, first see a working login into your @gitpod.io account if you proceed.
  6. Take the URL from step 4.), find the state query param, alter the last segment (they are separated by .), then paste it into a new tab. You should see client config not found. The logs will contain the bad state param as reason.

Release Notes

NONE

Documentation

Build Options:

  • /werft with-github-actions
    Experimental feature to run the build with GitHub Actions (and not in Werft).
  • leeway-no-cache
    leeway-target=components:all
  • /werft no-test
    Run Leeway with --dont-test
Publish Options
  • /werft publish-to-npm
  • /werft publish-to-jb-marketplace
Installer Options
  • with-ee-license
  • with-slow-database
  • with-dedicated-emulation
  • with-ws-manager-mk2
  • workspace-feature-flags
    Add desired feature flags to the end of the line above, space separated

Preview Environment Options:

  • /werft with-local-preview
    If enabled this will build install/preview
  • /werft with-preview
  • /werft with-large-vm
  • /werft with-gce-vm
    If enabled this will create the environment on GCE infra
  • /werft with-integration-tests=all
    Valid options are all, workspace, webapp, ide, jetbrains, vscode, ssh

@werft-gitpod-dev-com
Copy link

started the job as gitpod-build-at-state-jwt.1 because the annotations in the pull request description changed
(with .werft/ from main)

@AlexTugarev AlexTugarev marked this pull request as ready for review February 9, 2023 14:55
@AlexTugarev AlexTugarev requested a review from a team February 9, 2023 14:55
@github-actions github-actions bot added the team: webapp Issue belongs to the WebApp team label Feb 9, 2023
Copy link
Member

@easyCZ easyCZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general. Let's find a better way to avoid the skipVerification, as that pushes test code into our main implementation.

cipher db.Cipher
dbConn *gorm.DB
cipher db.Cipher
stateJWT *StateJWT
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some way, the fact that it's a JWT is an implementation detail. And as such, here we could go with an interface rather than the fact that it is JWT

You could express this as

type State interface {
  Encode(..)
  Decode(..)
}

While this creates indirection, it does help future readers to not have to know the details of how it's encoded, just that it is.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up to you.

Comment on lines 72 to 70
func newTestService(sessionServiceAddress string, dbConn *gorm.DB, cipher db.Cipher, stateJWT *StateJWT) *Service {
service := NewService(sessionServiceAddress, dbConn, cipher, stateJWT)
service.skipVerifyIdToken = true
return service
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either move this to a test package, or make it a first class constructor which doens't mention Test. It's an anti-pattern in Go to have test code alongisde the non-test implementation.

But I think we shouldn't have the skipVerifyIdToken as a thing on this service at all. We should either abstract the verification behind an interface, such that we can mock/stub it in tests, or actually run the verification check.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, if you want to provide this as an option for the caller, encode it into the request such that the caller can choose to have the verification skipped.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if you'd like to pair on this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'Removed newTestService.

skipVerifyIdToken is passed to a constructor from a library. It was a recommended way to use this shortcut and remove skipVerifyIdToken in a separate PR.

With the added golang-jwt/jwt, it's now easier to extend the mocked service to actually to the checks instead of skipping verification.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's find a better way to avoid the skipVerification, as that pushes test code into our main implementation.

skipVerification is deemed to be removed in following PR.

Using JWT tokens for encoding/decoding/validation of state params carried throughout the OIDC/OAuth2 flow.

Validating of integrity is crucial, as this piece of information contains the ID of the OIDC client to continue with when Gitpod receives the callback from a 3rd party. Tests should show that expiration time is checked and signature validation is effective.
Copy link
Member

@easyCZ easyCZ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/hold for final Qs

@@ -152,7 +150,7 @@ func (s *Service) GetClientConfigFromCallbackRequest(r *http.Request) (*ClientCo
return nil, fmt.Errorf("missing state parameter")
}

state, err := decodeStateParam(stateParam)
state, err := s.decodeStateParam(stateParam)
if err != nil {
return nil, fmt.Errorf("bad state param")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we communicate that this is a User Error, and not a system error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as #16331 (comment)

That needs improvements.

@@ -75,11 +75,13 @@ func Start(logger *logrus.Entry, version string, cfg *config.Configuration) erro
}
}

var stateJWT *oidc.StateJWT
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's no secret, the stateJWT can now be nil. We're passing it into

oidc.NewService(cfg.SessionServiceAddress, dbConn, cipherSet, stateJWT)

Where it will panic when we try to access it on the calls. Should we hard fail if we don't have the secret? How do we handle this case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was about to raise this question as well.

First of, adding config for staging and prod in ops.

Should we hard fail if we don't have the secret? How do we handle this case?

Failing the whole component is really bad, but doesn't seem to be wrong as this as crucial bits from security perspective, i.e. falling back to a default is a "no go".

OTOH, a second option might be to check for stateJWT == nil in handlers and fail late. That comes at the cost of monitoring, as we'd need to cover that independently.

How was that discussed with PATs and Stripe integration?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@easyCZ, I'm opting for not failing hard for now. We can change later on. Reason: for now this is all experimental and behind the scene, so let's keep risk at minimum.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Now it should not panic, but any attempt to call the endpoints while the signing key is not provided will be cancelled with an error.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, it's better to hard-fail in this case, I'm happy to do a follow-up change. With Stripe, we didn't hard fail because we had the use-case that we couldn't control the stripe configuration but needed the rest of the server to work. But here, we get a better rollout signal when we rollout and miss something.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@easyCZ, let's do in follow-ups, please!

What else is required to release the breaks on this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR is approved, feel free to land

@AlexTugarev
Copy link
Member Author

/hold cancel

@roboquat roboquat merged commit 80dc959 into main Feb 15, 2023
@roboquat roboquat deleted the at/state-jwt branch February 15, 2023 17:55
@roboquat roboquat added deployed: webapp Meta team change is running in production deployed Change is completely running in production labels Feb 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deployed: webapp Meta team change is running in production deployed Change is completely running in production release-note-none size/L team: webapp Issue belongs to the WebApp team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants