Skip to content

Investigate inconsistent licence loading in self-hosted instances #13357

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mrsimonemms opened this issue Sep 27, 2022 · 3 comments
Closed

Investigate inconsistent licence loading in self-hosted instances #13357

mrsimonemms opened this issue Sep 27, 2022 · 3 comments
Labels
meta: never-stale This issue can never become stale

Comments

@mrsimonemms
Copy link
Contributor

Problem

When a new Gitpod instance is started, the first user goes to create their user account and the error Error: Maximum number of users permitted by the license exceeded is returned.

Ultimately, no users are able to log in leaving an unusable instance.

Root cause

These logs are from a customer support bundle

It seems to be a problem that the server call to kotsadm fails and the fallback licence is not properly loaded. A log similar to this will be at the start of the server log.

{
   "@type":"type.googleapis.com/google.devtools.clouderrorreporting.v1beta1.ReportedErrorEvent",
   "serviceContext":{
      "service":"server",
      "version":"release-2022.08.0.10"
   },
   "component":"server",
   "severity":"ERROR",
   "time":"2022-09-23T18:39:19.614Z",
   "message":"invalid license: falling back to default",
   "payload":{
      "domain":"xxxxx",
      "msg":"cannot query kots admin, \"Get \\\"http://kotsadm:3000/license/v1/license\\\": dial tcp ***HIDDEN***:3000: connect: connection refused\""
   },
   "loggedViaConsole":true
}

In this example, the kotsadm server appears to not be running. This is not something that should happen because the kotsadm pod is already running, so clearly this is something to investigate.

Workaround

Simply reload the server pod and ensure that the above error does not happen

kubectl rollout restart -n <namespace> deployment/server

Potential avenues/solutions

  • ensure that the server component can differentiate between licence error and fatal responses. If there are < 1 users available, crash the server
  • ensure that the invalid license: falling back to default actually correctly loads the default license. This may be a problem with the Replicated vs Gitpod licensor as the Replicated license requires an HTTP call whereas the Gitpod license is from a static file
@mrsimonemms mrsimonemms added the meta: never-stale This issue can never become stale label Sep 27, 2022
@AlexTugarev
Copy link
Member

In this example, the kotsadm server appears to not be running. This is not something that should happen because the kotsadm pod is already running, so clearly this is something to investigate.

@mrsimonemms, just to clarify on this: the issue seems to be that the server is fetching the license only once on start, and it might happen that the license service is not yet available, because if you restart server (to repeat the request effectively) the request succeeds.

Looking at the code, the licenseEvaluator.validate is never called, thus the validation isn't repeated after initialization. It looks like there were changes made in this area which led to this gap. A solution would be to call licenseEvaluator.validate whenever the state of the license is requested, but an invalid result was returned previously.

@mrsimonemms
Copy link
Contributor Author

@AlexTugarev yes. I created the Replicated licensor and wanted to keep the changes to server to a minimum. If we did make a call each time then it SHOULD work, although I guess there's the potential for that to introduce new bugs if the kotsadm service isn't available.

We almost want a nuanced approach - if the licence has loaded "correctly", then should use the in-memory licence otherwise, the server should attempt to revalidate/reload the licence.

@gtsiolis
Copy link
Contributor

Closing as the licensor component has been removed in #16983. Cc @aledbf @AlexTugarev @geropl

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
meta: never-stale This issue can never become stale
Projects
Status: In Validation
Development

No branches or pull requests

3 participants