Skip to content

How do we stop Kubernetes updates breaking installations? #11136

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
incident-io bot opened this issue Jul 5, 2022 · 7 comments · Fixed by #11177
Closed

How do we stop Kubernetes updates breaking installations? #11136

incident-io bot opened this issue Jul 5, 2022 · 7 comments · Fixed by #11177
Assignees
Labels
team: delivery Issue belongs to the self-hosted team

Comments

@incident-io
Copy link

incident-io bot commented Jul 5, 2022

Problem

An update to the K8s dependencies broke the installation process. In k8s 1.24, NetworkPolicy has a new status object added to it. In the Golang definition, this does not use omitempty, so the generation of the YAML add status: {} to the YAML.

As Kubernetes <= 1.23 doesn't know how to handle it, this returns this error when running kubectl apply

error: error validating "STDIN": error validating data: ValidationError(NetworkPolicy): unknown field "status" in io.k8s.api.networking.v1.NetworkPolicy; if you choose to ignore these errors, turn validation off with --validate=false

This is not the first time we've had this problem - we've already had it with StatefulSets

This is likely to be a recurring issue and could hit us any time we do an update of the Kubernetes variables.

Potential Solutions

These are just the solutions I came up with at the time of writing. If you have one, please add it to the list with pros and cons

1. Post-process with yq

Every time we get one of these, we document the problem as done with StatefulSets

Pros

  • Simple

Cons

  • Has to be done everywhere (KOTS, production, preview-environments, etc)
  • Could become a very long list
  • Not clear when these should be removed from docs

2. Lock the versions to the Kubernetes versions we support

We officially support the latest version of Kubernetes and the 2 previous versions (currently 1.22-1.24). We could lock the versions of Kubernetes

Pros

  • No need to be done everywhere

Cons

  • Not exactly clear how this could be achieved
  • Workspace team may need to bump things that they need to support
  • May still get this problem when bumping to a new version (in this case, from 1.23 to 1.24)

3. Process inside the Installer

Either customise the Golang structs (preferred) or post-processing inside the Installer (if first not possible) so that the bad parameters are never generated.

Pros

  • No need to be done everywhere

Cons

  • Bit hacky?

My personal preference is 3. However, this is something that affects all teams so it should be debated by all and a final decision should be made by @corneliusludmann and @csweichel

cc @gitpod-io/engineering-self-hosted @gitpod-io/engineering-ide @gitpod-io/product-engineering @gitpod-io/platform @gitpod-io/engineering-workspace @gitpod-io/engineering-webapp


This action was created from Incident 172 by Mads Hartmann, using incident.io 🔥

@mrsimonemms mrsimonemms changed the title Add post-processing capabilities to the Installer How do we stop Kubernetes updates breaking installations? Jul 5, 2022
@mrsimonemms mrsimonemms added the team: delivery Issue belongs to the self-hosted team label Jul 5, 2022
@corneliusludmann
Copy link
Contributor

In my opinion, the only sensible way is to use the proper Kubernetes client version that corresponds to that Kubernetes version we need to support (item (2) in the listing above).

According to the readme of kubernetes/client-go, that is how the Kubernetes client is designed. Kubernetes is backwards compatible with clients and they backport bugfixes. Honestly, I personally don't really see the benefit of fiddling with manifests by hand to achieve backwards compatibility instead of simply using the proper client version.

The only remaining question in my opinion is: Do we (Team Workspace?) rely on any new features in the client library that justify using a newer version?

In case (2) is not an option for some reason, I think (3) would be the proper alternative.


Cons
Not exactly clear how this could be achieved

That would mean, we do not update the client library above v0.22.x as long as we support Kubernetes 1.22.

Workspace team may need to bump things that they need to support

Not sure if I get this. 🤔 😬

May still get this problem when bumping to a new version (in this case, from 1.23 to 1.24)

That's a general issue when we stop supporting a specific version. I don't think we have a proper answer to that yet. However, once we don't support an older version anymore, we don't need to support this case either. 🥁

@mrsimonemms
Copy link
Contributor

Workspace team may need to bump things that they need to support

It always seems to be @gitpod-io/engineering-workspace that bumps the versions - f this is done to get access to a new feature in the updated library, then this approach could cause problems.

@lucasvaltl
Copy link
Contributor

@csweichel & @kylos101 what are your thoughts on this? Are there any good reasons that we do not stick with the k8s client version that we need to support.

@sagor999
Copy link
Contributor

sagor999 commented Jul 6, 2022

I am personally not quite sure why do we need to be on the bleeding edge of k8s updates. 🤔 Seems like it is just asking for trouble, unless there is some very very specific need (like important security fix or something like that).

@kylos101
Copy link
Contributor

Hi @csweichel @corneliusludmann @mrsimonemms @sagor999 👋 ,

May we ask you to peek at this internal Notion doc, which describes how and why we currently take on Kubernetes dependencies at Gitpod, and share feedback? Full disclosure, I was unaware how we do now, authoring this helped me better understand the related perspective for our current norms. Thank you @aledbf for sharing your experience, perspective, and thoughts.

@lucasvaltl and I spoke and agreed it would be good to do a couple things as next steps:

  1. Share this document, so that we can all have a common understanding as to how and why we do things now
  2. Plan to run self-hosted tests on the last 3 versions of Kubernetes, this way we can assert Gitpod works well on each of them

Is there anything else you feel strongly about that you think we should do that we might be missing? 🤔

@mrsimonemms can you act as reviewer for self-hosted while @corneliusludmann is out?

cc: @gitpod-io/engineering-workspace for awareness

@mrsimonemms
Copy link
Contributor

mrsimonemms commented Jul 13, 2022

I've read the document that @kylos101 and @lucasvaltl have done and am broadly happy with it.

I think that we have to now decide between options 1 or 3 - my preference would be 3. They are broadly the same thing, but in 3 the changes happen inside the Installer so we can control them

@corneliusludmann
Copy link
Contributor

With #11391 we implemented 3 for now. However, the process how to detect incompatible changes is still slightly unclear.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team: delivery Issue belongs to the self-hosted team
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

5 participants