Skip to content

Update preview VM image #11053

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 25, 2022
Merged

Update preview VM image #11053

merged 1 commit into from
Jul 25, 2022

Conversation

ArthurSens
Copy link
Contributor

@ArthurSens ArthurSens commented Jun 30, 2022

Signed-off-by: ArthurSens [email protected]

Description

Updates the VM image to gitpod-k3s-202206291903

Related Issue(s)

Fixes #10832

How to test

Release Notes

NONE

Documentation

Werft options:

  • /werft with-preview

@ArthurSens ArthurSens requested review from a team June 30, 2022 19:16
@github-actions github-actions bot added team: devx team: workspace Issue belongs to the Workspace team labels Jun 30, 2022
@ArthurSens
Copy link
Contributor Author

ArthurSens commented Jun 30, 2022

/hold

Putting a hold until both teams tested properly that this works 😬

@kylos101
Copy link
Contributor

@ArthurSens Weird, I am not seeing a Results tab, with a link to the preview environment here. 🤔

@werft-gitpod-dev-com
Copy link

started the job as gitpod-build-as-updae-vm.1 because the annotations in the pull request description changed
(with .werft/ from main)

@ArthurSens
Copy link
Contributor Author

ArthurSens commented Jun 30, 2022

@ArthurSens Weird, I am not seeing a Results tab, with a link to the preview environment here. 🤔

@kylos101 you should get a new one by ticking/unticking the with-preview checkbox in the PR description :)

@ArthurSens
Copy link
Contributor Author

@kylos101 @meysholdt are we confident on merging this one?

@Furisto
Copy link
Member

Furisto commented Jul 1, 2022

@kylos101 @ArthurSens The cgroup filesystem in the preview environment is still v1.

Copy link
Contributor

@kylos101 kylos101 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ArthurSens given the feedback from @Furisto , something is not right.

@Furisto how are you checking that the file system is still on cgroup v1?

@utam0k
Copy link
Contributor

utam0k commented Jul 3, 2022

@Furisto how are you checking that the file system is still on cgroup v1?

@kylos101
If you find /sys/fs/cgroup/cpu path, it means you use cgroup v1.

@Furisto
Copy link
Member

Furisto commented Jul 4, 2022

@Furisto how are you checking that the file system is still on cgroup v1?

Opened a workspace and then did as @utam0k described.

@ArthurSens
Copy link
Contributor Author

@kylos101 @Furisto, I'm not sure how to proceed here.

Do you have a way to certify that a VM image belongs to a certain github release? How do we make sure we have the correct image in place?

And regarding group v2, do you configure something to make it available in production? Am I missing something here?

@Furisto
Copy link
Member

Furisto commented Jul 4, 2022

This is how we activate cgroup v2 in prod: https://github.com/gitpod-io/gitpod-packer-gcp-image/blob/a76de6e8bd479e4f441d5e86affc642abf15c246/setup.sh#L115

@kylos101
Copy link
Contributor

kylos101 commented Jul 6, 2022

@ArthurSens do you need anything else to test this?

@jenting @utam0k do you have any other recommendations for @ArthurSens ? I ask because I know you wrote this internal doc last night.

@ArthurSens
Copy link
Contributor Author

ArthurSens commented Jul 6, 2022

@ArthurSens do you need anything else to test this?

@jenting @utam0k do you have any other recommendations for @ArthurSens ? I ask because I know you wrote this internal doc last night.

The internal doc written by toru does help a little bit, but the need for a reboot makes things very complicated to add into our CI at the moment... I'll try to take another look at this tomorrow 🤔

@jenting
Copy link
Contributor

jenting commented Jul 7, 2022

@ArthurSens do you need anything else to test this?
@jenting @utam0k do you have any other recommendations for @ArthurSens ? I ask because I know you wrote this internal doc last night.

The internal doc written by toru does help a little bit, but the need for a reboot makes things very complicated to add into our CI at the moment... I'll try to take another look at this tomorrow 🤔

Thank you, Arthur. It would be good to support reboot because we are thinking that having a werft annotation to make us able to switch from cgroup v1 or cgroup v2.

@utam0k
Copy link
Contributor

utam0k commented Jul 7, 2022

@ArthurSens Can we at least decide when we first create the preview-env?

@meysholdt
Copy link
Member

meysholdt commented Jul 7, 2022

Thank you, Arthur. It would be good to support reboot because we are thinking that having a werft annotation to make us able to switch from cgroup v1 or cgroup v2.

Does "having a werft annotation to make us able to switch from cgroup v1 or cgroup v2" really require support for reboot? Assumtion: What we want to have here is the ability to pass Linux kernel boot parameters from Werft to the VM as part of the VM-creation-process.

@kylos101
Copy link
Contributor

kylos101 commented Jul 7, 2022

👋 let's try to keep this like a 🛹

My recommendation is:

  1. The default be set to cgroup v2
  2. If someone wants to test on cgroup v1, they can follow the Notion doc that JenTing and Toru mention here

If following the document to test on cgroup v1 happens often, and it'd save people time in the future, then we can consider the werft annotation in a future PR and issue.

In other words, right now, its too early to say whether we truly need a werft annotation to switch between types of cgroup for a preview environment instance. It'd be better (best?) to save that energy for the Platform Team, so they can work other things.

@ArthurSens
Copy link
Contributor Author

ArthurSens commented Jul 7, 2022

👋 let's try to keep this like a 🛹

My recommendation is:

  1. The default be set to cgroup v2

  2. If someone wants to test on cgroup v1, they can follow the Notion doc that JenTing and Toru mention here

If following the document to test on cgroup v1 happens often, and it'd save people time in the future, then we can consider the werft annotation in a future PR and issue.

In other words, right now, its too early to say whether we truly need a werft annotation to switch between types of cgroup for a preview environment instance. It'd be better (best?) to save that energy for the Platform Team, so they can work other things.

The problem here is that looks like a reboot is needed mid-CI to enable cgroupv2, and that is quite complex to implement.

We depend on the VM being up for several steps of the CI. I'm not saying it is impossible, but will require a fair amount of refactoring 🙃

Is there any other way of enabling chroupv2 that won't require a reboot?

@utam0k
Copy link
Contributor

utam0k commented Jul 7, 2022

AFAI, there is no way without reboot because cgroup is a core feature of linux kernel. All processes belong to some cgroup. It is probably impossible to switch this without a restart.

Is there any other way of enabling chroupv2 that won't require a reboot?

@kylos101
Copy link
Contributor

@ArthurSens 👋 I'm not sure why the reboot is necessary if you're using our updated image. Is Harvester mutating this grub entry? My understanding based on this comment is that on boot-up, we'd be using cgroup v2.

Can you share the output of cat /etc/default/grub?

@meysholdt
Copy link
Member

meysholdt commented Jul 14, 2022

/werft with-preview=true

👎 unknown command: with-preview=true
Use /werft help to list the available commands

@meysholdt
Copy link
Member

meysholdt commented Jul 14, 2022

/werft with-preview

👎 unknown command: with-preview
Use /werft help to list the available commands

@meysholdt
Copy link
Member

meysholdt commented Jul 14, 2022

/werft run with-preview=true

👍 started the job as gitpod-build-as-updae-vm.8
(with .werft/ from main)

@meysholdt
Copy link
Member

I got a bit further:
When starting the preview env via werft run github (instead of from this PR) then the preview env gets the new image from this PR.

This can be confirmed by looking at the VM in Harvester and seeing the image version:
image

in the new VM, the test passes:

ls /sys/fs/cgroup/cgroup.controllers
/sys/fs/cgroup/cgroup.controllers

However, no pods in k3s start.

Some errors from journalctl:

I0714 15:39:44.950125  259700 codec.go:117] "Using lenient decoding as strict decoding failed" err="strict decoding error: unknown field \"conntrack\""
time="2022-07-14T15:39:44Z" level=error msg="Failed to record snapshots for cluster: No snapshot configmap found"
time="2022-07-14T15:39:48Z" level=info msg="error in remotedialer server [400]: websocket: close 1006 (abnormal closure): unexpected EOF"
E0714 15:39:50.318989  259700 kubelet.go:2391] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized"

For some reason, calico does not get installed.

If I install it by running kubectl apply -f /var/lib/gitpod/manifests/calico.yaml, it does not launch and kubectl -n kube-system describe pod calico-node-f8gxr shows error:

failed to create pod sandbox: rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: expected cgroupsPath to be of format "slice:prefix:name" for systemd cgroups, got "/kubepods/burstable/pod95dd2279-3853-4cdb-b25a-718c82d4f907/e8f0110ff17cebbe19c298f684bbb2781a308c22b25f3a015a1b9aa9b551c6ee" instead: unknown

Any hints why k3s does not start as expected are very much appreciated :)

@kylos101
Copy link
Contributor

@meysholdt do you have the same trouble when using the latest image? That would be this one.

@aledbf
Copy link
Member

aledbf commented Jul 14, 2022

@meysholdt besides @kylos101 comment, please make sure k3s is started with the flag --kubelet-arg cgroup-driver=systemd

@roboquat roboquat added size/S and removed size/XS labels Jul 21, 2022
Comment on lines +342 to +343
kubectl apply -f /var/lib/gitpod/manifests/csi-driver.yaml
kubectl apply -f /var/lib/gitpod/manifests/csi-config.yaml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vulkoingim
Copy link
Contributor

With the latest image and a fresh VM now the preview env succeeds:

https://werft.gitpod-dev.com/job/gitpod-custom-as-updae-vm.11

image

grep cgroup /proc/filesystems

nodev   cgroup
nodev   cgroup2
...
ls /sys/fs/cgroup/cgroup.controllers
/sys/fs/cgroup/cgroup.controllers
time="2022-07-21T16:02:03Z" level=info msg="Running kubelet --address=0.0.0.0 --anonymous-auth=false --authentication-token-webhook=true --authorization-mode=Webhook --cgroup-driver=systemd ....

@vulkoingim vulkoingim requested review from kylos101 and a team July 21, 2022 16:28
Copy link
Member

@meysholdt meysholdt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @vulkoingim!

I can confirm that the preview env on this PR runs on image default/gitpod-k3s-202207120820.qcow2 and that I could successfully start workspace inside that preview env.

@jenting
Copy link
Contributor

jenting commented Jul 25, 2022

/werft run with-preview=true

👍 started the job as gitpod-build-as-updae-vm.19
(with .werft/ from main)

@vulkoingim
Copy link
Contributor

@jenting this job won't run with the changes, it has to be started manually from the CLI. I'll do it in a second and send a link, after I clean the current VM.

@jenting
Copy link
Contributor

jenting commented Jul 25, 2022

@jenting this job won't run with the changes, it has to be started manually from the CLI. I'll do it in a second and send a link, after I clean the current VM.

Thank u

@jenting
Copy link
Contributor

jenting commented Jul 25, 2022

I could open the workspace, and the underlying is cgroupv2 and with cgroup driver systemd.

@jenting
Copy link
Contributor

jenting commented Jul 25, 2022

/unhold

Copy link
Contributor

@kylos101 kylos101 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 to unblock this, thanks @Furisto and @jenting !

@roboquat roboquat merged commit 70d0392 into main Jul 25, 2022
@roboquat roboquat deleted the as/updae-vm branch July 25, 2022 14:08
@roboquat roboquat added the deployed: workspace Workspace team change is running in production label Aug 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deployed: workspace Workspace team change is running in production release-note-none size/S team: devx team: workspace Issue belongs to the Workspace team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[preview envirionments] run with cgroup v2
9 participants