Skip to content

Fixing the clean-slate-deployment of new VM base preview-environments #8748

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 14, 2022

Conversation

fullmetalrooster
Copy link
Contributor

@fullmetalrooster fullmetalrooster commented Mar 11, 2022

Description

This fixes a problem where VMs are not removed during a clean-slate-deployment.

UPDATE: Thanks for the support. I dug deeper and refactored a bit. The prepareVM function is now only used when we really want a VM. Then it checks in prepareVM if the the VM is present. If it is not present createVM otherwise the VM exists and we need to check if a clean-slate-deployment is triggered. Then the VM which existence has been verified in the previous step will be deleted and recreated.

Related Issue(s)

Fixes #8747

How to test

Start a workspace and run werft run github -a with-vm=true. After the deployment of the VM run werft run github -a with-vm=true -a with-clean-slate-deployment. The VM will be removed during the prepare phase and watch out for the string "Cleaning previously created VM" in the logs. An additional test would be to run werft run github -a with-vm=true and make sure that the VM will not be removed.

Release Notes

None

Documentation

@codecov
Copy link

codecov bot commented Mar 11, 2022

Codecov Report

Merging #8748 (8acb739) into main (5e3897d) will decrease coverage by 1.13%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##             main    #8748      +/-   ##
==========================================
- Coverage   12.31%   11.17%   -1.14%     
==========================================
  Files          20       18       -2     
  Lines        1161      993     -168     
==========================================
- Hits          143      111      -32     
+ Misses       1014      880     -134     
+ Partials        4        2       -2     
Flag Coverage Δ
components-gitpod-cli-app 11.17% <ø> (ø)
components-local-app-app-darwin-amd64 ?
components-local-app-app-darwin-arm64 ?
components-local-app-app-linux-amd64 ?
components-local-app-app-linux-arm64 ?
components-local-app-app-windows-386 ?
components-local-app-app-windows-amd64 ?
components-local-app-app-windows-arm64 ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
components/local-app/pkg/auth/pkce.go
components/local-app/pkg/auth/auth.go

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

Copy link
Contributor

@mads-hartmann mads-hartmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes don't look right to me. You're passing in the namespace and then also appending preview in the -n argument to kubectl.

The code looks right before though, so I suspect whatever issue you're looking for is hiding somewhere else ☺️

@mads-hartmann
Copy link
Contributor

mads-hartmann commented Mar 11, 2022

@wulfthimm I suspect the real problem is that we only invoke prepareVM if the VM doesn't exist. And the logic to delete the VM is inside prepareVM ☺️

@fullmetalrooster
Copy link
Contributor Author

fullmetalrooster commented Mar 11, 2022

The changes don't look right to me. You're passing in the namespace and then also appending preview in the -n argument to kubectl.

The code looks right before though, so I suspect whatever issue you're looking for is hiding somewhere else relaxed

Thanks, that is what I also suspect. I started with this approach and the job did what I expected to do. Therefore this premature PR to unblock the beta release and take a deeper look afterwards. But I will take a deeper look to ensure that the solution is sustainable now.

@fullmetalrooster
Copy link
Contributor Author

@wulfthimm I suspect the real problem is that we only invoke prepareVM if the VM doesn't exist. And the logic to delete the VM is inside prepareVM relaxed

Having the deletion of the previous VM inside prepareVM is valid as we check for a cleanSlateDeployment. The problem is the function existsVM that return a false when it should return a true.

@mads-hartmann
Copy link
Contributor

The problem is the function existsVM that return a false when it should return a true.

Maybe we're talking past each other. Your changes has introduced a bug which means VM.vmExists will always return false, but was that also the case on main?

If we're looking at main there's a bug even if VM.vmExists does the right thing. If VM.vmExists return true we will never invoke prepareVM because of this conditional. Because we never invoke prepareVM, we will never try to delete the VM.

I think the best way forward is to

  1. Revert your changes
  2. Ensure VM.vmExists does the right thing
  3. Fix the issue I described above

.werft/vm/vm.ts Outdated
const namespace = `preview-${options.name}`
const status = exec(`kubectl --kubeconfig ${KUBECONFIG_PATH} -n ${namespace} get vmi ${options.name}`, { dontCheckRc: true, silent: true })
const status = exec(`kubectl --kubeconfig ${KUBECONFIG_PATH} -n preview-${options.name} get vmi ${options.name}`, { dontCheckRc: true, silent: true })
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you want to revert this :)

@fullmetalrooster fullmetalrooster marked this pull request as draft March 11, 2022 14:25
@fullmetalrooster fullmetalrooster force-pushed the wth/fix-clean-vm branch 3 times, most recently from 1381c3c to 1ae598f Compare March 11, 2022 14:44
@fullmetalrooster
Copy link
Contributor Author

fullmetalrooster commented Mar 11, 2022

Thanks for the support. I dug deeper and refactored a bit. The prepareVM function is now only used when we really want a VM. Then it checks in prepareVM if the the VM is present. If it is not present createVM otherwise the VM exists and we need to check if a clean-slate-deployment is triggered. Then the VM which existence has been verified in the previous step will be deleted and recreated.

@fullmetalrooster fullmetalrooster marked this pull request as ready for review March 11, 2022 14:53
Copy link
Contributor

@ArthurSens ArthurSens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling this bug introduced by me wulf! 🤗

I feel like your approach is overcomplicating things a bit though... Could we stick with the original implementation, while adding your check following the Single Responsibility Principle?

Does this approach looks better to read?

function decideHarvesterVMCreation(werft: Werft, config: JobConfig) {
    if (shouldCreateVM(config)) {
        prepareVM(werft, config)
    } else {
        werft.currentPhaseSpan.setAttribute("werft.harvester.created_vm", false)
    }
    werft.done(prepareSlices.BOOT_VM)
}

function shouldCreateVM(config: JobConfig) {
    return config.withVM && (
          !VM.vmExists({ name: config.previewEnvironment.destname }) || 
          config.cleanSlateDeployment
    )
}

@fullmetalrooster
Copy link
Contributor Author

@ArthurSens I find my approach much easier to read because it does not use as much functions and is still relatively compact. But I am not a TS developer and do not have any preferences.

@ArthurSens
Copy link
Contributor

ArthurSens commented Mar 11, 2022

Using conditionals is nice because it's easy to implement things when in a hurry, but they add complexity as we need to keep track of how all those conditionals correlate between different parts of the code. When reading the code, with this approach, we need to scroll up and down, we need to look at different levels of nested conditionals and it does add cognitive noise 😵

Functions, on the other hand, are a powerful thing that can reduce complexity 🙂. We can read code only once in a top-to-bottom approach, while not entering into nested conditionals.

Copy link
Contributor

@ArthurSens ArthurSens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wulfthimm and I just did a pair-programming session to align coding style 🙂

Copy link
Contributor

@mads-hartmann mads-hartmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks!

@ArthurSens
Copy link
Contributor

ArthurSens commented Mar 14, 2022

/hold

Looks like the job failed due to core-dev flakyness?

/werft run

👍 started the job as gitpod-build-wth-fix-clean-vm.44

@ArthurSens
Copy link
Contributor

ArthurSens commented Mar 14, 2022

/werft run

👍 started the job as gitpod-build-wth-fix-clean-vm.45

@ArthurSens
Copy link
Contributor

/unhold

@roboquat roboquat merged commit 3e35224 into main Mar 14, 2022
@roboquat roboquat deleted the wth/fix-clean-vm branch March 14, 2022 08:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

VMs are not removed during a clean-slate-deployment
4 participants