Skip to content

[ws-daemon] Fix cgroups v2 I/O limiting and add support for Systemd cgroup #10669

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 20, 2022

Conversation

aledbf
Copy link
Member

@aledbf aledbf commented Jun 14, 2022

Release Notes

Fix I/O limiting when cgroups v2 is enabled

@aledbf
Copy link
Member Author

aledbf commented Jun 15, 2022

/werft run

👍 started the job as gitpod-build-aledbf-fuse.7
(with .werft/ from main)

@roboquat roboquat added size/XL and removed size/L labels Jun 15, 2022
@aledbf aledbf force-pushed the aledbf/fuse branch 4 times, most recently from 8082c15 to cb89c5b Compare June 15, 2022 05:00
@aledbf aledbf changed the title [ws-daemon] Include cgroups v2 error instead of a static message [ws-daemon] Fix cgroups v2 I/O limiting Jun 15, 2022
@aledbf aledbf force-pushed the aledbf/fuse branch 3 times, most recently from ca63271 to 9062a35 Compare June 15, 2022 14:01
@aledbf aledbf marked this pull request as ready for review June 15, 2022 16:27
@aledbf aledbf requested review from a team June 15, 2022 16:27
@github-actions github-actions bot added team: webapp Issue belongs to the WebApp team team: workspace Issue belongs to the Workspace team labels Jun 15, 2022
@aledbf aledbf changed the title [ws-daemon] Fix cgroups v2 I/O limiting [ws-daemon] Fix cgroups v2 I/O limiting and add support for Systemd cgroup Jun 15, 2022
@werft-gitpod-dev-com
Copy link

started the job as gitpod-build-aledbf-fuse.26 because the annotations in the pull request description changed
(with .werft/ from main)

@werft-gitpod-dev-com
Copy link

started the job as gitpod-build-aledbf-fuse.27 because the annotations in the pull request description changed
(with .werft/ from main)

@werft-gitpod-dev-com
Copy link

started the job as gitpod-build-aledbf-fuse.28 because the annotations in the pull request description changed
(with .werft/ from main)

@werft-gitpod-dev-com
Copy link

started the job as gitpod-build-aledbf-fuse.29 because the annotations in the pull request description changed
(with .werft/ from main)

Copy link
Contributor

@csweichel csweichel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks great - how I can test?

@aledbf
Copy link
Member Author

aledbf commented Jun 16, 2022

Code looks great - how I can test?

You can use https://github.com/gitpod-io/workspace-preview/pull/44 to start a workspace preview (cgroups v2 is already enabled)
(you need to set limits, we don't set any by default)

}

return setup == Legacy, nil
return cgroups.Mode() == cgroups.Unified, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have discussed using containerd or runc cgroup libs, and we decided not to depend on any libs. WDYT?
#8471 (comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have that library as a dependency. Not sure that argument is valid?

@utam0k
Copy link
Contributor

utam0k commented Jun 17, 2022

I tried it in a cgroup v2 environment. However, only one eBPF code was found. The contents were probably runc defaults, and not even the fuse plugin was working. Am I looking in the wrong path?

root@g024e48b9450d35ffc790c7:/# bpftool cgroup tree /sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod87551354_d6d6_458c_8c02_23ec6701dea7.slice/cri-containerd-d9cf7e0f5eee155572588b05ffa9a1a13d2a0ee1c53660f35cc2ee5dfb57a6c7.scope
CgroupPath
ID       AttachType      AttachFlags     Name           
/sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod87551354_d6d6_458c_8c02_23ec6701dea7.slice/cri-containerd-d9cf7e0f5eee155572588b05ffa9a1a13d2a0ee1c53660f35cc2ee5dfb57a6c7.scope
612      device          multi  
root@g024e48b9450d35ffc790c7:/# bpftool prog dump xlat id 612
   0: (61) r2 = *(u32 *)(r1 +0)
   1: (54) w2 &= 65535
   2: (61) r3 = *(u32 *)(r1 +0)
   3: (74) w3 >>= 16
   4: (61) r4 = *(u32 *)(r1 +4)
   5: (61) r5 = *(u32 *)(r1 +8)
   6: (55) if r2 != 0x1 goto pc+5
   7: (bc) w1 = w3
   8: (54) w1 &= 1
   9: (5d) if r1 != r3 goto pc+2
  10: (b4) w0 = 1
  11: (95) exit
  12: (55) if r2 != 0x2 goto pc+5
  13: (bc) w1 = w3
  14: (54) w1 &= 1
  15: (5d) if r1 != r3 goto pc+2
  16: (b4) w0 = 1
  17: (95) exit
  18: (55) if r2 != 0x2 goto pc+4
  19: (55) if r4 != 0x1 goto pc+3
  20: (55) if r5 != 0x3 goto pc+2
  21: (b4) w0 = 1
  22: (95) exit
  23: (55) if r2 != 0x2 goto pc+4
  24: (55) if r4 != 0x1 goto pc+3
  25: (55) if r5 != 0x5 goto pc+2
  26: (b4) w0 = 1
  27: (95) exit
  28: (55) if r2 != 0x2 goto pc+4
  29: (55) if r4 != 0x1 goto pc+3
  30: (55) if r5 != 0x7 goto pc+2
  31: (b4) w0 = 1
  32: (95) exit
  33: (55) if r2 != 0x2 goto pc+4
  34: (55) if r4 != 0x1 goto pc+3
  35: (55) if r5 != 0x8 goto pc+2
  36: (b4) w0 = 1
  37: (95) exit
  38: (55) if r2 != 0x2 goto pc+4
  39: (55) if r4 != 0x1 goto pc+3
  40: (55) if r5 != 0x9 goto pc+2
  41: (b4) w0 = 1
  42: (95) exit
  43: (55) if r2 != 0x2 goto pc+4
  44: (55) if r4 != 0x5 goto pc+3
  45: (55) if r5 != 0x0 goto pc+2
  46: (b4) w0 = 1
  47: (95) exit
  48: (55) if r2 != 0x2 goto pc+4
  49: (55) if r4 != 0x5 goto pc+3
  50: (55) if r5 != 0x2 goto pc+2
  51: (b4) w0 = 1
  52: (95) exit
  53: (55) if r2 != 0x2 goto pc+4
  54: (55) if r4 != 0xa goto pc+3
  55: (55) if r5 != 0xc8 goto pc+2
  56: (b4) w0 = 1
  57: (95) exit
  58: (55) if r2 != 0x2 goto pc+3
  59: (55) if r4 != 0x88 goto pc+2
  60: (b4) w0 = 1
  61: (95) exit
  62: (b4) w0 = 0
  63: (95) exit

This code, for example, the following section can be interpreted this way. And /dev/fuse(10:229) was not found. This means that fuse_plugin is not working well either.

  23: (55) if r2 != 0x2 goto pc+4
  24: (55) if r4 != 0x1 goto pc+3
  25: (55) if r5 != 0x5 goto pc+2
  26: (b4) w0 = 1
  27: (95) exit
  Allow /dev/zero (1:5)

The test for this fuse device fails. However, fuse may have nothing to do with this PR.

#define _GNU_SOURCE
include <unistd.h>.

#include <sys/syscall.h> (syscall)
#include <linux/fs.h>.
#include <sys/types.h>.
#include <sys/stat.h>
#include <fcntl.h> (English)
#include <stdio.h> (English)


int main() {
  const char* src_path = "/dev/fuse";
  unsigned int flags = O_RDWR;
  printf("RET: %ld", syscall(SYS_openat, AT_FDCWD, src_path, flags))
}

In summary, I have two questions.

  • Which paths does the eBPF apply to?
  • Can the fuse plugin and the io_limit plugin load eBPF code in the same cgroup path?

Here is ws-manager cm I edited

        "cpulimit": {
          "enabled": true,
          "totalBandwidth": "12",
          "limit": "2",
          "burstLimit": "4",
          "controlPeriod": "15s",
          "cgroupBasePath": "/mnt/node-cgroups"
        },
        "ioLimit": {
          "writeBandwidthPerSecond": "300Mi",
          "readBandwidthPerSecond": "350Mi",
          "writeIOPS": 0,
          "readI

@aledbf aledbf requested a review from utam0k June 17, 2022 23:48
@aledbf
Copy link
Member Author

aledbf commented Jun 17, 2022

@utam0k fuse is working correctly now. Please review again.

@utam0k
Copy link
Contributor

utam0k commented Jun 20, 2022

I checked. It works fine!!!

  • memory high and limit
root@ga8a2ae65be0d10d33ff12f:/sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod34261dc3_3529_427f_ad87_538c766e5335.slice/cri-containerd-d6ddb03d6d6091f03b5b2a6bc3b0f21be28da9a7b1f6b23fea04eb249100bca0.scope# cat memory.high 
3435970560
root@ga8a2ae65be0d10d33ff12f:/sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod34261dc3_3529_427f_ad87_538c766e5335.slice/cri-containerd-d6ddb03d6d6091f03b5b2a6bc3b0f21be28da9a7b1f6b23fea04eb249100bca0.scope# cat memory.max 
4294967296
  • CPU max with stress-ng
  • fuse
  58: (55) if r2 != 0x2 goto pc+4
  59: (55) if r4 != 0xa goto pc+3
  60: (55) if r5 != 0xe5 goto pc+2
  61: (b4) w0 = 1
  62: (95) exit
// Allow /dev/fuse(10:229) 
  • io limit "ioLimit": { "writeBandwidthPerSecond": "300Mi", "readBandwidthPerSecond": "350Mi" },
$ dd if=/dev/zero of=zero.dat bs=2G count=5
dd: warning: partial read (2147479552 bytes); suggest iflag=fullblock
0+5 records in
0+5 records out
10737397760 bytes (11 GB, 10 GiB) copied, 34.8157 s, 308 MB/s
# cat io.max 
8:0 rbps=367001600 wbps=314572800 riops=max wiops=max


ioMaxFile := filepath.Join(basePath, cgroupPath)
_, err := v2.NewManager(basePath, filepath.Join("/", cgroupPath), c.limits)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only want to write the value to io.max , right? Why do you use contained Manager?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a little difficult to know what this code means because I had to assume some kinds of things.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually mistakenly thought this was updating the device filter and changing the eBPF. However, I admit I was a little dumb.

@utam0k
Copy link
Contributor

utam0k commented Jun 20, 2022

The behavior itself is the charm. Thank you @aledbf

@Furisto
Copy link
Member

Furisto commented Jun 20, 2022

Can confirm @utam0k's findings.Looks good to me.

@roboquat roboquat merged commit 27dfa4c into main Jun 20, 2022
@roboquat roboquat deleted the aledbf/fuse branch June 20, 2022 15:13
@roboquat roboquat added deployed: webapp Meta team change is running in production deployed: workspace Workspace team change is running in production deployed: IDE IDE change is running in production labels Jun 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deployed: IDE IDE change is running in production deployed: webapp Meta team change is running in production deployed: workspace Workspace team change is running in production release-note size/XL team: delivery Issue belongs to the self-hosted team team: IDE team: webapp Issue belongs to the WebApp team team: workspace Issue belongs to the Workspace team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants