-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Enable user namespaces and seccomp #1172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
It is also worth noting that both of these are required for Chromium and Google Chrome to properly implement sandboxing of their various sub-components. |
@CTassisF has your issue been resolved? If so, please close this issue. Thanks. |
User namespaces are disabled from the current kernel:
|
With all kernel options that aren't simply loadable modules we are concerned about increased kernel size and reduced performance. If you were to present "before and after" comparisons of free memory and performance (using some standard benchmarks) it may strengthen your case. |
@pelwell what kind of benchmark are you after? Disk block IO, file system IO, CPU cycles, etc.? I want seccomp for Docker on the Raspberry Pi, it is an important protection. And user namespace as well, this allow for mapping user ID within a container to different user ID on the host. This is important for example for user ID 0 (aka root)! This is not limited to Docker, but LXC and LXD requires user namespace for running unprivileged containers. So I'm ready to measure memory before and after, but for benchmark, what do you expect? |
How about running dd from /dev/zero to /dev/null with three block sizes - 1, 4k and 1024k - and a count set to make each one take at least 10 seconds? I'm concerned about performance for non-sandboxed processes, but I would be curious to compare with sandboxed as well, so a 3x3 grid of results (MB/s for each of three sizes in a kernel without seccomp, with seccomp, and sandboxed) would be great. |
Ok @pelwell I can do this. Give me a week in order to build the kernel and do the benchmarks. I'm a young dad so I need time! ;-) |
Hi @pelwell I haven't finish testing but I can already give a status update on the following:
PS: I've added a small benchmark which run a static web page (a simple HTML page) with nginx. The benchmark is done from my desktop using apache bench, my desktop is powerful enough to overcome the rpi if necessary ;-) I will try to test also with a generated website maybe something like ghost or wordpress, which ever is easier to install on the rpi and in a container. I will give more details when I'm done with testing. |
That's looking good so far. The dd test is almost a worst case, with very large numbers of userspace-to-kernel round-trips. As such, a 10-20% performance drop doesn't sound too bad, but others may disagree - that isn't a final decision. |
It seems I cannot test further, I was successful at compiling a new kernel with AppArmor support and to run it. It works well but not with Docker (moby/moby#27351), I'm investigating with some support from Docker. Anyway, the impact of having AppArmor installed has decreased the performance a bit further. Now with AppArmor, seccomp filtering and user namespace activated I am getting the following results:
My changes to the default config are in this branch on a fork I made: https://github.com/jcberthon/linux/tree/rpi-sec-apparmor-seccomp-userns I will write later to described other tests I have conducted and also if I have any update on Docker with AppArmor on ARM. |
Just a quick update. The problem of Docker when AppArmor is active on ARM has been solved and merged. The fix will be available in Docker 1.12.3. I've installed the patch and can now run Docker successfully on Raspberry Pi with my improved kernel. In the coming days I'll provide a pull request with the changes. So if it is decided that this issue should be fixed, it will simply a matter of reviewing and possibly merging my changes. I now have to find the time to do some benchmarking inside the container with the official and my kernel. Although trivial I'm lacking time, so do not expect much feedback from me in the next 10-20 days. |
Thanks for the update - take your time, we'll still be here. |
Just for information: the See these commits on 4.6 and 4.7 by @popcornmix: 4.6 39f02dd#diff-d578de903015b334ab3f9f22d7055058 and 4.7 c2b66ab#diff-d578de903015b334ab3f9f22d7055058 And it is in the baseline config from 4.8 on. My other proposed changes are not included in newer branches (4.5 to 4.9) |
Hi I have concluded the benchmarking using dd as suggested by @pelwell. I have tested dd in 3 settings with 1 byte (test1), 4kB (test2) and 1MB (test3) blocks and configure it so that each tests run within 20-30s. Each tests was run 3 times and I computed the average. My platform was a Raspberry Pi 2 headless (using SSH, no monitor or keyboard or X11). The tests were run in 5 different environments, with the Raspbian vanilla kernel (4.4.27-v7+), with the Raspbian kernel configs and the User Namespace and SECCOMP filters active, and then adding also AppArmor. So 3 different kernels, and on the vanilla kernel and the kernel with UserNS+SECCOMP-filters+AppArmor, I run the benchmark in a Docker container. So in total that's 5 environments. For Docker, I used 1.12.3-rc1 which contains a patch allowing it to run on ARM with AppArmor, and I used the Debian:jessie image from armhf (https://hub.docker.com/r/armhf/debian/). Note that since yesterday the final 1.12.3 has been published, but I did not retest it. tldr; Performance impact is +-7% when using UserNS+SECCOMP-filters compare to the vanilla Kernel. But it is up to -23% impact when using UserNS+SECCOMP-filters+AppArmor compare to the vanilla Kernel. Within Docker, the benchmark is always about 2% faster than on the host itself. Detailed results:
The columns named "Impact" are the amount in percent of change between the previous column and the baseline which is the vanilla Raspbian kernel, except when specified otherwise. Conclusion: including UserNS and SECCOMP filters does not seem to have much impact. Activating AppArmor can produce up to 22% performance impact in worth case scenarios, but in normal use, the impact should not be felt. Using these flags has not impacted the Kernel stability, my Raspberry Pi has been up and running during the last weeks with the self generated kernels and I did not have a single application or system crash or unexpectedly not running. |
@popcornmix @pelwell Not a huge impact to performance, do we want to include this? |
Yes, please? Why the heck not? |
It seems this issue can be closed since the requested changes are present in the latest kernel: $ cat /etc/issue; uname -r; apt-cache policy raspberrypi-kernel; zgrep 'SECCOMP|_NS=' /proc/config.gz 4.9.35-v7+ |
Hi @iam-TJ Very strange as it is not visible in the config file: https://github.com/raspberrypi/linux/blob/rpi-4.9.y/arch/arm/configs/bcm2835_defconfig Perhaps it is now automatically included by other flags. That's now almost a year that I'm maintaining and running on my own build Kernel. So I can't really check. |
I believe that this stuff is now included in our standard kernel. Closing. |
Hi @JamesH65 I just checked again, I have now another Raspberry Pi and I installed a clean Raspbian. When checking if all the flags in my Push Request (PR) are there, that is not the case but this is true that the SECCOMP one are now activated. Here is the output: $ cat /etc/issue; uname -r; sudo modprobe configs; zegrep "SECCOMP|_NS=|CG_|CGROUP|APPARMOR" /proc/config.gz
Raspbian GNU/Linux 9 \n \l
4.14.34-v7+
CONFIG_CGROUPS=y
# CONFIG_MEMCG_SWAP is not set
CONFIG_BLK_CGROUP=y
# CONFIG_DEBUG_BLK_CGROUP is not set
CONFIG_CGROUP_WRITEBACK=y
CONFIG_CGROUP_SCHED=y
# CONFIG_CGROUP_PIDS is not set
# CONFIG_CGROUP_RDMA is not set
CONFIG_CGROUP_FREEZER=y
CONFIG_CGROUP_DEVICE=y
CONFIG_CGROUP_CPUACCT=y
# CONFIG_CGROUP_PERF is not set
# CONFIG_CGROUP_DEBUG is not set
CONFIG_SOCK_CGROUP_DATA=y
CONFIG_UTS_NS=y
CONFIG_IPC_NS=y
CONFIG_USER_NS=y
CONFIG_PID_NS=y
CONFIG_NET_NS=y
# CONFIG_SLUB_MEMCG_SYSFS_ON is not set
CONFIG_HAVE_ARCH_SECCOMP_FILTER=y
CONFIG_SECCOMP_FILTER=y
CONFIG_SECCOMP=y
CONFIG_NF_CONNTRACK_NETBIOS_NS=m
# CONFIG_NETFILTER_XT_MATCH_CGROUP is not set
CONFIG_NET_CLS_CGROUP=m
# CONFIG_CGROUP_NET_PRIO is not set
CONFIG_CGROUP_NET_CLASSID=y
# CONFIG_TCG_TPM is not set Compare to my PR, we can see that the SECCOMP and NS (name spaces) are now set. However most control groups (CGROUP or MEMCG) are still not set. For instance when running Docker, it is not possible to support many resource control. When doing
With my PR, Docker is happy. But this is not limited to Docker, other container technologies (rkt, cri-o, Kubernetes, etc.) make use of them and even things like systemd can use them (and potentially, but I could be mistaken, snap and flatpak could use them). I could use the issue #1605 to update my PR and get it merged. Or you could re-open this issue and I update my PR. But before I am putting some effort in this PR, will it be considered? (I'm asking because I have 4 very young kids and my free time is often very limited) |
Probably best on another PR/Issue, this specific issue (NS and SECCOMP) on Firejail appears to be solved. |
Alright, and #1605 is also solved w.r.t. systemd 231. So I will create a new issue and PR. Thank you for the feedback and advice. |
Hello.
I am trying to use Firejail[0] on my RaspberryPi running Raspbian but it shows two warnings:
Warning: user namespaces not available in the current kernel.
Warning: seccomp disabled, it requires a Linux kernel version 3.5 or newer.
Can you enable this features on the next kernel release?
My current uname -a is:
Linux RaspberryPi 4.1.7+ #817 PREEMPT Sat Sep 19 15:25:36 BST 2015 armv6l GNU/Linux
Thanks in advance,
César
[0] https://l3net.wordpress.com/projects/firejail/
The text was updated successfully, but these errors were encountered: