Skip to content

Commit fd7af8f

Browse files
committed
WIP: bootstrap: pivot into node image before bootstrapping
As per openshift/enhancements#1637, we're trying to get rid of all OpenShift-versioned components from the bootimages. This means that there will no longer be `oc`, `kubelet`, or `crio` binaries for example, which bootstrapping obviously relies on. Instead, now we change things up so that early on when booting the bootstrap node, we pull down the node image, unencapsulate it (this just means convert it back to an OSTree commit), then mount over its `/usr`, and import new `/etc` content. This is done by isolating to a different systemd target to only bring up the minimum number of services to do the pivot and then carry on with bootstrapping. This does not incur additional reboots and should be compatible with AI/ABI/SNO. But it is of course, a huge conceptual shift in how bootstrapping works. With this, we would now always be sure that we're using the same binaries as the target version as part of bootstrapping, which should alleviate some issues such as AI late-binding (see e.g. https://issues.redhat.com/browse/MGMT-16705). The big exception of course being the kernel. Relatedly, note we do persist `/usr/lib/modules` from the booted system so that loading kernel modules still works. To be conservative, the new logic only kicks in when using bootimages which do not have `oc`. This will allow us to ratchet this in more easily. Down the line, we should be able to replace some of this with `bootc apply-live` once that's available (and also works in a live environment). (See bootc-dev/bootc#76.) For full context, see the linked enhancement and discussions there.
1 parent 5130297 commit fd7af8f

File tree

8 files changed

+149
-1
lines changed

8 files changed

+149
-1
lines changed
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
#!/bin/bash
2+
set -euo pipefail
3+
4+
UNIT_DIR="${1:-/tmp}"
5+
6+
if ! rpm -q openshift-clients &>/dev/null; then
7+
ln -sf "/etc/systemd/system/node-image-overlay.target" \
8+
"${UNIT_DIR}/default.target"
9+
fi
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
# This is a separate unit because in the assisted-installer flow, we only want
2+
# `node-image-overlay.service`, not the isolating back to `multi-user.target`.
3+
4+
[Unit]
5+
Description=Node Image Finish
6+
Requires=node-image-overlay.service
7+
After=node-image-overlay.service
8+
9+
[Service]
10+
Type=oneshot
11+
# and now, back to our regularly scheduled programming...
12+
ExecStart=/usr/bin/systemctl --no-block isolate multi-user.target
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
[Unit]
2+
Description=Node Image Overlay
3+
Requires=node-image-pull.service
4+
After=node-image-pull.service
5+
6+
[Service]
7+
Type=oneshot
8+
ExecStart=/usr/local/bin/node-image-overlay.sh
9+
RemainAfterExit=yes
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
[Unit]
2+
Description=Node Image Overlay Target
3+
Requires=basic.target
4+
5+
# for easier debugging
6+
Requires=sshd.service getty.target systemd-user-sessions.service
7+
8+
Requires=node-image-overlay.service
9+
Requires=node-image-finish.service
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
[Unit]
2+
Description=Node Image Pull
3+
Requires=network.target NetworkManager.service
4+
After=network.target
5+
6+
[Service]
7+
Type=oneshot
8+
# we need to call ostree container (i.e. rpm-ostree), which has install_exec_t,
9+
# but by default, we'll run as unconfined_service_t, which is not allowed that
10+
# transition. Relabel the script itself.
11+
ExecStartPre=chcon --reference=/usr/bin/ostree /usr/local/bin/node-image-pull.sh
12+
ExecStart=/usr/local/bin/node-image-pull.sh
13+
# see related XXX in node-image-pull.sh
14+
TimeoutStartSec=infinity
15+
MountFlags=slave
16+
RemainAfterExit=yes
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
#!/bin/bash
2+
set -euo pipefail
3+
4+
ostree_repo=/var/ostree-container/repo
5+
if [ ! -d "${ostree_repo}" ]; then
6+
ostree_repo=/ostree/repo
7+
fi
8+
9+
checkout="${ostree_repo}/tmp/node-image"
10+
11+
# keep /usr/lib/modules from the booted deployment for kernel modules
12+
mount -o bind,ro "/usr/lib/modules" "${checkout}/usr/lib/modules"
13+
mount -o rbind,ro "${checkout}/usr" /usr
14+
rsync -av "${checkout}/usr/etc/" /etc
15+
16+
# reload the new policy
17+
semodule -R
Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
#!/bin/bash
2+
set -euo pipefail
3+
4+
# shellcheck source=release-image.sh.template
5+
. /usr/local/bin/release-image.sh
6+
7+
# yuck... this is a good argument for renaming the node image to just `node` in both OCP and OKD
8+
coreos_img=rhel-coreos
9+
{{ if .IsOKD }}
10+
coreos_img=stream-coreos
11+
{{ end }}
12+
# XXX: Unset NOTIFY_SOCKET for podman to workaround an outstanding bug in
13+
# RHEL. When it sees the socket, it wants to keep extending the service start
14+
# timeout. It writes to stderr, but we use `--quiet` which leaves it null,
15+
# so it hits SIGSEGV. To work around not having timeout extensions; we use
16+
# TimeoutStartSec=infinity.
17+
# This is fixed upstream by https://github.com/containers/common/pull/1758.
18+
# Should request backport...
19+
while ! COREOS_IMAGE=$(unset NOTIFY_SOCKET; image_for ${coreos_img}); do
20+
echo 'Failed to query release image; retrying...'
21+
sleep 10
22+
done
23+
24+
# try to do this in the system repo so we get hardlinks and the checkout is
25+
# read-only, but fallback to using /var if we're in the live environment since
26+
# that's truly read-only
27+
ostree_repo=/ostree/repo
28+
hardlink='-H'
29+
if grep -q coreos.liveiso= /proc/cmdline; then
30+
ostree_repo=/var/ostree-container/repo
31+
mkdir -p "${ostree_repo}"
32+
ostree init --mode=bare --repo="${ostree_repo}"
33+
# use the system repo as parent to still get layer deduping
34+
ostree config --repo="${ostree_repo}" set core.parent /ostree/repo
35+
# but we won't be able to force hardlinks cross-device
36+
hardlink=''
37+
else
38+
# (remember, we're MountFlags=slave)
39+
mount -o rw,remount /sysroot
40+
fi
41+
42+
# https://docs.fedoraproject.org/en-US/bootc/container-pull-secrets/
43+
cp /root/.docker/config.json /etc/ostree/auth.json
44+
45+
# Use ostree stack to pull the container here. This gives us efficient
46+
# downloading with layers we already have, and also handles SELinux.
47+
while ! ostree container image pull "${ostree_repo}" \
48+
ostree-unverified-image:docker://"${COREOS_IMAGE}"; do
49+
echo 'Failed to fetch release image; retrying...'
50+
sleep 10
51+
done
52+
53+
# ideally, `ostree container image pull` would support `--write-ref` or a
54+
# command to escape a pullspec, but for now it's pretty easy to tell which ref
55+
# it is since it's the only docker one
56+
ref=$(ostree refs | grep ^ostree/container/image/docker)
57+
if [ $(echo "$ref" | wc -l) != 1 ]; then
58+
echo "Expected single docker ref, found:"
59+
echo "$ref"
60+
exit 1
61+
fi
62+
ostree refs "$ref" --create coreos/node-image
63+
64+
# massive hack to make ostree admin config-diff work in live ISO where /etc
65+
# is actually on a separate mount and not the deployment root proper... should
66+
# enhance libostree for this (remember, we're MountFlags=slave)
67+
if grep -q coreos.liveiso= /proc/cmdline; then
68+
mount -o bind,ro /etc /ostree/deploy/*/deploy/*/etc
69+
fi
70+
71+
# get all state files in /etc; this is a cheap way to get "3-way /etc merge" semantics
72+
etc_keep=$(ostree admin config-diff | cut -f5 -d' ' | sed -e 's,^,/usr/etc/,')
73+
74+
# check out the commit
75+
checkout="${ostree_repo}/tmp/node-image"
76+
ostree checkout --repo "${ostree_repo}" ${hardlink} coreos/node-image "${checkout}" --skip-list=<(cat <<< "$etc_keep")

pkg/asset/ignition/bootstrap/common.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -404,7 +404,7 @@ func AddStorageFiles(config *igntypes.Config, base string, uri string, templateD
404404

405405
var mode int
406406
appendToFile := false
407-
if parentDir == "bin" || parentDir == "dispatcher.d" {
407+
if parentDir == "bin" || parentDir == "dispatcher.d" || parentDir == "system-generators" {
408408
mode = 0555
409409
} else if filename == "motd" || filename == "containers.conf" {
410410
mode = 0644

0 commit comments

Comments
 (0)