-
Notifications
You must be signed in to change notification settings - Fork 633
[k8s_cloud] Add sshjump host support. #2201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- it also applies to kubernetes where we use ssh jump host to access internal ray head pod
9063150
to
e66cda1
Compare
sky/backends/backend_utils.py
Outdated
@@ -958,7 +957,6 @@ def write_cluster_config( | |||
|
|||
# Kubernetes only: | |||
'skypilot_k8s_image': k8s_image, | |||
'ssh_key_secret_name': ssh_key_secret_name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ssh_key_secret_name
doesn't seem to be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah thanks for catching - this is removed in the current latest k8s_cloud branch.
@@ -17,7 +17,7 @@ provider: | |||
module: sky.skylet.providers.kubernetes.KubernetesNodeProvider | |||
|
|||
# Use False if running from outside of k8s cluster | |||
use_internal_ips: false | |||
use_internal_ips: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It causes ssh itself (not its proxy command) to target internal pod ipaddress.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure how to use ssh_proxy_command_config
handled here. Our ssh_proxy seems to be dynamic; port is dynamic NodePort.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we assume port assigned to nodeport doesn't change (since the service is created only once)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. The port is the same as long as the sshjump service is running.
e66cda1
to
832818c
Compare
832818c
to
73e35d6
Compare
I confirm that launching the following task against kind, works well. SSH interactions with ray head during provision phase, is done through sshjump host as well as user can access ray head with ssh where head is being reached via sshjump pod.
create a local kind k8s cluster:
ensure cluster is up:
launch the task
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @aviweit! This is great. I skimmed through the code and left some comments. Haven't tried running yet. Will take another pass soon.
Can you also merge the latest changes from k8s_cloud
branch? It has a few fixes, and some changes in how ssh auth works (especially after #2119), so would be good to verify this still works.
Dockerfile_k8s_sshjump
Outdated
FROM continuumio/miniconda3:22.11.1 | ||
|
||
RUN apt update -y && \ | ||
apt install sudo vim openssh-server -y && \ | ||
rm -rf /var/lib/apt/lists/* | ||
|
||
# Setup SSH and generate hostkeys | ||
RUN mkdir -p /var/run/sshd && \ | ||
sed -i 's/PermitRootLogin prohibit-password/PermitRootLogin yes/' /etc/ssh/sshd_config && \ | ||
sed 's@session\s*required\s*pam_loginuid.so@session optional pam_loginuid.so@g' -i /etc/pam.d/sshd && \ | ||
cd /etc/ssh/ && \ | ||
ssh-keygen -A | ||
|
||
RUN useradd -m -s /bin/bash sky && \ | ||
echo "sky ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers && \ | ||
echo 'Defaults secure_path="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"' > /etc/sudoers.d/sky | ||
|
||
# Switch to sky user | ||
USER sky | ||
|
||
# Add /home/sky/.local/bin/ to PATH | ||
RUN echo 'export PATH="$PATH:$HOME/.local/bin"' >> ~/.bashrc | ||
|
||
WORKDIR /home/sky |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious - what do you think about reusing the image generated from Dockerfile_k8s (us-central1-docker.pkg.dev/skypilot-375900/skypilotk8s/skypilot:latest
) instead of creating a new one?
Pros: Reduces image maintenance burden for us, may be faster to pull since cluster would likely already have the main skypilot:latest
image for running tasks
Cons: Dockerfile_k8s is quite bloated because it installs SkyPilot and its dependencies.
Might be worth looking at image sizes for both before taking a call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @romilbhardwaj ,
I built both images using build_image.sh and noticed the below size differences. I agree with your Pros and Cons.
$docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
...
sshjump latest 9c4a7ca3d121 4 hours ago 479MB
skypilot latest 448c7c3b7553 4 hours ago 1.32GB
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @aviweit. Since this image will be pulled only once, lets remove this and reuse Dockerfile_k8s (us-central1-docker.pkg.dev/skypilot-375900/skypilotk8s/skypilot:latest
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@romilbhardwaj , done.
sky/authentication.py
Outdated
@@ -10,6 +10,8 @@ | |||
from typing import Any, Dict, Tuple | |||
import uuid | |||
|
|||
from urllib.parse import urlparse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: since urllib is standard library, perhaps should be added at L11 before import uuid
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @romilbhardwaj , I removed import urlparse
, as it is now being used by clouds.Kubernetes.get_external_ip()
at setup_kubernetes_authentication
.
sky/authentication.py
Outdated
sshjump_name = clouds.Kubernetes.SKY_SSH_JUMP_NAME | ||
sshjump_image = clouds.Kubernetes.SSH_JUMP_IMAGE | ||
|
||
def _to_sshjump_pod_spec(name, image, secret): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we define this pod spec as a jinja template (e.g., templates/k8s_ssh_jump.j2
) that is filled in with name
, image
and secret
through a function call? Same for the service spec below. Might be able to put them in the same file too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I pushed the changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we assume port assigned to nodeport doesn't change (since the service is created only once)?
sky/backends/backend_utils.py
Outdated
@@ -958,7 +957,6 @@ def write_cluster_config( | |||
|
|||
# Kubernetes only: | |||
'skypilot_k8s_image': k8s_image, | |||
'ssh_key_secret_name': ssh_key_secret_name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah thanks for catching - this is removed in the current latest k8s_cloud branch.
k8s_cloud-sshjumphost2
- it seems not to be used anymore
- template file to store sshjump pod and service specs - update setup_kubernetes_authentication to use it
Hi @romilbhardwaj , I have merged the PR with It worked well with skypilot installed on Ubuntu 20.04 host - against an external kubernetes cluster (deployed via kubeadm), and a one deployed locally (kind). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @aviweit! We probably also need to handle the lifecycle of the jump pod - left some comments. Also, let's use us-central1-docker.pkg.dev/skypilot-375900/skypilotk8s/skypilot:latest
image for the ssh jump pod.
Dockerfile_k8s_sshjump
Outdated
FROM continuumio/miniconda3:22.11.1 | ||
|
||
RUN apt update -y && \ | ||
apt install sudo vim openssh-server -y && \ | ||
rm -rf /var/lib/apt/lists/* | ||
|
||
# Setup SSH and generate hostkeys | ||
RUN mkdir -p /var/run/sshd && \ | ||
sed -i 's/PermitRootLogin prohibit-password/PermitRootLogin yes/' /etc/ssh/sshd_config && \ | ||
sed 's@session\s*required\s*pam_loginuid.so@session optional pam_loginuid.so@g' -i /etc/pam.d/sshd && \ | ||
cd /etc/ssh/ && \ | ||
ssh-keygen -A | ||
|
||
RUN useradd -m -s /bin/bash sky && \ | ||
echo "sky ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers && \ | ||
echo 'Defaults secure_path="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"' > /etc/sudoers.d/sky | ||
|
||
# Switch to sky user | ||
USER sky | ||
|
||
# Add /home/sky/.local/bin/ to PATH | ||
RUN echo 'export PATH="$PATH:$HOME/.local/bin"' >> ~/.bashrc | ||
|
||
WORKDIR /home/sky |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @aviweit. Since this image will be pulled only once, lets remove this and reuse Dockerfile_k8s (us-central1-docker.pkg.dev/skypilot-375900/skypilotk8s/skypilot:latest
).
sky/authentication.py
Outdated
content = json.loads(_content) | ||
|
||
try: | ||
kubernetes.core_api().create_namespaced_pod('default', content['pod_spec']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to think a bit about the lifecycle of this jump pod.
- Creation - Creation of this jump ssh pod happens when the user creates their first SkyPilot cluster in the Kubernetes cluster.
- Run - The jump SSH pod is kept alive as long as there is a SkyPilot cluster running (i.e., there exists a Kubernetes pod with label
parent: skypilot
in the same namespace) - Termination - If there hasn't been a pod with label
parent: skypilot
in the same namespace since the past n minutes (say, 10 minutes), this jump pod should terminate itself.
We can do this by running a script like this in the jump ssh pod as the command
for the pod:
import os
import datetime
import pytz
import time
from kubernetes import client, config
# Load kube config
config.load_incluster_config()
v1 = client.CoreV1Api()
# Get the current namespace from the pod service account
with open("/var/run/secrets/kubernetes.io/serviceaccount/namespace", "r") as f:
current_namespace = f.read()
# Set the time delta for checking last active pods
time_delta = datetime.timedelta(minutes=10)
# Set delay for each retry
retry_delay = 60 # In seconds
while True:
# Get the current time
now = datetime.datetime.now(pytz.UTC)
found = False
# List the pods in the current namespace
ret = v1.list_namespaced_pod(current_namespace)
for i in ret.items:
if i.metadata.labels and 'parent' in i.metadata.labels and i.metadata.labels['parent'] == 'skypilot':
# Calculate the elapsed time since the pod was last active
elapsed_time = now - i.metadata.creation_timestamp
# If the pod was active in the last 10 minutes, set found to True
if elapsed_time < time_delta:
found = True
break
# If no active pods were found with the specified label, exit the script
if not found:
print("No active pods found with label 'parent: skypilot' in the past 10 minutes. Exiting...")
exit(1)
# If pods were found, sleep for the specified delay and then retry
print(f"Active pods found with label 'parent: skypilot'. Retrying in {retry_delay} seconds...")
time.sleep(retry_delay)
What do you think?
sky/templates/kubernetes-sshjump.j2
Outdated
@@ -0,0 +1,74 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have this file as a YAML instead of a json? Trying to have something that can also be easily used with kubectl apply -f
for debugging/testing (of course, after replacing template variables).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@romilbhardwaj , yes. I changed. Thanks.
- use Dockerfile_k8s for sshjump host as well
Thanks a lot for the comments. I updated the code to reuse Dockerfile_k8s and updated the template format to yaml. I re-tested the recent changes in our environment and it seems to work well. Your suggestion on sshjump lifecycle seems to be fine for me. There seems to be additional work such as defining RBAC for the pod to list other pods. I tested it and got that failure. I would like to ask whether it would be ok to merge this PR and to open another PR to handle sshjump lifecycle? Other PRs can be probably opened to handle additional enhancements as needed. Thanks again. |
@aviweit - this is great! Tried it out - works nicely.
Is there some way to upload kubeconfig credentials so that jump pod can use those to authenticate with the control plane and list pods? Asking because I think lifecycle management of the jump host is important to get right before we ship this. We do not want to leave an orphan ssh jump pod running on users' clusters after they are done using SkyPilot. If we can figure out a quick solution to the above problem, it'll be great to have in this PR. If you think it will be an involved process, then it'll be good to have a draft solution verifying this kind of lifecycle management is possible. Once we can verify it can be done, then we can merge this PR and create another one to address it. What do you think? Thanks again for adding this! |
@aviweit - I've made some updates to |
- remove skypilot label from sshjump related objects so that they are not currently being queried by sshjump lcm script
… k8s_cloud-sshjumphost2
Hi @romilbhardwaj , I merged Regarding reusing I would like to ask whether this PR can be merged and if needed, additional PRs will be opened? This will save us time on keeping this PR up to date in respect to upstream changes. @romilbhardwaj , I think that you will need to re-build and push Dockerfile_k8s because of adding Thanks again. |
- add comments in sshjump lcm script
- update from greater to greater-equal
Hi @romilbhardwaj , I have updated the code to create sshjump related resources in the correct namespace (and a small update to Thanks a lot. |
@romilbhardwaj , I added sshjump lcm to monitor only ray pods it is responsible for. If it is ok, it can be merged to this PR. Thanks. |
We should probably re-open this PR by changing the target branch. The original k8s_cloud branch has been deleted as it was merged to the master. |
Hey @aviweit ! This PR got closed when we merged #2096 into master. For our next sprint, we'll be switching to k8s_cloud_beta1 branch. Can you re-open this PR on that branch? |
Add an initial support for ssh jump host.
By this, only single port (nodeport) is opened on a target k8s cluster where ssh connections established from skypilot user targeted to ray cluster - are tunneled through ssh jump host pod.