Skip to content

Commit 47969a1

Browse files
Updated e2e tests to support S3 compatible storage bucket from whicyh to download MNISt datasets for disconnected automatione
1 parent e7a45ba commit 47969a1

13 files changed

+503
-20
lines changed

.pre-commit-config.yaml

+1
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ repos:
77
- id: trailing-whitespace
88
- id: end-of-file-fixer
99
- id: check-yaml
10+
args: [--allow-multiple-documents]
1011
- id: check-added-large-files
1112
- repo: https://github.com/psf/black
1213
rev: 23.3.0

docs/e2e.md

+22-5
Original file line numberDiff line numberDiff line change
@@ -108,8 +108,25 @@ Currently the SDK doesn't support tolerations, so e2e tests can't be executed on
108108
```
109109
poetry run pytest -v -s ./tests/e2e -m openshift --timeout=1200
110110
```
111-
- If the cluster doesn't have NVidia GPU support or GPU nodes have taint then we need to disable NVidia GPU tests by providing proper marker:
112-
```
113-
poetry install --with test,docs
114-
poetry run pytest -v -s ./tests/e2e/mnist_raycluster_sdk_kind_test.py -m 'not nvidia_gpu'
115-
```
111+
112+
## On OpenShift Disconnected clusters
113+
114+
- In addition to setup phase mentioned above in case of Openshift cluster, Disconnected environment requires following pre-requisites :
115+
- Mirror Image registry :
116+
- Image mirror registry is used to host set of container images required locally for the applications and services. This ensures to pull images without needing an external network connection. It also ensures continuous operation and deployment capabilities in a network-isolated environment.
117+
- PYPI Mirror Index :
118+
- When trying to install Python packages in a disconnected environment, the pip command might fail because the connection cannot install packages from external URLs. This issue can be resolved by setting up PIP Mirror Index on separate endpoint in same environment.
119+
- S3 compatible storage :
120+
- Some of our distributed training examples require an external storage solution so that all nodes can access the same data in disconnected environment (For example: common-datasets and model files).
121+
- Minio S3 compatible storage type instance can be deployed in disconnected environment using `/tests/e2e/minio_deployment.yaml` or using support methods in e2e test suite.
122+
- The following are environment variables for configuring PIP index URl for accessing the common-python packages required and the S3 or Minio storage for your Ray Train script or interactive session.
123+
```
124+
export RAY_IMAGE=quay.io/project-codeflare/ray@sha256:<image-digest> (prefer image digest over image tag in disocnnected environment)
125+
PIP_INDEX_URL=https://<bastion-node-endpoint-url>/root/pypi/+simple/ \
126+
PIP_TRUSTED_HOST=<bastion-node-endpoint-url> \
127+
AWS_DEFAULT_ENDPOINT=<s3-compatible-storage-endpoint-url> \
128+
AWS_ACCESS_KEY_ID=<s3-compatible-storage-access-key> \
129+
AWS_SECRET_ACCESS_KEY=<s3-compatible-storage-secret-key> \
130+
AWS_STORAGE_BUCKET=<storage-bucket-name>
131+
AWS_STORAGE_BUCKET_MNIST_DIR=<storage-bucket-MNIST-datasets-directory>
132+
```

poetry.lock

+117-1
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

pyproject.toml

+1
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@ pytest = "7.4.0"
4444
coverage = "7.2.7"
4545
pytest-mock = "3.11.1"
4646
pytest-timeout = "2.2.0"
47+
minio = "^7.2.7"
4748

4849
[tool.pytest.ini_options]
4950
filterwarnings = [

tests/e2e/local_interactive_sdk_oauth_test.py

+3
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,8 @@ def test_local_interactives(self):
2828
self.run_local_interactives()
2929

3030
def run_local_interactives(self):
31+
ray_image = get_ray_image()
32+
3133
auth = TokenAuthentication(
3234
token=run_oc_command(["whoami", "--show-token=true"]),
3335
server=run_oc_command(["whoami", "--show-server=true"]),
@@ -46,6 +48,7 @@ def run_local_interactives(self):
4648
worker_cpu_limits=1,
4749
worker_memory_requests=1,
4850
worker_memory_limits=4,
51+
image=ray_image,
4952
verify_tls=False,
5053
)
5154
)

tests/e2e/minio_deployment.yaml

+163
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,163 @@
1+
---
2+
kind: PersistentVolumeClaim
3+
apiVersion: v1
4+
metadata:
5+
name: minio-pvc
6+
spec:
7+
accessModes:
8+
- ReadWriteOnce
9+
resources:
10+
requests:
11+
storage: 20Gi
12+
volumeMode: Filesystem
13+
---
14+
kind: Secret
15+
apiVersion: v1
16+
metadata:
17+
name: minio-secret
18+
stringData:
19+
# change the username and password to your own values.
20+
# ensure that the user is at least 3 characters long and the password at least 8
21+
minio_root_user: minio
22+
minio_root_password: minio123
23+
---
24+
kind: Deployment
25+
apiVersion: apps/v1
26+
metadata:
27+
name: minio
28+
spec:
29+
replicas: 1
30+
selector:
31+
matchLabels:
32+
app: minio
33+
template:
34+
metadata:
35+
creationTimestamp: null
36+
labels:
37+
app: minio
38+
spec:
39+
volumes:
40+
- name: data
41+
persistentVolumeClaim:
42+
claimName: minio-pvc
43+
containers:
44+
- resources:
45+
limits:
46+
cpu: 250m
47+
memory: 1Gi
48+
requests:
49+
cpu: 20m
50+
memory: 100Mi
51+
readinessProbe:
52+
tcpSocket:
53+
port: 9000
54+
initialDelaySeconds: 5
55+
timeoutSeconds: 1
56+
periodSeconds: 5
57+
successThreshold: 1
58+
failureThreshold: 3
59+
terminationMessagePath: /dev/termination-log
60+
name: minio
61+
livenessProbe:
62+
tcpSocket:
63+
port: 9000
64+
initialDelaySeconds: 30
65+
timeoutSeconds: 1
66+
periodSeconds: 5
67+
successThreshold: 1
68+
failureThreshold: 3
69+
env:
70+
- name: MINIO_ROOT_USER
71+
valueFrom:
72+
secretKeyRef:
73+
name: minio-secret
74+
key: minio_root_user
75+
- name: MINIO_ROOT_PASSWORD
76+
valueFrom:
77+
secretKeyRef:
78+
name: minio-secret
79+
key: minio_root_password
80+
ports:
81+
- containerPort: 9000
82+
protocol: TCP
83+
- containerPort: 9090
84+
protocol: TCP
85+
imagePullPolicy: IfNotPresent
86+
volumeMounts:
87+
- name: data
88+
mountPath: /data
89+
subPath: minio
90+
terminationMessagePolicy: File
91+
image: >-
92+
quay.io/minio/minio:RELEASE.2024-06-22T05-26-45Z
93+
# In case of disconnected environment, use image digest instead of tag
94+
# For example : <mirror_registry_endpoint>/minio/minio@sha256:6b3abf2f59286b985bfde2b23e37230b466081eda5dccbf971524d54c8e406b5
95+
args:
96+
- server
97+
- /data
98+
- --console-address
99+
- :9090
100+
restartPolicy: Always
101+
terminationGracePeriodSeconds: 30
102+
dnsPolicy: ClusterFirst
103+
securityContext: {}
104+
schedulerName: default-scheduler
105+
strategy:
106+
type: Recreate
107+
revisionHistoryLimit: 10
108+
progressDeadlineSeconds: 600
109+
---
110+
kind: Service
111+
apiVersion: v1
112+
metadata:
113+
name: minio-service
114+
spec:
115+
ipFamilies:
116+
- IPv4
117+
ports:
118+
- name: api
119+
protocol: TCP
120+
port: 9000
121+
targetPort: 9000
122+
- name: ui
123+
protocol: TCP
124+
port: 9090
125+
targetPort: 9090
126+
internalTrafficPolicy: Cluster
127+
type: ClusterIP
128+
ipFamilyPolicy: SingleStack
129+
sessionAffinity: None
130+
selector:
131+
app: minio
132+
---
133+
kind: Route
134+
apiVersion: route.openshift.io/v1
135+
metadata:
136+
name: minio-api
137+
spec:
138+
to:
139+
kind: Service
140+
name: minio-service
141+
weight: 100
142+
port:
143+
targetPort: api
144+
wildcardPolicy: None
145+
tls:
146+
termination: edge
147+
insecureEdgeTerminationPolicy: Redirect
148+
---
149+
kind: Route
150+
apiVersion: route.openshift.io/v1
151+
metadata:
152+
name: minio-ui
153+
spec:
154+
to:
155+
kind: Service
156+
name: minio-service
157+
weight: 100
158+
port:
159+
targetPort: ui
160+
wildcardPolicy: None
161+
tls:
162+
termination: edge
163+
insecureEdgeTerminationPolicy: Redirect

0 commit comments

Comments
 (0)