You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Docker doesn’t provide long term storage or image distribution capabilities, so developers need something more. [Docker Registry](https://docs.docker.com/registry/) performs these tasks, and using it guarantees the same application runtime environment through virtualization. However, building an image can involve a significant time investment, which is where [Quay](https://www.redhat.com/en/resources/quay-datasheet)(pronounced *kway*) comes in. A registry like Quay can both build and store containers. You can then deploy these containers in a shorter time and with less effort than using Docker Registry. This guide explains how Quay can be an essential part of the development process and details how to deploy a Quay registry.
12
+
Docker doesn’t provide long term storage or image distribution capabilities, so developers need something more. [Docker Registry](https://docs.docker.com/registry/) performs these tasks, and using it guarantees the same application runtime environment through virtualization. However, building an image can involve a significant time investment, which is where [Quay](https://www.redhat.com/en/resources/quay-datasheet) comes in. A registry like Quay can both build and store containers. You can then deploy these containers in a shorter time and with less effort than using Docker Registry. This guide explains how Quay can be an essential part of the development process and details how to deploy a Quay registry.
@@ -37,15 +37,15 @@ Follow this tutorial to deploy a RAG pipeline on Akamai’s LKE service using ou
37
37
-**Kubeflow Pipeline:** Used to deploy pipelines, reusable machine learning workflows built using the Kubeflow Pipelines SDK. In this tutorial, a pipeline is used to run LlamaIndex to process the dataset and store embeddings.
38
38
-**Meta’s Llama 3 LLM:** The [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) model is used as the LLM. You should review and agree to the licensing agreement before deploying.
39
39
-**Milvus:** Milvus is an open-source vector database and is used for generative AI workloads. This tutorial uses Milvus to store embeddings generated by LlamaIndex and make them available to queries sent to the Llama 3 LLM.
40
-
-**Open WebUI:** This is an self-hosted AI chatbot application that’s compatible with LLMs like Llama 3 and includes a built-in inference engine for RAG solutions. Users interact with this interface to query the LLM. This can be configured to send queries straight to Llama 3 or to first load data from Milvus and send that context along with the query.
40
+
-**Open WebUI:** This is a self-hosted AI chatbot application that’s compatible with LLMs like Llama 3 and includes a built-in inference engine for RAG solutions. Users interact with this interface to query the LLM. This can be configured to send queries straight to Llama 3 or to first load data from Milvus and send that context along with the query.
41
41
42
42
## Prerequisites
43
43
44
44
This tutorial requires you to have access to a few different services and local software tools. You should also have a custom dataset available to use for the pipeline.
45
45
46
46
- A [Cloud Manager](https://cloud.linode.com/) account is required to use many of Akamai’s cloud computing services, including LKE.
47
47
- A [Hugging Face](https://huggingface.co/) account is used for deploying the Llama 3 LLM to KServe.
48
-
- You should have both [kubectl](https://kubernetes.io/docs/reference/kubectl/) and [Helm](https://helm.sh/) installed on your local machine. These apps are used for managing your LKE cluster and installing applications to your cluster.
48
+
- You should have [kubectl](https://kubernetes.io/docs/reference/kubectl/), [Kustomize](https://kustomize.io/), and [Helm](https://helm.sh/) installed on your local machine. These apps are used for managing your LKE cluster and installing applications to your cluster.
49
49
- A **custom dataset** is needed, preferably in Markdown format, though you can use other types of data if you modify the LlamaIndex configuration provided in this tutorial. This dataset should contain all of the information you want used by the Llama 3 LLM. This tutorial uses a Markdown dataset containing all of the Linode Docs.
@@ -61,7 +61,7 @@ It’s not part of the scope of this document to cover the setup required to sec
61
61
62
62
The first step is to provision the infrastructure needed for this tutorial and configure it with kubectl, so that you can manage it locally and install software through helm. As part of this process, we’ll also need to install the NVIDIA GPU operator at this step so that the NVIDIA cards within the GPU worker nodes can be used on Kubernetes.
63
63
64
-
1.**Provision an LKE cluster.** We recommend using at least 3**RTX4000 Ada x1 Medium** GPU plans (plan ID: `g2-gpu-rtx4000a1-m`), though you can adjust this as needed. For reference, Kubeflow recommends 32 GB of RAM and 16 CPU cores for just their own application. This tutorial has been tested using Kubernetes v1.31, though other versions should also work. To learn more about provisioning a cluster, see the [Create a cluster](https://techdocs.akamai.com/cloud-computing/docs/create-a-cluster) guide.
64
+
1.**Provision an LKE cluster.** We recommend using at least 2**RTX4000 Ada x1 Medium** GPU plans (plan ID: `g2-gpu-rtx4000a1-m`), though you can adjust this as needed. For reference, Kubeflow recommends 32 GB of RAM and 16 CPU cores for just their own application. This tutorial has been tested using Kubernetes v1.31, though other versions should also work. To learn more about provisioning a cluster, see the [Create a cluster](https://techdocs.akamai.com/cloud-computing/docs/create-a-cluster) guide.
65
65
66
66
{{< note noTitle=true >}}
67
67
GPU plans are available in a limited number of data centers. Review the [GPU product documentation](https://techdocs.akamai.com/cloud-computing/docs/gpu-compute-instances#availability) to learn more about availability.
@@ -97,10 +97,10 @@ Next, let’s deploy Kubeflow on the LKE cluster. These instructions deploy all
97
97
openssl rand -base64 18
98
98
```
99
99
100
-
1. Create a hash of this password, replacing PASSWORD with the password generated in the previous step. This outputs a string starting with `$2y$12$`, which is password hash.
100
+
1. Create a hash of this password, replacing PASSWORD with the password generated in the previous step. This outputs the password hash, which starts with `$2y$12$`.
101
101
102
102
```command
103
-
htpasswd -bnBC 12 ""<PASSWORD>| tr -d ':\n'
103
+
htpasswd -bnBC 12 "" PASSWORD | tr -d ':\n'
104
104
```
105
105
106
106
1. Edit the `common/dex/base/dex-passwords.yaml` file, replacing the value for`DEX_USER_PASSWORD` with the password hash generatedin the previous step.
@@ -111,7 +111,7 @@ Next, let’s deploy Kubeflow on the LKE cluster. These instructions deploy all
111
111
while! kustomize build example | kubectl apply -f -;doecho"Retrying to apply resources"; sleep 20;done
112
112
```
113
113
114
-
1. This may take some time to finish. Once it’s complete, verify that all pods are in the ready state.
114
+
1. This may take some time to finish. Once it’s complete, verify that all pods are in the running state.
115
115
116
116
```command
117
117
kubectl get pods -A
@@ -152,6 +152,7 @@ After Kubeflow has been installed, we can now deploy the Llama 3 LLM to KServe.
152
152
name: huggingface-llama3
153
153
spec:
154
154
predictor:
155
+
minReplicas: 1
155
156
model:
156
157
modelFormat:
157
158
name: huggingface
@@ -202,6 +203,11 @@ Milvus, the vector database designed for AI inference workloads, will be used as
202
203
nvidia.com/gpu: "1"
203
204
limits:
204
205
nvidia.com/gpu: "1"
206
+
persistentVolumeClaim:
207
+
size: 5Gi
208
+
minio:
209
+
persistence:
210
+
size: 50Gi
205
211
```
206
212
207
213
1. Add Milvus to Helm.
@@ -214,7 +220,7 @@ Milvus, the vector database designed for AI inference workloads, will be used as
@@ -335,19 +341,19 @@ This tutorial employs a Python script to create the YAML file used within Kubefl
335
341
336
342

337
343
338
-
1. Next, navigate to Pipelines > Pipelines and click the **Upload Pipeline** link. Select **Upload a file** and use the **Choose file** dialog box to selectthe pipeline YAML file that was created in a previous step.
344
+
1. Next, navigate to Pipelines > Pipelines and click the **Upload Pipeline** link. Select **Upload a file** and use the **Choose file** dialog box to selectthe pipeline YAML file that was created in a previous step. Click the **Create** button to create the pipeline.
339
345
340
346

341
347
342
-
1. Navigate to the Pipelines > Runs page and click **Create Run**. Within the Run details section, selectthe pipeline and experiment that you just created. Choose *One-off* as the **Run Type** and provide the collection name and URL of the dataset (the zip file with the documents you wish to process) in the **Run parameters** section. For this tutorial, we are using `linode_docs` as the name and `https://github.com/linode/docs/archive/refs/tags/v1.360.0.zip`and the dataset URL.
348
+
1. Navigate to the Pipelines > Runs page and click **Create Run**. Within the Run details section, selectthe pipeline and experiment that you just created. Choose *One-off* as the **Run Type** and provide the collection name and URL of the dataset (the zip file with the documents you wish to process) in the **Run parameters** section. For this tutorial, we are using `linode_docs` as the name and `https://github.com/linode/docs/archive/refs/tags/v1.366.0.zip`as the dataset URL.
343
349
344
350

345
351
346
-
1. Click **Start** to run the pipeline. This process takes some time. For reference, it took ~10 minutes for the run to complete successfully on the linode.com/docs dataset.
352
+
1. Click **Start** to run the pipeline. This process takes some time. For reference, it takes about ~10 minutes for the run to complete on the linode.com/docs dataset.
347
353
348
354
## Deploy the chatbot
349
355
350
-
To finish up this tutorial, we will install the Open-WebUI chatbot and configure it to connect the data generated in the Kubernetes Pipeline with the LLM deployed in KServe. Once this is up and running, you can open up a browser interface to the chatbot and ask it questions. Chatbot UI will use the Milvus database to load context related to the search and send it, along with your query, to the Llama 3 instance within KServe. The LLM will send back a response to the chatbot and your browser will display an answer that is informed by your own custom data.
356
+
To finish up this tutorial, install the Open-WebUI chatbot and configure it to connect the data generated in the Kubernetes Pipeline with the LLM deployed in KServe. Once this is up and running, you can open up a browser interface to the chatbot and ask it questions. Chatbot UI uses the Milvus database to load context related to the search and sends it, along with your query, to the Llama 3 instance within KServe. The LLM then sends back a response to the chatbot and your browser displays an answer that is informed by your own custom data.
351
357
352
358
### Create the RAG pipeline files
353
359
@@ -360,6 +366,7 @@ Despite the naming, these RAG pipeline files are not related to the Kubeflow pip
360
366
```file {title="pipeline-requirements.txt"}
361
367
requests
362
368
pymilvus
369
+
opencv-python-headless
363
370
llama-index
364
371
llama-index-vector-stores-milvus
365
372
llama-index-embeddings-huggingface
@@ -368,7 +375,7 @@ Despite the naming, these RAG pipeline files are not related to the Kubeflow pip
368
375
369
376
1. Create a file called `rag_pipeline.py` with the following contents. The filenames of both the `pipeline-requirements.txt` and `rag_pipeline.py` files should not be changed as they are referenced within the Open WebUI Pipeline configuration file.
370
377
371
-
```file {title="rag-pipeline.py"}
378
+
```file {title="rag_pipeline.py"}
372
379
"""
373
380
title: RAG Pipeline
374
381
version: 1.0
@@ -594,4 +601,6 @@ Now that the chatbot has been configured, the final step is to access the chatbo
594
601
595
602
- The **RAG Pipeline** model that you defined in a previous section does use data from your custom dataset. Ask it a question relevant to your data and the chatbot should respond with an answer that is informed by the custom dataset you configured.
596
603
597
-

604
+

605
+
606
+
The response time depends on a variety of factors. Using similar cluster resources and the same dataset as this guide, an estimated response time is between 6 to 70 seconds.
Copy file name to clipboardExpand all lines: docs/guides/kubernetes/setting-up-harbor-registry-with-lke/index.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -31,7 +31,7 @@ This guide shows how to set up a Harbor registry on a dedicated compute instance
31
31
32
32
The Harbor installation in this guide assumes that you have [a domain name registered through a domain registrar](/docs/products/networking/dns-manager/get-started/#register-the-domain), and that you can edit the DNS records for this domain. This is so that SSL connections can be configured for the Harbor server. If you do not have a domain name, register one now.
33
33
34
-
The infrastructure for this guide is created on the Akamai Connected Cloud platform. If you do not already have one, [create an account](/docs/products/platform/get-started/) for the platform.
34
+
The infrastructure for this guide is created on the Akamai Cloud platform. If you do not already have one, [create an account](/docs/products/platform/get-started/) for the platform.
35
35
36
36
The following is a summary of the infrastructure created in this guide. Instructions for creating these services are included later in the guide:
0 commit comments