Skip to content

Files

Latest commit

Jan 23, 2025
9b0f98b · Jan 23, 2025

History

History
30 lines (21 loc) · 1.23 KB
·

README.md

File metadata and controls

30 lines (21 loc) · 1.23 KB
·

Deploy ChatQnA on Kubernetes cluster

Deploy on Xeon

export HFTOKEN="insert-your-huggingface-token-here"
helm install chatqna oci://ghcr.io/opea-project/charts/chatqna  --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f cpu-values.yaml

Deploy on Gaudi

export HFTOKEN="insert-your-huggingface-token-here"
helm install chatqna oci://ghcr.io/opea-project/charts/chatqna  --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f gaudi-vllm-values.yaml

Deploy variants of ChatQnA

ChatQnA is configurable and you can enable/disable features by providing values.yaml file. For example, to run with tgi instead of vllm inference engine on Gaudi hardware, use gaudi-tgi-values.yaml file:

export HFTOKEN="insert-your-huggingface-token-here"
helm install chatqna oci://ghcr.io/opea-project/charts/chatqna  --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f gaudi-tgi-values.yaml

See other *-values.yaml files in this directory for more reference.