Skip to content

Fix workexec agent docker build issues and enable LLM Remote Endpoint #2103

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 30 additions & 3 deletions WorkflowExecAgent/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,24 +72,48 @@ And finally here are the results from the microservice logs:

### Start Agent Microservice

Workflow Executor will have a single docker image. First, build the agent docker image.
Workflow Executor will have a single docker image.

(Optional) Build the agent docker image with the most latest changes.
By default, Workflow Executor uses public [opea/vllm](https://hub.docker.com/r/opea/agent) docker image if no local built image exists.

```sh
export WORKDIR=$PWD
git clone https://github.com/opea-project/GenAIComps.git
git clone https://github.com/opea-project/GenAIExamples.git
cd GenAIExamples//WorkflowExecAgent/docker_image_build/
docker compose -f build.yaml build --no-cache
```

<details>
<summary> Using Remote LLM Endpoints </summary>
When models are deployed on a remote server, a base URL and an API key are required to access them. To set up a remote server and acquire the base URL and API key, refer to <a href="https://www.intel.com/content/www/us/en/products/docs/accelerator-engines/enterprise-ai.html"> Intel® AI for Enterprise Inference </a> offerings.

Set the following environment variables.

- `llm_endpoint_url` is the HTTPS endpoint of the remote server with the model of choice (i.e. https://api.inference.denvrdata.com). **Note:** If not using LiteLLM, the second part of the model card needs to be appended to the URL i.e. `/Llama-3.3-70B-Instruct` from `meta-llama/Llama-3.3-70B-Instruct`.
- `llm_endpoint_api_key` is the access token or key to access the model(s) on the server.
- `LLM_MODEL_ID` is the model card which may need to be overwritten depending on what it is set to `set_env.sh`.

```bash
export llm_endpoint_url=<https-endpoint-of-remote-server>
export llm_endpoint_api_key=<your-api-key>
export LLM_MODEL_ID=<model-card>
```

</details>

Configure `GenAIExamples/WorkflowExecAgent/docker_compose/.env` file with the following. Replace the variables according to your usecase.

```sh
export SDK_BASE_URL=${SDK_BASE_URL}
export SERVING_TOKEN=${SERVING_TOKEN}
export HF_TOKEN=${HF_TOKEN}
export llm_engine=${llm_engine}
export llm_engine=vllm
export llm_endpoint_url=${llm_endpoint_url}
export api_key=${llm_endpoint_api_key:-""}
export ip_address=$(hostname -I | awk '{print $1}')
export model="mistralai/Mistral-7B-Instruct-v0.3"
export model=${LLM_MODEL_ID:-"mistralai/Mistral-7B-Instruct-v0.3"}
export recursion_limit=${recursion_limit}
export temperature=0
export max_new_tokens=1000
Expand All @@ -99,6 +123,9 @@ export http_proxy=${http_proxy}
export https_proxy=${https_proxy}
```

> Note: SDK_BASE_URL and SERVING_TOKEN can be obtained from Intel Data Insight Automation platform.
> For llm_endpoint_url, both local vllm service or an remote vllm endpoint work for the example.

Launch service by running the docker compose command.

```sh
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ services:
recursion_limit: ${recursion_limit}
llm_engine: ${llm_engine}
llm_endpoint_url: ${llm_endpoint_url}
api_key: ${API_KEY}
model: ${model}
temperature: ${temperature}
max_new_tokens: ${max_new_tokens}
Expand Down
2 changes: 1 addition & 1 deletion WorkflowExecAgent/docker_image_build/build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
services:
agent:
build:
context: GenAIComps
context: ${WORKDIR:-./}/GenAIComps
dockerfile: comps/agent/src/Dockerfile
args:
http_proxy: ${http_proxy}
Expand Down
8 changes: 4 additions & 4 deletions WorkflowExecAgent/tests/2_start_vllm_service.sh
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ function build_vllm_docker_image() {

function start_vllm_service() {
echo "start vllm service"
docker run -d -p ${vllm_port}:${vllm_port} --rm --network=host --name test-comps-vllm-service -v ~/.cache/huggingface:/root/.cache/huggingface -v ${WORKPATH}/tests/tool_chat_template_mistral_custom.jinja:/root/tool_chat_template_mistral_custom.jinja -e HF_TOKEN=$HF_TOKEN -e http_proxy=$http_proxy -e https_proxy=$https_proxy -it vllm-cpu-env --model ${model} --port ${vllm_port} --chat-template /root/tool_chat_template_mistral_custom.jinja --enable-auto-tool-choice --tool-call-parser mistral
docker run -d -p ${vllm_port}:${vllm_port} --rm --network=host --name test-comps-vllm-service -v ~/.cache/huggingface:/root/.cache/huggingface -v ${WORKPATH}/tests/tool_chat_template_mistral_custom.jinja:/root/tool_chat_template_mistral_custom.jinja -e HF_TOKEN=$HF_TOKEN -e http_proxy=$http_proxy -e https_proxy=$https_proxy -it public.ecr.aws/q9t5s3a7/vllm-cpu-release-repo:v0.9.1 --model ${model} --port ${vllm_port} --chat-template /root/tool_chat_template_mistral_custom.jinja --enable-auto-tool-choice --tool-call-parser mistral
echo ${LOG_PATH}/vllm-service.log
sleep 5s
echo "Waiting vllm ready"
Expand All @@ -59,9 +59,9 @@ function start_vllm_service() {
}

function main() {
echo "==================== Build vllm docker image ===================="
build_vllm_docker_image
echo "==================== Build vllm docker image completed ===================="
# echo "==================== Build vllm docker image ===================="
# build_vllm_docker_image
# echo "==================== Build vllm docker image completed ===================="

echo "==================== Start vllm docker service ===================="
start_vllm_service
Expand Down
1 change: 1 addition & 0 deletions WorkflowExecAgent/tests/3_launch_and_validate_agent.sh
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ export HF_TOKEN=${HF_TOKEN}
export llm_engine=vllm
export ip_address=$(hostname -I | awk '{print $1}')
export llm_endpoint_url=http://${ip_address}:${vllm_port}
export api_key=""
export model=mistralai/Mistral-7B-Instruct-v0.3
export recursion_limit=25
export temperature=0
Expand Down
Loading