diff --git a/WorkflowExecAgent/README.md b/WorkflowExecAgent/README.md
index 402913775d..5c0e66a5ad 100644
--- a/WorkflowExecAgent/README.md
+++ b/WorkflowExecAgent/README.md
@@ -72,24 +72,48 @@ And finally here are the results from the microservice logs:
 
 ### Start Agent Microservice
 
-Workflow Executor will have a single docker image. First, build the agent docker image.
+Workflow Executor will have a single docker image.
+
+(Optional) Build the agent docker image with the most latest changes.  
+By default, Workflow Executor uses public [opea/vllm](https://hub.docker.com/r/opea/agent) docker image if no local built image exists.
 
 ```sh
+export WORKDIR=$PWD
+git clone https://github.com/opea-project/GenAIComps.git
 git clone https://github.com/opea-project/GenAIExamples.git
 cd GenAIExamples//WorkflowExecAgent/docker_image_build/
 docker compose -f build.yaml build --no-cache
 ```
 
+<details>
+<summary> Using Remote LLM Endpoints </summary>
+When models are deployed on a remote server, a base URL and an API key are required to access them. To set up a remote server and acquire the base URL and API key, refer to <a href="https://www.intel.com/content/www/us/en/products/docs/accelerator-engines/enterprise-ai.html"> Intel® AI for Enterprise Inference </a> offerings.
+
+Set the following environment variables.
+
+- `llm_endpoint_url` is the HTTPS endpoint of the remote server with the model of choice (i.e. https://api.inference.denvrdata.com). **Note:** If not using LiteLLM, the second part of the model card needs to be appended to the URL i.e. `/Llama-3.3-70B-Instruct` from `meta-llama/Llama-3.3-70B-Instruct`.
+- `llm_endpoint_api_key` is the access token or key to access the model(s) on the server.
+- `LLM_MODEL_ID` is the model card which may need to be overwritten depending on what it is set to `set_env.sh`.
+
+```bash
+export llm_endpoint_url=<https-endpoint-of-remote-server>
+export llm_endpoint_api_key=<your-api-key>
+export LLM_MODEL_ID=<model-card>
+```
+
+</details>
+
 Configure `GenAIExamples/WorkflowExecAgent/docker_compose/.env` file with the following. Replace the variables according to your usecase.
 
 ```sh
 export SDK_BASE_URL=${SDK_BASE_URL}
 export SERVING_TOKEN=${SERVING_TOKEN}
 export HF_TOKEN=${HF_TOKEN}
-export llm_engine=${llm_engine}
+export llm_engine=vllm
 export llm_endpoint_url=${llm_endpoint_url}
+export api_key=${llm_endpoint_api_key:-""}
 export ip_address=$(hostname -I | awk '{print $1}')
-export model="mistralai/Mistral-7B-Instruct-v0.3"
+export model=${LLM_MODEL_ID:-"mistralai/Mistral-7B-Instruct-v0.3"}
 export recursion_limit=${recursion_limit}
 export temperature=0
 export max_new_tokens=1000
@@ -99,6 +123,9 @@ export http_proxy=${http_proxy}
 export https_proxy=${https_proxy}
 ```
 
+> Note: SDK_BASE_URL and SERVING_TOKEN can be obtained from Intel Data Insight Automation platform.  
+> For llm_endpoint_url, both local vllm service or an remote vllm endpoint work for the example.
+
 Launch service by running the docker compose command.
 
 ```sh
diff --git a/WorkflowExecAgent/docker_compose/intel/cpu/xeon/compose_vllm.yaml b/WorkflowExecAgent/docker_compose/intel/cpu/xeon/compose_vllm.yaml
index e69e9ef7f2..c1bf1448e4 100644
--- a/WorkflowExecAgent/docker_compose/intel/cpu/xeon/compose_vllm.yaml
+++ b/WorkflowExecAgent/docker_compose/intel/cpu/xeon/compose_vllm.yaml
@@ -17,6 +17,7 @@ services:
       recursion_limit: ${recursion_limit}
       llm_engine: ${llm_engine}
       llm_endpoint_url: ${llm_endpoint_url}
+      api_key: ${API_KEY}
       model: ${model}
       temperature: ${temperature}
       max_new_tokens: ${max_new_tokens}
diff --git a/WorkflowExecAgent/docker_image_build/build.yaml b/WorkflowExecAgent/docker_image_build/build.yaml
index 61f2b0dda5..1fe9f9edf7 100644
--- a/WorkflowExecAgent/docker_image_build/build.yaml
+++ b/WorkflowExecAgent/docker_image_build/build.yaml
@@ -4,7 +4,7 @@
 services:
   agent:
     build:
-      context: GenAIComps
+      context: ${WORKDIR:-./}/GenAIComps
       dockerfile: comps/agent/src/Dockerfile
       args:
         http_proxy: ${http_proxy}
diff --git a/WorkflowExecAgent/tests/2_start_vllm_service.sh b/WorkflowExecAgent/tests/2_start_vllm_service.sh
index 9fb65541fc..ffc9398f20 100644
--- a/WorkflowExecAgent/tests/2_start_vllm_service.sh
+++ b/WorkflowExecAgent/tests/2_start_vllm_service.sh
@@ -37,7 +37,7 @@ function build_vllm_docker_image() {
 
 function start_vllm_service() {
     echo "start vllm service"
-    docker run -d -p ${vllm_port}:${vllm_port} --rm --network=host --name test-comps-vllm-service -v ~/.cache/huggingface:/root/.cache/huggingface -v ${WORKPATH}/tests/tool_chat_template_mistral_custom.jinja:/root/tool_chat_template_mistral_custom.jinja -e HF_TOKEN=$HF_TOKEN -e http_proxy=$http_proxy -e https_proxy=$https_proxy -it vllm-cpu-env --model ${model} --port ${vllm_port} --chat-template /root/tool_chat_template_mistral_custom.jinja --enable-auto-tool-choice --tool-call-parser mistral
+    docker run -d -p ${vllm_port}:${vllm_port} --rm --network=host --name test-comps-vllm-service -v ~/.cache/huggingface:/root/.cache/huggingface -v ${WORKPATH}/tests/tool_chat_template_mistral_custom.jinja:/root/tool_chat_template_mistral_custom.jinja -e HF_TOKEN=$HF_TOKEN -e http_proxy=$http_proxy -e https_proxy=$https_proxy -it public.ecr.aws/q9t5s3a7/vllm-cpu-release-repo:v0.9.1 --model ${model} --port ${vllm_port} --chat-template /root/tool_chat_template_mistral_custom.jinja --enable-auto-tool-choice --tool-call-parser mistral
     echo ${LOG_PATH}/vllm-service.log
     sleep 5s
     echo "Waiting vllm ready"
@@ -59,9 +59,9 @@ function start_vllm_service() {
 }
 
 function main() {
-    echo "==================== Build vllm docker image ===================="
-    build_vllm_docker_image
-    echo "==================== Build vllm docker image completed ===================="
+    # echo "==================== Build vllm docker image ===================="
+    # build_vllm_docker_image
+    # echo "==================== Build vllm docker image completed ===================="
 
     echo "==================== Start vllm docker service ===================="
     start_vllm_service
diff --git a/WorkflowExecAgent/tests/3_launch_and_validate_agent.sh b/WorkflowExecAgent/tests/3_launch_and_validate_agent.sh
index 3fa75920c3..f537a7f503 100644
--- a/WorkflowExecAgent/tests/3_launch_and_validate_agent.sh
+++ b/WorkflowExecAgent/tests/3_launch_and_validate_agent.sh
@@ -16,6 +16,7 @@ export HF_TOKEN=${HF_TOKEN}
 export llm_engine=vllm
 export ip_address=$(hostname -I | awk '{print $1}')
 export llm_endpoint_url=http://${ip_address}:${vllm_port}
+export api_key=""
 export model=mistralai/Mistral-7B-Instruct-v0.3
 export recursion_limit=25
 export temperature=0