add test script #2199

srinarayan-srikanthan · 2025-08-14T03:48:36Z

Description

Test script for remote endpoint

Issues

List the issue or RFC link this PR is working on. If there is no such link, please mark it as n/a.

Type of change

List the type of change like below. Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
[ x ] New feature (non-breaking change which adds new functionality)
Breaking change (fix or feature that would break existing design and interface)
Others (enhancement, documentation, validation, etc.)

Dependencies

NA

Tests

Describe the tests that you ran to verify your changes.

Signed-off-by: Ubuntu <azureuser@denvr-inf.kifxisxbiwme5gt4kkwqsfdjuh.dx.internal.cloudapp.net>

github-actions · 2025-08-14T03:48:48Z

Dependency Review

✅ No vulnerabilities or license issues found.

Scanned Files

None

for more information, see https://pre-commit.ci

alexsin368 · 2025-08-15T01:16:01Z

ChatQnA/tests/test_compose_remote_on_xeon.sh

+    echo "Build all the images with --no-cache, check docker_image_build.log for details..."
+    service_list="chatqna chatqna-ui dataprep retriever nginx"
+    docker compose -f build.yaml build ${service_list} --no-cache > ${LOG_PATH}/docker_image_build.log
+    docker run --env "VLLM_SKIP_WARMUP=true" -p 8000:8000 --ipc=host public.ecr.aws/q9t5s3a7/vllm-cpu-release-repo:v0.9.2 --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 --api-key $TEST_KEY


where is TEST_KEY set? or mention in the tests folder README on that to set it to

alexsin368 · 2025-08-15T01:19:59Z

ChatQnA/tests/test_compose_remote_on_xeon.sh

+    echo "Build all the images with --no-cache, check docker_image_build.log for details..."
+    service_list="chatqna chatqna-ui dataprep retriever nginx"
+    docker compose -f build.yaml build ${service_list} --no-cache > ${LOG_PATH}/docker_image_build.log
+    docker run --env "VLLM_SKIP_WARMUP=true" -p 8000:8000 --ipc=host public.ecr.aws/q9t5s3a7/vllm-cpu-release-repo:v0.9.2 --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 --api-key $TEST_KEY


will we need to update the docker image for vllm-cpu over time?

alexsin368 · 2025-08-15T01:20:46Z

CodeGen/tests/test_compose_remote_on_xeon.sh

+    service_list="codegen codegen-gradio-ui dataprep retriever embedding"
+
+    docker compose -f build.yaml build ${service_list} --no-cache > ${LOG_PATH}/docker_image_build.log
+    docker run --env "VLLM_SKIP_WARMUP=true" -p 8000:8000 --ipc=host public.ecr.aws/q9t5s3a7/vllm-cpu-release-repo:v0.9.2 --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 --api-key $TEST_KEY


same here regarding TEST_KEY and the vllm-cpu docker image

alexsin368 · 2025-08-15T01:21:42Z

DocSum/tests/test_compose_remote_on_xeon.sh

+
+export MAX_INPUT_TOKENS=2048
+export MAX_TOTAL_TOKENS=4096
+#export REMOTE_ENDPOINT=


remove commented code

alexsin368 · 2025-08-15T01:21:54Z

DocSum/tests/test_compose_remote_on_xeon.sh

+    echo "Build all the images with --no-cache, check docker_image_build.log for details..."
+    service_list="docsum docsum-gradio-ui whisper llm-docsum"
+    docker compose -f build.yaml build ${service_list} --no-cache > ${LOG_PATH}/docker_image_build.log
+    docker run --env "VLLM_SKIP_WARMUP=true" -p 8000:8000 --ipc=host public.ecr.aws/q9t5s3a7/vllm-cpu-release-repo:v0.9.2 --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 --api-key $TEST_KEY


same here regarding TEST_KEY and the vllm-cpu docker image

alexsin368 · 2025-08-15T01:23:47Z

ChatQnA/tests/test_compose_remote_on_xeon.sh

+    fi
+}
+
+function stop_docker() {


add a line to stop the vllm-cpu docker container

alexsin368 · 2025-08-15T01:24:31Z

CodeGen/tests/test_compose_remote_on_xeon.sh

+    fi
+}
+
+function stop_docker() {


add a line to stop the vllm-cpu docker container

alexsin368 · 2025-08-15T01:25:27Z

DocSum/tests/test_compose_remote_on_xeon.sh

+        "stream=False"
+}
+
+function stop_docker() {


add a line to stop the vllm-cpu docker container

alexsin368 · 2025-08-15T01:25:41Z

ProductivitySuite/tests/test_compose_remote_on_xeon.sh

+
+    echo "Build all the images with --no-cache, check docker_image_build.log for details..."
+    docker compose -f build.yaml build --no-cache > ${LOG_PATH}/docker_image_build.log
+    docker run --env "VLLM_SKIP_WARMUP=true" -p 8000:8000 --ipc=host public.ecr.aws/q9t5s3a7/vllm-cpu-release-repo:v0.9.2 --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 --api-key $TEST_KEY


same here regarding TEST_KEY and the vllm-cpu docker image

alexsin368 · 2025-08-15T01:26:21Z

ProductivitySuite/tests/test_compose_remote_on_xeon.sh

+    fi
+}
+
+function stop_docker() {


add a line to stop the vllm-cpu docker container

alexsin368 · 2025-08-15T01:31:21Z

ChatQnA/tests/test_compose_remote_on_xeon.sh

+
+    # Start Docker Containers
+    docker compose -f compose_remote.yaml -f compose.telemetry.yaml up -d --quiet-pull > ${LOG_PATH}/start_services_with_compose_remote.log
+}


add logic to wait for the longest docker container to spin up, usually the backend service. example: https://github.com/opea-project/GenAIExamples/blob/main/ChatQnA/tests/test_compose_on_xeon.sh#L47

alexsin368 · 2025-08-15T01:31:32Z

CodeGen/tests/test_compose_remote_on_xeon.sh

+    export API_KEY=$TEST_KEY
+    export LLM_MODEL_ID=TinyLlama/TinyLlama-1.1B-Chat-v1.0
+    # Start Docker Containers
+    docker compose -f compose_remote.yaml up -d > ${LOG_PATH}/start_services_with_compose.log


add logic to wait for the longest docker container to spin up, usually the backend service

alexsin368 · 2025-08-15T01:32:38Z

DocSum/tests/test_compose_remote_on_xeon.sh

+    export API_KEY=$TEST_KEY
+    export LLM_MODEL_ID=TinyLlama/TinyLlama-1.1B-Chat-v1.0
+    docker compose -f compose_remote.yaml up -d > ${LOG_PATH}/start_services_with_compose_remote.log
+    sleep 1m


Sleeping for 1 minute is ok from experience waiting for the docker containers to spin up. But if you want to make it a closed-loop check, add logic to wait for the longest docker container to spin up, usually the backend service

alexsin368 · 2025-08-15T01:33:09Z

ProductivitySuite/tests/test_compose_remote_on_xeon.sh

+
+    # Start Docker Containers
+    docker compose -f compose_remote.yaml up -d > ${LOG_PATH}/start_services_with_compose_remote.log
+    sleep 30s


Sleeping for 30 seconds is ok, but if you want to make it a closed-loop check, add logic to wait for the longest docker container to spin up, usually the backend service

alexsin368

TEST_KEY needs to be set somewhere, vllm-cpu docker image as a dummy remote endpoint uses a fixed version image, and need to add logic to wait for docker containers to spin up before proceeding to validate services.

louie-tsai · 2025-08-15T23:25:35Z

ChatQnA/tests/test_compose_remote_on_xeon.sh

+    echo "Build all the images with --no-cache, check docker_image_build.log for details..."
+    service_list="chatqna chatqna-ui dataprep retriever nginx"
+    docker compose -f build.yaml build ${service_list} --no-cache > ${LOG_PATH}/docker_image_build.log
+    docker run --env "VLLM_SKIP_WARMUP=true" -p 8000:8000 --ipc=host public.ecr.aws/q9t5s3a7/vllm-cpu-release-repo:v0.9.2 --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 --api-key $TEST_KEY


Suggested change

docker run --env "VLLM_SKIP_WARMUP=true" -p 8000:8000 --ipc=host public.ecr.aws/q9t5s3a7/vllm-cpu-release-repo:v0.9.2 --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 --api-key $TEST_KEY

docker run --env "VLLM_SKIP_WARMUP=true" -p 8000:8000 --ipc=host public.ecr.aws/q9t5s3a7/vllm-cpu-release-repo:latest --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 --api-key $TEST_KEY

maybe good to use latest one.

louie-tsai · 2025-08-15T23:27:18Z

ChatQnA/tests/test_compose_remote_on_xeon.sh

+    echo "Build all the images with --no-cache, check docker_image_build.log for details..."
+    service_list="chatqna chatqna-ui dataprep retriever nginx"
+    docker compose -f build.yaml build ${service_list} --no-cache > ${LOG_PATH}/docker_image_build.log
+    docker run --env "VLLM_SKIP_WARMUP=true" -p 8000:8000 --ipc=host public.ecr.aws/q9t5s3a7/vllm-cpu-release-repo:v0.9.2 --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 --api-key $TEST_KEY


why this docker run command inside docker build function? you could move it to start services and do sleep in between.

louie-tsai · 2025-08-15T23:29:39Z

CodeGen/tests/test_compose_remote_on_xeon.sh

+    docker build --no-cache -t ${REGISTRY}/comps-base:${TAG} --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile .
+    popd && sleep 1s
+
+    git clone https://github.com/vllm-project/vllm.git && cd vllm


don't need to rebuild vllm. just use public release

louie-tsai · 2025-08-15T23:30:24Z

DocSum/tests/test_compose_remote_on_xeon.sh

+    echo "Build all the images with --no-cache, check docker_image_build.log for details..."
+    service_list="docsum docsum-gradio-ui whisper llm-docsum"
+    docker compose -f build.yaml build ${service_list} --no-cache > ${LOG_PATH}/docker_image_build.log
+    docker run --env "VLLM_SKIP_WARMUP=true" -p 8000:8000 --ipc=host public.ecr.aws/q9t5s3a7/vllm-cpu-release-repo:v0.9.2 --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 --api-key $TEST_KEY


louie-tsai · 2025-08-15T23:30:40Z

CodeGen/tests/test_compose_remote_on_xeon.sh

+    service_list="codegen codegen-gradio-ui dataprep retriever embedding"
+
+    docker compose -f build.yaml build ${service_list} --no-cache > ${LOG_PATH}/docker_image_build.log
+    docker run --env "VLLM_SKIP_WARMUP=true" -p 8000:8000 --ipc=host public.ecr.aws/q9t5s3a7/vllm-cpu-release-repo:v0.9.2 --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 --api-key $TEST_KEY


louie-tsai · 2025-08-15T23:30:58Z

DocSum/tests/test_compose_remote_on_xeon.sh

+    echo "Build all the images with --no-cache, check docker_image_build.log for details..."
+    service_list="docsum docsum-gradio-ui whisper llm-docsum"
+    docker compose -f build.yaml build ${service_list} --no-cache > ${LOG_PATH}/docker_image_build.log
+    docker run --env "VLLM_SKIP_WARMUP=true" -p 8000:8000 --ipc=host public.ecr.aws/q9t5s3a7/vllm-cpu-release-repo:v0.9.2 --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 --api-key $TEST_KEY


add test script

9c7a242

Signed-off-by: Ubuntu <azureuser@denvr-inf.kifxisxbiwme5gt4kkwqsfdjuh.dx.internal.cloudapp.net>

Copilot AI review requested due to automatic review settings August 14, 2025 03:48

srinarayan-srikanthan requested review from jaswanth8888, hteeyeoh, letonghan, lvliang-intel and yao531441 as code owners August 14, 2025 03:48

srinarayan-srikanthan self-assigned this Aug 14, 2025

srinarayan-srikanthan assigned louie-tsai and alexsin368 Aug 14, 2025

[pre-commit.ci] auto fixes from pre-commit.com hooks

3da05e7

for more information, see https://pre-commit.ci

louie-tsai requested review from louie-tsai and removed request for Copilot August 14, 2025 03:49

louie-tsai mentioned this pull request Aug 14, 2025

[Feature] Add remote inference endpoint support in CI/CD #2005

Open

13 tasks

alexsin368 reviewed Aug 15, 2025

View reviewed changes

ChatQnA/tests/test_compose_remote_on_xeon.sh

fi

}

function stop_docker() {

Copy link

Collaborator

alexsin368 Aug 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a line to stop the vllm-cpu docker container

alexsin368 reviewed Aug 15, 2025

View reviewed changes

CodeGen/tests/test_compose_remote_on_xeon.sh

fi

}

function stop_docker() {

Copy link

Collaborator

alexsin368 Aug 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a line to stop the vllm-cpu docker container

alexsin368 reviewed Aug 15, 2025

View reviewed changes

alexsin368 requested changes Aug 15, 2025

View reviewed changes

alexsin368 mentioned this pull request Aug 15, 2025

Codetrans: enable remote endpoints #2195

Merged

4 tasks

louie-tsai reviewed Aug 15, 2025

View reviewed changes

	docker run --env "VLLM_SKIP_WARMUP=true" -p 8000:8000 --ipc=host public.ecr.aws/q9t5s3a7/vllm-cpu-release-repo:v0.9.2 --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 --api-key $TEST_KEY
	docker run --env "VLLM_SKIP_WARMUP=true" -p 8000:8000 --ipc=host public.ecr.aws/q9t5s3a7/vllm-cpu-release-repo:latest --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 --api-key $TEST_KEY

add test script #2199

Are you sure you want to change the base?

add test script #2199

Uh oh!

Conversation

srinarayan-srikanthan commented Aug 14, 2025

Description

Issues

Type of change

Dependencies

Tests

Uh oh!

github-actions bot commented Aug 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Dependency Review

Scanned Files

Uh oh!

alexsin368 Aug 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alexsin368 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot commented Aug 14, 2025 •

edited

Loading

alexsin368 Aug 15, 2025 •

edited

Loading