-
Notifications
You must be signed in to change notification settings - Fork 308
add test script #2199
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
add test script #2199
Conversation
Signed-off-by: Ubuntu <azureuser@denvr-inf.kifxisxbiwme5gt4kkwqsfdjuh.dx.internal.cloudapp.net>
Dependency Review✅ No vulnerabilities or license issues found.Scanned FilesNone |
for more information, see https://pre-commit.ci
echo "Build all the images with --no-cache, check docker_image_build.log for details..." | ||
service_list="chatqna chatqna-ui dataprep retriever nginx" | ||
docker compose -f build.yaml build ${service_list} --no-cache > ${LOG_PATH}/docker_image_build.log | ||
docker run --env "VLLM_SKIP_WARMUP=true" -p 8000:8000 --ipc=host public.ecr.aws/q9t5s3a7/vllm-cpu-release-repo:v0.9.2 --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 --api-key $TEST_KEY |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is TEST_KEY set? or mention in the tests folder README on that to set it to
echo "Build all the images with --no-cache, check docker_image_build.log for details..." | ||
service_list="chatqna chatqna-ui dataprep retriever nginx" | ||
docker compose -f build.yaml build ${service_list} --no-cache > ${LOG_PATH}/docker_image_build.log | ||
docker run --env "VLLM_SKIP_WARMUP=true" -p 8000:8000 --ipc=host public.ecr.aws/q9t5s3a7/vllm-cpu-release-repo:v0.9.2 --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 --api-key $TEST_KEY |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will we need to update the docker image for vllm-cpu over time?
service_list="codegen codegen-gradio-ui dataprep retriever embedding" | ||
|
||
docker compose -f build.yaml build ${service_list} --no-cache > ${LOG_PATH}/docker_image_build.log | ||
docker run --env "VLLM_SKIP_WARMUP=true" -p 8000:8000 --ipc=host public.ecr.aws/q9t5s3a7/vllm-cpu-release-repo:v0.9.2 --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 --api-key $TEST_KEY |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here regarding TEST_KEY and the vllm-cpu docker image
|
||
export MAX_INPUT_TOKENS=2048 | ||
export MAX_TOTAL_TOKENS=4096 | ||
#export REMOTE_ENDPOINT= |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove commented code
echo "Build all the images with --no-cache, check docker_image_build.log for details..." | ||
service_list="docsum docsum-gradio-ui whisper llm-docsum" | ||
docker compose -f build.yaml build ${service_list} --no-cache > ${LOG_PATH}/docker_image_build.log | ||
docker run --env "VLLM_SKIP_WARMUP=true" -p 8000:8000 --ipc=host public.ecr.aws/q9t5s3a7/vllm-cpu-release-repo:v0.9.2 --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 --api-key $TEST_KEY |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here regarding TEST_KEY and the vllm-cpu docker image
fi | ||
} | ||
|
||
function stop_docker() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a line to stop the vllm-cpu docker container
fi | ||
} | ||
|
||
function stop_docker() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a line to stop the vllm-cpu docker container
"stream=False" | ||
} | ||
|
||
function stop_docker() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a line to stop the vllm-cpu docker container
|
||
echo "Build all the images with --no-cache, check docker_image_build.log for details..." | ||
docker compose -f build.yaml build --no-cache > ${LOG_PATH}/docker_image_build.log | ||
docker run --env "VLLM_SKIP_WARMUP=true" -p 8000:8000 --ipc=host public.ecr.aws/q9t5s3a7/vllm-cpu-release-repo:v0.9.2 --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 --api-key $TEST_KEY |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here regarding TEST_KEY and the vllm-cpu docker image
fi | ||
} | ||
|
||
function stop_docker() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a line to stop the vllm-cpu docker container
|
||
# Start Docker Containers | ||
docker compose -f compose_remote.yaml -f compose.telemetry.yaml up -d --quiet-pull > ${LOG_PATH}/start_services_with_compose_remote.log | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add logic to wait for the longest docker container to spin up, usually the backend service. example: https://github.com/opea-project/GenAIExamples/blob/main/ChatQnA/tests/test_compose_on_xeon.sh#L47
export API_KEY=$TEST_KEY | ||
export LLM_MODEL_ID=TinyLlama/TinyLlama-1.1B-Chat-v1.0 | ||
# Start Docker Containers | ||
docker compose -f compose_remote.yaml up -d > ${LOG_PATH}/start_services_with_compose.log |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add logic to wait for the longest docker container to spin up, usually the backend service
export API_KEY=$TEST_KEY | ||
export LLM_MODEL_ID=TinyLlama/TinyLlama-1.1B-Chat-v1.0 | ||
docker compose -f compose_remote.yaml up -d > ${LOG_PATH}/start_services_with_compose_remote.log | ||
sleep 1m |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sleeping for 1 minute is ok from experience waiting for the docker containers to spin up. But if you want to make it a closed-loop check, add logic to wait for the longest docker container to spin up, usually the backend service
|
||
# Start Docker Containers | ||
docker compose -f compose_remote.yaml up -d > ${LOG_PATH}/start_services_with_compose_remote.log | ||
sleep 30s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sleeping for 30 seconds is ok, but if you want to make it a closed-loop check, add logic to wait for the longest docker container to spin up, usually the backend service
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TEST_KEY needs to be set somewhere, vllm-cpu docker image as a dummy remote endpoint uses a fixed version image, and need to add logic to wait for docker containers to spin up before proceeding to validate services.
echo "Build all the images with --no-cache, check docker_image_build.log for details..." | ||
service_list="chatqna chatqna-ui dataprep retriever nginx" | ||
docker compose -f build.yaml build ${service_list} --no-cache > ${LOG_PATH}/docker_image_build.log | ||
docker run --env "VLLM_SKIP_WARMUP=true" -p 8000:8000 --ipc=host public.ecr.aws/q9t5s3a7/vllm-cpu-release-repo:v0.9.2 --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 --api-key $TEST_KEY |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docker run --env "VLLM_SKIP_WARMUP=true" -p 8000:8000 --ipc=host public.ecr.aws/q9t5s3a7/vllm-cpu-release-repo:v0.9.2 --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 --api-key $TEST_KEY | |
docker run --env "VLLM_SKIP_WARMUP=true" -p 8000:8000 --ipc=host public.ecr.aws/q9t5s3a7/vllm-cpu-release-repo:latest --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 --api-key $TEST_KEY |
maybe good to use latest one.
echo "Build all the images with --no-cache, check docker_image_build.log for details..." | ||
service_list="chatqna chatqna-ui dataprep retriever nginx" | ||
docker compose -f build.yaml build ${service_list} --no-cache > ${LOG_PATH}/docker_image_build.log | ||
docker run --env "VLLM_SKIP_WARMUP=true" -p 8000:8000 --ipc=host public.ecr.aws/q9t5s3a7/vllm-cpu-release-repo:v0.9.2 --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 --api-key $TEST_KEY |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this docker run command inside docker build function? you could move it to start services and do sleep in between.
docker build --no-cache -t ${REGISTRY}/comps-base:${TAG} --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile . | ||
popd && sleep 1s | ||
|
||
git clone https://github.com/vllm-project/vllm.git && cd vllm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't need to rebuild vllm. just use public release
echo "Build all the images with --no-cache, check docker_image_build.log for details..." | ||
service_list="docsum docsum-gradio-ui whisper llm-docsum" | ||
docker compose -f build.yaml build ${service_list} --no-cache > ${LOG_PATH}/docker_image_build.log | ||
docker run --env "VLLM_SKIP_WARMUP=true" -p 8000:8000 --ipc=host public.ecr.aws/q9t5s3a7/vllm-cpu-release-repo:v0.9.2 --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 --api-key $TEST_KEY |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here.
service_list="codegen codegen-gradio-ui dataprep retriever embedding" | ||
|
||
docker compose -f build.yaml build ${service_list} --no-cache > ${LOG_PATH}/docker_image_build.log | ||
docker run --env "VLLM_SKIP_WARMUP=true" -p 8000:8000 --ipc=host public.ecr.aws/q9t5s3a7/vllm-cpu-release-repo:v0.9.2 --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 --api-key $TEST_KEY |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
echo "Build all the images with --no-cache, check docker_image_build.log for details..." | ||
service_list="docsum docsum-gradio-ui whisper llm-docsum" | ||
docker compose -f build.yaml build ${service_list} --no-cache > ${LOG_PATH}/docker_image_build.log | ||
docker run --env "VLLM_SKIP_WARMUP=true" -p 8000:8000 --ipc=host public.ecr.aws/q9t5s3a7/vllm-cpu-release-repo:v0.9.2 --model TinyLlama/TinyLlama-1.1B-Chat-v1.0 --api-key $TEST_KEY |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
Description
Test script for remote endpoint
Issues
List the issue or RFC link this PR is working on. If there is no such link, please mark it as
n/a
.Type of change
List the type of change like below. Please delete options that are not relevant.
Dependencies
NA
Tests
Describe the tests that you ran to verify your changes.