diff --git a/FinanceAgent/README.md b/FinanceAgent/README.md index 64ce01cc0a..640f7113d0 100644 --- a/FinanceAgent/README.md +++ b/FinanceAgent/README.md @@ -1,6 +1,26 @@ -# Finance Agent +# Finance Agent Example -## 1. Overview +## Table of Contents + +- [Overview](#overview) +- [Problem Motivation](#problem-motivation) +- [Architecture](#architecture) + - [High-Level Diagram](#high-level-diagram) + - [OPEA Microservices Diagram for Data Handling](#opea-microservices-diagram-for-data-handling) +- [Deployment Options](#deployment-options) +- [Contribution](#contribution) + +## Overview + +The Finance Agent exemplifies a hierarchical multi-agent system designed to streamline financial document processing and analysis for users. It offers three core functionalities: summarizing lengthy financial documents, answering queries related to these documents, and conducting research to generate investment reports on public companies. + +Navigating and analyzing extensive financial documents can be both challenging and time-consuming. Users often need concise summaries, answers to specific queries, or comprehensive investment reports. The Finance Agent effectively addresses these needs by automating document summarization, query answering, and research tasks, thereby enhancing productivity and decision-making efficiency. + +Users interact with the system through a graphical user interface (UI), where a supervisor agent manages requests by delegating tasks to worker agents or the summarization microservice. The system also supports document uploads via the UI for processing. + +## Architecture + +### High-Level Diagram The architecture of this Finance Agent example is shown in the figure below. The agent is a hierarchical multi-agent system and has 3 main functions: @@ -12,6 +32,8 @@ The user interacts with the supervisor agent through the graphical UI. The super ![Finance Agent Architecture](assets/finance_agent_arch.png) +### OPEA Microservices Diagram for Data Handling + The architectural diagram of the `dataprep` microservice is shown below. We use [docling](https://github.com/docling-project/docling) to extract text from PDFs and URLs into markdown format. Both the full document content and tables are extracted. We then use an LLM to extract metadata from the document, including the company name, year, quarter, document type, and document title. The full document markdown then gets chunked, and LLM is used to summarize each chunk, and the summaries are embedded and saved to a vector database. Each table is also summarized by LLM and the summaries are embedded and saved to the vector database. The chunks and tables are also saved into a KV store. The pipeline is designed as such to improve retrieval accuracy of the `search_knowledge_base` tool used by the Question Answering worker agent. ![dataprep architecture](assets/fin_agent_dataprep.png) @@ -30,154 +52,16 @@ The Question Answering worker agent uses `search_knowledge_base` tool to get rel ![finqa search tool arch](assets/finqa_tool.png) -## 2. Getting started - -### 2.1 Download repos - -```bash -mkdir /path/to/your/workspace/ -export WORKDIR=/path/to/your/workspace/ -cd $WORKDIR -git clone https://github.com/opea-project/GenAIExamples.git -``` - -### 2.2 Set up env vars - -```bash -export ip_address="External_Public_IP" -export no_proxy=${your_no_proxy},${ip_address} -export HF_CACHE_DIR=/path/to/your/model/cache/ -export HF_TOKEN= -export FINNHUB_API_KEY= # go to https://finnhub.io/ to get your free api key -export FINANCIAL_DATASETS_API_KEY= # go to https://docs.financialdatasets.ai/ to get your free api key -``` - -### 2.3 [Optional] Build docker images - -Only needed when docker pull failed. - -```bash -cd $WORKDIR/GenAIExamples/FinanceAgent/docker_image_build -# get GenAIComps repo -git clone https://github.com/opea-project/GenAIComps.git -# build the images -docker compose -f build.yaml build --no-cache -``` - -If deploy on Gaudi, also need to build vllm image. - -```bash -cd $WORKDIR -git clone https://github.com/HabanaAI/vllm-fork.git -# get the latest release tag of vllm gaudi -cd vllm-fork -VLLM_VER=$(git describe --tags "$(git rev-list --tags --max-count=1)") -echo "Check out vLLM tag ${VLLM_VER}" -git checkout ${VLLM_VER} -docker build --no-cache -f Dockerfile.hpu -t opea/vllm-gaudi:latest --shm-size=128g . --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -``` - -## 3. Deploy with docker compose - -### 3.1 Launch vllm endpoint - -Below is the command to launch a vllm endpoint on Gaudi that serves `meta-llama/Llama-3.3-70B-Instruct` model on 4 Gaudi cards. - -```bash -cd $WORKDIR/GenAIExamples/FinanceAgent/docker_compose/intel/hpu/gaudi -bash launch_vllm.sh -``` - -### 3.2 Prepare knowledge base - -The commands below will upload some example files into the knowledge base. You can also upload files through UI. - -First, launch the redis databases and the dataprep microservice. - -```bash -# inside $WORKDIR/GenAIExamples/FinanceAgent/docker_compose/intel/hpu/gaudi/ -bash launch_dataprep.sh -``` - -Validate datat ingest data and retrieval from database: - -```bash -python $WORKDIR/GenAIExamples/FinanceAgent/tests/test_redis_finance.py --port 6007 --test_option ingest -python $WORKDIR/GenAIExamples/FinanceAgent/tests/test_redis_finance.py --port 6007 --test_option get -``` - -### 3.3 Launch the multi-agent system - -The command below will launch 3 agent microservices, 1 docsum microservice, 1 UI microservice. - -```bash -# inside $WORKDIR/GenAIExamples/FinanceAgent/docker_compose/intel/hpu/gaudi/ -bash launch_agents.sh -``` - -### 3.4 Validate agents - -FinQA Agent: - -```bash -export agent_port="9095" -prompt="What is Gap's revenue in 2024?" -python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --prompt "$prompt" --agent_role "worker" --ext_port $agent_port -``` - -Research Agent: - -```bash -export agent_port="9096" -prompt="generate NVDA financial research report" -python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --prompt "$prompt" --agent_role "worker" --ext_port $agent_port --tool_choice "get_current_date" --tool_choice "get_share_performance" -``` - -Supervisor Agent single turns: - -```bash -export agent_port="9090" -python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --agent_role "supervisor" --ext_port $agent_port --stream -``` - -Supervisor Agent multi turn: - -```bash -python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --agent_role "supervisor" --ext_port $agent_port --multi-turn --stream - -``` - -## How to interact with the agent system with UI - -The UI microservice is launched in the previous step with the other microservices. -To see the UI, open a web browser to `http://${ip_address}:5175` to access the UI. Note the `ip_address` here is the host IP of the UI microservice. - -1. Create Admin Account with a random value - -2. Enter the endpoints in the `Connections` settings - - First, click on the user icon in the upper right corner to open `Settings`. Click on `Admin Settings`. Click on `Connections`. - - Then, enter the supervisor agent endpoint in the `OpenAI API` section: `http://${ip_address}:9090/v1`. Enter the API key as "empty". Add an arbitrary model id in `Model IDs`, for example, "opea_agent". The `ip_address` here should be the host ip of the agent microservice. - - Then, enter the dataprep endpoint in the `Icloud File API` section. You first need to enable `Icloud File API` by clicking on the button on the right to turn it into green and then enter the endpoint url, for example, `http://${ip_address}:6007/v1`. The `ip_address` here should be the host ip of the dataprep microservice. - - You should see screen like the screenshot below when the settings are done. - -![opea-agent-setting](assets/ui_connections_settings.png) - -3. Upload documents with UI - - Click on the `Workplace` icon in the top left corner. Click `Knowledge`. Click on the "+" sign to the right of `Icloud Knowledge`. You can paste an url in the left hand side of the pop-up window, or upload a local file by click on the cloud icon on the right hand side of the pop-up window. Then click on the `Upload Confirm` button. Wait till the processing is done and the pop-up window will be closed on its own when the data ingestion is done. See the screenshot below. - - Note: the data ingestion may take a few minutes depending on the length of the document. Please wait patiently and do not close the pop-up window. +## Deployment Options -![upload-doc-ui](assets/upload_doc_ui.png) +This Finance Agent example can be deployed manually on Docker Compose. -4. Test agent with UI +| Hardware | Deployment Mode | Guide Link | +| :----------------------------- | :------------------- | :----------------------------------------------------------------------- | +| Intel® Gaudi® AI Accelerator | Single Node (Docker) | [Gaudi Docker Compose Guide](./docker_compose/intel/hpu/gaudi/README.md) | - After the settings are done and documents are ingested, you can start to ask questions to the agent. Click on the `New Chat` icon in the top left corner, and type in your questions in the text box in the middle of the UI. +_Note: Building custom microservice images can be done using the resources in [GenAIComps](https://github.com/opea-project/GenAIComps)._ - The UI will stream the agent's response tokens. You need to expand the `Thinking` tab to see the agent's reasoning process. After the agent made tool calls, you would also see the tool output after the tool returns output to the agent. Note: it may take a while to get the tool output back if the tool execution takes time. +## Contribution -![opea-agent-test](assets/opea-agent-test.png) +We welcome contributions to the OPEA project. Please refer to the [contribution guidelines](https://github.com/opea-project/docs/blob/main/community/CONTRIBUTING.md) for more information. diff --git a/FinanceAgent/docker_compose/intel/hpu/gaudi/README.md b/FinanceAgent/docker_compose/intel/hpu/gaudi/README.md new file mode 100644 index 0000000000..79f0a9dec9 --- /dev/null +++ b/FinanceAgent/docker_compose/intel/hpu/gaudi/README.md @@ -0,0 +1,205 @@ +# Deploy Finance Agent on Intel® Gaudi® AI Accelerator with Docker Compose + +This README provides instructions for deploying the Finance Agent application using Docker Compose on systems equipped with Intel® Gaudi® AI Accelerators. + +## Table of Contents + +- [Overview](#overview) +- [Prerequisites](#prerequisites) +- [Start Deployment](#start-deployment) +- [Validate Services](#validate-services) +- [Accessing the User Interface (UI)](#accessing-the-user-interface-ui) + +## Overview + +This guide focuses on running the pre-configured Finance Agent service using Docker Compose on Intel® Gaudi® AI Accelerators. It leverages containers optimized for Gaudi for the LLM serving component, along with CPU-based containers for other microservices like embedding, retrieval, data preparation and the UI. + +## Prerequisites + +- Docker and Docker Compose installed. +- Intel® Gaudi® AI Accelerator(s) with the necessary drivers and software stack installed on the host system. (Refer to Intel Gaudi Documentation). +- Git installed (for cloning repository). +- Hugging Face Hub API Token (for downloading models). +- Access to the internet (or a private model cache). +- Finnhub API Key. Go to https://docs.financialdatasets.ai/ to get your free api key +- Financial Datgasets API Key. Go to https://docs.financialdatasets.ai/ to get your free api key + +Clone the GenAIExamples repository: + +```shell +mkdir /path/to/your/workspace/ +export WORKDIR=/path/to/your/workspace/ +cd $WORKDIR +git clone https://github.com/opea-project/GenAIExamples.git +cd GenAIExamples/FinanceAgent/docker_compose/intel/hpu/gaudi +``` + +## Start Deployment + +This uses the default vLLM-based deployment profile (vllm-gaudi-server). + +### Configure Environment + +Set required environment variables in your shell: + +```shell +# Path to your model cache +export HF_CACHE_DIR="./data" +# Some models from Hugging Face require approval beforehand. Ensure you have the necessary permissions to access them. +export HF_TOKEN="your_huggingface_token" +export FINNHUB_API_KEY="your-finnhub-api-key" +export FINANCIAL_DATASETS_API_KEY="your-financial-datgasets-api-key" + +# Optional: Configure HOST_IP if needed +# Replace with your host's external IP address (do not use localhost or 127.0.0.1). +# export HOST_IP=$(hostname -I | awk '{print $1}') + +# Optional: Configure proxy if needed +# export HTTP_PROXY="${http_proxy}" +# export HTTPS_PROXY="${https_proxy}" +# export NO_PROXY="${NO_PROXY},${HOST_IP}" + +source ../../set_env.sh +``` + +Note: The compose file might read additional variables from set_env.sh. Ensure all required variables like ports (LLM_SERVICE_PORT, TEI_EMBEDDER_PORT, etc.) are set if not using defaults from the compose file. For instance, edit the set_env.sh to change the LLM model: + +### Start Services + +#### Deploy with Docker Compose + +Below is the command to launch services + +- vllm-gaudi-server +- tei-embedding-serving +- redis-vector-db +- redis-kv-store +- dataprep-redis-server-finance +- finqa-agent-endpoint +- research-agent-endpoint +- docsum-vllm-gaudi +- supervisor-agent-endpoint +- agent-ui + +```shell +docker compose -f compose.yaml up -d +``` + +#### [Optional] Build docker images + +This is only needed if the Docker image is unavailable or the pull operation fails. + +```bash +cd $WORKDIR/GenAIExamples/FinanceAgent/docker_image_build +# get GenAIComps repo +git clone https://github.com/opea-project/GenAIComps.git +# build the images +docker compose -f build.yaml build --no-cache +``` + +If deploy on Gaudi, also need to build vllm image. + +```bash +cd $WORKDIR +git clone https://github.com/HabanaAI/vllm-fork.git +# get the latest release tag of vllm gaudi +cd vllm-fork +VLLM_VER=$(git describe --tags "$(git rev-list --tags --max-count=1)") +echo "Check out vLLM tag ${VLLM_VER}" +git checkout ${VLLM_VER} +docker build --no-cache -f Dockerfile.hpu -t opea/vllm-gaudi:latest --shm-size=128g . --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy +``` + +## Validate Services + +Wait several minutes for models to download and services to initialize (Gaudi initialization can take time). Check container logs (docker compose logs -f , especially vllm-gaudi-server). + +```bash +docker logs --tail 2000 -f vllm-gaudi-server +``` + +> Below is the expected output of the `vllm-gaudi-server` service. + +``` + INFO: Started server process [1] + INFO: Waiting for application startup. + INFO: Application startup complete. + INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit) + INFO: : - "GET /health HTTP/1.1" 200 OK + +``` + +### Validate Data Services + +Ingest data and retrieval from database + +```bash +python $WORKDIR/GenAIExamples/FinanceAgent/tests/test_redis_finance.py --port 6007 --test_option ingest +python $WORKDIR/GenAIExamples/FinanceAgent/tests/test_redis_finance.py --port 6007 --test_option get +``` + +### Validate Agents + +FinQA Agent: + +```bash +export agent_port="9095" +prompt="What is Gap's revenue in 2024?" +python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --prompt "$prompt" --agent_role "worker" --ext_port $agent_port +``` + +Research Agent: + +```bash +export agent_port="9096" +prompt="generate NVDA financial research report" +python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --prompt "$prompt" --agent_role "worker" --ext_port $agent_port --tool_choice "get_current_date" --tool_choice "get_share_performance" +``` + +Supervisor Agent single turns: + +```bash +export agent_port="9090" +python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --agent_role "supervisor" --ext_port $agent_port --stream +``` + +Supervisor Agent multi turn: + +```bash +python3 $WORKDIR/GenAIExamples/FinanceAgent/tests/test.py --agent_role "supervisor" --ext_port $agent_port --multi-turn --stream +``` + +## Accessing the User Interface (UI) + +The UI microservice is launched in the previous step with the other microservices. +To see the UI, open a web browser to `http://${HOST_IP}:5175` to access the UI. Note the `HOST_IP` here is the host IP of the UI microservice. + +1. Create Admin Account with a random value + +2. Enter the endpoints in the `Connections` settings + + First, click on the user icon in the upper right corner to open `Settings`. Click on `Admin Settings`. Click on `Connections`. + + Then, enter the supervisor agent endpoint in the `OpenAI API` section: `http://${HOST_IP}:9090/v1`. Enter the API key as "empty". Add an arbitrary model id in `Model IDs`, for example, "opea_agent". The `HOST_IP` here should be the host ip of the agent microservice. + + Then, enter the dataprep endpoint in the `Icloud File API` section. You first need to enable `Icloud File API` by clicking on the button on the right to turn it into green and then enter the endpoint url, for example, `http://${HOST_IP}:6007/v1`. The `HOST_IP` here should be the host ip of the dataprep microservice. + + You should see screen like the screenshot below when the settings are done. + +![opea-agent-setting](../../../../assets/ui_connections_settings.png) + +3. Upload documents with UI + + Click on the `Workplace` icon in the top left corner. Click `Knowledge`. Click on the "+" sign to the right of `iCloud Knowledge`. You can paste an url in the left hand side of the pop-up window, or upload a local file by click on the cloud icon on the right hand side of the pop-up window. Then click on the `Upload Confirm` button. Wait till the processing is done and the pop-up window will be closed on its own when the data ingestion is done. See the screenshot below. + Then, enter the dataprep endpoint in the `iCloud File API` section. You first need to enable `iCloud File API` by clicking on the button on the right to turn it into green and then enter the endpoint url, for example, `http://${HOST_IP}:6007/v1`. The `HOST_IP` here should be the host ip of the dataprep microservice. + Note: the data ingestion may take a few minutes depending on the length of the document. Please wait patiently and do not close the pop-up window. + +![upload-doc-ui](../../../../assets/upload_doc_ui.png) + +4. Test agent with UI + + After the settings are done and documents are ingested, you can start to ask questions to the agent. Click on the `New Chat` icon in the top left corner, and type in your questions in the text box in the middle of the UI. + + The UI will stream the agent's response tokens. You need to expand the `Thinking` tab to see the agent's reasoning process. After the agent made tool calls, you would also see the tool output after the tool returns output to the agent. Note: it may take a while to get the tool output back if the tool execution takes time. + +![opea-agent-test](../../../../assets/opea-agent-test.png) diff --git a/FinanceAgent/docker_compose/intel/hpu/gaudi/compose.yaml b/FinanceAgent/docker_compose/intel/hpu/gaudi/compose.yaml index 997aade843..e788c5899a 100644 --- a/FinanceAgent/docker_compose/intel/hpu/gaudi/compose.yaml +++ b/FinanceAgent/docker_compose/intel/hpu/gaudi/compose.yaml @@ -1,37 +1,146 @@ # Copyright (C) 2024 Intel Corporation # SPDX-License-Identifier: Apache-2.0 + +x-common-environment: + &common-env + no_proxy: ${NO_PROXY} + http_proxy: ${HTTP_PROXY} + https_proxy: ${HTTPS_PROXY} + +x-common-agent-environment: + &common-agent-env + <<: *common-env + HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN} + llm_endpoint_url: ${LLM_ENDPOINT} + model: ${LLM_MODEL_ID} + REDIS_URL_VECTOR: ${REDIS_URL_VECTOR} + REDIS_URL_KV: ${REDIS_URL_KV} + TEI_EMBEDDING_ENDPOINT: ${TEI_EMBEDDING_ENDPOINT} + ip_address: ${HOST_IP} + strategy: react_llama + require_human_feedback: false + services: + + vllm-service: + image: ${REGISTRY:-opea}/vllm-gaudi:${TAG:-latest} + container_name: vllm-gaudi-server + ports: + - "8086:8000" + volumes: + - ${HF_CACHE_DIR:-./data}:/data + environment: + <<: *common-env + HF_TOKEN: ${HF_TOKEN} + HUGGING_FACE_HUB_TOKEN: ${HF_TOKEN} + HF_HOME: ./data + HABANA_VISIBLE_DEVICES: all + OMPI_MCA_btl_vader_single_copy_mechanism: none + LLM_MODEL_ID: ${LLM_MODEL_ID} + VLLM_TORCH_PROFILER_DIR: "/mnt" + VLLM_SKIP_WARMUP: true + PT_HPU_ENABLE_LAZY_COLLECTIVES: true + healthcheck: + test: ["CMD-SHELL", "curl -f http://$HOST_IP:8086/health || exit 1"] + interval: 10s + timeout: 10s + retries: 100 + runtime: habana + cap_add: + - SYS_NICE + ipc: host + command: --model ${LLM_MODEL_ID} --tensor-parallel-size ${NUM_CARDS} --host 0.0.0.0 --port 8000 --max-seq-len-to-capture $MAX_LEN + + tei-embedding-serving: + image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 + container_name: tei-embedding-serving + entrypoint: /bin/sh -c "apt-get update && apt-get install -y curl && text-embeddings-router --json-output --model-id ${EMBEDDING_MODEL_ID} --auto-truncate" + ports: + - "${TEI_EMBEDDER_PORT:-10221}:80" + volumes: + - ${HF_CACHE_DIR:-./data}:/data + shm_size: 1g + environment: + <<: *common-env + HF_TOKEN: ${HF_TOKEN} + host_ip: ${HOST_IP} + healthcheck: + test: ["CMD", "curl", "-f", "http://${HOST_IP}:${TEI_EMBEDDER_PORT}/health"] + interval: 10s + timeout: 6s + retries: 48 + + redis-vector-db: + image: redis/redis-stack:7.2.0-v9 + container_name: redis-vector-db + ports: + - "${REDIS_PORT1:-6379}:6379" + - "${REDIS_PORT2:-8001}:8001" + environment: + <<: *common-env + healthcheck: + test: ["CMD", "redis-cli", "ping"] + timeout: 10s + retries: 3 + start_period: 10s + + redis-kv-store: + image: redis/redis-stack:7.2.0-v9 + container_name: redis-kv-store + ports: + - "${REDIS_PORT3:-6380}:6379" + - "${REDIS_PORT4:-8002}:8001" + environment: + <<: *common-env + healthcheck: + test: ["CMD", "redis-cli", "ping"] + timeout: 10s + retries: 3 + start_period: 10s + + dataprep-redis-finance: + image: ${REGISTRY:-opea}/dataprep:${TAG:-latest} + container_name: dataprep-redis-server-finance + depends_on: + redis-vector-db: + condition: service_healthy + redis-kv-store: + condition: service_healthy + tei-embedding-serving: + condition: service_healthy + ports: + - "${DATAPREP_PORT:-6007}:5000" + environment: + <<: *common-env + DATAPREP_COMPONENT_NAME: ${DATAPREP_COMPONENT_NAME} + REDIS_URL_VECTOR: ${REDIS_URL_VECTOR} + REDIS_URL_KV: ${REDIS_URL_KV} + TEI_EMBEDDING_ENDPOINT: ${TEI_EMBEDDING_ENDPOINT} + LLM_ENDPOINT: ${LLM_ENDPOINT} + LLM_MODEL: ${LLM_MODEL_ID} + HUGGINGFACEHUB_API_TOKEN: ${HF_TOKEN} + HF_TOKEN: ${HF_TOKEN} + LOGFLAG: true + worker-finqa-agent: image: opea/agent:latest container_name: finqa-agent-endpoint volumes: - ${TOOLSET_PATH}:/home/user/tools/ - ${PROMPT_PATH}:/home/user/prompts/ + ipc: host ports: - "9095:9095" - ipc: host environment: - ip_address: ${ip_address} - strategy: react_llama + <<: *common-agent-env with_memory: false - recursion_limit: ${recursion_limit_worker} - llm_engine: vllm - HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN} - llm_endpoint_url: ${LLM_ENDPOINT_URL} - model: ${LLM_MODEL_ID} + recursion_limit: ${RECURSION_LIMIT_WORKER} temperature: ${TEMPERATURE} max_new_tokens: ${MAX_TOKENS} stream: false tools: /home/user/tools/finqa_agent_tools.yaml custom_prompt: /home/user/prompts/finqa_prompt.py - require_human_feedback: false - no_proxy: ${no_proxy} - http_proxy: ${http_proxy} - https_proxy: ${https_proxy} - REDIS_URL_VECTOR: $REDIS_URL_VECTOR - REDIS_URL_KV: $REDIS_URL_KV - TEI_EMBEDDING_ENDPOINT: $TEI_EMBEDDING_ENDPOINT port: 9095 worker-research-agent: @@ -40,67 +149,20 @@ services: volumes: - ${TOOLSET_PATH}:/home/user/tools/ - ${PROMPT_PATH}:/home/user/prompts/ + ipc: host ports: - "9096:9096" - ipc: host environment: - ip_address: ${ip_address} - strategy: react_llama + <<: *common-agent-env with_memory: false - recursion_limit: 25 - llm_engine: vllm - HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN} - llm_endpoint_url: ${LLM_ENDPOINT_URL} - model: ${LLM_MODEL_ID} + recursion_limit: ${RECURSION_LIMIT_WORKER} stream: false tools: /home/user/tools/research_agent_tools.yaml custom_prompt: /home/user/prompts/research_prompt.py - require_human_feedback: false - no_proxy: ${no_proxy} - http_proxy: ${http_proxy} - https_proxy: ${https_proxy} FINNHUB_API_KEY: ${FINNHUB_API_KEY} FINANCIAL_DATASETS_API_KEY: ${FINANCIAL_DATASETS_API_KEY} port: 9096 - supervisor-react-agent: - image: opea/agent:latest - container_name: supervisor-agent-endpoint - depends_on: - - worker-finqa-agent - - worker-research-agent - volumes: - - ${TOOLSET_PATH}:/home/user/tools/ - - ${PROMPT_PATH}:/home/user/prompts/ - ports: - - "9090:9090" - ipc: host - environment: - ip_address: ${ip_address} - strategy: react_llama - with_memory: true - recursion_limit: ${recursion_limit_supervisor} - llm_engine: vllm - HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN} - llm_endpoint_url: ${LLM_ENDPOINT_URL} - model: ${LLM_MODEL_ID} - temperature: ${TEMPERATURE} - max_new_tokens: ${MAX_TOKENS} - stream: true - tools: /home/user/tools/supervisor_agent_tools.yaml - custom_prompt: /home/user/prompts/supervisor_prompt.py - require_human_feedback: false - no_proxy: ${no_proxy} - http_proxy: ${http_proxy} - https_proxy: ${https_proxy} - WORKER_FINQA_AGENT_URL: $WORKER_FINQA_AGENT_URL - WORKER_RESEARCH_AGENT_URL: $WORKER_RESEARCH_AGENT_URL - DOCSUM_ENDPOINT: $DOCSUM_ENDPOINT - REDIS_URL_VECTOR: $REDIS_URL_VECTOR - REDIS_URL_KV: $REDIS_URL_KV - TEI_EMBEDDING_ENDPOINT: $TEI_EMBEDDING_ENDPOINT - port: 9090 - docsum-vllm-gaudi: image: opea/llm-docsum:latest container_name: docsum-vllm-gaudi @@ -108,26 +170,48 @@ services: - ${DOCSUM_PORT:-9000}:9000 ipc: host environment: - no_proxy: ${no_proxy} - http_proxy: ${http_proxy} - https_proxy: ${https_proxy} + <<: *common-env LLM_ENDPOINT: ${LLM_ENDPOINT} LLM_MODEL_ID: ${LLM_MODEL_ID} HF_TOKEN: ${HF_TOKEN} LOGFLAG: ${LOGFLAG:-False} MAX_INPUT_TOKENS: ${MAX_INPUT_TOKENS} MAX_TOTAL_TOKENS: ${MAX_TOTAL_TOKENS} - DocSum_COMPONENT_NAME: ${DocSum_COMPONENT_NAME:-OpeaDocSumvLLM} + DocSum_COMPONENT_NAME: ${DOCSUM_COMPONENT_NAME:-OpeaDocSumvLLM} restart: unless-stopped + supervisor-react-agent: + image: opea/agent:latest + container_name: supervisor-agent-endpoint + volumes: + - ${TOOLSET_PATH}:/home/user/tools/ + - ${PROMPT_PATH}:/home/user/prompts/ + ipc: host + depends_on: + - worker-finqa-agent + - worker-research-agent + ports: + - "9090:9090" + environment: + <<: *common-agent-env + with_memory: "true" + recursion_limit: ${RECURSION_LIMIT_SUPERVISOR} + temperature: ${TEMPERATURE} + max_new_tokens: ${MAX_TOKENS} + stream: "true" + tools: /home/user/tools/supervisor_agent_tools.yaml + custom_prompt: /home/user/prompts/supervisor_prompt.py + WORKER_FINQA_AGENT_URL: ${WORKER_FINQA_AGENT_URL} + WORKER_RESEARCH_AGENT_URL: ${WORKER_RESEARCH_AGENT_URL} + DOCSUM_ENDPOINT: ${DOCSUM_ENDPOINT} + port: 9090 + agent-ui: image: opea/agent-ui:latest container_name: agent-ui environment: - host_ip: ${host_ip} - no_proxy: ${no_proxy} - http_proxy: ${http_proxy} - https_proxy: ${https_proxy} + <<: *common-env + host_ip: ${HOST_IP} ports: - "5175:8080" ipc: host diff --git a/FinanceAgent/docker_compose/intel/hpu/gaudi/dataprep_compose.yaml b/FinanceAgent/docker_compose/intel/hpu/gaudi/dataprep_compose.yaml deleted file mode 100644 index 5e4333c7d2..0000000000 --- a/FinanceAgent/docker_compose/intel/hpu/gaudi/dataprep_compose.yaml +++ /dev/null @@ -1,82 +0,0 @@ -# Copyright (C) 2025 Intel Corporation -# SPDX-License-Identifier: Apache-2.0 - -services: - tei-embedding-serving: - image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.5 - container_name: tei-embedding-serving - entrypoint: /bin/sh -c "apt-get update && apt-get install -y curl && text-embeddings-router --json-output --model-id ${EMBEDDING_MODEL_ID} --auto-truncate" - ports: - - "${TEI_EMBEDDER_PORT:-10221}:80" - volumes: - - "./data:/data" - shm_size: 1g - environment: - no_proxy: ${no_proxy} - http_proxy: ${http_proxy} - https_proxy: ${https_proxy} - host_ip: ${host_ip} - HF_TOKEN: ${HF_TOKEN} - healthcheck: - test: ["CMD", "curl", "-f", "http://${host_ip}:${TEI_EMBEDDER_PORT}/health"] - interval: 10s - timeout: 6s - retries: 48 - - redis-vector-db: - image: redis/redis-stack:7.2.0-v9 - container_name: redis-vector-db - ports: - - "${REDIS_PORT1:-6379}:6379" - - "${REDIS_PORT2:-8001}:8001" - environment: - - no_proxy=${no_proxy} - - http_proxy=${http_proxy} - - https_proxy=${https_proxy} - healthcheck: - test: ["CMD", "redis-cli", "ping"] - timeout: 10s - retries: 3 - start_period: 10s - - redis-kv-store: - image: redis/redis-stack:7.2.0-v9 - container_name: redis-kv-store - ports: - - "${REDIS_PORT3:-6380}:6379" - - "${REDIS_PORT4:-8002}:8001" - environment: - - no_proxy=${no_proxy} - - http_proxy=${http_proxy} - - https_proxy=${https_proxy} - healthcheck: - test: ["CMD", "redis-cli", "ping"] - timeout: 10s - retries: 3 - start_period: 10s - - dataprep-redis-finance: - image: ${REGISTRY:-opea}/dataprep:${TAG:-latest} - container_name: dataprep-redis-server-finance - depends_on: - redis-vector-db: - condition: service_healthy - redis-kv-store: - condition: service_healthy - tei-embedding-serving: - condition: service_healthy - ports: - - "${DATAPREP_PORT:-6007}:5000" - environment: - no_proxy: ${no_proxy} - http_proxy: ${http_proxy} - https_proxy: ${https_proxy} - DATAPREP_COMPONENT_NAME: ${DATAPREP_COMPONENT_NAME} - REDIS_URL_VECTOR: ${REDIS_URL_VECTOR} - REDIS_URL_KV: ${REDIS_URL_KV} - TEI_EMBEDDING_ENDPOINT: ${TEI_EMBEDDING_ENDPOINT} - LLM_ENDPOINT: ${LLM_ENDPOINT} - LLM_MODEL: ${LLM_MODEL} - HUGGINGFACEHUB_API_TOKEN: ${HF_TOKEN} - HF_TOKEN: ${HF_TOKEN} - LOGFLAG: true diff --git a/FinanceAgent/docker_compose/intel/hpu/gaudi/launch_agents.sh b/FinanceAgent/docker_compose/intel/hpu/gaudi/launch_agents.sh deleted file mode 100644 index 55dcbb7d3d..0000000000 --- a/FinanceAgent/docker_compose/intel/hpu/gaudi/launch_agents.sh +++ /dev/null @@ -1,36 +0,0 @@ - -# Copyright (C) 2025 Intel Corporation -# SPDX-License-Identifier: Apache-2.0 - -export ip_address=$(hostname -I | awk '{print $1}') -export HUGGINGFACEHUB_API_TOKEN=${HF_TOKEN} -export TOOLSET_PATH=$WORKDIR/GenAIExamples/FinanceAgent/tools/ -echo "TOOLSET_PATH=${TOOLSET_PATH}" -export PROMPT_PATH=$WORKDIR/GenAIExamples/FinanceAgent/prompts/ -echo "PROMPT_PATH=${PROMPT_PATH}" -export recursion_limit_worker=12 -export recursion_limit_supervisor=10 - -vllm_port=8086 -export LLM_MODEL_ID="meta-llama/Llama-3.3-70B-Instruct" -export LLM_ENDPOINT_URL="http://${ip_address}:${vllm_port}" -export TEMPERATURE=0.5 -export MAX_TOKENS=4096 - -export WORKER_FINQA_AGENT_URL="http://${ip_address}:9095/v1/chat/completions" -export WORKER_RESEARCH_AGENT_URL="http://${ip_address}:9096/v1/chat/completions" - -export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5" -export TEI_EMBEDDING_ENDPOINT="http://${ip_address}:10221" -export REDIS_URL_VECTOR="redis://${ip_address}:6379" -export REDIS_URL_KV="redis://${ip_address}:6380" - -export MAX_INPUT_TOKENS=2048 -export MAX_TOTAL_TOKENS=4096 -export DocSum_COMPONENT_NAME="OpeaDocSumvLLM" -export DOCSUM_ENDPOINT="http://${ip_address}:9000/v1/docsum" - -export FINNHUB_API_KEY=${FINNHUB_API_KEY} -export FINANCIAL_DATASETS_API_KEY=${FINANCIAL_DATASETS_API_KEY} - -docker compose -f compose.yaml up -d diff --git a/FinanceAgent/docker_compose/intel/hpu/gaudi/launch_dataprep.sh b/FinanceAgent/docker_compose/intel/hpu/gaudi/launch_dataprep.sh deleted file mode 100644 index 9bb006c191..0000000000 --- a/FinanceAgent/docker_compose/intel/hpu/gaudi/launch_dataprep.sh +++ /dev/null @@ -1,15 +0,0 @@ -# Copyright (C) 2025 Intel Corporation -# SPDX-License-Identifier: Apache-2.0 - -export host_ip=${ip_address} -export DATAPREP_PORT="6007" -export TEI_EMBEDDER_PORT="10221" -export REDIS_URL_VECTOR="redis://${ip_address}:6379" -export REDIS_URL_KV="redis://${ip_address}:6380" -export LLM_MODEL=$model -export LLM_ENDPOINT="http://${ip_address}:${vllm_port}" -export DATAPREP_COMPONENT_NAME="OPEA_DATAPREP_REDIS_FINANCE" -export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5" -export TEI_EMBEDDING_ENDPOINT="http://${ip_address}:${TEI_EMBEDDER_PORT}" - -docker compose -f dataprep_compose.yaml up -d diff --git a/FinanceAgent/docker_compose/intel/hpu/gaudi/launch_vllm.sh b/FinanceAgent/docker_compose/intel/hpu/gaudi/launch_vllm.sh deleted file mode 100644 index 5d8d58641b..0000000000 --- a/FinanceAgent/docker_compose/intel/hpu/gaudi/launch_vllm.sh +++ /dev/null @@ -1,7 +0,0 @@ -# Copyright (C) 2025 Intel Corporation -# SPDX-License-Identifier: Apache-2.0 - -export LLM_MODEL_ID="meta-llama/Llama-3.3-70B-Instruct" -export MAX_LEN=16384 - -docker compose -f vllm_compose.yaml up -d diff --git a/FinanceAgent/docker_compose/intel/hpu/gaudi/vllm_compose.yaml b/FinanceAgent/docker_compose/intel/hpu/gaudi/vllm_compose.yaml deleted file mode 100644 index 8ca62e1e46..0000000000 --- a/FinanceAgent/docker_compose/intel/hpu/gaudi/vllm_compose.yaml +++ /dev/null @@ -1,35 +0,0 @@ - -# Copyright (C) 2025 Intel Corporation -# SPDX-License-Identifier: Apache-2.0 - -services: - vllm-service: - image: ${REGISTRY:-opea}/vllm-gaudi:${TAG:-latest} - container_name: vllm-gaudi-server - ports: - - "8086:8000" - volumes: - - ${HF_CACHE_DIR}:/data - environment: - no_proxy: ${no_proxy} - http_proxy: ${http_proxy} - https_proxy: ${https_proxy} - HF_TOKEN: ${HF_TOKEN} - HUGGING_FACE_HUB_TOKEN: ${HF_TOKEN} - HF_HOME: /data - HABANA_VISIBLE_DEVICES: all - OMPI_MCA_btl_vader_single_copy_mechanism: none - LLM_MODEL_ID: ${LLM_MODEL_ID} - VLLM_TORCH_PROFILER_DIR: "/mnt" - VLLM_SKIP_WARMUP: true - PT_HPU_ENABLE_LAZY_COLLECTIVES: true - healthcheck: - test: ["CMD-SHELL", "curl -f http://$host_ip:8086/health || exit 1"] - interval: 10s - timeout: 10s - retries: 100 - runtime: habana - cap_add: - - SYS_NICE - ipc: host - command: --model $LLM_MODEL_ID --tensor-parallel-size 4 --host 0.0.0.0 --port 8000 --max-seq-len-to-capture $MAX_LEN diff --git a/FinanceAgent/docker_compose/intel/set_env.sh b/FinanceAgent/docker_compose/intel/set_env.sh new file mode 100644 index 0000000000..16893f3ab5 --- /dev/null +++ b/FinanceAgent/docker_compose/intel/set_env.sh @@ -0,0 +1,89 @@ +#!/usr/bin/env bash + +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +# Navigate to the parent directory and source the environment +pushd "../../" > /dev/null +source .set_env.sh +popd > /dev/null + +# Function to check if a variable is set +check_var() { + local var_name="$1" + local var_value="${!var_name}" + if [ -z "${var_value}" ]; then + echo "Error: ${var_name} is not set. Please set ${var_name}." + return 1 # Return an error code but do not exit the script + fi +} + +# Check critical variables +check_var "HF_TOKEN" +check_var "HOST_IP" + +# VLLM configuration +export VLLM_PORT="${VLLM_PORT:-8086}" +export VLLM_VOLUME="${VLLM_VOLUME:-/data2/huggingface}" +export VLLM_IMAGE="${VLLM_IMAGE:-opea/vllm-gaudi:latest}" +export LLM_MODEL_ID="${LLM_MODEL_ID:-meta-llama/Llama-3.3-70B-Instruct}" +export LLM_ENDPOINT="http://${HOST_IP}:${VLLM_PORT}" +export MAX_LEN="${MAX_LEN:-16384}" +export NUM_CARDS="${NUM_CARDS:-4}" +export HF_CACHE_DIR="${HF_CACHE_DIR:-"./data"}" + +# Data preparation and embedding configuration +export DATAPREP_PORT="${DATAPREP_PORT:-6007}" +export TEI_EMBEDDER_PORT="${TEI_EMBEDDER_PORT:-10221}" +export REDIS_URL_VECTOR="redis://${HOST_IP}:6379" +export REDIS_URL_KV="redis://${HOST_IP}:6380" +export DATAPREP_COMPONENT_NAME="${DATAPREP_COMPONENT_NAME:-OPEA_DATAPREP_REDIS_FINANCE}" +export EMBEDDING_MODEL_ID="${EMBEDDING_MODEL_ID:-BAAI/bge-base-en-v1.5}" +export TEI_EMBEDDING_ENDPOINT="http://${HOST_IP}:${TEI_EMBEDDER_PORT}" + +# Hugging Face API token +export HUGGINGFACEHUB_API_TOKEN="${HF_TOKEN}" + +# Recursion limits +export RECURSION_LIMIT_WORKER="${RECURSION_LIMIT_WORKER:-12}" +export RECURSION_LIMIT_SUPERVISOR="${RECURSION_LIMIT_SUPERVISOR:-10}" + +# LLM configuration +export TEMPERATURE="${TEMPERATURE:-0.5}" +export MAX_TOKENS="${MAX_TOKENS:-4096}" +export MAX_INPUT_TOKENS="${MAX_INPUT_TOKENS:-2048}" +export MAX_TOTAL_TOKENS="${MAX_TOTAL_TOKENS:-4096}" + +# Worker URLs +export WORKER_FINQA_AGENT_URL="http://${HOST_IP}:9095/v1/chat/completions" +export WORKER_RESEARCH_AGENT_URL="http://${HOST_IP}:9096/v1/chat/completions" + +# DocSum configuration +export DOCSUM_COMPONENT_NAME="${DOCSUM_COMPONENT_NAME:-"OpeaDocSumvLLM"}" +export DOCSUM_ENDPOINT="http://${HOST_IP}:9000/v1/docsum" + +# API keys +check_var "FINNHUB_API_KEY" +check_var "FINANCIAL_DATASETS_API_KEY" +export FINNHUB_API_KEY="${FINNHUB_API_KEY}" +export FINANCIAL_DATASETS_API_KEY="${FINANCIAL_DATASETS_API_KEY}" + + +# Toolset and prompt paths +if check_var "WORKDIR"; then + export TOOLSET_PATH=$WORKDIR/GenAIExamples/FinanceAgent/tools/ + export PROMPT_PATH=$WORKDIR/GenAIExamples/FinanceAgent/prompts/ + + echo "TOOLSET_PATH=${TOOLSET_PATH}" + echo "PROMPT_PATH=${PROMPT_PATH}" + + # Array of directories to check + REQUIRED_DIRS=("${TOOLSET_PATH}" "${PROMPT_PATH}") + + for dir in "${REQUIRED_DIRS[@]}"; do + if [ ! -d "${dir}" ]; then + echo "Error: Required directory does not exist: ${dir}" + exit 1 + fi + done +fi diff --git a/FinanceAgent/tests/test_compose_on_gaudi.sh b/FinanceAgent/tests/test_compose_on_gaudi.sh index 0f42813978..d534ffa122 100644 --- a/FinanceAgent/tests/test_compose_on_gaudi.sh +++ b/FinanceAgent/tests/test_compose_on_gaudi.sh @@ -6,33 +6,69 @@ set -xe export WORKPATH=$(dirname "$PWD") export WORKDIR=$WORKPATH/../../ echo "WORKDIR=${WORKDIR}" -export ip_address=$(hostname -I | awk '{print $1}') +export IP_ADDRESS=$(hostname -I | awk '{print $1}') +export HOST_IP=${IP_ADDRESS} LOG_PATH=$WORKPATH -#### env vars for LLM endpoint ############# -model=meta-llama/Llama-3.3-70B-Instruct -vllm_image=opea/vllm-gaudi:latest -vllm_port=8086 -vllm_image=$vllm_image -HF_CACHE_DIR=${model_cache:-"/data2/huggingface"} -vllm_volume=${HF_CACHE_DIR} -####################################### +# Proxy settings +export NO_PROXY="${NO_PROXY},${HOST_IP}" +export HTTP_PROXY="${http_proxy}" +export HTTPS_PROXY="${https_proxy}" + +export no_proxy="${no_proxy},${HOST_IP}" +export http_proxy="${http_proxy}" +export https_proxy="${https_proxy}" + +# VLLM configuration +MODEL=meta-llama/Llama-3.3-70B-Instruct +export VLLM_PORT="${VLLM_PORT:-8086}" + +# export HF_CACHE_DIR="${HF_CACHE_DIR:-"./data"}" +export HF_CACHE_DIR=${model_cache:-"./data2/huggingface"} +export VLLM_VOLUME="${HF_CACHE_DIR:-"./data2/huggingface"}" +export VLLM_IMAGE="${VLLM_IMAGE:-opea/vllm-gaudi:latest}" +export LLM_MODEL_ID="${LLM_MODEL_ID:-meta-llama/Llama-3.3-70B-Instruct}" +export LLM_MODEL=$LLM_MODEL_ID +export LLM_ENDPOINT="http://${IP_ADDRESS}:${VLLM_PORT}" +export MAX_LEN="${MAX_LEN:-16384}" +export NUM_CARDS="${NUM_CARDS:-4}" + +# Recursion limits +export RECURSION_LIMIT_WORKER="${RECURSION_LIMIT_WORKER:-12}" +export RECURSION_LIMIT_SUPERVISOR="${RECURSION_LIMIT_SUPERVISOR:-10}" + +# Hugging Face API token +export HUGGINGFACEHUB_API_TOKEN="${HF_TOKEN}" + +# LLM configuration +export TEMPERATURE="${TEMPERATURE:-0.5}" +export MAX_TOKENS="${MAX_TOKENS:-4096}" +export MAX_INPUT_TOKENS="${MAX_INPUT_TOKENS:-2048}" +export MAX_TOTAL_TOKENS="${MAX_TOTAL_TOKENS:-4096}" + +# Worker URLs +export WORKER_FINQA_AGENT_URL="http://${IP_ADDRESS}:9095/v1/chat/completions" +export WORKER_RESEARCH_AGENT_URL="http://${IP_ADDRESS}:9096/v1/chat/completions" + +# DocSum configuration +export DOCSUM_COMPONENT_NAME="${DOCSUM_COMPONENT_NAME:-"OpeaDocSumvLLM"}" +export DOCSUM_ENDPOINT="http://${IP_ADDRESS}:9000/v1/docsum" + +# Toolset and prompt paths +export TOOLSET_PATH=$WORKDIR/GenAIExamples/FinanceAgent/tools/ +export PROMPT_PATH=$WORKDIR/GenAIExamples/FinanceAgent/prompts/ #### env vars for dataprep ############# -export host_ip=${ip_address} export DATAPREP_PORT="6007" export TEI_EMBEDDER_PORT="10221" -export REDIS_URL_VECTOR="redis://${ip_address}:6379" -export REDIS_URL_KV="redis://${ip_address}:6380" -export LLM_MODEL=$model -export LLM_ENDPOINT="http://${ip_address}:${vllm_port}" +export REDIS_URL_VECTOR="redis://${IP_ADDRESS}:6379" +export REDIS_URL_KV="redis://${IP_ADDRESS}:6380" + export DATAPREP_COMPONENT_NAME="OPEA_DATAPREP_REDIS_FINANCE" export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5" -export TEI_EMBEDDING_ENDPOINT="http://${ip_address}:${TEI_EMBEDDER_PORT}" +export TEI_EMBEDDING_ENDPOINT="http://${IP_ADDRESS}:${TEI_EMBEDDER_PORT}" ####################################### - - function get_genai_comps() { if [ ! -d "GenAIComps" ] ; then git clone --depth 1 --branch ${opea_branch:-"main"} https://github.com/opea-project/GenAIComps.git @@ -48,7 +84,7 @@ function build_dataprep_agent_images() { function build_agent_image_local(){ cd $WORKDIR/GenAIComps/ - docker build -t opea/agent:latest -f comps/agent/src/Dockerfile . --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy + docker build -t opea/agent:latest -f comps/agent/src/Dockerfile . --build-arg https_proxy=$HTTPS_PROXY --build-arg http_proxy=$HTTP_PROXY } function build_vllm_docker_image() { @@ -62,24 +98,25 @@ function build_vllm_docker_image() { VLLM_FORK_VER=v0.6.6.post1+Gaudi-1.20.0 git checkout ${VLLM_FORK_VER} &> /dev/null - docker build --no-cache -f Dockerfile.hpu -t $vllm_image --shm-size=128g . --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy + docker build --no-cache -f Dockerfile.hpu -t $VLLM_IMAGE --shm-size=128g . --build-arg https_proxy=$HTTPS_PROXY --build-arg http_proxy=$HTTP_PROXY if [ $? -ne 0 ]; then - echo "$vllm_image failed" + echo "$VLLM_IMAGE failed" exit 1 else - echo "$vllm_image successful" + echo "$VLLM_IMAGE successful" fi } +function stop_llm(){ + cid=$(docker ps -aq --filter "name=vllm-gaudi-server") + echo "Stopping container $cid" + if [[ ! -z "$cid" ]]; then docker rm $cid -f && sleep 1s; fi + +} + +function start_all_services(){ + docker compose -f $WORKPATH/docker_compose/intel/hpu/gaudi/compose.yaml up -d -function start_vllm_service_70B() { - echo "token is ${HF_TOKEN}" - echo "start vllm gaudi service" - echo "**************model is $model**************" - docker run -d --runtime=habana --rm --name "vllm-gaudi-server" -e HABANA_VISIBLE_DEVICES=all -p $vllm_port:8000 -v $vllm_volume:/data -e HF_TOKEN=$HF_TOKEN -e HUGGING_FACE_HUB_TOKEN=$HF_TOKEN -e HF_HOME=/data -e OMPI_MCA_btl_vader_single_copy_mechanism=none -e PT_HPU_ENABLE_LAZY_COLLECTIVES=true -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e no_proxy=$no_proxy -e VLLM_SKIP_WARMUP=true --cap-add=sys_nice --ipc=host $vllm_image --model ${model} --max-seq-len-to-capture 16384 --tensor-parallel-size 4 - sleep 10s - echo "Waiting vllm gaudi ready" - n=0 until [[ "$n" -ge 200 ]] || [[ $ready == true ]]; do docker logs vllm-gaudi-server &> ${LOG_PATH}/vllm-gaudi-service.log n=$((n+1)) @@ -96,19 +133,6 @@ function start_vllm_service_70B() { echo "Service started successfully" } - -function stop_llm(){ - cid=$(docker ps -aq --filter "name=vllm-gaudi-server") - echo "Stopping container $cid" - if [[ ! -z "$cid" ]]; then docker rm $cid -f && sleep 1s; fi - -} - -function start_dataprep(){ - docker compose -f $WORKPATH/docker_compose/intel/hpu/gaudi/dataprep_compose.yaml up -d - sleep 1m -} - function validate() { local CONTENT="$1" local EXPECTED_RESULT="$2" @@ -155,16 +179,8 @@ function stop_dataprep() { } -function start_agents() { - echo "Starting Agent services" - cd $WORKDIR/GenAIExamples/FinanceAgent/docker_compose/intel/hpu/gaudi/ - bash launch_agents.sh - sleep 2m -} - - function validate_agent_service() { - # # test worker finqa agent + # test worker finqa agent echo "======================Testing worker finqa agent======================" export agent_port="9095" prompt="What is Gap's revenue in 2024?" @@ -178,7 +194,7 @@ function validate_agent_service() { exit 1 fi - # # test worker research agent + # test worker research agent echo "======================Testing worker research agent======================" export agent_port="9096" prompt="Johnson & Johnson" @@ -215,7 +231,6 @@ function validate_agent_service() { docker logs supervisor-agent-endpoint exit 1 fi - } function stop_agent_docker() { @@ -228,7 +243,6 @@ function stop_agent_docker() { done } - echo "workpath: $WORKPATH" echo "=================== Stop containers ====================" stop_llm @@ -238,24 +252,22 @@ stop_dataprep cd $WORKPATH/tests echo "=================== #1 Building docker images====================" -build_vllm_docker_image +# build_vllm_docker_image build_dataprep_agent_images -#### for local test -# build_agent_image_local -# echo "=================== #1 Building docker images completed====================" +# ## for local test +# # build_agent_image_local +echo "=================== #1 Building docker images completed====================" -echo "=================== #2 Start vllm endpoint====================" -start_vllm_service_70B -echo "=================== #2 vllm endpoint started====================" +echo "=================== #2 Start services ====================" +start_all_services +echo "=================== #2 Endpoints for services started====================" -echo "=================== #3 Start dataprep and ingest data ====================" -start_dataprep +echo "=================== #3 Validate ingest_validate_dataprep ====================" ingest_validate_dataprep echo "=================== #3 Data ingestion and validation completed====================" echo "=================== #4 Start agents ====================" -start_agents validate_agent_service echo "=================== #4 Agent test passed ===================="