Support denvr endpoints with Litellm. #2085

srinarayan-srikanthan · 2025-06-20T03:43:52Z

Description

Support remote inference with denvr endpoint for chatqna with readme updates.

Issues

#2084

Type of change

List the type of change like below. Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds new functionality)
Breaking change (fix or feature that would break existing design and interface)
Others (enhancement, documentation, validation, etc.)

Dependencies

N/A

Tests

Describe the tests that you ran to verify your changes.

Signed-off-by: Ubuntu <azureuser@denvr-inf.kifxisxbiwme5gt4kkwqsfdjuh.dx.internal.cloudapp.net>

github-actions · 2025-06-20T03:44:06Z

Dependency Review

✅ No vulnerabilities or license issues found.

Scanned Files

None

for more information, see https://pre-commit.ci

Signed-off-by: Ubuntu <azureuser@denvr-inf.kifxisxbiwme5gt4kkwqsfdjuh.dx.internal.cloudapp.net>

https://github.com/srinarayan-srikanthan/GenAIExamples into denvr_chat merging remote

Copilot

Pull Request Overview

Adds support for deploying ChatQnA with remote denvr inference endpoints and updates the streaming response parser for multi-chunk JSON outputs.

Introduce a new compose_remote.yaml workflow and environment variable instructions in the Xeon CPU Docker README.
Update align_generator in chatqna.py to split and process multiple JSON chunks per line.

Reviewed Changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated no comments.

File	Description
ChatQnA/docker_compose/intel/cpu/xeon/README.md	Added remote endpoint deployment steps and updated compose table
ChatQnA/chatqna.py	Refactored `align_generator` to handle multi-chunk streaming JSON

Comments suppressed due to low confidence (3)

ChatQnA/docker_compose/intel/cpu/xeon/README.md:78

[nitpick] Clarify whether REMOTE_ENDPOINT should include the /v1/chat/completions path or just the base URL to avoid confusion.

**Note**: Set REMOTE_ENDPOINT variable value to "https://api.inference.denvrdata.com" when the remote endpoint to access is "https://api.inference.denvrdata.com/v1/chat/completions"

ChatQnA/chatqna.py:178

[nitpick] The outer variable line is reused for the inner loop below, which can reduce readability; consider renaming the loop variable to chunk or similar.

        chunks = [chunk.strip() for chunk in line.split("\n\n") if chunk.strip()]

ChatQnA/chatqna.py:191

The previous finish_reason check was removed, which may cause tokens to be emitted after the stream should end; consider re-adding or documenting this behavior change.

                elif "content" in json_data["choices"][0]["delta"]:

Signed-off-by: Ubuntu <azureuser@denvr-inf.kifxisxbiwme5gt4kkwqsfdjuh.dx.internal.cloudapp.net>

for more information, see https://pre-commit.ci

Signed-off-by: Ubuntu <azureuser@denvr-inf.kifxisxbiwme5gt4kkwqsfdjuh.dx.internal.cloudapp.net>

…/GenAIExamples into denvr_chat merging local to remote

for more information, see https://pre-commit.ci

Signed-off-by: Ubuntu <azureuser@denvr-inf.kifxisxbiwme5gt4kkwqsfdjuh.dx.internal.cloudapp.net>

…/GenAIExamples into denvr_chat merging divergent remote

for more information, see https://pre-commit.ci

Signed-off-by: Ubuntu <azureuser@denvr-inf.kifxisxbiwme5gt4kkwqsfdjuh.dx.internal.cloudapp.net>

for more information, see https://pre-commit.ci

alexsin368 · 2025-06-27T23:04:02Z

DocSum/docker_compose/intel/cpu/xeon/compose_remote.yaml

+      no_proxy: ${no_proxy}
+      http_proxy: ${http_proxy}
+      https_proxy: ${https_proxy}
+      LLM_ENDPOINT: ${LLM_ENDPOINT}


should this be:

LLM_ENDPOINT: ${REMOTE_ENDPOINT}

louie-tsai

minor comments. overall looks good.

louie-tsai · 2025-06-27T21:59:37Z

ChatQnA/docker_compose/intel/cpu/xeon/README.md

+
+```bash
+export REMOTE_ENDPOINT=<endpoint-url>
+export LLM_MODEL_ID=<model-id>


good to put a notice why we need to set LLM_MODEL_ID again and users need to pick one supported by remote endpoint.

louie-tsai · 2025-06-27T23:23:53Z

ChatQnA/docker_compose/intel/cpu/xeon/README.md

+```bash
+export REMOTE_ENDPOINT=<endpoint-url>
+export LLM_MODEL_ID=<model-id>
+export OPENAI_API_KEY=<API-KEY>


good to change it from OPENAI_API_KEY to API_KEY since it is not for openai

louie-tsai · 2025-06-27T23:25:15Z

CodeGen/docker_compose/intel/cpu/xeon/README.md

+- **To Run:**
+
+  ```bash
+  export OPENAI_API_KEY=<api-key>>


OPENAI name here is confusing. good to remove OPENAI term here.

louie-tsai · 2025-06-27T23:26:52Z

DocSum/docker_compose/intel/cpu/xeon/README.md

+
+```bash
+export LLM_ENDPOINT=<endpoint-url>
+export LLM_MODEL_ID=<model-id>


Better to mention how users to set LLM_MODEL_ID correctly by getting the supported model list

Ubuntu added 3 commits June 20, 2025 02:52

enable streaming of multiple chunks

d87c288

Signed-off-by: Ubuntu <azureuser@denvr-inf.kifxisxbiwme5gt4kkwqsfdjuh.dx.internal.cloudapp.net>

removed whitespace

e5432bb

Signed-off-by: Ubuntu <azureuser@denvr-inf.kifxisxbiwme5gt4kkwqsfdjuh.dx.internal.cloudapp.net>

readme update

a13e1d4

Signed-off-by: Ubuntu <azureuser@denvr-inf.kifxisxbiwme5gt4kkwqsfdjuh.dx.internal.cloudapp.net>

Copilot AI review requested due to automatic review settings June 20, 2025 03:43

srinarayan-srikanthan requested review from lvliang-intel and letonghan as code owners June 20, 2025 03:43

This comment was marked as outdated.

Sign in to view

pre-commit-ci bot and others added 3 commits June 20, 2025 03:44

[pre-commit.ci] auto fixes from pre-commit.com hooks

42b57fb

for more information, see https://pre-commit.ci

readme fix

e04febb

Signed-off-by: Ubuntu <azureuser@denvr-inf.kifxisxbiwme5gt4kkwqsfdjuh.dx.internal.cloudapp.net>

Merge branch 'denvr_chat' of

ce89a31

https://github.com/srinarayan-srikanthan/GenAIExamples into denvr_chat merging remote

srinarayan-srikanthan requested a review from Copilot June 20, 2025 14:00

Copilot AI reviewed Jun 20, 2025

View reviewed changes

Ubuntu and others added 5 commits June 23, 2025 17:47

added docsum compose remote

4329ea2

Signed-off-by: Ubuntu <azureuser@denvr-inf.kifxisxbiwme5gt4kkwqsfdjuh.dx.internal.cloudapp.net>

docsum readme update

faf4810

Signed-off-by: Ubuntu <azureuser@denvr-inf.kifxisxbiwme5gt4kkwqsfdjuh.dx.internal.cloudapp.net>

[pre-commit.ci] auto fixes from pre-commit.com hooks

e640a71

for more information, see https://pre-commit.ci

codegen changes

6df5d64

Signed-off-by: Ubuntu <azureuser@denvr-inf.kifxisxbiwme5gt4kkwqsfdjuh.dx.internal.cloudapp.net>

Merge branch 'denvr_chat' of https://github.com/srinarayan-srikanthan…

53fc0f1

…/GenAIExamples into denvr_chat merging local to remote

srinarayan-srikanthan requested a review from yao531441 as a code owner June 26, 2025 19:35

pre-commit-ci bot and others added 5 commits June 26, 2025 19:36

[pre-commit.ci] auto fixes from pre-commit.com hooks

e33cf84

for more information, see https://pre-commit.ci

removed comments

f125973

Signed-off-by: Ubuntu <azureuser@denvr-inf.kifxisxbiwme5gt4kkwqsfdjuh.dx.internal.cloudapp.net>

Merge branch 'denvr_chat' of https://github.com/srinarayan-srikanthan…

7a0cdf7

…/GenAIExamples into denvr_chat merging divergent remote

[pre-commit.ci] auto fixes from pre-commit.com hooks

96ac978

for more information, see https://pre-commit.ci

updated codegen readme

3908dc3

Signed-off-by: Ubuntu <azureuser@denvr-inf.kifxisxbiwme5gt4kkwqsfdjuh.dx.internal.cloudapp.net>

srinarayan-srikanthan requested review from jaswanth8888 and hteeyeoh as code owners June 27, 2025 14:12

[pre-commit.ci] auto fixes from pre-commit.com hooks

3a323bb

for more information, see https://pre-commit.ci

srinarayan-srikanthan mentioned this pull request Jun 27, 2025

[Feature] Remote Inference Endpoints Support #1972

Open

louie-tsai mentioned this pull request Jun 27, 2025

[Feature] Enable remote inference endpoints for examples #1973

Open

35 tasks

alexsin368 reviewed Jun 27, 2025

View reviewed changes

louie-tsai reviewed Jun 27, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support denvr endpoints with Litellm. #2085

Support denvr endpoints with Litellm. #2085

Uh oh!

srinarayan-srikanthan commented Jun 20, 2025

Uh oh!

github-actions bot commented Jun 20, 2025 •

edited

Loading

Uh oh!

This comment was marked as outdated.

Uh oh!

Copilot AI left a comment

Uh oh!

alexsin368 Jun 27, 2025

Uh oh!

louie-tsai left a comment

Uh oh!

louie-tsai Jun 27, 2025

Uh oh!

louie-tsai Jun 27, 2025

Uh oh!

louie-tsai Jun 27, 2025

Uh oh!

louie-tsai Jun 27, 2025

Uh oh!

Uh oh!

Support denvr endpoints with Litellm. #2085

Are you sure you want to change the base?

Support denvr endpoints with Litellm. #2085

Uh oh!

Conversation

srinarayan-srikanthan commented Jun 20, 2025

Description

Issues

Type of change

Dependencies

Tests

Uh oh!

github-actions bot commented Jun 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Dependency Review

Scanned Files

Uh oh!

This comment was marked as outdated.

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

alexsin368 Jun 27, 2025

Choose a reason for hiding this comment

Uh oh!

louie-tsai left a comment

Choose a reason for hiding this comment

Uh oh!

louie-tsai Jun 27, 2025

Choose a reason for hiding this comment

Uh oh!

louie-tsai Jun 27, 2025

Choose a reason for hiding this comment

Uh oh!

louie-tsai Jun 27, 2025

Choose a reason for hiding this comment

Uh oh!

louie-tsai Jun 27, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot commented Jun 20, 2025 •

edited

Loading