Fix: Proper RGBA -> RGB conversion for PIL images. #18508

huachenheli · 2025-05-21T22:22:14Z

Directly converting RGBA to RGB via convert on PIL.Image produces a strange background as demonstrated in the picture:

Original: https://upload.wikimedia.org/wikipedia/commons/4/47/PNG_transparency_demonstration_1.png
Bad conversion:
Desired conversion (this PR):

Test plan:

Unit test: pytest tests/multimodal/test_image.py -s

Local server:
vllm serve /home/huachenheli/local/llm/huggingface/llama4/Llama-4-Scout-17B-16E-Instruct --gpu-memory-utilization 0.5 --tensor-parallel-size 8 --max-model-len 65536

curl -0 -v -X POST localhost:8000/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d "@-" << EOF
{
    "messages": [{
        "role": "user",
        "model": "/home/huachenheli/local/llm/huggingface/llama4/Llama-4-Scout-17B-16E-Instruct",
        "content": [
            {"type": "text", "text": "What is the background color of this image, excluding four dice?"},
            {"type": "image_url", "image_url": {"url": "https://upload.wikimedia.org/wikipedia/commons/4/47/PNG_transparency_demonstration_1.png"}}
        ]
    }]
}
EOF

Without this change:

"content":"The background color of the image, excluding the four dice, appears to be a multicolored striped pattern. The colors include blue, white, green, purple, red, and yellow. However, if we had to choose a dominant color or a color that is most representative of the background, it would be difficult to pinpoint a single color due to the striped nature of the background.\n\nHowever, if the question is asking for a color that is visible behind the dice and not obstructed by them, then the answer could be inferred as black, since there is a black border around the image. \n\nTherefore, the background color of the image, excluding the four dice, is black."

With this change:

"content":"The background color of the image, excluding the four dice, is white."

github-actions · 2025-05-21T22:22:23Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

mgoin · 2025-05-21T22:56:17Z

vllm/multimodal/image.py

Maybe you could just make a image_to_image_mode function that has this conditional inside of it

DarkLight1337 · 2025-05-22T01:52:31Z

Thanks for reporting and fixing this issue! Can you add a unit test to avoid future regressions?

huachenheli · 2025-05-22T02:34:54Z

Thanks for reporting and fixing this issue! Can you add a unit test to avoid future regressions?

Done.

DarkLight1337 · 2025-05-22T02:40:22Z

Hmm actually, I see other places where .convert is used, can you update them as well?

huachenheli · 2025-05-22T15:37:49Z

Hmm actually, I see other places where .convert is used, can you update them as well?

Let me check.

huachenheli · 2025-05-22T16:02:34Z

Updated existing call sites.

DarkLight1337 · 2025-05-22T16:03:29Z

Please also merge from main to fix CI failures.

mergify · 2025-05-22T20:11:29Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @ChenheliHua.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Signed-off-by: Chenheli Hua <[email protected]>

mergify bot added the multi-modality Related to multi-modality (#4194) label May 21, 2025

huachenheli marked this pull request as ready for review May 21, 2025 22:29

huachenheli requested review from DarkLight1337 and ywang96 as code owners May 21, 2025 22:29

mgoin reviewed May 21, 2025

View reviewed changes

huachenheli requested a review from mgoin May 21, 2025 23:56

DarkLight1337 approved these changes May 22, 2025

View reviewed changes

mergify bot added the documentation Improvements or additions to documentation label May 22, 2025

DarkLight1337 approved these changes May 22, 2025

View reviewed changes

DarkLight1337 enabled auto-merge (squash) May 22, 2025 16:03

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label May 22, 2025

ywang96 approved these changes May 22, 2025

View reviewed changes

mgoin approved these changes May 22, 2025

View reviewed changes

auto-merge was automatically disabled May 22, 2025 20:10
Head branch was pushed to by a user without write access

huachenheli force-pushed the rgba branch from 3c12599 to 1f02182 Compare May 22, 2025 20:10

huachenheli requested review from jeejeelee, youkaichao, russellb, robertgshaw2-redhat, tlrmchlsmth, WoosukKwon, simon-mo and njhill as code owners May 22, 2025 20:10

huachenheli requested review from comaniac, alexm-redhat and zhuohan123 as code owners May 22, 2025 20:10

mergify bot added ci/build frontend structured-output speculative-decoding v1 labels May 22, 2025

github-project-automation bot added this to Structured Output May 22, 2025

mergify bot added tpu Related to Google TPUs tool-calling labels May 22, 2025

mergify bot added the needs-rebase label May 22, 2025

github-project-automation bot added this to Tool Calling May 22, 2025

huachenheli closed this May 22, 2025

huachenheli force-pushed the rgba branch from b549997 to 6e588da Compare May 22, 2025 20:20

github-project-automation bot moved this to Done in Tool Calling May 22, 2025

github-project-automation bot moved this to Done in Structured Output May 22, 2025

mergify bot removed the tpu Related to Google TPUs label May 22, 2025

huachenheli mentioned this pull request May 22, 2025

Re-submit: Fix: Proper RGBA -> RGB conversion for PIL images. #18569

Merged

huachenheli added a commit to huachenheli/vllm that referenced this pull request May 22, 2025

Re-commit PR: vllm-project#18508

861fa47

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: Signed-off-by: Chenheli Hua <[email protected]>

huachenheli deleted the rgba branch July 2, 2025 15:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fix: Proper RGBA -> RGB conversion for PIL images. #18508

Fix: Proper RGBA -> RGB conversion for PIL images. #18508

Uh oh!

huachenheli commented May 21, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented May 21, 2025

Uh oh!

mgoin May 21, 2025

Uh oh!

DarkLight1337 commented May 22, 2025

Uh oh!

huachenheli commented May 22, 2025

Uh oh!

DarkLight1337 commented May 22, 2025

Uh oh!

huachenheli commented May 22, 2025

Uh oh!

huachenheli commented May 22, 2025

Uh oh!

DarkLight1337 commented May 22, 2025

Uh oh!

mergify bot commented May 22, 2025

Uh oh!

Uh oh!

Uh oh!

Fix: Proper RGBA -> RGB conversion for PIL images. #18508

Fix: Proper RGBA -> RGB conversion for PIL images. #18508

Uh oh!

Conversation

huachenheli commented May 21, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented May 21, 2025

Uh oh!

mgoin May 21, 2025

Choose a reason for hiding this comment

Uh oh!

DarkLight1337 commented May 22, 2025

Uh oh!

huachenheli commented May 22, 2025

Uh oh!

DarkLight1337 commented May 22, 2025

Uh oh!

huachenheli commented May 22, 2025

Uh oh!

huachenheli commented May 22, 2025

Uh oh!

DarkLight1337 commented May 22, 2025

Uh oh!

mergify bot commented May 22, 2025

Uh oh!

Uh oh!

huachenheli commented May 21, 2025 •

edited by github-actions bot

Loading