Skip to content

[Bug] InternVL2 超过2图就报错 #2263

@JixiangGao

Description

@JixiangGao

Checklist

  • 1. I have searched related issues but cannot get the expected help.
  • 2. The bug has not been fixed in the latest version.
  • 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.

Describe the bug

一次输入2图没问题,>=3图就会报错

2024-08-08 17:08:41,058 - lmdeploy - INFO - ImageEncoder received 3 images, left 3 images.
2024-08-08 17:08:41,058 - lmdeploy - INFO - ImageEncoder process 3 images, left 0 images.
2024-08-08 17:08:42,400 - lmdeploy - INFO - ImageEncoder forward 3 images, cost 1.341s
2024-08-08 17:08:42,401 - lmdeploy - INFO - ImageEncoder done 3 images, left 0 images.
2024-08-08 17:08:42,403 - lmdeploy - INFO - prompt='<|im_start|>system\n<|im_end|>\n<|im_start|>user\n<img><IMAGE_TOKEN><IMAGE_TOKEN><IMAGE_TOKEN></img>\n请简单描述下每张照片<|im_end|>\n<|im_start|>assistant\n', gen_config=EngineGenerationConfig(n=1, max_new_tokens=1024, top_p=0.8, top_k=40, temperature=0.6, repetition_penalty=1.1, ignore_eos=False, random_seed=14542548753086595270, stop_words=[92542, 92540], bad_words=None, min_new_tokens=None, skip_special_tokens=True, logprobs=None), prompt_token_id=[1, 92543, 9081, 364, 92542, 364, 92543, 1008, 364, 92544, 0, 0, 0, .........., 0, 92545, 364, 60836, 68435, 69401, 60380, 60619, 60862, 68805, 92542, 364, 92543, 525, 11353, 364], adapter_name=None.
2024-08-08 17:08:42,407 - lmdeploy - INFO - session_id=0, history_tokens=0, input_tokens=10009, max_new_tokens=1024, seq_start=True, seq_end=True, step=0, prep=True
2024-08-08 17:08:42,409 - lmdeploy - ERROR - Truncate max_new_tokens to 128
2024-08-08 17:08:42,410 - lmdeploy - ERROR - run out of tokens. session_id=0.

Reproduction

import nest_asyncio
nest_asyncio.apply()
from lmdeploy import pipeline, TurbomindEngineConfig, ChatTemplateConfig, GenerationConfig, VisionConfig
from lmdeploy.vl import load_image
from lmdeploy.vl.constants import IMAGE_TOKEN

system_prompt = ''
chat_template_config = ChatTemplateConfig('internvl-internlm2')
chat_template_config.meta_instruction = system_prompt

backend_config = TurbomindEngineConfig(
    model_format='awq', 
    max_batch_size=32,
    enable_prefix_caching=True,
    cache_max_entry_count=0.5,
    session_len=8192
)

vision_config=VisionConfig(
    max_batch_size=8,
    thread_safe=True
)

pipe = pipeline(
    '/root/modelhub/InternVL2-8B-AWQ',
    chat_template_config=chat_template_config,
    backend_config=backend_config,
    vision_config=vision_config,
    log_level='INFO'
)

gen_config=GenerationConfig(
    max_new_tokens=1024,
    top_p=0.8,
    top_k=40,
    temperature=0.6,
    repetition_penalty=1.1
)

image_list = [
    'xxx.jgp', 'yyy.jpg', 'zzz'.jpg
]
images = [load_image(name) for name in image_list[:]]

prompt = '请简单描述下每张照片'
text = pipe((prompt, images), gen_config=gen_config).text
print(text)

Environment

lmdeploy==0.5.3


lmdeploy  check_env
Traceback (most recent call last):
  File "/opt/conda/bin/lmdeploy", line 8, in <module>
    sys.exit(run())
  File "/opt/conda/lib/python3.8/site-packages/lmdeploy/cli/entrypoint.py", line 36, in run
    args.run(args)
  File "/opt/conda/lib/python3.8/site-packages/lmdeploy/cli/cli.py", line 237, in check_env
    gpu_topo = get_gpu_topo()
  File "/opt/conda/lib/python3.8/site-packages/lmdeploy/cli/cli.py", line 223, in get_gpu_topo
    res = subprocess.run(['nvidia-smi', 'topo', '-m'],
  File "/opt/conda/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['nvidia-smi', 'topo', '-m']' returned non-zero exit status 255.


### Error traceback

_No response_

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions