[Bug] InternVL2 超过2图就报错

### Checklist

- [X] 1. I have searched related issues but cannot get the expected help.
- [X] 2. The bug has not been fixed in the latest version.
- [X] 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.

### Describe the bug

一次输入2图没问题，>=3图就会报错

```
2024-08-08 17:08:41,058 - lmdeploy - INFO - ImageEncoder received 3 images, left 3 images.
2024-08-08 17:08:41,058 - lmdeploy - INFO - ImageEncoder process 3 images, left 0 images.
2024-08-08 17:08:42,400 - lmdeploy - INFO - ImageEncoder forward 3 images, cost 1.341s
2024-08-08 17:08:42,401 - lmdeploy - INFO - ImageEncoder done 3 images, left 0 images.
2024-08-08 17:08:42,403 - lmdeploy - INFO - prompt='<|im_start|>system\n<|im_end|>\n<|im_start|>user\n<img><IMAGE_TOKEN><IMAGE_TOKEN><IMAGE_TOKEN></img>\n请简单描述下每张照片<|im_end|>\n<|im_start|>assistant\n', gen_config=EngineGenerationConfig(n=1, max_new_tokens=1024, top_p=0.8, top_k=40, temperature=0.6, repetition_penalty=1.1, ignore_eos=False, random_seed=14542548753086595270, stop_words=[92542, 92540], bad_words=None, min_new_tokens=None, skip_special_tokens=True, logprobs=None), prompt_token_id=[1, 92543, 9081, 364, 92542, 364, 92543, 1008, 364, 92544, 0, 0, 0, .........., 0, 92545, 364, 60836, 68435, 69401, 60380, 60619, 60862, 68805, 92542, 364, 92543, 525, 11353, 364], adapter_name=None.
2024-08-08 17:08:42,407 - lmdeploy - INFO - session_id=0, history_tokens=0, input_tokens=10009, max_new_tokens=1024, seq_start=True, seq_end=True, step=0, prep=True
2024-08-08 17:08:42,409 - lmdeploy - ERROR - Truncate max_new_tokens to 128
2024-08-08 17:08:42,410 - lmdeploy - ERROR - run out of tokens. session_id=0.
```


### Reproduction

```
import nest_asyncio
nest_asyncio.apply()
from lmdeploy import pipeline, TurbomindEngineConfig, ChatTemplateConfig, GenerationConfig, VisionConfig
from lmdeploy.vl import load_image
from lmdeploy.vl.constants import IMAGE_TOKEN

system_prompt = ''
chat_template_config = ChatTemplateConfig('internvl-internlm2')
chat_template_config.meta_instruction = system_prompt

backend_config = TurbomindEngineConfig(
    model_format='awq', 
    max_batch_size=32,
    enable_prefix_caching=True,
    cache_max_entry_count=0.5,
    session_len=8192
)

vision_config=VisionConfig(
    max_batch_size=8,
    thread_safe=True
)

pipe = pipeline(
    '/root/modelhub/InternVL2-8B-AWQ',
    chat_template_config=chat_template_config,
    backend_config=backend_config,
    vision_config=vision_config,
    log_level='INFO'
)

gen_config=GenerationConfig(
    max_new_tokens=1024,
    top_p=0.8,
    top_k=40,
    temperature=0.6,
    repetition_penalty=1.1
)

image_list = [
    'xxx.jgp', 'yyy.jpg', 'zzz'.jpg
]
images = [load_image(name) for name in image_list[:]]

prompt = '请简单描述下每张照片'
text = pipe((prompt, images), gen_config=gen_config).text
print(text)
```

### Environment

```Shell
lmdeploy==0.5.3


lmdeploy  check_env
Traceback (most recent call last):
  File "/opt/conda/bin/lmdeploy", line 8, in <module>
    sys.exit(run())
  File "/opt/conda/lib/python3.8/site-packages/lmdeploy/cli/entrypoint.py", line 36, in run
    args.run(args)
  File "/opt/conda/lib/python3.8/site-packages/lmdeploy/cli/cli.py", line 237, in check_env
    gpu_topo = get_gpu_topo()
  File "/opt/conda/lib/python3.8/site-packages/lmdeploy/cli/cli.py", line 223, in get_gpu_topo
    res = subprocess.run(['nvidia-smi', 'topo', '-m'],
  File "/opt/conda/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['nvidia-smi', 'topo', '-m']' returned non-zero exit status 255.
```
```


### Error traceback

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug] InternVL2 超过2图就报错 #2263

Checklist

Describe the bug

Reproduction

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug] InternVL2 超过2图就报错 #2263

Description

Checklist

Describe the bug

Reproduction

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions