Eval bug: vulkan: regression: vram usage increased

### Name and Version

716bd6dec3e044e5c325386b5b0483392b24cefe
bisected

### Operating systems

Linux

### GGML backends

Vulkan

### Hardware

amdgpu 8g

### Models

Qwen2.5-Coder-14B-Instruct-Q4_K_M
or any model with similar size

### Problem description & steps to reproduce

on c250ecb3157f3bae0a45f44c3c953b5414d4c2f7 . the weight part of model can fit into vram. left only context/kv cache on gtt. memory usage is 8166m vram + 2271m gtt.


but on 716bd6dec3e044e5c325386b5b0483392b24cefe . memory usage is 6342m vram + 4107m gtt. significantly slowed down the tg speed.

### First Bad Commit

716bd6dec3e044e5c325386b5b0483392b24cefe

### Relevant log output

```shell
no difference on log output
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Eval bug: vulkan: regression: vram usage increased #11339

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Eval bug: vulkan: regression: vram usage increased #11339

Description

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions