Skip to content

Eval bug: vulkan: regression: vram usage increased #11339

@rhjdvsgsgks

Description

@rhjdvsgsgks

Name and Version

716bd6d
bisected

Operating systems

Linux

GGML backends

Vulkan

Hardware

amdgpu 8g

Models

Qwen2.5-Coder-14B-Instruct-Q4_K_M
or any model with similar size

Problem description & steps to reproduce

on c250ecb . the weight part of model can fit into vram. left only context/kv cache on gtt. memory usage is 8166m vram + 2271m gtt.

but on 716bd6d . memory usage is 6342m vram + 4107m gtt. significantly slowed down the tg speed.

First Bad Commit

716bd6d

Relevant log output

no difference on log output

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions