Skip to content

[float8] Add support for blockwise fp8 quantization scheme used in DeepSeek v3 #1594

Open
@danielvegamyhre

Description

@danielvegamyhre

DeepSeek v3 uses a blockwise fp8 quantization strategy, where the scaling factor is computed independently for each block, rather than for each tensor/row/etc. The code is available here.

It would be useful for torchao to support this as well, for users wishing to do research or development with this same quantization strategy.

cc @drisspg @vkuzo

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions