Skip to content

Non-deterministic outputs for llama2 #966

@normster

Description

@normster

For some adversarially optimized prompts, it seems that llama2 running on vllm returns slightly different generations from time to time. Does anyone know what could be causing this, and if it's possible to fix this? My suspicion is the model shards not being reduced in the same order every time which leads to different floating point values due to non-associativity.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions