-
-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
For some adversarially optimized prompts, it seems that llama2 running on vllm returns slightly different generations from time to time. Does anyone know what could be causing this, and if it's possible to fix this? My suspicion is the model shards not being reduced in the same order every time which leads to different floating point values due to non-associativity.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working