Non-deterministic outputs for llama2

For some adversarially optimized prompts, it seems that llama2 running on vllm returns slightly different generations from time to time. Does anyone know what could be causing this, and if it's possible to fix this? My suspicion is the model shards not being reduced in the same order every time which leads to different floating point values due to non-associativity.