Question about integration with DeepSpeed-Ulysses

Hi developers,

Thanks for such a great project that can demonstrate the power of newly released features in torch.

When I want to run llama2 model with 128k long sequence, how can we enable it? I have some experience with DeepSpeed-Ulysses, so the question becomes does torchtitan support sequence parallelism in DeepSpeed-Ulysses?

Thanks!