Author | SHA1 Message | Date |
---|---|---|
Jinghan Yao | 3bdd187e71 Fixing the reshape bug in sequence parallel alltoall, which corrupted all QKV data (#5664) | 4 months ago |
Kun Chen | f86824be81 Add Ulysses DistributedAttention compatibility (#5525) | 5 months ago |
Reza Yazdani | 2afa1c7f2f Communication Optimization for Large-Scale Training (#4695) | 11 months ago |
Sam Ade Jacobs | a855405e0b DeepSpeed Ulysses release (#4198) | 1 year ago |