作者 | SHA1 备注 | 提交日期 |
---|---|---|
Jinghan Yao | 3bdd187e71 Fixing the reshape bug in sequence parallel alltoall, which corrupted all QKV data (#5664) | 4 月之前 |
Kun Chen | f86824be81 Add Ulysses DistributedAttention compatibility (#5525) | 5 月之前 |
Reza Yazdani | 2afa1c7f2f Communication Optimization for Large-Scale Training (#4695) | 11 月之前 |
Sam Ade Jacobs | a855405e0b DeepSpeed Ulysses release (#4198) | 1 年之前 |