Jinghan Yao 3bdd187e71 Fixing the reshape bug in sequence parallel alltoall, which corrupted all QKV data (#5664) 4 月之前
..
__init__.py a855405e0b DeepSpeed Ulysses release (#4198) 1 年之前
layer.py 3bdd187e71 Fixing the reshape bug in sequence parallel alltoall, which corrupted all QKV data (#5664) 4 月之前