Jinghan Yao 3bdd187e71 Fixing the reshape bug in sequence parallel alltoall, which corrupted all QKV data (#5664) | 4 月之前 | |
---|---|---|
.. | ||
__init__.py | a855405e0b DeepSpeed Ulysses release (#4198) | 1 年之前 |
layer.py | 3bdd187e71 Fixing the reshape bug in sequence parallel alltoall, which corrupted all QKV data (#5664) | 4 月之前 |