2023-08-24-ulysses.md 280 B


title: "DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models" excerpt: "" link: https://github.com/microsoft/DeepSpeed/blob/master/blogs/deepspeed-ulysses/README.md date: 2023-08-24 00:00:00

tags: training ZeRO English