Joe Mayer
|
d3550dc88a
Adagrad support in ZeRO (#3401)
|
1 年之前 |
Michael Wyatt
|
b361c72761
Update DeepSpeed copyright license to Apache 2.0 (#3111)
|
1 年之前 |
Jeff Rasley
|
91d63e0228
update formatter version and style settings (#3098)
|
1 年之前 |
Jeff Rasley
|
da84e60d98
add missing license info to top of all source code (#2889)
|
1 年之前 |
Ma, Guokai
|
98cc35b6a8
Abstract accelerator (step 3) (#2677)
|
1 年之前 |
Karim Foda
|
735406e536
fix import errors (#2026)
|
2 年之前 |
Ammar Ahmad Awan
|
36ad3119d5
DeepSpeed comm backend v1 (#1985)
|
2 年之前 |
Olatunji Ruwase
|
56c5223868
bf16+pipeline parallelism (#1801)
|
2 年之前 |
Justin Chiu
|
4912e0ad7e
Various ZeRO Stage3 Optimizations + Improvements (including bfloat16 support) (#1453)
|
2 年之前 |
Jeff Rasley
|
3293cf72a0
[ZeRO] Default disable elastic ckpt in stage 1+2 and reduce CPU memory overhead during ckpt load (#1525)
|
2 年之前 |
Stas Bekman
|
a029239812
clean up logging (#1190)
|
3 年之前 |
Samyam Rajbhandari
|
599258f979
ZeRO 3 Offload (#834)
|
3 年之前 |
Olatunji Ruwase
|
865104be85
Support optimizer AdamW type (#670)
|
3 年之前 |
Jeff Rasley
|
31f46feee2
DeepSpeed JIT op + PyPI support (#496)
|
4 年之前 |
Jeff Rasley
|
41db1c2f03
ZeRO-Offload release (#391)
|
4 年之前 |
Jeff Rasley
|
e5bbc2e559
Sparse attn + ops/runtime refactor + v0.3.0 (#343)
|
4 年之前 |