Farzan Taj
|
9886d6d9e0
Fix CPUAdam for when `vendor_id_raw` is not provided (#2836)
|
1 year ago |
Ma, Guokai
|
98cc35b6a8
Abstract accelerator (step 3) (#2677)
|
1 year ago |
Ma, Guokai
|
9548d48f48
Abstract accelerator (step 2) (#2560)
|
1 year ago |
Alex Hedges
|
316c4a43e0
Add flake8 to pre-commit checks (#2051)
|
2 years ago |
Michael Wyatt
|
7bae53d154
Fix for AMD unit tests (#2047)
|
2 years ago |
Jeff Rasley
|
2422ec4885
add segfault guard for cpu-adam/adagrad (#1681)
|
2 years ago |
Reza Yazdani
|
559c4ce11a
Convert the fp16_params to group of parameters (#1651)
|
2 years ago |
Jeff Rasley
|
a10e4811fe
force set lf instead of crlf (https://github.com/pre-commit/pre-commit-hooks#mixed-line-ending) (#1598)
|
2 years ago |
Alex Hedges
|
be789b1665
Fix many typos (#1423)
|
3 years ago |
Ammar Ahmad Awan
|
f28432441b
DeepSpeed MoE (#1310)
|
3 years ago |
Stas Bekman
|
a029239812
clean up logging (#1190)
|
3 years ago |
Stas Bekman
|
c79184ebcc
fix cpu_adam memory leak on deepspeed re-use in the same process (#896)
|
3 years ago |
Jeff Rasley
|
dd03cff29f
set adamw_mode default true (follows FusedAdam and < 0.3.11 logic) (#844)
|
3 years ago |
Samyam Rajbhandari
|
599258f979
ZeRO 3 Offload (#834)
|
3 years ago |
Reza Yazdani
|
9f52a36fad
tracking optimizer step in cpu-adam when loading checkpoint (#564)
|
3 years ago |
Jeff Rasley
|
31f46feee2
DeepSpeed JIT op + PyPI support (#496)
|
4 years ago |
Reza Yazdani
|
7d4d742bf0
Fixing CPU-Adam convergence issue (#503)
|
4 years ago |
Reza Yazdani
|
f5aa2547d8
Add CPUAdam optimizer for zero-offload in deepspeed engine (#484)
|
4 years ago |
Jeff Rasley
|
41db1c2f03
ZeRO-Offload release (#391)
|
4 years ago |