Reza Yazdani
|
2afa1c7f2f
Communication Optimization for Large-Scale Training (#4695)
|
11 months ago |
Guanhua Wang
|
b1cb0dfc46
Guanhua/partial offload rebase v2 (#590) (#4636)
|
11 months ago |
Matthew Hoffman
|
604d701e35
Introduce pydantic_v1 compatibility module for pydantic>=2.0.0 support (#4407)
|
1 year ago |
Yuxiang Wei
|
986b5958e2
fix: wrong documentation of `ignore_unused_parameters` (#4418)
|
1 year ago |
Heyang Qin
|
7711bdbbd2
MP ZeRO++ (#3954)
|
1 year ago |
leiwen83
|
1e0c39c6bf
enable pipeline checkpoint loading mode (#3629)
|
1 year ago |
Olatunji Ruwase
|
7f90ef4bdd
Multiple zero stage 3 related fixes (#3886)
|
1 year ago |
Heyang Qin
|
d18aa2c79c
ZeRO++ (#3784)
|
1 year ago |
Zhen Zhang
|
2e99f6edf6
[DRAFT] Tentative implementation of MiCS (#2964)
|
1 year ago |
Olatunji Ruwase
|
47f9f13bd3
DeepSpeed Chat (#3186)
|
1 year ago |
Michael Wyatt
|
b361c72761
Update DeepSpeed copyright license to Apache 2.0 (#3111)
|
1 year ago |
Jeff Rasley
|
91d63e0228
update formatter version and style settings (#3098)
|
1 year ago |
Jeff Rasley
|
da84e60d98
add missing license info to top of all source code (#2889)
|
1 year ago |
Michael Wyatt
|
fe6785447d
Add missing Inference sub-configs (#2518)
|
1 year ago |
Michael Wyatt
|
43bf035cfc
Update docs to autogenerate pydantic config model docs (#2509)
|
1 year ago |
Olatunji Ruwase
|
5870f36c58
Correctly detect offload configuration (#2208)
|
2 years ago |
Olatunji Ruwase
|
2210ebe70f
Release swap buffers for persisted params (#2089)
|
2 years ago |
Michael Wyatt
|
5997589683
Refactor ZeRO configs to use Pydantic (#2004)
|
2 years ago |
Justin Chiu
|
4912e0ad7e
Various ZeRO Stage3 Optimizations + Improvements (including bfloat16 support) (#1453)
|
2 years ago |
Alex Hedges
|
fc2f378ece
Improve pre-commit hooks (#1602)
|
2 years ago |
Olatunji Ruwase
|
97f7ed9e98
Use correct default for round robin gradients (#1258)
|
3 years ago |
Olatunji Ruwase
|
4d420df565
Make round robin gradient partitioning configurable (default False) (#1256)
|
3 years ago |
Stas Bekman
|
a8d6dfe87c
correct cpu_offload deprecation (#1140)
|
3 years ago |
Jeff Rasley
|
cfa63f5dad
ZeRO stage 1 refresh (#1042)
|
3 years ago |
Olatunji Ruwase
|
5b393f1555
Avoid unused parameters assert by default (#1039)
|
3 years ago |
hamlet
|
d0b61f1810
Add find_unused_parameters option to DeepSpeedEngine (#945)
|
3 years ago |
Jeff Rasley
|
0d4a54a04d
ZeRO-Infinity (#976)
|
3 years ago |
Stas Bekman
|
5ca86ae4ed
improved readability + typos (#895)
|
3 years ago |
Stas Bekman
|
39013dd2b8
save_fp16_model consolidated for zero3 (#893)
|
3 years ago |
Olatunji Ruwase
|
7bcd72a278
Make config objects json serializable (#862)
|
3 years ago |