Masahiro Tanaka
|
c3cfe96bb3
Enable torch.compile with ZeRO (Experimental) (#4878)
|
8 months ago |
Masahiro Tanaka
|
19e0dc39ba
Delay reduce-scatter for ZeRO3 leaf modules (#5008)
|
8 months ago |
Heyang Qin
|
75ed63c94f
Enable hpz based on secondary tensor presence (#4906)
|
9 months ago |
Masahiro Tanaka
|
96c5a873e6
Add API to set a module as a leaf node when recursively setting Z3 hooks (#4966)
|
9 months ago |
ChangyueLiao
|
18a04d04a5
Use clearer naming (#2548)
|
1 year ago |
Heyang Qin
|
462def451e
Enable hpz when running with torch.no_grad (#4232)
|
1 year ago |
Olatunji Ruwase
|
6df158733d
Load z3 checkpoints for inference (#4171)
|
1 year ago |
Heyang Qin
|
7711bdbbd2
MP ZeRO++ (#3954)
|
1 year ago |
Ma, Guokai
|
0f5406323c
[CPU] FusedAdam and CPU training support (#3991)
|
1 year ago |
Heyang Qin
|
9377921a3f
Separate ZeRO3 InflightParamRegistry for train and eval (#3884)
|
1 year ago |
Heyang Qin
|
d18aa2c79c
ZeRO++ (#3784)
|
1 year ago |
Heyang Qin
|
4716b0f769
share inflight registry between PartitionedParameterCoordinators (#3462)
|
1 year ago |
Alexander van Eck
|
29aea09a81
feat: Add support for `NamedTuple` when sharding parameters [#3029] (#3037)
|
1 year ago |
Olatunji Ruwase
|
47f9f13bd3
DeepSpeed Chat (#3186)
|
1 year ago |
Michael Wyatt
|
b361c72761
Update DeepSpeed copyright license to Apache 2.0 (#3111)
|
1 year ago |
Jeff Rasley
|
91d63e0228
update formatter version and style settings (#3098)
|
1 year ago |
Olatunji Ruwase
|
e80ae08886
Empty ZeRO3 partition cache (#3060)
|
1 year ago |
Ma, Guokai
|
98cc35b6a8
Abstract accelerator (step 3) (#2677)
|
1 year ago |
Samyam Rajbhandari
|
a298a43af2
[zero-3] Handle forward parameter return correctly in nested cases (#2642)
|
1 year ago |
Ammar Ahmad Awan
|
b5ac0d542d
[zero-3] print warning once and support torch parameter (#2127)
|
2 years ago |
Olatunji Ruwase
|
5870f36c58
Correctly detect offload configuration (#2208)
|
2 years ago |
Olatunji Ruwase
|
2210ebe70f
Release swap buffers for persisted params (#2089)
|
2 years ago |
Michael Wyatt
|
5997589683
Refactor ZeRO configs to use Pydantic (#2004)
|
2 years ago |
Olatunji Ruwase
|
678c3fe330
Split parameter offload from z3 (#2009)
|
2 years ago |