Olatunji Ruwase
|
865104be85
Support optimizer AdamW type (#670)
|
3 年之前 |
Jeff Rasley
|
f032e56f8a
Validate consistent ckpt tags across ranks (#667)
|
3 年之前 |
Cheng Li
|
e2fbe4d238
squash latest flops profiling changes (#1) (#664)
|
3 年之前 |
Jeff Rasley
|
81aeea361d
Elastic training support (#602)
|
3 年之前 |
Olatunji Ruwase
|
be1147c08a
PLD release (#513)
|
4 年之前 |
Reza Yazdani
|
f5aa2547d8
Add CPUAdam optimizer for zero-offload in deepspeed engine (#484)
|
4 年之前 |
Shaden Smith
|
65c2f974d8
Pipeline parallel training engine. (#392)
|
4 年之前 |
Jeff Rasley
|
41db1c2f03
ZeRO-Offload release (#391)
|
4 年之前 |
Ammar Ahmad Awan
|
01726ce2b8
Add 1-bit Adam support to DeepSpeed (#380)
|
4 年之前 |
Arash Ashari
|
a64b0abbcc
fixed a typo; this was fixed before but seems like it has been lost in the refactor (#364)
|
4 年之前 |
Jeff Rasley
|
e5bbc2e559
Sparse attn + ops/runtime refactor + v0.3.0 (#343)
|
4 年之前 |