Jeff Rasley
|
0fc11fa0e6
[squash] zero-ckpt-cpu-issue (#1673)
|
2 years ago |
Mikhail Druzhinin
|
d14baad940
allreduce_always_fp16 (#1487)
|
2 years ago |
Rana Ali Amjad
|
648f7bfa50
Bfloat16 zero2 (#1398)
|
3 years ago |
Hari Prasad
|
c0b27fb019
Added drop_last to DeepSpeedDataLoader (#1321)
|
3 years ago |
Ammar Ahmad Awan
|
f28432441b
DeepSpeed MoE (#1310)
|
3 years ago |
Conglong Li
|
b2b34ae342
Curriculum learning (#1307)
|
3 years ago |
Reza Yazdani
|
ed3de0c21b
Quantization + inference release (#1091)
|
3 years ago |
Jeff Rasley
|
f032e56f8a
Validate consistent ckpt tags across ranks (#667)
|
3 years ago |
Jeff Rasley
|
7435b2f10a
Ability to initialize distributed backend outside deepspeed runtime (#608)
|
3 years ago |
Olatunji Ruwase
|
be1147c08a
PLD release (#513)
|
4 years ago |
Jeff Rasley
|
41db1c2f03
ZeRO-Offload release (#391)
|
4 years ago |
Arash Ashari
|
a64b0abbcc
fixed a typo; this was fixed before but seems like it has been lost in the refactor (#364)
|
4 years ago |
Jeff Rasley
|
e5bbc2e559
Sparse attn + ops/runtime refactor + v0.3.0 (#343)
|
4 years ago |