Samyam Rajbhandari
|
599258f979
ZeRO 3 Offload (#834)
|
3 years ago |
Jeff Rasley
|
f032e56f8a
Validate consistent ckpt tags across ranks (#667)
|
3 years ago |
Jeff Rasley
|
7435b2f10a
Ability to initialize distributed backend outside deepspeed runtime (#608)
|
3 years ago |
Jeff Rasley
|
dce054dbba
backwards compatability w. v020 ckpts, fix issue with zero-1 ckpts (#543)
|
3 years ago |
Jeff Rasley
|
31f46feee2
DeepSpeed JIT op + PyPI support (#496)
|
4 years ago |
Reza Yazdani
|
f5aa2547d8
Add CPUAdam optimizer for zero-offload in deepspeed engine (#484)
|
4 years ago |
Shaden Smith
|
65c2f974d8
Pipeline parallel training engine. (#392)
|
4 years ago |
Jeff Rasley
|
41db1c2f03
ZeRO-Offload release (#391)
|
4 years ago |
Jeff Rasley
|
e5bbc2e559
Sparse attn + ops/runtime refactor + v0.3.0 (#343)
|
4 years ago |
Chunyang Wen
|
e1ad8803eb
Add log util (#230)
|
4 years ago |
Jeff Rasley
|
734d8991c8
Transformer kernel release (#242)
|
4 years ago |
Jeff Rasley
|
f2ac7eafd5
ZeRO-2 (#217)
|
4 years ago |
Shaden Smith
|
dd166ee6b6
README and RTD improvements. (#198)
|
4 years ago |
Jeff Rasley
|
7e8132832f
MPI 3.x support via mpi4py (#107)
|
4 years ago |
Jeff Rasley
|
5aa58b3878
Init distributed torch only if needed (#108)
|
4 years ago |
Jeff Rasley
|
5897091eb9
add deprecated deepspeed flag for legacy code (#104)
|
4 years ago |
Jeff Rasley
|
001abe2362
Refactor simple model test, fix pythonpath issue (#96)
|
4 years ago |
Shaden Smith
|
2abef1ef76
Updating MPU docs (#92)
|
4 years ago |
Shaden Smith
|
50ae149f82
Moving to major/minor/patch versioning. (#51)
|
4 years ago |
Olatunji Ruwase
|
8326aff279
Improve doc string for add_XXX_arguments (#32)
|
4 years ago |
Shaden Smith
|
b18eae24e8
Fixing file permissions (#1)
|
4 years ago |
Jeff Rasley
|
6ef93347ed
add deepspeed init
|
4 years ago |