Justin Chiu
|
4912e0ad7e
Various ZeRO Stage3 Optimizations + Improvements (including bfloat16 support) (#1453)
|
2 年之前 |
Jeff Rasley
|
e46d808a1b
MoE inference + PR-MoE model support (#1705)
|
2 年之前 |
Alex Hedges
|
fc2f378ece
Improve pre-commit hooks (#1602)
|
2 年之前 |
Jeff Rasley
|
2332cb31a7
Enables ZeRO-3 inference (#1514)
|
2 年之前 |
Cheng Li
|
9caa74e577
Autotuning (#1554)
|
2 年之前 |
Jeff Rasley
|
e2fdd254ed
Big science related changes (#1407)
|
3 年之前 |
Ammar Ahmad Awan
|
ddffbae021
Remove duplicate clip grad function in deepspeed (#1333)
|
3 年之前 |
Olatunji Ruwase
|
85acf14c58
Activation checkpointing improvements (#1254)
|
3 年之前 |
Ammar Ahmad Awan
|
f28432441b
DeepSpeed MoE (#1310)
|
3 年之前 |
Stas Bekman
|
32e85eda58
[see_memory_usage] fix deprecation (#1234)
|
3 年之前 |
Shaden Smith
|
46f4573b1a
Seeded unit tests (#1072)
|
3 年之前 |
Olatunji Ruwase
|
e88ebbcfc9
Use amp autocast in ZeRO3 linear (#990)
|
3 年之前 |
Conglong Li
|
67a48aaa89
1-bit LAMB optimizer (#970)
|
3 年之前 |
Stas Bekman
|
7f03282c51
[debug utils] see_memory_usage fixes (#890)
|
3 年之前 |
Samyam Rajbhandari
|
599258f979
ZeRO 3 Offload (#834)
|
3 年之前 |
Olatunji Ruwase
|
ec8b1cb0a0
Activation checkpointing for non-tensor arguments and return values (#741)
|
3 年之前 |
Jeff Rasley
|
44bd538b11
Module replacement support (#586)
|
3 年之前 |
Jeff Rasley
|
08c96a1bc6
ZeRO-1 tune max-elems + bug fix (#532)
|
3 年之前 |
Shaden Smith
|
65c2f974d8
Pipeline parallel training engine. (#392)
|
4 年之前 |
Ammar Ahmad Awan
|
01726ce2b8
Add 1-bit Adam support to DeepSpeed (#380)
|
4 年之前 |
Jeff Rasley
|
e5bbc2e559
Sparse attn + ops/runtime refactor + v0.3.0 (#343)
|
4 年之前 |