Yasyf Mohamedali
|
d3de737550
Remove deprecated `torch._six` imports (#2863)
|
1 年之前 |
Ma, Guokai
|
98cc35b6a8
Abstract accelerator (step 3) (#2677)
|
1 年之前 |
Guo Yejun
|
d0dbc95a90
call empty_cache to really free up GPU memory as described in comment (#2620)
|
1 年之前 |
Alex Hedges
|
316c4a43e0
Add flake8 to pre-commit checks (#2051)
|
2 年之前 |
Karim Foda
|
735406e536
fix import errors (#2026)
|
2 年之前 |
Ammar Ahmad Awan
|
36ad3119d5
DeepSpeed comm backend v1 (#1985)
|
2 年之前 |
Olatunji Ruwase
|
56c5223868
bf16+pipeline parallelism (#1801)
|
2 年之前 |
Ammar Ahmad Awan
|
c0af6d90f7
Refactor MoE and Groups API to simplify model creation and mangement (#1798)
|
2 年之前 |
Justin Chiu
|
4912e0ad7e
Various ZeRO Stage3 Optimizations + Improvements (including bfloat16 support) (#1453)
|
2 年之前 |
Jeff Rasley
|
e46d808a1b
MoE inference + PR-MoE model support (#1705)
|
2 年之前 |
Alex Hedges
|
fc2f378ece
Improve pre-commit hooks (#1602)
|
2 年之前 |
Jeff Rasley
|
2332cb31a7
Enables ZeRO-3 inference (#1514)
|
2 年之前 |
Cheng Li
|
9caa74e577
Autotuning (#1554)
|
2 年之前 |
Jeff Rasley
|
e2fdd254ed
Big science related changes (#1407)
|
3 年之前 |
Ammar Ahmad Awan
|
ddffbae021
Remove duplicate clip grad function in deepspeed (#1333)
|
3 年之前 |
Olatunji Ruwase
|
85acf14c58
Activation checkpointing improvements (#1254)
|
3 年之前 |
Ammar Ahmad Awan
|
f28432441b
DeepSpeed MoE (#1310)
|
3 年之前 |
Stas Bekman
|
32e85eda58
[see_memory_usage] fix deprecation (#1234)
|
3 年之前 |
Shaden Smith
|
46f4573b1a
Seeded unit tests (#1072)
|
3 年之前 |
Olatunji Ruwase
|
e88ebbcfc9
Use amp autocast in ZeRO3 linear (#990)
|
3 年之前 |
Conglong Li
|
67a48aaa89
1-bit LAMB optimizer (#970)
|
3 年之前 |
Stas Bekman
|
7f03282c51
[debug utils] see_memory_usage fixes (#890)
|
3 年之前 |
Samyam Rajbhandari
|
599258f979
ZeRO 3 Offload (#834)
|
3 年之前 |
Olatunji Ruwase
|
ec8b1cb0a0
Activation checkpointing for non-tensor arguments and return values (#741)
|
3 年之前 |
Jeff Rasley
|
44bd538b11
Module replacement support (#586)
|
3 年之前 |
Jeff Rasley
|
08c96a1bc6
ZeRO-1 tune max-elems + bug fix (#532)
|
3 年之前 |
Shaden Smith
|
65c2f974d8
Pipeline parallel training engine. (#392)
|
4 年之前 |
Ammar Ahmad Awan
|
01726ce2b8
Add 1-bit Adam support to DeepSpeed (#380)
|
4 年之前 |
Jeff Rasley
|
e5bbc2e559
Sparse attn + ops/runtime refactor + v0.3.0 (#343)
|
4 年之前 |