Logan Adams
|
6dcced1d5c
Cleanup required_torch_version code and references. (#5370)
|
6 月之前 |
Masahiro Tanaka
|
c56a4b9e0d
Improve universal checkpoint (#5289)
|
6 月之前 |
Jackmin801
|
58a206059f
Small docstring fix (#4431)
|
1 年之前 |
Jackmin801
|
2f73b834b5
change default set_to_none in zero_grad methods (#4438)
|
1 年之前 |
Logan Adams
|
6b2365e4fa
Re-enable elastic training for torch 2+ (#4010)
|
1 年之前 |
Michael Wyatt
|
b361c72761
Update DeepSpeed copyright license to Apache 2.0 (#3111)
|
1 年之前 |
Jeff Rasley
|
91d63e0228
update formatter version and style settings (#3098)
|
1 年之前 |
Ma, Guokai
|
98cc35b6a8
Abstract accelerator (step 3) (#2677)
|
1 年之前 |
loadams
|
34a11688c4
Change zero_grad() argument to match pytorch (#2741)
|
1 年之前 |
JackieWu
|
323c266cfe
[Bug Fixed] use torch.cuda.is_available() (#2661)
|
1 年之前 |
Karim Foda
|
735406e536
fix import errors (#2026)
|
2 年之前 |
Ammar Ahmad Awan
|
36ad3119d5
DeepSpeed comm backend v1 (#1985)
|
2 年之前 |
Jeff Rasley
|
50893458d6
Fairseq support (#1915)
|
2 年之前 |
Olatunji Ruwase
|
135a625619
Move param_shapes to model files (#1732)
|
2 年之前 |
Alex Hedges
|
4cf970e6bb
Add codespell to pre-commit checks (#1717)
|
2 年之前 |
Jeff Rasley
|
3293cf72a0
[ZeRO] Default disable elastic ckpt in stage 1+2 and reduce CPU memory overhead during ckpt load (#1525)
|
2 年之前 |
Jeff Rasley
|
e2fdd254ed
Big science related changes (#1407)
|
3 年之前 |
Ammar Ahmad Awan
|
f28432441b
DeepSpeed MoE (#1310)
|
3 年之前 |
Reza Yazdani
|
ed3de0c21b
Quantization + inference release (#1091)
|
3 年之前 |
Conglong Li
|
67a48aaa89
1-bit LAMB optimizer (#970)
|
3 年之前 |
Stas Bekman
|
29853c3eed
less scary overflow notice (#833)
|
3 年之前 |
Shaden Smith
|
65c2f974d8
Pipeline parallel training engine. (#392)
|
4 年之前 |
Jeff Rasley
|
e5bbc2e559
Sparse attn + ops/runtime refactor + v0.3.0 (#343)
|
4 年之前 |