inkcherry
|
c66bc4269e
set the default to use set_to_none for clearing gradients in BF16 optimizer. (#5434)
|
6 months ago |
Moshe Island
|
08e0733e4a
Support MoE for pipeline models (#5338)
|
6 months ago |
YangQun
|
40009eb1c7
BF16 optimizer: Clear lp grads after updating hp grads in hook (#5328)
|
6 months ago |
Masahiro Tanaka
|
c56a4b9e0d
Improve universal checkpoint (#5289)
|
6 months ago |
inkcherry
|
e5dd5501c1
support bf16_optimizer moe expert parallel training and moe EP grad_scale/grad_norm fix (#5259)
|
6 months ago |
Masahiro Tanaka
|
b112c99ea8
Fix loading a universal checkpoint (#5263)
|
7 months ago |
Max Kovalenko
|
3c0bd31288
BF16 optimizer: Improve device utilization by immediate grad update (#4975)
|
8 months ago |
Masahiro Tanaka
|
18179807f5
Remove optimizer step on initialization (#5104)
|
8 months ago |
inkcherry
|
d5a7c1e0b4
Capture short kernel sequences to graph (#4318)
|
10 months ago |
Nadav Elyahu
|
ce6070800a
BF16_Optimizer: add support for bf16 grad acc (#4713)
|
10 months ago |
taozhiwei
|
fd0a52c1ac
use all_gather_into_tensor instead of all_gather (#4705)
|
10 months ago |
Moshe Island
|
8ad187d84f
Universal ckp fixes (#4588)
|
11 months ago |
Kazuki Fujii
|
7ed952eff1
Fix bug in bfloat16 optimizer related to checkpointing (#4434)
|
1 year ago |
mzl
|
5a5340d03b
remove UtilsBuilder load, use torch (un)flatten ops (#3728)
|
1 year ago |
Olatunji Ruwase
|
dd8df20fe0
zero3 checkpoint frozen params (#3205)
|
1 year ago |
Michael Wyatt
|
b361c72761
Update DeepSpeed copyright license to Apache 2.0 (#3111)
|
1 year ago |
Jeff Rasley
|
91d63e0228
update formatter version and style settings (#3098)
|
1 year ago |
Olatunji Ruwase
|
541e423ae6
Enable tensor fragments for zero 2 & 3 (#2727)
|
1 year ago |
Olatunji Ruwase
|
799120e7e4
Universal checkpoint for zero stage 1 (#2284)
|
2 years ago |
Olatunji Ruwase
|
f4a92a19a6
Checkpoint backwards-compatbility workaround (#2384)
|
2 years ago |
Olatunji Ruwase
|
53182531ed
Refactor universal checkpointing and tensor fragments (#2253)
|
2 years ago |
Mikhail Druzhinin
|
4671cce558
Fix OrderedDict import for python3.6 (#2267)
|
2 years ago |
shjwudp
|
57140e8e95
fix: fix BF16_Optimizer compatibility issue with optimizer state 0-dim tensor (#2152)
|
2 years ago |
Alex Hedges
|
316c4a43e0
Add flake8 to pre-commit checks (#2051)
|
2 years ago |
Olatunji Ruwase
|
80d0a32f0b
Checkpoint reshaping (#1953)
|
2 years ago |
Karim Foda
|
735406e536
fix import errors (#2026)
|
2 years ago |
Ammar Ahmad Awan
|
36ad3119d5
DeepSpeed comm backend v1 (#1985)
|
2 years ago |
Jeff Rasley
|
50893458d6
Fairseq support (#1915)
|
2 years ago |
Olatunji Ruwase
|
af58f63dde
bf16 inference (#1917)
|
2 years ago |
Olatunji Ruwase
|
56c5223868
bf16+pipeline parallelism (#1801)
|
2 years ago |