inkcherry
|
d5a7c1e0b4
Capture short kernel sequences to graph (#4318)
|
10 月之前 |
Reza Yazdani
|
2afa1c7f2f
Communication Optimization for Large-Scale Training (#4695)
|
11 月之前 |
Nadav Elyahu
|
c2074b3410
add option to disable pipeline partitioning (#4322)
|
11 月之前 |
Reza Yazdani
|
ec029e7625
Fix the sequence-parallelism for the dense model architecture (#4530)
|
1 年之前 |
Hongjiu "Enneamer" Zhang
|
8e64c3b550
feat: add Lion optimizer (#4331)
|
1 年之前 |
Olatunji Ruwase
|
aa4a7401f8
ZeRO-Inference refresh (#4197)
|
1 年之前 |
Michael Wyatt
|
9647ea791d
Add MuP optimizers (#2043)
|
1 年之前 |
Quentin Anthony
|
0411a9f871
Expose Consecutive Hysteresis to Users (#3553)
|
1 年之前 |
digger-yu
|
198166423d
fix spelling error with deepspeed/ (#3494)
|
1 年之前 |
Zhen Zhang
|
2e99f6edf6
[DRAFT] Tentative implementation of MiCS (#2964)
|
1 年之前 |
Olatunji Ruwase
|
47f9f13bd3
DeepSpeed Chat (#3186)
|
1 年之前 |
Michael Wyatt
|
b361c72761
Update DeepSpeed copyright license to Apache 2.0 (#3111)
|
1 年之前 |
Jeff Rasley
|
91d63e0228
update formatter version and style settings (#3098)
|
1 年之前 |
Molly Smith
|
27e1b02deb
Remove bf16 from inference config dtye enum (#3010)
|
1 年之前 |
Jeff Rasley
|
457850dc5a
[zero] prevent poor configs from running w. zero-offload (#2971)
|
1 年之前 |
Jeff Rasley
|
da84e60d98
add missing license info to top of all source code (#2889)
|
1 年之前 |
Michael Wyatt
|
4079077c2c
add support for hjson config files (#2783)
|
1 年之前 |
Michael Wyatt
|
d923f7c895
Refactor/Pydantify monitoring config (#2640)
|
1 年之前 |
Conglong Li
|
ef869377e9
DeepSpeed Data Efficiency Library (#2585)
|
1 年之前 |
Cheng Li
|
abe4fc6b55
encoded ds config into command line argument when launching child processes in autotuning (#2524)
|
1 年之前 |
Joe Mayer
|
21c2802964
Adding Gradient Accumulation Data Type Config (#2512)
|
1 年之前 |
Joe Mayer
|
7d113633e4
Fix Bug #2319 (#2438)
|
2 年之前 |
Adam Moody
|
b8fb9c3f1a
parallelize writing of layer checkpoint files across data parallel instances (#1419)
|
2 年之前 |
Olatunji Ruwase
|
28dfca8a13
Log user config exactly (#2201)
|
2 年之前 |
Jeff Rasley
|
a039e2261a
enable fp16 input autocasting (#2158)
|
2 年之前 |
Arpan Jain
|
1ed5aa96a8
Elastic Training support in DeepSpeed (#2153) (#2156)
|
2 年之前 |
trajep
|
e669aaf55b
Trajepl/nebula ckpt engine (#2085)
|
2 年之前 |
Michael Wyatt
|
5997589683
Refactor ZeRO configs to use Pydantic (#2004)
|
2 年之前 |
Quentin Anthony
|
5349347bb6
DeepSpeed Communication Profiling and Logging (#2012)
|
2 年之前 |
Olatunji Ruwase
|
80d0a32f0b
Checkpoint reshaping (#1953)
|
2 年之前 |