Olatunji Ruwase
|
7f90ef4bdd
Multiple zero stage 3 related fixes (#3886)
|
1 年之前 |
Ma, Guokai
|
0f5406323c
[CPU] FusedAdam and CPU training support (#3991)
|
1 年之前 |
digger yu
|
fc8de76f1d
Simplify chain comparisons, remove redundant parentheses (#3912)
|
1 年之前 |
hipudding
|
7528035c1e
Use device_name instead of device index to support other device (#3933)
|
1 年之前 |
Heyang Qin
|
e59f69a8ff
remove the call to param.ds_tensor from print (#3928)
|
1 年之前 |
hipudding
|
e292343d7b
Del comment deepspeed.zero.Init() can be used as a decorator (#3894)
|
1 年之前 |
Heyang Qin
|
f8551b439e
Fix racing condition in GatheredParameters (#3819)
|
1 年之前 |
Masahiro Tanaka
|
203ac9d7ac
support model declaration in zero.Init context (#3592)
|
1 年之前 |
Heyang Qin
|
d18aa2c79c
ZeRO++ (#3784)
|
1 年之前 |
Olatunji Ruwase
|
046afcedb4
Increase tensor creator coverage (#3684)
|
1 年之前 |
hablb
|
0977106ac9
zero3 performance optimizations (#3622)
|
1 年之前 |
digger yu
|
5d14afd26c
fix typo deepspeed/runtime (#3663)
|
1 年之前 |
Yizhou Wang
|
9f4a876360
Fix broadcast error on multi-node training with ZeroStage3 and TensorParallel=2 (#2999)
|
1 年之前 |
Zhen Zhang
|
2e99f6edf6
[DRAFT] Tentative implementation of MiCS (#2964)
|
1 年之前 |
Masahiro Tanaka
|
717c30203e
Nested zero.Init() and dynamically defined model class (#2989)
|
1 年之前 |
Olatunji Ruwase
|
47f9f13bd3
DeepSpeed Chat (#3186)
|
1 年之前 |
Olatunji Ruwase
|
4d27225f3e
zero.Init() should pin params in GPU memory as requested (#2953)
|
1 年之前 |
Michael Wyatt
|
b361c72761
Update DeepSpeed copyright license to Apache 2.0 (#3111)
|
1 年之前 |
Mayank Mishra
|
a6317eb509
♻️ replace deprecated functions for communication (#2995)
|
1 年之前 |
Jeff Rasley
|
91d63e0228
update formatter version and style settings (#3098)
|
1 年之前 |
Stas Bekman
|
30d3f5df7a
fix a mispelled attribute (#2750)
|
1 年之前 |
Ma, Guokai
|
98cc35b6a8
Abstract accelerator (step 3) (#2677)
|
1 年之前 |
Stas Bekman
|
ddd48b36ac
[GatheredParameters] fix memory leak (#2665)
|
1 年之前 |
Stas Bekman
|
217cc07bb5
[GatheredParameters] add support for any iterator (#2664)
|
1 年之前 |
iLeGend
|
06e00f61ce
Fix typos: deepseed -> deepspeed (#2499)
|
1 年之前 |
Olatunji Ruwase
|
2210ebe70f
Release swap buffers for persisted params (#2089)
|
2 年之前 |
Jeff Rasley
|
46401b3884
[zero-3] shutdown zero.Init from within ds.init (#2150)
|
2 年之前 |
Michael Wyatt
|
5997589683
Refactor ZeRO configs to use Pydantic (#2004)
|
2 年之前 |
Alex Hedges
|
316c4a43e0
Add flake8 to pre-commit checks (#2051)
|
2 年之前 |
Quentin Anthony
|
5349347bb6
DeepSpeed Communication Profiling and Logging (#2012)
|
2 年之前 |