BacharL
|
697f945a05
Split is_synchronized_device api to multiple apis (#5026)
|
8 月之前 |
Logan Adams
|
3255569b78
Switch hasattr check from compile to compiler (#5096)
|
8 月之前 |
Yun Dai
|
688239e3f2
[xs] fix ZEROPP convergence test (#5061)
|
8 月之前 |
mmhab
|
961bc85624
optimize clip_grad_norm_ function (#4915)
|
8 月之前 |
minchao
|
4f477328c4
[NPU] replace 'cuda' with get_accelerator().device_name() (#5095)
|
8 月之前 |
Nadav Elyahu
|
b42a470615
HPU Accelerator: fix supported_dtypes API (#5094)
|
8 月之前 |
Logan Adams
|
ec49222cd8
Update nv-accelerate to latest torch (#5040)
|
8 月之前 |
Masahiro Tanaka
|
c3cfe96bb3
Enable torch.compile with ZeRO (Experimental) (#4878)
|
8 月之前 |
Lev Kurilenko
|
e212845e39
Add backwards compatibility w/ older versions of diffusers (<0.25.0) (#5083)
|
8 月之前 |
Logan Adams
|
e469e7d98c
Update torch version for nv-torch-latest-cpu (#5086)
|
8 月之前 |
Logan Adams
|
55eb78ee1f
Revert "Update nv-torch-latest-version"
|
8 月之前 |
Logan Adams
|
889620b0a4
Update nv-torch-latest-version
|
8 月之前 |
Masahiro Tanaka
|
5a721de32c
Stop tracking backward chain of broadcast in initialization (#5075)
|
8 月之前 |
Masahiro Tanaka
|
f02d7bdadf
Fix verification for ZeRO3 leaf module (#5074)
|
8 月之前 |
Matthew Hoffman
|
9922270f47
Further refactor deepspeed.moe.utils + deepspeed.moe.layer type hints (#5060)
|
8 月之前 |
ByronHsu
|
3e6d606957
[doc/1-line change] default stage3_param_persistence_threshold is wrong in the doc (#5073)
|
8 月之前 |
segyges
|
dde64b000c
Make batch size documentation clearer (#5072)
|
8 月之前 |
ByronHsu
|
592325abde
[Zero++ qgZ] Fall back to reduce_scatter if `tensor.numel() % (2 * global_world_size) != 0` (#5056)
|
8 月之前 |
Nadav Elyahu
|
2eafe41be7
adding hccl to init_distributed function description (#5034)
|
8 月之前 |
Michael Wyatt
|
a049370c0c
Update import for changes to latest diffusers (#5065)
|
8 月之前 |
Polisetty V R K Jyothendra Varma
|
567f97b264
load linear layer weight with given dtype (#4044)
|
8 月之前 |
Nadav Elyahu
|
61daaa1ea2
Optimize grad_norm calculations by reducing device/host dependency (#4974)
|
8 月之前 |
Masahiro Tanaka
|
19e0dc39ba
Delay reduce-scatter for ZeRO3 leaf modules (#5008)
|
8 月之前 |
CurryRice233
|
6de31de73f
[NPU] Change log level to debug (#5051)
|
8 月之前 |
Michael Wyatt
|
449f9ad01f
Fix broken model names in inference CI (#5053)
|
8 月之前 |
Yun Dai
|
76ec8b4927
[doc] update inference related docs from `mp_size` to `tensor_parallel` for TP (#5048)
|
8 月之前 |
Matthew Hoffman
|
971d82b573
MoE type hints (#5043)
|
8 月之前 |
CurryRice233
|
88cca60afb
[NPU] Add NPU to support hybrid engine (#4831)
|
8 月之前 |
Michael Wyatt
|
93e9537d4c
Fix nv-torch-latest-cpu CI (#5045)
|
8 月之前 |
Yizhou Wang
|
8f6277001a
launcher_helper: enable fds passing (#5042)
|
8 月之前 |