.. |
autotuning
|
c052545122
DS #4993 #662 : autotune single node hostfile bugfix (#4996)
|
9 months ago |
checkpoint
|
8998707a2f
Universal Checkpoint for Sequence Parallelism (#4752)
|
10 months ago |
comm
|
c3cfe96bb3
Enable torch.compile with ZeRO (Experimental) (#4878)
|
8 months ago |
compression
|
389bf69319
fix: Remove duplicate word the (#4051)
|
1 year ago |
elasticity
|
d2e9adce39
Fix error report of DSElasticAgent._set_master_addr_port() (#4985)
|
9 months ago |
inference
|
1d35db76a0
Refactor the Qwen positional emebdding config code (#4955)
|
9 months ago |
launcher
|
8f6277001a
launcher_helper: enable fds passing (#5042)
|
8 months ago |
model_implementations
|
d5a7c1e0b4
Capture short kernel sequences to graph (#4318)
|
10 months ago |
module_inject
|
e212845e39
Add backwards compatibility w/ older versions of diffusers (<0.25.0) (#5083)
|
8 months ago |
moe
|
9922270f47
Further refactor deepspeed.moe.utils + deepspeed.moe.layer type hints (#5060)
|
8 months ago |
monitor
|
604d701e35
Introduce pydantic_v1 compatibility module for pydantic>=2.0.0 support (#4407)
|
1 year ago |
nebula
|
cd4e473ee6
fix typo with deepspeed/ (#3547)
|
1 year ago |
ops
|
fd0a52c1ac
use all_gather_into_tensor instead of all_gather (#4705)
|
10 months ago |
pipe
|
b361c72761
Update DeepSpeed copyright license to Apache 2.0 (#3111)
|
1 year ago |
profiling
|
61391229c9
Update flops profiler to recurse (#4374)
|
11 months ago |
runtime
|
5ce448d326
Switch hasattr to check for compiler and not compile since compile was introduced in torch 2.0 but compiler was introduced in torch 2.1, this fixes issues for those building with torch 2.0
|
8 months ago |
sequence
|
2afa1c7f2f
Communication Optimization for Large-Scale Training (#4695)
|
11 months ago |
utils
|
19e0dc39ba
Delay reduce-scatter for ZeRO3 leaf modules (#5008)
|
8 months ago |
__init__.py
|
c3cfe96bb3
Enable torch.compile with ZeRO (Experimental) (#4878)
|
8 months ago |
accelerator
|
9548d48f48
Abstract accelerator (step 2) (#2560)
|
1 year ago |
constants.py
|
706a72562a
Allow env var for timeout (#4405)
|
1 year ago |
env_report.py
|
c1ba6a104f
[CANN] Support cpu offload optimizer for Ascend NPU (#4568)
|
11 months ago |
git_version_info.py
|
57a27b0803
add type checker ignore to resolve that pylance can't resolved noqa annotation (#4102)
|
1 year ago |
pydantic_v1.py
|
604d701e35
Introduce pydantic_v1 compatibility module for pydantic>=2.0.0 support (#4407)
|
1 year ago |