noabauma db15ef578a deepspeed.init_distributed() support for TCP protocols (#2905) 1 年之前
..
autotuning da84e60d98 add missing license info to top of all source code (#2889) 1 年之前
checkpoint da84e60d98 add missing license info to top of all source code (#2889) 1 年之前
comm db15ef578a deepspeed.init_distributed() support for TCP protocols (#2905) 1 年之前
compression da84e60d98 add missing license info to top of all source code (#2889) 1 年之前
elasticity da84e60d98 add missing license info to top of all source code (#2889) 1 年之前
inference 87eaf8f99a Check for local CUDA graphs when enable_cuda_graph=True (#2941) 1 年之前
launcher 8d53ac0cd3 Add MPICH Multinode Runner (#2839) 1 年之前
model_implementations 87eaf8f99a Check for local CUDA graphs when enable_cuda_graph=True (#2941) 1 年之前
module_inject 0acf7e9c48 [RFC] add device abstraction to allow other device than CUDA be used (#2221) 1 年之前
moe da84e60d98 add missing license info to top of all source code (#2889) 1 年之前
monitor 91d7090e47 Fixes `AttributeError` in #2853 (#2854) 1 年之前
nebula da84e60d98 add missing license info to top of all source code (#2889) 1 年之前
ops 17fa0876ad Always convert input mask to half (#2851) 1 年之前
pipe da84e60d98 add missing license info to top of all source code (#2889) 1 年之前
profiling da84e60d98 add missing license info to top of all source code (#2889) 1 年之前
runtime 80d8fcbdb3 Improve overflow handling (#2944) 1 年之前
utils 541e423ae6 Enable tensor fragments for zero 2 & 3 (#2727) 1 年之前
__init__.py 4abf637f96 Remove mutable default parameter in init_inference() (#2540) 1 年之前
accelerator 9548d48f48 Abstract accelerator (step 2) (#2560) 1 年之前
constants.py 1ed5aa96a8 Elastic Training support in DeepSpeed (#2153) (#2156) 2 年之前
env_report.py da84e60d98 add missing license info to top of all source code (#2889) 1 年之前
git_version_info.py da84e60d98 add missing license info to top of all source code (#2889) 1 年之前