Masahiro Tanaka c9fc34a4be Use file store for tests (#6632) 4 days ago
..
accelerator b20c46745b add missing methods to MPS_Accelerator (#5134) 8 months ago
autotuning b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 year ago
checkpoint 774b897736 fix the missing argument in test and typo (#5730) 3 months ago
comm 19da95f783 [CPU] add fp16 support to shm inference_all_reduce (#5669) 3 months ago
compression 6dcced1d5c Cleanup required_torch_version code and references. (#5370) 6 months ago
elasticity 8e4f6e48db Skip the UT cases that use unimplemented op builders. (#5372) 5 months ago
hybrid_engine f69f8840fc Removal of cuda hardcoded string with get_device function (#5351) 6 months ago
inference 1a45bd8e8c Lock cache file of HF model list (#6628) 6 days ago
launcher 13c16c9562 Accept btl_tcp_if_include option through launcher_args (#6613) 1 week ago
linear 6e5d58d248 OptimizedLinear updates (#5791) 2 months ago
model_parallelism 6dcced1d5c Cleanup required_torch_version code and references. (#5370) 6 months ago
moe 9a3ede7079 add moe topk(k>2) gate support (#5881) 2 months ago
monitor 488a823f64 New integration - CometMonitor (#5466) 5 months ago
ops a1f98bdc70 AIO CPU Locked Tensor (#6592) 1 week ago
pipe 7ddc3b01dd Fix pipeline module evaluation when contiguous activation checkpointing is enabled (#3005) 1 year ago
profiling 6dcced1d5c Cleanup required_torch_version code and references. (#5370) 6 months ago
runtime 85b7469ea0 Add first Step in LR Schedulers (#6597) 1 week ago
sequence_parallelism 8b191d7ccf Long sequence parallelism (Ulysses) integration with HuggingFace (#5774) 2 months ago
utils 08e0733e4a Support MoE for pipeline models (#5338) 6 months ago
__init__.py b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 year ago
alexnet_model.py 9bc4cd01b7 Store/Load CIFAR from local/offline (#6390) 1 month ago
common.py c9fc34a4be Use file store for tests (#6632) 4 days ago
ds_batch_config.json ff42743865 Refactor remaining distributed tests (#2216) 2 years ago
gpt2-merges.txt ff42743865 Refactor remaining distributed tests (#2216) 2 years ago
gpt2-vocab.json ff42743865 Refactor remaining distributed tests (#2216) 2 years ago
megatron_model.py 4b35833379 Revert "Update megatron GPT2Model" 1 year ago
modeling.py 180dd39714 Clean up modeling code (#4320) 1 year ago
modelingpreln.py 180dd39714 Clean up modeling code (#4320) 1 year ago
multi_output_model.py c08e69f212 Make op builder detection adapt to accelerator change (#5206) 7 months ago
simple_model.py c08e69f212 Make op builder detection adapt to accelerator change (#5206) 7 months ago
util.py 1ab1928d79 Enable dynamic shapes for pipeline parallel engine inputs (#5481) 2 months ago