Masahiro Tanaka 8f81634e45 Merge branch 'master' into tohtana/offload_zero_buffers 1 month ago
..
accelerator b20c46745b add missing methods to MPS_Accelerator (#5134) 8 months ago
autotuning b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 year ago
checkpoint 774b897736 fix the missing argument in test and typo (#5730) 3 months ago
comm 19da95f783 [CPU] add fp16 support to shm inference_all_reduce (#5669) 3 months ago
compression 6dcced1d5c Cleanup required_torch_version code and references. (#5370) 6 months ago
elasticity 8e4f6e48db Skip the UT cases that use unimplemented op builders. (#5372) 5 months ago
hybrid_engine f69f8840fc Removal of cuda hardcoded string with get_device function (#5351) 6 months ago
inference 89c4d9f5a7 TestLowCpuMemUsage UT get device by device_name (#6397) 1 month ago
launcher c08e69f212 Make op builder detection adapt to accelerator change (#5206) 7 months ago
linear 6e5d58d248 OptimizedLinear updates (#5791) 2 months ago
model_parallelism 6dcced1d5c Cleanup required_torch_version code and references. (#5370) 6 months ago
moe 9a3ede7079 add moe topk(k>2) gate support (#5881) 2 months ago
monitor 488a823f64 New integration - CometMonitor (#5466) 5 months ago
ops b5cf30a085 Dtype support check for accelerator in UTs (#6360) 1 month ago
pipe 7ddc3b01dd Fix pipeline module evaluation when contiguous activation checkpointing is enabled (#3005) 1 year ago
profiling 6dcced1d5c Cleanup required_torch_version code and references. (#5370) 6 months ago
runtime d33807907a validate devcies of offload states 1 month ago
sequence_parallelism 8b191d7ccf Long sequence parallelism (Ulysses) integration with HuggingFace (#5774) 2 months ago
utils 08e0733e4a Support MoE for pipeline models (#5338) 6 months ago
__init__.py b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 year ago
alexnet_model.py 9bc4cd01b7 Store/Load CIFAR from local/offline (#6390) 1 month ago
common.py 659f6be105 Avoid security issues of subprocess shell (#6498) 1 month ago
ds_batch_config.json ff42743865 Refactor remaining distributed tests (#2216) 2 years ago
gpt2-merges.txt ff42743865 Refactor remaining distributed tests (#2216) 2 years ago
gpt2-vocab.json ff42743865 Refactor remaining distributed tests (#2216) 2 years ago
megatron_model.py 4b35833379 Revert "Update megatron GPT2Model" 1 year ago
modeling.py 180dd39714 Clean up modeling code (#4320) 1 year ago
modelingpreln.py 180dd39714 Clean up modeling code (#4320) 1 year ago
multi_output_model.py c08e69f212 Make op builder detection adapt to accelerator change (#5206) 7 months ago
simple_model.py c08e69f212 Make op builder detection adapt to accelerator change (#5206) 7 months ago
util.py 1ab1928d79 Enable dynamic shapes for pipeline parallel engine inputs (#5481) 2 months ago