Zhen Zhang 8a63754bce save_non_zero_checkpoint on first partition group (#3787) 1 年之前
..
autotuning b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 年之前
checkpoint 8a63754bce save_non_zero_checkpoint on first partition group (#3787) 1 年之前
comm 1bc3b78423 [CPU] Use allreduce_low_latency for AutoTP and implement low latency allreduce for CPU backend (single node) (#3919) 1 年之前
compression 6b2365e4fa Re-enable elastic training for torch 2+ (#4010) 1 年之前
elasticity 7290aace9b [CPU] Skip CPU support unimplemented error (#3633) 1 年之前
hybrid_engine 7290aace9b [CPU] Skip CPU support unimplemented error (#3633) 1 年之前
inference 76953a37b7 fix opt-350m shard loading issue in AutoTP (#3600) 1 年之前
launcher 1f72082fc0 [CPU] Support Intel CPU inference (#3041) 1 年之前
model_parallelism 6b2365e4fa Re-enable elastic training for torch 2+ (#4010) 1 年之前
moe 6b2365e4fa Re-enable elastic training for torch 2+ (#4010) 1 年之前
monitor b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 年之前
ops 7290aace9b [CPU] Skip CPU support unimplemented error (#3633) 1 年之前
pipe 7ddc3b01dd Fix pipeline module evaluation when contiguous activation checkpointing is enabled (#3005) 1 年之前
profiling 6b2365e4fa Re-enable elastic training for torch 2+ (#4010) 1 年之前
runtime 7f90ef4bdd Multiple zero stage 3 related fixes (#3886) 1 年之前
utils b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 年之前
__init__.py b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 年之前
alexnet_model.py aef6c65ce3 Reduce Unit Test Times (Part 3) (#3850) 1 年之前
common.py 1bc3b78423 [CPU] Use allreduce_low_latency for AutoTP and implement low latency allreduce for CPU backend (single node) (#3919) 1 年之前
ds_batch_config.json ff42743865 Refactor remaining distributed tests (#2216) 2 年之前
gpt2-merges.txt ff42743865 Refactor remaining distributed tests (#2216) 2 年之前
gpt2-vocab.json ff42743865 Refactor remaining distributed tests (#2216) 2 年之前
megatron_model.py 4b35833379 Revert "Update megatron GPT2Model" 1 年之前
modeling.py b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 年之前
modelingpreln.py b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 年之前
multi_output_model.py b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 年之前
simple_model.py 2ded2ff0be checking process_group before merging bucket ranges (#3521) (#3577) 1 年之前
util.py 7b850d3d04 Re-enable skipped unit tests (#3939) 1 年之前