Ma, Guokai f15cccfa0c [AutoTP] Make AutoTP work when num_heads not divisible by number of workers (#4011) 1 年之前
..
autotuning b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 年之前
checkpoint 4294ea172c CI fix for torch 2.1 release (#4452) 1 年之前
comm 1bc3b78423 [CPU] Use allreduce_low_latency for AutoTP and implement low latency allreduce for CPU backend (single node) (#3919) 1 年之前
compression 9bf77782b2 Fix a bug in the implementation of dequantization for inference (#3433) 1 年之前
elasticity 7290aace9b [CPU] Skip CPU support unimplemented error (#3633) 1 年之前
hybrid_engine 388c84834f add CPU autotp UT (#4263) 1 年之前
inference f15cccfa0c [AutoTP] Make AutoTP work when num_heads not divisible by number of workers (#4011) 1 年之前
launcher 8145b5e41f added port argument for ssh (#4117) 1 年之前
model_parallelism 6b2365e4fa Re-enable elastic training for torch 2+ (#4010) 1 年之前
moe 6b2365e4fa Re-enable elastic training for torch 2+ (#4010) 1 年之前
monitor b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 年之前
ops 8e64c3b550 feat: add Lion optimizer (#4331) 1 年之前
pipe 7ddc3b01dd Fix pipeline module evaluation when contiguous activation checkpointing is enabled (#3005) 1 年之前
profiling 6b2365e4fa Re-enable elastic training for torch 2+ (#4010) 1 年之前
runtime 604d701e35 Introduce pydantic_v1 compatibility module for pydantic>=2.0.0 support (#4407) 1 年之前
utils b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 年之前
__init__.py b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 年之前
alexnet_model.py aef6c65ce3 Reduce Unit Test Times (Part 3) (#3850) 1 年之前
common.py d9a889d559 Fix nv-nightly workflow (#4163) 1 年之前
ds_batch_config.json ff42743865 Refactor remaining distributed tests (#2216) 2 年之前
gpt2-merges.txt ff42743865 Refactor remaining distributed tests (#2216) 2 年之前
gpt2-vocab.json ff42743865 Refactor remaining distributed tests (#2216) 2 年之前
megatron_model.py 4b35833379 Revert "Update megatron GPT2Model" 1 年之前
modeling.py 180dd39714 Clean up modeling code (#4320) 1 年之前
modelingpreln.py 180dd39714 Clean up modeling code (#4320) 1 年之前
multi_output_model.py b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 年之前
simple_model.py 2ded2ff0be checking process_group before merging bucket ranges (#3521) (#3577) 1 年之前
util.py 7b850d3d04 Re-enable skipped unit tests (#3939) 1 年之前