Nadav Elyahu 54c0687264 stage3: efficient compute of scaled_global_grad_norm (#5256) 6 月之前
..
__init__.py 2e99f6edf6 [DRAFT] Tentative implementation of MiCS (#2964) 1 年之前
config.py 2afa1c7f2f Communication Optimization for Large-Scale Training (#4695) 11 月之前
contiguous_memory_allocator.py 9ec55bd99b Fix f-string messages (#4865) 9 月之前
linear.py 28b9d5c231 Add condition when dimension is greater than 2 (#4390) 1 年之前
mics.py c3cfe96bb3 Enable torch.compile with ZeRO (Experimental) (#4878) 8 月之前
mics_utils.py 2e99f6edf6 [DRAFT] Tentative implementation of MiCS (#2964) 1 年之前
offload_config.py b1cb0dfc46 Guanhua/partial offload rebase v2 (#590) (#4636) 11 月之前
parameter_offload.py c3cfe96bb3 Enable torch.compile with ZeRO (Experimental) (#4878) 8 月之前
partition_parameters.py c08e69f212 Make op builder detection adapt to accelerator change (#5206) 7 月之前
partitioned_param_coordinator.py 697f945a05 Split is_synchronized_device api to multiple apis (#5026) 8 月之前
partitioned_param_profiler.py d18aa2c79c ZeRO++ (#3784) 1 年之前
stage3.py 54c0687264 stage3: efficient compute of scaled_global_grad_norm (#5256) 6 月之前
stage_1_and_2.py 7b5b06602d fix pagable h2d memcpy (#5301) 6 月之前
test.py b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 年之前
tiling.py b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 年之前
utils.py c3cfe96bb3 Enable torch.compile with ZeRO (Experimental) (#4878) 8 月之前