Logan Adams 5ce448d326 Switch hasattr to check for compiler and not compile since compile was introduced in torch 2.0 but compiler was introduced in torch 2.1, this fixes issues for those building with torch 2.0 8 months ago
..
activation_checkpointing 9e455d7651 Checkpointing: Avoid assigning tensor storage with different device (#4836) 10 months ago
checkpoint_engine c5edc91ecb change partititon_name to partition_name (#3700) 1 year ago
comm 592325abde [Zero++ qgZ] Fall back to reduce_scatter if `tensor.numel() % (2 * global_world_size) != 0` (#5056) 8 months ago
compression b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 year ago
data_pipeline 736bf1853b bug fix (#3609) 1 year ago
fp16 2ce6bf8ce0 [NPU] Add HcclBackend for 1-bit adam, 1-bit lamb, 0/1 adam (#4733) 10 months ago
pipe ac84cf3ff1 Pipeline: Add support to eval micro bs configuration (#4859) 9 months ago
swap_tensor d058d4b39b Nvme offload checkpoint (#4707) 9 months ago
zero 4f477328c4 [NPU] replace 'cuda' with get_accelerator().device_name() (#5095) 8 months ago
__init__.py b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 year ago
bf16_optimizer.py d5a7c1e0b4 Capture short kernel sequences to graph (#4318) 10 months ago
compiler.py 5ce448d326 Switch hasattr to check for compiler and not compile since compile was introduced in torch 2.0 but compiler was introduced in torch 2.1, this fixes issues for those building with torch 2.0 8 months ago
config.py c3cfe96bb3 Enable torch.compile with ZeRO (Experimental) (#4878) 8 months ago
config_utils.py 604d701e35 Introduce pydantic_v1 compatibility module for pydantic>=2.0.0 support (#4407) 1 year ago
constants.py d5a7c1e0b4 Capture short kernel sequences to graph (#4318) 10 months ago
dataloader.py b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 year ago
eigenvalue.py b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 year ago
engine.py c3cfe96bb3 Enable torch.compile with ZeRO (Experimental) (#4878) 8 months ago
hybrid_engine.py 5f41bd06dd Fix Hybrid Engine metrics printing (#4789) 10 months ago
lr_schedules.py ce0ebdade2 [Bug fix] WarmupCosineLR issues (#4688) 11 months ago
progressive_layer_drop.py b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 year ago
quantize.py b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 year ago
sparse_tensor.py c84c28d23b Support cpu tensors without direct device invocation (#3842) 9 months ago
state_dict_factory.py b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 year ago
utils.py 961bc85624 optimize clip_grad_norm_ function (#4915) 8 months ago
weight_quantizer.py b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 year ago