提交历史

作者 SHA1 备注 提交日期
  Xinyu Lian d2b1d7fc08 Universal checkpoint for zero stage 3 (#5475) 3 月之前
  Chirag Jain 88b2ef71b3 Fix memory leak from _hp_mapping (#5643) 3 月之前
  inkcherry c66bc4269e set the default to use set_to_none for clearing gradients in BF16 optimizer. (#5434) 6 月之前
  Moshe Island 08e0733e4a Support MoE for pipeline models (#5338) 6 月之前
  YangQun 40009eb1c7 BF16 optimizer: Clear lp grads after updating hp grads in hook (#5328) 6 月之前
  Masahiro Tanaka c56a4b9e0d Improve universal checkpoint (#5289) 6 月之前
  inkcherry e5dd5501c1 support bf16_optimizer moe expert parallel training and moe EP grad_scale/grad_norm fix (#5259) 6 月之前
  Masahiro Tanaka b112c99ea8 Fix loading a universal checkpoint (#5263) 7 月之前
  Max Kovalenko 3c0bd31288 BF16 optimizer: Improve device utilization by immediate grad update (#4975) 8 月之前
  Masahiro Tanaka 18179807f5 Remove optimizer step on initialization (#5104) 8 月之前
  inkcherry d5a7c1e0b4 Capture short kernel sequences to graph (#4318) 10 月之前
  Nadav Elyahu ce6070800a BF16_Optimizer: add support for bf16 grad acc (#4713) 10 月之前
  taozhiwei fd0a52c1ac use all_gather_into_tensor instead of all_gather (#4705) 10 月之前
  Moshe Island 8ad187d84f Universal ckp fixes (#4588) 11 月之前
  Kazuki Fujii 7ed952eff1 Fix bug in bfloat16 optimizer related to checkpointing (#4434) 1 年之前
  mzl 5a5340d03b remove UtilsBuilder load, use torch (un)flatten ops (#3728) 1 年之前
  Olatunji Ruwase dd8df20fe0 zero3 checkpoint frozen params (#3205) 1 年之前
  Michael Wyatt b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 年之前
  Jeff Rasley 91d63e0228 update formatter version and style settings (#3098) 1 年之前
  Olatunji Ruwase 541e423ae6 Enable tensor fragments for zero 2 & 3 (#2727) 1 年之前
  Olatunji Ruwase 799120e7e4 Universal checkpoint for zero stage 1 (#2284) 2 年之前
  Olatunji Ruwase f4a92a19a6 Checkpoint backwards-compatbility workaround (#2384) 2 年之前
  Olatunji Ruwase 53182531ed Refactor universal checkpointing and tensor fragments (#2253) 2 年之前
  Mikhail Druzhinin 4671cce558 Fix OrderedDict import for python3.6 (#2267) 2 年之前
  shjwudp 57140e8e95 fix: fix BF16_Optimizer compatibility issue with optimizer state 0-dim tensor (#2152) 2 年之前
  Alex Hedges 316c4a43e0 Add flake8 to pre-commit checks (#2051) 2 年之前
  Olatunji Ruwase 80d0a32f0b Checkpoint reshaping (#1953) 2 年之前
  Karim Foda 735406e536 fix import errors (#2026) 2 年之前
  Ammar Ahmad Awan 36ad3119d5 DeepSpeed comm backend v1 (#1985) 2 年之前
  Jeff Rasley 50893458d6 Fairseq support (#1915) 2 年之前