Commit History

Author SHA1 Message Date
  Jeff Rasley dbd08236a6 formatting 2 years ago
  Jeff Rasley 0fc11fa0e6 [squash] zero-ckpt-cpu-issue (#1673) 2 years ago
  Olatunji Ruwase 4354c3cc67 Fix largest param numel calculation (#1623) 2 years ago
  Jeff Rasley 1d295ff5f8 Refactor ZeRO naming to reduce confusion (#1607) 2 years ago
  Alex Hedges fc2f378ece Improve pre-commit hooks (#1602) 2 years ago
  Jeff Rasley a10e4811fe force set lf instead of crlf (https://github.com/pre-commit/pre-commit-hooks#mixed-line-ending) (#1598) 2 years ago
  Mikhail Druzhinin d14baad940 allreduce_always_fp16 (#1487) 2 years ago
  Jeff Rasley 2332cb31a7 Enables ZeRO-3 inference (#1514) 2 years ago
  Olatunji Ruwase 7567c76c05 Update offload parameter names (#1536) 2 years ago
  Olatunji Ruwase 488105ebd2 Fix zinf none swapper (#1550) 2 years ago
  Zhen Zhang c0eeb69dfb ZeRO3, improved parameter all-gather operation (#1188) 3 years ago
  Olatunji Ruwase 58a8e13ccd Ensure single zero3 context (#1462) 3 years ago
  Alex Hedges be789b1665 Fix many typos (#1423) 3 years ago
  Jeff Rasley e2fdd254ed Big science related changes (#1407) 3 years ago
  Stas Bekman 3fa24208c4 [zero3] fix reference counting in backward over multiple forwards (#1227) 3 years ago
  Stas Bekman 2a921069d7 [model weights] zero_to_fp32 multiple improvements (#1181) 3 years ago
  Stas Bekman 5127b2fa25 improve debug (#1215) 3 years ago
  Stas Bekman 91f58c068c [zero3] params_to_reduce isn't always there (#1214) 3 years ago
  Stas Bekman a029239812 clean up logging (#1190) 3 years ago
  Stas Bekman bc019a5339 undo noise (#1191) 3 years ago
  Stas Bekman c0c4ebf143 introduce debug utils (#1136) 3 years ago
  Stas Bekman 0c1802cc8b ZeRO 2+3 memory estimators (#965) 3 years ago
  Samyam Rajbhandari 4eaf910616 Samyamr/largest partitioned params calculation fix (#1150) 3 years ago
  Olatunji Ruwase e9e9d5b825 ZeRO-Infinity: Swap into unaligned fp16 buffer (#1086) 3 years ago
  Olatunji Ruwase d88d927995 ZeRO-Infinity: support swapping misaligned sized fp16 tensors (#1076) 3 years ago
  Olatunji Ruwase 6b49b60ec8 Get correct fp16 reuse buffer size (#1071) 3 years ago
  Sean Naren b3870363e0 [Stage][Fix] Add additional conditions when checking types of output from the model (#1026) 3 years ago
  Olatunji Ruwase 429dfa6c3d Handle Norm allreduce when no mp (#1021) 3 years ago
  Samyam Rajbhandari dad26428e3 Samyamr/full precision for ZeRO Stage2 and Stage3 (#1004) 3 years ago
  William Buchwalter a711878996 Fix issue where gradient_predivide_factor was called as a func. (#996) 3 years ago