Commit History

Author SHA1 Message Date
  Jeff Rasley a10e4811fe force set lf instead of crlf (https://github.com/pre-commit/pre-commit-hooks#mixed-line-ending) (#1598) 2 years ago
  eltonzheng 51d42ab9ec fix partition activations issue when mp=2 and pp=2 (#1589) 2 years ago
  Hyunwoong Ko d1e72f29d8 Fix non-fp tensor bugs of contiguous activation checkpointing (#1376) 3 years ago
  Alex Hedges be789b1665 Fix many typos (#1423) 3 years ago
  Jeff Rasley e2fdd254ed Big science related changes (#1407) 3 years ago
  Olatunji Ruwase 85acf14c58 Activation checkpointing improvements (#1254) 3 years ago
  Olatunji Ruwase b1669c0d8f Avoid partitioning small activations (#1154) 3 years ago
  Cheng Li 4544b7d2f1 Improve flops profiler functionality (#1065) 3 years ago
  Sean Naren 41ab660b5d Refactor param_dict to config (#1008) 3 years ago
  Jeff Rasley 0d4a54a04d ZeRO-Infinity (#976) 3 years ago
  Samyam Rajbhandari 599258f979 ZeRO 3 Offload (#834) 3 years ago
  Olatunji Ruwase ec8b1cb0a0 Activation checkpointing for non-tensor arguments and return values (#741) 3 years ago
  Shaden Smith adcfd2694d Handle actvitation checkpointing args that are None or non-tensors (#660) 3 years ago
  Shaden Smith a825f99688 Fix activation checkpoint unit tests for GPU systems (#421) 4 years ago
  Jeff Rasley a74a604a9e Revert "Activation checkpointing bugfix and unit tests (#420)" (#422) 4 years ago
  Shaden Smith 01b6e27e78 Activation checkpointing bugfix and unit tests (#420) 4 years ago
  Shaden Smith 65c2f974d8 Pipeline parallel training engine. (#392) 4 years ago
  Jeff Rasley e5bbc2e559 Sparse attn + ops/runtime refactor + v0.3.0 (#343) 4 years ago