Commit History

Author SHA1 Message Date
  mzl 8d53ac0cd3 Add MPICH Multinode Runner (#2839) 1 year ago
  Jeff Rasley da84e60d98 add missing license info to top of all source code (#2889) 1 year ago
  Logan Adams d038dbd268 Fix Slurm launcher user args (#2806) 1 year ago
  Logan Adams 4af1f76a99 Add user defined launcher args for PDSH launcher (#2804) 1 year ago
  Ma, Guokai 98cc35b6a8 Abstract accelerator (step 3) (#2677) 1 year ago
  Dashiell Stander 3db0b5e2de Add SLURM Multinode Runner (#2404) 2 years ago
  Arpan Jain 1ed5aa96a8 Elastic Training support in DeepSpeed (#2153) (#2156) 2 years ago
  Alex Hedges 316c4a43e0 Add flake8 to pre-commit checks (#2051) 2 years ago
  Jerry Mannil d0eae5ad7a Propagate max errorcode to deepspeed when using PDSH launcher (#1994) 2 years ago
  Michael Wyatt 3678ee1778 [bug] Add user-defined launcher args for MPI launcher (#1933) 2 years ago
  Shuai Zheng 4575b2b792 fix launcher for reading env vars (#1907) 2 years ago
  Jeff Rasley 9351266f78 Multi-node save pid support + allow sparse-attn extra (#1728) 2 years ago
  liamcli fead387f78 support module and no python args for launcher (#1690) 2 years ago
  Jeff Rasley a10e4811fe force set lf instead of crlf (https://github.com/pre-commit/pre-commit-hooks#mixed-line-ending) (#1598) 2 years ago
  Chunyang Wen 93c71831c7 fstr for multnode_runner (#1532) 2 years ago
  Ammar Ahmad Awan 01726ce2b8 Add 1-bit Adam support to DeepSpeed (#380) 4 years ago