Commit History

Author SHA1 Message Date
  Adam Moody 43d58d99eb ckpt: create directories in checkpoint_engine (#2988) 1 year ago
  Jeff Rasley da84e60d98 add missing license info to top of all source code (#2889) 1 year ago
  Ma, Guokai 98cc35b6a8 Abstract accelerator (step 3) (#2677) 1 year ago
  Joe Mayer 18713c6838 Updating API docs (#2586) 1 year ago
  iLeGend 06e00f61ce Fix typos: deepseed -> deepspeed (#2499) 1 year ago
  Adam Moody b8fb9c3f1a parallelize writing of layer checkpoint files across data parallel instances (#1419) 2 years ago
  trajep e669aaf55b Trajepl/nebula ckpt engine (#2085) 2 years ago
  Alex Hedges 316c4a43e0 Add flake8 to pre-commit checks (#2051) 2 years ago
  Karim Foda 735406e536 fix import errors (#2026) 2 years ago
  Ammar Ahmad Awan 36ad3119d5 DeepSpeed comm backend v1 (#1985) 2 years ago
  Jeff Rasley b4fcd98ff0 Inference PP changes for neox (#1899) 2 years ago
  Olatunji Ruwase 56c5223868 bf16+pipeline parallelism (#1801) 2 years ago
  James Reed fafc827d64 Render docs for pipe.ProcessTopology (#1505) 2 years ago
  Stas Bekman bf1725bb57 [code readability] pipe (#1510) 3 years ago
  Alex Hedges be789b1665 Fix many typos (#1423) 3 years ago
  Hyunwoong Ko 30965ea734 Add flexibility of pipeline parallel module and engine (#1399) 3 years ago
  Jeff Rasley e2fdd254ed Big science related changes (#1407) 3 years ago
  Adam Moody 4ad8019cdf fix: support three digit layer numbers (#1377) 3 years ago
  Olatunji Ruwase 336dd089e5 Use clone to avoid checkpoint bloat (#1326) 3 years ago
  Reza Yazdani ed3de0c21b Quantization + inference release (#1091) 3 years ago
  Shaden Smith 46f4573b1a Seeded unit tests (#1072) 3 years ago
  sdtblck 669028f0fd Fix all Pipeline Module Parameters being sent to cuda:0 (#687) 3 years ago
  Shaden Smith fbece50b21 assert no Z2/Z3 with pipeline and fix some docs links (#980) 3 years ago
  Shaden Smith c82756cd15 readthedocs upgrade (#402) 4 years ago
  Shaden Smith 65c2f974d8 Pipeline parallel training engine. (#392) 4 years ago