提交历史

作者 SHA1 备注 提交日期
  Mikhail Druzhinin b62e0cc5a8 Add gradient_average flag support for sparse grads (#2188) 2 年之前
  Reza Yazdani 54a9e1b924 Fix the layer-past for GPT based models (#2196) 2 年之前
  Tiago De Gaspari 0f5c2012ce Update README.md (#2192) 2 年之前
  Jeff Rasley 1223b13c9a Update nv-lightning-v100.yml (#2190) 2 年之前
  Jeff Rasley 33667e0e54 [docs] add more models to adoption (#2189) 2 年之前
  Rahil Bathwal ee5ce52460 fix missing import (#2175) 2 年之前
  Hanlin Tang 80b5b9259b Update README to latest Composer version (#2177) 2 年之前
  Ramya Ramineni 2e3769a1f4 Enable fused_lamb_cuda_kernel on ROCm (#2148) 2 年之前
  Olatunji Ruwase e419f7cbcd Match compute and reduce dtype (#2145) 2 年之前
  Reza Yazdani e7d9959540 fixing model partitioning without injection (#2179) 2 年之前
  Jeff Rasley fad0a4106d update offload docs to include stage 1 (#2178) 2 年之前
  Michael Wyatt d1cd18e5fb Update for AMD CI workflow (#2172) 2 年之前
  Jeff Rasley bb49dc73f5 [docs] adoption updates (#2173) 2 年之前
  Zion Wu 6bfcf3c694 Fix wrong unit of latency in flops-profiler (#2090) (#2095) 2 年之前
  Jeff Rasley 776e36988d delay torch import for inference compatability check (#2167) 2 年之前
  Michael Wyatt 1a71e77dc2 Fix for distributed tests on pytorch>=1.12 (#2141) 2 年之前
  Jeff Rasley b005db86fc bump to 0.7.1 2 年之前
  Siddharth Singh 5fe9d61065 Tensor parallelism for Mixture of Experts (#2074) 2 年之前
  Olatunji Ruwase 2210ebe70f Release swap buffers for persisted params (#2089) 2 年之前
  Jeff Rasley a039e2261a enable fp16 input autocasting (#2158) 2 年之前
  Jeff Rasley 46401b3884 [zero-3] shutdown zero.Init from within ds.init (#2150) 2 年之前
  Jeff Rasley 63f470eeb6 prevent cuda 10 builds of inference kernels on ampere (#2157) 2 年之前
  Arpan Jain 1ed5aa96a8 Elastic Training support in DeepSpeed (#2153) (#2156) 2 年之前
  Nicholas Cilfone ba67bd9a14 Added retain_graph as a kwarg to the main engine backward function (#1149) 2 年之前
  Reza Yazdani 556f005152 Fix random token-generation issue + MP-checkpoint loading/saving (#2132) 2 年之前
  shjwudp 57140e8e95 fix: fix BF16_Optimizer compatibility issue with optimizer state 0-dim tensor (#2152) 2 年之前
  Jerry Mannil 66d29b0a6c Graceful exit on failures for multi-node runs (#2008) 2 年之前
  trajep e669aaf55b Trajepl/nebula ckpt engine (#2085) 2 年之前
  Jeff Rasley a54661a06f force newer datasets version (#2147) 2 年之前
  Jeff Rasley b442264dc9 formatting fix for #1962 2 年之前