Commit History

Author SHA1 Message Date
  Logan Adams 101fff1da7 Update to latest pydantic (2.0+) 1 year ago
  Logan Adams 3c89576105 Merge branch 'master' into loadams/update-pydantic 1 year ago
  Jinzhen Lin 7e8bcc07d6 fix "undefined symbol: curandCreateGenerator" for quantizer op (#3846) 1 year ago
  YiSheng5 c520d47679 [profiling][mics]Fix some issues for log_summary(). (#3899) 1 year ago
  Ramya Ramineni d24629f4fd [ROCm] Enable TestCUDABackward::test_backward unit tests (#3849) 1 year ago
  hipudding c5e55d3d14 Fix a typo of global variable in comm.py(#3852) (#3852) 1 year ago
  Michael Wyatt b58e0fa92a avoid init for deepspeed backend first (#3893) 1 year ago
  Alexander Grund 9aeba94a8e Avoid deprecation warnings in `CHECK_CUDA` (#3854) 1 year ago
  Michael Wyatt 52844f4956 Update workflows for merge queue (#3892) 1 year ago
  Pinstripe Potoroo 3491e32d72 fix rnn flop profiler to compute flops instead of macs (#3833) 1 year ago
  digger yu 97e7b8410c fix error :Dictionary expression not allowed in type annotation Pylance (#3708) 1 year ago
  Lev Kurilenko cc3a7c9cba Fix Meta Tensor checkpoint load for BLOOM models (#3885) 1 year ago
  Yejing-Lai d6f622176d Add GPTNeoX AutoTP support (#3778) 1 year ago
  Heyang Qin 9377921a3f Separate ZeRO3 InflightParamRegistry for train and eval (#3884) 1 year ago
  Ammar Ahmad Awan db4638d157 Extend HE-Lora test with Z3 support + Fix/add guard in HE for Z3 (#3883) 1 year ago
  Logan Adams 59c9b0914f Update apex installation to resolve apex's pyproject.toml issues. (#3745) 1 year ago
  Reza Yazdani f3c93b056d Add FALCON Auto-TP Support (#3640) 1 year ago
  Mashiro 385e89d4a8 update MMEngine support in README.md (#3879) 1 year ago
  Michael Wyatt 7c126f431c update lightning version in CI (#3882) 1 year ago
  Xingjian Shi d81dfdabcc Fix LoRA Fuse/Unfuse in Hybrid Engine (#3563) 1 year ago
  Pinstripe Potoroo c1c1d2496f fix retrieval of out_channels in _conv_trans_flops_compute (#3834) 1 year ago
  Jeff Rasley 691d246e02 [zero] revert PR #3166, it disabled grad clip for bf16 (#3790) 1 year ago
  hablb d229ff175e Zero3 Fix allreduce optimization for extra large tensor (#3832) 1 year ago
  Guo Yejun 807d1b5dfc scripts/check-torchcuda.py: add checking for tensor.is_cuda (#3843) 1 year ago
  Alexander Jipa 2ded2ff0be checking process_group before merging bucket ranges (#3521) (#3577) 1 year ago
  Ma, Guokai 5d1124f2aa [profiling]add show_straggler argument to log_summary() (#3579) 1 year ago
  Michael Wyatt fd1d2c6447 Reduce Unit Test Time (Part 2) (#3838) 1 year ago
  Logan Adams c973e15711 Disable AMD test flows in YML (#3847) 1 year ago
  Guo Yejun b4626194e4 zero/mics.py: use on_accelerator instead of cuda only (#3806) 1 year ago
  Heyang Qin f8551b439e Fix racing condition in GatheredParameters (#3819) 1 year ago