Commit History

Author SHA1 Message Date
  Connor Holmes 0a61d5d664 Hybrid Engine Refactor and Llama Inference Support (#3425) 1 year ago
  Michael Wyatt b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 year ago
  Jeff Rasley 91d63e0228 update formatter version and style settings (#3098) 1 year ago
  Ammar Ahmad Awan 867da307d0 Inference Refactor (replace_with_policy, model_implementations) (#2554) 1 year ago
  Reza Yazdani 95d9a1b6c3 Fix Opt injection (#2541) 1 year ago
  Jeff Rasley d9b788d773 tweaks to ds-attn, distilbert policy, and mup (#2649) 1 year ago
  Michael Wyatt ccb8eb81fb Add checkpoint sharding unit tests (#2561) 1 year ago
  Lev Kurilenko 731965db33 Fix MegatronLayerPolicy to have megatron_v2=True (#2579) 1 year ago
  Reza Yazdani 35b350b28c Fix quantized-inference & Add generic support of checkpoint loading (#2547) 1 year ago
  Connor Holmes e7e7595502 Stable Diffusion Enhancements (#2491) 1 year ago
  Jeff Rasley ec13da6ba7 add SD injection policy (#2381) 2 years ago
  Ammar Ahmad Awan 993264388d Inference profiling updates/fixes (#2348) (#2349) 2 years ago
  Arash Bakhtiari fae896ef60 Make OPT policy backward compatible with pre-OPT transformers versions (#2254) 2 years ago
  Arash Bakhtiari 8b2a63717a Add support of OPT models (#2205) 2 years ago
  Reza Yazdani 556f005152 Fix random token-generation issue + MP-checkpoint loading/saving (#2132) 2 years ago
  Alex Hedges 316c4a43e0 Add flake8 to pre-commit checks (#2051) 2 years ago
  Michael Wyatt ee7ea3b805 use HF NeoX (#2087) 2 years ago
  Reza Yazdani aa88137b8d Add Inference support for running the BigScience-BLOOM Architecture (#2083) 2 years ago
  Reza Yazdani 8164ea9e6d Fixing several bugs in the inference-api and the kernels (#1951) 2 years ago
  Jeff Rasley b4fcd98ff0 Inference PP changes for neox (#1899) 2 years ago
  Samyam Rajbhandari c13457b756 Supporting multiple modules injection with a single policy when they have identical architectures (#1869) 2 years ago
  Reza Yazdani 94de0229fb Fix inference api & add more description on inference engine tutorial (#1711) 2 years ago
  Jeff Rasley e46d808a1b MoE inference + PR-MoE model support (#1705) 2 years ago
  Reza Yazdani 289c3f9ba4 GPT-J inference support (#1670) 2 years ago
  Jeff Rasley a10e4811fe force set lf instead of crlf (https://github.com/pre-commit/pre-commit-hooks#mixed-line-ending) (#1598) 2 years ago
  Hyunwoong Ko 429cbc89af Fix bugs about non-contiguous tensor broadcasting (#1168) 3 years ago
  Jeff Rasley 96eb5b12e3 delay imports for replace policies and fix missing req (#1100) 3 years ago
  Reza Yazdani ed3de0c21b Quantization + inference release (#1091) 3 years ago