提交历史

作者 SHA1 备注 提交日期
  Heyang Qin dc01cee5ca using container when loading inference checkpoints (#2875) 1 年之前
  Jeff Rasley da84e60d98 add missing license info to top of all source code (#2889) 1 年之前
  Lev Kurilenko fd1449c766 Port Reza's INT8-quantization fix to container architecture (#2725) 1 年之前
  Molly Smith 46784cb58e Fix auto TP for duplicate modules with different gems (#2784) 1 年之前
  Lev Kurilenko 10f3c301a0 Add container load checkpoint error reporting + refactor (#2792) 1 年之前
  Lev Kurilenko 0a73e6e613 Container param cleanup + remove qkv_merging (#2780) 1 年之前
  Reza Yazdani 9f41ffe4a6 Reset KV-cache at the beginning of text-generation (#2669) 1 年之前
  Ma, Guokai 98cc35b6a8 Abstract accelerator (step 3) (#2677) 1 年之前
  Ammar Ahmad Awan 867da307d0 Inference Refactor (replace_with_policy, model_implementations) (#2554) 1 年之前
  Jeff Rasley d9b788d773 tweaks to ds-attn, distilbert policy, and mup (#2649) 1 年之前
  Jeff Rasley e0aa84c5b5 Fix issue w. bloom when changing tp size (#2645) 1 年之前
  Lev Kurilenko 503706ac44 Remove GatheredParameters context from replace_with_policy (#2591) 1 年之前
  Jeff Rasley 35eabb0a33 Fix issues w. python 3.6 + add py-version checks to CI (#2589) 1 年之前
  Michael Wyatt ccb8eb81fb Add checkpoint sharding unit tests (#2561) 1 年之前
  Reza Yazdani 35b350b28c Fix quantized-inference & Add generic support of checkpoint loading (#2547) 1 年之前
  Ammar Ahmad Awan 90ae688442 Pass down the new DS inference config to replace_transformer_layer. (#2539) 1 年之前
  Ammar Ahmad Awan b5d18a6ab3 DeepSpeed inference config. (#2459) (#2472) 1 年之前
  lokoppakmsft f2710bbe1d Make data contiguous before the inplace reshape-copy_ function (#2489) 1 年之前
  Connor Holmes e7e7595502 Stable Diffusion Enhancements (#2491) 1 年之前
  Kevin Ko 6f77da1bae Add `scale_attn_by_inverse_layer_idx` feature (#2486) 1 年之前
  Reza Yazdani 9cfcf7431a Add correct memory-allocation at DeepSpeed-Attention (#2474) 1 年之前
  Connor Holmes 10e9d04c23 Cache Allocation and Softmax Fixes (#2433) 2 年之前
  Jeff Rasley ec13da6ba7 add SD injection policy (#2381) 2 年之前
  Andrey Chernykh cd3a70953a Fix GPT Neo-X multi-gpu inference (#2401) 2 年之前
  lekurile 46a886c068 Change type to tuple in replace_wo_policy isinstance check (#2387) 2 年之前
  Ammar Ahmad Awan 993264388d Inference profiling updates/fixes (#2348) (#2349) 2 年之前
  Stas Bekman b146aa3523 [ds-inference] fix progress bar (#2286) 2 年之前
  Reza Yazdani afdc72879f Ds-inference Int8 support through ZeroQuant technology (#2217) 2 年之前
  Molly Smith a7ee688a6f Update replace_module.py, test-gptj.py related fix (#2269) 2 年之前
  Reza Yazdani c35bfe89f6 fix ds-inference without policy (#2247) 2 年之前