提交历史

作者 SHA1 备注 提交日期
  Lev Kurilenko 87eaf8f99a Check for local CUDA graphs when enable_cuda_graph=True (#2941) 1 年之前
  Ammar Ahmad Awan e4b3b610ba Refactor DS inference API. No longer need replace_method. (#2831) 1 年之前
  Reza Yazdani 9f41ffe4a6 Reset KV-cache at the beginning of text-generation (#2669) 1 年之前
  Molly Smith c5b983e92e Fix broken kernel inject bug (#2776) 1 年之前
  Ma, Guokai 98cc35b6a8 Abstract accelerator (step 3) (#2677) 1 年之前
  Molly Smith d59b572911 Automatic tensor parallelism v2 (#2670) 1 年之前
  Ammar Ahmad Awan 867da307d0 Inference Refactor (replace_with_policy, model_implementations) (#2554) 1 年之前
  Reza Yazdani 95d9a1b6c3 Fix Opt injection (#2541) 1 年之前
  Jeff Rasley 5676f5ec9c [inference] check for unsupported model generate args (#2627) 1 年之前
  Lev Kurilenko 503706ac44 Remove GatheredParameters context from replace_with_policy (#2591) 1 年之前
  Ammar Ahmad Awan 90ae688442 Pass down the new DS inference config to replace_transformer_layer. (#2539) 1 年之前
  Connor Holmes 57e0a55066 Ensure is initialized for SD (#2534) 1 年之前
  Ammar Ahmad Awan b5d18a6ab3 DeepSpeed inference config. (#2459) (#2472) 1 年之前
  Connor Holmes e7e7595502 Stable Diffusion Enhancements (#2491) 1 年之前
  Reza Yazdani 39bdc14195 fixing the checkpoint loading at inference-engine (#2429) 2 年之前
  Connor Holmes 10e9d04c23 Cache Allocation and Softmax Fixes (#2433) 2 年之前
  Michael Wyatt e772f16665 Use CUDA events for inference model profiling (#2371) 2 年之前
  Reza Yazdani 537e8581fe fix checkpoint loading when it is a dictionary (#2425) 2 年之前
  Jeff Rasley ec13da6ba7 add SD injection policy (#2381) 2 年之前
  Ammar Ahmad Awan 993264388d Inference profiling updates/fixes (#2348) (#2349) 2 年之前
  Jeff Rasley cf638be998 only override forward if using cuda-graph (#2291) 2 年之前
  Reza Yazdani afdc72879f Ds-inference Int8 support through ZeroQuant technology (#2217) 2 年之前
  Arash Bakhtiari 8b2a63717a Add support of OPT models (#2205) 2 年之前
  Reza Yazdani 556f005152 Fix random token-generation issue + MP-checkpoint loading/saving (#2132) 2 年之前
  trajep e669aaf55b Trajepl/nebula ckpt engine (#2085) 2 年之前
  Alex Hedges 316c4a43e0 Add flake8 to pre-commit checks (#2051) 2 年之前
  Jeff Rasley 844d9f31a9 reduce ds-inference log verbosity (#2111) 2 年之前
  Reza Yazdani aa88137b8d Add Inference support for running the BigScience-BLOOM Architecture (#2083) 2 年之前
  Karim Foda 735406e536 fix import errors (#2026) 2 年之前
  Jeff Rasley b666d5cd73 [inference] test suite for ds-kernels (bert, roberta, gpt2, gpt-neo, gpt-j) (#1992) 2 年之前