Commit History

Author SHA1 Message Date
  Reza Yazdani ee6a92c066 Fixing the transformer APIs to return tuple as the output (if needed) (#1491) 3 years ago
  Kamal Raj 466b0e638c rm max_seq_length arg:DeepSpeedTransformerConfig (#1362) 3 years ago
  Alex Hedges be789b1665 Fix many typos (#1423) 3 years ago
  Reza Yazdani ed3de0c21b Quantization + inference release (#1091) 3 years ago
  Reza Yazdani cab30aa61e Transformer - fix unit test (#964) 3 years ago
  Reza Yazdani e2dfcadf3b Fix the bias-add and add the layer-norm-eps parameter (#791) 3 years ago
  Jeff Rasley 44bd538b11 Module replacement support (#586) 3 years ago
  Reza Yazdani fd2f970bdf Transformer-kernel - supporting any arbitrary sequence-length (#587) 3 years ago
  Jeff Rasley 31f46feee2 DeepSpeed JIT op + PyPI support (#496) 4 years ago
  RezaYazdaniAminabadi f0f2a70268 support dynamic sequence length in transformer kernels (#424) 4 years ago
  RezaYazdaniAminabadi a148bd33d6 Add configurable intermediate size to transformer kernels (#423) 4 years ago
  Jeff Rasley 4ac9bf60a7 Revert "supporting different intermediate sizes other than 4 * hidden_dim (#389)" (#404) 4 years ago
  RezaYazdaniAminabadi e549be607c supporting different intermediate sizes other than 4 * hidden_dim (#389) 4 years ago
  Jeff Rasley e5bbc2e559 Sparse attn + ops/runtime refactor + v0.3.0 (#343) 4 years ago