Reza Yazdani
|
ee6a92c066
Fixing the transformer APIs to return tuple as the output (if needed) (#1491)
|
3 年之前 |
Kamal Raj
|
466b0e638c
rm max_seq_length arg:DeepSpeedTransformerConfig (#1362)
|
3 年之前 |
Alex Hedges
|
be789b1665
Fix many typos (#1423)
|
3 年之前 |
Reza Yazdani
|
ed3de0c21b
Quantization + inference release (#1091)
|
3 年之前 |
Reza Yazdani
|
cab30aa61e
Transformer - fix unit test (#964)
|
3 年之前 |
Reza Yazdani
|
e2dfcadf3b
Fix the bias-add and add the layer-norm-eps parameter (#791)
|
3 年之前 |
Jeff Rasley
|
44bd538b11
Module replacement support (#586)
|
3 年之前 |
Reza Yazdani
|
fd2f970bdf
Transformer-kernel - supporting any arbitrary sequence-length (#587)
|
3 年之前 |
Jeff Rasley
|
31f46feee2
DeepSpeed JIT op + PyPI support (#496)
|
4 年之前 |
RezaYazdaniAminabadi
|
f0f2a70268
support dynamic sequence length in transformer kernels (#424)
|
4 年之前 |
RezaYazdaniAminabadi
|
a148bd33d6
Add configurable intermediate size to transformer kernels (#423)
|
4 年之前 |
Jeff Rasley
|
4ac9bf60a7
Revert "supporting different intermediate sizes other than 4 * hidden_dim (#389)" (#404)
|
4 年之前 |
RezaYazdaniAminabadi
|
e549be607c
supporting different intermediate sizes other than 4 * hidden_dim (#389)
|
4 年之前 |
Jeff Rasley
|
e5bbc2e559
Sparse attn + ops/runtime refactor + v0.3.0 (#343)
|
4 年之前 |