Jeff Rasley
|
c3c8d5dd93
AMD support (#1430)
|
2 年之前 |
Reza Yazdani
|
bc7778ea5b
Fix the workspace allocation for the transformer kernel (#1397)
|
3 年之前 |
Reza Yazdani
|
ed3de0c21b
Quantization + inference release (#1091)
|
3 年之前 |
Reza Yazdani
|
e721cb691f
Supporting different hidden dimensions for transformer kernels-v2 (#934)
|
3 年之前 |
Reza Yazdani
|
8295d7a89e
Fixing gelu_checkpointing memory issue (#812)
|
3 年之前 |
Reza Yazdani
|
e2dfcadf3b
Fix the bias-add and add the layer-norm-eps parameter (#791)
|
3 年之前 |
Conglong Li
|
1fcc5f7a78
Fix transformer kernel CUDA illegal memory access error (#765)
|
3 年之前 |
Reza Yazdani
|
981bc7d493
Move workspace memory-allocation to PyTorch (#661)
|
3 年之前 |
Reza Yazdani
|
fd2f970bdf
Transformer-kernel - supporting any arbitrary sequence-length (#587)
|
3 年之前 |
Bruno
|
95575579b3
Use parentesis around min and max to enable Windows build (#449)
|
4 年之前 |
RezaYazdaniAminabadi
|
f0f2a70268
support dynamic sequence length in transformer kernels (#424)
|
4 年之前 |
RezaYazdaniAminabadi
|
a148bd33d6
Add configurable intermediate size to transformer kernels (#423)
|
4 年之前 |
Jeff Rasley
|
4ac9bf60a7
Revert "supporting different intermediate sizes other than 4 * hidden_dim (#389)" (#404)
|
4 年之前 |
RezaYazdaniAminabadi
|
e549be607c
supporting different intermediate sizes other than 4 * hidden_dim (#389)
|
4 年之前 |
Jeff Rasley
|
734d8991c8
Transformer kernel release (#242)
|
4 年之前 |