Logan Adams
|
c7724c6181
Switch from HIP_PLATFORM_HCC to HIP_PLATFORM_AMD (#4539)
|
1 year ago |
Alexander Grund
|
9aeba94a8e
Avoid deprecation warnings in `CHECK_CUDA` (#3854)
|
1 year ago |
Olatunji Ruwase
|
47f9f13bd3
DeepSpeed Chat (#3186)
|
1 year ago |
Michael Wyatt
|
b361c72761
Update DeepSpeed copyright license to Apache 2.0 (#3111)
|
1 year ago |
Jeff Rasley
|
c3c8d5dd93
AMD support (#1430)
|
2 years ago |
Reza Yazdani
|
bc7778ea5b
Fix the workspace allocation for the transformer kernel (#1397)
|
3 years ago |
Reza Yazdani
|
ed3de0c21b
Quantization + inference release (#1091)
|
3 years ago |
Reza Yazdani
|
e721cb691f
Supporting different hidden dimensions for transformer kernels-v2 (#934)
|
3 years ago |
Reza Yazdani
|
8295d7a89e
Fixing gelu_checkpointing memory issue (#812)
|
3 years ago |
Reza Yazdani
|
e2dfcadf3b
Fix the bias-add and add the layer-norm-eps parameter (#791)
|
3 years ago |
Conglong Li
|
1fcc5f7a78
Fix transformer kernel CUDA illegal memory access error (#765)
|
3 years ago |
Reza Yazdani
|
981bc7d493
Move workspace memory-allocation to PyTorch (#661)
|
3 years ago |
Reza Yazdani
|
fd2f970bdf
Transformer-kernel - supporting any arbitrary sequence-length (#587)
|
3 years ago |
Bruno
|
95575579b3
Use parentesis around min and max to enable Windows build (#449)
|
4 years ago |
RezaYazdaniAminabadi
|
f0f2a70268
support dynamic sequence length in transformer kernels (#424)
|
4 years ago |
RezaYazdaniAminabadi
|
a148bd33d6
Add configurable intermediate size to transformer kernels (#423)
|
4 years ago |
Jeff Rasley
|
4ac9bf60a7
Revert "supporting different intermediate sizes other than 4 * hidden_dim (#389)" (#404)
|
4 years ago |
RezaYazdaniAminabadi
|
e549be607c
supporting different intermediate sizes other than 4 * hidden_dim (#389)
|
4 years ago |
Jeff Rasley
|
734d8991c8
Transformer kernel release (#242)
|
4 years ago |