Ramya Ramineni 7bcb4fabeb Enable CG headers on ROCm (#1821) 2 years ago
..
StopWatch.h 734d8991c8 Transformer kernel release (#242) 4 years ago
Timer.h a10e4811fe force set lf instead of crlf (https://github.com/pre-commit/pre-commit-hooks#mixed-line-ending) (#1598) 2 years ago
compat.h 8abdaee243 Add cpu adagrad (#1358) 3 years ago
context.h ee1ffe2e88 CPU-Adam fix for scalar mode (#735) 3 years ago
cpu_adagrad.h 74493b2bee support CPU Adam and Adagrad on Windows with SDK 10.0.22000 (#1634) 2 years ago
cpu_adam.h 74493b2bee support CPU Adam and Adagrad on Windows with SDK 10.0.22000 (#1634) 2 years ago
cublas_wrappers.h c3c8d5dd93 AMD support (#1430) 2 years ago
custom_cuda_layers.h c3c8d5dd93 AMD support (#1430) 2 years ago
dropout.h a10e4811fe force set lf instead of crlf (https://github.com/pre-commit/pre-commit-hooks#mixed-line-ending) (#1598) 2 years ago
ds_transformer_cuda.h bc7778ea5b Fix the workspace allocation for the transformer kernel (#1397) 3 years ago
feed_forward.h c3c8d5dd93 AMD support (#1430) 2 years ago
gelu.h a10e4811fe force set lf instead of crlf (https://github.com/pre-commit/pre-commit-hooks#mixed-line-ending) (#1598) 2 years ago
gemm_test.h c3c8d5dd93 AMD support (#1430) 2 years ago
general_kernels.h c3c8d5dd93 AMD support (#1430) 2 years ago
normalize_layer.h a10e4811fe force set lf instead of crlf (https://github.com/pre-commit/pre-commit-hooks#mixed-line-ending) (#1598) 2 years ago
quantizer.h ed3de0c21b Quantization + inference release (#1091) 3 years ago
simd.h 259936a76c Fix cpu-adam AVX performance (#1637) 2 years ago
softmax.h a10e4811fe force set lf instead of crlf (https://github.com/pre-commit/pre-commit-hooks#mixed-line-ending) (#1598) 2 years ago
strided_batch_gemm.h c3c8d5dd93 AMD support (#1430) 2 years ago
type_shim.h 648f7bfa50 Bfloat16 zero2 (#1398) 3 years ago