Connor Holmes 01080fc30e Merge branch 'master' into lokoppak/ln_schedule_update 1 year ago
..
StopWatch.h 734d8991c8 Transformer kernel release (#242) 4 years ago
Timer.h a10e4811fe force set lf instead of crlf (https://github.com/pre-commit/pre-commit-hooks#mixed-line-ending) (#1598) 2 years ago
compat.h 8abdaee243 Add cpu adagrad (#1358) 3 years ago
context.h ee1ffe2e88 CPU-Adam fix for scalar mode (#735) 3 years ago
conversion_utils.h 9aa7b638b7 Kernel Data Conversion Utility (#2327) 2 years ago
cpu_adagrad.h 74493b2bee support CPU Adam and Adagrad on Windows with SDK 10.0.22000 (#1634) 2 years ago
cpu_adam.h a04480e192 Fix the half-precision version of CPU-Adam (#2032) 2 years ago
cublas_wrappers.h c3c8d5dd93 AMD support (#1430) 2 years ago
custom_cuda_layers.h ef869377e9 DeepSpeed Data Efficiency Library (#2585) 1 year ago
dequantization_utils.h 30c8d8a881 Initial dequant library implementation (#2521) 1 year ago
dropout.h a10e4811fe force set lf instead of crlf (https://github.com/pre-commit/pre-commit-hooks#mixed-line-ending) (#1598) 2 years ago
ds_kernel_utils.h edd17bdaad format fixes 1 year ago
ds_transformer_cuda.h bc7778ea5b Fix the workspace allocation for the transformer kernel (#1397) 3 years ago
feed_forward.h c3c8d5dd93 AMD support (#1430) 2 years ago
gelu.h a10e4811fe force set lf instead of crlf (https://github.com/pre-commit/pre-commit-hooks#mixed-line-ending) (#1598) 2 years ago
gemm_test.h c3c8d5dd93 AMD support (#1430) 2 years ago
general_kernels.h c3c8d5dd93 AMD support (#1430) 2 years ago
memory_access_utils.h be4ffb82ad Reduction Kernel Utility (#2436) 2 years ago
normalize_layer.h a10e4811fe force set lf instead of crlf (https://github.com/pre-commit/pre-commit-hooks#mixed-line-ending) (#1598) 2 years ago
quantization.h 30c8d8a881 Initial dequant library implementation (#2521) 1 year ago
quantization_utils.h 30c8d8a881 Initial dequant library implementation (#2521) 1 year ago
quantizer.h ed3de0c21b Quantization + inference release (#1091) 3 years ago
reduction_utils.h 30c8d8a881 Initial dequant library implementation (#2521) 1 year ago
simd.h a04480e192 Fix the half-precision version of CPU-Adam (#2032) 2 years ago
softmax.h a10e4811fe force set lf instead of crlf (https://github.com/pre-commit/pre-commit-hooks#mixed-line-ending) (#1598) 2 years ago
strided_batch_gemm.h c3c8d5dd93 AMD support (#1430) 2 years ago
type_shim.h 648f7bfa50 Bfloat16 zero2 (#1398) 3 years ago