.. |
StopWatch.h
|
734d8991c8
Transformer kernel release (#242)
|
4 years ago |
Timer.h
|
a10e4811fe
force set lf instead of crlf (https://github.com/pre-commit/pre-commit-hooks#mixed-line-ending) (#1598)
|
2 years ago |
compat.h
|
8abdaee243
Add cpu adagrad (#1358)
|
3 years ago |
context.h
|
ee1ffe2e88
CPU-Adam fix for scalar mode (#735)
|
3 years ago |
conversion_utils.h
|
9aa7b638b7
Kernel Data Conversion Utility (#2327)
|
2 years ago |
cpu_adagrad.h
|
74493b2bee
support CPU Adam and Adagrad on Windows with SDK 10.0.22000 (#1634)
|
2 years ago |
cpu_adam.h
|
a04480e192
Fix the half-precision version of CPU-Adam (#2032)
|
2 years ago |
cublas_wrappers.h
|
c3c8d5dd93
AMD support (#1430)
|
2 years ago |
custom_cuda_layers.h
|
ef869377e9
DeepSpeed Data Efficiency Library (#2585)
|
1 year ago |
dequantization_utils.h
|
30c8d8a881
Initial dequant library implementation (#2521)
|
1 year ago |
dropout.h
|
a10e4811fe
force set lf instead of crlf (https://github.com/pre-commit/pre-commit-hooks#mixed-line-ending) (#1598)
|
2 years ago |
ds_kernel_utils.h
|
edd17bdaad
format fixes
|
1 year ago |
ds_transformer_cuda.h
|
bc7778ea5b
Fix the workspace allocation for the transformer kernel (#1397)
|
3 years ago |
feed_forward.h
|
c3c8d5dd93
AMD support (#1430)
|
2 years ago |
gelu.h
|
a10e4811fe
force set lf instead of crlf (https://github.com/pre-commit/pre-commit-hooks#mixed-line-ending) (#1598)
|
2 years ago |
gemm_test.h
|
c3c8d5dd93
AMD support (#1430)
|
2 years ago |
general_kernels.h
|
c3c8d5dd93
AMD support (#1430)
|
2 years ago |
memory_access_utils.h
|
be4ffb82ad
Reduction Kernel Utility (#2436)
|
2 years ago |
normalize_layer.h
|
a10e4811fe
force set lf instead of crlf (https://github.com/pre-commit/pre-commit-hooks#mixed-line-ending) (#1598)
|
2 years ago |
quantization.h
|
30c8d8a881
Initial dequant library implementation (#2521)
|
1 year ago |
quantization_utils.h
|
30c8d8a881
Initial dequant library implementation (#2521)
|
1 year ago |
quantizer.h
|
ed3de0c21b
Quantization + inference release (#1091)
|
3 years ago |
reduction_utils.h
|
30c8d8a881
Initial dequant library implementation (#2521)
|
1 year ago |
simd.h
|
a04480e192
Fix the half-precision version of CPU-Adam (#2032)
|
2 years ago |
softmax.h
|
a10e4811fe
force set lf instead of crlf (https://github.com/pre-commit/pre-commit-hooks#mixed-line-ending) (#1598)
|
2 years ago |
strided_batch_gemm.h
|
c3c8d5dd93
AMD support (#1430)
|
2 years ago |
type_shim.h
|
648f7bfa50
Bfloat16 zero2 (#1398)
|
3 years ago |