.. |
StopWatch.h
|
734d8991c8
Transformer kernel release (#242)
|
4 年之前 |
Timer.h
|
734d8991c8
Transformer kernel release (#242)
|
4 年之前 |
context.h
|
ee1ffe2e88
CPU-Adam fix for scalar mode (#735)
|
3 年之前 |
cpu_adam.h
|
f28432441b
DeepSpeed MoE (#1310)
|
3 年之前 |
cublas_wrappers.h
|
734d8991c8
Transformer kernel release (#242)
|
4 年之前 |
custom_cuda_layers.h
|
be789b1665
Fix many typos (#1423)
|
3 年之前 |
dropout.h
|
f0f2a70268
support dynamic sequence length in transformer kernels (#424)
|
4 年之前 |
ds_transformer_cuda.h
|
bc7778ea5b
Fix the workspace allocation for the transformer kernel (#1397)
|
3 年之前 |
feed_forward.h
|
734d8991c8
Transformer kernel release (#242)
|
4 年之前 |
gelu.h
|
f0f2a70268
support dynamic sequence length in transformer kernels (#424)
|
4 年之前 |
gemm_test.h
|
95575579b3
Use parentesis around min and max to enable Windows build (#449)
|
4 年之前 |
general_kernels.h
|
734d8991c8
Transformer kernel release (#242)
|
4 年之前 |
normalize_layer.h
|
e2dfcadf3b
Fix the bias-add and add the layer-norm-eps parameter (#791)
|
3 年之前 |
quantizer.h
|
ed3de0c21b
Quantization + inference release (#1091)
|
3 年之前 |
softmax.h
|
be789b1665
Fix many typos (#1423)
|
3 年之前 |
strided_batch_gemm.h
|
f0f2a70268
support dynamic sequence length in transformer kernels (#424)
|
4 年之前 |
type_shim.h
|
be789b1665
Fix many typos (#1423)
|
3 年之前 |