.. |
op_binding
|
468882fb68
Add the policy to run llama model from the official repo (#4313)
|
1 year ago |
triton
|
0e0748c579
adds triton flash attention2 kernel (#4337)
|
1 year ago |
__init__.py
|
b361c72761
Update DeepSpeed copyright license to Apache 2.0 (#3111)
|
1 year ago |
bias_add.py
|
b361c72761
Update DeepSpeed copyright license to Apache 2.0 (#3111)
|
1 year ago |
config.py
|
468882fb68
Add the policy to run llama model from the official repo (#4313)
|
1 year ago |
diffusers_2d_transformer.py
|
b361c72761
Update DeepSpeed copyright license to Apache 2.0 (#3111)
|
1 year ago |
diffusers_attention.py
|
1ba4098918
Fix Stable Diffusion Injection (#4078)
|
1 year ago |
diffusers_transformer_block.py
|
ce535945e6
fix: change ==NONE to is (#3923)
|
1 year ago |
ds_attention.py
|
468882fb68
Add the policy to run llama model from the official repo (#4313)
|
1 year ago |
ds_mlp.py
|
15f94ae756
Engine side fix for loading llama checkpoint fine-tuned with zero3 (#3981)
|
1 year ago |
moe_inference.py
|
ce535945e6
fix: change ==NONE to is (#3923)
|
1 year ago |
triton_ops.py
|
b361c72761
Update DeepSpeed copyright license to Apache 2.0 (#3111)
|
1 year ago |