stephen youn 0e0748c579 adds triton flash attention2 kernel (#4337) 1 year ago
..
op_binding 468882fb68 Add the policy to run llama model from the official repo (#4313) 1 year ago
triton 0e0748c579 adds triton flash attention2 kernel (#4337) 1 year ago
__init__.py b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 year ago
bias_add.py b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 year ago
config.py 468882fb68 Add the policy to run llama model from the official repo (#4313) 1 year ago
diffusers_2d_transformer.py b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 year ago
diffusers_attention.py 1ba4098918 Fix Stable Diffusion Injection (#4078) 1 year ago
diffusers_transformer_block.py ce535945e6 fix: change ==NONE to is (#3923) 1 year ago
ds_attention.py 468882fb68 Add the policy to run llama model from the official repo (#4313) 1 year ago
ds_mlp.py 15f94ae756 Engine side fix for loading llama checkpoint fine-tuned with zero3 (#3981) 1 year ago
moe_inference.py ce535945e6 fix: change ==NONE to is (#3923) 1 year ago
triton_ops.py b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 year ago