stephen youn 0e0748c579 adds triton flash attention2 kernel (#4337) 1 年之前
..
op_binding 468882fb68 Add the policy to run llama model from the official repo (#4313) 1 年之前
triton 0e0748c579 adds triton flash attention2 kernel (#4337) 1 年之前
__init__.py b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 年之前
bias_add.py b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 年之前
config.py 468882fb68 Add the policy to run llama model from the official repo (#4313) 1 年之前
diffusers_2d_transformer.py b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 年之前
diffusers_attention.py 1ba4098918 Fix Stable Diffusion Injection (#4078) 1 年之前
diffusers_transformer_block.py ce535945e6 fix: change ==NONE to is (#3923) 1 年之前
ds_attention.py 468882fb68 Add the policy to run llama model from the official repo (#4313) 1 年之前
ds_mlp.py 15f94ae756 Engine side fix for loading llama checkpoint fine-tuned with zero3 (#3981) 1 年之前
moe_inference.py ce535945e6 fix: change ==NONE to is (#3923) 1 年之前
triton_ops.py b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 年之前