.. |
features
|
389bf69319
fix: Remove duplicate word the (#4051)
|
1 年之前 |
__init__.py
|
468882fb68
Add the policy to run llama model from the official repo (#4313)
|
1 年之前 |
base.py
|
468882fb68
Add the policy to run llama model from the official repo (#4313)
|
1 年之前 |
base_moe.py
|
b361c72761
Update DeepSpeed copyright license to Apache 2.0 (#3111)
|
1 年之前 |
bert.py
|
69d1b9f978
DeepSpeed-Triton for Inference (#3748)
|
1 年之前 |
bloom.py
|
d81dfdabcc
Fix LoRA Fuse/Unfuse in Hybrid Engine (#3563)
|
1 年之前 |
clip.py
|
0a61d5d664
Hybrid Engine Refactor and Llama Inference Support (#3425)
|
1 年之前 |
distil_bert.py
|
69d1b9f978
DeepSpeed-Triton for Inference (#3748)
|
1 年之前 |
gpt2.py
|
0a61d5d664
Hybrid Engine Refactor and Llama Inference Support (#3425)
|
1 年之前 |
gptj.py
|
d81dfdabcc
Fix LoRA Fuse/Unfuse in Hybrid Engine (#3563)
|
1 年之前 |
gptneo.py
|
d81dfdabcc
Fix LoRA Fuse/Unfuse in Hybrid Engine (#3563)
|
1 年之前 |
gptneox.py
|
d81dfdabcc
Fix LoRA Fuse/Unfuse in Hybrid Engine (#3563)
|
1 年之前 |
internlm.py
|
367d6f9cec
Support InternLM (#4137)
|
1 年之前 |
llama.py
|
4fc2c8e7d5
Fix llama meta tensor loading in AutoTP and kernel injected inference (#3608)
|
1 年之前 |
llama2.py
|
468882fb68
Add the policy to run llama model from the official repo (#4313)
|
1 年之前 |
megatron_gpt.py
|
6cbf666131
fix MegatronLayerPolicy to be compatible with the newest ParallelTransformerLayer (#4236)
|
1 年之前 |
megatron_gpt_moe.py
|
0a61d5d664
Hybrid Engine Refactor and Llama Inference Support (#3425)
|
1 年之前 |
opt.py
|
d81dfdabcc
Fix LoRA Fuse/Unfuse in Hybrid Engine (#3563)
|
1 年之前 |
unet.py
|
b361c72761
Update DeepSpeed copyright license to Apache 2.0 (#3111)
|
1 年之前 |
vae.py
|
1ba4098918
Fix Stable Diffusion Injection (#4078)
|
1 年之前 |