Xingjian Shi
|
d81dfdabcc
Fix LoRA Fuse/Unfuse in Hybrid Engine (#3563)
|
1 year ago |
Connor Holmes
|
0a61d5d664
Hybrid Engine Refactor and Llama Inference Support (#3425)
|
1 year ago |
Reza Yazdani
|
3e8564645d
Add HE support for the rest of model containers (#3191)
|
1 year ago |
Olatunji Ruwase
|
47f9f13bd3
DeepSpeed Chat (#3186)
|
1 year ago |
Michael Wyatt
|
b361c72761
Update DeepSpeed copyright license to Apache 2.0 (#3111)
|
1 year ago |
Jeff Rasley
|
91d63e0228
update formatter version and style settings (#3098)
|
1 year ago |
Heyang Qin
|
dc01cee5ca
using container when loading inference checkpoints (#2875)
|
1 year ago |
Jeff Rasley
|
da84e60d98
add missing license info to top of all source code (#2889)
|
1 year ago |
Lev Kurilenko
|
fd1449c766
Port Reza's INT8-quantization fix to container architecture (#2725)
|
1 year ago |
Lev Kurilenko
|
10f3c301a0
Add container load checkpoint error reporting + refactor (#2792)
|
1 year ago |
Lev Kurilenko
|
0a73e6e613
Container param cleanup + remove qkv_merging (#2780)
|
1 year ago |
Ammar Ahmad Awan
|
867da307d0
Inference Refactor (replace_with_policy, model_implementations) (#2554)
|
1 year ago |