Logan Adams ccfdb84e2a FP6 quantization end-to-end. (#5234) 7 月之前
..
attention c1e02052ac Refactor the positional emebdding config code (#4920) 9 月之前
embedding 7b818ee961 improve the way to determine whether a variable is None (#4782) 10 月之前
linear ccfdb84e2a FP6 quantization end-to-end. (#5234) 7 月之前
moe c00388a2ef Mixtral FastGen Support (#4828) 10 月之前
post_norm 5411030529 Inference Checkpoints in V2 (#4664) 11 月之前
pre_norm 5411030529 Inference Checkpoints in V2 (#4664) 11 月之前
unembed 834272531a Add support of Microsoft Phi-2 model to DeepSpeed-FastGen (#4812) 9 月之前
__init__.py 38b41dffa1 DeepSpeed-FastGen (#4604) 11 月之前