.. |
checkpoint
|
a37e59b590
Update deprecated HuggingFace function (#5144)
|
8 月之前 |
kernels
|
b3767d01d4
Fixed Windows inference build. (#5609)
|
4 月之前 |
model_implementations
|
ccfdb84e2a
FP6 quantization end-to-end. (#5234)
|
7 月之前 |
modules
|
e3d873a00e
Fix the FP6 kernels compilation problem on non-Ampere GPUs. (#5333)
|
6 月之前 |
ragged
|
49359d0bc7
Replace HIP_PLATFORM_HCC with HIP_PLATFORM_AMD (#5264)
|
7 月之前 |
__init__.py
|
5411030529
Inference Checkpoints in V2 (#4664)
|
11 月之前 |
allocator.py
|
3c811c966b
47% FastGen speedup for low workload - refactor allocator (#5090)
|
8 月之前 |
config_v2.py
|
ccfdb84e2a
FP6 quantization end-to-end. (#5234)
|
7 月之前 |
engine_factory.py
|
bcc617a000
Add fp16 support of Qwen1.5 models (0.5B to 72B) to DeepSpeed-FastGen (#5219)
|
7 月之前 |
engine_v2.py
|
5dea776a84
Enhance query APIs for text generation (#4965)
|
9 月之前 |
inference_parameter.py
|
5411030529
Inference Checkpoints in V2 (#4664)
|
11 月之前 |
inference_utils.py
|
38b41dffa1
DeepSpeed-FastGen (#4604)
|
11 月之前 |
logging.py
|
38b41dffa1
DeepSpeed-FastGen (#4604)
|
11 月之前 |
scheduling_utils.py
|
38b41dffa1
DeepSpeed-FastGen (#4604)
|
11 月之前 |