.. |
quantization
|
6c86ff393f
adding 8bit dequantization kernel for asym fine-grained block quantization in zero-inference (#4450)
|
1 年之前 |
test_checkpoint_sharding.py
|
76953a37b7
fix opt-350m shard loading issue in AutoTP (#3600)
|
1 年之前 |
test_inference.py
|
f15cccfa0c
[AutoTP] Make AutoTP work when num_heads not divisible by number of workers (#4011)
|
1 年之前 |
test_inference_config.py
|
b361c72761
Update DeepSpeed copyright license to Apache 2.0 (#3111)
|
1 年之前 |
test_model_profiling.py
|
7290aace9b
[CPU] Skip CPU support unimplemented error (#3633)
|
1 年之前 |