Liangliang-Ma 11a62a0635 Add Compressedbackend for Onebit optimizers (#5473) | 4 月之前 | |
---|---|---|
.. | ||
README.md | 11a62a0635 Add Compressedbackend for Onebit optimizers (#5473) | 4 月之前 |
test_compressed_backend.py | 11a62a0635 Add Compressedbackend for Onebit optimizers (#5473) | 4 月之前 |
test_compressed_perf.py | 11a62a0635 Add Compressedbackend for Onebit optimizers (#5473) | 4 月之前 |
test_mpi_backend.py | b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) | 1 年之前 |
test_mpi_perf.py | b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) | 1 年之前 |
test_nccl_backend.py | b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) | 1 年之前 |
test_nccl_perf.py | b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) | 1 年之前 |
In this folder, you can test the functionality and performance of different backend for doing compressed allreduce, which is the main algorithm in one-bit optimizers like One-Bit Adam, One-Bit Lamb and Zero-One Adam.
Basically it requires your environment have relative communication backend installed, the NCCL backend of PyTorch distributed or Message Passing Interface (MPI) like MVAPICH2-GDR and OpenMPI. Detailed Pre-requisites.
To test accuracy and performance of NCCL backend:
python test_nccl_backend.py
python test_nccl_perf.py
Similarly, for MPI backend:
python test_mpi_backend.py
python test_mpi_perf.py
This backend provides an approach to abstract the generic part of one-bit optimizers and implements accelerator dependent part with DeepSpeed custom op builder. To use this CompressedBackend
and test it, you should make sure that your current accelerator supports PackbitsBuilder
, so that it could be loaded to do high performance packing and unpacking between float and Byte datatype.
An example can be found in Deepspeed/op_builder/xpu/packbits.py
.
The test usage is same as others:
python test_compressed_backend.py
python test_compressed_perf.py