Masahiro Tanaka
|
c3cfe96bb3
Enable torch.compile with ZeRO (Experimental) (#4878)
|
8 月之前 |
Michael Wyatt
|
f0e5bece60
Fix for nightly torch CI (#5039)
|
8 月之前 |
YiSheng5
|
752a50acdd
[MiCS]Add the path to support sequence_data_parallel on MiCS (#4926)
|
9 月之前 |
Zhen Zhang
|
d7b764e3d8
Unit tests for MiCS (#4792)
|
9 月之前 |
Guanhua Wang
|
ba6bfd0b76
fix mics run with offload++ (#4749)
|
10 月之前 |
Heyang Qin
|
7711bdbbd2
MP ZeRO++ (#3954)
|
1 年之前 |
leiwen83
|
1e0c39c6bf
enable pipeline checkpoint loading mode (#3629)
|
1 年之前 |
Joe Mayer
|
8afcda2ac9
ZeRO Gradient Accumulation Dtype. (#2847)
|
1 年之前 |
Guo Yejun
|
b4626194e4
zero/mics.py: use on_accelerator instead of cuda only (#3806)
|
1 年之前 |
Heyang Qin
|
d18aa2c79c
ZeRO++ (#3784)
|
1 年之前 |
Zhen Zhang
|
c88af21432
[MiCS] [Fix] saving and loading model checkpoint logic for MiCS sharding (#3440)
|
1 年之前 |
Zhen Zhang
|
2e99f6edf6
[DRAFT] Tentative implementation of MiCS (#2964)
|
1 年之前 |