Jeff Rasley
|
a10e4811fe
force set lf instead of crlf (https://github.com/pre-commit/pre-commit-hooks#mixed-line-ending) (#1598)
|
2 年之前 |
eltonzheng
|
51d42ab9ec
fix partition activations issue when mp=2 and pp=2 (#1589)
|
2 年之前 |
Hyunwoong Ko
|
d1e72f29d8
Fix non-fp tensor bugs of contiguous activation checkpointing (#1376)
|
3 年之前 |
Alex Hedges
|
be789b1665
Fix many typos (#1423)
|
3 年之前 |
Jeff Rasley
|
e2fdd254ed
Big science related changes (#1407)
|
3 年之前 |
Olatunji Ruwase
|
85acf14c58
Activation checkpointing improvements (#1254)
|
3 年之前 |
Olatunji Ruwase
|
b1669c0d8f
Avoid partitioning small activations (#1154)
|
3 年之前 |
Cheng Li
|
4544b7d2f1
Improve flops profiler functionality (#1065)
|
3 年之前 |
Sean Naren
|
41ab660b5d
Refactor param_dict to config (#1008)
|
3 年之前 |
Jeff Rasley
|
0d4a54a04d
ZeRO-Infinity (#976)
|
3 年之前 |
Samyam Rajbhandari
|
599258f979
ZeRO 3 Offload (#834)
|
3 年之前 |
Olatunji Ruwase
|
ec8b1cb0a0
Activation checkpointing for non-tensor arguments and return values (#741)
|
3 年之前 |
Shaden Smith
|
adcfd2694d
Handle actvitation checkpointing args that are None or non-tensors (#660)
|
3 年之前 |
Shaden Smith
|
a825f99688
Fix activation checkpoint unit tests for GPU systems (#421)
|
4 年之前 |
Jeff Rasley
|
a74a604a9e
Revert "Activation checkpointing bugfix and unit tests (#420)" (#422)
|
4 年之前 |
Shaden Smith
|
01b6e27e78
Activation checkpointing bugfix and unit tests (#420)
|
4 年之前 |
Shaden Smith
|
65c2f974d8
Pipeline parallel training engine. (#392)
|
4 年之前 |
Jeff Rasley
|
e5bbc2e559
Sparse attn + ops/runtime refactor + v0.3.0 (#343)
|
4 年之前 |