Sam Ade Jacobs
|
12e1cb8262
MOE matmult with memaccess (#2336)
|
2 年之前 |
Sam Ade Jacobs
|
80b10d0c69
MOE residual matmult unit test (#2323)
|
2 年之前 |
Jeff Rasley
|
0f0a7a5d0b
bump to 0.7.4
|
2 年之前 |
Michael Wyatt
|
1592381018
Add more options to inference benchmark (#2325)
|
2 年之前 |
Jeff Rasley
|
cf638be998
only override forward if using cuda-graph (#2291)
|
2 年之前 |
Guanhua Wang
|
95d1151733
add quant unit test (#2315)
|
2 年之前 |
Michael Wyatt
|
c199edac82
refactor to use mem_access (#2317)
|
2 年之前 |
Olatunji Ruwase
|
060078ab14
ZeRO-Inference blog - Update README (#2322)
|
2 年之前 |
Olatunji Ruwase
|
f5230be87c
ZeRO-Inference blog - wrap up (#2321)
|
2 年之前 |
Olatunji Ruwase
|
276eec7beb
ZeRO-Inference blog (#2271)
|
2 年之前 |
Michael Wyatt
|
18ee381eb3
Upgrade P40 tests to torch 1.8 (#2316)
|
2 年之前 |
Jeff Rasley
|
9595dff6d7
add inference eval scripts (#2303)
|
2 年之前 |
Arash Bakhtiari
|
9d541a63dc
Add unit tests for residual_add kernels (#2307)
|
2 年之前 |
Arash Bakhtiari
|
efa8aded4a
Fix the residual add mp scaling for GPTNeoX (#2310)
|
2 年之前 |
Michael Wyatt
|
a691ec605b
Add tensor parallel inference unit tests (#2232)
|
2 年之前 |
Molly Smith
|
d0dfe38d53
Update relu.cu with mem_access_utils (#2306)
|
2 年之前 |
Michael Wyatt
|
b2d550ab85
Unit test for bias add kernel (#2298)
|
2 年之前 |
Quentin Anthony
|
4524b8dbc6
Fix for Zero3 when MP>1 and at least one batch param undefined (#2289)
|
2 年之前 |
Michael Wyatt
|
7e085b6258
fix for pytest picking up local deepspeed dir instead of installed deepspeed (#2299)
|
2 年之前 |
Molly Smith
|
5102c4aba9
fused bias relu unittest (#2297)
|
2 年之前 |
Olatunji Ruwase
|
c864103210
Offload all gradients to nvme (#2282)
|
2 年之前 |
Stas Bekman
|
b146aa3523
[ds-inference] fix progress bar (#2286)
|
2 年之前 |
Olatunji Ruwase
|
53182531ed
Refactor universal checkpointing and tensor fragments (#2253)
|
2 年之前 |
Reza Yazdani
|
47e030f54d
Fp32 accuracy bug fix (#2285)
|
2 年之前 |
Connor Holmes
|
c84bca37b1
Memory Access Utility (#2276)
|
2 年之前 |
Connor Holmes
|
5505e2473e
Add to codeowners file (#2279)
|
2 年之前 |
Ammar Ahmad Awan
|
91eba15a01
add a new unit test for cuda ops (#2278)
|
2 年之前 |
Reza Yazdani
|
afdc72879f
Ds-inference Int8 support through ZeroQuant technology (#2217)
|
2 年之前 |
Jeff Rasley
|
088212a7a9
prevent torch ext folder mkdir at tmp (#2274)
|
2 年之前 |
trajep
|
aca34a9b5b
Trajepl: nebula load fix (#2182)
|
2 年之前 |