Cuong Nguyen
|
bd9dc1630f
[release] unbreak dask_on_ray_1tb_sort (#46120)
|
4 月之前 |
Cuong Nguyen
|
0bf72a6d7c
[ci] deflake rllib release tests (#45901)
|
4 月之前 |
Lonnie Liu
|
d59d1ef352
[finetune] change fine-tuning examples to use cuda 12.3 (#45879)
|
4 月之前 |
Sven Mika
|
d49f15b111
[RLlib] Add "official" benchmark script for Atari PPO benchmarks (new API stack). (#45697)
|
4 月之前 |
Sven Mika
|
4adb78b2bf
[RLlib] Activate DreamerV3 weekly release test (on Pong-v5 with the 100k setup). (#45654)
|
4 月之前 |
Hongchao Deng
|
d4fc01c584
[core] add chaos_many_tasks/actors terminate instance cases (#45663)
|
4 月之前 |
Sven Mika
|
c94140a3a4
[RLlib] Complete do-over of RLlib release tests (new API stack). (#45589)
|
4 月之前 |
Stephanie Wang
|
ab2b442b34
[core][experimental] Fix GPU microbenchmark (#45426)
|
5 月之前 |
Cuong Nguyen
|
90fa2895bd
[release] mark chaos_torch_batch_inference_16_gpu_300gb_raw as non-stable (#45387)
|
5 月之前 |
Stephanie Wang
|
79f39957dc
[core][experimental] Accelerated DAG NCCL-based p2p channels for torch.Tensors (#45092)
|
5 月之前 |
Cindy Zhang
|
863dc2392f
[serve] run all serve release tests in isolated cloud (#44939)
|
6 月之前 |
Cindy Zhang
|
4449c5e3c6
[serve] remove old autoscaling release tests (#44785)
|
6 月之前 |
Cindy Zhang
|
f38582fd76
[serve] improve 1k replica scalability release test (#44318)
|
6 月之前 |
Cindy Zhang
|
6158f13cc6
[serve] Add microbenchmark release tests (#44327)
|
6 月之前 |
Stephanie Wang
|
f80d4d330b
[tests] Increase timeout for single_node_oom test #44745
|
6 月之前 |
Cuong Nguyen
|
7c115e8679
[ci] mark old-stack tf rllib release tests as non release-blocking (#44598)
|
6 月之前 |
Cindy Zhang
|
f7fcfe0475
[serve] autoscaling release tests (#42421)
|
6 月之前 |
Maheedhar Reddy Chappidi
|
0115c3bdeb
[train] Add TorchAwsNeuronXLABackend and XLAConfig (#39130)
|
6 月之前 |
Justin Yu
|
6f4561700e
[tune][release-test] Remove `long_running_pbt` release test (#44270)
|
7 月之前 |
Cuong Nguyen
|
4ee4a69082
[ci] mark rllib_learner_group_checkpointing_multinode as unstable (#44188)
|
7 月之前 |
Cuong Nguyen
|
057d4a780d
[ci] mark rllib_learning_tests_ppo_new_api_stack_torch as unstable (#44116)
|
7 月之前 |
Cuong Nguyen
|
933ac63db5
[ci] mark data 1b tests as unstable (#44115)
|
7 月之前 |
Jiajun Yao
|
f11d8bda3a
[Core] Revamp many_drivers release test (#43886)
|
7 月之前 |
Jiajun Yao
|
4a1b5e5853
[Core] Change runtime env release tests ownership to core (#43978)
|
7 月之前 |
Justin Yu
|
3adbee4918
[train] Simplify `ray.train.xgboost/lightgbm` (6/n): Add core xgb/lgbm trainer release tests (#43693)
|
7 月之前 |
Justin Yu
|
94bbf99cd5
[train+tune] Local directory refactor (2/n): Separate driver artifacts and trial working directories (#43403)
|
7 月之前 |
Sven Mika
|
f8e59cba73
[RLlib] Do-over of release tests in light of rllib_contrib AND new- vs old API stack. (#43278)
|
7 月之前 |
Cuong Nguyen
|
d950679c30
[release] support repeated run for release test (#43514)
|
7 月之前 |
Yunxuan Xiao
|
4a7395721b
[Train] Colocate Trainer and rank 0 worker (#43115)
|
7 月之前 |
Cuong Nguyen
|
8418009738
Revert "[ci][release] repeated run for release tests (#43472)" (#43510)
|
7 月之前 |