Sven Mika d5bfb7b7da [RLlib] Preparatory PR for multi-agent multi-GPU learner (alpha-star style) #03 (#21652) 2 年之前
..
a3c 599e589481 [RLlib] Move existing fake multi-GPU learning tests into separate buildkite job. (#18065) 3 年之前
ars 2589309cf0 [RLlib] Make sure torch and tf behave the same wrt conv2d nets. (#8785) 4 年之前
cql e7f9e8ceec [RLlib] Report total_train_steps correctly for offline agents like CQL. (#20541) 2 年之前
ddpg 60b2219d72 [RLlib] Allow for evaluation to run by `timesteps` (alternative to `episodes`) and add auto-setting to make sure train doesn't ever have to wait for eval (e.g. long episodes) to finish. (#20757) 2 年之前
dqn d5bfb7b7da [RLlib] Preparatory PR for multi-agent multi-GPU learner (alpha-star style) #03 (#21652) 2 年之前
dreamer 4e9888ce2f [RLlib] Dreamer (#10172) 4 年之前
es b84575c092 [RLlib] 2 RLlib Flaky Tests (#14930) 3 年之前
impala 026bf01071 [RLlib] Upgrade gym version to 0.21 and deprecate pendulum-v0. (#19535) 3 年之前
maml 59bc1e6c09 [RLLib] MAML extension for all models except RNNs (#11337) 3 年之前
marwil 4b278c36fc [RLlib] Behavioral Cloning (from MARWIL). (#10619) 4 年之前
mbmpo 6e6c680f14 MBMPO Cartpole (#11832) 4 年之前
pg 853d10871c [RLlib] Issue 18499: PGTrainer with training_iteration fn does not support multi-GPU. (#21376) 2 年之前
ppo f3397b6f48 [RLlib] Minor fixes/cleanups; chop_into_sequences now handles nested data. (#19408) 3 年之前
qmix abd3bef63b [RLlib] QMIX better defaults + added to CI learning tests (#21332) 2 年之前
sac 63db0e3a7c [RLlib] Fix SAC learning test flakiness introduced in PR: "Sub-class `Trainer` (instead of `build_trainer()`): All remaining classes; soft-deprecate `build_trainer`." (#20985) 2 年之前
cleanup_experiment.py baa053496a [RLlib] Benchmark and regression test yaml cleanup and restructuring. (#8414) 4 年之前
compact-regression-test.yaml 93c0a5549b [RLlib] Deprecate `vf_share_layers` in top-level PPO/MAML/MB-MPO configs. (#13397) 3 年之前
create_plots.py baa053496a [RLlib] Benchmark and regression test yaml cleanup and restructuring. (#8414) 4 年之前