Sven Mika
|
d5bfb7b7da
[RLlib] Preparatory PR for multi-agent multi-GPU learner (alpha-star style) #03 (#21652)
|
2 年之前 |
Sven Mika
|
b10d5533be
[RLlib] Issue 20920 (partial solution): contrib/MADDPG + pettingzoo coop-pong-v4 not working. (#21452)
|
2 年之前 |
Sven Mika
|
b4790900f5
[RLlib] Sub-class `Trainer` (instead of `build_trainer()`): All remaining classes; soft-deprecate `build_trainer`. (#20725)
|
2 年之前 |
Artur Niederfahrenhorst
|
d07e50e957
[RLlib] Replay buffer API (cleanups; docstrings; renames; move into `rllib/execution/buffers` dir) (#20552)
|
2 年之前 |
gjoliver
|
99a0088233
[RLlib] Unify the way we create local replay buffer for all agents (#19627)
|
3 年之前 |
gjoliver
|
89fbfc00f8
[RLlib] Some minor cleanups (buffer buffer_size -> capacity and others). (#19623)
|
3 年之前 |
Sven Mika
|
732197e23a
[RLlib] Multi-GPU for tf-DQN/PG/A2C. (#13393)
|
3 年之前 |
Sven Mika
|
8000258333
[RLlib] R2D2 Implementation. (#13933)
|
3 年之前 |
desktable
|
5af745c90d
[RLlib] Implement the SlateQ algorithm (#11450)
|
4 年之前 |