Sven Mika
|
5a3954eff7
[RLlib] APPO+new-stack (Atari benchmark) - Preparatory PR 01. (#34743)
|
1 年之前 |
kourosh hakhamaneshi
|
8d2dc9a399
[RLlib] Change default framework from tf to torch (#33604)
|
1 年之前 |
Sven Mika
|
8e680c483c
[RLlib] gymnasium support (new `Env.reset()/step()/seed()/render()` APIs). (#28369)
|
1 年之前 |
Artur Niederfahrenhorst
|
0dceddb912
[RLlib] Move learning_starts logic from buffers into `training_step()`. (#26032)
|
2 年之前 |
Sven Mika
|
7c39aa5fac
[RLlib] Trainer.training_iteration -> Trainer.training_step; Iterations vs reportings: Clarification of terms. (#25076)
|
2 年之前 |
Artur Niederfahrenhorst
|
fb2915d26a
[RLlib] Replay Buffer API and Ape-X. (#24506)
|
2 年之前 |
Sven Mika
|
f066180ed5
[RLlib] Deprecate `timesteps_per_iteration` config key (in favor of `min_[sample|train]_timesteps_per_reporting`. (#24372)
|
2 年之前 |
Artur Niederfahrenhorst
|
9a64bd4e9b
[RLlib] Simple-Q uses training iteration fn (instead of execution_plan); ReplayBuffer API for Simple-Q (#22842)
|
2 年之前 |
Sven Mika
|
2746fc0476
[RLlib] Auto-framework, retire `use_pytorch` in favor of `framework=...` (#8520)
|
4 年之前 |
Sven Mika
|
baa053496a
[RLlib] Benchmark and regression test yaml cleanup and restructuring. (#8414)
|
4 年之前 |