Sven Mika
|
827ab91741
[RLlib] Replace remaining mentions of "trainer" by "algorithm". (#36557)
|
1 年之前 |
Sven Mika
|
a3ec4a936e
[RLlib] Enable `eager_tracing=True` by default. (#36556)
|
1 年之前 |
Sven Mika
|
b218ae7e4a
[RLlib] Replace CartPole-v0 -> CartPole-v1 everywhere, incl. docs. (#29752)
|
2 年之前 |
Sven Mika
|
e7a614f388
Revert "Revert "[RLlib] AlgorithmConfig: Next steps (volume 01); Algos, Rollo…" (#29747)
|
2 年之前 |
Kai Fricke
|
12b579d95e
Revert "[RLlib] AlgorithmConfig: Next steps (volume 01); Algos, RolloutWorker, PolicyMap, WorkerSet use AlgorithmConfig objects under the hood. (#29395)" (#29742)
|
2 年之前 |
Sven Mika
|
182744bbd1
[RLlib] AlgorithmConfig: Next steps (volume 01); Algos, RolloutWorker, PolicyMap, WorkerSet use AlgorithmConfig objects under the hood. (#29395)
|
2 年之前 |
mgerstgrasser
|
7ba37885c6
Cast rewards as tf.float32 to fix error in DQN in tf2 (#28384)
|
2 年之前 |
Artur Niederfahrenhorst
|
0dceddb912
[RLlib] Move learning_starts logic from buffers into `training_step()`. (#26032)
|
2 年之前 |
Sven Mika
|
b5bc2b93c3
[RLlib] Move all remaining algos into `algorithms` directory. (#25366)
|
2 年之前 |
kourosh hakhamaneshi
|
3815e52a61
[RLlib] Agents to algos: DQN w/o Apex and R2D2, DDPG/TD3, SAC, SlateQ, QMIX, PG, Bandits (#24896)
|
2 年之前 |