Sven Mika
|
2ed09c5445
[RLlib] Move all config validation logic into AlgorithmConfig classes. (#29854)
|
1 年之前 |
Sven Mika
|
96693055bd
[RLlib] More Trainer -> Algorithm renaming cleanups. (#25869)
|
2 年之前 |
Sven Mika
|
130b7eeaba
[RLlib] `Trainer` to `Algorithm` renaming. (#25539)
|
2 年之前 |
Sven Mika
|
7c39aa5fac
[RLlib] Trainer.training_iteration -> Trainer.training_step; Iterations vs reportings: Clarification of terms. (#25076)
|
2 年之前 |
Sven Mika
|
1bc6419e0e
[RLlib] R2D2 training iteration fn AND switch off `execution_plan` API by default. (#24165)
|
2 年之前 |
Sven Mika
|
92781c603e
[RLlib] A2C `training_iteration` method implementation (`_disable_execution_plan_api=True`) (#23735)
|
2 年之前 |
Steven Morad
|
00922817b6
[RLlib] Rewrite PPO to use training_iteration + enable DD-PPO for Win32. (#23673)
|
2 年之前 |
Sven Mika
|
434265edd0
[RLlib] Examples folder: All `training_iteration` translations. (#23712)
|
2 年之前 |
Sven Mika
|
7cb86acce2
[RLlib] trainer_template.py: hard deprecation (error when used). (#23488)
|
2 年之前 |
Balaji Veeramani
|
7f1bacc7dc
[CI] Format Python code with Black (#21975)
|
2 年之前 |
Sven Mika
|
371fbb17e4
[RLlib] Make `policies_to_train` more flexible via callable option. (#20735)
|
2 年之前 |
gjoliver
|
99a0088233
[RLlib] Unify the way we create local replay buffer for all agents (#19627)
|
3 年之前 |
Richard Liaw
|
a78a2263e5
[RLlib] Fix reverted RockPaperScissors Pettingzoo example (#16896)
|
3 年之前 |
Pierre TASSEL
|
66605cfcbd
[RLLib] Random Parametric Trainer (#11366)
|
4 年之前 |