Sven Mika
|
62dbf26394
[RLlib] POC: Run PGTrainer w/o the distr. exec API (Trainer's new training_iteration method). (#20984)
|
2 年之前 |
Kai Fricke
|
3e6ba5d6d2
Revert "Revert [RLlib] POC: `PGTrainer` class that works by sub-classing, not `trainer_template.py`." (#20285)
|
2 年之前 |
Kai Fricke
|
246787cdd9
Revert "[RLlib] POC: `PGTrainer` class that works by sub-classing, not `trainer_template.py`. (#20055)" (#20284)
|
2 年之前 |
Sven Mika
|
6f85af435f
[RLlib] POC: `PGTrainer` class that works by sub-classing, not `trainer_template.py`. (#20055)
|
2 年之前 |
Sven Mika
|
b213565783
[RLlib] Fix failing test cases: Soft-deprecate ModelV2.from_batch (in favor of ModelV2.__call__). (#19693)
|
3 年之前 |
Sven Mika
|
ef18893fb5
[RLlib] PPO, APPO, and DD-PPO code cleanup. (#10420)
|
4 年之前 |
Sven Mika
|
d14b501692
[RLlib] First attempt at cleaning up algo code in RLlib: PG. (#10115)
|
4 年之前 |
Barak Michener
|
8e76796fd0
ci: Redo `format.sh --all` script & backfill lint fixes (#9956)
|
4 年之前 |
Sven Mika
|
fcdf410ae1
[RLlib] Tf2.x native. (#8752)
|
4 年之前 |
Sven Mika
|
43043ee4d5
[RLlib] Tf2x preparation; part 2 (upgrading `try_import_tf()`). (#9136)
|
4 年之前 |
Sven Mika
|
7008902cff
[RLlib] Minor `rllib.utils` cleanup. (#8932)
|
4 年之前 |
roireshef
|
3c60caa448
[rllib] implemented compute_advantages without gae (#6941)
|
4 年之前 |
Sven
|
f1b56fa5ee
PG unify/cleanup tf vs torch and PG functionality test cases (tf + torch). (#6650)
|
4 年之前 |