Vince Jankovics
|
7dc3de4eed
[RLlib] Fix config mismatch for train_one_step. num_sgd_iter instead of sgd_num_iter. (#21555)
|
2 年之前 |
Sven Mika
|
92f030331e
[RLlib] Initial code/comment cleanups in preparation for decentralized multi-agent learner. (#21420)
|
2 年之前 |
Sven Mika
|
853d10871c
[RLlib] Issue 18499: PGTrainer with training_iteration fn does not support multi-GPU. (#21376)
|
2 年之前 |
Sven Mika
|
62dbf26394
[RLlib] POC: Run PGTrainer w/o the distr. exec API (Trainer's new training_iteration method). (#20984)
|
2 年之前 |
gjoliver
|
e7f9e8ceec
[RLlib] Report total_train_steps correctly for offline agents like CQL. (#20541)
|
2 年之前 |
Sven Mika
|
ed85f59194
[RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. (#18879)
|
3 年之前 |
Sven Mika
|
924f11cd45
[RLlib] Torch algos use now-framework-agnostic MultiGPUTrainOneStep execution op (~33% speedup for PPO-torch + GPU). (#17371)
|
3 年之前 |
Sven Mika
|
5a313ba3d6
[RLlib] Refactor: All tf static graph code should reside inside Policy class. (#17169)
|
3 年之前 |
Sven Mika
|
18d173b172
[RLlib] Implement policy_maps (multi-agent case) in RolloutWorkers as LRU caches. (#17031)
|
3 年之前 |
Sven Mika
|
55a90e670a
[RLlib] Trainer.add_policy() not working for tf, if added policy is trained afterwards. (#16927)
|
3 年之前 |
Sven Mika
|
7eb1a29426
[RLlib] Fix ModelV2 custom metrics for torch. (#16734)
|
3 年之前 |
Sven Mika
|
be6db06485
[RLlib] Re-do: Trainer: Support add and delete Policies. (#16569)
|
3 年之前 |
Amog Kamsetty
|
bd3cbfc56a
Revert "[RLlib] Allow policies to be added/deleted on the fly. (#16359)" (#16543)
|
3 年之前 |
Sven Mika
|
e78ec370a9
[RLlib] Allow policies to be added/deleted on the fly. (#16359)
|
3 年之前 |
Sven Mika
|
2303851c3c
[RLlib] Torch multi-GPU + LSTM/RNN bug fix. (#15492)
|
3 年之前 |
Sven Mika
|
cecfc3b43b
[RLlib] Multi-GPU support for Torch algorithms. (#14709)
|
3 年之前 |
Sven Mika
|
c3a15ecc0f
[RLlib] Issue #13802: Enhance metrics for `multiagent->count_steps_by=agent_steps` setting. (#14033)
|
3 年之前 |
Sven Mika
|
732197e23a
[RLlib] Multi-GPU for tf-DQN/PG/A2C. (#13393)
|
3 年之前 |
Sven Mika
|
775e685531
[RLlib] Issue #13824: `compress_observations=True` crashes for all algos not using a replay buffer. (#14034)
|
3 年之前 |
Sven Mika
|
eb0038612f
[RLlib] Extend on_learn_on_batch callback to allow for custom metrics to be added. (#13584)
|
3 年之前 |
Michael Luo
|
a2d1215200
[RLlib] Execution Annotation (#13036)
|
3 年之前 |
Edward Oakes
|
cde711aaf1
Revert "[RLLib] Execution-Folder Type Annotations (#12760)" (#12886)
|
3 年之前 |
Michael Luo
|
becca1424d
[RLLib] Execution-Folder Type Annotations (#12760)
|
3 年之前 |
Sven Mika
|
62c7ab5182
[RLlib] Trajectory view API: Enable by default for PPO, IMPALA, PG, A3C (tf and torch). (#11747)
|
4 年之前 |
Sven Mika
|
805dad3bc4
[RLlib] SAC algo cleanup. (#10825)
|
4 年之前 |
Sven Mika
|
2256047876
[RLlib] Rename rllib.utils.types into typing to match built-in python module's name. (#10114)
|
4 年之前 |
Barak Michener
|
8e76796fd0
ci: Redo `format.sh --all` script & backfill lint fixes (#9956)
|
4 年之前 |
Sven Mika
|
fcdf410ae1
[RLlib] Tf2.x native. (#8752)
|
4 年之前 |
Sven Mika
|
43043ee4d5
[RLlib] Tf2x preparation; part 2 (upgrading `try_import_tf()`). (#9136)
|
4 年之前 |
Eric Liang
|
1e0e1a45e6
[rllib] Add type annotations for evaluation/, env/ packages (#9003)
|
4 年之前 |