Sven Mika
|
698b4eeed3
[RLlib] POC: Separate losses for APPO/IMPALA. Enable TFPolicy to handle multiple optimizers/losses (like TorchPolicy). (#18669)
|
3 年之前 |
Sven Mika
|
649580d735
[RLlib] Redo simplify multi agent config dict: Reverted b/c seemed to break test_typing (non RLlib test). (#17046)
|
3 年之前 |
Amog Kamsetty
|
38b5b6d24c
Revert "[RLlib] Simplify multiagent config (automatically infer class/spaces/config). (#16565)" (#17036)
|
3 年之前 |
Sven Mika
|
e4123fff27
[RLlib] Simplify multiagent config (automatically infer class/spaces/config). (#16565)
|
3 年之前 |
Kai Fricke
|
10fd7111b3
[rllib] Improve test learning check, fix flaky two step qmix (#16843)
|
3 年之前 |
Sven Mika
|
53206dd440
[RLlib] CQL BC loss fixes; PPO/PG/A2|3C action normalization fixes (#16531)
|
3 年之前 |
Sven Mika
|
be6db06485
[RLlib] Re-do: Trainer: Support add and delete Policies. (#16569)
|
3 年之前 |
Sven Mika
|
169ddabae7
[RLlib] Issue 15973: Trainer.with_updates(validate_config=...) behaves confusingly. (#16429)
|
3 年之前 |
Amog Kamsetty
|
bd3cbfc56a
Revert "[RLlib] Allow policies to be added/deleted on the fly. (#16359)" (#16543)
|
3 年之前 |
Sven Mika
|
e78ec370a9
[RLlib] Allow policies to be added/deleted on the fly. (#16359)
|
3 年之前 |
Sven Mika
|
d2c755ccef
[RLlib] Examples scripts add argparse help and replace `--torch` with `--framework`. (#15832)
|
3 年之前 |
Sven Mika
|
c17169dc11
[RLlib] Fix all example scripts to run on GPUs. (#11105)
|
4 年之前 |
Sven Mika
|
78dfed2683
[RLlib] Issue 8384: QMIX doesn't learn anything. (#9527)
|
4 年之前 |