Sven Mika
|
a931076f59
[RLlib] Tf2 + eager-tracing same speed as framework=tf; Add more test coverage for tf2+tracing. (#19981)
|
3 年之前 |
Sven Mika
|
0b308719f8
[RLlib; Docs overhaul] Docstring cleanup: rllib/utils (#19829)
|
3 年之前 |
Sven Mika
|
f2cb2ed203
[RLlib; Docs overhaul] Docstring cleanup: Policies, policy_templates. (#19759)
|
3 年之前 |
Sven Mika
|
b213565783
[RLlib] Fix failing test cases: Soft-deprecate ModelV2.from_batch (in favor of ModelV2.__call__). (#19693)
|
3 年之前 |
Sven Mika
|
61a1274619
[RLlib] No Preprocessors (part 2). (#18468)
|
3 年之前 |
Sven Mika
|
698b4eeed3
[RLlib] POC: Separate losses for APPO/IMPALA. Enable TFPolicy to handle multiple optimizers/losses (like TorchPolicy). (#18669)
|
3 年之前 |
Sven Mika
|
9883505e84
[RLlib] Add [LSTM=True + multi-GPU]-tests to nightly RLlib testing suite (for all algos supporting RNNs, except R2D2, RNNSAC, and DDPPO). (#18017)
|
3 年之前 |
Sven Mika
|
494ddd98c1
[RLlib] Replace "seq_lens" w/ SampleBatch.SEQ_LENS. (#17928)
|
3 年之前 |
Sven Mika
|
a428f10ebe
[RLlib] Add multi-GPU learning tests to nightly. (#17778)
|
3 年之前 |
Sven Mika
|
924f11cd45
[RLlib] Torch algos use now-framework-agnostic MultiGPUTrainOneStep execution op (~33% speedup for PPO-torch + GPU). (#17371)
|
3 年之前 |
Sven Mika
|
90b21ce27e
[RLlib] De-flake 3 test cases; Fix `config.simple_optimizer` and `SampleBatch.is_training` warnings. (#17321)
|
3 年之前 |
Chris Bamford
|
29768a7c01
[RLLib] (P1 regression) Fixing view requirements in compute actions (#15856)
|
3 年之前 |
Sven Mika
|
5a313ba3d6
[RLlib] Refactor: All tf static graph code should reside inside Policy class. (#17169)
|
3 年之前 |
Sven Mika
|
18d173b172
[RLlib] Implement policy_maps (multi-agent case) in RolloutWorkers as LRU caches. (#17031)
|
3 年之前 |
Sven Mika
|
be6db06485
[RLlib] Re-do: Trainer: Support add and delete Policies. (#16569)
|
3 年之前 |
Sven Mika
|
e973b726c2
[RLlib] Support native tf.keras.Models (part 2) - Default keras models for Vision/RNN/Attention. (#15273)
|
3 年之前 |
Sven Mika
|
bb8a286cbc
[RLlib] Support native tf.keras.Model (milestone toward obsoleting ModelV2 class). (#14684)
|
3 年之前 |
Sven Mika
|
9c5a0cfd7a
[RLlib] Issue 14385: `Policy.compute_actions_from_input_dict` does not properly track accessed fields for Policy's view requirements. (#14386)
|
3 年之前 |
Sven Mika
|
4f66309e19
[RLlib] Redo issue 14533 tf enable eager exec (#14984)
|
3 年之前 |
SangBin Cho
|
fa5f961d5e
Revert "[RLlib] Issue 14533: `tf.enable_eager_execution()` must be called at beginning. (#14737)" (#14918)
|
3 年之前 |
Sven Mika
|
3e389d5812
[RLlib] Issue 14533: `tf.enable_eager_execution()` must be called at beginning. (#14737)
|
3 年之前 |
Sven Mika
|
04bc0a9828
[RLlib] Remove all non-trajectory view API code. (#14860)
|
3 年之前 |
Sven Mika
|
69202c6a7d
[RLlib] Obsolete usage tracking dict via sample batch. (#13065)
|
3 年之前 |
Sven Mika
|
ee4b6e7e3b
[RLlib] Unity3D example broken due to change in ML-Agents API. Attention-net prev-n-a/r. Attention-wrapper works with images. (#14569)
|
3 年之前 |
Sven Mika
|
732197e23a
[RLlib] Multi-GPU for tf-DQN/PG/A2C. (#13393)
|
3 年之前 |
Sven Mika
|
8000258333
[RLlib] R2D2 Implementation. (#13933)
|
3 年之前 |
Sven Mika
|
4db86404ad
[RLlib] Issue #13507: Fix MB-MPO CartPole Env's reward function as well as MB-MPO running into a traj. view API related issue. (#14037)
|
3 年之前 |
Sven Mika
|
d7301a51f4
[RLlib]: Trajectory View API: Keep env infos (e.g. for postprocessing callbacks), no matter what. (#13555)
|
3 年之前 |
Sven Mika
|
391cdfae8c
[RLlib] Trajectory view API docs. (#12718)
|
3 年之前 |
Sven Mika
|
b2bcab711d
[RLlib] Attention Nets: tf (#12753)
|
3 年之前 |