Sven Mika
|
f82880eda1
Revert "Revert [RLlib] POC: Deprecate `build_policy` (policy template) for torch only; PPOTorchPolicy (#20061) (#20399)" (#20417)
|
2 年之前 |
Amog Kamsetty
|
90dc5460d4
Revert "[RLlib] POC: Deprecate `build_policy` (policy template) for torch only; PPOTorchPolicy (#20061)" (#20399)
|
2 年之前 |
Sven Mika
|
5b1c8e46e1
[RLlib] POC: Deprecate `build_policy` (policy template) for torch only; PPOTorchPolicy (#20061)
|
2 年之前 |
Sven Mika
|
cf21c634a3
[RLlib] Fix deprecated warning for torch_ops.py (soft-replaced by torch_utils.py). (#19982)
|
3 年之前 |
Sven Mika
|
0b308719f8
[RLlib; Docs overhaul] Docstring cleanup: rllib/utils (#19829)
|
3 年之前 |
Sven Mika
|
9c73871da0
[RLlib; Docs overhaul] Docstring cleanup: Evaluation (#19783)
|
3 年之前 |
Sven Mika
|
b213565783
[RLlib] Fix failing test cases: Soft-deprecate ModelV2.from_batch (in favor of ModelV2.__call__). (#19693)
|
3 年之前 |
roireshef
|
9b0352f363
[RLlib] Added LearningRateSchedule and EntropyCoeffSchedule to TF and Torch versions of A3C and PPO (#19276)
|
3 年之前 |
Sven Mika
|
b4300dd532
[RLlib] Issue 18812: Torch multi-GPU stats not protected against race conditions. (#18937)
|
3 年之前 |
Sven Mika
|
9883505e84
[RLlib] Add [LSTM=True + multi-GPU]-tests to nightly RLlib testing suite (for all algos supporting RNNs, except R2D2, RNNSAC, and DDPPO). (#18017)
|
3 年之前 |
Sven Mika
|
494ddd98c1
[RLlib] Replace "seq_lens" w/ SampleBatch.SEQ_LENS. (#17928)
|
3 年之前 |
Sven Mika
|
a428f10ebe
[RLlib] Add multi-GPU learning tests to nightly. (#17778)
|
3 年之前 |
Sven Mika
|
5107d16ae5
[RLlib] Add @Deprecated decorator to simplify/unify deprecation of classes, methods, functions. (#17530)
|
3 年之前 |
Michael Luo
|
474f04e322
[RLlib] DDPG/TD3 + A3C/A2C + MARWIL/BC Annotation/Comments/Code Cleanup (#14707)
|
3 年之前 |
Sven Mika
|
4b3add0066
[RLlib] Discussion 2021: PPO does not learn vf, iff use_gae=False (ignores use_critic setting). (#15610)
|
3 年之前 |
mvindiola1
|
9330403200
[RLlib] Mask out padded values for A3C loss with recurrent policy (#15525)
|
3 年之前 |
Sven Mika
|
2e3655e8a9
[RLlib] Issue 9071 A3C w/ RNN not working due to VF assuming no RNN. (#13238)
|
3 年之前 |
Sven Mika
|
99ae7bae05
[RLlib] JAXPolicy prep. PR #1. (#13077)
|
3 年之前 |
Sven Mika
|
62c7ab5182
[RLlib] Trajectory view API: Enable by default for PPO, IMPALA, PG, A3C (tf and torch). (#11747)
|
4 年之前 |
Kingsley Kuan
|
d1dd5d578e
[RLlib] Fix PyTorch A3C / A2C loss function using mixed reduced sum / mean (#11449)
|
4 年之前 |
Sven Mika
|
36bda8432b
[RLlib] Trajectory view API: Simple List Collector (on by default for PPO); LSTM-agnostic (#11056)
|
4 年之前 |
Tomasz Wrona
|
f266318a01
[rllib] Do not store torch tensors when using grad clipping (#8509)
|
4 年之前 |
Sven Mika
|
57544b1ff9
[RLlib] Examples folder restructuring (Model examples; final part). (#8278)
|
4 年之前 |
Sven Mika
|
428516056a
[RLlib] SAC Torch (incl. Atari learning) (#7984)
|
4 年之前 |
Sven Mika
|
5537fe13b0
[RLlib] Exploration API: ParamNoise Integration into DQN; working example/test cases. (#7814)
|
4 年之前 |
Sven Mika
|
e2edca45d4
[RLlib] PPO torch memory leak and unnecessary torch.Tensor creation and gc'ing. (#7238)
|
4 年之前 |
roireshef
|
3c60caa448
[rllib] implemented compute_advantages without gae (#6941)
|
4 年之前 |
Sven Mika
|
ae9a3a2237
[RLlib] from_config util method for framework agnostic components; start moving RLlib tests into Bazel. (#6865)
|
4 年之前 |
Sven Mika
|
303547f119
[RLlib] Policy-classes cleanup and torch/tf unification. (#6770)
|
4 年之前 |
Sven
|
60d4d5e1aa
Remove future imports (#6724)
|
4 年之前 |