Sven Mika
|
a931076f59
[RLlib] Tf2 + eager-tracing same speed as framework=tf; Add more test coverage for tf2+tracing. (#19981)
|
3 years ago |
Sven Mika
|
2d24ef0d32
[RLlib] Add all simple learning tests as `framework=tf2`. (#19273)
|
3 years ago |
Sven Mika
|
0b308719f8
[RLlib; Docs overhaul] Docstring cleanup: rllib/utils (#19829)
|
3 years ago |
Sven Mika
|
9c73871da0
[RLlib; Docs overhaul] Docstring cleanup: Evaluation (#19783)
|
3 years ago |
Sven Mika
|
f2cb2ed203
[RLlib; Docs overhaul] Docstring cleanup: Policies, policy_templates. (#19759)
|
3 years ago |
Sven Mika
|
ac3371a148
[RLlib] Discussion 3644: Fix bug for complex obs spaces containing `Box([2D shape])` and discrete component. (#18917)
|
3 years ago |
Sven Mika
|
ed85f59194
[RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. (#18879)
|
3 years ago |
Sven Mika
|
61a1274619
[RLlib] No Preprocessors (part 2). (#18468)
|
3 years ago |
Sven Mika
|
698b4eeed3
[RLlib] POC: Separate losses for APPO/IMPALA. Enable TFPolicy to handle multiple optimizers/losses (like TorchPolicy). (#18669)
|
3 years ago |
Sven Mika
|
9883505e84
[RLlib] Add [LSTM=True + multi-GPU]-tests to nightly RLlib testing suite (for all algos supporting RNNs, except R2D2, RNNSAC, and DDPPO). (#18017)
|
3 years ago |
Sven Mika
|
494ddd98c1
[RLlib] Replace "seq_lens" w/ SampleBatch.SEQ_LENS. (#17928)
|
3 years ago |
Sven Mika
|
5107d16ae5
[RLlib] Add @Deprecated decorator to simplify/unify deprecation of classes, methods, functions. (#17530)
|
3 years ago |
Sven Mika
|
8a844ff840
[RLlib] Issues: 17397, 17425, 16715, 17174. When on driver, Torch|TFPolicy should not use `ray.get_gpu_ids()` (b/c no GPUs assigned by ray). (#17444)
|
3 years ago |
Sven Mika
|
5a313ba3d6
[RLlib] Refactor: All tf static graph code should reside inside Policy class. (#17169)
|
3 years ago |
Sven Mika
|
18d173b172
[RLlib] Implement policy_maps (multi-agent case) in RolloutWorkers as LRU caches. (#17031)
|
3 years ago |
Sven Mika
|
1fd0eb805e
[RLlib] Redo fix bug normalize vs unsquash actions (original PR made log-likelihood test flakey). (#17014)
|
3 years ago |
Kai Fricke
|
27d80c4c88
[RLlib] ONNX export for tensorflow (1.x) and torch (#16805)
|
3 years ago |
Amog Kamsetty
|
bc33dc7e96
Revert "[RLlib] Fix bug in policy.py: normalize_actions=True has to call `unsquash_action`, not `normalize_action`." (#17002)
|
3 years ago |
Sven Mika
|
7862dd64ea
[RLlib] Fix bug in policy.py: normalize_actions=True has to call `unsquash_action`, not `normalize_action`. (#16774)
|
3 years ago |
Sven Mika
|
7eb1a29426
[RLlib] Fix ModelV2 custom metrics for torch. (#16734)
|
3 years ago |
Sven Mika
|
d0014cd351
[RLlib] Policies get/set_state fixes and enhancements. (#16354)
|
3 years ago |
Sven Mika
|
e80095591c
[RLlib] Entropy coeff schedule bug fix and git bisect script. (#15937)
|
3 years ago |
Amog Kamsetty
|
ebc44c3d76
[CI] Upgrade flake8 to 3.9.1 (#15527)
|
3 years ago |
Sven Mika
|
e973b726c2
[RLlib] Support native tf.keras.Models (part 2) - Default keras models for Vision/RNN/Attention. (#15273)
|
3 years ago |
Sven Mika
|
78b776942f
[RLlib] Discussion 1928: Initial lr wrong if schedule used that includes ts=0 (both tf and torch). (#15538)
|
3 years ago |
Sven Mika
|
bb8a286cbc
[RLlib] Support native tf.keras.Model (milestone toward obsoleting ModelV2 class). (#14684)
|
3 years ago |
Sven Mika
|
41968512ca
[RLlib] Partial GPU examples (for learner and workers). (#15334)
|
3 years ago |
Sven Mika
|
bbfa8ffec9
[RLlib] Minor release 1.3 warnings cleanups. (#15272)
|
3 years ago |
Sven Mika
|
9c5a0cfd7a
[RLlib] Issue 14385: `Policy.compute_actions_from_input_dict` does not properly track accessed fields for Policy's view requirements. (#14386)
|
3 years ago |
Sven Mika
|
69202c6a7d
[RLlib] Obsolete usage tracking dict via sample batch. (#13065)
|
3 years ago |