Sven Mika
|
ea2bea7e30
[RLlib; Docs overhaul] Docstring cleanup: Offline. (#19808)
|
3 年之前 |
Sven Mika
|
1fd0eb805e
[RLlib] Redo fix bug normalize vs unsquash actions (original PR made log-likelihood test flakey). (#17014)
|
3 年之前 |
Amog Kamsetty
|
bc33dc7e96
Revert "[RLlib] Fix bug in policy.py: normalize_actions=True has to call `unsquash_action`, not `normalize_action`." (#17002)
|
3 年之前 |
Sven Mika
|
7862dd64ea
[RLlib] Fix bug in policy.py: normalize_actions=True has to call `unsquash_action`, not `normalize_action`. (#16774)
|
3 年之前 |
Sven Mika
|
8b3554e37e
[RLlib] Remove all (already soft-deprecated) `SampleBatch.data` from code. (#15335)
|
3 年之前 |
Sven Mika
|
f6b84cb2f7
[RLlib] Fix offline logp vs prob bug in OffPolicyEstimator class. (#12158)
|
3 年之前 |
Sven Mika
|
805dad3bc4
[RLlib] SAC algo cleanup. (#10825)
|
4 年之前 |
Julius Frost
|
dc659ae89a
make action probabilities a numpy array (#10122)
|
4 年之前 |
Sven Mika
|
2256047876
[RLlib] Rename rllib.utils.types into typing to match built-in python module's name. (#10114)
|
4 年之前 |
Michael Luo
|
b51ab2af66
[RLlib] Offline Type Annotations (#9676)
|
4 年之前 |
Sven Mika
|
83e06cd30a
[RLlib] DDPG refactor and Exploration API action noise classes. (#7314)
|
4 年之前 |
Sven Mika
|
0db2046b0a
[RLlib] Policy.compute_log_likelihoods() and SAC refactor. (issue #7107) (#7124)
|
4 年之前 |
Sven Mika
|
d537e9f0d8
[RLlib] Exploration API: merge deterministic flag with exploration classes (SoftQ and StochasticSampling). (#7155)
|
4 年之前 |
Sven Mika
|
6e1c3ea824
[RLlib] Exploration API (+EpsilonGreedy sub-class). (#6974)
|
4 年之前 |
Sven
|
60d4d5e1aa
Remove future imports (#6724)
|
4 年之前 |
Robert Nishihara
|
39a3459886
Remove (object) from class declarations. (#6658)
|
4 年之前 |
Eric Liang
|
5d7afe8092
[rllib] Try moving RLlib to top level dir (#5324)
|
5 年之前 |