Artur Niederfahrenhorst bccabb5ebf [RLlib] More fixes around PPO release tests. (#36045) | 1 年之前 | |
---|---|---|
.. | ||
yaml_files | bccabb5ebf [RLlib] More fixes around PPO release tests. (#36045) | 1 年之前 |
README.md | c026374acb [RLlib] Fix the 2 failing RLlib release tests. (#25603) | 2 年之前 |
run.py | 8e680c483c [RLlib] gymnasium support (new `Env.reset()/step()/seed()/render()` APIs). (#28369) | 1 年之前 |
todo_tests_currently_not_covered.yaml | 7401b39720 [RLlib] Fix double '::' in RLlib release test yaml files. (#34865) | 1 年之前 |
Test most important RLlib algorithms with hard enough tasks to prevent performance regression.
Algorithms in this suite are split into multiple tests, so groups of tests can run in parallel. This is to ensure reasonable total runtime.
All learning tests have stop
and pass_criteria
configured, where stop
specifies a fixed test duration, and pass_criteria
specified performance goals like minimum reward
and minimum throughput
.
Unlike normal tuned examples, these learning tests always run to the full specified test duration, and would NOT stop early when the pass_criteria
is met.
This is so they can serve better as performance regression tests:
pass_criteria
, we can specify a relatively conservative pass_criteria
, to avoid having flaky tests that pass and fail because of random fluctuations.TODO: we don't see progress right now in the time series chart, if an algorithm learns faster, but to the same peak performance. For that, we need to plot multiple lines at different percentage time mark.
If you have any questions about these tests, ping jungong@.