feature_overview.rst 5.8 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121
  1. .. List of most important features of RLlib, with sigil-like buttons for each of the features.
  2. To be included into different rst files.
  3. .. container:: clear-both
  4. .. container:: buttons-float-left
  5. .. https://docs.google.com/drawings/d/1i_yoxocyEOgiCxcfRZVKpNh0R_-2tQZOX4syquiytAI/edit?skip_itp2_check=true&pli=1
  6. .. image:: images/sigils/rllib-sigil-tf-and-torch.svg
  7. :width: 100
  8. :target: https://github.com/ray-project/ray/blob/master/rllib/examples/custom_tf_policy.py
  9. .. container::
  10. The most **popular deep-learning frameworks**: `PyTorch <https://github.com/ray-project/ray/blob/master/rllib/examples/custom_torch_policy.py>`_ and `TensorFlow
  11. (tf1.x/2.x static-graph/eager/traced) <https://github.com/ray-project/ray/blob/master/rllib/examples/custom_tf_policy.py>`_.
  12. .. container:: clear-both
  13. .. container:: buttons-float-left
  14. .. https://docs.google.com/drawings/d/1yEOfeHvuLi5EzZKtGFQMfQ2NINzi3bUBrU3Z7bCiuKs/edit
  15. .. image:: images/sigils/rllib-sigil-distributed-learning.svg
  16. :width: 100
  17. :target: https://github.com/ray-project/ray/blob/master/rllib/examples/tune/framework.py
  18. .. container::
  19. **Highly distributed learning**: Our RLlib algorithms (such as our "PPO" or "IMPALA")
  20. allow you to set the ``num_workers`` config parameter, such that your workloads can run
  21. on 100s of CPUs/nodes thus parallelizing and speeding up learning.
  22. .. container:: clear-both
  23. .. container:: buttons-float-left
  24. .. https://docs.google.com/drawings/d/1b8uaRo0KjPH-x-elBmyvDwAA4I2oy8cj3dxNnUT3HTE/edit
  25. .. image:: images/sigils/rllib-sigil-vector-envs.svg
  26. :width: 100
  27. :target: https://github.com/ray-project/ray/blob/master/rllib/examples/env_rendering_and_recording.py
  28. .. container::
  29. **Vectorized (batched) and remote (parallel) environments**: RLlib auto-vectorizes
  30. your ``gym.Envs`` via the ``num_envs_per_worker`` config. Environment workers can
  31. then batch and thus significantly speedup the action computing forward pass.
  32. On top of that, RLlib offers the ``remote_worker_envs`` config to create
  33. `single environments (within a vectorized one) as ray Actors <https://github.com/ray-project/ray/blob/master/rllib/examples/remote_envs_with_inference_done_on_main_node.py>`_,
  34. thus parallelizing even the env stepping process.
  35. .. container:: clear-both
  36. .. container:: buttons-float-left
  37. .. https://docs.google.com/drawings/d/1Lbi1Zf5SvczSliGEWuK4mjWeehPIArYY9XKys81EtHU/edit
  38. .. image:: images/sigils/rllib-sigil-multi-agent.svg
  39. :width: 100
  40. :target: https://github.com/ray-project/ray/blob/master/rllib/examples/multi_agent_independent_learning.py
  41. .. container::
  42. | **Multi-agent RL** (MARL): Convert your (custom) ``gym.Envs`` into a multi-agent one
  43. via a few simple steps and start training your agents in any of the following fashions:
  44. | 1) Cooperative with `shared <https://github.com/ray-project/ray/blob/master/rllib/examples/centralized_critic.py>`_ or
  45. `separate <https://github.com/ray-project/ray/blob/master/rllib/examples/two_step_game.py>`_
  46. policies and/or value functions.
  47. | 2) Adversarial scenarios using `self-play <https://github.com/ray-project/ray/blob/master/rllib/examples/self_play_with_open_spiel.py>`_
  48. and `league-based training <https://github.com/ray-project/ray/blob/master/rllib/examples/self_play_league_based_with_open_spiel.py>`_.
  49. | 3) `Independent learning <https://github.com/ray-project/ray/blob/master/rllib/examples/multi_agent_independent_learning.py>`_
  50. of neutral/co-existing agents.
  51. .. container:: clear-both
  52. .. container:: buttons-float-left
  53. .. https://docs.google.com/drawings/d/1DY2IJUPo007mSRylz6IEs-dz_n1-rFh67RMi9PB2niY/edit
  54. .. image:: images/sigils/rllib-sigil-external-simulators.svg
  55. :width: 100
  56. :target: https://github.com/ray-project/ray/tree/master/rllib/examples/serving
  57. .. container::
  58. **External simulators**: Don't have your simulation running as a gym.Env in python?
  59. No problem! RLlib supports an external environment API and comes with a pluggable,
  60. off-the-shelve
  61. `client <https://github.com/ray-project/ray/blob/master/rllib/examples/serving/cartpole_client.py>`_/
  62. `server <https://github.com/ray-project/ray/blob/master/rllib/examples/serving/cartpole_server.py>`_
  63. setup that allows you to run 100s of independent simulators on the "outside"
  64. (e.g. a Windows cloud) connecting to a central RLlib Policy-Server that learns
  65. and serves actions. Alternatively, actions can be computed on the client side
  66. to save on network traffic.
  67. .. container:: clear-both
  68. .. container:: buttons-float-left
  69. .. https://docs.google.com/drawings/d/1VFuESSI5u9AK9zqe9zKSJIGX8taadijP7Qw1OLv2hSQ/edit
  70. .. image:: images/sigils/rllib-sigil-offline-rl.svg
  71. :width: 100
  72. :target: https://github.com/ray-project/ray/blob/master/rllib/examples/offline_rl.py
  73. .. container::
  74. **Offline RL and imitation learning/behavior cloning**: You don't have a simulator
  75. for your particular problem, but tons of historic data recorded by a legacy (maybe
  76. non-RL/ML) system? This branch of reinforcement learning is for you!
  77. RLlib's comes with several `offline RL <https://github.com/ray-project/ray/blob/master/rllib/examples/offline_rl.py>`_
  78. algorithms (*CQL*, *MARWIL*, and *DQfD*), allowing you to either purely
  79. `behavior-clone <https://github.com/ray-project/ray/blob/master/rllib/algorithms/bc/tests/test_bc.py>`_
  80. your existing system or learn how to further improve over it.