README.rst 15 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434
  1. .. image:: https://github.com/ray-project/ray/raw/master/doc/source/images/ray_header_logo.png
  2. .. image:: https://readthedocs.org/projects/ray/badge/?version=master
  3. :target: http://docs.ray.io/en/master/?badge=master
  4. .. image:: https://img.shields.io/badge/Ray-Join%20Slack-blue
  5. :target: https://forms.gle/9TSdDYUgxYs8SA9e8
  6. .. image:: https://img.shields.io/badge/Discuss-Ask%20Questions-blue
  7. :target: https://discuss.ray.io/
  8. .. image:: https://img.shields.io/twitter/follow/raydistributed.svg?style=social&logo=twitter
  9. :target: https://twitter.com/raydistributed
  10. |
  11. **Ray provides a simple, universal API for building distributed applications.**
  12. Ray is packaged with the following libraries for accelerating machine learning workloads:
  13. - `Tune`_: Scalable Hyperparameter Tuning
  14. - `RLlib`_: Scalable Reinforcement Learning
  15. - `Train`_: Distributed Deep Learning (beta)
  16. - `Datasets`_: Distributed Data Loading and Compute
  17. As well as libraries for taking ML and distributed apps to production:
  18. - `Serve`_: Scalable and Programmable Serving
  19. - `Workflows`_: Fast, Durable Application Flows (alpha)
  20. There are also many `community integrations <https://docs.ray.io/en/master/ray-libraries.html>`_ with Ray, including `Dask`_, `MARS`_, `Modin`_, `Horovod`_, `Hugging Face`_, `Scikit-learn`_, and others. Check out the `full list of Ray distributed libraries here <https://docs.ray.io/en/master/ray-libraries.html>`_.
  21. Install Ray with: ``pip install ray``. For nightly wheels, see the
  22. `Installation page <https://docs.ray.io/en/master/installation.html>`__.
  23. .. _`Modin`: https://github.com/modin-project/modin
  24. .. _`Hugging Face`: https://huggingface.co/transformers/main_classes/trainer.html#transformers.Trainer.hyperparameter_search
  25. .. _`MARS`: https://docs.ray.io/en/latest/data/mars-on-ray.html
  26. .. _`Dask`: https://docs.ray.io/en/latest/data/dask-on-ray.html
  27. .. _`Horovod`: https://horovod.readthedocs.io/en/stable/ray_include.html
  28. .. _`Scikit-learn`: https://docs.ray.io/en/master/joblib.html
  29. .. _`Serve`: https://docs.ray.io/en/master/serve/index.html
  30. .. _`Datasets`: https://docs.ray.io/en/master/data/dataset.html
  31. .. _`Workflows`: https://docs.ray.io/en/master/workflows/concepts.html
  32. .. _`Train`: https://docs.ray.io/en/master/train/train.html
  33. Quick Start
  34. -----------
  35. Execute Python functions in parallel.
  36. .. code-block:: python
  37. import ray
  38. ray.init()
  39. @ray.remote
  40. def f(x):
  41. return x * x
  42. futures = [f.remote(i) for i in range(4)]
  43. print(ray.get(futures))
  44. To use Ray's actor model:
  45. .. code-block:: python
  46. import ray
  47. ray.init()
  48. @ray.remote
  49. class Counter(object):
  50. def __init__(self):
  51. self.n = 0
  52. def increment(self):
  53. self.n += 1
  54. def read(self):
  55. return self.n
  56. counters = [Counter.remote() for i in range(4)]
  57. [c.increment.remote() for c in counters]
  58. futures = [c.read.remote() for c in counters]
  59. print(ray.get(futures))
  60. Ray programs can run on a single machine, and can also seamlessly scale to large clusters. To execute the above Ray script in the cloud, just download `this configuration file <https://github.com/ray-project/ray/blob/master/python/ray/autoscaler/aws/example-full.yaml>`__, and run:
  61. ``ray submit [CLUSTER.YAML] example.py --start``
  62. Read more about `launching clusters <https://docs.ray.io/en/master/cluster/index.html>`_.
  63. Tune Quick Start
  64. ----------------
  65. .. image:: https://github.com/ray-project/ray/raw/master/doc/source/images/tune-wide.png
  66. `Tune`_ is a library for hyperparameter tuning at any scale.
  67. - Launch a multi-node distributed hyperparameter sweep in less than 10 lines of code.
  68. - Supports any deep learning framework, including PyTorch, `PyTorch Lightning <https://github.com/williamFalcon/pytorch-lightning>`_, TensorFlow, and Keras.
  69. - Visualize results with `TensorBoard <https://www.tensorflow.org/tensorboard>`__.
  70. - Choose among scalable SOTA algorithms such as `Population Based Training (PBT)`_, `Vizier's Median Stopping Rule`_, `HyperBand/ASHA`_.
  71. - Tune integrates with many optimization libraries such as `Facebook Ax <http://ax.dev>`_, `HyperOpt <https://github.com/hyperopt/hyperopt>`_, and `Bayesian Optimization <https://github.com/fmfn/BayesianOptimization>`_ and enables you to scale them transparently.
  72. To run this example, you will need to install the following:
  73. .. code-block:: bash
  74. $ pip install "ray[tune]"
  75. This example runs a parallel grid search to optimize an example objective function.
  76. .. code-block:: python
  77. from ray import tune
  78. def objective(step, alpha, beta):
  79. return (0.1 + alpha * step / 100)**(-1) + beta * 0.1
  80. def training_function(config):
  81. # Hyperparameters
  82. alpha, beta = config["alpha"], config["beta"]
  83. for step in range(10):
  84. # Iterative training function - can be any arbitrary training procedure.
  85. intermediate_score = objective(step, alpha, beta)
  86. # Feed the score back back to Tune.
  87. tune.report(mean_loss=intermediate_score)
  88. analysis = tune.run(
  89. training_function,
  90. config={
  91. "alpha": tune.grid_search([0.001, 0.01, 0.1]),
  92. "beta": tune.choice([1, 2, 3])
  93. })
  94. print("Best config: ", analysis.get_best_config(metric="mean_loss", mode="min"))
  95. # Get a dataframe for analyzing trial results.
  96. df = analysis.results_df
  97. If TensorBoard is installed, automatically visualize all trial results:
  98. .. code-block:: bash
  99. tensorboard --logdir ~/ray_results
  100. .. _`Tune`: https://docs.ray.io/en/master/tune.html
  101. .. _`Population Based Training (PBT)`: https://docs.ray.io/en/master/tune/api_docs/schedulers.html#population-based-training-tune-schedulers-populationbasedtraining
  102. .. _`Vizier's Median Stopping Rule`: https://docs.ray.io/en/master/tune/api_docs/schedulers.html#median-stopping-rule-tune-schedulers-medianstoppingrule
  103. .. _`HyperBand/ASHA`: https://docs.ray.io/en/master/tune/api_docs/schedulers.html#asha-tune-schedulers-ashascheduler
  104. RLlib Quick Start
  105. -----------------
  106. .. image:: https://github.com/ray-project/ray/raw/master/doc/source/rllib/images/rllib-logo.png
  107. `RLlib`_ is an industry-grade library for reinforcement learning (RL), built on top of Ray.
  108. It offers high scalability and unified APIs for a
  109. `variety of industry- and research applications <https://www.anyscale.com/event-category/ray-summit>`_.
  110. .. code-block:: bash
  111. $ pip install "ray[rllib]" tensorflow # or torch
  112. .. Do NOT edit the following code directly in this README! Instead, edit
  113. the ray/rllib/examples/documentation/rllib_on_ray_readme.py script and then
  114. copy the new code in here:
  115. .. code-block:: python
  116. import gym
  117. from ray.rllib.algorithms.ppo import PPO
  118. # Define your problem using python and openAI's gym API:
  119. class SimpleCorridor(gym.Env):
  120. """Corridor in which an agent must learn to move right to reach the exit.
  121. ---------------------
  122. | S | 1 | 2 | 3 | G | S=start; G=goal; corridor_length=5
  123. ---------------------
  124. Possible actions to chose from are: 0=left; 1=right
  125. Observations are floats indicating the current field index, e.g. 0.0 for
  126. starting position, 1.0 for the field next to the starting position, etc..
  127. Rewards are -0.1 for all steps, except when reaching the goal (+1.0).
  128. """
  129. def __init__(self, config):
  130. self.end_pos = config["corridor_length"]
  131. self.cur_pos = 0
  132. self.action_space = gym.spaces.Discrete(2) # left and right
  133. self.observation_space = gym.spaces.Box(0.0, self.end_pos, shape=(1,))
  134. def reset(self):
  135. """Resets the episode and returns the initial observation of the new one.
  136. """
  137. self.cur_pos = 0
  138. # Return initial observation.
  139. return [self.cur_pos]
  140. def step(self, action):
  141. """Takes a single step in the episode given `action`
  142. Returns:
  143. New observation, reward, done-flag, info-dict (empty).
  144. """
  145. # Walk left.
  146. if action == 0 and self.cur_pos > 0:
  147. self.cur_pos -= 1
  148. # Walk right.
  149. elif action == 1:
  150. self.cur_pos += 1
  151. # Set `done` flag when end of corridor (goal) reached.
  152. done = self.cur_pos >= self.end_pos
  153. # +1 when goal reached, otherwise -1.
  154. reward = 1.0 if done else -0.1
  155. return [self.cur_pos], reward, done, {}
  156. # Create an RLlib Trainer instance.
  157. trainer = PPO(
  158. config={
  159. # Env class to use (here: our gym.Env sub-class from above).
  160. "env": SimpleCorridor,
  161. # Config dict to be passed to our custom env's constructor.
  162. "env_config": {
  163. # Use corridor with 20 fields (including S and G).
  164. "corridor_length": 20
  165. },
  166. # Parallelize environment rollouts.
  167. "num_workers": 3,
  168. })
  169. # Train for n iterations and report results (mean episode rewards).
  170. # Since we have to move at least 19 times in the env to reach the goal and
  171. # each move gives us -0.1 reward (except the last move at the end: +1.0),
  172. # we can expect to reach an optimal episode reward of -0.1*18 + 1.0 = -0.8
  173. for i in range(5):
  174. results = trainer.train()
  175. print(f"Iter: {i}; avg. reward={results['episode_reward_mean']}")
  176. After training, you may want to perform action computations (inference) in your environment.
  177. Here is a minimal example on how to do this. Also
  178. `check out our more detailed examples here <https://github.com/ray-project/ray/tree/master/rllib/examples/inference_and_serving>`_
  179. (in particular for `normal models <https://github.com/ray-project/ray/blob/master/rllib/examples/inference_and_serving/policy_inference_after_training.py>`_,
  180. `LSTMs <https://github.com/ray-project/ray/blob/master/rllib/examples/inference_and_serving/policy_inference_after_training_with_lstm.py>`_,
  181. and `attention nets <https://github.com/ray-project/ray/blob/master/rllib/examples/inference_and_serving/policy_inference_after_training_with_attention.py>`_).
  182. .. code-block:: python
  183. # Perform inference (action computations) based on given env observations.
  184. # Note that we are using a slightly different env here (len 10 instead of 20),
  185. # however, this should still work as the agent has (hopefully) learned
  186. # to "just always walk right!"
  187. env = SimpleCorridor({"corridor_length": 10})
  188. # Get the initial observation (should be: [0.0] for the starting position).
  189. obs = env.reset()
  190. done = False
  191. total_reward = 0.0
  192. # Play one episode.
  193. while not done:
  194. # Compute a single action, given the current observation
  195. # from the environment.
  196. action = trainer.compute_single_action(obs)
  197. # Apply the computed action in the environment.
  198. obs, reward, done, info = env.step(action)
  199. # Sum up rewards for reporting purposes.
  200. total_reward += reward
  201. # Report results.
  202. print(f"Played 1 episode; total-reward={total_reward}")
  203. .. _`RLlib`: https://docs.ray.io/en/master/rllib/index.html
  204. Ray Serve Quick Start
  205. ---------------------
  206. .. image:: https://raw.githubusercontent.com/ray-project/ray/master/doc/source/serve/logo.svg
  207. :width: 400
  208. `Ray Serve`_ is a scalable model-serving library built on Ray. It is:
  209. - Framework Agnostic: Use the same toolkit to serve everything from deep
  210. learning models built with frameworks like PyTorch or Tensorflow & Keras
  211. to Scikit-Learn models or arbitrary business logic.
  212. - Python First: Configure your model serving declaratively in pure Python,
  213. without needing YAMLs or JSON configs.
  214. - Performance Oriented: Turn on batching, pipelining, and GPU acceleration to
  215. increase the throughput of your model.
  216. - Composition Native: Allow you to create "model pipelines" by composing multiple
  217. models together to drive a single prediction.
  218. - Horizontally Scalable: Serve can linearly scale as you add more machines. Enable
  219. your ML-powered service to handle growing traffic.
  220. To run this example, you will need to install the following:
  221. .. code-block:: bash
  222. $ pip install scikit-learn
  223. $ pip install "ray[serve]"
  224. This example runs serves a scikit-learn gradient boosting classifier.
  225. .. code-block:: python
  226. import pickle
  227. import requests
  228. from sklearn.datasets import load_iris
  229. from sklearn.ensemble import GradientBoostingClassifier
  230. from ray import serve
  231. serve.start()
  232. # Train model.
  233. iris_dataset = load_iris()
  234. model = GradientBoostingClassifier()
  235. model.fit(iris_dataset["data"], iris_dataset["target"])
  236. @serve.deployment(route_prefix="/iris")
  237. class BoostingModel:
  238. def __init__(self, model):
  239. self.model = model
  240. self.label_list = iris_dataset["target_names"].tolist()
  241. async def __call__(self, request):
  242. payload = (await request.json())["vector"]
  243. print(f"Received flask request with data {payload}")
  244. prediction = self.model.predict([payload])[0]
  245. human_name = self.label_list[prediction]
  246. return {"result": human_name}
  247. # Deploy model.
  248. BoostingModel.deploy(model)
  249. # Query it!
  250. sample_request_input = {"vector": [1.2, 1.0, 1.1, 0.9]}
  251. response = requests.get("http://localhost:8000/iris", json=sample_request_input)
  252. print(response.text)
  253. # Result:
  254. # {
  255. # "result": "versicolor"
  256. # }
  257. .. _`Ray Serve`: https://docs.ray.io/en/master/serve/index.html
  258. More Information
  259. ----------------
  260. - `Documentation`_
  261. - `Tutorial`_
  262. - `Blog`_
  263. - `Ray 1.0 Architecture whitepaper`_ **(new)**
  264. - `Exoshuffle: large-scale data shuffle in Ray`_ **(new)**
  265. - `RLlib paper`_
  266. - `RLlib flow paper`_
  267. - `Tune paper`_
  268. *Older documents:*
  269. - `Ray paper`_
  270. - `Ray HotOS paper`_
  271. .. _`Documentation`: http://docs.ray.io/en/master/index.html
  272. .. _`Tutorial`: https://github.com/ray-project/tutorial
  273. .. _`Blog`: https://medium.com/distributed-computing-with-ray
  274. .. _`Ray 1.0 Architecture whitepaper`: https://docs.google.com/document/d/1lAy0Owi-vPz2jEqBSaHNQcy2IBSDEHyXNOQZlGuj93c/preview
  275. .. _`Exoshuffle: large-scale data shuffle in Ray`: https://arxiv.org/abs/2203.05072
  276. .. _`Ray paper`: https://arxiv.org/abs/1712.05889
  277. .. _`Ray HotOS paper`: https://arxiv.org/abs/1703.03924
  278. .. _`RLlib paper`: https://arxiv.org/abs/1712.09381
  279. .. _`RLlib flow paper`: https://arxiv.org/abs/2011.12719
  280. .. _`Tune paper`: https://arxiv.org/abs/1807.05118
  281. Getting Involved
  282. ----------------
  283. .. list-table::
  284. :widths: 25 50 25 25
  285. :header-rows: 1
  286. * - Platform
  287. - Purpose
  288. - Estimated Response Time
  289. - Support Level
  290. * - `Discourse Forum`_
  291. - For discussions about development and questions about usage.
  292. - < 1 day
  293. - Community
  294. * - `GitHub Issues`_
  295. - For reporting bugs and filing feature requests.
  296. - < 2 days
  297. - Ray OSS Team
  298. * - `Slack`_
  299. - For collaborating with other Ray users.
  300. - < 2 days
  301. - Community
  302. * - `StackOverflow`_
  303. - For asking questions about how to use Ray.
  304. - 3-5 days
  305. - Community
  306. * - `Meetup Group`_
  307. - For learning about Ray projects and best practices.
  308. - Monthly
  309. - Ray DevRel
  310. * - `Twitter`_
  311. - For staying up-to-date on new features.
  312. - Daily
  313. - Ray DevRel
  314. .. _`Discourse Forum`: https://discuss.ray.io/
  315. .. _`GitHub Issues`: https://github.com/ray-project/ray/issues
  316. .. _`StackOverflow`: https://stackoverflow.com/questions/tagged/ray
  317. .. _`Meetup Group`: https://www.meetup.com/Bay-Area-Ray-Meetup/
  318. .. _`Twitter`: https://twitter.com/raydistributed
  319. .. _`Slack`: https://forms.gle/9TSdDYUgxYs8SA9e8