一个针对强化学习和深度学习所设计的大规模分布式计算框架。

Jiao Dong fd59518980 to 1.5.0rc0 3 年之前
.buildkite 5c589debfa Revert "Set runs_per_test_detect_flakes for core tests on master (#16863)" (#16936) 3 年之前
.github 38496b1765 Allow feature requests in Github Issues (#16892) 3 年之前
.gitpod ebc44c3d76 [CI] Upgrade flake8 to 3.9.1 (#15527) 3 年之前
bazel 298d2afc35 [Ray Log] remove glog dependency (#16077) 3 年之前
benchmarks b08795582b Disable runtime envs in scalability envelope (#16978) 3 年之前
ci 560fd15568 [C++ worker] support build and add C++ worker to python wheel (#16496) 3 年之前
cpp eed0ffc6ff [Core]Align storage of session_dir in java/python so it can be accessed u… (#16958) 3 年之前
dashboard 66ea099897 [Dashboard][event] Basic event module (#16698) 3 年之前
deploy 113ed2a07c [kubernetes] Adding cpu limit to make ray helm chart working in environments which require set resource limits (#16701) 3 年之前
doc fce8fa2668 [tune] use bayesopt for quick start example (which actually converges) (#16997) 3 年之前
docker 3334357c58 [autoscaler] [azure] Fix Azure Autoscaling Failures (#16640) 3 年之前
java fd59518980 to 1.5.0rc0 3 年之前
python fd59518980 to 1.5.0rc0 3 年之前
release 667f53a0a2 add stress test (#16977) 3 年之前
rllib 55a90e670a [RLlib] Trainer.add_policy() not working for tf, if added policy is trained afterwards. (#16927) 3 年之前
src fd59518980 to 1.5.0rc0 3 年之前
streaming fd59518980 to 1.5.0rc0 3 年之前
thirdparty 298d2afc35 [Ray Log] remove glog dependency (#16077) 3 年之前
.bazelrc 1d52ab819f [release] release 1.3.0 results and test updates (#15366) 3 年之前
.clang-format 658c14282c Remove legacy Ray code. (#3121) 6 年之前
.editorconfig 219180b580 Improve .editorconfig entries (#7344) 4 年之前
.flake8 fc701067c3 [runtime env] Support `.gitignore` exclusion in working dir (#15392) 3 年之前
.gitignore 560fd15568 [C++ worker] support build and add C++ worker to python wheel (#16496) 3 年之前
.gitpod.yml 0fa6bae104 [dev] Enable gitpod (#15420) 3 年之前
.style.yapf 9a8f29e571 YAPF, take 3 (#2098) 6 年之前
.travis.yml 38b5fe7e51 [Buildkite] Add rest of the Python tests (#16517) 3 年之前
BUILD.bazel 298d2afc35 [Ray Log] remove glog dependency (#16077) 3 年之前
CONTRIBUTING.rst a563344bc2 [docs] remove ref to google groups -> github discussions (#11019) 4 年之前
LICENSE 3fa9f2e5d6 [Modin] Add tests for modin (#16260) 3 年之前
README.rst af11ec079a update serve verbiage (#16360) 3 年之前
WORKSPACE ccbcc4bafa Use GRCP and Bazel 1.0 (#6002) 5 年之前
build-docker.sh fd59518980 to 1.5.0rc0 3 年之前
build.sh ac39e23145 Get rid of build shell scripts and move them to Python (#6082) 4 年之前
pylintrc 2e972c2a77 RLLIB and pylintrc (#8995) 4 年之前
scripts 1c992661a8 Add scripts symlink back (#9219) (#9475) 4 年之前
setup_hooks.sh b14728d999 Shellcheck quoting (#9596) 4 年之前

README.rst

.. image:: https://github.com/ray-project/ray/raw/master/doc/source/images/ray_header_logo.png

.. image:: https://readthedocs.org/projects/ray/badge/?version=master
:target: http://docs.ray.io/en/master/?badge=master

.. image:: https://img.shields.io/badge/Ray-Join%20Slack-blue
:target: https://forms.gle/9TSdDYUgxYs8SA9e8

.. image:: https://img.shields.io/badge/Discuss-Ask%20Questions-blue
:target: https://discuss.ray.io/

|


**Ray provides a simple, universal API for building distributed applications.**

Ray is packaged with the following libraries for accelerating machine learning workloads:

- `Tune`_: Scalable Hyperparameter Tuning
- `RLlib`_: Scalable Reinforcement Learning
- `RaySGD `__: Distributed Training Wrappers
- `Ray Serve`_: Scalable and Programmable Serving

There are also many `community integrations `_ with Ray, including `Dask`_, `MARS`_, `Modin`_, `Horovod`_, `Hugging Face`_, `Scikit-learn`_, and others. Check out the `full list of Ray distributed libraries here `_.

Install Ray with: ``pip install ray``. For nightly wheels, see the
`Installation page `__.

.. _`Modin`: https://github.com/modin-project/modin
.. _`Hugging Face`: https://huggingface.co/transformers/main_classes/trainer.html#transformers.Trainer.hyperparameter_search
.. _`MARS`: https://docs.ray.io/en/master/mars-on-ray.html
.. _`Dask`: https://docs.ray.io/en/master/dask-on-ray.html
.. _`Horovod`: https://horovod.readthedocs.io/en/stable/ray_include.html
.. _`Scikit-learn`: joblib.html



Quick Start
-----------

Execute Python functions in parallel.

.. code-block:: python

import ray
ray.init()

@ray.remote
def f(x):
return x * x

futures = [f.remote(i) for i in range(4)]
print(ray.get(futures))

To use Ray's actor model:

.. code-block:: python


import ray
ray.init()

@ray.remote
class Counter(object):
def __init__(self):
self.n = 0

def increment(self):
self.n += 1

def read(self):
return self.n

counters = [Counter.remote() for i in range(4)]
[c.increment.remote() for c in counters]
futures = [c.read.remote() for c in counters]
print(ray.get(futures))


Ray programs can run on a single machine, and can also seamlessly scale to large clusters. To execute the above Ray script in the cloud, just download `this configuration file `__, and run:

``ray submit [CLUSTER.YAML] example.py --start``

Read more about `launching clusters `_.

Tune Quick Start
----------------

.. image:: https://github.com/ray-project/ray/raw/master/doc/source/images/tune-wide.png

`Tune`_ is a library for hyperparameter tuning at any scale.

- Launch a multi-node distributed hyperparameter sweep in less than 10 lines of code.
- Supports any deep learning framework, including PyTorch, `PyTorch Lightning `_, TensorFlow, and Keras.
- Visualize results with `TensorBoard `__.
- Choose among scalable SOTA algorithms such as `Population Based Training (PBT)`_, `Vizier's Median Stopping Rule`_, `HyperBand/ASHA`_.
- Tune integrates with many optimization libraries such as `Facebook Ax `_, `HyperOpt `_, and `Bayesian Optimization `_ and enables you to scale them transparently.

To run this example, you will need to install the following:

.. code-block:: bash

$ pip install "ray[tune]"


This example runs a parallel grid search to optimize an example objective function.

.. code-block:: python


from ray import tune


def objective(step, alpha, beta):
return (0.1 + alpha * step / 100)**(-1) + beta * 0.1


def training_function(config):
# Hyperparameters
alpha, beta = config["alpha"], config["beta"]
for step in range(10):
# Iterative training function - can be any arbitrary training procedure.
intermediate_score = objective(step, alpha, beta)
# Feed the score back back to Tune.
tune.report(mean_loss=intermediate_score)


analysis = tune.run(
training_function,
config={
"alpha": tune.grid_search([0.001, 0.01, 0.1]),
"beta": tune.choice([1, 2, 3])
})

print("Best config: ", analysis.get_best_config(metric="mean_loss", mode="min"))

# Get a dataframe for analyzing trial results.
df = analysis.results_df

If TensorBoard is installed, automatically visualize all trial results:

.. code-block:: bash

tensorboard --logdir ~/ray_results

.. _`Tune`: https://docs.ray.io/en/master/tune.html
.. _`Population Based Training (PBT)`: https://docs.ray.io/en/master/tune-schedulers.html#population-based-training-pbt
.. _`Vizier's Median Stopping Rule`: https://docs.ray.io/en/master/tune-schedulers.html#median-stopping-rule
.. _`HyperBand/ASHA`: https://docs.ray.io/en/master/tune-schedulers.html#asynchronous-hyperband

RLlib Quick Start
-----------------

.. image:: https://github.com/ray-project/ray/raw/master/doc/source/images/rllib-wide.jpg

`RLlib`_ is an open-source library for reinforcement learning built on top of Ray that offers both high scalability and a unified API for a variety of applications.

.. code-block:: bash

pip install tensorflow # or tensorflow-gpu
pip install "ray[rllib]"

.. code-block:: python

import gym
from gym.spaces import Discrete, Box
from ray import tune

class SimpleCorridor(gym.Env):
def __init__(self, config):
self.end_pos = config["corridor_length"]
self.cur_pos = 0
self.action_space = Discrete(2)
self.observation_space = Box(0.0, self.end_pos, shape=(1, ))

def reset(self):
self.cur_pos = 0
return [self.cur_pos]

def step(self, action):
if action == 0 and self.cur_pos > 0:
self.cur_pos -= 1
elif action == 1:
self.cur_pos += 1
done = self.cur_pos >= self.end_pos
return [self.cur_pos], 1 if done else 0, done, {}

tune.run(
"PPO",
config={
"env": SimpleCorridor,
"num_workers": 4,
"env_config": {"corridor_length": 5}})

.. _`RLlib`: https://docs.ray.io/en/master/rllib.html


Ray Serve Quick Start
---------------------

.. image:: https://raw.githubusercontent.com/ray-project/ray/master/doc/source/serve/logo.svg
:width: 400

`Ray Serve`_ is a scalable model-serving library built on Ray. It is:

- Framework Agnostic: Use the same toolkit to serve everything from deep
learning models built with frameworks like PyTorch or Tensorflow & Keras
to Scikit-Learn models or arbitrary business logic.
- Python First: Configure your model serving declaratively in pure Python,
without needing YAMLs or JSON configs.
- Performance Oriented: Turn on batching, pipelining, and GPU acceleration to
increase the throughput of your model.
- Composition Native: Allow you to create "model pipelines" by composing multiple
models together to drive a single prediction.
- Horizontally Scalable: Serve can linearly scale as you add more machines. Enable
your ML-powered service to handle growing traffic.

To run this example, you will need to install the following:

.. code-block:: bash

$ pip install scikit-learn
$ pip install "ray[serve]"

This example runs serves a scikit-learn gradient boosting classifier.

.. code-block:: python

from ray import serve
import pickle
import requests
from sklearn.datasets import load_iris
from sklearn.ensemble import GradientBoostingClassifier

# Train model
iris_dataset = load_iris()
model = GradientBoostingClassifier()
model.fit(iris_dataset["data"], iris_dataset["target"])

# Define Ray Serve model,
class BoostingModel:
def __init__(self):
self.model = model
self.label_list = iris_dataset["target_names"].tolist()

def __call__(self, flask_request):
payload = flask_request.json["vector"]
print("Worker: received flask request with data", payload)

prediction = self.model.predict([payload])[0]
human_name = self.label_list[prediction]
return {"result": human_name}


# Deploy model
client = serve.start()
client.create_backend("iris:v1", BoostingModel)
client.create_endpoint("iris_classifier", backend="iris:v1", route="/iris")

# Query it!
sample_request_input = {"vector": [1.2, 1.0, 1.1, 0.9]}
response = requests.get("http://localhost:8000/iris", json=sample_request_input)
print(response.text)
# Result:
# {
# "result": "versicolor"
# }


.. _`Ray Serve`: https://docs.ray.io/en/master/serve/index.html

More Information
----------------

- `Documentation`_
- `Tutorial`_
- `Blog`_
- `Ray 1.0 Architecture whitepaper`_ **(new)**
- `Ray Design Patterns`_ **(new)**
- `RLlib paper`_
- `RLlib flow paper`_
- `Tune paper`_

*Older documents:*

- `Ray paper`_
- `Ray HotOS paper`_
- `Blog (old)`_

.. _`Documentation`: http://docs.ray.io/en/master/index.html
.. _`Tutorial`: https://github.com/ray-project/tutorial
.. _`Blog (old)`: https://ray-project.github.io/
.. _`Blog`: https://medium.com/distributed-computing-with-ray
.. _`Ray 1.0 Architecture whitepaper`: https://docs.google.com/document/d/1lAy0Owi-vPz2jEqBSaHNQcy2IBSDEHyXNOQZlGuj93c/preview
.. _`Ray Design Patterns`: https://docs.google.com/document/d/167rnnDFIVRhHhK4mznEIemOtj63IOhtIPvSYaPgI4Fg/edit
.. _`Ray paper`: https://arxiv.org/abs/1712.05889
.. _`Ray HotOS paper`: https://arxiv.org/abs/1703.03924
.. _`RLlib paper`: https://arxiv.org/abs/1712.09381
.. _`RLlib flow paper`: https://arxiv.org/abs/2011.12719
.. _`Tune paper`: https://arxiv.org/abs/1807.05118

Getting Involved
----------------

- `Forum`_: For discussions about development, questions about usage, and feature requests.
- `GitHub Issues`_: For reporting bugs.
- `Twitter`_: Follow updates on Twitter.
- `Meetup Group`_: Join our meetup group.
- `StackOverflow`_: For questions about how to use Ray.

.. _`Forum`: https://discuss.ray.io/
.. _`GitHub Issues`: https://github.com/ray-project/ray/issues
.. _`StackOverflow`: https://stackoverflow.com/questions/tagged/ray
.. _`Meetup Group`: https://www.meetup.com/Bay-Area-Ray-Meetup/
.. _`Twitter`: https://twitter.com/raydistributed