Max van Dijck 232c331ce3 [RLlib] Rename all np.product usage to np.prod (#46317) 3 月之前
..
examples 331c5b7e13 [RLlib-contrib] Alpha Zero. (#36736) 1 年之前
src 331c5b7e13 [RLlib-contrib] Alpha Zero. (#36736) 1 年之前
tests 331c5b7e13 [RLlib-contrib] Alpha Zero. (#36736) 1 年之前
tuned_examples a9ac55d4f2 [RLlib; RLlib contrib] Move `tuned_examples` into rllib_contrib and remove CI learning tests for contrib algos. (#40444) 1 年之前
BUILD a9ac55d4f2 [RLlib; RLlib contrib] Move `tuned_examples` into rllib_contrib and remove CI learning tests for contrib algos. (#40444) 1 年之前
README.md 331c5b7e13 [RLlib-contrib] Alpha Zero. (#36736) 1 年之前
pyproject.toml 232c331ce3 [RLlib] Rename all np.product usage to np.prod (#46317) 3 月之前
requirements.txt 232c331ce3 [RLlib] Rename all np.product usage to np.prod (#46317) 3 月之前

README.md

Alpha Zero

Alpha Zero is a general reinforcement learning approach that achieved superhuman performance in the games of chess, shogi, and Go through tabula rasa learning from games of self-play, surpassing previous state-of-the-art programs that relied on handcrafted evaluation functions and domain-specific adaptations.

Installation

conda create -n rllib-alpha-zero python=3.10
conda activate rllib-alpha-zero
pip install -r requirements.txt
pip install -e '.[development]'

Usage

[AlphaZero Example]()