Max van Dijck 232c331ce3 [RLlib] Rename all np.product usage to np.prod (#46317) 3 月之前
..
examples 8d286f03ce [RLlib-contrib] Dreamer(V1) (won't be moved into `rllib_contrib`, b/c we now have DreamerV3). (#36621) 1 年之前
src 583cfe62c4 [RLlib-contrib] Leela Chess Zero. (#36627) 1 年之前
tests 583cfe62c4 [RLlib-contrib] Leela Chess Zero. (#36627) 1 年之前
tuned_examples af4c431d4d [RLlib] Remove utilities/tests/classes not needed anymore. (#40939) 11 月之前
BUILD af4c431d4d [RLlib] Remove utilities/tests/classes not needed anymore. (#40939) 11 月之前
README.md 583cfe62c4 [RLlib-contrib] Leela Chess Zero. (#36627) 1 年之前
pyproject.toml 232c331ce3 [RLlib] Rename all np.product usage to np.prod (#46317) 3 月之前
requirements.txt 232c331ce3 [RLlib] Rename all np.product usage to np.prod (#46317) 3 月之前

README.md

Leela Chess Zero

Leela Chess Zero Leela chess zero is an algorithm made to train agents on the Leela Chess Engine. The Leela Chess Zero’s neural network is largely based on the DeepMind’s AlphaGo Zero and AlphaZero architecture. There are however some changes. It should be trained in a competition with multiple versions of its past self.

The policy/model assumes that the environment is a MultiAgent Chess environment, that has a discrete action space and returns an observation as a dictionary with two keys:

  • obs that contains an observation under either the form of a state vector or an image
  • action_mask that contains a mask over the legal actions

It should also implement a get_stateand a set_state function, used in the MCTS implementation.

The model used in AlphaZero trainer should extend TorchModelV2 and implement the method compute_priors_and_value.

References

Installation

conda create -n rllib-leela-chess python=3.10
conda activate rllib-leela-chess
pip install -r requirements.txt
pip install -e '.[development]'

Usage

[Leela Chess Zero Example]()