Max van Dijck 232c331ce3 [RLlib] Rename all np.product usage to np.prod (#46317) 3 月之前
..
examples 8d286f03ce [RLlib-contrib] Dreamer(V1) (won't be moved into `rllib_contrib`, b/c we now have DreamerV3). (#36621) 1 年之前
src 34b630c74d [RLlib-contrib] MADDPG. (#36628) 1 年之前
tests 34b630c74d [RLlib-contrib] MADDPG. (#36628) 1 年之前
tuned_examples a9ac55d4f2 [RLlib; RLlib contrib] Move `tuned_examples` into rllib_contrib and remove CI learning tests for contrib algos. (#40444) 1 年之前
BUILD a9ac55d4f2 [RLlib; RLlib contrib] Move `tuned_examples` into rllib_contrib and remove CI learning tests for contrib algos. (#40444) 1 年之前
README.md 34b630c74d [RLlib-contrib] MADDPG. (#36628) 1 年之前
pyproject.toml 232c331ce3 [RLlib] Rename all np.product usage to np.prod (#46317) 3 月之前
requirements.txt 232c331ce3 [RLlib] Rename all np.product usage to np.prod (#46317) 3 月之前

README.md

MADDPG (Multi-Agent Deep Deterministic Policy Gradient)

MADDPG is a DDPG centralized/shared critic algorithm. Code here is adapted from https://github.com/openai/maddpg to integrate with RLlib multi-agent APIs. Please check justinkterry/maddpg-rllib for examples and more information. Note that the implementation here is based on OpenAI’s, and is intended for use with the discrete MPE environments. Please also note that people typically find this method difficult to get to work, even with all applicable optimizations for their environment applied. This method should be viewed as for research purposes, and for reproducing the results of the paper introducing it.

Installation

conda create -n rllib-maddpg python=3.10
conda activate rllib-maddpg
pip install -r requirements.txt
pip install -e '.[development]'

Usage

[MADDPG Example]()