snake.ai - Multiplayer Snake AI
This project provides
- an interface for a multi-player snake game inpired by slither.io
- implementations of various agents based on reinforcement learning (Q-learning, Policy Gradients) and game theory (adversarial search such as Minimax, Alpha-beta pruning ...)
The multiagent nature of this game provides endless opportunities to explore RL algorithms (eg. curriculum learning, how opponents during training episodes impact learned behaviors, ...) and is also a nice way of assessing the relative performance of each methods.
In addition, it is also possible to study how different reward functions or game rules will shape the agents' strategies: for example, is it better to grow as much as possible by eating candies or to try and kill other players to end the game as quickly as possible?
Read this blog post to get an overview of this project as well as some details on one of the reinforcement learning methods implemented (Q-learning with function approximation by neural networks).
Visualizing individual games
It is possible to run a single game with the GUI through the command
$ python controller.py [h]
If you do use the option
h, this will add a 'human player': an agent you can control with the keyboard.
The config file
The config file
config.py lets you configure the different agents or the details of the experiments/simulations
you would like to run.
Here is an example configuration:
agent = "RL" filename = "rl-pg-linear-r6-1000" game_hp = HP(grid_size = 20, max_iter = 3000, discount = 0.9) rl_hp = RlHp(rl_type = "policy_gradients", radius = 6, filter_actions = False, lambda_ = None, q_type = "linear") depth = lambda s,a : survivorDfunc(s, a , 2, 0.5) evalFn = greedyEvaluationFunction opponents = [SmartGreedyAgent, OpportunistAgent, searchAgent("alphabeta", depth, evalFn)] num_trials = 1000
ES will add the corresponding agent to the
opponents list (after training if necessary).
Setting it to anything else will keep this list unchanged.
Once you filled the config file, you can easily run 500 simulations (without the GUI) to get some stats about how the AIs perform against each other:
$ python simulation.py 500 [load]
If you do use the
load parameter, this will load pre-trained weights for the RL agents, otherwise it will first run some trial games
to learn such weights. In the latter case, learned weights will be saved in the
data/ folder with the name provided in the config
simple-pg-r6.p contain the weights of RL agents trained respectively via Q-learning
and Policy Gradients on 1,000 trials.
We recommend training agents against hard-coded strategies instead of search-based ones such as Minimax (at least at first) since it will be much faster.
Basic statistics will be printed in the terminal, but these (and more) will be saved in a file in
experiments/ with the name
set in the config file. Note that the snakes' id correspond to the strategy's index in the list
strategies.pyimplements hard-coded strategies, especially useful to train RL agents or as baselines
minimax.pyimplements adversarial strategies that expore trees of possible moves
rl.pyprovides the interface for RL-based algorithms
rl_interface.pyprovides utilities to train and load RL agents
policy_gradients.pyimplements a simple Policy Gradients algorithm for reinforcement learning
qlearning.pyimplements Q-learning for reinforcement learning and supports both a simple linear model or neural nets
es.pyimplements an Evolutionary Strategy algorithm
FeatureExtractorto derive useful features from any state and used by RL agents
hp.pycontain the general code for the game