<h1 style="background-color: gray;
           color: black;
           padding: 20px;
           text-align: center;">INFO</h1>

This script is an extension of the `compare_all_randoms` script, with `Random4` added.

<h1 style="background-color: gray;
           color: black;
           padding: 20px;
           text-align: center;">IMPORTS</h1>

In [None]:
# External imports
import sys
import os
import tqdm.auto as tqdm
import matplotlib.pyplot as pyplot
import scipy.stats as scstats

# Add needed directories to the path
sys.path.append(os.path.join("..", "players"))

# PyRat imports
from pyrat import Game, GameMode
from Random1 import Random1
from Random2 import Random2
from Random3 import Random3
from Random4 import Random4

<h1 style="background-color: gray;
           color: black;
           padding: 20px;
           text-align: center;">CONSTANTS</h1>

In this script, we are going to make multiple independent games. \
The goal is to collect enough statistics to draw conclusions on which algorithm is better than the other. \
This constant defines how many games are made.

In [None]:
# Determines how many games will be played for each player
NB_GAMES = 1000

<h1 style="background-color: gray;
           color: black;
           padding: 20px;
           text-align: center;">RUN THE GAMES</h1>

First, let us configure the game with a dictionary. \
Later in the script, we will update the dictionary with the `random_seed` argument to enforce the same maze/cheese for all players. \
Note that we put the game mode as `SIMULATION` to perform all games as fast as possible.

In [None]:
# Customize the game elements
config = {"mud_percentage": 0.0,
          "nb_cheese": 1,
          "game_mode": GameMode.SIMULATION}

Let us now perform all games. \
For each game, we remember the number of turns needed to complete it.

In [None]:
# Players to test (keys are legends to appear in the plot)
players = {"Random 1": {"class": Random1, "args": {}},
           "Random 2": {"class": Random2, "args": {}},
           "Random 3": {"class": Random3, "args": {}},
           "Random 4": {"class": Random4, "args": {}}}

# Run the games for each player
results = {player: [] for player in players}
for key in players:
    for seed in tqdm.tqdm(range(NB_GAMES), desc=key):
        
        # Update config to add the random seed
        config["random_seed"] = seed

        # Make the game
        game = Game(**config)
        player = players[key]["class"](**players[key]["args"])
        game.add_player(player)
        stats = game.start()
        
        # Store the number of turns needed
        results[key].append(stats["turns"])

<h1 style="background-color: gray;
           color: black;
           padding: 20px;
           text-align: center;">ANALYZE THE RESULTS</h1>
           
Now that all games are performed, we plot the percentage of games completed as a function of the number of turns elapsed.

In [None]:
# Visualization of cumulative curves of numbers of turns taken per program
max_turn = max([max(results[player]) for player in results])
pyplot.figure(figsize=(10, 5))
for player in results:
    turns = [0] + sorted(results[player]) + [max_turn]
    games_completed_per_turn = [len([turn for turn in results[player] if turn <= t]) * 100.0 / NB_GAMES for t in turns]
    pyplot.plot(turns, games_completed_per_turn, label=player)
pyplot.title("Comparison of turns needed to complete all %d games" % (NB_GAMES))
pyplot.xlabel("Turns")
pyplot.ylabel("% of games completed")
pyplot.xscale("log")
pyplot.legend()
pyplot.show()

Visualizing is great, but it may be hard to conclude with just a plot. \
Here, we perform a statistical test that will give more insight on whether an algorithm is better than the other.

In [None]:
# Formal statistics to check if these curves are statistically significant
for i, player_1 in enumerate(results):
    for j, player_2 in enumerate(results):
        if j > i:
            test_result = scstats.mannwhitneyu(results[player_1], results[player_2], alternative="two-sided")
            print("Mann-Whitney U test between turns of program '%s' and of program '%s':" % (player_1, player_2), test_result)