mabby is a library for simulating multi-armed bandits (MABs), a resource-allocation problem and framework in reinforcement learning. It allows users to quickly yet flexibly define and run bandit simulations, with the ability to:
- choose from a wide range of classic bandit algorithms to use
- configure environments with custom arm spaces and rewards distributions
- collect and visualize simulation metrics like regret and optimality
Prerequisites: Python 3.9+ and pip
Install mabby with pip
:
pip install mabby
The code example below demonstrates the basic steps of running a simulation with mabby. For more in-depth examples, please see the Usage Examples section of the mabby documentation.
import mabby as mb
# configure bandit arms
bandit = mb.BernoulliArm.bandit(p=[0.3, 0.6])
# configure bandit strategy
strategy = mb.strategies.EpsilonGreedyStrategy(eps=0.2)
# setup simulation
simulation = mb.Simulation(bandit=bandit, strategies=[strategy])
# run simulation
stats = simulation.run(trials=100, steps=300)
# plot regret statistics
stats.plot_regret()
Please see CONTRIBUTING for more information.
This software is licensed under the Apache 2.0 license. Please see LICENSE for more information.