Open Bandit Pipeline: a python library for bandit algorithms and off-policy evaluation
SCOPE-RL: A python library for offline reinforcement learning, off-policy evaluation, and selection
Implementations and examples of common offline policy evaluation methods in Python.