A professional Python framework for testing prompt strength, quality, and real autonomous-agent effectiveness.