Comprehensive AI Model Evaluation Framework with support for multiple LLM providers
A lightweight benchmark for action-oriented agents.