PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
langwatch
langwatch-scenario

Agentic testing for agentic codebases

65K 869 60
Giskard-AI
giskard

🐢 Open-Source Evaluation & Testing library for LLM Agents

40K 5K 446
jhd3197
prompture

Prompture is an API-first library for requesting structured JSON output from LLMs (or any structure), validating it against a schema, and running comparative tests between models.

11K 9 0
avansaber
tailtester

tailtest — the test + security validator that lives inside Claude Code. Never blocks, never lies.

4K 7 1
Pacific-AI-Corp
langtest

Pacific AI provides a library for delivering safe & effective NLP models.

3K 556 49
ksgisang
aat-devqa

AI-powered automated E2E testing. Just enter a URL — AI generates and runs test scenarios.

3K 5 1
JohnSnowLabs
nlptest

Deliver safe & effective language models

2K 556 49
tenro-ai
tenro

Open-source simulation harness for testing AI agents. Simulate LLM and tool calls to test edge cases, failure paths, and agent logic without live API calls.

1K 5 0
Harshit-J004
py-toolguard

The "Cloudflare for AI Agents". 7-layer security interceptor, real-time observability dashboard, and automated reliability testing for MCP and AI tool chains. Prevent hallucinations, prompt injection, and destructive tool calls.

1K 12 3
kdunee
intentguard

A Python library for verifying code properties using natural language assertions.

1K 35 0
Chatbot-TRACER
chatbot-tracer

An automated approach for exploring and testing conversational agents using large language models. TRACER discovers chatbot functionalities, generates user profiles, and creates comprehensive test suites for conversational AI systems.

1K 2 0
ctoapplymatic
sharingan-autotest

Autonomous testing agent for Claude Code. Discovers, tests, diagnoses, and fixes your web app.

919 1 0
radoslaw-sz
maia-test-framework

A pytest-based framework for testing multi AI agents system. It provides a flexible and extensible platform for creating and running complex multi-agent simulations and capturing the results.

681 1 0
alepot55
agentrial

Statistical evaluation framework for AI agents - pytest for agent trajectories

664 16 2
justinGrosvenor
alignmenter

Persona-aligned evaluation toolkit for auditing conversational AI authenticity, safety, and stability.

589 5 0
Swanand33
llm-behave

Behavioral testing for LLM applications. pytest plugin with semantic assertions, multi-turn conversation testing, and drift detection. No LLM judge needed.

586 1 0
Addepto
ccheck

MIT-licensed Framework for LLMs, RAGs, Chatbots testing. Configurable via YAML and integrable into CI pipelines for automated testing.

508 95 11
AetherLabCo
aetherlab

Official Python SDK for AetherLab's AI Guardrails and Compliance Platform

380 1 0
Rowusuduah
llm-sentry

Unified AI Reliability Platform. One install, 12 diagnostic engines. Zero-dependency LLM pipeline monitoring.

215 0 0
RahulMK22
pyllmtest

🚀 Comprehensive testing framework for LLM applications with semantic assertions, multi-provider support, RAG testing, and prompt optimization. Test AI the right way!

147 1 0
sazed5055
llmtest-framework

pytest for LLM apps - Test for grounding failures, prompt injection, safety violations, and regressions

138 3 0
Forge-NC
forge-nc

Local-first AI coding assistant with adversarial model testing, transparent context management, and cryptographic audit trails

107 2 0
awrshift
housemonkey

Chaos testing for AI apps. 18 extreme personas attack your AI to find edge cases before users do. OWASP LLM Top 10 coverage.

100 1 0
chigwell
llmtestr

A new package that helps developers integration-test AI and LLM applications by validating structured outputs. It takes a user's test scenario or prompt as input, sends it to an LLM, and uses pattern

93 1 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery