PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Search Packages

Find Python packages by name, description, GitHub topic, or filter by metrics
fathom-lab
styxx

Cognitive observability for LLM agents. Nine calibrated cognometric instruments — pure-Python, MIT, no LLM required. 9-for-9 on K=1 phase transition. Every Mind Leaves Vitals (DOI 10.5281/zenodo.19777921).

28K 5 1
Basaltlabs-app
gauntlet-cli

Behavioral reliability under pressure. Test how LLMs behave when things get hard.

10K 6 0
cvs-health
uqlm

UQLM: Uncertainty Quantification for Language Models, is a Python package for UQ-based LLM hallucination detection

6K 1K 121
Nomadu27
insa-its

Runtime Security for Multi-Agent AI — Website & Documentation

6K 23 0
krlabsorg
lettucedetect

Lightweight hallucination detection framework for RAG applications

5K 568 39
hinanohart
yuragi

LLM Confidence Fragility Analyzer — Measure how fragile your AI's confidence really is

3K 0 0
uptrain-ai
uptrain

UpTrain is an open-source unified platform to evaluate and improve Generative AI applications. We provide grades for 20+ preconfigured checks (covering language, code, embedding use-cases), perform root cause analysis on failure cases and give insights on how to resolve them.

3K 2K 203
mattijsmoens
sovereign-shield

AI security framework: deterministic Immutable input filtering, adaptive rule learning, optional LLM veto verification. Zero dependencies. Works without an LLM. Patent Pending.

3K 19 7
ENDEVSOLS
longtracer

RAG verification guardrails — detect hallucinations in LLM responses using hybrid STS + NLI.

2K 29 4
anulum
director-ai

Real-time LLM hallucination guardrail — NLI + RAG fact-checking with token-level streaming halt

2K 0 0
MigoXLab
dingo-python

Dingo: A Comprehensive AI Data, Model and Application Quality Evaluation Tool

2K 691 71
QWED-AI
qwed

The Deterministic Verification Protocol for AI - 11 verification engines for math, logic, code, SQL, facts, images, and more. Now with Agentic Security Guards.

1K 55 8
TKCollective
langchain-agentoracle

Trust layer for AI agents. Verify before you act. Per-claim verification via 4 independent sources. x402-native on Base, SKALE, Stellar.

606 0 0
aimonlabs
hdm2

HalluciNot: Hallucination Detection Through Context and Common Knowledge Verification

414 11 0
mattijsmoens
sovereign-shield-adaptive

AI security framework: deterministic Immutable input filtering, adaptive rule learning, optional LLM veto verification. Zero dependencies. Works without an LLM. Patent Pending.

364 19 7
CertainLogicAI
certainlogic-guard

Linguistic confidence gate for AI responses. Catches hedging words (maybe, I think). Not fact verification. Zero dependencies.

355 0 0
mattijsmoens
sovereign-mcp

Deterministic MCP Security Architecture. FrozenNamespace as Root of Trust for Model Context Protocol tool verification.

324 3 4
anulum
backfire-kernel

Director-Class AI — Rust Backfire Kernel (50ms safety gate)

288 0 0
Seinarukiro2
wraith

Catches what your AI forgot to check. Deterministic linter for AI-generated Python code — hallucinated APIs, phantom packages, hardcoded secrets, taint analysis. 20 rules, zero config.

286 4 0
antrixsh
trusteval-ai

Enterprise LLM Evaluation & Responsible AI Framework — Benchmark bias, hallucination, PII leakage, and toxicity across Healthcare, BFSI, Retail & Legal industries. Supports OpenAI, Anthropic, Gemini & HuggingFace. Python SDK + CLI + Web Dashboard. 191 tests. Compliance-ready reports.

252 7 5
serhanwbahar
dep-hallucinator

Advanced security scanner for detecting AI-generated dependency confusion vulnerabilities with signature verification support

216 7 0
bdeva1975
hallucinationbench

Detect hallucinations in your RAG pipeline output — in two lines of Python.

195 2 0
syncreus
syncreus-eval

Evaluate your LLM apps with one function call. Hallucination detection, RAG scoring, and agent evals for OpenAI, Anthropic, and more. 14 evaluators, pytest plugin, composite trust scores.

164 2 0
Kernel-Dirichlet
cotarag

Agentic-AI framework w/o the headaches

155 9 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery