PyPI Stats
  • Insights
  • PyPI
  • GitHub
  • Search
  • Compare
  • Advisories
  • Ecosystem
  • About
Home

Hallucination Detection Python Packages

Python packages with the GitHub topic hallucination-detection. Sorted by relevance, with stars and monthly downloads.
fathom-lab
styxx

Cognitive observability for LLM agents. Nine calibrated cognometric instruments — pure-Python, MIT, no LLM required. 9-for-9 on K=1 phase transition. Every Mind Leaves Vitals (DOI 10.5281/zenodo.19777921).

29K 5 1
Basaltlabs-app
gauntlet-cli

Behavioral reliability under pressure. Test how LLMs behave when things get hard.

10K 6 0
cvs-health
uqlm

UQLM: Uncertainty Quantification for Language Models, is a Python package for UQ-based LLM hallucination detection

6K 1K 121
Nomadu27
insa-its

Runtime Security for Multi-Agent AI — Website & Documentation

6K 23 0
krlabsorg
lettucedetect

Lightweight hallucination detection framework for RAG applications

5K 568 39
hinanohart
yuragi

yuragi — LLM Confidence Fragility Analyzer. Perturbation-driven hallucination detection with workshop-grade real benchmarks (TruthfulQA n=412 ensemble AUC 0.73, TriviaQA n=200 confidence-inversion AUC 0.75).

4K 0 0
uptrain-ai
uptrain

UpTrain is an open-source unified platform to evaluate and improve Generative AI applications. We provide grades for 20+ preconfigured checks (covering language, code, embedding use-cases), perform root cause analysis on failure cases and give insights on how to resolve them.

3K 2K 203
mattijsmoens
sovereign-shield

AI security framework: deterministic Immutable input filtering, adaptive rule learning, optional LLM veto verification. Zero dependencies. Works without an LLM. Patent Pending.

2K 19 7
anulum
director-ai

Real-time LLM hallucination guardrail — NLI + RAG fact-checking with token-level streaming halt

2K 0 0
MigoXLab
dingo-python

Dingo: A Comprehensive AI Data, Model and Application Quality Evaluation Tool

2K 691 71
ENDEVSOLS
longtracer

RAG verification guardrails — detect hallucinations in LLM responses using hybrid STS + NLI.

1K 29 4
QWED-AI
qwed

The Deterministic Verification Protocol for AI - 11 verification engines for math, logic, code, SQL, facts, images, and more. Now with Agentic Security Guards.

1K 55 8
TKCollective
langchain-agentoracle

Trust layer for AI agents. Verify before you act. Per-claim verification via 4 independent sources. x402-native on Base, SKALE, Stellar.

627 0 0
aimonlabs
hdm2

HalluciNot: Hallucination Detection Through Context and Common Knowledge Verification

430 11 0
CertainLogicAI
certainlogic-guard

Linguistic confidence gate for AI responses. Catches hedging words (maybe, I think). Not fact verification. Zero dependencies.

389 0 0
mattijsmoens
sovereign-shield-adaptive

AI security framework: deterministic Immutable input filtering, adaptive rule learning, optional LLM veto verification. Zero dependencies. Works without an LLM. Patent Pending.

354 19 7
mattijsmoens
sovereign-mcp

Deterministic MCP Security Architecture. FrozenNamespace as Root of Trust for Model Context Protocol tool verification

309 3 4
anulum
backfire-kernel

Director-Class AI — Rust Backfire Kernel (50ms safety gate)

299 0 0
antrixsh
trusteval-ai

Enterprise LLM Evaluation & Responsible AI Framework — Benchmark bias, hallucination, PII leakage, and toxicity across Healthcare, BFSI, Retail & Legal industries. Supports OpenAI, Anthropic, Gemini & HuggingFace. Python SDK + CLI + Web Dashboard. 191 tests. Compliance-ready reports.

254 7 5
serhanwbahar
dep-hallucinator

Advanced security scanner for detecting AI-generated dependency confusion vulnerabilities with signature verification support

251 7 0
Seinarukiro2
wraith

Catches what your AI forgot to check. Deterministic linter for AI-generated Python code — hallucinated APIs, phantom packages, hardcoded secrets, taint analysis. 20 rules, zero config.

237 4 0
bdeva1975
hallucinationbench

Detect hallucinations in your RAG pipeline output — in two lines of Python.

202 2 0
syncreus
syncreus-eval

Evaluate your LLM apps with one function call. Hallucination detection, RAG scoring, and agent evals for OpenAI, Anthropic, and more. 14 evaluators, pytest plugin, composite trust scores.

166 2 0
sarvanithin
medguard-llm

Healthcare-specific LLM guardrails middleware for clinical safety

156 0 0
    • Data from PyPI, GitHub, ClickHouse, and BigQuery