Llm Safety Python Packages

nemoguardrails

NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.

263K 6K 666

deepteam

DeepTeam is a framework to red team LLMs and LLM systems.

58K 2K 253

styxx

Cognitive observability for LLM agents. Nine calibrated cognometric instruments — pure-Python, MIT, no LLM required. 9-for-9 on K=1 phase transition. Every Mind Leaves Vitals (DOI 10.5281/zenodo.19777921).

29K 5 1

agent-airlock

Open-source security firewall for AI agents — validates tool calls, strips ghost arguments, enforces type safety, PII masking, RBAC, cost tracking & sandbox isolation. Works with LangChain, OpenAI Agents SDK, PydanticAI & CrewAI.

8K 6 0

uqlm

UQLM: Uncertainty Quantification for Language Models, is a Python package for UQ-based LLM hallucination detection

6K 1K 121

gateguard-ai

A fact-forcing hook gate for Claude Code. Makes the AI pause and investigate before editing.

4K 2 0

qwed-finance

Deterministic verification middleware for banking and financial AI. NPV, IRR, loan amortization, and interest calculations with QWED precision.

2K 2 1

agi-pragma

AI Action Firewall — seven-stage Decision Intelligence Core for safe agentic AI

2K 0 0

agent-audit

Static security scanner for LLM agents — prompt injection, MCP config auditing, taint analysis. 49 rules mapped to OWASP Agentic Top 10 (2026). Works with LangChain, CrewAI, AutoGen.

1K 161 16

qwed

The Deterministic Verification Protocol for AI - 11 verification engines for math, logic, code, SQL, facts, images, and more. Now with Agentic Security Guards.

1K 55 8

lintlang

Static linter for AI agent configs, tool descriptions, and system prompts with zero-LLM CI gating

1K 28 1

neurosym-ai

Neuro-symbolic guardrails for LLMs: rules + repair loops + (optional) SMT.

980 1 0

csl-core

Deterministic policy language for AI agents. Z3 + TLA+ dual-engine formal verification. Runtime enforcement <1ms.

943 10 9

brix-protocol

Runtime Reliability Infrastructure for LLM Pipelines

614 8 0

openbias

Reliability layer for AI agents - monitors workflow adherence and intervenes when agents deviate

551 67 2

guardex

Guardex - AI Control Plane for autonomous agents (closed source)

403 0 0

blackwall-llm-shield-python

Security middleware for Python LLM apps and services. Blocks prompt injection, masks PII, inspects outputs, and gates agent tools.

353 1 0

jailbreakeval

A collection of automated evaluators for assessing jailbreak attempts

272 191 12

opensentinel

Reliability layer for AI agents - monitors workflow adherence and intervenes when agents deviate

214 77 2

ioa-core

Intelligent Orchestration Architecture Core - Open-source platform for orchestrating modular AI agents with memory-driven collaboration and governance mechanisms

181 0 0

medguard-llm

Healthcare-specific LLM guardrails middleware for clinical safety

156 0 0

qwed-legal

🏛️ Deterministic rejection layer for computational legal claims. Verifies dates, amounts, and structured constraints; blocks unproven legal outputs.

134 2 3

custom-guardrails

NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems.

93 6K 672

reprobe

Phase-aware LLM activation steering and linear probing. A memory-efficient, practical implementation of Representation Engineering (RepE) for safety research.

92 2 0