Llm Observability Python Packages

logfire

AI observability platform for production LLM and agent systems.

25M 4K 230

opik

Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.

5.1M 19K 1K

judgeval

The open source post-building layer for agents. Our environment data and evals power agent post-training (RL, SFT) and monitoring.

368K 1K 91

opik-optimizer

Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.

58K 19K 1K

agenta

The open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM observability all in one place.

56K 4K 517

helicone-helpers

🧊 Open source LLM observability platform. One line of code to monitor, evaluate, and experiment. YC W23 🍓

10K 6K 560

acontext

Agent Skills as a Memory Layer

5K 3K 314

genai-otel-instrument

GenAI OpenTelemetry Auto-Instrumentation Library A comprehensive wrapper for automatic instrumentation of LLM/GenAI applications Supports all major LLM providers and MCP (Model Context Protocol) tool calls

5K 1 1

dunetrace

Real time anomaly detection layer for AI agents. Privacy-safe by design.

4K 38 3

burnlens

Open-source LLM FinOps proxy — track OpenAI, Anthropic (Claude), and Google Gemini costs by feature, team, and customer. Zero code changes. pip install burnlens.

3K 1 0

helicone

🧊 Open source LLM observability platform. One line of code to monitor, evaluate, and experiment. YC W23 🍓

3K 6K 560

helicone-async

🧊 Open source LLM observability platform. One line of code to monitor, evaluate, and experiment. YC W23 🍓

2K 6K 560

pathlight

Visual debugging, execution traces, and observability for AI agents.

2K 15 3

observal-cli

Observal is an AI agent registry with first in class observabilty and eval framework

2K 839 79

comet-llm

Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.

1K 19K 1K

peaky-peek

Lightweight tracing SDK for AI agents. Capture decisions, tool calls, and LLM events with one context manager.

1K 5 0

peaky-peek-server

Local-first agent debugger with replay, failure memory, smart highlights, and drift detection.

1K 5 0

otel-genai-graph

Project OpenTelemetry GenAI traces into a queryable graph (Neo4j or DuckDB) — agent delegation, cost attribution, blast radius.

1K 0 0

l9gpu

GPU telemetry with workload attribution. One OTLP agent per node ties hardware metrics (NVIDIA, AMD, Intel Gaudi) to the K8s pod or Slurm job burning the GPU — so you know who's paying for that idle H100.

1K 10 2