Token Optimization Python Packages

headroom-ai

The Context Optimization Layer for LLM Applications

94K 2K 145

agent-friend

Find the MCP schema issues eating your context window. 156 checks. Grade A+ through F. ESLint for MCP.

15K 4 0

tokenpak

Slash LLM costs with intelligent context compression, smart routing, and cost tracking

11K 0 1

entroly

Entroly-Daemon: Self-Evolving Daemon. Compress 2M-token repos into a razor-sharp Principal Engineer's context. 85–99% fewer tokens, 100% accuracy retention (verified by live API benchmarks). Built for Cursor, Claude Code, Opus, Codex, GPT & Custom Providers.

7K 330 58

code-context-engine

Claude re-reads your code every session. Make it stop. Save 70%+ on tokens. Local MCP server with AST indexing, hybrid search, and cross-session memory.

6K 4 2

anvl-monitor

Session monitor for Claude Code — detects inflated sessions, saves quota with smart rotation and handoffs. Growth-aware health tracking with dual-signal waste detection.

6K 0 0

precis-mcp

MCP server giving LLM agents a seven-verb API over papers, documents, code, state, patents, and cached web/Wolfram/YouTube tool calls

6K 1 1

context-router-cli

Memory-aware context engine for AI coding agents — up to 91% fewer tokens, 17/18 rank-1 across 6 OSS projects. MCP-native, multi-repo, with persistent observations & decisions.

4K 6 2

octave-mcp

OCTAVE protocol - structured AI communication with 3-20x token reduction. MCP server with lenient-to-canonical pipeline and schema validation.

4K 49 4

rtk-hermes

RTK plugin for Hermes — rewrites shell commands for 60-90% LLM token savings

4K 36K 2K

sqz

Compress LLM context to save tokens and reduce costs

4K 176 7

entroly-core

3K 330 58

ctxeng

Build perfect LLM context from your Python codebase — automatically

3K 2 0

prism-cc

Session intelligence for Claude Code - find extra token usage, why your sessions fail, and how to fix it.

3K 19 1

tok-protocol

tok is an invisible bridge

2K 0 0

tailspin-ai

Token optimization and compression for Claude API requests

2K 1K 123

graph-tool-call

Graph-based tool retrieval for LLM agents — 248 tools → 82% accuracy, 79% fewer tokens. Zero dependencies. OpenAPI / MCP / LangChain.

2K 6 1

entroplain

Entropy-based early exit for efficient agent reasoning

2K 2 0

python-token-killer

Minimize LLM tokens from Python objects, code, logs, diffs, and more. Zero deps. Ultra-Lightweight.

1K 12 0

mnemosyne-engine

LLM Context Compression and Retrieval Engine -- zero dependencies, sub-100ms queries, document + code ingestion

1K 53 9

agentatlas

Shared browser interaction schema registry for AI agents. 80-100% token reduction.

998 0 0

lapsh

Lean API Platform -- Token-efficient API specs for AI agents

923 223 14

fortress-optimizer

Cut your LLM API costs by 10-20%. Drop-in prompt optimization for Python.

853 0 0

llm-prompt-refiner

🚀 Lightweight Python library for building production LLM applications with smart context management and automatic token optimization. Save 10-20% on API costs while fitting RAG docs, chat history, and prompts into your token budget.

727 37 3

Search Packages