CLI proxy that reduces LLM token consumption by 60-90% on common dev commands. Single Rust binary, zero dependencies
-
Updated
Jun 5, 2026 - Rust
CLI proxy that reduces LLM token consumption by 60-90% on common dev commands. Single Rust binary, zero dependencies
Compress tool outputs, logs, files, and RAG chunks before they reach the LLM. 60-95% fewer tokens, same answers. Library, proxy, MCP server.
LeanCTX — the Context OS for AI development. One local binary that compresses, remembers, routes, and verifies every token between your code and the model. 63 MCP tools, 10 read modes, up to 99% token savings. Works with Cursor, Claude Code, Copilot, Windsurf, Codex, Gemini.
Sharper context. Fewer tokens. Open-source middleware for Claude Code.
Find the ghost tokens. Fix them. Survive compaction. Avoid context quality decay.
Up to 71.5x fewer tokens per session on Claude Code with Obsidian + Graphify. Persistent memory, codebase knowledge graphs, and chat import pipeline. 🇧🇷 PT-BR included.
Reusable setup prompts for optimizing Claude Code documentation. Achieve 90% token savings on any project in 5 minutes.
Working memory for Claude Code - persistent context and multi-instance coordination
Intelligent token optimization for Claude Code - achieving 95%+ token reduction through caching, compression, and smart tool intelligence
Cut your Claude / OpenAI / Gemini bill 70–95% on AI coding. Local proxy that compresses context, keeps provider caches hot, and verifies LLM output ($0 hallucination guard). Drop-in for Cursor, Claude Code, Codex, Aider + 34 more and custom providers — 30s, no code changes
Stop Claude Code from burning through your quota in 20 minutes. Auto-rotates oversized sessions and preserves context.
Compress LLM context to save tokens and reduce costs
Your agents are guessing at APIs. Give them the actual Agent-Native spec. 1500+ API's Ready To-Use skills, Compile any API spec into a lean, agent-native format. 10× smaller. OpenAPI, GraphQL, AsyncAPI, Protobuf, Postman.
An MCP server that executes Python code in isolated rootless containers with optional MCP server proxying. Implementation of Anthropic's and Cloudflare's ideas for reducing MCP tool definitions context bloat.
CLI proxy that reduces LLM token usage by 60-90%. Declarative YAML filters for Claude Code, Cursor, Copilot, Gemini. rtk alternative in Go.
Production-ready modular Claude Code framework with 30+ commands, token optimization, and MCP server integration. Achieves 2-10x productivity gains through systematic command organization and hierarchical configuration.
Generate a compact codebase index for AI assistants — saves 50K+ tokens per conversation
A high-performance Semantic Signal Engine with Context OS for Agentic AI. Run your AI with zero noise, pure context, and 90% lower token costs.
Config-driven CLI tool that compresses command output before it reaches an LLM context
openlore provides persistent architectural memory for AI coding agents by turning codebases into queryable knowledge graphs featuring static analysis, living specs, automated drift detection, and graph-native MCP tools to eliminate context decay and drastically slash orientation token costs.
Add a description, image, and links to the token-optimization topic page so that developers can more easily learn about it.
To associate your repository with the token-optimization topic, visit your repo's landing page and select "manage topics."