TokenSaver intercepts your Claude Code prompts, analyzes your repo with a local LLM, and injects a structured task plan — invisibly, before Claude sees it.
// how it works
The hook fires on every prompt. You keep using Claude Code exactly as before.
Claude Code's UserPromptSubmit
hook routes your prompt through TokenSaver before Claude sees it.
Fast file scanner finds candidates. Qwen2.5-Coder selects the truly relevant ones and structures a precise task plan.
Structured context arrives alongside your original prompt. Claude starts with the right files and a clear task — no repo exploration needed.
// the transformation
See what Claude receives when TokenSaver is active.
Claude spends tokens exploring the repo, guessing architecture, re-learning conventions from scratch.
// persistent memory
The local LLM updates these files automatically after every prompt. Commit them to share context across your team.
Updated automatically by the local LLM · Commit to share with your team
// local llm
No data leaves your machine. No API keys. No cloud costs.
Code-aware · 400 MB · Runs on any MacBook
Set model = "llama3.2" in config.toml. Any model installed in Ollama works.
Ollama offline? TokenSaver uses keyword-based analysis automatically. Your session is never blocked.
Your code, prompts, and memory files never leave your machine. No accounts, no cloud sync.
// install
Open Claude Code. Type a prompt. TokenSaver runs automatically.