Powered by Qwen2.5-Coder · Runs locally

Every prompt,
smarter.

TokenSaver intercepts your Claude Code prompts, analyzes your repo with a local LLM, and injects a structured task plan — invisibly, before Claude sees it.

tokensaver — process

// how it works

Three steps, zero friction.

The hook fires on every prompt. You keep using Claude Code exactly as before.

01 / intercept

Hook fires

Claude Code's UserPromptSubmit hook routes your prompt through TokenSaver before Claude sees it.

02 / analyze

Local LLM decides

Fast file scanner finds candidates. Qwen2.5-Coder selects the truly relevant ones and structures a precise task plan.

03 / inject

Claude gets context

Structured context arrives alongside your original prompt. Claude starts with the right files and a clear task — no repo exploration needed.

// the transformation

From vague to precise.
Automatically.

See what Claude receives when TokenSaver is active.

Without TokenSaver
Claude Code prompt
"fix login redirect after session expiry"

Claude spends tokens exploring the repo, guessing architecture, re-learning conventions from scratch.

With TokenSaver
additionalContext injected
Task:
Fix login redirect after JWT session expiry.
Relevant Files:
- src/auth/session.ts
- src/middleware/auth.ts
- src/routes/login.tsx
Relevant Symbols:
- validateSession() [session.ts:34]
- requireAuth() [auth.ts:12]
Reasoning:
Session module validates JWT expiry.
Middleware controls redirect on failure.
Constraints:
- Authentication uses JWT, not cookies
- Do not modify database schema

// persistent memory

Your project never forgets.

The local LLM updates these files automatically after every prompt. Commit them to share context across your team.

📄 .tokensaver/memory.md facts
Backend uses FastAPI + JWT No schema changes without migration All routes use requireAuth() Frontend: Next.js 14, App Router
📋 .tokensaver/changelog.md history
## 2026-05-12 14:22 Fixed JWT session expiry redirect ────────────────── ## 2026-05-11 09:15 Refactored auth middleware chain ────────────────── ## 2026-05-10 16:40 Added rate limiting to /api/login
📊 .tokensaver/tasks.jsonl tasks
{"id":"a1","status": "active", "description":"Fix login redirect"} {"id":"c3","status": "completed", "description":"Auth middleware refactor"} {"id":"e5","status": "active", "description":"Rate limit /api/login"}

Updated automatically by the local LLM · Commit to share with your team

// local llm

Local. Private. Fast.

No data leaves your machine. No API keys. No cloud costs.

Default Model
ready
qwen2.5-coder:0.5b

Code-aware · 400 MB · Runs on any MacBook

0.5B
parameters
~400MB
model size
<2s
avg response
100%
offline
Swap any Ollama model

Set model = "llama3.2" in config.toml. Any model installed in Ollama works.

Graceful fallback

Ollama offline? TokenSaver uses keyword-based analysis automatically. Your session is never blocked.

Zero telemetry

Your code, prompts, and memory files never leave your machine. No accounts, no cloud sync.

// install

Up in 5 minutes.

1
Install Ollama and pull the model
$ brew install ollama
$ ollama serve
$ ollama pull qwen2.5-coder:0.5b
2
Build and install TokenSaver
$ git clone https://github.com/MoMicro-core/TokenSaver
$ cd TokenSaver && cargo install --path .
3
Initialize in your project
$ cd your-project
$ tokensaver init
4
Add memory and verify
$ tokensaver remember "Backend uses FastAPI with JWT"
$ tokensaver llm-status
$ tokensaver context "fix auth redirect"

Open Claude Code. Type a prompt. TokenSaver runs automatically.