Compress your
AI context window
Local proxy that compresses tool outputs, deduplicates file reads, and strips noise. Save thousands of tokens per session with zero workflow changes.
Works with your tools
Auto-detects API format from request headers. Zero per-tool config.
See the difference
Real compression results from actual coding sessions. Every byte counts.
7-Layer Pipeline
Each request passes through seven independent stages. Each layer catches what the previous one missed.
01System Prompt
~13KB → 600 tokens
02Read Dedup
Collapse duplicate reads
03Noise Strip
ANSI, progress bars, spinners
04Tool Patterns
30+ specific compressors
05Line Dedup
Repeated lines & stacks
06AI Compress
Haiku / GPT-mini / Flash
07Session Cache
KV cache warming
Everything you need
30+ Patterns
Git diffs, test runners, build tools, Docker, Terraform, package managers — each has a dedicated compressor that knows exactly what to keep.
AI Fallback
When no pattern matches, Haiku, GPT-4o-mini, or Gemini Flash compress to under 150 tokens. The best model wins.
File Dedup
Read the same file 5 times? Only the latest stays full. Earlier reads become lightweight references.
Session Cache
Identical compressed strings reuse API provider KV cache — up to 90% cost reduction on cache hits.
Expand Tool
The AI can call squeezr_expand() to retrieve any original content. Nothing is permanently lost.
Zero Config
One install, one command, works immediately. Optional TOML config for fine-grained control.
See the compression
Before and after from real coding sessions. Click to toggle.
Three steps. Thirty seconds.
From install to savings in under a minute. No configuration required.
Install & Setup
One npm install, one setup command. Auto-detects your OS, configures env vars, and starts the daemon.
Proxy Intercepts
Your AI tool sends requests through localhost. Squeezr intercepts transparently — no code changes needed.
Savings Begin
Compressed requests go to the API. Your AI gets all essential info with a fraction of the tokens.
Estimate your savings
See how much you could save based on your usage.
Ready to compress?
Three commands. Thirty seconds. That's it.