We added TOON compression to our LLM gateway – compress prompts, saves tokens

(github.com)

1 points | by raaihank 2 hours ago

1 comments

raaihank 2 hours ago
Costbase is an LLM cost optimization proxy. We just shipped TOON (Token-Oriented Object Notation) compression.
TOON is an open format (not ours): https://github.com/toon-format/toon
It converts JSON like this:
```
    {"id": "cust_001", "name": "Acme", "mrr": 15000}
```
Into: id: cust_001 name: Acme mrr: 15000
We integrated it into our gateway to automatically compress JSON in tool results, user messages, and tool call arguments before they hit the LLM.
Benchmarks on real payloads:
- CRM query (10 records): 48% tokens saved - E-commerce orders (4 orders): 34% saved - API metrics (8 endpoints): 43% saved
Sub-100μs latency overhead. LLMs parse it correctly in our testing (GPT-4o, Claude, etc).
Not a silver bullet — works best on arrays of objects with uniform schemas. Deeply nested or irregular JSON sees less benefit.
Curious what strategies others use for token compression. We considered CSV for tabular data but it doesn't handle nested structures.
https://www.costbase.ai