1 points | by raaihank 2 hours ago
1 comments
TOON is an open format (not ours): https://github.com/toon-format/toon
It converts JSON like this:
{"id": "cust_001", "name": "Acme", "mrr": 15000}
We integrated it into our gateway to automatically compress JSON in tool results, user messages, and tool call arguments before they hit the LLM.
Benchmarks on real payloads:
- CRM query (10 records): 48% tokens saved - E-commerce orders (4 orders): 34% saved - API metrics (8 endpoints): 43% saved
Sub-100μs latency overhead. LLMs parse it correctly in our testing (GPT-4o, Claude, etc).
Not a silver bullet — works best on arrays of objects with uniform schemas. Deeply nested or irregular JSON sees less benefit.
Curious what strategies others use for token compression. We considered CSV for tabular data but it doesn't handle nested structures.
https://www.costbase.ai
TOON is an open format (not ours): https://github.com/toon-format/toon
It converts JSON like this:
Into: id: cust_001 name: Acme mrr: 15000We integrated it into our gateway to automatically compress JSON in tool results, user messages, and tool call arguments before they hit the LLM.
Benchmarks on real payloads:
- CRM query (10 records): 48% tokens saved - E-commerce orders (4 orders): 34% saved - API metrics (8 endpoints): 43% saved
Sub-100μs latency overhead. LLMs parse it correctly in our testing (GPT-4o, Claude, etc).
Not a silver bullet — works best on arrays of objects with uniform schemas. Deeply nested or irregular JSON sees less benefit.
Curious what strategies others use for token compression. We considered CSV for tabular data but it doesn't handle nested structures.
https://www.costbase.ai