Exploring TOON (Token-Oriented Object Notation), the new AI-optimized data format that reduces token usage by 30-60% compared to JSON.
Introduction
The evolution of data formats tells a fascinating story about how technology adapts to meet our changing needs. From humble .INI files that powered early configurations, through verbose but structured XML, lightweight JSON, friendly YAML, and now TOON, a token-optimized format built for the AI era, each has emerged to solve the challenges of its time.
Today, as LLMs redefine how we process and exchange information, token-level efficiency has become a new frontier. Let’s explore how TOON (Token-Oriented Object Notation) compares to JSON, and why TOON might become the preferred format for GenAI developers.
A brief history of data formats
INI Files
The .INI format was one of the earliest ways to store configurations. Simple and straightforward, it used key-value pairs grouped in sections. Despite its simplicity, INI files remain popular for configurations and Windows systems due to their direct approach.
XML
Then came XML (eXtensible Markup Language), offering structure, validation, and hierarchy. It became the backbone of early web services, SOAP APIs, and document systems. However, its verbosity came at a cost.
JSON
Enter JSON (JavaScript Object Notation): lightweight, human-readable, and easy to parse for machines. It found the sweet spot between structure and simplicity, quickly becoming the standard for APIs and data exchange.
YAML
As systems and automation grew, developers wanted something even more readable. YAML (YAML Ain’t Markup Language) adopted indentation and minimal punctuation, becoming the preferred choice for configuration files and CI/CD pipelines.
TOON: The new era
Now, as AI models process and reason about text, a new challenge emerged: token efficiency. Every character counts in LLMs, directly affecting cost and performance.
This led to the birth of TOON (Token-Oriented Object Notation), a format built for the LLM era.
TOON is not just another serialization format. It’s a data format for AI generation: compact, structured, and optimized for how language models “think”.
The modern challenge
Traditional formats like JSON are still excellent, but in LLM-driven workflows, verbosity equals cost. When every token matters, using 50% fewer tokens to represent the same data can significantly reduce expenses and processing time.
What is JSON?
JSON is a lightweight text-based format that represents structured data using key-value pairs. Originally derived from JavaScript, it’s now language-independent and universally supported.
Key features of JSON
- Syntax: Uses , [], :, and ,
- Readable: Easy for humans and machines
- Flexible: Supports complex nesting
- Compatible: Supported everywhere
- Verbose: Repetitive keys can increase size
JSON example
{
"tags": ["jazz", "chill", "lofi"]
}
What is TOON?
TOON (Token-Oriented Object Notation) is a next-generation format designed for AI and LLM applications. Its goal is to make structured data token-efficient, reducing the cost of processing data within language models.
Key features of TOON
- Syntax: Indentation-based with tabular structure
- Efficiency: Uses 30-60% fewer tokens than JSON
- Compactness: Eliminates redundant symbols and keys
- Readability: Clean, spreadsheet-like representation
- Optimization: Built specifically for AI data flows
TOON example
users[3]{id,name,role,email}:
1,ana,admin,[email protected]
2,miguel,admin,[email protected]
3,pedro,user,[email protected]
metadata{total,last_updated}:
3,2024-01-15T10:30:00Z
TOON vs JSON: Key differences
1. Syntax and Structure
JSON: Curly braces , square brackets [], colons, commas. TOON: Indentation and column headers, cleaner, less noise.
2. Token Efficiency
LLMs charge per token, so structure matters.
3. Readability
JSON is familiar and tool-rich. TOON feels new but becomes intuitive, especially for structured and repetitive data (like CSV meets JSON).
4. Use Cases
JSON is ideal for REST APIs, web applications, and systems requiring universal compatibility. TOON is perfect for LLM workflows, where token efficiency and cost matter.
Efficiency Comparison
| Format | Tokens | Savings |
|---|---|---|
| JSON | ~89 | — |
| TOON | ~45 | ~50% less |
Real-world comparison
In a practical example, a dataset that takes approximately 180 tokens in JSON is reduced to approximately 85 tokens in TOON, representing a ~53% savings.
This reduction not only cuts costs but also improves processing speed and allows handling more data within model context limits.
When to use each format
Use JSON when
You need compatibility and standardization. You’re building REST APIs or web applications. You use well-established toolchains. Team familiarity is critical.
Use TOON when
You work with LLMs and AI agents. Cost and token efficiency matter. You handle large datasets or repetitive data. You build systems that communicate with AI models.
Implementation and libraries
JSON Support: Universal across all languages. Extensive tooling (linters, validators). Built-in browser and backend support.
TOON Support: JavaScript/TypeScript: TOON on GitHub Python: toon-py
