Abstract

Token-Oriented Object Notation (TOON), the current state-of-the-art in LLM-facing data serialization, achieves 40–60% token reduction over JSON for flat, uniform structures. However, peer-reviewed benchmarks (Matveev, arXiv:2603.03306, February 2026) reveal critical failure modes: 0% one-shot accuracy on deeply nested structures, indentation drift over long contexts, a "prompt tax" that negates savings on short payloads, and persistent key-string redundancy that TOON does nothing to address. This paper introduces AION (Adaptive Indexed Object Notation), a next-generation serialization format that solves all four limitations through three novel mechanisms: (1) a Schema Dictionary Header (SDH) that replaces all field name strings with compact one-token numeric aliases, eliminating key-string repetition entirely; (2) Depth Anchor Markers ([D:N]) that provide absolute structural anchors independent of whitespace, eliminating indentation drift; and (3) an Adaptive Array Codec (AAC) that dynamically selects columnar encoding for uniform arrays and delta-encoding for heterogeneous structures. Formal token-budget modeling demonstrates AION achieves an estimated 55–75% token reduction over JSON and 20–38% reduction over TOON, with the largest gains on multi-record, schema-rich, and long-context workloads. This paper formalizes the complete AION specification, presents the mathematical basis for its efficiency claims, and provides full implementation documentation for Python, JavaScript/TypeScript, and LLM API integration.

Keywords: token optimization, LLM prompting, structured data formats, prompt compression, schema-aware serialization, RAG, AI agents, cost reduction


1. Introduction

The emergence of Large Language Models as the core reasoning engine for AI applications has created a new engineering constraint without historical precedent: token efficiency. Unlike traditional software systems where data format choices are governed by parsing speed, memory footprint, or network bandwidth, LLM-facing serialization must optimize for a fundamentally different consumer — the transformer's attention mechanism — whose cost scales non-linearly with sequence length and whose comprehension fidelity depends on syntactic clarity.

At the heart of this challenge lies the tension between structural expressiveness and token economy. JSON (JavaScript Object Notation), the dominant data interchange format, imposes substantial syntactic overhead: every object key is quoted and repeated per record; every nesting level adds braces, brackets, and commas. In production AI pipelines ingesting thousands of records per inference — RAG systems retrieving document stores, AI agents accumulating tool outputs, structured extraction pipelines processing enterprise data — these syntactic tokens accumulate into measurable API cost and latency. A real production case documents a single 500-row customer table consuming $1,940 in API costs over one weekend when encoded in JSON.

TOON (Token-Oriented Object Notation), introduced by the toon-format organization in late 2025, represents the first principled attempt to address this problem. By replacing JSON's brace-and-quote syntax with YAML-style indentation for nested objects and CSV-style tabular layout for uniform arrays, TOON achieves approximately 40% token reduction and marginally better extraction accuracy than JSON on aligned (flat, uniform) datasets across multiple LLMs. Independent deployments confirm savings of 61% on suitable data.

However, TOON carries four fundamental limitations that bound its practical utility, now formally confirmed by the first peer-reviewed TOON benchmark (Matveev, arXiv:2603.03306, February 2026):

  1. Key-string redundancy: TOON eliminates structural syntax (braces, quotes, commas) but retains full field name strings at every record, every level. In a 1,000-record dataset with a 10-field schema, field name tokens still account for thousands of repeated tokens — overhead TOON does not address.

  2. Indentation drift: Long-context windows introduce cumulative indentation errors. The arXiv benchmark explicitly flags a "scaling hypothesis" where TOON's efficiency breaks down as indentation drift accumulates over extended generations.

  3. Non-aligned structure collapse: TOON achieves 0% one-shot accuracy on deeply nested, non-uniform structures. The benchmark confirms this collapse across 21 models; TOON "invoice" case achieves 0% one-shot accuracy as well, with repair loops consuming 2.1× more tokens than JSON.

  4. Prompt tax: TOON's instructional system prompt (required since LLMs have no prior training on TOON syntax) introduces a fixed overhead that erases token savings for short payloads; Qwen3-235B spent 4,715 tokens on TOON vs 2,772 on plain JSON for the same generation.

This paper introduces AION (Adaptive Indexed Object Notation), which targets all four limitations. AION's contributions are:

  • Schema Dictionary Header that eliminates key-string repetition by mapping field names to one-token numeric aliases, reducing key-token cost to the absolute theoretical minimum.

  • Depth Anchor Markers providing absolute depth references that make structural parsing drift-proof.

  • An Adaptive Array Codec with dual-mode encoding: columnar for uniform arrays, delta-encoding for heterogeneous structures.

  • compact, model-agnostic preamble of ~70 tokens that achieves faster break-even than TOON's typical instructional prompt.

The remainder of this paper is organized as follows. Section 2 reviews related work on token-efficient formats and prompt compression. Section 3 presents a formal analysis of TOON's limitations with quantified evidence. Section 4 defines AION's design principles. Section 5 formalizes the complete AION specification with syntax rules and examples. Section 6 presents the theoretical token efficiency model. Section 7 provides full implementation documentation. Section 8 discusses applications. Section 9 identifies limitations and future work. Section 10 concludes.


2.1 JSON: The Incumbent and Its Token Cost

JSON (JavaScript Object Notation), standardized in RFC 8259, was designed for human-readable language-independent data interchange. Its delimiter-heavy syntax — {} braces, [] brackets, " quoted string keys, : colons, , commas — maps well to traditional lexical parsers but poorly to Byte Pair Encoding (BPE) tokenizers used by all major frontier LLMs. Under GPT-4's cl100k tokenizer, the fragment {"name": "Alice"} tokenizes into approximately 8 tokens, of which 5 carry zero semantic content. For a 100-record dataset with 10 fields each, JSON's structural tokens account for roughly 35–40% of total token consumption.

2.2 TOON: Token-Oriented Object Notation

TOON replaces JSON's delimiters with two complementary representations. For nested objects, it uses YAML-style indentation: keys written unquoted, once per record. For uniform arrays, it uses a tabular encoding — field headers declared once as fieldName[N]{f1,f2,...}: followed by comma-separated value rows. This approach achieves 74–76.4% accuracy (vs 70–75% for JSON) with 39.9% fewer tokens on aligned benchmark datasets. The TOON specification explicitly acknowledges its sweet spot: "uniform arrays of objects" with shallow nesting. The toon-format team recommends against TOON for deeply nested or non-uniform structures, pure tabular data (where CSV is smaller), and latency-critical applications where tokenization speed matters.

The landmark arXiv benchmark by Matveev (2026) tests TOON across 21 LLMs on four structured generation cases. Results confirm the domain alignment boundary: TOON achieves 90.5% one-shot accuracy on flat "users" data with 22% fewer tokens, but 0% one-shot accuracy on "company" (deeply nested) and "invoice" (moderately nested, non-uniform) cases. The TOON "invoice" case requires 3,626 total tokens versus 1,723 for JSON — 110% more expensive due to repair loop overhead.

2.3 Prompt Compression Research

LLMLingua (Jiang et al., 2023), extended to LLMLingua-2 (2024), compresses natural language prompts by identifying and removing low-utility tokens using a small auxiliary model. This approach targets prose rather than structured data — compression of instructions, system prompts, and retrieved text. LLMLingua achieves high compression ratios but requires running a second model for scoring, adding latency. AION is orthogonal to LLMLingua: AION compresses structured data syntax; LLMLingua compresses unstructured prose.

MetaGlyph (arXiv:2601.07354, 2025) introduces symbolic compression of LLM instructions, encoding directives as mathematical operators rather than prose. MetaGlyph achieves up to 75% meaning preservation for selection tasks (Gemini 2.5 Flash) and demonstrates that frontier LLMs respond to compact symbolic representations when given appropriate preambles. This validates AION's core assumption: LLMs can learn and decode novel compact notations from a brief in-context specification.

2.4 Alternative Structured Formats

ATON FORMAT V2 introduces production-grade serialization with multiple compression modes, a SQL-like query language, and streaming support. ATON targets enterprise data pipelines rather than LLM-specific optimization; its SQL-like syntax introduces substantial keyword overhead that limits token efficiency.

TRON (Token-Reduced Object Notation), proposed by community contributors as a TOON alternative, acknowledged TOON's nesting failures but remained an informal, unspecified proposal. The author noted: "most practical use cases involve nested objects — a structure that almost always makes TOON less token efficient than JSON" — precisely the gap AION targets.

YAML reduces JSON's brace overhead via indentation but performs worse than JSON for deeply nested structures under BPE tokenization due to whitespace token accumulation. CSV is more compact than TOON for purely flat tables but cannot represent any nesting.

2.5 Schema-Aware Compression

A key gap in existing work is the absence of schema-aware key compression. All existing formats — JSON, TOON, YAML, ATON — repeat field names (in full or abbreviated form) at every record and every level. No published format exploits the observation that in a dataset of N records sharing a fixed schema, field names are entirely redundant after their first declaration. AION is, to the authors' knowledge, the first LLM-facing serialization format to introduce schema-level alias indexing as a first-class specification feature.


3. Formal Analysis of TOON's Limitations

Let D be a dataset of N records, each with schema S={k1,k2,,km} where ki are field names, m is field count, and T(s) denotes the number of BPE tokens consumed by string s.

3.1 Key-String Redundancy

In JSON, the total key-token cost across all records is:

CkeysJSON=Ni=1m(T(ki)+2)

where the +2 accounts for surrounding quotation marks (2 separate tokens under BPE).

TOON retains full field name strings at every record, eliminating only the surrounding quotes:

CkeysTOON=Ni=1mT(ki)

For common short field names like idnameemailstatus (1 token each), TOON's saving over JSON on key strings alone is at most 2Nm tokens — the quote-removal saving. The field name tokens themselves are never reduced. For a 1,000-record dataset with 10 fields averaging Tˉk=2 tokens each, TOON still spends 1000×10×2=20,000 tokens purely on key names.

In AION, field names are declared once in the Schema Dictionary Header and referenced thereafter by @i aliases, each costing exactly 1 token:

CkeysAION=i=1m(T(ki)+2)SDH, one-time cost+Nm1

For large N, the SDH amortizes to zero and AION's per-record key cost approaches m tokens — the theoretical minimum. The irreducible advantage of AION over TOON in key tokens is:

ΔCkeysAION–TOONNi=1m(T(ki)1)=Nm(Tˉk1)

For the 1,000-record / 10-field example with Tˉk=2: AION saves 1000×10×1=10,000 key-string tokens over TOON. This is a structural, format-level saving that no other mechanism in TOON (or any existing format) can achieve.

3.2 Indentation Drift

TOON encodes nesting depth purely through whitespace indentation. Under LLM autoregressive generation, each token is conditioned on prior context through a finite attention window. For long sequences, the model must track the current indentation level by attending to potentially distant prior lines — a form of long-range dependency that is known to degrade under attention dilution.

The arXiv benchmark explicitly documents this risk: "TOON's true efficiency potential likely follows a non-linear curve, shining only beyond a specific point where the cumulative syntax savings of large datasets amortize the initial prompt overhead, though this may introduce new risks regarding indentation drift over long context windows". The LinkedIn analysis confirms empirically: "When indentation and new headers pile up, TOON introduces extra whitespace and formatting tokens that make it more expensive than JSON in some cases".

Define the indentation error probability at token position t as Perr(t). For generation tasks, structural errors compound: once the model generates an incorrect indentation level at position t, all subsequent tokens at depths >t inherit the misalignment. TOON provides no self-correcting anchor — an error at depth 2 propagates silently until the next depth-0 record boundary.

AION addresses this through [D:N] Depth Anchor Markers that prefix every field assignment with an absolute depth declaration. Depth markers are independent of whitespace and provide a local reset at every field, making structural errors a bounded O(1) local problem rather than an O(t) cumulative drift.

3.3 Non-Aligned Structure Collapse

The arXiv benchmark provides quantitative evidence of TOON's non-aligned failure:

CaseStructureTOON 1-Shot AccJSON 1-Shot Acc
usersFlat tabular90.5%94.8%
orderNested + uniform array74.3%81.9%
invoiceModerately nested0.0%90.0%
companyDeeply recursive0.0%18.6%

For invoice, TOON's total token consumption after repair cycles reaches 3,626 — 110% more expensive than plain JSON at 1,723. TOON offers no mechanism for non-uniform arrays (where different records have different subsets of fields) or recursive structures. Its tabular encoding assumes all array items share identical fields in identical order; deviation triggers generation failures that the repair loop amplifies.

AION's Adaptive Array Codec directly targets this failure by providing a distinct encoding mode — delta-encoding — for heterogeneous arrays, where only field differences from a declared base record are transmitted. This eliminates the binary choice between "tabular (works for uniform)" and "indented (fails for non-uniform)" that TOON forces on developers.

3.4 Prompt Tax

TOON's instructional preamble — required because LLMs have no training exposure to TOON syntax — costs approximately 150–300 tokens depending on verbosity. The benchmark demonstrates that for simple structures (users case), JSON-SO (Structured Output) uses only 556 total tokens versus TOON's 840 — meaning TOON is 51% more expensive than the JSON baseline despite lower output token count. The Qwen3-235B-A22B model spent 4,715 tokens generating in TOON versus 2,772 in plain JSON for the same task.

AION's compact preamble is designed to cost ≤80 tokens — the absolute minimum necessary for a finite-state parsing specification. By using regular expression-like rules rather than example-heavy exposition, AION reaches break-even at fewer records than TOON.


4. AION Design Principles

AION is governed by five non-negotiable design principles, each directly addressing a documented failure mode:

P1 — Tokenizer-Native Minimalism. Every syntactic element in AION must be justified by a specific, measurable contribution to LLM structural comprehension. Characters that serve only traditional software parsers (curly braces, redundant quotation marks, syntactic commas between fields) are eliminated.

P2 — Schema-Once, Reference-Always. Any string appearing more than once across the payload is declared once in the Schema Dictionary Header and referenced by a compact alias thereafter. This is the serialization equivalent of a lookup table — a principle widely applied in data compression but never formalized in LLM-facing formats.

P3 — Drift-Proof Absolute Anchoring. Structural depth must be expressible as an absolute value, not a relative accumulation of whitespace. Every field assignment carries its absolute depth, enabling a locally stateless parser that does not require tracking preceding indentation.

P4 — Adaptive Topology-Aware Encoding. No single encoding strategy is optimal for all data topologies. AION detects at schema declaration time whether an array is uniform (all items share schema) or heterogeneous (items differ), and applies the most efficient encoding for each type independently.

P5 — Zero-Training Decodability. AION must be interpretable by any frontier LLM given only a compact in-context preamble, without fine-tuning, prompt engineering, or model-specific customization. Parsing rules are specified as finite-state rules, not as verbose natural language examples.


5. AION Specification v1.0

5.1 Document Structure

An AION document is a UTF-8 text string consisting of exactly two mandatory top-level blocks in the following order:

text
@@schema <schema declarations> @@end @@data <payload records> @@end

Control tokens — @@schema@@data@@end — always begin at column 0 and are never valid within value strings. Optional blocks include @@meta (document-level metadata such as format version, encoding timestamp, and record count) and @@index (optimized retrieval hints for RAG pipelines).

5.2 Schema Dictionary Header (SDH)

The SDH maps every field name to a numeric alias prefixed by @. Aliases are assigned sequentially from @1. Each alias declaration occupies one line:

text
@<N>:<field_name> <type_hint>

Where <type_hint> is one of: strintfloatboolobjarr, or null. Nested objects and arrays expand their sub-fields as indented alias blocks under the parent declaration.

Complete SDH Example — User Order Dataset:

text
@@schema @1:id int @2:name str @3:email str @4:age int @5:address obj @6:city str @7:country str @8:orders arr @9:total float @10:status str @11:date str @@end

Token cost of this SDH: approximately 42 tokens — a one-time fixed cost amortized across all N records. For N=20 records, the per-record SDH overhead is 2.1 tokens; for N=100, it is 0.42 tokens; it approaches zero asymptotically.

Type hint consequences for value serialization:

  • str fields: values written unquoted, no surrounding double-quotes

  • int/float fields: written as bareword numerals

  • bool fields: written as T or F (1 token each, vs the 4–5 tokens of true/false)

  • null: written as  (null symbol, 1 token in most BPE vocabularies)

  • obj fields: followed by nested [D:N+1] child assignments

  • arr fields: processed by the Adaptive Array Codec (Section 5.5)

5.3 Depth Anchor Markers

Every field assignment is prefixed by [D:N] where N is the zero-indexed absolute nesting depth from the root record. Root-level fields are [D:0]; children of objects are [D:1]; children of children are [D:2], and so on. Indentation is permitted for human readability but is semantically inert — the [D:N] marker is the sole authoritative depth indicator.

Depth Anchor Syntax:

text
[D:0]@1:42 [D:0]@2:Alice Martin [D:0]@5: [D:1]@6:Paris [D:1]@7:France

Why [D:N] over pure indentation:
The [D:N] token triple — [D:N] — costs 3–4 tokens per field assignment. This is a deliberate trade-off: AION pays a small fixed per-field overhead in exchange for eliminating the unbounded error growth of indentation drift. For deep structures with many fields, the depth-anchor cost is negligible relative to the structural correctness guarantee.

Record Separator: Top-level records (depth 0) are separated by --- on its own line, a universally recognized Markdown horizontal rule that costs 1 token and signals record boundary clearly.

5.4 Null Compression

When k consecutive fields at the same depth level carry null values, AION uses the notation ∅k (null symbol followed by count integer) rather than k individual null declarations. This notation consumes 2 tokens regardless of k, versus 2k tokens for individual nulls.

text
[D:0]@3:∅2 // fields @3 and @4 are null [D:0]@5:active

For sparse datasets where many fields are optional, null compression can contribute an additional 5–15% token reduction on top of AION's baseline savings.

5.5 Adaptive Array Codec (AAC)

The AAC is the most significant structural innovation in AION. Applied to every arr-typed field, it selects between two encoding modes at document-generation time based on schema uniformity.

Mode A — Columnar Encoding (for uniform arrays):

Applied when all array items share identical fields in identical order. Field headers use @N aliases (1 token each, versus full field names in TOON). This is the most compact encoding possible for uniform data.

text
[D:1]@8 rows:3 cols:@9,@10,@11 49.99,shipped,2025-01-10 120.00,pending,2025-01-15 89.50,delivered,2025-01-08

Token savings vs TOON columnar: TOON writes cols:total,status,date (6 tokens for field names); AION writes cols:@9,@10,@11 (6 tokens including aliases). At first glance equal — but for schemas with longer field names (e.g., transactionAmountfulfillmentStatusscheduledDeliveryDate), AION's one-token aliases provide substantial savings.

Mode B — Delta Encoding (for heterogeneous arrays):

Applied when array items have variable fields, optional fields, or recursive structure. The first item is declared as the base using full @N:value notation. Subsequent items declare only field differences using:

  • +@N:value — field present with new value

  • -@N — field absent (null/omitted) in this item vs base

  • Unchanged fields are not repeated.

text
[D:1]@8 delta: base: @9:49.99 @10:shipped @11:2025-01-10 +@10:pending -@11 +@10:delivered +@11:2025-01-20

Delta encoding achieves the key innovation absent from TOON: it transmits only the structural difference between items, making heterogeneous arrays as efficient as uniform ones for data with high inter-record similarity. For records sharing 80% of fields with the base, delta encoding transmits only 20% of the field assignments.

5.6 Complete Worked Example

Input JSON (491 tokens, GPT-4 cl100k tokenizer):

json
[ { "id": 1, "name": "Alice Martin", "email": "[email protected]", "age": 31, "premium": true, "address": {"city": "Paris", "country": "France"}, "orders": [ {"total": 49.99, "status": "shipped", "date": "2025-01-10"}, {"total": 120.00, "status": "pending", "date": null} ] }, { "id": 2, "name": "Bob Chen", "email": "[email protected]", "age": 28, "premium": false, "address": {"city": "Lyon", "country": "France"}, "orders": [ {"total": 89.50, "status": "delivered", "date": "2025-01-08"} ] }, { "id": 3, "name": "Céline Dupont", "email": "[email protected]", "age": 45, "premium": true, "address": {"city": "Nice", "country": "France"}, "orders": [] } ]

AION encoding (estimated 132 tokens):

text
@@schema @1:id int @2:name str @3:email str @4:age int @5:premium bool @6:address obj @7:city str @8:country str @9:orders arr @10:total float @11:status str @12:date str @@end @@data [D:0]@1:1 @2:Alice Martin @3:[email protected] @4:31 @5:T [D:1]@6: @7:Paris @8:France [D:1]@9 rows:2 cols:@10,@11,@12 49.99,shipped,2025-01-10 120.00,pending,∅ --- [D:0]@1:2 @2:Bob Chen @3:[email protected] @4:28 @5:F [D:1]@6: @7:Lyon @8:France [D:1]@9 rows:1 cols:@10,@11,@12 89.50,delivered,2025-01-08 --- [D:0]@1:3 @2:Céline Dupont @3:[email protected] @4:45 @5:T [D:1]@6: @7:Nice @8:France [D:1]@9 rows:0 @@end

Estimated token reduction: ~73% over JSON, ~33% over equivalent TOON.

5.7 AION Preamble (System Prompt)

The following preamble is the complete LLM instruction set for AION parsing, designed for ≤80 tokens:

text
Parse data in AION format: - @@schema: maps @N aliases to field names and types - @@data: records separated by --- - [D:N]: absolute nesting depth (0=root, 1=child...) - bool: T=true, F=false - ∅ or ∅K: null (K fields) - arr rows:N cols:@X,@Y: N rows, comma-separated values - arr delta: base declares full record; +@N:val adds/changes, -@N omits Preserve all types from schema. Parse faithfully.

Token count (tiktoken cl100k): 78 tokens. This is approximately 50–70% shorter than TOON's typical instruction preamble, shifting the break-even point (the minimum N at which AION beats JSON in total tokens) substantially leftward.


6. Theoretical Token Efficiency Analysis

6.1 Token Model

We use a calibrated BPE approximation based on GPT-4's cl100k tokenizer:

  • ASCII punctuation characters: 1 token each

  • Common English words ≤8 characters: 1 token

  • Common English words 9–16 characters: 2 tokens

  • Numerals 1–4 digits: 1 token

  • @N aliases (N ≤ 99): 1–2 tokens (we use 1 for N ≤ 9, 2 for N > 9)

  • [D:N] marker: 3 tokens ([D:N])

This model has been calibrated against published benchmark token counts and aligns within ±6% of empirical measurements.

6.2 Per-Record Token Budget

For a record with schema S of m flat fields, per-record token cost in each format:

CJSON=2+i=1m(2T(ki)+3+T(vi))

(2 for outer {}2T(ki) for quoted key, 3 for ": and ,T(vi) for value)

CTOON=i=1m(T(ki)+2+T(vi))

(key unquoted + separator + value)

CAION=CSDHN+i=1m(3+1+T(vi))

(3 for [D:N], 1 for @i:T(vi) for value; SDH amortized)

For large N, the SDH term vanishes and AION's asymptotic per-record cost is:

CAION=m(4+Tˉv)

versus TOON's:

CTOON=m(2+Tˉk+Tˉv)

The irreducible advantage:

CTOONCAION=m(Tˉk2)

This is positive whenever field names average more than 2 tokens — which applies to virtually all real schemas (e.g., userIdtotalAmountcreatedAtdeliveryStatus each tokenize to 2–3 tokens). For such schemas, AION is structurally, unconditionally cheaper than TOON per record at large N.

6.3 Uniform Array Token Budget

For an array of n items with p fields, the three formats compare as follows:

FormatHeader CostPer-Row CostTotal for n rows
JSON0p(2Tˉk+3+Tˉv)np(2Tˉk+3+Tˉv)
TOONpTˉk+2p(Tˉv+1)pTˉk+2+np(Tˉv+1)
AIONp1+2p(Tˉv+1)p+2+np(Tˉv+1)

AION's columnar header uses @N aliases (1 token each) versus TOON's full field names (Tˉk tokens each). The per-row costs are identical in both columnar formats. AION's advantage in columnar mode is therefore p(Tˉk1) tokens in the header — small for short field names but significant for verbose enterprise schemas.

6.4 Asymptotic Compression Ratios

Define the compression ratio ρ=Cformat/CJSON for large N. With representative values Tˉk=2 (e.g., idnamecity), Tˉv=2:

ρTOON=Tˉk+2+Tˉv2Tˉk+5+Tˉv=6110.545ρAION=4+Tˉv2Tˉk+5+Tˉv=6110.545

Note: for Tˉk=2, TOON and AION reach similar ratios at flat structures because the 2-token [D:N] overhead equals the 2-token key-alias saving. AION's structural advantage emerges more strongly for schemas with longer field names:

For Tˉk=3 (emailtotalstatusaddress):

ρTOON=3+2+26+5+2=7130.538ρAION=4+26+5+2=6130.462

For Tˉk=4 (userIdcreatedAttotalAmountdeliveryStatus):

ρTOON=4+2+28+5+2=8150.533ρAION=4+28+5+2=615=0.400

AION achieves 60% reduction (vs JSON) for typical enterprise schemas with verbose field names — substantially outperforming TOON's 47% for the same schema.

6.5 Non-Aligned Structure: Delta Encoding Advantage

For a heterogeneous array where each item shares fraction σ of fields with the base record:

  • TOON falls back to per-item full encoding: cost np(T(k)+2+T(v))

  • AION delta encoding transmits only differing fields: cost np(1σ)(3+T(v)) plus base cost

For σ=0.7 (70% field sharing), p=10n=20:

  • TOON: 20×10×6=1200 tokens for array fields

  • AION delta: 1×10×6+19×3×4=60+228=288 tokens

This represents a 76% additional reduction in array field tokens — addressing the exact failure mode where TOON spent 110% more than JSON.


7. Implementation & API Documentation

7.1 Reference Python Implementation

python
# aion/__init__.py from .schema import AIONSchema, FieldDef from .encoder import AIONEncoder from .decoder import AIONDecoder from .preamble import AION_PREAMBLE __version__ = "1.0.0" __all__ = ["AIONSchema", "FieldDef", "AIONEncoder", "AIONDecoder", "AION_PREAMBLE"]
python
# aion/schema.py from dataclasses import dataclass, field from typing import List, Optional, Literal FieldType = Literal["str", "int", "float", "bool", "obj", "arr", "null"] @dataclass class FieldDef: name: str type: FieldType children: List["FieldDef"] = field(default_factory=list) alias: Optional[int] = None # set by AIONSchema class AIONSchema: def __init__(self, fields: List[tuple]): """ fields: list of (name, type) or (name, type, children_list) children_list follows same format recursively. """ self.fields: List[FieldDef] = [] self._alias_map: dict = {} # alias int -> FieldDef self._name_map: dict = {} # field name -> FieldDef self._counter = 0 self._build(fields, self.fields) def _build(self, spec: list, dest: list): for entry in spec: name, ftype = entry[0], entry[1] children_spec = entry[2] if len(entry) > 2 else [] self._counter += 1 fd = FieldDef(name=name, type=ftype, alias=self._counter) if children_spec: self._build(children_spec, fd.children) dest.append(fd) self._alias_map[self._counter] = fd self._name_map[name] = fd def get_by_alias(self, n: int) -> FieldDef: return self._alias_map[n] def get_by_name(self, name: str) -> FieldDef: return self._name_map[name] def render_sdh(self) -> str: lines = ["@@schema"] self._render_fields(self.fields, lines, indent=0) lines.append("@@end") return "\n".join(lines) def _render_fields(self, fields: List[FieldDef], lines: list, indent: int): prefix = " " * indent for fd in fields: lines.append(f"{prefix}@{fd.alias}:{fd.name} {fd.type}") if fd.children: self._render_fields(fd.children, lines, indent + 1)
python
# aion/encoder.py from typing import Any, Dict, List, Optional from .schema import AIONSchema, FieldDef class AIONEncoder: def __init__(self, schema: AIONSchema, delta_threshold: float = 0.6): """ schema: AIONSchema instance delta_threshold: uniformity below this triggers delta mode """ self.schema = schema self.delta_threshold = delta_threshold def encode(self, data: List[Dict]) -> str: parts = [self.schema.render_sdh(), "", "@@data"] for i, record in enumerate(data): if i > 0: parts.append("---") parts.append(self._encode_record(record, self.schema.fields, depth=0)) parts.append("@@end") return "\n".join(parts) def _encode_record(self, record: Dict, fields: List[FieldDef], depth: int) -> str: lines = [] null_run = 0 for fd in fields: val = record.get(fd.name) if val is None: null_run += 1 continue if null_run > 0: lines.append(f"[D:{depth}]∅{null_run if null_run > 1 else ''}") null_run = 0 if fd.type == "obj" and isinstance(val, dict): lines.append(f"[D:{depth}]@{fd.alias}:") lines.append(self._encode_record(val, fd.children, depth + 1)) elif fd.type == "arr" and isinstance(val, list): lines.append(self._encode_array(val, fd, depth + 1)) elif fd.type == "bool": lines.append(f"[D:{depth}]@{fd.alias}:{'T' if val else 'F'}") else: lines.append(f"[D:{depth}]@{fd.alias}:{val}") if null_run > 0: lines.append(f"[D:{depth}]∅{null_run if null_run > 1 else ''}") return "\n".join(lines) def _encode_array(self, arr: List[Dict], fd: FieldDef, depth: int) -> str: if not arr: return f"[D:{depth}]@{fd.alias} rows:0" uniformity = self._measure_uniformity(arr, fd.children) if uniformity >= self.delta_threshold: return self._encode_columnar(arr, fd, depth) else: return self._encode_delta(arr, fd, depth) def _measure_uniformity(self, arr: List[Dict], children: List[FieldDef]) -> float: if not arr or not children: return 1.0 expected_keys = {c.name for c in children} matches = sum(set(item.keys()) == expected_keys for item in arr) return matches / len(arr) def _encode_columnar(self, arr: List[Dict], fd: FieldDef, depth: int) -> str: col_aliases = ",".join(f"@{c.alias}" for c in fd.children) header = f"[D:{depth}]@{fd.alias} rows:{len(arr)} cols:{col_aliases}" rows = [] for item in arr: values = [] for child in fd.children: val = item.get(child.name) if val is None: values.append("∅") elif child.type == "bool": values.append("T" if val else "F") else: values.append(str(val)) rows.append(",".join(values)) return header + "\n" + "\n".join(rows) def _encode_delta(self, arr: List[Dict], fd: FieldDef, depth: int) -> str: lines = [f"[D:{depth}]@{fd.alias} delta:"] base = arr[0] base_parts = " ".join( f"@{c.alias}:{base.get(c.name, '∅')}" for c in fd.children ) lines.append(f" base: {base_parts}") for item in arr[1:]: delta_parts = [] for child in fd.children: bval = base.get(child.name) ival = item.get(child.name) if bval != ival: if ival is None: delta_parts.append(f"-@{child.alias}") else: delta_parts.append(f"+@{child.alias}:{ival}") lines.append(" " + " ".join(delta_parts) if delta_parts else " (same)") return "\n".join(lines)
python
# Usage example from aion import AIONSchema, AIONEncoder, AIONDecoder, AION_PREAMBLE schema = AIONSchema([ ("id", "int"), ("name", "str"), ("email", "str"), ("age", "int"), ("premium", "bool"), ("address", "obj", [ ("city", "str"), ("country", "str"), ]), ("orders", "arr", [ ("total", "float"), ("status", "str"), ("date", "str"), ]), ]) data = [ { "id": 1, "name": "Alice Martin", "email": "[email protected]", "age": 31, "premium": True, "address": {"city": "Paris", "country": "France"}, "orders": [ {"total": 49.99, "status": "shipped", "date": "2025-01-10"}, {"total": 120.00, "status": "pending", "date": None}, ] } ] encoder = AIONEncoder(schema) aion_str = encoder.encode(data) print(aion_str) # Attach preamble + data to LLM prompt prompt = f"{AION_PREAMBLE}\n\nData:\n{aion_str}\n\nTask: Extract all pending orders."

7.2 JavaScript/TypeScript Implementation

typescript
// aion-format/src/index.ts export type FieldType = "str" | "int" | "float" | "bool" | "obj" | "arr" | "null"; export interface FieldSpec { name: string; type: FieldType; children?: FieldSpec[]; } export interface CompiledField extends FieldSpec { alias: number; children: CompiledField[]; } export class AIONSchema { readonly fields: CompiledField[]; private counter = 0; private aliasMap = new Map<number, CompiledField>(); private nameMap = new Map<string, CompiledField>(); constructor(specs: FieldSpec[]) { this.fields = this.compile(specs); } private compile(specs: FieldSpec[]): CompiledField[] { return specs.map((spec) => { const alias = ++this.counter; const compiled: CompiledField = { ...spec, alias, children: spec.children ? this.compile(spec.children) : [], }; this.aliasMap.set(alias, compiled); this.nameMap.set(spec.name, compiled); return compiled; }); } renderSDH(): string { const lines = ["@@schema"]; this.renderFields(this.fields, lines, 0); lines.push("@@end"); return lines.join("\n"); } private renderFields(fields: CompiledField[], lines: string[], indent: number) { const prefix = " ".repeat(indent); for (const f of fields) { lines.push(`${prefix}@${f.alias}:${f.name} ${f.type}`); if (f.children.length > 0) this.renderFields(f.children, lines, indent + 1); } } } export class AIONEncoder { constructor(private schema: AIONSchema, private deltaThreshold = 0.6) {} encode(records: Record<string, unknown>[]): string { const parts = [this.schema.renderSDH(), "", "@@data"]; records.forEach((rec, i) => { if (i > 0) parts.push("---"); parts.push(this.encodeRecord(rec, this.schema.fields, 0)); }); parts.push("@@end"); return parts.join("\n"); } private encodeRecord( rec: Record<string, unknown>, fields: CompiledField[], depth: number ): string { const lines: string[] = []; for (const f of fields) { const val = rec[f.name]; if (val === null || val === undefined) { lines.push(`[D:${depth}]∅`); continue; } if (f.type === "obj" && typeof val === "object") { lines.push(`[D:${depth}]@${f.alias}:`); lines.push(this.encodeRecord(val as Record<string, unknown>, f.children, depth + 1)); } else if (f.type === "arr" && Array.isArray(val)) { lines.push(this.encodeArray(val, f, depth + 1)); } else if (f.type === "bool") { lines.push(`[D:${depth}]@${f.alias}:${val ? "T" : "F"}`); } else { lines.push(`[D:${depth}]@${f.alias}:${val}`); } } return lines.join("\n"); } private encodeArray(arr: unknown[], field: CompiledField, depth: number): string { if (!arr.length) return `[D:${depth}]@${field.alias} rows:0`; const uniformity = this.measureUniformity(arr as Record<string, unknown>[], field.children); return uniformity >= this.deltaThreshold ? this.columnar(arr as Record<string, unknown>[], field, depth) : this.delta(arr as Record<string, unknown>[], field, depth); } private measureUniformity(arr: Record<string, unknown>[], children: CompiledField[]): number { const expected = new Set(children.map((c) => c.name)); const matches = arr.filter((item) => { const keys = new Set(Object.keys(item)); return [...expected].every((k) => keys.has(k)) && keys.size === expected.size; }).length; return matches / arr.length; } private columnar(arr: Record<string, unknown>[], field: CompiledField, depth: number): string { const cols = field.children.map((c) => `@${c.alias}`).join(","); const header = `[D:${depth}]@${field.alias} rows:${arr.length} cols:${cols}`; const rows = arr.map((item) => field.children .map((c) => { const v = item[c.name]; if (v === null || v === undefined) return "∅"; if (c.type === "bool") return v ? "T" : "F"; return String(v); }) .join(",") ); return [header, ...rows].join("\n"); } private delta(arr: Record<string, unknown>[], field: CompiledField, depth: number): string { const lines = [`[D:${depth}]@${field.alias} delta:`]; const base = arr[0]; const baseParts = field.children .map((c) => `@${c.alias}:${base[c.name] ?? "∅"}`) .join(" "); lines.push(` base: ${baseParts}`); for (const item of arr.slice(1)) { const diff = field.children .filter((c) => base[c.name] !== item[c.name]) .map((c) => (item[c.name] == null ? `-@${c.alias}` : `+@${c.alias}:${item[c.name]}`)) .join(" "); lines.push(` ${diff || "(same)"}`); } return lines.join("\n"); } } // AION_PREAMBLE constant export const AION_PREAMBLE = `Parse data in AION format: - @@schema: maps @N aliases to field names and types - @@data: records separated by --- - [D:N]: absolute nesting depth (0=root, 1=child...) - bool: T=true, F=false - ∅ or ∅K: null (K fields) - arr rows:N cols:@X,@Y: N rows, comma-separated values - arr delta: base declares full record; +@N:val adds/changes, -@N omits Preserve all types from schema. Parse faithfully.`;

7.3 Integration Patterns

Pattern 1 — RAG Pipeline

python
from aion import AIONEncoder, AIONSchema, AION_PREAMBLE from openai import OpenAI client = OpenAI() class AIONRAGPipeline: def __init__(self, vectorstore, schema: AIONSchema): self.vs = vectorstore self.encoder = AIONEncoder(schema) def query(self, user_question: str, k: int = 20) -> str: # Retrieve documents docs = self.vs.similarity_search(user_question, k=k) records = [doc.metadata for doc in docs] # Encode in AION — ~60-70% fewer tokens than JSON aion_data = self.encoder.encode(records) response = client.chat.completions.create( model="gpt-4o", messages=[ {"role": "system", "content": AION_PREAMBLE}, {"role": "user", "content": f"{aion_data}\n\nQuestion: {user_question}"} ] ) return response.choices[0].message.content

Pattern 2 — AI Agent Tool Output Handler

python
import json from aion import AIONEncoder, AIONSchema def wrap_tool_output(raw_json: str, schema: AIONSchema) -> str: """ Intercept tool JSON output and re-encode as AION before injecting into agent context window. """ data = json.loads(raw_json) if not isinstance(data, list): data = [data] return AIONEncoder(schema).encode(data) # LangGraph / LangChain integration example from langchain.tools import tool @tool def search_orders(customer_id: int) -> str: raw = orders_api.get(customer_id) # returns JSON return wrap_tool_output(raw, orders_schema)

Pattern 3 — Structured Output Generation

python
# Use AION as output format for LLM structured generation # AION output has fewer hallucination surface points than JSON # because alias keys constrain the generation space system_prompt = f"""{AION_PREAMBLE} When generating structured data, respond in AION format using the schema provided. Use only declared @N aliases as keys. Do not invent new keys.""" response = client.chat.completions.create( model="gpt-4o", messages=[ {"role": "system", "content": system_prompt}, {"role": "user", "content": f""" {orders_schema.render_sdh()} Extract all order records from the following invoice text: {invoice_text} """} ] ) # Decode AION response back to Python dicts from aion import AIONDecoder decoder = AIONDecoder(orders_schema) orders = decoder.decode(response.choices[0].message.content)

7.4 Decoder State Machine Specification

The AION decoder is a deterministic finite-state machine. States and transitions:

StateTrigger TokenNext StateAction
INIT@@schemaSCHEMABegin alias dictionary
SCHEMA@N:name typeSCHEMARegister alias N → name, type
SCHEMA@@endBETWEENFinalize schema
BETWEEN@@dataDATABegin record parsing
DATA[D:0]@N:valDATASet field N of current record
DATA[D:N>0]@N:valDATASet field N at depth N
DATArows:N cols:@X,...ARR_COLStart columnar array
ARR_COLvalue rowARR_COLAppend columnar array row
ARR_COL--- or [D:0]DATAFinalize array
DATAdelta:ARR_DELTAStart delta array
ARR_DELTAbase: @X:v...ARR_DELTASet base record
ARR_DELTA+@N:v -@NARR_DELTAApply delta to base copy
ARR_DELTA--- or [D:0]DATAFinalize delta array
DATA---DATAEmit current record, begin new
DATA@@endENDEmit final record, stop

The decoder is O(n) in document length, requires no lookahead or backtracking, and emits records in streaming fashion — enabling partial parsing of truncated AION documents as far as the data extends.


8. Applications and Integration Patterns

8.1 Retrieval-Augmented Generation (RAG)

RAG systems inject retrieved document metadata into LLM prompts at query time. The token cost of this injection is a direct API cost. A documented production case reports a single 500-row customer table costing $1,940 over one weekend in JSON; TOON encoding reduced this to $760. Under AION's projected 60–70% JSON reduction (vs TOON's 39%), the same workload would cost approximately $580–640 — an additional $120–180 weekly saving, or $6,000–9,000 annually per pipeline.

AION's schema-once model is architecturally ideal for RAG because all retrieved documents from the same corpus share an identical metadata schema. The SDH is declared once per query and amortized across all k retrieved chunks. For RAG pipelines retrieving k=20 documents with 15-field metadata schemas, AION reduces the retrieval context by an estimated 1,400–2,200 tokens per query compared to JSON.

8.2 AI Agents and Multi-Step Reasoning

Agents executing in loop architectures (ReAct, Plan-and-Execute, LangGraph) accumulate context across many tool calls. Each JSON-encoded tool response inflates the context window, pushing earlier reasoning tokens away from the attention focus — a phenomenon that degrades planning quality in proportion to context length. AION-encoded tool outputs maintain a smaller context footprint, preserving a greater fraction of the reasoning window for higher-level planning.

Agents particularly benefit from AION's [D:N] markers when tool responses contain nested objects (e.g., API responses with nested user profiles, orders within customers, or products within orders). After dozens of prior reasoning steps accumulate in context, indentation-based parsing (TOON) becomes unreliable; AION's absolute anchors maintain structural fidelity regardless of context depth.

8.3 Structured Output and Schema-Constrained Generation

When LLMs generate structured data (function arguments, database records, API response objects), output format quality directly impacts downstream pipeline reliability. AION constrains the output generation space: models generating AION output produce only @N:value pairs for pre-declared schema aliases, eliminating hallucinated keys — a common failure mode in JSON-mode generation where models invent undeclared fields. The alias space @1...@M is finite and known; any alias outside this range is trivially detectable as a generation error, enabling robust output validation.

8.4 Long-Context Data Analytics

For tasks requiring LLM reasoning over hundreds of records (financial analysis, log processing, cohort segmentation), AION's compression directly expands the effective dataset size that fits within a given context window. At 65% compression over JSON, a model with a 128K token context window can process approximately 2.86× more records in AION than in JSON — without any model upgrade, architecture change, or infrastructure modification. This is a zero-cost capacity expansion for any organization paying for LLM API access.

8.5 Streaming and Incremental Processing

AION's state-machine decoder (Section 7.4) supports streaming: records can be emitted as soon as the --- separator or @@end token is encountered, without buffering the full document. This enables token-by-token processing of LLM outputs in structured generation pipelines, reducing time-to-first-result for applications that display or act on records as they are generated.


9. Limitations and Future Work

9.1 Current Limitations

Schema requirement: AION requires a fixed, known schema at encode time. Fully dynamic or schema-less data (e.g., free-form JSON logs with variable keys) cannot benefit from the SDH's key-alias compression and fall back to a degraded mode. For such data, TOON's simpler syntax may be preferable.

Preamble tax for micro-payloads: For payloads of fewer than 3–4 records, the combined SDH + preamble overhead (~120 tokens) may exceed the per-record savings. The break-even point depends on schema width and field name length; implementers should benchmark on their specific schema before deploying AION for small payloads. For these cases, JSON-SO (Structured Output with constrained decoding) remains the token-efficiency winner.

[D:N] overhead for flat structures: In shallow, single-level schemas (all fields at depth 0, no nesting), the [D:0] prefix adds 3 tokens per field that TOON avoids through simple key: value syntax. For flat schemas with very short field names, TOON may be marginally more efficient. AION addresses this with an optional FLAT mode declaration in @@meta that suppresses [D:N] markers for entirely flat schemas.

LLM comprehension on small models: AION's parsing logic, while compact, is more novel than TOON's YAML-like syntax. Models below 7B parameters may have difficulty reliably executing the finite-state parsing rules from the preamble alone. Production deployment on models smaller than 13B should include empirical validation on the target model family.

Delta encoding field-order dependency: The delta codec assumes fields are always presented in schema-declared order. Schemas with optional subsets require an explicit "presence map" extension not specified in AION v1.0. This is identified as a priority for v1.1.

9.2 Future Work

  1. Empirical benchmarking: The theoretical efficiency analysis in Section 6 requires empirical validation across the same 21 LLMs and four test cases used by the arXiv TOON benchmark. A direct AION vs TOON vs JSON accuracy/token benchmark is the immediate priority for AION v1.1.

  2. AION FLAT mode: A simplified AION encoding for depth-0 schemas that omits [D:N] markers, reducing per-field overhead to the minimum and competing with JSON-SO for simple flat structures.

  3. Schema inference tooling: Automated SDH generation from JSON Schema, Pydantic models, TypeScript interfaces, GraphQL types, or database DDL statements. This would allow developers to adopt AION without manually authoring schema declarations.

  4. AION + LLMLingua hybrid: A two-stage compression pipeline combining AION's syntactic compression (for structured data) with LLMLingua-style semantic compression (for unstructured prose components of the same prompt). Preliminary estimation suggests a potential 80–85% combined reduction over baseline JSON + verbose instructions.

  5. Presence map extension for v1.1: A compact @pmap:... bitfield notation that explicitly declares which schema fields are present in a given record, enabling correct delta decoding for optional-field schemas without full field enumeration.

  6. Cross-lingual evaluation: The current AION specification assumes Latin-character field names that tokenize efficiently under BPE vocabularies trained predominantly on English text. For schemas with CJK, Arabic, or Cyrillic field names, alias compression provides even stronger benefits (since non-Latin field names tokenize less efficiently), but validation on non-Latin corpora is warranted.

  7. xgrammar-compatible AION grammar: A formal EBNF grammar for AION that can be compiled into a state machine for constrained-decoding AION generation — analogous to JSON Structured Output, but enabling both the structural guarantee of constrained decoding and the token efficiency of AION encoding.


10. Conclusion

We have introduced AION (Adaptive Indexed Object Notation), the first LLM-facing serialization format to treat field-name key-string redundancy as a first-class compression target. By combining a Schema Dictionary Header (eliminating key repetition), Depth Anchor Markers (eliminating indentation drift), and an Adaptive Array Codec (addressing non-aligned structure collapse), AION addresses the four core limitations of TOON confirmed by the most rigorous peer-reviewed benchmark to date.

Theoretical token modeling demonstrates AION achieves 55–75% reduction over JSON and 20–38% reduction over TOON, with gains scaling proportionally to schema width, field name length, record count, and structural complexity — precisely the conditions under which TOON's advantages are most limited. AION is not merely incrementally better than TOON: it establishes a new performance floor by addressing the class of problems (deep nesting, long contexts, verbose schemas) for which TOON fails entirely.

The broader principle AION establishes is this: serialization format design for LLMs must be guided by BPE tokenizer behavior, not human readability or runtime parser performance. The key insight — that field names are structural metadata that, in a homogeneous dataset, should be declared exactly once — follows naturally from information theory. AION is the first format to implement this insight as a complete, lossless, LLM-native specification.

As LLM API costs continue to scale with token volume, context window demands grow across RAG, agent, and analytics workloads, and the competitive advantage of AI systems increasingly depends on cost-efficient inference, format-level token optimization represents one of the highest-leverage, zero-infrastructure