Work in progress — this reference is being written in the open. Unfinished pages are excluded from search engines.
Paged · IDML Reference
Foundations

The binary-to-XML lineage

Why IDML's open XML interchange form exists alongside InDesign's native binary document — same document, different audience.

Beginner· explanation

IDML and the native binary carry the same document — they differ only in who they are for.

In short: A page-layout document has two faces. The native Adobe® InDesign® file is a packed binary: compact, fast for the application to load, and tightly bound to that application's internals — great for the app, opaque to everyone else. IDML is the same document written out as XML so that any tool, not just the one that made it, can read, diff, generate, or transform it. This page explains why that second form exists, what it faithfully preserves, and what it deliberately leaves behind.

The native document format is a packed binary: compact, fast for the application to load, and tightly coupled to that application's internals. That is good for the app and hard for everyone else. You cannot reliably read it, diff it, or generate it from another tool without reverse-engineering the byte layout.

IDML is the answer to "everyone else." It expresses the same document as XML, so the structure is inspectable and other software can produce and consume it. The binary and the XML carry the same document; they differ in audience. The binary serves one application. The XML serves any tool willing to read element names.

What the form preserves

The interchange form keeps the things a second tool needs to reconstruct the document: the page geometry, the text and where it flows, the styles, the colors, and the references that tie those parts together. It does not try to be the application's working memory. Live caches, undo history, and editor-only state stay in the binary. What survives the trip to XML is the document's structure and intent — what is on the page and why — not a snapshot of the app's runtime.

What it leaves to the application

Some values are recorded as facts rather than as instructions. A text variable, for example, carries the text it resolved to at export time; recomputing it page by page is left to whatever opens the file. The form states "this is what it said then" and trusts the application to decide whether to refresh it.

What we read first

Our reader treats the archive as a container before it treats it as a document. The container open (core/crates/paged-parse) decompresses every entry, then checks one thing up front: the mimetype entry must read exactly application/vnd.adobe.indesign-idml-package. Anything else is rejected as not-IDML before a single page is parsed. Only then does it read designmap.xml, the manifest that names every other part.

Document::open (core/crates/paged-scene) builds on that: it pulls the shared resources (graphics, styles), then the master spreads, the spreads, and the stories the manifest enumerates — each fetched by the path the design map gave it. The lineage shows here in the order of reading: identify the package, read the manifest, follow it to the parts. The XML exists precisely so a tool other than the original application can do exactly that.

Frequently asked questions

Why does an XML interchange format exist alongside the native binary? The native binary is built for one application: it is compact and fast to load, but tied to that app's internals and not safely readable by anything else. The XML interchange form exists so that other software can reliably read, diff, generate, and transform the same document without reverse-engineering a private byte layout.

Does the XML form contain everything the binary does? No, and that is by design. The interchange form keeps what a second tool needs to reconstruct the document — page geometry, text and its flow, styles, colors, and the references that tie them together. Application-only state such as live caches and undo history stays in the binary.

Why are some values, like text variables, stored as fixed results instead of live instructions? The interchange form records certain values as facts rather than as instructions to recompute. A text variable, for example, carries the text it resolved to at export time; whatever opens the file decides whether to refresh it. Our reader honors that by capturing the resolved result faithfully rather than recomputing it.

On this page