Work in progress — this reference is being written in the open. Unfinished pages are excluded from search engines.
Paged · IDML Reference
Test corpus

Test corpus

The test corpus is how Paged proves it reads IDML correctly — a license-clear set of generated fixtures diffed pixel-for-pixel against InDesign, plus the hand-authored examples in these docs validated against the real renderer in CI.

Pro· explanation

The test corpus is the evidence behind every claim this reference makes about what the renderer does.

In short: Paged keeps two bodies of test material, and this chapter explains both. The first is the fidelity corpus in the engine: a set of generated .idml fixtures, each paired with a PDF that Adobe InDesign exported from the same document, diffed pixel-for-pixel so the renderer's output can be measured against the reference that defines "correct." The second is the examples loop in these docs: the small, hand-authored IDML packages you see embedded throughout the reference, each one assembled and run through the real renderer on every build so a page can never quietly describe something the renderer rejects. One loop guards how faithfully we paint; the other guards how honestly we write. Together they are why this reference can say "the renderer does X" and mean it.

When a renderer claims to handle a format, the only claim worth anything is one you can re-run. "We support gradients" is marketing; "this gradient fixture diffs under 0.2 mean ΔE against InDesign's own export, and here is the script that checks it on every commit" is a fact. The test corpus is where Paged turns claims into facts — and where this reference borrows its right to make them.

Two loops, two questions

The two bodies of test material answer two different questions, and it is worth keeping them apart from the start.

The fidelity loop asks: do the pixels match? It lives in the engine repo (core). A generator emits IDML fixtures; InDesign exports a reference PDF from the same documents; the renderer rasterises its own version; an image-diff tool (ΔE2000 colour difference plus SSIM structural similarity) compares the two, page by page, against per-fixture tolerances. If the renderer drifts away from InDesign's output, the numbers move and a CI gate fails. This is a visual test — it renders actual pixels and compares them.

The examples loop asks: does the renderer still accept what the docs describe? It lives in this docs repo. Every <ExampleEmbed> you see in the reference is a real, hand-authored IDML package; on every build, each one is zipped into a .idml and fed to the renderer's introspection tool. If the renderer stops accepting an example — a parser change, a removed feature — the docs build breaks. This is a structural test: it parses the package and builds the internal model, but it does not rasterise. No GPU, no fonts, no pixels.

The distinction matters because the two loops have different reach, and we are honest about it. The fidelity gate is pixel-deep but only covers the generated fixtures. The examples gate is broad — it touches every documented construct that has an example — but it is shallow: it proves the package parses and builds, not that it looks right. Neither alone is enough; this chapter is about how they fit together.

Where to go next

  • How we test — the fidelity loop in detail: the fixture generator, the InDesign reference PDFs, the ΔE2000 / SSIM gate, the CPU backend for headless CI, and the discipline of never loosening a threshold to hide a regression.
  • The docs examples loop — how the examples in this reference are stored as unzipped IDML parts, assembled to a real .idml, and validated against the pinned renderer in CI — the living-docs guarantee.

Frequently asked questions

Why keep two separate test corpora instead of one? Because they answer different questions and have different costs. The fidelity corpus needs InDesign in the loop to export reference PDFs and a rasteriser to produce comparable pixels, so it is expensive to grow and lives with the engine. The docs examples are cheap, hand-written, license-clear fragments that anyone can read in review, and they only need the renderer's parser and model-builder — no fonts, no GPU. Folding them into one corpus would either make the examples expensive or make the fidelity tests shallow. Keeping them apart lets each be the best version of itself.

Does a passing test corpus mean the renderer is bug-free? No, and the chapter is careful never to imply it. The fidelity gate only proves the generated fixtures match InDesign within tolerance; constructs no fixture exercises are simply untested visually. The examples gate only proves the renderer accepts and models each example; it never checks that the result looks right. A construct can be green on both gates and still have a rendering bug in a case neither corpus covers. The corpus narrows the unknowns; it does not eliminate them.

What is the difference between the fidelity loop and the examples loop? They guard different things and measure at different depths. The fidelity loop lives in the engine (core): it diffs the renderer's rasterised pages against InDesign's own reference PDFs using ΔE2000 and SSIM, so it is pixel-deep but only covers the generated fixtures. The examples loop lives in these docs: it runs every embedded IDML example through the renderer's introspection tool, checking the package still parses and builds — a structural test that is broad across documented constructs but never rasterises, so it says nothing about pixels. One proves the pages look right; the other proves the docs stay honest.

On this page