The pipeline

How Paged's renderer turns an IDML package into pixels — a five-stage relay across focused Rust crates parse, model, layout, compose, and rasterize, each owning one job and handing its output forward.

The pipeline is the five-stage relay that carries an IDML package from XML to pixels: parse, model, layout, compose, rasterize.

In short: The renderer is not one big function — it's a chain of small, focused crates, each owning a single transformation and handing its result to the next. The parse stage turns the ZIP's XML into a typed tree. The model stage resolves that tree into a usable document with styles applied and frame chains threaded. The layout stage shapes and breaks text into lines. The compose stage flattens the laid-out scene into a versioned display list. The rasterize stage paints that list to pixels. This page walks each stage, names the crate that owns it, and shows the top-level Rust entry point that drives the whole relay.

The shape of the engine follows the shape of the problem. IDML is layered — a package of XML parts, references between them, styles that cascade, text that flows through chained frames — so the renderer is layered to match. The data only ever moves in one direction:

paged-parse → paged-scene → paged-text → paged-compose → paged-gpu
                                              ↘ paged-renderer (drives all of it)
                                                   ↘ paged-fidelity (diffs the result)

paged-renderer sits on top and coordinates the others; paged-fidelity is the testing harness that diffs a render against an InDesign-exported reference. Let's walk the five stages in order.

Stage 1 — Parse: ZIP and XML to a typed tree

The first job is to get out of XML and into Rust types. The parse stage opens the IDML package (a ZIP archive), reads designmap.xml to learn the document's shape, and walks the parts it references — spreads, stories, styles, the graphic resources. Every element it understands becomes a typed struct: a TextFrame, a Rectangle, an Oval, a Polygon, a Story, a paragraph, a run.

This is also where the format's quirks get absorbed once, so nothing downstream has to think about them. Compound paths record their per-contour boundaries (so holes render as holes, not filled-in blobs). The BasedOn style cascade is captured. Bullets, numbering, tab stops, and ItemTransform matrices all land as data here. The output is a faithful, typed mirror of the package — no layout decisions made yet.

Stage 2 — Model: resolve the document

A typed tree still isn't a document you can lay out. References need following and styles need resolving. The model stage — entered through Document::open — turns the raw AST into a coherent Document: it resolves the style cascade so each paragraph knows its effective properties, matches each story to the frame it starts in, and threads chained frames into a frame_chain so a story that overflows one text frame continues into the next.

This is the stage that answers "where does this text actually go, and what does it look like?" — before a single glyph is shaped. Everything after this point reads a resolved Document and never has to chase a reference back through the package.

Stage 3 — Layout: shape and break text

Now the text becomes geometry. The layout stage shapes each run into positioned glyphs (the right glyph for each character in the chosen font, with the right advances and kerning), then breaks the shaped text into lines that fit the column.

Line-breaking uses Knuth–Plass — the same total-fit algorithm InDesign's paragraph composer is built on — calibrated against InDesign so the line breaks land where a designer expects, not just somewhere legal. Hyphenation, multi-font runs, and tab-stop alignment all happen here. The result is a set of laid-out lines: for each line, which glyphs sit where, at what baseline.

Stage 4 — Compose: flatten to a display list

With text laid out and frame geometry known, the compose stage walks the whole scene — frames, fills, strokes, images, gradients, effects, and every glyph — and flattens it into a single flat command stream: the display list. Each entry is a DisplayCommand such as "fill this path with this paint under this transform" or "place this image here." Repeated shapes — glyphs above all — share interned path data through a path buffer, so the letter "e" is tessellated once and referenced many times.

The display list is the renderer's pivot point. It is a small, self-contained, versioned intermediate representation of what to paint, deliberately decoupled from how it gets painted. That decoupling is what the next stage — and the whole rendering-backends story — depends on. It's important enough to have its own reference page.

Stage 5 — Rasterize: paint the pixels

The last stage consumes the display list and produces actual pixels: an RGBA8 image, one per page. It walks the command stream and asks a rasterizer to fill each path, stroke each outline, place each image, and composite each effect.

There are two rasterizer implementations behind one PathRasterizer trait. The WebGPU backend, built on Vello, is the forward path — it's what drives the SDK, the viewer, and this site's live preview. A CPU rasterizer (tiny-skia) is kept as the always-available, deterministic backend used by headless CI and the fidelity gate. Which one runs is a backend choice made at the edge; the display list feeding them is identical. The rendering-backends page covers the trade-offs.

The crate map

Each stage is a crate. Here's the whole relay, end to end, with the responsibility each crate owns.

Attribute · crate · responsibility	Type / values	Support	Notes
paged-parse	parse	Supported	ZIP + XML → typed AST. Container, designmap, spreads (TextFrame / Rectangle / Oval / Polygon / GraphicLine / Group), stories, graphic + gradients, ItemTransform, styles + BasedOn cascade, bullets/numbering, compound-path subpaths.
paged-scene	model	Supported	Document::open. Resolves the style cascade, matches stories to frames, threads chained frames (frame_chain). The resolved Document everything downstream reads.
paged-text	layout	Supported	Shaping (shape_run), paragraph composition, multi-font line layout. InDesign-calibrated Knuth–Plass line breaking, hyphenation, tab-stop alignment.
paged-compose	compose	Supported	Walks the laid-out scene and emits the versioned DisplayList (DisplayCommand stream). Owns the path buffer / glyph cache and the gradient + image pools.
paged-gpu	rasterize	Supported	Owns the PathRasterizer trait. WebGPU/Vello forward backend + tiny-skia CPU backend. DisplayList → RGBA8 pixels.
paged-renderer	coordinate	Supported	The top-level engine. build_document / render_document drive parse → … → raster; PipelineOptions carries the knobs. Re-exports Document, DisplayList, DisplayCommand so consumers depend on one crate.
paged-fidelity	verify	Supported	Not in the render path. The diff harness (ΔE2000 + SSIM) that compares a render against an InDesign-exported reference PDF in CI.

One call drives the whole relay

The top-level paged-renderer crate is the public Rust surface. A host — the WASM binding, a native tool, the fidelity harness — hands it a resolved Document plus a PipelineOptions, and gets back built pages (and, with the CPU backend, rasterized images):

use paged_renderer::{Document, PipelineOptions, pipeline};

// Document::open did the parse + model stages.
let document = Document::open(idml_bytes)?;

let options = PipelineOptions {
    default_point_size: 12.0,
    ..PipelineOptions::default()
};

// build_document runs layout + compose: one BuiltPage per <Page>,
// each carrying its own versioned DisplayList.
let built = pipeline::build_document(&document, &options)?;

// Rasterize is a separate, backend-specific step (CPU shown here).
let images = built.pages
    .iter()
    .map(|page| pipeline::render_built_page(page, 300.0, paged_compose::Color::WHITE))
    .collect::<Vec<_>>();

Two things are worth noticing. First, build and rasterize are separate calls: build_document produces the display lists, and rasterizing is a deliberate second step — so the same built pages can be painted by the CPU backend in CI or by the WebGPU backend in the live preview, without rebuilding. Second, PipelineOptions carries everything tunable — font fallbacks, the CMYK ICC profile, image-decode caches, and the per-page line-layout capture the canvas uses for hit-testing and caret placement.

Frequently asked questions

Why split the renderer into so many crates? Because each transformation is genuinely separable, and separating them keeps each one honest. The parser never thinks about pixels; the rasterizer never re-reads XML. The clean seam between compose and rasterize — the display list — is what lets one engine drive both a deterministic CPU render and a live WebGPU preview from a single source of truth.

Does the pipeline ever loop back? No. It's a one-directional relay. Each stage reads the previous stage's output and produces the next stage's input; nothing flows backwards. (Incremental re-layout for the interactive editor reuses cached deltas rather than reversing the flow.)

What's the difference between build and build_document? build_document is the real entry point: it produces one BuiltPage per <Page> in the document, each with its own display list, dimensions, and stats. build is a historical single-page variant that unions all page bounds into one surface; new callers should use build_document.

Where does the fidelity harness fit in? Off to the side, after rasterization. paged-fidelity is not part of producing a page — it's how we check a page, diffing a render against an InDesign-exported reference PDF using ΔE2000 colour difference and SSIM. It's the reason the CPU backend stays around; see rendering backends.

On this page