Text variables
How TextVariableInstance and auto-page-number markers become characters in a run — frozen ResultText snapshots and private-use placeholder markers.
Text variables and page-number markers are computed characters that the parser inlines into a story's run text.
In short: Not every character in a story is typed — a running header, a chapter
number, a file name, or the current page number is computed. InDesign stores these
as text variables (TextVariableInstance, carrying a frozen ResultText) and
page-number markers (processing instructions inside Content). Our parser inlines
the variable's value and substitutes private-use placeholders for the page-number
markers, so by the time you have a run string a variable is just more characters. This
page is the reference for both.
Some of the characters in a story aren't typed — they're computed. A running header, a chapter number, a file name, the current page number: InDesign stores these as text variables and page-number markers, and bakes their value into the story at export. Our parser inlines that value into the run text, so by the time you have a run string the variable is just more characters.
Two pieces: the definition and the instance
A text variable has a definition in the design map and one or more instances in the stories.
The definition is a TextVariable element under Document. The parser records
its identity and type — paged-parse/src/designmap.rs:227.
| Attribute · TextVariable (designmap.xml) | Type / values | Support | Notes |
|---|---|---|---|
| Self | string id (TextVariable/…) | Supported | The id an instance references via AssociatedTextVariable. |
| Name | string | Supported | The variable’s display name (e.g. "Chapter Number"). |
| VariableType | string | Supported | What kind of variable it is (running header, file name, page number, …). |
The instance is a TextVariableInstance element sitting inside a
CharacterStyleRange, where the variable's value appears in the flowed text. It
carries the value verbatim in ResultText, plus a back-reference to its
definition. When the parser reaches one, it appends ResultText to the current
run — paged-parse/src/story.rs:1643.
| Attribute · TextVariableInstance (inside a CharacterStyleRange) | Type / values | Support | Notes |
|---|---|---|---|
| ResultText | string | Supported | The computed value, inlined into the run text exactly as written. |
| AssociatedTextVariable | string ref (TextVariable/…) | Parsed, not yet rendered | Points back at the definition. Read, but not used to recompute the value. |
ResultText is a snapshot. It is whatever InDesign computed the last time it
composed the document — so a "Running Header" instance carries the header text as
of export, and a "File Name" instance carries the file name as of export. The
renderer inlines that snapshot; it does not re-derive the value from the current
layout. A running header won't update if you move the paragraph it tracks, and a
page-number variable baked as text won't follow repagination.
Auto-page-number markers
The current- and next-page numbers are a special case. InDesign doesn't store them
as TextVariableInstance — it writes a processing instruction inside Content:
<?ACE 18?> for the current page number and <?ACE 19?> for the next page number
(the kind familiar from "continued on page …" footers). These correspond to the
PageNumberType values AutoPageNumber and NextPageNumber.
Because the real number depends on where the frame lands, the parser can't inline a
literal. Instead it substitutes a private-use placeholder character that the
renderer replaces with the live page number at emit time —
AUTO_PAGE_NUMBER_MARKER (U+E018) and NEXT_PAGE_NUMBER_MARKER (U+E019),
defined at paged-parse/src/story.rs:45, written when the PI is seen at
paged-parse/src/story.rs:1729.
| Attribute · Page-number markers (PI inside Content) | Type / values | Support | Notes |
|---|---|---|---|
| <?ACE 18?> | processing instruction | Supported | Current-page-number marker → U+E018; renderer substitutes the live page number. |
| <?ACE 19?> | processing instruction | Supported | Next-page-number marker → U+E019; used in "continued on" footers. |
These markers are deliberately picked from a Unicode tag block that no real text produces, so they never collide with authored characters. If you extract text and see a U+E018 / U+E019, that's a live page-number marker, not stray data — map it back to a page number (or strip it) for a plain-text export.
Why this matters for extraction
When you extract all text, a
TextVariableInstance contributes its ResultText like any other characters, so a
running header shows up in your output as the frozen header string. The page-number
markers, by contrast, come through as the U+E018 / U+E019 placeholders — decide per
export whether to resolve them, leave them, or drop them.
Frequently asked questions
What is a text variable in IDML?
A text variable is a computed piece of text — a running header, chapter number, file
name, and the like. It has a TextVariable definition in the design map and one or
more TextVariableInstance instances in the stories, each carrying the value in
ResultText.
Does the renderer recompute a text variable's value?
No. ResultText is a snapshot of whatever InDesign computed at the last export, and
the renderer inlines that snapshot rather than re-deriving it. A running header won't
update if you move the paragraph it tracks, and a baked page-number variable won't
follow repagination.
Why does extracted text contain U+E018 or U+E019?
Those are the private-use placeholders the parser substitutes for the current-page
(<?ACE 18?>) and next-page (<?ACE 19?>) markers, since the real number depends on
where the frame lands. They are chosen from a Unicode tag block no real text produces,
so seeing one means a live page-number marker, not stray data — map it to a page number
or strip it for plain-text export.
How are auto-page-number markers different from other text variables?
Auto-page numbers aren't stored as TextVariableInstance. InDesign writes them as
processing instructions inside Content, and because the value depends on layout the
parser can't inline a literal — it writes the U+E018 / U+E019 placeholder, which the
renderer replaces with the live page number at emit time.
Threading and overset
How one story flows across several frames in a chain, and why text that doesn't fit at the end is overset and dropped from the rendered page.
Extract all text from a document
A recipe for pulling the plain text out of every story in reading order — walk StoryList, then each story tree, concatenating Content.