Corrigenda Are Operations over Operations
Not editorial notes. Higher-order legal-state operations.
A corrigendum (oikaisuilmoitus in Finnish) is a published correction to an enacted text. Most legal information systems treat corrigenda as editorial annotations — purple text, footnotes, or silent corrections applied by consolidation editors.
They are more than that. A corrigendum is a higher-order operation: an operation that modifies a prior amendment operation, not just the base statute text.
Why this matters
Consider an amendment A that changes section 12 of a statute. A corrigendum C is published correcting the text of amendment A. The question is: what is the correct state of section 12?
If you model corrigenda as flat editorial corrections to the consolidated text, you get the right answer for "what does the text say now?" but you lose:
- The provenance chain — which version of the amendment instruction produced the current text
- The temporal semantics — the corrigendum was published at time T₂, but may correct an amendment published at T₁. Whether the correction is retroactive, prospective, or merely evidential depends on the jurisdiction and correction type.
- The composition — if another amendment B was applied between T₁ and T₂, and B was based on reading the uncorrected text of A, the interaction is invisible in a flat model.
In a compiler model, a corrigendum is: C : A → A', where A is the original amendment and A' is the corrected amendment. The replay engine then applies A' instead of A, producing different state from the same source chain.
Bitemporal consequences
Corrigenda create bitemporal situations:
- Transaction time — when the corrigendum was published
- Valid time — the point in legal history it retroactively corrects
Before the corrigendum, the observable publication surface is the state produced by A. After the corrigendum, the system may need to expose both the original publication-time state and the corrected legal/evidential state. The exact legal consequence is not universal; it must be classified by source regime and correction type.
A flat consolidation system can only show the current corrected state. A replay compiler can show both: the state before correction (for historical queries) and the state after correction (for current queries).
The Finnish corpus
LawVM's Finnish source corpus contains 471 corrigendum entries affecting 180 statutes. These are oikaisuilmoitukset published in Säädöskokoelma — official correction notices.
Finlex's treatment of corrigenda is architecturally interesting: correction notices are published as unstructured PDFs, not as machine-readable XML. The corrected text appears in the consolidated view, but the correction operation itself bypasses the machine-readable publication pipeline entirely. This means:
- The enacted XML contains the uncorrected text
- The consolidated view shows the corrected text
- No machine-readable artifact records the mapping between them
LawVM extracts corrigendum operations from Säädöskokoelma source and applies them as explicit patches to the amendment instruction chain. In the Finnish corpus, corrigendum handling creates a distinct class of replay-vs-Finlex divergences where the source correction layer must be adjudicated explicitly rather than hidden as editorial cleanup.
The paradox
LawVM without corrigendum patches may match Finlex better on the similarity metric, while LawVM with corrigendum patches may better reflect a correction source. That can score worse against Finlex, because the benchmark measures Finlex agreement, not legal correctness.
This is a concrete example of why a single similarity score is misleading. The system that agrees more with the oracle is not always the system that is more correct. The residual taxonomy exists to make this distinction visible.