Replay Bugs vs Editorial Artifacts

Telling them apart is the hard part.

Elias Kunnas

When LawVM's replayed text diverges from Finlex's editorial consolidation, the natural assumption is that LawVM is wrong. Sometimes it is. But a significant fraction of divergences turn out to be the other way around.

A replay bug

A replay bug is a defect in LawVM's parser or compiler. The source material is sufficient, but the system produced incorrect state.

Example: 1998/745 — chapter repeal lost in coordinate parsing. Amendment 2012/475 said "2 §, 3, 4, 6 ja 7 luku sekä 40 §" — section 2, chapters 3, 4, 6, 7, and section 40. The bare numbers "3, 4, 6, 7" inherit the trailing "luku" (chapter), but LawVM's parser degraded them into section repeals. Five entire chapters were being replayed as if only individual sections were repealed. Similarity: 29.8%.

This is unambiguously a LawVM bug. The amendment preamble is clear. The fix was in the frontend coordinate parser. After the fix, similarity jumped to 97.4%.

An editorial artifact

An editorial artifact is a divergence caused by editorial choices in the consolidation process — choices that are defensible but differ from strict replay.

Example: 1992/728 — future-effective amendment already applied. Finlex's consolidated metadata says the consolidation date is 2009-12-29, but section 3 already reflects amendment 2009/1710, which enters into force on 2010-01-01. LawVM's legal point-in-time mode carries the earlier wording because the amendment is not yet in force at the stated date.

Finlex editors ran their editorial process ahead of the strict legal effectivity date. This is pragmatic — users visiting Finlex in late December probably want to see the upcoming version. But it is not strict point-in-time state. LawVM documents the divergence and classifies it as oracle version drift.

A candidate oracle issue

Sometimes the primary sources appear to support LawVM over the Finlex consolidation. Those cases are candidate findings until confirmed by Finlex or another competent authority.

Example: 2004/699 — candidate stale section heading. Amendment 2008/886 changed §8's heading from "Valvontatehtävän siirtäminen toiselle valvontaviranomaiselle" to "... ulkomaan valvontaviranomaiselle" ("another" → "foreign" supervisory authority). Finlex still shows the 2004 heading. No subsequent amendment found by LawVM changed it back. This is a high-confidence candidate finding, but still a candidate until authority review.

Example: 2014/716 — candidate missing COVID-19 emergency provisions. Temporary amendment 2020/697 modified four sections and added new §8b for enterprises in difficulty. LawVM's source-backed replay indicates the Finlex consolidated version omitted these changes during the amendment's period of force.

These are not editorial choices. They are omissions — content that was published in Säädöskokoelma but never made it into the consolidation.

Source pathology

Sometimes the source XML is itself broken. Neither system can produce correct output.

Example: 2014/1244 — corrupted XML in Finlex production pipeline. The amendment source XML (2018/1202) contains a stray "790" from the old frequency range embedded in the replacement payload: "470—694 790 megahertsiä" instead of "470—694 megahertsiä". A second defect produces "on ovat" instead of "on". LawVM faithfully replays the corrupted source. Finlex's consolidation shows the correct text — because editors manually corrected it.

In source pathology cases, LawVM is "correct" in the sense that it accurately reproduces the published source, and "wrong" in the sense that the source itself is broken. The system documents the pathology rather than silently correcting it.

Why typing matters

A single similarity score collapses all of these into one number. A 95% match could mean:

  • 5% replay bugs (system is wrong)
  • 5% editorial artifacts (system is right, oracle made different choices)
  • 5% candidate oracle issues (primary sources appear to support the replay over the comparison surface)
  • or any mixture

Without typing, you cannot improve the system, validate the oracle, or publish findings. With typing, every divergence is an investigation that either improves LawVM, documents a candidate official-surface issue, or reveals a source pathology.

This is why LawVM maintains a residual taxonomy and a candidate-finding dataset. As of 2026-04-25, 22 high-confidence meaningful candidate findings have been reported to Finlex, with hundreds of additional divergences still being classified. The taxonomy turns opaque failure into analyzable structure.