Getting Started
Prerequisites
- Python 3.12+
- uv package manager
- Git
Install
git clone https://github.com/eliask/lawvm.git
cd lawvm
uv sync
Import source archives
Finland replay requires local archived sources. Import the public Finlex archives (~13 GB ZIP input, ~5 GB on disk after ingestion):
uv run lawvm import-zip \
--statute-zip https://www.finlex.fi/api/assets/open-data/archives/statute.zip \
--consolidated-zip https://www.finlex.fi/api/assets/open-data/archives/statute-consolidated.zip
This downloads the official Finlex open data archives and ingests them into local .farchive files. The import is a one-time operation; subsequent commands read from the local archive.
First replay
Replay statute 2002/738 (Työturvallisuuslaki / Occupational Safety Act) as of January 1, 2024:
uv run lawvm replay 2002/738 --as-of 2024-01-01
This compiles all amendment acts affecting 2002/738, replays them over the base statute, and materializes the point-in-time text. The output is the complete statute as it stood on that date.
First diff
Compare LawVM's replay against the Finlex consolidation:
uv run lawvm diff 2002/738
This shows section-by-section divergences. Green sections match. Red sections diverge. Each divergence starts an investigation: replay defect, source gap, editorial convention, or candidate issue in the official consolidation surface.
First explain
See the amendment chain and operation history:
uv run lawvm explain 2002/738
This shows which amendments affected the statute, what operations they compiled to, and the temporal sequence of changes.
What success looks like
When replay succeeds on a statute with dozens of amendments spanning decades, you get:
- Point-in-time text that matches the official consolidation character-for-character
- Full provenance: every provision traced to the amendment that changed it
- Temporal versioning: query any past date, get the text that was in force
When replay diverges, typed residuals explain the likely class: replay defect, source pathology, editorial artifact, or oracle staleness.
Run the benchmark
uv run lawvm bench --mode finlex_oracle
Replays the configured Finnish alpha corpus and reports aggregate metrics. See Artifacts for methodology and interpretation.
Explore further
uv run lawvm --help
The CLI surface includes replay, diff, explain, benchmark, bisect, diagnose, and many more tools. The full command surface is documented in --help output.
Architecture documentation lives in notes/ in the repository. Start with notes/SPEC_INDEX.md.