The set Trap in Lean 4: When Cached Builds Hide Real Errors
2026-05-27
A Lean 4 proof compiles for months. Then you run lake build from
scratch, and nothing works. The culprit: set P := ComplexDef, a
one-line tactic that creates a local definition the simplifier cannot
see through. This post documents the problem, why cached builds mask
it, and how to fix it.
The Symptom
You have a large formalization β say, 30,000 lines of population
protocol analysis. It builds. You make an edit in one file. You run
lake build. Lean recompiles the edited file against cached .olean
files for its dependencies. Everything passes.
Months later, something forces a full rebuild: a Lean toolchain bump, a Mathlib update, or a colleague cloning the repo fresh. Suddenly, files that “always worked” produce cascading errors:
Type mismatch
(transitionPEM_phase4_role_settled_or_resetting hsi hsj).left
has type
(transitionPEM_phase4 n Rmax ...).1.role = Role.Settled β¨ ...
The lemma produces the right statement, but Lean cannot unify it
with the goal. The goal mentions P.Ξ΄ where P is a local
definition, and the lemma mentions transitionPEM_phase4 β the
function that P.Ξ΄ ultimately calls through several layers of
definition. These are definitionally equal. So why does Lean refuse?
The Cause: set Creates an Opaque Abbreviation
The set tactic in Lean 4 does two things:
- Introduces a local definition:
let P := PEMProtocolCoupled ... - Replaces all occurrences of
PEMProtocolCoupled ...in the goal and context withP
After set, the goal contains P everywhere. When simp or rw
tries to apply a lemma that mentions PEMProtocolCoupled, it needs
to see that P = PEMProtocolCoupled .... In principle, P is
definitionally transparent β Lean’s kernel can unfold it. But the
pattern matcher used by simp and rw does not always unfold
local set definitions when searching for matching subterms.
The result: lemmas that should apply do not fire. The proof breaks β not because the mathematics is wrong, but because the automation cannot see through a name it introduced.
Why Cached Builds Hide the Problem
When Lean compiles a file, it produces an .olean artifact containing
the elaborated terms. Downstream files import the .olean, not the
source. The .olean stores fully elaborated, kernel-checked terms
where all local definitions have been resolved.
If file A.lean is compiled with a version of its dependencies where
set P unfolding happened to succeed (perhaps due to a slightly
different elaboration order, or a simpler context), the .olean for
A records the successful proof term. Downstream files see only this
resolved term. They never re-check the tactic proof.
When you lake build incrementally, only modified files are
recompiled. Unmodified files use their cached .olean. The broken
set P proof in A.lean is never re-executed β it lives in the
.olean as an already-checked term.
A full rebuild forces every file to be re-elaborated from source.
Now the set P proof must actually succeed during elaboration.
If the pattern matcher cannot unfold P, the proof fails.
Three Fixes
1. Remove set, inline the definition
The most reliable fix: do not use set for definitions that appear
in lemma applications. Write the full expression everywhere.
-- Before (breaks on full rebuild):
set P := PEMProtocolCoupled n Rmax Emax Dmax hn0
exact step_rank_preserved_of_InSswap hn0 hS w
-- After (works always):
exact step_rank_preserved_of_InSswap (Rmax := Rmax) hn0 hS w
This is verbose but unambiguous. Every lemma application directly
names PEMProtocolCoupled, and the pattern matcher finds the match.
2. Bridge the definitional gap explicitly
When a lemma talks about an inner function (like
transitionPEM_phase4) but the goal mentions the outer wrapper
(P.Ξ΄), prove the connection as a separate have:
have h_bridge : (P.Ξ΄ (D i, D j)).1.role =
(transitionPEM_phase4 n Rmax ((D i).1, (D j).1)
(D i).2 (D j).2).1.role := by
simp only [P, PEMProtocolCoupled, PEMProtocol, protocolPEM,
transitionPEM, transitionPEM_prePhase4,
hRD, hsi, hsj, and_self, ite_true,
role_settled_ne_resetting, ...]
rw [h_bridge]
exact (the_lemma hsi hsj).1
3. Unfold in the transition proof directly
For goals about transition function fields (.role, .rank,
.timer), skip the abstraction entirely and unfold the full
transition:
unfold transitionPEM transitionPEM_phase4 transitionPEM_prePhase4
phase4_swap phase4_decide phase4_propagate
simp only [hRD, hsi, hsj, ...]
by_cases hpar : n % 2 = 0
Β· simp only [hpar, if_true]
split_ifs <;> simp_all
Β· simp only [hpar, if_false]
split_ifs <;> simp_all
This is expensive (the simp_all after split_ifs may need
high maxRecDepth and maxHeartbeats), but it does not depend
on any lemma matching through set definitions.
Lessons
-
lake buildfrom scratch is the ground truth. Incremental builds with cached.oleanfiles can hide elaboration failures. Run a clean build periodically, especially before submitting or sharing a project. -
setis convenient but fragile. Use it for goals where you only needrflorcalcreasoning. Avoid it whensimp/rwmust match the defined name against library lemmas. -
Lean 4 is not Lean 3 here. In Lean 3,
setcreated a true local definition that tactics could unfold. In Lean 4, the behavior aroundsimpmatching is different. The definitional transparency is there for the kernel, but the tactic-level pattern matcher is more conservative. -
When a proof “always worked,” check the olean. If a file compiles only because its olean was cached, the proof is not reproducible. Treat “compiles from cache” as untested.
Mathematics is not the art of writing down long proofs. It is the art of making each step so small it cannot fail.
β adapted from Dijkstra
Zinan Huang, 2026-05-27