Next Session Handoff — B4-UNWRAP-IN-LOOP Deep Dive
Session type: Single-topic debug session (1 fresh context window)
Complexity: M-L (2-4 quartz-hours per the plan, given two failed fix attempts already burned ~1h of investigation)
Priority: Medium-high — real silent miscompile, but one-line workaround exists
Prerequisite: clean trunk at b61da0f4 (Batch A+B sprint complete), fixpoint 2281, smoke + regression sweep green
What you’re inheriting
The Apr 14, 2026 Batch A+B sprint landed 9 of 9 items (see docs/ROADMAP.md §“Batch A + Batch B sprint summary”). B4 was the one that reproduces reliably but the plan’s fix shape did NOT work. The bug, the minimal reproducer, the two failed fix attempts, and the critical narrowing observations are all documented below so you can start ~1 hour ahead of where I started.
Read first:
docs/ROADMAP.mdrowB4-UNWRAP-IN-LOOPspec/qspec/unwrap_in_loop_spec.qz— the pending test documents the exact broken shape; the 3 passing tests document what does work- Commit
258468f5— the B4 commit with the reproducer, IR analysis, and the two failed fix attempts in the message body self-hosted/frontend/macro_expand.qz:1220—expand_builtin_unwrap, the function that generates the match AST that miscompiles
The bug
Minimal 11-line reproducer (also at spec/qspec/unwrap_in_loop_spec.qz as an it_pending stub):
def main(): Int
var step = Option::Some(10)
var sum = 0
var i = 0
while i < 3
sum += step! # ← miscompiles
step = Option::Some(i + 100)
i += 1
end
return 0 if sum == 211
return 1
end
Expected: sum = 10 + 100 + 101 = 211 Observed: sum = 0 (cached constant) — or garbage pointer value in more complex cases
Every iteration, step! returns 0, not the current step’s Option payload. The sum is therefore 0+0+0.
IR diagnosis
The lambda body at the MIR level emits:
while_body2:
%v10 = load i64, ptr %sum, align 8
%v11 = add i64 %v10, %v0 ; %v0 is a CONSTANT defined at function entry
store i64 %v11, ptr %sum, align 8
Where %v0 = add i64 0, 0 is defined once at fn_entry — it’s the constant 0. The step! unwrap isn’t loading from %step at all. The match is substituted with a cached constant.
Critical triangulation — what makes the bug fire
The bug requires ALL THREE of these conditions. Any one missing and it works:
$unwrapmacro expansion (postfix!or explicit$unwrap(opt)— both fail identically)- Inside a
whileloop body - Subject reassigned inside the loop
Shapes that WORK (locked in by passing tests in unwrap_in_loop_spec.qz):
step!outside a loop with reassignment → works (test 1)- Explicit
match step { Some(x) => x; None => 0; endin loop with reassignment → works (test 2) - Inline
sum += match step { ... enddirectly (not viavar v = match) in loop → works (test 3)
The explicit match form works correctly in the SAME loop shape. So the bug is NOT about loops, reassignment, narrowing, or match-in-loop in general. It’s specifically in how the $unwrap macro-generated match construction interacts with MIR lowering.
Rules out
- ❌ is-narrowing interaction (the bug reproduces with
while trueconditional too) - ❌ reassignment of the match subject (explicit match handles it fine)
- ❌ loop + match in general (explicit match in loop works)
- ❌ inline vs. bound-to-var match (both inline and bound forms of explicit match work)
- ❌ The three-arm match with wildcard (hand-written 3-arm matches in loops work)
- ❌ Qualified vs. unqualified variant names (both hand-written forms work)
The bug is narrower than the plan originally framed it. It’s in the macro-generated AST specifically.
Failed fix attempts (from commit 258468f5, don’t retry these)
Attempt 1: Change macro payloads from vec_new() to 0
The parser for unqualified Some(x) patterns passes payloads = 0 (integer sentinel) at parser.qz:4858:
return ast::ast_enum_access(s, "", name, uq_bound_names, 0, ln, cl)
The $unwrap macro at macro_expand.qz:1233 passes payloads = vec_new() (empty vec, which is non-zero):
var some_pattern = ast::ast_enum_access(s, "Option", "Some", some_bound, some_payloads, line, col)
Hypothesis: 0 (sentinel) and vec_new() (empty vec) are handled differently downstream. Changed macro to pass 0. No effect — miscompile still fires.
Attempt 2: Hoist the match subject into an explicit var subj = expr block
Rewrote the macro to generate:
{
var __unwrap_subj = expr
match __unwrap_subj { Some(v) => v; None => panic; _ => panic end
}
Hypothesis: the bug is about evaluating the match subject once per iteration and caching it. Hoisting it into a fresh local should force a clean re-evaluation each time. No effect — miscompile still fires through the NODE_BLOCK wrapper.
This rules out “the bug is about when/how the subject is evaluated.”
Recommended investigation approach (fresh session)
The fast path to root cause is side-by-side AST inspection of the macro-generated match vs. a hand-written match that compiles correctly. The ASTs look identical at the surface level — same NODE_MATCH, same NODE_MATCH_ARM, same NODE_ENUM_ACCESS patterns with variant="Some". There must be a subtle difference in one of: str1, str2, extras, children, int_val, ops, or lefts/rights slots.
Phase 1: Add a targeted AST dump (2 hours)
Quartz doesn’t have --dump-ast, but you can add a temporary debug print in mir_lower_match_expr (in mir_lower_expr_handlers.qz) that dumps the match node’s structure just before MIR lowering. Print:
- Match subject node kind + all its slot values
- For each arm: pattern node kind + all its slot values, guard, body kind + str1
- Also print what
mir_ctx_lookup_var(subject_name)returns for the subject
Compile the minimal reproducer with this instrumentation. Also compile an equivalent hand-written match that compiles correctly (e.g. match step { Option::Some(x) => x; Option::None => 0; end). Diff the dumps. The difference is the bug.
Rebuild the compiler with the debug print (quake build), run both forms, capture output, revert the debug print, run quake guard to verify fixpoint returns to 2281.
Phase 2: Check mir_ctx state at match emission time
Another angle: maybe the issue isn’t the AST shape — maybe it’s state in MirContext at the moment the macro-generated match is lowered. The macro runs at parse time, but the MIR lowering of its output happens inside the loop body during the normal walk. Check whether:
ctx.mir_ctx_lookup_var("step")returns the right value when called from inside the macro-generated match’s subject evaluation- The loop body’s scope stack matches what the explicit-match path sees
- The macro’s
gensymname (__macro_N__) has any collision with a scope entry
Phase 3: Check if MIR constant folding is the culprit
The %v0 = add i64 0, 0 at function entry is a literal constant. If a MIR pass is const-folding the match result to 0 because the Some arm’s body is NODE_IDENT(gensym0) and the gensym binding somehow resolves to 0 at MIR time, that would produce exactly this symptom.
Grep for constant-folding passes in self-hosted/backend/mir_lower*.qz and self-hosted/backend/codegen*.qz. Look for anything that short-circuits a match with a single-ident-body arm to a constant. If you find such a pass, check whether it accounts for the Some arm’s binding scope correctly — the gensym0 should be a FRESH alloca per iteration, not a shared constant.
Phase 4: Consider desugaring ! to something other than a match
If Phases 1-3 don’t find the root cause in reasonable time, an alternative is to change the $unwrap macro to desugar to a different AST shape entirely:
Option A: Use NODE_TRY_EXPR directly. $try already exists (macro_expand.qz:1207) and works by emitting NODE_TRY_EXPR directly — no match at all. It early-returns on None instead of panicking, but the Ok-extraction MIR path is well-tested. Create a NODE_FORCE_UNWRAP variant that reuses the same extraction logic but panics on None. Medium-complexity — needs a new AST node kind, MIR lowering handler, and codegen.
Option B: Use an if-else tree. Lower $unwrap(e) to:
{
var __subj = e
if __subj is Option::Some then
## extract payload via field access
load_offset(__subj, 1)
else
panic("unwrap failed")
end
}
This avoids the match entirely and uses only the is narrowing + field access paths, which are well-tested. Might be simpler than the match-based expansion.
Option C: Replace ! with a builtin intrinsic call. Add option_unwrap to cg_intrinsic_core.qz that emits the raw tag check + payload load + panic branch. ! lowers to option_unwrap(e). Most direct — matches the stdlib opt.unwrap() UFCS path which already works.
My recommendation: Start with Phase 1 AST dump. If the difference is obvious (30 min), fix the macro directly. If the dumps look identical, the bug is in MIR state at emission time (Phases 2-3) — investigate there. If that also doesn’t yield, fall back to Option C (simplest desugaring change) rather than keep chasing the match path.
Success criteria
- The pending test in
spec/qspec/unwrap_in_loop_spec.qz(“step! inside while loop with reassignment”) passes. Changeit_pending→itand it runs green. - The minimal reproducer (
/tmp/b4_loop_simple.qz— paste from the commit message) returns 211, not 0. quake guardfixpoint still verifies (±30 functions tolerance from 2281).quake smokegreen.- Scheduler + impl_trait + all B-sprint regression specs still green.
- ROADMAP row
B4-UNWRAP-IN-LOOPis updated with**RESOLVED** (date, commit SHA). ...in the RESOLVED-via-strikethrough format.
Why this is the right next target
- Real silent miscompile with a tight, verified, minimal reproducer. Users WILL hit this.
- Failed attempts documented — you don’t waste the 1 hour I spent on payloads=0 and hoist-into-block.
- Clean narrowing of the bug surface — three necessary conditions, all other variations work. Rules out 5+ hypotheses.
- Multiple viable fix paths (Phases 1-4 above). Unlikely to dead-end.
- Sized for one fresh session — 2-4 quartz-hours. Fits the “one topic, one session” model.
- Unblocks
!as a safe user-facing feature. Currently users have to know to switch to explicitmatchin loops, which is non-obvious.
Backup binary + cleanliness check before starting
cd /Users/mathisto/projects/quartz
# Verify baseline
git log --oneline -3 # should show b61da0f4 B2 at top
git status # should be clean
./self-hosted/bin/quake guard:check # "Fixpoint stamp valid"
./self-hosted/bin/quake smoke # 4/4 + 22/22
# Pre-session backup (per quake-guard rule 1)
cp self-hosted/bin/quartz self-hosted/bin/backups/quartz-pre-b4-fix-golden
Then read the commit message for 258468f5 (it’s the best summary of the investigation so far), the pending test in unwrap_in_loop_spec.qz, and expand_builtin_unwrap in macro_expand.qz. Start with Phase 1 (AST dump).
Good luck. This one is solvable — the bug has been narrowed to a small surface (one macro function + one MIR lowering path) and the three-condition trigger is specific enough to pin down quickly.