Bug 1: “Never type in let binding context” — exit 176
Spec file: spec/qspec/never_type_spec.qz
Failing test: "never type in let binding context" (line 98)
Observed: exit 176 instead of expected 42.
Verdict: the test name is misleading. The root cause is not Never-type handling. The root cause is a tail-call optimization miscompilation when a stack-allocated @value Option is passed to a non-inlined function that’s called in tail position. The Never-type arms just happen to prevent the inliner from eliding the call, which is what exposes the TCO bug.
1. Problem (as observed)
The test compiles and runs this program:
def get_or_die(opt: Int): Int
return match opt
Some(x) => x
None => panic("was None")
_ => panic("impossible")
end
end
def main(): Int
return get_or_die(Option::Some(42))
end
Option::Some(42) is expected to return 42, but the program exits with 176. The 176 is not random — it is 0xb0, the low byte of a libc pointer (0x1f9cb80b0 = lsl::sAllocatorBuffer) that happens to sit on the freed stack slot where the payload used to live.
0x1f9cb80b0 & 0xff = 0xb0 = 176.
2. Reproduced minimal case (no Never at all)
def my_unwrap(opt: Int): Int
var total = 0
for i in 0..100
total += i
end
for j in 0..100
total += j * 2
end
return match opt
Some(x) => x + total - total
None => 0
_ => 0
end
end
def main(): Int
return my_unwrap(Option::Some(42))
end
Exits 0 (garbage). No panic, no Never. The common ingredient is:
- Caller constructs a stack-allocated
@valueOption (alloca of[2 x i64]). - Caller passes
ptrtoint %alloc to i64as the argument to a non-inlined callee. - Caller’s final statement is
return callee(...). - Quartz codegen emits
tail call i64 @callee(i64 %v1). - LLVM performs tail call elimination: caller deallocates its frame, branches to callee.
- Callee’s own frame overlaps the freed caller frame; the payload is overwritten before it’s read.
A simpler version where unwrap is small enough to inline does NOT reproduce — the inliner eliminates the call, so TCO never happens.
3. Current Quartz state
3.1 Where the tail call marker is emitted
self-hosted/backend/codegen_instr.qz:407-442
# Tail call detection: call result is block's return value
var is_tail = 0
if var_idx < 0 and callee_has_narrow_ret == 0
var blk = state.current_block
var term_kind = blk.mir_block_get_term_kind()
if term_kind == mir::TERM_RETURN
var ret_val = blk.mir_block_get_term_data()
if ret_val == dest
is_tail = 1
end
end
end
# Disable tail call for decomposed calls (arg count mismatch after expansion)
if callee_decompose == 1
is_tail = 0
end
# Emit call ...
if is_tail == 1
codegen_util::cg_emit(out, " = tail call i64 @")
else
codegen_util::cg_emit(out, " = call i64 @")
end
The current rule: emit tail call whenever the call result is the return-value register. There is no check on whether any argument is a pointer into the caller’s stack frame. That’s the bug.
3.2 Where stack-allocated Options are constructed
@value enum constructors (Option::Some, etc.) allocate [N x i64] via alloca in the caller, then ptrtoint the pointer to i64 and pass it as a plain i64 argument. The generated IR looks like:
%alloc_1.p = alloca [2 x i64], align 8
%alloc_1 = bitcast [2 x i64]* %alloc_1.p to ptr
%v1 = ptrtoint ptr %alloc_1 to i64
store i64 %v0, ptr %alloc_1 ; tag = 0 (Some)
%sgep_4 = getelementptr ... %alloc_1, i64 1
store i64 %v3, ptr %sgep_4 ; payload = 42
%v6 = tail call i64 @get_or_die(i64 %v1) ; <-- BUG: TCO-eligible, arg is stack addr
ret i64 %v6
After LLVM’s tail-call pass: caller does add sp, sp, #0x20 then branches to callee; callee does sub sp, sp, #0x50 and stores callee-saved regs into the space that used to hold the Option payload.
3.3 ARM64 disassembly evidence
From the repro, get_or_die has epilog stp x0, x0, [sp, #0x8]; add sp, sp, #0x50; ret. At the ret, x0 (the return value) is 0x1f9cb80b0, not 42. Tracing backwards: the ldr x0, [x0, #0x8] that should load the payload at offset 8 actually reads from a stack slot that the callee clobbered with a callee-saved register during its prolog. That register’s old value is a libc pointer because the caller (qz_main) had a function pointer sitting there from a previous TLS access path.
4. External research
4.1 LLVM rules for tail call
From LLVM Language Reference — call instruction:
The optional
tailandmusttailmarkers indicate that the optimizers should perform tail call optimization (TCO). …tailis a hint that the TCO is eligible … The marker has the following semantics:
- The call will not cause unbounded stack growth if it is part of a recursive cycle in the call graph.
- Arguments with the in alloca or inalloca attribute are forbidden.
- The callee must not access any memory that is local to the caller, such as allocas or spill slots.
Quartz is violating the third bullet: the caller passes ptrtoint %alloca to i64, and the callee absolutely accesses that memory (it loads the tag and payload). LLVM’s escape analysis cannot see through ptrtoint reliably — from LLVM’s perspective the i64 argument is just an integer, and it assumes the callee doesn’t touch caller-local memory. So LLVM performs TCO and the program miscompiles.
From LLVM issue #72555 — Subtle issue with [[clang::musttail]]:
If the function has any alloca instructions, safely keeping allocas in the entry block requires analysis to prove that the tail-called function does not read or write the stack object.
This is exactly the missing analysis in Quartz.
4.2 How other languages handle this
Rust: does not emit LLVM tail call markers by default. Rust’s codegen emits plain call for most user-level calls. The become keyword (RFC 1888, “Guaranteed TCO”) is still unimplemented precisely because of the alloca-escape issue — the MIR-to-LLVM layer needs to prove no caller-local memory escapes before it can emit musttail. See Rust tracking issue #112788.
Swift: emits tail only when SIL’s escape analysis proves no alloc_stack is reachable from the call arguments. @inout and UnsafePointer arguments to a tail-position call disable the marker.
Zig: emits call tail only for @call(.always_tail, ...) and enforces at comptime that no argument is a pointer to a caller-local. Ordinary tail-position calls get plain call.
GHC (Haskell): has its own STG tail-call discipline that never uses LLVM’s tail marker. Closures live on a separate heap, not the C stack.
Common thread: nobody trusts LLVM to figure this out from opaque i64 arguments. They either skip the marker entirely, or they do their own escape analysis in the frontend.
4.3 Rust’s actual handling of let x = if ... else { panic!() } (the surface pattern the test thought it was testing)
From Rust Reference — Never type and Never Type initiative:
panic!()has type!(never).- When
!appears at a coercion site, the compiler inserts an implicit coercionabsurd: ! -> T. - The typechecker unifies the other branch’s type
Twith the overall expression type, and the!branch is elided from value computation (it’s terminated byunreachable). - Codegen lowers panics as normal calls that terminate with
unreachable. There’s no “value of panic” flowing through phi nodes.
Quartz already does this correctly: mir_lower_expr_handlers.qz:2261-2266 marks panic/exit/unreachable calls as terminating their block with TERM_UNREACHABLE. The Never-type aspect of the failing test is not broken.
5. Root cause hypothesis
codegen_instr.qz:408-417 emits tail call purely based on “is the call result the block’s return value?” without checking whether any argument is a pointer into the caller’s stack frame.
When that happens, LLVM’s TCO pass (correctly, per its own rules) deallocates the caller’s frame before branching to the callee. The @value Option sitting in the caller’s alloca is overwritten by the callee’s prolog (callee-saved register spills). The callee loads garbage and returns it. The lower byte of that garbage is 176 in the test, 0 in the simpler repro — both are wrong.
Why the Never test surfaced it: the panic arms prevent LLVM’s inliner from inlining get_or_die into main. Without inlining, there’s a real call site to optimize, which triggers the TCO. Non-panicking versions of the same pattern inline cleanly and don’t hit the bug.
6. Fix plan
Phase 1: Stop emitting tail call when any argument may point to caller stack (CORRECT FIX)
File: self-hosted/backend/codegen_instr.qz
Function: the call-emission path starting at line 407.
Change: add an argument-escape check before setting is_tail = 1. The check needs to answer: does any arg[i] transitively originate from an alloca in the current function?
Options for the check, in order of increasing precision:
- (a) Conservative: disable
tail callwhenever ANY argument is anMirRegwhose def-site is a stack-alloca intrinsic (stack_alloc,value_struct_new, etc.) or aptrtointof one. This is easy to implement at the MIR level: when MIR lowers stack-allocated Option/struct constructors, tag the resulting reg with a “stack-pointer” bit. In codegen, if any call arg has that bit, forceis_tail = 0. - (b) Precise: compute a reaching-defs / taint set per register during MIR→LLVM lowering. Any reg whose taint set includes an alloca is a “stack pointer.” Tail-calls are rejected if any arg is a stack pointer.
- (c) Future-proof: stop using
ptrtointfor stack-allocated struct values entirely. Pass them as typedptrarguments. Then LLVM’s own escape analysis sees the alloca pointer and refuses TCO automatically. This is the right long-term shape (see Phase 3).
Recommended: implement (a) now — it’s 50 lines, fully correct, and closes the bug. Add (c) as a follow-up because it pays off in other ways (better LLVM optimization, clearer IR).
Specific changes:
self-hosted/backend/mir.qz: add a bitMirReg::escapes_via_stack(or reuse an existing flag byte). WhenMIR_STACK_ALLOC/MIR_VALUE_CTOR/ any intrinsic thatallocas produces a register, set the bit. Propagate throughMIR_PTRTOINTand through any copy/move.self-hosted/backend/codegen_instr.qz:415: beforeis_tail = 1, loop overargs[]and if any arg’sMirReghas the stack-pointer bit, keepis_tail = 0.- Add a regression test covering the minimal repro (
my_unwrapwith a non-inlined body) so the bug cannot silently return.
Quartz-time estimate: 0.5 day. Risk: low. Propagating one bit through MIR is mechanical. Worst case: some legitimately-tail-callable sites lose TCO; none of them are in hot loops in current Quartz code.
Phase 2: Activate the failing QSpec test
After Phase 1, the existing never_type_spec.qz test at line 98 should pass unchanged. Run the full never_type_spec.qz and stress_type_system_spec.qz suites to confirm.
Phase 3 (follow-up, not blocking): switch @value struct/enum args to typed ptr
Stop emitting ptrtoint %alloca to i64 followed by i64 arguments. Emit ptr arguments directly. Rewrite mir_sizeof_type-aware calls to use ptr types where the arg type is a stack-allocated struct. This requires threading struct-type information through the MIR→LLVM call lowering path. Benefits:
- LLVM’s own TCO escape analysis handles it (no manual tracking needed).
- Opt passes (DSE, GVN, memcpyopt) get better alias information.
- Debug info becomes more accurate.
- Matches what Rust/Swift/Zig do.
Quartz-time estimate: 2 days. Touches codegen_instr.qz, codegen_util.qz, and the MIR→LLVM type-name layer.
Risk: medium. Needs careful handling of mixed i64/ptr arg lists and of existing callees that expect i64.
Phase 4 (follow-up): do not inline the dead mir_emit_store_var after a terminated block
Unrelated but adjacent: in mir_lower_expr_handlers.qz:2989-2997 (match arm bodies) and similar sites, ctx.mir_emit_store_var(result_name, arm_val) is called even when the current block is already terminated by TERM_UNREACHABLE. The resulting store is dead code but adds IR noise. Guard these with a term_kind < 0 check before emitting. Not a correctness bug, just cleanup.
Quartz-time estimate: 0.25 day.
7. Out of scope
- Actual Never-type handling: already correct. Panic calls correctly terminate blocks with TERM_UNREACHABLE. Match arms correctly propagate the terminator. The
never_type_spec.qztests that pass today pass for the right reasons. - The
exit 176magic number: incidental. Will becomeexit 0or any other garbage value once the TCO bug is fixed (the fix makes the test pass; the specific garbage value no longer matters).
8. Citations
- LLVM Language Reference Manual — call instruction
- LLVM issue #72555 — Subtle issue with
[[clang::musttail]] - Rust RFC 1888 — Guaranteed TCO via
become - Rust tracking issue #112788 —
explicit_tail_calls - Rust Reference — Never type
- Never Type initiative RFC
- Never Type initiative — No inference changes
- A Look at the LLVM Tail Call Elimination Pass