Next session — PSQ-4 worst-case + PSQ-8 bundle
Baseline: 38eec1d8 (Batch D complete, all 5 of 5 landed, fixpoint 2288, smoke 4/4 + 22/22, D+B+C sweep 22/22 green)
Primary target: PSQ-4 worst-case (silent wrong-struct field reads through Vec<T> binding)
Secondary (if budget allows): PSQ-8 (typechecker accepts wrong-arity builtin calls → codegen crash)
Deliberately excluded: PSQ-2, PSQ-6, #11, #12, #19 — each needs its own session
Why this bundle
PSQ-4 is the highest-severity open bug on the board: silent wrong-data reads, no error, no crash, just garbage values. It’s the only remaining bug from the progress sprint that can corrupt production data without announcing itself. Every other open PSQ either crashes cleanly, fails to compile, or is cosmetic.
PSQ-8 sequences well with PSQ-4 because:
- Different subsystems — PSQ-4 lives in
typecheck_expr_handlers.qz(type inference + field resolution); PSQ-8 lives intypecheck_builtins.qz+tc_expr_call(arity validation). No risk of the two fixes stepping on each other. - Complementary cognitive load — PSQ-4 is deep and may require sustained focus; PSQ-8 is mechanical and good as a warm-up or wind-down.
- Both are typechecker — once you’re in the typecheck files with the mental model loaded, PSQ-8 is ~50% marginal effort.
- Both have regression-lock specs already written — PSQ-4’s in D1’s
expand_node_audit_spec.qz(the worst-case repro is in PROGRESS_SPRINT_QUIRKS.md and trivial to promote to a spec); PSQ-8 needs a new one but it’s ~5 tests.
If PSQ-4 consumes the full session, PSQ-8 stays queued for the session after.
PSQ-4 — the primary target
Status at end of Batch D
PSQ-4 has TWO symptoms:
- Less-dangerous:
cols[0].kinderror “Unknown struct: Struct, unknown” inside closures. CLOSED by C2 (9517ef5b, Vecptype). - Worse: two structs sharing a field name → silent wrong-offset reads through
for item in Vec<T>. STILL OPEN — D1 verified it reproduces on current trunk.
The minimal repro is in docs/bugs/PROGRESS_SPRINT_QUIRKS.md:113 under PSQ-4 (updated Apr 15, 2026 with inline code block). Copy it to /tmp/psq4.qz and confirm total=0 instead of total=99 before making any changes.
The two-headed fix
Per the PSQ-4 ROADMAP row, there are two bugs and they must be fixed together — fixing #1 without #2 converts silent-wrong-data back into “Unknown struct” errors, which is worse for UX than the current state.
Bug #1: Type inference hole in tc_expr_index. When the indexed expression has type Vec<T>, the result type is T, not Int. Currently Vec<Live>[i] resolves to Int (or unknown), losing the element type. Fix in self-hosted/middle/typecheck_expr_handlers.qz — find tc_expr_index, propagate the generic type param from the Vec binding via tc_registry_get_vec_element_type (or the equivalent registry lookup) through the index expression’s result type.
Bug #2: Name-resolution fallthrough in tc_expr_field_access. When the receiver type is unknown or Int, the resolver searches all struct types in scope for a matching field name and picks the first match. This is the silent-wrong-struct behavior. Fix: when receiver is unknown, error with QZ0603 (cannot resolve struct for field access) instead of name-searching. Only dispatch to the struct whose type ID matches the receiver. QZ0603 is already registered in explain.qz for exactly this case.
The for item in Vec<Live> loop is the wedge between the two bugs: the loop desugars to var item = _vec[i], and _vec[i]’s type comes from bug #1. If bug #1 is fixed, item is correctly typed Live and the field access resolves to Live._ended via bug #2’s (already-correct) type-ID dispatch. Without bug #1’s fix, item is Int/unknown, and bug #2 triggers the name search.
Order of operations
- Fix bug #2 first, alone. Confirm compile errors appear on the PSQ-4 repro (expected: “QZ0603: cannot resolve struct”). Confirm B+C+D sweep still passes — if any existing code relied on the name-search fallthrough, those callers need typed bindings too.
- Fix bug #1. The QZ0603 errors from step 1 should resolve automatically because
itemis now correctly typed. - Confirm the PSQ-4 repro prints
total=99. - Add
spec/qspec/vec_element_type_field_shadow_spec.qz— lock in both the two-struct case and at least 3 variations (different Vec binding shapes, different struct orderings, different field names) so this can’t regress.
Files to read first
docs/bugs/PROGRESS_SPRINT_QUIRKS.md:113— PSQ-4 row with minimal reproself-hosted/middle/typecheck_expr_handlers.qz— findtc_expr_indexandtc_expr_field_accessself-hosted/middle/typecheck_registry.qz— look fortc_registry_get_vec_element_typeor equivalent Vecelement type helper self-hosted/middle/typecheck_generics.qz— may have additional Vecinference helpers self-hosted/error/explain.qz— QZ0603 is already defined, confirm error message matches
Tensions and risks
- The name-search fallthrough may be load-bearing for legitimate code. Some existing callers may work only because unknown-type field access falls through to a struct name search. Run the B+C+D sweep early and often. If regressions appear, each one is a sign of a local binding that needs a type annotation.
- Generic Vec propagation touches type inference hot paths. H-M inference (
typecheck_generics.qz) and theVec<T>ptype system (C2’s fix) both interact with this. Keep the fix surgical — don’t refactor the inference algorithm. - The fix may be more than one commit. If bug #2’s fix alone surfaces 10 regressions in the B+C+D sweep, don’t try to fix them all at once. Commit bug #2 with the regression-induced call site fixes bundled, then do bug #1 as its own commit.
Budget
Plan for 1 full session (4-8 quartz-hours). Don’t rush — this is the top-priority bug and a half-fix is worse than no fix.
PSQ-8 — the secondary target
Status
Filed Apr 15, 2026 during D5. Details in docs/bugs/PROGRESS_SPRINT_QUIRKS.md under PSQ-8. TL;DR:
def main(): Int
var m = mutex_new() # 0 args, should be 1
return 0
end
Crashes the compiler with index out of bounds: index 0, size 0 from cg_intrinsic_conc_sched.qz’s mutex_new handler. The typechecker doesn’t arity-check builtins.
Fix
Option 1 (correct, documented in the PSQ-8 row): extend tc_register_builtin to accept a required-count parameter, populate a parallel tc.registry.builtin_required_counts (or similar), and arity-check builtins at tc_expr_call entry. Emit QZ0170: Function X requires N arguments, got M.
Option 2 (band-aid): defensive args.size guards in every cg_intrinsic_*.qz handler. Don’t do this.
Files to read
self-hosted/middle/typecheck_builtins.qz—tc_register_builtindefinition + ~400 builtin registrationsself-hosted/middle/typecheck_expr_handlers.qz—tc_expr_callaround lines 2488–2521 where user-function arity is checkedself-hosted/backend/cg_intrinsic_conc_sched.qz:54— the exact crash site (args[0].to_s()with empty args)
Budget
2-3 quartz-hours. Mechanical work once the registry field is added — the bulk is adding an arity argument to every tc_register_builtin call.
Regression lock spec
New spec/qspec/builtin_arity_spec.qz — ~5 tests:
mutex_new()→ QZ0170mutex_lock()→ QZ0170channel_new()→ QZ0170 (needs capacity arg)send(ch)→ QZ0170 (needs value arg)- Control:
mutex_new(0)compiles
Adjacent items deliberately NOT in this bundle
- PSQ-6 (
Vec.sizereads 0 from I/O poller pthread) — shares “cross-thread Vec reads” territory with PSQ-4 at first glance, but the root cause is different. PSQ-4 is a typechecker type-inference hole; PSQ-6 is a codegen / memory model issue (non-atomic global loads, or register allocation hoisting). Investigate separately once PSQ-4 is closed. - PSQ-2 (
import progresscascade instd/quake.qz) — blockssh_with_progresslift. Module resolver load-order bug. Own session, ~1 day. - #11 Resolver full scope tracking — 1-2 days, own session.
- #12 Rust-style pattern matrix exhaustiveness — 3-5 days, multi-session.
- #19 Parser O(n²) fix (compiler memory opt Phase 3) — 1-2 weeks.
Session-start checklist
cd /Users/mathisto/projects/quartz
# 1. Verify baseline
git log --oneline -6 # should show 38eec1d8 D5 at top
git status # should be clean
./self-hosted/bin/quake guard:check # "Fixpoint stamp valid"
./self-hosted/bin/quake smoke 2>&1 | tail -6 # brainfuck 4/4, expr_eval 22/22
# 2. Read the key docs
cat docs/handoff/next-session-psq4-and-psq8.md # this document
sed -n '113,180p' docs/bugs/PROGRESS_SPRINT_QUIRKS.md # PSQ-4 full detail + repro
# 3. Reproduce PSQ-4 worst case to confirm baseline
# (copy the struct Progress/struct Live program from the PSQ-4 row to /tmp/psq4.qz)
./self-hosted/bin/quartz /tmp/psq4.qz 2>/dev/null > /tmp/psq4.ll
llc /tmp/psq4.ll -o /tmp/psq4.s && clang /tmp/psq4.s -o /tmp/psq4 -lm -lpthread && /tmp/psq4
# Expected: prints "total=0" (BUG). After the fix: prints "total=99".
# 4. Pre-PSQ-4 snapshot
rm -rf .quartz-cache
cp self-hosted/bin/quartz self-hosted/bin/backups/quartz-pre-psq4-golden
Success criteria
- Minimum viable: PSQ-4 worst-case committed (bug #1 + bug #2 fixed together), regression-locked by a new spec, B+C+D sweep still green. PSQ-8 may slip to the session after.
- Target: PSQ-4 worst-case committed + PSQ-8 committed in the same session. Both regression-locked.
- Stretch: above + PSQ-6 preliminary investigation (does the same root cause apply? write a minimal repro probe).
Each committed item must:
- Have
quake guardfixpoint verified (2288 ± 30 functions) - Pass
quake smoke - Pass the B+C+D regression sweep
- Have a regression-lock spec
- Have the PROGRESS_SPRINT_QUIRKS.md row updated with status + commit SHA
Prime directives reminder (v2, Apr 12 2026)
- Pick highest-impact, not easiest — PSQ-4 is the top-priority open bug. Don’t get distracted by smaller items.
- Design before building; research what the world does — check how Rust / Swift / Go resolve
Vec<T>element field access and handle the “unknown type + field name search” ambiguity. Swift’s existential-erasure story and Rust’s turbofish/never-fallthrough policy are both relevant prior art. - Pragmatism ≠ cowardice; shortcuts = cowardice — if the name-search fallthrough has 20 legitimate callers, don’t quietly keep it. Add type annotations at each caller and make the fix complete.
- Work spans sessions — PSQ-4 may not finish in one session. Hand off cleanly.
- Report reality, not optimism — half-fixes are lies. “Bug #2 fixed, bug #1 not yet” is a valid report; “PSQ-4 closed” when total still reads 0 is not.
- Holes get filled or filed — any regression the sweep surfaces gets either fixed or filed as its own row.
- Delete freely, no compat layers — if bug #2’s fix breaks name-search fallthrough, the name-search code gets deleted, not deprecated.
- Binary discipline —
quake guardmandatory, fix-specific backups mandatory. - Quartz-time estimation — PSQ-4 is 4-8h. Don’t pad.
- Corrections are calibration — if I’m wrong about the two-headed diagnosis, update the plan and move. Don’t defend this document.