Quartz v5.25

Overnight Handoff — Binary DSL Phase 2 Track B kickoff (TA-F5 fixed)

Baseline: 04fddae3 on trunk (TA-F5 fix landed). This handoff: Start Track B — discriminated unions inside binary {}. Fixpoint: 2093 functions. 14 binary-DSL specs, 93 tests, all green.

Design doc (canonical): docs/design/BINARY_DSL.md — 12 locked decisions.

Prior handoffs (read for context as needed):


What changed since Track C

TA-F5 FIXED (04fddae3). Track A compute expressions now resolve self.field.size and self.field.length on Vec/Map/String/Set-typed binary fields. Two-part fix in mir_lower.qz:

  1. Binary-block struct registry now records the real runtime annotation (Vec<Int> for primitive array fields, String for pstring/cstring, Bytes for bytes) instead of a blanket "Int" placeholder. A local _mir_bin_field_annotation helper mirrors typecheck’s _tc_bin_field_annotation to avoid a cross-phase import.
  2. NODE_FIELD_ACCESS in MIR now falls back to the vec_size/map_size/ str_size/set_size intrinsic when typecheck hasn’t rewritten .size/.length on a collection-typed base (which happens because compute ASTs skip typecheck re-entry).

For Track B: you can freely write count: u16 = self.options.size or self.payload.length inside compute expressions. No workaround needed. binary_arrays_spec.qz test 6 is the reference pattern.


Track B — Discriminated unions

Surface (from BINARY_DSL.md, unchanged)

type Tcp = binary {
  source_port:  u16be
  dest_port:    u16be
  seq:          u32be
  ack:          u32be
  data_offset:  u4
  reserved:     u3
  flags:        u9be
  window:       u16be
  checksum:     u16be
  urgent_ptr:   u16be
  options:      [TcpOption]           # Track C [T] form — already available
}

type TcpOption = binary {
  kind: u8
  match kind
    0 => { }                           # END_OF_LIST
    1 => { }                           # NOP
    2 => { mss: u16be }                # MSS option
    8 => { tsval: u32be; tsecr: u32be }  # Timestamps
  end
}

Semantics

  • Discriminator is always the FIRST field, and must be a primitive integer (u8, u16be, etc.). Compile error otherwise.
  • Each match arm adds additional fields after the discriminator. Arms with an empty body { } are valid (like TCP NOP).
  • Decode: read the discriminator, match it against the arms, dispatch to that arm’s field layout. If no arm matches and there is no default arm, return Err(ParseError::InvalidDiscriminant).
  • Encode: the Quartz value is an enum (generated from the match arms). Pack emits the discriminator + the matched arm’s fields.

Scope

  1. Parser — accept match <discriminator_field> inside a binary block. Each arm is a literal integer (or range?) => { field_list }. Grammar: binary_union ::= 'match' IDENT NEWLINE (integer_literal '=>' '{' binary_field* '}' NEWLINE)* 'end'. Stashes variant specs into the binary block’s children as a new kind of AST node (e.g. NODE_BINARY_UNION) so typecheck + MIR can walk it.

  2. Typecheck — register the arms as enum variants. The outer binary block becomes a tagged-union struct. Field-name collisions across arms are OK; arm fields are only in scope inside their own arm’s decode/encode path. Consider whether to emit an implicit enum type TcpOptionKind = enum { EndOfList, Nop, Mss(u16), Timestamps(u32, u32) } — that would make pattern matching on decoded values ergonomic.

  3. MIR + codegen — extend cg_intrinsic_binary.qz. The variable- path emitters already dispatch per-field via _cg_bin_var_spec_class. Add a new class (e.g. -20) for the match dispatch. Pack emits the discriminator first, then a switch on the enum tag to the matched arm’s field-list emitter. Unpack reads the discriminator, switches on its value, dispatches to the matched arm’s unpack emitter, and constructs the tagged-union result.

Suggested STEP decomposition

  • STEP B1: Parser. Accept match field ... end inside a binary block. AST surface. QSpec: binary_union_parse_spec.qz — 4-6 tests covering literal arms, empty-body arms, position rules (match must follow at least one prior field which is the discriminator), error recovery.
  • STEP B2: Typecheck variant registration. Declare the implicit enum type. Check discriminator is integer-primitive and first. QSpec: binary_union_typecheck_spec.qz — 4-6 tests.
  • STEP B3: PACK codegen. Emit discriminator + arm-switch + arm field-list. Single-test smoke first.
  • STEP B4: UNPACK codegen. Read discriminator, switch, allocate tagged-union struct with matched arm’s fields.
  • STEP B5: End-to-end roundtrip spec. binary_union_spec.qz — TCP options (NOP + MSS + Timestamps) as the first full protocol. PE section headers and ELF section types as stretch goals.

Estimate (quartz-time): B1 ~2 quartz-hours, B2 ~2, B3 ~3, B4 ~3, B5 ~2. ~12 quartz-hours = 1-2 sessions.


Copy-paste handoff prompt

Read docs/handoff/overnight-binary-dsl-phase-2-trackb-kickoff.md FIRST.
TA-F5 is fixed; Track B (discriminated unions) is the remaining Phase 2
track.

Starting state (verified at handoff):
- Trunk clean. Guard stamp valid at 2093 functions. Smokes green.
- 14 binary-DSL specs, 93 tests, all green.
- Session backup: self-hosted/bin/backups/quartz-pre-binary-phase2-taf5-golden.
  Before touching the compiler, snapshot a new fix-specific copy:
    cp self-hosted/bin/quartz self-hosted/bin/backups/quartz-pre-binary-phase2-trackb-golden

NEVER overwrite a fix-specific backup until the attempted fix is
committed end-to-end with tests and smokes passing. The rolling
quartz-golden managed by `quake guard` gets overwritten on every
successful build — your fix-specific copy is the recovery hatch.

Recommended STEP order:
1. B1 — parser surface. Write spec first.
2. B2 — typecheck variant registration + position checks.
3. B3 — PACK codegen for match-inside-binary.
4. B4 — UNPACK codegen.
5. B5 — end-to-end TCP option roundtrip spec.

Workflow per STEP (identical to prior phases):
1. Write QSpec tests FIRST (red phase).
2. Implement the minimum to green.
3. Run `./self-hosted/bin/quake guard` before EVERY commit.
4. Smoke after every guard — brainfuck + expr_eval (both ~10s each).
5. Commit each STEP as a single coherent commit.

Prime Directives v2 compact:
1. Pick highest-impact, not easiest.
2. Design is locked (BINARY_DSL.md) — implement, don't redesign.
3. Pragmatism = sequencing correctly; shortcut = wrong thing.
4. Work spans sessions; don't compromise because context is ending.
5. Report reality. Partial = say partial.
6. Holes get filled or filed.
7. Delete freely. Pre-launch.
8. Binary discipline: guard mandatory, smokes + backups not optional.
9. Quartz-time = traditional ÷ 4.
10. Corrections = calibration, not conflict.

Stop conditions:
- Track B complete with fixpoint stable → write next handoff.
- Blocked on compiler bug → file in Discoveries, commit what works.
- Context limit → stop at next clean commit boundary, write handoff.

Pointers (verified post-TA-F5):
- Binary-block struct registry annotation: `mir_lower.qz:5573-5720`
  (`_mir_bin_field_annotation` helper). Extend for union arm fields
  if needed.
- NODE_FIELD_ACCESS intrinsic fallback: `mir_lower.qz:1796-1870`.
- Track C helpers for variable-path array dispatch:
  `cg_intrinsic_binary.qz` — `_cg_bin_var_spec_class`,
  `_cg_bin_parse_array_info`, `_cg_bin_array_count_field_name`,
  `_cg_bin_find_prior_field_slot`. Mirror this pattern for the new
  union class (e.g. -20).
- Variable-tail pack emitter: `cg_emit_binary_pack` dispatches to
  `_cg_bin_emit_pack_variable` around line 1190.
- Variable-tail unpack emitter: `_cg_bin_emit_unpack_variable` around
  line 1180.
- EOF branch pattern (alloca ret_a, icmp ult sz min, br err/ok, join):
  already in use; the invalid-discriminant case should follow the same
  shape with `ParseError::InvalidDiscriminant`.

Test status (unchanged from Track C + TA-F5 commit)

FileTestsStatus
binary_parse_spec.qz14green
binary_typecheck_spec.qz19green
binary_mir_spec.qz10green
binary_types_spec.qz5green
binary_methods_spec.qz3green
binary_bitcast_spec.qz3green
binary_with_spec.qz3green
binary_roundtrip_spec.qz5green
binary_varwidth_spec.qz5green
binary_straddle_spec.qz3green
binary_eof_spec.qz4green
binary_strict_spec.qz6green
binary_computed_spec.qz6green
binary_arrays_spec.qz7green
Total93green

Smokes (post-guard): examples/brainfuck.qz, examples/expr_eval.qz — both pass with the TA-F5 binary.

Full QSpec suite NOT run from Claude Code (CLAUDE.md protocol). Run ./self-hosted/bin/quake qspec in a separate terminal after Track B lands to catch cross-spec regressions before declaring Phase 2 done.


Safety rails (verify before starting Track B)

  1. Quake guard before every commit. Pre-commit hook enforces it.
  2. Smoke after every guard. brainfuck + expr_eval are enough.
  3. Fix-specific backup at self-hosted/bin/backups/quartz-pre-binary-phase2-trackb-golden (create at top of next session — see prompt above).
  4. Full QSpec NOT in Claude Code. The harness PTY can hang on large runs. Use quake qspec_file FILE=spec/qspec/<name>.qz for targeted runs (NOT FILE=... quake qspec — that ignores FILE and runs the whole suite).
  5. Crash reports first (CLAUDE.md): on silent SIGSEGV check ~/Library/Logs/DiagnosticReports/quartz-*.ips before ASAN/lldb.