Overnight Handoff — Binary DSL Phase 2 kickoff
Baseline: 620d3ffe on trunk (Phase 1.5 complete — all 5 worked examples roundtrip, 81 binary-DSL tests green, fixpoint 2087 functions).
Design doc (canonical): docs/design/BINARY_DSL.md — 335 lines, 12 locked decisions.
Prior handoffs (READ FIRST):
docs/handoff/overnight-binary-dsl-phase-1-5-kickoff.md— the plan that landed 5/5.docs/handoff/overnight-binary-dsl-phase-1-5.md— Phase 1.4 follow-ups + D1-D10.docs/handoff/overnight-binary-dsl-phase-1-4.md— Phase 1.4 scope.docs/handoff/overnight-binary-dsl-phase-1.md— D1-D5 discoveries.
What’s shipped in Phase 1.5 (this session, 5 commits)
| STEP | Commit | Spec | Tests | What it unblocks |
|---|---|---|---|---|
| 1 | 05a9fbb7 | binary_varwidth_spec.qz | 5 | bytes / bytes(N) / cstring / pstring(uN) — IPv4 payload, MAC, UUID, TLS/DNS length-prefixed records |
| 2 | 326237a3 | binary_straddle_spec.qz | 3 | Sub-byte fields that cross byte boundaries (IPv4 frag_off u13be) |
| 3 | 799d0873 | binary_eof_spec.qz | 4 | Err(UnexpectedEof) on short buffers (no more reads past end) |
| 4 | c208887b | binary_strict_spec.qz | 6 | QZ0954-QZ0959 — strict as + .with validation at typecheck |
| 5 | 620d3ffe | binary_roundtrip_spec.qz +1 | 1 | IPv4Header roundtrip — closes all 5 worked examples |
Total: 19 new tests, 81 binary-DSL green. Fast path and variable path share the same prefix emitters (straddle, sub-byte, byte-aligned) — Phase 1.5 also simplified codegen by ~140 lines net despite adding three new codepaths.
What user code can now do
# Full IPv4Header — sub-byte + straddle + variable-width payload.
type IPv4Header = binary {
version: u4
ihl: u4
tos: u8
total: u16be
id: u16be
flags: u3
frag_off: u13be
ttl: u8
proto: u8
checksum: u16be
src: u32be
dst: u32be
payload: bytes
}
# Short buffer → UnexpectedEof, not a crash.
match IPv4Header.decode(bytes)
Ok(h) => process(h.src, h.dst, h.payload)
Err(UnexpectedEof) => log("truncated")
Err(e) => log_other(e)
end
# Strict `as` typechecking — typos caught at compile time.
packed struct(u32) GpioModer
pin0_mode: 2
# ... 15 more pins
end
var m = GpioModer { ... }
var raw = m as u32 # OK — packed + backing matches.
var r2 = m as u16 # QZ0955 — width mismatch.
var tweak = m.with { pin5_mode = 1 } # OK — field exists.
var typo = m.with { pin5_mdoe = 1 } # QZ0959 — unknown field.
Copy-paste handoff prompt (paste into a fresh session)
Read docs/handoff/overnight-binary-dsl-phase-2-kickoff.md FIRST.
Previous handoff docs (phase-1 through phase-1.5) have the D1-D10
discoveries and architectural context — load them if anything is
unclear. Design is locked in docs/design/BINARY_DSL.md (12 decisions).
Starting state (verified at handoff 620d3ffe):
- Trunk clean. Guard stamp valid at 2087 functions. Smoke green.
- 12 binary-DSL specs, 81 tests, all green.
- Session backup from 1.5 overwritten — create a new one:
cp self-hosted/bin/quartz self-hosted/bin/backups/quartz-pre-binary-phase2-golden
NEVER overwrite quartz-pre-binary-phase2-golden until the last STEP
of this session is committed AND smoke/spec suites pass. The rolling
quartz-golden that quake guard manages gets overwritten on every
successful build — the fix-specific copy is your escape hatch.
Phase 2 has three parallel tracks. Pick ONE and ship it clean. Don't
start another until the first is committed end-to-end with tests.
TRACK A — Computed fields (recommended for first commit).
Surface: `checksum: u16be = ip_tcp_checksum(pseudo_header, body)`.
Semantics: on encode, evaluate the expression with prior fields in
scope and write. On decode (v1), skip + trust. v2 can validate.
Scope: parser (extend ps_parse_binary_block_field to accept `=
<expr>` after the type), typecheck (type-check the expression in
the block's scope where prior fields are locals), codegen (PACK
evaluates expr → writes; UNPACK skips the slot).
Spec: spec/qspec/binary_computed_spec.qz — TCP checksum, pstring
length-from-string, PE IMAGE_NT_HEADERS size, gzip trailer.
Size estimate: 400-600 lines (parser + typecheck + codegen).
TRACK B — Discriminated unions inside binary blocks.
Surface: `kind: u8` + `match kind { 1 => ...; 2 => ... end`.
Required for TCP options, ELF sections, PE chunks, USB descriptors.
Harder than A — needs match semantics inside the binary block's
layout model, discriminator-driven variant selection at both pack
and unpack. Allow tag-skipping (decode reads kind but all variants
emit it on encode). Separate MIR opcode or extend BINARY_* ops.
Size estimate: 800-1200 lines.
TRACK C — Array forms [T; n] / [T; field] / [T] (STEP 1b follow-up).
The only Phase 1 deliverable left out. Primitives only in the first
pass (no nested binary blocks — that's Phase 2d). The [T; field]
cross-field case needs read-time resolution of the count field;
encode-time is trivial since struct field is known.
Spec: spec/qspec/binary_arrays_spec.qz.
Size estimate: 200-400 lines.
Recommendation: Track A first. Unblocks TCP/UDP/PNG/gzip checksums,
which every network spec in the wild wants. Computed fields also
extend naturally into Track B's variant dispatch later. Track C is
important but its omission doesn't block any of the common formats
covered in the design doc's worked examples.
Workflow per STEP (identical to Phase 1.5):
1. Write QSpec tests FIRST (red phase) — failures must be specific.
2. Implement the minimum to green.
3. Run `./self-hosted/bin/quake guard` before EVERY commit.
4. Smoke after every guard — brainfuck, expr_eval (both in ~10s).
5. Commit each STEP as a single coherent commit.
Prime Directives v2 compact:
1. Pick highest-impact, not easiest.
2. Design is locked (BINARY_DSL.md) — implement, don't redesign.
3. Pragmatism = sequencing correctly; shortcut = wrong thing.
4. Work spans sessions; don't compromise because context is ending.
5. Report reality. Partial = say partial.
6. Holes get filled or filed.
7. Delete freely. Pre-launch.
8. Binary discipline: guard mandatory, smokes + backups not optional.
9. Quartz-time = traditional ÷ 4.
10. Corrections = calibration, not conflict.
Stop conditions:
- Track complete with fixpoint stable → write next handoff with
remaining tracks.
- Blocked on compiler bug → file in Discoveries, commit what works.
- Context limit → stop at next clean commit boundary, write handoff.
Pointers (verified in Phase 1.5):
- cg_intrinsic_binary.qz is now ~1280 lines. Pack/unpack dispatch to
the variable or fast path via _cg_bin_find_var_split; both call
_cg_bin_emit_{pack_prefix_stores,unpack_prefix_reads} for the
fixed prefix (with straddle, sub-byte, byte-aligned support).
- Variable tail uses a runtime cursor `%v<d>.c<k>` that advances per
field. bytes/cstring/pstring all use the same chain pattern.
- EOF check is a branch + alloca-ret pattern:
br i1 <sz lt min>, err, ok
err: build Err, store ret_a, br join
ok: <body>, store ret_a, br join
join: load ret_a → %v<d>
- Typecheck strictness errors are QZ0954-QZ0959 in typecheck_walk.qz
at NODE_TYPE_CAST / NODE_BINARY_WITH.
- New helper: `_cg_bin_var_spec_class(spec)` classifies variable
specs into -3..N: -3 (not variable), 0 (bytes rest), N>0 (bytes(N)),
-1 (cstring), -2/-4/-5 (pstring u8/u16le/u32le), -6/-7 (pstring be
variants — codegen exists but no spec covers them yet).
Discoveries — Phase 1.5 session notes (append to prior D1-D10)
D11 — vec_new_filled(N, val) with Vec element type fills with zero
Pre-existing codegen bug in self-hosted/backend/cg_intrinsic_vec.qz
— the ew==8 branch of vec_new_filled allocates the data region but
calls memset(data, 0, N*8) and never writes the fill value. The
ew==1 branch is correct.
Impact: vec_new_filled(6, 0xaa) with an unannotated result (default
Vecbinary_varwidth_spec.qz work around by using explicit push().
Fix owner: STEP 1 follow-up. Low priority — no in-the-wild use so far.
D12 — Quartz has str_contains but no str_index_of
std/string.qz:196 def str_contains(s, sub): Bool returns bool only.
Searching for a byte position inside a string requires manual
str_byte_at(s, i) loops. The Phase 1.5 STEP 4 strictness checks
use tc_lookup_struct with a match on Found/NotFound to handle
generic annotations (e.g., Option<Int>) — an annotation that’s
not a registered struct name simply resolves to NotFound and the
check falls through lenient. Cleaner than string surgery.
D13 — Vec inside Bytes is i64-per-slot; not contiguous bytes
Bytes._data: Vec<Int> stores each byte as a 64-bit slot (matches
Quartz’s uniform Vecmemcpy(dst, src, n * 8) — the *8 is
mandatory. Copying from a String (raw char buffer, 1 byte per char)
into Bytes requires a byte-to-slot loop (zext i8 → i64 per element).
Reverse direction (Bytes → String) needs trunc i64 → i8.
Emit helpers: _cg_bin_emit_str_to_slots_loop,
_cg_bin_emit_slots_to_str_loop, _cg_bin_emit_cstring_scan.
D14 — Fast-path / variable-path prefix emitter unification
The original Phase 1.4 inline field-walks in cg_emit_binary_pack /
cg_emit_binary_unpack got extracted in STEP 2 into
_cg_bin_emit_pack_prefix_stores / _cg_bin_emit_unpack_prefix_reads.
Both fast and variable paths now call them. Net simplification:
~200 lines of duplicate walk code removed when straddle support
was added. Keep this unified design — future STEP work (computed
fields, arrays) should extend the helpers, not re-inline.
D15 — Alloca in mid-function IR is LLVM-acceptable
For loop counters in codegen (e.g., cstring scan, byte-to-slot loop),
an alloca i64 in the middle of a function is OK by the verifier.
Not optimal — the alloca won’t be hoisted to the entry block — but
correctness is preserved. Used in _cg_bin_emit_str_to_slots_loop
et al. Phase 2 computed-field codegen can use the same pattern for
intermediate scratch space.
STEP 1 follow-ups (filed, not shipped)
Items left out of Phase 1.5 that should land in Phase 2 or later:
F1 — Array forms [T; n] / [T; field] / [T]
Variable-width arrays of primitive T. Fixed count is compile-time known; count-prefixed reads from a prior field at runtime; rest-of- stream computes count from remaining buffer size. Primitive-only in the first pass — nested binary blocks in array positions are harder. See Track C above.
F2 — Sub-byte field after a variable field
Currently the variable-path emits “not yet supported” for sub-byte fields in the tail. The fix requires bit-cursor tracking (not just byte-cursor) in the tail loop. No current worked example needs this; defer until a real consumer asks for it.
F3 — Pad field after a variable field
Similar to F2 — tail loop assumes all variable-tail fields are real named fields, not pad placeholders. Easy fix (skip with no store) but no consumer yet.
F4 — vec_new_filled(N, val) Vec fill bug
See D11. Cosmetically wrong but harmless in practice because Phase 1.5 tests work around it. Still file as a real compiler bug, not a binary-DSL follow-up.
F5 — LE straddle
The straddle emitter handles BE only (the design default). LE is
rare — no current worked example uses u13le-type fields. File as
STEP 2b.
F6 — pstring(u16be) / pstring(u32be) coverage
Classes -6 and -7 exist in _cg_bin_var_spec_class but no spec
exercises them. If a consumer needs BE-length-prefixed strings (one
or two candidates in older network formats), add a spec.
Safety rails (verify before starting Phase 2)
- Quake guard before every commit. Pre-commit hook enforces it.
- Smoke after every guard. brainfuck + expr_eval are enough.
- Fix-specific backup at
self-hosted/bin/backups/quartz-pre-binary-phase2-golden(create it at the top of the next session — see copy-paste block). - Full QSpec NOT in Claude Code. The harness PTY can hang on
large runs. Use targeted
FILE=...invocations for spec files. - Crash reports first (CLAUDE.md): on silent SIGSEGV check
~/Library/Logs/DiagnosticReports/quartz-*.ipsbefore ASAN/lldb.
Test status after Phase 1.5
| File | Tests | Status |
|---|---|---|
binary_parse_spec.qz | 14 | 🟢 green |
binary_typecheck_spec.qz | 19 | 🟢 green |
binary_mir_spec.qz | 10 | 🟢 green |
binary_types_spec.qz | 5 | 🟢 green |
binary_methods_spec.qz | 3 | 🟢 green |
binary_bitcast_spec.qz | 3 | 🟢 green |
binary_with_spec.qz | 3 | 🟢 green |
binary_roundtrip_spec.qz | 5 | 🟢 green |
binary_varwidth_spec.qz (new) | 5 | 🟢 green |
binary_straddle_spec.qz (new) | 3 | 🟢 green |
binary_eof_spec.qz (new) | 4 | 🟢 green |
binary_strict_spec.qz (new) | 6 | 🟢 green |
| Total | 80 | 🟢 green |
(5 worked examples + roundtrip spec accounts for the “81” figure used
elsewhere — the roundtrip file has 5 cases. The table above collapses
to per-file totals; cross-check with FILE=... quake qspec_file.)
Full QSpec suite NOT run from Claude Code (CLAUDE.md protocol). Run
./self-hosted/bin/quake qspec in a terminal before calling Phase
1.5 “truly complete” if you want to catch any cross-spec regressions
introduced by this session.