Overnight Handoff — Binary DSL Phase 1.5 (kickoff)
Baseline: a976ac5a on trunk (Phase 1.4 complete — 7 commits, 61 tests green, fixpoint 2072)
Design doc (canonical): docs/design/BINARY_DSL.md — 335 lines, 12 locked decisions
Prior handoffs (READ FIRST):
docs/handoff/overnight-binary-dsl-phase-1-5.md— status + 10 discoveries + gapsdocs/handoff/overnight-binary-dsl-phase-1-4.md— Phase 1.4 scopedocs/handoff/overnight-binary-dsl-phase-1.md— D1-D5 discoveries
Session goal: Close Phase 1.4 gaps (straddle + variable-width + bounds check + float + typecheck strictness) so all 5 worked examples round-trip, then start Phase 2 computed fields. Estimated 1000-1400 lines across 5 commits.
Copy-paste handoff prompt (paste this into a fresh session)
Read docs/handoff/overnight-binary-dsl-phase-1-5-kickoff.md FIRST, then
docs/handoff/overnight-binary-dsl-phase-1-5.md (prior handoff with
D1-D10 discoveries — load-bearing, you WILL hit them). Design is
docs/design/BINARY_DSL.md — 12 locked decisions, don't re-litigate.
Starting state (verified at handoff a976ac5a):
- Trunk clean at a976ac5a. Guard stamp valid. Smoke green.
- 61 binary-DSL tests green: parse 14, typecheck 19, mir 10, types 5,
methods 3, bitcast 3, roundtrip 4, with 3.
- Fixpoint 2072 functions (gen1 == gen2 byte-identical).
- Session backup pre-Phase-1.4: self-hosted/bin/backups/quartz-pre-binary-codegen-golden
(safe to keep, or overwrite with a fresh quartz-pre-binary-phase2-golden
before risky work).
SAVE a NEW fix-specific backup BEFORE you touch a single .qz:
cp self-hosted/bin/quartz self-hosted/bin/backups/quartz-pre-binary-phase15-golden
NEVER overwrite quartz-pre-binary-phase15-golden until the last STEP
of this session is committed AND the full binary spec suite passes.
The rolling quartz-golden that quake guard manages gets overwritten
on every successful build.
Phase 1.5 has 5 STEPs — Phase 1.4 gap closure. Ship as 5 small commits.
If a STEP blows out, stop at the next clean commit boundary and write
a partial handoff. Don't compromise scope for speed.
STEP 1 — Variable-width fields (bytes / cstring / pstring(uN) /
[T;n] / [T;field] / [T]). Biggest gap. Unlocks IPv4 payload + DNS
+ TLS + PE/ELF chunked formats. ~300-450 lines across typecheck +
codegen. Spec: spec/qspec/binary_varwidth_spec.qz.
STEP 2 — Straddling sub-byte fields (u13be-style across byte
boundaries). Unlocks IPv4 full roundtrip. ~150-250 lines in
cg_intrinsic_binary.qz (extend _cg_bin_emit paths to handle
cross-byte shifts). Spec: spec/qspec/binary_straddle_spec.qz.
STEP 3 — UnexpectedEof bounds check in UNPACK. Quick win. At the
top of cg_emit_binary_unpack, emit: `if bytes.size() < expected
return Err(UnexpectedEof)`. ~50-100 lines. Requires the fixed-bits
total to be known at codegen (it is, via the layout registry).
Spec: spec/qspec/binary_eof_spec.qz.
STEP 4 — Typecheck strictness for `as` and `.with`. Fire friendly
errors at typecheck time per design #11 / #12:
- `as uN`: receiver must be packed-struct with backing == N
- `as StructName`: source must be Int
- `.with { field = ... }`: each named field must exist on the
packed struct; typos should error
~80-120 lines in typecheck_walk.qz. Spec extends binary_bitcast
and binary_with with assert_compile_error cases.
STEP 5 — Round out IPv4Header. Adds IPv4 to binary_roundtrip_spec.qz
now that STEP 1+2 unblock it. If this compiles and runs clean,
write Phase 2 kickoff handoff. ~50 lines test only.
Phase 2+ roadmap (if STEPs 1-5 land with context remaining):
- Phase 2a: computed fields `value: u16be = checksum(payload)`
- Phase 2b: discriminated unions (TCP options, USB descriptors)
- Phase 3: dogfood — migrate cg_intrinsic_intmap.qz to
IntMapHeader.decode() (gates on Phase 2a + stable 1.5)
Workflow per STEP (identical to Phase 1.4):
1. Write QSpec tests FIRST (red phase) — failure must be specific
(wrong IR content, wrong exit code, specific error message).
2. Implement minimum to green them.
3. Run `./self-hosted/bin/quake guard` before EVERY commit. Never
skip. Never --no-verify.
4. Smoke tests after every guard — brainfuck + style_demo +
expr_eval. Restore from quartz-pre-binary-phase15-golden if
any regress.
5. Commit each STEP as a single coherent commit.
Prime Directives v2 compact:
1. Pick highest-impact, not easiest.
2. Design is locked (BINARY_DSL.md) — implement, don't redesign.
3. Pragmatism = sequencing correctly. Shortcut = wrong thing because
right is hard. Name the path.
4. Work spans sessions. Don't compromise because context is ending.
5. Report reality. Partial = say partial. "Should work" = a lie.
6. Holes get filled or filed (Discoveries section in handoff).
7. Delete freely. Pre-launch.
8. Binary discipline: guard mandatory, fixpoint + smoke not optional,
fix-specific backups before risky work.
9. Quartz-time = traditional ÷ 4.
10. Corrections = calibration, not conflict.
Stop conditions:
- STEP 5 complete with IPv4 roundtrip green and fixpoint stable →
done. Write Phase 2 kickoff handoff covering computed fields.
- Blocked on a compiler bug → file in Discoveries section with
minimal repro, commit what works, write partial handoff.
- Approaching context limit → stop at next clean commit boundary,
write handoff for next session.
Helpful pointers (verified in Phase 1.4 session):
- cg_intrinsic_binary.qz is ~720 lines. Pack/unpack emitters walk
the MIR layout registry field-by-field; extending for straddle
adds a third branch (neither byte-aligned nor single-byte).
- std/binary.qz exports ParseError with UnexpectedEof /
InvalidValue(field, expected, got) / LengthOverflow(field,
declared, remaining). Use those exact variants.
- _tc_bin_parse_numeric_width handles u/i/f<N>[le|be] generically
for N in 1..64.
- _cg_bin_parse_width_info exposes (width, is_float, is_signed,
is_le, has_endian) for codegen.
- Bytes is `struct Bytes { _data: Vec<Int> }`. Vec<Int>'s runtime
rep is [cap, size, data_ptr, elem_width] = 32-byte header +
malloc'd data region. Each byte is stored as an i64 slot (elem
width = 8) — design quirk, don't fight it.
- Result::Ok layout is [tag=0, payload_ptr]; Err is [tag=1, err_ptr].
ParseError::UnexpectedEof is a unit variant (tag=0 in ParseError's
enum, stored as Int).
- The `; === Binary DSL Layouts ===` IR manifest from 1.3 stays
useful — keep it; binary_mir_spec.qz asserts on it.
- NODE_BINARY_WITH = 96, NODE_TYPE_CAST = 97. Next free: 98.
- RESOLVE_TAG_BINARY_BLOCK = 13, RESOLVE_TAG_PACKED_STRUCT = 14.
For the caller (outside the prompt)
Recommended invocation: open a fresh Claude Code session, paste the block above, let it run overnight.
When it returns:
git log --oneline trunk— confirm STEP 1-5 commits (or partial).- Run
./self-hosted/bin/quake qspecin a terminal to catch cross-spec regressions. - Read any new Discoveries section (D11+) appended to the phase-1-5 handoff — new quirks to remember.
- If the session ended partial, paste the new handoff prompt it wrote.
Phase 1.5 STEP details
STEP 1 — Variable-width fields (target: 1 commit, ~300-450 lines)
Goal: bytes, bytes(n), cstring, pstring(uN), [T; n], [T; field],
[T] all pack and unpack correctly.
Scope:
_tc_bin_field_annotationin typecheck.qz: mapbytes/bytes(n)→"Bytes"cstring→"String"pstring(u8)/pstring(u16le)/ etc. →"String"[T; n]/[T; field]/[T]→"Vec<T>"(recurse on T)
cg_intrinsic_binary.qz:- PACK — when the field’s width is 0 (variable), load the
Bytes/String/Vec handle from the struct, extract its inner Vec
data pointer + size, then:
bytes(n): copy exactly n bytes to the output buffer (error at runtime if size != n? design is quiet; use n as limit).bytes(rest-of-stream): copy all bytes to the tail.cstring: copy bytes then append a 0 terminator.pstring(uN): write the length prefix as uN, then the bytes.[T; n]: emit n copies of T’s pack sequence, reading from the Vec’s data pointer at indices 0..n. [T; field]: same but n = value of the priorfield(look up at codegen time — requires cross-field awareness).[T]: all remaining elements.
- UNPACK — mirror the above; for length-prefixed and count-prefixed variants, read the length / prior-field value first then loop.
- PACK — when the field’s width is 0 (variable), load the
Bytes/String/Vec handle from the struct, extract its inner Vec
data pointer + size, then:
- The total-bytes computation in PACK needs to become “minimum
fixed bytes” plus runtime-computed variable-length additions.
For
bytes(n)and[T; n]the size is known; for the rest it’s runtime. Handle by:- Fixed prefix → same as STEP 4 codegen.
- Variable tail → compute the size at runtime by summing the variable field sizes, allocate a Vec of that size, pack.
Tests (spec/qspec/binary_varwidth_spec.qz):
bytes(n)fixed-length blob (e.g.,mac: bytes(6)) round-trips.bytesrest-of-stream (e.g., IPv4payload: bytes) round-trips.cstringround-trips with null terminator.pstring(u8)round-trips with single-byte length prefix.pstring(u16le)round-trips with 16-bit LE length prefix.[u8; 4](fixed) round-trips.[u16be; count](length-prefixed) round-trips — thecountfield comes earlier in the binary block.[u32be](rest-of-stream) round-trips.
STEP 2 — Straddling sub-byte fields (target: 1 commit, ~150-250 lines)
Goal: Fields with bit_in_byte + width > 8 pack/unpack correctly
MSB-first across byte boundaries.
Example: IPv4 frag_off: u13be at bit offset 51 straddles
bytes 6 and 7. It occupies:
- bits 3..7 of byte 6 (5 high bits of the field go here)
- bits 0..7 of byte 7 (8 low bits of the field go here)
Implementation sketch:
# In cg_emit_binary_pack, add a third branch after sub-byte-single-byte:
elif (bit_in_byte + width) > 8 and width <= 64:
# Compute which bytes the field spans.
first_byte = bit_offset / 8
last_byte = (bit_offset + width - 1) / 8
span = last_byte - first_byte + 1
total_bit_pos = bit_offset + width # MSB-first end position
# For MSB-first: the high bits of the field value go into the high
# bits of first_byte, and the low bits go into the low bits of
# last_byte.
# For each byte in [first_byte, last_byte]:
# bits_in_this_byte = min(8 - bit_in_byte (first only), 8)
# Shift field value right by (field_width - bits_written_so_far - bits_in_this_byte)
# Mask to bits_in_this_byte
# OR into the output byte at the correct bit position
For BE (default): the first byte gets the MSB chunk. For LE: the byte order reverses (byte 0 of the field’s bytes lands at byte last_byte in the output).
Tests (spec/qspec/binary_straddle_spec.qz):
flags: u3; frag_off: u13beafterflagsat bit 48 round-trips.- A 24-bit BE field straddling 3 bytes (bits 4..27 of a 4-byte block).
- A sub-byte LE field for completeness (rare in real formats).
- IPv4 partial: just the flags + frag_off pair.
STEP 3 — UnexpectedEof bounds check (target: 1 commit, ~50-100 lines)
Goal: UNPACK returns Err(ParseError::UnexpectedEof) when the
input Bytes buffer is smaller than the layout’s minimum expected size.
Implementation:
At the top of cg_emit_binary_unpack, after loading the Vec size:
%v<d>.needed = add i64 0, <total_bytes>
%v<d>.cmp = icmp ult i64 %v<d>.sz, %v<d>.needed
br i1 %v<d>.cmp, label %<d>.eof, label %<d>.ok
<d>.eof:
; Construct ParseError::UnexpectedEof (tag = 0 in ParseError enum)
%v<d>.errp = call ptr @malloc(i64 16)
store i64 0, ptr %v<d>.errp ; ParseError::UnexpectedEof tag
; Wrap in Result::Err (tag = 1)
%v<d>.rep = call ptr @malloc(i64 16)
store i64 1, ptr %v<d>.rep
%v<d>.rep2 = getelementptr i64, ptr %v<d>.rep, i64 1
%v<d>.errpi = ptrtoint ptr %v<d>.errp to i64
store i64 %v<d>.errpi, ptr %v<d>.rep2
%v<d>.retE = ptrtoint ptr %v<d>.rep to i64
br label %<d>.join
<d>.ok:
; existing unpack body ...
%v<d>.retO = ptrtoint ptr %v<d>.rp to i64
br label %<d>.join
<d>.join:
%v<d> = phi i64 [%v<d>.retE, %<d>.eof], [%v<d>.retO, %<d>.ok]
Tests (spec/qspec/binary_eof_spec.qz):
PngIhdr.decode(bytes_of_length_5)returnsErr(UnexpectedEof).IntMapHeader.decode(bytes_of_length_39)returnsErr.- Exact-length buffer returns
Ok.
STEP 4 — Typecheck strictness for as and .with (target: 1 commit, ~80-120 lines)
as checks (NODE_TYPE_CAST in typecheck_walk.qz):
- Target is
u8/u16/u32/u64:- Require source to be a packed struct. Error if source struct
is not DSL-kind 2 (
tc_struct_dsl_kind != 2). - Require source’s backing width == target width. Friendly error citing both widths.
- Error code:
QZ0954: 'as <target>' on non-packed type 'TypeName'orQZ0955: backing width mismatch — 'TypeName' is packed struct(uN), not uM.
- Require source to be a packed struct. Error if source struct
is not DSL-kind 2 (
- Target is a registered struct name:
- Require target to be DSL-kind 2 (packed).
- Require source type to be Int.
- Error codes: QZ0956, QZ0957.
.with {} checks (NODE_BINARY_WITH in typecheck_walk.qz):
- Receiver must be a packed struct. Error QZ0958 if not.
- Each override field name must exist on the struct. Error QZ0959 naming the unknown field + the valid field list.
Tests: Extend binary_bitcast_spec.qz and binary_with_spec.qz
with assert_compile_error cases per error code.
STEP 5 — IPv4Header roundtrip (target: 1 commit, ~50 lines spec)
Goal: Finish Phase 1.5 by extending binary_roundtrip_spec.qz
with the IPv4Header example from BINARY_DSL.md. Should pass as soon
as STEP 1 (variable-width payload: bytes) and STEP 2 (straddling
frag_off: u13be) ship.
it("IPv4Header: sub-byte + straddle + payload") do ->
assert_run_exits("""
import * from binary
import * from bytes
type IPv4Header = binary {
version: u4
ihl: u4
tos: u8
total: u16be
id: u16be
flags: u3
frag_off: u13be
ttl: u8
proto: u8
checksum: u16be
src: u32be
dst: u32be
payload: bytes
}
def main(): Int
var p = Bytes { _data: vec_new_filled(4, 42) }
var h = IPv4Header {
version: 4, ihl: 5, tos: 0, total: 24, id: 0x1234,
flags: 0b010, frag_off: 0x1abc, ttl: 64, proto: 6,
checksum: 0x9abc, src: 0x0a000001, dst: 0x0a000002,
payload: p,
}
var b = h.encode()
match IPv4Header.decode(b)
Ok(h2) => {
return 1 if h2.version != 4
return 2 if h2.ihl != 5
return 3 if h2.total != 24
return 4 if h2.flags != 0b010
return 5 if h2.frag_off != 0x1abc
return 6 if h2.src != 0x0a000001
return 7 if h2.payload.size() != 4
return 0
}
Err(_) => return 99
end
end
""", 0)
end
When this passes, all 5 worked examples from the design doc round-trip. Write a Phase 2 kickoff handoff if context remains.
Phase 2 preview (for the session that gets this far)
If STEPs 1-5 ship with context remaining, start Phase 2a. The design has this sketched but not implemented:
Phase 2a — Computed fields
type TcpHeader = binary {
src_port: u16be
dst_port: u16be
seq: u32be
ack: u32be
data_off: u4
rsvd: u3
ns: u1
flags: u8
window: u16be
checksum: u16be = ip_tcp_checksum(pseudo_header, body)
urgent: u16be
options: [u8; data_off * 4 - 20] # also computed: depends on data_off
body: bytes
}
The = expr suffix means: on encode, evaluate and write; on decode,
skip and validate (optional, v2).
Implementation path:
- Parser: extend
ps_parse_binary_block_fieldto accept= exprafter the type spec. Store the expr handle in the NODE_BINARY_FIELD. - Typecheck: type-check the expression in the block’s scope (where prior fields are in scope as locals).
- Codegen: PACK evaluates the expression (closure over prior field values) before writing; UNPACK either skips or validates.
Parse surface is the biggest risk — decide parser grammar early.
Phase 2b — Discriminated unions
See BINARY_DSL.md phasing section. Match on a discriminator to pick a variant layout. Large — separate handoff session.
Phase 3 — Dogfood intmap header
Once Phase 2a is stable + variable-width works, rewrite
cg_intrinsic_intmap.qz to use IntMapHeader.decode() instead of
manual getelementptr loads. Mechanical but proves the DSL is
production-ready.
Safety rails (verify before starting)
- Quake guard before every commit.
./self-hosted/bin/quake guard. Pre-commit hook enforces it. Never —no-verify. - Smoke after every guard. brainfuck + style_demo + expr_eval.
- Fix-specific backup at
quartz-pre-binary-phase15-golden. Don’t overwrite until Phase 1.5 is committed end-to-end. - Full QSpec NOT in Claude Code. Run targeted specs from the harness; give the user the full-suite command to paste in a terminal if a cross-spec regression is suspected.
- Crash reports first (CLAUDE.md): on silent SIGSEGV check
~/Library/Logs/DiagnosticReports/quartz-*.ipsbefore ASAN/lldb.