Handoff — Interactive Quartz kernel: PMM, Vec, Map, preemption, scheduler, serial RX, page fault
Head: a374961b on trunk. Fixpoint stable at 2138 functions.
One session, fourteen commits. The kernel went from “prints Hi under QEMU” to a preemptive, allocating, cooperatively-scheduled, interactive system that catches its own page faults.
Direct reproduction
$ timeout 4 qemu-system-x86_64 -kernel hello_x86.elf -serial stdio \
-display none -no-reboot
Hi
TRAP
PMM 5/5 (7/256 pgs)
VEC 4 sum=48
MAP 5 sum=1500
A ran 49, tick=50
B ran 50, tick=100
A ran 50, tick=150
B ran 50, tick=200
A ran 50, tick=250
sched done (rx=0)
With host-to-guest input:
$ printf "hello" | timeout 4 qemu-system-x86_64 -kernel hello_x86.elf \
-serial stdio -display none -no-reboot
Hi
TRAP
PMM 5/5 (7/256 pgs)
VEC 4 sum=48
MAP 5 sum=1500
[e][l][l][o]A ran 50, tick=50
B ran 50, tick=100
A ran 50, tick=150
B ran 50, tick=200
A ran 50, tick=250
sched done (rx=4)
Each marker proves something:
| Line | Proves |
|---|---|
Hi | UART 16550 init + qz_main reached |
TRAP | IDT[3] dispatch → x86_intrcc ISR → iret resumes main |
PMM 5/5 (N/256) | 1 MiB bump allocator — pointer round-trips through .bss |
VEC 4 sum=48 | Quartz stdlib Vec<Int> — malloc + realloc + memcpy work |
MAP 5 sum=1500 | Quartz stdlib Map<K,V> — hash + rehash + bucket alloc work |
[e][l][l][o] | Serial RX via IRQ4 → IDT[0x24] → serial_rx_isr echoes |
A ran .../ B ran ... | Timer drives a two-task cooperative scheduler |
sched done (rx=N) | Final completion + RX byte tally |
(not triggered normally) PF at 0x.. err=0x.. | #PF handler installed as safety net — prints CR2 + error code |
Automated by quake baremetal:qemu_boot_x86_64 — now gates on all
markers incl. A ran, B ran, sched done.
This session’s commits (on top of 7a61eb1a)
Fourteen commits. Three compiler changes, nine KERN.1 milestones, two handoff updates.
| SHA | Scope |
|---|---|
c4e53893 | SYS.1 8a fix. extern "x86-interrupt" now emits ret void + ptr byval([5 x i64]) first param. Passes LLVM’s x86_intrcc validator. |
37c95240 | KERN.1: IDT skeleton. 4 KiB .bss + 10-byte IDTR + three asm shims. breakpoint_isr prints “TRAP” on int3. |
67b2dcf5 | KERN.1: div-by-zero + iret. Slot 0 gets divide_error_isr. Proves real hardware exceptions dispatch through the IDT. |
5f530add | SYS.1: wrmsr / rdmsr intrinsics. Hard-register constraints {ecx},{eax},{edx}. Unblocks APIC, IA32_EFER, SYSCALL. |
10c5c1dc | KERN.1: preemption. pic_init remaps 8259A to 0x20-0x2F. pit_init @100 Hz. timer_isr at 0x20. |
a3efab1b | Handoff update. |
9fc06cde | KERN.1: PMM bump + 16 MiB identity map. 1 MiB pool; boot paging expanded to 8 × 2 MiB huge pages. |
a3ac92e1 | KERN.1: Veclibc_stubs.c real malloc/realloc/memcpy backed by pmm_alloc_page + 16-byte header. |
026fe319 | KERN.1: Map<K,V> in kernel. map_new() + map_set × 5 works. |
8cb5238e | KERN.1: string interpolation. Freestanding qz_alloc_str + qz_str_get_len + hand-rolled libc-free to_str. |
f98869ba | Handoff update after interp. |
b139d3a9 | KERN.1: toy scheduler. Two tasks A/B alternate on 50-tick phase windows. Main dispatches; timer drives phase. |
1e7f8215 | KERN.1: serial RX via IRQ4. UART IER bit 0 + PIC mask 0xEE + serial_rx_isr at vector 0x24 + echo loop. |
a374961b | KERN.1: page fault handler. read_cr2() + page_fault_isr(frame, err) at vector 14. 2-arg x86_intrcc signature proven. Safety net — halts loudly on any future paging bug. |
Kernel surface now (hello_x86.qz, ~340 LoC)
tools/baremetal/hello_x86.qz
├── UART 16550 driver (init, putc, puts, put_str, put_uint, put_hex)
├── @panic_handler (panic_halt)
├── extern asm shims (idt_*, pmm_pool_*)
├── ISRs
│ ├── breakpoint_isr (vec 3, "TRAP" + iret)
│ ├── divide_error_isr (vec 0, "DIV0" + halt)
│ ├── page_fault_isr (vec 14, 2-arg, CR2 + err + halt)
│ ├── timer_isr (vec 0x20, tick++ + EOI)
│ └── serial_rx_isr (vec 0x24, drain RX FIFO + echo + EOI)
├── read_cr2 (@c("mov %cr2, $0"))
├── IDT code (set_entry, zero, install — all Quartz)
├── PIC init (ICW1..ICW4 + unmask IRQ0, IRQ4)
├── PIT init (channel 0 @ 100 Hz)
├── PMM (init, alloc_page, zero_page, stats)
└── main (init everything, prove everything, run scheduler)
Boot glue:
tools/baremetal/boot_trampoline_x86.s (~180 LoC asm)
├── pml4 / pdpt / pd (paging tables in .bss)
├── idt_storage / idtr_storage
├── pmm_pool_start / pmm_pool_end (1 MiB .bss)
├── _start (.code32) 32→64 long-mode transition
│ ├── GDT setup
│ ├── 8 × 2 MiB huge PDEs → 16 MiB identity map
│ └── CR4.PAE + CR3 + EFER.LME + CR0.PG + ljmp
└── long_mode_start (.code64) data seg reload + call qz_main + hlt
Libc surface:
tools/baremetal/libc_stubs.c
├── pmm_alloc_page (weak stub — overridden by Quartz def in kernel)
├── malloc → bump through pmm_alloc_page + 16-byte size header
├── realloc → malloc + copy via header
├── free → no-op
├── memcpy / memset (real byte-wise)
└── qsort (no-op)
Language / compiler additions this session
New intrinsics: wrmsr(msr, val), rdmsr(msr) — both with
hard-register constraints.
New freestanding runtime helpers (emitted automatically when
--target x86_64-unknown-none-elf):
qz_alloc_str— length-prefixed allocation via mallocqz_str_get_len— read length prefixto_str— libc-freei64 → Stringwith pointer-sniff + hand- rolled digit-fill. Makes interpolation work for Int values without any hosted runtime dependency.
Compiler bug fixes:
x86_intrccemitsret void(notret i64 0) when return type is Void — matters for ISRs with any trailing expression.x86_intrccfirst param isptr byval([5 x i64])(noti64) — required by LLVM’s validator.
What KERN.1 needs next
Current kernel proves the core primitives. What’s left are real subsystems. Pick ≥1 per session.
1. Real context-switching scheduler. Current scheduler is
dispatcher-in-a-loop — each task is a function call. Upgrade to
saved per-task state: task struct with rsp, a per-task stack
allocated from the PMM, a switch_to(from, to) asm helper that
pushes callee-saved regs, swaps RSP, pops, rets. 30-50 LoC asm
- 50 LoC Quartz. Sets up for real green threads inside the kernel.
2. Remaining CPU exception handlers (1-31). Mechanical.
Cookie-cutter per vector. Error-code-carrying vectors (#DF, #TS,
#NP, #SS, #GP, #AC) use the 2-arg signature now proven in
a374961b.
3. APIC + LAPIC timer. wrmsr/rdmsr shipped. Replace PIC +
PIT with LAPIC timer. Needs MMIO mapping of the APIC base
(0xFEE00000) — our current 16 MiB identity map doesn’t cover it.
Add a 4 KiB mapping for the APIC page + use MMIO helpers. Higher
frequency, lower latency, SMP-ready. ~100 LoC including mapping
helpers.
4. Multiboot2 memory-map consumption. PVH boot path hands us
a start_info_t via rdi in long mode. Walk it, find usable RAM
ranges, resize the PMM pool. Turns 1 MiB pool into
“however-much-RAM-is-actually-there” (128+ MiB under QEMU default).
~50 LoC.
5. More stdlib types in kernel. Already have Vec
Nice-to-haves
@[section(...)]codegen — letmultiboot_headers.s+ IDT storage become Quartz source.@panic_handlerrouting for prelude bounds / overflow / map-key panic helpers (currently stub tounreachable).read_cr2/read_cr3/write_cr3/invlpgintrinsics — mirror ofwrmsr. Current@c("mov %cr2, $0")works but dedicated intrinsics would be cleaner.- More libc-free runtime helpers for String + Bool + F64
interpolation. Current
to_strcovers Int. - PIC EOI as a helper (currently open-coded
port_out8(0x20, 0x20)in three ISRs).
State of the tree
- Branch:
trunk - HEAD:
a374961b - Fixpoint: 2138 functions, gen1 == gen2 byte-identical
- Smokes: brainfuck + style_demo PASS
- QSpec: freestanding 17/17, port_io 8/8, msr 5/5, x86_intr 5/5, preserve_cconv 5/5
- Six baremetal tasks all green (
qemu_boot_x86_64asserts “Hi” + “TRAP” + “PMM 5/5” + “VEC 4 sum=48” + “MAP 5 sum=1500” + “A ran” + “B ran” + “sched done”) - Backup binaries saved:
backups/quartz-pre-idt-goldenbackups/quartz-pre-msr-goldenbackups/quartz-pre-interp-golden
Recommended next thread
Context-switching scheduler. It’s the biggest remaining “real kernel” piece + the one that turns the toy scheduler into something that can host concurrent work. 1 session, high confidence, unlocks the effects-track landing path.
Alternative: APIC + LAPIC timer for modern interrupt delivery
- SMP foundation. Bigger scope (MMIO mapping work) but high-visibility.
Alternative: Multiboot2 memory map for a real PMM consuming all of system RAM. Small, strategic.