Unikernel virtio-net RX ring stall (Apr 19 2026)
Severity: HIGH (full service hang on production VPS)
Status: Observed, not yet diagnosed. Workaround: systemctl restart quartz-unikernel.
First observed: Apr 19 2026, mattkelly.io VPS, after ~some-hours of serving real traffic
since the Apr 18 deploy.
Symptoms
- Host
pingto VPS responds normally. nc -zv 195.35.36.247 8080— TCP handshake succeeds (QEMU hostfwd accepts on :8080, forwards to guest :80).curl http://195.35.36.247:8080/— connection opens, then 0 bytes of data received, request times out.connections_servedcounter at last-known-good was 16,371 (from a curl just before the hang was noticed; stats.json was unreachable by the time the hang was confirmed).
Pre-restart kernel log signature
[rx: u=16444 c=8816895 len=68]
[rx: u=16444 c=8816896 len=64]
[rx: u=16444 c=8816897 len=679]
[rx: u=16444 c=8816898 len=64]
...
u (virtio-net used.idx) stops advancing. The c counter (a local RX-loop
counter) keeps incrementing because the device is still delivering frames into the
ring — but the guest-visible used.idx is stuck, so virtio_net_rx_wait never
returns. The final log line before we restarted was TCP: SYN from 10.0.2.2:42610
— meaning one frame DID get through near the end, but then the ring wedged.
The u=16444 value is suspicious: 16444 × (some descriptor size) likely corresponds
to a wrap-around boundary in the virtio ring (rings are typically 256 entries;
16444 mod 256 = 60, not a clean boundary, so the numerical coincidence is probably
misleading — but the fact that u froze at a specific value across thousands of
arriving frames is the smoking gun).
NOT the 16-slot TCP-table theory
Initial hypothesis was “all 16 per-connection TCP slots leaked” (based on handoff doc calling out no TIME_WAIT, no retransmits). This is almost certainly wrong:
- If TCP slots had leaked, we’d still see
uadvancing — the virtio device would happily deliver frames, the kernel would parse them, andtcp_find_slotwould return 0 (unknown peer),tcp_handle_framewould drop them. - Actual log shows
uitself stuck, meaning the problem is BELOW TCP — at the virtio-net driver layer, not the TCP slot layer.
Likely root causes (ranked)
- Descriptor-ring wrap bug.
virtio_net_rx_post()posts descriptors back toavail.idxon each completion. If the avail/used indices drift (e.g., off-by-one over hours of traffic), the device runs out of posted descriptors. The driver then pollsused.idxforever waiting for a completion that can’t arrive because the device has nothing to deliver into. Existing comment inhello_x86.qznearvirtio_net_rx_postalready flags a similar concern (“naïvely re-posting every iteration inflates avail.idx far past used.idx”). - Avail-ring corruption from a non-DMA write. The current driver re-uses
g_vnet_rx_buf— a single 4 KiB page — for every RX. If a DMA write ever straddles the buffer (it shouldn’t — MTU is 1500) we’d clobber the ring metadata. Low probability. - Host-side QEMU TCG + virtio-mmio quirk that only triggers after a specific number of cycles or interrupts. Observed only on the Ubuntu 5.15 VPS; reproduces may be host-specific.
Repro strategy (not yet attempted)
- Let the unikernel run under local
qemu-system-x86_64 -M microvmfor N hours with a traffic generator (e.g.,wrk -t2 -c4 -d24h http://127.0.0.1:8093/). - Log
uandavail.idxevery 1000 frames to watch the gap grow. - When a hang reproduces, diff
used/avail/ descriptor table to find the exact off-by-one.
Real fix
DEF-B (IOAPIC + IRQ-driven RX) in docs/handoff/kern4-to-joy-demo-handoff.md.
The current polling loop is fundamentally fragile because used.idx is the only
signal the driver looks at — if that stalls for ANY reason, the kernel wedges.
An IRQ-driven RX path would:
- Hang a
hlton the idle task instead of polling. - Use virtio-mmio’s
InterruptStatusregister to distinguish “RX completion” from “config change” interrupts. - Re-post descriptors defensively on every ISR, not just on loop iteration.
This moves the bug into a region where it can at least be isolated to a specific interrupt path rather than hiding inside a monotonic-counter polling loop.
Workaround (live)
ssh mattkelly.io systemctl restart quartz-unikernel — takes under 2 seconds,
ELF reloads from /opt/quartz/quartz-unikernel.elf. Unikernel state is cleared
on restart (stats counters go to 0, PMM pool re-initialized). No user-visible
data is lost because the unikernel is stateless.
Consider adding a systemd Restart=on-failure + WatchdogSec for automatic
recovery — but the current driver won’t crash, it just hangs, so the watchdog
needs to be external (HTTP health check from Caddy or a cron).