Quartz Launch Readiness Audit — Full Report

Date: March 10, 2026 (original) | Last updated: April 6, 2026 Version Audited: v5.26.0-alpha (trunk branch) Methodology: Adversarial codebase audit across 13 areas, 9 parallel investigation agents, cross-referenced against specs, tests, docs, and compiler source.

April 6, 2026 — Status Update

Revised Grade: B+ (up from B-)

Since the original audit, significant progress has been made:

Union type codegen: Implemented (15/15 tests now active)

Bounded generic dispatch: Implemented (dispatch logic added)

Record type parameters: Partially fixed

README rewritten: All build commands verified, accurate API examples, all 12 benchmarks shown

Error diagnostics: Wired up with Rust-style formatting, caret underlining, “Did you mean?” suggestions

WASM backend: Fully operational with playground

2,561 functions (up from 1,357), 6,400+ tests (up from 5,307), 479 spec files

New features: postfix ? operator, puts auto-coercion, string/map iteration, char literals, negative indexing

Table Stakes: 18/21 items complete (was 0/21 at audit time)

Remaining blockers:

Intermittent scheduler hang (~3%, needs park/wake refactor)

Diagnostic wiring still incomplete in some paths

No package manager

1. EXECUTIVE SUMMARY

Overall Readiness Grade: B- (see April 2026 update above for revised B+ grade)

Quartz is a technically impressive, genuinely self-hosted systems language with real innovations (structural typing, existential erasure, Ruby-familiar syntax). The compiler is battle-tested (6,400+ tests, fixpoint verified). ~~But marketing claims outrun implementation in several high-visibility areas, and onboarding friction would kill adoption on contact.~~

Top 3 Strengths

Self-hosting compiler is real and proven — gen2==gen3 byte-identical, 2,561 functions, fixpoint verified. This is the single strongest credibility signal.
Memory safety model works for single-threaded code — Move semantics, borrow checker, Drop/RAII all battle-tested through self-compilation. 286 safety tests, ~99% passing.
Performance is genuinely competitive — 9/12 benchmarks at C parity after optimization. json_parse 4.5x faster than C (length-prefixed strings).

Top 3 Blockers (March 2026 — most now resolved)

~~Union types advertised but codegen missing~~ — ✅ FIXED: Union codegen implemented, 15/15 tests active.
~~Onboarding is broken~~ — ✅ FIXED: README rewritten with verified commands, accurate examples.
~~Claims outrun reality~~ — Partially fixed. README updated to be accurate. Some edge cases remain.

2. DETAILED FINDINGS

AREA 1: The “Ruby-Like” Claim

Verdict: OVERSTATED. Reposition.

What’s genuinely Ruby-like:

def/end syntax, string interpolation ("Hello #{name}"), puts, postfix conditionals (return 0 if done), symbols (:ok), for-in loops, pattern matching, blocks/closures with do/end
These are real and work well — a Ruby dev will feel some familiarity

What’s not Ruby-like at all:

Mandatory type annotations on all function signatures (opposite of Ruby’s philosophy)
No metaprogramming (no method_missing, eval, define_method, send, open classes)
Collections require function calls (vec_len(), vec_get()) not method chains (.map { |x| })
No splat (*args), no keyword rest (**kwargs), no symbol-to-proc (&:upcase)
No duck typing — structural dispatch blocked by design (25 tests pending, incompatible with existential erasure)

Recommendation: Change tagline from “Write it like Ruby” to “Ruby-familiar syntax for systems code” or “Learn it in an afternoon if you know Ruby.” Add a prominent “NOT Ruby” section listing what doesn’t transfer.

AREA 2: The Performance Claim (“Run it like C”)

Verdict: TRUE WITH UNDISCLOSED CAVEATS. Fix or caveat.

Current benchmark reality (12 benchmarks):

Result	Benchmarks	Notes
At C parity (1.00x)	fibonacci, sum, sieve, matrix, string_concat, binary_trees_bump	6 benchmarks
Faster than C	json_parse (0.22x — 4.5x faster)	Length-prefixed strings
Slower than C	linked_list (1.42x), nbody (1.29x), binary_trees (1.20x), hash_map (1.16x)	4 benchmarks
Conditional	struct_heavy (1.0x with `--stack-alloc`, 6.0x without)	Non-default flag

Problems:

README shows stale data — Old numbers predating optimization work
README cherry-picks 6 of 12 benchmarks — Omits 4 slower ones
Requires non-default flags — opt -O2 and --stack-alloc needed for advertised numbers but not documented
BENCHMARKS.md contradicts itself — Claims “no opt passes” but run.sh runs opt -O2
Default compiler output is unoptimized — Users following README get worse performance than advertised

Recommendation: Update README with all 12 benchmarks, current numbers, and required flags. Change “beat C” to “matches or beats C on most benchmarks with LLVM optimization.”

AREA 3: The Type System

Verdict: CORE WORKS, EXPERIMENTAL FEATURES ARE LANDMINES. Fix before launch.

Feature status matrix:

Feature	Parse	TypeCheck	Codegen	Tests	Status
Structs, enums, Option/Result	Yes	Yes	Yes	100%	(a) Works end-to-end
Generics (single-param)	Yes	Yes	Yes	88%	(a) Works
Traits + UFCS dispatch	Yes	Yes	Yes	74%	(a) Mostly works
Type inference (local H-M)	Yes	Yes	Yes	~95%	(a) Works
Union types	Yes	Yes	No	0/15	(c) LANDMINE
Intersection types	Yes	Yes	Partial	12/17	(b) Partial
Record types	Yes	Partial	Partial	1/6	(c) LANDMINE
Bounded generic dispatch	Yes	Yes	No	0/9	(c) LANDMINE

Critical landmines:

Union types (15/15 pending): Parser accepts Int | String, typecheck validates subtyping, but codegen is completely missing. No tag/discriminant emission. User code silently fails at runtime. The reference docs say “codegen not yet implemented” but the parser doesn’t reject it.
Record type parameters (5/6 pending): docs/QUARTZ_REFERENCE.md shows def get_x(r: { x: Int }): Int as a working example. It doesn’t compile. Return position works; parameter position doesn’t.
Bounded generics (9/38 pending in traits): def max<T: Ord>(a: T, b: T): T parses and typechecks constraints but MIR has no dispatch logic. Can’t call trait methods inside generic body.

Recommendation: Either implement union codegen before launch OR remove union/record/bounded-generic examples from all documentation and add compiler errors when users try to use them: “QZ9999: Union type codegen not yet implemented.”

AREA 4: Memory Safety

Verdict: STRONG FOR SINGLE-THREADED CODE. Concurrency story incomplete.

What works (battle-tested through self-compilation):

Move semantics: 69 tests, comprehensive tracking across assignments/calls/returns/conditionals/loops
Borrow checker: 30 tests, 5 core rules enforced (QZ1205-1211)
Drop/RAII: 30+ tests, LIFO ordering, moved values not double-dropped
Partial moves: Field-level granularity (QZ1216)

Critical gaps vs Rust:

No thread safety enforcement — No Send/Sync traits, no borrow checker integration with thread captures. Shared mutable state across taskgroup_spawn compiles without warning.
CPtr bypasses all safety — Documented as escape hatch but no warnings. Use-after-free trivially achievable.
Arena UAF not detected — Can use arena-allocated pointer after arena_destroy().
Parameter borrow lifetimes incomplete — Returning borrow of parameter may create dangling pointer in edge cases.

Recommendation: Qualify safety claims: “Memory-safe for single-threaded owned values. Thread safety and raw pointer safety are the programmer’s responsibility.” Don’t claim to prevent data races.

AREA 5: Error Messages & Diagnostics

Verdict: INFRASTRUCTURE IS WORLD-CLASS, INTEGRATION IS INCOMPLETE.

What exists:

34 error codes (QZ0101-QZ1218)
18 documented with --explain (53% coverage)
Rust-style diagnostic formatter in diagnostic.qz (colors, gutters, spans, suggestions, JSON mode)
“Did you mean?” fuzzy matching for struct names, function names, variable names
Multi-error reporting (up to 30 errors per compilation)

The problem: The diagnostic formatter is never called. Errors go directly via eputs("error[TYPE]: #{msg} at line #{line}, col #{col}"), bypassing colors, source context, caret underlining, and notes. The infrastructure exists but isn’t wired up.

What users see:

error[TYPE]: Unknown struct: Pount at line 5, col 3

What the infrastructure could produce:

error[QZ0301]: Unknown struct
  --> prog.qz:5:3
   |
 5 | p = Pount { x: 1, y: 2 }
   |     ^^^^^ Did you mean 'Point'?

Recommendation: Wire up diagnostic.qz to tc_print_errors(). This is a 2-3 hour task with massive DX payoff. Also document all 34 error codes and add the 16 missing explanations.

AREA 6-7: Developer Experience & Tooling

Verdict: WORLD-CLASS INTERNALS, BROKEN ONBOARDING.

Tool ratings:

Tool	Rating	Notes
Compiler	3.5/5	Good flags, but outputs LLVM IR to stdout (confusing for beginners)
Quake (build system)	4/5	Polished — task listing, fuzzy matching, dependency resolution, caching
QSpec (test framework)	4/5	RSpec-like ergonomics, 50+ assertions, fast (12s for 367 files)
Formatter	3/5	Works but underdocumented, limited rules
Linter	2.5/5	Primitive — only block balance, naming conventions, comments
VS Code extension	2/5	Built but NOT published to Marketplace. Requires manual build. No IntelliSense.
REPL	3.5/5	Works but hidden — no `quartz repl` command, must compile from `tools/repl.qz`
Debugger (lldb)	4/5	World-class DWARF support (355K `!dbg` annotations), but undocumented in public docs

Critical onboarding failures:

README says ./self-hosted/bin/quartz hello.qz -o hello — the -o flag doesn’t exist
Default compiler output is LLVM IR to stdout — beginners think it’s broken
No Getting Started tutorial — just a 3,346-line reference manual
LLVM PATH setup is temporary (per-session export), not persistent
No hello.qz example file provided

The single investment that would most improve DX: A unified quartz run hello.qz command that compiles and executes in one step (currently requires piping through llc and clang).

AREA 8: Documentation

Verdict: COMPREHENSIVE REFERENCE, ZERO ONBOARDING. 5 critical gaps.

What exists (82 markdown files, 32K lines):

QUARTZ_REFERENCE.md — 3,346 lines, comprehensive but dense
BORROWING.md — Excellent (5 rules, clear examples, error codes)
STYLE.md — Complete conventions guide
ARCHITECTURE.md — Good compiler pipeline overview
31 auto-generated API docs (field/variant lists, no descriptions or examples)

What’s critically missing:

Getting Started tutorial — No path from “installed” to “running program”
Standard library guide — API docs are auto-generated field tables with zero narrative
Error handling guide — No guidance on Result vs panic, no $try examples
Testing guide — QSpec not documented in user-facing docs at all
FFI guide — No docs on calling C functions

Also missing: learn-quartz-in-y-minutes.qz references old be compiler (stale). No /examples directory. README Quick Start has wrong compilation command.

Recommendation: Write Getting Started guide (P0), stdlib guide with examples (P0), and testing guide (P1) before launch.

AREA 9: Competitive Landscape

Verdict: TECHNICALLY NOVEL, SOCIALLY EMBRYONIC.

Quartz’s unique strengths vs the field:

Structural dispatch + row polymorphism (no other systems language has both)
Existential type erasure (novel approach: compile-time richness, runtime simplicity)
Self-hosting at v5.26 (Crystal took years longer)
LLVM + C + WASM backends (only language with all three working)
Ruby-familiar syntax for systems code (Crystal is closest competitor)

Crystal is the most dangerous comparison:

Crystal has: package manager (Shards), 30K GitHub stars, web frameworks (Kemal, Lucky), ORM, full Ruby-port story, playground
Quartz has: better type system, better debugger, structural typing, WASM target
Reality: 95% of developers don’t care about structural typing. They care about “can I use my libraries?”

Rust comparison (the HN crowd):

Rust has: 125K+ crates, proven in Linux kernel/Cloudflare/etc., world-class tooling
Quartz has: better readability, simpler syntax, faster learning curve
Reality: Rust’s ecosystem gravity is unbeatable. Quartz can only win on onboarding speed.

Risk: Quartz could become “Nim 2.0” — amazing language, tiny community. Avoiding this requires a package manager and killer app within 12 months of launch.

AREA 10: The Five-Minute Test

Senior Rust Developer Simulation:

Clicks repo — impressed by tagline, skeptical of claims
Scans README — “structural typing” hooks them, “beat C” makes them suspicious
Tries to build — fails (README command doesn’t work, -o flag doesn’t exist)
Tests safety claim — finds CPtr bypasses everything, no thread safety
Closes tab. Build failure is fatal for first impression.

Ruby Developer Simulation:

Clicks repo — excited by familiar syntax
Scans README — sees def/end, interpolation, pattern matching
Tries to build — same failure as above
Writes greet("World") — works! Feels familiar.
Tries .map { |x| ... } — fails. Coin flip on closing tab. Syntax is familiar but collections aren’t.

AREA 11: Claims We Must Not Make

Claim	Status	Recommendation
”Beat C on benchmarks”	9/12 with non-default flags	Caveat: “matches or beats C on most benchmarks with LLVM optimization"
"True union types”	Parser/TC only, 0% codegen	Cut or implement. 15/15 tests pending.
”Row polymorphism”	Record params don’t work	Caveat: “in return position; parameter position pending"
"No runtime”	Runtime library exists (~1.2K lines)	Fix: “No garbage collector. Minimal runtime."
"Zero-overhead allocator”	General allocator has overhead	Caveat: “Zero-overhead bump allocator available”
`quartz hello.qz -o hello`	`-o` flag doesn’t exist	Fix the README immediately
”Bare metal example”	Requires inline C assembly	Caveat or remove

AREA 12: README & First Impression

What’s excellent:

Tagline is memorable and shareable: “Write it like Ruby. Run it like C.”
First code example is well-chosen (pipeline operator, lambdas, clean syntax)
Badges (test count, self-hosted, MIT) project credibility
Professional presentation

What’s broken:

Quick Start build command doesn’t work
Claims need tempering (5 items above)
No “when to use Quartz vs X” section
No “what Quartz is NOT” section
Missing context for why each feature matters

AREA 13: What Would Make This a 500-Point HN Post

The Story: Not “we built a language” — that’s 50 points. The story is: “One developer + Claude built a self-hosting systems language in 90 days. Here’s what we learned about language design at AI speed.”

The single strongest demo: The self-compilation fixpoint. Show the same Quartz code compiling itself and producing byte-identical output. This is the “holy shit” moment that proves it’s real.

The narrative hook: “Rust is correct but painful. Ruby is joyful but slow. We built the language that splits the difference — and proved it by writing a compiler in it.”

The credibility signal: 1,286 commits in 90 days. 5,307 tests. Self-hosting. Three backends (LLVM, C, WASM). This isn’t vaporware.

The community entry point: Missing. Need a Discord/Matrix, a package registry, and one flagship example app.

3. THE KILL LIST

Things that MUST be fixed/removed/caveated before launch:

P0 — Launch Blockers

#	Item	Action	Effort
1	README build command broken (`-o` flag doesn’t exist)	Fix Quick Start with correct `llc`/`clang` pipeline OR add `quartz run` command	1-2 hours
2	Union types: 15/15 tests pending, codegen missing	Either implement codegen OR remove from docs + add compiler error when used	5-7 days (implement) or 1 hour (remove)
3	Record type param example in docs doesn’t compile	Fix documentation: “Parameter position not yet implemented”	30 min
4	Benchmark table in README is stale/cherry-picked	Update with all 12 benchmarks + current numbers + required flags	1 hour
5	”No runtime” claim is false	Change to “No garbage collector. Minimal runtime.”	5 min
6	No Getting Started tutorial	Write `docs/GETTING_STARTED.md` (install, hello world, first program, first test)	1 day

P1 — Fix Before HN Post

#	Item	Action	Effort
7	Bounded generic dispatch missing (39 tests pending)	Document as known limitation OR implement MIR dispatch	3-5 days (implement)
8	Error formatter not wired up	Connect `diagnostic.qz` to `tc_print_errors()`	2-3 hours
9	VS Code extension not published	Build .vsix, publish to Marketplace	2 hours
10	REPL not discoverable	Add `quartz repl` subcommand	1 hour
11	Debugger undocumented	Add `docs/DEBUGGING.md` with lldb examples	2 hours
12	Thread safety claims unqualified	Add caveat: “single-threaded safety only; concurrency is user’s responsibility”	30 min
13	16 error codes undocumented	Write explanations in `explain.qz`	3 hours

4. THE HIGHLIGHT REEL

Things that should be front-and-center in marketing:

Highlight	Why It Matters	Evidence
Self-hosting fixpoint	Proves the language is real, not a toy	gen2==gen3 byte-identical, 1,357 functions
1,286 commits in 90 days	Shows velocity and commitment	Git log
json_parse 4.5x faster than C	Concrete, verifiable, surprising	`benchmarks/run.sh` reproducible
Borrow checker without lifetime annotations	Addresses Rust’s #1 pain point	30 tests, NLL-lite, ephemeral borrows
Ruby-familiar syntax	Low barrier to entry for a huge audience	`def`/`end`, interpolation, pattern matching
LLVM + C + WASM backends	Cross-platform story	TGT.1 WASM complete, TGT.2 C backend complete
355K debug annotations	”We didn’t skip the hard parts”	DWARF 5, enum metadata, variable inspection
QSpec test framework	Dogfooding — tests written in Quartz itself	367 test files, 5,307 active tests
Structural typing	Genuine type system innovation	Row polymorphism in return position works
2.3s self-compile	Fast iteration, practical toolchain	After P.5 optimization (was 12.6s)

5. THE HONEST POSITIONING STATEMENT

Quartz is a statically-typed, compiled systems language with Ruby-familiar syntax and C-competitive performance. It features a borrow checker that doesn’t require lifetime annotations, structural typing with row polymorphism, and an existential type model where rich compile-time types erase to efficient runtime representations. The compiler is self-hosted (written in Quartz, compiling itself to byte-identical output) and ships with a formatter, linter, test framework, build system, and LLVM/C/WASM backends. It is a pre-1.0 alpha with no package manager, no LSP, and incomplete support for union types and bounded generic dispatch. Memory safety is enforced for single-threaded owned values; thread safety and raw pointer safety are the programmer’s responsibility.

This statement would survive hostile scrutiny on HN because every claim is verifiable and every limitation is disclosed.

APPENDIX A: Pending Test Inventory

Total it_pending tests: 137

Category	Count	Root Cause
Bounded generic dispatch	39	MIR has no dispatch logic
Structural dispatch	25	Incompatible with existential erasure (by design)
Union type codegen	17	MIR/codegen not implemented
Record types/intersection	8	Parser done, runtime incomplete
Network/TLS infrastructure	8	Tests need local server setup
Platform-specific (lli JIT)	6	PCRE2 not linked, C macro limitations
Multi-param generics	4	Not implemented
Type model limitations	4	Existential erasure prevents runtime type distinction
Generic field access	3	TC resolution incomplete
@cfg advanced features	3	Arch/feature/AND/OR not parsed
Error message quality	3	Missing QZ codes, branch-level tracking
Exhaustiveness checking	2	Non-exhaustive match not fully detected
Loop features	2	Labeled break/continue, loop expressions
Misc	13	Custom calling conventions, circular imports, etc.

APPENDIX B: Compiler TODO/FIXME Comments

Total: 21 (trivial — verification checks, output control, unimplemented C backend intrinsics). None are language bugs.

APPENDIX C: Tool Ratings Summary

Tool	Rating	Launch Ready?
Compiler	3.5/5	Yes (with CLI UX fix)
Quake	4/5	Yes
QSpec	4/5	Yes
Formatter	3/5	Yes (needs docs)
Linter	2.5/5	Yes (limited but functional)
VS Code	2/5	No (must publish to Marketplace)
REPL	3.5/5	No (must be discoverable)
Debugger	4/5	Yes (needs public docs)