10 KiB
title, description, weight, type
| title | description | weight | type |
|---|---|---|---|
| Compiler Inspection Tools | Tools for inspecting and debugging the compiler pipeline | 50 | docs |
ƿit includes a set of tools for inspecting the compiler pipeline at every stage. These are useful for debugging, testing optimizations, and understanding what the compiler does with your code.
Pipeline Overview
The compiler runs in stages:
source → tokenize → parse → fold → mcode → streamline → output
Each stage has a corresponding CLI tool that lets you see its output.
| Stage | Tool | What it shows |
|---|---|---|
| tokenize | tokenize.ce |
Token stream as JSON |
| parse | parse.ce |
Unfolded AST as JSON |
| fold | fold.ce |
Folded AST as JSON |
| mcode | mcode.ce |
Raw mcode IR as JSON |
| mcode | mcode.ce --pretty |
Human-readable mcode IR |
| streamline | streamline.ce |
Full optimized IR as JSON |
| streamline | streamline.ce --types |
Optimized IR with type annotations |
| streamline | streamline.ce --stats |
Per-function summary stats |
| streamline | streamline.ce --ir |
Human-readable canonical IR |
| all | ir_report.ce |
Structured optimizer flight recorder |
All tools take a source file as input and run the pipeline up to the relevant stage.
Quick Start
# see raw mcode IR (pretty-printed)
cell mcode --pretty myfile.ce
# see optimized IR with type annotations
cell streamline --types myfile.ce
# full optimizer report with events
cell ir_report --full myfile.ce
fold.ce
Prints the folded AST as JSON. This is the output of the parser and constant folder, before mcode generation.
cell fold <file.ce|file.cm>
mcode.ce
Prints mcode IR. Default output is JSON; use --pretty for human-readable format with opcodes, operands, and program counter.
cell mcode <file.ce|file.cm> # JSON (default)
cell mcode --pretty <file.ce|file.cm> # human-readable IR
streamline.ce
Runs the full pipeline (tokenize, parse, fold, mcode, streamline) and outputs the optimized IR as JSON. Useful for piping to jq or saving for comparison.
cell streamline <file.ce|file.cm> # full JSON (default)
cell streamline --stats <file.ce|file.cm> # summary stats per function
cell streamline --ir <file.ce|file.cm> # human-readable IR
cell streamline --check <file.ce|file.cm> # warnings only
cell streamline --types <file.ce|file.cm> # IR with type annotations
cell streamline --diagnose <file.ce|file.cm> # compile-time diagnostics
| Flag | Description |
|---|---|
| (none) | Full optimized IR as JSON (backward compatible) |
--stats |
Per-function summary: args, slots, instruction counts by category, nops eliminated |
--ir |
Human-readable canonical IR (same format as ir_report.ce) |
--check |
Warnings only (e.g. nr_slots > 200 approaching 255 limit) |
--types |
Optimized IR with inferred type annotations per slot |
--diagnose |
Run compile-time diagnostics (type errors and warnings) |
Flags can be combined.
seed.ce
Regenerates the boot seed files in boot/. These are pre-compiled mcode IR (JSON) files that bootstrap the compilation pipeline on cold start.
cell seed # regenerate all boot seeds
cell seed --clean # also clear the build cache after
The script compiles each pipeline module (tokenize, parse, fold, mcode, streamline) and internal/bootstrap.cm through the current pipeline, encodes the output as JSON, and writes it to boot/<name>.cm.mcode.
When to regenerate seeds:
- Before a release or distribution
- When the pipeline source changes in a way the existing seeds can't compile the new source (e.g. language-level changes)
- Seeds do NOT need regenerating for normal development — the engine recompiles pipeline modules from source automatically via the content-addressed cache
ir_report.ce
The optimizer flight recorder. Runs the full pipeline with structured logging and outputs machine-readable, diff-friendly JSON. This is the most detailed tool for understanding what the optimizer did and why.
cell ir_report [options] <file.ce|file.cm>
Options
| Flag | Description |
|---|---|
--summary |
Per-pass JSON summaries with instruction counts and timing (default) |
--events |
Include rewrite events showing each optimization applied |
--types |
Include type delta records showing inferred slot types |
--ir-before=PASS |
Print canonical IR before a specific pass |
--ir-after=PASS |
Print canonical IR after a specific pass |
--ir-all |
Print canonical IR before and after all passes |
--full |
Everything: summary + events + types + ir-all |
With no flags, --summary is the default.
Output Format
Output is line-delimited JSON. Each line is a self-contained JSON object with a type field:
type: "pass" — Per-pass summary with categorized instruction counts before and after:
{
"type": "pass",
"pass": "eliminate_type_checks",
"fn": "fib",
"ms": 0.12,
"changed": true,
"before": {"instr": 77, "nop": 0, "guard": 16, "branch": 28, ...},
"after": {"instr": 77, "nop": 1, "guard": 15, "branch": 28, ...},
"changes": {"guards_removed": 1, "nops_added": 1}
}
type: "event" — Individual rewrite event with before/after instructions and reasoning:
{
"type": "event",
"pass": "eliminate_type_checks",
"rule": "incompatible_type_forces_jump",
"at": 3,
"before": [["is_int", 5, 2, 4, 9], ["jump_false", 5, "rel_ni_2", 4, 9]],
"after": ["_nop_tc_1", ["jump", "rel_ni_2", 4, 9]],
"why": {"slot": 2, "known_type": "float", "checked_type": "int"}
}
type: "types" — Inferred type information for a function:
{
"type": "types",
"fn": "fib",
"param_types": {},
"slot_types": {"25": "null"}
}
type: "ir" — Canonical IR text for a function at a specific point:
{
"type": "ir",
"when": "before",
"pass": "all",
"fn": "fib",
"text": "fn fib (args=1, slots=26)\n @0 access s2, 2\n ..."
}
Rewrite Rules
Each pass records events with named rules:
eliminate_type_checks:
known_type_eliminates_guard— type already known, guard removedincompatible_type_forces_jump— type conflicts, conditional jump becomes unconditionalnum_subsumes_int_float— num check satisfied by int or floatdynamic_to_field— load_dynamic/store_dynamic narrowed to field accessdynamic_to_index— load_dynamic/store_dynamic narrowed to index access
simplify_algebra:
add_zero,sub_zero,mul_one,div_one— identity operations become movesmul_zero— multiplication by zero becomes constantself_eq,self_ne— same-slot comparisons become constants
simplify_booleans:
not_jump_false_fusion— not + jump_false fused into jump_truenot_jump_true_fusion— not + jump_true fused into jump_falsedouble_not— not + not collapsed to move
eliminate_moves:
self_move— move to same slot becomes nop
eliminate_dead_jumps:
jump_to_next— jump to immediately following label becomes nop
Canonical IR Format
The --ir-all, --ir-before, and --ir-after flags produce a deterministic text representation of the IR:
fn fib (args=1, slots=26)
@0 access s2, 2
@1 is_int s4, s1 ; [guard]
@2 jump_false s4, "rel_ni_2" ; [branch]
@3 --- nop (tc) ---
@4 jump "rel_ni_2" ; [branch]
@5 lt_int s3, s1, s2
@6 jump "rel_done_4" ; [branch]
rel_ni_2:
@8 is_num s4, s1 ; [guard]
Properties:
@Nis the raw array index, stable across passes (passes replace, never insert or delete)sNprefix distinguishes slot operands from literal values- String operands are quoted
- Labels appear as indented headers with a colon
- Category tags in brackets:
[guard],[branch],[load],[store],[call],[arith],[move],[const] - Nops shown as
--- nop (reason) ---with reason codes:tc(type check),bl(boolean),mv(move),dj(dead jump),ur(unreachable)
Examples
# what passes changed something?
cell ir_report --summary myfile.ce | jq 'select(.changed)'
# list all rewrite rules that fired
cell ir_report --events myfile.ce | jq 'select(.type == "event") | .rule'
# diff IR before and after optimization
cell ir_report --ir-all myfile.ce | jq -r 'select(.type == "ir") | .text'
# full report for analysis
cell ir_report --full myfile.ce > report.json
ir_stats.cm
A utility module used by ir_report.ce and available for custom tooling. Not a standalone tool.
var ir_stats = use("ir_stats")
ir_stats.detailed_stats(func) // categorized instruction counts
ir_stats.ir_fingerprint(func) // djb2 hash of instruction array
ir_stats.canonical_ir(func, name, opts) // deterministic text representation
ir_stats.type_snapshot(slot_types) // frozen copy of type map
ir_stats.type_delta(before_types, after_types) // compute type changes
ir_stats.category_tag(op) // classify an opcode
Instruction Categories
detailed_stats classifies each instruction into one of these categories:
| Category | Opcodes |
|---|---|
| load | load_field, load_index, load_dynamic, get, access (non-constant) |
| store | store_field, store_index, store_dynamic, set_var, put, push |
| branch | jump, jump_true, jump_false, jump_not_null |
| call | invoke, goinvoke |
| guard | is_int, is_text, is_num, is_bool, is_null, is_array, is_func, is_record, is_stone |
| arith | add_int, sub_int, ..., add_float, ..., concat, neg_int, neg_float, bitwise ops |
| move | move |
| const | int, true, false, null, access (with constant value) |
| label | string entries that are not nops |
| nop | strings starting with _nop_ |
| other | everything else (frame, setarg, array, record, function, return, etc.) |