--- title: "Compiler Inspection Tools" description: "Tools for inspecting and debugging the compiler pipeline" weight: 50 type: "docs" --- ƿit includes a set of tools for inspecting the compiler pipeline at every stage. These are useful for debugging, testing optimizations, and understanding what the compiler does with your code. ## Pipeline Overview The compiler runs in stages: ``` source → tokenize → parse → fold → mcode → streamline → output ``` Each stage has a corresponding CLI tool that lets you see its output. | Stage | Tool | What it shows | |-------------|---------------------------|----------------------------------------| | tokenize | `tokenize.ce` | Token stream as JSON | | parse | `parse.ce` | Unfolded AST as JSON | | fold | `fold.ce` | Folded AST as JSON | | mcode | `mcode.ce` | Raw mcode IR as JSON | | mcode | `mcode.ce --pretty` | Human-readable mcode IR | | streamline | `streamline.ce` | Full optimized IR as JSON | | streamline | `streamline.ce --types` | Optimized IR with type annotations | | streamline | `streamline.ce --stats` | Per-function summary stats | | streamline | `streamline.ce --ir` | Human-readable canonical IR | | all | `ir_report.ce` | Structured optimizer flight recorder | All tools take a source file as input and run the pipeline up to the relevant stage. ## Quick Start ```bash # see raw mcode IR (pretty-printed) cell mcode --pretty myfile.ce # see optimized IR with type annotations cell streamline --types myfile.ce # full optimizer report with events cell ir_report --full myfile.ce ``` ## fold.ce Prints the folded AST as JSON. This is the output of the parser and constant folder, before mcode generation. ```bash cell fold ``` ## mcode.ce Prints mcode IR. Default output is JSON; use `--pretty` for human-readable format with opcodes, operands, and program counter. ```bash cell mcode # JSON (default) cell mcode --pretty # human-readable IR ``` ## streamline.ce Runs the full pipeline (tokenize, parse, fold, mcode, streamline) and outputs the optimized IR as JSON. Useful for piping to `jq` or saving for comparison. ```bash cell streamline # full JSON (default) cell streamline --stats # summary stats per function cell streamline --ir # human-readable IR cell streamline --check # warnings only cell streamline --types # IR with type annotations ``` | Flag | Description | |------|-------------| | (none) | Full optimized IR as JSON (backward compatible) | | `--stats` | Per-function summary: args, slots, instruction counts by category, nops eliminated | | `--ir` | Human-readable canonical IR (same format as `ir_report.ce`) | | `--check` | Warnings only (e.g. `nr_slots > 200` approaching 255 limit) | | `--types` | Optimized IR with inferred type annotations per slot | Flags can be combined. ## seed.ce Regenerates the boot seed files in `boot/`. These are pre-compiled mcode IR (JSON) files that bootstrap the compilation pipeline on cold start. ```bash cell seed # regenerate all boot seeds cell seed --clean # also clear the build cache after ``` The script compiles each pipeline module (tokenize, parse, fold, mcode, streamline) and `internal/bootstrap.cm` through the current pipeline, encodes the output as JSON, and writes it to `boot/.cm.mcode`. **When to regenerate seeds:** - Before a release or distribution - When the pipeline source changes in a way the existing seeds can't compile the new source (e.g. language-level changes) - Seeds do NOT need regenerating for normal development — the engine recompiles pipeline modules from source automatically via the content-addressed cache ## ir_report.ce The optimizer flight recorder. Runs the full pipeline with structured logging and outputs machine-readable, diff-friendly JSON. This is the most detailed tool for understanding what the optimizer did and why. ```bash cell ir_report [options] ``` ### Options | Flag | Description | |------|-------------| | `--summary` | Per-pass JSON summaries with instruction counts and timing (default) | | `--events` | Include rewrite events showing each optimization applied | | `--types` | Include type delta records showing inferred slot types | | `--ir-before=PASS` | Print canonical IR before a specific pass | | `--ir-after=PASS` | Print canonical IR after a specific pass | | `--ir-all` | Print canonical IR before and after all passes | | `--full` | Everything: summary + events + types + ir-all | With no flags, `--summary` is the default. ### Output Format Output is line-delimited JSON. Each line is a self-contained JSON object with a `type` field: **`type: "pass"`** — Per-pass summary with categorized instruction counts before and after: ```json { "type": "pass", "pass": "eliminate_type_checks", "fn": "fib", "ms": 0.12, "changed": true, "before": {"instr": 77, "nop": 0, "guard": 16, "branch": 28, ...}, "after": {"instr": 77, "nop": 1, "guard": 15, "branch": 28, ...}, "changes": {"guards_removed": 1, "nops_added": 1} } ``` **`type: "event"`** — Individual rewrite event with before/after instructions and reasoning: ```json { "type": "event", "pass": "eliminate_type_checks", "rule": "incompatible_type_forces_jump", "at": 3, "before": [["is_int", 5, 2, 4, 9], ["jump_false", 5, "rel_ni_2", 4, 9]], "after": ["_nop_tc_1", ["jump", "rel_ni_2", 4, 9]], "why": {"slot": 2, "known_type": "float", "checked_type": "int"} } ``` **`type: "types"`** — Inferred type information for a function: ```json { "type": "types", "fn": "fib", "param_types": {}, "slot_types": {"25": "null"} } ``` **`type: "ir"`** — Canonical IR text for a function at a specific point: ```json { "type": "ir", "when": "before", "pass": "all", "fn": "fib", "text": "fn fib (args=1, slots=26)\n @0 access s2, 2\n ..." } ``` ### Rewrite Rules Each pass records events with named rules: **eliminate_type_checks:** - `known_type_eliminates_guard` — type already known, guard removed - `incompatible_type_forces_jump` — type conflicts, conditional jump becomes unconditional - `num_subsumes_int_float` — num check satisfied by int or float - `dynamic_to_field` — load_dynamic/store_dynamic narrowed to field access - `dynamic_to_index` — load_dynamic/store_dynamic narrowed to index access **simplify_algebra:** - `add_zero`, `sub_zero`, `mul_one`, `div_one` — identity operations become moves - `mul_zero` — multiplication by zero becomes constant - `self_eq`, `self_ne` — same-slot comparisons become constants **simplify_booleans:** - `not_jump_false_fusion` — not + jump_false fused into jump_true - `not_jump_true_fusion` — not + jump_true fused into jump_false - `double_not` — not + not collapsed to move **eliminate_moves:** - `self_move` — move to same slot becomes nop **eliminate_dead_jumps:** - `jump_to_next` — jump to immediately following label becomes nop ### Canonical IR Format The `--ir-all`, `--ir-before`, and `--ir-after` flags produce a deterministic text representation of the IR: ``` fn fib (args=1, slots=26) @0 access s2, 2 @1 is_int s4, s1 ; [guard] @2 jump_false s4, "rel_ni_2" ; [branch] @3 --- nop (tc) --- @4 jump "rel_ni_2" ; [branch] @5 lt_int s3, s1, s2 @6 jump "rel_done_4" ; [branch] rel_ni_2: @8 is_num s4, s1 ; [guard] ``` Properties: - `@N` is the raw array index, stable across passes (passes replace, never insert or delete) - `sN` prefix distinguishes slot operands from literal values - String operands are quoted - Labels appear as indented headers with a colon - Category tags in brackets: `[guard]`, `[branch]`, `[load]`, `[store]`, `[call]`, `[arith]`, `[move]`, `[const]` - Nops shown as `--- nop (reason) ---` with reason codes: `tc` (type check), `bl` (boolean), `mv` (move), `dj` (dead jump), `ur` (unreachable) ### Examples ```bash # what passes changed something? cell ir_report --summary myfile.ce | jq 'select(.changed)' # list all rewrite rules that fired cell ir_report --events myfile.ce | jq 'select(.type == "event") | .rule' # diff IR before and after optimization cell ir_report --ir-all myfile.ce | jq -r 'select(.type == "ir") | .text' # full report for analysis cell ir_report --full myfile.ce > report.json ``` ## ir_stats.cm A utility module used by `ir_report.ce` and available for custom tooling. Not a standalone tool. ```javascript var ir_stats = use("ir_stats") ir_stats.detailed_stats(func) // categorized instruction counts ir_stats.ir_fingerprint(func) // djb2 hash of instruction array ir_stats.canonical_ir(func, name, opts) // deterministic text representation ir_stats.type_snapshot(slot_types) // frozen copy of type map ir_stats.type_delta(before_types, after_types) // compute type changes ir_stats.category_tag(op) // classify an opcode ``` ### Instruction Categories `detailed_stats` classifies each instruction into one of these categories: | Category | Opcodes | |----------|---------| | load | `load_field`, `load_index`, `load_dynamic`, `get`, `access` (non-constant) | | store | `store_field`, `store_index`, `store_dynamic`, `set_var`, `put`, `push` | | branch | `jump`, `jump_true`, `jump_false`, `jump_not_null` | | call | `invoke`, `goinvoke` | | guard | `is_int`, `is_text`, `is_num`, `is_bool`, `is_null`, `is_array`, `is_func`, `is_record`, `is_stone` | | arith | `add_int`, `sub_int`, ..., `add_float`, ..., `concat`, `neg_int`, `neg_float`, bitwise ops | | move | `move` | | const | `int`, `true`, `false`, `null`, `access` (with constant value) | | label | string entries that are not nops | | nop | strings starting with `_nop_` | | other | everything else (`frame`, `setarg`, `array`, `record`, `function`, `return`, etc.) |