better sem analysis
This commit is contained in:
@@ -15,65 +15,51 @@ The compiler runs in stages:
|
||||
source → tokenize → parse → fold → mcode → streamline → output
|
||||
```
|
||||
|
||||
Each stage has a corresponding dump tool that lets you see its output.
|
||||
Each stage has a corresponding CLI tool that lets you see its output.
|
||||
|
||||
| Stage | Tool | What it shows |
|
||||
|-------------|-------------------|----------------------------------------|
|
||||
| fold | `dump_ast.cm` | Folded AST as JSON |
|
||||
| mcode | `dump_mcode.cm` | Raw mcode IR before optimization |
|
||||
| streamline | `dump_stream.cm` | Before/after instruction counts + IR |
|
||||
| streamline | `dump_types.cm` | Optimized IR with type annotations |
|
||||
| streamline | `streamline.ce` | Full optimized IR as JSON |
|
||||
| all | `ir_report.ce` | Structured optimizer flight recorder |
|
||||
| Stage | Tool | What it shows |
|
||||
|-------------|---------------------------|----------------------------------------|
|
||||
| tokenize | `tokenize.ce` | Token stream as JSON |
|
||||
| parse | `parse.ce` | Unfolded AST as JSON |
|
||||
| fold | `fold.ce` | Folded AST as JSON |
|
||||
| mcode | `mcode.ce` | Raw mcode IR as JSON |
|
||||
| mcode | `mcode.ce --pretty` | Human-readable mcode IR |
|
||||
| streamline | `streamline.ce` | Full optimized IR as JSON |
|
||||
| streamline | `streamline.ce --types` | Optimized IR with type annotations |
|
||||
| streamline | `streamline.ce --stats` | Per-function summary stats |
|
||||
| streamline | `streamline.ce --ir` | Human-readable canonical IR |
|
||||
| all | `ir_report.ce` | Structured optimizer flight recorder |
|
||||
|
||||
All tools take a source file as input and run the pipeline up to the relevant stage.
|
||||
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
# see raw mcode IR
|
||||
./cell --core . dump_mcode.cm myfile.ce
|
||||
# see raw mcode IR (pretty-printed)
|
||||
cell mcode --pretty myfile.ce
|
||||
|
||||
# see what the optimizer changed
|
||||
./cell --core . dump_stream.cm myfile.ce
|
||||
# see optimized IR with type annotations
|
||||
cell streamline --types myfile.ce
|
||||
|
||||
# full optimizer report with events
|
||||
./cell --core . ir_report.ce --full myfile.ce
|
||||
cell ir_report --full myfile.ce
|
||||
```
|
||||
|
||||
## dump_ast.cm
|
||||
## fold.ce
|
||||
|
||||
Prints the folded AST as JSON. This is the output of the parser and constant folder, before mcode generation.
|
||||
|
||||
```bash
|
||||
./cell --core . dump_ast.cm <file.ce|file.cm>
|
||||
cell fold <file.ce|file.cm>
|
||||
```
|
||||
|
||||
## dump_mcode.cm
|
||||
## mcode.ce
|
||||
|
||||
Prints the raw mcode IR before any optimization. Shows the instruction array as formatted text with opcode, operands, and program counter.
|
||||
Prints mcode IR. Default output is JSON; use `--pretty` for human-readable format with opcodes, operands, and program counter.
|
||||
|
||||
```bash
|
||||
./cell --core . dump_mcode.cm <file.ce|file.cm>
|
||||
```
|
||||
|
||||
## dump_stream.cm
|
||||
|
||||
Shows a before/after comparison of the optimizer. For each function, prints:
|
||||
- Instruction count before and after
|
||||
- Number of eliminated instructions
|
||||
- The streamlined IR (nops hidden by default)
|
||||
|
||||
```bash
|
||||
./cell --core . dump_stream.cm <file.ce|file.cm>
|
||||
```
|
||||
|
||||
## dump_types.cm
|
||||
|
||||
Shows the optimized IR with type annotations. Each instruction is followed by the known types of its slot operands, inferred by walking the instruction stream.
|
||||
|
||||
```bash
|
||||
./cell --core . dump_types.cm <file.ce|file.cm>
|
||||
cell mcode <file.ce|file.cm> # JSON (default)
|
||||
cell mcode --pretty <file.ce|file.cm> # human-readable IR
|
||||
```
|
||||
|
||||
## streamline.ce
|
||||
@@ -81,10 +67,11 @@ Shows the optimized IR with type annotations. Each instruction is followed by th
|
||||
Runs the full pipeline (tokenize, parse, fold, mcode, streamline) and outputs the optimized IR as JSON. Useful for piping to `jq` or saving for comparison.
|
||||
|
||||
```bash
|
||||
./cell --core . streamline.ce <file.ce|file.cm> # full JSON (default)
|
||||
./cell --core . streamline.ce --stats <file.ce|file.cm> # summary stats per function
|
||||
./cell --core . streamline.ce --ir <file.ce|file.cm> # human-readable IR
|
||||
./cell --core . streamline.ce --check <file.ce|file.cm> # warnings only
|
||||
cell streamline <file.ce|file.cm> # full JSON (default)
|
||||
cell streamline --stats <file.ce|file.cm> # summary stats per function
|
||||
cell streamline --ir <file.ce|file.cm> # human-readable IR
|
||||
cell streamline --check <file.ce|file.cm> # warnings only
|
||||
cell streamline --types <file.ce|file.cm> # IR with type annotations
|
||||
```
|
||||
|
||||
| Flag | Description |
|
||||
@@ -93,6 +80,7 @@ Runs the full pipeline (tokenize, parse, fold, mcode, streamline) and outputs th
|
||||
| `--stats` | Per-function summary: args, slots, instruction counts by category, nops eliminated |
|
||||
| `--ir` | Human-readable canonical IR (same format as `ir_report.ce`) |
|
||||
| `--check` | Warnings only (e.g. `nr_slots > 200` approaching 255 limit) |
|
||||
| `--types` | Optimized IR with inferred type annotations per slot |
|
||||
|
||||
Flags can be combined.
|
||||
|
||||
@@ -101,8 +89,8 @@ Flags can be combined.
|
||||
Regenerates the boot seed files in `boot/`. These are pre-compiled mcode IR (JSON) files that bootstrap the compilation pipeline on cold start.
|
||||
|
||||
```bash
|
||||
./cell --core . seed.ce # regenerate all boot seeds
|
||||
./cell --core . seed.ce --clean # also clear the build cache after
|
||||
cell seed # regenerate all boot seeds
|
||||
cell seed --clean # also clear the build cache after
|
||||
```
|
||||
|
||||
The script compiles each pipeline module (tokenize, parse, fold, mcode, streamline) and `internal/bootstrap.cm` through the current pipeline, encodes the output as JSON, and writes it to `boot/<name>.cm.mcode`.
|
||||
@@ -117,7 +105,7 @@ The script compiles each pipeline module (tokenize, parse, fold, mcode, streamli
|
||||
The optimizer flight recorder. Runs the full pipeline with structured logging and outputs machine-readable, diff-friendly JSON. This is the most detailed tool for understanding what the optimizer did and why.
|
||||
|
||||
```bash
|
||||
./cell --core . ir_report.ce [options] <file.ce|file.cm>
|
||||
cell ir_report [options] <file.ce|file.cm>
|
||||
```
|
||||
|
||||
### Options
|
||||
@@ -246,16 +234,16 @@ Properties:
|
||||
|
||||
```bash
|
||||
# what passes changed something?
|
||||
./cell --core . ir_report.ce --summary myfile.ce | jq 'select(.changed)'
|
||||
cell ir_report --summary myfile.ce | jq 'select(.changed)'
|
||||
|
||||
# list all rewrite rules that fired
|
||||
./cell --core . ir_report.ce --events myfile.ce | jq 'select(.type == "event") | .rule'
|
||||
cell ir_report --events myfile.ce | jq 'select(.type == "event") | .rule'
|
||||
|
||||
# diff IR before and after optimization
|
||||
./cell --core . ir_report.ce --ir-all myfile.ce | jq -r 'select(.type == "ir") | .text'
|
||||
cell ir_report --ir-all myfile.ce | jq -r 'select(.type == "ir") | .text'
|
||||
|
||||
# full report for analysis
|
||||
./cell --core . ir_report.ce --full myfile.ce > report.json
|
||||
cell ir_report --full myfile.ce > report.json
|
||||
```
|
||||
|
||||
## ir_stats.cm
|
||||
|
||||
@@ -130,9 +130,9 @@ Seeds are used during cold start (empty cache) to compile the pipeline modules f
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `dump_mcode.cm` | Print raw Mcode IR before streamlining |
|
||||
| `dump_stream.cm` | Print IR after streamlining with before/after stats |
|
||||
| `dump_types.cm` | Print streamlined IR with type annotations |
|
||||
| `mcode.ce --pretty` | Print raw Mcode IR before streamlining |
|
||||
| `streamline.ce --types` | Print streamlined IR with type annotations |
|
||||
| `streamline.ce --stats` | Print IR after streamlining with before/after stats |
|
||||
|
||||
## Test Files
|
||||
|
||||
|
||||
@@ -257,17 +257,17 @@ The `+` operator is excluded from target slot propagation when it would use the
|
||||
|
||||
## Debugging Tools
|
||||
|
||||
Three dump tools inspect the IR at different stages:
|
||||
CLI tools inspect the IR at different stages:
|
||||
|
||||
- **`dump_mcode.cm`** — prints the raw Mcode IR after `mcode.cm`, before streamlining
|
||||
- **`dump_stream.cm`** — prints the IR after streamlining, with before/after instruction counts
|
||||
- **`dump_types.cm`** — prints the streamlined IR with type annotations on each instruction
|
||||
- **`cell mcode --pretty`** — prints the raw Mcode IR after `mcode.cm`, before streamlining
|
||||
- **`cell streamline --stats`** — prints the IR after streamlining, with before/after instruction counts
|
||||
- **`cell streamline --types`** — prints the streamlined IR with type annotations on each instruction
|
||||
|
||||
Usage:
|
||||
```
|
||||
./cell --core . dump_mcode.cm <file.ce|file.cm>
|
||||
./cell --core . dump_stream.cm <file.ce|file.cm>
|
||||
./cell --core . dump_types.cm <file.ce|file.cm>
|
||||
cell mcode --pretty <file.ce|file.cm>
|
||||
cell streamline --stats <file.ce|file.cm>
|
||||
cell streamline --types <file.ce|file.cm>
|
||||
```
|
||||
|
||||
## Tail Call Marking
|
||||
|
||||
Reference in New Issue
Block a user