Merge branch 'sem_grab'

This commit is contained in:
2026-02-18 10:35:25 -06:00
16 changed files with 664 additions and 572 deletions

View File

@@ -15,65 +15,51 @@ The compiler runs in stages:
source → tokenize → parse → fold → mcode → streamline → output
```
Each stage has a corresponding dump tool that lets you see its output.
Each stage has a corresponding CLI tool that lets you see its output.
| Stage | Tool | What it shows |
|-------------|-------------------|----------------------------------------|
| fold | `dump_ast.cm` | Folded AST as JSON |
| mcode | `dump_mcode.cm` | Raw mcode IR before optimization |
| streamline | `dump_stream.cm` | Before/after instruction counts + IR |
| streamline | `dump_types.cm` | Optimized IR with type annotations |
| streamline | `streamline.ce` | Full optimized IR as JSON |
| all | `ir_report.ce` | Structured optimizer flight recorder |
| Stage | Tool | What it shows |
|-------------|---------------------------|----------------------------------------|
| tokenize | `tokenize.ce` | Token stream as JSON |
| parse | `parse.ce` | Unfolded AST as JSON |
| fold | `fold.ce` | Folded AST as JSON |
| mcode | `mcode.ce` | Raw mcode IR as JSON |
| mcode | `mcode.ce --pretty` | Human-readable mcode IR |
| streamline | `streamline.ce` | Full optimized IR as JSON |
| streamline | `streamline.ce --types` | Optimized IR with type annotations |
| streamline | `streamline.ce --stats` | Per-function summary stats |
| streamline | `streamline.ce --ir` | Human-readable canonical IR |
| all | `ir_report.ce` | Structured optimizer flight recorder |
All tools take a source file as input and run the pipeline up to the relevant stage.
## Quick Start
```bash
# see raw mcode IR
./cell --core . dump_mcode.cm myfile.ce
# see raw mcode IR (pretty-printed)
cell mcode --pretty myfile.ce
# see what the optimizer changed
./cell --core . dump_stream.cm myfile.ce
# see optimized IR with type annotations
cell streamline --types myfile.ce
# full optimizer report with events
./cell --core . ir_report.ce --full myfile.ce
cell ir_report --full myfile.ce
```
## dump_ast.cm
## fold.ce
Prints the folded AST as JSON. This is the output of the parser and constant folder, before mcode generation.
```bash
./cell --core . dump_ast.cm <file.ce|file.cm>
cell fold <file.ce|file.cm>
```
## dump_mcode.cm
## mcode.ce
Prints the raw mcode IR before any optimization. Shows the instruction array as formatted text with opcode, operands, and program counter.
Prints mcode IR. Default output is JSON; use `--pretty` for human-readable format with opcodes, operands, and program counter.
```bash
./cell --core . dump_mcode.cm <file.ce|file.cm>
```
## dump_stream.cm
Shows a before/after comparison of the optimizer. For each function, prints:
- Instruction count before and after
- Number of eliminated instructions
- The streamlined IR (nops hidden by default)
```bash
./cell --core . dump_stream.cm <file.ce|file.cm>
```
## dump_types.cm
Shows the optimized IR with type annotations. Each instruction is followed by the known types of its slot operands, inferred by walking the instruction stream.
```bash
./cell --core . dump_types.cm <file.ce|file.cm>
cell mcode <file.ce|file.cm> # JSON (default)
cell mcode --pretty <file.ce|file.cm> # human-readable IR
```
## streamline.ce
@@ -81,10 +67,11 @@ Shows the optimized IR with type annotations. Each instruction is followed by th
Runs the full pipeline (tokenize, parse, fold, mcode, streamline) and outputs the optimized IR as JSON. Useful for piping to `jq` or saving for comparison.
```bash
./cell --core . streamline.ce <file.ce|file.cm> # full JSON (default)
./cell --core . streamline.ce --stats <file.ce|file.cm> # summary stats per function
./cell --core . streamline.ce --ir <file.ce|file.cm> # human-readable IR
./cell --core . streamline.ce --check <file.ce|file.cm> # warnings only
cell streamline <file.ce|file.cm> # full JSON (default)
cell streamline --stats <file.ce|file.cm> # summary stats per function
cell streamline --ir <file.ce|file.cm> # human-readable IR
cell streamline --check <file.ce|file.cm> # warnings only
cell streamline --types <file.ce|file.cm> # IR with type annotations
```
| Flag | Description |
@@ -93,6 +80,7 @@ Runs the full pipeline (tokenize, parse, fold, mcode, streamline) and outputs th
| `--stats` | Per-function summary: args, slots, instruction counts by category, nops eliminated |
| `--ir` | Human-readable canonical IR (same format as `ir_report.ce`) |
| `--check` | Warnings only (e.g. `nr_slots > 200` approaching 255 limit) |
| `--types` | Optimized IR with inferred type annotations per slot |
Flags can be combined.
@@ -101,8 +89,8 @@ Flags can be combined.
Regenerates the boot seed files in `boot/`. These are pre-compiled mcode IR (JSON) files that bootstrap the compilation pipeline on cold start.
```bash
./cell --core . seed.ce # regenerate all boot seeds
./cell --core . seed.ce --clean # also clear the build cache after
cell seed # regenerate all boot seeds
cell seed --clean # also clear the build cache after
```
The script compiles each pipeline module (tokenize, parse, fold, mcode, streamline) and `internal/bootstrap.cm` through the current pipeline, encodes the output as JSON, and writes it to `boot/<name>.cm.mcode`.
@@ -117,7 +105,7 @@ The script compiles each pipeline module (tokenize, parse, fold, mcode, streamli
The optimizer flight recorder. Runs the full pipeline with structured logging and outputs machine-readable, diff-friendly JSON. This is the most detailed tool for understanding what the optimizer did and why.
```bash
./cell --core . ir_report.ce [options] <file.ce|file.cm>
cell ir_report [options] <file.ce|file.cm>
```
### Options
@@ -246,16 +234,16 @@ Properties:
```bash
# what passes changed something?
./cell --core . ir_report.ce --summary myfile.ce | jq 'select(.changed)'
cell ir_report --summary myfile.ce | jq 'select(.changed)'
# list all rewrite rules that fired
./cell --core . ir_report.ce --events myfile.ce | jq 'select(.type == "event") | .rule'
cell ir_report --events myfile.ce | jq 'select(.type == "event") | .rule'
# diff IR before and after optimization
./cell --core . ir_report.ce --ir-all myfile.ce | jq -r 'select(.type == "ir") | .text'
cell ir_report --ir-all myfile.ce | jq -r 'select(.type == "ir") | .text'
# full report for analysis
./cell --core . ir_report.ce --full myfile.ce > report.json
cell ir_report --full myfile.ce > report.json
```
## ir_stats.cm

View File

@@ -130,9 +130,9 @@ Seeds are used during cold start (empty cache) to compile the pipeline modules f
| File | Purpose |
|------|---------|
| `dump_mcode.cm` | Print raw Mcode IR before streamlining |
| `dump_stream.cm` | Print IR after streamlining with before/after stats |
| `dump_types.cm` | Print streamlined IR with type annotations |
| `mcode.ce --pretty` | Print raw Mcode IR before streamlining |
| `streamline.ce --types` | Print streamlined IR with type annotations |
| `streamline.ce --stats` | Print IR after streamlining with before/after stats |
## Test Files

View File

@@ -257,17 +257,17 @@ The `+` operator is excluded from target slot propagation when it would use the
## Debugging Tools
Three dump tools inspect the IR at different stages:
CLI tools inspect the IR at different stages:
- **`dump_mcode.cm`** — prints the raw Mcode IR after `mcode.cm`, before streamlining
- **`dump_stream.cm`** — prints the IR after streamlining, with before/after instruction counts
- **`dump_types.cm`** — prints the streamlined IR with type annotations on each instruction
- **`cell mcode --pretty`** — prints the raw Mcode IR after `mcode.cm`, before streamlining
- **`cell streamline --stats`** — prints the IR after streamlining, with before/after instruction counts
- **`cell streamline --types`** — prints the streamlined IR with type annotations on each instruction
Usage:
```
./cell --core . dump_mcode.cm <file.ce|file.cm>
./cell --core . dump_stream.cm <file.ce|file.cm>
./cell --core . dump_types.cm <file.ce|file.cm>
cell mcode --pretty <file.ce|file.cm>
cell streamline --stats <file.ce|file.cm>
cell streamline --types <file.ce|file.cm>
```
## Tail Call Marking

View File

@@ -1,16 +0,0 @@
// dump_ast.cm — pretty-print the folded AST as JSON
//
// Usage: ./cell --core . dump_ast.cm <file.ce|file.cm>
var fd = use("fd")
var json = use("json")
var tokenize = use("tokenize")
var parse = use("parse")
var fold = use("fold")
var filename = args[0]
var src = text(fd.slurp(filename))
var tok = tokenize(src, filename)
var ast = parse(tok.tokens, src, filename, tokenize)
var folded = fold(ast)
print(json.encode(folded))

View File

@@ -1,117 +0,0 @@
// dump_mcode.cm — pretty-print mcode IR (before streamlining)
//
// Usage: ./cell --core . dump_mcode.cm <file.ce|file.cm>
var fd = use("fd")
var json = use("json")
var tokenize = use("tokenize")
var parse = use("parse")
var fold = use("fold")
var mcode = use("mcode")
if (length(args) < 1) {
print("usage: cell --core . dump_mcode.cm <file>")
return
}
var filename = args[0]
var src = text(fd.slurp(filename))
var tok = tokenize(src, filename)
var ast = parse(tok.tokens, src, filename, tokenize)
var folded = fold(ast)
var compiled = mcode(folded)
var pad_right = function(s, w) {
var r = s
while (length(r) < w) {
r = r + " "
}
return r
}
var fmt_val = function(v) {
if (is_null(v)) {
return "null"
}
if (is_number(v)) {
return text(v)
}
if (is_text(v)) {
return `"${v}"`
}
if (is_object(v)) {
return json.encode(v)
}
if (is_logical(v)) {
return v ? "true" : "false"
}
return text(v)
}
var dump_function = function(func, name) {
var nr_args = func.nr_args != null ? func.nr_args : 0
var nr_slots = func.nr_slots != null ? func.nr_slots : 0
var nr_close = func.nr_close_slots != null ? func.nr_close_slots : 0
var instrs = func.instructions
var i = 0
var pc = 0
var instr = null
var op = null
var n = 0
var parts = null
var j = 0
var operands = null
var pc_str = null
var op_str = null
print(`\n=== ${name} (args=${text(nr_args)}, slots=${text(nr_slots)}, closures=${text(nr_close)}) ===`)
if (instrs == null || length(instrs) == 0) {
print(" (empty)")
return null
}
while (i < length(instrs)) {
instr = instrs[i]
if (is_text(instr)) {
if (!starts_with(instr, "_nop_")) {
print(`${instr}:`)
}
} else if (is_array(instr)) {
op = instr[0]
n = length(instr)
parts = []
j = 1
while (j < n - 2) {
push(parts, fmt_val(instr[j]))
j = j + 1
}
operands = text(parts, ", ")
pc_str = pad_right(text(pc), 5)
op_str = pad_right(op, 14)
print(` ${pc_str} ${op_str} ${operands}`)
pc = pc + 1
}
i = i + 1
}
return null
}
var main_name = null
var fi = 0
var func = null
var fname = null
// Dump main
if (compiled.main != null) {
main_name = compiled.name != null ? compiled.name : "<main>"
dump_function(compiled.main, main_name)
}
// Dump sub-functions
if (compiled.functions != null) {
fi = 0
while (fi < length(compiled.functions)) {
func = compiled.functions[fi]
fname = func.name != null ? func.name : `<func_${text(fi)}>`
dump_function(func, `[${text(fi)}] ${fname}`)
fi = fi + 1
}
}

View File

@@ -1,237 +0,0 @@
// dump_types.cm — show streamlined IR with type annotations
//
// Usage: ./cell --core . dump_types.cm <file.ce|file.cm>
var fd = use("fd")
var json = use("json")
var tokenize = use("tokenize")
var parse = use("parse")
var fold = use("fold")
var mcode = use("mcode")
var streamline = use("streamline")
if (length(args) < 1) {
print("usage: cell --core . dump_types.cm <file>")
return
}
var filename = args[0]
var src = text(fd.slurp(filename))
var tok = tokenize(src, filename)
var ast = parse(tok.tokens, src, filename, tokenize)
var folded = fold(ast)
var compiled = mcode(folded)
var optimized = streamline(compiled)
// Type constants
def T_UNKNOWN = "unknown"
def T_INT = "int"
def T_FLOAT = "float"
def T_NUM = "num"
def T_TEXT = "text"
def T_BOOL = "bool"
def T_NULL = "null"
def T_ARRAY = "array"
def T_RECORD = "record"
def T_FUNCTION = "function"
def int_result_ops = {
bitnot: true, bitand: true, bitor: true,
bitxor: true, shl: true, shr: true, ushr: true
}
def bool_result_ops = {
eq_int: true, ne_int: true, lt_int: true, gt_int: true,
le_int: true, ge_int: true,
eq_float: true, ne_float: true, lt_float: true, gt_float: true,
le_float: true, ge_float: true,
eq_text: true, ne_text: true, lt_text: true, gt_text: true,
le_text: true, ge_text: true,
eq_bool: true, ne_bool: true,
not: true, and: true, or: true,
is_int: true, is_text: true, is_num: true,
is_bool: true, is_null: true, is_identical: true,
is_array: true, is_func: true, is_record: true, is_stone: true
}
var access_value_type = function(val) {
if (is_number(val)) {
return is_integer(val) ? T_INT : T_FLOAT
}
if (is_text(val)) {
return T_TEXT
}
return T_UNKNOWN
}
var track_types = function(slot_types, instr) {
var op = instr[0]
var src_type = null
if (op == "access") {
slot_types[text(instr[1])] = access_value_type(instr[2])
} else if (op == "int") {
slot_types[text(instr[1])] = T_INT
} else if (op == "true" || op == "false") {
slot_types[text(instr[1])] = T_BOOL
} else if (op == "null") {
slot_types[text(instr[1])] = T_NULL
} else if (op == "move") {
src_type = slot_types[text(instr[2])]
slot_types[text(instr[1])] = src_type != null ? src_type : T_UNKNOWN
} else if (int_result_ops[op] == true) {
slot_types[text(instr[1])] = T_INT
} else if (op == "concat") {
slot_types[text(instr[1])] = T_TEXT
} else if (bool_result_ops[op] == true) {
slot_types[text(instr[1])] = T_BOOL
} else if (op == "typeof") {
slot_types[text(instr[1])] = T_TEXT
} else if (op == "array") {
slot_types[text(instr[1])] = T_ARRAY
} else if (op == "record") {
slot_types[text(instr[1])] = T_RECORD
} else if (op == "function") {
slot_types[text(instr[1])] = T_FUNCTION
} else if (op == "invoke" || op == "tail_invoke") {
slot_types[text(instr[2])] = T_UNKNOWN
} else if (op == "load_field" || op == "load_index" || op == "load_dynamic") {
slot_types[text(instr[1])] = T_UNKNOWN
} else if (op == "pop" || op == "get") {
slot_types[text(instr[1])] = T_UNKNOWN
} else if (op == "length") {
slot_types[text(instr[1])] = T_INT
} else if (op == "add" || op == "subtract" || op == "multiply" ||
op == "divide" || op == "modulo" || op == "pow" || op == "negate") {
slot_types[text(instr[1])] = T_UNKNOWN
}
return null
}
var pad_right = function(s, w) {
var r = s
while (length(r) < w) {
r = r + " "
}
return r
}
var fmt_val = function(v) {
if (is_null(v)) {
return "null"
}
if (is_number(v)) {
return text(v)
}
if (is_text(v)) {
return `"${v}"`
}
if (is_object(v)) {
return json.encode(v)
}
if (is_logical(v)) {
return v ? "true" : "false"
}
return text(v)
}
// Build type annotation string for an instruction
var type_annotation = function(slot_types, instr) {
var n = length(instr)
var parts = []
var j = 1
var v = null
var t = null
while (j < n - 2) {
v = instr[j]
if (is_number(v)) {
t = slot_types[text(v)]
if (t != null && t != T_UNKNOWN) {
push(parts, `s${text(v)}:${t}`)
}
}
j = j + 1
}
if (length(parts) == 0) {
return ""
}
return text(parts, " ")
}
var dump_function_typed = function(func, name) {
var nr_args = func.nr_args != null ? func.nr_args : 0
var nr_slots = func.nr_slots != null ? func.nr_slots : 0
var instrs = func.instructions
var slot_types = {}
var i = 0
var pc = 0
var instr = null
var op = null
var n = 0
var annotation = null
var operand_parts = null
var j = 0
var operands = null
var pc_str = null
var op_str = null
var line = null
print(`\n=== ${name} (args=${text(nr_args)}, slots=${text(nr_slots)}) ===`)
if (instrs == null || length(instrs) == 0) {
print(" (empty)")
return null
}
while (i < length(instrs)) {
instr = instrs[i]
if (is_text(instr)) {
if (starts_with(instr, "_nop_")) {
i = i + 1
continue
}
slot_types = {}
print(`${instr}:`)
} else if (is_array(instr)) {
op = instr[0]
n = length(instr)
annotation = type_annotation(slot_types, instr)
operand_parts = []
j = 1
while (j < n - 2) {
push(operand_parts, fmt_val(instr[j]))
j = j + 1
}
operands = text(operand_parts, ", ")
pc_str = pad_right(text(pc), 5)
op_str = pad_right(op, 14)
line = pad_right(` ${pc_str} ${op_str} ${operands}`, 50)
if (length(annotation) > 0) {
print(`${line} ; ${annotation}`)
} else {
print(line)
}
track_types(slot_types, instr)
pc = pc + 1
}
i = i + 1
}
return null
}
var main_name = null
var fi = 0
var func = null
var fname = null
// Dump main
if (optimized.main != null) {
main_name = optimized.name != null ? optimized.name : "<main>"
dump_function_typed(optimized.main, main_name)
}
// Dump sub-functions
if (optimized.functions != null) {
fi = 0
while (fi < length(optimized.functions)) {
func = optimized.functions[fi]
fname = func.name != null ? func.name : `<func_${text(fi)}>`
dump_function_typed(func, `[${text(fi)}] ${fname}`)
fi = fi + 1
}
}

View File

@@ -8,36 +8,9 @@
var fd = use('fd')
var json = use('json')
var tokenize_mod = use('tokenize')
var parse_mod = use('parse')
var fold_mod = use('fold')
var index_mod = use('index')
var explain_mod = use('explain')
var shop = use('internal/shop')
// Resolve import paths on an index in-place.
var resolve_imports = function(idx_obj, fname) {
var fi = shop.file_info(fd.realpath(fname))
var ctx = fi.package
var ri = 0
var rp = null
var lp = null
while (ri < length(idx_obj.imports)) {
rp = shop.resolve_use_path(idx_obj.imports[ri].module_path, ctx)
// Fallback: check sibling files in the same directory.
if (rp == null) {
lp = fd.dirname(fd.realpath(fname)) + '/' + idx_obj.imports[ri].module_path + '.cm'
if (fd.is_file(lp)) {
rp = lp
}
}
if (rp != null) {
idx_obj.imports[ri].resolved_path = rp
}
ri = ri + 1
}
}
var mode = null
var span_arg = null
var symbol_name = null
@@ -47,12 +20,10 @@ var parts = null
var filename = null
var line = null
var col = null
var src = null
var idx = null
var indexes = []
var explain = null
var result = null
var pipeline = {tokenize: tokenize_mod, parse: parse_mod, fold: fold_mod}
for (i = 0; i < length(args); i++) {
if (args[i] == '--span') {
@@ -108,9 +79,7 @@ if (mode == "span") {
$stop()
}
src = text(fd.slurp(filename))
idx = index_mod.index_file(src, filename, pipeline)
resolve_imports(idx, filename)
idx = shop.index_file(filename)
explain = explain_mod.make(idx)
result = explain.at_span(line, col)
@@ -139,11 +108,8 @@ if (mode == "symbol") {
}
if (length(files) == 1) {
// Single file: use by_symbol for a focused result.
filename = files[0]
src = text(fd.slurp(filename))
idx = index_mod.index_file(src, filename, pipeline)
resolve_imports(idx, filename)
idx = shop.index_file(filename)
explain = explain_mod.make(idx)
result = explain.by_symbol(symbol_name)
@@ -154,13 +120,10 @@ if (mode == "symbol") {
print("\n")
}
} else if (length(files) > 1) {
// Multiple files: index each and search across all.
indexes = []
i = 0
while (i < length(files)) {
src = text(fd.slurp(files[i]))
idx = index_mod.index_file(src, files[i], pipeline)
resolve_imports(idx, files[i])
idx = shop.index_file(files[i])
indexes[] = idx
i = i + 1
}

12
fold.ce
View File

@@ -1,13 +1,5 @@
var fd = use("fd")
var json = use("json")
var shop = use("internal/shop")
var filename = args[0]
var src = text(fd.slurp(filename))
var tokenize = use("tokenize")
var parse = use("parse")
var fold = use("fold")
var tok_result = tokenize(src, filename)
var ast = parse(tok_result.tokens, src, filename, tokenize)
var folded = fold(ast)
var folded = shop.analyze_file(filename)
print(json.encode(folded))

View File

@@ -7,19 +7,11 @@
var fd = use('fd')
var json = use('json')
var tokenize_mod = use('tokenize')
var parse_mod = use('parse')
var fold_mod = use('fold')
var index_mod = use('index')
var shop = use('internal/shop')
var filename = null
var output_path = null
var i = 0
var file_info = null
var pkg_ctx = null
var resolved = null
var local_path = null
for (i = 0; i < length(args); i++) {
if (args[i] == '-o' || args[i] == '--output') {
@@ -53,29 +45,7 @@ if (!fd.is_file(filename)) {
$stop()
}
var src = text(fd.slurp(filename))
var pipeline = {tokenize: tokenize_mod, parse: parse_mod, fold: fold_mod}
var idx = index_mod.index_file(src, filename, pipeline)
// Resolve import paths to filesystem locations.
file_info = shop.file_info(fd.realpath(filename))
pkg_ctx = file_info.package
i = 0
while (i < length(idx.imports)) {
resolved = shop.resolve_use_path(idx.imports[i].module_path, pkg_ctx)
// Fallback: check sibling files in the same directory.
if (resolved == null) {
local_path = fd.dirname(fd.realpath(filename)) + '/' + idx.imports[i].module_path + '.cm'
if (fd.is_file(local_path)) {
resolved = local_path
}
}
if (resolved != null) {
idx.imports[i].resolved_path = resolved
}
i = i + 1
}
var idx = shop.index_file(filename)
var out = json.encode(idx, true)
if (output_path != null) {

View File

@@ -22,6 +22,7 @@ var index_ast = function(ast, tokens, filename) {
var references = []
var call_sites = []
var exports_list = []
var intrinsic_refs = []
var node_counter = 0
var fn_map = {}
var _i = 0
@@ -147,6 +148,29 @@ var index_ast = function(ast, tokens, filename) {
nid = next_id()
// this keyword
if (node.kind == "this") {
references[] = {
node_id: nid,
name: "this",
symbol_id: null,
span: make_span(node),
enclosing: enclosing,
ref_kind: "read"
}
return
}
// Capture intrinsic refs with positions (intrinsics lack function_nr).
if (node.kind == "name" && node.name != null && node.intrinsic == true) {
intrinsic_refs[] = {
node_id: nid,
name: node.name,
span: make_span(node),
enclosing: enclosing
}
}
// Name reference — has function_nr when it's a true variable reference.
if (node.kind == "name" && node.name != null && node.function_nr != null) {
if (node.intrinsic != true) {
@@ -208,6 +232,17 @@ var index_ast = function(ast, tokens, filename) {
}
}
// Capture intrinsic callee refs (e.g., print, length).
if (node.expression != null && node.expression.kind == "name" &&
node.expression.intrinsic == true && node.expression.name != null) {
intrinsic_refs[] = {
node_id: nid,
name: node.expression.name,
span: make_span(node.expression),
enclosing: enclosing
}
}
// Walk callee expression (skip name — already recorded above).
if (node.expression != null && node.expression.kind != "name") {
walk_expr(node.expression, enclosing, false)
@@ -596,6 +631,7 @@ var index_ast = function(ast, tokens, filename) {
imports: imports,
symbols: symbols,
references: references,
intrinsic_refs: intrinsic_refs,
call_sites: call_sites,
exports: exports_list,
reverse_refs: reverse

View File

@@ -510,6 +510,139 @@ function inject_env(inject) {
return env
}
// --- Pipeline API ---
// Lazy-loaded pipeline modules from use_cache (no re-entrancy risk).
var _tokenize_mod = null
var _parse_mod = null
var _fold_mod = null
var _index_mod = null
var _token_cache = {}
var _ast_cache = {}
var _analyze_cache = {}
var _index_cache = {}
var get_tokenize = function() {
if (!_tokenize_mod) _tokenize_mod = use_cache['core/tokenize'] || use_cache['tokenize']
return _tokenize_mod
}
var get_parse = function() {
if (!_parse_mod) _parse_mod = use_cache['core/parse'] || use_cache['parse']
return _parse_mod
}
var get_fold = function() {
if (!_fold_mod) _fold_mod = use_cache['core/fold'] || use_cache['fold']
return _fold_mod
}
var get_index = function() {
if (!_index_mod) {
_index_mod = use_cache['core/index'] || use_cache['index']
if (!_index_mod) _index_mod = Shop.use('index', 'core')
}
return _index_mod
}
Shop.tokenize_file = function(path) {
var src = text(fd.slurp(path))
var key = content_hash(stone(blob(src)))
if (_token_cache[key]) return _token_cache[key]
var result = get_tokenize()(src, path)
_token_cache[key] = result
return result
}
Shop.parse_file = function(path) {
var src = text(fd.slurp(path))
var key = content_hash(stone(blob(src)))
if (_ast_cache[key]) return _ast_cache[key]
var tok = Shop.tokenize_file(path)
var ast = get_parse()(tok.tokens, src, path, get_tokenize())
_ast_cache[key] = ast
return ast
}
Shop.analyze_file = function(path) {
var src = text(fd.slurp(path))
var key = content_hash(stone(blob(src)))
if (_analyze_cache[key]) return _analyze_cache[key]
var ast = Shop.parse_file(path)
var folded = get_fold()(ast)
_analyze_cache[key] = folded
return folded
}
// Resolve import paths on an index in-place.
Shop.resolve_imports = function(idx_obj, fname) {
var fi = Shop.file_info(fd.realpath(fname))
var ctx = fi.package
var ri = 0
var rp = null
var lp = null
while (ri < length(idx_obj.imports)) {
rp = Shop.resolve_use_path(idx_obj.imports[ri].module_path, ctx)
if (rp == null) {
lp = fd.dirname(fd.realpath(fname)) + '/' + idx_obj.imports[ri].module_path + '.cm'
if (fd.is_file(lp)) {
rp = lp
}
}
if (rp != null) {
idx_obj.imports[ri].resolved_path = rp
}
ri = ri + 1
}
}
Shop.index_file = function(path) {
var src = text(fd.slurp(path))
var key = content_hash(stone(blob(src)))
if (_index_cache[key]) return _index_cache[key]
var tok = Shop.tokenize_file(path)
var pipeline = {tokenize: get_tokenize(), parse: get_parse(), fold: get_fold()}
var idx = get_index().index_file(src, path, pipeline)
Shop.resolve_imports(idx, path)
_index_cache[key] = idx
return idx
}
Shop.pipeline = function() {
return {
tokenize: get_tokenize(),
parse: get_parse(),
fold: get_fold(),
mcode: use_cache['core/mcode'] || use_cache['mcode'],
streamline: use_cache['core/streamline'] || use_cache['streamline']
}
}
Shop.all_script_paths = function() {
var packages = Shop.list_packages()
var result = []
var i = 0
var j = 0
var scripts = null
var pkg_dir = null
var has_core = false
for (i = 0; i < length(packages); i++) {
if (packages[i] == 'core') has_core = true
}
if (!has_core) {
packages = array(packages, ['core'])
}
for (i = 0; i < length(packages); i++) {
pkg_dir = starts_with(packages[i], '/') ? packages[i] : get_packages_dir() + '/' + safe_package_path(packages[i])
scripts = get_package_scripts(packages[i])
for (j = 0; j < length(scripts); j++) {
result[] = {
package: packages[i],
rel_path: scripts[j],
full_path: pkg_dir + '/' + scripts[j]
}
}
}
return result
}
// Lazy-loaded compiler modules for on-the-fly compilation
var _mcode_mod = null
var _streamline_mod = null

140
ls.ce
View File

@@ -1,35 +1,131 @@
// list modules and actors in a package
// if args[0] is a package alias, list that one
// otherwise, list the local one
// list modules and actors in packages
//
// Usage:
// cell ls [<package>] List modules and programs
// cell ls --all List across all packages
// cell ls --modules|-m [<package>] Modules only
// cell ls --programs|-p [<package>] Programs only
// cell ls --paths [<package>] One absolute path per line
var shop = use('internal/shop')
var package = use('package')
var ctx = null
var pkg = args[0] || package.find_package_dir('.')
var modules = package.list_modules(pkg)
var programs = package.list_programs(pkg)
var show_all = false
var show_modules = true
var show_programs = true
var show_paths = false
var filter_modules = false
var filter_programs = false
var pkg_arg = null
var show_help = false
var i = 0
log.console("Modules in " + pkg + ":")
modules = sort(modules)
if (length(modules) == 0) {
log.console(" (none)")
} else {
for (i = 0; i < length(modules); i++) {
log.console(" " + modules[i])
for (i = 0; i < length(args); i++) {
if (args[i] == '--all' || args[i] == '-a') {
show_all = true
} else if (args[i] == '--modules' || args[i] == '-m') {
filter_modules = true
} else if (args[i] == '--programs' || args[i] == '-p') {
filter_programs = true
} else if (args[i] == '--paths') {
show_paths = true
} else if (args[i] == '--help' || args[i] == '-h') {
show_help = true
} else if (!starts_with(args[i], '-')) {
pkg_arg = args[i]
}
}
log.console("")
log.console("Programs in " + pkg + ":")
programs = sort(programs)
if (length(programs) == 0) {
log.console(" (none)")
} else {
for (i = 0; i < length(programs); i++) {
log.console(" " + programs[i])
if (filter_modules || filter_programs) {
show_modules = filter_modules
show_programs = filter_programs
}
var list_one_package = function(pkg) {
var pkg_dir = null
var modules = null
var programs = null
var j = 0
if (starts_with(pkg, '/')) {
pkg_dir = pkg
} else {
pkg_dir = shop.get_package_dir(pkg)
}
if (show_modules) {
modules = sort(package.list_modules(pkg))
if (show_paths) {
for (j = 0; j < length(modules); j++) {
log.console(pkg_dir + '/' + modules[j] + '.cm')
}
} else {
if (!filter_modules || show_all) {
log.console("Modules in " + pkg + ":")
}
if (length(modules) == 0) {
log.console(" (none)")
} else {
for (j = 0; j < length(modules); j++) {
log.console(" " + modules[j])
}
}
}
}
if (show_programs) {
programs = sort(package.list_programs(pkg))
if (show_paths) {
for (j = 0; j < length(programs); j++) {
log.console(pkg_dir + '/' + programs[j] + '.ce')
}
} else {
if (!show_paths && show_modules && !filter_programs) {
log.console("")
}
if (!filter_programs || show_all) {
log.console("Programs in " + pkg + ":")
}
if (length(programs) == 0) {
log.console(" (none)")
} else {
for (j = 0; j < length(programs); j++) {
log.console(" " + programs[j])
}
}
}
}
}
var packages = null
var pkg = null
if (show_help) {
log.console("Usage: cell ls [options] [<package>]")
log.console("")
log.console("Options:")
log.console(" --all, -a List across all installed packages")
log.console(" --modules, -m Show modules only")
log.console(" --programs, -p Show programs only")
log.console(" --paths Output one absolute path per line")
} else if (show_all) {
packages = shop.list_packages()
if (find(packages, function(p) { return p == 'core' }) == null) {
packages[] = 'core'
}
packages = sort(packages)
for (i = 0; i < length(packages); i++) {
if (!show_paths && i > 0) {
log.console("")
}
if (!show_paths) {
log.console("--- " + packages[i] + " ---")
}
list_one_package(packages[i])
}
} else {
pkg = pkg_arg || package.find_package_dir('.')
list_one_package(pkg)
}
$stop()

133
mcode.ce
View File

@@ -1,13 +1,124 @@
// mcode.ce — compile to mcode IR
//
// Usage:
// cell mcode <file> Full mcode IR as JSON (default)
// cell mcode --pretty <file> Pretty-printed human-readable IR
var fd = use("fd")
var json = use("json")
var tokenize = use("tokenize")
var parse = use("parse")
var fold = use("fold")
var mcode = use("mcode")
var filename = args[0]
var src = text(fd.slurp(filename))
var result = tokenize(src, filename)
var ast = parse(result.tokens, src, filename, tokenize)
var folded = fold(ast)
var compiled = mcode(folded)
print(json.encode(compiled))
var shop = use("internal/shop")
var show_pretty = false
var filename = null
var i = 0
for (i = 0; i < length(args); i++) {
if (args[i] == '--pretty') {
show_pretty = true
} else if (args[i] == '--help' || args[i] == '-h') {
log.console("Usage: cell mcode [--pretty] <file>")
$stop()
} else if (!starts_with(args[i], '-')) {
filename = args[i]
}
}
if (!filename) {
log.console("usage: cell mcode [--pretty] <file>")
$stop()
}
var folded = shop.analyze_file(filename)
var pl = shop.pipeline()
var compiled = pl.mcode(folded)
if (!show_pretty) {
print(json.encode(compiled))
$stop()
}
// Pretty-print helpers (from dump_mcode.cm)
var pad_right = function(s, w) {
var r = s
while (length(r) < w) {
r = r + " "
}
return r
}
var fmt_val = function(v) {
if (is_null(v)) return "null"
if (is_number(v)) return text(v)
if (is_text(v)) return `"${v}"`
if (is_object(v)) return json.encode(v)
if (is_logical(v)) return v ? "true" : "false"
return text(v)
}
var dump_function = function(func, name) {
var nr_args = func.nr_args != null ? func.nr_args : 0
var nr_slots = func.nr_slots != null ? func.nr_slots : 0
var nr_close = func.nr_close_slots != null ? func.nr_close_slots : 0
var instrs = func.instructions
var i = 0
var pc = 0
var instr = null
var op = null
var n = 0
var parts = null
var j = 0
var operands = null
var pc_str = null
var op_str = null
print(`\n=== ${name} (args=${text(nr_args)}, slots=${text(nr_slots)}, closures=${text(nr_close)}) ===`)
if (instrs == null || length(instrs) == 0) {
print(" (empty)")
return null
}
while (i < length(instrs)) {
instr = instrs[i]
if (is_text(instr)) {
if (!starts_with(instr, "_nop_")) {
print(`${instr}:`)
}
} else if (is_array(instr)) {
op = instr[0]
n = length(instr)
parts = []
j = 1
while (j < n - 2) {
push(parts, fmt_val(instr[j]))
j = j + 1
}
operands = text(parts, ", ")
pc_str = pad_right(text(pc), 5)
op_str = pad_right(op, 14)
print(` ${pc_str} ${op_str} ${operands}`)
pc = pc + 1
}
i = i + 1
}
return null
}
var main_name = null
var fi = 0
var func = null
var fname = null
if (compiled.main != null) {
main_name = compiled.name != null ? compiled.name : "<main>"
dump_function(compiled.main, main_name)
}
if (compiled.functions != null) {
fi = 0
while (fi < length(compiled.functions)) {
func = compiled.functions[fi]
fname = func.name != null ? func.name : `<func_${text(fi)}>`
dump_function(func, `[${text(fi)}] ${fname}`)
fi = fi + 1
}
}
$stop()

View File

@@ -1,9 +1,5 @@
var fd = use("fd")
var json = use("json")
var tokenize = use("tokenize")
var parse = use("parse")
var shop = use("internal/shop")
var filename = args[0]
var src = text(fd.slurp(filename))
var result = tokenize(src, filename)
var ast = parse(result.tokens, src, filename, tokenize)
var ast = shop.parse_file(filename)
print(json.encode(ast, true))

View File

@@ -1,22 +1,20 @@
// streamline.ce — run the full compile + optimize pipeline
//
// Usage:
// pit streamline <file> Full optimized IR as JSON (default)
// pit streamline --stats <file> Summary stats per function
// pit streamline --ir <file> Human-readable IR
// pit streamline --check <file> Warnings only (e.g. high slot count)
// cell streamline <file> Full optimized IR as JSON (default)
// cell streamline --stats <file> Summary stats per function
// cell streamline --ir <file> Human-readable IR
// cell streamline --check <file> Warnings only (e.g. high slot count)
// cell streamline --types <file> Optimized IR with type annotations
var fd = use("fd")
var json = use("json")
var tokenize = use("tokenize")
var parse = use("parse")
var fold = use("fold")
var mcode = use("mcode")
var streamline = use("streamline")
var shop = use("internal/shop")
var show_stats = false
var show_ir = false
var show_check = false
var show_types = false
var filename = null
var i = 0
@@ -27,21 +25,24 @@ for (i = 0; i < length(args); i++) {
show_ir = true
} else if (args[i] == '--check') {
show_check = true
} else if (args[i] == '--types') {
show_types = true
} else if (args[i] == '--help' || args[i] == '-h') {
log.console("Usage: cell streamline [--stats] [--ir] [--check] [--types] <file>")
$stop()
} else if (!starts_with(args[i], '-')) {
filename = args[i]
}
}
if (!filename) {
print("usage: pit streamline [--stats] [--ir] [--check] <file>")
print("usage: cell streamline [--stats] [--ir] [--check] [--types] <file>")
$stop()
}
var src = text(fd.slurp(filename))
var result = tokenize(src, filename)
var ast = parse(result.tokens, src, filename, tokenize)
var folded = fold(ast)
var compiled = mcode(folded)
var folded = shop.analyze_file(filename)
var pl = shop.pipeline()
var compiled = pl.mcode(folded)
// Deep copy for before snapshot (needed by --stats)
var before = null
@@ -49,18 +50,16 @@ if (show_stats) {
before = json.decode(json.encode(compiled))
}
var optimized = streamline(compiled)
var optimized = pl.streamline(compiled)
// If no flags, default to full JSON output
if (!show_stats && !show_ir && !show_check) {
if (!show_stats && !show_ir && !show_check && !show_types) {
print(json.encode(optimized, true))
$stop()
}
// --- Helpers ---
var ir_stats = use("ir_stats")
var pad_right = function(s, w) {
var r = s
while (length(r) < w) {
@@ -69,6 +68,15 @@ var pad_right = function(s, w) {
return r
}
var fmt_val = function(v) {
if (is_null(v)) return "null"
if (is_number(v)) return text(v)
if (is_text(v)) return `"${v}"`
if (is_object(v)) return json.encode(v)
if (is_logical(v)) return v ? "true" : "false"
return text(v)
}
var count_nops = function(func) {
var instrs = func.instructions
var nops = 0
@@ -83,6 +91,13 @@ var count_nops = function(func) {
return nops
}
// --- Stats mode ---
var ir_stats = null
if (show_stats || show_ir) {
ir_stats = use("ir_stats")
}
var print_func_stats = function(func, before_func, name) {
var nr_args = func.nr_args != null ? func.nr_args : 0
var nr_slots = func.nr_slots != null ? func.nr_slots : 0
@@ -118,6 +133,164 @@ var check_func = function(func, name) {
}
}
// --- Types mode (from dump_types.cm) ---
def T_UNKNOWN = "unknown"
def T_INT = "int"
def T_FLOAT = "float"
def T_NUM = "num"
def T_TEXT = "text"
def T_BOOL = "bool"
def T_NULL = "null"
def T_ARRAY = "array"
def T_RECORD = "record"
def T_FUNCTION = "function"
def int_result_ops = {
bitnot: true, bitand: true, bitor: true,
bitxor: true, shl: true, shr: true, ushr: true
}
def bool_result_ops = {
eq_int: true, ne_int: true, lt_int: true, gt_int: true,
le_int: true, ge_int: true,
eq_float: true, ne_float: true, lt_float: true, gt_float: true,
le_float: true, ge_float: true,
eq_text: true, ne_text: true, lt_text: true, gt_text: true,
le_text: true, ge_text: true,
eq_bool: true, ne_bool: true,
not: true, and: true, or: true,
is_int: true, is_text: true, is_num: true,
is_bool: true, is_null: true, is_identical: true,
is_array: true, is_func: true, is_record: true, is_stone: true
}
var access_value_type = function(val) {
if (is_number(val)) return is_integer(val) ? T_INT : T_FLOAT
if (is_text(val)) return T_TEXT
return T_UNKNOWN
}
var track_types = function(slot_types, instr) {
var op = instr[0]
var src_type = null
if (op == "access") {
slot_types[text(instr[1])] = access_value_type(instr[2])
} else if (op == "int") {
slot_types[text(instr[1])] = T_INT
} else if (op == "true" || op == "false") {
slot_types[text(instr[1])] = T_BOOL
} else if (op == "null") {
slot_types[text(instr[1])] = T_NULL
} else if (op == "move") {
src_type = slot_types[text(instr[2])]
slot_types[text(instr[1])] = src_type != null ? src_type : T_UNKNOWN
} else if (int_result_ops[op] == true) {
slot_types[text(instr[1])] = T_INT
} else if (op == "concat") {
slot_types[text(instr[1])] = T_TEXT
} else if (bool_result_ops[op] == true) {
slot_types[text(instr[1])] = T_BOOL
} else if (op == "typeof") {
slot_types[text(instr[1])] = T_TEXT
} else if (op == "array") {
slot_types[text(instr[1])] = T_ARRAY
} else if (op == "record") {
slot_types[text(instr[1])] = T_RECORD
} else if (op == "function") {
slot_types[text(instr[1])] = T_FUNCTION
} else if (op == "invoke" || op == "tail_invoke") {
slot_types[text(instr[2])] = T_UNKNOWN
} else if (op == "load_field" || op == "load_index" || op == "load_dynamic") {
slot_types[text(instr[1])] = T_UNKNOWN
} else if (op == "pop" || op == "get") {
slot_types[text(instr[1])] = T_UNKNOWN
} else if (op == "length") {
slot_types[text(instr[1])] = T_INT
} else if (op == "add" || op == "subtract" || op == "multiply" ||
op == "divide" || op == "modulo" || op == "pow" || op == "negate") {
slot_types[text(instr[1])] = T_UNKNOWN
}
return null
}
var type_annotation = function(slot_types, instr) {
var n = length(instr)
var parts = []
var j = 1
var v = null
var t = null
while (j < n - 2) {
v = instr[j]
if (is_number(v)) {
t = slot_types[text(v)]
if (t != null && t != T_UNKNOWN) {
push(parts, `s${text(v)}:${t}`)
}
}
j = j + 1
}
if (length(parts) == 0) return ""
return text(parts, " ")
}
var dump_function_typed = function(func, name) {
var nr_args = func.nr_args != null ? func.nr_args : 0
var nr_slots = func.nr_slots != null ? func.nr_slots : 0
var instrs = func.instructions
var slot_types = {}
var i = 0
var pc = 0
var instr = null
var op = null
var n = 0
var annotation = null
var operand_parts = null
var j = 0
var operands = null
var pc_str = null
var op_str = null
var line = null
print(`\n=== ${name} (args=${text(nr_args)}, slots=${text(nr_slots)}) ===`)
if (instrs == null || length(instrs) == 0) {
print(" (empty)")
return null
}
while (i < length(instrs)) {
instr = instrs[i]
if (is_text(instr)) {
if (starts_with(instr, "_nop_")) {
i = i + 1
continue
}
slot_types = {}
print(`${instr}:`)
} else if (is_array(instr)) {
op = instr[0]
n = length(instr)
annotation = type_annotation(slot_types, instr)
operand_parts = []
j = 1
while (j < n - 2) {
push(operand_parts, fmt_val(instr[j]))
j = j + 1
}
operands = text(operand_parts, ", ")
pc_str = pad_right(text(pc), 5)
op_str = pad_right(op, 14)
line = pad_right(` ${pc_str} ${op_str} ${operands}`, 50)
if (length(annotation) > 0) {
print(`${line} ; ${annotation}`)
} else {
print(line)
}
track_types(slot_types, instr)
pc = pc + 1
}
i = i + 1
}
return null
}
// --- Process functions ---
var main_name = optimized.name != null ? optimized.name : "<main>"
@@ -141,6 +314,9 @@ if (optimized.main != null) {
if (show_check) {
check_func(optimized.main, main_name)
}
if (show_types) {
dump_function_typed(optimized.main, main_name)
}
}
// Sub-functions
@@ -160,6 +336,9 @@ if (optimized.functions != null) {
if (show_check) {
check_func(func, fname)
}
if (show_types) {
dump_function_typed(func, fname)
}
fi = fi + 1
}
}

View File

@@ -1,7 +1,5 @@
var fd = use("fd")
var json = use("json")
var tokenize = use("tokenize")
var shop = use("internal/shop")
var filename = args[0]
var src = text(fd.slurp(filename))
var result = tokenize(src, filename)
var result = shop.tokenize_file(filename)
print(json.encode({filename: result.filename, tokens: result.tokens}))