cell/streamline.md at 900db912a5691e366263520432038057a3ee98b4

john/cell

Files

John Alanbrook 900db912a5 streamline mcode

2026-02-12 09:43:13 -06:00

8.1 KiB

Raw Blame History

title, description

title	description
Streamline Optimizer	Mcode IR optimization passes

Overview

The streamline optimizer (streamline.cm) runs a series of independent passes over the Mcode IR to eliminate redundant operations. Each pass is a standalone function that can be enabled, disabled, or reordered. Passes communicate only through the instruction array they mutate in place, replacing eliminated instructions with nop strings (e.g., _nop_tc_1).

The optimizer runs after mcode.cm generates the IR and before the result is lowered to the Mach VM or emitted as QBE IL.

Fold (AST) → Mcode (JSON IR) → Streamline → Mach VM / QBE

Type Lattice

The optimizer tracks a type for each slot in the register file:

Type	Meaning
`unknown`	No type information
`int`	Integer
`float`	Floating-point
`num`	Number (subsumes int and float)
`text`	String
`bool`	Logical (true/false)
`null`	Null value
`array`	Array
`record`	Record (object)
`function`	Function
`blob`	Binary blob

Subsumption: int and float both satisfy a num check.

Passes

1. infer_param_types (backward type inference)

Scans all typed operators to determine what types their operands must be. For example, add_int dest, a, b implies both a and b are integers.

When a parameter slot (1..nr_args) is consistently inferred as a single type, that type is recorded. Since parameters are immutable (def), the inferred type holds for the entire function and persists across label join points (loop headers, branch targets).

Backward inference rules:

Operator class	Operand type inferred
`add_int`, `sub_int`, `mul_int`, `div_int`, `mod_int`, `eq_int`, comparisons, bitwise	T_INT
`add_float`, `sub_float`, `mul_float`, `div_float`, `mod_float`, float comparisons	T_FLOAT
`concat`, text comparisons	T_TEXT
`eq_bool`, `ne_bool`, `not`, `and`, `or`	T_BOOL
`store_index` (object operand)	T_ARRAY
`store_index` (index operand)	T_INT
`store_field` (object operand)	T_RECORD
`push` (array operand)	T_ARRAY

When a slot appears with conflicting type inferences (e.g., used in both add_int and concat across different type-dispatch branches), the result is unknown. INT + FLOAT conflicts produce num.

Nop prefix: none (analysis only, does not modify instructions)

2. eliminate_type_checks (type-check + jump elimination)

Forward pass that tracks the known type of each slot. When a type check (is_int, is_text, is_num, etc.) is followed by a conditional jump, and the slot's type is already known, the check and jump can be eliminated or converted to an unconditional jump.

Three cases:

Known match (e.g., is_int on a slot known to be int): both the check and the conditional jump are eliminated (nop'd).
Known mismatch (e.g., is_text on a slot known to be int): the check is nop'd and the conditional jump is rewritten to an unconditional jump.
Unknown: the check remains, but on fallthrough, the slot's type is narrowed to the checked type (enabling downstream eliminations).

This pass also reduces load_dynamic/store_dynamic to load_field/store_field or load_index/store_index when the key slot's type is known.

At label join points, all type information is reset except for parameter types seeded by the backward inference pass.

Nop prefix: _nop_tc_

3. simplify_algebra (algebraic identity + comparison folding)

Tracks known constant values alongside types. Rewrites identity operations:

Pattern	Rewrite
`add_int dest, x, 0`	`move dest, x`
`add_int dest, 0, x`	`move dest, x`
`sub_int dest, x, 0`	`move dest, x`
`mul_int dest, x, 1`	`move dest, x`
`mul_int dest, 1, x`	`move dest, x`
`mul_int dest, x, 0`	`int dest, 0`
`div_int dest, x, 1`	`move dest, x`
`add_float dest, x, 0`	`move dest, x`
`mul_float dest, x, 1`	`move dest, x`
`div_float dest, x, 1`	`move dest, x`

Float multiplication by zero is intentionally not optimized because it is not safe with NaN and Inf values.

Same-slot comparison folding:

Pattern	Rewrite
`eq_* dest, x, x`	`true dest`
`le_* dest, x, x`	`true dest`
`ge_* dest, x, x`	`true dest`
`is_identical dest, x, x`	`true dest`
`ne_* dest, x, x`	`false dest`
`lt_* dest, x, x`	`false dest`
`gt_* dest, x, x`	`false dest`

Nop prefix: none (rewrites in place, does not create nops)

4. simplify_booleans (not + jump fusion)

Peephole pass that eliminates unnecessary not instructions:

Pattern	Rewrite
`not d, x; jump_false d, L`	nop; `jump_true x, L`
`not d, x; jump_true d, L`	nop; `jump_false x, L`
`not d1, x; not d2, d1`	nop; `move d2, x`

This is particularly effective on if (!cond) patterns, which the compiler generates as not; jump_false. After this pass, they become a single jump_true.

Nop prefix: _nop_bl_

5. eliminate_moves (self-move elimination)

Removes move a, a instructions where the source and destination are the same slot. These can arise from earlier passes rewriting binary operations into moves.

Nop prefix: _nop_mv_

6. eliminate_unreachable (dead code after return/disrupt)

Currently disabled. Nops instructions after return or disrupt until the next real label.

Disabled because disruption handler code is placed after the return/disrupt instruction without a label boundary. The VM dispatches to handlers via the disruption_pc offset, not through normal control flow. Re-enabling this pass requires the mcode compiler to emit labels at disruption handler entry points.

Nop prefix: _nop_ur_

7. eliminate_dead_jumps (jump-to-next-label elimination)

Removes jump L instructions where L is the immediately following label (skipping over any intervening nop strings). These are common after other passes eliminate conditional branches, leaving behind jumps that fall through naturally.

Nop prefix: _nop_dj_

Pass Composition

All passes run in sequence in optimize_function:

infer_param_types      → returns param_types map
eliminate_type_checks   → uses param_types
simplify_algebra
simplify_booleans
eliminate_moves
(eliminate_unreachable) → disabled
eliminate_dead_jumps

Each pass is independent and can be commented out for testing or benchmarking.

Intrinsic Inlining

Before streamlining, mcode.cm recognizes calls to built-in intrinsic functions and emits direct opcodes instead of the generic frame/setarg/invoke call sequence. This reduces a 6-instruction call pattern to a single instruction:

Call	Emitted opcode
`is_array(x)`	`is_array dest, src`
`is_function(x)`	`is_func dest, src`
`is_object(x)`	`is_record dest, src`
`is_stone(x)`	`is_stone dest, src`
`is_integer(x)`	`is_int dest, src`
`is_text(x)`	`is_text dest, src`
`is_number(x)`	`is_num dest, src`
`is_logical(x)`	`is_bool dest, src`
`is_null(x)`	`is_null dest, src`
`length(x)`	`length dest, src`
`push(arr, val)`	`push arr, val`

These inlined opcodes have corresponding Mach VM implementations in mach.c.

Debugging Tools

Three dump tools inspect the IR at different stages:

dump_mcode.cm — prints the raw Mcode IR after mcode.cm, before streamlining
dump_stream.cm — prints the IR after streamlining, with before/after instruction counts
dump_types.cm — prints the streamlined IR with type annotations on each instruction

Usage:

./cell --core . dump_mcode.cm <file.ce|file.cm>
./cell --core . dump_stream.cm <file.ce|file.cm>
./cell --core . dump_types.cm <file.ce|file.cm>

Nop Convention

Eliminated instructions are replaced with strings matching _nop_<prefix>_<counter>. The prefix identifies which pass created the nop. Nop strings are:

Skipped during interpretation (the VM ignores them)
Skipped during QBE emission
Not counted in instruction statistics
Preserved in the instruction array to maintain positional stability for jump targets

8.1 KiB Raw Blame History