clean up bytecode

2026-02-13 09:03:00 -06:00
parent 0acaabd5fa
commit 3795533554
14 changed files with 23087 additions and 29051 deletions
--- a/docs/spec/streamline.md
+++ b/docs/spec/streamline.md
@@ -37,7 +37,7 @@ Subsumption: `int` and `float` both satisfy a `num` check.

 ### 1. infer_param_types (backward type inference)

-Scans all typed operators to determine what types their operands must be. For example, `add_int dest, a, b` implies both `a` and `b` are integers.
+Scans typed operators and generic arithmetic to determine what types their operands must be. For example, `subtract dest, a, b` implies both `a` and `b` are numbers.

 When a parameter slot (1..nr_args) is consistently inferred as a single type, that type is recorded. Since parameters are immutable (`def`), the inferred type holds for the entire function and persists across label join points (loop headers, branch targets).

@@ -45,8 +45,9 @@ Backward inference rules:

 | Operator class | Operand type inferred |
 |---|---|
-| `add_int`, `sub_int`, `mul_int`, `div_int`, `mod_int`, `eq_int`, comparisons, bitwise | T_INT |
-| `add_float`, `sub_float`, `mul_float`, `div_float`, `mod_float`, float comparisons | T_FLOAT |
+| `subtract`, `multiply`, `divide`, `modulo`, `pow`, `negate` | T_NUM |
+| `eq_int`, `ne_int`, `lt_int`, `gt_int`, `le_int`, `ge_int`, bitwise ops | T_INT |
+| `eq_float`, `ne_float`, `lt_float`, `gt_float`, `le_float`, `ge_float` | T_FLOAT |
 | `concat`, text comparisons | T_TEXT |
 | `eq_bool`, `ne_bool`, `not`, `and`, `or` | T_BOOL |
 | `store_index` (object operand) | T_ARRAY |
@@ -58,7 +59,9 @@ Backward inference rules:
 | `load_field` (object operand) | T_RECORD |
 | `pop` (array operand) | T_ARRAY |

-When a slot appears with conflicting type inferences (e.g., used in both `add_int` and `concat` across different type-dispatch branches), the result is `unknown`. INT + FLOAT conflicts produce `num`.
+Note: `add` is excluded from backward inference because it is polymorphic — it handles both numeric addition and text concatenation. Only operators that are unambiguously numeric can infer T_NUM.
+
+When a slot appears with conflicting type inferences, the result is `unknown`. INT + FLOAT conflicts produce `num`.

 **Nop prefix:** none (analysis only, does not modify instructions)

@@ -83,11 +86,10 @@ Write type mapping:
 | `record` | T_RECORD |
 | `function` | T_FUNCTION |
 | `length` | T_INT |
-| int arithmetic, `neg_int`, bitwise ops | T_INT |
-| float arithmetic, `neg_float` | T_FLOAT |
+| bitwise ops | T_INT |
 | `concat` | T_TEXT |
 | bool ops, comparisons, `in` | T_BOOL |
-| generic arithmetic (`add`, `subtract`, etc.) | T_UNKNOWN |
+| generic arithmetic (`add`, `subtract`, `negate`, etc.) | T_UNKNOWN |
 | `move`, `load_field`, `load_index`, `load_dynamic`, `pop`, `get` | T_UNKNOWN |
 | `invoke`, `tail_invoke` | T_UNKNOWN |

@@ -95,11 +97,12 @@ The result is a map of slot→type for slots where all writes agree on a single

 Common patterns this enables:

- **Loop counters** (`var i = 0; ... i = i + 1`): written by `int` (T_INT) and `add_int` (T_INT) → invariant T_INT
 - **Length variables** (`var len = length(arr)`): written by `length` (T_INT) only → invariant T_INT
 - **Boolean flags** (`var found = false; ... found = true`): written by `false` and `true` → invariant T_BOOL
 - **Locally-created containers** (`var arr = []`): written by `array` only → invariant T_ARRAY

+Note: Loop counters (`var i = 0; i = i + 1`) are NOT invariant because `add` produces T_UNKNOWN. However, if `i` is a function parameter used in arithmetic, backward inference from `subtract`/`multiply`/etc. will infer T_NUM for it, which persists across labels.
+
 **Nop prefix:** none (analysis only, does not modify instructions)

 ### 3. eliminate_type_checks (type-check + jump elimination)
@@ -118,26 +121,9 @@ At label join points, all type information is reset except for parameter types f

 **Nop prefix:** `_nop_tc_`

-### 4. simplify_algebra (algebraic identity + comparison folding)
+### 4. simplify_algebra (same-slot comparison folding)

-Tracks known constant values alongside types. Rewrites identity operations:
-
-| Pattern | Rewrite |
-|---------|---------|
-| `add_int dest, x, 0` | `move dest, x` |
-| `add_int dest, 0, x` | `move dest, x` |
-| `sub_int dest, x, 0` | `move dest, x` |
-| `mul_int dest, x, 1` | `move dest, x` |
-| `mul_int dest, 1, x` | `move dest, x` |
-| `mul_int dest, x, 0` | `int dest, 0` |
-| `div_int dest, x, 1` | `move dest, x` |
-| `add_float dest, x, 0` | `move dest, x` |
-| `mul_float dest, x, 1` | `move dest, x` |
-| `div_float dest, x, 1` | `move dest, x` |
-
-Float multiplication by zero is intentionally not optimized because it is not safe with NaN and Inf values.
-
-Same-slot comparison folding:
+Tracks known constant values. Folds same-slot comparisons:

 | Pattern | Rewrite |
 |---------|---------|
@@ -222,6 +208,16 @@ Before streamlining, `mcode.cm` recognizes calls to built-in intrinsic functions

 These inlined opcodes have corresponding Mach VM implementations in `mach.c`.

+## Unified Arithmetic
+
+Arithmetic operations use generic opcodes: `add`, `subtract`, `multiply`, `divide`, `modulo`, `pow`, `negate`. There are no type-dispatched variants (e.g., no `add_int`/`add_float`).
+
+The Mach VM dispatches at runtime with an int-first fast path via `reg_vm_binop()`: it checks `JS_VALUE_IS_BOTH_INT` first for fast integer arithmetic, then falls back to float conversion, text concatenation (for `add` only), or type error.
+
+Bitwise operations (`shl`, `shr`, `ushr`, `bitand`, `bitor`, `bitxor`, `bitnot`) remain integer-only and disrupt if operands are not integers.
+
+The QBE/native backend maps generic arithmetic to helper calls (`qbe.add`, `qbe.sub`, etc.). The vision for the native path is that with sufficient type inference, the backend can unbox proven-numeric values to raw registers, operate directly, and only rebox at boundaries (returns, calls, stores).
+
 ## Debugging Tools

 Three dump tools inspect the IR at different stages:
@@ -279,7 +275,7 @@ The streamline optimizer uses a numeric type lattice (`T_INT`, `T_FLOAT`, `T_TEX

 - **Backward inference** (pass 1): Scans typed operators to infer parameter types. Since parameters are `def` (immutable), inferred types persist across label boundaries.
 - **Write-type invariance** (pass 2): Scans all instructions to find local slots where every write produces the same type. These invariant types persist across label boundaries alongside parameter types.
- **Forward tracking** (pass 3): `track_types` follows instruction execution order, tracking the type of each slot. Typed arithmetic results set their destination type. Type checks on unknown slots narrow the type on fallthrough.
+- **Forward tracking** (pass 3): `track_types` follows instruction execution order, tracking the type of each slot. Known-type operations set their destination type (e.g., `concat` → T_TEXT, `length` → T_INT). Generic arithmetic produces T_UNKNOWN. Type checks on unknown slots narrow the type on fallthrough.
 - **Type check elimination** (pass 3): When a slot's type is already known, `is_<type>` + conditional jump pairs are eliminated or converted to unconditional jumps.
 - **Dynamic access narrowing** (pass 3): `load_dynamic`/`store_dynamic` are narrowed to `load_field`/`store_field` or `load_index`/`store_index` when the key type is known.

@@ -301,7 +297,7 @@ The current purity set is conservative (only `is_*`). It could be expanded by:

 ### Forward Type Narrowing from Typed Operations

-After a typed operation like `add_int dest, a, b` executes successfully, we know `a` and `b` are integers. This could be used to eliminate subsequent type checks on the same slots within a basic block. An implementation was attempted but caused intermittent GC crashes during self-hosting, suggesting the type narrowing interacted badly with the runtime's garbage collector (possibly through changed instruction timing or register pressure). The approach is sound in principle but needs careful investigation of the GC interaction.
+With unified arithmetic (generic `add`/`subtract`/`multiply`/`divide`/`modulo`/`negate` instead of typed variants), this approach is no longer applicable. Typed comparisons (`eq_int`, `lt_float`, etc.) still exist and their operands have known types, but these are already handled by backward inference.

 ### Guard Hoisting for Parameters

@@ -330,16 +326,31 @@ Currently all type inference is intraprocedural (within a single function). Cros

 ### Strength Reduction

-Common patterns that could be lowered to cheaper operations:
+Common patterns that could be lowered to cheaper operations when operand types are known:

- `mul_int x, 2` → `add_int x, x` (shift left)
- `div_int x, 2` → arithmetic shift right
- `mod_int x, power_of_2` → bitwise and
+- `multiply x, 2` with proven-int operands → shift left
+- `divide x, 2` with proven-int → arithmetic shift right
+- `modulo x, power_of_2` with proven-int → bitwise and
+
+### Numeric Unboxing (QBE/native path)
+
+With unified arithmetic and backward type inference, the native backend can identify regions where numeric values remain in registers without boxing/unboxing:
+
+1. **Guard once**: When backward inference proves a parameter is T_NUM, emit a single type guard at function entry.
+2. **Unbox**: Convert the tagged JSValue to a raw double register.
+3. **Operate**: Use native FP/int instructions directly (no function calls, no tag checks).
+4. **Rebox**: Convert back to tagged JSValue only at rebox points (function returns, calls, stores to arrays/records).
+
+This requires inserting `unbox`/`rebox` IR annotations (no-ops in the Mach VM, meaningful only to QBE).

 ### Loop-Invariant Code Motion

 Type checks that are invariant across loop iterations (checking a variable that doesn't change in the loop body) could be hoisted above the loop. This would require identifying loop boundaries and proving invariance.

+### Algebraic Identity Optimization
+
+With unified arithmetic, algebraic identities (x+0→x, x*1→x, x*0→0, x/1→x) require knowing operand values at compile time. Since generic `add`/`multiply` operate on any numeric type, the constant-tracking logic in `simplify_algebra` could be extended to handle these for known-constant slots.
+
 ## Nop Convention

 Eliminated instructions are replaced with strings matching `_nop_<prefix>_<counter>`. The prefix identifies which pass created the nop. Nop strings are: