more detail on broken pipeline and vm suit tests

This commit is contained in:
2026-02-15 11:51:23 -06:00
parent ee646db394
commit 7de20b39da
16 changed files with 16527 additions and 15432 deletions

View File

@@ -17,12 +17,25 @@ itself, but the only available compiler (old pitweb) miscompiles it.
## Current State
- `boot/tokenize.cm.mcode` through `boot/streamline.cm.mcode` contain the **old pitweb**
pipeline mcode (pre-merge). These work correctly — 641/641 vm_suite tests pass.
pipeline mcode (pre-merge). These pass 641/641 vm_suite tests.
- All other boot mcode files (engine, bootstrap, seed_bootstrap, plus core modules like
fd, time, toml, etc.) are compiled from the merged source and work correctly.
- The merged pipeline `.cm` source has changes from fix_gc that are **not active** — the
runtime uses the old pitweb pipeline mcode.
**The old pitweb pipeline is NOT fully working.** While it passes the test suite, it
miscompiles nested function declarations. This breaks:
- `toml.encode()` — the encoder uses nested `function` declarations inside `encode_toml`
- `Shop.save_lock()` — calls `toml.encode()`, so any lock.toml mutation fails
- Any other `.cm` module that uses nested named function declarations
This means the **ID-based package symbol naming** (Phase 2 in the plan) is blocked: it
needs `save_lock()` to persist package IDs to lock.toml.
The shop.cm changes for ID-based naming are already written and correct — they just need
a working pipeline underneath. Once the pipeline is fixed, the ID system will work.
## What Changed in the Pipeline
The fix_gc merge brought these changes to the pipeline `.cm` files:
@@ -88,8 +101,12 @@ compiled by the current (old pitweb) pipeline. The process:
- Use `--seed` mode for testing compilation — it bypasses the engine entirely and
loads the pipeline directly from boot mcode.
- The failure mode is silent: the old compiler compiles the new source without errors
but produces wrong bytecode. Start debugging with the simplest failing case
(`var x = 42; print(x)`) and work up.
but produces wrong bytecode.
- Known broken patterns with the old pitweb pipeline:
- `var x = 42; print(x)` fails when compiled by the regenned pipeline mcode
- Nested named function declarations (`function foo() {}` inside another function)
produce "not a function" errors — this breaks `toml.encode()`
- Test with: `echo 'var toml = use("toml"); print(toml.encode({a: 1}))' > /tmp/t.ce && ./cell --dev /tmp/t.ce`
- The most likely culprits are the mcode.cm changes (type-guarded arithmetic, array/record
count args) since these change the bytecode format. The fold.cm changes (lookup tables)
are more likely safe refactors.