failsafe boot mode

This commit is contained in:
2026-02-15 11:44:33 -06:00
parent ff80e0d30d
commit ee646db394
22 changed files with 290669 additions and 594234 deletions

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

2244
boot/fd.cm.mcode Normal file

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

59300
boot/internal_shop.cm.mcode Normal file

File diff suppressed because it is too large Load Diff

14083
boot/link.cm.mcode Normal file

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

14693
boot/package.cm.mcode Normal file

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

8189
boot/pronto.cm.mcode Normal file

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

14564
boot/time.cm.mcode Normal file

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

12640
boot/toml.cm.mcode Normal file

File diff suppressed because it is too large Load Diff

4422
boot/toolchains.cm.mcode Normal file

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

95
fix_pipeline.md Normal file
View File

@@ -0,0 +1,95 @@
# Fix Compilation Pipeline Bootstrap
## Problem
After merging `fix_gc` into `pitweb`, the compilation pipeline `.cm` source files
(tokenize.cm, parse.cm, fold.cm, mcode.cm, streamline.cm) cannot bootstrap themselves.
The old pitweb pipeline mcode compiles the merged `.cm` source without errors, but the
resulting new pipeline mcode is **semantically broken** — it can't even compile
`var x = 42; print(x)`.
Both branches worked independently. The merge introduced no syntax errors, but the old
pitweb compiler produces incorrect bytecode from the merged pipeline source. This is a
classic bootstrapping problem: the new pipeline needs a compatible compiler to build
itself, but the only available compiler (old pitweb) miscompiles it.
## Current State
- `boot/tokenize.cm.mcode` through `boot/streamline.cm.mcode` contain the **old pitweb**
pipeline mcode (pre-merge). These work correctly — 641/641 vm_suite tests pass.
- All other boot mcode files (engine, bootstrap, seed_bootstrap, plus core modules like
fd, time, toml, etc.) are compiled from the merged source and work correctly.
- The merged pipeline `.cm` source has changes from fix_gc that are **not active** — the
runtime uses the old pitweb pipeline mcode.
## What Changed in the Pipeline
The fix_gc merge brought these changes to the pipeline `.cm` files:
- **mcode.cm**: Type-guarded arithmetic (`emit_add_decomposed` now generates `is_text`/`is_num`
checks instead of letting the VM dispatch), `emit_numeric_binop` for subtract/multiply/etc.,
`sensory_ops` lookup table, array/record literal count args (`["array", dest, count]`
instead of `["array", dest, 0]`)
- **fold.cm**: Lookup tables (`binary_ops`, `unary_ops`, `assign_ops`, etc.) replacing
if-chains, combined `"array"` and `"text literal"` handling
- **tokenize.cm**: ~500 lines of changes
- **streamline.cm**: ~700 lines of changes
- **parse.cm**: ~40 lines of changes (minor)
## Regen Flags
`regen.ce` now has two modes:
```
./cell --dev --seed regen # default: skip pipeline files
./cell --dev --seed regen --all # include pipeline files (tokenize/parse/fold/mcode/streamline)
```
The default mode is safe — it regenerates everything except the 5 pipeline files,
preserving the working old pitweb pipeline mcode.
## How to Fix
The goal is to get the merged pipeline `.cm` source to produce working mcode when
compiled by the current (old pitweb) pipeline. The process:
1. Start from the current repo state (old pitweb pipeline mcode in boot/)
2. Edit one or more pipeline `.cm` files to fix the issue
3. Regen with `--all` to recompile everything including pipeline:
```
./cell --dev --seed regen --all
```
4. Test the new pipeline with a simple sanity check:
```
rm -rf .cell/build/*
echo 'var x = 42; print(x)' > /tmp/test.ce
./cell --dev --seed /tmp/test
```
5. If that works, run the full test suite:
```
rm -rf .cell/build/*
./cell --dev vm_suite
```
6. If tests pass, regen again (the new pipeline compiles itself):
```
./cell --dev --seed regen --all
```
7. Repeat steps 4-6 until **idempotent** — two consecutive `regen --all` runs produce
identical boot mcode and all tests pass.
## Debugging Tips
- The old pitweb pipeline mcode is always available via:
```
git checkout HEAD^1 -- boot/tokenize.cm.mcode boot/parse.cm.mcode \
boot/fold.cm.mcode boot/mcode.cm.mcode boot/streamline.cm.mcode
```
- Use `--seed` mode for testing compilation — it bypasses the engine entirely and
loads the pipeline directly from boot mcode.
- The failure mode is silent: the old compiler compiles the new source without errors
but produces wrong bytecode. Start debugging with the simplest failing case
(`var x = 42; print(x)`) and work up.
- The most likely culprits are the mcode.cm changes (type-guarded arithmetic, array/record
count args) since these change the bytecode format. The fold.cm changes (lookup tables)
are more likely safe refactors.

View File

@@ -68,6 +68,8 @@ function use_core(path) {
var result = null
var script = null
var ast = null
var mcode_path = null
var mcode_blob = null
// Build env: merge core_extras
env = {use: use_core}
@@ -78,6 +80,26 @@ function use_core(path) {
var mach_blob = null
var source_blob = null
// Check for pre-compiled .cm.mcode JSON IR (generated by regen)
mcode_path = core_path + '/boot/' + replace(path, '/', '_') + '.cm.mcode'
if (fd.is_file(mcode_path)) {
mcode_blob = fd.slurp(mcode_path)
hash = content_hash(mcode_blob)
cached_path = cache_path(hash)
if (cached_path && fd.is_file(cached_path)) {
result = mach_load(fd.slurp(cached_path), env)
} else {
mach_blob = mach_compile_mcode_bin('core:' + path, text(mcode_blob))
if (cached_path) {
ensure_build_dir()
fd.slurpwrite(cached_path, mach_blob)
}
result = mach_load(mach_blob, env)
}
use_cache[cache_key] = result
return result
}
// Compile from source .cm file
var file_path = core_path + '/' + path + MOD_EXT
if (fd.is_file(file_path)) {

View File

@@ -9,22 +9,41 @@ var fold = use("fold")
var mcode = use("mcode")
var streamline = use("streamline")
var files = [
// Pipeline files (tokenize/parse/fold/mcode/streamline) are only regenerated
// with --all flag since they require a self-consistent compiler to bootstrap.
var pipeline_files = [
{src: "tokenize.cm", name: "tokenize", out: "boot/tokenize.cm.mcode"},
{src: "parse.cm", name: "parse", out: "boot/parse.cm.mcode"},
{src: "fold.cm", name: "fold", out: "boot/fold.cm.mcode"},
{src: "mcode.cm", name: "mcode", out: "boot/mcode.cm.mcode"},
{src: "streamline.cm", name: "streamline", out: "boot/streamline.cm.mcode"},
{src: "streamline.cm", name: "streamline", out: "boot/streamline.cm.mcode"}
]
var files = [
{src: "qbe.cm", name: "qbe", out: "boot/qbe.cm.mcode"},
{src: "qbe_emit.cm", name: "qbe_emit", out: "boot/qbe_emit.cm.mcode"},
{src: "verify_ir.cm", name: "verify_ir", out: "boot/verify_ir.cm.mcode"},
{src: "internal/bootstrap.cm", name: "bootstrap", out: "boot/bootstrap.cm.mcode"},
{src: "internal/engine.cm", name: "engine", out: "boot/engine.cm.mcode"},
{src: "boot/seed_bootstrap.cm", name: "seed_bootstrap", out: "boot/seed_bootstrap.cm.mcode"}
{src: "boot/seed_bootstrap.cm", name: "seed_bootstrap", out: "boot/seed_bootstrap.cm.mcode"},
{src: "fd.cm", name: "fd", out: "boot/fd.cm.mcode"},
{src: "time.cm", name: "time", out: "boot/time.cm.mcode"},
{src: "pronto.cm", name: "pronto", out: "boot/pronto.cm.mcode"},
{src: "toml.cm", name: "toml", out: "boot/toml.cm.mcode"},
{src: "link.cm", name: "link", out: "boot/link.cm.mcode"},
{src: "toolchains.cm", name: "toolchains", out: "boot/toolchains.cm.mcode"},
{src: "package.cm", name: "package", out: "boot/package.cm.mcode"},
{src: "internal/shop.cm", name: "internal_shop", out: "boot/internal_shop.cm.mcode"}
]
// Resolve shop_path for cache writes
// Include pipeline files with --all flag
var os = use('os')
var regen_all = args != null && length(args) > 0 && args[0] == "--all"
if (regen_all) {
files = array(pipeline_files, files)
}
// Resolve shop_path for cache writes
var shop = os.getenv('CELL_SHOP')
var home = null
var cache_dir = null