bootstrap init

This commit is contained in:
2026-02-16 01:33:48 -06:00
parent 46c345d34e
commit 788ea98651
25 changed files with 3943 additions and 410026 deletions

View File

@@ -9,18 +9,36 @@ The shop is the module resolution and loading engine behind `use()`. It handles
## Startup Pipeline
When `pit` runs a program, three layers bootstrap in sequence:
When `pit` runs a program, startup takes one of two paths:
### Fast path (warm cache)
```
bootstrap.cm → engine.cm → shop.cm → user program
C runtime → engine.cm (from cache) → shop.cm → user program
```
**bootstrap.cm** loads the compiler toolchain (tokenize, parse, fold, mcode, streamline) from pre-compiled bytecode. It defines `analyze()` (source to AST) and `compile_to_blob()` (AST to binary blob). It then loads engine.cm.
The C runtime hashes the source of `internal/engine.cm` with BLAKE2 and looks up the hash in the content-addressed cache (`~/.pit/build/<hash>`). On a cache hit, engine.cm loads directly — no bootstrap involved.
**engine.cm** creates the actor runtime (`$_`), defines `use_core()` for loading core modules, and populates the environment that shop receives. It then loads shop.cm via `use_core('internal/shop')`.
### Cold path (first run or cache cleared)
```
C runtime → bootstrap.cm → (seeds cache) → engine.cm (from cache) → shop.cm → user program
```
On a cache miss, the C runtime loads `boot/bootstrap.cm.mcode` (a pre-compiled seed). Bootstrap compiles engine.cm and the pipeline modules (tokenize, parse, fold, mcode, streamline) from source and caches the results. The C runtime then retries the engine cache lookup, which now succeeds.
### Engine
**engine.cm** is self-sufficient. It loads its own compilation pipeline from the content-addressed cache, with fallback to the pre-compiled seeds in `boot/`. It defines `analyze()` (source to AST), `compile_to_blob()` (AST to binary blob), and `use_core()` for loading core modules. It creates the actor runtime and loads shop.cm via `use_core('internal/shop')`.
### Shop
**shop.cm** receives its dependencies through the module environment — `analyze`, `run_ast_fn`, `use_cache`, `shop_path`, `runtime_env`, `content_hash`, `cache_path`, and others. It defines `Shop.use()`, which is the function behind every `use()` call in user code.
### Cache invalidation
All caching is content-addressed by BLAKE2 hash of the source. When any source file changes, its hash changes and the old cache entry is simply never looked up again. No manual invalidation is needed. To force a full rebuild, delete `~/.pit/build/`.
## Module Resolution
When `use('path')` is called from a package context, the shop resolves the module through a multi-layer search. Both the `.cm` script file and C symbol are resolved independently, and the one with the narrowest scope wins.
@@ -90,7 +108,7 @@ This scheme provides automatic cache invalidation: when source changes, its hash
### Core Module Caching
Core modules loaded via `use_core()` in engine.cm follow the same pattern. On first startup after a fresh install, core modules are compiled from `.cm.mcode` JSON IR and cached as `.mach` blobs. Subsequent startups load from cache, skipping the JSON parse and compile steps entirely.
Core modules loaded via `use_core()` in engine.cm follow the same content-addressed pattern. On first use, a module is compiled from source and cached by the BLAKE2 hash of its source content. Subsequent loads with unchanged source hit the cache directly.
User scripts (`.ce` files) are also cached. The first run compiles and caches; subsequent runs with unchanged source load from cache.
@@ -196,9 +214,10 @@ If `shop.toml` is missing or has no `[policy]` section, all methods are enabled
| File | Role |
|------|------|
| `internal/bootstrap.cm` | Loads compiler, defines `analyze()` and `compile_to_blob()` |
| `internal/engine.cm` | Actor runtime, `use_core()`, environment setup |
| `internal/bootstrap.cm` | Minimal cache seeder (cold start only) |
| `internal/engine.cm` | Self-sufficient entry point: compilation pipeline, actor runtime, `use_core()` |
| `internal/shop.cm` | Module resolution, compilation, caching, C extension loading |
| `internal/os.c` | OS intrinsics: dylib ops, internal symbol lookup, embedded modules |
| `package.cm` | Package directory detection, alias resolution, file listing |
| `link.cm` | Development link management (link.toml read/write) |
| `boot/*.cm.mcode` | Pre-compiled pipeline seeds (tokenize, parse, fold, mcode, bootstrap) |

View File

@@ -104,7 +104,8 @@ pit --emit-qbe script.ce > output.ssa
| `streamline.cm` | Mcode IR optimizer |
| `qbe_emit.cm` | Mcode IR → QBE IL emitter |
| `qbe.cm` | QBE IL operation templates |
| `internal/bootstrap.cm` | Pipeline orchestrator |
| `internal/bootstrap.cm` | Cache seeder (cold start only) |
| `internal/engine.cm` | Self-sufficient pipeline loader and orchestrator |
## Debug Tools