Files
cell/CLAUDE.md
2026-02-21 03:01:26 -06:00

238 lines
11 KiB
Markdown

# ƿit (pit) Language Project
## Building
Build (or rebuild after changes): `make`
Install to system: `make install`
Run `cell --help` to see all CLI flags.
## Code Style
All code uses 2 spaces for indentation. K&R style for C and Javascript.
## ƿit Script Quick Reference
ƿit script files: `.ce` (actors) and `.cm` (modules). The syntax is similar to JavaScript with important differences listed below.
### Key Differences from JavaScript
- `var` (mutable) and `def` (constant) — no `let` or `const`
- `==` and `!=` are strict (no `===` or `!==`)
- No `undefined` — only `null`
- No classes — only objects and prototypes (`meme()`, `proto()`, `isa()`)
- No `switch`/`case` — use record dispatch (a record keyed by case, values are functions or results) instead of if/else chains
- No `for...in`, `for...of`, spread (`...`), rest params, or default params
- Functions have a maximum of 4 parameters — use a record for more
- Variables must be declared at function body level only (not in if/while/for/blocks)
- All variables must be initialized at declaration (`var x` alone is an error; use `var x = null`)
- No `try`/`catch`/`throw` — use `disrupt`/`disruption`
- No arraybuffers — only `blob` (works with bits; must `stone(blob)` before reading)
- Identifiers can contain `?` and `!` (e.g., `nil?`, `set!`, `is?valid`)
- Prefer backticks for string interpolation; otherwise use `text()` to convert non-strings
- Everything should be lowercase
### Intrinsic Functions (always available, no `use()` needed)
The creator functions are **polymorphic** — behavior depends on argument types:
- `array(number)` — create array of size N filled with null
- `array(number, value_or_fn)` — create array with initial values
- `array(array)` — copy array
- `array(array, fn)` — map
- `array(array, array)` — concatenate
- `array(array, from, to)` — slice
- `array(record)` — get keys as array of text
- **`array(text)` — split text into individual characters** (e.g., `array("hello")``["h","e","l","l","o"]`)
- `array(text, separator)` — split by separator
- `array(text, length)` — split into chunks of length
- `text(array, separator)` — join array into text
- `text(number)` or `text(number, radix)` — number to text
- `text(text, from, to)` — substring
- `number(text)` or `number(text, radix)` — parse text to number
- `number(logical)` — boolean to number
- `record(record)` — copy
- `record(record, another)` — merge
- `record(array_of_keys)` — create record from keys
Other key intrinsics: `length()`, `stone()`, `is_stone()`, `print()`, `filter()`, `find()`, `reduce()`, `sort()`, `reverse()`, `some()`, `every()`, `starts_with()`, `ends_with()`, `meme()`, `proto()`, `isa()`, `splat()`, `apply()`, `extract()`, `replace()`, `search()`, `format()`, `lower()`, `upper()`, `trim()`
Sensory functions: `is_array()`, `is_text()`, `is_number()`, `is_object()`, `is_function()`, `is_null()`, `is_logical()`, `is_integer()`, `is_stone()`, etc.
### Standard Library (loaded with `use()`)
- `blob` — binary data (bits, not bytes)
- `time` — time constants and conversions
- `math` — trig, logarithms, roots (`math/radians`, `math/turns`)
- `json` — JSON encoding/decoding
- `random` — random number generation
### Actor Model
- `.ce` files are actors (independent execution units, don't return values)
- `.cm` files are modules (return a value, cached and frozen)
- Actors never share memory; communicate via `$send()` message passing
- Actor intrinsics start with `$`: `$me`, `$stop()`, `$send()`, `$start()`, `$delay()`, `$receiver()`, `$clock()`, `$portal()`, `$contact()`, `$couple()`, `$unneeded()`, `$connection()`, `$time_limit()`
### Requestors (async composition)
`sequence()`, `parallel()`, `race()`, `fallback()` — compose asynchronous operations. See docs/requestors.md.
### Error Handling
```javascript
var fn = function() {
disrupt // bare keyword, no value
} disruption {
// handle error; can re-raise with disrupt
}
```
### Push/Pop Syntax
```javascript
var a = [1, 2]
a[] = 3 // push: [1, 2, 3]
var v = a[] // pop: v is 3, a is [1, 2]
```
## C Integration
- Declare everything `static` that can be
- Most files don't have headers; files in a package are not shared between packages
- No undefined in C API: use `JS_IsNull` and `JS_NULL` only
- A C file with correct macros (`CELL_USE_FUNCS` etc) is loaded as a module by its name (e.g., `png.c` in a package → `use('<package>/png')`)
- C symbol naming: `js_<pkg>_<file>_use` (e.g., `js_core_math_radians_use` for `core/math/radians`)
- Core is the `core` package — its symbols follow the same `js_core_<name>_use` pattern as all other packages
- Package directories should contain only source files (no `.mach`/`.mcode` alongside source)
- Build cache files in `build/` are bare hashes (no extensions)
### MANDATORY: GC Rooting for C Functions
This project uses a **copying garbage collector**. ANY JS allocation (`JS_NewObject`, `JS_NewString`, `JS_NewArray`, `JS_NewInt32`, `JS_SetPropertyStr`, `js_new_blob_stoned_copy`, etc.) can trigger GC, which **invalidates all unrooted JSValue locals**. This is not theoretical — it causes real crashes.
**Before writing or modifying ANY C function**, apply this checklist:
1. Count the number of `JS_New*`, `JS_SetProperty*`, and `js_new_blob*` calls in the function
2. If there are 2 or more, the function MUST use `JS_FRAME`/`JS_ROOT`/`JS_RETURN`
3. Every JSValue that is held across an allocating call must be rooted
**Pattern — object with properties:**
```c
JS_FRAME(js);
JS_ROOT(obj, JS_NewObject(js));
JS_SetPropertyStr(js, obj.val, "x", JS_NewInt32(js, 42));
JSValue name = JS_NewString(js, "hello");
JS_SetPropertyStr(js, obj.val, "name", name);
JS_RETURN(obj.val);
```
**Pattern — array with loop (declare root BEFORE the loop):**
```c
JS_FRAME(js);
JS_ROOT(arr, JS_NewArray(js));
JSGCRef item = { .val = JS_NULL, .prev = NULL };
JS_PushGCRef(js, &item);
for (int i = 0; i < count; i++) {
item.val = JS_NewObject(js);
JS_SetPropertyStr(js, item.val, "v", JS_NewInt32(js, i));
JS_SetPropertyNumber(js, arr.val, i, item.val);
}
JS_RETURN(arr.val);
```
**Rules:**
- Access rooted values via `.val` (e.g., `obj.val`, not `obj`)
- NEVER put `JS_ROOT` inside a loop — it pushes the same stack address twice, corrupting the GC chain
- Error returns before `JS_FRAME` use plain `return`
- Error returns after `JS_FRAME` must use `JS_RETURN_EX()` or `JS_RETURN_NULL()`
**CRITICAL — C argument evaluation order bug:**
Allocating functions (`JS_NewString`, `JS_NewFloat64`, `js_new_blob_stoned_copy`, etc.) used as arguments to `JS_SetPropertyStr` can crash because C evaluates arguments in unspecified order. The compiler may read `obj.val` BEFORE the allocating call, then GC moves the object, leaving a stale pointer.
```c
// UNSAFE — intermittent crash:
JS_SetPropertyStr(js, obj.val, "format", JS_NewString(js, "rgba32"));
JS_SetPropertyStr(js, obj.val, "pixels", js_new_blob_stoned_copy(js, data, len));
// SAFE — separate the allocation:
JSValue fmt = JS_NewString(js, "rgba32");
JS_SetPropertyStr(js, obj.val, "format", fmt);
JSValue pixels = js_new_blob_stoned_copy(js, data, len);
JS_SetPropertyStr(js, obj.val, "pixels", pixels);
```
`JS_NewInt32`, `JS_NewUint32`, and `JS_NewBool` do NOT allocate and are safe inline.
See `docs/c-modules.md` for the full GC safety reference.
## Project Layout
- `source/` — C source for the cell runtime and CLI
- `docs/` — master documentation (Markdown), reflected on the website
- `website/` — Hugo site; theme at `website/themes/knr/`
- `internal/` — internal ƿit scripts (engine.cm etc.)
- `packages/` — core packages
- `Makefile` — build system (`make` to rebuild, `make bootstrap` for first build)
## Package Management (Shop CLI)
**Two shops:** `cell <cmd>` uses the global shop at `~/.cell/packages/`. `cell --dev <cmd>` uses the local shop at `.cell/packages/`. Linked packages (via `cell link`) are symlinked into the shop — edit the source directory directly.
```
cell add <path> # add a package (local path or remote)
cell remove <path> # remove a package (cleans lock, symlink, dylibs)
cell build <path> # build C modules for a package
cell build <path> --force # force rebuild (ignore stat cache)
cell test package <path> # run tests for a package
cell list # list installed packages
cell link # list linked packages
```
The build step compiles C files to content-addressed dylibs in `~/.cell/build/<hash>` and writes a per-package manifest so the runtime can find them. C files in `src/` are support files linked into module dylibs, not standalone modules.
## Debugging Compiler Issues
When investigating bugs in compiled output (wrong values, missing operations, incorrect comparisons), **start from the optimizer down, not the VM up**. The compiler inspection tools will usually identify the problem faster than adding C-level tracing:
```
./cell --dev streamline --types <file> # show inferred slot types — look for wrong types
./cell --dev ir_report --events <file> # show every optimization applied and why
./cell --dev ir_report --types <file> # show type inference results per function
./cell --dev mcode --pretty <file> # show raw IR before optimization
./cell --dev streamline --ir <file> # show human-readable optimized IR
```
**Triage order:**
1. `streamline --types` — are slot types correct? Wrong type inference causes wrong optimizations.
2. `ir_report --events` — are type checks being incorrectly eliminated? Look for `known_type_eliminates_guard` on slots that shouldn't have known types.
3. `mcode --pretty` — is the raw IR correct before optimization? If so, the bug is in streamline.
4. Only dig into `source/mach.c` if the IR looks correct at all levels.
See `docs/compiler-tools.md` for the full tool reference and `docs/spec/streamline.md` for pass details.
## Testing
After any C runtime changes, run all three test suites before considering the work done:
```
make # rebuild
./cell --dev vm_suite # VM-level tests (641 tests)
./cell --dev test suite # language-level tests (493 tests)
./cell --dev fuzz # fuzzer (100 iterations)
```
All three must pass with 0 failures.
## Documentation
The `docs/` folder is the single source of truth. The website at `website/` mounts it via Hugo. Key files:
- `docs/language.md` — language syntax reference
- `docs/functions.md` — all built-in intrinsic functions
- `docs/actors.md` — actor model and actor intrinsics
- `docs/requestors.md` — async requestor pattern
- `docs/library/*.md` — intrinsic type reference (text, number, array, object) and standard library modules