# ƿit (pit) Language Project ## Building Build (or rebuild after changes): `make` Install to system: `make install` Run `cell --help` to see all CLI flags. ## Code Style All code uses 2 spaces for indentation. K&R style for C and Javascript. ## ƿit Script Quick Reference ƿit script files: `.ce` (actors) and `.cm` (modules). The syntax is similar to JavaScript with important differences listed below. ### Key Differences from JavaScript - `var` (mutable) and `def` (constant) — no `let` or `const` - `==` and `!=` are strict (no `===` or `!==`) - No `undefined` — only `null` - No classes — only objects and prototypes (`meme()`, `proto()`, `isa()`) - No `switch`/`case` — use record dispatch (a record keyed by case, values are functions or results) instead of if/else chains - No `for...in`, `for...of`, spread (`...`), rest params, or default params - Functions have a maximum of 4 parameters — use a record for more - Variables must be declared at function body level only (not in if/while/for/blocks) - All variables must be initialized at declaration (`var x` alone is an error; use `var x = null`) - No `try`/`catch`/`throw` — use `disrupt`/`disruption` - No arraybuffers — only `blob` (works with bits; must `stone(blob)` before reading) - Identifiers can contain `?` and `!` (e.g., `nil?`, `set!`, `is?valid`) - Prefer backticks for string interpolation; otherwise use `text()` to convert non-strings - Everything should be lowercase ### Intrinsic Functions (always available, no `use()` needed) The creator functions are **polymorphic** — behavior depends on argument types: - `array(number)` — create array of size N filled with null - `array(number, value_or_fn)` — create array with initial values - `array(array)` — copy array - `array(array, fn)` — map - `array(array, array)` — concatenate - `array(array, from, to)` — slice - `array(record)` — get keys as array of text - **`array(text)` — split text into individual characters** (e.g., `array("hello")` → `["h","e","l","l","o"]`) - `array(text, separator)` — split by separator - `array(text, length)` — split into chunks of length - `text(array, separator)` — join array into text - `text(number)` or `text(number, radix)` — number to text - `text(text, from, to)` — substring - `number(text)` or `number(text, radix)` — parse text to number - `number(logical)` — boolean to number - `record(record)` — copy - `record(record, another)` — merge - `record(array_of_keys)` — create record from keys Other key intrinsics: `length()`, `stone()`, `is_stone()`, `print()`, `filter()`, `find()`, `reduce()`, `sort()`, `reverse()`, `some()`, `every()`, `starts_with()`, `ends_with()`, `meme()`, `proto()`, `isa()`, `splat()`, `apply()`, `extract()`, `replace()`, `search()`, `format()`, `lower()`, `upper()`, `trim()` Sensory functions: `is_array()`, `is_text()`, `is_number()`, `is_object()`, `is_function()`, `is_null()`, `is_logical()`, `is_integer()`, `is_stone()`, etc. ### Standard Library (loaded with `use()`) - `blob` — binary data (bits, not bytes) - `time` — time constants and conversions - `math` — trig, logarithms, roots (`math/radians`, `math/turns`) - `json` — JSON encoding/decoding - `random` — random number generation ### Actor Model - `.ce` files are actors (independent execution units, don't return values) - `.cm` files are modules (return a value, cached and frozen) - Actors never share memory; communicate via `$send()` message passing - Actor intrinsics start with `$`: `$me`, `$stop()`, `$send()`, `$start()`, `$delay()`, `$receiver()`, `$clock()`, `$portal()`, `$contact()`, `$couple()`, `$unneeded()`, `$connection()`, `$time_limit()` ### Requestors (async composition) `sequence()`, `parallel()`, `race()`, `fallback()` — compose asynchronous operations. See docs/requestors.md. ### Error Handling ```javascript var fn = function() { disrupt // bare keyword, no value } disruption { // handle error; can re-raise with disrupt } ``` ### Push/Pop Syntax ```javascript var a = [1, 2] a[] = 3 // push: [1, 2, 3] var v = a[] // pop: v is 3, a is [1, 2] ``` ## C Integration - Declare everything `static` that can be - Most files don't have headers; files in a package are not shared between packages - No undefined in C API: use `JS_IsNull` and `JS_NULL` only - A C file with correct macros (`CELL_USE_FUNCS` etc) is loaded as a module by its name (e.g., `png.c` in a package → `use('/png')`) - C symbol naming: `js___use` (e.g., `js_core_math_radians_use` for `core/math/radians`) - Core is the `core` package — its symbols follow the same `js_core__use` pattern as all other packages - Package directories should contain only source files (no `.mach`/`.mcode` alongside source) - Build cache files in `build/` are bare hashes (no extensions) ### MANDATORY: GC Rooting for C Functions This project uses a **copying garbage collector**. ANY JS allocation (`JS_NewObject`, `JS_NewString`, `JS_NewArray`, `JS_NewInt32`, `JS_SetPropertyStr`, `js_new_blob_stoned_copy`, etc.) can trigger GC, which **invalidates all unrooted JSValue locals**. This is not theoretical — it causes real crashes. **Before writing or modifying ANY C function**, apply this checklist: 1. Count the number of `JS_New*`, `JS_SetProperty*`, and `js_new_blob*` calls in the function 2. If there are 2 or more, the function MUST use `JS_FRAME`/`JS_ROOT`/`JS_RETURN` 3. Every JSValue that is held across an allocating call must be rooted **Pattern — object with properties:** ```c JS_FRAME(js); JS_ROOT(obj, JS_NewObject(js)); JS_SetPropertyStr(js, obj.val, "x", JS_NewInt32(js, 42)); JS_SetPropertyStr(js, obj.val, "name", JS_NewString(js, "hello")); JS_RETURN(obj.val); ``` **Pattern — array with loop:** ```c JS_FRAME(js); JS_ROOT(arr, JS_NewArray(js)); for (int i = 0; i < count; i++) { JS_ROOT(item, JS_NewObject(js)); JS_SetPropertyStr(js, item.val, "v", JS_NewInt32(js, i)); JS_SetPropertyNumber(js, arr.val, i, item.val); } JS_RETURN(arr.val); ``` **Rules:** - Access rooted values via `.val` (e.g., `obj.val`, not `obj`) - Error returns before `JS_FRAME` use plain `return` - Error returns after `JS_FRAME` must use `JS_RETURN_EX()` or `JS_RETURN_NULL()` - When calling a helper that itself returns a JSValue, that return value is safe to pass directly into `JS_SetPropertyStr` — no need to root temporaries that aren't stored in a local **Common mistake — UNSAFE (will crash under GC pressure):** ```c JSValue obj = JS_NewObject(js); // NOT rooted JS_SetPropertyStr(js, obj, "pixels", js_new_blob_stoned_copy(js, data, len)); // ^^^ blob allocation can GC, invalidating obj return obj; // obj may be a dangling pointer ``` See `docs/c-modules.md` for the full GC safety reference. ## Project Layout - `source/` — C source for the cell runtime and CLI - `docs/` — master documentation (Markdown), reflected on the website - `website/` — Hugo site; theme at `website/themes/knr/` - `internal/` — internal ƿit scripts (engine.cm etc.) - `packages/` — core packages - `Makefile` — build system (`make` to rebuild, `make bootstrap` for first build) ## Package Management (Shop CLI) When running locally with `./cell --dev`, these commands manage packages: ``` ./cell --dev add # add a package (local path or remote) ./cell --dev remove # remove a package (cleans lock, symlink, dylibs) ./cell --dev build # build C modules for a package ./cell --dev test package # run tests for a package ./cell --dev list # list installed packages ``` Local paths are symlinked into `.cell/packages/`. The build step compiles C files to `.cell/lib//.dylib`. C files in `src/` are support files linked into module dylibs, not standalone modules. ## Debugging Compiler Issues When investigating bugs in compiled output (wrong values, missing operations, incorrect comparisons), **start from the optimizer down, not the VM up**. The compiler inspection tools will usually identify the problem faster than adding C-level tracing: ``` ./cell --dev streamline --types # show inferred slot types — look for wrong types ./cell --dev ir_report --events # show every optimization applied and why ./cell --dev ir_report --types # show type inference results per function ./cell --dev mcode --pretty # show raw IR before optimization ./cell --dev streamline --ir # show human-readable optimized IR ``` **Triage order:** 1. `streamline --types` — are slot types correct? Wrong type inference causes wrong optimizations. 2. `ir_report --events` — are type checks being incorrectly eliminated? Look for `known_type_eliminates_guard` on slots that shouldn't have known types. 3. `mcode --pretty` — is the raw IR correct before optimization? If so, the bug is in streamline. 4. Only dig into `source/mach.c` if the IR looks correct at all levels. See `docs/compiler-tools.md` for the full tool reference and `docs/spec/streamline.md` for pass details. ## Testing After any C runtime changes, run all three test suites before considering the work done: ``` make # rebuild ./cell --dev vm_suite # VM-level tests (641 tests) ./cell --dev test suite # language-level tests (493 tests) ./cell --dev fuzz # fuzzer (100 iterations) ``` All three must pass with 0 failures. ## Documentation The `docs/` folder is the single source of truth. The website at `website/` mounts it via Hugo. Key files: - `docs/language.md` — language syntax reference - `docs/functions.md` — all built-in intrinsic functions - `docs/actors.md` — actor model and actor intrinsics - `docs/requestors.md` — async requestor pattern - `docs/library/*.md` — intrinsic type reference (text, number, array, object) and standard library modules