238 lines
11 KiB
Markdown
238 lines
11 KiB
Markdown
# ƿit (pit) Language Project
|
|
|
|
## Building
|
|
|
|
Build (or rebuild after changes): `make`
|
|
Install to system: `make install`
|
|
Run `cell --help` to see all CLI flags.
|
|
|
|
## Code Style
|
|
|
|
All code uses 2 spaces for indentation. K&R style for C and Javascript.
|
|
|
|
## ƿit Script Quick Reference
|
|
|
|
ƿit script files: `.ce` (actors) and `.cm` (modules). The syntax is similar to JavaScript with important differences listed below.
|
|
|
|
### Key Differences from JavaScript
|
|
|
|
- `var` (mutable) and `def` (constant) — no `let` or `const`
|
|
- `==` and `!=` are strict (no `===` or `!==`)
|
|
- No `undefined` — only `null`
|
|
- No classes — only objects and prototypes (`meme()`, `proto()`, `isa()`)
|
|
- No `switch`/`case` — use record dispatch (a record keyed by case, values are functions or results) instead of if/else chains
|
|
- No `for...in`, `for...of`, spread (`...`), rest params, or default params
|
|
- Functions have a maximum of 4 parameters — use a record for more
|
|
- Variables must be declared at function body level only (not in if/while/for/blocks)
|
|
- All variables must be initialized at declaration (`var x` alone is an error; use `var x = null`)
|
|
- No `try`/`catch`/`throw` — use `disrupt`/`disruption`
|
|
- No arraybuffers — only `blob` (works with bits; must `stone(blob)` before reading)
|
|
- Identifiers can contain `?` and `!` (e.g., `nil?`, `set!`, `is?valid`)
|
|
- Prefer backticks for string interpolation; otherwise use `text()` to convert non-strings
|
|
- Everything should be lowercase
|
|
|
|
### Intrinsic Functions (always available, no `use()` needed)
|
|
|
|
The creator functions are **polymorphic** — behavior depends on argument types:
|
|
|
|
- `array(number)` — create array of size N filled with null
|
|
- `array(number, value_or_fn)` — create array with initial values
|
|
- `array(array)` — copy array
|
|
- `array(array, fn)` — map
|
|
- `array(array, array)` — concatenate
|
|
- `array(array, from, to)` — slice
|
|
- `array(record)` — get keys as array of text
|
|
- **`array(text)` — split text into individual characters** (e.g., `array("hello")` → `["h","e","l","l","o"]`)
|
|
- `array(text, separator)` — split by separator
|
|
- `array(text, length)` — split into chunks of length
|
|
|
|
- `text(array, separator)` — join array into text
|
|
- `text(number)` or `text(number, radix)` — number to text
|
|
- `text(text, from, to)` — substring
|
|
|
|
- `number(text)` or `number(text, radix)` — parse text to number
|
|
- `number(logical)` — boolean to number
|
|
|
|
- `record(record)` — copy
|
|
- `record(record, another)` — merge
|
|
- `record(array_of_keys)` — create record from keys
|
|
|
|
Other key intrinsics: `length()`, `stone()`, `is_stone()`, `print()`, `filter()`, `find()`, `reduce()`, `sort()`, `reverse()`, `some()`, `every()`, `starts_with()`, `ends_with()`, `meme()`, `proto()`, `isa()`, `splat()`, `apply()`, `extract()`, `replace()`, `search()`, `format()`, `lower()`, `upper()`, `trim()`
|
|
|
|
Sensory functions: `is_array()`, `is_text()`, `is_number()`, `is_object()`, `is_function()`, `is_null()`, `is_logical()`, `is_integer()`, `is_stone()`, etc.
|
|
|
|
### Standard Library (loaded with `use()`)
|
|
|
|
- `blob` — binary data (bits, not bytes)
|
|
- `time` — time constants and conversions
|
|
- `math` — trig, logarithms, roots (`math/radians`, `math/turns`)
|
|
- `json` — JSON encoding/decoding
|
|
- `random` — random number generation
|
|
|
|
### Actor Model
|
|
|
|
- `.ce` files are actors (independent execution units, don't return values)
|
|
- `.cm` files are modules (return a value, cached and frozen)
|
|
- Actors never share memory; communicate via `$send()` message passing
|
|
- Actor intrinsics start with `$`: `$me`, `$stop()`, `$send()`, `$start()`, `$delay()`, `$receiver()`, `$clock()`, `$portal()`, `$contact()`, `$couple()`, `$unneeded()`, `$connection()`, `$time_limit()`
|
|
|
|
### Requestors (async composition)
|
|
|
|
`sequence()`, `parallel()`, `race()`, `fallback()` — compose asynchronous operations. See docs/requestors.md.
|
|
|
|
### Error Handling
|
|
|
|
```javascript
|
|
var fn = function() {
|
|
disrupt // bare keyword, no value
|
|
} disruption {
|
|
// handle error; can re-raise with disrupt
|
|
}
|
|
```
|
|
|
|
### Push/Pop Syntax
|
|
|
|
```javascript
|
|
var a = [1, 2]
|
|
a[] = 3 // push: [1, 2, 3]
|
|
var v = a[] // pop: v is 3, a is [1, 2]
|
|
```
|
|
|
|
## C Integration
|
|
|
|
- Declare everything `static` that can be
|
|
- Most files don't have headers; files in a package are not shared between packages
|
|
- No undefined in C API: use `JS_IsNull` and `JS_NULL` only
|
|
- A C file with correct macros (`CELL_USE_FUNCS` etc) is loaded as a module by its name (e.g., `png.c` in a package → `use('<package>/png')`)
|
|
- C symbol naming: `js_<pkg>_<file>_use` (e.g., `js_core_math_radians_use` for `core/math/radians`)
|
|
- Core is the `core` package — its symbols follow the same `js_core_<name>_use` pattern as all other packages
|
|
- Package directories should contain only source files (no `.mach`/`.mcode` alongside source)
|
|
- Build cache files in `build/` are bare hashes (no extensions)
|
|
|
|
### MANDATORY: GC Rooting for C Functions
|
|
|
|
This project uses a **copying garbage collector**. ANY JS allocation (`JS_NewObject`, `JS_NewString`, `JS_NewArray`, `JS_NewInt32`, `JS_SetPropertyStr`, `js_new_blob_stoned_copy`, etc.) can trigger GC, which **invalidates all unrooted JSValue locals**. This is not theoretical — it causes real crashes.
|
|
|
|
**Before writing or modifying ANY C function**, apply this checklist:
|
|
|
|
1. Count the number of `JS_New*`, `JS_SetProperty*`, and `js_new_blob*` calls in the function
|
|
2. If there are 2 or more, the function MUST use `JS_FRAME`/`JS_ROOT`/`JS_RETURN`
|
|
3. Every JSValue that is held across an allocating call must be rooted
|
|
|
|
**Pattern — object with properties:**
|
|
```c
|
|
JS_FRAME(js);
|
|
JS_ROOT(obj, JS_NewObject(js));
|
|
JS_SetPropertyStr(js, obj.val, "x", JS_NewInt32(js, 42));
|
|
JSValue name = JS_NewString(js, "hello");
|
|
JS_SetPropertyStr(js, obj.val, "name", name);
|
|
JS_RETURN(obj.val);
|
|
```
|
|
|
|
**Pattern — array with loop (declare root BEFORE the loop):**
|
|
```c
|
|
JS_FRAME(js);
|
|
JS_ROOT(arr, JS_NewArray(js));
|
|
JSGCRef item = { .val = JS_NULL, .prev = NULL };
|
|
JS_PushGCRef(js, &item);
|
|
for (int i = 0; i < count; i++) {
|
|
item.val = JS_NewObject(js);
|
|
JS_SetPropertyStr(js, item.val, "v", JS_NewInt32(js, i));
|
|
JS_SetPropertyNumber(js, arr.val, i, item.val);
|
|
}
|
|
JS_RETURN(arr.val);
|
|
```
|
|
|
|
**Rules:**
|
|
- Access rooted values via `.val` (e.g., `obj.val`, not `obj`)
|
|
- NEVER put `JS_ROOT` inside a loop — it pushes the same stack address twice, corrupting the GC chain
|
|
- Error returns before `JS_FRAME` use plain `return`
|
|
- Error returns after `JS_FRAME` must use `JS_RETURN_EX()` or `JS_RETURN_NULL()`
|
|
|
|
**CRITICAL — C argument evaluation order bug:**
|
|
|
|
Allocating functions (`JS_NewString`, `JS_NewFloat64`, `js_new_blob_stoned_copy`, etc.) used as arguments to `JS_SetPropertyStr` can crash because C evaluates arguments in unspecified order. The compiler may read `obj.val` BEFORE the allocating call, then GC moves the object, leaving a stale pointer.
|
|
|
|
```c
|
|
// UNSAFE — intermittent crash:
|
|
JS_SetPropertyStr(js, obj.val, "format", JS_NewString(js, "rgba32"));
|
|
JS_SetPropertyStr(js, obj.val, "pixels", js_new_blob_stoned_copy(js, data, len));
|
|
|
|
// SAFE — separate the allocation:
|
|
JSValue fmt = JS_NewString(js, "rgba32");
|
|
JS_SetPropertyStr(js, obj.val, "format", fmt);
|
|
JSValue pixels = js_new_blob_stoned_copy(js, data, len);
|
|
JS_SetPropertyStr(js, obj.val, "pixels", pixels);
|
|
```
|
|
|
|
`JS_NewInt32`, `JS_NewUint32`, and `JS_NewBool` do NOT allocate and are safe inline.
|
|
|
|
See `docs/c-modules.md` for the full GC safety reference.
|
|
|
|
## Project Layout
|
|
|
|
- `source/` — C source for the cell runtime and CLI
|
|
- `docs/` — master documentation (Markdown), reflected on the website
|
|
- `website/` — Hugo site; theme at `website/themes/knr/`
|
|
- `internal/` — internal ƿit scripts (engine.cm etc.)
|
|
- `packages/` — core packages
|
|
- `Makefile` — build system (`make` to rebuild, `make bootstrap` for first build)
|
|
|
|
## Package Management (Shop CLI)
|
|
|
|
**Two shops:** `cell <cmd>` uses the global shop at `~/.cell/packages/`. `cell --dev <cmd>` uses the local shop at `.cell/packages/`. Linked packages (via `cell link`) are symlinked into the shop — edit the source directory directly.
|
|
|
|
```
|
|
cell add <path> # add a package (local path or remote)
|
|
cell remove <path> # remove a package (cleans lock, symlink, dylibs)
|
|
cell build <path> # build C modules for a package
|
|
cell build <path> --force # force rebuild (ignore stat cache)
|
|
cell test package <path> # run tests for a package
|
|
cell list # list installed packages
|
|
cell link # list linked packages
|
|
```
|
|
|
|
The build step compiles C files to content-addressed dylibs in `~/.cell/build/<hash>` and writes a per-package manifest so the runtime can find them. C files in `src/` are support files linked into module dylibs, not standalone modules.
|
|
|
|
## Debugging Compiler Issues
|
|
|
|
When investigating bugs in compiled output (wrong values, missing operations, incorrect comparisons), **start from the optimizer down, not the VM up**. The compiler inspection tools will usually identify the problem faster than adding C-level tracing:
|
|
|
|
```
|
|
./cell --dev streamline --types <file> # show inferred slot types — look for wrong types
|
|
./cell --dev ir_report --events <file> # show every optimization applied and why
|
|
./cell --dev ir_report --types <file> # show type inference results per function
|
|
./cell --dev mcode --pretty <file> # show raw IR before optimization
|
|
./cell --dev streamline --ir <file> # show human-readable optimized IR
|
|
```
|
|
|
|
**Triage order:**
|
|
1. `streamline --types` — are slot types correct? Wrong type inference causes wrong optimizations.
|
|
2. `ir_report --events` — are type checks being incorrectly eliminated? Look for `known_type_eliminates_guard` on slots that shouldn't have known types.
|
|
3. `mcode --pretty` — is the raw IR correct before optimization? If so, the bug is in streamline.
|
|
4. Only dig into `source/mach.c` if the IR looks correct at all levels.
|
|
|
|
See `docs/compiler-tools.md` for the full tool reference and `docs/spec/streamline.md` for pass details.
|
|
|
|
## Testing
|
|
|
|
After any C runtime changes, run all three test suites before considering the work done:
|
|
|
|
```
|
|
make # rebuild
|
|
./cell --dev vm_suite # VM-level tests (641 tests)
|
|
./cell --dev test suite # language-level tests (493 tests)
|
|
./cell --dev fuzz # fuzzer (100 iterations)
|
|
```
|
|
|
|
All three must pass with 0 failures.
|
|
|
|
## Documentation
|
|
|
|
The `docs/` folder is the single source of truth. The website at `website/` mounts it via Hugo. Key files:
|
|
- `docs/language.md` — language syntax reference
|
|
- `docs/functions.md` — all built-in intrinsic functions
|
|
- `docs/actors.md` — actor model and actor intrinsics
|
|
- `docs/requestors.md` — async requestor pattern
|
|
- `docs/library/*.md` — intrinsic type reference (text, number, array, object) and standard library modules
|