7.1 KiB
title, description, weight, type
| title | description | weight | type |
|---|---|---|---|
| Testing | Writing and running tests in ƿit | 45 | docs |
ƿit has built-in support for writing and running tests. Tests live in the tests/ directory of a package and are .cm modules that return a record of test functions.
Writing Tests
A test file returns a record where each key starting with test_ is a test function. A test passes if it returns null (or nothing). It fails if it returns a text string describing the failure.
// tests/math.cm
return {
test_addition: function() {
if (1 + 2 != 3) return "expected 3"
},
test_division: function() {
if (10 / 3 != 3.333333333333333333) return "unexpected result"
}
}
Test functions take no arguments. Use early returns with a failure message to report errors:
test_array_push: function() {
var a = [1, 2]
a[] = 3
if (length(a) != 3) return "expected length 3, got " + text(length(a))
if (a[2] != 3) return "expected a[2] to be 3"
}
Running Tests
pit test # run all tests in current package
pit test suite # run a specific test file (tests/suite.cm)
pit test tests/math # same, with explicit path
pit test all # run all tests in current package
pit test package <name> # run all tests in a named package
pit test package <name> <test> # run a specific test in a named package
pit test package all # run tests from all installed packages
Flags
pit test suite -g # run GC after each test (useful for detecting leaks)
pit test suite --verify # enable IR verification during compilation
pit test suite --diff # run each test optimized and unoptimized, compare results
--verify and --diff can be combined:
pit test suite --verify --diff
IR Verification
The --verify flag enables structural validation of the compiler's intermediate representation after each optimizer pass. This catches bugs like invalid slot references, broken jump targets, and malformed instructions.
When verification fails, errors are printed with the pass name that introduced them:
[verify_ir] slot_bounds: slot 12 out of range 0..9 in instruction add_int
[verify_ir] 1 errors after dead_code_elimination
IR verification adds overhead and is intended for development, not production use.
Differential Testing
Differential testing runs each test through two paths — with the optimizer enabled and with it disabled — and compares results. Any mismatch between the two indicates an optimizer bug.
Inline Mode
The --diff flag on pit test runs each test module through both paths during a normal test run:
pit test suite --diff
Output includes a mismatch count at the end:
Tests: 493 passed, 0 failed, 493 total
Diff mismatches: 0
Standalone Mode
pit diff is a dedicated differential testing tool with detailed mismatch reporting:
pit diff # diff all test files in current package
pit diff suite # diff a specific test file
pit diff tests/math # same, with explicit path
For each test function, it reports whether the optimized and unoptimized results match:
tests/suite.cm: 493 passed, 0 failed
----------------------------------------
Diff: 493 passed, 0 failed, 493 total
When a mismatch is found:
tests/suite.cm: 492 passed, 1 failed
MISMATCH: test_foo: result mismatch opt=42 noopt=43
ASAN for Native AOT
When debugging native (shop.use_native) crashes, there are two useful sanitizer workflows.
1) AOT-only sanitizer (fastest loop)
Enable sanitizer flags for generated native modules by creating a marker file:
touch .cell/asan_aot
cell --dev bench --native fibonacci
This adds -fsanitize=address -fno-omit-frame-pointer to AOT module compilation.
Disable it with:
rm -f .cell/asan_aot
2) Full runtime sanitizer (CLI + runtime + AOT)
Build an ASAN-instrumented cell binary:
meson setup build-asan -Dbuildtype=debug -Db_sanitize=address
CCACHE_DISABLE=1 meson compile -C build-asan
ASAN_OPTIONS=abort_on_error=1:detect_leaks=0 ./build-asan/cell --dev bench --native fibonacci
This catches bugs crossing the boundary between generated dylibs and runtime helpers.
If stale native artifacts are suspected after compiler/runtime changes, clear build outputs first:
cell --dev clean shop --build
Fuzz Testing
The fuzzer generates random self-checking programs, compiles them, and runs them through both optimized and unoptimized paths. Each generated program contains test functions that validate their own expected results, so failures catch both correctness bugs and optimizer mismatches.
pit fuzz # 100 iterations, random seed
pit fuzz 500 # 500 iterations, random seed
pit fuzz --seed 42 # 100 iterations, deterministic seed
pit fuzz 1000 --seed 42 # 1000 iterations, deterministic seed
The fuzzer generates programs that exercise:
- Integer and float arithmetic with known expected results
- Control flow (if/else, while loops)
- Closures and captured variable mutation
- Records and property access
- Arrays and iteration
- Higher-order functions
- Disruption handling
- Text concatenation
On failure, the generated source is saved to tests/fuzz_failures/ for reproduction:
Fuzzing: 1000 iterations, starting seed=42
FAIL seed=57: diff fuzz_3: opt=10 noopt=11
saved to tests/fuzz_failures/seed_57.cm
----------------------------------------
Fuzz: 999 passed, 1 failed, 1000 total
Failures saved to tests/fuzz_failures/
Saved failure files are valid .cm modules that can be run directly or added to the test suite.
Compile-Time Diagnostics Tests
The tests/compile.cm test suite verifies that the type checker catches provably wrong operations at compile time. It works by compiling source snippets through the pipeline with _warn enabled and checking that the expected diagnostics are emitted.
var shop = use('internal/shop')
var streamline = use('streamline')
function get_diagnostics(src) {
fd.slurpwrite(tmpfile, stone(blob(src)))
var compiled = shop.mcode_file(tmpfile)
compiled._warn = true
var optimized = streamline(compiled)
if (optimized._diagnostics == null) return []
return optimized._diagnostics
}
The suite covers:
- Store errors: storing named property on array, numeric index on record, property/index on text, push on text/record
- Invoke errors: invoking null, number, text
- Warnings: named property access on array/text, record key on record
- Clean code: valid operations produce no diagnostics
Run the compile diagnostics tests with:
pit test compile
Test File Organization
Tests live in the tests/ directory of a package:
mypackage/
├── pit.toml
├── math.cm
└── tests/
├── suite.cm # main test suite
├── math.cm # math-specific tests
└── disrupt.cm # disruption tests
All .cm files under tests/ are discovered automatically by pit test.