cell/testing.md at a1b41d5ecf00a943fe6ee45559b3f05de8bc846c

john/cell

Files

John Alanbrook 2f41f58521 update docs for compile chain

2026-02-20 14:35:48 -06:00

7.1 KiB

Raw Blame History

title, description, weight, type

title	description	weight	type
Testing	Writing and running tests in ƿit	45	docs

ƿit has built-in support for writing and running tests. Tests live in the tests/ directory of a package and are .cm modules that return a record of test functions.

Writing Tests

A test file returns a record where each key starting with test_ is a test function. A test passes if it returns null (or nothing). It fails if it returns a text string describing the failure.

// tests/math.cm
return {
    test_addition: function() {
        if (1 + 2 != 3) return "expected 3"
    },

    test_division: function() {
        if (10 / 3 != 3.333333333333333333) return "unexpected result"
    }
}

Test functions take no arguments. Use early returns with a failure message to report errors:

test_array_push: function() {
    var a = [1, 2]
    a[] = 3
    if (length(a) != 3) return "expected length 3, got " + text(length(a))
    if (a[2] != 3) return "expected a[2] to be 3"
}

Running Tests

pit test                          # run all tests in current package
pit test suite                    # run a specific test file (tests/suite.cm)
pit test tests/math               # same, with explicit path
pit test all                      # run all tests in current package
pit test package <name>           # run all tests in a named package
pit test package <name> <test>    # run a specific test in a named package
pit test package all              # run tests from all installed packages

Flags

pit test suite -g          # run GC after each test (useful for detecting leaks)
pit test suite --verify    # enable IR verification during compilation
pit test suite --diff      # run each test optimized and unoptimized, compare results

--verify and --diff can be combined:

pit test suite --verify --diff

IR Verification

The --verify flag enables structural validation of the compiler's intermediate representation after each optimizer pass. This catches bugs like invalid slot references, broken jump targets, and malformed instructions.

When verification fails, errors are printed with the pass name that introduced them:

[verify_ir] slot_bounds: slot 12 out of range 0..9 in instruction add_int
[verify_ir] 1 errors after dead_code_elimination

IR verification adds overhead and is intended for development, not production use.

Differential Testing

Differential testing runs each test through two paths — with the optimizer enabled and with it disabled — and compares results. Any mismatch between the two indicates an optimizer bug.

Inline Mode

The --diff flag on pit test runs each test module through both paths during a normal test run:

pit test suite --diff

Output includes a mismatch count at the end:

Tests: 493 passed, 0 failed, 493 total
Diff mismatches: 0

Standalone Mode

pit diff is a dedicated differential testing tool with detailed mismatch reporting:

pit diff                   # diff all test files in current package
pit diff suite             # diff a specific test file
pit diff tests/math        # same, with explicit path

For each test function, it reports whether the optimized and unoptimized results match:

  tests/suite.cm: 493 passed, 0 failed
----------------------------------------
Diff: 493 passed, 0 failed, 493 total

When a mismatch is found:

  tests/suite.cm: 492 passed, 1 failed
    MISMATCH: test_foo: result mismatch opt=42 noopt=43

ASAN for Native AOT

When debugging native (shop.use_native) crashes, there are two useful sanitizer workflows.

1) AOT-only sanitizer (fastest loop)

Enable sanitizer flags for generated native modules by creating a marker file:

touch .cell/asan_aot
cell --dev bench --native fibonacci

This adds -fsanitize=address -fno-omit-frame-pointer to AOT module compilation.

Disable it with:

rm -f .cell/asan_aot

2) Full runtime sanitizer (CLI + runtime + AOT)

Build an ASAN-instrumented cell binary:

meson setup build-asan -Dbuildtype=debug -Db_sanitize=address
CCACHE_DISABLE=1 meson compile -C build-asan
ASAN_OPTIONS=abort_on_error=1:detect_leaks=0 ./build-asan/cell --dev bench --native fibonacci

This catches bugs crossing the boundary between generated dylibs and runtime helpers.

If stale native artifacts are suspected after compiler/runtime changes, clear build outputs first:

cell --dev clean shop --build

Fuzz Testing

The fuzzer generates random self-checking programs, compiles them, and runs them through both optimized and unoptimized paths. Each generated program contains test functions that validate their own expected results, so failures catch both correctness bugs and optimizer mismatches.

pit fuzz                   # 100 iterations, random seed
pit fuzz 500               # 500 iterations, random seed
pit fuzz --seed 42         # 100 iterations, deterministic seed
pit fuzz 1000 --seed 42   # 1000 iterations, deterministic seed

The fuzzer generates programs that exercise:

Integer and float arithmetic with known expected results
Control flow (if/else, while loops)
Closures and captured variable mutation
Records and property access
Arrays and iteration
Higher-order functions
Disruption handling
Text concatenation

On failure, the generated source is saved to tests/fuzz_failures/ for reproduction:

Fuzzing: 1000 iterations, starting seed=42
  FAIL seed=57: diff fuzz_3: opt=10 noopt=11
    saved to tests/fuzz_failures/seed_57.cm
----------------------------------------
Fuzz: 999 passed, 1 failed, 1000 total
Failures saved to tests/fuzz_failures/

Saved failure files are valid .cm modules that can be run directly or added to the test suite.

Compile-Time Diagnostics Tests

The tests/compile.cm test suite verifies that the type checker catches provably wrong operations at compile time. It works by compiling source snippets through the pipeline with _warn enabled and checking that the expected diagnostics are emitted.

var shop = use('internal/shop')
var streamline = use('streamline')

function get_diagnostics(src) {
  fd.slurpwrite(tmpfile, stone(blob(src)))
  var compiled = shop.mcode_file(tmpfile)
  compiled._warn = true
  var optimized = streamline(compiled)
  if (optimized._diagnostics == null) return []
  return optimized._diagnostics
}

The suite covers:

Store errors: storing named property on array, numeric index on record, property/index on text, push on text/record
Invoke errors: invoking null, number, text
Warnings: named property access on array/text, record key on record
Clean code: valid operations produce no diagnostics

Run the compile diagnostics tests with:

pit test compile

Test File Organization

Tests live in the tests/ directory of a package:

mypackage/
├── pit.toml
├── math.cm
└── tests/
    ├── suite.cm        # main test suite
    ├── math.cm         # math-specific tests
    └── disrupt.cm      # disruption tests

All .cm files under tests/ are discovered automatically by pit test.

7.1 KiB Raw Blame History