Files
cell/docs/spec/mcode.md
2026-02-08 08:25:48 -06:00

3.0 KiB

title, description
title description
Mcode IR JSON-based intermediate representation

Overview

Mcode is a JSON-based intermediate representation that can be interpreted directly. It represents the same operations as the Mach register VM but uses string-based instruction dispatch rather than binary opcodes. Mcode is intended as an intermediate step toward native code compilation.

Pipeline

Source → Tokenize → Parse (AST) → Mcode (JSON) → Interpret
                                                → Compile to Mach (planned)
                                                → Compile to native (planned)

Mcode is produced by the JS_Mcode compiler pass, which emits a cJSON tree. The mcode interpreter walks this tree directly, dispatching on instruction name strings.

JSMCode Structure

struct JSMCode {
  uint16_t nr_args;        // argument count
  uint16_t nr_slots;       // register count
  cJSON **instrs;          // pre-flattened instruction array
  uint32_t instr_count;    // number of instructions

  struct {
    const char *name;      // label name
    uint32_t index;        // instruction index
  } *labels;
  uint32_t label_count;

  struct JSMCode **functions; // nested functions
  uint32_t func_count;

  cJSON *json_root;        // keeps JSON alive
  const char *name;        // function name
  const char *filename;    // source file
  uint16_t disruption_pc;  // exception handler offset
};

Instruction Format

Each instruction is a JSON array. The first element is the instruction name (string), followed by operands:

["LOADK", 0, 42]
["ADD", 2, 0, 1]
["JMPFALSE", 3, "else_label"]
["CALL", 0, 2, 1]

The instruction set mirrors the Mach VM opcodes — same operations, same register semantics, but with string dispatch instead of numeric opcodes.

Labels

Control flow uses named labels instead of numeric offsets:

["LABEL", "loop_start"]
["ADD", 1, 1, 2]
["JMPFALSE", 3, "loop_end"]
["JMP", "loop_start"]
["LABEL", "loop_end"]

Labels are collected into a name-to-index map during loading, enabling O(1) jump resolution.

Differences from Mach

Property Mcode Mach
Instructions cJSON arrays 32-bit binary
Dispatch String comparison Switch on opcode byte
Constants Inline in JSON Separate constant pool
Jump targets Named labels Numeric offsets
Memory Heap (cJSON nodes) Off-heap (malloc)

Purpose

Mcode serves as an inspectable, debuggable intermediate format:

  • Human-readable — the JSON representation can be printed and examined
  • Language-independent — any tool that produces the correct JSON can target the ƿit runtime
  • Compilation target — the Mach compiler can consume mcode as input, and future native code generators can work from the same representation

The cost of string-based dispatch makes mcode slower than the binary Mach VM, so it is primarily useful during development and as a compilation intermediate rather than for production execution.