Instruction set reference for the JSON-based intermediate representation
Overview
Mcode is the intermediate representation at the center of the ƿit compilation pipeline. All source code is lowered to mcode before execution or native compilation. The mcode instruction set is the authoritative reference for the operations supported by the ƿit runtime — the Mach VM bytecode is a direct binary encoding of these same instructions.
Mcode is produced by mcode.cm, optimized by streamline.cm, then either serialized to 32-bit bytecode for the Mach VM (mach.c), or lowered to QBE/LLVM IL for native compilation (qbe_emit.cm). See Compilation Pipeline for the full overview.
Module Structure
An .mcode file is a JSON object representing a compiled module:
Field
Type
Description
name
string
Module name (typically the source filename)
filename
string
Source filename
data
object
Constant pool — string and number literals used by instructions
main
function
The top-level function (module body)
functions
array
Nested function definitions (referenced by function dest, id)
Function Record
Each function (both main and entries in functions) has:
Field
Type
Description
name
string
Function name ("<anonymous>" for lambdas)
filename
string
Source filename
nr_args
integer
Number of parameters
nr_slots
integer
Total register slots needed (args + locals + temporaries)
nr_close_slots
integer
Number of closure slots captured from parent scope
disruption_pc
integer
Instruction index of the disruption handler (0 if none)
instructions
array
Instruction arrays and label strings
Slot 0 is reserved. Slots 1 through nr_args hold parameters. Remaining slots up to nr_slots - 1 are locals and temporaries.
Instruction Format
Each instruction is a JSON array. The first element is the instruction name (string), followed by operands. The last two elements are line and column numbers for source mapping:
Operands are register slot numbers (integers), constant values (strings, numbers), or label names (strings).
Instruction Reference
Loading and Constants
Instruction
Operands
Description
access
dest, name
Load variable by name (intrinsic or environment)
int
dest, value
Load integer constant
true
dest
Load boolean true
false
dest
Load boolean false
null
dest
Load null
move
dest, src
Copy register value
function
dest, id
Load nested function by index
regexp
dest, pattern
Create regexp object
Arithmetic — Integer
Instruction
Operands
Description
add_int
dest, a, b
dest = a + b (integer)
sub_int
dest, a, b
dest = a - b (integer)
mul_int
dest, a, b
dest = a * b (integer)
div_int
dest, a, b
dest = a / b (integer)
mod_int
dest, a, b
dest = a % b (integer)
neg_int
dest, src
dest = -src (integer)
Arithmetic — Float
Instruction
Operands
Description
add_float
dest, a, b
dest = a + b (float)
sub_float
dest, a, b
dest = a - b (float)
mul_float
dest, a, b
dest = a * b (float)
div_float
dest, a, b
dest = a / b (float)
mod_float
dest, a, b
dest = a % b (float)
neg_float
dest, src
dest = -src (float)
Arithmetic — Generic
Instruction
Operands
Description
pow
dest, a, b
dest = a ^ b (exponentiation)
Text
Instruction
Operands
Description
concat
dest, a, b
dest = a ~ b (text concatenation)
Comparison — Integer
Instruction
Operands
Description
eq_int
dest, a, b
dest = a == b (integer)
ne_int
dest, a, b
dest = a != b (integer)
lt_int
dest, a, b
dest = a < b (integer)
le_int
dest, a, b
dest = a <= b (integer)
gt_int
dest, a, b
dest = a > b (integer)
ge_int
dest, a, b
dest = a >= b (integer)
Comparison — Float
Instruction
Operands
Description
eq_float
dest, a, b
dest = a == b (float)
ne_float
dest, a, b
dest = a != b (float)
lt_float
dest, a, b
dest = a < b (float)
le_float
dest, a, b
dest = a <= b (float)
gt_float
dest, a, b
dest = a > b (float)
ge_float
dest, a, b
dest = a >= b (float)
Comparison — Text
Instruction
Operands
Description
eq_text
dest, a, b
dest = a == b (text)
ne_text
dest, a, b
dest = a != b (text)
lt_text
dest, a, b
dest = a < b (lexicographic)
le_text
dest, a, b
dest = a <= b (lexicographic)
gt_text
dest, a, b
dest = a > b (lexicographic)
ge_text
dest, a, b
dest = a >= b (lexicographic)
Comparison — Boolean
Instruction
Operands
Description
eq_bool
dest, a, b
dest = a == b (boolean)
ne_bool
dest, a, b
dest = a != b (boolean)
Comparison — Special
Instruction
Operands
Description
is_identical
dest, a, b
Object identity check (same reference)
eq_tol
dest, a, b
Equality with tolerance
ne_tol
dest, a, b
Inequality with tolerance
Type Checks
Inlined from intrinsic function calls. Each sets dest to true or false.
Instruction
Operands
Description
is_int
dest, src
Check if integer
is_num
dest, src
Check if number (integer or float)
is_text
dest, src
Check if text
is_bool
dest, src
Check if logical
is_null
dest, src
Check if null
is_array
dest, src
Check if array
is_func
dest, src
Check if function
is_record
dest, src
Check if record (object)
is_stone
dest, src
Check if stone (immutable)
is_proxy
dest, src
Check if function proxy (arity 2)
Logical
Instruction
Operands
Description
not
dest, src
Logical NOT
and
dest, a, b
Logical AND
or
dest, a, b
Logical OR
Bitwise
Instruction
Operands
Description
bitand
dest, a, b
Bitwise AND
bitor
dest, a, b
Bitwise OR
bitxor
dest, a, b
Bitwise XOR
bitnot
dest, src
Bitwise NOT
shl
dest, a, b
Shift left
shr
dest, a, b
Arithmetic shift right
ushr
dest, a, b
Unsigned shift right
Property Access
Memory operations come in typed variants. The compiler selects the appropriate variant based on type_tag and access_kind annotations from parse and fold.
Instruction
Operands
Description
load_field
dest, obj, key
Load record property by string key
store_field
obj, val, key
Store record property by string key
load_index
dest, obj, idx
Load array element by integer index
store_index
obj, val, idx
Store array element by integer index
load_dynamic
dest, obj, key
Load property (dispatches at runtime)
store_dynamic
obj, val, key
Store property (dispatches at runtime)
delete
obj, key
Delete property
in
dest, obj, key
Check if property exists
length
dest, src
Get length of array or text
Object and Array Construction
Instruction
Operands
Description
record
dest
Create empty record {}
array
dest, n
Create empty array (elements added via push)
push
arr, val
Push value to array
pop
dest, arr
Pop value from array
Function Calls
Function calls are decomposed into three instructions:
Instruction
Operands
Description
frame
dest, fn, argc
Allocate call frame for fn with argc arguments
setarg
frame, idx, val
Set argument idx in call frame
invoke
frame, result
Execute the call, store result
goframe
dest, fn, argc
Allocate frame for async/concurrent call
goinvoke
frame, result
Invoke async/concurrent call
Variable Resolution
Instruction
Operands
Description
access
dest, name
Load variable (intrinsic or module environment)
get
dest, level, slot
Get closure variable from parent scope
put
level, slot, src
Set closure variable in parent scope
Control Flow
Instruction
Operands
Description
LABEL
name
Define a named label (not executed)
jump
label
Unconditional jump
jump_true
cond, label
Jump if cond is true
jump_false
cond, label
Jump if cond is false
jump_not_null
val, label
Jump if val is not null
return
src
Return value from function
disrupt
—
Trigger disruption (error)
Typed Instruction Design
A key design principle of mcode is that every type check is an explicit instruction. Arithmetic and comparison operations come in type-specialized variants (add_int, add_float, eq_text, etc.) rather than a single polymorphic instruction.
When type information is available from the fold stage, the compiler emits the typed variant directly. When the type is unknown, the compiler emits a type-check/dispatch pattern:
The Streamline Optimizer eliminates dead branches when types are statically known, collapsing the dispatch to a single typed instruction.
Intrinsic Inlining
The mcode compiler recognizes calls to built-in intrinsic functions and emits direct opcodes instead of the generic frame/setarg/invoke call sequence:
Source call
Emitted instruction
is_array(x)
is_array dest, src
is_function(x)
is_func dest, src
is_object(x)
is_record dest, src
is_stone(x)
is_stone dest, src
is_integer(x)
is_int dest, src
is_text(x)
is_text dest, src
is_number(x)
is_num dest, src
is_logical(x)
is_bool dest, src
is_null(x)
is_null dest, src
length(x)
length dest, src
push(arr, val)
push arr, val
Function Proxy Decomposition
When the compiler encounters a method call obj.method(args), it emits a branching pattern to handle ƿit's function proxy protocol. An arity-2 function used as a proxy target receives the method name and argument array instead of a normal method call:
Labels are collected into a name-to-index map during loading, enabling O(1) jump resolution. The Mach serializer converts label names to numeric offsets in the binary bytecode.
Nop Convention
The streamline optimizer replaces eliminated instructions with nop strings (e.g., _nop_tc_1, _nop_bl_2). Nop strings are skipped during interpretation and native code emission but preserved in the instruction array to maintain positional stability for jump targets.
Internal Structures
JSMCode (Mcode Interpreter)
structJSMCode{uint16_tnr_args;// argument count
uint16_tnr_slots;// register count
cJSON**instrs;// instruction array
uint32_tinstr_count;// number of instructions
struct{constchar*name;// label name
uint32_tindex;// instruction index
}*labels;uint32_tlabel_count;structJSMCode**functions;// nested functions
uint32_tfunc_count;cJSON*json_root;// keeps JSON alive
constchar*name;// function name
constchar*filename;// source file
uint16_tdisruption_pc;// disruption handler offset
};
JSCodeRegister (Mach VM Bytecode)
structJSCodeRegister{uint16_tarity;// argument count
uint16_tnr_slots;// total register count
uint32_tcpool_count;// constant pool size
JSValue*cpool;// constant pool
uint32_tinstr_count;// instruction count
MachInstr32*instructions;// 32-bit instruction array
uint32_tfunc_count;// nested function count
JSCodeRegister**functions;// nested function table
JSValuename;// function name
uint16_tdisruption_pc;// disruption handler offset
};
The Mach serializer (mach.c) converts the JSON mcode into compact 32-bit instructions with a constant pool. See Register VM for the binary encoding formats.