2.7 KiB
title, description
| title | description |
|---|---|
| Register VM | Binary encoding of the Mach bytecode interpreter |
Overview
The Mach VM is a register-based virtual machine that directly interprets the Mcode IR instruction set as compact 32-bit binary bytecode. It is modeled after Lua's register VM — operands are register indices rather than stack positions, reducing instruction count and improving performance.
The Mach serializer (mach.c) converts streamlined mcode JSON into binary instructions. Since the Mach bytecode is a direct encoding of the mcode, the Mcode IR reference is the authoritative instruction set documentation.
Instruction Formats
All instructions are 32 bits wide. Four encoding formats are used:
iABC — Three-Register
[op: 8][A: 8][B: 8][C: 8]
Used for operations on three registers: R(A) = R(B) op R(C).
iABx — Register + Constant
[op: 8][A: 8][Bx: 16]
Used for loading constants: R(A) = K(Bx).
iAsBx — Register + Signed Offset
[op: 8][A: 8][sBx: 16]
Used for conditional jumps: if R(A) then jump by sBx.
isJ — Signed Jump
[op: 8][sJ: 24]
Used for unconditional jumps with a 24-bit signed offset.
Registers
Each function frame has a fixed number of register slots, determined at compile time:
- R(0) —
thisbinding - R(1)..R(arity) — function arguments
- R(arity+1).. — local variables and temporaries
JSCodeRegister
The compiled output for a function:
struct JSCodeRegister {
uint16_t arity; // argument count
uint16_t nr_slots; // total register count
uint32_t cpool_count; // constant pool size
JSValue *cpool; // constant pool
uint32_t instr_count; // instruction count
MachInstr32 *instructions; // 32-bit instruction array
uint32_t func_count; // nested function count
JSCodeRegister **functions; // nested function table
JSValue name; // function name
uint16_t disruption_pc; // disruption handler offset
};
The constant pool holds all non-immediate values referenced by LOADK instructions: strings, large numbers, and other constants.
Constant Pool Index Overflow
Named property instructions (LOAD_FIELD, STORE_FIELD, DELETE) use the iABC format where the constant pool key index occupies an 8-bit field (max 255). When a function references more than 256 unique property names, the serializer automatically falls back to a two-instruction sequence:
LOADK tmp, key_index— load the key string into a temporary register (iABx, 16-bit index)LOAD_DYNAMIC/STORE_DYNAMIC/DELETEINDEX— use the register-based variant
This is transparent to the mcode compiler and streamline optimizer.