4.4 KiB
title, description
| title | description |
|---|---|
| Register VM | Binary encoding of the Mach bytecode interpreter |
Overview
The Mach VM is a register-based virtual machine that directly interprets the Mcode IR instruction set as compact 32-bit binary bytecode. It is modeled after Lua's register VM — operands are register indices rather than stack positions, reducing instruction count and improving performance.
The Mach serializer (mach.c) converts streamlined mcode JSON into binary instructions. Since the Mach bytecode is a direct encoding of the mcode, the Mcode IR reference is the authoritative instruction set documentation.
Instruction Formats
All instructions are 32 bits wide. Four encoding formats are used:
iABC — Three-Register
[op: 8][A: 8][B: 8][C: 8]
Used for operations on three registers: R(A) = R(B) op R(C).
iABx — Register + Constant
[op: 8][A: 8][Bx: 16]
Used for loading constants: R(A) = K(Bx).
iAsBx — Register + Signed Offset
[op: 8][A: 8][sBx: 16]
Used for conditional jumps: if R(A) then jump by sBx.
isJ — Signed Jump
[op: 8][sJ: 24]
Used for unconditional jumps with a 24-bit signed offset.
Registers
Each function frame has a fixed number of register slots, determined at compile time:
- R(0) —
thisbinding - R(1)..R(arity) — function arguments
- R(arity+1).. — local variables and temporaries
JSCodeRegister
The compiled output for a function:
struct JSCodeRegister {
uint16_t arity; // argument count
uint16_t nr_slots; // total register count
uint32_t cpool_count; // constant pool size
JSValue *cpool; // constant pool
uint32_t instr_count; // instruction count
MachInstr32 *instructions; // 32-bit instruction array
uint32_t func_count; // nested function count
JSCodeRegister **functions; // nested function table
JSValue name; // function name
uint16_t disruption_pc; // disruption handler offset
};
The constant pool holds all non-immediate values referenced by LOADK instructions: strings, large numbers, and other constants.
Constant Pool Index Overflow
Named property instructions (LOAD_FIELD, STORE_FIELD, DELETE) use the iABC format where the constant pool key index occupies an 8-bit field (max 255). When a function references more than 256 unique property names, the serializer automatically falls back to a two-instruction sequence:
LOADK tmp, key_index— load the key string into a temporary register (iABx, 16-bit index)LOAD_DYNAMIC/STORE_DYNAMIC/DELETEINDEX— use the register-based variant
This is transparent to the mcode compiler and streamline optimizer.
Arithmetic Dispatch
Arithmetic ops (ADD, SUB, MUL, DIV, MOD, POW) are executed inline without calling the polymorphic reg_vm_binop() helper. Since mcode's type guard dispatch guarantees both operands are numbers:
- Int-int fast path:
JS_VALUE_IS_BOTH_INT→ native integer arithmetic with int32 overflow check. Overflow promotes to float64. - Float fallback:
JS_ToFloat64→ native floating-point operation. Non-finite results produce null.
DIV and MOD check for zero divisor (→ null). POW uses pow() with non-finite handling for finite inputs.
Comparison ops (EQ through GE) and bitwise ops still use reg_vm_binop() for their slow paths, as they handle a wider range of type combinations (string comparisons, null equality, etc.).
String Concatenation
CONCAT has a three-tier dispatch for self-assign patterns (concat R(A), R(A), R(C) where dest equals the left operand):
- In-place append: If
R(A)is a mutable heap text (S bit clear) withlength + rhs_length <= cap56, characters are appended directly. Zero allocation, zero GC. - Growth allocation (
JS_ConcatStringGrow): Allocates a new text with 2x capacity and does not stone the result, leaving it mutable for subsequent appends. - Exact-fit stoned (
JS_ConcatString): Used when dest differs from the left operand (normal non-self-assign concat).
The stone_text instruction (iABC, B=0, C=0) sets the S bit on a mutable heap text in R(A). For non-pointer values or already-stoned text, it is a no-op. This instruction is emitted by the streamline optimizer at escape points; see Streamline — insert_stone_text and Stone Memory — Mutable Text.