106 lines
4.4 KiB
Markdown
106 lines
4.4 KiB
Markdown
---
|
|
title: "Register VM"
|
|
description: "Binary encoding of the Mach bytecode interpreter"
|
|
---
|
|
|
|
## Overview
|
|
|
|
The Mach VM is a register-based virtual machine that directly interprets the [Mcode IR](mcode.md) instruction set as compact 32-bit binary bytecode. It is modeled after Lua's register VM — operands are register indices rather than stack positions, reducing instruction count and improving performance.
|
|
|
|
The Mach serializer (`mach.c`) converts streamlined mcode JSON into binary instructions. Since the Mach bytecode is a direct encoding of the mcode, the [Mcode IR](mcode.md) reference is the authoritative instruction set documentation.
|
|
|
|
## Instruction Formats
|
|
|
|
All instructions are 32 bits wide. Four encoding formats are used:
|
|
|
|
### iABC — Three-Register
|
|
|
|
```
|
|
[op: 8][A: 8][B: 8][C: 8]
|
|
```
|
|
|
|
Used for operations on three registers: `R(A) = R(B) op R(C)`.
|
|
|
|
### iABx — Register + Constant
|
|
|
|
```
|
|
[op: 8][A: 8][Bx: 16]
|
|
```
|
|
|
|
Used for loading constants: `R(A) = K(Bx)`.
|
|
|
|
### iAsBx — Register + Signed Offset
|
|
|
|
```
|
|
[op: 8][A: 8][sBx: 16]
|
|
```
|
|
|
|
Used for conditional jumps: if `R(A)` then jump by `sBx`.
|
|
|
|
### isJ — Signed Jump
|
|
|
|
```
|
|
[op: 8][sJ: 24]
|
|
```
|
|
|
|
Used for unconditional jumps with a 24-bit signed offset.
|
|
|
|
## Registers
|
|
|
|
Each function frame has a fixed number of register slots, determined at compile time:
|
|
|
|
- **R(0)** — `this` binding
|
|
- **R(1)..R(arity)** — function arguments
|
|
- **R(arity+1)..** — local variables and temporaries
|
|
|
|
## JSCodeRegister
|
|
|
|
The compiled output for a function:
|
|
|
|
```c
|
|
struct JSCodeRegister {
|
|
uint16_t arity; // argument count
|
|
uint16_t nr_slots; // total register count
|
|
uint32_t cpool_count; // constant pool size
|
|
JSValue *cpool; // constant pool
|
|
uint32_t instr_count; // instruction count
|
|
MachInstr32 *instructions; // 32-bit instruction array
|
|
uint32_t func_count; // nested function count
|
|
JSCodeRegister **functions; // nested function table
|
|
JSValue name; // function name
|
|
uint16_t disruption_pc; // disruption handler offset
|
|
};
|
|
```
|
|
|
|
The constant pool holds all non-immediate values referenced by `LOADK` instructions: strings, large numbers, and other constants.
|
|
|
|
### Constant Pool Index Overflow
|
|
|
|
Named property instructions (`LOAD_FIELD`, `STORE_FIELD`, `DELETE`) use the iABC format where the constant pool key index occupies an 8-bit field (max 255). When a function references more than 256 unique property names, the serializer automatically falls back to a two-instruction sequence:
|
|
|
|
1. `LOADK tmp, key_index` — load the key string into a temporary register (iABx, 16-bit index)
|
|
2. `LOAD_DYNAMIC` / `STORE_DYNAMIC` / `DELETEINDEX` — use the register-based variant
|
|
|
|
This is transparent to the mcode compiler and streamline optimizer.
|
|
|
|
## Arithmetic Dispatch
|
|
|
|
Arithmetic ops (ADD, SUB, MUL, DIV, MOD, POW) are executed inline without calling the polymorphic `reg_vm_binop()` helper. Since mcode's type guard dispatch guarantees both operands are numbers:
|
|
|
|
1. **Int-int fast path**: `JS_VALUE_IS_BOTH_INT` → native integer arithmetic with int32 overflow check. Overflow promotes to float64.
|
|
2. **Float fallback**: `JS_ToFloat64` → native floating-point operation. Non-finite results produce null.
|
|
|
|
DIV and MOD check for zero divisor (→ null). POW uses `pow()` with non-finite handling for finite inputs.
|
|
|
|
Comparison ops (EQ through GE) and bitwise ops still use `reg_vm_binop()` for their slow paths, as they handle a wider range of type combinations (string comparisons, null equality, etc.).
|
|
|
|
## String Concatenation
|
|
|
|
CONCAT has a three-tier dispatch for self-assign patterns (`concat R(A), R(A), R(C)` where dest equals the left operand):
|
|
|
|
1. **In-place append**: If `R(A)` is a mutable heap text (S bit clear) with `length + rhs_length <= cap56`, characters are appended directly. Zero allocation, zero GC.
|
|
2. **Growth allocation** (`JS_ConcatStringGrow`): Allocates a new text with 2x capacity and does **not** stone the result, leaving it mutable for subsequent appends.
|
|
3. **Exact-fit stoned** (`JS_ConcatString`): Used when dest differs from the left operand (normal non-self-assign concat).
|
|
|
|
The `stone_text` instruction (iABC, B=0, C=0) sets the S bit on a mutable heap text in `R(A)`. For non-pointer values or already-stoned text, it is a no-op. This instruction is emitted by the streamline optimizer at escape points; see [Streamline — insert_stone_text](streamline.md#7-insert_stone_text-mutable-text-escape-analysis) and [Stone Memory — Mutable Text](stone.md#mutable-text-concatenation).
|