91 lines
3.0 KiB
Markdown
91 lines
3.0 KiB
Markdown
---
|
|
title: "Mcode IR"
|
|
description: "JSON-based intermediate representation"
|
|
---
|
|
|
|
## Overview
|
|
|
|
Mcode is a JSON-based intermediate representation that can be interpreted directly. It represents the same operations as the Mach register VM but uses string-based instruction dispatch rather than binary opcodes. Mcode is intended as an intermediate step toward native code compilation.
|
|
|
|
## Pipeline
|
|
|
|
```
|
|
Source → Tokenize → Parse (AST) → Mcode (JSON) → Interpret
|
|
→ Compile to Mach (planned)
|
|
→ Compile to native (planned)
|
|
```
|
|
|
|
Mcode is produced by the `JS_Mcode` compiler pass, which emits a cJSON tree. The mcode interpreter walks this tree directly, dispatching on instruction name strings.
|
|
|
|
## JSMCode Structure
|
|
|
|
```c
|
|
struct JSMCode {
|
|
uint16_t nr_args; // argument count
|
|
uint16_t nr_slots; // register count
|
|
cJSON **instrs; // pre-flattened instruction array
|
|
uint32_t instr_count; // number of instructions
|
|
|
|
struct {
|
|
const char *name; // label name
|
|
uint32_t index; // instruction index
|
|
} *labels;
|
|
uint32_t label_count;
|
|
|
|
struct JSMCode **functions; // nested functions
|
|
uint32_t func_count;
|
|
|
|
cJSON *json_root; // keeps JSON alive
|
|
const char *name; // function name
|
|
const char *filename; // source file
|
|
uint16_t disruption_pc; // exception handler offset
|
|
};
|
|
```
|
|
|
|
## Instruction Format
|
|
|
|
Each instruction is a JSON array. The first element is the instruction name (string), followed by operands:
|
|
|
|
```json
|
|
["LOADK", 0, 42]
|
|
["ADD", 2, 0, 1]
|
|
["JMPFALSE", 3, "else_label"]
|
|
["CALL", 0, 2, 1]
|
|
```
|
|
|
|
The instruction set mirrors the Mach VM opcodes — same operations, same register semantics, but with string dispatch instead of numeric opcodes.
|
|
|
|
## Labels
|
|
|
|
Control flow uses named labels instead of numeric offsets:
|
|
|
|
```json
|
|
["LABEL", "loop_start"]
|
|
["ADD", 1, 1, 2]
|
|
["JMPFALSE", 3, "loop_end"]
|
|
["JMP", "loop_start"]
|
|
["LABEL", "loop_end"]
|
|
```
|
|
|
|
Labels are collected into a name-to-index map during loading, enabling O(1) jump resolution.
|
|
|
|
## Differences from Mach
|
|
|
|
| Property | Mcode | Mach |
|
|
|----------|-------|------|
|
|
| Instructions | cJSON arrays | 32-bit binary |
|
|
| Dispatch | String comparison | Switch on opcode byte |
|
|
| Constants | Inline in JSON | Separate constant pool |
|
|
| Jump targets | Named labels | Numeric offsets |
|
|
| Memory | Heap (cJSON nodes) | Off-heap (malloc) |
|
|
|
|
## Purpose
|
|
|
|
Mcode serves as an inspectable, debuggable intermediate format:
|
|
|
|
- **Human-readable** — the JSON representation can be printed and examined
|
|
- **Language-independent** — any tool that produces the correct JSON can target the ƿit runtime
|
|
- **Compilation target** — the Mach compiler can consume mcode as input, and future native code generators can work from the same representation
|
|
|
|
The cost of string-based dispatch makes mcode slower than the binary Mach VM, so it is primarily useful during development and as a compilation intermediate rather than for production execution.
|