update
This commit is contained in:
90
docs/spec/mcode.md
Normal file
90
docs/spec/mcode.md
Normal file
@@ -0,0 +1,90 @@
|
||||
---
|
||||
title: "Mcode IR"
|
||||
description: "JSON-based intermediate representation"
|
||||
---
|
||||
|
||||
## Overview
|
||||
|
||||
Mcode is a JSON-based intermediate representation that can be interpreted directly. It represents the same operations as the Mach register VM but uses string-based instruction dispatch rather than binary opcodes. Mcode is intended as an intermediate step toward native code compilation.
|
||||
|
||||
## Pipeline
|
||||
|
||||
```
|
||||
Source → Tokenize → Parse (AST) → Mcode (JSON) → Interpret
|
||||
→ Compile to Mach (planned)
|
||||
→ Compile to native (planned)
|
||||
```
|
||||
|
||||
Mcode is produced by the `JS_Mcode` compiler pass, which emits a cJSON tree. The mcode interpreter walks this tree directly, dispatching on instruction name strings.
|
||||
|
||||
## JSMCode Structure
|
||||
|
||||
```c
|
||||
struct JSMCode {
|
||||
uint16_t nr_args; // argument count
|
||||
uint16_t nr_slots; // register count
|
||||
cJSON **instrs; // pre-flattened instruction array
|
||||
uint32_t instr_count; // number of instructions
|
||||
|
||||
struct {
|
||||
const char *name; // label name
|
||||
uint32_t index; // instruction index
|
||||
} *labels;
|
||||
uint32_t label_count;
|
||||
|
||||
struct JSMCode **functions; // nested functions
|
||||
uint32_t func_count;
|
||||
|
||||
cJSON *json_root; // keeps JSON alive
|
||||
const char *name; // function name
|
||||
const char *filename; // source file
|
||||
uint16_t disruption_pc; // exception handler offset
|
||||
};
|
||||
```
|
||||
|
||||
## Instruction Format
|
||||
|
||||
Each instruction is a JSON array. The first element is the instruction name (string), followed by operands:
|
||||
|
||||
```json
|
||||
["LOADK", 0, 42]
|
||||
["ADD", 2, 0, 1]
|
||||
["JMPFALSE", 3, "else_label"]
|
||||
["CALL", 0, 2, 1]
|
||||
```
|
||||
|
||||
The instruction set mirrors the Mach VM opcodes — same operations, same register semantics, but with string dispatch instead of numeric opcodes.
|
||||
|
||||
## Labels
|
||||
|
||||
Control flow uses named labels instead of numeric offsets:
|
||||
|
||||
```json
|
||||
["LABEL", "loop_start"]
|
||||
["ADD", 1, 1, 2]
|
||||
["JMPFALSE", 3, "loop_end"]
|
||||
["JMP", "loop_start"]
|
||||
["LABEL", "loop_end"]
|
||||
```
|
||||
|
||||
Labels are collected into a name-to-index map during loading, enabling O(1) jump resolution.
|
||||
|
||||
## Differences from Mach
|
||||
|
||||
| Property | Mcode | Mach |
|
||||
|----------|-------|------|
|
||||
| Instructions | cJSON arrays | 32-bit binary |
|
||||
| Dispatch | String comparison | Switch on opcode byte |
|
||||
| Constants | Inline in JSON | Separate constant pool |
|
||||
| Jump targets | Named labels | Numeric offsets |
|
||||
| Memory | Heap (cJSON nodes) | Off-heap (malloc) |
|
||||
|
||||
## Purpose
|
||||
|
||||
Mcode serves as an inspectable, debuggable intermediate format:
|
||||
|
||||
- **Human-readable** — the JSON representation can be printed and examined
|
||||
- **Language-independent** — any tool that produces the correct JSON can target the ƿit runtime
|
||||
- **Compilation target** — the Mach compiler can consume mcode as input, and future native code generators can work from the same representation
|
||||
|
||||
The cost of string-based dispatch makes mcode slower than the binary Mach VM, so it is primarily useful during development and as a compilation intermediate rather than for production execution.
|
||||
Reference in New Issue
Block a user