Files
cell/todo/jit.md
2025-12-17 02:53:01 -06:00

108 lines
6.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

Yep — what youre describing is *exactly* how fast JS engines make “normal-looking arrays” fast: **an array carries an internal “elements kind”**, e.g. “all int32”, “all doubles”, “boxed values”, and it *transitions* to a more general representation the moment you store something that doesnt fit. V8 literally documents this “elements kinds” + transitions model (packed smi → packed double → packed elements, plus “holey” variants). ([V8][1])
### 1) “Numbers-only until polluted” arrays: do it — but keep it brutally simple
For CellScript, you can get most of the win with just **three** internal kinds:
* **PACKED_I32** (or “fit int” if you want)
* **PACKED_F64**
* **PACKED_VALUE** (boxed JSValue)
Rules:
* On write, if kind is I32 and value is i32 → store unboxed.
* If value is non-i32 number → upgrade whole backing store to F64.
* If value is non-number → upgrade to VALUE.
* Never downgrade (keep it one-way like V8 does). ([V8][1])
Extra credit that matters more than it sounds: **forbid sparse arrays** (or define them away). If an out-of-range write extends the array, *fill with null* so the storage remains dense. That keeps iteration tight and avoids “holey” variants (which are a real perf cliff in engines). ([V8][1])
Your standard library already nudges toward dense arrays (constructors like `array(n)` fill with null).
### 2) Your “fast property op assuming data properties only” is the biggest bang
Since youve banned Proxy and all descriptor/accessor machinery , you can add VM ops that assume the world is sane:
* `GET_PROP_PLAIN`
* `SET_PROP_PLAIN`
Then slap an **inline cache** (IC) on them: cache `(shape_id, slot_offset)` for a given property name/key. On hit → direct load/store by offset. On miss → slow path resolves, updates cache.
This is not hypothetical: QuickJS forks have pursued this; QuickJS-NG had discussion of polymorphic inline caches (PolyIC) with reported big wins in some forks. ([GitHub][2])
Even if you keep “objects are fully dynamic”, ICs still work great because most call sites are monomorphic in practice.
### 3) “What else should I remove to make JITing easier + faster?”
The best deletions are the ones that eliminate **invalidation** (stuff that forces “anything could change”):
1. **Prototype mutation** (you already forbid it; `meme` creates, “prototypes cannot be changed”).
2. **Accessors / defineProperty / descriptors** (you already forbid it).
3. **Proxy / Reflect** (already gone).
4. **Property enumeration order guarantees** — and you already *dont* guarantee key order for `array(object)`.
Thats secretly huge: you can store properties in whatever layout is fastest (hash + compact slots) without “insertion order” bookkeeping.
5. **Sparse arrays / hole semantics** (if you delete this, your array JIT story becomes *way* easier).
Stuff thats *less* important than people think:
* Keeping `delete` as a keyword is fine *if* you implement it in a JIT-friendly way (next point).
### 4) You can keep `delete` without wrecking shapes: make it a “logical delete”
If you want “`obj[k] = null` deletes it”, you can implement deletion as:
* keep the slot/offset **stable**
* store **null** and mark the property as “absent for enumeration / membership”
So the shape doesnt thrash and cached offsets stay valid. `delete obj[k]` becomes the same thing.
Thats the trick: you keep the *semantics* of deletion, but avoid the worst-case performance behavior (shape churn) that makes JITs sad.
### 5) What assumptions do “meme + immutable prototypes” unlock?
Two big ones:
* **Prototype chain links never change**, so once youve specialized a load, you dont need “prototype changed” invalidation machinery.
* If your prototypes are usually **stone** (module exports from `use()` are stone) , then prototype *contents* dont change either. That means caching “property X lives on prototype P at offset Y” is stable forever.
In a JIT or even in an interpreter with ICs, you can:
* guard receiver shape once per loop (or hoist it)
* do direct loads either from receiver or a known prototype object
### 6) What do `stone` and `def` buy you, concretely?
**stone(value)** is a promise: “no more mutations, deep.”
That unlocks:
* hoisting shape checks out of loops (because the receiver wont change shape mid-loop)
* for stone arrays: no push/pop → stable length + stable element kind
* for stone objects: stable slot layout; you can treat them like read-only structs *when the key is known*
But: **stone doesnt magically remove the need to identify which layout youre looking at.** If the receiver is not a compile-time constant, you still need *some* guard (shape id or pointer class id). The win is you can often make that guard **once**, then blast through the loop.
**def** is about *bindings*, not object mutability:
* a `def` global / module binding can be constant-folded and inlined
* a `def` that holds a `key()` capability makes `obj[that_key]` an excellent JIT target: the key identity is constant, so the lookup can be cached very aggressively.
### 7) LuaJIT comparison: what its doing, and where you could beat it
LuaJIT is fast largely because its a **tracing JIT**: it records a hot path, emits IR, and inserts **guards** that bail out if assumptions break. ([GitHub][3])
Tables also have a split **array part + hash part** representation, which is why “array-ish” use is fast. ([Percona Community][4])
Could CellScript beat LuaJIT? Not as an interpreter. But with:
* unboxed dense arrays (like above),
* plain-data-property ICs,
* immutable prototypes,
* plus either a trace JIT or a simple baseline JIT…
…you can absolutely be in “LuaJIT-ish” territory for the patterns you care about (actors + data + tight array loops). The big JS engines are still monsters in general-purpose optimization, but your *constraints* are real leverage if you cash them in at the VM/JIT level.
If you implement only two performance features this year: **(1) dense unboxed arrays with one-way kind transitions, and (2) inline-cached GET/SET for plain properties.** Everything else is garnish.
[1]: https://v8.dev/blog/elements-kinds?utm_source=chatgpt.com "Elements kinds in V8"
[2]: https://github.com/quickjs-ng/quickjs/issues/116?utm_source=chatgpt.com "Optimization: Add support for Poly IC · Issue #116 · quickjs- ..."
[3]: https://github.com/tarantool/tarantool/wiki/LuaJIT-SSA-IR?utm_source=chatgpt.com "LuaJIT SSA IR"
[4]: https://percona.community/blog/2020/04/29/the-anatomy-of-luajit-tables-and-whats-special-about-them/?utm_source=chatgpt.com "The Anatomy of LuaJIT Tables and What's Special About Them"