docs revamp
This commit is contained in:
107
todo/jit.md
Normal file
107
todo/jit.md
Normal file
@@ -0,0 +1,107 @@
|
||||
Yep — what you’re describing is *exactly* how fast JS engines make “normal-looking arrays” fast: **an array carries an internal “elements kind”**, e.g. “all int32”, “all doubles”, “boxed values”, and it *transitions* to a more general representation the moment you store something that doesn’t fit. V8 literally documents this “elements kinds” + transitions model (packed smi → packed double → packed elements, plus “holey” variants). ([V8][1])
|
||||
|
||||
### 1) “Numbers-only until polluted” arrays: do it — but keep it brutally simple
|
||||
|
||||
For CellScript, you can get most of the win with just **three** internal kinds:
|
||||
|
||||
* **PACKED_I32** (or “fit int” if you want)
|
||||
* **PACKED_F64**
|
||||
* **PACKED_VALUE** (boxed JSValue)
|
||||
|
||||
Rules:
|
||||
|
||||
* On write, if kind is I32 and value is i32 → store unboxed.
|
||||
* If value is non-i32 number → upgrade whole backing store to F64.
|
||||
* If value is non-number → upgrade to VALUE.
|
||||
* Never downgrade (keep it one-way like V8 does). ([V8][1])
|
||||
|
||||
Extra credit that matters more than it sounds: **forbid sparse arrays** (or define them away). If an out-of-range write extends the array, *fill with null* so the storage remains dense. That keeps iteration tight and avoids “holey” variants (which are a real perf cliff in engines). ([V8][1])
|
||||
Your standard library already nudges toward dense arrays (constructors like `array(n)` fill with null).
|
||||
|
||||
### 2) Your “fast property op assuming data properties only” is the biggest bang
|
||||
|
||||
Since you’ve banned Proxy and all descriptor/accessor machinery , you can add VM ops that assume the world is sane:
|
||||
|
||||
* `GET_PROP_PLAIN`
|
||||
* `SET_PROP_PLAIN`
|
||||
|
||||
Then slap an **inline cache** (IC) on them: cache `(shape_id, slot_offset)` for a given property name/key. On hit → direct load/store by offset. On miss → slow path resolves, updates cache.
|
||||
|
||||
This is not hypothetical: QuickJS forks have pursued this; QuickJS-NG had discussion of polymorphic inline caches (PolyIC) with reported big wins in some forks. ([GitHub][2])
|
||||
|
||||
Even if you keep “objects are fully dynamic”, ICs still work great because most call sites are monomorphic in practice.
|
||||
|
||||
### 3) “What else should I remove to make JITing easier + faster?”
|
||||
|
||||
The best deletions are the ones that eliminate **invalidation** (stuff that forces “anything could change”):
|
||||
|
||||
1. **Prototype mutation** (you already forbid it; `meme` creates, “prototypes cannot be changed”).
|
||||
2. **Accessors / defineProperty / descriptors** (you already forbid it).
|
||||
3. **Proxy / Reflect** (already gone).
|
||||
4. **Property enumeration order guarantees** — and you already *don’t* guarantee key order for `array(object)`.
|
||||
That’s secretly huge: you can store properties in whatever layout is fastest (hash + compact slots) without “insertion order” bookkeeping.
|
||||
5. **Sparse arrays / hole semantics** (if you delete this, your array JIT story becomes *way* easier).
|
||||
|
||||
Stuff that’s *less* important than people think:
|
||||
|
||||
* Keeping `delete` as a keyword is fine *if* you implement it in a JIT-friendly way (next point).
|
||||
|
||||
### 4) You can keep `delete` without wrecking shapes: make it a “logical delete”
|
||||
|
||||
If you want “`obj[k] = null` deletes it”, you can implement deletion as:
|
||||
|
||||
* keep the slot/offset **stable**
|
||||
* store **null** and mark the property as “absent for enumeration / membership”
|
||||
|
||||
So the shape doesn’t thrash and cached offsets stay valid. `delete obj[k]` becomes the same thing.
|
||||
|
||||
That’s the trick: you keep the *semantics* of deletion, but avoid the worst-case performance behavior (shape churn) that makes JITs sad.
|
||||
|
||||
### 5) What assumptions do “meme + immutable prototypes” unlock?
|
||||
|
||||
Two big ones:
|
||||
|
||||
* **Prototype chain links never change**, so once you’ve specialized a load, you don’t need “prototype changed” invalidation machinery.
|
||||
* If your prototypes are usually **stone** (module exports from `use()` are stone) , then prototype *contents* don’t change either. That means caching “property X lives on prototype P at offset Y” is stable forever.
|
||||
|
||||
In a JIT or even in an interpreter with ICs, you can:
|
||||
|
||||
* guard receiver shape once per loop (or hoist it)
|
||||
* do direct loads either from receiver or a known prototype object
|
||||
|
||||
### 6) What do `stone` and `def` buy you, concretely?
|
||||
|
||||
**stone(value)** is a promise: “no more mutations, deep.”
|
||||
That unlocks:
|
||||
|
||||
* hoisting shape checks out of loops (because the receiver won’t change shape mid-loop)
|
||||
* for stone arrays: no push/pop → stable length + stable element kind
|
||||
* for stone objects: stable slot layout; you can treat them like read-only structs *when the key is known*
|
||||
|
||||
But: **stone doesn’t magically remove the need to identify which layout you’re looking at.** If the receiver is not a compile-time constant, you still need *some* guard (shape id or pointer class id). The win is you can often make that guard **once**, then blast through the loop.
|
||||
|
||||
**def** is about *bindings*, not object mutability:
|
||||
|
||||
* a `def` global / module binding can be constant-folded and inlined
|
||||
* a `def` that holds a `key()` capability makes `obj[that_key]` an excellent JIT target: the key identity is constant, so the lookup can be cached very aggressively.
|
||||
|
||||
### 7) LuaJIT comparison: what it’s doing, and where you could beat it
|
||||
|
||||
LuaJIT is fast largely because it’s a **tracing JIT**: it records a hot path, emits IR, and inserts **guards** that bail out if assumptions break. ([GitHub][3])
|
||||
Tables also have a split **array part + hash part** representation, which is why “array-ish” use is fast. ([Percona Community][4])
|
||||
|
||||
Could CellScript beat LuaJIT? Not as an interpreter. But with:
|
||||
|
||||
* unboxed dense arrays (like above),
|
||||
* plain-data-property ICs,
|
||||
* immutable prototypes,
|
||||
* plus either a trace JIT or a simple baseline JIT…
|
||||
|
||||
…you can absolutely be in “LuaJIT-ish” territory for the patterns you care about (actors + data + tight array loops). The big JS engines are still monsters in general-purpose optimization, but your *constraints* are real leverage if you cash them in at the VM/JIT level.
|
||||
|
||||
If you implement only two performance features this year: **(1) dense unboxed arrays with one-way kind transitions, and (2) inline-cached GET/SET for plain properties.** Everything else is garnish.
|
||||
|
||||
[1]: https://v8.dev/blog/elements-kinds?utm_source=chatgpt.com "Elements kinds in V8"
|
||||
[2]: https://github.com/quickjs-ng/quickjs/issues/116?utm_source=chatgpt.com "Optimization: Add support for Poly IC · Issue #116 · quickjs- ..."
|
||||
[3]: https://github.com/tarantool/tarantool/wiki/LuaJIT-SSA-IR?utm_source=chatgpt.com "LuaJIT SSA IR"
|
||||
[4]: https://percona.community/blog/2020/04/29/the-anatomy-of-luajit-tables-and-whats-special-about-them/?utm_source=chatgpt.com "The Anatomy of LuaJIT Tables and What's Special About Them"
|
||||
Reference in New Issue
Block a user