str stone; concat

This commit is contained in:
2026-02-20 21:54:19 -06:00
parent a82c13170f
commit fca1041e52
9 changed files with 389 additions and 15 deletions

View File

@@ -77,6 +77,30 @@ Messages between actors are stoned before delivery, ensuring actors never share
Literal objects and arrays that can be determined at compile time may be allocated directly in stone memory.
## Mutable Text Concatenation
String concatenation in a loop (`s = s + "x"`) is optimized to O(n) amortized by leaving concat results **unstoned** with over-allocated capacity. On the next concatenation, if the destination text is mutable (S bit clear) and has enough room, the VM appends in-place with zero allocation.
### How It Works
When the VM executes `concat dest, dest, src` (same destination and left operand — a self-assign pattern):
1. **Inline fast path**: If `dest` holds a heap text, is not stoned, and `length + src_length <= capacity` — append characters in place, update length, done. No allocation, no GC possible.
2. **Growth path** (`JS_ConcatStringGrow`): Allocate a new text with `capacity = max(new_length * 2, 16)`, copy both operands, and return the result **without stoning** it. The 2x growth factor means a loop of N concatenations does O(log N) allocations totaling O(N) character copies.
3. **Exact-fit path** (`JS_ConcatString`): When `dest != left` (not self-assign), the existing exact-fit stoned path is used. This is the normal case for expressions like `var c = a + b`.
### Safety Invariant
**An unstoned heap text is uniquely referenced by exactly one slot.** This is enforced by the `stone_text` mcode instruction, which the [streamline optimizer](streamline.md#7-insert_stone_text-mutable-text-escape-analysis) inserts before any instruction that would create a second reference to the value (move, store, push, setarg, put). Two VM-level guards cover cases where the compiler cannot prove the type: `get` (closure reads) and `return` (inter-frame returns).
### Why Over-Allocation Is GC-Safe
- The copying collector copies based on `cap56` (the object header's capacity field), not `length`. Over-allocated capacity survives GC.
- `js_alloc_string` zero-fills the packed data region, so padding beyond `length` is always clean.
- String comparisons, hashing, and interning all use `length`, not `cap56`. Extra capacity is invisible to string operations.
## Relationship to GC
The Cheney copying collector only operates on the mutable heap. During collection, when the collector encounters a pointer to stone memory (S bit set), it skips it — stone objects are roots that never move. This means stone memory acts as a permanent root set with zero GC overhead.