This commit is contained in:
2026-01-30 20:16:08 -06:00
parent a49b94e0a1
commit b3f3bc8a5f
2 changed files with 817 additions and 803 deletions

660
plan.md
View File

@@ -1,547 +1,215 @@
# Refactoring QuickJS to Mist Memory Format
# Cell/QuickJS Refactoring Plan: Remove Atoms, Shapes, and Dual-Encoding
## Summary
## Overview
Complete rework of `quickjs.h` and `quickjs.c` to align with `docs/memory.md` and the new JSValue encoding scheme using LSB-based type discrimination with short floats.
Refactor `source/quickjs.c` to match `docs/memory.md` specification:
- Remove JSAtom system (171 references → ~41 remaining)
- Remove JSShape system (94 references) ✓
- Remove IC caches (shape-based inline caches) ✓
- Remove `is_wide_char` dual-encoding (18 locations) ✓
- Use JSValue texts directly as property keys
- Reference: `mquickjs.c` shows the target pattern
## Key Design Decisions (from user)
## Completed Phases
1. **Remove NaN-boxing entirely** - Use LSB-based type tags instead
2. **Short float for numbers** - Truncated double (3 fewer exponent bits), out-of-range → NULL
3. **Optional 32-bit float mode** - Compile-time option, stored like ints
4. **Remove KeyId** - Use JSValue directly as keys in objects
5. **Remove JSStringRope** - No lazy concatenation, immediate text creation
6. **Remove JSObject/shapes** - Move to JSRecord only with direct key/value storage
7. **Remove atoms from objects** - String interning for literals/properties only
### Phase 1: Remove is_wide_char Remnants ✓
### Phase 2: Remove IC Caches ✓
### Phase 3: Remove JSShape System ✓
### Phase 4: Complete Property Access with JSValue Keys ✓
## New JSValue Encoding (64-bit)
Completed:
- Removed JS_GC_OBJ_TYPE_JS_OBJECT fallbacks from OP_get_field
- Removed JS_GC_OBJ_TYPE_JS_OBJECT fallbacks from OP_put_field
- Removed JS_GC_OBJ_TYPE_JS_OBJECT fallbacks from OP_define_field
- Created emit_key() function that adds JSValue to cpool and emits index
Based on the provided header, using LSB-based discrimination:
---
```
LSB = 0 → 31-bit signed integer (value >> 1)
LSB = 01 → 61-bit pointer
LSB = 101 → Short float (truncated double, 3 fewer exponent bits)
LSB = 11 → Special tag (next 3 bits for subtype, 5 bits total)
```
## Phase 5: Convert JSAtom to JSValue Text (IN PROGRESS)
**Special tags (5 bits, LSB = 11):**
- `00011` (3) = JS_TAG_BOOL (payload bit 5 = value)
- `00111` (7) = JS_TAG_NULL
- `01011` (11) = JS_TAG_UNDEFINED (may not be needed - use NULL)
- `01111` (15) = JS_TAG_EXCEPTION
- `10111` (23) = JS_TAG_UNINITIALIZED
- `11011` (27) = JS_TAG_STRING_ASCII (immediate string: 3-bit len + up to 7 ASCII bytes)
- `11111` (31) = JS_TAG_CATCH_OFFSET
This is the core transformation. All identifier handling moves from atoms to JSValue.
## Critical Files
### Completed Items
- `/Users/johnalanbrook/work/cell/source/quickjs.h` - Complete rewrite of JSValue encoding
- `/Users/johnalanbrook/work/cell/source/quickjs.c` - Remove shapes, atoms from objects, string ropes
**Token and Parser Infrastructure:**
- [x] Change JSToken.u.ident.atom to JSToken.u.ident.str (JSValue)
- [x] Change parse_ident() to return JSValue
- [x] Create emit_key() function (cpool-based)
- [x] Create JS_KEY_* macros for common names (lines ~279-335 in quickjs.c)
- [x] Update all token.u.ident.atom references to .str
- [x] Create keyword lookup table (js_keywords[]) with string comparison
- [x] Rewrite update_token_ident() to use js_keyword_lookup()
- [x] Rewrite is_strict_future_keyword() to use JSValue
- [x] Update token_is_pseudo_keyword() to use JSValue and js_key_equal()
## Implementation Plan
**Function Declaration Parsing:**
- [x] Update js_parse_function_decl() signature to use JSValue func_name
- [x] Update js_parse_function_decl2() to use JSValue func_name throughout
- [x] Update js_parse_function_check_names() to use JSValue
- [x] Convert JS_DupAtom/JS_FreeAtom to JS_DupValue/JS_FreeValue in function parsing
### Phase 1: New JSValue Encoding in quickjs.h
**Variable Definition and Lookup:**
- [x] Update find_global_var() to use JSValue and js_key_equal()
- [x] Update find_lexical_global_var() to use JSValue
- [x] Update find_lexical_decl() to use JSValue and js_key_equal()
- [x] Update js_define_var() to use JSValue
- [x] Update js_parse_check_duplicate_parameter() to use JSValue and js_key_equal()
- [x] Update js_parse_destructuring_var() to return JSValue
- [x] Update js_parse_var() to use JSValue for variable names
Replace the entire JSValue system with LSB-based tags:
**Comparison Helpers:**
- [x] Create js_key_equal_str() for comparing JSValue with C string literals
- [x] Update is_var_in_arg_scope() to use js_key_equal/js_key_equal_str
- [x] Update has_with_scope() to use js_key_equal_str
- [x] Update closure variable comparisons (cv->var_name) to use js_key_equal_str
**Property Access:**
- [x] Fix JS_GetPropertyStr to create proper JSValue keys
- [x] Fix JS_SetPropertyInternal callers to use JS_KEY_* instead of JS_ATOM_*
### JS_KEY_* Macros Added
Compile-time immediate ASCII string constants (≤7 chars):
```c
#if INTPTR_MAX >= INT64_MAX
#define JS_PTR64
typedef uint64_t JSValue;
#define JSW 8
#define JS_USE_SHORT_FLOAT
#else
typedef uint32_t JSValue;
#define JSW 4
#endif
enum {
JS_TAG_INT = 0, /* LSB = 0, 31-bit int */
JS_TAG_PTR = 1, /* LSB = 01, pointer */
JS_TAG_SPECIAL = 3, /* LSB = 11, special values */
JS_TAG_BOOL = JS_TAG_SPECIAL | (0 << 2), /* 5 bits */
JS_TAG_NULL = JS_TAG_SPECIAL | (1 << 2),
JS_TAG_EXCEPTION = JS_TAG_SPECIAL | (3 << 2),
JS_TAG_UNINITIALIZED = JS_TAG_SPECIAL | (5 << 2),
JS_TAG_STRING_ASCII = JS_TAG_SPECIAL | (6 << 2), /* immediate ASCII string */
JS_TAG_CATCH_OFFSET = JS_TAG_SPECIAL | (7 << 2),
#ifdef JS_USE_SHORT_FLOAT
JS_TAG_SHORT_FLOAT = 5, /* LSB = 101 */
#endif
};
/* Value extraction */
#define JS_VALUE_GET_INT(v) ((int32_t)(v) >> 1)
#define JS_VALUE_GET_PTR(v) ((void *)((v) & ~(JSW - 1)))
#define JS_VALUE_GET_SPECIAL_TAG(v) ((v) & 0x1F)
#define JS_VALUE_GET_SPECIAL_VALUE(v) ((int32_t)(v) >> 5)
/* Value creation */
#define JS_MKINT(val) (((JSValue)(val) << 1) | JS_TAG_INT)
#define JS_MKPTR(ptr) (((JSValue)(uintptr_t)(ptr)) | JS_TAG_PTR)
#define JS_MKSPECIAL(tag, val) ((JSValue)(tag) | ((JSValue)(val) << 5))
/* Type checks */
static inline JS_BOOL JS_IsInt(JSValue v) { return (v & 1) == JS_TAG_INT; }
static inline JS_BOOL JS_IsPtr(JSValue v) { return (v & (JSW-1)) == JS_TAG_PTR; }
static inline JS_BOOL JS_IsNull(JSValue v) { return v == JS_MKSPECIAL(JS_TAG_NULL, 0); }
static inline JS_BOOL JS_IsException(JSValue v) { return JS_VALUE_GET_SPECIAL_TAG(v) == JS_TAG_EXCEPTION; }
#ifdef JS_USE_SHORT_FLOAT
static inline JS_BOOL JS_IsShortFloat(JSValue v) { return (v & 7) == JS_TAG_SHORT_FLOAT; }
#endif
/* Constants */
#define JS_NULL JS_MKSPECIAL(JS_TAG_NULL, 0)
#define JS_FALSE JS_MKSPECIAL(JS_TAG_BOOL, 0)
#define JS_TRUE JS_MKSPECIAL(JS_TAG_BOOL, 1)
#define JS_EXCEPTION JS_MKSPECIAL(JS_TAG_EXCEPTION, 0)
#define JS_UNINITIALIZED JS_MKSPECIAL(JS_TAG_UNINITIALIZED, 0)
JS_KEY_empty, JS_KEY_name, JS_KEY_message, JS_KEY_stack,
JS_KEY_errors, JS_KEY_Error, JS_KEY_cause, JS_KEY_length,
JS_KEY_value, JS_KEY_get, JS_KEY_set, JS_KEY_raw,
JS_KEY_flags, JS_KEY_source, JS_KEY_exec, JS_KEY_toJSON,
JS_KEY_eval, JS_KEY_this, JS_KEY_true, JS_KEY_false,
JS_KEY_null, JS_KEY_NaN, JS_KEY_default, JS_KEY_index,
JS_KEY_input, JS_KEY_groups, JS_KEY_indices, JS_KEY_let,
JS_KEY_var, JS_KEY_new, JS_KEY_of, JS_KEY_yield,
JS_KEY_async, JS_KEY_target, JS_KEY_from, JS_KEY_meta,
JS_KEY_as, JS_KEY_with
```
### Phase 2: Short Float Implementation
Short float uses 3 fewer exponent bits than double. Numbers outside range become NULL.
Runtime macro for strings >7 chars:
```c
/* Short float: 61 bits = 1 sign + 8 exp + 52 mantissa (vs double's 11 exp)
* Range: approximately +-3.4e38 (vs double's +-1.8e308)
* Out of range values become JS_NULL
* Zero and subnormals: 0.0 is representable, subnormals become 0.0
*/
static inline JSValue JS_NewFloat64(JSContext *ctx, double d) {
union { double d; uint64_t u; } u;
u.d = d;
/* Extract sign, exponent, mantissa */
uint64_t sign = u.u >> 63;
int exp = (u.u >> 52) & 0x7FF;
uint64_t mantissa = u.u & ((1ULL << 52) - 1);
/* Special case: zero (exp=0, mantissa=0) */
if (exp == 0 && mantissa == 0) {
/* Encode +0.0 or -0.0 */
return (sign << 63) | JS_TAG_SHORT_FLOAT; /* short_exp=0, mantissa=0 */
}
/* Check for NaN/Inf (exp=0x7FF) */
if (exp == 0x7FF) {
return JS_NULL; /* NaN or Infinity → null */
}
/* Subnormals (exp=0, mantissa!=0): flush to zero */
if (exp == 0) {
return (sign << 63) | JS_TAG_SHORT_FLOAT; /* becomes +/-0.0 */
}
/* Normal numbers: convert exponent bias */
/* Double bias = 1023, short float bias = 127 */
int short_exp = exp - 1023 + 127;
if (short_exp < 1 || short_exp > 254) {
return JS_NULL; /* Out of range (short_exp 0 and 255 are special) */
}
/* Check if it fits in int32 (prefer integer encoding) */
if (d >= INT32_MIN && d <= INT32_MAX) {
int32_t i = (int32_t)d;
if ((double)i == d) {
return JS_MKINT(i);
}
}
/* Encode as short float:
* [sign:1][short_exp:8][mantissa:52][tag:3] */
JSValue v = (sign << 63) | ((uint64_t)short_exp << 55) | (mantissa << 3) | JS_TAG_SHORT_FLOAT;
return v;
}
static inline double JS_VALUE_GET_FLOAT64(JSValue v) {
/* Decode short float back to double */
uint64_t sign = v >> 63;
uint64_t short_exp = (v >> 55) & 0xFF;
uint64_t mantissa = (v >> 3) & ((1ULL << 52) - 1);
/* Convert exponent: short bias 127 → double bias 1023 */
uint64_t exp = short_exp - 127 + 1023;
union { double d; uint64_t u; } u;
u.u = (sign << 63) | (exp << 52) | mantissa;
return u.d;
}
#define JS_KEY_STR(ctx, str) JS_NewStringLen((ctx), (str), sizeof(str) - 1)
```
### Phase 3: Immediate ASCII String (JS_TAG_STRING_ASCII)
Up to 7 ASCII characters stored directly in JSValue payload.
**Layout (64-bit):**
- Bits 0-4: Tag (JS_TAG_STRING_ASCII = 27)
- Bits 5-7: Length (0-7)
- Bits 8-63: Up to 7 ASCII bytes (char[0] in bits 8-15, etc.)
Helper function for comparing JSValue with C string literals:
```c
#define JS_ASCII_MAX_LEN 7
/* Check if value is immediate ASCII string */
static inline JS_BOOL JS_IsImmediateASCII(JSValue v) {
return JS_VALUE_GET_SPECIAL_TAG(v) == JS_TAG_STRING_ASCII;
}
/* Get immediate ASCII string length (bits 5-7) */
static inline size_t JS_GetImmediateASCIILen(JSValue v) {
return (v >> 5) & 0x7;
}
/* Get immediate ASCII string character at index */
static inline char JS_GetImmediateASCIIChar(JSValue v, int idx) {
return (char)((v >> (8 + idx * 8)) & 0xFF);
}
/* Try to create immediate ASCII string, returns JS_NULL if doesn't fit */
static inline JSValue JS_TryNewImmediateASCII(const char *str, size_t len) {
if (len > JS_ASCII_MAX_LEN) return JS_NULL;
for (size_t i = 0; i < len; i++) {
if ((uint8_t)str[i] >= 0x80) return JS_NULL; /* non-ASCII */
}
/* Tag (5 bits) | Length (3 bits) | chars (56 bits) */
JSValue v = JS_TAG_STRING_ASCII | ((JSValue)len << 5);
for (size_t i = 0; i < len; i++) {
v |= ((JSValue)(uint8_t)str[i]) << (8 + i * 8);
}
return v;
}
/* Hash an immediate ASCII string (hash the entire JSValue) */
static inline uint64_t js_hash_immediate_ascii(JSValue v) {
fash64_state s;
fash64_begin(&s);
fash64_word(&s, v);
return fash64_end(&s);
}
static JS_BOOL js_key_equal_str(JSValue a, const char *str);
```
### Phase 4: Remove JSStringRope
### Remaining Work
Delete `JSStringRope` structure and all rope-related functions:
- `js_new_string_rope()` (line 4815)
- `js_rebalancee_string_rope()` (line 4952)
- `string_rope_iter_*` functions
- `JS_TAG_STRING_ROPE` usage
#### 5.3 Update js_parse_property_name()
- [ ] Change return type from JSAtom* to JSValue*
- [ ] Update all callers (js_parse_object_literal, etc.)
- [ ] This is a larger change affecting many functions
String concatenation creates new `mist_text` objects immediately.
#### 5.4 Replace remaining emit_atom() calls with emit_key()
- [ ] Many emit_atom calls remain in bytecode generation
- [ ] emit_atom is currently a wrapper that calls emit_key
- [ ] Eventually remove emit_atom entirely
### Phase 5: UTF-32 Text Objects (mist_text)
#### 5.5 Update Variable Opcode Format in quickjs-opcode.h
- [ ] Change `atom` format opcodes to `key` format
- [ ] Change `atom_u8` and `atom_u16` to `key_u8` and `key_u16`
The `mist_text` structure already exists. Complete integration:
#### 5.6 Update VM Opcode Handlers
These read atoms from bytecode using get_u32(). Need to change to read cpool indices:
- [ ] OP_check_var, OP_get_var_undef, OP_get_var
- [ ] OP_put_var, OP_put_var_init, OP_put_var_strict
- [ ] OP_set_name, OP_make_var_ref, OP_delete_var
- [ ] OP_define_var, OP_define_func, OP_throw_error
- [ ] OP_make_loc_ref, OP_make_arg_ref
```c
/* Text object: UTF-32 packed 2 chars per 64-bit word
* Pretext (mutable, stone=0): hdr.cap = char capacity, length field = current length
* Text (immutable, stone=1): hdr.cap = length, length field = hash
*/
typedef struct mist_text {
objhdr_t hdr; /* type=OBJ_TEXT, cap=char count, stone bit */
uint64_t length; /* pretext: char count | text: hash */
uint64_t packed[]; /* UTF-32 chars, 2 per word (high then low) */
} mist_text;
#### 5.7 Update resolve_scope_var()
- [ ] Currently reads var_name as atom from bytecode
- [ ] Compares with JS_ATOM_* constants
- [ ] Need to change to read cpool index and compare with JSValue
/* Create new text from UTF-8 C string */
JSValue JS_NewStringLen(JSContext *ctx, const char *str, size_t len) {
/* Try immediate text first */
JSValue imm = JS_TryNewImmediateText(str, len);
if (!JS_IsNull(imm)) return imm;
#### 5.8 Convert Remaining JS_ATOM_* Usages (~41 comparisons remain)
Categories:
- Bytecode reading (get_u32 reads atoms) - will change with opcode format
- js_parse_property_name callers - need function update first
- Stub atom functions - will be removed in Phase 7
/* Convert UTF-8 to UTF-32 */
uint32_t *utf32 = js_malloc(ctx, len * sizeof(uint32_t));
size_t utf32_len = utf8_to_utf32(str, len, utf32);
---
/* Allocate mist_text */
size_t word_count = (utf32_len + 1) / 2;
mist_text *text = js_mallocz(ctx, sizeof(mist_text) + word_count * sizeof(uint64_t));
text->hdr = objhdr_make(utf32_len, OBJ_TEXT, false, false, false, false);
text->length = utf32_len;
## Phase 6: Update Bytecode Serialization
/* Pack UTF-32 into words */
for (size_t i = 0; i < utf32_len; i += 2) {
uint64_t hi = utf32[i];
uint64_t lo = (i + 1 < utf32_len) ? utf32[i + 1] : 0;
text->packed[i / 2] = (hi << 32) | lo;
}
### 6.1 JS_WriteObjectTag Changes
- [ ] Write cpool values as JSValue (text serialization)
- [ ] Variable opcodes reference cpool indices (already u32)
js_free(ctx, utf32);
/* Add to GC list and return as JSValue */
return JS_MKPTR(text);
}
```
### 6.2 JS_ReadObject Changes
- [ ] Read cpool values as JSValue
- [ ] Variable opcode operands are cpool indices
### Phase 6: Remove JSObject, Use JSRecord Only
### 6.3 Version Bump
- [ ] Increment bytecode version to indicate new format
**Delete:**
- `JSObject` structure (line 1664)
- `JSShape` and `JSShapeProperty` structures
- Shape hash table in JSRuntime
- All shape-related functions
- `find_own_property()` and shape-based property access
---
**Keep only JSRecord with direct key/value storage:**
## Phase 7: Final Cleanup
```c
/* Record: open-addressing hash table with JSValue keys
* Slot 0 reserved: key[0] = class_id<<32 | rec_key_id, value[0] = opaque
*/
/* Slot: key/value pair stored together */
typedef struct JSSlot {
JSValue key;
JSValue val;
} JSSlot;
### 7.1 Remove JSAtom Type and Functions
- [ ] Remove `typedef uint32_t JSAtom`
- [ ] Remove JS_NewAtom, JS_NewAtomString
- [ ] Remove JS_FreeAtom, JS_DupAtom (currently stubs)
- [ ] Remove JS_AtomToValue, JS_ValueToAtom
- [ ] Remove JS_AtomToCString, JS_AtomGetStr
- [ ] Remove all JS_ATOM_* constants
- [ ] Remove JSAtomStruct and related
typedef struct JSRecord {
JSGCObjectHeader header;
objhdr_t mist_hdr; /* type=OBJ_RECORD, cap=slot_mask */
struct JSRecord *proto; /* prototype chain */
uint32_t len; /* number of live entries */
uint32_t tombs; /* tombstone count */
JSSlot *slots; /* key/value pairs, size = mask+1 */
} JSRecord;
### 7.2 Remove quickjs-atom.h
- [ ] Delete or convert to JSValue text initialization
/* Three key types for property lookup:
* 1. Immediate ASCII (JS_TAG_STRING_ASCII): hash from JSValue itself
* 2. Text object (mist_text pointer): hash from object's stored hash
* 3. Record object used as key: hash from monotonic ID in record's key[0]
*
* Per memory.md: when a record is used as a key, it gets assigned a
* monotonically increasing 32-bit ID stored in lower 32 bits of keys[0].
*/
### 7.3 Remove Legacy JSObject Type
- [ ] Remove JS_GC_OBJ_TYPE_JS_OBJECT if unused
- [ ] Remove JSObject struct (replaced by JSRecord)
/* Get hash for any key JSValue */
static uint64_t js_key_hash(JSValue key) {
if (JS_IsImmediateASCII(key)) {
/* Hash the entire JSValue for immediate ASCII */
return fash64_hash_one(key);
}
### 7.4 Update quickjs.h Public API
- [ ] Remove JSAtom references from public API
- [ ] Ensure all property functions use JSValue keys or const char*
if (!JS_IsPtr(key))
return 0; /* Invalid key */
---
void *ptr = JS_VALUE_GET_PTR(key);
objhdr_t hdr = *(objhdr_t *)ptr; /* Read object header */
uint8_t type = objhdr_type(hdr);
## Current Build Status
if (type == OBJ_TEXT) {
/* Text object: hash stored in length field (if stoned) or computed */
mist_text *text = (mist_text *)ptr;
return get_text_hash(text);
}
**Build: SUCCEEDS** with warnings (unused variables, labels)
if (type == OBJ_RECORD) {
/* Record used as key: hash from monotonic ID in slots[0].key */
JSRecord *rec = (JSRecord *)ptr;
uint32_t rec_id = (uint32_t)rec->slots[0].key; /* lower 32 bits */
return fash64_hash_one(rec_id);
}
**Statistics:**
- JS_ATOM_* comparisons: ~41 remaining (down from 171+)
- Most remaining are in bytecode reading code (will change with opcode format)
return 0; /* Unknown type */
}
**What Works:**
- Keyword detection via string comparison
- Function declaration parsing with JSValue names
- Variable definition with JSValue names
- Property access with JSValue keys
- Closure variable tracking with JSValue names
/* Ensure record has a key ID assigned (for use as property key) */
static void rec_ensure_key_id(JSRuntime *rt, JSRecord *rec) {
uint32_t id = (uint32_t)rec->slots[0].key;
if (id == 0) {
/* Assign new monotonically increasing ID */
id = ++rt->rec_key_next;
if (id == 0) id = ++rt->rec_key_next; /* Skip 0 */
rec->slots[0].key = (rec->slots[0].key & 0xFFFFFFFF00000000ULL) | id;
}
}
**Next Priority:**
1. Update js_parse_property_name() to use JSValue
2. Update VM opcode handlers to read from cpool
3. Convert remaining bytecode-related JS_ATOM_* usages
/* Compare two keys for equality */
static JS_BOOL js_key_equal(JSValue a, JSValue b) {
/* Fast path: identical values */
if (a == b) return TRUE;
/* Immediate ASCII: must be identical (handled above) */
if (JS_IsImmediateASCII(a) || JS_IsImmediateASCII(b))
return FALSE;
/* Both must be pointers */
if (!JS_IsPtr(a) || !JS_IsPtr(b))
return FALSE;
void *pa = JS_VALUE_GET_PTR(a);
void *pb = JS_VALUE_GET_PTR(b);
uint8_t ta = objhdr_type(*(objhdr_t *)pa);
uint8_t tb = objhdr_type(*(objhdr_t *)pb);
/* Record keys: pointer equality (identity) */
if (ta == OBJ_RECORD || tb == OBJ_RECORD)
return FALSE; /* Already checked a == b above */
/* Text objects: string content comparison */
if (ta == OBJ_TEXT && tb == OBJ_TEXT)
return text_content_equal((mist_text *)pa, (mist_text *)pb);
return FALSE;
}
/* Property lookup using open-addressing hash table */
static int rec_find_slot(JSRecord *rec, JSValue key) {
uint32_t mask = (uint32_t)objhdr_cap56(rec->mist_hdr);
uint64_t hash = js_key_hash(key);
uint32_t idx = hash & mask;
if (idx == 0) idx = 1; /* slot 0 reserved */
for (uint32_t i = 0; i <= mask; i++) {
JSValue k = rec->slots[idx].key;
if (JS_IsNull(k)) return -1; /* empty, not found */
if (k == JS_EXCEPTION) { /* tombstone, continue */
idx = (idx + 1) & mask;
if (idx == 0) idx = 1;
continue;
}
if (js_key_equal(k, key)) return idx;
idx = (idx + 1) & mask;
if (idx == 0) idx = 1;
}
return -1;
}
```
### Phase 7: Update Property Access Functions
Replace all JSObject property functions with JSRecord equivalents:
```c
JSValue JS_GetPropertyInternal(JSContext *ctx, JSValueConst this_obj,
JSValue prop, JS_BOOL throw_ref_error) {
if (!JS_IsPtr(this_obj)) {
if (throw_ref_error)
return JS_ThrowTypeError(ctx, "not an object");
return JS_NULL;
}
JSRecord *rec = (JSRecord *)JS_VALUE_GET_PTR(this_obj);
while (rec) {
int idx = rec_find_slot(rec, prop);
if (idx >= 0) {
return rec->slots[idx].val; /* No dup needed if no ref counting */
}
rec = rec->proto;
}
if (throw_ref_error)
return JS_ThrowReferenceError(ctx, "property not found");
return JS_NULL;
}
int JS_SetPropertyInternal(JSContext *ctx, JSValueConst this_obj,
JSValue prop, JSValue val) {
if (!JS_IsPtr(this_obj))
return -1;
JSRecord *rec = (JSRecord *)JS_VALUE_GET_PTR(this_obj);
int idx = rec_find_slot(rec, prop);
if (idx >= 0) {
rec->slots[idx].val = val;
return 0;
}
/* Add new property */
return rec_add_property(ctx, rec, prop, val);
}
```
### Phase 8: C Class Storage in Slot 0
Per memory.md, slot 0 is reserved for internal use:
```c
/* slots[0].key: lower 32 bits = rec_key_id (for identity-based keys)
* upper 32 bits = class_id (C class)
* slots[0].val: opaque C pointer
*/
void JS_SetOpaque(JSContext *ctx, JSValue obj, void *opaque) {
JSRecord *rec = (JSRecord *)JS_VALUE_GET_PTR(obj);
rec->slots[0].val = (JSValue)(uintptr_t)opaque;
}
void *JS_GetOpaque(JSContext *ctx, JSValue obj, uint32_t class_id) {
JSRecord *rec = (JSRecord *)JS_VALUE_GET_PTR(obj);
uint32_t stored_class = (uint32_t)(rec->slots[0].key >> 32);
if (stored_class != class_id) return NULL;
return (void *)(uintptr_t)rec->slots[0].val;
}
void JS_SetClassID(JSRecord *rec, uint32_t class_id) {
rec->slots[0].key = (rec->slots[0].key & 0xFFFFFFFF) | ((uint64_t)class_id << 32);
}
```
### Phase 9: Update GC
The GC needs updates for the new object format:
```c
static void mark_children(JSRuntime *rt, JSGCObjectHeader *gp, ...) {
switch (gp->gc_obj_type) {
case JS_GC_OBJ_TYPE_RECORD:
{
JSRecord *rec = (JSRecord *)gp;
uint32_t mask = objhdr_cap56(rec->mist_hdr);
if (rec->proto)
mark_func(rt, &rec->proto->header);
for (uint32_t i = 1; i <= mask; i++) {
if (!JS_IsNull(rec->keys[i]) &&
rec->keys[i] != JS_EXCEPTION) { /* tombstone */
/* Mark key if it's a pointer */
if (JS_IsPtr(rec->keys[i]))
JS_MarkValue(rt, rec->keys[i], mark_func);
/* Mark value if it's a pointer */
if (JS_IsPtr(rec->values[i]))
JS_MarkValue(rt, rec->values[i], mark_func);
}
}
}
break;
// ... other cases
}
}
```
## Cleanup - Items to Remove
1. **quickjs.h:**
- Old NaN-boxing macros (JS_VALUE_GET_TAG, JS_MKVAL, etc.)
- JS_TAG_STRING, JS_TAG_STRING_ROPE, JS_TAG_OBJECT, JS_TAG_ARRAY, JS_TAG_FUNCTION
- JSValueConst (just use JSValue)
2. **quickjs.c:**
- JSStringRope structure and functions
- JSShape and JSShapeProperty structures
- Shape hash table and functions
- Atom-based property access (keep atoms for parser/compiler)
- JSObject structure (replace with JSRecord)
- `find_own_property()`, `add_shape_property()`, etc.
## Verification
1. **Build:** `make` completes without errors
2. **Basic test:** Create objects, set/get properties
3. **Number test:** Verify short float encoding/decoding, out-of-range → null
4. **String test:** Immediate text for short strings, mist_text for long
5. **GC test:** Create cycles, verify collection works
6. **C class test:** SetOpaque/GetOpaque work with slot 0 storage
---
## Notes
- This is a **major rework** affecting most of the codebase
- Atoms remain for parser/compiler but not for object property storage
- Reference counting may be simplified since fewer pointer types
- The short float range (+-3.4e38) covers most practical use cases
- Out-of-range numbers becoming NULL is intentional per memory.md
- JSVarDef.var_name is JSValue
- JSClosureVar.var_name is JSValue
- JSGlobalVar.var_name is JSValue
- JSFunctionDef.func_name is JSValue
- BlockEnv.label_name is JSValue
- OP_get_field/put_field/define_field already use cpool index format
- JSRecord with open addressing is fully implemented
- js_key_hash and js_key_equal work with both immediate and heap text
- js_key_equal_str enables comparison with C string literals for internal names
---
## Testing Strategy
After each sub-phase:
1. `make` - verify compilation
2. Run basic eval: `./cell -e "1+1"`
3. Run property test: `./cell -e "var o = {a:1}; o.a"`
4. Run function test: `./cell -e "(function f(x){return x*2})(3)"`
5. Run closure test: `./cell -e "var f = (function(){var x=1;return function(){return x++}})(); f(); f()"`

File diff suppressed because it is too large Load Diff