rm atoms
This commit is contained in:
660
plan.md
660
plan.md
@@ -1,547 +1,215 @@
|
||||
# Refactoring QuickJS to Mist Memory Format
|
||||
# Cell/QuickJS Refactoring Plan: Remove Atoms, Shapes, and Dual-Encoding
|
||||
|
||||
## Summary
|
||||
## Overview
|
||||
|
||||
Complete rework of `quickjs.h` and `quickjs.c` to align with `docs/memory.md` and the new JSValue encoding scheme using LSB-based type discrimination with short floats.
|
||||
Refactor `source/quickjs.c` to match `docs/memory.md` specification:
|
||||
- Remove JSAtom system (171 references → ~41 remaining)
|
||||
- Remove JSShape system (94 references) ✓
|
||||
- Remove IC caches (shape-based inline caches) ✓
|
||||
- Remove `is_wide_char` dual-encoding (18 locations) ✓
|
||||
- Use JSValue texts directly as property keys
|
||||
- Reference: `mquickjs.c` shows the target pattern
|
||||
|
||||
## Key Design Decisions (from user)
|
||||
## Completed Phases
|
||||
|
||||
1. **Remove NaN-boxing entirely** - Use LSB-based type tags instead
|
||||
2. **Short float for numbers** - Truncated double (3 fewer exponent bits), out-of-range → NULL
|
||||
3. **Optional 32-bit float mode** - Compile-time option, stored like ints
|
||||
4. **Remove KeyId** - Use JSValue directly as keys in objects
|
||||
5. **Remove JSStringRope** - No lazy concatenation, immediate text creation
|
||||
6. **Remove JSObject/shapes** - Move to JSRecord only with direct key/value storage
|
||||
7. **Remove atoms from objects** - String interning for literals/properties only
|
||||
### Phase 1: Remove is_wide_char Remnants ✓
|
||||
### Phase 2: Remove IC Caches ✓
|
||||
### Phase 3: Remove JSShape System ✓
|
||||
### Phase 4: Complete Property Access with JSValue Keys ✓
|
||||
|
||||
## New JSValue Encoding (64-bit)
|
||||
Completed:
|
||||
- Removed JS_GC_OBJ_TYPE_JS_OBJECT fallbacks from OP_get_field
|
||||
- Removed JS_GC_OBJ_TYPE_JS_OBJECT fallbacks from OP_put_field
|
||||
- Removed JS_GC_OBJ_TYPE_JS_OBJECT fallbacks from OP_define_field
|
||||
- Created emit_key() function that adds JSValue to cpool and emits index
|
||||
|
||||
Based on the provided header, using LSB-based discrimination:
|
||||
---
|
||||
|
||||
```
|
||||
LSB = 0 → 31-bit signed integer (value >> 1)
|
||||
LSB = 01 → 61-bit pointer
|
||||
LSB = 101 → Short float (truncated double, 3 fewer exponent bits)
|
||||
LSB = 11 → Special tag (next 3 bits for subtype, 5 bits total)
|
||||
```
|
||||
## Phase 5: Convert JSAtom to JSValue Text (IN PROGRESS)
|
||||
|
||||
**Special tags (5 bits, LSB = 11):**
|
||||
- `00011` (3) = JS_TAG_BOOL (payload bit 5 = value)
|
||||
- `00111` (7) = JS_TAG_NULL
|
||||
- `01011` (11) = JS_TAG_UNDEFINED (may not be needed - use NULL)
|
||||
- `01111` (15) = JS_TAG_EXCEPTION
|
||||
- `10111` (23) = JS_TAG_UNINITIALIZED
|
||||
- `11011` (27) = JS_TAG_STRING_ASCII (immediate string: 3-bit len + up to 7 ASCII bytes)
|
||||
- `11111` (31) = JS_TAG_CATCH_OFFSET
|
||||
This is the core transformation. All identifier handling moves from atoms to JSValue.
|
||||
|
||||
## Critical Files
|
||||
### Completed Items
|
||||
|
||||
- `/Users/johnalanbrook/work/cell/source/quickjs.h` - Complete rewrite of JSValue encoding
|
||||
- `/Users/johnalanbrook/work/cell/source/quickjs.c` - Remove shapes, atoms from objects, string ropes
|
||||
**Token and Parser Infrastructure:**
|
||||
- [x] Change JSToken.u.ident.atom to JSToken.u.ident.str (JSValue)
|
||||
- [x] Change parse_ident() to return JSValue
|
||||
- [x] Create emit_key() function (cpool-based)
|
||||
- [x] Create JS_KEY_* macros for common names (lines ~279-335 in quickjs.c)
|
||||
- [x] Update all token.u.ident.atom references to .str
|
||||
- [x] Create keyword lookup table (js_keywords[]) with string comparison
|
||||
- [x] Rewrite update_token_ident() to use js_keyword_lookup()
|
||||
- [x] Rewrite is_strict_future_keyword() to use JSValue
|
||||
- [x] Update token_is_pseudo_keyword() to use JSValue and js_key_equal()
|
||||
|
||||
## Implementation Plan
|
||||
**Function Declaration Parsing:**
|
||||
- [x] Update js_parse_function_decl() signature to use JSValue func_name
|
||||
- [x] Update js_parse_function_decl2() to use JSValue func_name throughout
|
||||
- [x] Update js_parse_function_check_names() to use JSValue
|
||||
- [x] Convert JS_DupAtom/JS_FreeAtom to JS_DupValue/JS_FreeValue in function parsing
|
||||
|
||||
### Phase 1: New JSValue Encoding in quickjs.h
|
||||
**Variable Definition and Lookup:**
|
||||
- [x] Update find_global_var() to use JSValue and js_key_equal()
|
||||
- [x] Update find_lexical_global_var() to use JSValue
|
||||
- [x] Update find_lexical_decl() to use JSValue and js_key_equal()
|
||||
- [x] Update js_define_var() to use JSValue
|
||||
- [x] Update js_parse_check_duplicate_parameter() to use JSValue and js_key_equal()
|
||||
- [x] Update js_parse_destructuring_var() to return JSValue
|
||||
- [x] Update js_parse_var() to use JSValue for variable names
|
||||
|
||||
Replace the entire JSValue system with LSB-based tags:
|
||||
**Comparison Helpers:**
|
||||
- [x] Create js_key_equal_str() for comparing JSValue with C string literals
|
||||
- [x] Update is_var_in_arg_scope() to use js_key_equal/js_key_equal_str
|
||||
- [x] Update has_with_scope() to use js_key_equal_str
|
||||
- [x] Update closure variable comparisons (cv->var_name) to use js_key_equal_str
|
||||
|
||||
**Property Access:**
|
||||
- [x] Fix JS_GetPropertyStr to create proper JSValue keys
|
||||
- [x] Fix JS_SetPropertyInternal callers to use JS_KEY_* instead of JS_ATOM_*
|
||||
|
||||
### JS_KEY_* Macros Added
|
||||
|
||||
Compile-time immediate ASCII string constants (≤7 chars):
|
||||
```c
|
||||
#if INTPTR_MAX >= INT64_MAX
|
||||
#define JS_PTR64
|
||||
typedef uint64_t JSValue;
|
||||
#define JSW 8
|
||||
#define JS_USE_SHORT_FLOAT
|
||||
#else
|
||||
typedef uint32_t JSValue;
|
||||
#define JSW 4
|
||||
#endif
|
||||
|
||||
enum {
|
||||
JS_TAG_INT = 0, /* LSB = 0, 31-bit int */
|
||||
JS_TAG_PTR = 1, /* LSB = 01, pointer */
|
||||
JS_TAG_SPECIAL = 3, /* LSB = 11, special values */
|
||||
JS_TAG_BOOL = JS_TAG_SPECIAL | (0 << 2), /* 5 bits */
|
||||
JS_TAG_NULL = JS_TAG_SPECIAL | (1 << 2),
|
||||
JS_TAG_EXCEPTION = JS_TAG_SPECIAL | (3 << 2),
|
||||
JS_TAG_UNINITIALIZED = JS_TAG_SPECIAL | (5 << 2),
|
||||
JS_TAG_STRING_ASCII = JS_TAG_SPECIAL | (6 << 2), /* immediate ASCII string */
|
||||
JS_TAG_CATCH_OFFSET = JS_TAG_SPECIAL | (7 << 2),
|
||||
#ifdef JS_USE_SHORT_FLOAT
|
||||
JS_TAG_SHORT_FLOAT = 5, /* LSB = 101 */
|
||||
#endif
|
||||
};
|
||||
|
||||
/* Value extraction */
|
||||
#define JS_VALUE_GET_INT(v) ((int32_t)(v) >> 1)
|
||||
#define JS_VALUE_GET_PTR(v) ((void *)((v) & ~(JSW - 1)))
|
||||
#define JS_VALUE_GET_SPECIAL_TAG(v) ((v) & 0x1F)
|
||||
#define JS_VALUE_GET_SPECIAL_VALUE(v) ((int32_t)(v) >> 5)
|
||||
|
||||
/* Value creation */
|
||||
#define JS_MKINT(val) (((JSValue)(val) << 1) | JS_TAG_INT)
|
||||
#define JS_MKPTR(ptr) (((JSValue)(uintptr_t)(ptr)) | JS_TAG_PTR)
|
||||
#define JS_MKSPECIAL(tag, val) ((JSValue)(tag) | ((JSValue)(val) << 5))
|
||||
|
||||
/* Type checks */
|
||||
static inline JS_BOOL JS_IsInt(JSValue v) { return (v & 1) == JS_TAG_INT; }
|
||||
static inline JS_BOOL JS_IsPtr(JSValue v) { return (v & (JSW-1)) == JS_TAG_PTR; }
|
||||
static inline JS_BOOL JS_IsNull(JSValue v) { return v == JS_MKSPECIAL(JS_TAG_NULL, 0); }
|
||||
static inline JS_BOOL JS_IsException(JSValue v) { return JS_VALUE_GET_SPECIAL_TAG(v) == JS_TAG_EXCEPTION; }
|
||||
|
||||
#ifdef JS_USE_SHORT_FLOAT
|
||||
static inline JS_BOOL JS_IsShortFloat(JSValue v) { return (v & 7) == JS_TAG_SHORT_FLOAT; }
|
||||
#endif
|
||||
|
||||
/* Constants */
|
||||
#define JS_NULL JS_MKSPECIAL(JS_TAG_NULL, 0)
|
||||
#define JS_FALSE JS_MKSPECIAL(JS_TAG_BOOL, 0)
|
||||
#define JS_TRUE JS_MKSPECIAL(JS_TAG_BOOL, 1)
|
||||
#define JS_EXCEPTION JS_MKSPECIAL(JS_TAG_EXCEPTION, 0)
|
||||
#define JS_UNINITIALIZED JS_MKSPECIAL(JS_TAG_UNINITIALIZED, 0)
|
||||
JS_KEY_empty, JS_KEY_name, JS_KEY_message, JS_KEY_stack,
|
||||
JS_KEY_errors, JS_KEY_Error, JS_KEY_cause, JS_KEY_length,
|
||||
JS_KEY_value, JS_KEY_get, JS_KEY_set, JS_KEY_raw,
|
||||
JS_KEY_flags, JS_KEY_source, JS_KEY_exec, JS_KEY_toJSON,
|
||||
JS_KEY_eval, JS_KEY_this, JS_KEY_true, JS_KEY_false,
|
||||
JS_KEY_null, JS_KEY_NaN, JS_KEY_default, JS_KEY_index,
|
||||
JS_KEY_input, JS_KEY_groups, JS_KEY_indices, JS_KEY_let,
|
||||
JS_KEY_var, JS_KEY_new, JS_KEY_of, JS_KEY_yield,
|
||||
JS_KEY_async, JS_KEY_target, JS_KEY_from, JS_KEY_meta,
|
||||
JS_KEY_as, JS_KEY_with
|
||||
```
|
||||
|
||||
### Phase 2: Short Float Implementation
|
||||
|
||||
Short float uses 3 fewer exponent bits than double. Numbers outside range become NULL.
|
||||
|
||||
Runtime macro for strings >7 chars:
|
||||
```c
|
||||
/* Short float: 61 bits = 1 sign + 8 exp + 52 mantissa (vs double's 11 exp)
|
||||
* Range: approximately +-3.4e38 (vs double's +-1.8e308)
|
||||
* Out of range values become JS_NULL
|
||||
* Zero and subnormals: 0.0 is representable, subnormals become 0.0
|
||||
*/
|
||||
static inline JSValue JS_NewFloat64(JSContext *ctx, double d) {
|
||||
union { double d; uint64_t u; } u;
|
||||
u.d = d;
|
||||
|
||||
/* Extract sign, exponent, mantissa */
|
||||
uint64_t sign = u.u >> 63;
|
||||
int exp = (u.u >> 52) & 0x7FF;
|
||||
uint64_t mantissa = u.u & ((1ULL << 52) - 1);
|
||||
|
||||
/* Special case: zero (exp=0, mantissa=0) */
|
||||
if (exp == 0 && mantissa == 0) {
|
||||
/* Encode +0.0 or -0.0 */
|
||||
return (sign << 63) | JS_TAG_SHORT_FLOAT; /* short_exp=0, mantissa=0 */
|
||||
}
|
||||
|
||||
/* Check for NaN/Inf (exp=0x7FF) */
|
||||
if (exp == 0x7FF) {
|
||||
return JS_NULL; /* NaN or Infinity → null */
|
||||
}
|
||||
|
||||
/* Subnormals (exp=0, mantissa!=0): flush to zero */
|
||||
if (exp == 0) {
|
||||
return (sign << 63) | JS_TAG_SHORT_FLOAT; /* becomes +/-0.0 */
|
||||
}
|
||||
|
||||
/* Normal numbers: convert exponent bias */
|
||||
/* Double bias = 1023, short float bias = 127 */
|
||||
int short_exp = exp - 1023 + 127;
|
||||
if (short_exp < 1 || short_exp > 254) {
|
||||
return JS_NULL; /* Out of range (short_exp 0 and 255 are special) */
|
||||
}
|
||||
|
||||
/* Check if it fits in int32 (prefer integer encoding) */
|
||||
if (d >= INT32_MIN && d <= INT32_MAX) {
|
||||
int32_t i = (int32_t)d;
|
||||
if ((double)i == d) {
|
||||
return JS_MKINT(i);
|
||||
}
|
||||
}
|
||||
|
||||
/* Encode as short float:
|
||||
* [sign:1][short_exp:8][mantissa:52][tag:3] */
|
||||
JSValue v = (sign << 63) | ((uint64_t)short_exp << 55) | (mantissa << 3) | JS_TAG_SHORT_FLOAT;
|
||||
return v;
|
||||
}
|
||||
|
||||
static inline double JS_VALUE_GET_FLOAT64(JSValue v) {
|
||||
/* Decode short float back to double */
|
||||
uint64_t sign = v >> 63;
|
||||
uint64_t short_exp = (v >> 55) & 0xFF;
|
||||
uint64_t mantissa = (v >> 3) & ((1ULL << 52) - 1);
|
||||
|
||||
/* Convert exponent: short bias 127 → double bias 1023 */
|
||||
uint64_t exp = short_exp - 127 + 1023;
|
||||
|
||||
union { double d; uint64_t u; } u;
|
||||
u.u = (sign << 63) | (exp << 52) | mantissa;
|
||||
return u.d;
|
||||
}
|
||||
#define JS_KEY_STR(ctx, str) JS_NewStringLen((ctx), (str), sizeof(str) - 1)
|
||||
```
|
||||
|
||||
### Phase 3: Immediate ASCII String (JS_TAG_STRING_ASCII)
|
||||
|
||||
Up to 7 ASCII characters stored directly in JSValue payload.
|
||||
|
||||
**Layout (64-bit):**
|
||||
- Bits 0-4: Tag (JS_TAG_STRING_ASCII = 27)
|
||||
- Bits 5-7: Length (0-7)
|
||||
- Bits 8-63: Up to 7 ASCII bytes (char[0] in bits 8-15, etc.)
|
||||
|
||||
Helper function for comparing JSValue with C string literals:
|
||||
```c
|
||||
#define JS_ASCII_MAX_LEN 7
|
||||
|
||||
/* Check if value is immediate ASCII string */
|
||||
static inline JS_BOOL JS_IsImmediateASCII(JSValue v) {
|
||||
return JS_VALUE_GET_SPECIAL_TAG(v) == JS_TAG_STRING_ASCII;
|
||||
}
|
||||
|
||||
/* Get immediate ASCII string length (bits 5-7) */
|
||||
static inline size_t JS_GetImmediateASCIILen(JSValue v) {
|
||||
return (v >> 5) & 0x7;
|
||||
}
|
||||
|
||||
/* Get immediate ASCII string character at index */
|
||||
static inline char JS_GetImmediateASCIIChar(JSValue v, int idx) {
|
||||
return (char)((v >> (8 + idx * 8)) & 0xFF);
|
||||
}
|
||||
|
||||
/* Try to create immediate ASCII string, returns JS_NULL if doesn't fit */
|
||||
static inline JSValue JS_TryNewImmediateASCII(const char *str, size_t len) {
|
||||
if (len > JS_ASCII_MAX_LEN) return JS_NULL;
|
||||
for (size_t i = 0; i < len; i++) {
|
||||
if ((uint8_t)str[i] >= 0x80) return JS_NULL; /* non-ASCII */
|
||||
}
|
||||
/* Tag (5 bits) | Length (3 bits) | chars (56 bits) */
|
||||
JSValue v = JS_TAG_STRING_ASCII | ((JSValue)len << 5);
|
||||
for (size_t i = 0; i < len; i++) {
|
||||
v |= ((JSValue)(uint8_t)str[i]) << (8 + i * 8);
|
||||
}
|
||||
return v;
|
||||
}
|
||||
|
||||
/* Hash an immediate ASCII string (hash the entire JSValue) */
|
||||
static inline uint64_t js_hash_immediate_ascii(JSValue v) {
|
||||
fash64_state s;
|
||||
fash64_begin(&s);
|
||||
fash64_word(&s, v);
|
||||
return fash64_end(&s);
|
||||
}
|
||||
static JS_BOOL js_key_equal_str(JSValue a, const char *str);
|
||||
```
|
||||
|
||||
### Phase 4: Remove JSStringRope
|
||||
### Remaining Work
|
||||
|
||||
Delete `JSStringRope` structure and all rope-related functions:
|
||||
- `js_new_string_rope()` (line 4815)
|
||||
- `js_rebalancee_string_rope()` (line 4952)
|
||||
- `string_rope_iter_*` functions
|
||||
- `JS_TAG_STRING_ROPE` usage
|
||||
#### 5.3 Update js_parse_property_name()
|
||||
- [ ] Change return type from JSAtom* to JSValue*
|
||||
- [ ] Update all callers (js_parse_object_literal, etc.)
|
||||
- [ ] This is a larger change affecting many functions
|
||||
|
||||
String concatenation creates new `mist_text` objects immediately.
|
||||
#### 5.4 Replace remaining emit_atom() calls with emit_key()
|
||||
- [ ] Many emit_atom calls remain in bytecode generation
|
||||
- [ ] emit_atom is currently a wrapper that calls emit_key
|
||||
- [ ] Eventually remove emit_atom entirely
|
||||
|
||||
### Phase 5: UTF-32 Text Objects (mist_text)
|
||||
#### 5.5 Update Variable Opcode Format in quickjs-opcode.h
|
||||
- [ ] Change `atom` format opcodes to `key` format
|
||||
- [ ] Change `atom_u8` and `atom_u16` to `key_u8` and `key_u16`
|
||||
|
||||
The `mist_text` structure already exists. Complete integration:
|
||||
#### 5.6 Update VM Opcode Handlers
|
||||
These read atoms from bytecode using get_u32(). Need to change to read cpool indices:
|
||||
- [ ] OP_check_var, OP_get_var_undef, OP_get_var
|
||||
- [ ] OP_put_var, OP_put_var_init, OP_put_var_strict
|
||||
- [ ] OP_set_name, OP_make_var_ref, OP_delete_var
|
||||
- [ ] OP_define_var, OP_define_func, OP_throw_error
|
||||
- [ ] OP_make_loc_ref, OP_make_arg_ref
|
||||
|
||||
```c
|
||||
/* Text object: UTF-32 packed 2 chars per 64-bit word
|
||||
* Pretext (mutable, stone=0): hdr.cap = char capacity, length field = current length
|
||||
* Text (immutable, stone=1): hdr.cap = length, length field = hash
|
||||
*/
|
||||
typedef struct mist_text {
|
||||
objhdr_t hdr; /* type=OBJ_TEXT, cap=char count, stone bit */
|
||||
uint64_t length; /* pretext: char count | text: hash */
|
||||
uint64_t packed[]; /* UTF-32 chars, 2 per word (high then low) */
|
||||
} mist_text;
|
||||
#### 5.7 Update resolve_scope_var()
|
||||
- [ ] Currently reads var_name as atom from bytecode
|
||||
- [ ] Compares with JS_ATOM_* constants
|
||||
- [ ] Need to change to read cpool index and compare with JSValue
|
||||
|
||||
/* Create new text from UTF-8 C string */
|
||||
JSValue JS_NewStringLen(JSContext *ctx, const char *str, size_t len) {
|
||||
/* Try immediate text first */
|
||||
JSValue imm = JS_TryNewImmediateText(str, len);
|
||||
if (!JS_IsNull(imm)) return imm;
|
||||
#### 5.8 Convert Remaining JS_ATOM_* Usages (~41 comparisons remain)
|
||||
Categories:
|
||||
- Bytecode reading (get_u32 reads atoms) - will change with opcode format
|
||||
- js_parse_property_name callers - need function update first
|
||||
- Stub atom functions - will be removed in Phase 7
|
||||
|
||||
/* Convert UTF-8 to UTF-32 */
|
||||
uint32_t *utf32 = js_malloc(ctx, len * sizeof(uint32_t));
|
||||
size_t utf32_len = utf8_to_utf32(str, len, utf32);
|
||||
---
|
||||
|
||||
/* Allocate mist_text */
|
||||
size_t word_count = (utf32_len + 1) / 2;
|
||||
mist_text *text = js_mallocz(ctx, sizeof(mist_text) + word_count * sizeof(uint64_t));
|
||||
text->hdr = objhdr_make(utf32_len, OBJ_TEXT, false, false, false, false);
|
||||
text->length = utf32_len;
|
||||
## Phase 6: Update Bytecode Serialization
|
||||
|
||||
/* Pack UTF-32 into words */
|
||||
for (size_t i = 0; i < utf32_len; i += 2) {
|
||||
uint64_t hi = utf32[i];
|
||||
uint64_t lo = (i + 1 < utf32_len) ? utf32[i + 1] : 0;
|
||||
text->packed[i / 2] = (hi << 32) | lo;
|
||||
}
|
||||
### 6.1 JS_WriteObjectTag Changes
|
||||
- [ ] Write cpool values as JSValue (text serialization)
|
||||
- [ ] Variable opcodes reference cpool indices (already u32)
|
||||
|
||||
js_free(ctx, utf32);
|
||||
/* Add to GC list and return as JSValue */
|
||||
return JS_MKPTR(text);
|
||||
}
|
||||
```
|
||||
### 6.2 JS_ReadObject Changes
|
||||
- [ ] Read cpool values as JSValue
|
||||
- [ ] Variable opcode operands are cpool indices
|
||||
|
||||
### Phase 6: Remove JSObject, Use JSRecord Only
|
||||
### 6.3 Version Bump
|
||||
- [ ] Increment bytecode version to indicate new format
|
||||
|
||||
**Delete:**
|
||||
- `JSObject` structure (line 1664)
|
||||
- `JSShape` and `JSShapeProperty` structures
|
||||
- Shape hash table in JSRuntime
|
||||
- All shape-related functions
|
||||
- `find_own_property()` and shape-based property access
|
||||
---
|
||||
|
||||
**Keep only JSRecord with direct key/value storage:**
|
||||
## Phase 7: Final Cleanup
|
||||
|
||||
```c
|
||||
/* Record: open-addressing hash table with JSValue keys
|
||||
* Slot 0 reserved: key[0] = class_id<<32 | rec_key_id, value[0] = opaque
|
||||
*/
|
||||
/* Slot: key/value pair stored together */
|
||||
typedef struct JSSlot {
|
||||
JSValue key;
|
||||
JSValue val;
|
||||
} JSSlot;
|
||||
### 7.1 Remove JSAtom Type and Functions
|
||||
- [ ] Remove `typedef uint32_t JSAtom`
|
||||
- [ ] Remove JS_NewAtom, JS_NewAtomString
|
||||
- [ ] Remove JS_FreeAtom, JS_DupAtom (currently stubs)
|
||||
- [ ] Remove JS_AtomToValue, JS_ValueToAtom
|
||||
- [ ] Remove JS_AtomToCString, JS_AtomGetStr
|
||||
- [ ] Remove all JS_ATOM_* constants
|
||||
- [ ] Remove JSAtomStruct and related
|
||||
|
||||
typedef struct JSRecord {
|
||||
JSGCObjectHeader header;
|
||||
objhdr_t mist_hdr; /* type=OBJ_RECORD, cap=slot_mask */
|
||||
struct JSRecord *proto; /* prototype chain */
|
||||
uint32_t len; /* number of live entries */
|
||||
uint32_t tombs; /* tombstone count */
|
||||
JSSlot *slots; /* key/value pairs, size = mask+1 */
|
||||
} JSRecord;
|
||||
### 7.2 Remove quickjs-atom.h
|
||||
- [ ] Delete or convert to JSValue text initialization
|
||||
|
||||
/* Three key types for property lookup:
|
||||
* 1. Immediate ASCII (JS_TAG_STRING_ASCII): hash from JSValue itself
|
||||
* 2. Text object (mist_text pointer): hash from object's stored hash
|
||||
* 3. Record object used as key: hash from monotonic ID in record's key[0]
|
||||
*
|
||||
* Per memory.md: when a record is used as a key, it gets assigned a
|
||||
* monotonically increasing 32-bit ID stored in lower 32 bits of keys[0].
|
||||
*/
|
||||
### 7.3 Remove Legacy JSObject Type
|
||||
- [ ] Remove JS_GC_OBJ_TYPE_JS_OBJECT if unused
|
||||
- [ ] Remove JSObject struct (replaced by JSRecord)
|
||||
|
||||
/* Get hash for any key JSValue */
|
||||
static uint64_t js_key_hash(JSValue key) {
|
||||
if (JS_IsImmediateASCII(key)) {
|
||||
/* Hash the entire JSValue for immediate ASCII */
|
||||
return fash64_hash_one(key);
|
||||
}
|
||||
### 7.4 Update quickjs.h Public API
|
||||
- [ ] Remove JSAtom references from public API
|
||||
- [ ] Ensure all property functions use JSValue keys or const char*
|
||||
|
||||
if (!JS_IsPtr(key))
|
||||
return 0; /* Invalid key */
|
||||
---
|
||||
|
||||
void *ptr = JS_VALUE_GET_PTR(key);
|
||||
objhdr_t hdr = *(objhdr_t *)ptr; /* Read object header */
|
||||
uint8_t type = objhdr_type(hdr);
|
||||
## Current Build Status
|
||||
|
||||
if (type == OBJ_TEXT) {
|
||||
/* Text object: hash stored in length field (if stoned) or computed */
|
||||
mist_text *text = (mist_text *)ptr;
|
||||
return get_text_hash(text);
|
||||
}
|
||||
**Build: SUCCEEDS** with warnings (unused variables, labels)
|
||||
|
||||
if (type == OBJ_RECORD) {
|
||||
/* Record used as key: hash from monotonic ID in slots[0].key */
|
||||
JSRecord *rec = (JSRecord *)ptr;
|
||||
uint32_t rec_id = (uint32_t)rec->slots[0].key; /* lower 32 bits */
|
||||
return fash64_hash_one(rec_id);
|
||||
}
|
||||
**Statistics:**
|
||||
- JS_ATOM_* comparisons: ~41 remaining (down from 171+)
|
||||
- Most remaining are in bytecode reading code (will change with opcode format)
|
||||
|
||||
return 0; /* Unknown type */
|
||||
}
|
||||
**What Works:**
|
||||
- Keyword detection via string comparison
|
||||
- Function declaration parsing with JSValue names
|
||||
- Variable definition with JSValue names
|
||||
- Property access with JSValue keys
|
||||
- Closure variable tracking with JSValue names
|
||||
|
||||
/* Ensure record has a key ID assigned (for use as property key) */
|
||||
static void rec_ensure_key_id(JSRuntime *rt, JSRecord *rec) {
|
||||
uint32_t id = (uint32_t)rec->slots[0].key;
|
||||
if (id == 0) {
|
||||
/* Assign new monotonically increasing ID */
|
||||
id = ++rt->rec_key_next;
|
||||
if (id == 0) id = ++rt->rec_key_next; /* Skip 0 */
|
||||
rec->slots[0].key = (rec->slots[0].key & 0xFFFFFFFF00000000ULL) | id;
|
||||
}
|
||||
}
|
||||
**Next Priority:**
|
||||
1. Update js_parse_property_name() to use JSValue
|
||||
2. Update VM opcode handlers to read from cpool
|
||||
3. Convert remaining bytecode-related JS_ATOM_* usages
|
||||
|
||||
/* Compare two keys for equality */
|
||||
static JS_BOOL js_key_equal(JSValue a, JSValue b) {
|
||||
/* Fast path: identical values */
|
||||
if (a == b) return TRUE;
|
||||
|
||||
/* Immediate ASCII: must be identical (handled above) */
|
||||
if (JS_IsImmediateASCII(a) || JS_IsImmediateASCII(b))
|
||||
return FALSE;
|
||||
|
||||
/* Both must be pointers */
|
||||
if (!JS_IsPtr(a) || !JS_IsPtr(b))
|
||||
return FALSE;
|
||||
|
||||
void *pa = JS_VALUE_GET_PTR(a);
|
||||
void *pb = JS_VALUE_GET_PTR(b);
|
||||
uint8_t ta = objhdr_type(*(objhdr_t *)pa);
|
||||
uint8_t tb = objhdr_type(*(objhdr_t *)pb);
|
||||
|
||||
/* Record keys: pointer equality (identity) */
|
||||
if (ta == OBJ_RECORD || tb == OBJ_RECORD)
|
||||
return FALSE; /* Already checked a == b above */
|
||||
|
||||
/* Text objects: string content comparison */
|
||||
if (ta == OBJ_TEXT && tb == OBJ_TEXT)
|
||||
return text_content_equal((mist_text *)pa, (mist_text *)pb);
|
||||
|
||||
return FALSE;
|
||||
}
|
||||
|
||||
/* Property lookup using open-addressing hash table */
|
||||
static int rec_find_slot(JSRecord *rec, JSValue key) {
|
||||
uint32_t mask = (uint32_t)objhdr_cap56(rec->mist_hdr);
|
||||
uint64_t hash = js_key_hash(key);
|
||||
uint32_t idx = hash & mask;
|
||||
if (idx == 0) idx = 1; /* slot 0 reserved */
|
||||
|
||||
for (uint32_t i = 0; i <= mask; i++) {
|
||||
JSValue k = rec->slots[idx].key;
|
||||
if (JS_IsNull(k)) return -1; /* empty, not found */
|
||||
if (k == JS_EXCEPTION) { /* tombstone, continue */
|
||||
idx = (idx + 1) & mask;
|
||||
if (idx == 0) idx = 1;
|
||||
continue;
|
||||
}
|
||||
if (js_key_equal(k, key)) return idx;
|
||||
idx = (idx + 1) & mask;
|
||||
if (idx == 0) idx = 1;
|
||||
}
|
||||
return -1;
|
||||
}
|
||||
```
|
||||
|
||||
### Phase 7: Update Property Access Functions
|
||||
|
||||
Replace all JSObject property functions with JSRecord equivalents:
|
||||
|
||||
```c
|
||||
JSValue JS_GetPropertyInternal(JSContext *ctx, JSValueConst this_obj,
|
||||
JSValue prop, JS_BOOL throw_ref_error) {
|
||||
if (!JS_IsPtr(this_obj)) {
|
||||
if (throw_ref_error)
|
||||
return JS_ThrowTypeError(ctx, "not an object");
|
||||
return JS_NULL;
|
||||
}
|
||||
|
||||
JSRecord *rec = (JSRecord *)JS_VALUE_GET_PTR(this_obj);
|
||||
|
||||
while (rec) {
|
||||
int idx = rec_find_slot(rec, prop);
|
||||
if (idx >= 0) {
|
||||
return rec->slots[idx].val; /* No dup needed if no ref counting */
|
||||
}
|
||||
rec = rec->proto;
|
||||
}
|
||||
|
||||
if (throw_ref_error)
|
||||
return JS_ThrowReferenceError(ctx, "property not found");
|
||||
return JS_NULL;
|
||||
}
|
||||
|
||||
int JS_SetPropertyInternal(JSContext *ctx, JSValueConst this_obj,
|
||||
JSValue prop, JSValue val) {
|
||||
if (!JS_IsPtr(this_obj))
|
||||
return -1;
|
||||
|
||||
JSRecord *rec = (JSRecord *)JS_VALUE_GET_PTR(this_obj);
|
||||
int idx = rec_find_slot(rec, prop);
|
||||
|
||||
if (idx >= 0) {
|
||||
rec->slots[idx].val = val;
|
||||
return 0;
|
||||
}
|
||||
|
||||
/* Add new property */
|
||||
return rec_add_property(ctx, rec, prop, val);
|
||||
}
|
||||
```
|
||||
|
||||
### Phase 8: C Class Storage in Slot 0
|
||||
|
||||
Per memory.md, slot 0 is reserved for internal use:
|
||||
|
||||
```c
|
||||
/* slots[0].key: lower 32 bits = rec_key_id (for identity-based keys)
|
||||
* upper 32 bits = class_id (C class)
|
||||
* slots[0].val: opaque C pointer
|
||||
*/
|
||||
|
||||
void JS_SetOpaque(JSContext *ctx, JSValue obj, void *opaque) {
|
||||
JSRecord *rec = (JSRecord *)JS_VALUE_GET_PTR(obj);
|
||||
rec->slots[0].val = (JSValue)(uintptr_t)opaque;
|
||||
}
|
||||
|
||||
void *JS_GetOpaque(JSContext *ctx, JSValue obj, uint32_t class_id) {
|
||||
JSRecord *rec = (JSRecord *)JS_VALUE_GET_PTR(obj);
|
||||
uint32_t stored_class = (uint32_t)(rec->slots[0].key >> 32);
|
||||
if (stored_class != class_id) return NULL;
|
||||
return (void *)(uintptr_t)rec->slots[0].val;
|
||||
}
|
||||
|
||||
void JS_SetClassID(JSRecord *rec, uint32_t class_id) {
|
||||
rec->slots[0].key = (rec->slots[0].key & 0xFFFFFFFF) | ((uint64_t)class_id << 32);
|
||||
}
|
||||
```
|
||||
|
||||
### Phase 9: Update GC
|
||||
|
||||
The GC needs updates for the new object format:
|
||||
|
||||
```c
|
||||
static void mark_children(JSRuntime *rt, JSGCObjectHeader *gp, ...) {
|
||||
switch (gp->gc_obj_type) {
|
||||
case JS_GC_OBJ_TYPE_RECORD:
|
||||
{
|
||||
JSRecord *rec = (JSRecord *)gp;
|
||||
uint32_t mask = objhdr_cap56(rec->mist_hdr);
|
||||
|
||||
if (rec->proto)
|
||||
mark_func(rt, &rec->proto->header);
|
||||
|
||||
for (uint32_t i = 1; i <= mask; i++) {
|
||||
if (!JS_IsNull(rec->keys[i]) &&
|
||||
rec->keys[i] != JS_EXCEPTION) { /* tombstone */
|
||||
/* Mark key if it's a pointer */
|
||||
if (JS_IsPtr(rec->keys[i]))
|
||||
JS_MarkValue(rt, rec->keys[i], mark_func);
|
||||
/* Mark value if it's a pointer */
|
||||
if (JS_IsPtr(rec->values[i]))
|
||||
JS_MarkValue(rt, rec->values[i], mark_func);
|
||||
}
|
||||
}
|
||||
}
|
||||
break;
|
||||
// ... other cases
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Cleanup - Items to Remove
|
||||
|
||||
1. **quickjs.h:**
|
||||
- Old NaN-boxing macros (JS_VALUE_GET_TAG, JS_MKVAL, etc.)
|
||||
- JS_TAG_STRING, JS_TAG_STRING_ROPE, JS_TAG_OBJECT, JS_TAG_ARRAY, JS_TAG_FUNCTION
|
||||
- JSValueConst (just use JSValue)
|
||||
|
||||
2. **quickjs.c:**
|
||||
- JSStringRope structure and functions
|
||||
- JSShape and JSShapeProperty structures
|
||||
- Shape hash table and functions
|
||||
- Atom-based property access (keep atoms for parser/compiler)
|
||||
- JSObject structure (replace with JSRecord)
|
||||
- `find_own_property()`, `add_shape_property()`, etc.
|
||||
|
||||
## Verification
|
||||
|
||||
1. **Build:** `make` completes without errors
|
||||
2. **Basic test:** Create objects, set/get properties
|
||||
3. **Number test:** Verify short float encoding/decoding, out-of-range → null
|
||||
4. **String test:** Immediate text for short strings, mist_text for long
|
||||
5. **GC test:** Create cycles, verify collection works
|
||||
6. **C class test:** SetOpaque/GetOpaque work with slot 0 storage
|
||||
---
|
||||
|
||||
## Notes
|
||||
|
||||
- This is a **major rework** affecting most of the codebase
|
||||
- Atoms remain for parser/compiler but not for object property storage
|
||||
- Reference counting may be simplified since fewer pointer types
|
||||
- The short float range (+-3.4e38) covers most practical use cases
|
||||
- Out-of-range numbers becoming NULL is intentional per memory.md
|
||||
- JSVarDef.var_name is JSValue
|
||||
- JSClosureVar.var_name is JSValue
|
||||
- JSGlobalVar.var_name is JSValue
|
||||
- JSFunctionDef.func_name is JSValue
|
||||
- BlockEnv.label_name is JSValue
|
||||
- OP_get_field/put_field/define_field already use cpool index format
|
||||
- JSRecord with open addressing is fully implemented
|
||||
- js_key_hash and js_key_equal work with both immediate and heap text
|
||||
- js_key_equal_str enables comparison with C string literals for internal names
|
||||
|
||||
---
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
After each sub-phase:
|
||||
1. `make` - verify compilation
|
||||
2. Run basic eval: `./cell -e "1+1"`
|
||||
3. Run property test: `./cell -e "var o = {a:1}; o.a"`
|
||||
4. Run function test: `./cell -e "(function f(x){return x*2})(3)"`
|
||||
5. Run closure test: `./cell -e "var f = (function(){var x=1;return function(){return x++}})(); f(); f()"`
|
||||
|
||||
960
source/quickjs.c
960
source/quickjs.c
File diff suppressed because it is too large
Load Diff
Reference in New Issue
Block a user