339 lines
14 KiB
Markdown
339 lines
14 KiB
Markdown
# Cell/QuickJS Refactoring Plan: Remove Atoms, Shapes, and Dual-Encoding
|
|
|
|
## Overview
|
|
|
|
Refactor `source/quickjs.c` to match `docs/memory.md` specification:
|
|
- Remove JSAtom system (171 references → ~41 remaining)
|
|
- Remove JSShape system (94 references) ✓
|
|
- Remove IC caches (shape-based inline caches) ✓
|
|
- Remove `is_wide_char` dual-encoding (18 locations) ✓
|
|
- Use JSValue texts directly as property keys
|
|
- Reference: `mquickjs.c` shows the target pattern
|
|
|
|
## Completed Phases
|
|
|
|
### Phase 1: Remove is_wide_char Remnants ✓
|
|
### Phase 2: Remove IC Caches ✓
|
|
### Phase 3: Remove JSShape System ✓
|
|
### Phase 4: Complete Property Access with JSValue Keys ✓
|
|
|
|
Completed:
|
|
- Removed JS_GC_OBJ_TYPE_JS_OBJECT fallbacks from OP_get_field
|
|
- Removed JS_GC_OBJ_TYPE_JS_OBJECT fallbacks from OP_put_field
|
|
- Removed JS_GC_OBJ_TYPE_JS_OBJECT fallbacks from OP_define_field
|
|
- Created emit_key() function that adds JSValue to cpool and emits index
|
|
|
|
---
|
|
|
|
## Phase 5: Convert JSAtom to JSValue Text (IN PROGRESS)
|
|
|
|
This is the core transformation. All identifier handling moves from atoms to JSValue.
|
|
|
|
### Completed Items
|
|
|
|
**Token and Parser Infrastructure:**
|
|
- [x] Change JSToken.u.ident.atom to JSToken.u.ident.str (JSValue)
|
|
- [x] Change parse_ident() to return JSValue
|
|
- [x] Create emit_key() function (cpool-based)
|
|
- [x] Create JS_KEY_* macros for common names (lines ~279-335 in quickjs.c)
|
|
- [x] Update all token.u.ident.atom references to .str
|
|
- [x] Create keyword lookup table (js_keywords[]) with string comparison
|
|
- [x] Rewrite update_token_ident() to use js_keyword_lookup()
|
|
- [x] Rewrite is_strict_future_keyword() to use JSValue
|
|
- [x] Update token_is_pseudo_keyword() to use JSValue and js_key_equal()
|
|
|
|
**Function Declaration Parsing:**
|
|
- [x] Update js_parse_function_decl() signature to use JSValue func_name
|
|
- [x] Update js_parse_function_decl2() to use JSValue func_name throughout
|
|
- [x] Update js_parse_function_check_names() to use JSValue
|
|
- [x] Convert JS_DupAtom/JS_FreeAtom to JS_DupValue/JS_FreeValue in function parsing
|
|
|
|
**Variable Definition and Lookup:**
|
|
- [x] Update find_global_var() to use JSValue and js_key_equal()
|
|
- [x] Update find_lexical_global_var() to use JSValue
|
|
- [x] Update find_lexical_decl() to use JSValue and js_key_equal()
|
|
- [x] Update js_define_var() to use JSValue
|
|
- [x] Update js_parse_check_duplicate_parameter() to use JSValue and js_key_equal()
|
|
- [x] Update js_parse_destructuring_var() to return JSValue
|
|
- [x] Update js_parse_var() to use JSValue for variable names
|
|
|
|
**Comparison Helpers:**
|
|
- [x] Create js_key_equal_str() for comparing JSValue with C string literals
|
|
- [x] Update is_var_in_arg_scope() to use js_key_equal/js_key_equal_str
|
|
- [x] Update has_with_scope() to use js_key_equal_str
|
|
- [x] Update closure variable comparisons (cv->var_name) to use js_key_equal_str
|
|
|
|
**Property Access:**
|
|
- [x] Fix JS_GetPropertyStr to create proper JSValue keys
|
|
- [x] Fix JS_SetPropertyInternal callers to use JS_KEY_* instead of JS_ATOM_*
|
|
|
|
### JS_KEY_* Macros Added
|
|
|
|
Compile-time immediate ASCII string constants (≤7 chars):
|
|
```c
|
|
JS_KEY_empty, JS_KEY_name, JS_KEY_message, JS_KEY_stack,
|
|
JS_KEY_errors, JS_KEY_Error, JS_KEY_cause, JS_KEY_length,
|
|
JS_KEY_value, JS_KEY_get, JS_KEY_set, JS_KEY_raw,
|
|
JS_KEY_flags, JS_KEY_source, JS_KEY_exec, JS_KEY_toJSON,
|
|
JS_KEY_eval, JS_KEY_this, JS_KEY_true, JS_KEY_false,
|
|
JS_KEY_null, JS_KEY_NaN, JS_KEY_default, JS_KEY_index,
|
|
JS_KEY_input, JS_KEY_groups, JS_KEY_indices, JS_KEY_let,
|
|
JS_KEY_var, JS_KEY_new, JS_KEY_of, JS_KEY_yield,
|
|
JS_KEY_async, JS_KEY_target, JS_KEY_from, JS_KEY_meta,
|
|
JS_KEY_as, JS_KEY_with
|
|
```
|
|
|
|
Runtime macro for strings >7 chars:
|
|
```c
|
|
#define JS_KEY_STR(ctx, str) JS_NewStringLen((ctx), (str), sizeof(str) - 1)
|
|
```
|
|
|
|
Helper function for comparing JSValue with C string literals:
|
|
```c
|
|
static JS_BOOL js_key_equal_str(JSValue a, const char *str);
|
|
```
|
|
|
|
### Remaining Work
|
|
|
|
#### 5.3 Update js_parse_property_name() ✓
|
|
- [x] Change return type from JSAtom* to JSValue*
|
|
- [x] Update all callers (js_parse_object_literal, etc.)
|
|
- [x] Updated get_lvalue(), put_lvalue(), js_parse_destructuring_element()
|
|
|
|
#### 5.4 Replace remaining emit_atom() calls with emit_key() ✓
|
|
- [x] Removed emit_atom wrapper function
|
|
- [x] Changed last emit_atom(JS_ATOM_this) to emit_key(JS_KEY_this)
|
|
|
|
#### 5.5 Update Variable Opcode Format in quickjs-opcode.h
|
|
- [ ] Change `atom` format opcodes to `key` format
|
|
- [ ] Change `atom_u8` and `atom_u16` to `key_u8` and `key_u16`
|
|
|
|
#### 5.6 Update VM Opcode Handlers ✓
|
|
These now read cpool indices and look up JSValue:
|
|
- [x] OP_check_var, OP_get_var_undef, OP_get_var
|
|
- [x] OP_put_var, OP_put_var_init, OP_put_var_strict
|
|
- [x] OP_set_name, OP_make_var_ref, OP_delete_var
|
|
- [x] OP_define_var, OP_define_func, OP_throw_error
|
|
- [x] OP_make_loc_ref, OP_make_arg_ref
|
|
- [x] OP_define_method, OP_define_method_computed
|
|
|
|
#### 5.7 Update resolve_scope_var() ✓
|
|
- [x] Changed signature to use JSValue var_name
|
|
- [x] Updated all comparisons to use js_key_equal()/js_key_equal_str()
|
|
- [x] Updated var_object_test() to use JSValue
|
|
- [x] Updated optimize_scope_make_global_ref() to use JSValue
|
|
- [x] Updated resolve_variables() callers to read from cpool
|
|
|
|
#### 5.8 Convert Remaining JS_ATOM_* Usages
|
|
Categories remaining:
|
|
- Some debug/print functions still use JSAtom
|
|
- Some function signatures not yet converted
|
|
- Will be addressed in Phase 7 cleanup
|
|
|
|
---
|
|
|
|
## Phase 6: Update Bytecode Serialization ✓
|
|
|
|
### 6.1 JS_WriteObjectTag Changes ✓
|
|
- [x] Changed JS_WriteObjectTag to use bc_put_key() directly for property keys
|
|
- [x] Removed JS_ValueToAtom/bc_put_atom path (was broken anyway)
|
|
- [x] cpool values serialized via JS_WriteObjectRec()
|
|
|
|
### 6.2 JS_ReadObject Changes ✓
|
|
- [x] Changed JS_ReadObjectTag to use bc_get_key() for property keys
|
|
- [x] Uses JS_SetPropertyInternal with JSValue keys
|
|
|
|
### 6.3 Opcode Format Updates ✓
|
|
- [x] Added OP_FMT_key_u8, OP_FMT_key_u16, OP_FMT_key_label_u16 formats
|
|
- [x] Updated variable opcodes to use key formats instead of atom formats
|
|
- [x] Updated bc_byte_swap() to handle new key formats
|
|
- [x] Updated JS_WriteFunctionBytecode() to skip key format opcodes
|
|
- [x] Updated JS_ReadFunctionBytecode() to skip key format opcodes
|
|
|
|
### 6.4 Version Bump ✓
|
|
- [x] Incremented BC_VERSION from 5 to 6
|
|
|
|
---
|
|
|
|
## Phase 7: Final Cleanup ✓
|
|
|
|
### 7.1 Additional Parser/Compiler Fixes ✓
|
|
- [x] Fixed TOK_IDENT case to use JSValue name, JS_DupValue, emit_key
|
|
- [x] Fixed TOK_TRY catch clause to use JSValue name
|
|
- [x] Fixed js_parse_statement_or_decl label_name to use JSValue
|
|
- [x] Fixed OP_scope_get_var "eval" check to use js_key_equal_str
|
|
- [x] Fixed js_parse_delete to use JSValue for name comparison
|
|
- [x] Fixed JSON parsing to use js_key_from_string for property names
|
|
- [x] Added js_key_from_string() helper function
|
|
- [x] Added JS_KEY__ret_, JS_KEY__eval_, JS_KEY__var_ for internal names
|
|
- [x] Updated add_closure_var, get_closure_var2, get_closure_var to use JSValue var_name
|
|
- [x] Updated set_closure_from_var to use JS_DupValue
|
|
- [x] Updated add_closure_variables to use JS_DupValue
|
|
- [x] Updated instantiate_hoisted_definitions to use fd_cpool_add for keys
|
|
- [x] Updated resolve_variables to use js_key_equal and fd_cpool_add
|
|
|
|
### 7.1.1 Property Access and Runtime Fixes ✓
|
|
- [x] Fixed JS_GetPropertyValue to use js_key_from_string instead of JS_ValueToAtom
|
|
- [x] Fixed JS_GetPropertyKey to use js_key_from_string for string keys
|
|
- [x] Fixed JS_SetPropertyKey to use js_key_from_string for string keys
|
|
- [x] Fixed JS_HasPropertyKey to use js_key_from_string for string keys
|
|
- [x] Fixed JS_DeletePropertyKey to use js_key_from_string for string keys
|
|
- [x] Updated JS_HasProperty signature to take JSValue prop
|
|
- [x] Fixed OP_get_ref_value handler to use JSValue key
|
|
- [x] Fixed OP_put_ref_value handler to use JSValue key
|
|
- [x] Updated free_func_def to use JS_FreeValue for JSValue fields
|
|
|
|
### 7.2 Remove JSAtom Type and Functions ✓
|
|
- [x] Removed most JS_ATOM_* constants (kept JS_ATOM_NULL, JS_ATOM_END for BC compat)
|
|
- [x] JS_NewAtomString now returns JSValue using js_key_new
|
|
- [x] JS_FreeAtom, JS_DupAtom are stubs (no-op for backward compat)
|
|
- [x] JS_AtomToValue, JS_ValueToAtom are stubs (minimal BC compat)
|
|
- [x] Replaced JS_ATOM_* usages with JS_KEY_* or JS_GetPropertyStr
|
|
|
|
### 7.3 Additional Runtime Fixes ✓
|
|
- [x] Fixed free_function_bytecode to use JS_FreeValueRT for JSValue fields
|
|
- [x] Fixed JS_SetPropertyFunctionList to use JSValue keys via find_key()
|
|
- [x] Fixed JS_InstantiateFunctionListItem to use JSValue keys
|
|
- [x] Fixed internalize_json_property to use JSValue names
|
|
- [x] Fixed emit_break and push_break_entry to use JSValue label_name
|
|
- [x] Implemented JS_Invoke to use JSValue method key
|
|
|
|
### 7.4 Remaining Stubs (kept for bytecode backward compatibility)
|
|
- JSAtom typedef (uint32_t) - used in BC serialization
|
|
- JS_ATOM_NULL, JS_ATOM_END - bytecode format markers
|
|
- JS_FreeAtom, JS_DupAtom - no-op stubs
|
|
- JS_FreeAtomRT, JS_DupAtomRT - no-op stubs
|
|
- Legacy BC reader (idx_to_atom array) - for reading old bytecode
|
|
|
|
---
|
|
|
|
## Current Build Status
|
|
|
|
**Build: SUCCEEDS** with warnings (unused variables, labels)
|
|
|
|
**Statistics:**
|
|
- JS_ATOM_* usages: Minimal (only BC serialization compat)
|
|
- Property access uses JS_KEY_* macros or JS_GetPropertyStr
|
|
- BC_VERSION: 6 (updated for new key-based format)
|
|
|
|
**What Works:**
|
|
- All property access via JSValue keys
|
|
- Keyword detection via string comparison
|
|
- Function declaration parsing with JSValue names
|
|
- Variable definition with JSValue names
|
|
- Closure variable tracking with JSValue names
|
|
- VM opcode handlers read cpool indices and look up JSValue
|
|
- resolve_scope_var() uses JSValue throughout
|
|
- js_parse_property_name() returns JSValue
|
|
- Bytecode serialization uses bc_put_key/bc_get_key for property keys
|
|
- Variable opcodes use key format (cpool indices)
|
|
- JSON parsing uses JSValue keys
|
|
- Internal variable names use JS_KEY__ret_, JS_KEY__eval_, JS_KEY__var_
|
|
- JS_SetPropertyFunctionList uses JSValue keys
|
|
- JS_Invoke uses JSValue method keys
|
|
- break/continue labels use JSValue
|
|
|
|
---
|
|
|
|
## Phase 8: Migrate to New Tagging System (IN PROGRESS)
|
|
|
|
**Problem**: `JS_VALUE_GET_TAG` returns `JS_TAG_PTR` for all pointers, but ~200 places check for obsolete tags like `JS_TAG_OBJECT`, `JS_TAG_STRING`, `JS_TAG_FUNCTION`, etc., which are never returned. This causes crashes.
|
|
|
|
**Target Design** (from memory.md):
|
|
- JSValue tags: Only `JS_TAG_INT`, `JS_TAG_PTR`, `JS_TAG_SHORT_FLOAT`, `JS_TAG_SPECIAL`
|
|
- Pointer types determined by `objhdr_t` header at offset 8 in heap objects
|
|
- mist_obj_type: `OBJ_ARRAY(0)`, `OBJ_BLOB(1)`, `OBJ_TEXT(2)`, `OBJ_RECORD(3)`, `OBJ_FUNCTION(4)`, etc.
|
|
|
|
### 8.1 Unified Heap Object Layout ✓
|
|
- [x] Updated mist_text structure to have objhdr_t at offset 8:
|
|
```c
|
|
typedef struct mist_text {
|
|
JSRefCountHeader _dummy_header; /* unused, for offset alignment */
|
|
uint32_t _pad; /* padding to align objhdr_t to offset 8 */
|
|
objhdr_t hdr; /* NOW at offset 8, like JSString */
|
|
word_t length;
|
|
word_t packed[];
|
|
} mist_text;
|
|
```
|
|
- [x] JSString already has objhdr_t at offset 8
|
|
|
|
### 8.2 Type-Checking Helper Functions ✓
|
|
Added lowercase internal helpers (to avoid conflict with quickjs.h declarations):
|
|
```c
|
|
static inline JS_BOOL js_is_gc_object(JSValue v) { return JS_IsPtr(v); }
|
|
static inline JSGCObjectTypeEnum js_get_gc_type(JSValue v) {
|
|
return ((JSGCObjectHeader *)JS_VALUE_GET_PTR(v))->gc_obj_type;
|
|
}
|
|
static inline JS_BOOL js_is_record(JSValue v) {
|
|
if (!JS_IsPtr(v)) return FALSE;
|
|
return js_get_gc_type(v) == JS_GC_OBJ_TYPE_RECORD;
|
|
}
|
|
static inline JS_BOOL js_is_array(JSValue v) {
|
|
if (!JS_IsPtr(v)) return FALSE;
|
|
return js_get_gc_type(v) == JS_GC_OBJ_TYPE_ARRAY;
|
|
}
|
|
static inline JS_BOOL js_is_function(JSValue v) {
|
|
if (!JS_IsPtr(v)) return FALSE;
|
|
return js_get_gc_type(v) == JS_GC_OBJ_TYPE_FUNCTION;
|
|
}
|
|
static inline JS_BOOL js_is_object(JSValue v) {
|
|
if (!JS_IsPtr(v)) return FALSE;
|
|
JSGCObjectTypeEnum t = js_get_gc_type(v);
|
|
return t == JS_GC_OBJ_TYPE_RECORD || t == JS_GC_OBJ_TYPE_ARRAY;
|
|
}
|
|
```
|
|
|
|
### 8.3 Updated Core Functions ✓
|
|
- [x] Updated JS_IsString to read objhdr_t from offset 8
|
|
- [x] Updated js_key_hash to read objhdr_t from offset 8
|
|
- [x] Updated js_key_equal to read objhdr_t from offset 8
|
|
- [x] Updated __JS_FreeValueRT to use objhdr_type for type dispatch
|
|
- [x] Updated JS_MarkValue, JS_MarkValueEdgeEx for GC
|
|
- [x] Added JS_SetPropertyValue function
|
|
- [x] Changed quickjs.h JS_IsFunction/JS_IsObject from inline to extern declarations
|
|
|
|
### 8.4 Tag Check Migration (PARTIAL)
|
|
Updated some critical tag checks:
|
|
- [x] Some JS_TAG_OBJECT checks → js_is_object() or js_is_record()
|
|
- [ ] Many more JS_TAG_OBJECT checks remain (~200 total)
|
|
- [ ] JS_TAG_FUNCTION checks → js_is_function()
|
|
- [ ] JS_TAG_STRING checks (some already use JS_IsString)
|
|
|
|
### 8.5 Remaining Work
|
|
- [ ] Fix ASAN memory corruption error (attempting free on address not malloc'd)
|
|
- Crash occurs in js_def_realloc during js_realloc_array
|
|
- Address is 112 bytes inside a JSFunctionDef allocation
|
|
- [ ] Complete remaining ~200 tag check migrations
|
|
- [ ] Add mist_hdr to JSFunction (optional, gc_obj_type already works)
|
|
- [ ] Remove obsolete tag definitions from quickjs.h:
|
|
- JS_TAG_STRING = -8
|
|
- JS_TAG_ARRAY = -6
|
|
- JS_TAG_FUNCTION = -5
|
|
- JS_TAG_FUNCTION_BYTECODE = -2
|
|
- JS_TAG_OBJECT = -1
|
|
|
|
### Current Status
|
|
|
|
**Build: SUCCEEDS** with warnings
|
|
|
|
**Runtime: CRASHES** with ASAN error:
|
|
```
|
|
==16122==ERROR: AddressSanitizer: attempting free on address which was not malloc()-ed
|
|
```
|
|
The crash occurs during test execution in `js_def_realloc` called from `js_realloc_array`.
|
|
Root cause not yet identified - likely a pointer being passed to realloc that wasn't allocated with malloc.
|
|
|
|
---
|
|
|
|
## Notes
|
|
|
|
- JSVarDef.var_name is JSValue
|
|
- JSClosureVar.var_name is JSValue
|
|
- JSGlobalVar.var_name is JSValue
|
|
- JSFunctionDef.func_name is JSValue
|
|
- BlockEnv.label_name is JSValue
|
|
- OP_get_field/put_field/define_field already use cpool index format
|
|
- JSRecord with open addressing is fully implemented
|
|
- js_key_hash and js_key_equal work with both immediate and heap text
|
|
- js_key_equal_str enables comparison with C string literals for internal names
|