Files
cell/gc_plan.md
2026-01-31 14:43:21 -06:00

13 KiB

GC Refactoring Plan: Cheney Copying Collector

Overview

Replace the current reference-counting GC with a simple two-space Cheney copying collector. This fundamentally simplifies the system: no more JSGCObjectHeader, no ref counts, no cycle detection - just bump allocation and copying live objects when memory fills up.

Architecture

JSRuntime (256 MB pool)
    ├── Buddy allocator for block management
    └── JSContext #1 (actor)
         ├── Current block (64KB initially)
         ├── heap_base: start of block
         ├── heap_free: bump pointer
         └── On memory pressure: request new block, copy live objects, return old block

Memory Model (from docs/memory.md)

Object Header (objhdr_t - 64 bits)

[56 bits: capacity] [1 bit: R flag] [3 bits: reserved] [1 bit: stone] [3 bits: type]

Object Types

  • 0: OBJ_ARRAY - Header, Length, Elements[]
  • 1: OBJ_BLOB - Header, Length (bits), BitWords[]
  • 2: OBJ_TEXT - Header, Length/Hash, PackedChars[]
  • 3: OBJ_RECORD - Header, Prototype, Length, Key/Value pairs
  • 4: OBJ_FUNCTION - Header, Code, Outer (always stone, 3 words)
  • 5: OBJ_CODE - Header, Arity, Size, ClosureSize, Entry, Disruption
  • 6: OBJ_FRAME - Header, Function, Caller, ReturnAddr, Slots[]
  • 7: OBJ_FORWARD - Forwarding pointer (used during GC)

Phase 1: Add Buddy Allocator to JSRuntime

File: source/quickjs.c

1.1 Add buddy allocator structures

#define BUDDY_MIN_ORDER 16    /* 64KB minimum block */
#define BUDDY_MAX_ORDER 28    /* 256MB maximum */
#define BUDDY_LEVELS (BUDDY_MAX_ORDER - BUDDY_MIN_ORDER + 1)

typedef struct BuddyBlock {
  struct BuddyBlock *next;
  struct BuddyBlock *prev;
  uint8_t order;      /* log2 of size */
  uint8_t is_free;
} BuddyBlock;

typedef struct BuddyAllocator {
  uint8_t *base;              /* 256MB base address */
  size_t total_size;          /* 256MB */
  BuddyBlock *free_lists[BUDDY_LEVELS];
} BuddyAllocator;

1.2 Update JSRuntime

struct JSRuntime {
  BuddyAllocator buddy;
  /* ... keep: class_count, class_array, context_list ... */
  /* REMOVE: gc_obj_list, gc_zero_ref_count_list, gc_phase, malloc_gc_threshold */
};

1.3 Implement buddy functions

  • buddy_init(BuddyAllocator *b) - allocate 256MB, initialize free lists
  • buddy_alloc(BuddyAllocator *b, size_t size) - allocate block of given size
  • buddy_free(BuddyAllocator *b, void *ptr, size_t size) - return block
  • buddy_destroy(BuddyAllocator *b) - free the 256MB

Phase 2: Restructure JSContext for Bump Allocation

File: source/quickjs.c

2.1 Update JSContext

struct JSContext {
  JSRuntime *rt;
  struct list_head link;

  /* Actor memory block */
  uint8_t *heap_base;         /* start of current block */
  uint8_t *heap_free;         /* bump pointer */
  uint8_t *heap_end;          /* end of block */
  size_t current_block_size;  /* 64KB initially */
  size_t next_block_size;     /* doubles if <10% recovered */

  /* Stack (VM execution) */
  JSValue *value_stack;
  int value_stack_top;
  int value_stack_capacity;
  struct VMFrame *frame_stack;
  int frame_stack_top;
  int frame_stack_capacity;

  /* Roots */
  JSValue global_obj;
  JSValue *class_proto;
  JSValue current_exception;

  /* Stone arena (immutable interned strings) */
  struct StoneArenaPage *st_pages;
  /* ... stone interning fields ... */

  /* Other context state */
  uint16_t class_count;
  int interrupt_counter;
  void *user_opaque;
  /* REMOVE: JSGCObjectHeader header at start */
};

2.2 Implement bump allocator

static void *ctx_alloc(JSContext *ctx, size_t size) {
  size = (size + 7) & ~7;  /* 8-byte align */
  if (ctx->heap_free + size > ctx->heap_end) {
    if (ctx_gc(ctx) < 0) return NULL;  /* triggers GC */
    if (ctx->heap_free + size > ctx->heap_end) {
      return NULL;  /* still OOM after GC */
    }
  }
  void *ptr = ctx->heap_free;
  ctx->heap_free += size;
  return ptr;
}

Phase 3: Unify Object Headers (Remove JSGCObjectHeader)

3.1 New unified object layout

All heap objects start with just objhdr_t:

/* Array */
typedef struct {
  objhdr_t hdr;          /* type=OBJ_ARRAY, cap=element_capacity */
  uint64_t len;
  JSValue elem[];
} MistArray;

/* Text */
typedef struct {
  objhdr_t hdr;          /* type=OBJ_TEXT, cap=char_capacity, s=stone bit */
  uint64_t len_or_hash;  /* len if pretext, hash if stoned */
  uint64_t packed[];     /* 2 UTF32 chars per word */
} MistText;

/* Record (object) */
typedef struct MistRecord {
  objhdr_t hdr;          /* type=OBJ_RECORD, cap=mask, s=stone bit */
  struct MistRecord *proto;
  uint64_t len;
  uint64_t tombs;
  uint16_t class_id;
  uint16_t _pad;
  uint32_t rec_id;       /* for record-as-key hashing */
  JSValue slots[];       /* key[0], val[0], key[1], val[1], ... */
} MistRecord;

/* Function */
typedef struct {
  objhdr_t hdr;          /* type=OBJ_FUNCTION, always stone */
  JSValue code;          /* pointer to MistCode */
  JSValue outer;         /* pointer to MistFrame */
} MistFunction;

/* Frame */
typedef struct {
  objhdr_t hdr;          /* type=OBJ_FRAME, cap=slot_count */
  JSValue function;      /* MistFunction */
  JSValue caller;        /* MistFrame or null */
  uint64_t return_addr;
  JSValue slots[];       /* args, locals, temporaries */
} MistFrame;

/* Code (always in stone/immutable memory) */
typedef struct {
  objhdr_t hdr;          /* type=OBJ_CODE, always stone */
  uint32_t arity;
  uint32_t frame_size;
  uint32_t closure_size;
  uint64_t entry_point;
  uint64_t disruption_point;
  uint8_t bytecode[];
} MistCode;

3.2 Delete JSGCObjectHeader usage

Remove from:

  • JSRecord, JSArray, JSFunction - remove JSGCObjectHeader header field
  • All p->header.ref_count, p->header.gc_obj_type, p->header.mark accesses
  • add_gc_object(), remove_gc_object() functions
  • gc_obj_list, gc_zero_ref_count_list in JSRuntime

Phase 4: Implement Cheney Copying GC

4.1 Core GC function

static int ctx_gc(JSContext *ctx) {
  size_t old_used = ctx->heap_free - ctx->heap_base;

  /* Request new block from runtime */
  size_t new_size = ctx->next_block_size;
  uint8_t *new_block = buddy_alloc(&ctx->rt->buddy, new_size);
  if (!new_block) return -1;

  uint8_t *to_base = new_block;
  uint8_t *to_free = new_block;
  uint8_t *to_end = new_block + new_size;

  /* Copy roots */
  ctx->global_obj = gc_copy_value(ctx, ctx->global_obj, &to_free, to_end);
  ctx->current_exception = gc_copy_value(ctx, ctx->current_exception, &to_free, to_end);
  for (int i = 0; i < ctx->class_count; i++) {
    ctx->class_proto[i] = gc_copy_value(ctx, ctx->class_proto[i], &to_free, to_end);
  }

  /* Copy stack */
  for (int i = 0; i < ctx->value_stack_top; i++) {
    ctx->value_stack[i] = gc_copy_value(ctx, ctx->value_stack[i], &to_free, to_end);
  }

  /* Scan copied objects (Cheney scan pointer) */
  uint8_t *scan = to_base;
  while (scan < to_free) {
    gc_scan_object(ctx, scan, &to_free, to_end);
    scan += gc_object_size(scan);
  }

  /* Return old block */
  buddy_free(&ctx->rt->buddy, ctx->heap_base, ctx->current_block_size);

  /* Update context */
  size_t new_used = to_free - to_base;
  size_t recovered = old_used - new_used;

  ctx->heap_base = to_base;
  ctx->heap_free = to_free;
  ctx->heap_end = to_end;
  ctx->current_block_size = new_size;

  /* If <10% recovered, double next block size */
  if (recovered < old_used / 10) {
    ctx->next_block_size = new_size * 2;
  }

  return 0;
}

4.2 Copy functions per type

static JSValue gc_copy_value(JSContext *ctx, JSValue v, uint8_t **to_free, uint8_t *to_end) {
  if (!JS_IsPtr(v)) return v;  /* immediate value */

  void *ptr = JS_VALUE_GET_PTR(v);
  if (is_stone_ptr(ptr)) return v;  /* stone memory, don't copy */

  objhdr_t hdr = *(objhdr_t *)ptr;

  /* Already forwarded? */
  if (objhdr_type(hdr) == OBJ_FORWARD) {
    return JS_MKPTR(JS_TAG_PTR, (void *)(hdr >> 3));  /* extract forwarding address */
  }

  /* Copy object */
  size_t size = gc_object_size(ptr);
  if (*to_free + size > to_end) abort();  /* shouldn't happen */

  void *new_ptr = *to_free;
  memcpy(new_ptr, ptr, size);
  *to_free += size;

  /* Install forwarding pointer */
  *(objhdr_t *)ptr = ((objhdr_t)(uintptr_t)new_ptr << 3) | OBJ_FORWARD;

  return JS_MKPTR(JS_TAG_PTR, new_ptr);
}

static void gc_scan_object(JSContext *ctx, void *ptr, uint8_t **to_free, uint8_t *to_end) {
  objhdr_t hdr = *(objhdr_t *)ptr;
  switch (objhdr_type(hdr)) {
    case OBJ_ARRAY: {
      MistArray *arr = ptr;
      for (uint64_t i = 0; i < arr->len; i++) {
        arr->elem[i] = gc_copy_value(ctx, arr->elem[i], to_free, to_end);
      }
      break;
    }
    case OBJ_RECORD: {
      MistRecord *rec = ptr;
      rec->proto = gc_copy_value(ctx, rec->proto, to_free, to_end);
      uint64_t mask = objhdr_cap56(hdr);
      for (uint64_t i = 0; i <= mask; i++) {
        rec->slots[i*2] = gc_copy_value(ctx, rec->slots[i*2], to_free, to_end);
        rec->slots[i*2+1] = gc_copy_value(ctx, rec->slots[i*2+1], to_free, to_end);
      }
      break;
    }
    case OBJ_FUNCTION: {
      MistFunction *fn = ptr;
      fn->code = gc_copy_value(ctx, fn->code, to_free, to_end);
      fn->outer = gc_copy_value(ctx, fn->outer, to_free, to_end);
      break;
    }
    case OBJ_FRAME: {
      MistFrame *fr = ptr;
      fr->function = gc_copy_value(ctx, fr->function, to_free, to_end);
      fr->caller = gc_copy_value(ctx, fr->caller, to_free, to_end);
      uint64_t cap = objhdr_cap56(hdr);
      for (uint64_t i = 0; i < cap; i++) {
        fr->slots[i] = gc_copy_value(ctx, fr->slots[i], to_free, to_end);
      }
      break;
    }
    case OBJ_TEXT:
    case OBJ_BLOB:
    case OBJ_CODE:
      /* No references to scan */
      break;
  }
}

Phase 5: Remove Old GC Infrastructure

Files: source/quickjs.c, source/quickjs.h

Delete entirely:

  • RC_TRACE conditional code (~100 lines)
  • RcEvent struct and rc_log array
  • rc_log_event, rc_trace_inc_gc, rc_trace_dec_gc, rc_dump_history
  • gc_decref, gc_decref_child, gc_decref_child_dbg, gc_decref_child_edge
  • gc_free_cycles, free_zero_refcount, free_gc_object
  • add_gc_object, remove_gc_object
  • JS_RunGCInternal, JS_GC_PHASE_* enum
  • mark_children, gc_mark (the old marking functions)
  • JSGCPhaseEnum, gc_phase field in JSRuntime
  • gc_obj_list, gc_zero_ref_count_list in JSRuntime

Update:

  • JS_FreeValue - no ref counting, just mark for GC or no-op
  • JS_DupValue - no ref counting, just return value
  • __JS_FreeValueRT - simplified, no ref count checks

Phase 6: Update Allocation Sites

6.1 Replace js_malloc with ctx_alloc

All object allocations change from:

JSRecord *rec = js_mallocz(ctx, sizeof(JSRecord));

to:

MistRecord *rec = ctx_alloc(ctx, sizeof(MistRecord) + (mask+1) * 2 * sizeof(JSValue));
rec->hdr = objhdr_make(mask, OBJ_RECORD, false, false, false, false);

6.2 Update object creation functions

  • JS_NewObject - use ctx_alloc, set hdr
  • JS_NewArray - use ctx_alloc, set hdr
  • JS_NewStringLen - use ctx_alloc for heap strings
  • js_create_function - use ctx_alloc
  • String concatenation, array push, etc.

Phase 7: Update Type Checks

Replace JSGCObjectHeader.gc_obj_type checks with objhdr_type:

/* Old */
((JSGCObjectHeader *)ptr)->gc_obj_type == JS_GC_OBJ_TYPE_RECORD

/* New */
objhdr_type(*(objhdr_t *)ptr) == OBJ_RECORD

Update helper functions:

  • js_is_record(v) - check objhdr_type == OBJ_RECORD
  • js_is_array(v) - check objhdr_type == OBJ_ARRAY
  • js_is_function(v) - check objhdr_type == OBJ_FUNCTION
  • JS_IsString(v) - check objhdr_type == OBJ_TEXT

Phase 8: Handle C Opaque Objects

Per docs/memory.md, C opaque objects need special handling:

8.1 Track live opaque objects

typedef struct {
  void *opaque;
  JSClassID class_id;
  uint8_t alive;
} OpaqueRef;

/* In JSContext */
OpaqueRef *opaque_refs;
int opaque_ref_count;
int opaque_ref_capacity;

8.2 During GC

  1. When copying a MistRecord with opaque data, mark it alive
  2. After GC, iterate opaque_refs and call finalizer for those with alive=0
  3. Clear all alive flags for next cycle

File Changes Summary

source/quickjs.c

  • Remove ~500 lines: RC_TRACE, gc_decref, gc_free_cycles, JSGCObjectHeader usage
  • Add ~300 lines: buddy allocator, Cheney GC, new object layouts
  • Modify ~200 lines: allocation sites, type checks

source/quickjs.h

  • Remove: JSGCObjectHeader from public API
  • Update: JS_FreeValue, JS_DupValue to be no-ops or trivial

Verification

  1. Build: make should compile without errors
  2. Basic test: ./cell test suite should pass
  3. Memory test: Run with ASAN to verify no leaks or corruption
  4. GC trigger: Test that GC runs when memory fills, objects survive correctly

Dependencies / Order of Work

  1. Phase 1 (Buddy) - independent, implement first
  2. Phase 2 (JSContext) - depends on Phase 1
  3. Phase 3 (Headers) - major refactor, careful testing needed
  4. Phase 4 (Cheney GC) - depends on Phases 1-3
  5. Phase 5 (Remove old GC) - after Phase 4 works
  6. Phase 6 (Allocation sites) - incremental, with Phase 3
  7. Phase 7 (Type checks) - with Phase 3
  8. Phase 8 (Opaque) - last, once basic GC works

Notes

  • Stone arena (immutable interned strings) remains unchanged - not subject to GC
  • OBJ_CODE lives in stone memory, never copied
  • Frames use caller=null to signal returnable (can be shrunk during GC)
  • Forward pointer type (7) used during GC to mark copied objects