8.0 KiB
title, description, weight, type
| title | description | weight | type |
|---|---|---|---|
| Semantic Index | Index and query symbols, references, and call sites in source files | 55 | docs |
ƿit includes a semantic indexer that extracts symbols, references, call sites, and imports from source files. The index powers the LSP (find references, rename) and is available as a CLI tool for scripting and debugging.
Overview
The indexer walks the parsed AST without modifying it. It produces a JSON structure that maps every declaration, every reference to that declaration, and every call site in a file.
source → tokenize → parse → fold → index
↓
symbols, references,
call sites, imports,
exports, reverse refs
Two CLI commands expose this:
| Command | Purpose |
|---|---|
pit index <file> |
Produce the full semantic index as JSON |
pit explain |
Query the index for a specific symbol or position |
pit index
Index a source file and print the result as JSON.
pit index <file.ce|file.cm>
pit index <file> -o output.json
Output
The index contains these sections:
| Section | Description |
|---|---|
imports |
All use() calls with local name, module path, and span |
symbols |
Every declaration: vars, defs, functions, params |
references |
Every use of a name, classified as read, write, or call |
call_sites |
Every function call with callee, args count, and enclosing function |
exports |
For .cm modules, the keys of the top-level return record |
reverse_refs |
Inverted index: name to list of reference spans |
Example
Given a file graph.ce with functions make_node, connect, and build_graph:
pit index graph.ce
{
"version": 1,
"path": "graph.ce",
"is_actor": true,
"imports": [
{"local_name": "json", "module_path": "json", "span": {"from_row": 2, "from_col": 0, "to_row": 2, "to_col": 22}}
],
"symbols": [
{
"symbol_id": "graph.ce:make_node:fn",
"name": "make_node",
"kind": "fn",
"params": ["name", "kind"],
"doc_comment": "// A node in the graph.",
"decl_span": {"from_row": 6, "from_col": 0, "to_row": 8, "to_col": 1},
"scope_fn_nr": 0
}
],
"references": [
{"node_id": 20, "name": "make_node", "ref_kind": "call", "span": {"from_row": 17, "from_col": 13, "to_row": 17, "to_col": 22}}
],
"call_sites": [
{"node_id": 20, "callee": "make_node", "args_count": 2, "span": {"from_row": 17, "from_col": 22, "to_row": 17, "to_col": 40}}
],
"exports": [],
"reverse_refs": {
"make_node": [
{"node_id": 20, "ref_kind": "call", "span": {"from_row": 17, "from_col": 13, "to_row": 17, "to_col": 22}}
]
}
}
Symbol Kinds
| Kind | Description |
|---|---|
fn |
Function (var or def with function value) |
var |
Mutable variable |
def |
Constant |
param |
Function parameter |
Each symbol has a symbol_id in the format filename:name:kind and a decl_span with from_row, from_col, to_row, to_col (0-based).
Reference Kinds
| Kind | Description |
|---|---|
read |
Value is read |
write |
Value is assigned |
call |
Used as a function call target |
Module Exports
For .cm files, the indexer detects the top-level return statement. If it returns a record literal, each key becomes an export linked to its symbol:
// math_utils.cm
var add = function(a, b) { return a + b }
var sub = function(a, b) { return a - b }
return {add: add, sub: sub}
pit index math_utils.cm
The exports section will contain:
[
{"name": "add", "symbol_id": "math_utils.cm:add:fn"},
{"name": "sub", "symbol_id": "math_utils.cm:sub:fn"}
]
pit explain
Query the semantic index for a specific symbol or cursor position. This is the targeted query interface — instead of dumping the full index, it answers a specific question.
pit explain --span <file>:<line>:<col>
pit explain --symbol <name> <file>
--span: What is at this position?
Point at a line and column (0-based) to find out what symbol or reference is there.
pit explain --span demo.ce:6:4
If the position lands on a declaration, that symbol is returned along with all its references and call sites. If it lands on a reference, the indexer traces back to the declaration and returns the same information.
The result includes:
| Field | Description |
|---|---|
symbol |
The resolved declaration (name, kind, params, doc comment, span) |
reference |
The reference at the cursor, if the cursor was on a reference |
references |
All references to this symbol across the file |
call_sites |
All call sites for this symbol |
imports |
The file's imports (for context) |
{
"symbol": {
"name": "build_graph",
"symbol_id": "demo.ce:build_graph:fn",
"kind": "fn",
"params": [],
"doc_comment": "// Build a sample graph and return it."
},
"references": [
{"node_id": 71, "ref_kind": "call", "span": {"from_row": 39, "from_col": 12, "to_row": 39, "to_col": 23}}
],
"call_sites": []
}
--symbol: Find a symbol by name
Look up a symbol by name, returning all matching declarations and every reference.
pit explain --symbol connect demo.ce
{
"symbols": [
{
"name": "connect",
"symbol_id": "demo.ce:connect:fn",
"kind": "fn",
"params": ["from", "to", "label"],
"doc_comment": "// Connect two nodes with a labeled edge."
}
],
"references": [
{"node_id": 29, "ref_kind": "call", "span": {"from_row": 21, "from_col": 2, "to_row": 21, "to_col": 9}},
{"node_id": 33, "ref_kind": "call", "span": {"from_row": 22, "from_col": 2, "to_row": 22, "to_col": 9}},
{"node_id": 37, "ref_kind": "call", "span": {"from_row": 23, "from_col": 2, "to_row": 23, "to_col": 9}}
],
"call_sites": [
{"callee": "connect", "args_count": 3, "span": {"from_row": 21, "from_col": 9, "to_row": 21, "to_col": 29}},
{"callee": "connect", "args_count": 3, "span": {"from_row": 22, "from_col": 9, "to_row": 22, "to_col": 31}},
{"callee": "connect", "args_count": 3, "span": {"from_row": 23, "from_col": 9, "to_row": 23, "to_col": 29}}
]
}
This tells you: connect is a function taking (from, to, label), declared on line 11, and called 3 times inside build_graph.
Programmatic Use
The index and explain modules can be used directly from ƿit scripts:
index.cm
var tokenize_mod = use('tokenize')
var parse_mod = use('parse')
var fold_mod = use('fold')
var index_mod = use('index')
var pipeline = {tokenize: tokenize_mod, parse: parse_mod, fold: fold_mod}
var idx = index_mod.index_file(src, filename, pipeline)
index_file runs the full pipeline (tokenize, parse, fold) and returns the index. If you already have a parsed AST and tokens, use index_ast instead:
var idx = index_mod.index_ast(ast, tokens, filename)
explain.cm
var explain_mod = use('explain')
var expl = explain_mod.make(idx)
// What is at line 10, column 5?
var result = expl.at_span(10, 5)
// Find all symbols named "connect"
var result = expl.by_symbol("connect")
// Get callers and callees of a symbol
var chain = expl.call_chain("demo.ce:connect:fn", 2)
For cross-file queries:
var result = explain_mod.explain_across([idx1, idx2, idx3], "connect")
LSP Integration
The semantic index powers these LSP features:
| Feature | LSP Method | Description |
|---|---|---|
| Find References | textDocument/references |
All references to the symbol under the cursor |
| Rename | textDocument/rename |
Rename a symbol and all its references |
| Prepare Rename | textDocument/prepareRename |
Validate that the cursor is on a renameable symbol |
| Go to Definition | textDocument/definition |
Jump to a symbol's declaration (index-backed with AST fallback) |
These work automatically in any editor with ƿit LSP support. The index is rebuilt on every file change.