System · architecture
Three lanes: index time, storage, query time.
skygrep index or
skygrep watch. Storage is the only shared state between the two
lanes. Query time runs per call to skygrep search.
System · recall
0.5.13 adds bounded candidate recall before final ranking.
The intent router ranks retrieval lanes, but it is no longer the only
component that decides which files can be seen. For semantic, lexical,
and mixed queries, skygrep now builds a cheap candidate file pool from
independent signals before the expensive cascade makes the final choice:
explicit include scope, indexed path tokens, indexed symbols, SQLite
chunk text, and a bounded rg -il -F pass.
Candidate recall is additive. It can vote likely files into the pool,
but semantic scoring, reranking, deduplication, and diversification still
decide ordering. For --content and agent calls, the selected
file can also carry a small same-file support pack: related constants,
symbol definitions, or assertion anchors that make the returned context
sufficient without widening to a raw tree dump.
System · storage
Two SQLite tables joined 1-to-1.
storage.init_db. The
chunks table holds source location and verbatim text;
vectors holds the embedding as a binary float32 buffer.
The two tables are kept in lockstep at write time
(store_chunks_batch inserts both rows in the same
transaction) and joined on id at read time. There is no
declared foreign key constraint; the join is enforced by application
code.
System · ranking