skylakegrep Architecture

System · architecture

Three lanes: index time, storage, query time.

Three-lane schematic showing the indexing pipeline writing into a SQLite store, and the query pipeline reading from the same store.
Index time runs offline via skygrep index or skygrep watch. Storage is the only shared state between the two lanes. Query time runs per call to skygrep search.

System · recall

0.5.13 adds bounded candidate recall before final ranking.

The intent router ranks retrieval lanes, but it is no longer the only component that decides which files can be seen. For semantic, lexical, and mixed queries, skygrep now builds a cheap candidate file pool from independent signals before the expensive cascade makes the final choice: explicit include scope, indexed path tokens, indexed symbols, SQLite chunk text, and a bounded rg -il -F pass.

Candidate recall is additive. It can vote likely files into the pool, but semantic scoring, reranking, deduplication, and diversification still decide ordering. For --content and agent calls, the selected file can also carry a small same-file support pack: related constants, symbol definitions, or assertion anchors that make the returned context sufficient without widening to a raw tree dump.

System · storage

Two SQLite tables joined 1-to-1.

SQLite schema with chunks and vectors tables, joined on id.
Schema as initialized by storage.init_db. The chunks table holds source location and verbatim text; vectors holds the embedding as a binary float32 buffer.

The two tables are kept in lockstep at write time (store_chunks_batch inserts both rows in the same transaction) and joined on id at read time. There is no declared foreign key constraint; the join is enforced by application code.

System · ranking

Cosine, lexical adjustment, dedup, diversify.

Ranking pipeline showing semantic and lexical scores combined and then post-processed with sort, dedup, and diversification.
The combined score weight (λ = 0.2) and the per-file cap (MAX_RESULTS_PER_FILE = 2) are module-level constants.