release notes · v0.1.0
skylakegrep 0.1.0 — release notes
The first public release of skylakegrep. This is the objective
description of what the project is and does today, not a historical
incremental changelog.
License: PolyForm Noncommercial 1.0.0. Personal / academic / research / hobby use is fully permitted. Commercial use requires a separate license — contact the maintainers.
What it does
skygrep is a local semantic code-search tool. It builds a SQLite
vector index from a repository's source files and answers
natural-language queries with line-cited snippets — fully offline,
backed by a local Ollama server.
$ skygrep "how does the auth token get refreshed?"
╭─ auth/refresh.py:34-51 ──────────────────────────── python 0.93
│ def refresh_token(refresh_jwt: str) -> AccessToken:
│ claims = jwt.decode(refresh_jwt, key, algorithms=ALGS)
│ ...
╰────────────────────────────────────────────────────────────────────
╭─ auth/middleware.py:78-94 ──────────────────────── python 0.81
│ async def renew_session(req: Request) -> Response:
│ if req.cookies.get("rt") and access_expired(req):
│ ...
╰────────────────────────────────────────────────────────────────────
Capabilities
LLM-driven query routing
A small local Ollama model (default qwen2.5:3b) reads each query
and returns a structured {intent, primary_token, skip_cascade,
extract_content, confidence} JSON. The CLI uses this to decide
which retrieval tiers to run.
Three-layer fallback chain:
primary : LLM router (Ollama HTTP, 500 ms timeout)
↓ on any failure
fallback-1 : hand-rolled rule-based classifier
↓ on any failure
fallback-2 : intent="mixed" — every tier runs, no smart routing
Confidence threshold protects accuracy: skip_cascade is honored
only when LLM confidence ≥ 0.7. Below that, cascade always runs.
Per-session SQLite cache so the same query never pays the LLM cost
twice.
Override with --no-llm-router to force the rule-based path
(useful for benchmarking or air-gapped use).
Three retrieval tiers, ranked by intent
L-2: filename lookup ~10 ms `where is X file?`
L-1: lexical content ~50 ms ripgrep-friendly content
L0+: semantic cascade ~0.5-3s vocabulary-mismatch
Each tier has a conservative four-condition gate; borderline queries
fall through. Every enabled tier always runs; results are
deduplicated by path and ranked by classify_intent(query) so the
right tier wins top slots.
Confidence-gated semantic cascade
Sub-second cheap path (file-mean cosine) on most queries, escalating to HyDE-augmented retrieval + cross-encoder rerank + PageRank tiebreaker only on uncertain queries. Full pipeline:
- L0 — ripgrep cold-start fallback. When the index is being built in the background, the first query in a fresh project gets ripgrep results in ~100 ms.
- L1 — file-mean cosine. Whole-corpus cosine on file-level embeddings; cheap path returns when top-1 vs top-2 gap > τ.
- L2 — symbol-aware retrieval. Tree-sitter extracts function / class / method symbols; symbol matches are boosted.
- L3 — HyDE + doc2query. When the cheap path is uncertain, a local LLM rewrites the query into a hypothetical document for cosine retrieval over the union.
- L4 — PageRank tiebreaker. Per-file import / export graph produces a centrality prior that breaks score ties.
- Cross-encoder rerank. A second-stage rerank over the candidate pool for the highest-precision tier.
Binary file content extraction
Lazy on-demand text extraction for filename matches in
--detail=full:
| Type | Extractor | Fallback |
|---|---|---|
.pdf (text layer) |
pdftotext (poppler) |
pypdf (pure Python) |
.pdf (scanned) |
tesseract (opt-in, --ocr) |
hint to user |
.docx |
python-docx |
error hint |
.txt/.md/.csv/.tsv |
direct read | n/a |
Scanned-PDF detection: text-layer extraction returning < 100
characters yields a friendly hint suggesting --ocr. OCR is
never auto-run (5–30 s/page is too slow for default).
Editor / app session-lock files (~$*.docx MS Word locks,
*.swp vim swap, .#* emacs lock, *~ backup) are filtered at
both the find layer and the extract layer.
Hierarchical merge with intent-aware ranking
Results from all tiers are deduplicated by path and ranked by the LLM-detected intent. Filename-intent queries pin the filename match to top while still surfacing relevant code chunks. Semantic-intent queries put cascade results first and surface filename matches only as low-priority context.
Framed terminal output
Each result wrapped in a rounded card:
╭─ <path:lines> ────────────────────────── <pill> <score>
│ <symbol if present>
│
│ <Pygments-highlighted body>
╰────────────────────────────────────────────────────────────
- Cyan repo-relative paths
- Right-aligned cyan-pill language tag + bold-green score
- Pygments
Terminal256Formatter(monokaidefault; override viaSKYGREP_PYGMENTS_STYLE) — 300+ languages - Dim grey separator rule
- Auto-detected: ANSI only when stdout is a TTY and
NO_COLORis unset;SKYGREP_FORCE_COLOR=1forces colour for piped output
--detail levels
| Level | Body |
|---|---|
brief |
header only, no body |
standard (default) |
+ ~10 lines of code body / metadata |
full |
+ extracted PDF/docx content for filename matches |
summary |
+ 1-line truncated preview |
skygrep setup — auto-register with LLM CLIs
Detects installed coding agents (Claude Code, Codex, OpenCode,
Gemini CLI, Cursor) and writes a markdown snippet into each one's
user-level instructions file so the agent prefers skygrep over
rg for natural-language code questions. Snippet lives between
explicit BEGIN/END markers; skygrep setup --uninstall removes
it cleanly.
Multi-language indexing
Tree-sitter grammars for Python, JavaScript, TypeScript are
bundled by default (declared in pyproject.toml). The indexer can
also use grammars for Go, Rust, Java, C, C++, C#, Ruby, PHP,
Swift, Kotlin, Scala if the corresponding tree-sitter-<lang>
package is installed manually (pip install tree-sitter-rust etc.).
Files in unsupported languages fall back to generic line-based
chunking.
.gitignore / .skygrepignore precedence honored.
Per-project, fully offline
Per-project SQLite database at ~/.skylakegrep/repos/<basename>-<hash>.db.
Indexes / embeddings / queries / answers all run on the local host
via Ollama. No remote service required. No telemetry.
CLI flags
skygrep "<query>" # bare query (default subcommand)
skygrep search "<query>" -m 10 # explicit search subcommand
skygrep index # build / refresh the index
skygrep watch # incremental file watcher
skygrep stats # per-project index info
skygrep doctor # runtime + integrations health check
skygrep setup # register with LLM CLIs
skygrep enrich # one-time chunk doc2query enrichment
skygrep serve # daemon mode (avoids reranker cold-load)
Common flags:
--top, -m N number of results
--json machine-readable output
--detail brief|standard|full|summary verbosity
--ocr tesseract OCR for scanned PDFs (slow)
--rg-shortcut/--no-rg-shortcut lexical pre-gate (default on)
--filename-shortcut/ filename pre-gate (default on)
--no-filename-shortcut
--llm-router/--no-llm-router LLM routing (default on)
--cascade/--no-cascade semantic cascade (default on)
--rerank/--no-rerank cross-encoder rerank (default on)
--hyde/--no-hyde HyDE rewrite
--multi-resolution/ two-stage file → chunk retrieval
--no-multi-resolution
--rank-by chunk|file ranking strategy
--include / --exclude path filters
--language language filter
--agentic local Ollama subquery decomposition
--answer synthesise an answer from results
--daemon-url use a running skygrep daemon
Environment variables
OLLAMA_URL default http://localhost:11434
OLLAMA_EMBED_MODEL default nomic-embed-text
OLLAMA_LLM_MODEL default qwen2.5:3b
OLLAMA_HYDE_MODEL default qwen2.5:3b
OLLAMA_KEEP_ALIVE default -1 (keep models warm)
SKYGREP_DB_PATH override per-project DB path
SKYGREP_FORCE_COLOR force ANSI in non-TTY output
SKYGREP_LLM_ROUTER_MODEL override LLM routing model
SKYGREP_LLM_ROUTER_TIMEOUT_SECONDS default 0.5
SKYGREP_LLM_ROUTER_MIN_CONFIDENCE default 0.7
SKYGREP_PYGMENTS_STYLE default monokai
SKYGREP_AUTO_REFRESH_THROTTLE_SECONDS default 30
NO_COLOR standard ANSI opt-out
Install
pip install skylakegrep
skygrep setup # one-time: register with detected LLM CLIs
skygrep "where is the config file?"
Requires Python 3.9+ and a local Ollama server with at least one
embedding model pulled (ollama pull nomic-embed-text). Pull a
generation model only if you use --answer, --agentic, the LLM
router, or the cascade's HyDE escalation
(ollama pull qwen2.5:3b).
Compatibility / honesty
- 120 unit tests pass.
- PolyForm Noncommercial 1.0.0 is not OSI-approved open source. GitHub will mark this repo as "non-permissive". Personal use remains fully free.
pdftotext,tesseract,pdftoppmare only required when using the corresponding extraction features; install viabrew install poppler tesseracton macOS if needed.
Why a fresh start?
This codebase has prior history — earlier iterations explored
cascade tuning, multi-language benchmarks, end-to-end agent
benchmarks, and a 5-tier retrieval architecture. The historical
incremental release notes are not preserved here because the
project has been re-licensed (MIT → PolyForm NC) and rebranded;
the current source tree, retroactively rewritten git history, and
this 0.1.0 release notes file are the single source of truth for
what skylakegrep is today.