release notes · v0.1.0

skylakegrep 0.1.0 — release notes

The first public release of skylakegrep. This is the objective description of what the project is and does today, not a historical incremental changelog.

License: PolyForm Noncommercial 1.0.0. Personal / academic / research / hobby use is fully permitted. Commercial use requires a separate license — contact the maintainers.

What it does

skygrep is a local semantic code-search tool. It builds a SQLite vector index from a repository's source files and answers natural-language queries with line-cited snippets — fully offline, backed by a local Ollama server.

$ skygrep "how does the auth token get refreshed?"

╭─ auth/refresh.py:34-51 ────────────────────────────  python   0.93
│ def refresh_token(refresh_jwt: str) -> AccessToken:
│     claims = jwt.decode(refresh_jwt, key, algorithms=ALGS)
│     ...
╰────────────────────────────────────────────────────────────────────

╭─ auth/middleware.py:78-94 ────────────────────────  python   0.81
│ async def renew_session(req: Request) -> Response:
│     if req.cookies.get("rt") and access_expired(req):
│     ...
╰────────────────────────────────────────────────────────────────────

Capabilities

LLM-driven query routing

A small local Ollama model (default qwen2.5:3b) reads each query and returns a structured {intent, primary_token, skip_cascade, extract_content, confidence} JSON. The CLI uses this to decide which retrieval tiers to run.

Three-layer fallback chain:

primary    : LLM router (Ollama HTTP, 500 ms timeout)
                  ↓ on any failure
fallback-1 : hand-rolled rule-based classifier
                  ↓ on any failure
fallback-2 : intent="mixed" — every tier runs, no smart routing

Confidence threshold protects accuracy: skip_cascade is honored only when LLM confidence ≥ 0.7. Below that, cascade always runs. Per-session SQLite cache so the same query never pays the LLM cost twice.

Override with --no-llm-router to force the rule-based path (useful for benchmarking or air-gapped use).

Three retrieval tiers, ranked by intent

L-2: filename lookup    ~10 ms   `where is X file?`
L-1: lexical content    ~50 ms   ripgrep-friendly content
L0+: semantic cascade   ~0.5-3s  vocabulary-mismatch

Each tier has a conservative four-condition gate; borderline queries fall through. Every enabled tier always runs; results are deduplicated by path and ranked by classify_intent(query) so the right tier wins top slots.

Confidence-gated semantic cascade

Sub-second cheap path (file-mean cosine) on most queries, escalating to HyDE-augmented retrieval + cross-encoder rerank + PageRank tiebreaker only on uncertain queries. Full pipeline:

L0 — ripgrep cold-start fallback. When the index is being built in the background, the first query in a fresh project gets ripgrep results in ~100 ms.
L1 — file-mean cosine. Whole-corpus cosine on file-level embeddings; cheap path returns when top-1 vs top-2 gap > τ.
L2 — symbol-aware retrieval. Tree-sitter extracts function / class / method symbols; symbol matches are boosted.
L3 — HyDE + doc2query. When the cheap path is uncertain, a local LLM rewrites the query into a hypothetical document for cosine retrieval over the union.
L4 — PageRank tiebreaker. Per-file import / export graph produces a centrality prior that breaks score ties.
Cross-encoder rerank. A second-stage rerank over the candidate pool for the highest-precision tier.

Binary file content extraction

Lazy on-demand text extraction for filename matches in --detail=full:

Type	Extractor	Fallback
`.pdf` (text layer)	`pdftotext` (poppler)	`pypdf` (pure Python)
`.pdf` (scanned)	`tesseract` (opt-in, `--ocr`)	hint to user
`.docx`	`python-docx`	error hint
`.txt`/`.md`/`.csv`/`.tsv`	direct read	n/a

Scanned-PDF detection: text-layer extraction returning < 100 characters yields a friendly hint suggesting --ocr. OCR is never auto-run (5–30 s/page is too slow for default).

Editor / app session-lock files (~$*.docx MS Word locks, *.swp vim swap, .#* emacs lock, *~ backup) are filtered at both the find layer and the extract layer.

Hierarchical merge with intent-aware ranking

Results from all tiers are deduplicated by path and ranked by the LLM-detected intent. Filename-intent queries pin the filename match to top while still surfacing relevant code chunks. Semantic-intent queries put cascade results first and surface filename matches only as low-priority context.

Framed terminal output

Each result wrapped in a rounded card:

╭─ <path:lines> ──────────────────────────  <pill>   <score>
│ <symbol if present>
│
│ <Pygments-highlighted body>
╰────────────────────────────────────────────────────────────

Cyan repo-relative paths
Right-aligned cyan-pill language tag + bold-green score
Pygments Terminal256Formatter (monokai default; override via SKYGREP_PYGMENTS_STYLE) — 300+ languages
Dim grey separator rule
Auto-detected: ANSI only when stdout is a TTY and NO_COLOR is unset; SKYGREP_FORCE_COLOR=1 forces colour for piped output

`--detail` levels

Level	Body
`brief`	header only, no body
`standard` (default)	+ ~10 lines of code body / metadata
`full`	+ extracted PDF/docx content for filename matches
`summary`	+ 1-line truncated preview

`skygrep setup` — auto-register with LLM CLIs

Detects installed coding agents (Claude Code, Codex, OpenCode, Gemini CLI, Cursor) and writes a markdown snippet into each one's user-level instructions file so the agent prefers skygrep over rg for natural-language code questions. Snippet lives between explicit BEGIN/END markers; skygrep setup --uninstall removes it cleanly.

Multi-language indexing

Tree-sitter grammars for Python, JavaScript, TypeScript are bundled by default (declared in pyproject.toml). The indexer can also use grammars for Go, Rust, Java, C, C++, C#, Ruby, PHP, Swift, Kotlin, Scala if the corresponding tree-sitter-<lang> package is installed manually (pip install tree-sitter-rust etc.). Files in unsupported languages fall back to generic line-based chunking.

.gitignore / .skygrepignore precedence honored.

Per-project, fully offline

Per-project SQLite database at ~/.skylakegrep/repos/<basename>-<hash>.db. Indexes / embeddings / queries / answers all run on the local host via Ollama. No remote service required. No telemetry.

CLI flags

skygrep "<query>"                    # bare query (default subcommand)
skygrep search "<query>" -m 10       # explicit search subcommand
skygrep index                         # build / refresh the index
skygrep watch                         # incremental file watcher
skygrep stats                         # per-project index info
skygrep doctor                        # runtime + integrations health check
skygrep setup                         # register with LLM CLIs
skygrep enrich                        # one-time chunk doc2query enrichment
skygrep serve                         # daemon mode (avoids reranker cold-load)

Common flags:
  --top, -m N                          number of results
  --json                               machine-readable output
  --detail brief|standard|full|summary verbosity
  --ocr                                tesseract OCR for scanned PDFs (slow)
  --rg-shortcut/--no-rg-shortcut       lexical pre-gate (default on)
  --filename-shortcut/                 filename pre-gate (default on)
    --no-filename-shortcut
  --llm-router/--no-llm-router         LLM routing (default on)
  --cascade/--no-cascade               semantic cascade (default on)
  --rerank/--no-rerank                 cross-encoder rerank (default on)
  --hyde/--no-hyde                     HyDE rewrite
  --multi-resolution/                  two-stage file → chunk retrieval
    --no-multi-resolution
  --rank-by chunk|file                 ranking strategy
  --include / --exclude                path filters
  --language                           language filter
  --agentic                            local Ollama subquery decomposition
  --answer                             synthesise an answer from results
  --daemon-url                         use a running skygrep daemon

Environment variables

OLLAMA_URL                  default http://localhost:11434
OLLAMA_EMBED_MODEL          default nomic-embed-text
OLLAMA_LLM_MODEL            default qwen2.5:3b
OLLAMA_HYDE_MODEL           default qwen2.5:3b
OLLAMA_KEEP_ALIVE           default -1 (keep models warm)

SKYGREP_DB_PATH             override per-project DB path
SKYGREP_FORCE_COLOR         force ANSI in non-TTY output
SKYGREP_LLM_ROUTER_MODEL    override LLM routing model
SKYGREP_LLM_ROUTER_TIMEOUT_SECONDS    default 0.5
SKYGREP_LLM_ROUTER_MIN_CONFIDENCE     default 0.7
SKYGREP_PYGMENTS_STYLE      default monokai
SKYGREP_AUTO_REFRESH_THROTTLE_SECONDS default 30
NO_COLOR                    standard ANSI opt-out

Install

pip install skylakegrep
skygrep setup        # one-time: register with detected LLM CLIs
skygrep "where is the config file?"

Requires Python 3.9+ and a local Ollama server with at least one embedding model pulled (ollama pull nomic-embed-text). Pull a generation model only if you use --answer, --agentic, the LLM router, or the cascade's HyDE escalation (ollama pull qwen2.5:3b).

Compatibility / honesty

120 unit tests pass.
PolyForm Noncommercial 1.0.0 is not OSI-approved open source. GitHub will mark this repo as "non-permissive". Personal use remains fully free.
pdftotext, tesseract, pdftoppm are only required when using the corresponding extraction features; install via brew install poppler tesseract on macOS if needed.

Why a fresh start?

This codebase has prior history — earlier iterations explored cascade tuning, multi-language benchmarks, end-to-end agent benchmarks, and a 5-tier retrieval architecture. The historical incremental release notes are not preserved here because the project has been re-licensed (MIT → PolyForm NC) and rebranded; the current source tree, retroactively rewritten git history, and this 0.1.0 release notes file are the single source of truth for what skylakegrep is today.