skylakegrep Release notes

Release notes

What's new in skylakegrep.

Every release has its own themed page (linked from each card below) and is also published on the GitHub Releases page with attached wheel + sdist artifacts. The most recent shipped versions, in chronological order:

0.5.13 · adaptive candidate recall · agent-grade context benchmark

Semantic and mixed queries now get bounded recall before cascade decides final ranking

This release adds a generic candidate recall substrate for semantic, lexical, and mixed queries. Explicit include scope, indexed path tokens, indexed symbols, SQLite chunk text, and a small rg -il -F pass can all vote files into the candidate pool before the normal scorer and reranker decide final ordering. Content and agent calls now receive a bounded same-file support pack when needed, so a downstream LLM sees compact evidence rather than just paths. The new agent tool-context benchmark shows 6.12× fewer tool calls, 37.74× less context, and 31.27× higher sufficiency density than a raw-rg agent baseline on 24 generic depth tasks.

Full notes →
0.5.12 · bounded cold semantic routing · public example verification

Cold semantic search is now budgeted, scoped, and verified against the public examples

This release hardens the path that previously let broad cold-start semantic searches spend too long exploring home and sibling roots. The lazy cwd and cross-folder lanes now have separate wall-clock budgets, hidden and dependency-cache trees are pruned before traversal, ripgrep fallback runs in one bounded pass, warm cross-folder expansion no longer pollutes local answers, and relative include globs such as --include "src/**" match absolute indexed paths. The public README / GitHub Pages examples were rerun on a fictional project, and --answer was tightened so direct evidence does not receive a contradictory missing-answer caveat.

Full notes →
0.5.11 · fast scoped discovery · Python 3.10 CI hotfix

0.5.10's fast scoped discovery release, hardened for the full CI matrix

This release supersedes 0.5.10. The generic scoped descriptor + metadata file-discovery lane, background refresh deferral, wall-time footer, and automatic setup-instruction refresh remain the main user-facing behavior. The proactive enhancer now also catches Python 3.10 concurrent.futures.TimeoutError budget exhaustion explicitly, so a total-budget timeout is recorded as telemetry instead of escaping as an exception in CI.

Full notes →
0.5.10 · fast scoped file discovery · agent instruction depth

Scoped file-location questions return fast, while agents learn the right output depth

This release adds a generic scoped descriptor + metadata file-discovery lane for path-depth questions that name a folder, target descriptors, and metadata such as created / modified / opened. Those searches can now finish from bounded filesystem evidence without waiting for semantic cascade, while content, answer, and agentic queries still keep deeper retrieval alive. Large foreground refreshes defer to background indexing, footer timing now reports command wall time, and skygrep setup writes the information-depth ladder into Claude Code, Codex, OpenCode, Gemini CLI, and Cursor instructions. Existing managed setup snippets auto-refresh after upgrade without touching user-authored text.

Full notes →
0.5.9 · generic adaptive routing · scoped search performance

Scope is now a first-class query-plan facet, so scoped searches stay fast and precise

This release upgrades routing from a single terminal intent into a facet-based plan: scope, target, metadata, and answer depth are handled independently. Folder / repo / workspace clauses are resolved to a real local root and stripped from router text before fast-intent, LLM fallback, metadata, and lexical gates run. Metadata remains instant when it fully answers the query, but becomes only a modifier when the user also names a target. Scoped semantic and JSON/agent queries can now finish from strong lexical evidence without waiting for expensive cascade/rerank, while CJK and mixed-language scope forms are handled generically. The release gate includes a 12-case synthetic CLI benchmark covering cold/warm, filename, semantic, metadata terminal/modifier, CJK, wrong-directory proactive, and JSON output.

Full notes →
0.5.8.7 · adaptive query-plan routing · metadata terminal/modifier facets

Metadata is now a routing facet: fast when it fully answers, deeper when it only constrains

This release turns filesystem metadata into a structured query-plan facet instead of a mutually exclusive intent. Pure metadata queries such as recently opened or recently created files still return from the fast filesystem lane, while composite queries keep searching and use metadata only as a cheap reranking signal inside already-relevant candidates. It adds created-time metadata, CJK terminal/modifier separation, a code-identifier collision guard for tokens such as created_at, optional document evidence fields for JSON and agent paths, and expands the privacy release gate to benchmark files.

Full notes →
0.5.8.6 · answer-depth routing · progressive semantic refinement

Fast path answers stay fast, while semantic questions continue past filename anchors

This release makes filename evidence answer-depth aware. A concrete filename hit can finish a path-depth query, but semantic questions that mention a filename keep that file as an anchor while lazy/cascade refinement continues in the same invocation. Human output now shows the active router lane by default, cold-start semantic queries can show a bounded content preview from the anchor before refinement, metadata questions use a fast filesystem lane, and cross-folder diffusion is suppressed when the current scope already has a concrete anchor.

Full notes →
0.5.8.5 · multilingual routing · bare-query CLI · bounded proactive diffusion

Natural-language queries work bare, quoted, or smart-quoted while semantic recall remains intact

This release hardens the intelligent router around the real cases users type at a shell: skygrep where is my case42 file in Downloads, skygrep -x where is my case42 file in Downloads, smart-quoted pasted text, and Chinese or mixed-language filename questions such as 我的 CASE42 文件在哪 and 我的合同文件在哪. Filename answers can return immediately when the filename layer has a clear answer, while literal/rg evidence still joins the normal semantic cascade instead of suppressing it. Proactive outside-path diffusion is now bounded to filename intent, and obvious semantic questions take a cheap semantic pre-router path before the LLM router.

Full notes →
0.5.8.3 · hot-fix · filename skip-cascade UnboundLocalError

High-confidence filename lookups can skip cascade without crashing later in the warm cross-folder gate

The LLM router correctly skips semantic cascade for pure filename lookups such as skygrep -x "where is my case42 file". In 0.5.8.2 the cascade branch initialised queries, but the cascade-skipped path did not; a later warm cross-folder condition still read len(queries). 0.5.8.3 initialises queries = [query] before that branch and adds regression coverage for the high-confidence filename skip path. No schema change, no index rebuild, public CLI and JSON shape unchanged.

Full notes →
0.5.8 · why-this-matched (--explain) · Ollama autostart · ES comparison

Every result now carries the full retrieval provenance — pass --explain and skygrep tells you why

Three layers of "why" answer three different questions: (a) a router rationale at the top — 🧭 router: <intent> · primary_token=… · conf=… · source=… plus a one-sentence reason from the LLM router; (b) a per-result via: line — which channel(s) contributed (cosine cascade · symbol RRF · filename-lookup · ripgrep), what symbol terms matched, the score; and (c) a 🛤 cascade lane: summary at the bottom with the σ-adaptive evidence (gap=…, tau=…). No new model calls, no extra retrieval — every signal was already in the pipeline; we just stopped throwing it away at render time. Off by default: existing UX is byte-identical to 0.5.7. Bonus: if Ollama isn't running but is installed, skygrep autostarts ollama serve in the background (5 s budget, env-tunable) and tells you. Two latent LLM-router bugs were also fixed along the way: keep_alive was being sent as the string "-1" which recent Ollama rejects with HTTP 400, and LLM_TIMEOUT_SECONDS defaulted to 0.5 s which timed out 100 % of cold qwen2.5:3b calls — both had been silently forcing the rule-based fallback on most queries. README + Pages now ship a dedicated "How skylakegrep differs from Elasticsearch" section answering the most common second-question. 207/207 unit tests pass; head-to-head vs 0.5.7 PyPI on the same query produces byte-identical paths and scores when --explain is off.

Full notes →
0.5.7 · hot-fix · cross-folder lazy SQLite cross-thread error

Re-applies the 0.5.6 worker-thread SQLite-conn pattern to all three lazy worker call sites

0.5.6 introduced parallel proactive umbrella + cascade-in-worker- thread + dedicated SQLite connection inside cascade's worker. Same fix was MISSED for the two cross-folder lazy paths (cold + wrong-folder branch's _run_cross and warm + low-confidence cross-folder) plus the cold-start _run_cwd branch — all three passed the main-thread conn into a ThreadPoolExecutor worker, triggering "SQLite objects created in a thread can only be used in that same thread" silently caught and turned into "0 results". The proactive umbrella's filename_extend tier still answered in 1 s on the wrong-path scenario, so the user-visible regression was small — but the lazy semantic tier was a dead path. 0.5.7 replicates the 0.5.6 cascade pattern across all three sites: each worker opens init_db(db_path), uses that conn for the call, and closes in finally. Verified: wrong-path wall 1 s with cross-folder returning 5 cosine-ranked Django files (no SQLite stderr). Hard bench 4/10 preserved. Pytest 217/217.

Full notes →
0.5.6 · proactive umbrella runs in parallel with cascade (12:50 → 26 s)

Cascade and proactive umbrella subprocesses fire at t = 0; whoever has an answer streams first

0.5.4 still had a sequential chain — filename + rg → cascade (100 ms ~ 60 s) → cross-folder (5–30 s) → proactive enhancers (≤ 1 s) → render. On a vocabulary-mismatch query (e.g. asking about a PDF in ~/Downloads from inside a code repo) the user reported a 12-minute-50-second wall clock, with the right answer hidden behind 99.7 s of cascade rerank. 0.5.6 refactors to the conceptual model in docs/proactive-umbrella-framework.md: cascade and every proactive subprocess (filename_extend, lazy_cross_folder, lazy_cwd) fire in parallel via ThreadPoolExecutor, each stream their results with a route + quality label as soon as ready. New 30 s hard timeout on cascade (worker-thread + dedicated SQLite connection) plus the existing 8 s cap on cross-folder. Same query: 12:50 → 26 s; first answer at ~1–2 s from the proactive umbrella block. Pytest 217/217.

Full notes →
0.5.4 · doc-surface completion + streaming cold-start UX

No more silent 30-second prompt — preliminary results stream as soon as rg returns

Two threads. (1) Doc surfaces: hero eyebrow on the GitHub Pages homepage bumped from a stale v0.2.13 to v0.5.4, bench-stats summary now carries a +4 / 10 lazy auto-trigger over rg cold-start tile alongside the 30 / 30 fully-indexed peak, og:description + twitter:description updated for accurate social cards, cli.html documents --lazy / --no-lazy + SKYGREP_PROACTIVE_DIRS + the env-var table, benchmarks.html grew a Cold-start lazy auto-trigger section with the real-CLI table, README headline tagline carries the +30 % lazy line. (2) Streaming cold-start UX: skygrep "<query>" on a never-indexed dir no longer sits silent for 5–30 s. It prints an immediate 🔍 scanning… line, the preliminary rg / filename hits as soon as rg returns (≤ 1 s), the 🌊 / 💧 / ⚡ lazy progress to stderr, and the lazy-refined matches under a ▾ refined matches… header — already-printed paths are de-duplicated. Pytest 217/217.

Full notes →
0.5.3 · lazy seed selection that actually works (1/10 → 4/10)

Real, measurable hit-rate improvement over rg cold-start on the Django oracle

0.5.0–0.5.2 shipped lazy auto-trigger but the real CLI bench showed it at 1/10 — barely above pure ripgrep. 0.5.3 fixes the root cause: token-shortcut DEDUP for numeric-prefix migration families, numeric-prefix scoring penalty, deterministic dir-token picker (LLM-router-independent), test-path penalty, weighted per-dir budget (top dir gets 8 files, then 4/2/2), and a critical Ollama keep_alive bug fix that had been silently zero-ing every LLM-router call (HTTP 400, "missing unit in duration"). Plus regex import diffusion, ThreadPool I/O parallelism, progressive stderr progress lines, cold-+-wrong-folder lazy_cwd ∥ lazy_cross_folder branch, and warm-cascade low-confidence cross-folder augmentation. Hard CLI bench: rg-only 0/10 → auto-trigger 4/10 (+30%) at +16 s/query on fresh Django. Pytest 217/217.

Full notes →
0.5.2 · themed HTML release notes + de-hardcoded proactive dirs

Closing the doc-surface gap from 0.5.0/0.5.1 + zero personal-filesystem assumptions

0.5.0 and 0.5.1 shipped with .md release notes only — the changelog "Full notes →" links pointed at .html pages that didn't exist. Closed in 0.5.2: a reusable scripts/render_release_notes.py wraps each .md in the same themed sidebar / topbar layout as 0.4.2.html so future releases don't repeat the lapse. Both skylakegrep-0.5.0.html and skylakegrep-0.5.1.html now exist alongside this page. Second change: SKYGREP_PROACTIVE_DIRS (colon-separated absolute paths) replaces hardcoded ~/Downloads / ~/Desktop / ~/Documents in proactive._default_search_dirs AND in lazy_indexer.lazy_explore_cross_folder's default candidate_roots. Both lists were the maintainer's personal layout; the codebase now contains zero personal-filesystem assumptions. Pytest 201 / 201.

Full notes →
0.5.1 · auto-trigger lazy on cold-start (no flag needed)

The user shouldn't have to know which tier they need

0.5.0 shipped --lazy as opt-in. Wrong: the whole premise of lazy is that the user doesn't know whether they're in the right folder or whether their query aligns with the code's vocabulary, so they can't be expected to add the flag. 0.5.1 flips it to auto-trigger. skygrep "<query>" on a never-indexed project now: runs rg immediately; if rg paths cover ≥ 2 distinct query tokens (PascalCase / snake_case split), returns the instant keyword answer; otherwise also fires the LLM-routed lazy semantic tier (~5 s) and merges. --no-lazy is the new opt-out for benchmarking. Verified end-to-end on a fresh Django checkout via real CLI (not python API) — 6 scenarios, 5 of which exercise the auto-trigger from both directions; e.g. "cache invalidation strategy" returns django/middleware/cache.py, cached_db.py, etc. in 8 s. Pytest 201 / 201.

Full notes →
0.5.0 · adaptive lazy index for cold-start semantic search

Real-semantic answer in ~5 s on a project that has never been indexed

0.4.x removed graph-walk after a production crash and didn't actually improve recall. 0.5.0 returns to the original design intent: the user shouldn't have to run skygrep index . and wait 5–10 minutes for the first semantic question. New lazy_indexer module: an LLM router (qwen2.5:3b) picks 5–15 likely entry-point paths from the directory tree alone, the embedder batch-embeds them in one bge-m3 call (~5× speedup over per-file), and a σ-validated cosine top-K is returned with confidence telemetry. Django bench (10 queries, fresh DB): 4 / 10 hits, p50 5.0 s, max 9.1 s. Architecture: lazy is purely additive — fills the previously-empty middle of the latency / recall curve between rg cold-start (~100 ms keyword) and full eager index (5–10 min upfront, 30 / 30 recall). 0.4.x graph_expand stays removed.

Full notes →
0.4.2 · P0 hot-fix · KeyError 'snippet'

Critical crash on every escalated query that graph_expand contributed to

0.4.0 / 0.4.1 shipped a production crash: KeyError: 'snippet' at cli.py:92 merge_results — fired on every CLI search query that escalated AND graph_expand returned candidates. Hidden by tests that called cascade_search directly, never through the CLI. Hot-fix: emit the missing snippet (and full result- dict shape) from _expand_via_reference_graph(). Verified by real CLI search query on the local index — escalation ran end-to-end (7.3s, no crash). Public OSS bench backfill is deferred to 0.4.3 (the bench wrapper currently stalls on Django, root cause under investigation). New auto-memory rule: every release MUST exercise skygrep search on a real index before tagging.

Full notes →
0.4.1 · real-corpus bench backfill

Honest end-to-end measurement of 0.4.0 graph_expand on real bge-m3

Backfills the real-CLI bench that 0.4.0 should have included before shipping. Indexed 27 Python files of skylakegrep itself with real bge-m3 embeddings, populated 108 graph nodes + 190 reference edges, ran 5 representative semantic queries with tau=0 forced escalation. graph_expand fired correctly on 3/3 escalated queries (4–9 candidates contributed each), correctly skipped on 2/2 cheap-path queries, latency invariant held (cheap path ~7ms, escalation 1.7–2.6s). Top-5 hit rate 3/5 on this bench — substrate works, but doesn't magically improve recall on queries where the answer is 2+ hops from cosine top-K. Honest framing now propagated to all GH surfaces. New auto-memory rule: every release MUST include real corpus end-to-end run before public-surface update. Full bench: benchmarks/release-0.4.0-real-corpus.md.

Full notes →
0.4.0 · holistic graph-aware retrieval (zero new hyperparameters)

1-hop reference-graph expansion in cascade escalation — by-construction additive, by-construction latency-neutral

Closes the v2 design that 0.3.0 attempted with 9+ preset hyperparameters (rolled back in 0.3.1). 0.4.0 redoes it holistically per the holistic graph-aware retrieval plan and the "intelligence-is-conditional" principle just locked into auto-memory. Zero new hyperparameters: every weight is cosine, every threshold is the existing CASCADE_TAU_FLOOR, every edge is a reference-graph ref. ~50 LoC added; ~800 LoC of phased-design scaffolding deleted (graph_walk.py · query_seeds.py · graph_substrate.py · per-component tests). End-to-end integration test covers the whole; no isolated-component tests. Cheap path identical to 0.2.21; escalation adds ≤ 2 ms latency to union 1-hop neighbours into the rerank pool.

Full notes →
0.3.1 · principled rollback

Graph-walk integration rolled back; preset hyperparameters stripped

0.3.0 shipped on by-construction arguments without an end-to-end bench. First real-corpus run (5 queries on skylakegrep/src/) showed only 2/5 hits — the seed mapper correctly ranked the answer first (graph_walk.py at 55% seed mass), but PPR walk diluted it through 9+ preset edge weights into structurally-adjacent siblings. 0.3.1 reverts the cascade integration (production behaviour ≡ 0.2.21), strips all score_per_hit constants, derives path_prox weight from path depth. Substrate modules + tests stay; integration deferred until weights are derived from corpus stats. Full benchmark and rollback rationale in benchmarks/release-0.3.0-graph-walk.md.

Full notes →
0.3.0 · graph-walk retrieval (v2 MVP)

Knowledge-graph substrate · cold-start seeds · bounded PPR walk

First new code-path release since 0.2.0. Implements the v2 retrieval substrate from the graph-walk plan: heterogeneous knowledge graph (file/folder/chunk/symbol/token nodes, 8 edge types, 4 cheap edges built at index time), bounded forward-push Personalized PageRank (Andersen-Chung-Lang 2006, σ-stop, max 200 nodes), and cold-start query → seeds via 4 matchers (filename/symbol/semantic/path-token) — first query produces rich seeds with zero history. Gated behind SKYGREP_GRAPH_WALK=1; default unchanged. Tests 221/221 (was 201). Latency invariant (cheap path unchanged) + accuracy invariant (purely additive to candidate pool) by construction.

Full notes →
0.2.21 · comparison label trim · perf banner relayout

Comparison row-3 label trimmed; performance aggregate banner now 3 clean columns

Reviewer caught two more overlaps. (1) The row-3 capability label Content — code · md · PDF · docx · images (~330 px) ran into the first data column's code · md · PDF · docx payload — trimmed to Content — multimodal (the cells already enumerate per tool). (2) The aggregate banner had three big numbers + three small labels jammed onto a single line that overflowed the 1032 px banner width — rebuilt as a 3-column "headline above caption" layout (banner 100 → 120 px tall) so the eye compares 30/30 · 60×–770× · −82% in parallel.

Full notes →
0.2.20 · hero geometry · raster stability

Third result row no longer collides with terminal frame; pill row stable in raster too

Reviewer caught the row-3 issue in one screenshot: utils/jwt.py overlapped its own :22-39 line range, and the third result card overflowed the terminal bottom border by 6 px. Path range moved x=100→120 (+14 px gap); card height 358→380 px (16 px bottom margin). Plus three issues caught in the sweep — XML-invalid <query> / & in cli-cheatsheet + configuration descs, and the cairosvg-only pill-row tspan glitch on hero + og-image (which broke social-preview unfurls). Both pill rows flattened to plain text.

Full notes →
0.2.19 · themed everywhere

Every clickable link now stays in the themed shell

25 markdown documents (20 release notes · principles · parity-benchmarks · token-benchmarking · roadmap · releasing) rendered as themed HTML pages with the same shared chrome. 83 link rewrites across 33 files (changelog "Full notes →", sidebar "Principles" link in every subpage, parity-benchmarks references, README internal links). Clicking any link no longer drops the visitor onto a raw GitHub markdown blob — the slate-blue frosted-glass shell stays consistent end-to-end.

Full notes →
0.2.18 · layout fix

Homepage now uses full horizontal width — Cody column no longer clipped

The 0.2.17 homepage was capped at --content-max: 800px with the legacy right-TOC still in the grid, so the 5-column comparison matrix overflowed and clipped the Sourcegraph Cody column on wide screens. Right TOC removed (homepage already has sidebar navigation), .docs-shell overridden to a 2-column grid up to 1500px, themed cards now span 100% of the content area. No more empty right margin.

Full notes →
0.2.17 · lifted theme · cheatsheet · config

Slate-blue glass · CLI cheatsheet + Configuration visualised · "custom type" rename

Dark theme lifted: --bg #0a0d12#13192a, panels 1.6 – 2.5× more opaque, themed-card backdrop-filter blur bumped to 36px so frosted glass reads through to slate-blue depth. Two new themed SVG cards — cli-cheatsheet.svg (bare form featured + 8 secondary tiles) and configuration.svg (three grouped env-var panels). The "Custom type" tile dropped from content-types.svg — 5 built-in types + extensibility footer banner; content-agnostic shouldn't sound like the user has to customise.

Full notes →
0.2.16 · mgrep correction · softer palette

mgrep corrected · softer frosted glass · workflow + content-types upgraded

0.2.15 mis-labeled mgrep as a "predecessor" — it is the Mixedbread AI cloud-backed paid CLI, the closest commercial competitor. Comparison row corrected (cloud-backed · sub + usage · npm + acct). Palette softened: #67e8f9#a5f3fc, #22d3ee#38bdf8, multi-stop frosted borders + backdrop-filter: blur. Two new themed SVGs: workflow-diagram.svg (3-stage pipeline) and content-types.svg (6-tile grid) replace the README's ASCII workflow + plain content-type table.

Full notes →
0.2.15 · comparison + visual

Named-tool comparison · themed SVG cards

Comparison surface sized against four named alternatives — ripgrep, mgrep, autodev-codebase, and Sourcegraph Cody — instead of generic "Cloud RAG". New themed SVG cards (comparison-matrix.svg + performance-matrix.svg) anchor the README with the same frosted-glass aesthetic as the hero. (Note: 0.2.15 mis-labeled mgrep as a predecessor; corrected in 0.2.16.)

Full notes →
0.2.14 · homepage

Homepage redesign + 6-page split

Animated terminal hero, three audience scenarios, comparison panel, three-step how-it-works diagram, bench-headline section, honesty list. 1405 → 849 lines on the home page; reference content moved to 6 new shared-chrome subpages (concepts · architecture · cli · reference · benchmarks · changelog).

Full notes →
0.2.13 · privacy

User-personal references swept from public docs

Sanitisation pass across release notes, plan documents, README, and GitHub Pages. No code change. Every public artefact now free of user-personal example tokens.

Full notes →
0.2.12 · proactive

Morphology fallback when LLM is unreachable

filename_extend's should-fire gate now extracts a candidate token via content-shape morphology when LLM didn't supply primary_token. Filed the conversational session-state plan.

Full notes →
0.2.11 · proactive

Second built-in enhancer: recovery_progress_hint

Content queries during a partial-index re-embed now get live progress + ETA + retry guidance instead of "no matches yet". Plus ProactiveContext infrastructure for future enhancers.

Full notes →
0.2.7 - 0.2.10 · proactive

Proactive enhancement framework + 4 iterations

Content-agnostic enhancer registry that runs in parallel after the cascade with bounded latency budget. Built-in filename_extend searches sibling home dirs when the in-project search returned 0 hits. Four bug-fix iterations (0.2.8 - 0.2.10) recorded as Principle 1 receipts.

0.2.7 → 0.2.10 →
0.2.6 · understanding

LLM-driven scope classification

RouterDecision.out_of_scope set by the same LLM router call. Replaces the keyword _METADATA_TOKENS list. Principle 1 ✓ shipped — substrate understanding over enumeration.

Full notes →
0.2.0 · substrate

bge-m3 + content-agnostic graph + 30 / 30

Default embedder switched from mxbai-embed-large to bge-m3 (1024-d, multilingual, symmetric XLM-RoBERTa). Content-agnostic reference_graph.register_extractor() registry. σ-adaptive cascade. 30 / 30 public-OSS recall (was 28 / 30).

Full notes →

All releases on GitHub →