Release notes
What's new in skylakegrep.
Every release has its own themed page (linked from each card below) and is also published on the GitHub Releases page with attached wheel + sdist artifacts. The most recent shipped versions, in chronological order:
Semantic and mixed queries now get bounded recall before cascade decides final ranking
This release adds a generic candidate recall substrate for semantic,
lexical, and mixed queries. Explicit include scope, indexed path tokens,
indexed symbols, SQLite chunk text, and a small rg -il -F
pass can all vote files into the candidate pool before the normal scorer
and reranker decide final ordering. Content and agent calls now receive a
bounded same-file support pack when needed, so a downstream LLM sees
compact evidence rather than just paths. The new agent tool-context
benchmark shows 6.12× fewer tool calls, 37.74× less context, and 31.27×
higher sufficiency density than a raw-rg agent baseline on
24 generic depth tasks.
Cold semantic search is now budgeted, scoped, and verified against the public examples
This release hardens the path that previously let broad cold-start
semantic searches spend too long exploring home and sibling roots. The
lazy cwd and cross-folder lanes now have separate wall-clock budgets,
hidden and dependency-cache trees are pruned before traversal, ripgrep
fallback runs in one bounded pass, warm cross-folder expansion no longer
pollutes local answers, and relative include globs such as
--include "src/**" match absolute indexed paths. The public
README / GitHub Pages examples were rerun on a fictional project, and
--answer was tightened so direct evidence does not receive a
contradictory missing-answer caveat.
0.5.10's fast scoped discovery release, hardened for the full CI matrix
This release supersedes 0.5.10. The generic scoped descriptor + metadata
file-discovery lane, background refresh deferral, wall-time footer, and
automatic setup-instruction refresh remain the main user-facing behavior.
The proactive enhancer now also catches Python 3.10
concurrent.futures.TimeoutError budget exhaustion explicitly,
so a total-budget timeout is recorded as telemetry instead of escaping as
an exception in CI.
Scoped file-location questions return fast, while agents learn the right output depth
This release adds a generic scoped descriptor + metadata file-discovery
lane for path-depth questions that name a folder, target descriptors, and
metadata such as created / modified / opened. Those searches can now
finish from bounded filesystem evidence without waiting for semantic
cascade, while content, answer, and agentic queries still keep deeper
retrieval alive. Large foreground refreshes defer to background indexing,
footer timing now reports command wall time, and skygrep setup
writes the information-depth ladder into Claude Code, Codex, OpenCode,
Gemini CLI, and Cursor instructions. Existing managed setup snippets
auto-refresh after upgrade without touching user-authored text.
Scope is now a first-class query-plan facet, so scoped searches stay fast and precise
This release upgrades routing from a single terminal intent into a facet-based plan: scope, target, metadata, and answer depth are handled independently. Folder / repo / workspace clauses are resolved to a real local root and stripped from router text before fast-intent, LLM fallback, metadata, and lexical gates run. Metadata remains instant when it fully answers the query, but becomes only a modifier when the user also names a target. Scoped semantic and JSON/agent queries can now finish from strong lexical evidence without waiting for expensive cascade/rerank, while CJK and mixed-language scope forms are handled generically. The release gate includes a 12-case synthetic CLI benchmark covering cold/warm, filename, semantic, metadata terminal/modifier, CJK, wrong-directory proactive, and JSON output.
Full notes →Metadata is now a routing facet: fast when it fully answers, deeper when it only constrains
This release turns filesystem metadata into a structured query-plan
facet instead of a mutually exclusive intent. Pure metadata queries
such as recently opened or recently created files still return from
the fast filesystem lane, while composite queries keep searching and
use metadata only as a cheap reranking signal inside already-relevant
candidates. It adds created-time metadata, CJK terminal/modifier
separation, a code-identifier collision guard for tokens such as
created_at, optional document evidence fields for JSON
and agent paths, and expands the privacy release gate to benchmark
files.
Fast path answers stay fast, while semantic questions continue past filename anchors
This release makes filename evidence answer-depth aware. A concrete filename hit can finish a path-depth query, but semantic questions that mention a filename keep that file as an anchor while lazy/cascade refinement continues in the same invocation. Human output now shows the active router lane by default, cold-start semantic queries can show a bounded content preview from the anchor before refinement, metadata questions use a fast filesystem lane, and cross-folder diffusion is suppressed when the current scope already has a concrete anchor.
Full notes →Natural-language queries work bare, quoted, or smart-quoted while semantic recall remains intact
This release hardens the intelligent router around the real cases
users type at a shell: skygrep where is my case42 file in Downloads,
skygrep -x where is my case42 file in Downloads, smart-quoted
pasted text, and Chinese or mixed-language filename questions such as
我的 CASE42 文件在哪 and 我的合同文件在哪. Filename
answers can return immediately when the filename layer has a clear
answer, while literal/rg evidence still joins the normal semantic
cascade instead of suppressing it. Proactive outside-path diffusion is
now bounded to filename intent, and obvious semantic questions take a
cheap semantic pre-router path before the LLM router.
High-confidence filename lookups can skip cascade without crashing later in the warm cross-folder gate
The LLM router correctly skips semantic cascade for pure filename
lookups such as skygrep -x "where is my case42 file".
In 0.5.8.2 the cascade branch initialised queries,
but the cascade-skipped path did not; a later warm cross-folder
condition still read len(queries). 0.5.8.3 initialises
queries = [query] before that branch and adds regression
coverage for the high-confidence filename skip path. No schema
change, no index rebuild, public CLI and JSON shape unchanged.
--explain) · Ollama autostart · ES comparisonEvery result now carries the full retrieval provenance — pass --explain and skygrep tells you why
Three layers of "why" answer three different questions: (a) a
router rationale at the top — 🧭 router:
<intent> · primary_token=… · conf=… · source=… plus
a one-sentence reason from the LLM router; (b) a per-result
via: line — which channel(s)
contributed (cosine cascade · symbol RRF · filename-lookup ·
ripgrep), what symbol terms matched, the score; and (c) a
🛤 cascade lane: summary at the
bottom with the σ-adaptive evidence (gap=…, tau=…).
No new model calls, no extra retrieval — every signal was already
in the pipeline; we just stopped throwing it away at render time.
Off by default: existing UX is byte-identical to 0.5.7.
Bonus: if Ollama isn't running but is installed,
skygrep autostarts ollama serve in the background
(5 s budget, env-tunable) and tells you. Two latent LLM-router
bugs were also fixed along the way:
keep_alive was being sent as the string
"-1" which recent Ollama rejects with HTTP 400, and
LLM_TIMEOUT_SECONDS defaulted to 0.5 s which timed
out 100 % of cold qwen2.5:3b calls — both had been silently
forcing the rule-based fallback on most queries. README + Pages
now ship a dedicated "How skylakegrep differs from
Elasticsearch" section answering the most common
second-question. 207/207 unit tests pass; head-to-head vs 0.5.7
PyPI on the same query produces byte-identical paths and scores
when --explain is off.
Re-applies the 0.5.6 worker-thread SQLite-conn pattern to all three lazy worker call sites
0.5.6 introduced parallel proactive umbrella + cascade-in-worker-
thread + dedicated SQLite connection inside cascade's worker.
Same fix was MISSED for the two cross-folder lazy paths (cold +
wrong-folder branch's _run_cross and warm +
low-confidence cross-folder) plus the cold-start
_run_cwd branch — all three passed the main-thread
conn into a ThreadPoolExecutor worker,
triggering "SQLite objects created in a thread can only be
used in that same thread" silently caught and turned into "0
results". The proactive umbrella's filename_extend
tier still answered in 1 s on the wrong-path scenario, so the
user-visible regression was small — but the lazy semantic tier
was a dead path. 0.5.7 replicates the 0.5.6 cascade pattern
across all three sites: each worker opens
init_db(db_path), uses that conn for the call, and
closes in finally. Verified: wrong-path wall 1 s
with cross-folder returning 5 cosine-ranked Django files (no
SQLite stderr). Hard bench 4/10 preserved. Pytest 217/217.
Cascade and proactive umbrella subprocesses fire at t = 0; whoever has an answer streams first
0.5.4 still had a sequential chain — filename + rg → cascade
(100 ms ~ 60 s) → cross-folder (5–30 s) → proactive enhancers
(≤ 1 s) → render. On a vocabulary-mismatch query (e.g. asking
about a PDF in ~/Downloads from inside a code repo)
the user reported a 12-minute-50-second wall clock, with the
right answer hidden behind 99.7 s of cascade rerank. 0.5.6
refactors to the conceptual model in
docs/proactive-umbrella-framework.md: cascade and
every proactive subprocess (filename_extend, lazy_cross_folder,
lazy_cwd) fire in parallel via ThreadPoolExecutor, each stream
their results with a route + quality label as soon as ready.
New 30 s hard timeout on cascade (worker-thread + dedicated
SQLite connection) plus the existing 8 s cap on cross-folder.
Same query: 12:50 → 26 s; first answer at ~1–2 s
from the proactive umbrella block. Pytest 217/217.
No more silent 30-second prompt — preliminary results stream as soon as rg returns
Two threads. (1) Doc surfaces: hero eyebrow on the GitHub Pages
homepage bumped from a stale v0.2.13 to
v0.5.4, bench-stats summary now carries a
+4 / 10 lazy auto-trigger over rg cold-start tile
alongside the 30 / 30 fully-indexed peak, og:description
+ twitter:description updated for accurate social
cards, cli.html documents --lazy / --no-lazy
+ SKYGREP_PROACTIVE_DIRS + the env-var table,
benchmarks.html grew a Cold-start lazy
auto-trigger section with the real-CLI table, README headline
tagline carries the +30 % lazy line. (2)
Streaming cold-start UX: skygrep "<query>" on a
never-indexed dir no longer sits silent for 5–30 s. It prints an
immediate 🔍 scanning… line, the preliminary rg /
filename hits as soon as rg returns (≤ 1 s), the
🌊 / 💧 / ⚡ lazy progress to stderr, and the
lazy-refined matches under a ▾ refined matches…
header — already-printed paths are de-duplicated. Pytest 217/217.
Real, measurable hit-rate improvement over rg cold-start on the Django oracle
0.5.0–0.5.2 shipped lazy auto-trigger but the real CLI bench showed it at 1/10 — barely above pure ripgrep. 0.5.3 fixes the root cause: token-shortcut DEDUP for numeric-prefix migration families, numeric-prefix scoring penalty, deterministic dir-token picker (LLM-router-independent), test-path penalty, weighted per-dir budget (top dir gets 8 files, then 4/2/2), and a critical Ollama keep_alive bug fix that had been silently zero-ing every LLM-router call (HTTP 400, "missing unit in duration"). Plus regex import diffusion, ThreadPool I/O parallelism, progressive stderr progress lines, cold-+-wrong-folder lazy_cwd ∥ lazy_cross_folder branch, and warm-cascade low-confidence cross-folder augmentation. Hard CLI bench: rg-only 0/10 → auto-trigger 4/10 (+30%) at +16 s/query on fresh Django. Pytest 217/217.
Full notes →Closing the doc-surface gap from 0.5.0/0.5.1 + zero personal-filesystem assumptions
0.5.0 and 0.5.1 shipped with .md release notes only —
the changelog "Full notes →" links pointed at .html
pages that didn't exist. Closed in 0.5.2: a reusable
scripts/render_release_notes.py wraps each
.md in the same themed sidebar / topbar layout as
0.4.2.html so future releases don't repeat the lapse. Both
skylakegrep-0.5.0.html and
skylakegrep-0.5.1.html now exist alongside this page.
Second change: SKYGREP_PROACTIVE_DIRS (colon-separated
absolute paths) replaces hardcoded
~/Downloads / ~/Desktop / ~/Documents in
proactive._default_search_dirs AND in
lazy_indexer.lazy_explore_cross_folder's default
candidate_roots. Both lists were the maintainer's
personal layout; the codebase now contains zero personal-filesystem
assumptions. Pytest 201 / 201.
The user shouldn't have to know which tier they need
0.5.0 shipped --lazy as opt-in. Wrong: the whole
premise of lazy is that the user doesn't know whether they're in
the right folder or whether their query aligns with the code's
vocabulary, so they can't be expected to add the flag. 0.5.1
flips it to auto-trigger. skygrep "<query>" on
a never-indexed project now: runs rg immediately;
if rg paths cover ≥ 2 distinct query tokens (PascalCase /
snake_case split), returns the instant keyword answer; otherwise
also fires the LLM-routed lazy semantic tier (~5 s) and merges.
--no-lazy is the new opt-out for benchmarking.
Verified end-to-end on a fresh Django checkout via real CLI (not
python API) — 6 scenarios, 5 of which exercise the auto-trigger
from both directions; e.g. "cache invalidation strategy"
returns django/middleware/cache.py,
cached_db.py, etc. in 8 s. Pytest 201 / 201.
Real-semantic answer in ~5 s on a project that has never been indexed
0.4.x removed graph-walk after a production crash and didn't
actually improve recall. 0.5.0 returns to the original design
intent: the user shouldn't have to run
skygrep index . and wait 5–10 minutes for the first
semantic question. New lazy_indexer module: an LLM
router (qwen2.5:3b) picks 5–15 likely entry-point paths from the
directory tree alone, the embedder batch-embeds them in one
bge-m3 call (~5× speedup over per-file), and a σ-validated cosine
top-K is returned with confidence telemetry. Django bench
(10 queries, fresh DB): 4 / 10 hits, p50 5.0 s, max 9.1 s.
Architecture: lazy is purely additive — fills the previously-empty
middle of the latency / recall curve between rg cold-start
(~100 ms keyword) and full eager index (5–10 min upfront,
30 / 30 recall). 0.4.x graph_expand stays removed.
Critical crash on every escalated query that graph_expand contributed to
0.4.0 / 0.4.1 shipped a production crash:
KeyError: 'snippet' at cli.py:92 merge_results
— fired on every CLI search query that escalated AND
graph_expand returned candidates. Hidden by tests that called
cascade_search directly, never through the CLI.
Hot-fix: emit the missing snippet (and full result-
dict shape) from _expand_via_reference_graph().
Verified by real CLI search query on the local index — escalation
ran end-to-end (7.3s, no crash). Public OSS bench backfill is
deferred to 0.4.3 (the bench wrapper currently stalls on Django,
root cause under investigation). New auto-memory rule: every
release MUST exercise skygrep search on a real index
before tagging.
Honest end-to-end measurement of 0.4.0 graph_expand on real bge-m3
Backfills the real-CLI bench that 0.4.0 should have included before shipping. Indexed 27 Python files of skylakegrep itself with real bge-m3 embeddings, populated 108 graph nodes + 190 reference edges, ran 5 representative semantic queries with tau=0 forced escalation. graph_expand fired correctly on 3/3 escalated queries (4–9 candidates contributed each), correctly skipped on 2/2 cheap-path queries, latency invariant held (cheap path ~7ms, escalation 1.7–2.6s). Top-5 hit rate 3/5 on this bench — substrate works, but doesn't magically improve recall on queries where the answer is 2+ hops from cosine top-K. Honest framing now propagated to all GH surfaces. New auto-memory rule: every release MUST include real corpus end-to-end run before public-surface update. Full bench: benchmarks/release-0.4.0-real-corpus.md.
Full notes →1-hop reference-graph expansion in cascade escalation — by-construction additive, by-construction latency-neutral
Closes the v2 design that 0.3.0 attempted with 9+ preset
hyperparameters (rolled back in 0.3.1). 0.4.0 redoes it
holistically per the
holistic
graph-aware retrieval plan and the
"intelligence-is-conditional" principle just locked into
auto-memory. Zero new hyperparameters: every
weight is cosine, every threshold is the existing
CASCADE_TAU_FLOOR, every edge is a reference-graph
ref. ~50 LoC added; ~800 LoC of phased-design scaffolding deleted
(graph_walk.py · query_seeds.py ·
graph_substrate.py · per-component tests). End-to-end
integration test covers the whole; no isolated-component tests.
Cheap path identical to 0.2.21; escalation adds ≤ 2 ms latency
to union 1-hop neighbours into the rerank pool.
Graph-walk integration rolled back; preset hyperparameters stripped
0.3.0 shipped on by-construction arguments without an end-to-end
bench. First real-corpus run (5 queries on
skylakegrep/src/) showed only 2/5
hits — the seed mapper correctly ranked the answer first
(graph_walk.py at 55% seed mass), but PPR walk diluted it through
9+ preset edge weights into structurally-adjacent siblings.
0.3.1 reverts the cascade integration (production behaviour ≡
0.2.21), strips all score_per_hit constants,
derives path_prox weight from path depth. Substrate
modules + tests stay; integration deferred until weights are
derived from corpus stats. Full benchmark and rollback rationale
in benchmarks/release-0.3.0-graph-walk.md.
Knowledge-graph substrate · cold-start seeds · bounded PPR walk
First new code-path release since 0.2.0. Implements the v2 retrieval
substrate from
the graph-walk plan:
heterogeneous knowledge graph (file/folder/chunk/symbol/token nodes,
8 edge types, 4 cheap edges built at index time), bounded forward-push
Personalized PageRank (Andersen-Chung-Lang 2006, σ-stop, max 200
nodes), and cold-start query → seeds via 4 matchers
(filename/symbol/semantic/path-token) — first query produces rich
seeds with zero history. Gated behind SKYGREP_GRAPH_WALK=1;
default unchanged. Tests 221/221 (was 201). Latency invariant
(cheap path unchanged) + accuracy invariant (purely additive to
candidate pool) by construction.
Comparison row-3 label trimmed; performance aggregate banner now 3 clean columns
Reviewer caught two more overlaps. (1) The row-3 capability
label Content — code · md · PDF · docx · images
(~330 px) ran into the first data column's
code · md · PDF · docx payload — trimmed to
Content — multimodal (the cells already enumerate
per tool). (2) The aggregate banner had three big numbers +
three small labels jammed onto a single line that overflowed
the 1032 px banner width — rebuilt as a 3-column "headline
above caption" layout (banner 100 → 120 px tall) so the eye
compares 30/30 · 60×–770× ·
−82% in parallel.
Third result row no longer collides with terminal frame; pill row stable in raster too
Reviewer caught the row-3 issue in one screenshot:
utils/jwt.py overlapped its own :22-39
line range, and the third result card overflowed the terminal
bottom border by 6 px. Path range moved x=100→120 (+14 px gap);
card height 358→380 px (16 px bottom margin). Plus three issues
caught in the sweep — XML-invalid <query> /
& in cli-cheatsheet + configuration descs, and
the cairosvg-only pill-row tspan glitch on hero + og-image
(which broke social-preview unfurls). Both pill rows
flattened to plain text.
Every clickable link now stays in the themed shell
25 markdown documents (20 release notes · principles · parity-benchmarks · token-benchmarking · roadmap · releasing) rendered as themed HTML pages with the same shared chrome. 83 link rewrites across 33 files (changelog "Full notes →", sidebar "Principles" link in every subpage, parity-benchmarks references, README internal links). Clicking any link no longer drops the visitor onto a raw GitHub markdown blob — the slate-blue frosted-glass shell stays consistent end-to-end.
Full notes →Homepage now uses full horizontal width — Cody column no longer clipped
The 0.2.17 homepage was capped at --content-max: 800px
with the legacy right-TOC still in the grid, so the
5-column comparison matrix overflowed and clipped the Sourcegraph
Cody column on wide screens. Right TOC removed (homepage already
has sidebar navigation), .docs-shell overridden to
a 2-column grid up to 1500px, themed cards now span 100% of the
content area. No more empty right margin.
Slate-blue glass · CLI cheatsheet + Configuration visualised · "custom type" rename
Dark theme lifted: --bg #0a0d12 →
#13192a, panels 1.6 – 2.5× more opaque, themed-card
backdrop-filter blur bumped to 36px so frosted glass
reads through to slate-blue depth. Two new themed SVG cards —
cli-cheatsheet.svg (bare form featured + 8 secondary
tiles) and configuration.svg (three grouped env-var
panels). The "Custom type" tile dropped from
content-types.svg — 5 built-in types + extensibility
footer banner; content-agnostic shouldn't sound like the user
has to customise.
mgrep corrected · softer frosted glass · workflow + content-types upgraded
0.2.15 mis-labeled mgrep
as a "predecessor" — it is the
Mixedbread AI
cloud-backed paid CLI, the closest commercial competitor.
Comparison row corrected (cloud-backed · sub + usage · npm + acct).
Palette softened: #67e8f9 → #a5f3fc,
#22d3ee → #38bdf8, multi-stop frosted
borders + backdrop-filter: blur. Two new themed SVGs:
workflow-diagram.svg (3-stage pipeline) and
content-types.svg (6-tile grid) replace the README's
ASCII workflow + plain content-type table.
Named-tool comparison · themed SVG cards
Comparison surface sized against four named alternatives —
ripgrep, mgrep,
autodev-codebase,
and Sourcegraph Cody — instead of generic "Cloud RAG". New
themed SVG cards (comparison-matrix.svg +
performance-matrix.svg) anchor the README with
the same frosted-glass aesthetic as the hero. (Note: 0.2.15
mis-labeled mgrep as a predecessor; corrected in 0.2.16.)
Homepage redesign + 6-page split
Animated terminal hero, three audience scenarios, comparison panel, three-step how-it-works diagram, bench-headline section, honesty list. 1405 → 849 lines on the home page; reference content moved to 6 new shared-chrome subpages (concepts · architecture · cli · reference · benchmarks · changelog).
Full notes →User-personal references swept from public docs
Sanitisation pass across release notes, plan documents, README, and GitHub Pages. No code change. Every public artefact now free of user-personal example tokens.
Full notes →Morphology fallback when LLM is unreachable
filename_extend's should-fire gate now extracts
a candidate token via content-shape morphology when LLM didn't
supply primary_token. Filed the conversational
session-state plan.
Second built-in enhancer: recovery_progress_hint
Content queries during a partial-index re-embed now get live
progress + ETA + retry guidance instead of "no matches yet".
Plus ProactiveContext infrastructure for future
enhancers.
Proactive enhancement framework + 4 iterations
Content-agnostic enhancer registry that runs in parallel after
the cascade with bounded latency budget. Built-in
filename_extend searches sibling home dirs when
the in-project search returned 0 hits. Four bug-fix iterations
(0.2.8 - 0.2.10) recorded as Principle 1 receipts.
LLM-driven scope classification
RouterDecision.out_of_scope set by the same LLM
router call. Replaces the keyword _METADATA_TOKENS
list. Principle 1 ✓ shipped — substrate
understanding over enumeration.
bge-m3 + content-agnostic graph + 30 / 30
Default embedder switched from mxbai-embed-large
to bge-m3 (1024-d, multilingual, symmetric XLM-RoBERTa).
Content-agnostic reference_graph.register_extractor()
registry. σ-adaptive cascade. 30 / 30 public-OSS recall
(was 28 / 30).