skylakegrep JSON schema

Reference · output schema

JSON shape.

--json writes a JSON array to stdout. Each element has the following fields:

[
 {
 "path": "skylakegrep/src/storage.py",
 "start_line": 197,
 "end_line": 271,
 "language": "python",
 "score": 0.824,
 "snippet": "def search(conn, query_embedding, top_k=10, …): …"
 }
]
path
Absolute file path on the indexing host.
start_line, end_line
Inclusive 1-based line range of the chunk within path.
language
Tree-sitter language key, or the extension-derived key if the file used the fallback splitter.
score
Combined score (cosine plus lexical adjustment unless --semantic-only).
snippet
Verbatim source text of the chunk.

Optional evidence fields

Semantic-depth filename anchors can include extra evidence fields when the matched file has extractable text. These fields are additive and may be absent; scripts should treat them as optional:

fallback, filename_token
Retrieval provenance for filename-anchor results.
candidate_recall, candidate_recall_lanes
Optional 0.5.13 provenance for results surfaced by the adaptive candidate recall substrate. Lanes can include include-scope, path-token, symbol, chunk-text, or bounded ripgrep recall. Treat these as explanatory metadata; the required path / line / score / snippet fields are unchanged.
query_excerpts
List of query-focused passages extracted from the anchored text/PDF/DOCX file.
content_excerpt
Joined query-focused passages, suitable for agent context.
content_preview, content_preview_truncated
Bounded full-text preview and whether it was clipped.
extracted_text_source, extraction_note
Extractor backend metadata, for example text passthrough, PDF text layer, DOCX parser, OCR note, or friendly extraction failure hint.