turnstone

History

pyr0ball 3155bde4ce feat: hybrid BM25 + vector re-ranking for diagnose search (#15 ) Adds late-fusion hybrid search to Turnstone's log retrieval layer: hybrid_score = 0.6 * bm25_normalized + 0.4 * cosine_similarity Implementation: - _bm25_search() extracts the existing FTS5 BM25 path as a named helper - _hybrid_search() fetches an oversized BM25 candidate pool (5x limit, min 100), embeds the query and each candidate text in-process via the existing embeddings service, normalizes BM25 rank to [0,1], combines with cosine similarity, and re-ranks - search() gets semantic=False param that dispatches to _hybrid_search() when True; pure BM25 remains the default for all existing call sites - diagnose_stream() enables semantic=True so symptom-based queries ("database connection failed") surface semantically equivalent entries ("ECONNREFUSED", "backend gone away", "max retries exceeded") - /api/search REST endpoint exposes ?semantic=true query param Graceful degradation: falls back silently to pure BM25 when the embedding backend is unavailable (EMBEDDING_AVAILABLE=False) or when embed_batch raises an exception. No new infra — in-process numpy cosine, no vector DB. 11 new tests: BM25 helper, hybrid re-ranking, fallback paths, dispatcher. 372 + 11 = 383 tests passing. Closes: #15		2026-06-01 18:13:09 -07:00
..
diagnose	feat: hybrid BM25 + vector re-ranking for diagnose search (#15 )	2026-06-01 18:13:09 -07:00
__init__.py	feat: initial Turnstone POC — ingest, FTS search, MCP server	2026-05-08 12:12:34 -07:00
blocklist.py	fix(db): add timeout=30s to all sqlite3.connect() calls across app	2026-05-26 23:12:48 -07:00
discover.py	feat: bundle PII sanitization, onboarding wizard, NL source addition (#51 , #52 , #53 )	2026-05-29 14:14:28 -07:00
embeddings.py	refactor: extract embeddings service layer — decouple context embedder from Ollama	2026-05-25 11:01:25 -07:00
incidents.py	feat: bundle PII sanitization, onboarding wizard, NL source addition (#51 , #52 , #53 )	2026-05-29 14:14:28 -07:00
llm.py	fix(diagnose): add max_tokens to all LLM calls; fix reasoning card contrast	2026-05-27 22:23:36 -07:00
models.py	feat: bundle PII sanitization, onboarding wizard, NL source addition (#51 , #52 , #53 )	2026-05-29 14:14:28 -07:00
nl_source.py	feat: bundle PII sanitization, onboarding wizard, NL source addition (#51 , #52 , #53 )	2026-05-29 14:14:28 -07:00
pihole.py	feat(blocklist): 6 REST endpoints + Pi-hole settings fields	2026-05-15 21:15:09 -07:00
search.py	feat: hybrid BM25 + vector re-ranking for diagnose search (#15 )	2026-06-01 18:13:09 -07:00