Overview
A command-line tool (ns) that searches my Obsidian vaults the way I
actually remember things -; sometimes by an exact weird phrase, sometimes by the
vague shape of an idea. It runs embeddings locally on my own GPU (or CPU), stores the
index as JSON + numpy right next to the notes, and that index syncs across machines
along with the notes themselves (Resilio, in my case). Nothing leaves the machine.
It exists because keyword search alone never found what I wanted in a pile of notes this big, and I didn't want to hand the whole vault to a cloud service to fix that.
How It Works
Notes are chunked by heading structure, so each chunk carries its heading path as
context. Each chunk is embedded locally with nomic-embed-text-v1.5. At
query time three ranking signals are combined via Reciprocal Rank Fusion:
- Semantic -; cosine similarity between the query vector and each chunk. Handles conceptual matches ("growing tomatoes" finds "planting vegetables").
- Lexical (BM25) -; classic keyword scoring, for when you remember a specific weird word.
- Wikilink graph boost (on by default) -; the top candidates'
outgoing
[[wikilinks]]promote their targets, surfacing notes that are topologically adjacent to a great match even when they don't use the query words.
Each result shows which signals contributed ([sem #1, lex #3, link #2])
and an obsidian:// deep link to click straight into the note. There's
also a related command to find notes similar to any given one.
ns index # build / update the index ns search neural network ideas # hybrid: semantic + BM25 + graph boost ns search "state machines" -n 20 # more results ns related Projects/Ideas/foo.md # notes similar to this one
Indexing is incremental -; only changed files get re-embedded. A first index of
a few thousand notes on a 3090 takes a couple of minutes; later runs are seconds. The
index lives at <notes_root>/.note-search/ and syncs with the notes,
so other machines just run ns search with no indexing of their own.
Current Status
Active and in daily use. Installed as a package on each machine and dropped into my
PowerShell profile so ns works from any window. Repo is private/local
for now -; no public link yet.
- Hybrid search (semantic + BM25 + graph boost) with per-signal attribution.
- Incremental indexing; index syncs across machines along with the notes.
- Known wrinkle: concurrent indexing on two machines is last-write-wins -; avoid racing.