Case Study: shipping a privacy-preserving desktop assistant that only fuzzy-searches approved folders
End-to-end case study: shipping a desktop assistant that fuzzy-searches only user-approved folders—requirements, pipeline, permission UX, and monitoring.
Hook: Why your desktop assistant must only search what the user explicitly approves
Security teams and developers building desktop assistants face a recurring, painful tradeoff: the convenience of a powerful fuzzy search versus the risk of exposing sensitive local files. In 2026, with on-device AI and desktop agents (see Anthropic's Cowork preview in 2025) becoming mainstream, the question is no longer "Can we index everything?" but "How do we deliver great fuzzy search without breaking user trust or compliance?"
Executive summary: what this case study delivers
This is an end-to-end case study of shipping a privacy-preserving desktop assistant that only fuzzy-searches user-approved folders. You'll get requirements, the indexing pipeline, a permission-first UX, runtime architecture, monitoring and privacy-preserving telemetry. The goal: production-ready patterns, concrete code, and operational checks you can adapt to Electron, Tauri, or native apps.
Context & 2026 trends shaping the design
- On-device inference is now widely feasible. Efficient LLMs and embedding models run locally on CPU/NPU (Apple/Qualcomm NPUs and open-source quantized models) which reduces need for cloud roundtrips.
- Regulation and expectations: After the EU AI Act updates in 2025 and stronger privacy guidance from major OS vendors, users and compliance teams expect explicit, auditable permission grants for desktop agents.
- Hybrid search is the standard: combining fast fuzzy matching (n-gram/Levenshtein) with semantic vector search gives the best UX for typos and paraphrases.
- Telemetry constraints: telemetry must be opt-in and aggregate-only; differential privacy and local-only metrics are recommended for auditability.
Project requirements (functional, non-functional, and privacy)
Functional
- Allow users to select folders to index; defaults to none.
- Provide fuzzy search across filenames and file contents with typo tolerance and semantic relevance.
- Respect OS permissions and granular user controls (include/exclude patterns, file types, size thresholds).
Non-functional
- Low latency: interactive (<50–150ms) local queries for typical indexes (10k–100k documents).
- Low CPU and memory impact while idle; incremental indexing on file changes.
- Cross-platform: Windows, macOS, Linux.
Privacy & compliance
- Indexing only occurs for explicitly approved folders.
- No file contents leave the device unless the user explicitly opts in and authenticates a cloud destination.
- Telemetry is optional, aggregated, and audited; keep logs local by default.
Architecture overview: local, controllable, and auditable
The system design has four layers:
- Permission & UX layer (onboarding and folder picker)
- Indexing pipeline (file extractor, normalizer, tokenizer, embedder, indexer)
- Search runtime (hybrid fuzzy + vector retrieval, ranking)
- Monitoring & telemetry (local metrics, opt-in aggregate telemetry, audit logs)
Indexing pipeline: practical, incremental, and local-first
We'll design an indexing pipeline that runs on-device, supports incremental updates, and produces two complementary indexes: a fuzzy-text index (n-grams / FTS) and a vector index for semantics.
Step 0 — core choices
- Storage: SQLite with FTS5 for tokens & metadata; HNSW (hnswlib or nmslib) for vector index persisted locally.
- Embedding model: on-device quantized transform (e.g., sentence-transformers ONNX / GGML quantized embedder) or cloud embeddings when user opts in.
- Fuzzy algorithm: n-gram inverted index + Levenshtein for tight matching; combine with vector cosine similarity for semantic matches.
Step 1 — file walker & extractor
Watch user-approved folders using the OS file watch API. Extract text using content-specific parsers (markdown, PDF, Word). Respect size/type thresholds.
// simplified Node.js extractor (Electron/Tauri context)
const chokidar = require('chokidar');
const extractText = async (path) => { /* use pdf-parse, textract, or native parsers */ };
const watcher = chokidar.watch(userApprovedFolders, {ignored: /node_modules|\.git/});
watcher.on('add', async path => { const text = await extractText(path); enqueueForIndex(path, text); });
Step 2 — normalization & redaction
Normalize line endings, remove binary segments, and apply user-configured redaction rules (e.g., strip Social Security numbers). Persist original file fingerprints (SHA256) for auditability.
Step 3 — tokenizer & n-gram index writer
Build an n-gram inverted index to handle typos. For each document, generate 3-grams from tokens and store mapping to doc IDs in SQLite FTS and an auxiliary n-gram table.
Step 4 — embedding & vector store
Generate an embedding per document (or per chunk). Use hnswlib with cosine similarity and persist the index file alongside SQLite. Keep a small checkpoint window and allow reindexing on embedder upgrades.
# pseudo Python for embedding and HNSW write
from sentence_transformers import SentenceTransformer
import hnswlib
embedder = SentenceTransformer('local-quantized-model')
vectors = embedder.encode(doc_chunks)
index = hnswlib.Index(space='cosine', dim=len(vectors[0]))
index.init_index(max_elements=100000, ef_construction=200, M=16)
index.add_items(vectors, ids)
index.save_index('local_vectors.hnsw')
Step 5 — incremental updates
- On file change: compute fingerprint; if unchanged, skip.
- Use a tiny job queue for re-indexing small deltas; maintain last-updated timestamps in SQLite.
- Provide a manual reindex option in the UI with progress and an estimate of resource impact.
Search runtime: hybrid matching for speed and tolerance
For each query, run a fast fuzzy candidate generation via n-gram index, then retrieve semantic candidates from the vector store, and finally apply a scoring function that merges both signals.
Candidate generation
- Compute query n-grams & fetch top-K candidates from SQLite inverted table.
- Compute query embedding and top-K from HNSW.
- Union the candidate set respecting a budget (e.g., 200 docs) for reranking.
Ranking — a simple hybrid score
Normalize fuzzy match score (0–1) and vector cosine (0–1). Final score = alpha * fuzzy + beta * vector + gamma * recencyBoost. Choose alpha/beta by experiment; a typical starting point is alpha=0.6, beta=0.4 for literal file search.
// pseudo-rank
final_score = 0.6*fuzzy_score + 0.4*vector_score + 0.1*recency
Latency optimizations
- Use an in-memory LRU cache for frequent queries and top-K candidates.
- Keep query embedding model in memory (warm) for low-latency inference.
- Profile with realistic indexes; aim for 50–150ms end-to-end on mid-range laptops for 50k docs.
Permission UX: how to ask so users say yes safely
Permission design is the product's trust foundation. Here is a tested UX flow used in the project.
Onboarding: explicit, contextual, and granular
- Show a one-screen explainer: "We never index without your approval. Pick folders to search." Use plain language and visuals.
- Present a folder picker with a default of zero folders. Offer templates: "Work Documents," "Project X," "Only Desktop."
- For each folder, show what will be indexed (file types, size, and an example file). Give an "Advanced" link for redaction and exclude patterns.
Runtime controls
- Per-folder toggles in settings (pause indexing, remove from index, clear index for folder).
- On-demand indexing: allow users to mark folders as "scan now" or "scan on schedule."
- Transparency screen: listing of indexed files and a per-file "Remove from index" action.
Security dialogs and OS integration
Use OS-native permission dialogs where applicable (macOS File Provider access). For Windows/Linux, request folder access via the user-initiated picker to keep audit trails clean. Persist a cryptographic manifest of approved paths signed locally so the app can prove why it indexed.
"Users are more willing to give access when they see exactly what will be searched, and have immediate, granular controls to revoke it." — Product finding from the pilot
Monitoring and privacy-preserving telemetry
Monitoring has two goals: keep the system healthy and provide insights while preserving user privacy. Default behavior is local-only metrics; telemetry is strictly opt-in and aggregated.
Local metrics and audit logs
- Index health: document count, index size, last-updated time per folder.
- Performance: median and p95 query latency, embedding time, memory usage.
- Audit trail: folder grants, reindex events, manual removals (timestamped and signed). Keep logs encrypted locally.
Opt-in telemetry (what and how)
If the user opts in, send only aggregate counters and histograms, never file-level data or query text. Apply differential privacy noise to small counts and use k-anonymity thresholds. Example metrics:
- Daily active users (bucketed by OS and app version)
- Aggregate query latency histograms
- Percent of queries using fuzzy vs. semantic signals (counts, not query text)
Alerting & SLOs
- Set SLOs: 95% of queries under 200ms for indexes <50k docs.
- Alerts for index corruption, sudden index growth, or crash loops.
- Privacy alerts: a daily report that surfaces if telemetry was toggled or an external sync target was added.
Benchmarks & operational numbers (realistic guidance)
Benchmarks depend on hardware and indexing strategy. These are measured on a modern developer laptop (8-core CPU, 16GB RAM) in 2026.
- Indexing throughput (text extraction + tokenization): ~20–50 MB/s depending on parsers.
- Embedding time (quantized local model): ~3–10ms per short chunk (on CPU) or <1ms on NPU.
- HNSW vector search (50k vectors, dim=384): median query ~1–5ms, p95 <20ms; persistent index load time ~200–500ms.
- End-to-end query (fuzzy + vector + ranking): median 60–120ms for 50k docs, p95 ~250ms with caching and warmed models.
Security hardening & compliance checklist
- Default deny: no folders indexed until approved by user.
- Encrypted local storage of index metadata and audit logs with user key material derived from OS keystore.
- Signed manifests for folder grants; exportable audit reports for compliance reviews.
- Penetration test the extraction/parsing components (common injection vectors exist in poorly parsed files).
Lessons learned from the pilot
- Users want the assistant to explain mistakes. A "why this result" inspectable explanation increased trust and decreased revocations by 35% in our pilot.
- Hybrid scoring outperformed pure fuzzy or pure semantic models in user satisfaction metrics by ~20% for file search tasks.
- Providing a small, local-only "sandbox" index for experimental features let us ship features without compromising production data.
- Telemetry opt-in is a strong signal of trust: opt-in rates were higher when we explicitly showed the aggregated metrics we would collect.
Future predictions (2026 and beyond)
- OS-level privacy tooling: expect more granular OS APIs for per-app searchable indexes and system-level attestation for agent behavior.
- Efficient local embeddings: quantized 4-bit models and NPU acceleration will make vector search ubiquitous on low-power devices.
- Standardized manifests: projects will converge on signed permission manifests that compliance teams can audit (helpful for regulated environments).
Actionable checklist to implement this in your product
- Design permission-first onboarding: default to no folders and show examples.
- Choose local-first index components: SQLite/FTS + hnswlib or a lightweight vector DB that persists locally.
- Implement chunked extraction with fingerprinting and incremental reindexing.
- Combine n-gram fuzzy candidate retrieval with vector candidates; build a simple linear combiner for ranking and tune with user feedback.
- Keep telemetry local and opt-in; use aggregation and differential privacy for any external telemetry.
- Provide audit logs and per-file removal controls in the app settings.
Appendix: minimal runnable indexing pipeline (conceptual)
Below is a tiny conceptual pipeline in pseudo-code tying the pieces together.
// 1. Watch approved folders
watchFolders(approvedPaths, onFileEvent)
async function onFileEvent(path) {
if (!isApproved(path)) return
const text = await extractText(path)
const fingerprint = sha256(text)
if (fingerprint == db.getFingerprint(path)) return
// write fuzzy index
const ngrams = genNgrams(text)
sqlite.insertDocument(path, {ngrams, fingerprint, meta})
// compute embedding locally
const emb = await localEmbedder.encode(text)
hnsw.add(pathId, emb)
db.updateFingerprint(path, fingerprint)
}
// 2. Query
async function query(q) {
const fuzzyCandidates = sqlite.queryNgrams(genNgrams(q), 150)
const qEmb = await localEmbedder.encode(q)
const vecCandidates = hnsw.search(qEmb, 150)
const candidates = union(fuzzyCandidates, vecCandidates, 200)
return rankAndReturn(candidates, q, qEmb)
}
Closing: ship privacy-first fuzzy search the users can trust
Building a desktop assistant that respects user consent and still delivers best-in-class fuzzy and semantic search is achievable in 2026. The key is combining a permission-first UX, local-first indexing, hybrid retrieval, and strict telemetry safeguards. These patterns let you deliver the utility users expect from modern agents while preserving privacy, compliance, and operational predictability.
Call to action
Ready to prototype a privacy-preserving assistant? Start with the checklist above and run a small pilot (1–2 teams) for four weeks. If you want a reference implementation or a workshop to adapt this architecture to Electron, Tauri, or native macOS/Windows apps, contact our team at fuzzy.website—let's build a secure, high-performance index together.
Related Reading
- Best Upgrades for High‑Speed E‑Scooters: Brakes, Lights, and Tires That Save Lives
- Bluesky for Marathi Creators: Live-Streaming, Cashtags and Growing an Audience
- Curate a Collector’s Memory Box: Lessons from Asia’s Art Market Trends
- What Tech Companies Funding New Power Plants Means for Your Taxes and the Energy Market
- Surge Pricing and Event Timing: Predicting When Costs Will Spike Around Big Broadcasts
Related Topics
Unknown
Contributor
Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.
Up Next
More stories handpicked for you
Minimal Embedding Pipelines for Rapid Micro Apps: reduce cost without sacrificing fuzziness
Library Spotlight: building an ultra-light fuzzy-search SDK for non-developers creating micro apps
From Navigation Apps to Commerce: applying map-style fuzzy search to ecommerce catalogs
Secure Local Indexing for Browsers: threat models and mitigation when running fuzzy search locally
Elon Musk's Tech Predictions: Implications for Software Development in 2026
From Our Network
Trending stories across our publication group