Hybrid Search
NEW in v0.8.6: See Hybrid Search below for BM25+Vector RRF.
Use your Long Term Core (LTC) as a high-level index for RAG-like consciousness. Search for relevant memories, then re-hydrate full context from git history.
Why Semantic Search?
Vant generates LTC via pruning - a “distilled” consciousness with only important learnings and decisions. But sometimes you need the full context.
This is like having a map (LTC) + archive (git history). Query the map, retrieve from the archive!
How It Works
Query → Search LTC → Identify topics → Re-hydrate from git → Full context
Usage
const search = require('vant').search;
// Get search summary
const summary = search.getSummary();
console.log(summary);
// { hasLTC: true, learnings: 10, decisions: 5, ... }
// Search LTC
const results = search.searchLTC('python', { limit: 5 });
console.log(results);
// [{ type: 'learnings', summary: '...', relevance: 3 }]
// Query + Re-hydrate
const { results, context } = await search.query('authentication');
console.log(context); // Full matching content
Query Options
All query modes:
// Full search + rehydrate (default)
const { results, context } = await search.query('python');
// Compact mode: summaries only (faster)
const { results, context } = await search.query('python', { compact: true });
// context: "- Learned X\n- Decided Y..."
// Hybrid search with caching
const r1 = await search.hybrid('python'); // First call
const r2 = await search.hybrid('python'); // Cached!
// Cache management
console.log(search.getCacheStats()); // { size: 1, max: 50 }
search.clearCache(); // Clear session cache
// HyDE transform
const hydeQuery = search.hyde(query);
Session Caching
Hybrid search results are cached per-session (50 max, MD5 keyed):
// First call - actual search
const r1 = await search.hybrid('python');
// Second call - cached (instant)
const r2 = await search.hybrid('python');
// Cache stats
search.getCacheStats(); // { size: 1, max: 50 }
search.clearCache(); // Clear all cached
Re-hydrate
Re-hydrate fetches full matching files from git history:
const { context } = await search.rehydrate(results);
// Returns: combined context up to 50KB
Security limits:
- Max 50KB re-hydrated content
- Only reads from
models/vX/directory - No arbitrary command execution
LTC Structure
Generated by pruning at models/vX/_core.json:
{
"version": "1.0",
"updated": "2026-05-05T12:00:00Z",
"core": {
"learnings": [...],
"decisions": [...],
"preferences": {...}
},
"stats": {
"pruned": 10,
"kept": 5
}
}
Use Cases
- Deep research: Search LTC, re-hydrate full context
- Decision review: Find all decisions on topic
- Context recovery: Need more than LTC provides
- RAG pipeline: Build AI responses with full context
Integration with AI
// In agent/loop.js
const search = require('vant').search;
async function answerWithContext(query) {
// 1. Quick LTC check
const ltc = search.getSummary();
if (!lc.hasLTC) {
return 'No LTC available - run prune first';
}
// 2. Search + re-hydrate
const { results, context } = await search.query(query);
if (!context) {
// 3. Fallback to current brain only
return null;
}
// 4. Use context in AI prompt
return context;
}
Caveats
- Requires LTC (run prune first)
- Limited to pruned versions
- 50KB max prevents context overflow
Rerank (RAG)
Keyword reranking and compression for LLM context
Rerank is separate from search - focuses on keyword scoring and token optimization:
- Rerank: Score memories by keyword match to query
- Compress: Strip markdown fluff, truncate to token budget
- Pipeline: Rerank + compress in sequence
See Rerank Guide for full documentation.
vant rerank "lessons learned" # Rerank
vant rerank pipeline "security" -t 4000 # Rerank + compress
Search vs Rerank
| Feature | Search | Rerank |
|---|---|---|
| Type | Semantic (BM25+Vector) | Keyword |
| Use case | Find relevant memories | Prepare for LLM |
| Input | Query | Query + memories |
| Output | Candidates | Ranked + compressed |
Use search to find candidates, rerank to optimize for LLM context.
Usage
const search = require('vant').search;
const results = await search.hybrid('herbalism plants');
// { sparse: [], dense: [], fused: [], sources: [] }
CLI
vant search "query" # Default hybrid
vant search --hybrid "q" # Explicit hybrid
vant search --hyde "q" # HyDE
vant search --stats # Index stats
vant search "q" -r # Search + rerank
vant search "q" -r -t 4000 # Search + rerank w/ 4000 tokens
Rerank Integration
Use -r or --rerank to pipeline search results through rerank:
# Search hybrid, then keyword rerank + compress
vant search "lessons" -r -t 4000
# Pipeline: Hybrid search → Rerank → Compress
# Returns keyword-scored, token-compressed memories
| Flag | Description | Default |
|---|---|---|
-r, --rerank |
Rerank results | false |
-t, --max-tokens |
Max tokens | 2000 |
All 3 search modes support
--rerank: Default (hybrid),--mode basic,--mode rag,--mode hybrid
Integration
Use with query transformation:
const query = require('./lib/query');
const citations = require('./lib/citations');
const result = await query.hyde('herbalism');
// Uses HyDE: fake answer → search real
citations.addSource(result.sources[0].commit);
MCP Tool
Search is available via MCP as vant_search with two modes:
{
name: 'vant_search',
arguments: {
query: 'search term',
mode: 'basic' | 'rag', // default: 'basic'
files: ['file.md'], // basic mode only
limit: 5 // rag mode, default: 5
}
}
Basic Mode
Fast text search. Default mode.
// Search all files
await vant_search({ query: 'lessons', mode: 'basic' });
// Search specific files
await vant_search({ query: 'lessons', mode: 'basic', files: ['identity.md', 'goals.md'] });
Returns:
{
"mode": "basic",
"query": "lessons",
"filesSearched": 24,
"hits": 5,
"results": [
{ "file": "identity.md", "line": 91, "text": "3. **Load brain**" },
{ "file": "lessons.md", "line": 1, "text": "# PROJECT LESSONS" }
]
}
RAG Mode
Semantic search using LTC. Searches indexed learnings and decisions, then rehydrates full context.
// Semantic search
await vant_search({ query: 'python', mode: 'rag', limit: 3 });
Returns:
{
"mode": "rag",
"query": "python",
"results": 2,
"hits": [
{ "type": "learnings", "summary": "Use requests library" },
{ "type": "decisions", "summary": "Prefer venv" }
],
"context": "=== learnings/python.md ===\n...",
"compressed": "[COMPRESSED:12345]...",
"ltc": { "hasLTC": true, "learnings": 42, "decisions": 15 }
}
Compression: RAG mode applies compression when context > 5KB.
Hybrid Mode
BM25 + Vector + RRF for full-text capability. Best for general search.
// Hybrid search (BM25 + Vector + RRF via unified lib)
await vant_search({ query: 'lessons', mode: 'hybrid', limit: 5 });
Returns:
{
"mode": "hybrid",
"query": "lessons",
"sparse": 10,
"dense": 5,
"fused": 8,
"results": [
{ "id": "abc123", "rrf": "0.815", "content": "Project lessons learned..." }
]
}
Best for: General queries where you want both keyword and semantic matches.
Unified API
All modes available via single lib/search.js:
const search = require('vant').search;
// Basic: text search
const basic = await search.searchLTC('query', { limit: 5 });
// RAG: semantic + rehydrate
const { results, context } = await search.query('query', { limit: 5 });
// Hybrid: BM25 + Vector + RRF
const hybrid = await search.hybrid('query');
// HyDE: query transformation
const hyde = await search.hyde('query');
// Get settings
const settings = search.getSettings();
CLI
# Basic search (default)
vant search "lessons"
# RAG search
vant search --mode rag "python"
vant search --mode rag --limit 3 "context"
Settings
Configure search via settings.ini:
# Search (RAG)
REHYDRATE_MAX_SIZE=51200 # bytes, max 1MB (default 50KB)
COMPRESSION_THRESHOLD=5120 # bytes trigger for compression hint (default 5KB)
RAG_LIMIT_MAX=20 # max results from LTC search (default 20)
| Setting | Default | Description |
|---|---|---|
| REHYDRATE_MAX_SIZE | 50KB | Max bytes returned in RAG context |
| COMPRESSION_THRESHOLD | 5KB | When to show compression hint |
| RAG_LIMIT_MAX | 20 | Max results from LTC query |
RAG response includes current settings:
{
"mode": "rag",
"settings": {
"rehydrateMaxSize": 51200,
"compressionThreshold": 5120,
"ragLimitMax": 20
}
}
Security
- Path traversal: Blocked in basic mode
- Null bytes: Content rejected
- Absolute paths: Blocked in files filter
- Query length: Max 500 chars via vaf
- RAG limit bounds: 1-20 enforced
Mode Comparison
| Feature | Basic | RAG/LTC | Hybrid |
|---|---|---|---|
| Speed | ~1ms | ~5ms | ~10ms |
| Type | Text match | Semantic | RRF |
| Context | Current brain | Git history | Full |
| Compression | N/A | >5KB trigger | N/A |
| Requires LTC | No | Yes | Optional |
| Tokens | Full file | Distilled | Full |
Future: Islands + vpatch
Combining with Islands architecture:
- Islands: Componentized brain (lazy-load on trigger)
- vpatch: Compact diff format vs full file
- Benefit: Smaller token context, faster hydration
Potential workflow:
- Query triggers island(s) → lazy-hydrate only needed components
- Return compressed vpatch diffs → smaller context
- Result: Faster RAG with lower tokens
See: Islands for architecture.
Related
- Islands - Componentized brain
- Audit - Activity logging
- Citations - Git-backed citations
- Hybrid Sync - Public/Private split