V Vant Docs

Hybrid Search

NEW in v0.8.6: See Hybrid Search below for BM25+Vector RRF.

Use your Long Term Core (LTC) as a high-level index for RAG-like consciousness. Search for relevant memories, then re-hydrate full context from git history.

Vant generates LTC via pruning - a “distilled” consciousness with only important learnings and decisions. But sometimes you need the full context.

This is like having a map (LTC) + archive (git history). Query the map, retrieve from the archive!

How It Works

Query → Search LTC → Identify topics → Re-hydrate from git → Full context

Usage

const search = require('vant').search;

// Get search summary
const summary = search.getSummary();
console.log(summary);
// { hasLTC: true, learnings: 10, decisions: 5, ... }

// Search LTC
const results = search.searchLTC('python', { limit: 5 });
console.log(results);
// [{ type: 'learnings', summary: '...', relevance: 3 }]

// Query + Re-hydrate
const { results, context } = await search.query('authentication');
console.log(context);  // Full matching content

Query Options

All query modes:

// Full search + rehydrate (default)
const { results, context } = await search.query('python');

// Compact mode: summaries only (faster)
const { results, context } = await search.query('python', { compact: true });
// context: "- Learned X\n- Decided Y..."

// Hybrid search with caching
const r1 = await search.hybrid('python');   // First call
const r2 = await search.hybrid('python');   // Cached!

// Cache management
console.log(search.getCacheStats());  // { size: 1, max: 50 }
search.clearCache();               // Clear session cache

// HyDE transform
const hydeQuery = search.hyde(query);

Session Caching

Hybrid search results are cached per-session (50 max, MD5 keyed):

// First call - actual search
const r1 = await search.hybrid('python');

// Second call - cached (instant)
const r2 = await search.hybrid('python');

// Cache stats
search.getCacheStats();  // { size: 1, max: 50 }
search.clearCache();  // Clear all cached

Re-hydrate

Re-hydrate fetches full matching files from git history:

const { context } = await search.rehydrate(results);
// Returns: combined context up to 50KB

Security limits:

LTC Structure

Generated by pruning at models/vX/_core.json:

{
    "version": "1.0",
    "updated": "2026-05-05T12:00:00Z",
    "core": {
        "learnings": [...],
        "decisions": [...],
        "preferences": {...}
    },
    "stats": {
        "pruned": 10,
        "kept": 5
    }
}

Use Cases

  1. Deep research: Search LTC, re-hydrate full context
  2. Decision review: Find all decisions on topic
  3. Context recovery: Need more than LTC provides
  4. RAG pipeline: Build AI responses with full context

Integration with AI

// In agent/loop.js
const search = require('vant').search;

async function answerWithContext(query) {
    // 1. Quick LTC check
    const ltc = search.getSummary();
    if (!lc.hasLTC) {
        return 'No LTC available - run prune first';
    }
    
    // 2. Search + re-hydrate
    const { results, context } = await search.query(query);
    
    if (!context) {
        // 3. Fallback to current brain only
        return null;
    }
    
    // 4. Use context in AI prompt
    return context;
}

Caveats


Rerank (RAG)

Keyword reranking and compression for LLM context

Rerank is separate from search - focuses on keyword scoring and token optimization:

See Rerank Guide for full documentation.

vant rerank "lessons learned"          # Rerank
vant rerank pipeline "security" -t 4000  # Rerank + compress

Search vs Rerank

Feature Search Rerank
Type Semantic (BM25+Vector) Keyword
Use case Find relevant memories Prepare for LLM
Input Query Query + memories
Output Candidates Ranked + compressed

Use search to find candidates, rerank to optimize for LLM context.

Usage

const search = require('vant').search;

const results = await search.hybrid('herbalism plants');
// { sparse: [], dense: [], fused: [], sources: [] }

CLI

vant search "query"           # Default hybrid
vant search --hybrid "q"   # Explicit hybrid
vant search --hyde "q"      # HyDE
vant search --stats        # Index stats
vant search "q" -r         # Search + rerank
vant search "q" -r -t 4000 # Search + rerank w/ 4000 tokens

Rerank Integration

Use -r or --rerank to pipeline search results through rerank:

# Search hybrid, then keyword rerank + compress
vant search "lessons" -r -t 4000

# Pipeline: Hybrid search → Rerank → Compress
# Returns keyword-scored, token-compressed memories
Flag Description Default
-r, --rerank Rerank results false
-t, --max-tokens Max tokens 2000

All 3 search modes support --rerank: Default (hybrid), --mode basic, --mode rag, --mode hybrid

Integration

Use with query transformation:

const query = require('./lib/query');
const citations = require('./lib/citations');

const result = await query.hyde('herbalism');
// Uses HyDE: fake answer → search real

citations.addSource(result.sources[0].commit);

MCP Tool

Search is available via MCP as vant_search with two modes:

{
  name: 'vant_search',
  arguments: {
    query: 'search term',
    mode: 'basic' | 'rag',  // default: 'basic'
    files: ['file.md'],       // basic mode only
    limit: 5                // rag mode, default: 5
  }
}

Basic Mode

Fast text search. Default mode.

// Search all files
await vant_search({ query: 'lessons', mode: 'basic' });

// Search specific files
await vant_search({ query: 'lessons', mode: 'basic', files: ['identity.md', 'goals.md'] });

Returns:

{
  "mode": "basic",
  "query": "lessons",
  "filesSearched": 24,
  "hits": 5,
  "results": [
    { "file": "identity.md", "line": 91, "text": "3. **Load brain**" },
    { "file": "lessons.md", "line": 1, "text": "# PROJECT LESSONS" }
  ]
}

RAG Mode

Semantic search using LTC. Searches indexed learnings and decisions, then rehydrates full context.

// Semantic search
await vant_search({ query: 'python', mode: 'rag', limit: 3 });

Returns:

{
  "mode": "rag",
  "query": "python",
  "results": 2,
  "hits": [
    { "type": "learnings", "summary": "Use requests library" },
    { "type": "decisions", "summary": "Prefer venv" }
  ],
  "context": "=== learnings/python.md ===\n...",
  "compressed": "[COMPRESSED:12345]...",
  "ltc": { "hasLTC": true, "learnings": 42, "decisions": 15 }
}

Compression: RAG mode applies compression when context > 5KB.

Hybrid Mode

BM25 + Vector + RRF for full-text capability. Best for general search.

// Hybrid search (BM25 + Vector + RRF via unified lib)
await vant_search({ query: 'lessons', mode: 'hybrid', limit: 5 });

Returns:

{
  "mode": "hybrid",
  "query": "lessons",
  "sparse": 10,
  "dense": 5,
  "fused": 8,
  "results": [
    { "id": "abc123", "rrf": "0.815", "content": "Project lessons learned..." }
  ]
}

Best for: General queries where you want both keyword and semantic matches.


Unified API

All modes available via single lib/search.js:

const search = require('vant').search;

// Basic: text search
const basic = await search.searchLTC('query', { limit: 5 });

// RAG: semantic + rehydrate
const { results, context } = await search.query('query', { limit: 5 });

// Hybrid: BM25 + Vector + RRF
const hybrid = await search.hybrid('query');

// HyDE: query transformation
const hyde = await search.hyde('query');

// Get settings
const settings = search.getSettings();

CLI

# Basic search (default)
vant search "lessons"

# RAG search
vant search --mode rag "python"
vant search --mode rag --limit 3 "context"

Settings

Configure search via settings.ini:

# Search (RAG)
REHYDRATE_MAX_SIZE=51200    # bytes, max 1MB (default 50KB)
COMPRESSION_THRESHOLD=5120  # bytes trigger for compression hint (default 5KB)
RAG_LIMIT_MAX=20             # max results from LTC search (default 20)
Setting Default Description
REHYDRATE_MAX_SIZE 50KB Max bytes returned in RAG context
COMPRESSION_THRESHOLD 5KB When to show compression hint
RAG_LIMIT_MAX 20 Max results from LTC query

RAG response includes current settings:

{
  "mode": "rag",
  "settings": {
    "rehydrateMaxSize": 51200,
    "compressionThreshold": 5120,
    "ragLimitMax": 20
  }
}

Security


Mode Comparison

Feature Basic RAG/LTC Hybrid
Speed ~1ms ~5ms ~10ms
Type Text match Semantic RRF
Context Current brain Git history Full
Compression N/A >5KB trigger N/A
Requires LTC No Yes Optional
Tokens Full file Distilled Full

Future: Islands + vpatch

Combining with Islands architecture:

Potential workflow:

  1. Query triggers island(s) → lazy-hydrate only needed components
  2. Return compressed vpatch diffs → smaller context
  3. Result: Faster RAG with lower tokens

See: Islands for architecture.