📊 Embeddings
/docs/programmatic-usage/embeddings
Provider Comparison
| Provider | Latency | Quality | Cost | Offline | Best For |
|---|---|---|---|---|---|
| Agentic-Flow (ONNX) | ~3ms | Good | Free | ✅ | Production (faster with ONNX runtime) |
| OpenAI | ~50-100ms | Excellent | $0.02-0.13/1M | ❌ | Highest quality |
| Transformers.js | ~230ms | Good | Free | ✅ | Local development |
| Mock | <1ms | N/A | Free | ✅ | Testing |
Basic Usage
typescriptimport { createEmbeddingService, cosineSimilarity } from '@claude-flow/embeddings'; // Auto-selects best provider (agentic-flow ONNX preferred) const embeddings = await createEmbeddingService({ provider: 'auto', // agentic-flow → transformers → mock autoInstall: true, // Auto-install agentic-flow if missing dimensions: 384, cache: { enabled: true, maxSize: 10000 } }); // Generate embeddings const result = await embeddings.embed('authentication patterns'); console.log(`Generated in ${result.latencyMs}ms`); // Batch processing with cache stats const batch = await embeddings.embedBatch([ 'user login flow', 'password reset', 'session management' ]); console.log(`Cache hits: ${batch.cacheStats?.hits}`); // Compare similarity const similarity = cosineSimilarity(batch.embeddings[0], batch.embeddings[1]); // 0.94 (high similarity)
Document Chunking
Split long documents into overlapping chunks:
typescriptimport { chunkText, estimateTokens } from '@claude-flow/embeddings'; const result = chunkText(longDocument, { maxChunkSize: 512, overlap: 50, strategy: 'sentence', // 'character' | 'sentence' | 'paragraph' | 'token' minChunkSize: 100, }); console.log(`Created ${result.totalChunks} chunks`); result.chunks.forEach((chunk, i) => { console.log(`Chunk ${i}: ${chunk.length} chars, ~${chunk.tokenCount} tokens`); });
Normalization Options
Normalize embeddings for consistent similarity:
typescriptimport { l2Normalize, l1Normalize, minMaxNormalize, zScoreNormalize } from '@claude-flow/embeddings'; // L2 normalize (unit vector - most common for cosine similarity) const l2 = l2Normalize(embedding); // [0.6, 0.8, 0] // Other normalizations const l1 = l1Normalize(embedding); // Manhattan norm = 1 const minMax = minMaxNormalize(embedding); // Values in [0, 1] const zScore = zScoreNormalize(embedding); // Mean 0, std 1
Hyperbolic Embeddings (Poincaré Ball)
Better representation for hierarchical code structures:
typescriptimport { euclideanToPoincare, hyperbolicDistance, hyperbolicCentroid, mobiusAdd, } from '@claude-flow/embeddings'; // Convert to hyperbolic space (better for tree-like structures) const poincare = euclideanToPoincare(embedding); // Hyperbolic distance (geodesic in Poincaré ball) const dist = hyperbolicDistance(embedding1, embedding2); // Hyperbolic centroid (Fréchet mean) const centroid = hyperbolicCentroid([embed1, embed2, embed3]); // Why hyperbolic? Better for: // - Parent-child relationships (class inheritance) // - Directory hierarchies // - Taxonomy structures // - Lower distortion for tree-like data
Neural Substrate Integration (Fine-Tuning)
Access neural features for embedding adaptation:
typescriptimport { createNeuralService, isNeuralAvailable } from '@claude-flow/embeddings'; // Check availability const available = await isNeuralAvailable(); // Create neural service const neural = createNeuralService({ dimension: 384 }); await neural.init(); if (neural.isAvailable()) { // Semantic drift detection (catches context drift) await neural.setDriftBaseline('Initial context'); const drift = await neural.detectDrift('New input to check'); console.log('Drift:', drift?.trend); // 'stable' | 'drifting' | 'accelerating' // Memory with interference detection const stored = await neural.storeMemory('mem-1', 'Important pattern'); console.log('Interference:', stored?.interference); // Recall by similarity const memories = await neural.recallMemories('query', 5); // Coherence calibration (fine-tune quality detection) await neural.calibrateCoherence(['good output 1', 'good output 2']); const coherence = await neural.checkCoherence('Output to verify'); // Swarm coordination via embeddings await neural.addSwarmAgent('agent-1', 'researcher'); const coordination = await neural.coordinateSwarm('Complex task'); }
Persistent SQLite Cache
Long-term embedding storage with LRU eviction:
typescriptimport { PersistentEmbeddingCache } from '@claude-flow/embeddings'; const cache = new PersistentEmbeddingCache({ dbPath: './embeddings.db', maxSize: 10000, ttlMs: 7 * 24 * 60 * 60 * 1000, // 7 days }); await cache.init(); await cache.set('my text', new Float32Array([0.1, 0.2, 0.3])); const embedding = await cache.get('my text'); const stats = await cache.getStats(); console.log(`Hit rate: ${(stats.hitRate * 100).toFixed(1)}%`);
CLI Commands
bash# Generate embedding ruflo embeddings embed "Your text here" # Batch embed from file ruflo embeddings batch documents.txt -o embeddings.json # Similarity search ruflo embeddings search "query" --index ./vectors # Document chunking ruflo embeddings chunk document.txt --strategy sentence --max-size 512 # Normalize embeddings ruflo embeddings normalize embeddings.json --type l2 -o normalized.json # Convert to hyperbolic ruflo embeddings hyperbolic embeddings.json -o poincare.json # Neural operations ruflo embeddings neural drift --baseline "context" --input "check" ruflo embeddings neural store --id mem-1 --content "data" ruflo embeddings neural recall "query" --top-k 5 # Model management ruflo embeddings models list ruflo embeddings models download all-MiniLM-L6-v2 # Cache management ruflo embeddings cache stats ruflo embeddings cache clear --older-than 7d
Available Models
| Provider | Model | Dimensions | Best For |
|---|---|---|---|
| Agentic-Flow | default | 384 | General purpose (fastest) |
| OpenAI | text-embedding-3-small | 1536 | Cost-effective, high quality |
| OpenAI | text-embedding-3-large | 3072 | Highest quality |
| Transformers.js | Xenova/all-MiniLM-L6-v2 | 384 | Fast, offline |
| Transformers.js | Xenova/all-mpnet-base-v2 | 768 | Higher quality offline |
| Transformers.js | Xenova/bge-small-en-v1.5 | 384 | Retrieval optimized |