📊 Performance Benchmarking

/docs/200---2026-01-15/performance-benchmarking

2026-01-15/performance-benchmarking" parent: "200---2026-01-15" order: 7

Statistical benchmarking, memory tracking, regression detection, and V3 performance target validation.

Features

Feature	Description	Performance
Statistical Analysis	Mean, median, P95, P99, stddev, outlier removal	Real-time
Memory Tracking	Heap, RSS, external, array buffers	Per-iteration
Auto-Calibration	Adjusts iterations for statistical significance	Automatic
Regression Detection	Compare against baselines with significance testing	<10ms
V3 Targets	Built-in targets for all performance metrics	Preconfigured
Flash Attention	Validate optimized attention (WASM-accelerated when available) targets	Integrated

Quick Start

typescript
import { benchmark, BenchmarkRunner, V3_PERFORMANCE_TARGETS } from '@claude-flow/performance';

// Single benchmark
const result = await benchmark('vector-search', async () => {
  await index.search(queryVector, 10);
}, { iterations: 100, warmup: 10 });

console.log(`Mean: ${result.mean}ms, P99: ${result.p99}ms`);

// Check against V3 target
if (result.mean <= V3_PERFORMANCE_TARGETS['vector-search']) {
  console.log('✅ Target met!');
}

V3 Performance Targets

typescript
import { V3_PERFORMANCE_TARGETS, meetsTarget } from '@claude-flow/performance';

// Built-in targets
V3_PERFORMANCE_TARGETS = {
  // Startup Performance
  'cli-cold-start': 500,        // <500ms (5x faster)
  'cli-warm-start': 100,        // <100ms
  'mcp-server-init': 400,       // <400ms (4.5x faster)
  'agent-spawn': 200,           // <200ms (4x faster)

  // Memory Operations
  'vector-search': 1,           // <1ms (150x faster)
  'hnsw-indexing': 10,          // <10ms
  'memory-write': 5,            // <5ms (10x faster)
  'cache-hit': 0.1,             // <0.1ms

  // Swarm Coordination
  'agent-coordination': 50,     // <50ms
  'task-decomposition': 20,     // <20ms
  'consensus-latency': 100,     // <100ms (5x faster)
  'message-throughput': 0.1,    // <0.1ms per message

  // SONA Learning
  'sona-adaptation': 0.05       // <0.05ms
};

// Check if target is met
const { met, target, ratio } = meetsTarget('vector-search', 0.8);
// { met: true, target: 1, ratio: 0.8 }

Benchmark Suite

typescript
import { BenchmarkRunner } from '@claude-flow/performance';

const runner = new BenchmarkRunner('Memory Operations');

// Run individual benchmarks
await runner.run('vector-search', async () => {
  await index.search(query, 10);
});

await runner.run('memory-write', async () => {
  await store.write(entry);
});

// Run all at once
const suite = await runner.runAll([
  { name: 'search', fn: () => search() },
  { name: 'write', fn: () => write() },
  { name: 'index', fn: () => index() }
]);

// Print formatted results
runner.printResults();

// Export as JSON
const json = runner.toJSON();

Comparison & Regression Detection

typescript
import { compareResults, printComparisonReport } from '@claude-flow/performance';

// Compare current vs baseline
const comparisons = compareResults(baselineResults, currentResults, {
  'vector-search': 1,      // Target: <1ms
  'memory-write': 5,       // Target: <5ms
  'cli-startup': 500       // Target: <500ms
});

// Print formatted report
printComparisonReport(comparisons);

// Programmatic access
for (const comp of comparisons) {
  if (!comp.targetMet) {
    console.error(`${comp.benchmark} missed target!`);
  }
  if (comp.significant && !comp.improved) {
    console.warn(`${comp.benchmark} regressed by ${comp.changePercent}%`);
  }
}

Result Structure

typescript
interface BenchmarkResult {
  name: string;
  iterations: number;
  mean: number;           // Average time (ms)
  median: number;         // Median time (ms)
  p95: number;            // 95th percentile
  p99: number;            // 99th percentile
  min: number;
  max: number;
  stdDev: number;         // Standard deviation
  opsPerSecond: number;   // Operations/second
  memoryUsage: {
    heapUsed: number;
    heapTotal: number;
    external: number;
    arrayBuffers: number;
    rss: number;
  };
  memoryDelta: number;    // Memory change during benchmark
  timestamp: number;
}

Formatting Utilities

typescript
import { formatBytes, formatTime } from '@claude-flow/performance';

formatTime(0.00005);  // '50.00 ns'
formatTime(0.5);      // '500.00 µs'
formatTime(5);        // '5.00 ms'
formatTime(5000);     // '5.00 s'

formatBytes(1024);          // '1.00 KB'
formatBytes(1048576);       // '1.00 MB'
formatBytes(1073741824);    // '1.00 GB'

CLI Commands

bash
# Run all benchmarks
npm run bench

# Run attention benchmarks
npm run bench:attention

# Run startup benchmarks
npm run bench:startup

# Performance report
npx ruflo@latest performance report

# Benchmark specific suite
npx ruflo@latest performance benchmark --suite memory

CLI Commands

🧪 Testing Framework