Advanced Usage

Benchmark Engines

ModestBench provides two engines with different performance characteristics and statistical approaches.

Engine Selection

Choose an engine based on your requirements:

# Tinybench engine (default) - fast development iteration
modestbench --engine tinybench

# Accurate engine - high-precision measurements
node --allow-natives-syntax ./node_modules/.bin/modestbench --engine accurate

Statistical Improvements

Both engines now use IQR (Interquartile Range) outlier removal to filter extreme values caused by:

Garbage collection pauses
System interruptions
Background processes
OS scheduler variations

This results in more stable and reliable measurements compared to raw statistical analysis.

AccurateEngine Statistical Features

The accurate engine provides enhanced statistical analysis:

V8 Optimization Guards: Uses V8 intrinsics (%NeverOptimizeFunction) to prevent JIT compiler interference with measurements
IQR Outlier Removal: Automatically removes extreme outliers (beyond Q1 - 1.5×IQR and Q3 + 1.5×IQR)
Comprehensive Statistics:
- Mean, min, max execution times
- Standard deviation and variance
- Coefficient of Variation (CV): Measures relative variability (stdDev / mean × 100)
- 95th and 99th percentiles
- Margin of error (95% confidence interval)

Coefficient of Variation (CV)

The CV metric helps assess benchmark quality:

CV < 5%    → Excellent (very stable)
CV 5-10%   → Good (acceptable variance)
CV 10-20%  → Fair (consider more samples)
CV > 20%   → Poor (investigate noise sources)

Example output showing CV:

$ modestbench --engine accurate --allow-natives-syntax --reporters json
{
  "name": "Array.push()",
  "mean": 810050,  // nanoseconds
  "stdDev": 19842,
  "cv": 2.45,      // 2.45% - excellent stability
  "marginOfError": 0.024,
  "p95": 845200,
  "p99": 862100
}

Performance Comparison

Real-world comparison using examples/benchmarks:

# Tinybench (fast iteration)
$ modestbench --engine tinybench --reporters json
# Typical run time: 3-5 seconds for 5 benchmark files

# Accurate (high precision)
$ node --allow-natives-syntax ./node_modules/.bin/modestbench --engine accurate --reporters json
# Typical run time: 8-12 seconds for 5 benchmark files

The accurate engine takes ~2-3x longer but provides:

More consistent results between runs
Better outlier filtering with V8 guards
Higher confidence in micro-optimizations

Choosing the Right Engine

Use Case	Recommended Engine
Development iteration	`tinybench`
CI/CD regression tests	`tinybench`
Blog post/publication	`accurate`
Library optimization	`accurate`
Micro-benchmark comparison	`accurate`
Algorithm selection	Either (results typically consistent)

Multiple Suites

Organize related benchmarks into separate suites with independent setup and teardown:

const state = {
  data: [],
  sortedData: [],
};

export default {
  suites: {
    Sorting: {
      setup() {
        state.data = generateTestData(1000);
      },
      teardown() {
        state.data = [];
      },
      benchmarks: {
        'Quick Sort': () => quickSort(state.data),
        'Merge Sort': () => mergeSort(state.data),
        'Bubble Sort': () => bubbleSort(state.data),
      },
    },

    Searching: {
      setup() {
        state.sortedData = generateSortedData(10000);
      },
      teardown() {
        state.sortedData = [];
      },
      benchmarks: {
        'Binary Search': () => binarySearch(state.sortedData, 5000),
        'Linear Search': () => linearSearch(state.sortedData, 5000),
        'Jump Search': () => jumpSearch(state.sortedData, 5000),
      },
    },
  },
};

Suite Lifecycle

setup() - Called once before any tasks in the suite run
Tasks execute - Each task runs with its configured iterations
teardown() - Called once after all tasks complete

Async Operations

ModestBench fully supports asynchronous benchmarks:

Async Functions

export default {
  suites: {
    'Async Performance': {
      benchmarks: {
        // Simple async benchmark
        'Promise.resolve()': async () => {
          return await Promise.resolve('test');
        },

        // With configuration
        'Fetch Simulation': {
          async fn() {
            const response = await simulateApiCall();
            return response.json();
          },
          config: {
            iterations: 100, // Fewer iterations for slow operations
          },
        },
      },
    },
  },
};

Async Setup/Teardown

export default {
  suites: {
    'Database Operations': {
      async setup() {
        this.db = await connectDatabase();
        await this.db.seed();
      },

      async teardown() {
        await this.db.close();
      },

      benchmarks: {
        'Read Query': async function() {
          return await this.db.query('SELECT * FROM users LIMIT 100');
        },

        'Write Query': async function() {
          return await this.db.insert({ name: 'Test User' });
        },
      },
    },
  },
};

Tagging and Filtering

Tag Cascading

Tags cascade from file → suite → task levels:

export default {
  // File-level tags (inherited by all suites and tasks)
  tags: ['performance', 'core'],

  suites: {
    'String Operations': {
      // Suite-level tags (inherited by all tasks in this suite)
      tags: ['string', 'fast'],

      benchmarks: {
        // Task inherits: ['performance', 'core', 'string', 'fast', 'regex']
        'RegExp Test': {
          fn: () => /pattern/.test(str),
          tags: ['regex'], // Task-specific tags
        },

        // Task inherits: ['performance', 'core', 'string', 'fast']
        'String Includes': () => str.includes('pattern'),
      },
    },

    'Array Operations': {
      tags: ['array', 'slow'],

      benchmarks: {
        // Task inherits: ['performance', 'core', 'array', 'slow']
        'Array spread': () => {
          let arr = [];
          for (let i = 0; i < 1000; i++) {
            arr = [...arr, i];
          }
          return arr;
        },
      },
    },
  },
};

Filtering Examples

# Run only fast benchmarks
modestbench --tags fast
# Runs: 'RegExp Test', 'String Includes'

# Run string OR array benchmarks
modestbench --tags string,array
# Runs: All tasks in 'String Operations' and 'Array Operations'

# Exclude slow benchmarks
modestbench --exclude-tags slow
# Runs: Only 'String Operations' tasks

# Combine: run fast benchmarks except experimental
modestbench --tags fast --exclude-tags experimental

Suite Lifecycle with Filtering

Suite setup() and teardown() only run if at least one task in the suite matches the filter:

export default {
  suites: {
    'Expensive Setup': {
      setup() {
        console.log('This only runs if at least one task will execute');
        this.expensiveResource = createExpensiveResource();
      },

      teardown() {
        console.log('This only runs if setup ran');
        this.expensiveResource.destroy();
      },

      benchmarks: {
        'Fast Task': {
          fn() { /* ... */ },
          tags: ['fast'],
        },
        'Slow Task': {
          fn() { /* ... */ },
          tags: ['slow'],
        },
      },
    },
  },
};

# Setup and teardown run (Fast Task matches)
modestbench --tags fast

# Setup and teardown DON'T run (Slow Task excluded)
modestbench --exclude-tags slow

Custom Task Configuration

Configure individual tasks with specific settings:

export default {
  suites: {
    'Custom Configs': {
      benchmarks: {
        // Default configuration
        'Standard Task': () => someOperation(),

        // Custom iterations
        'High Sample Task': {
          fn: () => criticalOperation(),
          config: {
            iterations: 10000,
            warmup: 200,
          },
        },

        // Custom timeout for slow operations
        'Slow Operation': {
          fn: async () => await slowAsyncOperation(),
          config: {
            timeout: 60000, // 60 seconds
            iterations: 10,  // Fewer samples
          },
        },
      },
    },
  },
};

Environment-Specific Benchmarks

Use JavaScript config files for dynamic configuration:

const isCI = process.env.CI === 'true';
const isProd = process.env.NODE_ENV === 'production';

export default {
  iterations: isCI ? 5000 : 100,
  warmup: isCI ? 100 : 0,
  reporters: isCI ? ['json', 'csv'] : ['simple'], // Simple reporter for CI, auto-detect for local
  quiet: isCI,
  outputDir: isCI ? './benchmark-results' : undefined,

  // Only run critical benchmarks in CI
  tags: isCI ? ['critical'] : [],

  // Exclude slow benchmarks in development
  excludeTags: isProd ? [] : ['slow'],
};

CI/CD Integration

GitHub Actions

name: Performance Tests
on: [push, pull_request]

jobs:
  benchmark:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3

      - uses: actions/setup-node@v3
        with:
          node-version: 20

      - name: Install dependencies
        run: npm ci

      - name: Build project
        run: npm run build

      - name: Run benchmarks
        run: |
          modestbench \
            --reporters json,csv \
            --output ./results \
            --quiet \
            --tags critical

      - name: Upload results
        uses: actions/upload-artifact@v3
        with:
          name: benchmark-results
          path: ./results/

      - name: Check for regressions
        run: node scripts/check-regression.js

Performance Regression Detection

import { execSync } from 'child_process';
import { readFileSync } from 'fs';

// Run current benchmarks
execSync('modestbench --reporters json --output ./current', {
  stdio: 'inherit',
});

const current = JSON.parse(
  readFileSync('./current/results.json', 'utf8')
);

// Load baseline results
const baseline = JSON.parse(
  readFileSync('./baseline/results.json', 'utf8')
);

let hasRegression = false;

// Check for significant regressions
for (const result of current.results) {
  const baselineResult = baseline.results.find(
    (r) => r.file === result.file && r.task === result.task
  );

  if (baselineResult) {
    const regression =
      (baselineResult.opsPerSecond - result.opsPerSecond) /
      baselineResult.opsPerSecond;

    if (regression > 0.1) {
      // 10% regression threshold
      console.error(
        `❌ Performance regression in ${result.task}: ${(
          regression * 100
        ).toFixed(1)}% slower`
      );
      console.error(`   Baseline: ${baselineResult.opsPerSecond.toFixed(2)} ops/sec`);
      console.error(`   Current:  ${result.opsPerSecond.toFixed(2)} ops/sec`);
      hasRegression = true;
    } else if (regression < -0.1) {
      // 10% improvement
      console.log(
        `✅ Performance improvement in ${result.task}: ${(
          Math.abs(regression) * 100
        ).toFixed(1)}% faster`
      );
    }
  }
}

if (hasRegression) {
  console.error('\n❌ Performance regressions detected!');
  process.exit(1);
} else {
  console.log('\n✅ No performance regressions detected!');
}

Historical Tracking

ModestBench automatically saves results to .modestbench/history/. Use the history commands for analysis:

View History

# List recent runs
modestbench history list

# Show specific run
modestbench history show run-2025-10-07-001

# Compare two runs
modestbench history compare \
  run-2025-10-07-001 \
  run-2025-10-07-002

Export Historical Data

# Export to CSV for analysis
modestbench history export \
  --format csv \
  --output historical-data.csv

# Export to JSON
modestbench history export \
  --format json \
  --output historical-data.json

Cleanup Old Data

# Clean runs older than 30 days
modestbench history clean --older-than 30d

# Keep only last 10 runs
modestbench history clean --keep 10

# Clean by size
modestbench history clean --max-size 100mb

Programmatic API

Use modestbench programmatically in your own tools:

import { modestbench, HumanReporter } from 'modestbench';

// Initialize the engine
const engine = modestbench();

// Register reporters
engine.registerReporter('human', new HumanReporter());

// Execute benchmarks
const result = await engine.execute({
  pattern: '**/*.bench.js',
  iterations: 1000,
  warmup: 50,
  reporters: ['human'],
});

// Process results
if (result.summary.failedTasks > 0) {
  console.error('Some benchmarks failed');
  process.exit(1);
}

Handling Fast Operations

Extremely fast operations (<1ns) can cause overflow errors. modestbench handles this automatically:

export default {
  suites: {
    'Ultra Fast Operations': {
      benchmarks: {
        // ModestBench will automatically adjust time budget for very fast ops
        'Variable Read': () => {
          const x = 42;
          return x;
        },

        // For ultra-fast operations, reduce iterations
        'Constant Return': {
          fn: () => 42,
          config: {
            iterations: 100, // Lower sample count
          },
        },
      },
    },
  },
};

Memory Profiling Context

Benchmark results include memory information:

{
  "environment": {
    "memory": {
      "total": 51539607552,
      "totalGB": 48.0,
      "free": 12884901888,
      "freeGB": 12.0
    }
  }
}

Track memory usage across runs to identify memory-intensive operations.

Concurrent Execution

Run benchmark files concurrently for faster execution:

modestbench --concurrent

Considerations:

Files run in parallel, but tasks within a file run sequentially
May cause resource contention on systems with limited CPU/memory
Results may vary between runs due to system load
Not recommended for accurate performance measurements

Troubleshooting

High Margin of Error

If benchmarks show high margin of error (>5%):

Increase warmup iterations: --warmup 100
Increase sample size: --iterations 2000
Close other applications to reduce system load
Use time-based limiting: --time 10000 --limit-by time

Timeouts

If benchmarks timeout:

Increase timeout: --timeout 60000
Reduce iterations: --iterations 10
Check for infinite loops in benchmark code

Inconsistent Results

If results vary significantly between runs:

Use warmup iterations: --warmup 100
Increase sample size: --iterations 5000
Run in isolation (no other processes)
Check for async operations completing outside benchmark scope

Best Practices

1. Isolate Benchmarks

Each benchmark should test one specific operation:

// ❌ Bad: Testing multiple things
'Bad Benchmark': () => {
  const arr = [];
  for (let i = 0; i < 1000; i++) {
    arr.push(i);
  }
  return arr.sort();
};

// ✅ Good: Isolated operations
'Array Push': () => {
  const arr = [];
  for (let i = 0; i < 1000; i++) {
    arr.push(i);
  }
  return arr;
},
'Array Sort': () => {
  const arr = Array.from({ length: 1000 }, (_, i) => i);
  return arr.sort();
},

2. Avoid Side Effects

Keep benchmarks pure and repeatable:

// ❌ Bad: Modifying external state
let counter = 0;
'Bad Benchmark': () => {
  counter++;
  return counter;
};

// ✅ Good: No external state
'Good Benchmark': () => {
  let counter = 0;
  counter++;
  return counter;
};

3. Use Warmup for JIT

Enable warmup for operations that benefit from JIT optimization:

export default {
  suites: {
    'JIT-Optimized Operations': {
      benchmarks: {
        'Math Operations': {
          fn: () => Math.sqrt(42) * Math.PI,
          config: {
            warmup: 100,
            iterations: 5000,
          },
        },
      },
    },
  },
};

4. Tag Strategically

Use tags to organize and filter benchmarks:

export default {
  tags: ['core'], // Project-wide tag

  suites: {
    'Critical Path': {
      tags: ['critical', 'fast'], // Important, quick benchmarks
      benchmarks: { /* ... */ },
    },

    'Edge Cases': {
      tags: ['edge-case', 'slow'], // Thorough but slow tests
      benchmarks: { /* ... */ },
    },
  },
};

Next Steps

Review Configuration for all options
Check CLI Reference for command details
See Output Formats for reporter integration
Read Architecture for internals