Advanced Usage
Benchmark Engines
Section titled “Benchmark Engines”ModestBench provides two engines with different performance characteristics and statistical approaches.
Engine Selection
Section titled “Engine Selection”Choose an engine based on your requirements:
# Tinybench engine (default) - fast development iterationmodestbench --engine tinybench
# Accurate engine - high-precision measurementsnode --allow-natives-syntax ./node_modules/.bin/modestbench --engine accurateStatistical Improvements
Section titled “Statistical Improvements”Both engines now use IQR (Interquartile Range) outlier removal to filter extreme values caused by:
- Garbage collection pauses
- System interruptions
- Background processes
- OS scheduler variations
This results in more stable and reliable measurements compared to raw statistical analysis.
AccurateEngine Statistical Features
Section titled “AccurateEngine Statistical Features”The accurate engine provides enhanced statistical analysis:
- V8 Optimization Guards: Uses V8 intrinsics (
%NeverOptimizeFunction) to prevent JIT compiler interference with measurements - IQR Outlier Removal: Automatically removes extreme outliers (beyond Q1 - 1.5×IQR and Q3 + 1.5×IQR)
- Comprehensive Statistics:
- Mean, min, max execution times
- Standard deviation and variance
- Coefficient of Variation (CV): Measures relative variability (
stdDev / mean × 100) - 95th and 99th percentiles
- Margin of error (95% confidence interval)
Coefficient of Variation (CV)
Section titled “Coefficient of Variation (CV)”The CV metric helps assess benchmark quality:
CV < 5% → Excellent (very stable)CV 5-10% → Good (acceptable variance)CV 10-20% → Fair (consider more samples)CV > 20% → Poor (investigate noise sources)Example output showing CV:
$ modestbench --engine accurate --allow-natives-syntax --reporters json{ "name": "Array.push()", "mean": 810050, // nanoseconds "stdDev": 19842, "cv": 2.45, // 2.45% - excellent stability "marginOfError": 0.024, "p95": 845200, "p99": 862100}Performance Comparison
Section titled “Performance Comparison”Real-world comparison using examples/benchmarks:
# Tinybench (fast iteration)$ modestbench --engine tinybench --reporters json# Typical run time: 3-5 seconds for 5 benchmark files
# Accurate (high precision)$ node --allow-natives-syntax ./node_modules/.bin/modestbench --engine accurate --reporters json# Typical run time: 8-12 seconds for 5 benchmark filesThe accurate engine takes ~2-3x longer but provides:
- More consistent results between runs
- Better outlier filtering with V8 guards
- Higher confidence in micro-optimizations
Choosing the Right Engine
Section titled “Choosing the Right Engine”| Use Case | Recommended Engine |
|---|---|
| Development iteration | tinybench |
| CI/CD regression tests | tinybench |
| Blog post/publication | accurate |
| Library optimization | accurate |
| Micro-benchmark comparison | accurate |
| Algorithm selection | Either (results typically consistent) |
Multiple Suites
Section titled “Multiple Suites”Organize related benchmarks into separate suites with independent setup and teardown:
const state = { data: [], sortedData: [],};
export default { suites: { Sorting: { setup() { state.data = generateTestData(1000); }, teardown() { state.data = []; }, benchmarks: { 'Quick Sort': () => quickSort(state.data), 'Merge Sort': () => mergeSort(state.data), 'Bubble Sort': () => bubbleSort(state.data), }, },
Searching: { setup() { state.sortedData = generateSortedData(10000); }, teardown() { state.sortedData = []; }, benchmarks: { 'Binary Search': () => binarySearch(state.sortedData, 5000), 'Linear Search': () => linearSearch(state.sortedData, 5000), 'Jump Search': () => jumpSearch(state.sortedData, 5000), }, }, },};Suite Lifecycle
Section titled “Suite Lifecycle”- setup() - Called once before any tasks in the suite run
- Tasks execute - Each task runs with its configured iterations
- teardown() - Called once after all tasks complete
Async Operations
Section titled “Async Operations”ModestBench fully supports asynchronous benchmarks:
Async Functions
Section titled “Async Functions”export default { suites: { 'Async Performance': { benchmarks: { // Simple async benchmark 'Promise.resolve()': async () => { return await Promise.resolve('test'); },
// With configuration 'Fetch Simulation': { async fn() { const response = await simulateApiCall(); return response.json(); }, config: { iterations: 100, // Fewer iterations for slow operations }, }, }, }, },};Async Setup/Teardown
Section titled “Async Setup/Teardown”export default { suites: { 'Database Operations': { async setup() { this.db = await connectDatabase(); await this.db.seed(); },
async teardown() { await this.db.close(); },
benchmarks: { 'Read Query': async function() { return await this.db.query('SELECT * FROM users LIMIT 100'); },
'Write Query': async function() { return await this.db.insert({ name: 'Test User' }); }, }, }, },};Tagging and Filtering
Section titled “Tagging and Filtering”Tag Cascading
Section titled “Tag Cascading”Tags cascade from file → suite → task levels:
export default { // File-level tags (inherited by all suites and tasks) tags: ['performance', 'core'],
suites: { 'String Operations': { // Suite-level tags (inherited by all tasks in this suite) tags: ['string', 'fast'],
benchmarks: { // Task inherits: ['performance', 'core', 'string', 'fast', 'regex'] 'RegExp Test': { fn: () => /pattern/.test(str), tags: ['regex'], // Task-specific tags },
// Task inherits: ['performance', 'core', 'string', 'fast'] 'String Includes': () => str.includes('pattern'), }, },
'Array Operations': { tags: ['array', 'slow'],
benchmarks: { // Task inherits: ['performance', 'core', 'array', 'slow'] 'Array spread': () => { let arr = []; for (let i = 0; i < 1000; i++) { arr = [...arr, i]; } return arr; }, }, }, },};Filtering Examples
Section titled “Filtering Examples”# Run only fast benchmarksmodestbench --tags fast# Runs: 'RegExp Test', 'String Includes'
# Run string OR array benchmarksmodestbench --tags string,array# Runs: All tasks in 'String Operations' and 'Array Operations'
# Exclude slow benchmarksmodestbench --exclude-tags slow# Runs: Only 'String Operations' tasks
# Combine: run fast benchmarks except experimentalmodestbench --tags fast --exclude-tags experimentalSuite Lifecycle with Filtering
Section titled “Suite Lifecycle with Filtering”Suite setup() and teardown() only run if at least one task in the suite matches the filter:
export default { suites: { 'Expensive Setup': { setup() { console.log('This only runs if at least one task will execute'); this.expensiveResource = createExpensiveResource(); },
teardown() { console.log('This only runs if setup ran'); this.expensiveResource.destroy(); },
benchmarks: { 'Fast Task': { fn() { /* ... */ }, tags: ['fast'], }, 'Slow Task': { fn() { /* ... */ }, tags: ['slow'], }, }, }, },};# Setup and teardown run (Fast Task matches)modestbench --tags fast
# Setup and teardown DON'T run (Slow Task excluded)modestbench --exclude-tags slowCustom Task Configuration
Section titled “Custom Task Configuration”Configure individual tasks with specific settings:
export default { suites: { 'Custom Configs': { benchmarks: { // Default configuration 'Standard Task': () => someOperation(),
// Custom iterations 'High Sample Task': { fn: () => criticalOperation(), config: { iterations: 10000, warmup: 200, }, },
// Custom timeout for slow operations 'Slow Operation': { fn: async () => await slowAsyncOperation(), config: { timeout: 60000, // 60 seconds iterations: 10, // Fewer samples }, }, }, }, },};Environment-Specific Benchmarks
Section titled “Environment-Specific Benchmarks”Use JavaScript config files for dynamic configuration:
const isCI = process.env.CI === 'true';const isProd = process.env.NODE_ENV === 'production';
export default { iterations: isCI ? 5000 : 100, warmup: isCI ? 100 : 0, reporters: isCI ? ['json', 'csv'] : ['simple'], // Simple reporter for CI, auto-detect for local quiet: isCI, outputDir: isCI ? './benchmark-results' : undefined,
// Only run critical benchmarks in CI tags: isCI ? ['critical'] : [],
// Exclude slow benchmarks in development excludeTags: isProd ? [] : ['slow'],};CI/CD Integration
Section titled “CI/CD Integration”GitHub Actions
Section titled “GitHub Actions”name: Performance Testson: [push, pull_request]
jobs: benchmark: runs-on: ubuntu-latest steps: - uses: actions/checkout@v3
- uses: actions/setup-node@v3 with: node-version: 20
- name: Install dependencies run: npm ci
- name: Build project run: npm run build
- name: Run benchmarks run: | modestbench \ --reporters json,csv \ --output ./results \ --quiet \ --tags critical
- name: Upload results uses: actions/upload-artifact@v3 with: name: benchmark-results path: ./results/
- name: Check for regressions run: node scripts/check-regression.jsPerformance Regression Detection
Section titled “Performance Regression Detection”import { execSync } from 'child_process';import { readFileSync } from 'fs';
// Run current benchmarksexecSync('modestbench --reporters json --output ./current', { stdio: 'inherit',});
const current = JSON.parse( readFileSync('./current/results.json', 'utf8'));
// Load baseline resultsconst baseline = JSON.parse( readFileSync('./baseline/results.json', 'utf8'));
let hasRegression = false;
// Check for significant regressionsfor (const result of current.results) { const baselineResult = baseline.results.find( (r) => r.file === result.file && r.task === result.task );
if (baselineResult) { const regression = (baselineResult.opsPerSecond - result.opsPerSecond) / baselineResult.opsPerSecond;
if (regression > 0.1) { // 10% regression threshold console.error( `❌ Performance regression in ${result.task}: ${( regression * 100 ).toFixed(1)}% slower` ); console.error(` Baseline: ${baselineResult.opsPerSecond.toFixed(2)} ops/sec`); console.error(` Current: ${result.opsPerSecond.toFixed(2)} ops/sec`); hasRegression = true; } else if (regression < -0.1) { // 10% improvement console.log( `✅ Performance improvement in ${result.task}: ${( Math.abs(regression) * 100 ).toFixed(1)}% faster` ); } }}
if (hasRegression) { console.error('\n❌ Performance regressions detected!'); process.exit(1);} else { console.log('\n✅ No performance regressions detected!');}Historical Tracking
Section titled “Historical Tracking”ModestBench automatically saves results to .modestbench/history/. Use the history commands for analysis:
View History
Section titled “View History”# List recent runsmodestbench history list
# Show specific runmodestbench history show run-2025-10-07-001
# Compare two runsmodestbench history compare \ run-2025-10-07-001 \ run-2025-10-07-002Export Historical Data
Section titled “Export Historical Data”# Export to CSV for analysismodestbench history export \ --format csv \ --output historical-data.csv
# Export to JSONmodestbench history export \ --format json \ --output historical-data.jsonCleanup Old Data
Section titled “Cleanup Old Data”# Clean runs older than 30 daysmodestbench history clean --older-than 30d
# Keep only last 10 runsmodestbench history clean --keep 10
# Clean by sizemodestbench history clean --max-size 100mbProgrammatic API
Section titled “Programmatic API”Use modestbench programmatically in your own tools:
import { modestbench, HumanReporter } from 'modestbench';
// Initialize the engineconst engine = modestbench();
// Register reportersengine.registerReporter('human', new HumanReporter());
// Execute benchmarksconst result = await engine.execute({ pattern: '**/*.bench.js', iterations: 1000, warmup: 50, reporters: ['human'],});
// Process resultsif (result.summary.failedTasks > 0) { console.error('Some benchmarks failed'); process.exit(1);}Handling Fast Operations
Section titled “Handling Fast Operations”Extremely fast operations (<1ns) can cause overflow errors. modestbench handles this automatically:
export default { suites: { 'Ultra Fast Operations': { benchmarks: { // ModestBench will automatically adjust time budget for very fast ops 'Variable Read': () => { const x = 42; return x; },
// For ultra-fast operations, reduce iterations 'Constant Return': { fn: () => 42, config: { iterations: 100, // Lower sample count }, }, }, }, },};Memory Profiling Context
Section titled “Memory Profiling Context”Benchmark results include memory information:
{ "environment": { "memory": { "total": 51539607552, "totalGB": 48.0, "free": 12884901888, "freeGB": 12.0 } }}Track memory usage across runs to identify memory-intensive operations.
Concurrent Execution
Section titled “Concurrent Execution”Run benchmark files concurrently for faster execution:
modestbench --concurrentConsiderations:
- Files run in parallel, but tasks within a file run sequentially
- May cause resource contention on systems with limited CPU/memory
- Results may vary between runs due to system load
- Not recommended for accurate performance measurements
Troubleshooting
Section titled “Troubleshooting”High Margin of Error
Section titled “High Margin of Error”If benchmarks show high margin of error (>5%):
- Increase warmup iterations:
--warmup 100 - Increase sample size:
--iterations 2000 - Close other applications to reduce system load
- Use time-based limiting:
--time 10000 --limit-by time
Timeouts
Section titled “Timeouts”If benchmarks timeout:
- Increase timeout:
--timeout 60000 - Reduce iterations:
--iterations 10 - Check for infinite loops in benchmark code
Inconsistent Results
Section titled “Inconsistent Results”If results vary significantly between runs:
- Use warmup iterations:
--warmup 100 - Increase sample size:
--iterations 5000 - Run in isolation (no other processes)
- Check for async operations completing outside benchmark scope
Best Practices
Section titled “Best Practices”1. Isolate Benchmarks
Section titled “1. Isolate Benchmarks”Each benchmark should test one specific operation:
// ❌ Bad: Testing multiple things'Bad Benchmark': () => { const arr = []; for (let i = 0; i < 1000; i++) { arr.push(i); } return arr.sort();};
// ✅ Good: Isolated operations'Array Push': () => { const arr = []; for (let i = 0; i < 1000; i++) { arr.push(i); } return arr;},'Array Sort': () => { const arr = Array.from({ length: 1000 }, (_, i) => i); return arr.sort();},2. Avoid Side Effects
Section titled “2. Avoid Side Effects”Keep benchmarks pure and repeatable:
// ❌ Bad: Modifying external statelet counter = 0;'Bad Benchmark': () => { counter++; return counter;};
// ✅ Good: No external state'Good Benchmark': () => { let counter = 0; counter++; return counter;};3. Use Warmup for JIT
Section titled “3. Use Warmup for JIT”Enable warmup for operations that benefit from JIT optimization:
export default { suites: { 'JIT-Optimized Operations': { benchmarks: { 'Math Operations': { fn: () => Math.sqrt(42) * Math.PI, config: { warmup: 100, iterations: 5000, }, }, }, }, },};4. Tag Strategically
Section titled “4. Tag Strategically”Use tags to organize and filter benchmarks:
export default { tags: ['core'], // Project-wide tag
suites: { 'Critical Path': { tags: ['critical', 'fast'], // Important, quick benchmarks benchmarks: { /* ... */ }, },
'Edge Cases': { tags: ['edge-case', 'slow'], // Thorough but slow tests benchmarks: { /* ... */ }, }, },};Next Steps
Section titled “Next Steps”- Review Configuration for all options
- Check CLI Reference for command details
- See Output Formats for reporter integration
- Read Architecture for internals