Skip to content

Architecture

ModestBench is a TypeScript-based benchmarking framework that wraps tinybench to provide structure, CLI tooling, historical tracking, and multiple output formats for JavaScript/TypeScript performance testing. The architecture follows a dependency injection pattern with clear separation of concerns across subsystems.

Core Technology: Node.js 18+, TypeScript, tinybench 2.6.0

Key Dependencies:

  • tinybench - Core benchmarking engine
  • yargs - CLI argument parsing
  • glob - File discovery
  • cli-progress - Terminal progress bars
  • consola - Logging

SubsystemPurposeKey FilesStateful?
CLICommand-line interface and command routingcli/index.ts
cli/commands/*.ts
No
CoreBenchmark orchestration and executioncore/engine.ts
core/loader.ts
core/error-manager.ts
Yes (ErrorManager)
ConfigConfiguration loading and mergingconfig/manager.tsNo
ProgressReal-time progress trackingprogress/manager.tsYes
ReportersOutput formatting (human/JSON/CSV)reporters/*.tsYes (HumanReporter)
StorageHistorical benchmark data persistencestorage/history.tsYes
TypesTypeScript interfaces and typestypes/*.tsNo

Entry Point: src/cli/index.ts

Source: src/cli/commands/run.ts

The CLI creates a CliContext object (lines 44-52 in src/cli/index.ts) containing all initialized services:

export interface CliContext {
readonly configManager: ConfigurationManager;
readonly engine: BenchmarkEngine;
readonly errorManager: ErrorManager;
readonly historyStorage: HistoryStorage;
readonly options: GlobalOptions;
readonly progressManager: ProgressManager;
readonly reporterRegistry: ReporterRegistry;
}

This context is passed to all command handlers, enabling:

  • Testability: Easy to mock dependencies
  • Flexibility: Services can be swapped without changing commands
  • Separation of Concerns: Each service has a single responsibility

ModestBench uses an abstract base class pattern to support multiple benchmark execution strategies. The architecture consists of three layers:

Abstract Base Class: src/core/engine.ts

  • Provides all orchestration logic: file discovery, validation, suite/task iteration, progress tracking, reporter lifecycle
  • Defines single abstract method: executeBenchmarkTask() for concrete engines to implement
  • Handles setup/teardown, error recovery, history storage, and result aggregation

Concrete Implementations: src/core/engines/

  1. TinybenchEngine - Wraps external tinybench library
  2. AccurateEngine - Custom measurement implementation with V8 optimization guards

3.2 TinybenchEngine: Wrapper Implementation

Section titled “3.2 TinybenchEngine: Wrapper Implementation”

Location: src/core/engines/tinybench-engine.ts

Strategy: Thin wrapper around the tinybench library

How It Works:

  1. Creates a Bench instance from tinybench with configured time/iterations
  2. Adds the benchmark function to the bench instance
  3. Runs the benchmark (tinybench handles timing internally)
  4. Extracts raw samples from tinybench results
  5. Post-processes samples with IQR outlier removal
  6. Calculates statistics from cleaned samples
  7. Returns standardized TaskResult

Key Features:

  • Leverages tinybench’s mature timing and iteration logic
  • Handles tinybench’s “Invalid array length” errors for extremely fast operations (automatic retry with minimal time)
  • Supports abort signals for task cancellation
  • Progress updates during execution (500ms interval)

Configuration Mapping (limitBy modes):

ModestBench ConfigTinybenchEngine Behavior
limitBy: 'all'Both time AND iterations must complete (default)
limitBy: 'any'Minimal time (1ms), iterations-limited
limitBy: 'time'Time-limited, minimal iterations (1)
limitBy: 'iterations'Iterations-limited, minimal time (1ms)

Location: src/core/engines/accurate-engine.ts

Strategy: Custom measurement using Node.js process.hrtime.bigint and V8 optimization guards

Inspiration: Adapted from bench-node measurement techniques

How It Works:

  1. Check V8 intrinsics availability (requires --allow-natives-syntax flag)
  2. Calculate adaptive iterations based on quick 30-iteration test
  3. Run optional warmup (min 10 samples or warmup time)
  4. Main benchmark loop:
    • Execute function N times in a batch (max 10,000 per round)
    • Time each batch with process.hrtime.bigint
    • Calculate per-operation duration
    • Push samples to array
    • Adjust iterations for next round based on remaining time
  5. Apply IQR outlier removal to raw samples
  6. Calculate statistics from cleaned samples
  7. Return standardized TaskResult

V8 Optimization Guards (when available):

// Created using V8 intrinsics
const DoNotOptimize = new Function('x', 'return x');
const NeverOptimize = new Function(
'fn',
'%NeverOptimizeFunction(fn); return fn;',
);
// Prevents V8 from optimizing away benchmark code
for (let i = 0; i < iterations; i++) {
const result = fn();
guardedDoNotOptimize(result); // Forces V8 to keep result
}

Key Features:

  • Higher accuracy through V8 optimization guards (prevents JIT artifacts)
  • Adaptive iteration calculation matches operation speed to target duration
  • Nanosecond precision using BigInt hrtime
  • Fallback mode when --allow-natives-syntax not available
  • Bounded iterations (max 10,000 per round) to prevent memory issues
  • Progress updates every 100 samples
  • Full abort signal support

Requirements:

  • Node.js >= 20
  • --allow-natives-syntax flag (optional but recommended)
  • Falls back gracefully without flag (prints warning once)

Trade-offs vs TinybenchEngine:

AspectTinybenchEngineAccurateEngine
AccuracyGood (tinybench’s timing)Excellent (V8 guards)
SetupNo special flags neededRequires --allow-natives-syntax for best results
SpeedFastSlower (more iterations)
MaturityProduction-ready (tinybench)Custom implementation
MaintenanceExternal dependencyInternal code

Both engines use the same statistical processing pipeline (src/core/stats-utils.ts):

  1. IQR Outlier Removal - Removes samples outside 1.5 * IQR range
  2. Statistics Calculation - mean, stdDev, variance, CV, percentiles (p95, p99)

This ensures consistent result quality regardless of engine choice.


Location: src/core/engines/tinybench-engine.ts (lines 27-334)

ModestBench wraps TinyBench at the task execution level only. The integration is isolated to the TinybenchEngine.executeBenchmarkTask() method:

TinybenchEngine maps ModestBench configuration to TinyBench options:

ModestBench ConfigTinyBench OptionDefaultNotes
timetime1000msCapped at 2000ms to prevent overflow
warmupwarmupTime0Warmup duration in ms, capped at 500ms
iterationsiterations100Direct mapping
limitByN/A’iterations’Controls how time/iterations interact (see section 3.2)
timeoutN/A30000msEnforced at task level, not TinyBench

Source: src/core/engines/tinybench-engine.ts (lines 77-82)

const bench = new Bench({
iterations: effectiveIterations,
time: effectiveTime,
warmupIterations: config.warmup,
warmupTime: config.warmup > 0 ? Math.min(config.warmup || 0, 500) : 0,
});

TinyBench returns results with the following structure, which TinybenchEngine transforms:

TinyBench → ModestBench Mapping:

TinyBench FieldModestBench FieldTransformation
latency.samplessamplesConverted ms → ns, IQR filtered
latency.samples (filtered)meanCalculated from cleaned samples
latency.samples (filtered)minCalculated from cleaned samples
latency.samples (filtered)maxCalculated from cleaned samples
latency.samples (filtered)stdDevCalculated from cleaned samples
latency.samples (filtered)varianceCalculated from cleaned samples
latency.samples (filtered)p95Calculated from cleaned samples
latency.samples (filtered)p99Calculated from cleaned samples
latency.samples (filtered)cvCalculated from cleaned samples
latency.samples (filtered)marginOfErrorCalculated from cleaned samples
throughput.meanopsPerSecondDirect from tinybench
latency.samples.length (after IQR)iterationsAfter outlier removal

Note: TinybenchEngine applies IQR outlier removal to raw samples before calculating most statistics. Only opsPerSecond comes directly from tinybench.

Source: src/core/engines/tinybench-engine.ts (lines 287-310)

TinybenchEngine implements special error handling for TinyBench edge cases:

Array Length Overflow: When operations are extremely fast (<1ns), TinyBench can throw “Invalid array length” errors. TinybenchEngine automatically retries with minimal time (1-10ms depending on limitBy mode) and gracefully reports if still failing.

Source: src/core/engines/tinybench-engine.ts (lines 98-178)

catch (error) {
if (errorMessage.includes('Invalid array length')) {
// Retry with minimal time for extremely fast operations
const minimalBench = new Bench({ time: retryTime, iterations: config.iterations });
await minimalBench.run();
// ... apply IQR and return results
}
}

ModestBench does not currently expose a documented programmatic API for library consumers. The CLI is the primary interface.

However, the architecture supports programmatic use through the exported engine:

Potential API Usage (not officially documented):

import { TinybenchEngine, AccurateEngine } from 'modestbench';
import {
ModestBenchConfigurationManager,
BenchmarkFileLoader,
FileHistoryStorage,
ModestBenchProgressManager,
ModestBenchReporterRegistry,
ModestBenchErrorManager,
} from 'modestbench';
// Manual setup required
const engine = new TinybenchEngine({
// or new AccurateEngine({
configManager: new ModestBenchConfigurationManager(),
fileLoader: new BenchmarkFileLoader(),
historyStorage: new FileHistoryStorage(),
progressManager: new ModestBenchProgressManager(),
reporterRegistry: new ModestBenchReporterRegistry(),
errorManager: new ModestBenchErrorManager(),
});
// Execute benchmarks
const result = await engine.execute({
pattern: '**/*.bench.js',
iterations: 1000,
});

Package Exports (package.json lines 24-30):

{
"exports": {
".": {
"import": "./dist/index.js",
"require": "./dist/index.cjs",
"types": "./dist/index.d.cts"
}
},
"main": "./dist/index.cjs",
"module": "./dist/index.js",
"types": "./dist/index.d.cts"
}

The project currently does not have an src/index.ts file that aggregates exports for library consumers. This would need to be created to support programmatic API usage.

Opinion: To support programmatic usage, create src/index.ts:

// Proposed src/index.ts
export { ModestBenchEngine } from './core/engine.js';
export { TinybenchEngine } from './core/engines/tinybench-engine.js';
export { AccurateEngine } from './core/engines/accurate-engine.js';
export { BenchmarkFileLoader } from './core/loader.js';
export { ModestBenchConfigurationManager } from './config/manager.js';
export { FileHistoryStorage } from './storage/history.js';
export { ModestBenchProgressManager } from './progress/manager.js';
export {
ModestBenchReporterRegistry,
BaseReporter,
} from './reporters/registry.js';
export { HumanReporter } from './reporters/human.js';
export { JsonReporter } from './reporters/json.js';
export { CsvReporter } from './reporters/csv.js';
export { ModestBenchErrorManager } from './core/error-manager.js';
// Export all types
export * from './types/index.js';

6. Bespoke Systems: Replacement Candidates

Section titled “6. Bespoke Systems: Replacement Candidates”

Current Implementation: src/config/manager.ts using cosmiconfig

Status: ✅ Migrated from custom implementation

What it does:

  • Discovers config files in directory tree using cosmiconfig
  • Loads JSON, YAML, JavaScript, and TypeScript config files
  • Merges CLI args with config file with defaults
  • Validates configuration

Implementation:

import { cosmiconfig } from 'cosmiconfig';
private createExplorer() {
return cosmiconfig('modestbench', {
loaders: {
'.ts': async (filepath: string) => {
// Use dynamic import for TypeScript files
const module = await import(filepath);
return module.default || module;
},
},
searchPlaces: [
'package.json',
'.modestbenchrc',
'.modestbenchrc.json',
'.modestbenchrc.yaml',
'.modestbenchrc.yml',
'modestbench.config.json',
'modestbench.config.yaml',
'modestbench.config.yml',
'modestbench.config.js',
'modestbench.config.mjs',
'modestbench.config.ts',
],
});
}

Benefits Realized:

  • ✅ Full support for JSON, YAML, JS, TS, package.json, RC files
  • ✅ Smart caching and directory traversal handled by cosmiconfig
  • ✅ ~150 lines of code removed (discovery and loading logic)
  • ✅ Standard tool used by ESLint, Prettier, Babel, etc.
  • ✅ Automatic parent directory searching

Current Implementation: src/core/loader.ts

What it does:

  • Uses glob for file discovery ✅
  • Validates file syntax (bracket matching) ⚠️
  • Validates structure (benchmark patterns) ⚠️

Opinion: ⚠️ PARTIAL REPLACEMENT CANDIDATE

Section titled “Opinion: ⚠️ PARTIAL REPLACEMENT CANDIDATE”

File Discovery: ✅ Keep as-is - glob package is the standard

Syntax Validation (lines 363-413): ⚠️ Consider removing

  • Current implementation is simplistic (counts braces/parens)
  • Doesn’t catch real syntax errors
  • Dynamic import will fail anyway if syntax is invalid
  • False negatives: Braces in strings/comments break detection

Structure Validation (lines 298-358): ✅ Keep

  • Useful warnings for missing benchmark patterns
  • Low complexity

Recommendation: Remove syntax validation, keep structure validation


The history system provides persistent storage of benchmark runs with querying, cleanup, and export capabilities.

Implementation: src/storage/history.ts

Storage Index (lines 28-46 in src/storage/history.ts)

Section titled “Storage Index (lines 28-46 in src/storage/history.ts)”
interface StorageIndex {
readonly version: string; // Index schema version
readonly created: Date; // Index creation timestamp
readonly lastModified: Date; // Last update timestamp
readonly entries: IndexEntry[]; // All run entries
}
interface IndexEntry {
readonly id: string; // Unique run ID
readonly filename: string; // run-YYYY-MM-DD-HASH.json
readonly date: Date; // Run timestamp
readonly sizeBytes: number; // File size
readonly summary: string; // "10 files, 50 tasks"
readonly tags: string[]; // User tags
}

Each run is stored as a complete JSON serialization of BenchmarkRun (src/types/core.ts lines 12-37):

{
"ci": {
"provider": "GitHub Actions"
/* ... */
},
"config": {
/* ... */
},
"duration": 150000,
"endTime": "2025-10-18T12:02:30.000Z",
"environment": {
"nodeVersion": "v18.0.0",
"platform": "darwin",
"arch": "arm64",
"cpu": {
/* ... */
},
"memory": {
/* ... */
}
},
"files": [
/* all file results */
],
"git": {
"commit": "abc123",
"branch": "main"
/* ... */
},
"id": "run-1729257600000-abc123def",
"startTime": "2025-10-18T12:00:00.000Z",
"summary": {
"totalFiles": 10,
"totalSuites": 25,
"totalTasks": 150,
"passedTasks": 148,
"failedTasks": 2
/* ... */
}
}

Save Run (lines 353-382 in src/storage/history.ts)

Section titled “Save Run (lines 353-382 in src/storage/history.ts)”

Query Runs (lines 283-348 in src/storage/history.ts)

Section titled “Query Runs (lines 283-348 in src/storage/history.ts)”

Supports filtering by:

  • Date range: since, until
  • Pattern matching: Regex on summary
  • Tags: Filter by benchmark tags
  • Sorting: By date, name, duration
  • Pagination: offset, limit
const runs = await history.queryRuns({
since: new Date('2025-10-01'),
until: new Date('2025-10-18'),
tags: ['performance', 'algorithm'],
pattern: 'sorting',
sortBy: 'date',
sort: 'desc',
limit: 10,
offset: 0,
});

Cleanup (lines 77-153 in src/storage/history.ts)

Section titled “Cleanup (lines 77-153 in src/storage/history.ts)”

Implements retention policies to manage storage:

interface RetentionPolicy {
maxAge?: number; // Max age in milliseconds
maxRuns?: number; // Max number of runs to keep
maxSize?: number; // Max total storage in bytes
}

Cleanup Algorithm:

  1. Load index
  2. Sort entries by date (oldest first)
  3. For each entry, check if it violates any policy:
    • Age > maxAge
    • Total runs > maxRuns
    • Total size > maxSize
  4. Remove violating files and update index

Export (lines 158-174 in src/storage/history.ts)

Section titled “Export (lines 158-174 in src/storage/history.ts)”

Supports:

  • JSON: Pretty-printed benchmark runs
  • CSV: Tabular format for spreadsheets

CSV fields (lines 397-430):

runId, startTime, endTime, duration, files, suites, tasks,
passed, failed, nodeVersion, platform, arch, gitCommit, gitBranch

Default: .modestbench/history/ in current working directory

Configurable: Pass storageDir option to FileHistoryStorage constructor

new FileHistoryStorage({
storageDir: '/path/to/custom/history',
maxFileSize: 10 * 1024 * 1024, // 10MB
});

The index is cached in memory after first load (line 52 in src/storage/history.ts):

private index: null | StorageIndex = null;

Invalidation: Index is rewritten on every saveRun() and cleanup() operation.

Performance: This is efficient for CLI usage (short-lived processes) but could cause issues for long-running programmatic use if multiple processes write simultaneously.


8. Rarely-Used Features: Removal Candidates

Section titled “8. Rarely-Used Features: Removal Candidates”

Command: modestbench init [type]
Location: src/cli/commands/init.ts

Current Features:

  • Project type templates: basic, advanced, library
  • Config file generation: JSON, YAML, JS, TS
  • Example benchmark files

Issues:

  1. Inconsistent support: Only JSON configs are loadable; YAML/JS/TS are templates only
  2. Maintenance overhead: Each template needs updates when config schema changes
  3. Limited differentiation: “basic” vs “advanced” vs “library” types aren’t significantly different

Recommendation:

  • Keep single init command with JSON config only
  • Remove template variations
  • Focus on one high-quality example

Location: src/core/engine.ts (lines 896-900)

Current Status:

private async getGitInfo(): Promise<GitInfo | undefined> {
// TODO: Implement Git information extraction
// This would use child_process to run git commands
return undefined;
}

The GitInfo type is defined (src/types/core.ts lines 182-200) but never populated.

Options:

  1. Implement: Use simple-git package or child_process
  2. Remove: Delete GitInfo type and references if not critical

Recommendation: Either implement fully or remove the placeholder. Git info is useful for CI tracking but not critical for local benchmarking.


Command: modestbench history compare <run-id1> <run-id2>
Location: src/cli/commands/history.ts

Status: Command is registered but actual comparison logic implementation unclear.

Recommendation: If not implemented, remove from CLI until implemented. If implemented, document thoroughly as it’s a key differentiator.


SubsystemStatefulnessLifecyclePersistence
ErrorManager✅ YesPer runNone (in-memory)
ProgressManager✅ YesPer runNone (in-memory)
HistoryStorage✅ YesPersistentFile system
ReporterRegistry✅ YesProcessNone (in-memory)
HumanReporter✅ YesPer runNone (in-memory)
ConfigManager❌ NoStatelessN/A
BenchmarkEngine❌ NoStatelessN/A
FileLoader❌ NoStatelessN/A

Location: src/core/error-manager.ts

State Stored:

  • errors: ExecutionError[] - Array of all handled errors (line 81)
  • handlers: ErrorHandler[] - Registered error callbacks (line 83)
  • maxRecentErrors = 50 - Memory limit (line 85)

Lifecycle:

  • Created per CLI invocation
  • Accumulates errors during benchmark run
  • Automatically trims to last 50 errors (lines 300-302)

Memory Safety: ✅ Bounded by maxRecentErrors

Location: src/progress/manager.ts

State Stored:

  • state: ProgressState - Current progress (line 41)
  • callbacks: ProgressCallback[] - Registered listeners (line 33)
  • metrics: ProgressMetrics | null - Throughput calculations (line 39)
  • lastUpdate: number - Throttling timestamp (line 35)

Throttling: Updates limited to every 100ms (line 43)

Lifecycle:

  • initialize(run) - Set totals (lines 208-243)
  • update(changes) - Incremental updates (lines 298-322)
  • cleanup() - Reset state (lines 52-56)

Memory Safety: ✅ Bounded state, cleared after run

Location: src/storage/history.ts

In-Memory State:

  • index: StorageIndex | null - Cached index (line 52)

Persistent State:

  • .modestbench/history/index.json - Run metadata
  • .modestbench/history/run-*.json - Full benchmark results

Concurrency: ⚠️ Not thread-safe
Multiple processes writing simultaneously could corrupt index.

Size Limits:

  • Default max file size: 10MB
  • Automatic cleanup via retention policies

Location: src/reporters/human.ts

State Stored:

  • startTime - Run start timestamp (line 52)
  • lastProgressLine - For terminal clearing (line 44)
  • progressTimer - Spinner animation interval (line 46)
  • spinnerIndex - Animation frame counter (line 50)

Lifecycle:

  • onStart() - Initialize (lines 200-244)
  • onProgress() - Update display (lines 166-198)
  • onEnd() - Finalize (lines 78-123)

Memory Safety: ✅ Minimal state, timer cleaned up


VariablePurposeLocationDefaultImpact
DEBUGShow stack traces on errorssrc/cli/index.tsundefinedError verbosity
CIDetect CI environmentsrc/core/engine.ts'false'Enable CI info collection
NODE_ENVEnvironment modesrc/core/engine.ts'development'Stored in environment info
FORCE_COLORForce color outputsrc/reporters/human.tsundefinedOverride color detection
NO_COLORDisable color outputsrc/reporters/human.tsundefinedOverride color detection
GitHub ActionsCI provider detectionsrc/core/engine.tsN/ASee below

Usage: DEBUG=1 modestbench run

Behavior:

  • Shows full error stack traces (lines 477-479 in src/cli/index.ts)
  • Prints error details on uncaught exceptions (lines 494-496)
if (process.env.DEBUG) {
console.error(err.stack);
}

Primary Detection: src/core/engine.ts (lines 724-759)

if (!process.env.CI) {
return undefined; // Not in CI
}

GitHub Actions Detection:

When GITHUB_ACTIONS is set, captures:

GitHub VariableModestBench FieldPurpose
GITHUB_RUN_NUMBERbuildNumberJob number
GITHUB_REPOSITORYUsed to build buildUrle.g., owner/repo
GITHUB_RUN_IDUsed to build buildUrlJob run ID
GITHUB_REF_NAMEbranchBranch or PR ref
GITHUB_EVENT_NAMEDetermines pullRequestEvent type
GITHUB_SHAcommitCommit SHA

Other CI Providers:

Falls back to generic detection:

  • BRANCHbranch
  • COMMITcommit
  • Provider shown as “Unknown CI”

Output in Results:

{
"ci": {
"provider": "GitHub Actions",
"buildNumber": "42",
"buildUrl": "https://github.com/owner/repo/actions/runs/123456",
"branch": "main",
"commit": "abc123",
"pullRequest": "refs/pull/42/merge"
}
}

Location: src/reporters/human.ts (lines 68-72)

Detection Logic:

this.useColor =
options.color ??
(process.stdout.isTTY &&
process.env.FORCE_COLOR !== '0' &&
process.env.NO_COLOR == null);

Priority:

  1. Explicit --color / --no-color CLI flag
  2. NO_COLOR environment variable (disables color)
  3. FORCE_COLOR environment variable (enables color unless '0')
  4. TTY detection

Examples:

Terminal window
# Force color in CI
FORCE_COLOR=1 modestbench run
# Disable color
NO_COLOR=1 modestbench run

Usage: Stored in environment info but does not change behavior

env: {
CI: process.env.CI || 'false',
NODE_ENV: process.env.NODE_ENV || 'development',
}

This is captured for historical tracking but doesn’t affect benchmark execution.



Why: Support multiple benchmark execution strategies without code duplication
How: Abstract base class with single executeBenchmarkTask() hook
Trade-off: Easier to add new engines, but requires understanding the abstraction

Why: Enables testing and flexibility
How: Services passed to ModestBenchEngine constructor
Trade-off: More verbose setup for programmatic use

12.3 Synchronous File I/O in HistoryStorage

Section titled “12.3 Synchronous File I/O in HistoryStorage”

Why: Simplicity
Where: Uses fs.readFileSync, fs.writeFileSync
Trade-off: Could block in high-frequency scenarios
Mitigation: CLI usage is typically one-shot

Why: Leverage maintained library, avoid duplication
How: Thin wrapper in TinybenchEngine.executeBenchmarkTask()
Trade-off: Dependent on TinyBench API stability
Alternative: AccurateEngine provides custom implementation option

Why: Consistent result quality across engines
How: Both engines use same IQR filtering and statistics calculation
Trade-off: Requires standardizing on nanosecond-precision samples

Why: Simple, portable, no database dependencies
Where: JSON files in .modestbench/history/
Trade-off: No multi-process safety, query performance limited
Alternative considered: SQLite


Implementation: Uses glob package
Performance: O(n) where n = number of files scanned
Typical: <100ms for 1000 files

Throttling: 100ms minimum between updates
Impact: Reduces terminal I/O overhead
UI responsiveness: Acceptable for human perception

Index loading: O(1) with in-memory cache
Filtering: O(n) linear scan of entries
Run loading: O(m) where m = matching runs
Optimization: Index filters before loading full run files

TinybenchEngine Overhead: Minimal wrapper around TinyBench
AccurateEngine Overhead: Custom measurement loop with V8 guards
Progress updates: Every 500ms (TinybenchEngine) or every 100 samples (AccurateEngine)
Reporter callbacks: Synchronous execution could add overhead if reporters are slow


Risk: Benchmark files are dynamically imported
Mitigation: Limited to files matching glob patterns
Recommendation: Run benchmarks in isolated environments if executing untrusted code

History storage: Writes to .modestbench/history/
Reporter output: Writes to configured outputDir
Risk: Path traversal if user controls paths
Current mitigation: Paths resolved relative to CWD

Risk: JS/TS config files execute arbitrary code via dynamic imports
Mitigation: Config files are treated as trusted code (like package.json scripts)
Implementation: Uses cosmiconfig with dynamic import() for TypeScript files

Risk: AccurateEngine uses new Function() with V8 intrinsics
Mitigation: String is hardcoded, never influenced by user input
Alternative: Falls back to basic mode without --allow-natives-syntax


Location: /test/

Structure:

  • unit/ - Pure function tests
  • integration/ - Component interaction tests
  • contract/ - Interface compliance tests
Test FileCoverage
test/contract/tinybench-engine.test.tsTinybenchEngine implementation
test/contract/accurate-engine.test.tsAccurateEngine implementation
test/integration/engine-comparison.test.tsEngine compatibility
test/integration/test_reporters.test.tsReporter output
test/integration/test_configuration.test.tsConfig loading

Both concrete engines (TinybenchEngine and AccurateEngine) are tested against the same contract to ensure API compatibility. This guarantees they can be swapped without breaking user code.

Location: test/util.ts
Provides test helpers and fixtures


SubsystemPrimary FileLines of CodeKey Classes/Functions
CLI Entrysrc/cli/index.ts650cli(), main(), createCliContext()
Run Commandsrc/cli/commands/run.ts305handleRunCommand()
Engine Basesrc/core/engine.ts891ModestBenchEngine (abstract)
Tinybenchsrc/core/engines/tinybench-engine.ts336TinybenchEngine
Accuratesrc/core/engines/accurate-engine.ts408AccurateEngine
Statssrc/core/stats-utils.ts~150calculateStatistics, removeOutliersIQR
Loadersrc/core/loader.ts416BenchmarkFileLoader
Error Managersrc/core/error-manager.ts373ModestBenchErrorManager
Configsrc/config/manager.ts465ModestBenchConfigurationManager
Historysrc/storage/history.ts605FileHistoryStorage
Progresssrc/progress/manager.ts413ModestBenchProgressManager
Reporterssrc/reporters/~800HumanReporter, JsonReporter, CsvReporter
Typessrc/types/~600Interface definitions

Total Source Code: ~6,500 lines


  1. ✅ Replace configuration loading with cosmiconfig - Enables YAML/JS/TS support, reduces code
  2. ✅ Create public API entry point - Document programmatic usage
  3. ⚠️ Complete or remove Git info collection - Half-implemented feature
  4. ⚠️ Add concurrency control to HistoryStorage - Prevent index corruption
  1. 🤔 Remove syntax validation - Redundant, error-prone
  2. 🤔 Simplify init command - Remove unused template variations
  1. Document environment variables in README
  2. Add programmatic API examples
  3. Consider SQLite for history storage (performance + safety)

TermDefinition
Benchmark RunComplete execution of all discovered benchmark files
SuiteCollection of related benchmark tasks
TaskSingle benchmark operation (one function to measure)
ReporterOutput formatter (human, JSON, CSV)
History StoragePersistent benchmark result storage
Progress StateReal-time execution progress tracking
Execution PhaseStage of benchmark execution (discovery, validation, execution, etc.)
TinyBenchExternal benchmark library wrapped by TinybenchEngine
AccurateEngineCustom benchmark engine with V8 optimization guards
TinybenchEngineEngine that wraps the tinybench library
IQR FilteringInterquartile Range outlier removal for sample cleanup
V8 IntrinsicsLow-level V8 functions for optimization control
CliContextDependency injection container for CLI commands

This architectural overview provides a comprehensive understanding of ModestBench’s internal structure, design decisions, and areas for improvement. The system is well-architected with clear separation of concerns through the engine abstraction pattern, enabling both tinybench integration and custom measurement approaches.