StackBench

CLI Commands Reference

Essential Commands

Main Workflow:

Management:

Main Commands

stackbench setup

Set up repository for IDE execution (clone + extract use cases).

stackbench setup <repository-url> -a cursor -i docs,examples -l python

Common options:

Returns: Run ID for use with other commands

stackbench print-prompt

Get formatted prompt for manual IDE execution.

# Get use case 1 prompt and copy to clipboard
stackbench print-prompt <run-id> -u 1 --copy

# Just print without copying
stackbench print-prompt <run-id> -u 1

stackbench analyze

Analyze implementations and generate reports.

# Analyze all use cases
stackbench analyze <run-id>

# Analyze specific use case
stackbench analyze <run-id> --use-case 2

# Force re-analysis
stackbench analyze <run-id> --force

Prerequisites:

Management Commands

stackbench list

List all benchmark runs.

# List all runs
stackbench list

# Filter by phase or agent
stackbench list --phase completed
stackbench list --agent cursor

stackbench status

Check run status and progress.

stackbench status <run-id>

Shows current phase, use case execution status, and next steps.

stackbench clean

Clean up old benchmark runs.

# Remove runs older than 30 days (default)
stackbench clean

# Remove runs older than 7 days
stackbench clean --older-than 7

# Preview what would be deleted
stackbench clean --dry-run

Configuration

Required environment variables:

OPENAI_API_KEY=your_openai_key_here     # For use case extraction
ANTHROPIC_API_KEY=your_anthropic_key    # For analysis

Common settings (optional):

NUM_USE_CASES=10                        # Default: 5
ANALYSIS_MAX_WORKERS=5                  # Default: 3
DSPY_MODEL=gpt-4o                       # Default: gpt-4o-mini

Quick Start Workflow

# 1. Set up repository
stackbench setup https://github.com/user/awesome-lib -i docs

# 2. Execute use cases in Cursor IDE
stackbench print-prompt <run-id> -u 1 --copy
# ⚠️ Wait for Cursor indexing to complete first!
# Paste in Cursor, let it implement, repeat for all use cases

# 3. Analyze results
stackbench analyze <run-id>

# 4. View results
cat ./data/<run-id>/results.md

Next Steps