CLI Commands Reference
Essential Commands
Main Workflow:
stackbench setup
- Set up repository for IDE execution (clone + extract)stackbench print-prompt
- Get use case prompts for manual executionstackbench analyze
- Analyze implementations and generate reports
Management:
stackbench list
- List all benchmark runsstackbench status
- Check run status and progressstackbench clean
- Clean up old runs
Main Commands
stackbench setup
Set up repository for IDE execution (clone + extract use cases).
stackbench setup <repository-url> -a cursor -i docs,examples -l python
Common options:
-i docs,examples
- Focus on specific folders-b develop
- Use specific branch-l javascript
- Specify programming language (python/py, javascript/js, typescript/ts)--num-use-cases 10
- Generate more use cases
Returns: Run ID for use with other commands
stackbench print-prompt
Get formatted prompt for manual IDE execution.
# Get use case 1 prompt and copy to clipboard
stackbench print-prompt <run-id> -u 1 --copy
# Just print without copying
stackbench print-prompt <run-id> -u 1
stackbench analyze
Analyze implementations and generate reports.
# Analyze all use cases
stackbench analyze <run-id>
# Analyze specific use case
stackbench analyze <run-id> --use-case 2
# Force re-analysis
stackbench analyze <run-id> --force
Prerequisites:
- Claude Code CLI:
npm install -g @anthropic-ai/claude-code
- Set
ANTHROPIC_API_KEY
in environment - Use case solution files must exist
Management Commands
stackbench list
List all benchmark runs.
# List all runs
stackbench list
# Filter by phase or agent
stackbench list --phase completed
stackbench list --agent cursor
stackbench status
Check run status and progress.
stackbench status <run-id>
Shows current phase, use case execution status, and next steps.
stackbench clean
Clean up old benchmark runs.
# Remove runs older than 30 days (default)
stackbench clean
# Remove runs older than 7 days
stackbench clean --older-than 7
# Preview what would be deleted
stackbench clean --dry-run
Configuration
Required environment variables:
OPENAI_API_KEY=your_openai_key_here # For use case extraction
ANTHROPIC_API_KEY=your_anthropic_key # For analysis
Common settings (optional):
NUM_USE_CASES=10 # Default: 5
ANALYSIS_MAX_WORKERS=5 # Default: 3
DSPY_MODEL=gpt-4o # Default: gpt-4o-mini
Quick Start Workflow
# 1. Set up repository
stackbench setup https://github.com/user/awesome-lib -i docs
# 2. Execute use cases in Cursor IDE
stackbench print-prompt <run-id> -u 1 --copy
# ⚠️ Wait for Cursor indexing to complete first!
# Paste in Cursor, let it implement, repeat for all use cases
# 3. Analyze results
stackbench analyze <run-id>
# 4. View results
cat ./data/<run-id>/results.md
Next Steps
- Getting Started - Complete tutorial walkthrough
- How StackBench Works - Understanding the internals