Documentation
Everything you need to run multi-model pipelines with an independent referee.
Quick Start
Get your first pipeline running in under three minutes.
# Install CRTX pip install crtx # Set your API keys (use the providers you have) export ANTHROPIC_API_KEY=sk-ant-... export OPENAI_API_KEY=sk-... export GOOGLE_API_KEY=AIza... export XAI_API_KEY=xai-... # Run your first pipeline crtx run \ "Build a REST API with JWT auth and rate limiting" \ --arbiter bookend \ --route quality_first
That's it. CRTX selects the best model for each pipeline stage, runs the Architect → Implement → Refactor → Verify sequence, and the Arbiter independently reviews the first and last stage outputs.
Installation
CRTX requires Python 3.12 or higher.
# Standard install pip install crtx # With the optional dashboard (real-time web visualization) pip install crtx[dashboard] # Verify installation crtx --version crtx --help
CRTX is a BYOK (Bring Your Own Keys) tool. You need at least one API key from a supported provider. The more providers you configure, the more options the routing engine has for model selection.
Configuration
CRTX uses TOML configuration files. Create a crtx.toml in your project root or pass --config path/to/config.toml.
# crtx.toml — Example configuration [pipeline] mode = "sequential" # sequential | parallel | debate arbiter = "bookend" # bookend | full | final | off [routing] strategy = "quality_first" # quality_first | cost_optimized | speed_first | hybrid min_fitness = 0.6 # Minimum fitness score to consider a model [models] # Override default model assignments per stage architect = "gemini-2.5-pro" implement = "gpt-4o" refactor = "claude-opus-4.6" arbiter = "grok-4" [arbiter] max_retries = 2 # Max retries on REJECT verdict inject_flags = true # Inject FLAG warnings into next stage [domain_rules] # Custom rules the Arbiter enforces (see Domain Rules section) rules = [ "All database operations must use connection pooling", "All API endpoints must validate input with Pydantic models", "Error responses must use RFC 7807 problem details format", ]
CLI Reference
crtx run
Execute a pipeline. This is the primary command.
crtx run [TASK] [OPTIONS]
Required:
TASK Task description (positional argument)
Options:
--mode TEXT Pipeline mode: sequential, parallel, debate
[default: sequential]
--arbiter TEXT Arbiter mode: bookend, full, final, off
[default: bookend]
--route TEXT Routing strategy: quality_first, cost_optimized,
speed_first, hybrid [default: quality_first]
--models TEXT Comma-separated model list (for parallel/debate)
--context-dir PATH Project directory for context injection
--context-budget INT Max tokens for injected context [default: 20000]
--apply Enable apply mode (write to codebase)
--confirm Actually write files (requires --apply)
--branch TEXT Create git branch before applying
--apply-include TEXT Glob patterns for files to write
--apply-exclude TEXT Glob patterns for files to skip
--rollback-on-fail Auto-revert if post-apply tests fail [default: true]
--test-command TEXT Test command to run after apply
--no-stream Disable streaming display
--config PATH Path to TOML config file
--output PATH Save pipeline output to file
--verbose Show detailed pipeline execution logs
--dry-run Show model assignments without running
--help Show this message and exitcrtx models
List available models and their fitness scores for each stage.
crtx models [OPTIONS] Options: --route TEXT Show scores for a specific routing strategy --stage TEXT Filter to a specific stage --help Show this message and exit
crtx dashboard
Start the real-time web dashboard. Requires the dashboard optional dependency.
crtx dashboard [OPTIONS] Options: --port INT Server port [default: 8420] --no-browser Don't auto-open browser --help Show this message and exit
crtx repl
Start an interactive REPL session. Run multiple pipelines without restarting, with session history and tab completion.
crtx repl [OPTIONS] Options: --mode TEXT Default pipeline mode --arbiter TEXT Default arbiter mode --route TEXT Default routing strategy --context-dir PATH Project directory for context injection --help Show this message and exit
crtx setup
Interactive setup wizard. Configures API keys, tests provider connectivity, and creates your initial crtx.toml.
crtx setup
Pipeline Modes
CRTX supports five pipeline modes. Each is suited to a different class of problem.
Sequential
The default mode. Tasks flow through four stages linearly, each building on the previous output.
The Architect designs the solution structure. The Implementer writes code against that scaffold. The Refactorer improves quality and adds tests. The Verifier validates the complete output. The Arbiter reviews at configured checkpoints.
crtx run "Build a user auth service" --mode sequential
Best for: standard development tasks, feature implementation, bug fixes, refactoring.
Parallel
Fan the same task out to multiple models simultaneously. Each model produces an independent solution, then cross-reviews the others' work.
The cross-review scoring evaluates each output on architecture (1-10), implementation quality (1-10), and correctness (1-10). The highest-scoring output wins. A synthesis step then merges the best ideas from all outputs into one.
crtx run \ "Design a caching layer with TTL and invalidation" \ --mode parallel \ --models claude-opus,gpt-4o,gemini-pro
Best for: tasks where the approach matters more than speed — data pipelines, complex algorithms, system design.
Debate
Two models take opposing positions on a question. Each writes a proposal, rebuts the other's position, then makes a final argument incorporating the criticisms. A third model serves as judge.
crtx run \ "Microservices vs monolith for our API gateway" \ --mode debate
Best for: architectural decisions, technology selection, design tradeoffs. The structured adversarial reasoning produces insights that neither model would generate alone.
Review
Multi-model code analysis with cross-review. Multiple models independently review your code for issues — security vulnerabilities, performance problems, architectural concerns — then cross-review each other's findings. The Arbiter synthesizes all reviews into a unified report.
crtx review-code src/ # With specific focus crtx review-code src/ --focus security,performance
Best for: code review, security audits, pre-merge quality checks, compliance verification.
Improve
Generate improvements, vote on best, synthesize. Models independently propose improvements to your code, then vote on which changes are most impactful. The best improvements are synthesized into a single changeset.
crtx improve src/ # Apply improvements directly crtx improve src/ --apply --confirm
Best for: refactoring, optimization, code quality improvement, technical debt reduction.
The Arbiter
The Arbiter is an independent model that reviews pipeline stage outputs. It always uses a different model from the one that produced the work — no model ever grades itself. This cross-model enforcement is the core of CRTX's quality assurance.
Verdicts
The Arbiter produces one of four verdicts for each review:
Arbiter Modes
Control when the Arbiter reviews:
bookendReviews after the first stage (Architect) and the last stage (Verify). Best balance of quality and cost.fullReviews after every stage. Maximum quality assurance, higher cost.finalReviews only the final output. Lowest overhead.offNo Arbiter reviews. Not recommended for production work.# Bookend (default — reviews first and last) crtx run "..." --arbiter bookend # Full (reviews every stage) crtx run "..." --arbiter full # Final only crtx run "..." --arbiter final
Smart Routing
CRTX's routing engine uses fitness scores to assign the best available model to each pipeline stage. Fitness scores are computed based on the model's strengths, the stage requirements, and the selected strategy.
Routing Strategies
quality_firstPicks the highest-scoring model for each stage regardless of cost. Best output quality.cost_optimizedMinimizes total pipeline cost while maintaining a minimum fitness threshold.speed_firstSelects the fastest models. Useful for iteration and prototyping.hybridBalanced approach — weights quality, cost, and speed equally.# See model fitness scores for your configured providers crtx models --route quality_first # Dry run to see assignments without executing crtx run "..." --route cost_optimized --dry-run
Domain Rules
Domain rules are custom enforcement criteria the Arbiter checks against every stage output. Define them in your crtx.toml to enforce your team's standards.
[domain_rules] rules = [ "All database operations must use connection pooling", "All API endpoints must validate input with Pydantic models", "Error responses must use RFC 7807 problem details format", "All async operations must have timeout handling", "SQL queries must use parameterized statements, never string formatting", ]
The Arbiter evaluates each rule against the stage output and includes violations in its verdict. FLAG verdicts from domain rule violations are injected into the next stage so the model can address them.
Supported Models
CRTX currently supports models from four providers:
You only need API keys for the providers you want to use. CRTX works with a single provider, but the routing engine benefits from having multiple options.
API Keys
Set your API keys as environment variables. CRTX auto-detects which providers are available.
# Set in your shell profile or .env file export ANTHROPIC_API_KEY=sk-ant-... export OPENAI_API_KEY=sk-... export GOOGLE_API_KEY=AIza... export XAI_API_KEY=xai-... # Or pass via config # crtx.toml [keys] anthropic = "sk-ant-..." openai = "sk-..." google = "AIza..." xai = "xai-..."
Never commit API keys to version control. Use environment variables or a .env file (add it to your .gitignore).
Context Injection
CRTX can scan your project directory and inject relevant code context into the pipeline. Models see your actual codebase — imports, patterns, conventions — and generate code that fits.
# Inject project context (scans Python files by default) crtx run "Add error handling to the API routes" \ --context-dir ./backend # Custom budget (tokens allocated to context) crtx run "Generate missing tests" \ --context-dir . \ --context-budget 32000 # With file filters crtx run "Refactor the auth module" \ --context-dir ./src \ --include "*.py" \ --exclude "**/tests/**"
The context injector uses AST scanning to extract file signatures, class definitions, function signatures, imports, and docstrings. The top 10 most relevant files are included in full; remaining files contribute signatures only. Budget defaults to 20,000 tokens.
Apply Mode
Apply mode writes pipeline output directly to your codebase with mandatory safety checks. Instead of copying files from crtx-output/, the pipeline resolves file paths and writes them in place.
Safe Direct Write (Phase 1)
The basic apply flow: resolve paths, check git state, preview diffs, write files, run tests, rollback on failure.
# Preview what would be written (dry run) crtx run "Add error handling" \ --context-dir ./backend \ --apply # Actually write after confirmation crtx run "Add error handling" \ --context-dir ./backend \ --apply --confirm # Write to a new branch with post-apply testing crtx run "Refactor auth" \ --context-dir ./backend \ --apply --confirm \ --branch crtx/refactor-auth \ --test-command "pytest tests/ -q" \ --rollback-on-fail
Safety gates run before any file is written: git repo required, dirty tree warning, protected branch blocking (main/master), arbiter REJECT verdict blocking, and file conflict detection for files modified since the context scan.
Intelligent Patching (Phase 2)
For existing files, CRTX can apply surgical edits using AST-aware structured patches instead of whole-file replacement. Patch anchors use function signatures and class names rather than line numbers, so they work even if the file has changed.
Seven patch operations are supported: insert_after, insert_before, replace, delete, insert_import, insert_method, and wrap. Post-patch validation checks syntax, import preservation, and signature integrity.
Streaming UI
By default, CRTX renders a real-time multi-panel display showing code as it generates, arbiter reasoning as it reviews, and running cost per model.
The display has four panels: pipeline progress (stage indicators with completion percentage), live output (syntax-highlighted code or refactor diffs), activity log (timestamped events with arbiter feedback), and cost ticker (per-model token counts and costs).
# Disable streaming (use classic status table) crtx run "Build an API" --no-stream
Streaming is enabled by default for sequential pipelines in interactive terminals (80x24 minimum). It falls back to the classic display for non-interactive environments, parallel/debate modes, or when --no-stream is passed.
REPL Mode
Start an interactive session to run multiple pipelines without restarting. Session history and tab completion are built in.
crtx repl # With defaults crtx repl --mode sequential --arbiter bookend --context-dir ./src
Inside the REPL, type your task directly — no quotes needed. Use /mode, /arbiter, and /route commands to change settings mid-session. /history shows past runs.
Session History
Every pipeline run is persisted to a local SQLite database with full metadata: task, model assignments, arbiter verdicts, costs, tokens, and timing.
# List recent sessions crtx history # View a specific session crtx history show <session-id> # Replay a session (re-run with same config) crtx replay <session-id>
Auto-Fallback
When a model provider is unavailable (rate limits, outages, timeouts), CRTX automatically falls back to the next-best model by fitness score. Provider health is tracked per-session — once a model fails, it is skipped immediately for subsequent stages.
In testing, CRTX completed a full pipeline with both Google Gemini and Claude Opus simultaneously down, falling back to o3 and Claude Sonnet automatically. No configuration needed — fallback is always active.