AI Agents & Executors

Introduction

Forge supports two types of agents that work together to give you ultimate flexibility in AI-assisted development. Understanding the distinction is key to mastering Forge.

The Key Distinction:

AI Coding Agents = The execution platforms (CLI tools that run AI models)
Specialized Agents = Custom prompts that work with ANY coding agent
Example: Your “test-writer” specialized agent can run on Claude today, Gemini tomorrow

AI Coding Agents vs Specialized Agents

The Concept

Think of it like this:

Quick Comparison

Aspect	AI Coding Agents	Specialized Agents
What it is	The AI execution platform	Custom prompt/instructions
Examples	Claude Code, Gemini, Cursor CLI	test-writer, security-expert
How many per task	Pick ONE	Optional, pick ZERO or ONE
Reusability	Locked to that agent	Works with ANY coding agent
Configuration	API keys, models	Custom prompts, instructions

The 8 AI Coding Agents

Forge can execute tasks using these AI coding agents - including open-source and LLM-agnostic options:

Claude Code

Provider: Anthropic Type: Commercial Best for: Complex reasoning, refactoring, architecture

forge task attempt 42 --llm claude

Claude Code Router

Provider: Open-source Type: LLM-agnostic router Best for: Using ANY model instead of Claude

forge task attempt 42 --llm claude-router

Use this to route to local models, Groq, or any other provider!

Cursor CLI

Provider: Cursor Type: Commercial Best for: Fast iterations, UI work, quick fixes

forge task attempt 42 --llm cursor

Gemini

Provider: Google Type: Commercial Best for: Cost-effective, multimodal, large context

forge task attempt 42 --llm gemini

Codex

Provider: OpenAI Type: Commercial Best for: Code completion, established patterns

forge task attempt 42 --llm codex

Amp

Provider: Sourcegraph Type: Commercial Best for: Code intelligence, search-augmented

forge task attempt 42 --llm amp

OpenCode

Provider: Open-source Type: Fully local Best for: Privacy, offline work, cost savings

forge task attempt 42 --llm opencode

Qwen Code

Provider: Alibaba (open-source) Type: Fully local Best for: Multilingual, Asian languages, local execution

forge task attempt 42 --llm qwen

The Power of Choice

Not Locked to Subscriptions

Unlike other platforms, Forge doesn’t force you into specific subscriptions:

Use open-source models - OpenCode, Qwen Code run fully local
Route to any LLM - Claude Code Router lets you use any provider
Bring your own API keys - Pay providers directly, not through Forge
Mix and match - Use Claude for architecture, Gemini for tests, OpenCode for refactoring

Compare Agents on Same Task

Try multiple agents and choose the best result:

# Attempt 1 with Claude
forge task attempt 42 --llm claude

# Attempt 2 with Gemini
forge task attempt 42 --llm gemini

# Attempt 3 with Cursor
forge task attempt 42 --llm cursor

# Compare all three
forge task compare 42

# Merge the best one
forge task merge 42 2  # Gemini won!

Specialized Agent System

Specialized agents are custom prompts that modify how any AI coding agent approaches a task. They’re reusable, portable, and work with ALL coding agents.

How They Work

Built-in Specialized Agents

Agent	Purpose	What it does
test-writer	Testing focus	Ensures comprehensive test coverage
security-expert	Security hardening	Reviews for vulnerabilities, adds security measures
pr-reviewer	Code review	Analyzes code quality, suggests improvements
auth-specialist	Authentication	Expert in OAuth, JWT, session management
performance-optimizer	Optimization	Focuses on speed, memory, efficiency
documentation-writer	Documentation	Adds JSDoc, README, examples

Using Specialized Agents

# Use with any coding agent
forge task attempt 42 --llm claude --specialized test-writer
forge task attempt 42 --llm gemini --specialized security-expert
forge task attempt 42 --llm cursor --specialized pr-reviewer

# Or configure as default for a task type
forge config set task.type.test.specialized test-writer

Creating Custom Specialized Agents

Create your own in .forge/specialized/:

# .forge/specialized/api-expert.yaml
name: api-expert
description: Expert in REST API design and implementation
prompt: |
  You are an expert in REST API design. When implementing this task:

  1. Follow RESTful principles strictly
  2. Use proper HTTP methods and status codes
  3. Implement comprehensive error handling
  4. Add request/response validation
  5. Include OpenAPI/Swagger documentation
  6. Write integration tests for all endpoints

  Focus on:
  - Clean resource naming
  - Versioning strategy
  - Authentication/authorization
  - Rate limiting
  - Caching headers

Use it:

forge task attempt 42 --llm claude --specialized api-expert

Agent Selection Matrix

Choose the right agent for the job:

By Task Type
By Priority
By Budget

Task Type	Recommended Agent	Why
Architecture	Claude Code	Best reasoning, handles complexity
Refactoring	Claude Code, Cursor	Strong code understanding
New Features	Gemini, Cursor	Fast, cost-effective
Bug Fixes	Cursor CLI	Quick iterations
UI Work	Cursor CLI	Visual focus
Security	Claude + security-expert	Deep analysis
Testing	Gemini + test-writer	Comprehensive, cheap
Documentation	Any + documentation-writer	Consistent style
Privacy-sensitive	OpenCode, Qwen Code	Fully local

Priority	Strategy	Agents
Speed	Fast iteration	Cursor CLI, Gemini
Quality	Multiple attempts	Claude + Gemini + Cursor
Cost	Cheap models	Gemini, OpenCode, Qwen
Privacy	Local execution	OpenCode, Qwen Code
Reliability	Proven models	Claude Code

Budget	Approach	Mix
No budget	Open-source only	OpenCode, Qwen Code
Limited	Cheap + selective premium	Gemini (default), Claude (critical)
Flexible	Best tool for job	Claude (complex), Cursor (UI), Gemini (tests)
Unlimited	Quality first	Claude Code everywhere + multiple attempts

Performance Comparison

Real-world data from Namastex Labs (50+ projects):

Speed (Average task completion)

Cursor CLI:     ████████░░ 3-5 min
Gemini:         █████████░ 4-6 min
Claude Code:    ██████████ 5-8 min
Codex:          ████████░░ 4-6 min
OpenCode:       ████░░░░░░ 8-12 min (local)

Quality (Code review score)

Claude Code:       ██████████ 9.2/10
Cursor CLI:        █████████░ 8.7/10
Gemini:            ████████░░ 8.3/10
Codex:             ████████░░ 8.1/10
OpenCode:          ███████░░░ 7.5/10

Cost (Per 1000 tasks)

OpenCode:          Free (local hardware)
Qwen Code:         Free (local hardware)
Gemini:           $45-60
Codex:            $80-120
Cursor CLI:       $120-160
Claude Code:      $180-240

Costs are estimates and vary based on task complexity, context size, and model versions. Always monitor your actual usage.

Best Practices

Multiple Attempts Strategy

For important features, use this proven workflow:

Fast prototype

# Attempt 1: Quick iteration with Cursor
forge task attempt 42 --llm cursor

Get a working prototype fast to validate approach.

Quality improvement

# Attempt 2: Refined version with Claude
forge task attempt 42 --llm claude --specialized security-expert

Get production-quality code with security review.

Cost-effective alternative

# Attempt 3: Budget option with Gemini
forge task attempt 42 --llm gemini

See if cheaper model can match quality.

Compare and choose

# Review all attempts side-by-side
forge task compare 42

# Merge the best
forge task merge 42 2

Agent-Specific Tips

Claude Code: Use for complex reasoning

Claude excels at:

System architecture decisions
Complex refactoring
Security-critical code
Understanding large codebases

# Good use case
forge task create \
  --title "Redesign authentication system" \
  --llm claude \
  --specialized security-expert

Gemini: Best value for testing

Gemini is perfect for:

Writing comprehensive tests
Documentation generation
Simple feature implementation
High-volume tasks

# Excellent cost/quality ratio
forge task create \
  --title "Add unit tests for API" \
  --llm gemini \
  --specialized test-writer

Cursor CLI: Speed matters

Cursor is fastest for:

Quick bug fixes
UI tweaks
Iterative development
Real-time experimentation

# When speed is priority
forge task create \
  --title "Fix button alignment" \
  --llm cursor

OpenCode/Qwen: Privacy + cost savings

Local models when you need:

Complete privacy (sensitive code)
Offline development
Zero API costs
Full control

# For sensitive projects
forge task create \
  --title "Implement payment processing" \
  --llm opencode  # Runs 100% locally

Next Steps

Agent Setup Guides

Configure each AI coding agent

Creating Tasks

Start using agents in your workflow

Task Attempts

Master the multiple attempts system

Workflows

See agents in action

Getting Started

Learn

Configuration

Reference

Troubleshooting

Introduction

AI Coding Agents vs Specialized Agents

The Concept

Quick Comparison

The 8 AI Coding Agents

Claude Code

Claude Code Router

Cursor CLI

Gemini

Codex

Amp

OpenCode

Qwen Code

The Power of Choice

Specialized Agent System

How They Work

Built-in Specialized Agents

Using Specialized Agents

Creating Custom Specialized Agents

Agent Selection Matrix

Performance Comparison

Speed (Average task completion)

Quality (Code review score)

Cost (Per 1000 tasks)

Best Practices

Multiple Attempts Strategy

Agent-Specific Tips

Next Steps

Agent Setup Guides

Creating Tasks

Task Attempts

Workflows

Getting Started

Learn

Configuration

Reference

Troubleshooting

​Introduction

​AI Coding Agents vs Specialized Agents

​The Concept

​Quick Comparison

​The 8 AI Coding Agents

Claude Code

Claude Code Router

Cursor CLI

Gemini

Codex

Amp

OpenCode

Qwen Code

​The Power of Choice

​Specialized Agent System

​How They Work

​Built-in Specialized Agents

​Using Specialized Agents

​Creating Custom Specialized Agents

​Agent Selection Matrix

​Performance Comparison

​Speed (Average task completion)

​Quality (Code review score)

​Cost (Per 1000 tasks)

​Best Practices

​Multiple Attempts Strategy

​Agent-Specific Tips

​Next Steps

Agent Setup Guides

Creating Tasks

Task Attempts

Workflows

Introduction

AI Coding Agents vs Specialized Agents

The Concept

Quick Comparison

The 8 AI Coding Agents

The Power of Choice

Specialized Agent System

How They Work

Built-in Specialized Agents

Using Specialized Agents

Creating Custom Specialized Agents

Agent Selection Matrix

Performance Comparison

Speed (Average task completion)

Quality (Code review score)

Cost (Per 1000 tasks)

Best Practices

Multiple Attempts Strategy

Agent-Specific Tips

Next Steps