Multi-LLM Routing
Overview
Armament supports multiple LLM providers simultaneously. Each channel/agent can be configured with a specific provider and model. The ProviderPool manages provider instances and handles deduplication.
Provider Pool
Providers are configured in ~/.armament/config.yaml:
providers:
- type: bedrock
region: us-west-2
models:
- name: claude-sonnet-4-20250514
- type: openai
apiKey: ${OPENAI_API_KEY}
models:
- name: gpt-4o
- name: gpt-4o-mini
- type: anthropic
apiKey: ${ANTHROPIC_API_KEY}
models:
- name: claude-sonnet-4-20250514
- type: gemini
apiKey: ${GEMINI_API_KEY}
Per-Agent Model Selection
When spawning an agent, specify the provider and model:
/spawn reviewer --model claude-sonnet-4-20250514 --provider anthropic
Or change the model for an existing channel:
/model gpt-4o
/provider openai
Supported Providers
| Provider | Auth | Models |
|---|---|---|
| Bedrock | AWS SSO / IAM | Claude, Mistral, Llama |
| Anthropic | API Key | Claude Opus, Sonnet, Haiku |
| OpenAI | API Key | GPT-4o, GPT-4o-mini, o-series |
| Gemini | API Key | Gemini Pro, Flash |
| Ollama | Local | Any local model |
Fallback Chain
Configure a fallback chain to try alternative models if the primary is unavailable:
/fallback gpt-4o claude-sonnet-4-20250514 gemini-pro
Rate Limiting
Rate limits are tracked per provider. When a provider returns rate limit errors, the system waits and retries automatically.
Planned Enhancements
- Cost-optimized routing: Select cheapest model meeting capability threshold
- Quality-first routing: Route complex tasks to best model, simple tasks to cheaper models
- Rule-based routing: Match tasks to models based on complexity, type, context size
- Observability dashboard:
/route statsfor model usage breakdown