Model Fallbacks
Fallback chains define the order of models to try when the primary model is unavailable. This ensures your workflows continue even if a provider experiences downtime.
Concept
A fallback chain is an ordered list of models:
1. claude-sonnet-4-20250514 (primary)
2. gpt-4-turbo (first fallback)
3. claude-3-5-haiku-latest (second fallback)
4. gpt-4o-mini (last resort)
When you request a model, Viben tries each in order until one succeeds.
Commands
List Fallback Chain
View the current fallback chain:
viben model fallbacks list
Output:
Fallback Chain:
1. claude-sonnet-4-20250514 (anthropic-main)
2. gpt-4-turbo (openai-main)
3. claude-3-5-haiku-latest (anthropic-main)
For JSON output:
viben model fallbacks list --json
Add to Fallback Chain
Add a model to the end of the fallback chain:
viben model fallbacks create -n <model>
Examples:
# Build a fallback chain step by step
viben model fallbacks create -n claude-sonnet-4-20250514
viben model fallbacks create -n gpt-4-turbo
viben model fallbacks create -n claude-3-5-haiku-latest
viben model fallbacks create -n gpt-4o-mini
Remove from Fallback Chain
Remove a model from the fallback chain:
viben model fallbacks remove -n <model>
Example:
viben model fallbacks remove -n gpt-4o-mini
Clear Fallback Chain
Remove all models from the fallback chain:
viben model fallbacks clear
Configuration File
Fallbacks are stored in ~/.viben/models.yaml:
# ~/.viben/models.yaml
version: 1
default: claude-sonnet-4-20250514
fallbacks:
- claude-sonnet-4-20250514 # Primary choice
- gpt-4-turbo # First fallback
- claude-3-5-haiku-latest # Second fallback
- gpt-4o-mini # Last resort
Fallback Strategies
Cross-Provider Fallbacks
Ensure resilience by including models from different providers:
fallbacks:
- claude-sonnet-4-20250514 # Anthropic (primary)
- gpt-4-turbo # OpenAI (fallback)
- gemini-1.5-pro # Google (second fallback)
This protects against provider-specific outages.
Local Fallback
Include a local model as the last resort:
fallbacks:
- claude-sonnet-4-20250514 # Cloud (primary)
- gpt-4-turbo # Cloud (fallback)
- llama3 # Local Ollama (last resort)
This ensures operation even without internet connectivity.
Cost-Optimized Fallbacks
Arrange by cost, starting with cheaper models:
fallbacks:
- gpt-4o-mini # Cheapest
- claude-3-5-haiku-latest # Inexpensive
- gpt-4o # Moderate
- claude-sonnet-4-20250514 # Higher quality
- claude-opus-4-20250514 # Highest quality (last resort)
Quality-Optimized Fallbacks
Arrange by quality, starting with the best:
fallbacks:
- claude-opus-4-20250514 # Best quality
- claude-sonnet-4-20250514 # High quality
- gpt-4-turbo # Good quality
- claude-3-5-haiku-latest # Faster but lower quality
- gpt-4o-mini # Fast but basic
How Fallbacks Work
When a request is made:
- Viben tries the first model in the chain
- If it fails (timeout, rate limit, provider error), it moves to the next
- This continues until a model succeeds or the chain is exhausted
- If all models fail, an error is returned
Request
|
v
claude-sonnet-4-20250514 --> FAIL (rate limited)
|
v
gpt-4-turbo --> FAIL (timeout)
|
v
claude-3-5-haiku-latest --> SUCCESS
|
v
Response
Combining with Aliases
Fallbacks can use aliases instead of full model names:
aliases:
smart: claude-sonnet-4-20250514
gpt: gpt-4-turbo
fast: claude-3-5-haiku-latest
fallbacks:
- smart # Resolves to claude-sonnet-4-20250514
- gpt # Resolves to gpt-4-turbo
- fast # Resolves to claude-3-5-haiku-latest
This makes configuration more readable and easier to maintain.
Best Practices
Mix Providers
Include at least two different providers in your fallback chain:
# Good - multiple providers
fallbacks:
- claude-sonnet-4-20250514 # Anthropic
- gpt-4-turbo # OpenAI
- gemini-1.5-pro # Google
# Less robust - single provider
fallbacks:
- claude-opus-4-20250514
- claude-sonnet-4-20250514
- claude-3-5-haiku-latest
Include a Fast Option
Always include a fast/cheap model as a last resort:
fallbacks:
- claude-sonnet-4-20250514 # Quality first
- gpt-4-turbo # Backup
- gpt-4o-mini # Fast fallback
Keep Chain Reasonable
A chain of 3-5 models is usually sufficient:
# Good - focused chain
fallbacks:
- claude-sonnet-4-20250514
- gpt-4-turbo
- claude-3-5-haiku-latest
# May be excessive - too many fallbacks
fallbacks:
- model1
- model2
- model3
- model4
- model5
- model6
- model7
Test Your Chain
Periodically test that all models in your fallback chain are configured correctly:
viben model status
Recommended Fallback Chains
General Purpose
fallbacks:
- claude-sonnet-4-20250514
- gpt-4-turbo
- claude-3-5-haiku-latest
High Availability
fallbacks:
- claude-sonnet-4-20250514 # Anthropic
- gpt-4o # OpenAI
- gemini-1.5-pro # Google
- llama3 # Local (no internet needed)
Cost-Conscious
fallbacks:
- gpt-4o-mini
- claude-3-5-haiku-latest
- gpt-4o
- claude-sonnet-4-20250514
Development Environment
fallbacks:
- llama3 # Local first (free, fast)
- claude-3-5-haiku-latest # Cloud fallback (cheap)
- gpt-4o-mini # Alternative (cheap)
JSON Output
viben model fallbacks list --json
{
"success": true,
"data": {
"fallbacks": [
{
"order": 1,
"model": "claude-sonnet-4-20250514",
"provider": "anthropic-main",
"status": "available"
},
{
"order": 2,
"model": "gpt-4-turbo",
"provider": "openai-main",
"status": "available"
},
{
"order": 3,
"model": "claude-3-5-haiku-latest",
"provider": "anthropic-main",
"status": "available"
}
]
}
}
Related
- Model Aliases - Create convenient shortcuts for model names
- Model Management - Configure model settings
- Provider Management - Configure providers for your models