Skip to main content

Model Management

Models are the AI services you interact with. Each model is associated with a provider and can have custom configuration settings.

Commands

List Models

List all available models from configured providers:

viben model list

Output:

Available Models:
Provider: anthropic-main
claude-opus-4-20250514 200K context $15/$75
claude-sonnet-4-20250514* 200K context $3/$15
claude-3-5-haiku-latest 200K context $0.25/$1.25

Provider: openai-main
gpt-4-turbo 128K context $10/$30
gpt-4o 128K context $2.5/$10
gpt-4o-mini 128K context $0.15/$0.6

* = default model

Filter by provider:

viben model list --provider anthropic-main

For JSON output:

viben model list --json

Check Model Status

View the availability status of configured models:

# Check all models
viben model status

# Check specific model
viben model status -n claude-sonnet-4-20250514

Output:

Model Status:
Default: claude-sonnet-4-20250514

claude-sonnet-4-20250514 anthropic-main ✓ available
gpt-4-turbo openai-main ✓ available
claude-3-5-haiku-latest anthropic-main ✓ available
local-llama local-ollama ✗ provider offline

Set Default Model

Set a model as the default for all operations:

viben model set-default -n <model>

Example:

viben model set-default -n claude-sonnet-4-20250514

Configuration File

Models are configured in ~/.viben/models.yaml:

# ~/.viben/models.yaml
version: 1

# Default model
default: claude-sonnet-4-20250514

# ============================================================
# Model Aliases
# Use short names to reference commonly used models
# ============================================================
aliases:
# Speed-focused
fast: claude-3-5-haiku-latest
quick: gpt-4o-mini

# Quality-focused
smart: claude-sonnet-4-20250514
balanced: gpt-4o

# Maximum capability
best: claude-opus-4-20250514
powerful: gpt-4-turbo

# Purpose-specific
code: claude-sonnet-4-20250514
chat: claude-3-5-haiku-latest
reasoning: o1-preview

# Provider-specific
gpt: gpt-4-turbo
claude: claude-sonnet-4-20250514
gemini: gemini-1.5-pro

# ============================================================
# Fallback Chain
# Models to try in order when primary is unavailable
# ============================================================
fallbacks:
- claude-sonnet-4-20250514 # Primary
- gpt-4-turbo # First fallback
- claude-3-5-haiku-latest # Second fallback
- gpt-4o-mini # Last resort

# ============================================================
# Model-specific Configuration
# Override default parameters for each model
# ============================================================
model_config:
claude-sonnet-4-20250514:
provider: anthropic-main
max_tokens: 8192
temperature: 0.7

gpt-4-turbo:
provider: openai-main
max_tokens: 4096
temperature: 0.7

Model Configuration

Each model can have custom settings that override defaults:

model_config:
claude-sonnet-4-20250514:
provider: anthropic-main # Which provider to use
max_tokens: 8192 # Maximum output tokens
temperature: 0.7 # Temperature (0.0-1.0)
# Optional parameters
# top_p: 0.9
# top_k: 40
# stop_sequences: ["\n\nHuman:"]

claude-opus-4-20250514:
provider: anthropic-main
max_tokens: 4096
temperature: 0.5 # Lower temperature for more deterministic output

claude-3-5-haiku-latest:
provider: anthropic-main
max_tokens: 4096
temperature: 0.8

gpt-4-turbo:
provider: openai-main
max_tokens: 4096
temperature: 0.7

gpt-4o:
provider: openai-main
max_tokens: 4096
temperature: 0.7

gpt-4o-mini:
provider: openai-main
max_tokens: 4096
temperature: 0.8

# Azure-hosted model
azure-gpt-4:
provider: azure-gpt4 # Uses Azure provider
max_tokens: 4096
temperature: 0.7

# Google Gemini model
gemini-1.5-pro:
provider: google-gemini
max_tokens: 8192
temperature: 0.7

# Local Ollama model
llama3:
provider: local-ollama
max_tokens: 4096
temperature: 0.8

# DeepSeek
deepseek-chat:
provider: deepseek
max_tokens: 4096
temperature: 0.7

# Groq (LLaMA)
llama-3.1-70b-versatile:
provider: groq
max_tokens: 4096
temperature: 0.7

Model Capabilities

You can define model capabilities for intelligent model selection:

model_capabilities:
claude-sonnet-4-20250514:
context_window: 200000
supports_vision: true
supports_tools: true
supports_streaming: true
cost_per_1k_input: 0.003
cost_per_1k_output: 0.015

claude-opus-4-20250514:
context_window: 200000
supports_vision: true
supports_tools: true
supports_streaming: true
cost_per_1k_input: 0.015
cost_per_1k_output: 0.075

gpt-4-turbo:
context_window: 128000
supports_vision: true
supports_tools: true
supports_streaming: true
cost_per_1k_input: 0.01
cost_per_1k_output: 0.03

gpt-4o-mini:
context_window: 128000
supports_vision: true
supports_tools: true
supports_streaming: true
cost_per_1k_input: 0.00015
cost_per_1k_output: 0.0006

These capabilities can be used by agents to intelligently select models based on task requirements.

Anthropic Models

ModelContextInput CostOutput CostNotes
claude-opus-4-20250514200K$15/1M$75/1MMost capable
claude-sonnet-4-20250514200K$3/1M$15/1MBalanced
claude-3-5-haiku-latest200K$0.25/1M$1.25/1MFastest

OpenAI Models

ModelContextInput CostOutput CostNotes
gpt-4-turbo128K$10/1M$30/1MMost capable
gpt-4o128K$2.5/1M$10/1MBalanced
gpt-4o-mini128K$0.15/1M$0.6/1MFastest
o1-preview128K$15/1M$60/1MReasoning

Google Models

ModelContextInput CostOutput CostNotes
gemini-1.5-pro2M$1.25/1M$5/1MLarge context
gemini-1.5-flash1M$0.075/1M$0.3/1MFast

Local Models (Ollama)

ModelContextCostNotes
llama38KFreeOpen-source
llama3.1128KFreeExtended context
mistral32KFreeFast
codellama16KFreeCode-focused

Using Models with Aliases

Instead of remembering full model names, use aliases:

# Instead of this:
viben model set-default -n claude-sonnet-4-20250514

# Use an alias:
viben model set-default -n smart

Common alias conventions:

AliasTypical ModelUse Case
fastclaude-3-5-haiku-latestQuick responses
smartclaude-sonnet-4-20250514Balanced quality
bestclaude-opus-4-20250514Maximum quality
codeclaude-sonnet-4-20250514Coding tasks
chatclaude-3-5-haiku-latestCasual conversation

Agent Integration

Models can be configured per-agent:

# Set model for specific agent
viben agent config -n my-agent set model claude-sonnet-4-20250514

# Or use an alias
viben agent config -n my-agent set model smart

JSON Output

All commands support --json flag:

viben model list --json
{
"success": true,
"data": {
"default": "claude-sonnet-4-20250514",
"models": [
{
"name": "claude-sonnet-4-20250514",
"provider": "anthropic-main",
"context_window": 200000,
"status": "available"
},
{
"name": "gpt-4-turbo",
"provider": "openai-main",
"context_window": 128000,
"status": "available"
}
]
}
}
viben model status --json
{
"success": true,
"data": {
"default": "claude-sonnet-4-20250514",
"models": [
{
"name": "claude-sonnet-4-20250514",
"provider": "anthropic-main",
"status": "available"
},
{
"name": "local-llama",
"provider": "local-ollama",
"status": "offline",
"error": "Provider not running"
}
]
}
}

Next Steps