Model Management

Models are the AI services you interact with. Each model is associated with a provider and can have custom configuration settings.

Commands

List Models

List all available models from configured providers:

viben model list

Output:

Available Models:
  Provider: anthropic-main
    claude-opus-4-20250514        200K context   $15/$75
    claude-sonnet-4-20250514*     200K context   $3/$15
    claude-3-5-haiku-latest       200K context   $0.25/$1.25

  Provider: openai-main
    gpt-4-turbo                   128K context   $10/$30
    gpt-4o                        128K context   $2.5/$10
    gpt-4o-mini                   128K context   $0.15/$0.6

* = default model

Filter by provider:

viben model list --provider anthropic-main

For JSON output:

viben model list --json

Check Model Status

View the availability status of configured models:

# Check all models
viben model status

# Check specific model
viben model status -n claude-sonnet-4-20250514

Output:

Model Status:
  Default: claude-sonnet-4-20250514

  claude-sonnet-4-20250514   anthropic-main   ✓ available
  gpt-4-turbo                openai-main      ✓ available
  claude-3-5-haiku-latest    anthropic-main   ✓ available
  local-llama                local-ollama     ✗ provider offline

Set Default Model

Set a model as the default for all operations:

viben model set-default -n <model>

Example:

viben model set-default -n claude-sonnet-4-20250514

Configuration File

Models are configured in ~/.viben/models.yaml:

# ~/.viben/models.yaml
version: 1

# Default model
default: claude-sonnet-4-20250514

# ============================================================
# Model Aliases
# Use short names to reference commonly used models
# ============================================================
aliases:
  # Speed-focused
  fast: claude-3-5-haiku-latest
  quick: gpt-4o-mini

  # Quality-focused
  smart: claude-sonnet-4-20250514
  balanced: gpt-4o

  # Maximum capability
  best: claude-opus-4-20250514
  powerful: gpt-4-turbo

  # Purpose-specific
  code: claude-sonnet-4-20250514
  chat: claude-3-5-haiku-latest
  reasoning: o1-preview

  # Provider-specific
  gpt: gpt-4-turbo
  claude: claude-sonnet-4-20250514
  gemini: gemini-1.5-pro

# ============================================================
# Fallback Chain
# Models to try in order when primary is unavailable
# ============================================================
fallbacks:
  - claude-sonnet-4-20250514      # Primary
  - gpt-4-turbo                    # First fallback
  - claude-3-5-haiku-latest        # Second fallback
  - gpt-4o-mini                    # Last resort

# ============================================================
# Model-specific Configuration
# Override default parameters for each model
# ============================================================
model_config:
  claude-sonnet-4-20250514:
    provider: anthropic-main
    max_tokens: 8192
    temperature: 0.7

  gpt-4-turbo:
    provider: openai-main
    max_tokens: 4096
    temperature: 0.7

Model Configuration

Each model can have custom settings that override defaults:

model_config:
  claude-sonnet-4-20250514:
    provider: anthropic-main        # Which provider to use
    max_tokens: 8192                # Maximum output tokens
    temperature: 0.7                # Temperature (0.0-1.0)
    # Optional parameters
    # top_p: 0.9
    # top_k: 40
    # stop_sequences: ["\n\nHuman:"]

  claude-opus-4-20250514:
    provider: anthropic-main
    max_tokens: 4096
    temperature: 0.5                # Lower temperature for more deterministic output

  claude-3-5-haiku-latest:
    provider: anthropic-main
    max_tokens: 4096
    temperature: 0.8

  gpt-4-turbo:
    provider: openai-main
    max_tokens: 4096
    temperature: 0.7

  gpt-4o:
    provider: openai-main
    max_tokens: 4096
    temperature: 0.7

  gpt-4o-mini:
    provider: openai-main
    max_tokens: 4096
    temperature: 0.8

  # Azure-hosted model
  azure-gpt-4:
    provider: azure-gpt4            # Uses Azure provider
    max_tokens: 4096
    temperature: 0.7

  # Google Gemini model
  gemini-1.5-pro:
    provider: google-gemini
    max_tokens: 8192
    temperature: 0.7

  # Local Ollama model
  llama3:
    provider: local-ollama
    max_tokens: 4096
    temperature: 0.8

  # DeepSeek
  deepseek-chat:
    provider: deepseek
    max_tokens: 4096
    temperature: 0.7

  # Groq (LLaMA)
  llama-3.1-70b-versatile:
    provider: groq
    max_tokens: 4096
    temperature: 0.7

Model Capabilities

You can define model capabilities for intelligent model selection:

model_capabilities:
  claude-sonnet-4-20250514:
    context_window: 200000
    supports_vision: true
    supports_tools: true
    supports_streaming: true
    cost_per_1k_input: 0.003
    cost_per_1k_output: 0.015

  claude-opus-4-20250514:
    context_window: 200000
    supports_vision: true
    supports_tools: true
    supports_streaming: true
    cost_per_1k_input: 0.015
    cost_per_1k_output: 0.075

  gpt-4-turbo:
    context_window: 128000
    supports_vision: true
    supports_tools: true
    supports_streaming: true
    cost_per_1k_input: 0.01
    cost_per_1k_output: 0.03

  gpt-4o-mini:
    context_window: 128000
    supports_vision: true
    supports_tools: true
    supports_streaming: true
    cost_per_1k_input: 0.00015
    cost_per_1k_output: 0.0006

These capabilities can be used by agents to intelligently select models based on task requirements.

Popular Models Reference

Anthropic Models

Model	Context	Input Cost	Output Cost	Notes
`claude-opus-4-20250514`	200K	$15/1M	$75/1M	Most capable
`claude-sonnet-4-20250514`	200K	$3/1M	$15/1M	Balanced
`claude-3-5-haiku-latest`	200K	$0.25/1M	$1.25/1M	Fastest

OpenAI Models

Model	Context	Input Cost	Output Cost	Notes
`gpt-4-turbo`	128K	$10/1M	$30/1M	Most capable
`gpt-4o`	128K	$2.5/1M	$10/1M	Balanced
`gpt-4o-mini`	128K	$0.15/1M	$0.6/1M	Fastest
`o1-preview`	128K	$15/1M	$60/1M	Reasoning

Google Models

Model	Context	Input Cost	Output Cost	Notes
`gemini-1.5-pro`	2M	$1.25/1M	$5/1M	Large context
`gemini-1.5-flash`	1M	$0.075/1M	$0.3/1M	Fast

Local Models (Ollama)

Model	Context	Cost	Notes
`llama3`	8K	Free	Open-source
`llama3.1`	128K	Free	Extended context
`mistral`	32K	Free	Fast
`codellama`	16K	Free	Code-focused

Using Models with Aliases

Instead of remembering full model names, use aliases:

# Instead of this:
viben model set-default -n claude-sonnet-4-20250514

# Use an alias:
viben model set-default -n smart

Common alias conventions:

Alias	Typical Model	Use Case
`fast`	claude-3-5-haiku-latest	Quick responses
`smart`	claude-sonnet-4-20250514	Balanced quality
`best`	claude-opus-4-20250514	Maximum quality
`code`	claude-sonnet-4-20250514	Coding tasks
`chat`	claude-3-5-haiku-latest	Casual conversation

Agent Integration

Models can be configured per-agent:

# Set model for specific agent
viben agent config -n my-agent set model claude-sonnet-4-20250514

# Or use an alias
viben agent config -n my-agent set model smart

JSON Output

All commands support --json flag:

viben model list --json

{
  "success": true,
  "data": {
    "default": "claude-sonnet-4-20250514",
    "models": [
      {
        "name": "claude-sonnet-4-20250514",
        "provider": "anthropic-main",
        "context_window": 200000,
        "status": "available"
      },
      {
        "name": "gpt-4-turbo",
        "provider": "openai-main",
        "context_window": 128000,
        "status": "available"
      }
    ]
  }
}

viben model status --json

{
  "success": true,
  "data": {
    "default": "claude-sonnet-4-20250514",
    "models": [
      {
        "name": "claude-sonnet-4-20250514",
        "provider": "anthropic-main",
        "status": "available"
      },
      {
        "name": "local-llama",
        "provider": "local-ollama",
        "status": "offline",
        "error": "Provider not running"
      }
    ]
  }
}

Next Steps

Model Aliases - Create convenient shortcuts for model names
Model Fallbacks - Set up automatic fallback chains
Provider Management - Configure providers for your models

Commands​

List Models​

Check Model Status​

Set Default Model​

Configuration File​

Model Configuration​

Model Capabilities​

Popular Models Reference​

Anthropic Models​

OpenAI Models​

Google Models​

Local Models (Ollama)​

Using Models with Aliases​

Agent Integration​

JSON Output​

Next Steps​