Guide

Choosing the Right Model

Learn how to select the best AI model for your specific use case and optimize for performance and cost.

Model Categories

Chat Models
Conversational AI optimized for dialogue

Best for: Chatbots, customer support, interactive applications

Popular models: GPT-4, Claude 3, Gemini Pro

Code Models
Specialized for programming tasks

Best for: Code generation, debugging, documentation

Popular models: GPT-4, Claude 3, CodeLlama

Vision Models
Process both text and images

Best for: Image analysis, OCR, visual Q&A

Popular models: GPT-4 Vision, Gemini Vision

Embedding Models
Convert text to vector representations

Best for: Semantic search, similarity matching, RAG

Popular models: text-embedding-3, voyage-2

Key Selection Factors

Performance vs Cost
  • Premium models (GPT-4, Claude 3 Opus): Best quality, highest cost
  • Balanced models (GPT-3.5, Claude 3 Sonnet): Good quality, moderate cost
  • Efficient models (Llama, Mistral): Lower cost, good for high volume
Context Window Size
  • Standard (4K-8K): Most conversations and tasks
  • Extended (32K-128K): Long documents, code repositories
  • Ultra (200K-1M): Entire books, massive codebases
Response Speed
  • Real-time apps: Use smaller, faster models
  • Batch processing: Can use larger, slower models
  • Streaming: Most models support token streaming

Optimization Strategies

Model Routing

Automatically select the best model based on your requirements:

{
  "model": "router/auto",
  "route": {
    "preferences": ["quality", "speed"],
    "fallbacks": ["gpt-3.5-turbo"]
  }
}
A/B Testing

Compare models side-by-side to find the best fit:

  • • Test quality with sample prompts
  • • Compare response times
  • • Analyze cost per request
  • • Monitor user satisfaction

Next Steps

Explore Models
Browse our full catalog with live pricing
Try It Out
Test models in our interactive playground
Integration Guide
Learn how to integrate models in your app