Guide
Choosing the Right Model
Learn how to select the best AI model for your specific use case and optimize for performance and cost.
Model Categories
Chat Models
Conversational AI optimized for dialogue
Best for: Chatbots, customer support, interactive applications
Popular models: GPT-4, Claude 3, Gemini Pro
Code Models
Specialized for programming tasks
Best for: Code generation, debugging, documentation
Popular models: GPT-4, Claude 3, CodeLlama
Vision Models
Process both text and images
Best for: Image analysis, OCR, visual Q&A
Popular models: GPT-4 Vision, Gemini Vision
Embedding Models
Convert text to vector representations
Best for: Semantic search, similarity matching, RAG
Popular models: text-embedding-3, voyage-2
Key Selection Factors
Performance vs Cost
- • Premium models (GPT-4, Claude 3 Opus): Best quality, highest cost
- • Balanced models (GPT-3.5, Claude 3 Sonnet): Good quality, moderate cost
- • Efficient models (Llama, Mistral): Lower cost, good for high volume
Context Window Size
- • Standard (4K-8K): Most conversations and tasks
- • Extended (32K-128K): Long documents, code repositories
- • Ultra (200K-1M): Entire books, massive codebases
Response Speed
- • Real-time apps: Use smaller, faster models
- • Batch processing: Can use larger, slower models
- • Streaming: Most models support token streaming
Optimization Strategies
Model Routing
Automatically select the best model based on your requirements:
{ "model": "router/auto", "route": { "preferences": ["quality", "speed"], "fallbacks": ["gpt-3.5-turbo"] } }
A/B Testing
Compare models side-by-side to find the best fit:
- • Test quality with sample prompts
- • Compare response times
- • Analyze cost per request
- • Monitor user satisfaction
Next Steps
Explore Models
Browse our full catalog with live pricing
Try It Out
Test models in our interactive playground
Integration Guide
Learn how to integrate models in your app