How It Works
Adaptive’s model routing is powered by a sophisticated evaluation system that analyzes model performance across diverse benchmarks:1
Benchmark Clustering
We take top benchmarks and cluster questions by embedding each one, creating semantic clusters that group similar tasks together
2
Model Evaluation
Each LLM is evaluated on every cluster, generating performance profiles that show which models excel at specific types of tasks
3
Inference Routing
When a prompt arrives, we embed it to find its cluster match, then select the best-performing model for that cluster
4
Continuous Learning
The system continuously updates profiles as new models and benchmarks become available. Benchmarks are updated based on production workloads, and models are continuously evaluated to guard against performance degradation.
Coming Soon: Custom evaluations for each user, allowing you to define your own benchmarks and evaluation criteria for personalized model routing.
Quick Start
Simply leave the model field empty to enable model routing:Costs shown include Adaptive overhead (0.20/1M output). With
BYOK (custom API keys), you only pay the overhead.
Simple Greeting
“Hello, how are you?”Routes to: Gemini Flash
Cost: $0.10 per 1M tokens
Savings: 97% vs GPT-4
Code Generation
“Write a React component…”Routes to: DeepSeek Coder
Cost: $0.34 per 1M tokens
Savings: 87% vs GPT-4
Complex Analysis
“Analyze this dataset…”Routes to: Claude Sonnet
Cost: $2.19 per 1M tokens
Savings: 72% vs GPT-4
Function Calling
“What’s the weather?” + toolsRoutes to: GPT-5 Mini
Prioritizes function calling support
Smart tool-capable model selection
Configuration Options
Function Calling Support
When tools are provided, Adaptive automatically prioritizes models with function calling capabilities:Control Cost vs Performance
Balance between cost savings and response quality:Limit Available Providers
Restrict routing to specific providers or models:Routing Performance
Accuracy
94% accurate model selection based on prompt analysis
Speed
<1ms routing decision time with zero added latency
Reliability
99.9% uptime with automatic failover mechanisms
Preview Routing Decisions
Want to see which model would be selected before making the request? Use our model selection preview:Response Information
Every response includes provider information:Advanced Use Cases
Enterprise Optimization
Custom provider contracts: Use model routing with your own API
keys and enterprise pricing
Local Deployment
On-premise inference: Get cloud-quality routing decisions for local
model deployments
A/B Testing
Model comparison: Preview different routing strategies before
implementing them
Cost Monitoring
Budget control: Set cost thresholds and optimize spending automatically
Best Practices
Tip: Start with
cost_bias: 0.3 for most applications. This provides
excellent cost savings while maintaining high quality responses.


