Intelligent Routing

Adaptive’s AI-powered routing engine analyzes every request and automatically selects the optimal model from multiple providers based on complexity, cost, and performance requirements.

How It Works

Request Analysis

Our ML models analyze your prompt’s complexity, length, task type, and function calling requirements in real-time

Provider Selection

The routing engine considers available providers, costs, performance metrics, and function calling support

Optimal Match

The best model is selected and your request is routed automatically

Response Delivery

You receive a standard response with provider information showing which model was used

Quick Start

Simply leave the model field empty to enable intelligent routing:

const completion = await openai.chat.completions.create({
  model: "", // Empty enables intelligent routing
  messages: [{ role: "user", content: "Hello!" }]
});

console.log(`Used provider: ${completion.provider}`);

Real Examples

Costs shown include Adaptive overhead (

0.10/1M input +

0.20/1M output). With BYOK (custom API keys), you only pay the overhead.

Simple Greeting

“Hello, how are you?”Routes to: Gemini Flash Cost: $0.10 per 1M tokens Savings: 97% vs GPT-4

Code Generation

“Write a React component…”Routes to: DeepSeek Coder Cost: $0.34 per 1M tokens Savings: 87% vs GPT-4

Complex Analysis

“Analyze this dataset…”Routes to: Claude Sonnet Cost: $2.19 per 1M tokens Savings: 72% vs GPT-4

Function Calling

“What’s the weather?” + toolsRoutes to: GPT-5 Mini Prioritizes function calling support Smart tool-capable model selection

Configuration Options

Function Calling Support

When tools are provided, Adaptive automatically prioritizes models with function calling capabilities:

const completion = await openai.chat.completions.create({
  model: "",
  messages: [{ role: "user", content: "What's the weather in San Francisco?" }],
  tools: [{
    type: "function",
    function: {
      name: "get_weather",
      description: "Get current weather for a location",
      parameters: {
        type: "object",
        properties: {
          location: { type: "string", description: "City name" }
        },
        required: ["location"]
      }
    }
  }]
});

// Automatically routes to models that support function calling

Control Cost vs Performance

Balance between cost savings and response quality:

const completion = await openai.chat.completions.create({
  model: "",
  messages: [{ role: "user", content: "Explain quantum physics" }],
  cost_bias: 0.3 // 0 = cheapest, 0.5 = balanced, 1 = best performance
});

Limit Available Providers

Restrict routing to specific providers or models:

const completion = await openai.chat.completions.create({
  model: "",
  messages: [{ role: "user", content: "Write a story" }],
  model_router: {
    models: ["openai:gpt-5-mini", "anthropic:claude-sonnet-4-5"] // Specify allowed models
  }
});

Routing Performance

Accuracy

94% accurate model selection based on prompt analysis

Speed

<1ms routing decision time with zero added latency

Reliability

99.9% uptime with automatic failover mechanisms

Preview Routing Decisions

Want to see which model would be selected before making the request? Use our model selection preview:

// Preview which model would be selected
const response = await fetch('https://api.llmadaptive.uk/v1/select-model', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer your-adaptive-key',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    prompt: 'Complex data analysis task',
    models: [
      'openai:gpt-5-mini',
      'anthropic:claude-sonnet-4-5',
      'gemini:gemini-2.5-flash-lite'
    ],
    cost_bias: 0.5
  })
});

const result = await response.json();
console.log(`Would select: ${result.provider}/${result.model}`);
console.log(`Alternatives: ${JSON.stringify(result.alternatives)}`);

Response Information

Every response includes provider information:

{
  "id": "chatcmpl-abc123",
  "choices": [
    {
      "message": { "content": "Hello! How can I help you today?" }
    }
  ],
  "usage": {
    "prompt_tokens": 9,
    "completion_tokens": 12,
    "total_tokens": 21
  },
  "provider": "gemini", // Which provider was selected
  "model": "gemini-flash" // Specific model used
}

Advanced Use Cases

Enterprise Optimization

Custom provider contracts: Use intelligent routing with your own API keys and enterprise pricing

Local Deployment

On-premise inference: Get cloud-quality routing decisions for local model deployments

A/B Testing

Model comparison: Preview different routing strategies before implementing them

Cost Monitoring

Budget control: Set cost thresholds and optimize spending automatically

Best Practices

Tip: Start with cost_bias: 0.3 for most applications. This provides excellent cost savings while maintaining high quality responses.

Important: Always handle the case where no suitable model is found. The API will return an error with suggested alternatives.

Getting Started

Framework Integrations

Developer Tools

Key Features

API Reference

Support

Intelligent Routing

How It Works

Quick Start

Real Examples

Simple Greeting

Code Generation

Complex Analysis

Function Calling

Configuration Options

Function Calling Support

Control Cost vs Performance

Limit Available Providers

Routing Performance

Accuracy

Speed

Reliability

Preview Routing Decisions

Response Information

Advanced Use Cases

Enterprise Optimization

Local Deployment

A/B Testing

Cost Monitoring

Best Practices

Next Steps

Performance Features

Provider Resiliency

Getting Started

Framework Integrations

Developer Tools

Key Features

API Reference

Support

​How It Works

​Quick Start

​Real Examples

Simple Greeting

Code Generation

Complex Analysis

Function Calling

​Configuration Options

​Function Calling Support

​Control Cost vs Performance

​Limit Available Providers

​Routing Performance

Accuracy

Speed

Reliability

​Preview Routing Decisions

​Response Information

​Advanced Use Cases

Enterprise Optimization

Local Deployment

A/B Testing

Cost Monitoring

​Best Practices

​Next Steps

Performance Features

Provider Resiliency

How It Works

Quick Start

Real Examples

Configuration Options

Function Calling Support

Control Cost vs Performance

Limit Available Providers

Routing Performance

Preview Routing Decisions

Response Information

Advanced Use Cases

Best Practices

Next Steps