Select Model

Get Adaptive’s intelligent model selection without using our inference. Provider-agnostic design - works with any models, any providers, any infrastructure.

Why Use This?

Use Adaptive’s intelligence, run inference wherever you want:

“I have my own OpenAI/Anthropic accounts” - Get optimal model selection, pay your providers directly
“I run models on-premise” - Get routing decisions for your local infrastructure
“I have enterprise contracts” - Use your existing provider relationships with intelligent routing
“I need data privacy” - Keep inference local while getting smart model selection

Request

Provider-agnostic format - send your available models and prompt, get intelligent selection back.

models

array

required

Array of available model specifications in provider:model_name format. Adaptive automatically queries the Model Registry to fill in pricing, capabilities, and other details for known models.

Show Model Specification Format

["openai:gpt-5-mini", "anthropic:claude-sonnet-4-5", "gemini:gemini-2.5-flash-lite"]

prompt

string

required

The prompt text to analyze for optimal model selection

user

string

Optional user identifier for caching optimization (enables user-specific cache hits)

cost_bias

number

Cost optimization preference (0.0 = cheapest, 1.0 = best performance) Default: Uses server configuration. Override to prioritize cost savings or performance for this specific selection.

tools

object[]

Available tool definitions for function calling detection Tool definitions help Adaptive understand if your prompt requires function calling capabilities, influencing model selection towards models that support tools.

Show Tool Definition Properties

type

string

required

Type of tool (always “function”)

function

object

required

Function definition object

Show Function Properties

name

string

required

Name of the function

description

string

Description of what the function does

parameters

object

JSON Schema object defining the function parameters

tool_call

object

Current tool call being made (if any) If this request is part of a tool calling sequence, provide the current tool call context to help with model selection optimization.

Show Tool Call Properties

string

required

Unique identifier for the tool call

type

string

required

Type of tool call (always “function”)

function

object

required

Function call details

Show Function Call Properties

name

string

required

Name of the function being called

arguments

string

required

JSON string containing the function arguments

model_router_cache

object

Semantic cache configuration for this request

Show Semantic Cache Configuration

enabled

boolean

Override whether to use semantic caching for this specific request (overrides server configuration)

semantic_threshold

number

Override similarity threshold for cache hits (0.0-1.0, higher = more strict matching)

Response

provider

string

Selected provider name The provider that was chosen for this prompt (e.g., “openai”, “anthropic”, “gemini”)

model

string

Selected model identifier The specific model that was chosen (e.g., “gpt-5-mini”, “claude-sonnet-4-5”, “glm-4.6”)

alternatives

array

Alternative provider/model combinations (optional) Fallback options if the primary selection is unavailable

Show Alternative Object

provider

string

Alternative provider name

model

string

Alternative model identifier

Quick Examples

”Known models - just specify what you have"

curl https://api.llmadaptive.uk/v1/select-model \
  -H "X-Stainless-API-Key: $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "models": [
      "openai:gpt-5-mini",
      "anthropic:claude-sonnet-4-5",
      "gemini:gemini-2.5-flash-lite"
    ],
    "prompt": "Hello, how are you?"
  }'

# Response:
{
  "provider": "openai",
  "model": "gpt-5-mini"
}

"Just specify providers - let Adaptive choose"

curl https://api.llmadaptive.uk/v1/select-model \
  -H "X-Stainless-API-Key: $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "models": [
      "openai:gpt-5-mini",
      "anthropic:claude-sonnet-4-5"
    ],
    "prompt": "Write a complex analysis of market trends"
  }'

# Response:
{
  "provider": "anthropic",
  "model": "claude-sonnet-4-5",
  "alternatives": [
    {"provider": "openai", "model": "gpt-5-mini"}
  ]
}

"Test cost optimization"

// Will cost_bias actually pick cheaper models?
const response = await fetch("/api/v1/select-model", {
  method: "POST",
  headers: { "X-Stainless-API-Key": apiKey },
  body: JSON.stringify({
    models: ["openai:gpt-4.1-nano", "openai:gpt-5-mini"],
    prompt: "Analyze this complex dataset and provide insights...",
    cost_bias: 0.1, // Maximize cost savings
  }),
});

if (!response.ok) {
  const errorBody = await response.text();
  throw new Error(`HTTP ${response.status}: ${errorBody}`);
}

const result = await response.json();
console.log(result);
// Check if it picked the cheaper model despite complexity

"Function calling optimization"

curl https://api.llmadaptive.uk/v1/select-model \
  -H "X-Stainless-API-Key: $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "models": [
      "openai:gpt-5-codex",
      "anthropic:claude-opus-4-1",
      "openai:gpt-5-nano"
    ],
    "prompt": "What is the weather like in San Francisco?",
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_weather",
          "description": "Get current weather for a location",
          "parameters": {
            "type": "object",
            "properties": {
              "location": {"type": "string", "description": "City name"}
            },
            "required": ["location"]
          }
        }
      }
    ]
  }'

# Response - will prefer models that support function calling:
{
  "provider": "openai",
  "model": "gpt-5-codex",
  "alternatives": [
    {"provider": "anthropic", "model": "claude-opus-4-1"}
  ]
}

"Compare different configurations”

import requests
import os

# Configuration
BASE_URL = "https://api.yourdomain.com"  # Replace with your actual domain
API_TOKEN = os.getenv("ADAPTIVE_API_TOKEN", "your-api-token-here")  # Set via environment variable
TIMEOUT = 30  # Request timeout in seconds

# Define available models
models = [
    "openai:gpt-4.1-nano",
    "openai:gpt-5-mini"
]

base_request = {
    "models": models,
    "prompt": "Write Python code to analyze customer data"
}

# Headers for authentication
headers = {
    "Authorization": f"Bearer {API_TOKEN}",
    "Content-Type": "application/json"
}

# Test cost-focused vs performance-focused
configs = [
    {"cost_bias": 0.1, "name": "cost-optimized"},
    {"cost_bias": 0.9, "name": "performance-focused"}
]

for config in configs:
    try:
        response = requests.post(
            f"{BASE_URL}/api/v1/select-model",
            json={
                **base_request,
                "cost_bias": config["cost_bias"]
            },
            headers=headers,
            timeout=TIMEOUT
        )

        # Check if request was successful
        if response.ok:
            result = response.json()
            print(f"{config['name']}: {result['provider']}/{result['model']}")
        else:
            print(f"Error for {config['name']}: HTTP {response.status_code} - {response.text}")

    except requests.exceptions.Timeout:
        print(f"Timeout error for {config['name']}: Request took longer than {TIMEOUT} seconds")
    except requests.exceptions.ConnectionError:
        print(f"Connection error for {config['name']}: Unable to connect to {BASE_URL}")
    except requests.exceptions.RequestException as e:
        print(f"Request error for {config['name']}: {e}")
    except Exception as e:
        print(f"Unexpected error for {config['name']}: {e}")

Real-World Integration Patterns

1. Use Your Own Provider Accounts

// Define your available models with your own pricing
const availableModels = [
  "openai:gpt-5-mini",
  "anthropic:claude-sonnet-4-5",
  "gemini:gemini-2.5-flash-lite",
];

// Get intelligent selection
const selection = await fetch("/api/v1/select-model", {
  method: "POST",
  headers: { "X-Stainless-API-Key": adaptiveKey },
  body: JSON.stringify({
    models: availableModels,
    prompt: userMessage,
  }),
});

const result = await selection.json();

// Route to your own provider accounts
if (result.provider === "openai") {
  const completion = await yourOpenAI.chat.completions.create({
    model: result.model,
    messages: [{ role: "user", content: userMessage }],
  });
} else if (result.provider === "anthropic") {
  const completion = await yourAnthropic.messages.create({
    model: result.model,
    messages: [{ role: "user", content: userMessage }],
    max_tokens: 4096,
  });
}

2. Multi-Provider Routing

// Tell Adaptive about your preferred providers (plus a fallback)
const res = await fetch("https://api.llmadaptive.uk/v1/select-model", {
  method: "POST",
  headers: {
    "X-Stainless-API-Key": adaptiveKey,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    models: [
      "anthropic:claude-opus-4-1",
      "z-ai:glm-4.6",
      "openai:gpt-5-mini", // Cloud fallback
    ],
    prompt: userMessage,
  }),
});
const selection = await res.json();

// Route to the right provider using provider/model
if (selection.provider === "anthropic") {
  await yourAnthropic.messages.create({
    model: selection.model,
    messages: [{ role: "user", content: userMessage }],
    max_tokens: 4096,
  });
} else if (selection.provider === "openai") {
  await yourOpenAI.chat.completions.create({
    model: selection.model,
    messages: [{ role: "user", content: userMessage }],
  });
} else if (selection.provider === "z-ai") {
  await yourZAIClient.generate({
    model: selection.model,
    messages: [{ role: "user", content: userMessage }],
  });
}

3. Enterprise Contract Optimization

// Maximize usage of your enterprise contracts
const res = await fetch("https://api.llmadaptive.uk/v1/select-model", {
  method: "POST",
  headers: {
    "X-Stainless-API-Key": adaptiveKey,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    models: [
      "anthropic:claude-opus-4-1", // Your enterprise contract
      "openai:gpt-5-mini", // Your enterprise contract
      "gemini:gemini-2.5-flash-lite", // Pay-per-use fallback
    ],
    prompt: userMessage,
    cost_bias: 0.8,
  }),
});
const selection = await res.json();

// Always use your own accounts
const client = yourProviderClients[selection.provider];
const completion = await client.create({
  model: selection.model,
  messages: [{ role: "user", content: userMessage }],
});

4. Data Privacy & Compliance

// Prefer privacy-focused providers while keeping prompts redacted
const selection = await selectModel({
  models: [
    "z-ai:glm-4.6",              // Zero-retention provider
    "anthropic:claude-haiku-4-5" // Privacy-focused Anthropic model
  ],
  prompt: "NON_SENSITIVE_TASK_DESCRIPTION",
  // Don't send actual sensitive data to Adaptive
});

// Route sensitive content only to providers you approve
if (selection.provider === "z-ai") {
  const result = await yourZAIClient.generate({
    model: selection.model,
    messages: actualSensitiveData,
  });
} else if (selection.provider === "anthropic") {
  const result = await yourAnthropic.messages.create({
    model: selection.model,
    messages: actualSensitiveData,
  });
}

Understanding the Response

What You Get Back

{
  "provider": "openai",
  "model": "gpt-5-mini",
  "alternatives": [{ "provider": "anthropic", "model": "claude-sonnet-4-5" }]
}

Key Insights

provider - Which API service should be called
model - The specific model identifier to use with that provider
alternatives - Fallback options if the primary selection is unavailable

Common Patterns

Before/After Comparison

// See what changes with different parameters
const baseline = await selectModel(request);
const withConstraints = await selectModel({
  ...request,
  cost_bias: 0.1,
});

console.log(`Baseline: ${baseline.model}`);
console.log(`Cost-optimized: ${withConstraints.model}`);

Validate Your Setup

// Make sure your routing rules work
const shouldUseCheap = await fetch(
  "https://api.llmadaptive.uk/v1/select-model",
  {
    method: "POST",
    headers: {
      "X-Stainless-API-Key": adaptiveKey,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      models: ["openai:gpt-4.1-nano", "openai:gpt-5-mini"],
      prompt: "Hi",
    }),
  },
).then((r) => r.json());

const shouldUseExpensive = await fetch(
  "https://api.llmadaptive.uk/v1/select-model",
  {
    method: "POST",
    headers: {
      "X-Stainless-API-Key": adaptiveKey,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      models: ["openai:gpt-4.1-nano", "openai:gpt-5-mini"],
      prompt: "Analyze this complex dataset...",
    }),
  },
).then((r) => r.json());

// Verify different complexity tasks get different models

Authentication

Same as chat completions:

# Any of these work
-H "X-Stainless-API-Key: your-key"
-H "Authorization: Bearer your-key"

No Inference = Fast & Cheap

This endpoint:

✅ Fast - No LLM inference, just routing logic
✅ Cheap - Doesn’t count against token usage
✅ Accurate - Uses exact same selection logic as real completions

Perfect for testing, debugging, and cost planning without burning through your budget.

Getting Started

Framework Integrations

Developer Tools

Key Features

API Reference

Support

Why Use This?

Request

Response

Quick Examples

”Known models - just specify what you have"

"Just specify providers - let Adaptive choose"

"Test cost optimization"

"Function calling optimization"

"Compare different configurations”

Real-World Integration Patterns

1. Use Your Own Provider Accounts

2. Multi-Provider Routing

3. Enterprise Contract Optimization

4. Data Privacy & Compliance

Understanding the Response

What You Get Back

Key Insights

Common Patterns

Before/After Comparison

Validate Your Setup

Authentication

No Inference = Fast & Cheap

Getting Started

Framework Integrations

Developer Tools

Key Features

API Reference

Support

​Why Use This?

​Request

​Response

​Quick Examples

​”Known models - just specify what you have"

​"Just specify providers - let Adaptive choose"

​"Test cost optimization"

​"Function calling optimization"

​"Compare different configurations”

​Real-World Integration Patterns

​1. Use Your Own Provider Accounts

​2. Multi-Provider Routing

​3. Enterprise Contract Optimization

​4. Data Privacy & Compliance

​Understanding the Response

​What You Get Back

​Key Insights

​Common Patterns

​Before/After Comparison

​Validate Your Setup

​Authentication

​No Inference = Fast & Cheap

Why Use This?

Request

Response

Quick Examples

”Known models - just specify what you have"

"Just specify providers - let Adaptive choose"

"Test cost optimization"

"Function calling optimization"

"Compare different configurations”

Real-World Integration Patterns

1. Use Your Own Provider Accounts

2. Multi-Provider Routing

3. Enterprise Contract Optimization

4. Data Privacy & Compliance

Understanding the Response

What You Get Back

Key Insights

Common Patterns

Before/After Comparison

Validate Your Setup

Authentication

No Inference = Fast & Cheap