Skip to main content
POST
/
v1
/
select-model
Select Model
curl --request POST \
  --url https://api.llmadaptive.uk/v1/select-model \
  --header 'Content-Type: application/json' \
  --data '{
  "models": [
    {}
  ],
  "prompt": "<string>",
  "user": "<string>",
  "cost_bias": 123,
  "tools": [
    {
      "type": "<string>",
      "function": {
        "name": "<string>",
        "description": "<string>",
        "parameters": {}
      }
    }
  ],
  "tool_call": {
    "id": "<string>",
    "type": "<string>",
    "function": {
      "name": "<string>",
      "arguments": "<string>"
    }
  },
  "model_router_cache": {
    "enabled": true,
    "semantic_threshold": 123
  }
}'
{
  "provider": "<string>",
  "model": "<string>",
  "alternatives": [
    {
      "provider": "<string>",
      "model": "<string>"
    }
  ]
}
Get Adaptive’s intelligent model selection without using our inference. Provider-agnostic design - works with any models, any providers, any infrastructure.

Why Use This?

Use Adaptive’s intelligence, run inference wherever you want:
  • “I have my own OpenAI/Anthropic accounts” - Get optimal model selection, pay your providers directly
  • “I run models on-premise” - Get routing decisions for your local infrastructure
  • “I have enterprise contracts” - Use your existing provider relationships with intelligent routing
  • “I need data privacy” - Keep inference local while getting smart model selection

Request

Provider-agnostic format - send your available models and prompt, get intelligent selection back.
models
array
required
Array of available model specifications in provider:model_name format. Adaptive automatically queries the Model Registry to fill in pricing, capabilities, and other details for known models.
prompt
string
required
The prompt text to analyze for optimal model selection
user
string
Optional user identifier for caching optimization (enables user-specific cache hits)
cost_bias
number
Cost optimization preference (0.0 = cheapest, 1.0 = best performance) Default: Uses server configuration. Override to prioritize cost savings or performance for this specific selection.
tools
object[]
Available tool definitions for function calling detection Tool definitions help Adaptive understand if your prompt requires function calling capabilities, influencing model selection towards models that support tools.
tool_call
object
Current tool call being made (if any) If this request is part of a tool calling sequence, provide the current tool call context to help with model selection optimization.
model_router_cache
object
Semantic cache configuration for this request

Response

provider
string
Selected provider name The provider that was chosen for this prompt (e.g., “openai”, “anthropic”, “gemini”)
model
string
Selected model identifier The specific model that was chosen (e.g., “gpt-5-mini”, “claude-sonnet-4-5”, “glm-4.6”)
alternatives
array
Alternative provider/model combinations (optional) Fallback options if the primary selection is unavailable

Quick Examples

”Known models - just specify what you have"

curl https://api.llmadaptive.uk/v1/select-model \
  -H "X-Stainless-API-Key: $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "models": [
      "openai:gpt-5-mini",
      "anthropic:claude-sonnet-4-5",
      "gemini:gemini-2.5-flash-lite"
    ],
    "prompt": "Hello, how are you?"
  }'

# Response:
{
  "provider": "openai",
  "model": "gpt-5-mini"
}

"Just specify providers - let Adaptive choose"

curl https://api.llmadaptive.uk/v1/select-model \
  -H "X-Stainless-API-Key: $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "models": [
      "openai:gpt-5-mini",
      "anthropic:claude-sonnet-4-5"
    ],
    "prompt": "Write a complex analysis of market trends"
  }'

# Response:
{
  "provider": "anthropic",
  "model": "claude-sonnet-4-5",
  "alternatives": [
    {"provider": "openai", "model": "gpt-5-mini"}
  ]
}

"Test cost optimization"

// Will cost_bias actually pick cheaper models?
const response = await fetch("/api/v1/select-model", {
  method: "POST",
  headers: { "X-Stainless-API-Key": apiKey },
  body: JSON.stringify({
    models: ["openai:gpt-4.1-nano", "openai:gpt-5-mini"],
    prompt: "Analyze this complex dataset and provide insights...",
    cost_bias: 0.1, // Maximize cost savings
  }),
});

if (!response.ok) {
  const errorBody = await response.text();
  throw new Error(`HTTP ${response.status}: ${errorBody}`);
}

const result = await response.json();
console.log(result);
// Check if it picked the cheaper model despite complexity

"Function calling optimization"

curl https://api.llmadaptive.uk/v1/select-model \
  -H "X-Stainless-API-Key: $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "models": [
      "openai:gpt-5-codex",
      "anthropic:claude-opus-4-1",
      "openai:gpt-5-nano"
    ],
    "prompt": "What is the weather like in San Francisco?",
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_weather",
          "description": "Get current weather for a location",
          "parameters": {
            "type": "object",
            "properties": {
              "location": {"type": "string", "description": "City name"}
            },
            "required": ["location"]
          }
        }
      }
    ]
  }'

# Response - will prefer models that support function calling:
{
  "provider": "openai",
  "model": "gpt-5-codex",
  "alternatives": [
    {"provider": "anthropic", "model": "claude-opus-4-1"}
  ]
}

"Compare different configurations”

import requests
import os

# Configuration
BASE_URL = "https://api.yourdomain.com"  # Replace with your actual domain
API_TOKEN = os.getenv("ADAPTIVE_API_TOKEN", "your-api-token-here")  # Set via environment variable
TIMEOUT = 30  # Request timeout in seconds

# Define available models
models = [
    "openai:gpt-4.1-nano",
    "openai:gpt-5-mini"
]

base_request = {
    "models": models,
    "prompt": "Write Python code to analyze customer data"
}

# Headers for authentication
headers = {
    "Authorization": f"Bearer {API_TOKEN}",
    "Content-Type": "application/json"
}

# Test cost-focused vs performance-focused
configs = [
    {"cost_bias": 0.1, "name": "cost-optimized"},
    {"cost_bias": 0.9, "name": "performance-focused"}
]

for config in configs:
    try:
        response = requests.post(
            f"{BASE_URL}/api/v1/select-model",
            json={
                **base_request,
                "cost_bias": config["cost_bias"]
            },
            headers=headers,
            timeout=TIMEOUT
        )

        # Check if request was successful
        if response.ok:
            result = response.json()
            print(f"{config['name']}: {result['provider']}/{result['model']}")
        else:
            print(f"Error for {config['name']}: HTTP {response.status_code} - {response.text}")

    except requests.exceptions.Timeout:
        print(f"Timeout error for {config['name']}: Request took longer than {TIMEOUT} seconds")
    except requests.exceptions.ConnectionError:
        print(f"Connection error for {config['name']}: Unable to connect to {BASE_URL}")
    except requests.exceptions.RequestException as e:
        print(f"Request error for {config['name']}: {e}")
    except Exception as e:
        print(f"Unexpected error for {config['name']}: {e}")

Real-World Integration Patterns

1. Use Your Own Provider Accounts

// Define your available models with your own pricing
const availableModels = [
  "openai:gpt-5-mini",
  "anthropic:claude-sonnet-4-5",
  "gemini:gemini-2.5-flash-lite",
];

// Get intelligent selection
const selection = await fetch("/api/v1/select-model", {
  method: "POST",
  headers: { "X-Stainless-API-Key": adaptiveKey },
  body: JSON.stringify({
    models: availableModels,
    prompt: userMessage,
  }),
});

const result = await selection.json();

// Route to your own provider accounts
if (result.provider === "openai") {
  const completion = await yourOpenAI.chat.completions.create({
    model: result.model,
    messages: [{ role: "user", content: userMessage }],
  });
} else if (result.provider === "anthropic") {
  const completion = await yourAnthropic.messages.create({
    model: result.model,
    messages: [{ role: "user", content: userMessage }],
    max_tokens: 4096,
  });
}

2. Multi-Provider Routing

// Tell Adaptive about your preferred providers (plus a fallback)
const res = await fetch("https://api.llmadaptive.uk/v1/select-model", {
  method: "POST",
  headers: {
    "X-Stainless-API-Key": adaptiveKey,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    models: [
      "anthropic:claude-opus-4-1",
      "z-ai:glm-4.6",
      "openai:gpt-5-mini", // Cloud fallback
    ],
    prompt: userMessage,
  }),
});
const selection = await res.json();

// Route to the right provider using provider/model
if (selection.provider === "anthropic") {
  await yourAnthropic.messages.create({
    model: selection.model,
    messages: [{ role: "user", content: userMessage }],
    max_tokens: 4096,
  });
} else if (selection.provider === "openai") {
  await yourOpenAI.chat.completions.create({
    model: selection.model,
    messages: [{ role: "user", content: userMessage }],
  });
} else if (selection.provider === "z-ai") {
  await yourZAIClient.generate({
    model: selection.model,
    messages: [{ role: "user", content: userMessage }],
  });
}

3. Enterprise Contract Optimization

// Maximize usage of your enterprise contracts
const res = await fetch("https://api.llmadaptive.uk/v1/select-model", {
  method: "POST",
  headers: {
    "X-Stainless-API-Key": adaptiveKey,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    models: [
      "anthropic:claude-opus-4-1", // Your enterprise contract
      "openai:gpt-5-mini", // Your enterprise contract
      "gemini:gemini-2.5-flash-lite", // Pay-per-use fallback
    ],
    prompt: userMessage,
    cost_bias: 0.8,
  }),
});
const selection = await res.json();

// Always use your own accounts
const client = yourProviderClients[selection.provider];
const completion = await client.create({
  model: selection.model,
  messages: [{ role: "user", content: userMessage }],
});

4. Data Privacy & Compliance

// Prefer privacy-focused providers while keeping prompts redacted
const selection = await selectModel({
  models: [
    "z-ai:glm-4.6",              // Zero-retention provider
    "anthropic:claude-haiku-4-5" // Privacy-focused Anthropic model
  ],
  prompt: "NON_SENSITIVE_TASK_DESCRIPTION",
  // Don't send actual sensitive data to Adaptive
});

// Route sensitive content only to providers you approve
if (selection.provider === "z-ai") {
  const result = await yourZAIClient.generate({
    model: selection.model,
    messages: actualSensitiveData,
  });
} else if (selection.provider === "anthropic") {
  const result = await yourAnthropic.messages.create({
    model: selection.model,
    messages: actualSensitiveData,
  });
}

Understanding the Response

What You Get Back

{
  "provider": "openai",
  "model": "gpt-5-mini",
  "alternatives": [{ "provider": "anthropic", "model": "claude-sonnet-4-5" }]
}

Key Insights

  • provider - Which API service should be called
  • model - The specific model identifier to use with that provider
  • alternatives - Fallback options if the primary selection is unavailable

Common Patterns

Before/After Comparison

// See what changes with different parameters
const baseline = await selectModel(request);
const withConstraints = await selectModel({
  ...request,
  cost_bias: 0.1,
});

console.log(`Baseline: ${baseline.model}`);
console.log(`Cost-optimized: ${withConstraints.model}`);

Validate Your Setup

// Make sure your routing rules work
const shouldUseCheap = await fetch(
  "https://api.llmadaptive.uk/v1/select-model",
  {
    method: "POST",
    headers: {
      "X-Stainless-API-Key": adaptiveKey,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      models: ["openai:gpt-4.1-nano", "openai:gpt-5-mini"],
      prompt: "Hi",
    }),
  },
).then((r) => r.json());

const shouldUseExpensive = await fetch(
  "https://api.llmadaptive.uk/v1/select-model",
  {
    method: "POST",
    headers: {
      "X-Stainless-API-Key": adaptiveKey,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      models: ["openai:gpt-4.1-nano", "openai:gpt-5-mini"],
      prompt: "Analyze this complex dataset...",
    }),
  },
).then((r) => r.json());

// Verify different complexity tasks get different models

Authentication

Same as chat completions:
# Any of these work
-H "X-Stainless-API-Key: your-key"
-H "Authorization: Bearer your-key"

No Inference = Fast & Cheap

This endpoint:
  • Fast - No LLM inference, just routing logic
  • Cheap - Doesn’t count against token usage
  • Accurate - Uses exact same selection logic as real completions
Perfect for testing, debugging, and cost planning without burning through your budget.