Get Adaptive’s intelligent model selection without using our inference. Provider-agnostic design - works with any models, any providers, any infrastructure.
Why Use This?
Use Adaptive’s intelligence, run inference wherever you want:
“I have my own OpenAI/Anthropic accounts” - Get optimal model selection, pay your providers directly
“I run models on-premise” - Get routing decisions for your local infrastructure
“I have enterprise contracts” - Use your existing provider relationships with intelligent routing
“I need data privacy” - Keep inference local while getting smart model selection
Request
Provider-agnostic format - send your available models and prompt, get intelligent selection back.
Array of available model specifications in provider:model_name format. Adaptive automatically queries the Model Registry to fill in pricing, capabilities, and other details for known models. Show Model Specification Format
[ "openai:gpt-5-mini" , "anthropic:claude-sonnet-4-5" , "gemini:gemini-2.5-flash-lite" ]
The prompt text to analyze for optimal model selection
Optional user identifier for caching optimization (enables user-specific cache
hits)
Cost optimization preference (0.0 = cheapest, 1.0 = best performance) Default:
Uses server configuration. Override to prioritize cost savings or performance
for this specific selection.
Available tool definitions for function calling detection Tool definitions
help Adaptive understand if your prompt requires function calling
capabilities, influencing model selection towards models that support tools. Show Tool Definition Properties
Type of tool (always “function”)
Function definition object Description of what the function does
JSON Schema object defining the function parameters
Current tool call being made (if any) If this request is part of a tool
calling sequence, provide the current tool call context to help with model
selection optimization. Show Tool Call Properties
Unique identifier for the tool call
Type of tool call (always “function”)
Function call details Show Function Call Properties
Name of the function being called
JSON string containing the function arguments
Semantic cache configuration for this request Show Semantic Cache Configuration
Override whether to use semantic caching for this specific request
(overrides server configuration)
Override similarity threshold for cache hits (0.0-1.0, higher = more
strict matching)
Response
Selected provider name The provider that was chosen for this prompt (e.g.,
“openai”, “anthropic”, “gemini”)
Selected model identifier The specific model that was chosen (e.g.,
“gpt-5-mini”, “claude-sonnet-4-5”, “glm-4.6”)
Alternative provider/model combinations (optional) Fallback options if the
primary selection is unavailableAlternative provider name
Alternative model identifier
Quick Examples
”Known models - just specify what you have"
curl https://api.llmadaptive.uk/v1/select-model \
-H "X-Stainless-API-Key: $API_KEY " \
-H "Content-Type: application/json" \
-d '{
"models": [
"openai:gpt-5-mini",
"anthropic:claude-sonnet-4-5",
"gemini:gemini-2.5-flash-lite"
],
"prompt": "Hello, how are you?"
}'
# Response:
{
"provider" : "openai",
"model" : "gpt-5-mini"
}
"Just specify providers - let Adaptive choose"
curl https://api.llmadaptive.uk/v1/select-model \
-H "X-Stainless-API-Key: $API_KEY " \
-H "Content-Type: application/json" \
-d '{
"models": [
"openai:gpt-5-mini",
"anthropic:claude-sonnet-4-5"
],
"prompt": "Write a complex analysis of market trends"
}'
# Response:
{
"provider" : "anthropic",
"model" : "claude-sonnet-4-5",
"alternatives" : [
{ "provider" : "openai", "model": "gpt-5-mini"}
]
}
"Test cost optimization"
// Will cost_bias actually pick cheaper models?
const response = await fetch ( "/api/v1/select-model" , {
method: "POST" ,
headers: { "X-Stainless-API-Key" : apiKey },
body: JSON . stringify ({
models: [ "openai:gpt-4.1-nano" , "openai:gpt-5-mini" ],
prompt: "Analyze this complex dataset and provide insights..." ,
cost_bias: 0.1 , // Maximize cost savings
}),
});
if ( ! response . ok ) {
const errorBody = await response . text ();
throw new Error ( `HTTP ${ response . status } : ${ errorBody } ` );
}
const result = await response . json ();
console . log ( result );
// Check if it picked the cheaper model despite complexity
"Function calling optimization"
curl https://api.llmadaptive.uk/v1/select-model \
-H "X-Stainless-API-Key: $API_KEY " \
-H "Content-Type: application/json" \
-d '{
"models": [
"openai:gpt-5-codex",
"anthropic:claude-opus-4-1",
"openai:gpt-5-nano"
],
"prompt": "What is the weather like in San Francisco?",
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string", "description": "City name"}
},
"required": ["location"]
}
}
}
]
}'
# Response - will prefer models that support function calling:
{
"provider" : "openai",
"model" : "gpt-5-codex",
"alternatives" : [
{ "provider" : "anthropic", "model": "claude-opus-4-1"}
]
}
"Compare different configurations”
import requests
import os
# Configuration
BASE_URL = "https://api.yourdomain.com" # Replace with your actual domain
API_TOKEN = os.getenv( "ADAPTIVE_API_TOKEN" , "your-api-token-here" ) # Set via environment variable
TIMEOUT = 30 # Request timeout in seconds
# Define available models
models = [
"openai:gpt-4.1-nano" ,
"openai:gpt-5-mini"
]
base_request = {
"models" : models,
"prompt" : "Write Python code to analyze customer data"
}
# Headers for authentication
headers = {
"Authorization" : f "Bearer { API_TOKEN } " ,
"Content-Type" : "application/json"
}
# Test cost-focused vs performance-focused
configs = [
{ "cost_bias" : 0.1 , "name" : "cost-optimized" },
{ "cost_bias" : 0.9 , "name" : "performance-focused" }
]
for config in configs:
try :
response = requests.post(
f " { BASE_URL } /api/v1/select-model" ,
json = {
** base_request,
"cost_bias" : config[ "cost_bias" ]
},
headers = headers,
timeout = TIMEOUT
)
# Check if request was successful
if response.ok:
result = response.json()
print ( f " { config[ 'name' ] } : { result[ 'provider' ] } / { result[ 'model' ] } " )
else :
print ( f "Error for { config[ 'name' ] } : HTTP { response.status_code } - { response.text } " )
except requests.exceptions.Timeout:
print ( f "Timeout error for { config[ 'name' ] } : Request took longer than { TIMEOUT } seconds" )
except requests.exceptions.ConnectionError:
print ( f "Connection error for { config[ 'name' ] } : Unable to connect to { BASE_URL } " )
except requests.exceptions.RequestException as e:
print ( f "Request error for { config[ 'name' ] } : { e } " )
except Exception as e:
print ( f "Unexpected error for { config[ 'name' ] } : { e } " )
Real-World Integration Patterns
1. Use Your Own Provider Accounts
// Define your available models with your own pricing
const availableModels = [
"openai:gpt-5-mini" ,
"anthropic:claude-sonnet-4-5" ,
"gemini:gemini-2.5-flash-lite" ,
];
// Get intelligent selection
const selection = await fetch ( "/api/v1/select-model" , {
method: "POST" ,
headers: { "X-Stainless-API-Key" : adaptiveKey },
body: JSON . stringify ({
models: availableModels ,
prompt: userMessage ,
}),
});
const result = await selection . json ();
// Route to your own provider accounts
if ( result . provider === "openai" ) {
const completion = await yourOpenAI . chat . completions . create ({
model: result . model ,
messages: [{ role: "user" , content: userMessage }],
});
} else if ( result . provider === "anthropic" ) {
const completion = await yourAnthropic . messages . create ({
model: result . model ,
messages: [{ role: "user" , content: userMessage }],
max_tokens: 4096 ,
});
}
2. Multi-Provider Routing
// Tell Adaptive about your preferred providers (plus a fallback)
const res = await fetch ( "https://api.llmadaptive.uk/v1/select-model" , {
method: "POST" ,
headers: {
"X-Stainless-API-Key" : adaptiveKey ,
"Content-Type" : "application/json" ,
},
body: JSON . stringify ({
models: [
"anthropic:claude-opus-4-1" ,
"z-ai:glm-4.6" ,
"openai:gpt-5-mini" , // Cloud fallback
],
prompt: userMessage ,
}),
});
const selection = await res . json ();
// Route to the right provider using provider/model
if ( selection . provider === "anthropic" ) {
await yourAnthropic . messages . create ({
model: selection . model ,
messages: [{ role: "user" , content: userMessage }],
max_tokens: 4096 ,
});
} else if ( selection . provider === "openai" ) {
await yourOpenAI . chat . completions . create ({
model: selection . model ,
messages: [{ role: "user" , content: userMessage }],
});
} else if ( selection . provider === "z-ai" ) {
await yourZAIClient . generate ({
model: selection . model ,
messages: [{ role: "user" , content: userMessage }],
});
}
3. Enterprise Contract Optimization
// Maximize usage of your enterprise contracts
const res = await fetch ( "https://api.llmadaptive.uk/v1/select-model" , {
method: "POST" ,
headers: {
"X-Stainless-API-Key" : adaptiveKey ,
"Content-Type" : "application/json" ,
},
body: JSON . stringify ({
models: [
"anthropic:claude-opus-4-1" , // Your enterprise contract
"openai:gpt-5-mini" , // Your enterprise contract
"gemini:gemini-2.5-flash-lite" , // Pay-per-use fallback
],
prompt: userMessage ,
cost_bias: 0.8 ,
}),
});
const selection = await res . json ();
// Always use your own accounts
const client = yourProviderClients [ selection . provider ];
const completion = await client . create ({
model: selection . model ,
messages: [{ role: "user" , content: userMessage }],
});
4. Data Privacy & Compliance
// Prefer privacy-focused providers while keeping prompts redacted
const selection = await selectModel ({
models: [
"z-ai:glm-4.6" , // Zero-retention provider
"anthropic:claude-haiku-4-5" // Privacy-focused Anthropic model
],
prompt: "NON_SENSITIVE_TASK_DESCRIPTION" ,
// Don't send actual sensitive data to Adaptive
});
// Route sensitive content only to providers you approve
if ( selection . provider === "z-ai" ) {
const result = await yourZAIClient . generate ({
model: selection . model ,
messages: actualSensitiveData ,
});
} else if ( selection . provider === "anthropic" ) {
const result = await yourAnthropic . messages . create ({
model: selection . model ,
messages: actualSensitiveData ,
});
}
Understanding the Response
What You Get Back
{
"provider" : "openai" ,
"model" : "gpt-5-mini" ,
"alternatives" : [{ "provider" : "anthropic" , "model" : "claude-sonnet-4-5" }]
}
Key Insights
provider - Which API service should be called
model - The specific model identifier to use with that provider
alternatives - Fallback options if the primary selection is unavailable
Common Patterns
Before/After Comparison
// See what changes with different parameters
const baseline = await selectModel ( request );
const withConstraints = await selectModel ({
... request ,
cost_bias: 0.1 ,
});
console . log ( `Baseline: ${ baseline . model } ` );
console . log ( `Cost-optimized: ${ withConstraints . model } ` );
Validate Your Setup
// Make sure your routing rules work
const shouldUseCheap = await fetch (
"https://api.llmadaptive.uk/v1/select-model" ,
{
method: "POST" ,
headers: {
"X-Stainless-API-Key" : adaptiveKey ,
"Content-Type" : "application/json" ,
},
body: JSON . stringify ({
models: [ "openai:gpt-4.1-nano" , "openai:gpt-5-mini" ],
prompt: "Hi" ,
}),
},
). then (( r ) => r . json ());
const shouldUseExpensive = await fetch (
"https://api.llmadaptive.uk/v1/select-model" ,
{
method: "POST" ,
headers: {
"X-Stainless-API-Key" : adaptiveKey ,
"Content-Type" : "application/json" ,
},
body: JSON . stringify ({
models: [ "openai:gpt-4.1-nano" , "openai:gpt-5-mini" ],
prompt: "Analyze this complex dataset..." ,
}),
},
). then (( r ) => r . json ());
// Verify different complexity tasks get different models
Authentication
Same as chat completions:
# Any of these work
-H "X-Stainless-API-Key: your-key"
-H "Authorization: Bearer your-key"
No Inference = Fast & Cheap
This endpoint:
✅ Fast - No LLM inference, just routing logic
✅ Cheap - Doesn’t count against token usage
✅ Accurate - Uses exact same selection logic as real completions
Perfect for testing, debugging, and cost planning without burning through your budget.