Supported Models

Adaptive supports the latest models from all major AI providers, automatically updated with new releases. Our intelligent routing system selects the optimal model based on your prompt, cost preferences, and performance requirements.

Latest October 2025 Models: GPT-5 series with unified reasoning, Claude Haiku 4.5 (now free tier), Gemini 2.5 Pro, DeepSeek V3.2-Exp with 50% lower pricing, Grok 4 Fast, Llama 4 Scout & Maverick on Groq. All models automatically available through intelligent routing.

OpenAI Models

GPT-5 Series (Latest 2025)

OpenAI’s most advanced models featuring unified reasoning and superior intelligence across all domains:

GPT-5 Models

gpt-5 - The flagship model with state-of-the-art performance across coding, math, writing, health, and visual perception
gpt-5-mini - Faster, cost-efficient version of GPT-5 for well-defined tasks
gpt-5-nano - Fastest, most cost-efficient version for simple tasks and high-volume usage
gpt-5-chat-latest - GPT-5 variant optimized for ChatGPT (API access available)

Key Features:

Context window: Up to 272,000 tokens input / 128,000 tokens output
Unified system with fast and deeper reasoning modes
94.6% accuracy on AIME 2025 math problems
74.9% on SWE-bench Verified coding tasks
45% fewer factual errors vs GPT-4o, 80% fewer vs o3 when reasoning
Support for custom tools with plaintext instead of JSON
Reasoning effort levels: minimal, low, medium, high
Advanced multimodal capabilities (text + vision)
Semantic caching reduces cached input costs by 90% ($0.125 per million cached tokens)

Pricing:

GPT-5: $1.25/$ 10 per million tokens (input/output), $0.125/million cached
GPT-5-mini: $0.25/$ 2 per million tokens
GPT-5-nano: $0.05/$ 0.4 per million tokens

Anthropic Claude Models

Claude 4 Family (Latest 2025)

The newest generation setting new standards for AI capabilities:

Claude 4 Models

claude-haiku-4.5 - Latest October 2025 release - Fast, affordable, available on free tier ( $1/$ 5 per million tokens)
claude-opus-4.1 - World’s best coding model (72.5% on SWE-bench)
claude-opus-4 - Advanced reasoning and AI agent capabilities
claude-sonnet-4.5 - Anthropic’s best Sonnet model for complex agents and coding with vision support
claude-sonnet-4 - Improved coding with 72.7% on SWE-bench

Pricing:

Haiku 4.5: $1/$ 5 per million tokens (input/output) - Now available free on Claude.ai
Sonnet 4 and 4.5: $3/$ 15 per million tokens (input/output)
Opus 4.1: $20/$ 80 per million tokens + $40/million thinking tokens

Claude 3.5 Family (Previous Generation)

claude-sonnet-4-5 - Enhanced performance across all tasks
claude-3-5-haiku - Previous Haiku version (superseded by 4.5)

Google Gemini Models

Gemini 2.5 Series (Latest)

Google’s most advanced thinking models with adaptive capabilities:

Gemini 2.5 Models

gemini-2.5-flash-lite - State-of-the-art thinking model with adaptive reasoning
gemini-2.5-flash - Best price-performance model with thinking capabilities
gemini-2.5-flash-lite - Most cost-efficient and fastest 2.5 model ($0.02 per million tokens)

Pricing:

2.5 Pro: $1.25/$ 10 per million tokens (up to 200K context), $2.50/$ 15 (over 200K)
2.5 Flash: Similar pricing to 2.5 Flash-Lite
2.5 Flash-Lite: $0.02 per million tokens

Special Features:

Adaptive thinking mode shows reasoning process
Superior code, math, and STEM reasoning
Long context for large datasets and documents

Gemini 2.0 Series

gemini-2.0-flash - Next-gen features with 1M token context ( $0.10/$ 0.40 per million tokens)
gemini-2.0-flash-live - Low-latency voice and video interactions

Gemini 1.5 Series (Deprecated – Removed May 2025)

Gemini 1.5 models have been fully deprecated and removed from the Adaptive platform. Please migrate to Gemini 2.x or later models.

Gemini 1.5 models are no longer supported and cannot be selected for new or existing projects.

DeepSeek Models

DeepSeek V3.2-Exp (Latest September 2025)

The most affordable and advanced hybrid model with 50% price reduction:

DeepSeek Latest Models

deepseek-v3.2-exp - Latest experimental model with 50% lower pricing ($0.028 per million input tokens)
deepseek-chat (V3) - Production-ready chat model ( $0.27 input /$ 1.10 output per million tokens)
deepseek-reasoner - Specialized reasoning mode with enhanced thinking capabilities
deepseek-v3 - Standard V3 model ( $0.14 input /$ 0.28 output per million tokens)

Pricing:

V3.2-Exp: $0.028 per million input tokens (50% cheaper than V3)
DeepSeek-Chat: $0.07 (cache hit) /$ 0.27 (cache miss) input, $1.10 output per million
DeepSeek V3: $0.14/$ 0.28 per million tokens (input/output)

Key Features:

671B total parameters (MoE; 37B active)
128K context window
Dual-mode operation (thinking vs direct)
Most affordable frontier model pricing

DeepSeek Specialized Models

deepseek-coder-v2 - 338 programming languages, 128K context
deepseek-r1 - Dedicated reasoning model for complex logic
deepseek-r1-0528 - Advanced reasoning with 23K token reasoning chains

Available Sizes

1.5B, 7B, 8B, 14B, 32B, 70B - Distilled models for various deployment needs

Groq Models (Ultra-Fast Inference)

Latest Models on Groq (2025)

High-performance inference with Groq’s LPU™ technology:

Groq High-Speed Models

llama-4-scout - Latest April 2025 Llama 4 model (17B active, 109B total parameters)
llama-4-maverick - Larger Llama 4 variant (17B active, 400B total parameters)
gpt-oss-120b - OpenAI open-source model on Groq (500+ tokens/sec, 128K context)
gpt-oss-20b - Smaller OpenAI OSS model (1000+ tokens/sec)
llama-3.3-70b-versatile - Flagship Llama 3 model with exceptional speed
llama-3-groq-70b-tool-use - Specialized for function calling
llama-3-groq-8b-tool-use - Efficient tool use variant
kimi-k2-0905 - 256K context window with agentic coding capabilities

Pricing:

gpt-oss-120B: $0.15/$ 0.75 per million tokens (input/output)
gpt-oss-20B: $0.10/$ 0.50 per million tokens

Performance Benefits:

5-15× faster than other API providers
Up to 1000+ tokens/second
Sub-second response times with LPU acceleration

Additional Groq Models

llama-guard-4-12b - AI content moderation
Compound on GroqCloud - Production-ready agentic AI with research, code execution, and browser control

xAI Grok Models

Grok 4 Series (Latest 2025)

xAI’s most intelligent models with real-time capabilities:

Grok 4 Models

grok-4 - “Most intelligent model in the world” with native tool use
grok-4-heavy - Most powerful version of Grok 4
grok-4-fast - Cost-efficient reasoning with 2M token context
grok-code-fast-1 - Specialized for agentic coding tasks

Pricing:

Grok 4: $3.00/$ 15.00 per million tokens (input/output), $0.75/million cached input
Grok 4 Fast: $0.20-$ 0.40 input (tiered), $0.50-$ 1.00 output, $0.05/million cached

Features:

Real-time X/web search integration
256K context window (2M for fast variants)
Native multimodal understanding and tool use

Z.AI GLM Models

GLM-4 Family (Latest 2025)

Z.AI’s advanced language models with competitive pricing and prompt caching:

GLM-4 Models

glm-4.6 - Latest flagship model with enhanced capabilities
glm-4.5 - Advanced model for complex tasks
glm-4.5v - Vision-enabled multimodal model
glm-4.5-x - Extra-large model for demanding applications
glm-4.5-air - Fast and cost-efficient for everyday tasks
glm-4.5-airx - Balanced performance and cost
glm-4-32b-0414-128k - 128K context window, ultra-low pricing
glm-4.5-flash - Free tier model with zero cost

Pricing (per 1M tokens):

GLM-4.6: $0.6/$ 2.2 (input/output), Cached: $0.11
GLM-4.5: $0.6/$ 2.2 (input/output), Cached: $0.11
GLM-4.5V: $0.6/$ 1.8 (input/output), Cached: $0.11
GLM-4.5-X: $2.2/$ 8.9 (input/output), Cached: $0.45
GLM-4.5-Air: $0.2/$ 1.1 (input/output), Cached: $0.03
GLM-4.5-AirX: $1.1/$ 4.5 (input/output), Cached: $0.22
GLM-4-32B-0414-128K: $0.1/$ 0.1 (input/output)
GLM-4.5-Flash: Free (input/output/cached)

Special Features:

Prompt caching for reduced input costs (storage currently free)
Built-in web search tool ($0.01 per use)
128K context window on select models
Multimodal capabilities (GLM-4.5V)

Perplexity Sonar Models

Latest Sonar (2025)

Built on Llama 3.3 70B with search optimization:

sonar-latest - Latest Sonar model optimized for answer quality
llama-3.1-sonar-large-128k-online - Large online search model
llama-3.1-sonar-small-128k-online - Efficient online model

Deprecation: llama-3.1-sonar-large-128k-online will be discontinued February 22, 2025.

Performance: 1200 tokens/second with Cerebras infrastructure

Together AI & HuggingFace Models

Qwen3 Series (2025)

Advanced reasoning models with dual-mode capabilities:

Qwen3 Models

qwen3-235b-a22b - Large MoE model (235B total, 22B active)
qwen3-30b-a3b - Smaller MoE model (30B total, 3B active)
qwen3-coder-480b-a35b - Largest open-source coding model
qwen2.5-vl - Visual reasoning and video understanding

Key Features:

Dual-mode: Instant responses vs deep reasoning
Apache 2.0 license
Outperforms OpenAI O3 on key benchmarks

Llama Models via Together AI

llama-3.3-70b-instruct-turbo - Recommended general-purpose model
llama-4-scout-17b - Vision model for multimodal tasks
Various fine-tuned and specialized variants

HuggingFace Models

Access to 200+ open-source models including:

meta-llama/Llama-3.1-8B-Instruct - Efficient general-purpose
deepseek-ai/DeepSeek-R1-Distill-Qwen-14B - Reasoning optimized
Custom and fine-tuned models for specialized domains

The remainder of this page (Model Selection Intelligence, Cost Optimization, Performance Tiers, Getting Started, Model Updates, Next Steps) remains unchanged.

Getting Started

Key Features

Framework Integrations

Developer Tools

Examples

API Reference

Support

Supported Models

Supported Models

OpenAI Models

GPT-5 Series (Latest 2025)

Anthropic Claude Models

Claude 4 Family (Latest 2025)

Claude 3.5 Family (Previous Generation)

Google Gemini Models

Gemini 2.5 Series (Latest)

Gemini 2.0 Series

Gemini 1.5 Series (Deprecated – Removed May 2025)

DeepSeek Models

DeepSeek V3.2-Exp (Latest September 2025)

DeepSeek Specialized Models

Available Sizes

Groq Models (Ultra-Fast Inference)

Latest Models on Groq (2025)

Additional Groq Models

xAI Grok Models

Grok 4 Series (Latest 2025)

Z.AI GLM Models

GLM-4 Family (Latest 2025)

Perplexity Sonar Models

Latest Sonar (2025)

Together AI & HuggingFace Models

Qwen3 Series (2025)

Llama Models via Together AI

HuggingFace Models

Getting Started

Key Features

Framework Integrations

Developer Tools

Examples

API Reference

Support

​Supported Models

​OpenAI Models

​GPT-5 Series (Latest 2025)

​Anthropic Claude Models

​Claude 4 Family (Latest 2025)

​Claude 3.5 Family (Previous Generation)

​Google Gemini Models

​Gemini 2.5 Series (Latest)

​Gemini 2.0 Series

​Gemini 1.5 Series (Deprecated – Removed May 2025)

​DeepSeek Models

​DeepSeek V3.2-Exp (Latest September 2025)

​DeepSeek Specialized Models

​Available Sizes

​Groq Models (Ultra-Fast Inference)

​Latest Models on Groq (2025)

​Additional Groq Models

​xAI Grok Models

​Grok 4 Series (Latest 2025)

​Z.AI GLM Models

​GLM-4 Family (Latest 2025)

​Perplexity Sonar Models

​Latest Sonar (2025)

​Together AI & HuggingFace Models

​Qwen3 Series (2025)

​Llama Models via Together AI

​HuggingFace Models

Supported Models

OpenAI Models

GPT-5 Series (Latest 2025)

Anthropic Claude Models

Claude 4 Family (Latest 2025)

Claude 3.5 Family (Previous Generation)

Google Gemini Models

Gemini 2.5 Series (Latest)

Gemini 2.0 Series

Gemini 1.5 Series (Deprecated – Removed May 2025)

DeepSeek Models

DeepSeek V3.2-Exp (Latest September 2025)

DeepSeek Specialized Models

Available Sizes

Groq Models (Ultra-Fast Inference)

Latest Models on Groq (2025)

Additional Groq Models

xAI Grok Models

Grok 4 Series (Latest 2025)

Z.AI GLM Models

GLM-4 Family (Latest 2025)

Perplexity Sonar Models

Latest Sonar (2025)

Together AI & HuggingFace Models

Qwen3 Series (2025)

Llama Models via Together AI

HuggingFace Models