Supported Models
Adaptive supports the latest models from all major AI providers, automatically updated with new releases. Our intelligent routing system selects the optimal model based on your prompt, cost preferences, and performance requirements.OpenAI Models
GPT-5 Series (Latest 2025)
OpenAI’s most advanced models featuring unified reasoning and superior intelligence across all domains:GPT-5 Models
GPT-5 Models
- gpt-5 - The flagship model with state-of-the-art performance across coding, math, writing, health, and visual perception
- gpt-5-mini - Faster, cost-efficient version of GPT-5 for well-defined tasks
- gpt-5-nano - Fastest, most cost-efficient version for simple tasks and high-volume usage
- gpt-5-chat-latest - GPT-5 variant optimized for ChatGPT (API access available)
- Context window: Up to 272,000 tokens input / 128,000 tokens output
- Unified system with fast and deeper reasoning modes
- 94.6% accuracy on AIME 2025 math problems
- 74.9% on SWE-bench Verified coding tasks
- 45% fewer factual errors vs GPT-4o, 80% fewer vs o3 when reasoning
- Support for custom tools with plaintext instead of JSON
- Reasoning effort levels: minimal, low, medium, high
- Advanced multimodal capabilities (text + vision)
- Semantic caching reduces cached input costs by 90% ($0.125 per million cached tokens)
- GPT-5: 10 per million tokens (input/output), $0.125/million cached
- GPT-5-mini: 2 per million tokens
- GPT-5-nano: 0.4 per million tokens
Anthropic Claude Models
Claude 4 Family (Latest 2025)
The newest generation setting new standards for AI capabilities:Claude 4 Models
Claude 4 Models
- claude-haiku-4.5 - Latest October 2025 release - Fast, affordable, available on free tier (5 per million tokens)
- claude-opus-4.1 - World’s best coding model (72.5% on SWE-bench)
- claude-opus-4 - Advanced reasoning and AI agent capabilities
- claude-sonnet-4.5 - Anthropic’s best Sonnet model for complex agents and coding with vision support
- claude-sonnet-4 - Improved coding with 72.7% on SWE-bench
- Haiku 4.5: 5 per million tokens (input/output) - Now available free on Claude.ai
- Sonnet 4 and 4.5: 15 per million tokens (input/output)
- Opus 4.1: 80 per million tokens + $40/million thinking tokens
Claude 3.5 Family (Previous Generation)
- claude-sonnet-4-5 - Enhanced performance across all tasks
- claude-3-5-haiku - Previous Haiku version (superseded by 4.5)
Google Gemini Models
Gemini 2.5 Series (Latest)
Google’s most advanced thinking models with adaptive capabilities:Gemini 2.5 Models
Gemini 2.5 Models
- gemini-2.5-flash-lite - State-of-the-art thinking model with adaptive reasoning
- gemini-2.5-flash - Best price-performance model with thinking capabilities
- gemini-2.5-flash-lite - Most cost-efficient and fastest 2.5 model ($0.02 per million tokens)
- 2.5 Pro: 10 per million tokens (up to 200K context), 15 (over 200K)
- 2.5 Flash: Similar pricing to 2.5 Flash-Lite
- 2.5 Flash-Lite: $0.02 per million tokens
- Adaptive thinking mode shows reasoning process
- Superior code, math, and STEM reasoning
- Long context for large datasets and documents
Gemini 2.0 Series
- gemini-2.0-flash - Next-gen features with 1M token context (0.40 per million tokens)
- gemini-2.0-flash-live - Low-latency voice and video interactions
Gemini 1.5 Series (Deprecated – Removed May 2025)
Gemini 1.5 models have been fully deprecated and removed from the Adaptive platform. Please migrate to Gemini 2.x or later models.DeepSeek Models
DeepSeek V3.2-Exp (Latest September 2025)
The most affordable and advanced hybrid model with 50% price reduction:DeepSeek Latest Models
DeepSeek Latest Models
- deepseek-v3.2-exp - Latest experimental model with 50% lower pricing ($0.028 per million input tokens)
- deepseek-chat (V3) - Production-ready chat model (1.10 output per million tokens)
- deepseek-reasoner - Specialized reasoning mode with enhanced thinking capabilities
- deepseek-v3 - Standard V3 model (0.28 output per million tokens)
- V3.2-Exp: $0.028 per million input tokens (50% cheaper than V3)
- DeepSeek-Chat: 0.27 (cache miss) input, $1.10 output per million
- DeepSeek V3: 0.28 per million tokens (input/output)
- 671B total parameters (MoE; 37B active)
- 128K context window
- Dual-mode operation (thinking vs direct)
- Most affordable frontier model pricing
DeepSeek Specialized Models
- deepseek-coder-v2 - 338 programming languages, 128K context
- deepseek-r1 - Dedicated reasoning model for complex logic
- deepseek-r1-0528 - Advanced reasoning with 23K token reasoning chains
Available Sizes
- 1.5B, 7B, 8B, 14B, 32B, 70B - Distilled models for various deployment needs
Groq Models (Ultra-Fast Inference)
Latest Models on Groq (2025)
High-performance inference with Groq’s LPU™ technology:Groq High-Speed Models
Groq High-Speed Models
- llama-4-scout - Latest April 2025 Llama 4 model (17B active, 109B total parameters)
- llama-4-maverick - Larger Llama 4 variant (17B active, 400B total parameters)
- gpt-oss-120b - OpenAI open-source model on Groq (500+ tokens/sec, 128K context)
- gpt-oss-20b - Smaller OpenAI OSS model (1000+ tokens/sec)
- llama-3.3-70b-versatile - Flagship Llama 3 model with exceptional speed
- llama-3-groq-70b-tool-use - Specialized for function calling
- llama-3-groq-8b-tool-use - Efficient tool use variant
- kimi-k2-0905 - 256K context window with agentic coding capabilities
- gpt-oss-120B: 0.75 per million tokens (input/output)
- gpt-oss-20B: 0.50 per million tokens
- 5-15× faster than other API providers
- Up to 1000+ tokens/second
- Sub-second response times with LPU acceleration
Additional Groq Models
- llama-guard-4-12b - AI content moderation
- Compound on GroqCloud - Production-ready agentic AI with research, code execution, and browser control
xAI Grok Models
Grok 4 Series (Latest 2025)
xAI’s most intelligent models with real-time capabilities:Grok 4 Models
Grok 4 Models
- grok-4 - “Most intelligent model in the world” with native tool use
- grok-4-heavy - Most powerful version of Grok 4
- grok-4-fast - Cost-efficient reasoning with 2M token context
- grok-code-fast-1 - Specialized for agentic coding tasks
- Grok 4: 15.00 per million tokens (input/output), $0.75/million cached input
- Grok 4 Fast: 0.40 input (tiered), 1.00 output, $0.05/million cached
- Real-time X/web search integration
- 256K context window (2M for fast variants)
- Native multimodal understanding and tool use
Z.AI GLM Models
GLM-4 Family (Latest 2025)
Z.AI’s advanced language models with competitive pricing and prompt caching:GLM-4 Models
GLM-4 Models
- glm-4.6 - Latest flagship model with enhanced capabilities
- glm-4.5 - Advanced model for complex tasks
- glm-4.5v - Vision-enabled multimodal model
- glm-4.5-x - Extra-large model for demanding applications
- glm-4.5-air - Fast and cost-efficient for everyday tasks
- glm-4.5-airx - Balanced performance and cost
- glm-4-32b-0414-128k - 128K context window, ultra-low pricing
- glm-4.5-flash - Free tier model with zero cost
- GLM-4.6: 2.2 (input/output), Cached: $0.11
- GLM-4.5: 2.2 (input/output), Cached: $0.11
- GLM-4.5V: 1.8 (input/output), Cached: $0.11
- GLM-4.5-X: 8.9 (input/output), Cached: $0.45
- GLM-4.5-Air: 1.1 (input/output), Cached: $0.03
- GLM-4.5-AirX: 4.5 (input/output), Cached: $0.22
- GLM-4-32B-0414-128K: 0.1 (input/output)
- GLM-4.5-Flash: Free (input/output/cached)
- Prompt caching for reduced input costs (storage currently free)
- Built-in web search tool ($0.01 per use)
- 128K context window on select models
- Multimodal capabilities (GLM-4.5V)
Perplexity Sonar Models
Latest Sonar (2025)
Built on Llama 3.3 70B with search optimization:- sonar-latest - Latest Sonar model optimized for answer quality
- llama-3.1-sonar-large-128k-online - Large online search model
- llama-3.1-sonar-small-128k-online - Efficient online model
Together AI & HuggingFace Models
Qwen3 Series (2025)
Advanced reasoning models with dual-mode capabilities:Qwen3 Models
Qwen3 Models
- qwen3-235b-a22b - Large MoE model (235B total, 22B active)
- qwen3-30b-a3b - Smaller MoE model (30B total, 3B active)
- qwen3-coder-480b-a35b - Largest open-source coding model
- qwen2.5-vl - Visual reasoning and video understanding
- Dual-mode: Instant responses vs deep reasoning
- Apache 2.0 license
- Outperforms OpenAI O3 on key benchmarks
Llama Models via Together AI
- llama-3.3-70b-instruct-turbo - Recommended general-purpose model
- llama-4-scout-17b - Vision model for multimodal tasks
- Various fine-tuned and specialized variants
HuggingFace Models
Access to 200+ open-source models including:- meta-llama/Llama-3.1-8B-Instruct - Efficient general-purpose
- deepseek-ai/DeepSeek-R1-Distill-Qwen-14B - Reasoning optimized
- Custom and fine-tuned models for specialized domains



