Provider Resiliency

Adaptive’s provider resiliency system ensures your applications stay online even when individual AI providers experience outages. With intelligent failover mechanisms and circuit breaker patterns, you get enterprise-grade reliability.

Cost Consideration: Fallback is disabled by default to control costs. Enable it when you need maximum reliability and can handle potential higher costs from multiple provider calls.

SDK Setup

These resiliency examples rely on the standard OpenAI SDK configured to talk to Adaptive:

import OpenAI from 'openai';

const openai = new OpenAI({
  apiKey: process.env.ADAPTIVE_API_KEY || 'your-adaptive-api-key',
  baseURL: 'https://api.llmadaptive.uk/v1'
});

How Resiliency Works

Health Monitoring

Continuous monitoring of provider availability, response times, and error rates

Failure Detection

Instant detection of timeouts, rate limits, service errors, and degraded performance

Automatic Failover

Seamless switching to backup providers based on your configured fallback strategy

Recovery Tracking

Automatic re-integration of recovered providers back into the rotation

Failover Strategies

Race Mode (Fastest, Higher Cost)

Send requests to multiple providers simultaneously and use the first successful response:

const completion = await openai.chat.completions.create({
  model: "adaptive/auto",
  messages: [{ role: "user", content: "Hello!" }],
  fallback: {
    mode: "race" // Try multiple providers simultaneously
  }
});

Benefits

Ultra-low latency: Get responses from the fastest available provider
Maximum reliability: Multiple providers increase success probability

Trade-offs

Higher costs: Multiple API calls are made simultaneously
Resource usage: Increased bandwidth and compute utilization

Sequential Mode (Cost-Effective)

Try providers one after another until one succeeds:

const completion = await openai.chat.completions.create({
  model: "adaptive/auto",
  messages: [{ role: "user", content: "Hello!" }],
  fallback: {
    mode: "sequential" // Try providers one by one
  }
});

Benefits

Lower costs: Only pay for successful requests
Predictable: Clear understanding of provider order and costs

Trade-offs

Higher latency: Additional delay when primary provider fails
Sequential delays: Each failed attempt adds to total response time

Disabled (Default)

Fallback disabled for cost control:

const completion = await openai.chat.completions.create({
  model: "adaptive/auto",
  messages: [{ role: "user", content: "Hello!" }]
  // No fallback configuration = disabled
});

Circuit Breaker Patterns

Automatic Circuit Breaking

Adaptive implements intelligent circuit breakers to prevent cascading failures:

Failure Threshold

5 failures within 60 seconds triggers circuit breaker activation

Recovery Time

30 seconds cooldown before attempting to use the provider again

Health Checks

Continuous monitoring to detect when providers recover

Circuit Breaker States

Closed (Normal)
Open (Blocked)
Half-Open (Testing)

State: All requests flow through normally
Condition: Provider is healthy and responding successfully
Behavior: No restrictions on request routing

Reliability Metrics

Uptime

99.95%
Across all providers

Failover Speed

<500ms
Detection and switch time

Recovery Time

<30s
Provider re-integration

Success Rate

99.9%
With fallback enabled

Configuration Options

Basic Fallback Configuration

fallback

object

Configuration for provider fallback behavior

Show Properties

mode

string

required

Fallback strategy: “race”, “sequential”, or "" (disabled)

providers

array

Custom list of providers to use for fallback (optional)

timeout_ms

integer

Request timeout in milliseconds (default: 30000)

Advanced Configuration

const completion = await openai.chat.completions.create({
  model: "adaptive/auto",
  messages: [{ role: "user", content: "Critical business request" }],
  fallback: {
    mode: "sequential",
    providers: ["openai", "anthropic", "deepseek"], // Custom provider order
    timeout_ms: 45000, // Extended timeout for critical requests
    max_retries: 3 // Maximum retry attempts per provider
  }
});

Error Handling

Comprehensive Error Management

try {
  const completion = await openai.chat.completions.create({
    model: "adaptive/auto",
    messages: [{ role: "user", content: "Hello!" }],
    fallback: {
      mode: "sequential"
    }
  });
  
  // Check which provider was used
  console.log(`Used provider: ${completion.provider}`);
  
} catch (error) {
  if (error.code === 'all_providers_failed') {
    console.error('All configured providers are currently unavailable');
    // Implement your fallback strategy (cached response, error message, etc.)
  } else if (error.code === 'timeout') {
    console.error('Request timed out across all providers');
  } else {
    console.error('Unexpected error:', error.message);
  }
}

Error Codes

all_providers_failed

Description: All configured providers returned errors or are unavailable
Action: Implement application-level fallback (cached responses, error messages)

timeout

Description: Request timed out across all attempted providers
Action: Consider increasing timeout_ms or checking network connectivity

rate_limit_exceeded

Description: Rate limits hit across all providers simultaneously
Action: Implement request queuing or backoff strategies

insufficient_quota

Description: Credit/quota exhausted across all providers
Action: Check billing and quota limits on provider accounts

Monitoring and Observability

Real-time Metrics

Track resiliency performance in your Adaptive dashboard:

Provider Health

Real-time status: Availability, response times, and error rates for each provider

Failover Events

Event tracking: When, why, and how often failovers occur

Circuit Breaker Status

State monitoring: Current state and history of circuit breakers

Success Rates

Reliability metrics: Success rates with and without fallback enabled

Alerts and Notifications

Provider Outages

Automatic alerts when providers go down or experience degraded performance

Failover Events

Notifications when automatic failover is triggered for your requests

Recovery Events

Updates when providers recover and are re-integrated into rotation

Quota Warnings

Proactive alerts before hitting rate limits or quota exhaustion

Best Practices

When to Enable Fallback

Critical Applications

High-availability needs: Customer-facing applications, real-time systems

Production Workloads

Business-critical: Revenue-generating applications, SLA requirements

Batch Processing

Large-scale operations: Long-running jobs that can’t afford to fail

Emergency Systems

Zero-downtime requirements: Safety-critical or emergency response systems

When to Keep Disabled

Cost-Sensitive Apps

Budget constraints: Development environments, cost-optimized applications

Non-Critical Workloads

Testing environments: Experimental features, internal tools

Batch Jobs

Delay-tolerant: Operations that can retry later without business impact

Development

Local development: Testing and debugging scenarios

Performance Impact

Race Mode Performance

Latency

Best case: 50ms faster than single provider
Worst case: Same as slowest provider

Cost

Typical: 2-3x single provider cost
Maximum: N providers × base cost

Reliability

Failure rate: Exponentially decreased
Uptime: 99.99%+ effective availability

Sequential Mode Performance

Latency

Best case: Same as single provider
Worst case: Sum of all timeouts

Cost

Typical: 1.1-1.3x single provider
Maximum: Same as race mode on full failures

Reliability

Failure rate: Significantly decreased
Uptime: 99.9%+ effective availability

Troubleshooting

Common Issues

All Providers Failing

Symptoms: Consistent all_providers_failed errorsPossible Causes:

Network connectivity issues
API key problems across multiple providers
Widespread provider outages
Request format issues

Solutions:

Check network connectivity and DNS resolution
Verify API keys and quotas for all providers
Check provider status pages for outages
Review request format and parameters

High Latency with Sequential Mode

Symptoms: Slow responses when fallback is enabledPossible Causes:

Primary provider consistently failing
Long timeout values
Network latency to backup providers

Solutions:

Review provider health metrics
Reduce timeout_ms for faster failover
Consider switching to race mode for critical requests
Check provider selection order

Unexpected Costs

Symptoms: Higher than expected API costsPossible Causes:

Race mode calling multiple providers
Frequent failovers due to provider issues
Misconfigured fallback settings

Solutions:

Review fallback mode configuration
Monitor provider health to identify problematic providers
Consider sequential mode for cost optimization
Set appropriate timeout values

Next Steps

Performance Features

Learn about performance optimizations and caching

Intelligent Routing

Understand how requests are routed to optimal providers

Getting Started

Key Features

Framework Integrations

Developer Tools

Examples

API Reference

Support

​SDK Setup

​How Resiliency Works

​Failover Strategies

​Race Mode (Fastest, Higher Cost)

Benefits

Trade-offs

​Sequential Mode (Cost-Effective)

Benefits

Trade-offs

​Disabled (Default)

​Circuit Breaker Patterns

​Automatic Circuit Breaking

Failure Threshold

Recovery Time

Health Checks

​Circuit Breaker States

​Reliability Metrics

Uptime

Failover Speed

Recovery Time

Success Rate

​Configuration Options

​Basic Fallback Configuration

​Advanced Configuration

​Error Handling

​Comprehensive Error Management

​Error Codes

all_providers_failed

timeout

rate_limit_exceeded

insufficient_quota

​Monitoring and Observability

​Real-time Metrics

Provider Health

Failover Events

Circuit Breaker Status

Success Rates

​Alerts and Notifications

​Best Practices

​When to Enable Fallback

Critical Applications

Production Workloads

Batch Processing

Emergency Systems

​When to Keep Disabled

Cost-Sensitive Apps

Non-Critical Workloads

Batch Jobs

Development

​Performance Impact

​Race Mode Performance

Latency

Cost

Reliability

​Sequential Mode Performance

Latency

Cost

Reliability

​Troubleshooting

​Common Issues

​Next Steps

Performance Features

Intelligent Routing

SDK Setup

How Resiliency Works

Failover Strategies

Race Mode (Fastest, Higher Cost)

Sequential Mode (Cost-Effective)

Disabled (Default)

Circuit Breaker Patterns

Automatic Circuit Breaking

Circuit Breaker States

Reliability Metrics

Configuration Options

Basic Fallback Configuration

Advanced Configuration

Error Handling

Comprehensive Error Management

Error Codes

Monitoring and Observability

Real-time Metrics

Alerts and Notifications

Best Practices

When to Enable Fallback

When to Keep Disabled

Performance Impact

Race Mode Performance

Sequential Mode Performance

Troubleshooting

Common Issues

Next Steps