Overview
LlamaIndex is the leading framework for building context-augmented LLM applications, enabling RAG (Retrieval-Augmented Generation), agents, and workflows over your data. By integrating Adaptive with LlamaIndex, you get intelligent model routing while building powerful data-aware applications.Key Benefits
- Drop-in replacement - Works with existing LlamaIndex code
- Intelligent routing - Automatic model selection for queries and agents
- Cost optimization - 30-70% cost reduction across RAG pipelines
- RAG-optimized - Adaptive selects models based on query complexity
- Agent support - Smart routing for function-calling agents
- Streaming support - Real-time responses in chat applications
- Multi-modal - Support for text, images, and structured outputs
Installation
Install LlamaIndex with OpenAI support:Basic Usage
Method 1: Using OpenAI with Custom Base URL (Recommended)
Configure the standard OpenAI LLM to use Adaptive’s endpoint:Method 2: Using OpenAILike (Alternative)
Use the OpenAILike class for more explicit configuration:RAG Examples
Simple RAG Pipeline
RAG with Streaming
Advanced RAG with Custom Retrieval
Agent Examples
Simple Function-Calling Agent
RAG Agent with Document Search
Multi-Tool Agent
Advanced Patterns
Custom Query Engine with Settings
Multi-Document Agents
Chat Engine with Memory
Configuration Options
LLM Parameters
All standard OpenAI parameters are supported:Model Selection Strategy
Global vs Local Configuration
Embeddings Configuration
For embeddings, you can use either OpenAI directly or Adaptive:Best Practices
- Use empty model string for intelligent routing across your RAG pipeline
- Set Settings.llm globally for consistent configuration
- Use specific models when you need deterministic behavior
- Leverage streaming for better user experience in chat applications
- Enable function calling for agents with
is_function_calling_model=True
- Use standard OpenAI for embeddings to avoid unnecessary routing overhead
- Configure timeout and retries for production reliability
Error Handling
Debugging and Logging
Enable verbose logging to see which models Adaptive selects:Migration Guide
From Standard OpenAI
From Azure OpenAI
Complete Example
See the complete LlamaIndex example for a full working implementation including:- Multi-document RAG pipeline
- Function-calling agents
- Streaming chat interface
- Custom retrieval and re-ranking
- Error handling and logging
- Production-ready configuration
TypeScript/JavaScript Support
LlamaIndex.TS also supports Adaptive:Next Steps
- Explore LlamaIndex documentation for advanced patterns
- Build production RAG applications with cost optimization
- Create multi-agent systems with intelligent routing
- Implement custom retrievers for specialized use cases
- Try structured outputs with Pydantic models