Gemini Generate Content

Overview

The Gemini Generate Content endpoint provides a Google Gemini API-compatible interface for generating text, code, and structured content. Use this endpoint with the official @google/genai SDK or any Gemini-compatible client.

This endpoint is fully compatible with Google’s Gemini API, allowing you to use the official Google Gen AI SDK while benefiting from Adaptive’s intelligent routing, cost optimization, and multi-provider support.

Authentication

x-goog-api-key

string

required

Your Adaptive API key. Also supports Authorization: Bearer, X-API-Key, or api-key headers.

Path Parameters

model

string

required

The model to use for generation. Supports Gemini model names and Adaptive’s intelligent routing.Examples:

gemini-2.5-pro - Latest Gemini Pro model
gemini-2.5-flash - Fast Gemini Flash model
gemini-1.5-pro - Gemini 1.5 Pro
Custom model aliases configured in Adaptive

Request Body

contents

array

required

An array of content parts representing the conversation history or prompt.

"contents": [
  {
    "role": "user",
    "parts": [
      {
        "text": "Explain quantum computing in simple terms"
      }
    ]
  }
]

config

object

Generation configuration parameters.

Show Configuration Properties

config.temperature

number

Controls randomness in generation (0.0 to 2.0). Default: 1.0

config.topP

number

Nucleus sampling parameter (0.0 to 1.0). Default: 0.95

config.topK

number

Top-K sampling parameter. Default: 40

config.maxOutputTokens

number

Maximum tokens to generate. Default: 8192

config.stopSequences

array

Sequences that stop generation when encountered.

config.candidateCount

number

Number of response candidates to generate. Default: 1

provider_configs

object

Adaptive Extension: Provider-specific configuration overrides.

"provider_configs": {
  "anthropic": {
    "temperature": 0.7
  },
  "openai": {
    "temperature": 0.8
  }
}

model_router

object

Adaptive Extension: Control intelligent routing behavior.

Show Router Options

model_router.enabled

boolean

Enable/disable intelligent routing. Default: true

model_router.fallback_models

array

List of fallback models if primary model fails.

model_router.cost_optimization

boolean

Enable cost-based model selection. Default: true

semantic_cache

object

Adaptive Extension: Semantic caching configuration.

"semantic_cache": {
  "enabled": true,
  "similarity_threshold": 0.95
}

fallback

object

Adaptive Extension: Fallback configuration for provider failures.

"fallback": {
  "enabled": true,
  "max_retries": 3
}

Response

candidates

array

Array of generated response candidates.

Show Candidate Structure

content

object

The generated content.

"content": {
  "parts": [
    {
      "text": "Quantum computing uses quantum..."
    }
  ],
  "role": "model"
}

finishReason

string

Reason the generation stopped: STOP, MAX_TOKENS, SAFETY, RECITATION, OTHER

safetyRatings

array

Safety classification ratings for the generated content.

citationMetadata

object

Citation information for referenced sources.

usageMetadata

object

Token usage information.

Show Usage Metadata

promptTokenCount

number

Number of tokens in the prompt.

candidatesTokenCount

number

Number of tokens in the generated response.

totalTokenCount

number

Total tokens used (prompt + completion).

modelVersion

string

The actual model version used for generation.

Code Examples

import { GoogleGenAI } from '@google/genai';

const ai = new GoogleGenAI({
  apiKey: process.env.GEMINI_API_KEY,
  httpOptions: {
    baseUrl: 'https://api.llmadaptive.uk/v1beta'
  }
});

const response = await ai.models.generateContent({
  model: 'gemini-2.5-pro',
  contents: [
    {
      role: 'user',
      parts: [
        { text: 'Explain quantum computing in simple terms' }
      ]
    }
  ],
  config: {
    temperature: 0.7,
    maxOutputTokens: 1024
  }
});

console.log(response.candidates[0].content.parts[0].text);
console.log('Tokens used:', response.usageMetadata.totalTokenCount);

Advanced Examples

Multi-Turn Conversation

const response = await ai.models.generateContent({
  model: 'gemini-2.5-pro',
  contents: [
    {
      role: 'user',
      parts: [{ text: 'What is the capital of France?' }]
    },
    {
      role: 'model',
      parts: [{ text: 'The capital of France is Paris.' }]
    },
    {
      role: 'user',
      parts: [{ text: 'What is its population?' }]
    }
  ]
});

With Adaptive Extensions

const response = await ai.models.generateContent({
  model: 'gemini-2.5-pro',
  contents: [
    {
      role: 'user',
      parts: [{ text: 'Write a sorting algorithm in Python' }]
    }
  ],
  config: {
    temperature: 0.3,
    maxOutputTokens: 2048
  },
  // Adaptive-specific features
  semantic_cache: {
    enabled: true,
    similarity_threshold: 0.95
  },
  fallback: {
    enabled: true,
    max_retries: 3
  },
  model_router: {
    cost_optimization: true,
    fallback_models: ['claude-sonnet-4-20250514', 'gpt-4o']
  }
});

Error Responses

error

object

Error information when the request fails.

Show Error Structure

code

number

HTTP status code (400, 401, 429, 500, etc.)

message

string

Human-readable error message.

status

string

Error status: INVALID_ARGUMENT, UNAUTHENTICATED, PERMISSION_DENIED, RESOURCE_EXHAUSTED, INTERNAL

Common Errors

401 UNAUTHENTICATED

{
  "error": {
    "code": 401,
    "message": "API key required. Provide it via x-goog-api-key, Authorization: Bearer, X-API-Key, or api-key header",
    "status": "UNAUTHENTICATED"
  }
}

Solution: Provide a valid API key in the x-goog-api-key header or other supported header formats.

400 INVALID_ARGUMENT

{
  "error": {
    "code": 400,
    "message": "Invalid request format",
    "status": "INVALID_ARGUMENT"
  }
}

Solution: Check your request body format. Ensure contents array is properly formatted with valid roles and parts.

429 RESOURCE_EXHAUSTED

{
  "error": {
    "code": 429,
    "message": "Rate limit exceeded",
    "status": "RESOURCE_EXHAUSTED"
  }
}

Solution: Reduce request rate or upgrade your plan for higher limits. Adaptive’s load balancing helps distribute requests across providers.

500 INTERNAL

{
  "error": {
    "code": 500,
    "message": "Internal server error",
    "status": "INTERNAL"
  }
}

Solution: Temporary server issue. Adaptive’s fallback system will automatically retry with alternative providers.

Features & Benefits

Google SDK Compatible

Drop-in replacement for Google’s Gemini API—use the official @google/genai SDK without changes

Multi-Provider Routing

Access models from Google, Anthropic, OpenAI, and more through a single endpoint

Intelligent Caching

Semantic and prompt caching reduce costs by up to 90% for similar requests

Automatic Fallbacks

Provider failures automatically route to alternative models for high reliability

Cost Optimization

Intelligent routing selects the most cost-effective model for each request

Usage Analytics

Detailed token usage, costs, and performance metrics in the dashboard

Stream Generate Content - Streaming version of this endpoint
Chat Completions - OpenAI-compatible chat endpoint
Select Model - Get optimal model recommendations

SDK Integration

For full SDK integration guide with code examples and best practices, see:

Getting Started

Key Features

Framework Integrations

Developer Tools

Examples

API Reference

Support

Gemini Generate Content

Overview

Authentication

Path Parameters

Request Body

Response

Code Examples

Advanced Examples

Multi-Turn Conversation

With Adaptive Extensions

Error Responses

Common Errors

Features & Benefits

Google SDK Compatible

Multi-Provider Routing

Intelligent Caching

Automatic Fallbacks

Cost Optimization

Usage Analytics

SDK Integration

Getting Started

Key Features

Framework Integrations

Developer Tools

Examples

API Reference

Support

​Overview

​Authentication

​Path Parameters

​Request Body

​Response

​Code Examples

​Advanced Examples

​Multi-Turn Conversation

​With Adaptive Extensions

​Error Responses

​Common Errors

​Features & Benefits

Google SDK Compatible

Multi-Provider Routing

Intelligent Caching

Automatic Fallbacks

Cost Optimization

Usage Analytics

​Related Endpoints

​SDK Integration

Overview

Authentication

Path Parameters

Request Body

Response

Code Examples

Advanced Examples

Multi-Turn Conversation

With Adaptive Extensions

Error Responses

Common Errors

Features & Benefits

Related Endpoints

SDK Integration