Infrastructure API Gateway 🎧 Audio

OpenRouter: The Unified LLM API Gateway for AI Agents

One API key, 400+ models, 60+ providers. How OpenRouter works, what it costs, and how to wire it into OpenClaw agents for automatic model switching, cost optimization, and bulletproof redundancy.

Michel Lacle & Yaneth | ThinkSmart.Life Research

February 16, 2026 · min read

🎧 Listen

If you're building AI agents or applications that use large language models, you've probably wrestled with the multi-provider problem: OpenAI has the best GPT models, Anthropic has Claude, Google has Gemini, Meta has Llama, Mistral has their lineup — and each one has its own API, billing system, rate limits, and quirks. Managing keys, handling outages, comparing costs, and switching models means juggling half a dozen integrations.

OpenRouter solves this. It's a unified API gateway that sits between your application and all the major LLM providers. You get one API key, one endpoint, one billing system — and access to 400+ models from 60+ providers. Your existing OpenAI SDK code works unchanged; you just swap the base URL.

1. What Is OpenRouter?

OpenRouter is an API aggregation platform — think of it as a smart proxy for LLMs. Founded as an independent company (not affiliated with any model provider), it provides a single OpenAI-compatible REST endpoint that routes your requests to the underlying provider of whichever model you choose.

The core value proposition:

Unified access: 400+ models from OpenAI, Anthropic, Google, Meta, Mistral, Cohere, DeepSeek, and many more — all through one /v1/chat/completions endpoint
OpenAI SDK compatible: Change your base URL and API key; your existing code works
Intelligent routing: Automatic fallbacks when providers go down, model-level routing, provider preferences
Transparent pricing: Pass-through provider rates with a flat 5.5% platform fee on credit purchases
Privacy-first: Prompt logging off by default, Zero Data Retention (ZDR) mode available

💡 The Elevator Pitch OpenRouter is to LLM APIs what Cloudflare is to web infrastructure — a smart layer that normalizes, routes, and optimizes requests across multiple backends while giving you a single integration point.

2. How It Works

The architecture is straightforward:

You send a request to https://openrouter.ai/api/v1/chat/completions with a model identifier like anthropic/claude-sonnet-4 or openai/gpt-5
OpenRouter normalizes your request to the target provider's format
The request is routed to the best available provider for that model (some models are available from multiple providers)
The response is normalized back to OpenAI-compatible format and returned to you
If a provider fails, OpenRouter can automatically retry with a fallback provider or model

All of this happens transparently. From your application's perspective, you're calling a single API that always responds in the same format, regardless of which model or provider is actually serving the request.

# The simplest possible OpenRouter request
curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer $OPENROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-sonnet-4",
    "messages": [
      {"role": "user", "content": "Explain quantum computing in 3 sentences."}
    ]
  }'

The response comes back in standard OpenAI format — choices[0].message.content — regardless of which provider served it.

3. Supported Models & Providers

OpenRouter's catalog is massive and constantly growing. As of February 2026, it includes 400+ models from 60+ providers. Here are the major families:

Provider	Notable Models	Category
OpenAI	GPT-5, GPT-4o, GPT-4o-mini, o1, o3-mini	Frontier / Reasoning
Anthropic	Claude Opus 4, Claude Sonnet 4, Claude 3.5 Haiku	Frontier / Code
Google	Gemini 2.5 Pro, Gemini 2.5 Flash, Gemini 2.0	Multimodal / Reasoning
Meta	Llama 3.3 70B, Llama 3.1 405B, Llama 4 Scout/Maverick	Open-weight
Mistral	Mistral Large, Mistral Medium, Codestral	Efficient / Code
DeepSeek	DeepSeek-V3, DeepSeek-R1	Reasoning / Budget
Cohere	Command R+, Command R	Enterprise / RAG
Microsoft	Phi-4, WizardLM	Small / Efficient
xAI	Grok-2, Grok-3	General Purpose
Various	Qwen, Yi, Gemma, Nous, Dolphin, etc.	Open-source / Fine-tuned

Model identifiers follow a consistent provider/model-name format. You can browse the full catalog at openrouter.ai/models or query it programmatically:

# List all available models via API
curl https://openrouter.ai/api/v1/models \
  -H "Authorization: Bearer $OPENROUTER_API_KEY" | jq '.data[] | {id, pricing}'

The free tier gives you access to 25+ models from 4 providers. The paid tier unlocks the full 400+ model catalog across all 60+ providers.

4. API Compatibility — Drop-In OpenAI Replacement

This is OpenRouter's killer feature for developers: it's fully OpenAI SDK compatible. If your code already works with the OpenAI API, switching to OpenRouter is a two-line change:

# Before: Direct OpenAI
from openai import OpenAI
client = OpenAI(api_key="sk-...")

# After: OpenRouter — change base_url and key, nothing else
from openai import OpenAI
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="sk-or-..."  # Your OpenRouter key
)

# Same code works for ANY model
response = client.chat.completions.create(
    model="anthropic/claude-sonnet-4",  # or openai/gpt-5, google/gemini-2.5-pro, etc.
    messages=[{"role": "user", "content": "Hello!"}],
    stream=True
)

for chunk in response:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

This compatibility extends to:

Chat completions — /v1/chat/completions (text, vision, streaming)
Model listing — /v1/models
Streaming via SSE — Same stream: true parameter, same event format
Function calling / tool use — Works with models that support it
JSON mode / structured output — Supported where the underlying model supports it
Multimodal inputs — Images, audio, PDFs (model-dependent)

✅ Any tool that supports "OpenAI-compatible" endpoints works with OpenRouter This includes LangChain, LlamaIndex, AutoGen, CrewAI, Cursor, Continue.dev, and yes — OpenClaw. If a tool lets you set a custom base URL for its LLM provider, it works with OpenRouter.

5. Key Features

Automatic Fallbacks

When a provider goes down or rate-limits you, OpenRouter can automatically fail over to another provider serving the same model, or to an entirely different fallback model you specify:

# Model fallbacks — if Claude fails, try GPT-5, then Gemini
curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer $OPENROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-sonnet-4",
    "messages": [{"role": "user", "content": "Analyze this data..."}],
    "models": [
      "anthropic/claude-sonnet-4",
      "openai/gpt-5",
      "google/gemini-2.5-pro"
    ],
    "route": "fallback"
  }'

If the first model is unavailable, rate-limited, or errors out, OpenRouter automatically tries the next one in the list. Your application never sees the failure.

Provider Routing & Preferences

Many models (especially open-weight ones like Llama) are served by multiple inference providers. OpenRouter lets you control which providers to use, avoid, or prefer:

# Provider routing — prefer specific providers, require ZDR
{
  "model": "meta-llama/llama-3.3-70b-instruct",
  "messages": [...],
  "provider": {
    "order": ["Together", "Fireworks"],
    "allow_fallbacks": true,
    "require_parameters": true,
    "data_collection": "deny"
  }
}

Auto Router

OpenRouter's Auto Router uses a lightweight classifier to analyze your prompt and automatically select the best model for the task — optimizing for quality, speed, or cost based on your preferences. Use openrouter/auto as your model:

# Let OpenRouter pick the best model automatically
{
  "model": "openrouter/auto",
  "messages": [{"role": "user", "content": "Write a haiku about Kubernetes"}]
}

BYOK — Bring Your Own Keys

If you already have API keys with providers like OpenAI or Anthropic, you can configure them in OpenRouter and route through your own accounts. Benefits:

Use your existing enterprise agreements and volume discounts
Up to 1M free requests/month on BYOK before the 5% fee kicks in
Still get OpenRouter's routing, fallbacks, and unified billing dashboard

Usage Tracking & Analytics

OpenRouter provides a detailed dashboard showing cost per model, token usage over time, and per-request breakdowns. You can also query usage programmatically via the /api/v1/auth/key endpoint to check your remaining credits and rate limits.

Privacy Controls

Prompt logging off by default — OpenRouter doesn't store your prompts unless you opt in
Zero Data Retention (ZDR) — Enforce that providers don't store your data
SOC 2 Type I certified (mid-2026)
Minimal metadata retention — only timestamps, token counts, and performance metrics

6. Pricing & Tiers

OpenRouter's pricing is refreshingly transparent: you pay the exact same per-token rate as going direct to the provider, plus a flat platform fee on credit purchases.

🟢 Free Tier

No platform fees
Access to 25+ free models from 4 providers
~50 requests/day on free models
Great for experimentation and testing

🔵 Pay-as-you-Go

5.5% platform fee on credit purchases (minimum $0.80)
5% fee on crypto purchases (no minimum)
Access to all 400+ models from 60+ providers
Credits expire after 365 days
Refunds within 24 hours (card purchases)
Email support

🟣 Enterprise

Volume discounts on credit purchases
Contract SLAs and dedicated support
SSO/SAML integration
Admin-controlled API keys and permissions
Budget controls, prompt caching, cost governance
Custom invoicing

Pricing Example — Real Math

Let's say you buy $100 in credits via card:

You pay: $105.50 ($100 + 5.5% fee)
You receive: $100 in spendable credits
Using Claude Sonnet 4: At $3/$15 per 1M input/output tokens, $100 gets you ~6.6M output tokens or ~33M input tokens
BYOK scenario: If routing through your own Anthropic key, an additional 5% usage fee applies ($5.00 on $100 of usage = $5.00 deducted from credits)

⚠️ When OpenRouter Makes Sense (and When It Doesn't) If you only ever use one model from one provider, going direct saves you the 5.5% fee. OpenRouter's value compounds when you use multiple models, need fallback reliability, want to compare models easily, or run an agent framework that benefits from model switching.

Rate Limits

Rate limits are dynamic and depend on your tier and credit balance:

Free tier: ~50 requests/day, ~1,000 tokens/minute on free models
Paid tier: Much higher limits, scaled with credit balance
BYOK: Up to 1M free requests/month, subject to underlying provider limits
Rate limit headers (X-RateLimit-*) are included in every response

7. OpenClaw Integration — Using OpenRouter with Your Agents

This is where it gets interesting for us. OpenClaw is an AI agent framework that supports any OpenAI-compatible provider — which means OpenRouter works as a drop-in provider. Here's why you'd want to do this and exactly how to set it up.

Why Use OpenRouter with OpenClaw?

Model flexibility: Switch your agent between Claude, GPT, Gemini, or any other model by changing a single config value — no API key juggling
Cost optimization: Use expensive models (Claude Opus) for complex reasoning tasks and cheap models (Llama 70B, GPT-4o-mini) for routine work, all through one provider
Redundancy: If Anthropic's API goes down, your OpenClaw agent keeps running via fallback models — no manual intervention
Experimentation: Test how different models handle your agent's tasks without reconfiguring anything
Unified billing: One bill instead of separate accounts with OpenAI, Anthropic, Google, etc.

Configuration

Since OpenClaw uses the OpenAI SDK under the hood, configuring OpenRouter is straightforward. Set the base URL and API key either via environment variables or in your OpenClaw configuration:

Option 1: Environment Variables

# Set OpenRouter as your LLM provider
export OPENAI_BASE_URL="https://openrouter.ai/api/v1"
export OPENAI_API_KEY="sk-or-v1-your-openrouter-key-here"

# Now start OpenClaw — it will route all LLM calls through OpenRouter
openclaw gateway start

Option 2: Per-Model Configuration

OpenClaw lets you specify different models for different tasks. With OpenRouter, you can use any model identifier from the OpenRouter catalog:

# In your OpenClaw configuration, use OpenRouter model identifiers:
# Main agent model (complex reasoning)
model: "anthropic/claude-opus-4"

# Default/fallback model (cost-effective)
default_model: "anthropic/claude-sonnet-4"

# For subagents or lighter tasks
subagent_model: "openai/gpt-4o-mini"

Option 3: Dynamic Model Switching in Agent Code

Because OpenRouter is OpenAI-compatible, any OpenClaw skill or tool that makes LLM calls can specify a model at request time:

# In a Python skill, switch models per-task
from openai import OpenAI

client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key=os.environ["OPENROUTER_API_KEY"]
)

# Heavy reasoning task → use the best model
analysis = client.chat.completions.create(
    model="anthropic/claude-opus-4",
    messages=[{"role": "user", "content": complex_prompt}]
)

# Simple formatting task → use a cheap model
formatted = client.chat.completions.create(
    model="openai/gpt-4o-mini",
    messages=[{"role": "user", "content": simple_prompt}]
)

Practical Use Cases for OpenClaw + OpenRouter

Use Case	Recommended Model	Why
Complex reasoning / code	`anthropic/claude-opus-4`	Best at nuanced tasks, long context
General agent tasks	`anthropic/claude-sonnet-4`	Best balance of quality and cost
Quick tool calls / routing	`openai/gpt-4o-mini`	Fast, cheap, good enough for simple tasks
Research / web analysis	`google/gemini-2.5-pro`	1M+ token context, strong at analysis
Budget batch processing	`deepseek/deepseek-chat-v3`	Excellent quality at very low cost
Auto-select best model	`openrouter/auto`	Let OpenRouter's classifier choose

✅ The Power Move: OpenRouter as Your Default Provider Set OpenRouter as your OpenClaw gateway's default LLM provider. You get instant access to every major model, automatic fallbacks if any provider goes down, and unified cost tracking. When you find a model that works well for a specific task, you can always switch to a direct API key for that provider to save the 5.5% fee.

8. Getting Started

Getting up and running with OpenRouter takes about 2 minutes:

Step 1: Create an Account

Go to openrouter.ai and sign up. You can use Google, GitHub, or email.

Step 2: Get Your API Key

Navigate to Settings → Keys and create a new API key. It starts with sk-or-v1-.

Step 3: Add Credits (Optional)

For free models, no credits needed. For paid models, add credits via card or crypto at Settings → Credits.

Step 4: Make Your First Request

# Test with a free model (no credits needed)
curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer sk-or-v1-YOUR-KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "meta-llama/llama-3.3-70b-instruct:free",
    "messages": [
      {"role": "user", "content": "What is OpenRouter?"}
    ]
  }'

Step 5: Use with OpenAI SDK

# Install the OpenAI SDK
pip install openai

# Use it with OpenRouter
from openai import OpenAI

client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key="sk-or-v1-YOUR-KEY"
)

response = client.chat.completions.create(
    model="openai/gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello from OpenRouter!"}]
)
print(response.choices[0].message.content)

Step 6: Wire into OpenClaw

# Set environment variables and start your agent
export OPENAI_BASE_URL="https://openrouter.ai/api/v1"
export OPENAI_API_KEY="sk-or-v1-YOUR-KEY"

# Start OpenClaw — all LLM calls now go through OpenRouter
openclaw gateway start

# Verify it's working
openclaw gateway status

9. Pros & Cons vs. Going Direct

✅ Advantages of OpenRouter

One integration for all providers — massive time savings
Automatic fallbacks — your app stays up when providers go down
Easy model comparison — swap model names, compare results
Unified billing — one dashboard, one bill
Privacy controls — ZDR, logging opt-in, provider filtering
Free tier — test models without spending anything
OpenAI SDK compatible — zero code changes for most tools
BYOK support — use your existing provider keys through OpenRouter
Auto Router — AI-powered model selection per prompt

❌ Disadvantages

5.5% platform fee — adds up at high volume (enterprise discounts available)
Additional latency — ~25-40ms routing overhead per request
Dependency risk — another service in your stack that can go down
Credit expiration — unused credits expire after 365 days
Feature lag — newest provider features (like OpenAI's latest beta endpoints) may take days to weeks to appear
Not all endpoints — primarily chat completions; specialized endpoints (fine-tuning, embeddings, images) may not be supported for all providers

When to Use OpenRouter vs. Direct

Scenario	Recommendation
You use one model from one provider exclusively	Go direct — save the 5.5%
You use multiple models or want to experiment	OpenRouter — unified access is worth the fee
Uptime is critical for your agent	OpenRouter — automatic fallbacks are invaluable
You're building a multi-model agent framework	OpenRouter — one integration instead of many
You need specialized APIs (fine-tuning, embeddings)	Go direct — OpenRouter focuses on chat completions
You want to minimize latency (<10ms matters)	Go direct — avoid the routing overhead
You're prototyping or testing	OpenRouter — free tier, instant access to everything

References

OpenRouter — The Unified Interface for LLMs — openrouter.ai
OpenRouter Quickstart Guide — openrouter.ai/docs
OpenRouter Models Documentation — 400+ models overview
Model Fallbacks Documentation — Automatic failover configuration
Provider Routing Guide — Intelligent multi-provider routing
Model Routing — Auto Router — Dynamic model selection
OpenRouter Pricing — Current pricing and tier details
OpenRouter FAQ — Rate limits, fees, and BYOK details
OpenRouter Review 2025 — Skywork AI — Independent review
OpenClaw — AI Agent Framework — openclaw.ai