Cheaper Model for OpenClaw: Free and Budget LLMs That Actually Work

← Back to Blog

OpenClaw Setup

Cheaper Model for OpenClaw: Free and Budget LLMs That Actually Work

OpenClaw ships with sensible defaults, but those defaults tend to point at frontier models. GPT-5.4, Claude Sonnet 4, Gemini Pro -- capable, but expensive when you have agents running on cron schedules, processing batches, or handling routine tasks that do not require top-tier reasoning. The good news: most OpenClaw workloads run just as well on models that cost a fraction of the price, and several excellent models are completely free.

This article covers every practical way to cut your OpenClaw model costs: free models through OpenRouter, using subscriptions you already pay for via OAuth, per-agent model routing so expensive models only run where they matter, and fallback chains that keep your agents running when a provider hiccups.

The Three Ways to Pay for OpenClaw Models

Before choosing a cheaper model, it helps to understand the three billing paths OpenClaw supports. Each has different cost characteristics and tradeoffs:

Method How it works Best for Typical cost
API Key Get a key from a provider (OpenAI, Anthropic, Google, Groq). Pay per token consumed. OpenAI offers a Flex tier at ~50% off standard pricing. Heavy users who need precise cost control and high rate limits $1-15 per million tokens (Flex tier cuts OpenAI prices ~50%)
OAuth Log in with your existing ChatGPT Plus/Pro subscription. No per-token charges. (OpenAI only -- Anthropic removed third-party OAuth in April 2026.) ChatGPT subscribers who want to avoid double-paying for API access $0 extra (included in $20-200/mo subscription)
OpenRouter Single API key to 300+ models from every major provider. Many models are free. Cost-conscious users, multi-model setups, anyone who wants to experiment Free to $15/M tokens depending on model

You are not locked into one method. OpenClaw supports multiple providers simultaneously -- you can use OAuth for your primary model, OpenRouter for a free fallback, and a direct API key for a specific agent that needs high throughput. The sections below cover each option in detail.

Free and Near-Free Models That Actually Work

The landscape of free LLM access has improved dramatically. These models are available at zero cost through OpenRouter and are genuinely capable for agentic workloads:

Model Provider Strengths Good for
Gemini 2.5 Flash Google Fast, high quality, large context window General assistant, data extraction, summarization
Claude Haiku 4.5 Anthropic Fast, good instruction following High-volume extraction, formatting, classification
GPT-4o-mini OpenAI Reliable, consistent output quality Daily tasks, monitoring agents, structured output
Llama 3.3 70B Meta (open source) Strong reasoning for an open model Code generation, analysis, tasks needing open-weight models
DeepSeek Chat v3.1 DeepSeek Strong reasoning, competitive with frontier models on benchmarks Complex analysis, multi-step reasoning, research tasks

These are not toy models. Gemini 2.5 Flash, for example, handles tool calling, multi-step reasoning, and long documents well enough that it serves as the default model in Zedly's own internal agent routing. For most OpenClaw users running daily automation, monitoring, or extraction workflows, one of these free models is the right starting point.

The key insight: start with a free model and only upgrade specific agents that demonstrably need more capability. Most people discover that 80% of their agents run fine on budget models.

OpenRouter: One API Key, 300+ Models

OpenRouter is a unified API gateway that sits between your OpenClaw instance and LLM providers. Instead of managing separate API keys for OpenAI, Anthropic, Google, and Meta, you get a single sk-or-... key that routes to any of 300+ models.

Why this matters for OpenClaw users:

  • Free models: Several high-quality models cost nothing. You can run agents indefinitely on Gemini Flash or GPT-4o-mini without spending a dollar.
  • Easy model switching: Changing from one model to another is a config change, not a provider migration. No new API keys, no new billing accounts.
  • Credit-based billing: For paid models, OpenRouter charges per token with no minimum commitment. Add $5 of credit and experiment with frontier models without a subscription.
  • Rate limit pooling: OpenRouter distributes requests across multiple provider accounts, which can give you higher effective rate limits than a single direct API key.

Setting up OpenRouter in OpenClaw takes one command after you have your API key:

openclaw models add openrouter --api-key sk-or-your-key-here

Then set your preferred model as the default:

openclaw models set-default openrouter/google/gemini-2.5-flash

That is it. Every agent that does not have a model override will now use Gemini Flash at zero cost through OpenRouter.

Use Your Existing Subscription (OAuth)

If you already pay for ChatGPT Plus ($20/mo) or ChatGPT Pro ($200/mo), you are paying for model access you might not be fully using. OpenClaw can route requests through your existing OpenAI subscription via OAuth -- no additional API charges.

Important: As of April 4, 2026, Anthropic no longer allows Claude Pro or Claude Max subscriptions to be used with OpenClaw or other third-party tools. If you see older guides suggesting Claude OAuth as an option, that path is closed. For Anthropic models, you now need either a direct API key or access through OpenRouter.

The tradeoff for OpenAI OAuth is straightforward:

Factor OAuth (ChatGPT subscription) API key (pay-per-token)
Cost $0 extra beyond your subscription $2-15 per million tokens (or ~50% less with Flex tier)
Rate limits Consumer-tier limits (may throttle under heavy use) Higher limits, scales with spend
Setup Browser login required (one-time) Paste API key (no browser needed)
Models available OpenAI models included in your subscription tier All models the provider offers via API

For individuals and small teams running a few agents, OAuth is often the cheapest path to frontier OpenAI models. A ChatGPT Plus subscriber gets GPT-5.4 access through OpenClaw at no extra cost. The rate limits are adequate for agents that run a few times a day. If you hit rate limits frequently, that is when a direct API key makes more sense.

OAuth setup requires a one-time browser login on your machine. The Zedly Setup Assistant initiates the auth flow and verifies it works -- you just complete the browser login when prompted.

OpenAI Flex Tier: Half-Price API Access

If you want API-level rate limits without the full API price, OpenAI offers Flex tier through platform.openai.com. Flex tier gives you access to the same GPT models at roughly half the standard per-token price, with the tradeoff that your requests get lower priority during peak demand periods.

For OpenClaw agents that are not latency-sensitive -- cron jobs, batch processing, overnight research tasks -- Flex tier is an excellent middle ground. You get the full capability of GPT-5.4 or GPT-4o at a fraction of the cost, and the slightly longer response times during peak hours rarely matter for background automation. Create an API key at platform.openai.com, configure it in OpenClaw, and requests automatically use Flex pricing.

Per-Agent Model Routing

Not every agent needs the same model. A contract analysis agent that handles nuanced legal language benefits from a frontier model. A daily scraping agent that checks a website and saves a summary does not. OpenClaw lets you assign different models to different agents, so you only spend on capability where it matters.

Agent task Recommended model Why Approx. cost per run
Contract clause extraction GPT-5.4 / Claude Sonnet 4 Needs strong reasoning and precision $0.10-0.50
Daily web scraping GPT-4o-mini (free via OpenRouter) Simple extraction, high frequency $0.00
Trend monitoring (X, Reddit) Gemini 2.5 Flash (free via OpenRouter) Fast, handles large context $0.00
Code review Claude Sonnet 4 / GPT-5.4 Needs code understanding and nuance $0.05-0.30
Email drafting Claude Haiku 4.5 (free via OpenRouter) Good writing quality at zero cost $0.00
Data pipeline monitoring GPT-4o-mini (free via OpenRouter) Reliable structured output $0.00

The pattern is clear: reserve frontier models for agents that do complex reasoning, and let everything else run on free or near-free models. For most users, this means 2-3 agents on a paid model and the rest on free models, cutting total spend by 70-90%.

For a deeper look at model routing in production environments, see OpenClaw for Enterprise. For tracking whether your routing decisions are paying off, the cost dashboard shows spend broken down by agent, model, and job.

Fallback Chains for Reliability and Cost

A cheap model is not useful if it goes down and takes your agents with it. OpenClaw supports fallback chains: if the primary model fails or is rate-limited, it automatically switches to a backup model without interrupting the workflow.

A good fallback strategy for cost-optimized setups:

  1. Primary: Free model via OpenRouter (e.g. Gemini 2.5 Flash)
  2. Fallback 1: Another free model via OpenRouter (e.g. GPT-4o-mini)
  3. Fallback 2: A paid model via direct API key (e.g. GPT-5.4-mini) for when free options are both down

With this setup, your agents run for free under normal conditions. If Gemini Flash has an outage, GPT-4o-mini picks up. If OpenRouter itself has issues, the direct API key to OpenAI takes over. You pay only when the free options are unavailable -- which in practice is rare.

Fallback chains also protect against rate limiting. Free-tier models on OpenRouter have lower rate limits than paid tiers. If an agent hits the limit on the primary model, the fallback kicks in immediately rather than waiting for the rate limit to reset.

What This Looks Like in Practice

Here is a simplified view of an openclaw.json configured for cost-optimized model routing with OpenRouter as the primary provider and fallbacks:

{
  "providers": {
    "openrouter": {
      "type": "openrouter",
      "apiKey": "sk-or-your-key-here"
    },
    "openai": {
      "type": "openai",
      "apiKey": "sk-proj-your-key-here"
    }
  },
  "models": {
    "default": "openrouter/google/gemini-2.5-flash",
    "fallbacks": [
      "openrouter/openai/gpt-4o-mini",
      "openai/gpt-5.4-mini"
    ]
  },
  "agents": {
    "contract-reviewer": {
      "model": "openai/gpt-5.4"
    },
    "daily-scraper": {
      "model": "openrouter/google/gemini-2.5-flash"
    },
    "trend-monitor": {
      "model": "openrouter/anthropic/claude-haiku-4.5"
    }
  }
}

In this configuration:

  • Most agents use Gemini Flash (free) by default
  • If Gemini Flash is unavailable, GPT-4o-mini (free) takes over
  • If both free models are down, GPT-5.4-mini (cheap, direct API) serves as the last resort
  • The contract reviewer gets GPT-5.4 directly because it needs the reasoning quality
  • The daily scraper and trend monitor are explicitly pinned to free models

The actual configuration has a few more fields (endpoints, model lists, auth profiles), but the structure above captures the cost-relevant decisions. The OpenClaw documentation covers the full config schema.

Let Us Configure This for You

Model selection, OpenRouter setup, OAuth configuration, per-agent routing, and fallback chains -- there are a lot of knobs to turn. The Zedly Setup Assistant handles all of it in a single session. An AI-assisted engineer connects to your machine via a secure reverse SSH tunnel, configures your providers, sets up cost-optimized model routing, and verifies everything works end-to-end.

  • 15-30 minute session -- providers configured, models assigned, fallbacks tested
  • Per-session SSH keys -- unique key per session, auto-revoked on disconnect
  • Full session log -- every command logged and exportable
Join the Setup Waitlist See Full Details

Frequently Asked Questions

Can I use OpenClaw for free?

Yes. OpenClaw itself is free and open source. The cost comes from LLM providers. If you use OpenRouter, several high-quality models are available at zero cost, including Google Gemini 2.5 Flash, GPT-4o-mini, and Claude Haiku 4.5. Alternatively, if you already pay for a ChatGPT Plus subscription, you can connect OpenClaw via OAuth and use OpenAI models at no additional per-token cost. Note that Anthropic discontinued third-party OAuth access in April 2026, so Claude subscriptions cannot be used this way.

What is OpenRouter and how does it work with OpenClaw?

OpenRouter is a unified API gateway that provides access to 300+ LLM models from multiple providers through a single API key. You sign up at openrouter.ai, get an API key starting with sk-or-, and configure it in OpenClaw. OpenRouter handles provider routing, rate limits, and billing. Many models on OpenRouter are completely free to use.

How do I change the default model in OpenClaw?

Run 'openclaw models set-default <provider>/<model>' from the terminal. For example, 'openclaw models set-default openrouter/google/gemini-2.5-flash' sets Gemini Flash as the default. You can also configure per-agent models in openclaw.json so different agents use different models based on task complexity.

Is the free tier good enough for production use?

For many workloads, yes. Gemini 2.5 Flash and GPT-4o-mini handle daily assistant tasks, data extraction, content generation, and monitoring workflows well. For tasks requiring deep reasoning, complex code generation, or nuanced legal analysis, you may want a frontier model for those specific agents while keeping everything else on budget models.

Can different OpenClaw agents use different models?

Yes. OpenClaw supports per-agent model configuration. You can assign GPT-5.4 to a contract analysis agent that needs strong reasoning, GPT-4o-mini to a daily scraping agent that processes high volume, and Gemini Flash to a monitoring agent that runs on a cron schedule. Each agent uses the model best suited to its task and budget.