ModelsOpenClawLocal AIApril 3, 2026·8 min read

Run OpenClaw Agents with Qwen3 Locally

Qwen3 from Alibaba is the most stable local model for tool calling in 2026 — the capability that matters most for OpenClaw agents. This guide shows you how to set up OpenClaw with Qwen3 via Ollama: zero API cost, native tool calling, 128K context, Apache 2.0 license.

Why Qwen3 for OpenClaw Agents

Most local models have a tool calling problem. They either hallucinate tool calls (call a tool that does not exist), drop parameters (call the right tool but forget to pass a required argument), or lose track of where they are mid-workflow. Qwen3 consistently outperforms other local models on exactly these failure modes.

For OpenClaw agents that use skills — web browsing, file management, API calls — tool calling reliability is the difference between an agent that works and one that you have to babysit. Qwen3 30B-A3B delivers cloud-model-level tool calling reliability on local hardware.

Most stable tool callingRanks first in 2026 local LLM tool calling evaluations. Rarely drops parameters or hallucinates tool calls.
MoE efficiency30B-A3B activates only 3B parameters per token — runs at 3B speed with 30B quality.
128K context windowHolds long agent conversations and large documents without truncation.
Apache 2.0 licenseCommercial use permitted. No royalties, no usage restrictions.
Zero API costRun entirely on your hardware. No per-token charges.

Qwen3 Model Variants: Which One to Pick

ModelRAM neededContextLicenseBest for
qwen3:1.7b4GB32KApache 2.0Triage only, very low RAM
qwen3:8b8GB128KApache 2.0Good all-rounder, 8GB machines
qwen3:30b-a3b20GB128KApache 2.0Best choice — recommended
qwen3:235b-a22b140GB+128KQwen ResearchHigh-end servers only

Recommendation: qwen3:30b-a3b. The MoE design means it uses the memory footprint of a ~20GB model but performs like a much larger one. Apple Silicon M2 Pro, M3, M4 Macs handle this comfortably. If your machine has less than 16GB RAM, use qwen3:8b instead.

Setup: OpenClaw + Qwen3 via Ollama

Step 1: Install Ollama and pull Qwen3

# Install Ollama
brew install ollama   # macOS
# or download from ollama.ai for Windows/Linux

# Pull Qwen3 30B-A3B (recommended)
ollama pull qwen3:30b-a3b

# Or 8B if RAM is limited
ollama pull qwen3:8b

# Verify
ollama run qwen3:30b-a3b "What tools do you have available?"

# Start Ollama server
ollama serve

Step 2: Add Qwen3 to OpenClaw config

~/.openclaw/openclaw.json
{
  "models": {
    "providers": {
      "ollama": {
        "baseUrl": "http://127.0.0.1:11434",
        "apiKey": "ollama-local",
        "api": "ollama",
        "models": [
          {
            "id": "qwen3:30b-a3b",
            "name": "Qwen3 30B MoE",
            "contextWindow": 128000,
            "maxOutput": 8192,
            "toolCalling": true
          },
          {
            "id": "qwen3:8b",
            "name": "Qwen3 8B",
            "contextWindow": 128000,
            "maxOutput": 4096,
            "toolCalling": true
          }
        ]
      }
    }
  }
}

Important: Use "api": "ollama" native mode, not OpenAI compatibility. Qwen3's tool calling works significantly better with the native Ollama API.

Step 3: Configure your SOUL.md

agents/researcher/SOUL.md
# Research Analyst

## Identity
- Name: Radar
- Role: Market Research Analyst
- Model: ollama/qwen3:30b-a3b
- Timezone: UTC

## Personality
- Thorough and data-driven
- Cites sources with every factual claim
- Uses structured output with headers and bullet points

## Rules
- Search the web before answering factual questions
- Include source URL and date for every reference
- Summarize findings in: Key Points → Details → Sources
- Flag information older than 6 months

## Skills
- browser: Search and read web pages
- files: Read and write local files

## Channels
- Telegram:
    token: ${TELEGRAM_BOT_TOKEN}
    allowed_users: [${ALLOWED_USER_ID}]

Step 4: Register and start

# Ensure Ollama is running
ollama serve &

# Register agent
openclaw agents add radar --workspace ./agents/researcher

# Start gateway
openclaw gateway start

# Test
openclaw agent --agent radar --message "Find the top 3 AI agent frameworks in 2026"

Best Agent Types for Qwen3

Multi-tool workflow agents

Best fit

Qwen3's standout strength. Agents that need to call browser → read results → call files → write summary stay on task through the entire chain without dropping steps. This is where Qwen3 separates from other local models.

Research and analysis agents

Best fit

128K context + reliable browser tool use = a research agent that can read multiple sources, synthesize findings, and produce structured output consistently.

Code review and documentation agents

Strong fit

Qwen3 performs well on code understanding. Feed it a PR diff or a function, and it produces accurate, useful review comments and documentation.

Private data processing agents

Strong fit

Run entirely offline. Agents that process internal documents, emails, or confidential business data without any data leaving your machine.

Qwen3 vs Other Local Models for OpenClaw

ModelTool CallingContextRAMVerdict
Qwen3 30B-A3BBest128K20GBTop pick for agents
Gemma 4 26BExcellent256K16GBBetter context, similar quality
Llama 3.1 8BGood128K8GBSolid, less RAM
Mistral 7BPartial32K8GBLimited context, weaker tools
Qwen3 8BVery good128K8GBBest in class at 8GB

Known Limitations

RAM requirement

Qwen3 30B-A3B needs ~20GB RAM. If your machine has 16GB, the model may work but with limited headroom. Use qwen3:8b on 16GB machines for a comfortable experience.

No real-time internet access

Like all local models, Qwen3 has no built-in internet. Add the browser skill to your SOUL.md so agents can search and read the web. Without it, the model reasons from training data only.

235B requires server hardware

The largest Qwen3 variant (235B-A22B) needs 140GB+ RAM. This is not a consumer device model. For individual use, stick with 30B-A3B or 8B.

Slower than cloud APIs

On Apple Silicon M2 Pro, Qwen3 30B-A3B generates at roughly 15-25 tokens/second. Fast enough for async agents, but noticeably slower than cloud APIs for interactive real-time use.

Related Guides

Frequently Asked Questions

Which Qwen3 model is best for OpenClaw agents?

Qwen3 30B-A3B is the recommended choice. It uses Mixture of Experts architecture that activates only 3B parameters during inference, so it runs at the speed of a 3B model while delivering the quality of a 30B model. It needs about 20GB RAM. If you have less than 16GB RAM, Qwen3 8B is a solid alternative — good tool calling, fits in 8GB RAM.

How does Qwen3 tool calling compare to other local models?

Qwen3 consistently ranks as the most stable local model for tool calling in 2026 evaluations. It rarely hallucinates tool calls or drops parameters mid-call — the two most common failure modes in local model tool use. Gemma 4 26B and Qwen3 30B-A3B are roughly comparable, with Qwen3 having a slight edge on complex multi-tool workflows.

Can I use Qwen3 with the browser skill in OpenClaw?

Yes. Qwen3's native tool calling works with OpenClaw's browser skill. Set up the browser skill in your SOUL.md, use Ollama native API mode (not OpenAI compatibility), and Qwen3 will reliably call the browser tool to search and read web pages. This is where Qwen3 outperforms many other local models — it stays on task during multi-step browser interactions.

Is Qwen3 free to use commercially?

Yes. Qwen3 models up to 30B are released under the Apache 2.0 license, which permits commercial use. The 235B model uses a custom Qwen Research License. For OpenClaw agents, the 30B-A3B variant is the sweet spot — Apache 2.0, fits on consumer hardware, excellent performance.

What is the difference between Qwen3 and Qwen3.5?

Qwen3 is the standard text model family (0.6B to 235B). Qwen3.5 is a multimodal variant with vision capabilities (up to 397B total parameters, 17B active). For most OpenClaw agent tasks, Qwen3 30B-A3B is the right choice — it handles text and tool use extremely well without the hardware requirements of Qwen3.5.

Pre-configured agent templates optimized for Qwen3

CrewClaw templates come with the right model, skills, and rules pre-configured. Download, point at your local Qwen3 instance, and deploy in minutes.

Deploy a Ready-Made AI Agent

Skip the setup. Pick a template and deploy in 60 seconds.

Get a Working AI Employee

Pick a role. Your AI employee starts working in 60 seconds. WhatsApp, Telegram, Slack & Discord. No setup required.

Get Your AI Employee
One-time payment Own the code Money-back guarantee