How to Run OpenClaw with Ollama: Free Local AI Agents
Run AI agents on your own machine without paying for API calls. This guide shows you how to connect OpenClaw to Ollama, choose the right local model, and run a fully functional AI agent with zero external dependencies.
Why Run AI Agents Locally?
Cloud APIs like Anthropic and OpenAI charge per token. For an agent that processes hundreds of messages per day, costs add up quickly. Running OpenClaw with Ollama eliminates API costs entirely — your agents run on your hardware, your data stays on your machine, and there are no rate limits or usage caps.
Local agents are also ideal for development and testing. You can iterate on SOUL.md configurations, test multi-agent workflows, and debug agent behavior without burning through API credits. Once your agents work correctly, you can optionally switch to a cloud provider for production use.
Prerequisites
Node.js 18+ — Required for OpenClaw. node --version to check.
Ollama — Download from ollama.com. Available for macOS, Linux, and Windows.
8 GB+ RAM — Minimum for small models. 16 GB recommended for Llama 3 8B.
GPU (optional) — Dramatically speeds up inference. Any modern NVIDIA or Apple Silicon GPU works.
Step 1: Install Ollama and Pull a Model
Ollama is a lightweight runtime for running large language models locally. Install it, then download a model that your agent will use as its brain.
# macOS (using Homebrew)
brew install ollama
# Linux
curl -fsSL https://ollama.com/install.sh | sh
# Start the Ollama service
ollama serve
# Pull a model (in a new terminal)
ollama pull llama3
# Or for a smaller model:
ollama pull mistral
# Or for a tiny model (2B params):
ollama pull gemma:2bThe download size depends on the model: Gemma 2B is about 1.4 GB, Mistral 7B is about 4 GB, and Llama 3 8B is about 4.7 GB. Downloads happen once and are cached locally.
Step 2: Configure OpenClaw for Ollama
Point OpenClaw to your local Ollama instance instead of a cloud provider. No API key is needed — Ollama runs on your machine and OpenClaw connects to it via localhost.
# Initialize OpenClaw (skip if already done)
npx openclaw init
# Configure Ollama as the model provider
openclaw models auth paste-token --provider ollama
# When prompted for a token, just press Enter (no key needed)OpenClaw automatically detects Ollama running on the default port (11434). If you changed the Ollama port or are running it on a different machine, you may need to set the endpoint in your configuration.
Step 3: Create and Register Your Agent
Create a SOUL.md file that defines your agent's personality and behavior, then register it with OpenClaw. You can generate a SOUL.md using the CrewClaw Generator or write one from scratch.
# Create agent workspace
mkdir -p agents/researcher
# Create a minimal SOUL.md
cat > agents/researcher/SOUL.md << 'EOF'
# Researcher Agent
## Identity
You are a research specialist who finds, analyzes, and summarizes information.
## Rules
- Always cite sources when making claims
- Present findings in a structured format
- Ask clarifying questions when the research topic is too broad
## Tone
Professional, thorough, and objective.
EOF
# Register the agent
openclaw agents add researcher --workspace ./agents/researcher --non-interactiveStep 4: Start Chatting with Your Local Agent
# Send a message directly
openclaw agent --agent researcher --message "Summarize the key differences between REST and GraphQL"
# Or start the gateway for web/Telegram access
openclaw gateway start
# Then open http://localhost:18789The first message may take a few seconds as Ollama loads the model into memory. Subsequent messages are significantly faster. On Apple Silicon Macs, expect 20-40 tokens per second with Llama 3 8B.
Recommended Models for OpenClaw Agents
| Model | Size | RAM | Best For |
|---|---|---|---|
| Gemma 2B | 1.4 GB | 8 GB | Quick tasks, classification |
| Mistral 7B | 4.1 GB | 16 GB | General purpose, fast |
| Llama 3 8B | 4.7 GB | 16 GB | Recommended — best quality/speed balance |
| Mixtral 8x7B | 26 GB | 32 GB | Complex reasoning, multilingual |
| Llama 3 70B | 40 GB | 64 GB | Near cloud-quality output |
Running on Windows with WSL
OpenClaw does not run natively on Windows but works perfectly through WSL 2 (Windows Subsystem for Linux). The setup takes about 10 minutes and gives you a full Linux environment inside Windows.
# 1. Install WSL (run in admin PowerShell)
wsl --install
# 2. Restart your computer, then open Ubuntu from Start menu
# 3. Install Node.js inside WSL
curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -
sudo apt-get install -y nodejs
# 4. Install Ollama inside WSL
curl -fsSL https://ollama.com/install.sh | sh
# 5. Now follow the standard OpenClaw setup
npx openclaw init
ollama pull llama3
openclaw models auth paste-token --provider ollamaIf you already have Ollama installed on Windows natively, WSL can connect to it via http://host.docker.internal:11434 or http://localhost:11434.
Frequently Asked Questions
Can I run OpenClaw completely offline with Ollama?
Yes. Once Ollama has downloaded a model, both Ollama and OpenClaw run entirely on your machine with no internet connection required. This makes it suitable for air-gapped environments, sensitive data processing, and situations where you cannot send data to external APIs. The only time you need internet is for the initial model download and OpenClaw installation.
Which Ollama model works best with OpenClaw agents?
For most agent tasks, Llama 3 8B offers the best balance of quality and speed. It handles content writing, research summaries, and code review well on machines with 16 GB RAM. If you have a GPU with 24+ GB VRAM, Llama 3 70B delivers quality closer to Claude or GPT-4. For lightweight tasks like classification or simple Q&A, Mistral 7B or Gemma 2B run faster with less memory.
How much RAM do I need to run Ollama with OpenClaw?
A minimum of 8 GB RAM is needed for small models like Gemma 2B or Phi-3 Mini. For the recommended Llama 3 8B model, 16 GB RAM is ideal. Larger models like Llama 3 70B or Mixtral 8x7B need 32-64 GB RAM or a dedicated GPU. OpenClaw itself uses minimal resources — the memory requirement is almost entirely driven by the Ollama model size.
Does OpenClaw with Ollama work on Windows?
OpenClaw runs on Windows through WSL 2 (Windows Subsystem for Linux). Install WSL with 'wsl --install' from an admin PowerShell, then install Node.js and Ollama inside the WSL environment. Ollama for Windows can also run natively and be accessed from WSL via the localhost API endpoint. The setup takes about 10 minutes and works identically to the Linux experience.
Can I mix Ollama and cloud providers in the same OpenClaw team?
Yes. OpenClaw supports per-agent model configuration. You can run your content writer on Claude Sonnet for high-quality writing while using a local Ollama model for your research agent to save costs. Each agent's SOUL.md can specify which model provider and model to use independently. This hybrid approach optimizes both cost and quality.
Is local Ollama as good as Claude or GPT-4 for OpenClaw agents?
For simple tasks like summarization, classification, and structured data extraction, local models perform comparably. For complex reasoning, creative writing, and nuanced instruction following, cloud models like Claude Sonnet and GPT-4 still outperform most local alternatives. The practical approach is to start with Ollama for development and testing, then switch to cloud models for production agents that need top-tier output quality.
Build Your Agent Config in Seconds
Use CrewClaw's SOUL.md Generator to create a complete agent configuration. Works with Ollama, Anthropic, OpenAI, and any provider OpenClaw supports.