Back to Blog
providersollamalocal-aiprivacy
Run Local LLMs with Ollama
Ishi Labs•January 17, 2026•2 min read

Run Local LLMs with Ollama
Want AI automation with zero cloud dependency? Ishi + Ollama gives you unlimited local inference. No API keys, no usage limits, complete privacy.
Why Local AI?
- 100% Private — Data never leaves your machine
- No API Costs — Unlimited inference after setup
- Offline Ready — Works without internet
- Air-Gap Compatible — Perfect for sensitive environments
Setting Up Ollama
Install Ollama
# macOS / Linux
curl -fsSL https://ollama.com/install.sh | sh
# Or download from ollama.com for Windows
Pull a Model
# Fast and capable
ollama pull llama3.2
# Reasoning powerhouse
ollama pull deepseek-r1
# Code specialist
ollama pull qwen2.5-coder
Configure Ishi
{
"provider": "ollama",
"model": "llama3.2"
}
Recommended Models
| Model | Size | VRAM | Best For | |-------|------|------|----------| | llama3.2:3b | 2GB | 4GB | Quick tasks | | llama3.2:8b | 5GB | 8GB | General use | | deepseek-r1 | 8GB | 10GB | Reasoning | | qwen2.5-coder:7b | 5GB | 8GB | Code tasks |
Performance Tips
Maximize Speed
{
"provider": "ollama",
"model": "llama3.2:3b",
"options": {
"num_ctx": 4096,
"num_gpu": 99
}
}
M-Series Mac Optimization
Apple Silicon runs Ollama exceptionally well:
- M1: 8-15 tok/s (8B models)
- M2 Pro: 20-30 tok/s
- M3 Max: 40-50 tok/s
Real-World Example: Offline File Organization
Scenario: Organizing confidential HR documents
You: "Sort these employee files by department"
Ishi + Ollama (fully offline):
1. Reads file contents locally
2. LLM categorizes by department
3. Ghost preview shows folder structure
4. Execute — no data ever leaves your machine
Hybrid Mode: Best of Both
Use Ollama for privacy-sensitive tasks, cloud models for complex reasoning:
{
"providers": {
"default": "ollama",
"reasoning": "anthropic"
}
}
Get started: Download Ishi | Ollama Docs