Why ISHI Runs Locally: Your Computer, Your Context, Your Intelligence

Why ISHI Runs Locally
We've received great questions from the community about ISHI's emphasis on local computing. Here's a deep dive into why we believe local-first AI isn't just a feature—it's fundamental to how developers should work with AI.
Reason 1: Context Lives Where You Work
Everything you work with is on your computer. Your files, your project structures, your environment variables, your SSH keys, your git history. When you use cloud-only AI tools, you're constantly uploading fragments of this context—but the AI never truly understands your workspace.
With ISHI running locally:
- Your entire codebase is accessible — No manual file uploads
- Environment context is preserved — ISHI sees what you see
- Cross-project understanding — Reference patterns from other local projects
- IDE integration — Seamless connection to your development environment
Reason 2: Data Ownership & Persistence
Here's a scenario that happens more often than you'd think: Your cloud AI provider bans your account, or has an outage, or changes their terms. Suddenly, all those carefully crafted conversations and context? Gone.
With local-first:
- Your data stays yours — No third-party can revoke access
- Persistent memory — ISHI remembers across sessions
- Behavior customization — Drag a file to steer AI behavior instead of repeating prompts
- No cloud dependency — Work even when the internet is unreliable
Reason 3: Leverage Intelligence You Already Pay For
This is the big one. If you're already paying for Claude Pro, ChatGPT Plus, or Gemini Advanced, you're paying for that intelligence. ISHI lets you use these subscriptions as your local AI backend.
On desktop, we use techniques to route through the frontier models you already subscribe to. It's like getting a powerful AI agent for free—or very cheaply—because you already pay for the intelligence elsewhere.
With cloud platforms, this isn't possible. You'd be paying twice: once for your personal subscription, and again for cloud AI costs.
What About AgenticFlow Cloud?
Great question! Here's the distinction:
- Local ISHI → Your personal development, uses your subscriptions, your files and context, experimentation & iteration
- AgenticFlow Cloud → Production workloads, managed infrastructure, deployed to teams/customers, scalable workflows
The workflow: Build and perfect your AI workflows locally with ISHI. When they work well and you want to share them with your team, clients, or customers—deploy to AgenticFlow Cloud.
Running Your Own LLM with Ollama
Want complete independence from cloud providers? ISHI integrates seamlessly with Ollama, the leading open-source local LLM runtime. Here's how to set it up:
Step 1: Install Ollama
# macOS / Linux
curl -fsSL https://ollama.com/install.sh | sh
# Or download from https://ollama.com/download
Step 2: Pull a Model
# For coding tasks, we recommend Qwen 2.5 Coder or DeepSeek Coder
ollama pull qwen2.5-coder:7b
# For general tasks, try Llama 3.2
ollama pull llama3.2:8b
# For powerful reasoning (requires 32GB+ RAM)
ollama pull deepseek-r1:14b
Step 3: Configure ISHI
Add this to your ISHI config:
{
"$schema": "https://ishi.so/config.json",
"provider": {
"ollama": {
"npm": "@ai-sdk/openai-compatible",
"name": "Ollama (local)",
"options": {
"baseURL": "http://localhost:11434/v1"
},
"models": {
"qwen2.5-coder:7b": { "name": "Qwen 2.5 Coder 7B" },
"llama3.2:8b": { "name": "Llama 3.2 8B" }
}
}
}
}
Why Local LLMs Matter
The open-source LLM community has exploded. Projects like llama.cpp, Ollama, and LM Studio make it trivial to run powerful models on consumer hardware:
- Llama 3.2 — Meta's latest, excellent general performance
- Qwen 2.5 Coder — Alibaba's coding-optimized model
- DeepSeek R1 — Reasoning model rivaling o1
- Mistral/Mixtral — European open-weight champions
- Phi-3 — Microsoft's small but mighty models
With a modern MacBook (M1/M2/M3/M4), you can run 7B-14B parameter models at conversational speeds. With 64GB+ RAM, even 70B models become practical.
Pro tip: If tool calls aren't working with Ollama, try increasing
num_ctxto 16k-32k in your Ollama configuration.
Hardware Considerations
We're planning for ISHI to function without internet, meaning local LLMs powered by your machine. For optimal performance:
- RAM matters — 16GB minimum, 32GB+ recommended for larger models
- Apple Silicon excels — M1/M2/M3/M4 Macs offer exceptional AI performance per dollar
- GPU acceleration — NVIDIA GPUs with 8GB+ VRAM on Linux/Windows
- Storage — Models range from 4GB to 40GB+, SSD recommended
MacBooks, perhaps surprisingly, offer the best price/performance ratio for local AI workloads due to their unified memory architecture. The M3 Max with 128GB RAM can run models that previously required dedicated GPU servers.
The Bottom Line
Local-first isn't about avoiding the cloud—it's about putting you in control. Your context, your data, your intelligence subscriptions, all working together on your machine.
Cloud platforms like AgenticFlow have their place for production and team deployments. But for the work you do every day? That should run where your work lives: locally.
Get started: Download ISHI | Ollama Setup Guide | All Providers