Why ISHI Runs Locally

We've received great questions from the community about ISHI's emphasis on local computing. Here's a deep dive into why we believe local-first AI isn't just a feature—it's fundamental to how developers should work with AI.

Reason 1: Context Lives Where You Work

Everything you work with is on your computer. Your files, your project structures, your environment variables, your SSH keys, your git history. When you use cloud-only AI tools, you're constantly uploading fragments of this context—but the AI never truly understands your workspace.

With ISHI running locally:

Your entire codebase is accessible — No manual file uploads
Environment context is preserved — ISHI sees what you see
Cross-project understanding — Reference patterns from other local projects
IDE integration — Seamless connection to your development environment

Reason 2: Data Ownership & Persistence

Here's a scenario that happens more often than you'd think: Your cloud AI provider bans your account, or has an outage, or changes their terms. Suddenly, all those carefully crafted conversations and context? Gone.

With local-first:

Your data stays yours — No third-party can revoke access
Persistent memory — ISHI remembers across sessions
Behavior customization — Drag a file to steer AI behavior instead of repeating prompts
No cloud dependency — Work even when the internet is unreliable

Reason 3: Leverage Intelligence You Already Pay For

This is the big one. If you're already paying for Claude Pro, ChatGPT Plus, or Gemini Advanced, you're paying for that intelligence. ISHI lets you use these subscriptions as your local AI backend.

On desktop, we use techniques to route through the frontier models you already subscribe to. It's like getting a powerful AI agent for free—or very cheaply—because you already pay for the intelligence elsewhere.

With cloud platforms, this isn't possible. You'd be paying twice: once for your personal subscription, and again for cloud AI costs.

What About AgenticFlow Cloud?

Great question! Here's the distinction:

Local ISHI → Your personal development, uses your subscriptions, your files and context, experimentation & iteration
AgenticFlow Cloud → Production workloads, managed infrastructure, deployed to teams/customers, scalable workflows

The workflow: Build and perfect your AI workflows locally with ISHI. When they work well and you want to share them with your team, clients, or customers—deploy to AgenticFlow Cloud.

Running Your Own LLM with Ollama

Want complete independence from cloud providers? ISHI integrates seamlessly with Ollama, the leading open-source local LLM runtime. Here's how to set it up:

Step 1: Install Ollama

# macOS / Linux
curl -fsSL https://ollama.com/install.sh | sh

# Or download from https://ollama.com/download

Step 2: Pull a Model

# For coding tasks, we recommend Qwen 2.5 Coder or DeepSeek Coder
ollama pull qwen2.5-coder:7b

# For general tasks, try Llama 3.2
ollama pull llama3.2:8b

# For powerful reasoning (requires 32GB+ RAM)
ollama pull deepseek-r1:14b

Step 3: Configure ISHI

Add this to your ISHI config:

{
  "$schema": "https://ishi.so/config.json",
  "provider": {
    "ollama": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "Ollama (local)",
      "options": {
        "baseURL": "http://localhost:11434/v1"
      },
      "models": {
        "qwen2.5-coder:7b": { "name": "Qwen 2.5 Coder 7B" },
        "llama3.2:8b": { "name": "Llama 3.2 8B" }
      }
    }
  }
}

Why Local LLMs Matter

The open-source LLM community has exploded. Projects like llama.cpp, Ollama, and LM Studio make it trivial to run powerful models on consumer hardware:

Llama 3.2 — Meta's latest, excellent general performance
Qwen 2.5 Coder — Alibaba's coding-optimized model
DeepSeek R1 — Reasoning model rivaling o1
Mistral/Mixtral — European open-weight champions
Phi-3 — Microsoft's small but mighty models

With a modern MacBook (M1/M2/M3/M4), you can run 7B-14B parameter models at conversational speeds. With 64GB+ RAM, even 70B models become practical.

Pro tip: If tool calls aren't working with Ollama, try increasing num_ctx to 16k-32k in your Ollama configuration.

Hardware Considerations

We're planning for ISHI to function without internet, meaning local LLMs powered by your machine. For optimal performance:

RAM matters — 16GB minimum, 32GB+ recommended for larger models
Apple Silicon excels — M1/M2/M3/M4 Macs offer exceptional AI performance per dollar
GPU acceleration — NVIDIA GPUs with 8GB+ VRAM on Linux/Windows
Storage — Models range from 4GB to 40GB+, SSD recommended

MacBooks, perhaps surprisingly, offer the best price/performance ratio for local AI workloads due to their unified memory architecture. The M3 Max with 128GB RAM can run models that previously required dedicated GPU servers.

The Bottom Line

Local-first isn't about avoiding the cloud—it's about putting you in control. Your context, your data, your intelligence subscriptions, all working together on your machine.

Cloud platforms like AgenticFlow have their place for production and team deployments. But for the work you do every day? That should run where your work lives: locally.

Get started: Download ISHI | Ollama Setup Guide | All Providers

Why ISHI Runs Locally: Your Computer, Your Context, Your Intelligence