2025-09-22

Agentic AI running locally on a Raspberry Pi 5

Got agentic AI running 100% locally on my Arm-based Raspberry Pi 5. It can check its own system state and run actions independently. Adding more tools means loads of potential, and of course security considerations. (It's using the LLM's training on shell commands; I didn't provide it a list.)

The loop is: Voice, Plan, Act, Observe, Summarise, Speak, Remember.

How it works

Rather than one big prompt doing everything, the agent is three narrow LLM calls that hand off to each other — a planner, a judge, and a summariser — all the same small model (Qwen3 1.7B) running under Ollama on the Pi. Keeping each call to a single job is what makes a 1.7B model dependable: every step has one checkable output.

How the agent works: one small model in three roles on the Raspberry Pi 5 — a planner picks the next shell command, the command runs locally, a judge scores whether it answered the question and loops back to try another command if not, then a summariser turns the output into one spoken sentence. — *The same small model plays three roles, and nothing leaves the Pi.*

When you speak, a quick check happens first: the question is embedded and compared against past answers in a semantic cache, so anything I've asked before comes straight back without running the model. On a miss, the loop kicks in:

Plan — the planner proposes a single shell command and a one-line reason, as strict JSON.
Act — the command runs locally, with sudo stripped, a 5-second timeout, and a dry-run option.
Observe — the judge scores the output from 0 to 1: did that actually answer the question? If not, it loops back and tries a different command, up to eight steps.
Summarise & Speak — once the judge is satisfied, the summariser turns the raw terminal output into one plain sentence, which Piper speaks aloud.
Remember — the question and answer go into the cache, so next time the same ask is instant.

The stack

LLM: Google DeepMind Gemma 3 1B, or Qwen3 1.7B (Q4_K_M), via Ollama
Tools: shell command
Embeddings: all-MiniLM-L6-v2 (a semantic cache for fast responses)
Speech-to-text: Moonshine AI
Text-to-speech: Piper

Also published on LinkedIn.

#edge-ai #agentic-ai #raspberry-pi

Agentic AI running locally on a Raspberry Pi 5

#How it works

#The stack

How it works

The stack