Agentic AI running locally on a Raspberry Pi 5
Got agentic AI running 100% locally on my Arm-based Raspberry Pi 5. It can check its own system state and run actions independently. Adding more tools means loads of potential, and of course security considerations. (It's using the LLM's training on shell commands; I didn't provide it a list.)
The loop is: Voice, Plan, Act, Observe, Summarise, Speak, Remember.
#How it works
Rather than one big prompt doing everything, the agent is three narrow LLM calls that hand off to each other — a planner, a judge, and a summariser — all the same small model (Qwen3 1.7B) running under Ollama on the Pi. Keeping each call to a single job is what makes a 1.7B model dependable: every step has one checkable output.
When you speak, a quick check happens first: the question is embedded and compared against past answers in a semantic cache, so anything I've asked before comes straight back without running the model. On a miss, the loop kicks in:
- Plan — the planner proposes a single shell command and a one-line reason, as strict JSON.
- Act — the command runs locally, with
sudostripped, a 5-second timeout, and a dry-run option. - Observe — the judge scores the output from 0 to 1: did that actually answer the question? If not, it loops back and tries a different command, up to eight steps.
- Summarise & Speak — once the judge is satisfied, the summariser turns the raw terminal output into one plain sentence, which Piper speaks aloud.
- Remember — the question and answer go into the cache, so next time the same ask is instant.
#The stack
- LLM: Google DeepMind Gemma 3 1B, or Qwen3 1.7B (Q4_K_M), via Ollama
- Tools: shell command
- Embeddings: all-MiniLM-L6-v2 (a semantic cache for fast responses)
- Speech-to-text: Moonshine AI
- Text-to-speech: Piper
Also published on LinkedIn.