Live on npm · npx ralpharium init · runs entirely on your machine

AI coding loops, under control.

Ralpharium is a small local app that runs AI coding loops — Claude, Codex, Aider, or any shell command — and gives you a dashboard to start, watch, and stop them safely. See what the AI changed, why it stopped, and whether it's safe to keep going. Nothing leaves your machine.

Install · 2 commands → View on npm

0 Iterations recorded

— / — Plan tasks done

0 Specs tracked

offline Backend status

ralpharium · live

connecting

ops snapshot · repo updated just now

connecting to backend...

LocalRuns on your machine

Any AIClaude · Codex · Aider · custom

One-clickStart / pause / stop

Test-gatedStops on test failure

ReplayInspect any iteration

About Ralpharium

The Ralph Loop, made observable.

Ralpharium is a local-first control plane for autonomous AI coding loops. You point it at a repo, pick a runner (Claude / Codex / Aider / any shell command), and click Start. It runs the AI in a tight cycle, gates each pass on tests and lint, and shows you exactly what changed — every iteration, every commit, every failure — in a dashboard you control. Nothing leaves your machine.

What it is, in one screen

A local-first dashboard for autonomous AI coding loops.

⌘

A small Python daemon

Runs in your terminal at localhost:3000. Spawns your AI runner as a subprocess, watches stdout, parses output, persists every iteration. No cloud, no accounts, no telemetry.

⊞

An operator dashboard

Start / pause / panic-stop the loop. Iteration timeline, plan health, validation backpressure, spec coverage, and a live event stream — all updating without a refresh.

◷

8 specialized agents in shared memory

Spec writer, researcher, planner, builder, reviewer, debugger, magpie, tagger. Each owns a slot in a 64KB multiprocessing.shared_memory segment and is observable on the RAM page.

↻

Replayable iteration history

Every loop pass is appended to .ralph/iterations.jsonl: mode, status, files changed, commit SHA, validation, last 4KB of runner output. Click any row to inspect.

Not a replacement for Claude / Codex / Aider — it runs them. Pick the runner, configure the command, click Start. Ralpharium provides the context, gates, observation, and stop conditions around it.

The original

Vanilla Ralph Loop — powerful, but rough.

The Ralph Wiggum technique by @ghuntley is a methodology: run Claude in a bash while loop, treat it as a capable-but-naive executor, steer behavior with engineered context (PROMPT.md, AGENTS.md, IMPLEMENTATION_PLAN.md, specs/*.md) re-read every iteration. It works — but it has real pain points:

Black box. All you see is bash stdout streaming past. No way to inspect intermediate state.
One monolithic prompt. Plan, code, validate, classify — all done in a single call. No specialization.
Manual stop only. If Ralph rewrites the same file 5 times with the same error, you don't notice until you Ctrl-C and read the log.
No replay. Past iterations live in git log at best. The reasoning, validation output, and prompts used are gone.
No structured guardrails. Stop conditions are whatever your bash script remembers to check.

Original playbook: github.com/ghuntley/how-to-ralph-wiggum

Ralpharium's upgrades

What we kept, fixed, and added.

From black box → glass box.
Every part of an iteration is observable in real time. Live blackboard, hex view of the shared-memory segment, event ring buffer, per-agent drill-down with prompt + decision history, replayable JSONL on disk.
From 1 monolithic loop → 8 specialized agents.
spec_writer + researcher handle phase-1 context. planner + builder drive the build cycle. reviewer + debugger own validation backpressure. magpie + tagger capture and classify each iteration's output. Each has its own state, latency, and history.
From .md files on disk → 64KB shared memory.
Agents pass structured work through an OS-level multiprocessing.shared_memory segment instead of re-reading the same files every iteration. Fewer disk hits, instant cross-agent visibility, observable as live JSON.
From manual Ctrl-C → real stop conditions.
One-toggle gates: stop_on_failure, stop_if_no_commit, stop_if_dirty_before_run, max_iterations. Plus a thrash detector that pauses the loop when the same files modified 3+ times produce the same failure — before tokens get burned.
From "tail bash logs" → replayable iterations.
Append-only .ralph/iterations.jsonl stores every iteration's mode, status, files changed, validation result, command output (last 4KB), commit SHA, and failure reason. Click any row in the timeline to inspect or re-run.
From single-shell session → REST + WebSocket API.
Any tool can POST /api/iterations to start a pass and PATCH when it finishes. The dashboard subscribes to a live WebSocket — multiple tabs and external clients see the same state.

Inside an iteration

What actually happens when you click Start.

A single iteration is one full pass of the 8-agent cycle. The daemon orchestrates it; the dashboard streams every step over WebSocket. Here's the sequence, end to end:

A
Pre-flight — Planner + Researcher
The Planner reads IMPLEMENTATION_PLAN.md and picks the first unchecked task. The Researcher scans the repo (branch, dirty state, spec count) and writes that context to the shared blackboard so every other agent sees the same snapshot.
B
Build phase — Builder runs the subprocess
The Builder spawns your runner command (e.g. claude -p "$(cat PROMPT.md)") with cwd set to your repo. Stdout streams line-by-line into the RAM event log, parsed for errors and test mentions. The blackboard's last_error, test_output, and pid slots update in real time.
C
Backpressure — Reviewer + Debugger
When the subprocess exits, the Reviewer runs your validation gates (npm test / lint / typecheck / build) auto-detected from package.json. If anything fails, the Debugger classifies the failure (which check, what kind of error) and writes it to the blackboard. With stop_on_failure on, the loop halts here.
D
Post-loop — Magpie + Tagger
If the iteration produced a real commit, the Magpie collects the artifacts (commit SHA, files changed, scratchpad notes) and the Tagger classifies the iteration (feature / fix / refactor / docs) by looking at file paths and the iteration summary.
E
Persist + broadcast
The full iteration record (mode, status, files changed, commit SHA, validation results, last 4KB of runner output, failure reason) is appended to .ralph/iterations.jsonl. The daemon broadcasts a fresh snapshot over WebSocket — your dashboard timeline, RAM agent grid, plan health, and event stream all update without a refresh.
F
Loop or stop
The loop checks stop conditions (max iterations hit, panic, dirty tree, no commit, validation failed). If clean, it goes back to step A and picks the next plan task. The thrash detector watches for "same files modified with same error 3+ times" and pauses you before tokens are wasted.

Key idea: the AI runner does the actual code work. Ralpharium provides the context, validation, observation, and stop conditions around it — turning a one-shot prompt into a steerable, replayable loop.

How it works

From terminal to dashboard in five steps.

The whole product fits on one chalkboard. No accounts, no cloud — Ralpharium runs on your machine, talks to your AI CLI, and shows you what happened.

1 Scaffold the repo
Run inside any folder. Creates PROMPT.md, AGENTS.md, IMPLEMENTATION_PLAN.md, specs/, and .ralph/.
npx ralpharium init
2 Boot the daemon
Spawns the FastAPI server, opens the dashboard automatically at localhost:3000.
npx ralpharium start
3 Pick a runner + click Start
In Loop configuration, choose Claude / Codex / Aider / custom. Empty command = safe fallback (so you can demo without auth).
4 Watch it work
Iteration timeline, plan health, validation, agent activity stream live. Open npx ralpharium ram for the under-the-hood debug view.
npx ralpharium ram
5 Stop or continue
Pause between iterations with Space, panic-stop with Ctrl/⌘+P, or let it loop until your stop conditions hit.

Under the hood

The tech.

Backend

Python 3.11+ FastAPI daemon. WebSocket for live updates, REST for first paint. Single Uvicorn process — no database, no message queue.

Shared memory

An OS-level multiprocessing.shared_memory segment holds the live blackboard JSON. The 8 agents read and write structured slots; everything is observable in real time.

Frontend

Vanilla JS + CSS. No framework, no build step, no bundler. The dashboard at /dashboard and the live debug view at /ram are static files served by the daemon.

Persistence

Append-only JSONL at .ralph/iterations.jsonl. Every iteration's mode, status, files changed, validation result, and commit SHA — replayable, greppable, gitignorable.

Backpressure

Auto-detects npm test / lint / typecheck / build from package.json (and Python equivalents). One-click validation, with optional stop-on-failure.

CLI

Tiny Node launcher (bin/ralph-studio.js) that spawns the Python daemon. The npm bin is ralpharium (and ralph-studio for back-compat).

Plain English

Five words you'll see everywhere.

Runner: The AI coding CLI you choose — Claude Code, Codex, Aider, OpenRouter via aider, or any custom shell command.
Loop: One run of the runner. A "continuous loop" is repeated runs with stop conditions you set.
Dashboard: The control room at /dashboard. Start, pause, stop. Watch iterations stream in.
RAM: The live debug page at /ram. Real-time internals — 8 agents, blackboard, event stream, hex viewer.
Backpressure: Tests, lint, typecheck, and git checks that gate the loop. If they fail, Ralpharium stops the run before bad code lands.

Install

Two commands to a working dashboard.

Live on npm as ralpharium. Drop it into any repo, init scaffolds the files, start opens the control room at localhost:3000. Nothing leaves your machine.

Requires

Node 18+ — comes with npx. Check: node -v
Python 3.11+ — backend daemon. Check: python --version (or python3 --version). Get it at python.org if missing.
Optional: Claude / Codex / Aider CLI — only needed for real iterations. Empty runner command = safe fallback so you can demo without auth.

FastAPI & uvicorn auto-install on first run — you don't have to do anything.

~/your-repo · zsh

# 1. scaffold the four Ralph artifacts in your repo
$ npx ralpharium init
  ↳ created PROMPT.md, AGENTS.md, IMPLEMENTATION_PLAN.md, specs/, .ralph/
 
# 2. boot the daemon — opens browser at localhost:3000
$ npx ralpharium start
  ↳ daemon up · websocket open · http://localhost:3000
 
# 3. (optional) other entry points
$ npx ralpharium dashboard
$ npx ralpharium ram
$ npx ralpharium start --port=4000
▋

npm install -g ralpharium for global install requires Node 18+ or Python 3.11+

Per iteration

Every run leaves a trail.

Eight things get captured every time the AI runs: the prompt it read, the plan tasks it touched, what it ran, what it changed, what passed or failed, and what it committed. You can replay any of them.

01PromptPROMPT.md re-readinput

02Plantasks parsedstate

03Spec matchspecs/* mappedscope

04Runrunner subprocessexecute

05Validatetests · lint · buildbackpressure

06Commitatomic git commitrecord

07Replaydrill into outputaudit

08Guardrailrules suggestedharden

Files the AI reads

A few markdown files, parsed for you.

The AI reads a small set of files in your repo each iteration — Ralpharium reads them too, so you always know what the AI is working from.

PROMPT.md

per-iteration instruction

Re-read at the start of every iteration. Studio shows file presence, last-modified, and the active prompt mode.

artifactrequired

AGENTS.md

operational rules

Build/test commands, scope limits, commit hygiene. Studio surfaces it and proposes new rules from failure history.

artifactrecommended

IMPLEMENTATION_PLAN.md

persistent state

Parsed into tasks: completed, pending, blocked, stale. Drift and repeated-work warnings raised automatically.

artifactparsed

specs/*.md

source of truth

Each spec is mapped against plan tasks and recent commits — covered, partial, drifting, or ignored.

artifactcoverage

Iteration log

append-only history

Every iteration's mode, status, files changed, commit SHA, validation result, and command output — replayable.

historyJSONL

Backpressure

validation gate

Auto-detects npm test, lint, typecheck, build, plus working-tree status. One click to run.

checksauto-detected

Commits

atomic record

Recent commits surfaced inline. Iterations link to the SHA they produced — or flag if no commit was created.

gitper iteration

Guardrails

learned rules

Repeat-failure heuristics turn into AGENTS.md suggestions. "Tests failed 3×, require npm test."

suggestionshistory-driven

iteration anatomy JSONL

field

type

role

id · number

str · int

unique handle, monotonic counter

mode

plan / build

which prompt was used

status

enum

running · passed · failed · stopped

files_changed

list[str]

derived from commit

commit_sha

str?

linked to git history

validation

list[check]

tests/lint/typecheck results

command_output

str (tail)

last 4KB of runner stdout

failure_reason

str?

only on failure

Iteration anatomy

Every loop pass is replayable.

When Ralph finishes — or fails — Studio writes a structured record to .ralph/iterations.jsonl. Click any row in the timeline to see the prompt used, the plan diff, the validation output, and the resulting commit.

Append-only.
Local JSONL. Easy to tail -f, easy to ship to wherever later.
External hooks.
POST /api/iterations from any CLI to start an iteration; PATCH when it finishes.
Convergence signals.
Failure clustering, dirty trees, missing commits — surfaced as guardrail suggestions.

Dashboard

What the dashboard does.

Six clearly-labelled cards: loop status, iteration timeline, plan health, validation, spec coverage, and a live runtime panel. Click anything for full detail.

▦

Loop status header

Mode, iteration, runner, branch, dirty tree — and one-click start, pause, stop, panic.

controlsWebSocket

⏱

Iteration timeline

Recent passes with mode, status, duration, files changed, commit SHA. Click to replay.

historylocal

✓

Plan health

Tasks parsed from IMPLEMENTATION_PLAN.md. Next, blocked, stale. Drift warnings.

parserregex

⚡

Backpressure

Auto-detected validation. Tests, lint, typecheck, build, git clean. Run any of them on demand.

subprocess1-click

⊡

Spec coverage map

Each spec is covered, partial, drifting, or ignored — derived from plan + commits.

heuristicper file

⚐

Guardrails

Surfaces PROMPT.md and AGENTS.md, plus rules learned from repeat failures.

suggestionshistory-driven

Roadmap

What's shipping next.

Built in the open. Every milestone is real work — no vapor, no "trust me bro." If a phase says shipped, it's already on your machine when you install. If it says next, the spec is written and the wiring is half-done.

Foundations

Shipped

✓Local FastAPI daemon + WebSocket dashboard
✓Iteration JSONL store at .ralph/iterations.jsonl
✓Plan parser, spec coverage, validation backpressure
✓RAM page: blackboard, event stream, scratchpad, checkpoints

8 agents, made observable

Shipped

✓AgentRoster with 8 fixed roles wired into the iteration lifecycle
✓8-card grid with click-through prompt + decision history
✓Thrash detector — "Ralph keeps Ralphing the wrong thing"
✓Live broadcast on every agent transition + localStorage cache

Real autonomy

Next · in progress

→Cost telemetry per iteration. Parse runner stdout for tokens, surface $/iteration, $/session, $/spec_completed.
→Git-worktree-per-iteration sandbox. No edits land on main until validation passes. Auto-rollback on failure.
→Bidirectional question channel. Agent writes a question, loop pauses, you reply, loop resumes — without losing context.
→Time + token budgets. Kill runaway iterations after N seconds or M tokens.

Multiplayer + resilience

Soon

◇Daemon crash recovery. Replay iterations.jsonl on boot — pick up exactly where the loop died.
◇Multi-client presence. Two devs watching the same loop see each other's cursors and selections.
◇Replay on a different model. Re-run a past iteration on Sonnet vs Opus vs Haiku and diff the outputs.
◇Real per-agent LLM calls. Phase 1 synthesizes agent activity from lifecycle events. Phase 3 = each of the 8 agents is a real LLM call with its own system prompt and history.

∞

North star

The dream

★Self-spec'ing loops. The runner writes specs/*.md from a chat thread, then builds against them. Phase-1 of Wiggum, automated.
★Multi-repo orchestration. One Ralpharium daemon coordinating loops across a monorepo's services or a fleet of microrepos.
★Visual diff approval. Each commit lands on a feature branch; dashboard shows the diff with one-click approve/revert.
★Public agent marketplace. Drop-in custom agents (security_reviewer, perf_auditor, accessibility_critic) you compose into your roster.

Want something on this list sooner? File an issue on GitHub or open a PR. Local-first stays local — but the roadmap is open.

Ready when you are

Run a loop. Watch what changes.

Open the dashboard, pick a runner, click Start. If something fails, the loop stops. If it commits, you see the SHA. Everything stays on your machine.

Open the dashboard → How it's built